Standardized tests aren't biased, says new data—but scores reflect society's biases

A page of multiple choice questions with various answers filled in. One line has FAIR? written in place of the traditional A, B, C, D and E options. — Standardized testing has been under fire in recent years, including in Massachusetts schools, where high school students must take the MCAS to graduate.

Kana Ruhalter is the associate producer of GBH’s All Things Considered. Feedback? Questions? Story ideas? Reach out to Kana at kana_ruhalter@wgbh.org.

Arun Rath hosts the local broadcast of GBH’s All Things Considered.

January 23, 2024

In the last decade, public opinion on standardized testing nationwide has shifted dramatically, with critics citing racial and economic disparities in scores. Efforts to do away with testing requirements have gained significant traction, and the pandemic only seemed to accelerate that momentum—so much so that dozens of universities dropped SAT and ACT scores as part of their admissions requirements.

Here in Massachusetts, the fate of MCAS—the Massachusetts Comprehensive Assessment System, which must be passed to graduate from high school—is in limbo. The state’s largest teachers’ union obtained over 130,000 signatures in support of a ballot question that proposes dropping the requirement entirely.

Advocates commend the step away from standardized testing as a victory for equity in education, but new research is showing that those test scores are a much stronger indicator of success in higher education than grade point averages alone.

So what does this mean? Should admissions officers reinstate standardized testing requirements? Brown University economics professor John Friedman, who co-authored the new study, joined GBH’s All Things Considered host Arun Rath to discuss. What follows is a lightly edited transcript.

Arun Rath: Before we dig in, talk a bit about your study’s methodology and what it reveals about the relationship between test scores and academic performance.

John Friedman: Our study is really pretty straightforward. We’re looking at students who have attended a set of Ivy-Plus institutions over the past five or six years and simply asking: “What information, what variables that were available at the time when they were admitted, does the best job at predicting a student’s level of academic success once they are in school?” So, we look at standardized test scores. We look at high school GPAs.

What we find is some contrast to what might be the received wisdom out there. In this setting, standardized test scores do substantially better than high school GPA at predicting, for instance, a student’s grade point average or the chances that a student will academically struggle once they’re in school.

That’s not only true just in the raw data; it’s true when you adjust for differences by family income, background, race, gender and what high schools students have gone to. It’s important that we know these correlations are not merely driven by racial disparities or something like that. What we find is that these test scores are more predictive than high school grades, even when comparing students of the same family income, race and gender.

Rath: I don’t want to sound too basic about it, but it sounds like that essentially confirms the reason why we had standardized tests to begin with.

Friedman: I think that’s right. Obviously, there’s a lot that goes into admissions at these institutions, but assessing a student’s level of academic preparation is clearly one of those things. It turns out that test scores—they’re far from perfect. They’re not a perfect predictor, but they do seem to be the best piece of information we have.

Rath: Do you have a sense of why grade point averages aren’t a better indicator of student success and why that doesn’t correlate as well?

Friedman: That’s a great question. I think there are two factors that contribute to this.

First of all, a 4.0, or an A, means so many different things across different high schools. Different high schools have different grading standards. Often, grades are given out more on a relative basis than on an absolute basis. Students are taking different courses, so it can be very difficult to compare students with high school GPAs in one setting versus another. That’s especially difficult for these highly selective private universities that are considering tens of thousands of applicants from not only the entire country, but the world.

Second, there’s been a lot of grade inflation in the past couple of decades. The fraction of students that are, essentially, receiving a perfect GPA is higher than ever before. Now, still, there’s a lot of variation between students that are getting top scores and students who are not. But for those colleges that are admitting only a small share of students who are among the very most academically prepared students, grades just aren’t as helpful, I think, as they used to be in figuring out, say, who’s the first or second student in a class from who might be the ninth or 10th.

Rath: One of your collaborators on the study, Raj Chetty, said that regarding standardized testing, your findings seem to “strengthen the argument that the disparities in SAT scores are a symptom, not a cause, of inequality in the US.” Could you expand on that?

Friedman: Sure. One major concern of those who have been opposed to the use of testing is that tests are biased against students coming from less advantaged backgrounds. I think it’s intuitive to see why that could be a reasonable concern.

If students from, say, higher-income families are able to hire tutors to study for the test or are able to take the test many different times, there’s a real concern that a given score might not mean the same thing for a student coming from a very high-income background as opposed to a student who is maybe the first in their family to go to college. We empirically examined this hypothesis by asking exactly that question. Take two students who have the same test score.

Do they perform similarly or not when they get to college? The answer is that yes—they perform similarly. If anything, the students from more well-resourced backgrounds outperform a little bit. But I think the big takeaway is that a [score of] 1400 on the SAT or a 34 on the ACT kind of means the same thing in terms of predicting success in college.

That leads to the next question: How can we account for these very large disparities across racial groups, across children, from high- and low-income families that we see in average test scores? I think the answer is that it reflects enormous disparities in the quality of education in many other environmental factors that these children experience through the first 17 or 18 years of their lives.

It’s not that the test itself is biased; it’s that this is picking up enormous disparities, but we shouldn’t address it just by calling the test biased and ignoring it. We need to address it by attacking those underlying disparities that exist in so many aspects of life.

Rath: I know your work isn’t trying to propose a solution, but if I could put you on the spot and ask you, in your professional opinion, do you think standardized testing requirements should be upheld or, in some cases, restored?

Friedman: This finding—that test scores are much better predictors of academic outcomes than high school GPAs—doesn’t need to hold in every single higher education setting. What I really encourage educational policymakers to do is examine what the data look like in their setting.

For instance, the University of California at Berkeley has produced a very similar set of results. That shows that in that setting, too, test scores are about an equal predictor, but you can do much better predicting academic success with both test scores and high school grades, rather than only with high school grades.

But if, in other circumstances, colleges look at the data and find that test scores are not a useful part of the prediction of academic preparation, then I see no reason to insist on that being a part of the admissions process, just out of principle.

Kana Ruhalter is the associate producer of GBH’s All Things Considered. Feedback? Questions? Story ideas? Reach out to Kana at kana_ruhalter@wgbh.org.

Arun Rath hosts the local broadcast of GBH’s All Things Considered.

Standardized tests aren't biased, says new data—but scores reflect society's biases

More Education