Is the Academic Achievement Gap a Racist Idea?

In this post I’m going to examine two of the main points from a 2016 article where Ibram Kendi argues that “the academic achievement gap between white and black students is a racist idea.” Similar arguments are made in this 2021 article from the National Education Association, which addresses “the racist beginnings of standardized testing.”

I agree that score gaps, our methods for measuring them, and our continuous discussion of them, can perpetuate educational inequities. Fixating on gaps can be counterproductive. However, I disagree somewhat with the claim from Kendi and others that the tests themselves are the main problem because, they argue, the tests 1) have origins in intelligence testing and 2) assess the wrong kinds of stuff.

Before I dig into these two points, a few preliminaries.

  • I recognize that the articles I’ve linked above are opinion pieces, intended to push the discussion forward while advocating for change, and that their formats may not allow for a comprehensive treatment of these points. My response has more to do with these points needing elaboration and context, and less to do with them being totally incorrect or unfounded.
  • NPR On Point did a series in 2019 on the achievement gap, with one of the interviews featuring Ibram Kendi and Prudence Carter, and both acknowledge the potential benefits of standardized testing. I recognize that Kendi’s 2016 article may not fully capture his perspective on gaps or testing.
  • The term achievement gap can hide the fact that differential academic performance by student group results from differential access and opportunity, the effects of which compound over time. I’ll use achievement here to be consistent with previous work.

Intelligence vs achievement

In his 2016 article, Kendi doesn’t make a clear distinction between intelligence and achievement. He transitions from the former to the latter while summarizing the history of standardized testing, but he refers to the achievement gap throughout, with the implication being that differences in intelligence are the same as, or close enough to, differences in achievement, such that they can be treated interchangeably.

Intelligence and achievement are two moderately correlated constructs, as far as we can measure them accurately. They overlap, but they aren’t the same. Achievement can be improved through teaching and learning, whereas intelligence is thought to be more stable over time (though the Flynn effect raises questions here). Achievement is usually linked to concrete content that is the focus of instruction (eg, fractions, reading comprehension), whereas intelligence is more related to abstract aptitudes (eg, memory, pattern recognition).

An achievement gap is then an average difference in achievement for two or more groups of students, typically measured via standardized tests, with groups defined based on student demographics like race or gender.

Data show that groups differ in variables related both to achievement and intelligence, but how and whether we can or need to interpret these group differences is up for debate. We set instructional and education policy goals based on achievement results. It’s not clear what we do with group differences in intelligence, which leads many to question the utility of analyzing intelligence by race, especially while attributing heritability (this Slate article by William Saletan summarizes the issue well).

Why is a distinction between constructs important? Because the limitations of intelligence testing don’t necessarily carry over into achievement. Both areas of testing involve standardization, but they differ in essential ways, including in design, content, administration, scoring, and use. Intelligence tests need not connect to a specific education system, whereas most achievement tests do (eg, see California content standards, the foundation of its annual end-of-year achievement tests, currently SBAC).

Both of the articles I linked at the start highlight some of the eugenic and racist origins of intelligence testing. Following the history into the 1960s and then 1990s, Kendi notes that genetic explanations for racial differences in intelligence have been disproven, but he still presents achievement testing and the achievement gap as a continuation of the original racist idea.

While intelligence as a construct is roughly 100 years old, standardized testing has actually been around for hundreds if not thousands of years (eg, Chinese civil service exams, from wikipedia). This isn’t to say achievement tests haven’t been used in racists ways in the US or elsewhere, but the methods themselves aren’t necessarily irredeemable simply because they resemble those used in intelligence testing.

Charles Murray, co-author on the controversial 1994 book The Bell Curve (mentioned by Kendi), also seems to conflate intelligence with achievement. Murray claims that persistent achievement gaps confirm his prediction that intelligence differences will remain relatively stable (see his comments at AEI.org). However, studies show that racial achievement gaps are to a large extent explained by other background variables and can be reduced through targeted intervention (summarized in this New York Magazine article, which is where I saw the Murray comments above; see also this article by Linda Darling-Hammond and this one by Prudence Carter). This research tells us achievement is malleable and should be treated separately from intelligence.

Kinds vs levels of achievement

Kendi and others argue that the contents of standardized tests don’t represent the kinds of achievement that are relevant to all students. The implication here is that differences in levels of achievement (ie, gaps) arise from biased test content, and can be explained by an absence of the kinds of achievement that are valued by or aligned with the experiences of underrepresented students. Kendi says:

Gathering knowledge of abstract items, from words to equations, that have no relation to our everyday lives has long been the amusement of the leisured elite. Relegating the non-elite to the basement of intellect because they do not know as many abstractions has been the conceit of the elite.

What if we measured literacy by how knowledgeable individuals are about their own environment: how much individuals knew all those complex equations and verbal and nonverbal vocabularies of their everyday life?

This sounds like culturally responsive pedagogy (here’s the wikipedia entry), where instruction, instructional materials, and even test content will seek to represent and engage students of diverse cultures and backgrounds. We should aim to teach with our entire student population in mind, especially underrepresented groups, rather than via one-size-fits-all approaches that default to tradition or the majority. But we’re still figuring out how this applies to standards-based systems. And, though culturally responsive pedagogy may be optimal, we don’t know that achievement gaps hinge on it.

While I have seen examples of standardized achievement tests that rely on outdated or irrelevant content, I haven’t seen evidence showing that gaps would reduce significantly if we measured different kinds of achievement. Kendi doesn’t reference any evidence to support this claim.

Continuing on this theme, Kendi targets standardized tests themselves as perpetuating a racial hierarchy. He says:

The testing movement does not value multiculturalism. The testing movement does not value the antiracist equality of difference. The testing movement values the racist hierarchy of difference, and its bastard 100-year-old child: the academic achievement gap.

This might be true to some extent, but if our tests are constructed to assess generally the content that is taught in schools, an achievement gap should result more from inequitable access to quality instruction in that content, or the appropriateness of that content, than from testing itself. In this case, other variables like high school grade point average and graduation rate will also reflect achievement gaps to some extent. So, it may be that the concern is more related to standardized education not valuing multiculturalism than standardized testing.

Whatever the reasons, I agree that multiculturalism hasn’t been a priority in the testing movement over the past century. This has bothered me since I started psychometric work over ten years ago. Standardization pushes us to materials devoid of context that is meaningful at the individual or subgroup levels. Fortunately, I am seeing more discussion of this issue in the educational and psychological measurement literature (eg, this article by Stephen Sireci) and am excited for the potential.

Final thoughts

Although my comments here have been critical of the anti-testing and anti-gap arguments, I agree with the general concern around how we discuss and interpret achievement gaps. I wouldn’t say that standardized testing is solely to blame, but I do question the utility in spending so much time measuring and reporting on achievement differences by student groups, especially when we know that these differences mostly reflect access and opportunity gaps. The pandemic has only heightened these concerns.

Returning to the question in the title of this post, is the academic achievement gap a racist idea, I would say, yes, sometimes. Gaps can be misinterpreted in racist ways as being heritable and immutable. To the extent that documenting achievement gaps contributes to inequities, I would agree that the process itself can become a racist one.

That said, research indicates that we can document and address achievement gaps in productive ways, in which case valid measurement is essential. As you might guess, I would aim for better testing instead of zero testing, including measures that are less standardized and more individualized and culturally responsive. The challenge here will be convincing test developers and users that we can move away from norm-referenced score comparisons without losing valuable information.

I didn’t really get into achievement gap research here, outside of a narrow critique of standardized testing. If you’re looking for more, I recommend the articles by Linda Darling-Hammond and Prudence Carter linked above, as well as the NPR On Point series. There’s also this 2006 article by Gloria Ladson-Billings based on her presidential address to the American Educational Research Association. Amy Stuart Wells continues the discussion in her 2019 presidential address, on Youtube.