Eliminating standardized tests in college admissions: the new affirmative action?

Author: Zwick, Rebecca
Source: Phi Delta Kappan v. 81 no4 (Dec. 1999) p. 320-4
ISSN: 0031-7217 Number: BEDI99035341

College enrollment figures aren't ordinarily big news, but the 1998 freshman enrollment numbers for the University of California's most prestigious campuses were startling enough to warrant headlines. At the University of California, Berkeley, African American enrollment dropped by more than 60% from 1997 levels, and Latino enrollment dropped by nearly 50%. UCLA experienced dramatic decreases as well.(1).

Since the passage in 1996 of California's Proposition 209, which banned consideration of race or ethnicity in admissions decisions at public colleges and universities, University of California educators have feared just such a plunge in minority representation and have been considering ways to counteract it. In 1997 the university settled on an apparently simple solution: eliminate the SAT as a criterion for admissions. "We ... have evidence that the SAT loses us 2,000 Latino students this year alone," said Eugene Garcia, dean of the School of Education at Berkeley in a 1997 interview.(2).

Although the university's enthusiasm for eliminating the SAT may have faded, admissions testing remains a source of controversy. A new document from the U.S. Department of Education, "Nondiscrimination in High-Stakes Testing" (still in draft form), advises that colleges may be in legal jeopardy if they rely too heavily on standardized test scores in making admissions or financial aid decisions. The president of the University of California, Richard Atkinson, said in a March 1999 interview that he "would be prepared to forget the SAT" if the newly approved California high school exit examination proves to be a good test.(3) And a bill that would deemphasize the role of standardized testing in admissions decisions (S.B. 145), introduced for the second time in January 1999, awaits action in the California senate. (An earlier version of the bill, introduced in 1998, passed both houses of the legislature but was vetoed by the outgoing governor, Pete Wilson.).

Meanwhile, Texas has been grappling with the effects of the Hopwood decision, which banned the use of race in admissions programs, and the state of Washington has been faced with the consequences of Initiative 200, a Prop 209 clone that was passed in 1998. These political developments have provoked a reconsideration of the role of tests in college admissions and have focused serious attention on two questions: Are standardized admissions tests biased against minorities, as is often argued? Would eradicating these tests produce a more ethnically diverse freshman class?

THE QUESTION OF BIAS

Differences between racial and ethnic groups in their performance on standardized tests -- including the SAT (from the Educational Testing Service) and its competition, the ACT (from ACT, Inc.) -- have been analyzed extensively both in academic journals and in the popular press. Researchers, social theorists, and politicians have offered an array of reasons for these score differences, ranging from socioeconomic, cultural, linguistic, and genetic factors to test bias. A recent inflammatory contribution to this literature was The Bell Curve, by Richard Herrnstein and Charles Murray, which was published in 1994 and encouraged consideration of genetic explanations for group differences in test scores.(4) But the controversy has not been limited to the reasons for the differences in performance. Even the matter of determining which groups are advantaged by standardized tests is less straightforward than it first appears.

In the popular press, the existence of bias in admissions tests is typically assumed to be demonstrated by the persistent pattern of differences between racial groups in average test scores. The idea that score differences are sufficient evidence to establish bias is reflected in the original language of the California standardized testing legislation that is currently under consideration. According to the initial version of the bill, "a test discriminates ... if there is a statistically significant difference in the outcome on test performance when test subjects are compared on the basis of gender, ethnicity, race, or economic status."(5) Another example of the view that score differences are sufficient evidence for test bias can be found at a website maintained by Time and the Princeton Review, a test preparation company: "Studies show persistent ... race bias in both the SAT and the ACT. ... The SAT favors white males, who tend to score better than all other groups except Asian-American males."(6).

When academic researchers investigate the fairness of the SAT, however, they don't ordinarily focus on the average scores achieved by each ethnic group. Instead, they consider another aspect of the test results: How well does the SAT predict college grades for each group? Researchers have typically found that using the SAT to predict first-year college grade-point averages (GPAs) results in a more positive prediction for black and Latino test-takers than is warranted; that is, the predicted grades tend to exceed the actual grades for these groups.

For example, a 1994 College Board study found that "there were, on average, underpredictions of college GPAs for Asian American students (and to a lesser extent, white students) and overpredictions for American Indian, black and Hispanic students."(7) In other words, SAT scores tended to predict higher college grades than were actually attained by African American, Latino, and American Indian students and lower grades than were actually attained by Asian American and white students. In discussing the recurrent finding of inflated predictions for African Americans, Robert Linn, an eminent educational researcher, noted in 1983 that this result is "contrary to a commonly held expectation that tests are unfair to certain minority groups in the sense that they give a misleadingly low indication of the likely performance ... in school. The overprediction finding suggests that, if anything, just the opposite is true."(8) In their widely acclaimed 1998 book, The Shape of the River, William Bowen and Derek Bok also include an extensive discussion of this phenomenon.(9).

What's the real story about differences in ethnic group performance on the SAT? Do black and Latino test-takers tend to score lower, or are predictions of their college grades based on their SAT performance inflated? Paradoxical as it may seem, both these patterns have characterized SAT results for many years.

The 1994 College Board study provides a useful context for illustrating these seemingly contradictory results. This research, based on 1985 data from 45 colleges, represents the most detailed and painstaking analysis of the utility of the SAT as a predictor of college grades. A portion of the results -- those for African Americans, Asian Americans, Latinos, and whites -- are given here. The much smaller American Indian group is not included. (See Table 1.).

The average SAT scores, high school GPAs, and college GPAs show substantial differences across groups. Average SAT scores are higher for Asian American and white students than for African American and Latino students. The difference is more dramatic for the math score than for the verbal score. The average SAT math score for Asian Americans is about 130 points higher than the average SAT math score for African Americans.(10) (The 1998 SAT results reveal similar patterns.) If a difference in average performance were considered sufficient to demonstrate test bias, then these findings would appear to show bias against African American and Latino test-takers. (If this were the sole criterion, we would have to conclude that high school and college grades were biased as well.).

However, in the world of psychometrics, the assessment of test bias is conceptualized differently. Group performance differences can arise for many reasons that are not a function of the test itself -- unequal educational opportunity being the most obvious -- so the absence of such differences is not considered a criterion for test fairness. Instead, traditional psychometric analysis focuses on another question: Is the test an effective and accurate predictor of college GPAs for all groups? (Here we consider only ethnic groups, but other demographic groups -- males, females, native and non-native speakers of English -- are ordinarily examined as well.).

The first step in the psychometric investigation is to assess the validity of the test for students as a whole. Does the SAT lead to better prediction of college grades than could be obtained using high school grades alone? Typically, the effectiveness with which SAT verbal scores, SAT math scores, and high school grades can jointly predict college grades is evaluated through linear regression analysis, a standard statistical procedure that is used in a variety of prediction applications. The regression analysis yields an equation for predicting college grades from high school grades, SAT math scores, and SAT verbal scores (each multiplied by a weighting factor and then added up). Predictive effectiveness is measured by the degree of correspondence between the predicted college grades and the actual college grades. The analysis can then be repeated using high school grades alone as a predictor. Comparing the results of the two analyses yields an estimate of the "value added" by using SAT scores.

After these analyses are completed for the entire group of students, the next step is to perform a separate prediction analysis within each ethnic group and to compare the resulting equations across groups. The College Board study evaluated various combinations of the three key predictors of college grades. Consistent with earlier research, the results showed that high school grades and SAT scores are important predictors in all ethnic groups and that including the SAT did lead to better prediction than using high school grades alone.(11) Research conducted at the University of California in 1997 produced the same conclusion.(12) In the College Board study, prediction was somewhat more effective for white and Asian American test-takers than for African American and Latino test-takers, regardless of which combination of predictors was used. In the African American group, unlike the other groups, SAT scores alone provided slightly more effective prediction than high school grades alone.

Although test validity research involves the computation of separate prediction equations for each ethnic group, admissions decisions within a college are ordinarily made by means of a common prediction equation for all ethnic groups. Will the use of a single equation result in systematic over-or underprediction of college grades for certain groups? This can be determined by comparing the actual first-year college grades to the predicted grades (obtained using the equation based on all students). Table 2 shows the average differences between actual college GPA and predicted college GPA for each group. A minus sign indicates overprediction (actual grades lower than predicted grades); a plus sign, underprediction (actual grades higher than predicted grades).

By definition, the equation will, on average, predict perfectly for the overall group. The white results will necessarily be similar since whites constitute about 82% of the total group in the study. But how do the results stack up for the remaining ethnic groups? Whether SAT score, high school GPA, or a combination is included in the equation, the results for Asian American test-takers are slightly underpredicted, while the results for African American and Latino test-takers are overpredicted. It is worth noting that overprediction is mitigated by the use of the SAT -- it's even worse when only high school GPA is used. For example, college GPAs for African Americans are overpredicted by an average of .35 when only high school GPAs are used as predictors. When SAT scores are included in the prediction equation, the average overprediction is reduced to .16.(13).

What explains the overprediction? A variety of reasons have been advanced, including differences across groups in high school courses taken or in the stringency of high school grading practices, differences across groups in the choice of college curriculum, and a greater incidence in ethnic minority groups of life difficulties that interfere with academic performance in college.

The results of the College Board study mirror the general findings of SAT validity research from the last several decades. First, for all ethnic groups, tests do contribute to the prediction of college performance as measured by college GPA. Second, there's some evidence of ethnic group differences in the effectiveness and accuracy of prediction. Third, it's possible for a group to have lower average test scores than other groups and still receive inflated predictions of later performance. The overriding conclusion is neither new nor earthshaking: in crafting a college admissions policy, tests serve as useful, but far from perfect, tools.

WOULD ELIMINATING THE SAT IMPROVE ETHNIC DIVERSITY?

If colleges removed the SAT from admissions criteria, what would be the likely result? This is the very question addressed in a December 1997 report issued by the Office of the President, University of California.(14) It was based on supplementary analyses of data from a study conducted by the California Postsecondary Education Commission (CPEC).(15) Transcripts, test scores, and demographic information from a 6% random sample of 1996 graduates of California public high schools were analyzed to determine the effect of applying various admissions criteria. The study issued by the Office of the President considered how eliminating standardized admissions tests would affect the rates of "UC eligibility," which is based on the completion of certain college-preparatory courses, the GPA for those courses, and (if the GPA is below 3.3) scores on the SAT or ACT.

The study's conclusion was surprising to some: eliminating the admissions test requirement, when combined with other mandated features of admissions policy at the University of California, would produce very small changes in the eligibility rates for Latinos (from 3.8% to 4.0%), African Americans (from 2.8% to 2.3%), and Asian Americans (from 30% to 29%). The largest change would be an increase in the eligibility rate for whites (from 12.7% to 14.8%).(16).

The analysis that produced these projections of eligibility rates incorporated the provisions of the Master Plan for Higher Education in California, which mandates that 12.5% of the state's high school graduates be declared "UC eligible." If the admissions test requirement were dropped, the minimum GPA for the required college-preparatory courses would need to be raised, a change that leads to the predicted effects on eligibility rates. Dropping the SAT, while simultaneously ignoring the "12.5%" requirement, increased eligibility to 18.7% overall, while leaving the pattern of ethnic-group eligibility virtually unchanged. (This analysis, as well as many of the conjectures in this article, is based on the implicit assumption that eliminating the SAT would not have a substantial impact on high school grading practices. Some educators have raised the concern that rampant grade inflation would occur if the SAT requirement were lifted, rendering high school grades useless as an admissions criterion.).

The minimal changes in the predicted eligibility rates for African American and Latino students are less remarkable in light of the finding that "low test scores rarely are the only reason for a student's ineligibility."(17) In fact, the CPEC report on eligibility shows that only 2..5% of California public high school graduates were ineligible solely on the basis of inadequate test scores. Most students -- 62.6% of graduates overall -- were ineligible because they had "major course omissions" or grade deficiencies or because they attended "schools that did not have a college-preparatory curriculum approved by the University." The percentage of students ineligible for these reasons was higher for African Americans (77%) and Latinos (73.6%) and lower for whites (58.7%) and Asian Americans (39%). Another 13.7% of graduates overall were ineligible because they were missing "only a few" (no more than three) of the required college-preparatory courses.(18).

Because the pattern of ethnic group differences in average high school GPA is usually similar to the pattern of average admissions test scores, an admissions policy that excludes tests but continues to include high school grades is unlikely to produce dramatic change. A case in point is the so-called 4% plan, which will go into effect at the University of California in 2001. The plan offers admission to the top 4% of graduates of every California high school who have completed the required college-preparatory courses, regardless of their test scores. Analyses have predicted that the plan will have "little impact on racial proportions at UC, since any increases in numbers of black, urban students will be matched by increases in white, rural students."(19) Keith Widaman, chair of the universitywide committee that developed the plan, told the San Francisco Chronicle that implementing the plan will probably have only a minor effect on the percentage of black and Latino applicants admitted.(20).

The indisputable fact is that both high school grades and scores on admissions tests are reflections of the same education system, with all its flaws and inequities. In a recent colloquium on the future of affirmative action, Christopher Edley, a professor of law at Harvard University and a consultant to President Clinton on issues of race, noted, "The SAT simply recapitulates ... all of the class advantages, all of the access advantages ... in the K-12 experiences of the student."(21) The same can also be said of high school grades. By using grades rather than SAT scores as an admissions criterion, said sociologist Christopher Jencks in a 1989 essay, "You are simply substituting tests designed by high school teachers for tests designed by the Educational Testing Service."(22) A college admissions system that relies heavily on either tests or high school grades, then, cannot be the path to the eventual elimination of disparities in educational opportunity.

While there is little basis for concluding that standardized admissions tests are biased against ethnic minorities in the psychometric sense -- in fact, they tend to overpredict performance for African American and Latino students -- it is clear that an overreliance on tests and other traditional measures of achievement in admissions can perpetuate the underrepresentation of certain groups by, as author Ellis Cose has put it, rewarding "those who have already been well schooled."(23) The Hopwood decision, Proposition 209, and similar initiatives exacerbate the problem by removing one method of increasing access to higher education for people of color.

A point on which individuals of every political stripe can agree is that, ultimately, we must fix "the pipeline" -- that is, improve K-12 education so that college applicants will be better prepared. But this viewpoint has drawn an impatient response from some educators. "Obviously," says Edley, "we all would prefer the great day in which the pipeline is repaired and students of all kinds show up at our doorsteps prepared, ready, eager to take the best of what we have to offer. But that day is not with us. What do we do in the meantime?"(24).

One avenue for change in the admissions process is the consideration of alternative definitions of college success. Although it has long been argued that the first-year college GPA is not the only outcome of interest, no other criterion has gained wide use. Remaining within the realm of grades, GPA in a student's area of specialization and GPA at graduation have been proposed as alternative criteria. The 1994 College Board study found that the grades earned in individual college courses may be more promising outcome measures than GPA. Other possible criteria are successful completion of the first year of college or successful completion of the bachelor's degree. What distinguishes students who attain these milestones from those who do not? Among the student attributes that warrant further investigation are motivation, perseverance, ability to overcome an adverse environment, and "spike talents" in particular areas. We need research to determine how best to measure these characteristics and how to assess their predictive value.

Of course, none of these approaches is guaranteed to improve the ethnic balance on U.S. campuses. As a society we must determine whether we believe that diversity is beneficial per se -- a view that is distinct from the argument that diversity be promoted as a way of righting past or present wrongs. If we support President Clinton's contention that "there are independent educational virtues to a diverse student body,"(25) then we should adopt the goal of diversity explicitly by considering an applicant's membership in an underrepresented group to be a "plus" in the admissions process.

Mounting legal barriers to such explicit consideration of ethnicity have given rise to the idea that eliminating the SAT can serve as a form of covert affirmative action. Although it is certainly possible to design a workable admissions policy that does not include standardized tests, as some 15% of four-year colleges have done, it is not sound policy to eliminate admissions tests in the hope of indirectly furthering a social policy goal. In California, the perennial hotbed of the affirmative action debate, we now know that failure to complete required college-preparatory courses -- rather than low test scores -- is the main barrier to admission to the University of California for members of all ethnic groups. In any case, both test scores and high school grades are reflections of the very same disparities in educational opportunity. Eliminating standardized tests and relying more heavily on high school achievement in admissions decisions simply cannot result in a dramatic change in the ethnic diversity of the student body. In short, dismantling admissions test requirements as a backdoor affirmative action policy cannot work.

Added material.

REBECCA ZWICK is a professor in the Department of Education, University of California, Santa Barbara. Previously, she spent 12 years as a researcher at Educational Testing Service. She is working on a book on the use of standardized tests in college, graduate school, and professional school admissions. 1999, Rebecca Zwick.

Illustration by Kris Hackleman.

Illustration by Christopher Burke.

TABLE 1. Average SAT Scores, High School GPAs, and College GPAs, by Ethnic Group.

(TABLE) African Asian American American Latino White OverallSAT (verbal) 436 484 462 513 505SAT (math) 466 595 516 564 559High school GPA 3.18 3.58 3.43 3.40 3.41College GPA 2.14 2.80 2.37 2.66 2.63Number of test-takers 2,475 3,848 1,599 36,743 44,849.

Source: Adapted from Leonard Ramist, Charles Lewis, and Laura McCamley-Jenkins, Student Group Differences in Predicting College Grades: Sex, Language, and Ethnic Groups (New York: College Entrance Examination Board, College Board Report No. 93-1; ETS Research Report No. 94-27, 1994), p. 9.

TABLE 2. Average College GPA Minus Average Predicted College GPA.

(TABLE)Predictors in African AsianEquation American American Latino White OverallHigh school GPA -.35 +.02 -.24 +.03 0SAT (verbal and math) -.23 +.08 -.13 +.01 0High school GPA plus SAT (verbal and math) -.16 +.04 -.13 +.01 0.

The scale of the GPAs is 0-4.