Full-text source: WilsonSelectPlus_FT
High-stakes testing: barriers to gifted girls in mathematics and science?.
Author: Rebhorn, Leslie S.; Miles, Dorothy D. Source: School Science and Mathematics v. 99 no6 (Oct. 1999) p. 313-19 ISSN: 0036-6803 Number: BEDI99029052 Copyright: The magazine publisher is the copyright holder of this article and it is reproduced with permission. Further reproduction of this article in violation of the copyright is prohibited.
Currently, programs that use the Talent Search model for identifying academically talented middle school students employ the Scholastic Assessment Test (SAT) when selecting students for admission to special courses and programs typically offered in the summer at colleges and universities. However, on the mathematics portion of the SAT (SAT-M) mean scores for females are significantly lower than for males, with important implications for participation in special programs. This article will explore multiple factors contributing to the situation, critique proposed solutions, and suggest two new solutions.
When the outcome of a single test has significant consequences, the testing process is described as "high stakes." When applied to programs for gifted students in which selection is based solely on SAT scores, the consequence is an inequitable selection of girls and boys for advanced programs in mathematics and science. Presently, approximately 160,000 middle school students each year participate in a nationwide talent search effort (VanTassel-Baska, 1998). Of those scoring above 500 (out of 800) on the mathematics portion of the SAT, approximately two thirds are boys (Benbow & Stanley, 1983; Center for Talent Development, 1994). Consequently, spaces in fast-paced university summer programs designed specifically for gifted students are filled with many more boys than girls.
Twenty-five years ago, Dr. Julian Stanley at Johns Hopkins University piloted the first "Talent Search" program (Assouline & Lupkowski-Shoplik, 1997). The Study of Mathematically Precocious Youth (SMPY) invited academically talented seventh graders who had scored at or above the 97th percentile on ingrade standardized achievement tests to take the Scholastic Aptitude Test (since renamed the Scholastic Assessment Test). The SAT, usually taken by high school seniors for college admission purposes, is a widely used instrument with a long and solid history. It now yields scores on two subtests, mathematics and verbal, ranging from 200 to 800 each. The rationale for using the SAT in the initial Talent Search program was that ceiling effects for grade-level tests would disguise or suppress the true extent of the abilities of the most academically talented students.
The Talent Search students' off-level test scores are used to invite participants to special courses and programs designed for their learning abilities. The four regional Talent Search sites and many other institutions offer summer courses that often accomplish the equivalent of a year of high school study in 2 or 3 weeks. Students can complete high school
athematics and/or science curricula in fewer than 4 years and can begin college level work while still in high school. All of the Talent Search sites operate similarly, using comparable criteria for participation in the search (95th to 97th percentile on in-grade standardized achievement tests) and comparable criteria for participation in specially designed educational opportunities.
THE GENDER GAPThere is a gender gap in the SAT-M scores of high school seniors who take the test for college admission (the use for which it was intended): The 1996 girls' average was 492, and the boys' 527, a difference of 35 points (Education Week, 1997, p. 13). The difference may be decreasing: In 1989, the senior girls' average was 46 points lower than the boys' (Callahan, 1991).
Comparison of scores before and after 1995 is difficult (Paulos, 1995), because the score scale of the SAT was adjusted on April 1, 1995, to recenter both the mathematics and verbal scales to 500, the original mean score of the test. The effects of the recentering--unprecedented and quite controversial--could have affected the relative scores for boys and girls, resulting in an apparent reduction of the gender gap.
The four Talent Search sites have found similar patterns of scoring over the years: Boys outscore girls by 30 to 50 points on the mathematics portion of the SAT. The Talent Identification Program (TIP) at Duke University recently identified 795 students for a 3-week summer residential program: 496 boys (62%) and 299 girls (38%). The boys' SAT-M average was 594.36, and the girls' was 548.14, a difference of 46 points (Stocking & Goldstein, 1992).
The results from the 1994 participants in the Midwest Talent Search, another of the regional sites, supported earlier findings but reflected some progress in reducing the gender gap: The boys' average SAT-M score was 446, 29 points higher than the girls' average of 417. In addition, boys outnumbered girls by 1.6:1 among those scoring above 500, by over 2:1 among those scoring above 600, and by almost 4:1 among those scoring 700 or more (Center for Talent Development, 1994).
For almost 20 years, Stanley has been studying a subgroup of Talent Search participants. Interestingly, the criteria for membership in this special study differ for the sexes: Boys must score 700 on the SAT-M before they turn 13, but girls must score at or above 640 at the same age to qualify because "it's harder to find girls who are mathematically talented at the higher level" (Stanley, in Kirschenbaum, 1992, p. 12).
Although some might argue that the gender gap is relatively small, the differences in the average SAT-M scores of Talent Search boys and girls translate into sex differences in the composition of the accelerated summer courses, as the TIP example illustrated. For programs using eligibility cutoffs on the SAT-M of 500, 520, or some other number, different numbers of boys and girls will be eligible, with proportionately fewer girls qualifying as the cutoff rises.
The opportunity to participate in the special summer programs is of significant academic benefit to the students. Stanley and Benbow (1983) listed some outcomes of such participation: increased enthusiasm for learning and life and improved attitude toward school; enhanced self-esteem; less egotism and arrogance, due to working with intellectual peers for the first time; much better educational preparation resulting in improved qualifications for highly selective colleges; and better opportunities for graduate school and fellowships, due to better preparation, networking experience with professors, and research skills. In addition, particular ben
its accrue to girls who participate in summer mathematics programs: They tend to take more Advanced Placement (AP) courses later in high school (Olszewski-Kubilius & Grant, 1996). AP courses themselves confer additional benefits on girls, including career stimulation, college and career counseling, and encouragement from older gi!
rls in mathematics and the physical sciences (Casserly, 1980).
But are girls being denied equal access to the summer programs on the basis of their gender? Yes, there is a gender gap in SAT-M scores among Talent Search participants--a large and consistent gap of 30 to 50 points. But does this mean that the SAT is biased, or that it is measuring or reflecting something else that must be examined more closely? Two hundred years ago, the literacy rates for men were twice those for women (Leder, 1992). Hindsight allows us to place that fact in context and understand that it was not a matter of capability but of opportunity. Perhaps the current gender gap on the SAT is similarly a question of nurture and not nature.
Many explanations have been offered in the professional literature for the differences in girls' and boys' scores on measures of mathematics ability. Underlying the various explanations is a more basic disagreement over whether the SAT is in fact the culprit or merely the magnifying glass. The following are proposed explanations for the gender gap in mathematics scores.
Explanation 1: The SAT is biased against females in that differences in scores do not reflect actual differences in ability.
Sadker and Sadker (1994) referred to this phenomenon in their book, Failing at Fairness, noting that male and female high school students often receive the same grades in courses but score very differently on the SAT. Similarly, Subotnik and Strauss (1994/1995) described the outcomes of a study of several sections of an AP BC-level calculus course: Boys and girls scored equally well on the AP exam, but the girls scored lower on the SAT-M, used in the study as a predictor of the AP exam scores.
However, a mathematician viewed the data with less alarm:.
Girls consistently score lower than boys ... on the SAT . Nevertheless, the frequent charges of cultural and gender bias are, in my opinion, overstated.... The test is "biased," but only toward the educationally prepared, the physically healthy, and the psychologically receptive. (Paulos, 1995, p. 64).
Another analyst found the "cure" to be worse than the ailment:.
In March the Center for Women Policy Studies in Washington decided to shoot the messenger. The SAT mathematics test, it argued, is biased against girls. The cure it recommends is that all questions on which boys consistently do better than girls should be eliminated from the SAT." (Ravitch, 1997, p. 68).
Ravitch included examples of SAT items in dispute: A high school basketball team has won 40% of its first 15 games. Beginning with the 16th game, how many games in a row does the team now have to win in order to have a 55% winning record? (A) 3; (B) 5; (C) 6; (D) 11; (E) 15.
This question produced a 27% gap between boys and girls, the single largest in the study, which might suggest that girls are discriminated against when questions are asked about sports or basketball. Note, however, that the team might well be a women's basketball team since the players' gender is not mentioned at all. (Ravitch, 1997, p. 68).
It is difficult, looking at the language and contexts of SAT test items, to conclude that gender bias is at work. It is not clear how the items themselves could contribute to girls' lower success rate. It is possible that the items, or the strategies that would be most effective in their solution, are somehow clearer or more accessible to boys than girls. Recently, research has pointed to a specific difference between the sexes in complex problem solving at high ability levels, attributed to "the tendency for boys to be risk-takers and flexible in their problem-solving procedures but for girls to stick to standard algorithms" (Gallagher, 1996, p. 462). It would be useful to pursue this hypothesis and to share the results with those who teach mathematics in schools and those who train future teachers of mathematics.
Explanation 2: Boys have genetically superior mathematics ability and/or aptitude.
This view, while popular among the public and perpetuated by the media, is not substantiated by research (Hyde, 1993; Leder, 1992). A recent study concluded that gender differences in mathematics achievement are small and becoming smaller with time. At age 9, virtually no gender differences in mathematics performance are found, and only minimal differences at age 13. A larger difference in favor of males is observed at age 17 (American Association of University Women, 1992). A British mathematician, summarizing results of research in this country, reached the same conclusion over a decade ago: "It is during the high school years that any developing sex differences in mathematics achievement are observed in the USA. This is a clear indicator of environmental rather than genetic explanations for such differences" (Burton, 1986, p. 3).
Explanation 3: Boys' scores are more variable, so there are more high scorers among boys.
There is some evidence to support this explanation, but it is not widespread. Callahan (1979) observed greater variability in males than females on many different variables, including cognitive and personality attributes. Eysenck (1995) echoed this observation more recently, as did Hood and Johnson (1991), who extended their discussion to implications for testing programs:.
In the case of the National Merit Testing Program, different cutoff scores are already used for different states, so that the top 1% of the students in each state qualify. If this practice were extended to the sexes, then qualifying scores could be established to ensure that the top 1% of both men and women would qualify. (Hood & Johnson, 1991, p. 233).
Explanation 4: The timed nature of the test contributes to lower scores for girls.
There has been some exploratory investigation into this hypothesis, but significant effects were not found. The authors of the study noted that differences between males' and females' SAT-M scores were reduced under untimed conditions, but that "the small sample size and resulting small cell size could obscure effects that may be significant when found in a larger sample" (Dreyden & Gallagher, 1989, p. 195). An additional complication is that the study used Talent Search students who volunteered to retake the SAT. The results, even if they had been significant, could not be generalized to girls and boys encountering the test for the first time, with accompanying levels of anxiety that might affect their performance.
As Paulos (1995) noted, There are many dimensions of scholastic ability that aren't measured by the SAT....The premium the SAT places on speed is especially difficult to defend. Anytime one tries to collapse a multifaceted, amorphous concept...along a linear scale, one is going to lose important information. (p. 65).
Educators would respond that "the purpose of the time limit on the SAT-M is to create a larger range of test scores" (Dreyden & Gallagher, 1989, p. 195). However, another view might be useful when using scores to identify students likely to do well in an advanced course or program.
Explanation 5: Girls take fewer mathematics courses so are less prepared for the test.
In a recent statement, the College Board said that gender differences in average SAT scores "'are not the result of bias, but reflect differences in academic preparation as well as other educational and socioeconomic factors. For instance, female students take fewer and less rigorous mathematics and science courses than male students" (Education Week, 1997, p. 13). However, the AAUW has pointed to 1991 National Assessment of Educational Progress results for 37 states, which found that "up to Algebra III/PreCalculus and Calculus there were no gender differences in either course-taking or average proficiency" (AAUW, 1992, pp. 42-43).
Explanation 6: Parents' expectations for their daughters are lower than for their sons.
Parents of mathematically talented youth are instrumental in shaping the schools' responses to their children's needs. However, parents do exhibit gender stereotypes in communicating their expectations to their children, including gifted children (Callahan, 1991; Hollinger, 1995). Additionally, "parents of gifted girls sometimes inadvertently undermine their daughters' motivation and achievement by conveying the idea that the true measure of success for women is a happy marriage and motherhood" (McCormick & Wolf, 1993, p. 85).
Explanation 7: Expectations of schools/teachers/counselors/peers are different for girls.
Leder (1992) compiled research findings from a plethora of studies that found small but subtle differences in the ways girls and boys are treated at school. Some of the differences include the tendency for teachers to praise males more frequently for correct answers, monitor males' work more closely, participate in more interactions with males, and wait longer for males to answer higher level questions, but wait longer for females to answer less challenging questions.
Silverman (1986) warned of the cumulative effects of this phenomenon:.
Gifted girls' advanced abilities...may be diminishing as a consequence of the educational process.... The purported differences in mathematical and scientific abilities favoring the boys do not appear until the intermediate grades.... The progressive quality of this type of underachievement ... supports the hypothesis that females achieve less than males because they are gradually conditioned by the educational system to view themselves as less capable than males. (pp. 59-60) Leder (1992) also noted the impact of "the subtle.
ways in which students who contravene prevailing norms are disapproved by the peer group" (p. 612), which would contribute to girls' tendency to downplay or devalue mathematics-related pursuits.
Fennema (1984) reviewed research on gender differences in mathematics achievement and on intervention programs designed to encourage girls' participation in mathematics and science. She concluded that:.
The causation of sex-related differences in mathematics rests within the schools.... While laying the blame for sex-related differences on schools seems unduly harsh to many people, it is the most positive thing that has been said. If schools cause the problem, this means they can solve the problem. Educators have the power to effect change. (p. 161).
PROPOSED SOLUTIONSAlthough the various explanations for gender differences in mathematics scores have their proponents and detractors, research results have apparently persuaded most professionals to relinquish the hypothesis of significant biological differences in mathematics ability. Given this fact, it appears that the SAT, whatever the reason, must measure differences in boys' and girls' performance on a specific task and that performance must vary for some environmentally based reason or combination of reasons. Along with the many explanations, myriad solutions have been proposed:.
1. Administer the SAT without time limits, to equalize boys' and girls' performances.
2. Supplement the SAT with other measures in the Talent Search process.
3. Eliminate items from the SAT on which boys consistently do better than girls.
4. Educate parents about the influence of parenting styles and attitudes on girls' self-perceptions regarding mathematics.
5. Inform teachers and guidance counselors about girls' true potential in mathematics, and help them support and encourage girls' progress in that area.
All of these ideas fit one of two categories: Do something to or about the SAT or do something to or about the environment around girls.
Some of these proposed solutions have merit. The first idea, administering the SAT without time limits, is intriguing. As Paulos (1995) noted, the imposition of time limits on a complex conceptual assessment is bound to confound the results. It has been 9 years since Dreyden and Gallagher (1989) examined the effects of time and direction changes on the SAT performance of academically talented adolescents. The idea should be pursued.
The SAT could be supplemented with other instruments to help include groups that traditionally score less well on the SAT, in this case, girls. This second proposal fits well with the overall orientation of the field of gifted education, away from single-score identification strategies and toward multiple criteria. It would be valuable to experiment with alternative and additional criteria, to find a combination of instruments that would be as effective as the SAT-M at identifying mathematics talent, while avoiding the problem of gender differences associated with the SAT-M alone. This solution would take time to test and implement.
The third solution, eliminating selected items from the SAT if boys answer them correctly more often than girls, seems a risky strategy. After all, the hypothesis that biology is to blame has been eliminated, and researchers have concluded that environmental factors are causing the problem. Altering the SAT would eliminate an important source of data about the nature and extent of the problem. For example, perhaps there is a specific type of item, or combination of types of items, that girls do not seem to have mastered to the degree boys have. That would be valuable information to share with classroom teachers of mathematics and also with teacher education faculty at colleges and universities. Eliminating items on the SAT-M would also prevent educators from seeing the results of implementing long-term social strategies to encourage girls in mathematics and science.
The fourth and fifth proposed solutions--to educate parents and other groups in society that could do more to encourage girls' efforts and success in mathematics--should be implemented regardless of which other solutions are attempted. Educators can learn from the successes of the accelerated summer courses about what kinds of instructional programs and approaches are beneficial (VanTassel-Baska, 1998), including recognizing that mastery is better conceptualized as skill development or knowledge attainment than as time spent studying a subject (Olszewski-Kubilius, 1998). In short, modifying the SAT or the Talent Search identification process is a situation-specific measure at best. Correcting the influences of the environment on girls' self-confidence, attitudes, and perceptions will do more for more girls in more areas than will solving the immediate problem of gender differences on the SAT.
PROPOSALSTwo additional solutions to the SAT-M problem are proposed:.
First, programs could establish different cutoff scores for girls and boys, cutoffs that reflect the current extent of the gender gap. For instance, if in a particular year the difference in mean SAT-M scores for boys and girls is 35 points, summer programs could enroll boys who score at 500 and above and girls who score at 465 and above. This is analogous to the different growth charts that pediatricians use to monitor the height and weight of male and female babies.
The advantages of this plan, which echoes Hood and Johnson's (1991) proposal for addressing score variability and Stanley's (Kirschenbaum, 1992) modification for the SAT-M-700-before-age-13 group, are its immediate availability and the preservation of an otherwise successful Talent Search model. This plan also fits well with the research evidence that girls may score lower than boys but perform at the same level in coursework. And this idea preserves the power of the SAT to mirror societal or environmental influences on girls. If gender differences on the SAT reflect environmental effects, not genetic differences, and if U.S. schools and families are successfully reoriented to support girls' potential in mathematics, the gender gap should narrow over the years, and the different cutoffs should become unnecessary. Perhaps the SAT-M will reflect this era's struggle to close its equivalent of the literacy gap.
The second new proposal is that girls themselves reject score cutoffs printed in program guides and propose to summer program directors that they be admitted based on evidence other than SAT-M scores. Girls can compile portfolios of their work, accomplishments, awards, grades, and other data and should make the case that this other evidence demonstrates their fitness for the course they would like to take. Clearly, program administrators must be open to considering alternative measures when selecting students for the summer programs.
These proposed solutions are not aimed at correcting the apparent gender gap in SAT-M scores; rather they are designed to accomplish the goals of the programs themselves--identify students who are likely to benefit from and be successful in the special programs designed for academically talented youth. Instead of moving hastily to correct the problem apparent on the surface, a more reflective response to the situation as a whole is proposed.
Added material.
Editor's Note: Correspondence concerning this article should be addressed to Leslie S. Rebhorn, Department of Education, Saint Louis University, 116 McGannon Hall, 3750 Lindell Blvd., St. Louis, MO 63108-3412.
Electronic mail may be sent via Internet to lrebhorn@ibm.net.
REFERENCESAmerican Association of University Women. (1992). How schools shortchange girls: The AAUW report. New York: Marlowe & Co.
Assouline, S. G., & Lupkowski-Shoplik, A. (1997). Talent Searches: A model for the discovery and development of academic talent. In N. Colangelo & G. A. Davis (Eds.), Handbook of gifted education (2nd ed., pp. 170-179). Boston: Allyn and Bacon.
Benbow, C. P., & Stanley, J. C. (1983). Sex differences in mathematical reasoning ability: More facts. Science, 222, 1029-1031.
Burton, L. (1986). Introduction. In L. Burton (Ed.), Girls into maths can go (pp. 1-20). London: Holt, Rinehart and Winston.
Callahan, C. M. (1979). The gifted and talented woman. In A. H. Passow (Ed.), The gifted and the talented: Their education and development (pp. 401-423). Chicago: Univ. of Chicago Press.
Callahan, C. M. (1991). An update on gifted females. Roeper Review, 14(3), 284-311.
Casserly, P. L. (1980). Factors affecting female participation in Advanced Placement programs in mathematics, chemistry, and physics. In L. Fox, L. Brody, & D. Tobin (Eds.), Women and the mathematical mystique (pp. 138-163). Baltimore, MD: Johns Hopkins Univ. Press.
Center for Talent Development. (1994). 1994 Midwest Talent Search statistical summary. Northwestern University, Evanston, IL: Center for Talent Development.
Dreyden, J. I., & Gallagher, S. A. (1989). The effects of time and direction changes on the SAT performance of academically talented adolescents. Journal for the Education of the Gifted, 12(3), 187-204.
Education Week. (1997). Testing. 16(26), 13.
Eysenck, H. J. (1995). Genius: The natural history of creativity. London: Cambridge University Press.
Fennema, E. (1984). Girls, women, and mathematics. In E. Fennema & M. J. Ayer (Eds.), Women and education: Equity or equality? (pp. 137-164). Berkeley, CA: McCutchan.
Gallagher, S. A. (1996). A new look (again) at gifted girls and mathematics achievement. Journal of Secondary Gifted Education, 7(4), 459-475.
Hollinger, C. L. (1995). Stress as a function of gender: Special needs of gifted girls and women. In J. L. Genshaft, M. Bireley, & C. L. Hollinger (Eds.), Serving gifted and talented students: A resource for school personnel (pp. 269-283). Austin, TX: Pro-Ed.
Hood, A. B., & Johnson, R. W. (1991). Assessment in counseling: A guide to the use of psychological assessment procedures. Alexandria, VA: American Counseling Association.
Hyde, J. S. (1993). Gender differences in mathematics ability, anxiety, and attitudes: What do meta-analyses tell us? In L. A. Penner, G. M. Batsche, H. M. Knoff, & D. L. Nelson (Eds.), The challenge in mathematics and science education: Psychology's response (pp. 237-249). Washington, DC: American Psychological Association.
Kirschenbaum, R. J. (1992). An interview with Julian C. Stanley: Part II. The Gifted Child Today, 15(5), 12-14.
Leder, G. C. (1992). Mathematics and gender: Changing perspectives. In D. A. Grouws (Ed.), Handbook of research on mathematics teaching and learning (pp. 597-622). New York: Macmillan.
McCormick, M. E., & Wolf, J. S. (1993). Intervention programs for gifted girls. Roeper Review, 16(2), 85-88.
Olszewski-Kubilius, P. (1998). Talent Search: Purposes, rationale, and role in gifted education. Journal of Secondary Gifted Education, 9(3), 106-113.
Olszewski-Kubilius, P., & Grant, B. (1996). Academically talented women and mathematics: The role of special programs and support from others on acceleration, achievement, and aspirations. In K. D. Arnold, K. D. Noble, & R. F. Subotnik (Eds.), Remarkable women: Perspectives on female talent development (pp. 281-294). Cresskill, NJ: Hampton Press, Inc.
Paulos, J. A. (1995). SAT top quartile score declines: Correlation, prediction, and improvement. In A mathematician reads the newspaper (pp. 63-66). New York: BasicBooks.
Ravitch, D. (1997, April 7). Showdown at gender gap. Forbes Magazine, p. 68.
Sadker, M., & Sadker, D. (1994). Failing at fairness: How our schools cheat girls. New York: Simon & Schuster.
Silverman, L. K. (1986). What happens to the gifted girl? In C. J. Maker (Ed.), Critical issues in gifted education (Vol. 1, pp. 43-89). Rockville, MD: Aspen.
Stanley, J. C., & Benbow, C. P. (1983). Educating mathematically precocious youths: Twelve policy recommendations. Educational Researcher, 11(5), 4-9.
Stocking, V. B., & Goldstein, D. (1992). Course selection and performance of very high ability students: Is there a gender gap? Roeper Review, 15(1), 48-51.
Subotnik, R. F., & Strauss, S. M. (1994/1995). Gender differences in classroom participation and achievement: An experiment involving Advanced Placement calculus classes. Journal of Secondary Gifted Education, 6(2), 77-85.
VanTassel-Baska, J. (1998). A critique of the Talent Searches: Issues, problems, and possibilities. Journal of Secondary Gifted Education, 9(3), 139-144.