EDDRA


Education Disinformation Detection and Reporting Agency


-- a Gerald Bracey Report on the Condition of Education


Index of
EDDRA
Articles

TEST SCORES IN THE LONG RUN:

NOT IMPORTANT

Gerald W. Bracey

 

In a number of states, students are promoted or retained based on tests and/or denied graduation. In some states, teachers and administrators suffer punishments for low test scores and rewards for high ones. Against this backdrop stands a body of literature that strongly implies that, in the long run, tests don't predict much of anything.

Even the Scholastic Assessment Test, created by the College Entrance Examination Board to predict who will succeed in college doesn't make very accurate predictions. The typical correlation between the SAT and freshman grades is about .45, about the same as for grades except in highly selective college where most students arrive with good grades, reducing the capability of using grades to predict differential success.

The amount of variance in one variable accounted for by another variable is given by the square of the correlation. If we square.45, we get .2025. Rendering this proportion as a percentage, we find that the SAT accounts for only 20% of freshman grade point variability. Eighty percent of what determines who makes dean's list and who gets tagged with academic probation comes from sources not measured by the test.

The belief by many, including Nicholas Lemann, author of The Big Test: The Secret History of Educational Testing Service, that the SAT is the single determinant of who gets into college is a triumph of ETS marketing over reality. Lemann ought to know better and college admissions officers do. While researching the matter for a Washington Post review of Lemann's book, I checked the admissions at Brown. I found that that university could have filled two freshman classes by admitting only people scoring between 750 and 800 on the SAT verbal, but, in fact, admitted only a third of the applicants with these magnificent scores. It admitted some students with SATs as low as 400 (it admitted only one third of those with scores in the 750-800 range).

College admissions officers know that the SAT is highly fallible. When Robert Schaeffer of FairTest talks with a group of admissions people, he always asks for a show of hands: "Who would continue to use the SAT if the university had to pay for it?" he asks. He says he has yet to see a single arm in the air. Unfortunately, the admissions deans' knowledge is tempered by the fact that they need to keep SAT averages high to show potential money-giving alumni that Old Ivy Wall U is still doing a great job.

When we turn to the workplace, the relationship between test scores and earnings all but disappears. Much of the literature investigating how much test scores contribute to productivity is reviewed by Henry Levin of Columbia University in a paper, "High Stakes Testing and Economic Productivity." The paper can be accessed at www.law.harvard.edu/groups/civilrights/conferences/testing98/drafts/levin.html. {I KNOW YOU'RE GONNA LOVE THAT, BRUCE].

In studies over the years, a one standard deviation increase in math test scores has been associated with a 3-4% difference in wages. For reading scores the correlation is essentially zero. In the 1980's, this figure increased to 7% for males and 14% for females when math scores are examined, but reading scores still failed to correlate with wages.

It is true that the National Adult Literacy Survey found that a one standard deviation in reading scores resulted in an 18% difference in income, but this result is suspect and still seems small in any case. The result is questionable because literacy and income were assessed simultaneously. It would be far preferable to assess literacy at Time 1 and use the scores to predict wages and some later point, Time 2. When the two variables are assessed at the same time, reading scores might well be as much a consequence of the job as a cause of high wages. Another study found that people who scored high on reading tests also had many more workplace opportunities to practice and hone their literacy skills. Levin concludes that the 18% figure is an overstatement.

Actually, predicting earnings is iffy even when we have a lot of other variables to work with. If we incorporate family background variables, and work experience, as well as mathematics test scores, we can still explain only 20% of the differences in the earnings of different people.

The current emphasis on testing, testing, testing probably makes the above seem a little surprising. But it is not when one steps back and contrasts tests with workplace requirements. Robert Sternberg of Yale has contended that tests measure only a portion of knowledge and analytical skills that might be needed on the job and nothing at all about creativity or "common sense." No doubt that is why supervisors' ratings of job performance show the same low correlation between job performance and test scores as earnings show.

Job success doesn't involve taking tests. People who get their wages increased please the supervisors and the boss, work well with groups, treat the customers and other workers politely, sow up reliably and on time, display a sense of humor, and, in some jobs use a host of other factors such as critical thinking, perseverance, motivation, enthusiasm, and "emotional intelligence."

Currently, some education reformers claim that test scores bear on job performance and that higher scores in students today will translate into better workers tomorrow. The evidence doesn't support this contention. Indeed, if employers come to believe that test scores are important and start selecting hirees based on tests and ignoring the many other factors that go into jobs, they could exert detrimental effects on their businesses' efficiency and on equity as well.

As Levin notes, "The link [between jobs and test scores] is so small that when test scores are used for employment selection, there is a high probability that many workers will be rejected for jobs that they could perform and many will be selected for jobs that they cannot perform." Since blacks and Hispanics score lower on tests than whites and Asians, they will be disproportionately affected. They will be turned down for jobs they could handle capably.

When we look at items larger than the individual, test scores don't seem to mean much either. On tests, South Carolina and Alabama are consistently among the lowest scoring states. Yet BMW decided to build cars in South Carolina and Mercedes opted for Alabama. Neither firm is considered to make inferior products. If test scores matter, no doubt these corporations would have headed to the Plains States, Connecticut, Maine--or Japan. Japan's kids consistently score among the highest nations in the world on international comparisons of mathematics and science. They don't seem to be able to do much for their economy, though, which, as this is written in February 2001, is sliding back into recession.

The failure of test scores to show up as important indicators has been observed before. In 1974, Leo Munday and Jean Davis of the American College Testing Program also conducted research on the relationship between test scores and later achievement. They couldn't find one. Here's what they had to say about it:

One of the undesirable by-products of testing practice has been the emphasis on academic talent with its accompanying indifference to other kinds of talent. Tests have fostered a narrow conception of ability and restricted the diversity of talent which might be brought to the attention of young people considering various profession. It is small wonder that some people have mistakenly interpreted test scores as measures of personal worth and have mistakenly assumed that academic talent, as evidence in school is related in a major way to later adult accomplishment. ("Varieties of Accomplishment: Perspectives on the Meaning of Academic Talent," American College Testing Program Technical Report #62, 1974).

If President Bush wants to really help education, he'll propose reforms the require less testing, not more.

NEXT: "What is at stake is not today's economy, but tomorrow's. Thus spake two prominent members of what I refer to as the Education Scare Industry, IBM CEO, Louis V. Gerstner, Jr., and Secretary of Transportation, Tommy G. Thompson. A looks at the mathematics demands of a specific industry.
---------------------

HOW TESTING LOCKS DOORS

As noted in the preceding section, the low correlation between test scores and job performance can lead to blacks and Hispanics being kept out of jobs they might readily handle. Fortunately, it appears that employers have not examined applicants test scores (except, of course, on any screening test the employers themselves used). As noted, some reformers are pushing for more test consideration. Albert Shanker, the late president of the American Federation of Teachers often lamented that employers didn't take into account any information from high school other than the fact that the student had graduated or had not. At the most recent education "summit" Gerstner tried to get the businessmen in attendance to promise they would examine transcripts for grades and courses of study.

Where test scores do play a significant role already, of course, is in selection. Recommendations for special education and gifted and talented programs usually include test information (and, sometimes, only test information). And, as also noted in the earlier segment, college admissions decisions involve tests to a greater or lesser extent.

Daniel Koretz, Formerly of RAND, now with the National Board on Educational Testing and Public Policy, at Boston College constructed to simulations to show concretely how minority applicants would be affected by differences between minority and white test scores given different cut scores for admission and different proportions of black and white students applying.

Koretz presents three scenarios:

  1. The cut score for admission is set at the overall mean of black and white applicants and assumes that blacks and whites apply in equal numbers.
  2. The number of applicants remains equal in the two groups, but the cut score is now set at the 84th percentile--one standard deviation above the mean.
  3. The cut score is set at the 84th percentile, but the number of black applicants is only 15% of the total, roughly the proportion of blacks in the total school population.

Koretz notes that recent reviews of the black-white score gap on the SAT, NAEP and achievement tests, found a range from a low of .66 standard deviations in NAEP reading to a high of 1.18 in some areas secondary school achievement tests.

For his simulations, Koretz sets the gap to .80 standard deviations. In this scenario, of 55% of white applicants are accepted compared to 20% of black applicants.

In scenario 2, although black applicants constitute 50% of all applicants, the effect of raising the cut score to +1.00 standard deviations above the overall mean is to exclude all but one percent of those applicants from acceptance. Seventeen percent of the white applicants are admitted. This leaves an admitted group consisting of about 92% white and 8% black.

In scenario 3, where the cut-score conditions are the same but blacks constitute only 15% of the applicants, the admitted group is 99% white.

Koretz concludes, "these results illustrate the difficulty inherent in reconciling academic selectivity with increased equity of access to post secondary education for non-Asian minority groups, particularly at selective colleges and universities." Koretz acknowledges that test scores are not the sole criterion for college admission and that this ameliorates the problem. "Nonetheless," he says, "unless test scores are given very little weight or are offset by other factors on which minority students have an advantage relative to whites, the average test-score disparity will generally have a severe impact on admission to selective colleges."

Koretz' paper, "The Impact of Score Differences on the Admission of Minority Students: An Illustration", can be obtained from the National Board on Educational Testing and Public Policy, School of Education, Boston College, Chestnut Hill, MA 02467 and accessed at www.nbetpp.bc.edu.

Posted 5/12/2001


This report originally appeared in the April 2001 Research Report in Kappan, Phi Delta Kappa.

© 2001 Gerald Bracey
Web Services by