Education Disinformation Detection and Reporting Agency
Who is failing whom??
EDDRA reports on testing in Virginia and Massachusetts
A techno-friend (virtual acquaintance? How does one describe a person known only through media?) in Washington declares that some reformers there will not be satisfied until their tests are hard enough that all kids fail. Some educators seem remarkably accepting of this attitude.
For instance, in the March 10, 1999 edition of Education Week, Thomas C. Boysen and Thomas Sobol wrote "The public must be prepared to live temporarily with distressingly low test scores—of the sort that occurred recently in Virginia and Massachusetts." Distressingly low? Apparently so. Of course, the operative word here is "apparently."
Boysen is the former commissioner of education in Kentucky and Sobol the former commissioner of education in New York. If the sloppiness indicated by their uncritical acceptance of weird statistics here is typical of their approach to other matters, education reform is indeed in trouble.
The statistics that Boysen and Sobol uncritically accept are a 98% school failure rate in Virginia and a 59% teacher candidate failure rate in Massachusetts. Even if 59% hadn’t caused their eyebrows to arch, you might have thought 98% would. You might have thought they’d have checked out the credibility of these statistics. They didn’t.
Here is how to view the Virginia data from national and international perspectives:
1. Of the 41 nations participating in the TIMSS middle school assessment, only six outscored the state of Iowa in math, only one in science (these data from Linking the National Assessment of Educational Progress to the Third International Mathematics Study, NCES report No. NCES 98-500, July 1998).
Thus the sequence goes like this: Iowa beats up on most of the world, West Springfield and Robinson clobber Iowa. If they took the TIMSS tests, at most one nation would outscore them and it is likely that none would.
Both schools failed the Virginia test.
Aside from myself and the Fairfax County administrators and faculties, no one seems to have questioned the reasonableness of failing schools that outscore the world. The Virginia Board of Education seems very pleased with itself and the outcomes (perhaps they, too, were hoping for 100% failure). Editorials in Northern Virginia have indicated that the editors are pleased that the results "have lit some fires" under school administrators and teachers. In the central part of the state one principal has already been suspended for alleged cheating. More will surely follow (in Texas, the indictment of Austin’s deputy superintendent for tampering with the tests has led at least one legislator to propose that such tampering be changed from a misdemeanor to a felony).
Meanwhile, up in Horace Mann land, summa cum laude graduates from selective colleges were barely passing the Massachusetts Teacher Test (MTT). Large proportions at some selective colleges were failing. Students complained they had no idea what would be on the test.
To date, despite numerous requests, neither the Massachusetts Department of Education, nor the tests’ developer, National Evaluation Systems (NES) has supplied any reliability or validity data.
In early April, 1999 the 55 member institutions of the Association of Independent Colleges and Universities of Massachusetts (AICUM) recommended that the MTT be abandoned in favor of ETS’ Praxis test at least until the reliability and validity of the MTT had been established.
Dominic Slowey, a spokesman for NES was quoted by the Associated Press as saying "There is no question that the test is valid and reliable. It was put through a rigorous process of review before it was ever given."
Mr. Slowey did not offer an explanation of how NES could determine the test’s validity before any of those who had taken it had entered classrooms (yes, it is possible to establish content validity, but the validity at issue here is predictive validity—do those who score well become better teachers. The content validity of the test is very much in question as well).
Nor did Mr. Slowey explain how the reliability of a test can be determined before it is given (Mr. Slowey has not returned phone calls and the interrogation I received from some intermediary suggests that his defenses are stronger than the fortifications in Belgrade).
It is possible that the extraordinarily different slants on the AICUM report themselves reflect the failure of our schools, especially schools of journalism. The major point of the story was that AICUM recommended rejection because the test had not been proved reliable and valid. The Associated Press and most newspapers got this although they varied greatly on what other aspects of the report they emphasized. The Boston Globe, though, seemed to be operating in some parallel universe. Its opening sentence contended that AICUM had "conceded that its schools are largely to blame for the stunning failure rates on the Massachusetts teacher certification tests." Globe, reporter, Kate Zernike, did not explain how anyone could take responsibility for low scores on a test of unknown reliability.
When the Department of Education and NES refused to supply reliability coefficients, researchers at Boston College decided to determine them empirically from data that was available. A test developer shoots for a reliability coefficient of .90 or better and settles for anything above .80. The BC researchers found the reliabilities of the reading and writing tests to be .29 and .37, respectively.
Giving the NES and the Department of Education a break, the researchers removed a few outliers who had scored extremely high one on administration and extremely low on another. Even by doing this, they were unable to get the reliabilities above .48 for reading and .49 for writing.
The report, by Walt Haney, Clarke Fowler and Anne Wheelock can be obtained from them at the Center for the Study of Testing, Evaluation and Educational Policy, Campion Hall, Boston College, Chestnut Hill, MA 02467. It can also be viewed and downloaded from Volume 7, Number 4 of the Educational Policy Analysis Archives, http://epaa.asu.edu/epaa/v7n4/ (note: there is no "www" in the address). Volume 7, Number 5 of the same journal contains a more "temperate" critique of the test by Howard Wainer of Educational Testing Service. Wainer also makes some important points about the BC report.
Those seeking relief from all the punitive, dispiriting standards talk can find some in a collection of articles in the April issue of Phi Delta Kappan where John Goodlad and others bring us up from reform to renewal. John offers the hopeful perspective that this sorta stuff really is cyclical and this too shall pass.
And, just so you know in this very first EDDRA commentary that it will not be all work and no play, I attach an account of one of the more delightful evenings I have spent in recent years, a story of dining on the Seine that appeared in the April 11, edition of the Washington Post.
© 1999 Gerald Bracey
Last updated April 13, 1999
Web Services by