THE 11TH BRACEY REPORT ON

The

Condition of

Public Education

By Gerald W. Bracey

posted with permission from the October 2001 issue of Kappan

 


A LOT of people think I defend schools reflexively. Not so. But a little more than a decade ago, I found a lot of data that proved that the people who make up what I have come to call the Education Scare Industry were wrong, and I said so. When I have thought the schools have been wrong, I have said that, too.

 

I begin this report with three of the most despicable, totalitarian acts by school authorities known to me in the 34 postdoctoral years I’ve been in education. These are the attacks by Gwinnett County, Georgia, on Susan Ohanian; by the Massachusetts Department of Education on Alfie Kohn; and by the Chicago Public Schools (CPS) on George Schmidt. I presume Ohanian and Kohn are known to Kappan readers from their bylines in this journal. Schmidt was a teacher in Chicago. All three of their stories are dirty — but informative —t a l e s about deeds done in the name of high-stakes testing. The beginning of the Schmidt and Ohanian stories were recounted in the 10th Bracey Report. They continue here.

 

In addition to being a longtime teacher of English and journalism, Schmidt also publishes a muckraking (not a pejorative term in my lexicon) monthly newspaper called Substance. One day, a plain brown envelope delivered to the Substance offices was found to contain copies of the CASE (Chicago Academic Standards Examinations) . Schmidt thought the test items were awful and, rather than write an editorial to that effect, published them in his paper. CPS suspended him without pay and sued for $1.4 million, which it claimed would be necessary to write new tests. CPS subsequently fired him.

 

When I saw the tests, I tried to imagine what would have happened had I produced them back in the days when I was director of testing for the Virginia Department of Education. Someone would have leaked the tests to the Richmond Times Dispatch. The Dispatch would have published the worst questions, along with a scathing editorial mocking the state department’s incompetence. The department would have summarily sacked me and deservedly so.  

This is what should have taken place in Chicago. The Chicago Tribune should have picked the tests up from Schmidt, published the worst items, and written a scathing editorial. Then CPS should have fired Carole Perlman, director of testing. Instead, the Tribune backed the tests and demanded that Schmidt be fired. Perlman testified against Schmidt at his hearing. These tests were more than just a set of “trivial pursuit” items, although most of the items were that, too. The tests contained items that had no right answer, items that had multiple right answers, and items to which the official right answer was wrong. It also contained items for which an earlier item cued the answer for a later one. In short, these tests were  garbage.

 

At a hearing on the issue (at which I testified in support of Schmidt), Perlman had the effrontery to defend the tests and even cajoled Tom Kerins, former Illinois director of testing, to testify on behalf of the tests and to confirm the cost estimate to replace them. Shame on you, Tom.

 

The $1.4 million, by the way, works out to about $12,000 an item. Chicago schoolteachers, not professional item writers, wrote the questions. At $12,000 per question, every four items cost the equivalent of a Chicago teacher’s annual salary. How could they possibly cost so much? Well, it is true that CRESST (Center for Research on Evaluation, Student Standards, and Testing) liberated $500,000 from CPS, but it claimed to have provided only a little technical assistance in teaching teachers how to write items. How CRESST could charge so much money for so little work could also spark an investigation. The CPS suit is ongoing, as is Schmidt’s counter-suit involving First Amendment arguments.

 



Some educators in Western Massachusetts invited Alfie Kohn to be the keynote speaker at a conference. When the Massachusetts Department of Education (MDE) heard that Kohn would be the keynoter, it told the organizers that, if Kohn spoke, the money for the conference would be withdrawn. The organizers caved, even though the money to pay Kohn was not from MDE funds. Officially, the reason for denying Kohn the right to speak was that his topic was beyond the theme of the conference. Kohn was invited to speak on standards and assessment, and the organizers titled the speech “The Case Against Standardized Testing.” The purpose of the conference was for charter schools and other public schools to share information about common issues. But, as Chester Finn and his colleagues have observed, “ Charter school discussions are saturated with talk about accountability.”

 

And talk about accountability usually includes talk about testing. Other sessions at the conference covered testing, and many had nothing to do with charter schools. Kohn was paid —not to speak. He says that the MDE’s action was not surprising: “It’s a small step from saying, ‘Pass this test or you don’t graduate,’ to saying, ‘Renege on this speaker or you don’t get funded .’ ”The ACLU is progressing toward a suit.

 

The Ohanian saga, which she discussed briefly in her January 2001 Kappan article, remains murky with regard to who is behind it and what they hope to accomplish — other than to frighten her and make her spend money on lawyers in two states. I spoke to Alvin Wilbanks, the school superintendent in Gwinnett County, who advised me that there was an “ongoing investigation” but steadfastly refused to tell me who was conducting it. Jim Keinard, the

Gwinnett School Police officer in charge of the investigation, has not returned phone calls or replied to e-mails. Ohanian has recently been ordered to supply fingerprints and a writing sample. Georgia law does not permit officers to ask for a writing sample. More than most high-stakes tests, the one in Gwinnett County had roiled many waters. Given the enormous amount of testing already present in the district, many Gwinnett citizens simply saw no need for it. Some saw malice and maybe even malfeasance in a memo from Assistant Superintendent Gale Hulme. Hulme claimed it was “human error” that caused some RFPs for the Gateway exam not to be sent to most bidders on time. It is not clear that only CTB/McGraw Hill received the RFP on time, but only CTB bid. Harcourt Educational Measurement declined, in a memo from then Vice President Phillip Young, dated three weeks after the deadline. Finally, some Gwinnett teachers were upset that the passing scores on some tests were set very close to 25% correct. This, of course, is the chance level, uncorrected for guessing, and strongly suggested that the whole enterprise was a political game.

Based on information provided by Gwinnett School Police, Vermont detective Timothy Bombardier in an affidavit accused Ohanian of attending a meeting of the “Alfie Kohn Group.” Bombardier wrote that “The Alfie Kohn Group trains people in how to disrupt and prevent the implementation of high-stakes testing. On 31 March 2001, the Alfie Kohn Group met at Columbia University in New York, and Lisa Amspaugh was in attendance. [Amspaugh is

a former resident of Gwinnett County and a critic of the test.] An attendee at the meeting advised investigators that a session was held specifically to plan strategies to disrupt the Gateway test.”

This would be hilarious if it did not come from an agent of the law. The conference was organized by Columbia University faculty members and FairTest. I spoke at it, sharing a session with noted radical Ted Chittenden of the Educational Testing Service. One day was indeed devoted to developing strategies to counter the negative effects of high-stakes testing, but it dealt with topics like how to get the message to the media, to politicians, etc. No one ever mentioned the Gateway test, and no one said anything about disrupting any test administration or committing any acts of civil disobedience. Neither Ohanian nor Kohn attended the meeting.

 

The tale began with someone who pilfered the county’s high-stakes test. Among the events that followed was the spectacle of all schools having to count, in front of a policeman, their copies of the test.


To Top of Report

 

Testing, Testing, and More Testing

A couple of decades ago, I formulated Bracey’s Paradox : test scores mean something only when you don’t pay any attention to them. Lately, a lot of people have been paying a lot of attention to them. If 2000 was the year that testing went crazy, 2001 was the year it went stark raving mad. I have already recounted three of the most outrageous incidents. Others merely reflect the tyranny of testing. What say we take a moment to consider a few of the personal qualities that standardized tests do not measure: creativity, critical thinking, resilience, motivation, persistence, humor, reliability, enthusiasm, civic-mindedness, self-awareness, self-discipline, empathy, leadership, and compassion.

 

Events in New York and Virginia reflected testing’s ascent to dominance. In New York, 37 small alternative schools had built their curricula around portfolios

as a means of assessment. They wanted to use these in lieu of the state tests for graduation. No can do, said New York Education Commissioner Richard Mills. Alternative school students have to take the tests just like everyone else.

 

In Virginia, people pressured the state board of education to permit alternatives to the board’s own tests, then the sole determinant of eligibility for high school graduation. Okay, said the board — and added more tests: the SAT, the Advanced Placement tests, and the International Baccalaureate. Grades and teacher recommendations were deemed too subjective. We should note that, for all their subjectivity and alleged variation in meaning and rigor from place to place, high school grades still predict first-year college grades at most universities better than the SAT.

 

“NewsHour with Jim Lehrer” also acknowledged testing’s prominence with a long segment.(A transcript of this “NewsHour ” segment can be found at  www.pbs.org/newshour/bb/
education/jan_june01/testing_215.html
.) Monty Neill of FairTest and Alfie Kohn squared off against Bill Evers of the Hoover Institution (an education advisor to President Bush) and Lisa Graham Keegan, then the state superintendent for Arizona. Evers and Keegan mumbled platitudes and generalities, most of which had no basis in data. Said Evers, “What we want to do with these tests is know where these children are and if we do it year by ye a r, we can see progress, we can see gains, we can see the growth, we can see problems with teachers as well as students.”

Can we now? Thomas Kane of the Hoover Institution and Douglas Staiger of Dartmouth College concluded that between 50% and 80% of the “improvement” in annual test scores for a school was temporary and caused by fluctuations that were not related to an increase in achievement. 3

David Grissmer of the RAND Corporation put the implications of these findings this way:“ The question is, are we picking out lucky schools or good chools, and unlucky schools or bad schools? The answer is, we’re picking out lucky and unlucky schools.”4

 

Kane and Staiger made a telling statement:“ Most of these [school accountability] systems have been set up with very little recognition of the strengths and weaknesses of the measures that they ’re based on.” 5

 

The reason they have been set up that way, of course, is that the people who set them up, the Keegans and Everses of the world , have an agenda. It is all about ideology, power, and control and not at all about children, learning, and education. Neill and Kohn pointed out a number of other weaknesses of tests, as well as some of the negative outcomes they produce. They didn’t mention that, in Virginia Beach, the board of education called a special session to decide if it needed to mandate recess for the district’s elementary schools, because many of the schools had eliminated recess in favor of additional test preparation. The problem extends well beyond Virginia Beach. 6

 

Washington Post reporter Liz Seymour found that Virginia’s tests not only flunk a lot of children but also have created a new class of dropouts: teachers. Some have taken early retirement, some have fled to private schools, and some have requested transfers to grades that are not tested. 7

 

Seymour interviewed teachers only in some of the highest-scoring districts in Virginia, those who would have the least to fear from the tests. The high-stakes fourth-grade test in New York is having a similar effect. Because tenured teachers can choose their assignments, fourth-grade has become the province of the least-experienced teachers.8

 

Seymour quoted Virginia state board president Kirk Schroder, who claimed, “ People miss the big picture here. The reality is that accountability is changing the culture of public education, and in some respects that has created some very positive achievement in some places where student achievement did not exist.” Schroder offered no examples. Surely, he did not have in mind the performance of students in algebra I in Richmond schools. On the third administration of the algebra I test, Richmond’s high schools had these passing rates : 19.8%, 10.3%, 9.0%, 5.8%, 4.6%, and 2.6%.Only three small, selective, affluent

schools did better. The techniques for setting passing scores reveal the purely political nature of these programs. Virginia employed the widely used Modified Angoff procedure. The process generates a recommended cut score from each of the 20-odd judges who participate. Usually, a cut score in the middle

of the full range of recommended scores is taken as the official passing score. For 19 of 21 tests, the Virginia board selected the highest recommended cut score. For the two others, it set the passing score higher than any of the judges had recommended. But at least the Virginia judges had some training and used a generally accepted procedure. In California, a panel of 100 people were given a dictionary definition of “competence” and told not to worry about setting a high passing score because eventually students would get there. 9

 

The judges recommended a cut score of 70%.State Superintendent Delaine Eastin overruled the judges and set the passing score at 60% for one test and 55% for the other. Still, a majority of the students failed, and the media scratched their heads over how so many students could flunk such an “easy” test, a test that, after all, required students to get barely more than half of the items right. (Recall that norm-referenced tests are composed mostly of questions that about half of the students get wrong; the percent correct on a test says nothing about its difficulty.)

 

The Alliance for Childhood, a loose coalition of psychiatrists, pediatricians, and educators, attempted, with little success, to bring some sanity to the situation with a position paper on high-stakes testing. The section headings summarize the paper’s story:“ The Technology of Testing Is Flawed”; “Test Scores Have Meaning Only in the Context of the Whole Child”; “Evidence Is Growing of Harm to Children’s Health” ; “More High-Stakes Testing Means More Dropouts, Fewer Good Teachers”; and “Standardization Is the Enemy of Effective Public Schools.” 10

 

Into the existing nuttiness over testing, Bush injected an unworkable and self-contradictory plan for chaos. In the name of giving states more freedom and flexibility, the President proposed to force them to test all students every year in reading and math in grades 3 through 8.Schools would be required to make “adequate yearly progress,” a concept that caused everyone’s eyes to roll back in their heads —even those who hadn’t seen the article by Kane and Staiger on the instability of annual gains. The initial House version would have labeled most schools as “failing schools.” When Shadow Secretary of Education Sandy Kress rewrote the “adequate yearly progress” formula, he called his own work “Rube Goldbergesque.” 1 1

 

Chester Finn said the legislators had rendered the notion of adequate yearly progress so “complexified” that it defied explanation to parents and teachers.12

 

In addition to “adequate yearly progress,” the Bush plan calls for all students to reach the “proficient” level on state assessments, said assessments to be  confirmed by the National Assessment of Educational Progress (NAEP) or by a nationally normed test. As FairTest noted in its analysis of the many problems in the Bush plan, this would require rates of progress never before seen in education (www. fairtest.org/nattest/bushtest.html).

 

In most states, fewer than one-third of fourth-graders currently attain the NAEP proficient level, and performance on state assessments often differs widely from NAEP. In Texas, for example, 89% of youngsters are proficient on the state reading assessment, but just 29% are proficient on the NAEP. Only 12% of black students scored proficient or better on the 2000 NAEP reading assessment . Sixty-three percent scored below basic. If the NAEP were administered

in the third gra d e, similar results would probably be found. Currently, the House plan comes in 10-year and 12-year versions. Suppose the 10-year version becomes law. An ave rage of 6.3% of American third-graders must move from below basic to proficient each year for 10 consecutive years.

 

New York Times education writer Jodi Wilgoren interviewed education leaders in all 50 states and found them complaining about the Bush proposal because it ignores an entire decade of work to develop standards and tests. 13

 

But what has this decade of work gotten us? Falling test scores. Aside from the SAT, only the IowaTests of Basic Skills and the Iowa Tests of Educational

Development provide evidence of longterm trends. By Iowa law, each new form of the test must be equated to the old form . Scores rose from 1955 to about 1965, fell for about a decade, and then rose to mostly record highs by the mid to late 1980s. After the 2000 renorming, though, the scores fell. No one seems to understand why. I would place my own bet primarily on changing demographics. The 2000 census paints a very different picture of America from the one painted by the 1990 census — much less the earlier ones. Now for a quick summary of the best of the rest of this year’s news about tests.

 

Richard Atkinson, president of the University of California System, set tongues wagging by proposing that the university do away with the SAT as a college admissions requiremet. He proposed temporarily using the College Board Achievement Tests until something better and more appropriate could be developed.The media made it a big deal when Mount Holyoke banished the SAT, but Joanne Creighton, Mount Holyoke’s president, said that the test never

counted for more than 10% in the admissions decision anyway.

 

Many stories covered cheating scandals. Many others documented the major errors made by companies that develop and score tests and explored the injurious

impact of these errors on students. Resistance to the tests also grew. Parents in several states boycotted state-mandated tests. The Business Roundtable felt the resistance sufficiently to issue a monograph on how to counter the testing “backlash.”14

 

Eugene Paslov, CEO of Harcourt Educational Measurement, garnered a fair amount of ink by saying that tests such as the ones his company produces should

not be used as graduation requirements. He said his company could not tell school districts how to use the tests, but “we do have a responsibility to tell policy makers how we feel.”15

 


To Top of Report

New NAEP Data

The most disturbing thing about the 2000 NAEP reading and math assessments was the way media and state officials covered and interpreted them. The reading data, which did not show any change, received little in the way of headlines. Nevertheless, in a Wall Street Journal op-ed piece, former Delaware Gov.

Pete du Pont called the results “ disastrous.” 1 6

 

Recall again that American students finished second in an international comparison of reading achievement. Theoretically, we could be suffering an international literacy crisis, but no one has claimed so. Few papers carried the math results on the front page. Many of the nation’s leading papers, including the  Washington Post, NewYork Times, Chicago Tribune, Chicago SunTimes, and USA Today, buried the story deep in Section A. This is disturbing because the results were generally positive. The 12thgrade scores dropped three points from the 1996 level, leaving them well ahead of the scores in 1990. Both fourthand eighth graders showed improvements. If their scores had dropped, a couple of journalists admitted to me, the story would have garnered page-one placement. Most papers treated the results as a state, not a national, story. The national results appeared mostly in national papers and in papers in states that did not participate in the assessment.

In some states, the newspapers and state officials bragged — Connecticut, Indiana, Iowa, Maryland, Massachusetts, Minnesota, North Carolina, Texas, Vermont, and V i r g i n i a . In other states, they lamented the low performance — Arkansas, California, Nebraska, Oklahoma, Utah, and Wyoming . Still others did a little of both, pointing out gains but mentioning below  average performance — Alabama , Idaho, Illinois, Kentucky, Louisiana, Michigan, and Ohio. The headline in the Biloxi Sun Herald was the most sorrowful: “Mississippi Improves Scores, but Finishes Last on Test.” This kind of coverage is disturbing because the utility of the NAEP depends on its invisibility . (See Bracey ’s Paradox, above) As soon as you start paying attention to a test, you introduce all kinds of corrupting influences that invalidate the scores. State officials attributed gains to their state’s reform

efforts. Although the NAEP likes to bill itself as “the nation’s report card,” it is increasingly becoming the states’ report card to be used for bragging or to goad

educators to greater effort and achievement. Thus it will not be usable to “confirm” Bush’s testing program or any other program.

 


To Top of Report

No Child Left Behind

The Bush education plan as presented to Congress in the document “No Child Left Behind” begins with three falsehoods: “Today nearly 70% of inner city fourth graders are unable to read at a basic level on national reading tests. Our high school seniors trail students in Cyprus and South Africa on international math tests. And nearly a third of our college freshmen find they must take a remedial course before they are able to even begin regular college level courses.”

There are no published data to support the 70% contention. (In an August 1 speech to the Urban League, Bush amended the figure to “almost two=thirds.” In an earlier speech, First Lady Laura Bush had used the better figure of 60%.) The 2000 NAEP results in reading show that 47% of students in central cities score below basic. Sixty percent of students eligible for free and reduced-price lunches score below basic.

 

As for high school seniors trailing Cyprus and South Africa, these are the two countries that the U. S. outscored, not trailed . Of course, to consider these data at all, one has to accept the results of the Final Year Study of TIMSS (Third International Mathematics and Science Study), which, as I hope I have made clear before, one should never do. 17

 

The flaws in the data render them virtually un-interpretable. When I parsed the results and found groups most comparable to the students in other countries,

American high school seniors remained about average, which is where they were as eighth-graders. The statement on college remediation makes it seem that college freshmen are showing up at Harvard lacking basic skills. Maybe. But I doubt that sound national figures exist because “remedial” means different

things in different states and on different campuses. In Virginia, for instance, remedial courses are not offered at the flagship institutions. You won’t find them at William and Mary, the University of Virginia, Virginia Tech, or most of the other four year universities. They are offered by three urban universities, by two historically black universities, and, especially, by the community colleges. If students did not take algebra II in high school, then decide that they want to go to a four-year college and so take algebra II at a community college, does that make the course remedial? Isn’t providing such opportunities a core function of community colleges? The President’s statement also overlooks the inconvenient fact that about two-thirds of high school graduates go on for further education. Shouldn’t we be applauding this?

 

Paul Gigot — soon to be editorial page editor of the Wall Street Journal — said that the signal quality of the legislation was that “Teddy Kennedy is happy, and Checker Finn is not.” 18

 

Certainly, most conservatives did not care for the Bush plan. The Heritage Foundation slammed it. So did the Family Research Council, Focus on the Family, Phyllis Schlafly’s Eagle Forum, and Paul Weyrich’s Free Congress Foundation . Analysts concluded that Bush had given the conservatives so much of their agenda in his first 100 days that he could afford to anger them now on a few issues.

 

“Missing in action” in all of the contentiousness has been Roderick Paige, the new secretary of education.The Houston Chronicle noted that “according to his official schedule, the secretary spends the bulk of his time meeting with foreign dignitaries, going to dinners and receptions, or traveling around the country.”

1 9

The New Republic observed, “In any administration, the blatant marginalization of the only African American domestic Cabinet secretary would be noteworthy. In an Administration that loudly trumpets its commitment to Cabinet government and racial diversity it’s stunning.. . .

From the beginning the White House seems to have expected him to be the education plan’s public face —and nothing more...Ah, the soft bigotry of low expectations.”20

 

Paige has denied rumors that he is unhappy with Bush and is planning to resign. 2

1

He has now declared that he is “at the table” and will seek a higher profile, but Jack Jennings of the Center on Education Policy still has him filed under “I” — for “irrelevant.” 22

 

As this is written, Congress is in recess. Everyone is predicting cantankerous debates to resolve the differences between the House and Senate versions of the Elementary and Secondary Education Act (ESEA) when legislators return this fall. The June 20 issue of Education Week carried a side-by-side comparison of the competing versions.

 


To Top of Report

New International Data

Early 2001 brought the release of the TIMSSR (R for Repeat) and TIMSS Benchmarking Studies. The media greeted these studies with a collective yawn in the first instance and with silence in the second. Actually, the TIMSSR report contained what I call “microcosmic data” — a small set of statistics that reveal the condition of education writ large.

 

The U.S. Department of Education disaggregated the TIMSSR data by ethnicity. I wondered what the results would have looked like if the entire U.S. sample had consisted of students of only one ethnicity. In the TIMSS sampling system, Asians and Native Americans constitute too small a group to generate a reliable estimate. The scores from blacks, whites, and Hispanics and those from the 38 participating nations (adding an ethnic group makes atotal of 39) generate these results:

Score Rank (of 39)
Math Science Math Science
Whites 525 547 13 6
Blacks 444 438 32 32
Hispanics 457 462 30 29
Int’l average 487 488

These results look drearily familiar. Unfortunately, TIMSS has no direct measure of poverty, only such indicators as the number of books in the home. The data above are stark enough, but if we could show data by ethnicity and poverty level, we’d see even more dramatic evidence of savage inequalities.

 

Of somewhat more interest than TIMSS R was what I will call TIMSS B, the TIMSS Benchmarking studies. Taking all 38 nations together, TIMSSB calculated what proportion of students attained certain “benchmark” levels: 90th percentile, 75th percentile, 50th percentile, and 25th percentile. In addition to the 38 nations, 13 states and 14 school districts or consortia of districts participated. The 38 nations generated an international mathematics average of 487. U.S. students scored 502.

 

All 13 states (Connecticut, Idaho, Illinois,  Indiana, Maryland, Massachusetts, Michigan, Missouri, Nort h Carolina, Oregon, Pennsylvania, South Carolina, and Texas) scored higher than the international average, and all but four of them scored at or above the U.S. average. Idaho, Maryland, Missouri, and North Carolina scored lower. Michigan, Texas, and Indiana topped the list. (Yes, Texas, but more about that in a moment.) Note that none of the states that scored highest in the first TIMSS (Iowa, Nebraska, M a i n e, Minnesota, Montana, North Dakota, a n d Wisconsin) participated in TIMSSB. Here are the results for math:

 

Percentage of Students Attaining

Selected Math Benchmarks

                              90th           75th        50th        25th

International            10                25            50            75

Texas                       13                32            66            90

Connnecticut           11                31            67            91

Illinois                     10                29            65            92

Massachusetts          10                31            68            92

Michigan                  10                33            70            92

Oregon                     10                32            69            91

South Carolina         10                30            60            88

Indiana                       9                28            65            88

Pennsylvania             9                28            65            91

Maryland                   8                27            57            87

North Carolina           7                25            57            88

Idaho                         5                24            61            88

Missouri                     4                20            58            89

United States              9                28            61            88

Singapore                 46                75            93            99

South Africa               0                  1              5            14

 

TIMSSB did not offer competition as tough as TIMSS. Some industrialized nations that took part in TIMSS did not participate in TIMSS B, and a few more developing countries did. In the original TIMSS, none of the seven highestscoring states named above placed more than 6% of students at the 90th percentile in math.

Still, there were 37 countries, plus Taipei, in TIMSS  B, including such high flyers as Singapore, Japan, Korea, and Hong Kong. Seven states had percentages of students at the international 90th percentile that were as high as or higher than these 37 nations. All but two states had at least 25% of their students scoring at the 75th percentile, and all 13 states had higher percentages scoring at or above the 50th percentile or at or above the 25th percentile than these 37 countries.

The United States had almost as high a proportion of students at the international 90th percentile, 9%, as the topscoring states, and only 14 of the 37 nations

had as high or higher proportions at this level. The U.S. had a higher percentage than average on the three other benchmarks. The contrast between the highest-scoring nation, Singapore, and the lowest, South Africa, shows the great gulf between the First and Third Worlds.

A Nation at Risk tightly yoked the test performance of students to the economic health of the nation. However, on 10 July 2001 Singapore declared its economy officially in recession.2 3

 

Meanwhile, observers worry that Japan will experience a second decade of recession. These facts should end any further assertions that high scores produce a competitive economy. By the way, education alone doesn’t produce jobs. If it did, India wouldn’t have tens of thousands of unemployed software engineers waiting for visas to the U.S.

The results for the American districts and consortia reveal contrasts almost as stark as those between the highest and lowest-scoring nations. In math, only the

five Asian nations finished ahead of Naperville, Illinois, and the First in the World Consortium, a group of 19 suburban Chicago districts. Only seven countries bested Montgomery County, Maryland, which has a lot more poverty than people realize and which also has more than 100 foreign languages to cope with. And only eight nations outscored the Michigan Invitational Group.

At the bottom, only five countries scored lower than the Miami-Dade school district. Only eight trailed Rochester, New York, and Chicago surpassed only 10. (In the original TIMSS, only three of 41 countries scored lower than Mississippi; only one scored lower than Washington, D.C.)

As with the first TIMSS, American students fared better in science than in math. The international average was again 488, but the U.S. average in science was 515. All states scored higher than the international average, and four scored below the U.S. average. Here are the benchmark results for science:

 

Percentage of Students Attaining

Selected Science Benchmarks

                                    90th        75th        50th    25th

International                    10            25            50        75

Michigan                          22            47            75        91

Oregon                             19            43            73        91

Indiana                             18            41            72        92

Connecticut                     17            39            69        90

Massachusetts                  17            40            71        92

Pennsylvania                   15            38            70        91

Texas                               15            35            61        83

Illinois                             14            36            66        88

Missouri                           14            36            67        89

Idaho                               13            37            70        91

South Carolina                 13            34            60        85

Maryland                         12            31            59        84

North Carolina                 11            30            60        85

United States                    15            34            52        85

Singapore                         32            56            80        94

South Africa                       0              2              6        13

Naperville                        33            64            90        98

 

The results from TIMSS, TIMSS R, and TIMSS B clearly indicate the need for something that people like me, David Berliner, Bruce Biddle, Harold Hodgkinson, and Michael Casserly of the Council of the Great City Schools have been calling for for ye a r s : a “Marshall Plan” for the inner cities and

poor rural areas. Reforms predicated on the dire state of the typical American public school or on the “crisis” in public education are wholly misguided.

My declarations about the inadequacy of the TIMSS Final Year Study stand. Still, those data yielded some interesting information. The College Board compared the scores of American students taking Advanced Placement (AP) exams in calculus and physics with the TIMSS scores of the various countries.

In calculus, AP students outscored all 16 countries, averaging 573 points, compared to 557 for France, the highest nation (this difference was not statistically

significant, but the other 15 were). Students who took the AP Calculus AB test (a test of first-year calculus) and received a score of 3 or better (considered passing) scored 586 on the TIMSS Advanced Math. Those who scored lower didn’t fare much worse: 565. Students who took the AP Calculus

BC test (a test of second-year calculus) and scored 3 or better aced the TIMSS test at 6 3 3. Roughly two-thirds of the students taking each test scored 3 or better. In physics, AP students finished fourth, behind Norway, Sweden, and Russia. But recall that the Scandinavian students had studied physics for three years. Russia tested only 2% of the student population and only those in Russian-speaking schools. Those who achieved a 3 or better on the AP Physics test scored 586 on the TIMSS physics test, five points ahead of top-ranked Norway. Students with AP scores in physics of less than 3 scored substantially

lower: 511. This ranks them ninth among the 17 countries in the TIMSS Final Year Study of physics. The study also revealed a different aspect of the ethnic achievement gap: virtually no blacks or Hispanics took either AP test. The calculus group contained just 1% black and 3% Hispanic students. Seventy-two percent were whites, and 21% were A s i a n s / Pacific Islanders. The physics group was made up of 1% blacks, 4% Hispanics, 66% whites, and 26% Asians/Pacific Islanders. In both groups, 4% of the students checked “other.”



To Top of Report

 

Vouchers, R.I.P.

In his 1962 book, Freedom and Capitalism, Milton Friedman developed the modern concept of school vouchers —influencing, among others, Ronald Reagan, who made them part of his education agenda. Friedman and his wife Rose lead a foundation dedicated to the propagation of vouchers. On the website’s FAQ (Frequently Asked Questions) section, one question is “Are vouchers popular?” The site says unequivocally yes and then provides a lot of survey data to try to bolster the claim (the survey data are more equivocal than Friedman would have you believe). Interested readers should visit www.friedmanfoundation.

org and click on Frequently Asked Questions.

 

Survey data about vouchers, however, have always proved wrong when the issue becomes meaningful — as in a vote. This is how it happened in 2000.The 2000 election saw voucher proposals in California and Michigan go down in flames by large margins in both states. Silicon Valley entrepreneurs sponsored the California proposal and funded it generously. In Michigan, voucher advocates outspent opponents 2 to 1. I asked Friedman how he interpreted this debacle, and he said that the “defeats are highly relevant to the question of political tactics.” But he also said that he retained his faith in the efficacy of vouchers.

Along with generous funding, voucher proponents garnered support from conservative pundits. George Will declared that facts about voucher successes had

“pummeled ” opponents; William Safire concluded that vouchers would wipe out the black/white achievement gap. 2 4

 

It didn’t help. Both pundits, interestingly publishing on the same day, drew mostly on the work of Harvard’s Paul Peterson, who has allowed his voucher theology to cloud his vision. Early on, Peterson characterized voucher advocates as “a small band of Jedi attackers” who were engaged in a fight with the unified might of “Death Star forces.” 25

 

Usually, researchers write up a research report, have some friends read and review it, then pass it on to a journal, where three or four anonymous reviewers

will pass judgment on its merits. Peterson gave his study of vouchers in Milwaukee to the Associated Press. The resultant story just happened to appear on the same day that Peterson and his frequent publishing companion, Jay Greene, now of the Manhattan Institute, published an op-ed piece on the same subject in the Wall Street Journal, which characterized John Witte’s original evaluation of the Milwaukee voucher program as “bad science.” This just happened

to be the same day that Republican Presidential candidate Bob Dole proposed vouchers to the Republican National Convention. As the Church Lady might

say, “How convenient.”

 

Deception by the Numbers, a booklet produced by People for the American Way, describes many of the inadequacies of Peterson’s work . For instance, using data from his studies in New York City, Dayton, and Washington, D.C., Peterson has claimed that vouchers work for African American children, but not for other ethnicities. 2 6

 

This is a most curious finding that Peterson has never attempted to explain. In fact, black children in New York City showed gains only in the sixth grade, not in grades 3, 4, or 5. However, grade 6 gains were so large that, when Peterson averaged the four grades, the average was significant. Peterson’s description of the results led David Myers of Mathematica Policy Research, Inc., a co-investigator with Peterson in the New York study, to call the claim “premature.”

“Right now you come away saying, ‘No there’s no impact,’ ” said Myers. 27

 

A later press release, available on the Mathematica website (www. mathinc.com/3rdLevel/school.htm), makes this clear: “ The report shows no overall differences in test scores between 3rd through 6th graders who were offered vouchers and those who were not. However, there were large and statistically significant impacts for African American 6th graders who were offered vouchers.”  Indeed, the results from Dayton are not significant even for African Americans. 28

Peterson has not explained this anomaly, either. Meanwhile, back on the referendum front, the California proposal from Silicon Valley entrepreneur Timothy Draper would have provided a $4,000 voucher for all children, including the 600,000 students already enrolled in private schools. A wide spectrum of groups opposed the proposal. The plan for subsidizing those wealthy families whose children already attended private schools offended some. Some worried about draining money from the public schools. Some private schools said that they would not accept voucher students who scored below grade level, and others expressed no interest in expansion. And even if private schools were of a mind to grow, one estimate has contended that the nation’s existing private schools could absorb only 4% of public school children. 29

 

Michigan put a more complex proposal before the people. In addition to providing vouchers worth $3,200 to students in “failing districts,” it established a teacher testing program and set a minimum funding level for schools. Supporters claimed that the legislation would affect more than 30 districts, but the Michigan Department of Education put the figure at seven. 30

 

Post mortems attributed the defeat to the complexity of the proposal, to the fear of taking money away from public schools, and to the fact that people could not easily read the proposal’s position on the political spectrum by looking at supporters and opponents. Popular Republican education reform Gov. John Engler opposed it. So did former Gov. James Blanchard, a Democrat. 31

 

In the wake of the defeats, voucher advocates, such as Peterson, Jeanne Allen, and Trent Lott, decided that the word “ voucher ” should be dropped from the lexicon of school choice. “I think maybe the word is part of the problem,” Lott said.“Maybe the word should be  ‘scholarship.’ ”32

 

At his confirmation hearings, Rod Paige told the committee that “the word ‘vouchers’ has taken on a negative tone.” 33

 

Given these events and sentiments, President Bush’s voucher proposal was quickly removed from both the House and the Senate education bills. As proposed

by Bush, the plan would have transferred wealth from taxpayers of all denominations to the Catholic Church. Journalist and voucher advocate Matthew Miller argued that, in cities, a voucher would have to be worth at least $6,000 to interest people. 34

 

Bush’s vouchers were, in Miller’s word, “puny,” worth just $1,500. Only the heavily subsidized Catholic schools, which have been hemorrhaging students (12.6% of all students in 1960, 4.7% in 2000), could have afforded to accept them. 35

 

Meanwhile, the Bush Administration asked the U. S. Supreme Court to hear the Cleveland voucher case. A federal appeals court, noting that 96% of the students using vouchers attended Catholic schools, had declared the program unconstitutional.

 



To Top of Report

Charter Schools

Accountability burbled to the top of the pot of charter school issues this year. In his 1996 book on charter schools, Joe Nathan wrote, “Hundreds of charter schools have been created around this nation by educators who are willing to put their jobs on the line, to say, ‘If we can’t improve students’ achievement, close down our school.’ That is accountability — clear, specific, and real.” 36

 

And nonexistent. If this all-or-none test were applied to charters, precious few would still stand. Charter operators have often resisted producing financial or achievement data, even when this information falls under a state’s freedom of information law. An RPP International report for the U. S. Department of Education found that just 37.3% of charter schools sent a progress report to the chartering agency. Some 60.9% did send a report to the charter board, but only 41.2% sent one to the students’ parents, and only 25.3% delivered one to the community. 37

 

A review of accountability in 10 active charter states found little activity and few trends toward tightening accountability requirements. 38

 

But people are talking about accountability. Chester Finn and his colleagues wrote in 1996 that they had “yet to see a single state with a thoughtful and well- formed plan for evaluating its charter school program .” 3 9

 

Finn and his colleagues returned in late 2000 to observe, “Charter school discussions are saturated with talk about accountability. Some view it as the third rail of the charter movement, some as the holy grail.” 4 0

 

They proposed something they call “accountability by transparency,” whereby“ the school routinely and systematically discloses complete, accurate, and timely information about its program, performance, and organization.” Their system, though, requires so much information in the form of various test scores, progress toward goals, student standards, curriculum, instructional methods, demographic characteristics, and more that it would seem to eviscerate the original concept of a charter school. In any case, it will not be adopted. No state has a true formal accountability program. Several, among them Colorado, Florida, Massachusetts, New York, and Ohio, In addition, Michigan charter sponsors, mostly public universities, encourage charter operators to team up with EMOs. The EMOs, presumably, have the skills and experience that the individual charter operators lack. Charter authorizers also favor EMOs because they are known quantities. This reduces the risk that some idiosyncratic, visionary charter operators will mismanage or steal money, develop a curriculum bereft of intellectual content, or otherwise mess up. Finally, working through an EMO gives the charter operators access to startup capital. The lack of capital poses

the biggest problem for charter founders. Interestingly, a little-noticed provision in California’s Proposition 39 might have solved or greatly ameliorated this problem. The proposition generated much debate because it lowered the size of the majority of voters needed to approve school bonds. But the proposition also directed school districts to provide charters with facilities that are “reasonably equivalent” to those provided by the public schools. All districts must comply by 2003. One might anticipate an EMO takeover similar to Michigan’s in Ohio. The most recent report from Ohio’s Legislative Office for Educational Oversight (LOEO) had this to say about charters, called “community schools” in Ohio, and EMOs: LOEO found that community schools benefit significantly from the assistance of management companies in areas such as financial management, curriculum development, teacher in-services, and general support and guidance. Directors [of charter schools] remarked to LOEO that the management company is the first place they turn when they have questions. Community schools not operated by a management company must be responsible for all aspects of running the school, ranging from curriculum design to staff hiring and evaluations to planning budgets. The director of one community school without a management company commented that “schools operated by a management company have the assistance I was looking for this year.” 42

 

Turning to EMOs might be practical, but it defeats some major purposes of the charter school movement: to stimulate innovation in curriculum and instruction and to give the people in the school building and the parents in the community ownership and control over what goes on in the school.

In fact, accountability for charters has proved more complicated than such early advocates as Joe Nathan, Albert Shanker, Ted Kolderie, and Ray Budde envisioned. Most statelevel evaluations have concluded that no single instrument serves appropriately for all schools. It is clear in these evaluations, though, that the charter authorizers, boards, and the school operators haven’t thought much or clearly about what would serve as appropriate instruments. Bruno Manno has outlined some of the difficulties: Today, it’s hard to know how well charter schools are actually doing. . . . There are three predominant reasons

for this situation. First, the charter strategy is so new that it’s difficult to measure results. There’s just not much data out there. Second, today’s charter  accountability systems remain underdeveloped, often clumsy and illfitting, and are themselves beset by dilemmas. A final reason for the dearth of charter school accountability information lies with authorizers and operators. Truth be told, they are often content to leave accountability agreements nebulous and

undefined. Leaving accountability agreements indeterminate is fraught with danger because over the long term this approach is more likely to lead to a charter school being subjected to the rule and compliance-based accountability practices that characterize conventional schools. 43

 

Rutgers University researcher Katrina Bulkley finds four factors that make it difficult to revoke or not renew a charter:

1.        Educational performance is not simple to define or measure, nor is how good is “good enough” in educational quality. In this context, authorizers sometimes turn to “proxies” to assess school quality.

2.          Other aspects of a school’s program, often more difficult to measure than test scores, are also important for families and authorizers.

3.          Te a c h e r s, parents, and students become very invested in particular schools, and destroying a community is more difficult than serving a diffuse public interest (like the one that would be served by closing a lowachieving school).

4.          Charter schools have become a highly politicized issue on both sides, and some authorizers are concerned about their decisions (to close schools) reflecting poorly on charter schools as a reform idea.44

 

These four challenges form what Bulkley calls “the accountability bind.” Proxies for achievement include parent and student satisfaction, accreditation by some national accrediting agency, and, in the case of EMOs, a possible “halo effect” — if authorizers view one school managed by the EMO as successful, they are likely to see the EMO’s other schools that way as well.

 

As regards the fourth challenge, those who authorize charter schools are often favorably disposed to the charter concept. This gives them an additional reason to let the charter continue. Bulkley observes further that, “while authorizers have difficulty determining what is and is not a successful charter school, they have even more difficulty deciding that a charter school is unsuccessful enough to justify as high a sanction as closure.”45

 

Not many charters have suffered the indignity of a revoked or non-renewed charter. Nationally, just 4% of all charters have closed. Texas has the highest rate at 8%. And some situations there have received devastating publicity. No doubt that publicity served as one reason why the Texas House of Representatives wanted to declare a moratorium on new charters. In a compromise, the legislature, over Gov. Rick Perry’s objections, capped the number of allowable charters at 211 (192 operated in 2001- 02) . Perry allowed the bill to become law without signing it. When charters do close, it is not always clear why. Authorizers seldom list academic reasons as the principal cause. Usually money problems dominate, but this might be misleading. Eric Premack of the Charter School Development Center at California State University, Sacramento, outlines the problems very well in an email message to me:

 

Pinpointing the primary cause of a revocation is a lot more difficult than one might think. Difficulties of schools academically, legally, financially, and at the school governance level tend to be very closely related. For example, if a school is unable to offer the academic program at the level promised to parents, it only takes a small number of parents dis-enrolling their children to send the school into a financial tailspin. This financial pressure, in turn, tends to lead to infighting at the governance and administrative level. When districts revoke, they usually focus on the financial and legal issues because they tend to be much easier to document. District staff prefer to bring clear and unambiguous reasons to their boards. As you [referring to me] know from your years of research in this area, measuring a academic growth and progress is a dismal and slippery “science.” It’s a lot easier for districts to document that a charter school’s budget is out of balance than it is to document declines in test scores or poor implementation of some promised academic program.