|
THE 11TH BRACEY
REPORT ON The Condition of Public EducationBy Gerald W. Bracey posted with
permission from the October 2001 issue of Kappan |
|
A LOT of people think I defend schools reflexively. Not so. But a
little more than a decade ago, I found a lot of data that proved that the people who
make up what I have come to call the Education Scare Industry were wrong, and I
said so. When I have thought the schools
have been wrong, I have said that, too. In addition to being
a longtime teacher of English and journalism, Schmidt also publishes a muckraking
(not a pejorative term in my lexicon) monthly newspaper called Substance. One day, a plain
brown envelope delivered to the Substance offices was found to
contain copies of the CASE (Chicago Academic Standards Examinations) . Schmidt
thought the test items were awful and, rather than write an editorial to that
effect, published them in his paper. CPS suspended him without pay and sued for
$1.4 million, which it claimed would be necessary to write new tests. CPS
subsequently fired him. When I saw the tests,
I tried to imagine what would have happened had I produced them back in the
days when I was director of testing for the Virginia Department of Education.
Someone would have leaked the tests to the Richmond Times Dispatch. The Dispatch would have published the
worst questions, along with a scathing editorial mocking the state department’s
incompetence. The department would have summarily sacked me and deservedly so. This is what should
have taken place in Chicago. The Chicago Tribune should have picked
the tests up from Schmidt, published the worst items, and written a scathing
editorial. Then CPS should have fired Carole Perlman, director of testing. Instead,
the Tribune backed the tests and demanded that Schmidt be fired. Perlman
testified against Schmidt at his hearing. These tests were more than just a set
of “trivial pursuit” items, although most of the items were that, too. The
tests contained items that had no right answer, items that had multiple right
answers, and items to which the official right answer was wrong. It also
contained items for which an earlier item cued the answer for a later one. In
short, these tests were garbage. At a hearing on the
issue (at which I testified in support of Schmidt), Perlman had the effrontery
to defend the tests and even cajoled Tom Kerins, former Illinois director of
testing, to testify on behalf of the tests and to confirm the cost estimate to
replace them. Shame on you, Tom. The $1.4 million, by
the way, works out to about $12,000 an item. Chicago schoolteachers, not
professional item writers, wrote the questions. At $12,000 per question, every
four items cost the equivalent of a Chicago teacher’s annual salary. How could
they possibly cost so much? Well, it is true that CRESST (Center for Research on
Evaluation, Student Standards, and Testing) liberated $500,000 from CPS, but it
claimed to have provided only a little technical assistance in teaching
teachers how to write items. How CRESST could charge so much money for so
little work could also spark an investigation. The CPS suit is ongoing, as is
Schmidt’s counter-suit involving First Amendment arguments. |
Some educators in
Western Massachusetts invited Alfie Kohn to be the keynote speaker at a
conference. When the Massachusetts Department of Education (MDE) heard that
Kohn would be the keynoter, it told the organizers that, if Kohn spoke, the money for the
conference would be withdrawn. The organizers caved, even though the money to
pay Kohn was not from MDE funds. Officially, the reason for denying Kohn the
right to speak was that his topic was beyond the theme of the conference. Kohn
was invited to speak on standards and assessment, and the organizers titled the speech “The Case
Against Standardized Testing.” The purpose of the conference was for charter
schools and other public schools to share information about common issues. But,
as Chester Finn and his colleagues have observed, “ Charter school discussions are
saturated with talk about accountability.” And talk about
accountability usually includes talk about testing. Other sessions at the
conference covered testing, and many had nothing to do with charter schools. Kohn
was paid —not to speak. He says that
the MDE’s action was not surprising: “It’s a small step from saying, ‘Pass this
test or you don’t graduate,’ to saying, ‘Renege on this speaker or you don’t
get funded .’ ”The ACLU is progressing toward a suit. The Ohanian saga, which
she discussed briefly in her January 2001 Kappan article, remains murky with regard to who is
behind it and what they hope to accomplish — other than to frighten her and make
her spend money on lawyers in two states. I spoke to Alvin Wilbanks, the school
superintendent in Gwinnett County, who advised me that there was an “ongoing
investigation” but steadfastly refused to tell me who was conducting it. Jim
Keinard, the Gwinnett School Police
officer in charge of the investigation, has not returned phone calls or replied
to e-mails. Ohanian has recently been ordered to supply fingerprints and a
writing sample. Georgia law does not permit officers to ask for a writing
sample. More than most high-stakes tests, the one in Gwinnett County had roiled
many waters. Given the enormous amount of testing already present in the
district, many Gwinnett citizens simply saw no need for it. Some saw malice and
maybe even malfeasance in a memo from Assistant Superintendent Gale Hulme.
Hulme claimed it was “human error” that caused some RFPs for the Gateway exam
not to be sent to most bidders on time. It is not clear that only CTB/McGraw
Hill received the RFP on time, but only CTB bid. Harcourt Educational
Measurement declined, in a memo from then Vice President Phillip Young, dated
three weeks after the deadline. Finally, some Gwinnett teachers were upset that
the passing scores on some tests were set very close to 25% correct. This, of
course, is the chance level, uncorrected for guessing, and strongly suggested
that the whole enterprise was a political game. Based on information
provided by Gwinnett School Police, Vermont detective Timothy Bombardier in an
affidavit accused Ohanian of attending a meeting of the “Alfie Kohn Group.”
Bombardier wrote that “The Alfie Kohn Group trains people in how to disrupt and
prevent the implementation of high-stakes testing. On 31 March 2001, the Alfie
Kohn Group met at Columbia University in New York, and Lisa Amspaugh was in
attendance. [Amspaugh is a former resident of
Gwinnett County and a critic of the test.] An attendee at the meeting advised
investigators that a session was held specifically to plan strategies to
disrupt the Gateway test.” This would be hilarious
if it did not come from an agent of the law. The conference was organized by
Columbia University faculty members and FairTest. I spoke at it, sharing a
session with noted radical Ted Chittenden of the Educational Testing Service.
One day was indeed devoted to developing strategies to counter the negative
effects of high-stakes testing, but it dealt with topics like how to get the
message to the media, to politicians, etc. No one ever mentioned the Gateway
test, and no one said anything about disrupting any test administration or
committing any acts of civil disobedience. Neither Ohanian nor Kohn attended
the meeting. The tale began with
someone who pilfered the county’s high-stakes test. Among the events that
followed was the spectacle of all schools having to count, in front of a
policeman, their copies of the test. | ||||||||||||||||||||||||||||
To Top of Report | |||||||||||||||||||||||||||||
|
A couple of decades
ago, I formulated Bracey’s Paradox : test scores mean something only when you
don’t pay any attention to them. Lately, a lot of people have been paying a lot
of attention to them. If 2000 was the year that testing went crazy, 2001 was
the year it went stark raving mad. I have already recounted three of the most
outrageous incidents. Others merely reflect the tyranny of testing. What say we
take a moment to consider a few of the personal qualities that standardized tests
do not measure: creativity, critical thinking, resilience, motivation, persistence, humor, reliability, enthusiasm, civic-mindedness, self-awareness, self-discipline, empathy, leadership, and compassion. Events in New York and
Virginia reflected testing’s ascent to dominance. In New York, 37 small
alternative schools had built their curricula around portfolios as a means of
assessment. They wanted to use these in lieu of the state tests for graduation.
No can do, said New York Education Commissioner Richard Mills. Alternative
school students have to take the tests just like everyone else. In Virginia, people
pressured the state board of education to permit alternatives to the board’s
own tests, then the sole determinant of eligibility for high school graduation.
Okay, said the board — and added more tests: the SAT, the Advanced Placement
tests, and the International Baccalaureate. Grades and teacher recommendations were
deemed too subjective. We should note that, for all their subjectivity and
alleged variation in meaning and rigor from place to place, high school grades still
predict first-year college grades at most universities better than the SAT. “NewsHour with Jim
Lehrer” also acknowledged testing’s prominence with a long segment.(A
transcript of this “NewsHour ” segment can be found at www.pbs.org/newshour/bb/ Can we now? Thomas Kane
of the Hoover Institution and Douglas Staiger of Dartmouth College concluded
that between 50% and 80% of the “improvement” in annual test scores for a
school was temporary and caused by fluctuations that were not related to an
increase in achievement. 3 David Grissmer of the
RAND Corporation put the implications of these findings this way:“ The question
is, are we picking out lucky schools or good chools, and unlucky schools or bad
schools? The answer is, we’re picking out lucky and unlucky schools.”4 Kane and Staiger made a
telling statement:“ Most of these [school accountability] systems have been set
up with very little recognition of the strengths and weaknesses of the measures
that they ’re based on.” 5 The reason they have
been set up that way, of course, is that the people who set them up, the
Keegans and Everses of the world , have an agenda. It is all about ideology,
power, and control and not at all about children, learning, and education. Neill
and Kohn pointed out a number of other weaknesses of tests, as well as some of
the negative outcomes they produce. They didn’t mention that, in Virginia Beach,
the board of education called a special session to decide if it needed to mandate
recess for the district’s elementary schools, because many of the schools had
eliminated recess in favor of additional test preparation. The problem extends well beyond Virginia Beach. 6 Washington Post reporter Liz Seymour found
that Virginia’s tests not only flunk a lot of children but also have created a
new class of dropouts: teachers. Some have taken early retirement, some have
fled to private schools, and some have requested transfers to grades that are
not tested. 7 Seymour interviewed
teachers only in some of the highest-scoring districts in Virginia, those who
would have the least to fear from the tests. The high-stakes fourth-grade test in
New York is having a similar effect. Because tenured teachers can choose their assignments,
fourth-grade has become the province of the least-experienced teachers.8 Seymour quoted Virginia
state board president Kirk Schroder, who claimed, “ People miss the big picture
here. The reality is that accountability is changing the culture of public
education, and in some respects that has created some very positive achievement
in some places where student achievement did not exist.” Schroder offered
no examples. Surely, he did not have in mind the performance of students in
algebra I in Richmond schools. On the third administration of the algebra I
test, Richmond’s high schools had these passing rates : 19.8%, 10.3%, 9.0%,
5.8%, 4.6%, and 2.6%.Only three small, selective, affluent schools did better. The
techniques for setting passing scores reveal the purely political nature of these
programs. Virginia employed the widely used Modified Angoff procedure. The process
generates a recommended cut score from each of the 20-odd judges who participate.
Usually, a cut score in the middle of the full range of
recommended scores is taken as the official passing score. For 19 of 21 tests,
the Virginia board selected the highest recommended cut score. For the two others, it
set the passing score higher than any of the judges had recommended. But at least
the Virginia judges had some training and used a generally accepted procedure.
In California, a panel of 100 people were given a dictionary definition of
“competence” and told not to worry about setting a high passing score because eventually
students would get there. 9 |
The judges recommended
a cut score of 70%.State Superintendent Delaine Eastin overruled the judges and
set the passing score at 60% for one test and 55% for the other. Still, a
majority of the students failed, and the media scratched their heads over how
so many students could flunk such an “easy” test, a test that, after all, required
students to get barely more than half of the items right. (Recall that
norm-referenced tests are composed mostly of questions that about half of the
students get wrong; the percent correct on a test says nothing about its
difficulty.) The Alliance for
Childhood, a loose coalition of psychiatrists, pediatricians, and educators,
attempted, with little success, to bring some sanity to the situation with a
position paper on high-stakes testing. The section headings summarize the
paper’s story:“ The Technology of Testing Is Flawed”; “Test Scores Have Meaning
Only in the Context of the Whole Child”; “Evidence Is Growing of Harm to Children’s
Health” ; “More High-Stakes Testing Means More Dropouts, Fewer Good
Teachers”; and “Standardization Is the Enemy of Effective Public Schools.” 10 Into the existing
nuttiness over testing, Bush injected an unworkable and self-contradictory plan
for chaos. In the name of giving states more freedom and flexibility, the
President proposed to force them to test all students every year in reading and
math in grades 3 through 8.Schools would be required to make “adequate yearly
progress,” a concept that caused everyone’s eyes to roll back in their heads
—even those who hadn’t seen the article by Kane and Staiger on the
instability of annual gains. The initial House version would have labeled most
schools as “failing schools.” When Shadow Secretary of Education Sandy Kress rewrote the
“adequate yearly progress” formula, he called his own work “Rube
Goldbergesque.” 1
1 Chester Finn said the
legislators had rendered the notion of adequate yearly progress so
“complexified” that it defied explanation to parents and teachers.12 In addition to
“adequate yearly progress,” the Bush plan calls for all students to reach the
“proficient” level on state assessments, said assessments to be confirmed by the National Assessment of
Educational Progress (NAEP) or by a nationally normed test. As FairTest noted
in its analysis of the many problems in the Bush plan, this would require rates
of progress never before seen in education (www. fairtest.org/nattest/bushtest.html). In most states, fewer
than one-third of fourth-graders currently attain the NAEP proficient level,
and performance on state assessments often differs widely from NAEP. In Texas,
for example, 89% of youngsters are proficient on the state reading assessment, but
just 29% are proficient on the NAEP. Only 12% of black students scored proficient
or better on the 2000 NAEP reading assessment . Sixty-three percent scored below
basic. If the NAEP were administered in the third gra d e,
similar results would probably be found. Currently, the House plan comes in 10-year
and 12-year versions. Suppose the 10-year version becomes law. An ave rage of
6.3% of American third-graders must move from below basic to proficient each
year for 10 consecutive years. New York Times education writer Jodi Wilgoren
interviewed education leaders in all 50 states and found them complaining about
the Bush proposal because it ignores an entire decade of work to develop
standards and tests. 13 But what has this
decade of work gotten us? Falling test scores. Aside from the SAT, only the
IowaTests of Basic Skills and the Iowa Tests of Educational Development provide
evidence of longterm trends. By Iowa law, each new form of the test must be
equated to the old form . Scores rose from 1955 to about 1965, fell for about a
decade, and then rose to mostly record highs by the mid to late 1980s. After
the 2000 renorming, though, the scores fell. No one seems to understand why. I
would place my own bet primarily on changing demographics. The 2000 census
paints a very different picture of America from the one painted by the 1990 census
— much less the earlier ones. Now for a quick summary of the best of the rest
of this year’s news about tests. Richard Atkinson,
president of the University of California System, set tongues wagging by
proposing that the university do away with the SAT as a college admissions requiremet. He proposed temporarily using the College Board Achievement
Tests until something better and more appropriate could be developed.The media
made it a big deal when Mount Holyoke banished the SAT, but Joanne Creighton,
Mount Holyoke’s president, said that the test never counted for more than
10% in the admissions decision anyway. Many stories covered
cheating scandals. Many others documented the major errors made by companies
that develop and score tests and explored the injurious impact of these errors
on students. Resistance to the tests also grew. Parents in several states
boycotted state-mandated tests. The Business Roundtable felt the resistance sufficiently
to issue a monograph on how to counter the testing “backlash.”14 Eugene Paslov, CEO of
Harcourt Educational Measurement, garnered a fair amount of ink by saying that
tests such as the ones his company produces should not be used as graduation
requirements. He said his company could not tell school districts how to use
the tests, but “we do have a responsibility to tell policy makers how we feel.”15 | ||||||||||||||||||||||||||||
To Top of Report | |||||||||||||||||||||||||||||
|
New NAEP Data The most disturbing
thing about the 2000 NAEP reading and math assessments was the way media and
state officials covered and interpreted them. The reading data, which did not
show any change, received little in the way of headlines. Nevertheless, in a Wall Street Journal op-ed piece, former
Delaware Gov. Pete du Pont called the
results “ disastrous.” 1
6 Recall again that American students finished second in an international comparison of reading achievement. Theoretically, we could be suffering an international literacy crisis, but no one has claimed so. Few papers carried the math results on the front page. Many of the nation’s leading papers, including the Washington Post, NewYork Times, Chicago Tribune, Chicago SunTimes, and USA Today, buried the story deep in Section A. This is disturbing because the results were generally positive. The 12thgrade scores dropped three points from the 1996 level, leaving them well ahead of the scores in 1990. Both fourthand eighth graders showed improvements. If their scores had dropped, a couple of journalists admitted to me, the story would have garnered page-one placement. Most papers treated the results as a state, not a national, story. The national results appeared mostly in national papers and in papers in states that did not participate in the assessment. |
In some states, the newspapers
and state officials bragged — Connecticut, Indiana, Iowa, Maryland, Massachusetts,
Minnesota, North Carolina, Texas, Vermont, and V i r g i n i a . In other
states, they lamented the low performance — Arkansas, California, Nebraska,
Oklahoma, Utah, and Wyoming . Still others did a little of both, pointing out
gains but mentioning below average
performance — Alabama , Idaho, Illinois, Kentucky, Louisiana, Michigan, and
Ohio. The headline in the Biloxi Sun Herald was the most sorrowful: “Mississippi Improves
Scores, but Finishes Last on Test.” This kind of coverage is disturbing because
the utility of the NAEP depends on its invisibility . (See Bracey ’s Paradox,
above) As soon as you start paying attention to a test, you introduce all kinds
of corrupting influences that invalidate the scores. State officials attributed
gains to their state’s reform efforts. Although the
NAEP likes to bill itself as “the nation’s report card,” it is increasingly
becoming the states’
report
card to be used for bragging or to goad educators to greater
effort and achievement. Thus it will not be usable to “confirm” Bush’s testing
program or any other program. | ||||||||||||||||||||||||||||
To Top of Report | |||||||||||||||||||||||||||||
|
No Child Left Behind The Bush education plan
as presented to Congress in the document “No Child Left Behind” begins with
three falsehoods: “Today nearly 70% of inner city fourth graders are unable to
read at a basic level on national reading tests. Our high school seniors trail
students in Cyprus and South Africa on international math tests. And nearly a
third of our college freshmen find they must take a remedial course before they
are able to even begin regular college level courses.” There are no published
data to support the 70% contention. (In an August 1 speech to the Urban League,
Bush amended the figure to “almost two=thirds.” In an earlier speech, First
Lady Laura Bush had used the better figure of 60%.) The 2000 NAEP results in
reading show that 47% of students in central cities score below basic. Sixty
percent of students eligible for free and reduced-price lunches score below basic. As for high school
seniors trailing Cyprus and South Africa, these are the two countries that the
U. S. outscored, not trailed . Of course, to consider these data at all, one
has to accept the results of the Final Year Study of TIMSS (Third International
Mathematics and Science Study), which, as I hope I have made clear before, one should
never do. 17 The flaws in the data
render them virtually un-interpretable. When I parsed the results and found
groups most comparable to the students in other countries, American high school
seniors remained about average, which is where they were as eighth-graders. The
statement on college remediation makes it seem that college freshmen are showing
up at Harvard lacking basic skills. Maybe. But I doubt that sound national
figures exist because “remedial” means different things in different
states and on different campuses. In Virginia, for instance, remedial courses
are not offered at the flagship institutions. You won’t find them at William
and Mary, the University of Virginia, Virginia Tech, or most of the other four
year universities. They are offered by three urban universities, by two
historically black universities, and, especially, by the community colleges. If
students did not take algebra II in high school, then decide that they want to
go to a four-year college and so take algebra II at a community college, does
that make the course remedial? Isn’t providing such opportunities a core function of
community colleges? The President’s statement also overlooks the inconvenient
fact that about two-thirds of high school graduates go on for further education.
Shouldn’t we be applauding this? Paul Gigot — soon to be
editorial page editor of the Wall Street Journal — said that the signal quality of the
legislation was that “Teddy Kennedy is happy, and Checker Finn is not.” 18 Certainly, most
conservatives did not care for the Bush plan. The Heritage Foundation slammed
it. So did the Family Research Council, Focus on the Family, Phyllis
Schlafly’s Eagle Forum, and Paul Weyrich’s Free Congress Foundation . Analysts
concluded that Bush had given the conservatives so much of their agenda in his
first 100 days that he could afford to anger them now on a few issues. |
“Missing in action” in
all of the contentiousness has been Roderick Paige, the new secretary of
education.The Houston
Chronicle noted
that “according to his official schedule, the secretary spends the bulk of his
time meeting with foreign dignitaries, going to dinners and receptions, or
traveling around the country.” 1 9 The New Republic observed, “In any administration, the blatant marginalization
of the only African American domestic Cabinet secretary would be noteworthy. In an Administration that loudly
trumpets its commitment to Cabinet government and racial diversity it’s
stunning.. . . From the beginning the
White House seems to have expected him to be the education plan’s public face
—and nothing more...Ah, the soft bigotry of low expectations.”20 Paige has denied rumors
that he is unhappy with Bush and is planning to resign. 2 1 He has now declared
that he is “at the table” and will seek a higher profile, but Jack Jennings of
the Center on Education Policy still has him filed under “I” — for “irrelevant.”
22 As this is written,
Congress is in recess. Everyone is predicting cantankerous debates to resolve
the differences between the House and Senate versions of the Elementary and
Secondary Education Act (ESEA) when legislators return this fall. The June 20
issue of Education
Week carried
a side-by-side comparison of the competing versions. | ||||||||||||||||||||||||||||
To Top of Report | |||||||||||||||||||||||||||||
|
New International Data Early 2001 brought the
release of the TIMSSR (R for Repeat) and TIMSS Benchmarking Studies. The media
greeted these studies with a collective yawn in the first instance and with
silence in the second. Actually, the TIMSSR report contained what I call
“microcosmic data” — a small set of statistics that reveal the condition of education
writ large. The U.S. Department of
Education disaggregated the TIMSSR data by ethnicity. I wondered what the
results would have looked like if the entire U.S. sample had consisted of
students of only one ethnicity. In the TIMSS sampling system, Asians and Native
Americans constitute too small a group to generate a reliable estimate. The
scores from blacks, whites, and Hispanics and those from the 38 participating nations (adding an ethnic group makes atotal of 39) generate
these results:
These results look
drearily familiar. Unfortunately, TIMSS has no direct measure of poverty, only
such indicators as the number of books in the home. The data above are stark
enough, but if we could show data by ethnicity and poverty level, we’d see even
more dramatic evidence of savage inequalities. Of somewhat more
interest than TIMSS R was what I will call TIMSS B, the TIMSS Benchmarking
studies. Taking all 38 nations together, TIMSSB calculated what proportion of
students attained certain “benchmark” levels: 90th percentile, 75th percentile,
50th percentile, and 25th percentile. In addition to the 38 nations,
13 states and 14 school districts or consortia of districts participated. The
38 nations generated an international mathematics average of 487. U.S. students
scored 502. All 13 states
(Connecticut, Idaho, Illinois, Indiana,
Maryland, Massachusetts, Michigan, Missouri, Nort h Carolina, Oregon,
Pennsylvania, South Carolina, and Texas) scored higher than the international
average, and all but four of them scored at or above the U.S. average. Idaho,
Maryland, Missouri, and North Carolina scored lower. Michigan, Texas, and Indiana
topped the list. (Yes, Texas, but more about that in a moment.) Note that none
of the states that scored highest in the first TIMSS (Iowa, Nebraska, M a i n
e, Minnesota, Montana, North Dakota, a n d Wisconsin) participated in TIMSSB.
Here are the results for math: Percentage of Students Attaining Selected Math Benchmarks 90th 75th 50th 25th International 10
25 50 75 Texas 13
32 66 90 Connnecticut 11
31 67 91 Illinois 10
29 65 92 Massachusetts 10
31 68 92 Michigan 10
33 70 92 Oregon 10
32 69 91 South Carolina 10 30 60 88 Indiana 9
28 65 88 Pennsylvania 9
28 65 91 Maryland 8
27 57 87 North Carolina 7 25 57 88 Idaho 5
24 61 88 Missouri 4
20 58 89 United States 9
28 61 88 Singapore 46
75 93 99 South Africa 0 1 5 14 TIMSSB did not offer
competition as tough as TIMSS. Some industrialized nations that took part in
TIMSS did not participate in TIMSS B, and a few more developing countries did.
In the original TIMSS, none of the seven highestscoring states named above
placed more than 6% of students at the 90th percentile in math. Still, there were 37 countries, plus
Taipei, in TIMSS B, including such
high flyers as Singapore, Japan, Korea, and Hong Kong. Seven states had
percentages of students at the international 90th percentile that were as high
as or higher than these 37 nations. All but two states had at least 25% of
their students scoring at the 75th percentile, and all 13 states had higher percentages
scoring at or above the 50th percentile or at or above the 25th percentile than
these 37 countries. The United States had
almost as high a proportion of students at the international 90th percentile,
9%, as the topscoring states, and only 14 of the 37 nations had as high or higher
proportions at this level. The U.S. had a higher percentage than average on the
three other benchmarks. The contrast between the highest-scoring nation,
Singapore, and the lowest, South Africa, shows the great gulf between the First
and Third Worlds. |
A Nation at Risk tightly yoked the test performance
of students to the economic health of the nation. However, on 10 July 2001
Singapore declared its economy officially in recession.2 3 Meanwhile, observers
worry that Japan will experience a second decade of recession. These facts should end any
further assertions that high scores produce a competitive economy. By the way,
education alone doesn’t produce jobs. If it did, India wouldn’t have tens of
thousands of unemployed software engineers waiting for visas to the U.S. The results for the
American districts and consortia reveal contrasts almost as stark as those
between the highest and lowest-scoring nations. In math, only the five Asian nations
finished ahead of Naperville, Illinois, and the First in the World Consortium,
a group of 19 suburban Chicago districts. Only seven countries bested
Montgomery County, Maryland, which has a lot more poverty than people realize
and which also has more than 100 foreign languages to cope with. And only eight
nations outscored the Michigan Invitational Group. At the bottom, only
five countries scored lower than the Miami-Dade school district. Only eight
trailed Rochester, New York, and Chicago surpassed only 10. (In the original TIMSS,
only three of 41 countries scored lower than Mississippi; only one scored lower
than Washington, D.C.) As with the first
TIMSS, American students fared better in science than in math. The
international average was again 488, but the U.S. average in science was 515. All
states scored higher than the international average, and four scored below the
U.S. average. Here are the benchmark results for science: Percentage of Students
Attaining Selected Science Benchmarks 90th
75th 50th 25th International 10
25 50 75 Michigan 22
47 75 91 Oregon 19
43 73 91 Indiana 18
41 72 92 Connecticut 17
39 69 90 Massachusetts 17
40 71 92 Pennsylvania 15
38 70 91 Texas 15
35 61 83 Illinois 14
36 66 88 Missouri 14
36 67 89 Idaho 13
37 70 91 South Carolina 13 34
60 85 Maryland 12
31 59 84 North Carolina 11 30
60 85 United States 15
34 52 85 Singapore 32
56 80 94 South Africa 0
2 6 13 Naperville 33
64 90 98 The results from TIMSS, TIMSS R, and TIMSS B clearly indicate the need for something that people
like me, David Berliner, Bruce Biddle, Harold Hodgkinson, and Michael Casserly
of the Council of the Great City Schools have been calling for for ye a r s : a
“Marshall Plan” for the inner cities and poor rural areas.
Reforms predicated on the dire state of the typical American public school or
on the “crisis” in public education are wholly misguided. My declarations about
the inadequacy of the TIMSS Final Year Study stand. Still, those data yielded
some interesting information. The College Board compared the scores of American
students taking Advanced Placement (AP) exams in calculus and physics with the
TIMSS scores of the various countries. In calculus, AP
students outscored all 16 countries, averaging 573 points, compared to 557 for
France, the highest nation (this difference was not statistically significant, but the
other 15 were). Students who took the AP Calculus AB test (a test of
first-year calculus) and received a score of 3 or better (considered passing)
scored 586 on the TIMSS Advanced Math. Those who scored lower didn’t fare much
worse: 565. Students who took the AP Calculus BC test (a test of
second-year calculus) and scored 3 or better aced the TIMSS test at 6 3 3.
Roughly two-thirds of the students taking each test scored 3 or better. In
physics, AP students finished fourth, behind Norway, Sweden, and Russia. But recall
that the Scandinavian students had studied physics for three years. Russia tested
only 2% of the student population and only those in Russian-speaking schools. Those
who achieved a 3 or better on the AP Physics test scored 586 on the TIMSS physics
test, five points ahead of top-ranked Norway. Students with AP scores in
physics of less than 3 scored substantially lower: 511. This ranks
them ninth among the 17 countries in the TIMSS Final Year Study of physics. The
study also revealed a different aspect of the ethnic achievement gap: virtually
no blacks or Hispanics took either AP test. The calculus group contained just
1% black and 3% Hispanic students. Seventy-two percent were whites, and 21%
were A s i a n s / Pacific Islanders. The physics group was made up of 1%
blacks, 4% Hispanics, 66% whites, and 26% Asians/Pacific Islanders. In both
groups, 4% of the students checked “other.” | ||||||||||||||||||||||||||||
To Top of Report | |||||||||||||||||||||||||||||
|
In his 1962 book, Freedom and Capitalism, Milton Friedman
developed the modern concept of school vouchers —influencing, among others,
Ronald Reagan, who made them part of his education agenda. Friedman and his
wife Rose lead a foundation dedicated to the propagation of vouchers. On the website’s
FAQ (Frequently Asked Questions) section, one question is “Are vouchers
popular?” The site says unequivocally yes and then provides a lot of survey data
to try to bolster the claim (the survey data are more equivocal than Friedman would
have you believe). Interested readers should visit www.friedmanfoundation. org and click on
Frequently Asked Questions. Survey data about
vouchers, however, have always proved wrong when the issue becomes meaningful —
as in a vote. This is how it happened in 2000.The 2000 election saw voucher
proposals in California and Michigan go down in flames by large margins in both
states. Silicon Valley entrepreneurs sponsored the California proposal and
funded it generously. In Michigan, voucher advocates outspent opponents 2 to 1.
I asked Friedman how he interpreted this debacle, and he said that the “defeats
are highly relevant to the question of political tactics.” But he also said that
he retained his faith in the efficacy of vouchers. Along with generous
funding, voucher proponents garnered support from conservative pundits. George
Will declared that facts about voucher successes had “pummeled ” opponents;
William Safire concluded that vouchers would wipe out the black/white
achievement gap. 2
4 It didn’t help. Both
pundits, interestingly publishing on the same day, drew mostly on the work of
Harvard’s Paul Peterson, who has allowed his voucher theology to cloud his
vision. Early on, Peterson characterized voucher advocates as “a small band of
Jedi attackers” who were engaged in a fight with the unified might of “Death
Star forces.” 25 Usually, researchers
write up a research report, have some friends read and review it, then pass it
on to a journal, where three or four anonymous reviewers will pass judgment on
its merits. Peterson gave his study of vouchers in Milwaukee to the Associated
Press. The resultant story just happened to appear on the same day that
Peterson and his frequent publishing companion, Jay Greene, now of the Manhattan
Institute, published an op-ed piece on the same subject in the Wall Street Journal, which characterized
John Witte’s original evaluation of the Milwaukee voucher program as “bad
science.” This just happened to be the same day that
Republican Presidential candidate Bob Dole proposed vouchers to the Republican
National Convention. As the Church Lady might say, “How convenient.” Deception by the Numbers, a booklet produced by
People for the American Way, describes many of the inadequacies of Peterson’s
work . For instance, using data from his studies in New York City, Dayton, and Washington,
D.C., Peterson has claimed that vouchers work for African American children,
but not for other ethnicities. 2 6 This is a most curious
finding that Peterson has never attempted to explain. In fact, black children
in New York City showed gains only in the sixth grade, not in grades 3, 4, or
5. However, grade 6 gains were so large that, when Peterson averaged the four grades,
the average was significant. Peterson’s description of the results led David Myers
of Mathematica Policy Research, Inc., a co-investigator with Peterson in the New
York study, to call the claim “premature.” “Right now you come
away saying, ‘No there’s no impact,’ ” said Myers. 27 |
A later press release,
available on the Mathematica website (www. mathinc.com/3rdLevel/school.htm),
makes this clear: “ The report shows no overall differences in test scores
between 3rd through 6th graders who were offered vouchers and those who were
not. However, there were large and statistically significant impacts for African
American 6th graders who were offered vouchers.” Indeed, the results from Dayton are not significant even for
African Americans. 28 Peterson has not
explained this anomaly, either. Meanwhile, back on the referendum front, the
California proposal from Silicon Valley entrepreneur Timothy Draper would have provided
a $4,000 voucher for all children, including the 600,000 students already
enrolled in private schools. A wide spectrum of groups opposed the proposal.
The plan for subsidizing those wealthy families whose children already attended
private schools offended some. Some worried about draining money from the
public schools. Some private schools said that they would not accept voucher
students who scored below grade level, and others expressed no interest in
expansion. And even if private schools were of a mind to grow, one estimate has
contended that the nation’s existing private schools could absorb only 4% of
public school children. 29 Michigan put a more
complex proposal before the people. In addition to providing vouchers worth
$3,200 to students in “failing districts,” it established a teacher testing
program and set a minimum funding level for schools. Supporters claimed that the
legislation would affect more than 30 districts, but the Michigan Department of
Education put the figure at seven. 30 Post mortems attributed
the defeat to the complexity of the proposal, to the fear of taking money away
from public schools, and to the fact that people could not easily read the
proposal’s position on the political spectrum by looking at supporters and
opponents. Popular Republican education reform Gov. John Engler opposed it. So
did former Gov. James Blanchard, a Democrat. 31 In the wake of the
defeats, voucher advocates, such as Peterson, Jeanne Allen, and Trent Lott,
decided that the word “ voucher ” should be dropped from the lexicon of school
choice. “I think maybe the word is part of the problem,” Lott said.“Maybe the
word should be ‘scholarship.’ ”32 At his confirmation
hearings, Rod Paige told the committee that “the word ‘vouchers’ has taken on a
negative tone.” 33 Given these events and
sentiments, President Bush’s voucher proposal was quickly removed from both the
House and the Senate education bills. As proposed by Bush,
the plan would have transferred wealth from taxpayers of all denominations to
the Catholic Church. Journalist and voucher advocate Matthew Miller argued
that, in cities, a voucher would have to be worth at least $6,000 to interest
people. 34 Bush’s
vouchers were, in Miller’s word, “puny,” worth just $1,500. Only the heavily
subsidized Catholic schools, which have been hemorrhaging students (12.6% of
all students in 1960, 4.7% in 2000), could have afforded to accept them. 35 Meanwhile,
the Bush Administration asked the U. S. Supreme Court to hear the Cleveland
voucher case. A federal appeals court, noting that 96% of the students using vouchers
attended Catholic schools, had declared the program unconstitutional. | ||||||||||||||||||||||||||||
To Top of Report | |||||||||||||||||||||||||||||
|
Charter Schools Accountability
burbled to the top of the pot of charter school issues this year. In his 1996
book on charter schools, Joe Nathan wrote, “Hundreds of charter schools have
been created around this nation by educators who are willing to put their jobs on
the line, to say, ‘If we can’t improve students’ achievement, close down our
school.’ That is accountability — clear, specific, and real.” 36 And
nonexistent. If this all-or-none test were applied to charters, precious few would
still stand. Charter operators have often resisted producing financial or achievement
data, even when this information falls under a state’s freedom of information
law. An RPP International report for the U. S. Department of Education found
that just 37.3% of charter schools sent a progress report to the chartering
agency. Some 60.9% did send a report to the charter board, but only 41.2% sent
one to the students’ parents, and only 25.3% delivered one to the community. 37 A review of
accountability in 10 active charter states found little activity and few trends
toward tightening accountability requirements. 38 But people
are talking about accountability. Chester Finn and his colleagues wrote in
1996 that they had “yet to see a single state with a thoughtful and well- formed
plan for evaluating its charter school program .” 3 9 Finn and
his colleagues returned in late 2000 to observe, “Charter school discussions are
saturated with talk about accountability. Some view it as the third rail of the
charter movement, some as the holy grail.” 4 0 They proposed
something they call “accountability by transparency,” whereby“ the school
routinely and systematically discloses complete, accurate, and timely
information about its program, performance, and organization.” Their system,
though, requires so much information in the form of various test scores,
progress toward goals, student standards, curriculum, instructional methods,
demographic characteristics, and more that it would seem to eviscerate the
original concept of a charter school. In any case, it will not be adopted. No state
has a true formal accountability program. Several, among them Colorado,
Florida, Massachusetts, New York, and Ohio, In addition, Michigan charter sponsors, mostly
public universities, encourage charter operators to team up with EMOs. The EMOs,
presumably, have the skills and experience that the individual charter
operators lack. Charter authorizers also favor EMOs because they are known
quantities. This reduces the risk that some idiosyncratic, visionary charter operators
will mismanage or steal money, develop a curriculum bereft of intellectual
content, or otherwise mess up. Finally, working through an EMO gives the
charter operators access to startup capital. The lack of capital poses the biggest problem for
charter founders. Interestingly, a little-noticed provision in California’s
Proposition 39 might have solved or greatly ameliorated this problem. The
proposition generated much debate because it lowered the size of the majority of
voters needed to approve school bonds. But the proposition also directed school
districts to provide charters with facilities that are “reasonably equivalent”
to those provided by the public schools. All districts must comply by 2003. One
might anticipate an EMO takeover similar to Michigan’s in Ohio. The most recent
report from Ohio’s Legislative Office for Educational Oversight (LOEO) had this
to say about charters, called “community schools” in Ohio, and EMOs: LOEO found
that community schools benefit significantly from the assistance of management
companies in areas such as financial management, curriculum development, teacher
in-services, and general support and guidance. Directors [of charter schools]
remarked to LOEO that the management company is the first place they turn when
they have questions. Community schools not operated by a management company
must be responsible for all aspects of running the school, ranging from
curriculum design to staff hiring and evaluations to planning budgets. The
director of one community school without a management company commented that
“schools operated by a management company have the assistance I was looking for
this year.” 42 Turning to EMOs might
be practical, but it defeats some major purposes of the charter school
movement: to stimulate innovation in curriculum and instruction and to give the
people in the school building and the parents in the community ownership and
control over what goes on in the school. In fact, accountability
for charters has proved more complicated than such early advocates as Joe
Nathan, Albert Shanker, Ted Kolderie, and Ray Budde envisioned. Most statelevel
evaluations have concluded that no single instrument serves appropriately for
all schools. It is clear in these evaluations, though, that the charter authorizers,
boards, and the school operators haven’t thought much or clearly about what would serve as appropriate
instruments. Bruno Manno has outlined some of the difficulties: Today,
it’s hard to know how well charter schools are actually doing. . . . There are
three predominant reasons for
this situation. First, the charter strategy is so new that it’s difficult to
measure results. There’s just not much data out there. Second, today’s charter accountability systems remain underdeveloped,
often clumsy and illfitting, and are themselves beset by dilemmas. A final
reason for the dearth of charter school accountability information lies with
authorizers and operators. Truth be told, they are often content to leave accountability
agreements nebulous and undefined.
Leaving accountability agreements indeterminate is fraught with danger because
over the long term this approach is more likely to lead to a charter school
being subjected to the rule and compliance-based accountability practices that
characterize conventional schools. 43 |
Rutgers University researcher Katrina Bulkley finds four factors that make it difficult to revoke or not renew a charter: 1. Educational performance is not simple to define or measure, nor is how good is “good enough” in educational quality. In this context, authorizers sometimes turn to “proxies” to assess school quality. 2.
Other aspects of a school’s program, often more
difficult to measure than test scores, are also important for families and
authorizers. 3.
Te a c h e r s, parents, and students become very
invested in particular schools, and destroying a community is more difficult
than serving a diffuse public interest (like the one that would be served by
closing a lowachieving school). 4.
Charter schools have become a highly politicized issue
on both sides, and some authorizers are concerned about their decisions (to
close schools) reflecting poorly on charter schools as a reform idea.44 These four challenges
form what Bulkley calls “the accountability bind.” Proxies for achievement
include parent and student satisfaction, accreditation by some national
accrediting agency, and, in the case of EMOs, a possible “halo effect” — if
authorizers view one school managed by the EMO as successful, they are likely to
see the EMO’s other schools that way as well. As regards the fourth
challenge, those who authorize charter schools are often favorably disposed to
the charter concept. This gives them an additional reason to let the charter continue.
Bulkley observes further that, “while authorizers have difficulty determining
what is and is not a successful charter school, they have even more difficulty
deciding that a charter school is unsuccessful enough to justify as high a sanction
as closure.”45
Not many charters have
suffered the indignity of a revoked or non-renewed charter. Nationally, just 4%
of all charters have closed. Texas has the highest rate at 8%. And some
situations there have received devastating publicity. No doubt that publicity served
as one reason why the Texas House of Representatives wanted to declare a
moratorium on new charters. In a compromise, the legislature, over Gov. Rick Perry’s
objections, capped the number of allowable charters at 211 (192 operated in 2001-
02) . Perry allowed the bill to become law without signing it. When charters do
close, it is not always clear why. Authorizers seldom list academic reasons as
the principal cause. Usually money problems dominate, but this might be
misleading. Eric Premack of the Charter School Development Center at California
State University, Sacramento, outlines the problems very well in an email
message to me: Pinpointing the primary cause of a revocation is a lot more difficult than one might think. Difficulties of schools academically, legally, financially, and at the school governance level tend to be very closely related. For example, if a school is unable to offer the academic program at the level promised to parents, it only takes a small number of parents dis-enrolling their children to send the school into a financial tailspin. This financial pressure, in turn, tends to lead to infighting at the governance and administrative level. When districts revoke, they usually focus on the financial and legal issues because they tend to be much easier to document. District staff prefer to bring clear and unambiguous reasons to their boards. As you [referring to me] know from your years of research in this area, measuring a academic growth and progress is a dismal and slippery “science.” It’s a lot easier for districts to document that a charter school’s budget is out of balance than it is to document declines in test scores or poor implementation of some promised academic program. | ||||||||||||||||||||||||||||