Mean scores in a mean world

Baines, L. A., & Goolsby, R. (2013). Mean scores in a mean world. In J. Bowen & P. Thomas (Eds.), De-testing and de-grading schools, (pp. 51-62). New York: Peter Lang.

It was not so long ago that a visit from the state department of education was
good news. In years past, the function of a state department of education was
primarily “advisory, statistical, and exhortatory” (Johnston, 1999). Indeed, a hallmark
of most successful schools is the help distributed through a powerful support
system, such as a board or a Ministry of Education.

During his tenure as superintendent of Massachusetts schools, for example,
Horace Mann often showed up unannounced in schools to observe teaching-in-action,
to chat with children, and to ask what he could do to help. Mann preached
the gospel of public schools as the antidote to problems of inequality, injustice,
and the loss of civility (Baines, 2006). About using common schools to educate
Americans of all classes and races, Mann (1867) said,

If we are derelict in our duty in this matter, our children in their turn will suffer.
If we permit the vulture’s eggs to be hatched, it will then be too late to take care
of the lambs. (p. 41)

The State as Vulture
Today, personnel from state departments of education are about as welcome in
public schools as vultures. A wake of vultures seldom attacks healthy animals but
prey upon the wounded or sick. So, when student achievement levels wane, the
state sees its role not as helper, but as disciplinarian—to punish a school for allowing
its students to post achievement scores below the mean. If a school is contacted
by the state, the news inevitably is bad—at best, a public humiliation and, at worst,
a tumult of teacher and administrator firings in a takeover. Firing people, while
enjoyable for select politicians, is a tactic that helps neither student nor teacher.

Little wonder that teacher morale is at a 20-year low, and that one in three
teachers admits that they are “very or fairly likely” to leave the classroom (Markow,
Pieters, & Harris Interactive, 2012, p. 5). In many states, teachers are told
not only what to teach but also how to teach by policymakers who have never
set foot in their classrooms. Then, when test scores are reported as lower than
expected, it is the teachers who get blamed—not the policies or policymakers.
Recently, we visited an elementary classroom in a low-income neighborhood
that enrolled 30 students, a number well above the “suggested maximum enrollment”
for young children of that age. The teacher was an energetic, endlessly
patient, loving, smart, first-year teacher who expertly marched through lessons in
reading and science. When writing a response to a prompt, one student broke her
pencil and commenced crying. Over the next 60 minutes, she would cry again
five more times—she did not understand a word, her eraser did not work properly,
she needed to go to the restroom, someone nearby finished the assignment
before she did, she discovered an ink stain on her hand.

Another student, who looked at least a year older than his classmates, kept
trying to pick a fight with nearby peers, while a fidgeting autistic boy sat talking
to himself and moving his head from side to side. Standing at the front of the
classroom, and continually interrupting the teacher, was a new student from Guatemala
who knew only a few words of English and tried to communicate by flailing
his arms. A second new student sat silently at a desk in the back of class with
his head on his desk. This student had just been placed with a new foster family
after his father had been convicted on drug charges and sent to jail in a different
city. These were only five of the young children in her class; there were 25 others.
Rather than help this first-year teacher in grappling with an incredibly challenging,
large group of diverse children, the state has done absolutely nothing--
provided no help in foreign language or special education, provided no relief from
an overcrowded classroom, provided no mentor. Instead, the state has informed
the teacher that her continued employment is dependent upon the test scores of
these 30 students.

As countless education dollars (billions?) flow toward expensive, intricate evaluation
systems, teacher pay is being slashed, class sizes are expanding, teacher tenure
is sliding into oblivion, and real professional development is nonexistent. All funds
in education today, without exception, must be somehow linked to test score ascension.
Teachers who spend time caring for the emotional and social lives of children
find themselves in the strange position of subverting the directives of the state.

Lean and Mean across the Nation
Among the lowest scorers in the recently made public teacher ratings for New
York City were a group of accomplished, seasoned teachers from Public School
146 in Brooklyn (Winerip, 2012). Because 97% of their students had scored
proficient in math the previous year, and only 89% scored proficient in math in
the current year, these teachers were rated among the worst in the city. Yet, if only
three children in their classes had scored one point higher on the exam, the teachers
would have been rated average to better-than-average. Among this group of
low-performing teachers was a Fulbright scholar, a former professor at Columbia,
and a theater owner recognized by the Guggenheim Museum for developing an
effective drama program for children.

In Florida, a state that rates each school and district by mean test scores, seesaw
ratings have become so routine that the ratings have lost all credibility. A high
school that receives an A one year might receive a C or D the next year, though
the teaching and administrative staff may have remained completely intact from
year to year (Florida Department of Education, 2012). One of the problems with
rankings is that schools often go into testing blindly, not knowing the criteria
by which they are going to be ranked. Although teachers are responsible for the
delivery of instruction, testing and assessment are in the hands of policymakers.

By and large, the richest schools receive the highest scores and the poorest
schools receive the lowest scores. As Tschinkel (2003) has noted in a series of
articles decrying the inequity of the mathematical formula used to rate Florida’s
schools, “many schools serving a less affluent population have been graded lower
than they deserve and many more affluent schools higher.” About the ratings,
Florida Teacher’s Union president Randy Ford commented, “It’s not that standardized
test results don’t tell us anything. They’re very accurate measures of the
size of the houses near a given school and the income levels of the people who live
in those houses” (Postal, 2012). According to Rothwell (2012), “the average low income
student attends a school that scores at the 42nd percentile on state exams,
while the average middle/high-income student attends a school that scores at the
61st percentile on state exams” (p. 1).

In Ohio, 80% of the highest poverty schools received a grade of either D or
F in 2010–2011 while less than 1% of the richest schools received a D or an F
(DiCarlo, 2012). None (as in zero) of Ohio’s high-poverty schools were considered
A-quality, while only 4% were considered B-quality. Meanwhile, 95% of the
richest schools were rated either A- or B-quality.

As teachers know, some efforts and expenditures do not directly translate into
higher test scores. Having a nurse onsite at a school, for example, used to be considered
essential to ensure that children’s health was adequately monitored, but
the presence of a nurse, by itself, does not increase test scores. Thus, in most states,
having a nurse on staff has become an extravagance that schools can no longer afford.
In the state where I live, on average, there is only one nurse for every 3,110
students (Toppo, 2009).

Similarly, the quality of the school library and the work of the school librarian
were once deemed integral to the basic functioning of a school. In the current era
of reductionism, credentialed librarians have been fired and school library collections
have withered. If a school actually has a librarian, he or she may have no
training in research or library science. In California, for example, 76% of school
libraries have no credentialed librarian, not even a part-time one. The ratio of
students to librarians is 5,124 to 1 (California Department of Education, 2008)
and the average copyright date of a nonfiction book in a California school library
is 1972.

This ruthless, corporate-style focus on the bottom line has become a staple
of a U.S. Department of Education that enthusiastically supports “state efforts to
improve the quality of their assessment systems” (U.S. Department of Education,
2010, p. 11). With regard to budget cuts, the federal government admonishes
that “it is important to do more with fewer resources” and warns that, in the
future, funding will go only to schools “that are designed to significantly increase
efficiency in the use of resources to improve student outcomes” (p. 41). In other
words, don’t expect the crises in school nurses or credentialed librarians to end
anytime soon. All those ramshackle portable buildings that blight the grounds
of half of the public schools in America are going to have to endure for another
couple decades as well (U.S. Department of Education, 2011).

Recently, state legislators have cast their eyes on academics to streamline costs.
Florida, South Carolina, Georgia, and Mississippi have taken the unusual step of
forcing 15-year-olds to select a major for high school. According to the former
governor of Florida, Jeb Bush, this edict allows students to “major in academic
subjects such as foreign languages or history, or specific job areas such as auto
mechanics. Regardless of a student’s path, majors . . . help them understand the
relevance between course work and their future” (Bush, 2006).

Bush’s push to slot children into life paths at a young age has been in vogue
outside the United States for years. In many European countries, it is common
for students to be assigned to either the academic track or the trade track by age
15, based on their performance on a standardized test. For thousands of years, the
Chinese have relied on a single, standardized test to help sort children into jobs
suitable to their aptitudes and their station in life. Throughout much of the world,
the richest students go to the best universities upon graduation from high school,
while the poorest students enter the work force right away. Perhaps it is disingenuous
for the United States to assert that it operates any differently.

The expense, uncertainty, and complexity of genuine human development
make it an unpopular issue for many policymakers. In fact, for reformers of Bush’s
ilk, who perceive the function of a school to be preparation for future work (or a
military career), human development is beside the point. Schools are just another
business, susceptible to market supply and demand fluctuations, as encumbered
by profits and debts as any business.

The Mean Kid
Consider the following three sets of scores.

Student A Student B Student C
100 70 100
100 80 80
100 70 70
0 80 50

All three sets of scores have a mean of 75, yet Student A looks as if he or she
is capable of scoring the highest of the three whenever he or she wants to. But,
what happened with that last score? Student B is fairly consistent, but does he
or she really try? Student C started out strong, but recent difficulties are a cause
for concern. Three different students with three distinctive stories, three diverse
performances, and three different responses to learning (or, perhaps, testing). Yet
all three students receive the same score with the expectation that all three should
learn the same material.

Despite its crudity as a measure, the mean can be a malleable construct in
the right hands. When we were teachers in Texas, for example, many schools used
to routinely plan field trips for students in special education on testing day because
the lower scores of special education students would crater the school mean.
So, honest superintendents who did not plan a special field trip for their special
education students and represented their scores accurately received lower mean
scores. Their honesty, while laudable, placed at risk their jobs, the jobs of teachers
and staff, and the school’s reputation.

Similarly, recent research has revealed how KIPP (Knowledge Is Power Program)
schools preselect out large numbers of students by requiring parents applying
to the school to sign a pledge of involvement with their children and the
school (Miron, Urschel, & Saxton, 2010). Furthermore, KIPP schools have the
ability to kick out nonperforming or recalcitrant students, whereas public schools
have no such option. As a result, in direct comparisons of schools with similar
student populations, KIPP scores are almost always going to be higher because
they have cut out their lowest-achieving students. Obviously, these students have
to go to school somewhere. Inevitably, they show up at the nearest public school,
thereby raising the mean scores at KIPP schools and lowering the mean scores at
the public school.

Even when individual student performance is pried away from the group
mean, what does a mean score reveal? If you were scouting the Boston Celtics
during the 1964–1965 season in professional basketball, you would learn that the
top nine players on the Celtics with the highest mean performance per minute of
game time were as follows (Basketball Reference, 2012):

1. Ron Bonham, 26.7 points
2. Sam Jones, 25.8 points
3. John Havlicek, 22.8 points
4. Tom Heinsohn, 19.2 points
5. Willie Naulls, 18.4 points
6. Larry Siegfried, 16.4 points
7. Mel Counts, 16.2 points
8. Tom Sanders, 13.8 points
9. John Thompson, 11.8 points

The only problem with this “achievement list” is that it excludes Bill Russell,
who only averaged 11.4 points for every 36 minutes of game time during
the 1964–1965 season. Was Bill Russell a lesser player because his mean score
was less than that of his teammates? For those who do not know or care about
professional basketball, Russell was a five-time winner of the NBA Most Valuable
Player Award, a 12-time All-Star, former captain of the U.S. Olympic Basketball
Team, and considered by most sportswriters to be one of the greatest players in
the history of professional basketball. He could score, but his forte was defense,
rebounding, and a relentless work ethic.

Bill Russell’s mean score did not provide a true indication of his contribution
or talent. Similarly, students are being assessed on skills that are easily measurable--
the standardized test equivalent of average points per minute. Students
possess myriad talents and skills not addressed by narrow measures of content area
knowledge. The best teachers try to identify talent and give students time to
work in areas of talent, as well as in areas that might need improvement. In valorizing
test scores, a student’s talent, individuality, and interests tend to be viewed
as obstructive and oppositional to the goals of the state.

A Mean World
A common refrain of educational reformers in recent years has been the unacceptable
scores of American students on the recent PISA (Program of International
Student Assessment, 2010) reading test. About the 2009 PISA results, Secretary
of Education Duncan (2010) explained, “Today’s PISA results show that America
needs to urgently accelerate student learning to remain competitive in the global
economy of the 21st century. More parents, teachers, and leaders need to recognize
the reality that other high-achieving nations are both out-educating us and
out-competing us.”

On the test, the average American 15-year-old (PISA only tests 15-year-olds)
scored 500, 39 points behind top-scoring South Korea. Table 4.1 indicates the
score of the average American in comparison with 15-year-olds in other countries.

Country Score
1. South Korea 539
2. Finland 536
3. Canada 524
4. New Zealand 521
5. Japan 520
12. United States 500

Table 4.1. Top Five Scores on PISA Reading Assessment by Country

Using a mean score obscures a rather startling fact: many Americans are
among the highest-achieving students on the planet. Students who have wealthy
parents, for example, scored an average of 551 on the exam, which would make
them the highest scoring group in the world. Just below rich Americans are Asian
Americans and white Americans, who scored an average of 541 and 525, respectively,
good enough for second and fourth best in the world (see Table 4.2).

Country Score
1. Rich Americans (5,000,000 students of all races) 551
2. Asian Americans (all schools, 4,000,000 students) 541
3. South Korea (7,500,000 students) 539
4. Finland (850,000 students) 536
5. White Americans (all schools, 28,000,000 students) 525
6. Canada (5,500,000 students) 524

Table 4.2. Top Scores on PISA Reading Assessment by Country, with Wealth
and Ethnicity Added for the United States

One can infer from these data that the quality of education received by 70%
of Americans (the category being rich or white or Asian) appears satisfactory--
good enough to place them among the highest-achieving students in the world.
It is also useful to note that South Korea and Finland are relatively small, largely
monocultural societies. Finland has fewer than a million students; South Korea
has 7.5 million. In comparison, there are more than 50 million students in American
public schools.

A related factor not considered in PISA or other international tests is the
number of immigrant children included in testing. In South Korea the rate of
immigration is 0. The immigration rate in Finland is .05, or about one person in
2,000. The few immigrants who come to Finland usually hail from Sweden, and
both Swedish and Finnish are official languages of the country.

In contrast, the rate of immigration in the United States is 400 times the rate
of immigration in Finland. In the United States, up to 21% of school-age children
(ages 5–17) speak a language other than English at home and one in three schools
in the United States have “official” immigrant populations of more than 25%
(U.S. Department of Education, 2012). Unlike in some countries, in the United
States, children of immigrants are free to enroll in public schools, even if they are
unable to speak a word of English. American assessment systems require all students,
irrespective of first language or the number of years spent in U.S. schools,
to take the standardized test.

Yet another factor neglected in the calculation of mean scores is the 13%
of American students classified as having special needs, meaning that they have
been identified as having intellectual, emotional, or physical limitations. Special
education is an American invention. Children with special needs in most other
countries stay at home or attend a school designated for the disabled. In America,
students in special education attend school with everyone else and are expected to
take the same tests as everyone else.

Considering the additional challenges wrought by influxes of non-English speaking
students and the complications inherent in serving huge numbers of
special education students on Individualized Education Plans (IEPs), Americans’
world-leading scores are all the more impressive.

The Bad News
The bad news is actually not news at all but has become an ugly fact about education
in the United States for a hundred years: Minority students who live in
impoverished neighborhoods do poorly on standardized exams. On average, students
who attend schools in America’s poorest neighborhoods scored 3 points below
children from Chile on the PISA Reading Assessment (see Table 4.3). American
children who attend poor schools score an average of 446, 105 points behind
American children who attend wealthy schools (and who lead the world at 551).

Students living in high-poverty neighborhoods in America who are high achieving
usually receive transportation to wealthier schools or have been the
lucky recipients of dramatically increased funding. Indeed, Perry and McConney
(2010) found the effects of socioeconomic status (SES) to be powerful and
resilient across income groups. That is, “All students—regardless of their personal/
family SES—benefit strongly and relatively equally from schooling contexts
in which the SES of the school group is high” (Perry & McConney, 2010, pp.
1157–1158). Conversely, students performed markedly less well, irrespective of
background, in low SES schools.

Country Score
Austria 470
Turkey 464
Chile 449
United States (children from the poorest schools) 446
Mexico 425

Table 4.3. Bottom Five Scores on PISA Reading Assessment by Country, with
Students in Poverty Added for the United States

Everyone’s favorite example of the successful public school in a poor area
seems to be Harlem Children’s Zone, but the school is neither poor nor open to
all. Harlem Children’s Zone has an $84 million budget and assets of more than
$200 million. The school receives $12,443 in public money and $3,482 in private
money per pupil per year, but these costs do not include field trips, “a 4 p.m.-to-6
p.m. after-school program, rewards for student performance, a chef who prepares
healthful meals, central administration and most building costs, and some of the
expense of the students’ free health and dental care” (Otterman, 2010).

While American schools are educating the middle class and the wealthy quite
effectively, they have been woefully ineffective at educating poor African American
and Hispanic children. In response to the unique needs of these minority
students in poor schools, the federal government has pursued a policy of nonnegotiable
standardized testing accompanied by a funding model based on the
percentage of students meeting minimal competencies. This regimen of high stakes,
low-challenge testing, initiated with Goals 2000 and continued with No
Child Left Behind, has become a defining feature of Race to the Top, the Obama
administration’s signature education program.

Test Scores, Wealth, and Power
Standardized tests assess low-level knowledge in a specific subject area and ignore
everything else. Once the standardized test becomes both curriculum and goal, its
omnipresence subverts learning and makes the development of talent beside the
point. But, even if the state decides that standardized test scores are what matters
most, data indicate that 70% of American children are among the highest-scoring
students in the world, despite the public schools’ open doors to Limited English
Proficient speakers and students with special needs. Everyone knows where the
lowest-scoring children live and where they go to school. In America, the bottom
30% come from the poorest 30% of the population and are those most likely to
drop out, be unemployed, and go to jail (Christensen, 2011). Yet, educational
policy seems purposefully designed to reward the rich and to punish the poor.

Implicit in the rationale for inundating even our youngest children with tests
is the contention that higher test scores will somehow make the United States
more competitive globally. However, no evidence supports such a belief. The
wealthiest countries in the world are listed in Table 4.4.

Country / Purchasing Power (per person) 2010 / Average PISA Score in Reading
1 Qatar 90,149 372
2 Luxembourg 79,411 472
3 Norway 52,964 503
4 United States 47,702 500
5 Switzerland 43,903 501
6 Netherlands 40,601 508
7 Australia 39,841 515
8 Austria 39,561 470
9 Canada 39,037 524
10 Ireland 39,009 496

Table 4.4. Wealthiest Countries in the World (Global Finance Magazine, 2012)

Few patterns of positive correlation exist between countries whose students
do well on standardized exams and countries that are economically prosperous.
History holds many examples that demonstrate that school learning has little impact
on economic prosperity. The Roman Republic/Roman Empire, for example,
with its massive population of slaves, high illiteracy rates, and inexorable military
machine, ruled much of the planet for many years without the benefit of a formal,
public system of education. The leaders of Rome focused on the accrual of power,
without regard to human suffering or cultural degradation.

Similarly, in the United States today, power and the pursuit of wealth drive
school reform. The goal is not human development, but the processing of children
at the cheapest possible cost.

References

Baines, L. A. (2006). Does Horace Mann still matter? Educational Horizons, 84(4), 268–273.
Basketball Reference. (2012). 1964–65 Boston Celtics Roster and Statistics. Retrieved from http://
www.basketball-reference.com/teams/BOS/1965.html
Bush, J. (2006, November 13). Should high school students be required to declare majors? Upfront
Magazine. Retrieved from http://teacher.scholastic.com/scholasticnews/indepth/upfront/
debate/index.asp?article=d1113
California Department of Education. (2008). Statistics about California school libraries. Retrieved
from http://www.cde.ca.gov/ci/cr/lb/schoollibrstats08.asp
Christensen, L. (2011). The classroom to prison pipeline. Rethinking Schools 26(2). Retrieved from
http://www.rethinkingschools.org/archive/26_02/26_02_christensen.shtml
DiCarlo, M. (2012, March 28). Ohio’s new school rating system: Different results, same flawed
methods. Shanker Blog [Web log post]. Retrieved from http://shankerblog.org/?p=5511
Duncan, A. (2010). Education Secretary Arne Duncan issues statement on the results of the Program
for International Student Assessment. Retrieved from http://www.ed.gov/news/pressreleases/
education-secretary-arne-duncan-issues-statement-results-program-international-s
Florida Department of Education. (2012). Florida school district rankings. Retrieved from https://
app2.fldoe.org/Ranking/Districts/
Global Finance Magazine. (2012). The richest countries in the world. Retrieved from http://www.
gfmag.com/tools/global-database/economic-data/10501-the-richest-countries-in-the-world.
html#axzz1stmRYW1Y
Johnston, R. (1999, June 23). State agencies take hands-on role in reform. Education Week 18(41),
1. Retrieved from http://www.edweek.org/ew/articles/1999/06/23/41power.h18.html
Mann, H. (1867). The life and works of Horace Mann, Volume II. Boston: Horace B. Fuller.
Markow, D., Pieters, A., & Harris Interactive. (2012). Survey of American teachers: Teachers, parents
and the economy. New York: MetLife. Retrieved from http://www.metlife.com/about/
corporate-profile/citizenship/metlife-foundation/metlife-survey-of-the-american-teacher.
html?WT.mc_id=vu1101
Miron, G., Urschel, J., & Saxton, N. (2010). What makes KIPP work? A study of student characteristics,
attrition, and school finance. New York: National Center for the Study of Privatization in
Education. Retrieved from http://www.ncspe.org/publications_files/OP195_3.pdf
Otterman, S. (2010, October 12). Lauded Harlem schools have their own problems. New York
Times. Retrieved from http://www.nytimes.com/2010/10/13/education/13harlem.html
Perry, L., & McConney, A. (2010). Does the SES of the school matter? An examination of socioeconomic
status and student achievement using PISA 2003. Teachers College Record 112(4),
1137–1162.
Postal, L. (2012, January 30). Educators criticize latest Florida school rankings. Orlando Sentinel.
Retrieved from http://articles.orlandosentinel.com/2012-01-30/news/os-florida-schoolrankings-
20120130_1_half-on-fcat-scores-fcat-scores-and-half-school-rankings
Program of International Student Achievement. (2010). PISA 2009 results: What students know
and can do: Student performance in reading, mathematics and science. Paris, FR: OECD.
Retrieved from http://dx.doi.org/10.1787/9789264091450-en
Rothwell, J. (2012). Housing costs, zoning, and access to high-scoring schools. Washington,
D. C.: Brookings Institute. Retrieved from http://www.brookings.edu/~/media/Files/rc/papers/
2012/0419_school_inequality_rothwell/0419_school_inequality_rothwell.pdf
Toppo, G. (2009, August 10). School nurses in short supply. USA Today. Retrieved from http://
www.usatoday.com/news/health/2009-08-10-school-nurses_N.htm
Tschinkel, W. (2003). New and improved A+ grades: Camouflaged bias. School performance articles.
Retrieved from http://www.bio.fsu.edu/~tschink/school_performance/
U.S. Department of Education. (2010). A blueprint for reform. Washington, DC: Government
Printing Office.
U.S. Department of Education. (2011). Percentage of public schools with permanent and portable
(temporary) buildings and with environmental factors that interfere with instruction in classrooms,
by selected school characteristics, type of factor, and extent of interference, 2005. Digest
of Education Statistics, 2010. Washington, DC: Government Printing Office. Retrieved from
http://nces.ed.gov/programs/digest/d10/tables/dt10_106.asp
U.S. Department of Education. (2012). Table A-6-2. Number and percentage of children ages
5–17 who spoke a language other than English at home and who spoke English with difficulty,
by age and selected characteristics: 2009. The Condition of Education, 2011. Retrieved from
http://nces.ed.gov/programs/coe/tables/table-lsm-2.asp
Winerip, M. (2012, March 4). Hard-working teachers, sabotaged when student test scores slip.
New York Times. Retrieved from http://www.nytimes.com/2012/03/05/nyregion/in-brooklynhard-
working-teachers-sabotaged-when-student-test-scores-slip.html