What Should Be Done About Standardized Tests? A Freakonomics Quorum

December 20, 2007

What should be done about the quality and quantity of standardized testing in U.S. schools? We touched on the subject in Freakonomics, but only insofar as the introduction of high-stakes testing altered the incentives at play — including the incentives for some teachers, who were found to cheat in order to cover up the poor performance of their students (which, obviously, also indicates the poor performance of the teachers).

Personally, I used to love taking standardized tests. To me, they represented the big ballgame that you spent all season preparing for, practicing for; they were easily my strongest incentive for paying attention during the school year. I realize, however, that this may not be a common view. Tests have increasingly come to be seen as a ritualized burden that encourages rote learning at the expense of good thinking.

So what should be done? We gathered a group of testing afficionados — W. James Popham, Robert Zemsky, Thomas Toch, Monty Neill, and Gaston Caperton — and put to them the following questions:

Should there be less standardized testing in the current school system, or more? Should all schools, including colleges, institute exit exams?

Here are their responses. Many thanks to all of them for their participation. I have to admit, I never saw the parallel between tests and French fries before, but now that I’ve seen it, I won’t soon forget it.

W. James Popham, author of The Truth About Testing: An Educator’s Call to Action and America’s Failing Schools:

Standardized tests have much in common with French fries. Both of them differ in composition as well as quality. French fries are available in numerous incarnations, including straight, curly, skins-on, skins-off, and, in recent years, with sweet potatoes. Regarding quality, of course, the taste of French fries can range substantially – from sublime to soggy. It’s really the same with standardized tests.

Certain standardized tests (called achievement tests) are intended to show us what skills or knowledge students have mastered. Other standardized tests (called aptitude tests) are designed to predict how well test-takers will perform in future settings, such as when they get to college. Some standardized tests are designed to differentiate among test-takers so we can say that Kevin scored at the 82nd percentile, while Melanie’s performance puts her at the 96th percentile. Some standardized tests are supposed to let us know how well a particular group of students, such as those in a given school, have been taught. But, just as is true with French fries, standardized tests can vary dramatically in their quality. Some standardized tests perform their measurement mission marvelously; others do a dismal job of it.

Thus, if we’re asked whether there should be more or fewer standardized tests in our school system, the only defensible answer is, “It depends.” It depends on whether the right kinds of tests are being used and whether those tests are good ones. Given the kinds and caliber of the standardized tests currently being used in our schools, I come down on the “less” side of the argument. But that’s chiefly because the wrong sorts of standardized tests are frequently being used. Take the No Child Left Behind Act, for instance, a federal accountability law requiring scads of standardized tests to be used in evaluating schools. Do you know that almost all of the standardized tests now being employed to judge school quality are unable to distinguish between well taught and badly taught students?

We surely don’t need more of those sorts of misleading tests. But we definitely do need more standardized tests that are sufficiently sensitive to instructional quality, so we can accurately tell which schools are truly successful and which ones aren’t. Standardized tests can be written that accurately measure a school’s instructional effectiveness, yet also stimulate teachers to do a better job of teaching.

Turning to the exit-exam question, all schools – kindergarten through college – should employ exit exams allowing us to determine what students have actually learned. We owe it to our students to make sure that they’ve been properly taught. But when I hear, as I recently have, of a proposal for colleges to start using end-of-course tests as exit exams, I become altogether apprehensive. I was a college professor for more than 30 years, and I assure you that most professors know no more about making exit exams than they do about making French fries.

Robert Zemsky, professor and chair of the Learning Alliance at the University of Pennsylvania, and former member of the Spellings Commission:

Discussing testing is roughly akin to planning a visit to the dentist – it’s all about remembered pain. No one really likes to be tested. And yet high-stakes testing — already a key element in the reform of primary and secondary education – has become a standard feature of the “let’s reform higher education” industry.

Testing raises a host of problematic questions. Who is being tested: the student, or the teacher? What is being tested: what the student knows, or what the student has learned? Should the tests focus on specific knowledge – like the ability to read a complex text or solve a standard physics problem – or should the test focus on more general attributes, like creative thinking and problem solving? Can a test in which the test-taker – that is, the student – does not have a direct stake in the outcome actually command the test-taker to do his or her very best?

Then there are the questions of what to do with the results. I have actually sat through an extended discussion of how we could use regression analysis to parse out the contribution different teachers made to a group of students’ performance on a set of standardized tests. The answer was, yes it was possible, and could in fact be used to award merit pay increases. But nobody left the room feeling very comfortable that there would be any gain in what we knew made for good teaching.

What we know – and what makes those of us in higher education particularly leery of generalized tests designed to capture how well an institution teaches attributes like creating thinking and problem solving – is that the best predictor of how well a group of college students will do on such a test is how well they did on the SAT or the ACT. Those instruments may not be perfect, or even good at identifying scholastic aptitudes, but boy are they good at telling us who the best test-takers are.

Thomas Toch, co-director of Education Sector, a Washington, D.C., think tank:

There’s a lot of standardized testing in public education. Elementary, middle, and high school students are taking some 56 million reading, math, and science tests this year just to comply with the demands of the No Child Left Behind Act, and many states and school systems layer a lot of other standardized tests on top of that.

This testing is valuable. Without it, parents, taxpayers, and policymakers would have a tough time knowing how well schools were performing. That was the case prior to the advent of the “standards movement” in public education in the 1990s, when states began setting standards, testing students, and publicizing the results. Students could fall through the cracks, and many did, but educators didn’t have strong incentives to help them because without tests that measured students’ performance against clear standards, there was no way of holding teachers and principals accountable for their students’ success.

NCLB took the standards movement to its logical next step, requiring standards and testing systems in every state, and creating consequences for schools that failed to make adequate progress with specific groups of students that public schools hadn’t educated very well in the past – the poor, students of color, English language learners, and the disabled. That has been the law’s most valuable contribution.

But we need much better tests. For a variety of reasons, including the need to produce vast numbers of tests quickly and cheaply, the majority of today’s state-level standardized tests are multiple-choice measures of mostly low-level skills, such as the recalling of facts in a reading passage. They largely sidestep higher-level skills, such as having students compare and contrast two reading passages, and the open-ended questions that are best suited to measuring such skills. Roughly half of the nation’s students are taking tests under NCLB that are completely free of open-ended questions.

This presents a problem, because when tests are high-stakes events, as they are under NCLB (teachers and principals can eventually lose their jobs if their students flunk NCLB tests for several consecutive years), educators have a strong incentive to “teach to the test.” In this case, that means teaching low level skills at the expense of the more demanding material that everyone says students need to master in today’s complicated world.

Exit exams, which students must pass to graduate, make sense. “Social promotion,” or advancing unprepared students, has been commonplace in schools and colleges for a long time.

But such tests pose tough questions. Two-thirds of the nation’s public high school students currently must pass exit exams in reading and math in order to graduate. But the majority of the tests measure ninth- or tenth-grade-level basic skills; passing them doesn’t mean students are ready for the workplace, much less prepared for college. Yet many state lawmakers have been wary of setting the bar higher for fear of large numbers of students failing.

But is it fair to give students what amounts to a counterfeit passport to college or work? And do such tests spur high school teachers and principals to aim high with their students? To both questions, the answer is, “No.” In most states today, high school exit tests serve the same role as the standardized tests mandated by NCLB: they try to jack up the floor of student achievement in the nation’s schools. The best high school exit tests would be end-of-course exams akin to the “comprehensive” exams that many colleges and universities require students to pass in their majors before graduation – tests, that is, that would raise the ceiling of student achievement.

Monty Neill, executive director of FairTest:

The No Child Left Behind law has had one clear accomplishment: it has given a black eye to education policies based on the overuse of standardized testing.

NCLB’s testing mandates have flooded American classrooms with millions of additional tests. At the same time, the rate of learning improvement has actually slowed, according to the National Assessment of Educational Progress (NAEP).

A mounting pile of surveys and reports document the negative consequences of testing overuse and abuse, as well as growing public opposition to the test-and-punish approach. For more evidence, just listen to the roars of approval when any of the presidential candidates criticizes the law. No wonder more than 140 national education, civil rights, religious, disability, parenting, and civic groups have called for its comprehensive overhaul.

Having long tracked the misuse and abuse of such tests, FairTest predicted a range of negative consequences from NCLB. Most have now been documented by independent researchers. The problems are compounded by high school graduation tests, and by pressure to score high on college admissions exams.

High-stakes testing has narrowed and dumbed down curricula; eliminated time spent on untested subjects like social studies, art, and even recess; turned classrooms into little more than test preparation centers; reduced high school graduation rates; and driven good teachers from the profession. Those are all reasons why FairTest and other experts advocate a sharp reduction in public school standardized testing and a halt to exit exams.

One-size-fits-all testing schemes make even less sense for colleges and universities. How could one exam ever accurately assess the learning of students majoring in subjects as diverse as art history, biomedical engineering, and political science?

As such, the politicians blindly mandating such exams are the ones outside the mainstream, not assessment reformers like us. Indeed, the testing industry’s own standards state that no single exam should be used as the sole or primary criterion to make high-stakes educational decisions such as promotion, retention, graduation, college admission, or scholarship awards.

There are better ways to assess student learning. Classroom-based information, such as grades, provides richer evidence of performance. High school grade point average is a better predictor of college success than either the SAT or the ACT.

The nation does need better assessments and more training for educators to get the most out of them. FairTest has long promoted high-quality, classroom-based assessments that can be used to improve student learning and teaching. We also support the more than 760 colleges that do not require admissions test scores for many or all of their applicants.

High quality assessment is an educational necessity. But high-stakes standardized tests harm educational quality and promote inequity.

Gaston Caperton, president of The College Board:

The quantity of testing is less important than the quality of testing. This is where the SAT excels. In an era of rampant grade inflation, the SAT offers students the most level playing field available to demonstrate their knowledge of core material. The SAT, in combination with the grade point average, provides students, parents and admissions counselors with the best predictor of academic success in college.

As for the question about exit exams, I think they largely exist already in the form of final exams in various subject areas, both in high school and in college. The SAT is unique in that it provides a focused look back at a student’s accomplishments in high school while offering a glimpse into that student’s potential in a college environment.

Search the Site

What Should Be Done About Standardized Tests? A Freakonomics Quorum

Comments