03 January 2012

The Myths of Standardized Tests by Phillip Harris et al.

There is a lot wrong with public education these days.  Today’s students seem less educated compared to students of yesteryear, yet strangely feel more academic pressure than their forebears ever did.  And no one is satisfied with the current system.  Parents are frustrated, teachers are frustrated, students are frustrated, administrators are upset, and educational leaders are disappointed.  What is the cause of this frustration?

Harris sets out to identify the sources of the current frustration, and he turns his sights on the increased use of standardized tests, which is a direct result of No Child Left Behind.  The desire for some sort of quantifiability is the direct result of needing to meet the standards enacted by this legislation.  Like all political processes, there is a general pretense of objectivity in the pursuit of fairness, for the operating assumption is that it is fair to give people objective, measurable goals in advance.  Of course, this requires that goal attainment actually be measured, hence the testing.

Harris presents a couple of broad themes in this book, and makes a compelling case for each of them.  First, Harris begins by explaining the limits of the test.  Since emphasis is placed on objectivity, standardized tests simply cannot measure those elements that are subjectively defined and quantified.  Furthermore, standardized tests do not even begin to test for all relevant standards, let alone test for the relevant standards with a sufficient sample size.

In one example, he points out how there are dozens of main standards, with several sub-points each, leading to over one hundred testable items.  But to get a proper view of well each student has mastered the standards, each student must be given multiple questions per each sub-point.  Thus, a sufficient test would be approximately 400-500 questions in length.  And that’s per subject.

Most standardized tests only ask around 100 questions total, and cover multiple subjects to boot.  Therefore, there are simply not enough data to make a reasonable assessment of student capability, for the sample size of knowledge tested is simply too small.

Second, Harris points out the distortive effects of standardized tests.  The most obvious effect is cheating.  The rationale for some occasions of cheating (most notably corrections made by teachers) is that the test doesn’t adequately reflect the abilities of the student in question.  This is likely true, given that the sample size is simply too small.

Another distortive effect is teaching to the test.  This is an especially ironic result in light of how broad the tests are supposed to be, for teaching to the test requires a tighter focus on test items and standards.  As such, education becomes more narrow and more retarding.  The current emphasis on test performance also leads schools to cut down the time spent on non-test subjects, like (generally) history, science, arts, and even recess and lunch.

A final distortive effect of the test is that it doesn’t account for test-taking abilities.  Since tests usually have a time constraint, there are certain strategies that enable test-takers to perform well on a test even if they aren’t familiar with much of the content being tested.  Things like answering easy questions first, making educated guesses, and such like are tried and true test-taking strategies.  The problem is that there is no way to separate this from one’s knowledge base, and so those who are good at taking tests will appear to be more knowledgeable than they are, relatively speaking.  Incidentally, this also speaks to the limited value of tests, for a test score is not only indicative of knowledge, but also of how one can display that knowledge in a specific format under specific constraints.  As such, tests are not a measure of knowledge, but a measure of a specific form and display of knowledge.  Tests are a subset.

Third, Harris points out that tests aren’t actually that useful, relative to their stated use, which is predicting future academic and career success.  In fact, GPA and extra-curricular non-school voluntarism are both vastly superior to predicting future success, most likely because future success requires not only knowledge, but also a work ethic.  Also, grades are more well-rounded metrics than test scores, and can account for subjective valuations, which incidentally proves that human judgment is superior to objective metrics.

This book does have some shortcomings.  The most notable one is that it operates from a statist assumption, in that the authors assume that the state should be in charge of education.  Consequently, the authors argue for reform, not realizing that the problem is the system, not its operation.

Also, the authors ignore the subjectivity of value.  This is especially ironic given that the discuss a poll that indicates what US citizens desire from the public education system.  The breadth and variation of the results should speak to the subjectivity of value, but this point is overlooked.  This is an egregious error, for once one realizes that value is subjective, it becomes abundantly clear that the market is in a considerably better position to solve the education problem.

Additionally, the book fails to address the fluidity of working knowledge in a meaningful way. This point needs to be hammered home, for it is not only possible, but incredibly likely that one’s test scores would be different were the test taken multiple times.  A third grade child taking the third grade ISTEP test, for example, would get a different score when taking it at the age of nine than, say, at the age of thirty.  This is because knowledge is fluid, and changes over time.  Thus, a test score is only indicative of one’s working knowledge under specific conditions at a specific point in time.  It is thus ludicrous to believe that one test, at specific point in time, will somehow have any long-term meaning or relevance in regards to one’s working knowledge.

In all, The Myths of Standardized Tests is a compelling, interesting read.  The authors take a much-needed critical look at the role of standardized testing in education.  It is weighed in the balance and found wanting.  Even with the book’s shortcomings, there is certainly much to think about.  As such, it is a recommended read.

Cf. also "The Pretense of Knowledge and Educational Testing."

