High-stakes reading comprehension exams, at least one that I know of, for high-school students give line references pointing to the source for the correct answers to each question. It is possible to get a perfect score without ever having read the passage--just having used the cues. What, then, is being assessed? Certainly not ability to read a passage, digest the information, and carry it forward in another context. At best, assessment of these test results is of ability to succeed at a sort of treasure hunt where prizes have been seeded and slightly oblique hints provided.
Is that useful?
Of course not. And it privileges those who have gone through some sort of test prep. Those who have simply learned to read well, and who follow the instructions to read the passage first are at a disadvantage, especially since the test is timed.
But who cares? Who is assessing the assessment?
We've gotten so far removed from any assessment of assessment that debacles like New York State's pineapple-and-hare question only come to light when students point out their idiocy. Testing giants like Pearson's are developing questions simply as questions that follow a formula, not that serve any real purpose of evaluating student progress in learning.
Garbage In, Garbage Out?
It certainly seems that way. The tests provide the garbage (the questions), the students stir it around a bit, and numbers are produced from their activity--more garbage.
The other day, Diane Ravitch posted a blog with the plaintive title "Why Do We Treat the Tests As Scientific Instruments?" Good question, and one that she has been asking for some time. As have I. As have many others.
A resounding silence.
Or simply more claims that we have to have "data." Only then can we effectively evaluate our schools, our teachers, and our students.
But what if that "data" is, in reality, garbage?
Why do we (and state legislatures and the U.S. Department of Education and the media) treat these tests and the scores they produce as accurate measures of what students know and can do? The reader [who had asked a question sparking the post], who clearly is a teacher, reminds us that the tests can’t do what everyone assumes they can do. They are subject to statistical error, measurement error, and random error. They are a yardstick that ranges from 30″ to 42″, sometimes more, sometimes less. Yet we treat them as infallible scientific instruments. They are not.Not only are they not "infallible scientific instruments," but their value as creators of any useful information is doubtful, at best.
Writing tests have to focus on the page and not on communication, but it is communication that is the heart and soul of writing. Why this focus? Because "communication" is almost impossible to assess numerically, while formulaic usages complying with a standardized grading rubric can be (if we ignore the fact that there is even a subjective element to assigning the numbers for parts of the rubric--something we ignore through what is called "norming," making sure every grader gives a particular test approximately the same score). Students are assessed on a kind of writing that meets established rules, but it is not a kind of writing that students will engage in anywhere beyond writing classrooms preparing them for standardized tests.
In some respects, what students are taught to do isn't even writing, but putting together pieces of a jig-saw puzzle. Little of it has anything to do with effective communication.
It's long past time that we start assessing the assessments, but are we going to do it?
There is too much invested in high-stakes testing (the entire, and hugely profitable "reform" movement in education is based on it) for anyone but the few on the fringes to call out that this emperor has no clothes.