School Testing: A Complicated Topic

High school teachers and the school board in Jefferson County got into a disagreement on when and how to administer end-of-semester exams and how much to weight them. The teachers issued a vote of no confidence in the board. Here's the details with some discussion of the purpose of testing kids as much as schools do.

Note: This article was published in the Spirit of Jefferson on February 8, 2023.

It may seem self-evident that testing students on their academic progress or lack of progress is baked into the very notion of schooling. Can you even be a student without taking tests? Can you be a teacher without giving tests?

It’s hard to think of an instance where these questions aren’t answered “no.”

Last fall, teachers in the two Jefferson County high schools were directed to prepare an end of semester test for their students. Under this arrangement the test was originally to count for 14 percent of their final grade, but the teachers negotiated down to 10 percent. But four days before the test was to be administered, the county Board of Education in a 3- 2 vote decided to change the weight of the test to 3 percent.

And so began a serious dust-up.

The teachers, who had been diligent in putting the tests together, objected saying that the lower percentage effectively wiped out any significance for the tests, because the results would not, could not, in any significant way change the scores that most students already had earned up to that point.

Lowering the significance of the tests, teachers argued, also eliminated many students’ motivation to prepare for the tests.

In other words, teachers argued that lowering the impact of the semester tests would render those tests a waste of teachers’ and students’ time.

Testing is tied to notions of accountability, and it’s part of the system of rewards (and punishments) that schools put in place to gauge and acknowledge student achievement. Besides getting a high grade on a test, rewards can include teacher praise, honor rolls, extra privileges from parents, prizes and peer recognition.

Ideally, every teacher should welcome a testing result where every student got an A. It would be a kind of reward for excellent and caring teaching. They could rig such a result by putting questions on the test that all students would very likely answer correctly. But you and I know from our own probably long experience with tests that that doesn’t typically, if ever, happen.

Tests are used to “discriminate” between test takers, meaning they are designed with questions of varying difficulty so that the test can assign higher scores to some students and lower scores to others, and so reflect the actual spread of student achievement.

This raises the question of test fairness. Do the test questions represent what teachers have expected their students to learn? It’s the famous question kids always want to ask: “Will that be on the test?” If it’s an achievement test, focused on what students had actually been taught, then the answer should be “yes. Alll questions were covered in class.”

In addition, test fairness is determined by the test format (e.g., multiple-choice, short answer, essay). Is the format one that students have been trained to understand? Does it require skills, for example, in reading proficiency, that are not being assessed? It does happen, particularly in multiple-choice reading tests, that the questions the students are asked to answer require better reading comprehension skills than the answer choices. A student might actually know the right answer, but can’t understand what the question is asking.

Academic testing strives to be fair, meaning that testing should accurately reflect the student’s grasp of the subject area, something that is highly correlated with the effort they put into getting that grasp. The reasoning goes that anyone can get an A, if they apply themselves seriously and diligently to studying beforehand and if instruction clearly relates to what is on the test. However, remember that tests are usually designed so that won’t happen.

But let’s say for argument’s sake that students have this opportunity to know beforehand what to be diligent about in class and know how to take tests. And that is actually the case, because instruction is tied to state mandated educational standards.

Every state publishes its own set of detailed standards to guide classroom instruction. Fair testing means, then, that the tests match up with those standards. In our state, these standards are called the West Virginia College- and Career-Readiness Standards. Teachers and textbooks are typically very explicit about telling students what those standards are.

Of course, it’s the case that some kids have to try harder than others, since not all kids come to the starting gate equally prepared to be serious and diligent. Poverty, poor nutrition, language proficiency and level of parental support can all affect a kid’s desire and willingness to learn and, consequently, affect his or her test performance. For instance, the standardized tests annually administered in states all make accommodations for students with intellectual and physical disabilities. You can’t expect a blind student to take and pass a test that requires them to read with their eyes, for instance. That wouldn’t be fair.

This notion of fairness in testing is a property that can be measured. In the field of test measurement, the fairness of a test is tied to two qualities: validity and reliability, both of which can be determined by statistical analysis of the test questions and the capabilities of the intended test takers. The statistics in question aren’t for the faint of heart, by the way. The need for doing these statistics increases with the impact the test results have for the test takers. Some tests, like the surprise pop quiz a teacher gives the class, have relatively low stakes. Others, like the “leaving exam” that in some schools you would be expected to pass in order to graduate have much higher stakes. You would be right to expect that a high stakes test should be able to demonstrate higher validity and reliability than the pop quiz.

Getting back to the dispute between the teachers and the school board, whose side do you take? Or better stated, on what basis do you decide, if you don’t want to just toss a coin?

Up to this point I’ve been supposing that testing is done to get information on student achievement. But, in fact, testing, or what we should more appropriately call assessment, is done also to get information on how well teachers have taught and how well schools have prepared and supported their teachers and student body. The argument is that how well students do on a test also has implications for how well the teachers and the school have performed.

So what would be a workable guideline to use in deciding whether a single test should be weighted either 10 percent or 3 percent toward a final course grade and whether a test should cover instruction given during the whole semester or just the last part?

It seems to me that all parties to this decision should acknowledge who are the main recipients of the information the tests deliver. One way to think about this is to consider the amount of preparation and feedback the students are given to understand how they will personally benefit from the test. Will they have the opportunity to study for the test? Do they understand the weight the test has on their grade?

Will they see their test results, question by question, and be explained why they might have answered incorrectly? Will they see how their performance stacks up against others in their class?

Will they have an opportunity to contest results, maybe arguing that one or another question was not covered in class or wasn’t aligned with instructional standards?

The higher the stakes of a single test, the more it’s necessary to demonstrate its value to students.

Likewise for the teachers, wll individual teachers be able to compare their students’ test results with those from other classrooms or compare one group of students with another? Will they be able to get information from the test useful for improving their own teaching? And will the school’s principal get information about how to shape curriculum or stage training sessions for teachers?

As a former test development specialist for high stakes proficiency tests, I feel personally that kids get tested too much, and that too little time and effort is given to explaining to students, parents and teachers what useful information comes out of the test—information that could potentially influence future teaching and learning. The position of the three objecting board members seems to argue, in my mind, not so much for a lower weight for the tests, but for not having given the tests at all.

Too much testing? (Source: AP/Jose Moreno)

As to whether 10 percent of the final grade is too high or too low in this particular case, I lean toward the teachers’ position. Their reasoning that the tests will give students no useful information is hard to argue against. Yet, the effort they put into making the tests cannot have been wasted, since it did focus their attention on their own instruction. They had to reflect on whether each question they included fairly represented what they had taught during the semester.

I would hope that the teachers and the Board of Education can come together and discuss their way to some agreement on the weight the tests should have. Whatever that percent is, though, everyone concerned needs to be honest with students and parents.

As an aside, I wish that the Jefferson County Commissioners who lit into the three Board of Education members and voted 4 to 1 to send a letter of no-confidence to the state Board of Education had kept their opinions to themselves. They have neither the expertise nor the authority to intrude on school matters. Injecting politics into the controversy doesn’t do anything to heal the wounds or lead to a sounder assessment policy.

School Testing: A Complicated Topic

Similar Posts:

Categories of Posts

Most Recent Posts