Students, teachers, and test makers each
have responsibilities that contribute to the meaning of a multiple-choice test
score. This post extracts the responsibilities from the four charts in the
prior post, Meaningful Multiple-Choice Test Scores, that compares short answer,
right-count traditional multiple-choice, and knowledge and judgment scoring
(KJS) of both.
Testing looks simple: learn, test, and evaluate. Short
answer, multiple-choice, or both with student judgment. Lower levels of thinking,
higher levels of thinking, or both as needed. Student ability below, on level,
or above grade level. There are many more variables for standardized test
makers to worry about in a nearly impossible situation. By the time these have
been sanitized from their standardized tests all that remains is a ranking on
the test that is of little if any instructional value (unless student judgment
is added to the scoring).
Chart 1/4 compares a short answer and a right-count
traditional multiple-choice test. The teacher has the most responsibility for
the test score when working with pupils at lower levels of thinking (60%). A
high quality student functioning at higher levels of thinking could take the
responsibility to report what is known or can be done in one pass and then just
mark the remainder for the same score (60%). The teacher’s score is based on
the subjective interpretation of the student’s work. The student’s score is
based on a matching of the subjective interpretation of the test questions with
test preparation. [The judgment needed to do this is not recorded in
traditional multiple-choice scores.]
Chart 2/4 compares what students are told about
multiple-choice tests and what actually takes place. Students are told the
starting score is zero. One point is added for each right mark. Wrong or blank answers
add nothing. There is no penalty. Mark an answer to every question. As a classroom
test, this makes sense if the results are returned in a functional formative
assessment environment. Teachers have the responsibility to sum several scores
when ranking students for grades.
As a standardized test, the single score is very unfair.
Test makers place great emphasis on the right-mark after-test score and the
precision of their data reduction tools (for individual questions and for groups
of students). They have a responsibility of pointing out that the student on
either side of you has an unknowable, different, starting score from chance,
let alone your luck on test day. The forced-choice test actually functions as a
lottery. Lower scoring students are well aware of this and adjust their sense
of responsibility accordingly (in the absence of a judgment or quality score to
guide them).
Chart 3/4 compares student performance by quality. Only a
student with a well-developed sense of responsibility, or a comparable innate
ability, can be expected to function as a high quality, high scoring, student
(100% but reported as 60%). A less self-motivated student or with less ability
can perform two passes at 100% and 80% to also yield 60%. The typical student,
facing a multiple-choice test, will make one pass; marking every question as it
comes to earn a quantity, quality, and test score of 60%; a rank of 60%. No one knows which right mark is a right
answer.
Teachers and test makers have a responsibility to assess and
report individual student quality on multiple-choice tests just as is done on
short-answer, essay, project, research, and performance tests. These notes of
encouragement and direction provide the same “feel good” effect found in a
knowledge and judgment scored quality score when accompanied with a list of what was known or could be done
(the right-marked questions).
Chart 4/4 shows knowledge and judgment scoring (KJS) with a
five-option question made from a regular four-option question plus omit. Omit
replaces “just marking”. A short answer question scored with KJS earns one
point for judgment and +/-1 point for right or wrong. An essay question
expecting four bits of information (short sentence, relationship, sketch, or chart)
earns 4 points for judgment and +/-4 points for an acceptable or not acceptable
report. (All fluff, filler, and snow are ignored. Students quickly learn to not
waste time on these unless the test is scored at the lowest level of thinking
by a “positive” scorer.)
Each student starts with the same multiple-choice score:
50%. Each student stops when each student has customized the test to that
student’s preparation. This produces an accurate, honest and fair test score. The
quality score provides judgment guidance for students at all levels. It is the
best that I know of when operating with paper and pencil. Power
Up Plus is a free example. Amplifire
refines judgment into confidence using a computer, and now on the Internet. It
is just easier to teach a high quality student who knows what he/she knows.
Most teachers I have met question the score of 60% from KJS.
How can a student get a score of 60% and only mark 10% of the questions right?
Easy. Sum 50% for perfect judgment, 10% for right answers, and NO wrong. Or sum 10% right, 10% right
and 10% wrong, and omit 20%. If the
student in the example chose to mark 10% right (a few well mastered facts) and
then just marked the rest (had no idea how to answer) the resulting score falls
below 40% (about 25% wrong). With no
judgment, the two methods of scoring (smart and dumb) produce identical test
scores. KJS is not a give-away. It is a simple, easy way to update currently
used multiple-choice questions to produce an accurate, honest, and fair test
score. KJS records what right-count traditional multiple-choice misses
(judgment) and what the CCSS movement tries to promote.