Wednesday, February 11, 2015

Learning Assessment Responsibilities

Students, teachers, and test makers each have responsibilities that contribute to the meaning of a multiple-choice test score. This post extracts the responsibilities from the four charts in the prior post, Meaningful Multiple-Choice Test Scores, that compares short answer, right-count traditional multiple-choice, and knowledge and judgment scoring (KJS) of both.

Testing looks simple: learn, test, and evaluate. Short answer, multiple-choice, or both with student judgment. Lower levels of thinking, higher levels of thinking, or both as needed. Student ability below, on level, or above grade level. There are many more variables for standardized test makers to worry about in a nearly impossible situation. By the time these have been sanitized from their standardized tests all that remains is a ranking on the test that is of little if any instructional value (unless student judgment is added to the scoring).

Chart 1/4 compares a short answer and a right-count traditional multiple-choice test. The teacher has the most responsibility for the test score when working with pupils at lower levels of thinking (60%). A high quality student functioning at higher levels of thinking could take the responsibility to report what is known or can be done in one pass and then just mark the remainder for the same score (60%). The teacher’s score is based on the subjective interpretation of the student’s work. The student’s score is based on a matching of the subjective interpretation of the test questions with test preparation. [The judgment needed to do this is not recorded in traditional multiple-choice scores.]

Chart 2/4 compares what students are told about multiple-choice tests and what actually takes place. Students are told the starting score is zero. One point is added for each right mark. Wrong or blank answers add nothing. There is no penalty. Mark an answer to every question. As a classroom test, this makes sense if the results are returned in a functional formative assessment environment. Teachers have the responsibility to sum several scores when ranking students for grades.

As a standardized test, the single score is very unfair. Test makers place great emphasis on the right-mark after-test score and the precision of their data reduction tools (for individual questions and for groups of students). They have a responsibility of pointing out that the student on either side of you has an unknowable, different, starting score from chance, let alone your luck on test day. The forced-choice test actually functions as a lottery. Lower scoring students are well aware of this and adjust their sense of responsibility accordingly (in the absence of a judgment or quality score to guide them).

Chart 3/4 compares student performance by quality. Only a student with a well-developed sense of responsibility, or a comparable innate ability, can be expected to function as a high quality, high scoring, student (100% but reported as 60%). A less self-motivated student or with less ability can perform two passes at 100% and 80% to also yield 60%. The typical student, facing a multiple-choice test, will make one pass; marking every question as it comes to earn a quantity, quality, and test score of 60%; a rank of 60%. No one knows which right mark is a right answer.

Teachers and test makers have a responsibility to assess and report individual student quality on multiple-choice tests just as is done on short-answer, essay, project, research, and performance tests. These notes of encouragement and direction provide the same “feel good” effect found in a knowledge and judgment scored quality score when accompanied with a list of what was known or could be done (the right-marked questions).

Chart 4/4 shows knowledge and judgment scoring (KJS) with a five-option question made from a regular four-option question plus omit. Omit replaces “just marking”. A short answer question scored with KJS earns one point for judgment and +/-1 point for right or wrong. An essay question expecting four bits of information (short sentence, relationship, sketch, or chart) earns 4 points for judgment and +/-4 points for an acceptable or not acceptable report. (All fluff, filler, and snow are ignored. Students quickly learn to not waste time on these unless the test is scored at the lowest level of thinking by a “positive” scorer.)

Each student starts with the same multiple-choice score: 50%. Each student stops when each student has customized the test to that student’s preparation. This produces an accurate, honest and fair test score. The quality score provides judgment guidance for students at all levels. It is the best that I know of when operating with paper and pencil. Power Up Plus is a free example. Amplifire refines judgment into confidence using a computer, and now on the Internet. It is just easier to teach a high quality student who knows what he/she knows.


Most teachers I have met question the score of 60% from KJS. How can a student get a score of 60% and only mark 10% of the questions right? Easy. Sum 50% for perfect judgment, 10% for right answers, and NO wrong. Or sum 10% right, 10% right and 10% wrong, and omit 20%. If the student in the example chose to mark 10% right (a few well mastered facts) and then just marked the rest (had no idea how to answer) the resulting score falls below 40% (about 25% wrong). With no judgment, the two methods of scoring (smart and dumb) produce identical test scores. KJS is not a give-away. It is a simple, easy way to update currently used multiple-choice questions to produce an accurate, honest, and fair test score. KJS records what right-count traditional multiple-choice misses (judgment) and what the CCSS movement tries to promote.