Multiple-Choice Reborn: Test Scoring Math Model

The mathematical model (Table 25) in the previous post relates all the parts of a traditional item analysis including the observed score distribution, test reproducibility, and the precision of a score. Factors that influence test scores can be detected and measured by the variation between and within selected columns and rows.

The model is only aware of variation within and between mark patterns (deviations from the mean). The variance (the sum of squared deviations from the mean divided by the number summed or the mean sum of squares or MSS) is the property of the data that relates the mark patterns to the normal distribution. This permits generating useful descriptive and predictive insights.

The deviation of each mark from the mean is obtained by subtracting the mean from the value of the mark (Table 25a). The squared deviation value is then elevated to the upper floor of the model (Step 1, Table 25b). [Un-squared deviations from the mean would add up to zero.]

[IF YOU ARE ONLY USING MULTIPLE-CHOICE TO RANK STUDENTS, YOU MAY WANT TO SKIP THE FOLLOWING DISCUSSION ON THE MEANING OF TEST SCORES WHEN USED TO GUIDE INSTRUCTION AND STUDENT DEVELOPMENT.]

The model’s operation gains meaning by relating the score and item mark distributions to a normal distribution. It compares observed data to what is expected from chance alone or as I like to call it, the know-nothing mean.

The expected know-nothing mean based on 0-wrong and 1-right with 4-option items (popular on standardized tests) is centered on 25%, 6 right out of 24 questions (Chart 62). This is from luck on test day alone (students only need to mark each item; they do not need to read the test) on a traditional multiple-choice test (TMC). The mean moves to 50% if student ability and item difficulty have equal value. It moves to 80% if students are functioning near the mastery level as seen in the Nursing124 data. The math model will adjust to fit these data.

The know-nothing mean, with Knowledge and Judgment Scoring (KJS) and the partial credit Rasch model (PCRM), is at 50% for a high quality student or 25% for a low quality student (same as TMC). Scoring is 0-wrong, 1-have yet to learn, and 2-right. A high quality student accurately, honestly, and fairly reports what is trusted to be useful in further instruction and learning. There are few, if any, wrong marks. A low quality student performs the same on both methods of scoring by marking an answer on all items. Students adjust the test to fit their preparation.

The know-nothing mean for Knowledge Factor (KF) is above 75% (near the mastery level in the Nursing124 data, violet). KF weights knowledge and judgment as 1:3, rather than 1:1 (KJS) or 1:0 (TMC). High-risk examinees do not guess. Test takers are given the same opportunity as teachers and test makers to produce accurate, honest, and fair test scores.

The distribution of scores about the know-nothing mean are the same for TMC (green, Chart 63) and KJS (red, Chart 63). An unprepared student can expect, on average, a score of 25% on a TMC test with 4-option items. Some 2/3 of the time the score will fall within +/- 1 standard deviation of 25%. As a rule of thumb, the standard deviation (SD) on a classroom test tends to be about 10%. The best an unprepared student can hope for is a score over 35% (25 + 10) about 1/6 of the time ((1 - 2/3)/2).

The know-nothing mean (50%) for KJS and the PCRM is very different from TMC (25%) for low quality students. The observed operational mean at the mastery level (above 80%, violet) is nearly the same for high quality students electing either method of scoring. High quality students have the option of selecting items they can trust they can answer correctly. There are few to no wrong marks. [Totally unprepared high quality students could elect to not mark any item for a score of 50%.]

The mark patterns on the lower floor of the mathematical model have different meanings based on the scoring method. TMC delivers a score that only ranks the student’s performance on the test. KJS and the PCR deliver an assessment of what a student knows or can do that can be trusted as the basis for further learning and instruction. Quantity (number right) and quality (portion marked that are right) are not linked. Any score below 50% indicates the student has not developed a sense of judgment needed to learn and report at higher levels of thinking.

The score and item mark patterns are fed into the upper floor of the mathematical model as the squared deviation from the mean (d^2). [A positive deviation of 3 and a negative deviation of 3 both yield a squared deviation of 9.] The next step is to make sense of (to visualize, to relate) the distributions of the variance (MSS) from columns and rows.

- - - - - - - - - - - - - - - - - - - - -

Free software to help you and your students experience and understand how to break out of traditional-multiple choice (TMC) and into Knowledge and Judgment Scoring (KJS) (tricycle to bicycle):

Multiple-Choice Reborn

Followers

Blog Archive

About Me

Wednesday, February 19, 2014

Test Scoring Math Model - Input

No comments:

Post a Comment