The mathematical
model (Table 25) in the previous post relates all the parts of a
traditional item analysis including the observed score distribution, test
reproducibility, and the precision of a score. Factors that influence test
scores can be detected and measured by the variation between and within
selected columns and rows.
The model is only
aware of variation within and between mark patterns (deviations from the
mean). The variance (the sum of squared deviations from the mean divided by the
number summed or the mean sum of squares or MSS) is the property of the data
that relates the mark patterns to the normal distribution. This permits
generating useful descriptive and predictive insights.
The deviation of each mark from the mean is obtained by
subtracting the mean from the value of the mark (Table 25a). The squared deviation value is then elevated
to the upper floor of the model (Step 1, Table 25b). [Un-squared deviations
from the mean would add up to zero.]
[IF YOU ARE ONLY USING MULTIPLE-CHOICE TO RANK STUDENTS, YOU
MAY WANT TO SKIP THE FOLLOWING DISCUSSION ON THE MEANING OF TEST SCORES WHEN
USED TO GUIDE INSTRUCTION AND STUDENT DEVELOPMENT.]
The model’s operation gains meaning by relating the score
and item mark distributions to a normal distribution. It compares observed data
to what is expected from chance alone or as I like to call it, the know-nothing mean.

The know-nothing mean, with Knowledge and Judgment Scoring
(KJS) and the partial credit Rasch model (PCRM), is at 50% for a high quality
student or 25% for a low quality student (same as TMC). Scoring is 0-wrong,
1-have yet to learn, and 2-right.
A high quality student accurately, honestly, and fairly reports what is
trusted to be useful in further instruction and learning. There are few, if any,
wrong marks. A low quality student performs the same on both methods of scoring
by marking an answer on all items. Students
adjust the test to fit their preparation.
The know-nothing mean for Knowledge Factor (KF) is above 75% (near the mastery level in the
Nursing124 data, violet). KF weights knowledge and judgment as 1:3, rather than
1:1 (KJS) or 1:0 (TMC). High-risk examinees do not guess. Test takers are given
the same opportunity as teachers and test makers to produce accurate, honest,
and fair test scores.

The know-nothing mean (50%) for KJS and the PCRM is very
different from TMC (25%) for low quality students. The observed operational mean at the mastery level
(above 80%, violet) is nearly the same for high quality students electing
either method of scoring. High quality students have the option of selecting
items they can trust they can answer correctly. There are few to no wrong
marks. [Totally unprepared high quality students could elect to not mark any item
for a score of 50%.]
The mark patterns
on the lower floor of the mathematical model have different meanings based on
the scoring method. TMC delivers a score that only ranks the student’s
performance on the test. KJS and the PCR deliver an assessment of what a
student knows or can do that can be trusted as the basis for further learning
and instruction. Quantity (number right) and quality (portion marked that are
right) are not linked. Any score below 50% indicates the student has not
developed a sense of judgment needed to learn and report at higher levels of
thinking.
The score and item mark patterns are fed into the upper
floor of the mathematical model as the squared
deviation from the mean (d^2). [A positive deviation of 3 and a negative
deviation of 3 both yield a squared deviation of 9.] The next step is to make
sense of (to visualize, to relate) the distributions of the variance (MSS) from
columns and rows.
- - - -
- - - - - - - - - - - - - - - - -
Free
software to help you and your students experience and understand how to break
out of traditional-multiple choice (TMC) and into Knowledge and Judgment
Scoring (KJS) (tricycle to bicycle):