The mathematical
model (Table 25) in the previous post relates all the parts of a
traditional item analysis including the observed score distribution, test
reproducibility, and the precision of a score. Factors that influence test
scores can be detected and measured by the variation between and within
selected columns and rows.
The model is only
aware of variation within and between mark patterns (deviations from the
mean). The variance (the sum of squared deviations from the mean divided by the
number summed or the mean sum of squares or MSS) is the property of the data
that relates the mark patterns to the normal distribution. This permits
generating useful descriptive and predictive insights.
The deviation of each mark from the mean is obtained by
subtracting the mean from the value of the mark (Table 25a). The squared deviation value is then elevated
to the upper floor of the model (Step 1, Table 25b). [Un-squared deviations
from the mean would add up to zero.]
[IF YOU ARE ONLY USING MULTIPLE-CHOICE TO RANK STUDENTS, YOU
MAY WANT TO SKIP THE FOLLOWING DISCUSSION ON THE MEANING OF TEST SCORES WHEN
USED TO GUIDE INSTRUCTION AND STUDENT DEVELOPMENT.]
The model’s operation gains meaning by relating the score
and item mark distributions to a normal distribution. It compares observed data
to what is expected from chance alone or as I like to call it, the know-nothing mean.
The expected know-nothing mean based on 0-wrong and 1-right
with 4-option items (popular on standardized tests) is centered on 25%, 6 right
out of 24 questions (Chart 62). This is from luck on test day alone (students
only need to mark each item; they do not need to read the test) on a
traditional multiple-choice test (TMC). The mean moves to 50% if student
ability and item difficulty have equal value. It moves to 80% if students are functioning
near the mastery level as seen in the Nursing124 data. The math model will adjust to fit these data.
The know-nothing mean, with Knowledge and Judgment Scoring
(KJS) and the partial credit Rasch model (PCRM), is at 50% for a high quality
student or 25% for a low quality student (same as TMC). Scoring is 0-wrong,
1-have yet to learn, and 2-right.
A high quality student accurately, honestly, and fairly reports what is
trusted to be useful in further instruction and learning. There are few, if any,
wrong marks. A low quality student performs the same on both methods of scoring
by marking an answer on all items. Students
adjust the test to fit their preparation.
The know-nothing mean for Knowledge Factor (KF) is above 75% (near the mastery level in the
Nursing124 data, violet). KF weights knowledge and judgment as 1:3, rather than
1:1 (KJS) or 1:0 (TMC). High-risk examinees do not guess. Test takers are given
the same opportunity as teachers and test makers to produce accurate, honest,
and fair test scores.
The distribution of
scores about the know-nothing mean are the same for TMC (green, Chart 63) and
KJS (red, Chart 63). An unprepared student can expect, on average, a score of
25% on a TMC test with 4-option items. Some 2/3 of the time the score will fall
within +/- 1 standard deviation of 25%. As a rule of thumb, the standard
deviation (SD) on a classroom test tends to be about 10%. The best an
unprepared student can hope for is a score over 35% (25 + 10) about 1/6 of the
time ((1 - 2/3)/2).
The know-nothing mean (50%) for KJS and the PCRM is very
different from TMC (25%) for low quality students. The observed operational mean at the mastery level
(above 80%, violet) is nearly the same for high quality students electing
either method of scoring. High quality students have the option of selecting
items they can trust they can answer correctly. There are few to no wrong
marks. [Totally unprepared high quality students could elect to not mark any item
for a score of 50%.]
The mark patterns
on the lower floor of the mathematical model have different meanings based on
the scoring method. TMC delivers a score that only ranks the student’s
performance on the test. KJS and the PCR deliver an assessment of what a
student knows or can do that can be trusted as the basis for further learning
and instruction. Quantity (number right) and quality (portion marked that are
right) are not linked. Any score below 50% indicates the student has not
developed a sense of judgment needed to learn and report at higher levels of
thinking.
The score and item mark patterns are fed into the upper
floor of the mathematical model as the squared
deviation from the mean (d^2). [A positive deviation of 3 and a negative
deviation of 3 both yield a squared deviation of 9.] The next step is to make
sense of (to visualize, to relate) the distributions of the variance (MSS) from
columns and rows.
- - - -
- - - - - - - - - - - - - - - - -
Free
software to help you and your students experience and understand how to break
out of traditional-multiple choice (TMC) and into Knowledge and Judgment
Scoring (KJS) (tricycle to bicycle):