3
The count of right marks on a test is the raw material fed
into statistical calculations. All right marks do not have the same value or
meaning; though traditional multiple-choice (TMC) ignores this fact (see prior posts). The
following model, operating in both a perfect world and also with real data,
will do the same.
Able and inspired students and teachers see mastery as their
goal. In this perfect world example, all of these students receive the same
test score (85%). There is no variation in the scores.
Unable and uninspired students and teachers see passing as
their goal. In this perfect world, all of these students receive the same test
score (65%). There is no variation in the scores.
There is no need to invent statistics to describe test
results if all students received the same test scores.
In a perfect world with 10 passing the test and 10 mastering
the lesson, a new statistic appears (the mean or average of the 20 scores) of
75%. Each test score is 10 points away from (above or below) the class average
score of 75%.
The passing score and the mastery score pass directly
through the normal curve at the point the curve changes direction from flexing
down from the mean to flexing up. These two points are called the standard
deviation of the mean; most often shortened to standard deviation (SD). The curve
can be described as 75% +- 10%. This standard measure is expected to contain
2/3 (68%) of the scores on a real student test.
Even though no student earned a score of 75%, this value
represents the on-average score for the entire test. This model score
distribution from 20 students, in no way, looks like the normal curve; the
distribution expected when it includes random error.
Random error injects variation into test results. Let’s say
one lucky student scored 90% right (an increase of 5%) instead of 85%. To keep
the example balanced would require one unlucky student to score 60% right (a
decrease of 5%) instead of 65%. This would stretch out the distribution (Chart
5).
But stretching increases the variation in the distribution.
The increase in variation can be balanced by two students scoring 70% (an
increase of 5%) instead of 65% and another two students scoring 80% (a decrease
of 5%) instead of 85%.
[It takes moving two scores closer to the mean to balance
one score further from the mean since the variation is expressed in squared
values. Score counts change linearly from the mean, such as, 1, 2, 3, 4, 5, but
the values for deviation from the mean change as squared values, such as, 1, 4,
9, 16, 25.
Squaring was resorted to so all values are positive, but it results
in a distorted distribution. The distance between 2 and 4 is a difference of 2.
The squared deviations from 4 to 16, vary a difference of 12.]
Doubling the amount of error (Chart 6) brings the score
distribution closer to the normal distribution of error (the normal curve).
Again the standard deviation remains 10. The distribution now looks more like
traditional multiple-choice classroom test results. A bi-modal distribution was
very common in my remedial biology class. The score distribution can be made to
look even more like the normal curve by tweaking additional clusters of scores.
The normal curve does not describe the actual observed score
distribution. The normal curve always views a distribution through the lens of
three points: the mean, plus 1 SD and minus 1 SD.
A small SD means the distribution is short. A large SD means
the distribution is more spread out.
The SD is never concerned with the location of your
individual test score. Plus and minus 1 SD on the score scale is the region
where about 2/3 of the test scores are expected to fall. There is no way to
specifically predict where your score will actually fall, only the region in
which it will fall. To find your test score, you must take the test.
The Nursing124 test data (Table 2) will now be used to apply
the above concepts. In Chart 7, the normal curve includes 15 of the 22 scores
within one SD of the mean. That is 2/3 or 68%, which is the same as the most expected
value of 68%.
[I learned from Chart 8 that a calculated normal curve for
discriminating items ignores the extreme values of 20% and 40% as well as zero percent
and 100%, however these extreme values are the main contributors when
calculating the SD in the next post. The actual distribution has been reduced
to a numerical abstraction. I used the Excel function NORM.DIST that only
refers to the mean and the SD.]
The total score normal curve is composed of the three (Mastery, 8 items; Unfinished, 8 items; and Discriminating, 5 items) sub-test normal curves (Chart 9). Every average score or mean can generate a normal curve. Visually the normal curve transmits more information in one view (subject to distortion by extreme values) about the raw score data than the average or the SD.
The uniqueness of each mark, student score, and item
difficulty has now been reviewed. Unless some strongly biasing factor is
involved, most factors are ignored using traditional multiple-choice (TMC).
Provision is made in Break
Out (Sheet 2), and PUP 5.22 (Table
2), to edit and rescore the test when an item is found to just be too bad to
use or a spirited class discussion earns a point for everyone on the item.
Otherwise, the only thing that counts using TMC is right and wrong: 1 and 0.
[PCM
values counts as 0, 1, and 2 for wrong, judgment, and right counts. KJS values counts as 0, 0.5, and 1 for
wrong, judgment, and right counts. Both scoring methods maintain the same value
ratio for wrong, judgment and right counts. Both promote student development of
high quality judgment.
TMC uses 0, 0.25, 1 for four-option items but this fact is hidden by forcing students to mark all items or accept a 0 for blank. This promotes guessing. Knowldge Factor uses 0, 0.75, and 1 which inverts the value for TMC judgment. This demands high quality judgment in high-risk occupations and in serious preparation for standardized tests.]
The normal curve can only be accurately drawn from large
score distributions. It can be calculated for tests of any size, based on the test
mean and SD.
- - - - - - - - - - - - - - - - - - - - -
Free software to help you and your students experience and
understand the change from TMC to KJS (tricycle to bicycle):