15
The visual education statistics engine (VESE) is now capable
of producing a statistical signature for a course using traditional
multiple-choice (TMC) and Knowledge and
Judgment Scoring (KJS).
I
selected two scenarios that explore three consecutive tests in each one. All items are set for maximum discrimination (right
and wrong marks are not mixed). All student score distributions are normal.
Both courses start with an average score of 50% and end with an average score
of 70%. A standard deviation of 10% is considered normal and convenient for
setting grades.
The first scenario is a class that starts with students of
relatively equal abilities (Chart 36). As the course progresses the score
distribution widens. This is the
natural consequence of the better students doing better and the poorer students
lagging behind; a typical result when using TMC that primarily only ranks
students. [A good example of how evolution actually works: the self-empowered
survive.]
The second scenario is a class that starts with students
spread out widely (Chart 37). As the course progresses the score distribution narrows. This is the natural
consequence of good student development; one of the results from switching from
TMC to KJS where students are empowered to report what they actually know and
trust as the basis for further instruction and learning.
The statistical signatures I found are Charts
38 and 39. In a traditional class the test reliability (KR20), the average
item discrimination (PBR), the standard deviation (SD) and the standard error
of measurement (SEM) all increased in value. The controlling factor was the
spread of student scores.
The SD captures the spread of student scores. In these two
scenarios the SD was set to increase or decrease with the
average student score, as required by the score distributions in Charts 36 and
37. [The two signatures are not perfect continuations due to rounding
errors and my inability to fit the 40 x 40 = 1600 marks under smooth normal
curves.]
Individual item discrimination (PBR) is not the controlling factor
as it has been set to the maximum for each item. [A visualization of individual item PBR and average item PBR is needed here. Lower individual PBR values result from
mixing right and wrong marks in an item mark pattern. Wider score distributions (larger SDs) make possible longer item mark patterns. An item mark pattern is visualized in the next post.]
These statistical results are interesting. A traditional class
ends with a test with increasing test reliability and a decreasing ability to separate
student performance with the SEM. A class that ends with most students
empowered (to question, to find answers, and to verify) shows low lower test
reliability and an increasing ability to separate student performance with the
SEM. This makes sense.
These two scenarios also shed light on teacher
effectiveness. Both classes reached the traditional goal of mastery for schools designed for failure. The first, I would
imagine, under the direction of traditional instruction aimed at the center of
the class. The second would require either special attention to lower
performing students or empowering most students to become self-correcting,
high-achieving learners; the goal of the Common Core State Standards (CCSS) movement.
-
- - - - - - - - - - - - - - - - - - - -
Free software to help you and your students
experience and understand how to break out of traditional-multiple choice (TMC)
and into Knowledge and Judgment Scoring (KJS) (tricycle to bicycle):
When my friend will be looking for help with statistics homework. I’m going to commend your post, for its concise information relative to statistics.
ReplyDelete