Wednesday, March 11, 2015

Modernizing Standardize Test Scores

                                                             #13
A single standardized right-count score (RCS) has little meaning beyond a ranking. A knowledge and judgment score (JKS) from the same set of questions not only tells us how much the student may know or can do but also the judgment to make use of that knowledge and skill. A student with a RCS must be told what he/she knows or can do. A student with a KJS tells the teacher or test maker what he/she knows. A RCS becomes a token in a federally sponsored political game. A KJS is a base onto which students build further learning and teachers build further instruction.

Table 40. RCS
Table 41. KJS
The previous two posts dealt with student ability during the test. This one looks at the score after the test. I developed four runs of the Visual Education Statistics Engine: Table 40. RCS, Table 41. KJS (simulated), and after maximizing item discrimination, Table 42. RCSmax, and Table 43. KJSmax. 

Table 42. RCSma
Table 43. KJSmax
Test reliability and the standard error of measurement (SEM) with some related statistics are gathered into Table 44. The reliability and SEM values are plotted on Chart 81 below.

Table 44
Students, on average, can reduce their wrong marks by about one half when they at first switch to knowledge and judgment scoring. The most obvious effect of changing 24 of 48 zeros to a value of 0.5 to simulate Knowledge and Judgment Scoring (KJS) was to reduce test reliability (0.36, red). Scoring both quantity and quality also increased the average test score from 64% to 73%.

Psychometricians do not like the reduction in test reliability. Standardized paper tests were marketed as “the higher the reliability the better the test”. Marketing has now moved to “the lower the standard error of measurement (SEM), the better the test”, using computers, CAT and online testing (green). The simulated KJS shows a better SEM (10%) in relation to 12% for RCS. By switching current emphasis from test reliability to precision (SEM) KJS now shows a slight advantage to test makers over RCS.

Chart 80
Chart 80 shows the general relationships between a right-count score and a KJS. This is Chart 4/4 from the previous post tipped on its side with the 60% passing performance replaced with the average scores of 64% RMS and 73% KJS. Again, KJS is not a giveaway. There is an increase in the score, if the student elects to use his/her judgment. There is also an increase in the ability to know what a student actually knows because the student is given the opportunity to report what is known, not to just to mark an answer to every question (even before looking at the test).

Chart 81
Chart 81 expands Chart 80 using the statistics in Table 44. In general there is little difference between a right-count score and a KJS, statistically. What is different is what is known about the student; the full meaning of the score. Right-count scoring delivers a score on a test carefully crafted to deliver a desired on-average test score distribution and cut score. THE TEST IS DESIGNED TO PRODUCE THE DESIRED SCORE DISTRIBUTION. The KJS adds to this the ability to assess what students actually know and can do that is of value to them. The knowledge and judgment score assess the complete student (quantity and quality).

Knowledge and Judgment Scoring requires appropriate implementation for the maximum effect on student development. In my experience, the switch from RCS must be voluntary to promote student development. It must result in a change in the level of thinking and related study habits where the student assumes responsibility for learning and reporting. At that time students feel comfortable changing scoring methods. They like the quality score. It reassures them that they really can learn and understand.

KJS no longer has a totally negative effect on current psychometrician attempts to sharpen their data reduction tools. But there are still the effects of tradition and project size. The NCLB movement demonstrated (failed in part) because low performing schools mimicked the standardized tests rather than tended to teaching and learning. Their attempt to succeed was counterproductive. Doing more of the same does not produce different results. These schools could also be expected to mimic standardized tests offering KJS.

The current CCSS movement is based on the need for one test for all in an attempt to get valid comparisons between students, teachers, schools and states. The effect has been gigantic contracts that only a few companies have the capacity to bid on and little competition to modernize their test scoring.

KJS is then a supplement to RCS. It can be offered on standardized tests. As such, it updates the multiple-choice test to its maximum potential, IMHO. KJS can be implemented in the classroom, by testing companies and entrepreneurs who see the mismatch between instruction and assessment.


Knowledge Factor has already done this with their patented learning/assessment system, Amplifire. It can prepare students online for current standardized tests. Power Up Plus is free for paper classroom tests. (Please see the two preceding posts for more details related to student ability during the test).