Wednesday, September 5, 2012

Your Choice of Multiple-Choice Testing

Doing PARCC, SBAC, waiver or no waiver? Your choice of multiple-choice makes a big difference in what you get for your money. Today you have a choice. You are not bound to the traditional, right count scored (RCS) version.

  • Traditional RCS multiple-choice works very well at the mastery level, 90% cut score; it is an easy way to score classroom tests, 75% average score and 60% cut score; it is gambling below a score of 60% (meaningless ranking where quantity and quality scores are identical).
  • Independent quantity and quality scored multiple-choice provides the same freedom for students to report what they trust they know and can do as when using short answer, essay, projects, and reports when scored at all levels of thinking. The quality score can range from that found on a RCS test up to 100%, independently from the quantity score: the number of right marks (both the examinee and the examiner know what the student knows and how well knowledge and judgment are used). No forced guessing or gambling is required, just fair and honest reporting.

You have several ways to implement the student empowering features of independent quantity and quality scoring.

  • Both right count scoring and Knowledge and Judgment Scoring (KJS) are featured in Power Up Plus (PUP). This is a classroom friendly implementation. It allows students to select either right count scoring (at the lowest levels of thinking) or KJS (at all levels of thinking). Research has shown that after two experiences with KJS, over 90% of students switch to KJS. It takes a couple of experiences for them to see and believe that they do better taking responsibility for what they know and can do than just marking a test and hoping for good luck. They then change study habits from memorizing random bits of non-sense (to hopefully match on a RCS test) to making sense of each assignment so they can now correctly answer questions they have not seen before.
  • Winsteps, the software many states have used on NCLB testing, contains a Partial Credit Model (PCM) analysis feature. Using item response theory (IRT), it produces the same scoring as classical test theory (CTT) in PUP. It calculates the unexpectedness of each student mark. PUP now colors the student and test performance charts using the unexpectedness values from Winsteps.
  • Amplifire by Knowledge Factor contains the most powerful implementation of independent quantity and quality scoring. Instead of mixing quantity and quality, half and half, for a test score (as is done in PUP and Winsteps), Confidence Based Learning (the forerunner incorporated into Amplifire) mixed three parts quality with one part quantity (knowledge). This is justified in high-risk occupations and in rigorous academic training (mastery). Amplifier, a patented instructional system, includes fast response coaching in such a timely manner that the seemingly impossible standard set by the high quality requirement can be met in a reasonable amount of time.

Today there is no reason to continue using traditional RMS multiple-choice tests when the average score falls below 75% and the cut score is below 60%. Below these points, the tests tell us nothing useful about student performance (other than a questionable rank on the test). From the same test, same preparation, same scanning, we can also get what each student actually knows and how well that knowledge or skill is used.

We get an insight into the level of thinking being used (teaching and learning) in the classroom. We get a student view of the test as well as teacher and test maker views. Misconceptions are distinguished from difficult questions. We get an insight into the development of the student, what levels of thinking are friendly and useful; and which students are taking charge of learning and reporting (highly teachable, meaning makers); and which students are still waiting for the teacher to teach, to test, and to tell them how many right marks they got gambling on a traditional RCS test.

All of the above test benefits are also available at the state department of education level. One of the difficulties of holding students, teachers, schools, and state departments of education accountable (see prior post) has been the use of a ranking system that has had little to do with what students actually knew or could do at the cut score. The cut score was often selected for political reasons. A valid 90% passing at a mastery level was as unsettling, as 24% passing because the scoring was too low, or 90% passing because the cut score was set too low. The additional information from independent quantity and quality scored testing reduces this problem.

The only change in standardized testing required is to offer Knowledge and Judgment Scoring, or PCM scoring, and RCS on the same test. My experience has been that this eliminates the politics of implementing a different scoring method. It is also crucial to allow students the freedom to make the choice that fits their development. This freedom is consistent with taking responsibility for selecting questions to mark when opting for Knowledge and Judgment Scoring or PCM scoring.

No comments:

Post a Comment