Multiple-Choice Reborn: Student Quality

We need both quantity and quality, but if a choice must be made, quality generally wins, expect in current academic testing. The traditional right count scored, forced-choice, version of multiple-choice assessment ties quantity and quality together in one meaningless ranking. It extracts the least information from the answer sheets. Because of this, there have been many movements (fads) to improve assessment from alternative assessment, authentic assessment, portfolios, projects, and reports to actual oral and visual presentations. In the end, traditional multiple-guess has always won out for some very good reasons: cheap, fast, easy to do and highly reproducible results.

Multiple-choice assessment does not have to be this way. Just change the instructions a bit and you have assessment at all levels of thinking as well as cheap, fast, easy to do, highly reproducible and meaningful results. Allowing students to accurately report (on multiple-choice tests) what they trust will be of use in further learning and instructing is not something new in 2012. Geoff Master from Melbourne, Australia, developed the partial credit Rasch model (PCM) that is included in Winsteps prior to 1982.

While teaching at Northwest Missouri State University, USA, along with several 1000 remedial biology students, I developed Knowledge and Judgment Scoring (KJS) in 1981 to obtain an individualized written report from each student that accurately assessed what each student really knew (from lecture, laboratory, and assignments) and on which further meaningful learning could be built. As one faculty member working with pre-med students put, “We know what they know and how well they know”. This method of scoring was crucial in providing the information needed with which to guide each student along the path from passive pupil to active, self-correcting, scholar. It made possible servicing a class of 120 remedial biology students more effectively and with less effort than 24 students in a class with “blue book” exams.

James Bruno made extensive studies in assessment at the University of California in Los Angeles. In 2005, Knowledge Factor patented an educational system (Confidence Based Learning – CBL, now Amplifier) based on his work with great success in the professional development and competency assessment area. Knowledge Factor sets the bar for quality at 75% or higher. KJS sets it at 50%. Traditional multiple-choice sets it at zero (passive scoring – when scoring the finished answer sheet) and at 25% for four-option questions (active scoring – when taking the test).

Both PCM and KJS produce the same test scores. They both also provide estimates of student quality. This illusive property is often discussed as only to be found in “alternative assessments”, alternative to traditional multiple-choice (for the majority of uninformed and un-relearning educational reformers). Quality by alternative assessment is very subjective. Quality by PCM and KJS is not. Quality by PCM and KJS is also highly reproducible.

PCM and KJS produced comparable quality indicators on a remedial general biology test for four students that had a 70% test score. The Student Normal (+) Output values on the table have been corrected by adding 25% to each value to match the Item Normal (+) Output value mean (see the full details on the 3 October 2012 post on the Rasch Model Audit blog, Rasch Model Student Ability and CTT Quality).

PCM and KJS Quality Indicators
Method	Student (70% Test Score)
	26	37	40	44
KJS	81%	88%	88%	95%
PCM	68%	76%	76%	88%

These quality indicators cannot be expected to have the exact same values as they include different components. KJS divides the number of right answers by the total number of marks a student makes to estimate quality (% Right). The number of right marks is an indicator of quality. The KJS student test score is a combination of quantity and quality (PUP uses a 1:1 ratio that every student can understand). If a student elects KJS but ends up marking most of the questions, the KJS assessment automatically turns into a traditionally right mark scored test with no penalties (except for the traditional 3 out of 4 wrong when forced to guess).

Knowledge Factor (KF) uses 3/4 for judgment and 1/4 for knowledge when working with high risk occupations (it also uses three-option questions instead of four or five options). This makes sense when setting the value for quality (judgment) three times greater than for quantity (knowledge). The examinee either knows or does not know (and is then coached and trained to seek help). No guessing is allowed when only mastery is the goal. Allowing one airliner to take off directly in the path of one landing is not a good thing.

KJS and KF only see mark counts of 0, 1, and 2. Winsteps combines student ability and item difficulty into one PCM expected score. The perfect Rasch model, implemented by Winsteps, sees combined student ability and item difficulty as probabilities from zero to 1. A question ranks higher if marked right by more able students. A student ranks higher marking more difficult questions than when marking easier questions. The end result is two comparable, but not exactly the same, estimates of quality from the two methods of analysis.

Knowledge Factor optimizes assessment and instruction for mastery in high risk occupations. Winsteps, PCM, is optimized for psychometricians and test makers. Both can be used in the classroom where mastery and the development of high quality students is important, not just a topic of conversation (this is in contrast to just passing). It is in sharp contrast to the traditional failing classroom where instruction and learning are conducted at lower levels of thinking in preparation for NCLB standardized tests.

Knowledge and Judgment Scoring, as presented in Power Up Plus (PUP) is an adaptation of holding students sufficiently accountable that they develop the skills of the self-motivated, self-correcting scholar (question, answers, and verify). Facts change. The skills needed to learn and relearn only develop more with use in a non-threatening environment. PUP provides students with the opportunity to voluntarily select reporting what they trust when they are ready to do so (switch from lower to all levels of think). It does this by scoring both methods: traditional guess testing and KJS. Over 90% of the students I worked with made the change after the second exposure (after two times on their new risky bicycles, where they learned to balance [to be the judge of what they knew], they readily gave up their tricycles). This was a new and empowering experience for many students, “I can do this!”

I have promoted KJS for over 20 years. It provides much of the information now lost using traditional RMS tests. It provides the guidance needed to move students from passive pupils to self-educating high achievers (including the current fad generally expressed as 21^stcentury skills – these skills have always been important for master achievers). But in a highly threatening environment created by federal government bullying, multiple-choice has been given a very bad name. The desire needed to risk, to relearn, that there are two very different multiple-choice assessment methods has been almost squelched.

Until KJS is offered on standardized tests, it still makes a great training ground for preparing for such tests as it makes very clear to each student, during the test (an effective formative assessment willingness to need to know moment), what each student has yet to learn (and what each teacher may need to “reteach” to students willing to learn).

When a student understands, he can answer questions he has never seen before. Students who made the switch in my classes also found they were also doing better in their other classes. Learning and reporting for your own empowerment is a lot more fun than learning for a classroom or standardized test conducted at the lowest levels of thinking (gambling for a passing score).

Both quantity and quality matter in alternative assessments, including PCM and KJS. They are more easily and less expensively assessable when multiple-choice is done right: PCM and KJS. Done right also promotes student development, to be a better learner: a weaver of relationships rather than a cataloger of isolated bits. Multiple-choice done right even guarantees mastery with KF.

Multiple-Choice Reborn

Followers

Blog Archive

About Me

Wednesday, August 15, 2012

Student Quality

No comments:

Post a Comment