Wednesday, April 28, 2010

My Lucky Score

Students and teachers are as interested in what the next test score will be as in the latest test score. Will it be at or above an expected score? What can be expected from luck? [YouTube]

The portion of the time each student will be lucky can be obtained from charts in the previous blog. These charts show the number of lucky scores obtained when the answer sheets were marked without looking at the test. 

The number of lucky scores becomes the expected frequency of lucky scores for each student. The bar graph becomes an uncluttered line graph.                                            

On 4-option questions, a student can expect to receive a lucky test score of 15 out of 60, about 1/8th of the time (0.12), by just marking the answer sheet without looking at the test.

Half of the time, the lucky test score is expected to be 15 or less, and half of the time 15 or more. Students can increase their luck by deleting one or more answer options. The average lucky score becomes 20 when one option is deleted on each question. 

Students can turn luck on and off by the decisions they make and the chances they take. The Arkansas Algebra I (AAI) test contains sixty 4-option multiple-choice questions. How students take the test determines how difficult it will be. If students think of options not on the test, they make the test more difficult, a 4-option question becomes a 5-option question or more. They are going in the wrong direction. 

Rather than picking a right answer, delete wrong answers and then guess. At the other extreme, if students can discard all but two options, on average, they can expect a lucky score of 30 out of the 60 questions, or 50%. [The higher order thinking skills needed to do this are promoted in the classroom by Knowledge and Judgment Scoring (KJS) and Confidence Based Learning (CBL). Students do not need to know “the right answers” to beat standardized tests. They need a practiced self-judgment.]

The expected average score is a stable value between 15 and 20. Where each student’s (my) lucky score will fall under that average is not. There is no way to predict each student’s lucky score. That is what makes luck enticing. We can predict the average lucky score and the range in which the lucky score will occur very well. Students can always pass the test with proper preparation.

The inability to predict individual student lucky scores is of little consequence with Confidence Based Learning (CBL), or the ACT and SAT, as chance has little effect at the mastery level of learning and performing. It has a devastating effect on students with similar abilities being selected to pass or fail a test with raw scores below 50%. Using an average score protects teachers and schools. It has taken forced disaggregation of NCLB test scores to prevent hiding low performance by groups smaller than about 30 students from being masked by the high performance of other students.

Fair means chance will distribute scores in a “bell shaped curve” or under the “normal curve of error.” (If there are enough questions on the test. The AAI, with 60 questions, has enough.) The curve has the name “normal” because this is what happens when you know nothing on the test, or mark the test without looking at the test booklet. It could be called the “know nothing curve”.

On a multiple-choice test scored only by counting right marks, Right Mark Scoring (RMS), there are no qualification runs to put the best or the worst at the head of the pack. Instead, chance assigns each student a secret handicap; luck, on test day. The student with the least ability in your class may draw 20 points and the next student may only draw 10. This is fair with RMS rules as both students have an equal opportunity to draw. [YouTube]

Some people believe that tests, especially high-stakes tests, should not be games of chance. They let examinees report what they know, based on their own judgment. Both knowledge and judgment are scored, just as on projects, essays, job assignments, and reports.

Knowledge and Judgment Scored (KJS) tests and Confidence Based Learning (CBL) tests give you a quantity, quality and test score. This form of testing and learning, in the classroom, promotes the student development needed for your students to be winners on any test based on high quality work.

Next, the three games played on a multiple-choice playing field, from traditional RMS (guess testing) to obtaining accurate, honest and fair scores.

Monday, April 26, 2010

Multiple-Choice Lucky Scores

The news headlines could have been, “Cheat or Chance” or “Trick or Teach,” this past year. The cut score for passing a multiple-choice test, scored by only counting right marks, continued to fall. The traditional multiple-choice test scoring method was being pushed over a credibility limit.

Aug 11: “City students are passing standardized tests just by guessing
Aug 17: “Guessing My Way to Promotion”
Sep 14: “Botched Most Answers on New York State Math Test? You Still Pass”
Sep 16: “Is any test reliable? CRCT? SAT? NAEP? ACT? Pick one”
Oct 31: “Ducan: States ‘set bar too low’”
Jan 11: “As School Exit Tests Prove Tough, States Ease Standards”

The 100-point 2009 Arkansas Algebra I (AAI) end-of-course test, mentioned in the last article, is a good example to examine to see how standardized testing actually works:

  1. Items for new AAI versions are trial-tested, in a current operational test, rather than field-tested on a selected sub-sample at a different time.
  2. A statewide Uniform Grading Scale is monitored for inflation by comparing the pass rate in school with the pass rate on the AAI.
  3. Arkansas has had a nearly perfect yearly increase in the AAI test score for the past nine years (see page 24 of 28).
The multiple-choice portion of the test is played on the traditional field of varying quality. At the high end, everyone knows what the examinee knows or can do, including the examinee. The scoring in Confidence Based Learning (CBL) plays in this region, as does the SAT and ACT when used to pick top quality winners. 

Traditional Right Marked Scoring (RMS), used  on the AAI, are played at the other, lower, end of the field. The examinee guesses and waits for the test score and even then no one knows what the student knows or can do, including the examinee.

Knowledge and Judgment Scoring (KJS) permits students to individualize their test to match their preparation. They can opt for RMS or for KJS. They can opt for the teacher to tell them what they have right, or for reporting what they know and trust is right. They can opt for lower or higher-order thinking.

Chance plays almost no part in CBL. Chance is the main determiner of lucky scores. [YouTube]  This holds for any test using RMS, including the SAT, ACT, and end-of-course tests. 

The effects of unaltered pure chance can be seen on tests such as the AAI when:

  1. The answer sheets are marked randomly without looking at the test booklet.
  2. The answer sheets have no erasures.
  3. No marking pattern is used such as wallpapering. Wallpapering reduces test anxiety by students agreeing, before the test, how they will mark forced-choice guesses (when they have finished reporting what they know and trust, but must not omit or not leave blanks).
  4. Student judgment is absent or is given no value (RMS).
There are several ways to score the effects of chance on multiple-choice tests:

  1. Randomly mark 100 AAI answer sheets for the 60 multiple-choice questions.
  2. Use a quincunx board.
  3. Use the Excel function: BINOMDIST.
The quincunx board allows you to see chance in action; that force behind what is called creativity in Arts, Letters, and Politics, and is also called error in Science, Math and Engineering. The quincunx board works well for normal classroom tests with about 25 students (balls) and 8 questions (9 bins). (Number each student. Run slowly. Have each student follow his/her ball as it falls into a bin. Repeat and compare results for an added effect.)

The Excel function BINOMDIST can be set for almost any number of students and questions. A set of 100 answer sheets produces a surprisingly uniform distribution even though the right answer is expected by chance but 1/4th of the time.

The graph of 4-option questions shows that no student can expect to pass the AAI by guessing. Classroom passing is set equal to 24 raw score points out of 100 points in Arkansas. The maximum lucky score on the sixty 4-option questions was 23, and that only happened about 1 out of 100 students. The required passing cut score of 37 points for graduation in Arkansas is far beyond the reach of lucky scores. [YouTube]

But students can alter these results by exercising higher-order thinking skills. If students can, on average, discard one option on each question, they are then working with a 3-option question test. The classroom test equivalent of 24 raw score points can be passed with lucky scores. Some 17 (6 + 4 + 3 + 2 + 1 + 1) out of 100 students passed by guessing from the remaining three options. Students who do this are often referred to as “test wise.”

Students, teachers, test makers, and administrators can manipulate the effects of chance, for their benefit, in other ways.