<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-6676724996771468267</id><updated>2011-12-21T06:00:07.772-08:00</updated><category term='test'/><category term='qualms'/><category term='scoring clicker data'/><category term='scoring'/><category term='grading'/><category term='NCLB'/><category term='grading clicker data'/><category term='bubbling'/><category term='cut score'/><category term='multiple choice'/><category term='percent passing'/><category term='TAKS'/><category term='clicker'/><category term='data'/><category term='multiple-choice'/><category term='bubbles'/><title type='text'>Multiple-Choice Reborn</title><subtitle type='html'>This blog explores the advantages and disadvantages of classical right mark scoring (RMS), Knowledge and Judgment Scoring (KJS) and Confidence Based Learning (CBL) when used to set grades (cut points) and to promote student and employee development.</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://richard-hart.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://richard-hart.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Richard Hart</name><uri>http://www.blogger.com/profile/04962997526156185761</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_D9sVIEz0Aho/SPYhfcx0BCI/AAAAAAAAAA0/BNYxI4ytqCY/S220/Page0002a.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>18</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-6676724996771468267.post-1774992643629938740</id><published>2011-12-21T06:00:00.000-08:00</published><updated>2011-12-21T06:00:07.821-08:00</updated><title type='text'>True Score Diviner</title><content type='html'>&lt;!--[if gte mso 9]&gt;&lt;xml&gt;  &lt;o:DocumentProperties&gt;   &lt;o:Template&gt;Normal&lt;/o:Template&gt;   &lt;o:Revision&gt;0&lt;/o:Revision&gt;   &lt;o:TotalTime&gt;0&lt;/o:TotalTime&gt;   &lt;o:Pages&gt;1&lt;/o:Pages&gt;   &lt;o:Words&gt;397&lt;/o:Words&gt;   &lt;o:Characters&gt;2266&lt;/o:Characters&gt;   &lt;o:Lines&gt;18&lt;/o:Lines&gt;   &lt;o:Paragraphs&gt;4&lt;/o:Paragraphs&gt;   &lt;o:CharactersWithSpaces&gt;2782&lt;/o:CharactersWithSpaces&gt;   &lt;o:Version&gt;11.1287&lt;/o:Version&gt;  &lt;/o:DocumentProperties&gt;  &lt;o:OfficeDocumentSettings&gt;   &lt;o:AllowPNG/&gt;   &lt;o:TargetScreenSize&gt;640x480&lt;/o:TargetScreenSize&gt;  &lt;/o:OfficeDocumentSettings&gt; &lt;/xml&gt;&lt;![endif]--&gt;&lt;!--[if gte mso 9]&gt;&lt;xml&gt;  &lt;w:WordDocument&gt;   &lt;w:Zoom&gt;0&lt;/w:Zoom&gt;   &lt;w:DoNotShowRevisions/&gt;   &lt;w:DoNotPrintRevisions/&gt;   &lt;w:DisplayHorizontalDrawingGridEvery&gt;0&lt;/w:DisplayHorizontalDrawingGridEvery&gt;   &lt;w:DisplayVerticalDrawingGridEvery&gt;0&lt;/w:DisplayVerticalDrawingGridEvery&gt;   &lt;w:UseMarginsForDrawingGridOrigin/&gt;  &lt;/w:WordDocument&gt; &lt;/xml&gt;&lt;![endif]--&gt;  &lt;!--[if gte mso 9]&gt;&lt;xml&gt;  &lt;o:shapedefaults v:ext="edit" spidmax="1026"/&gt; &lt;/xml&gt;&lt;![endif]--&gt;&lt;!--[if gte mso 9]&gt;&lt;xml&gt;  &lt;o:shapelayout v:ext="edit"&gt;   &lt;o:idmap v:ext="edit" data="1"/&gt;  &lt;/o:shapelayout&gt;&lt;/xml&gt;&lt;![endif]--&gt;    &lt;!--StartFragment--&gt;  &lt;br /&gt;&lt;div class="MsoNormal"&gt;The previous few posts have listed weaknesses in traditional multiple-choice right mark scoring (RMS). Other than a rank of increasingly questionable value as the test scores decrease, RMS results are seriously flawed for use in current formative assessments. Quality and quantity are still linked. They are not linked in projects, reports, and essay tests. Even on a failing project, there can still be a note, “Great use of color”; “Great idea, another bit of editing and a great paper”.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Knowledge and Judgment Scoring (KJS) does the same thing with multiple-choice tests, “You got a quality score of 90% on the test you select to mark. Now make the same preparation on more of the assignment and you will have a passing score. We know you can do it!”&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;RMS test scores are always suspect and often meaningless. The True Score Diviner can help you find your true score, or if your score is your true score, the range of test scores you may have gotten with the same preparation.&amp;nbsp;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-3RKeoXkYELw/TtFZr8l2ARI/AAAAAAAAARQ/VhlqwA2KguY/s1600/Diviner.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" height="320" src="http://1.bp.blogspot.com/-3RKeoXkYELw/TtFZr8l2ARI/AAAAAAAAARQ/VhlqwA2KguY/s320/Diviner.jpg" width="237" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;At 100%, your test score and true score are one and the same. With a test score of 25%, on a 4-option question test, your true score could range from 25 - 25 or zero to 25 + 25 or 50%. Half of the time RMS cheats you and half of the time it teases or lies to you. You have a lucky day or an unlucky day. There is no way to know which or how much from a single test. Statistical procedures say very little about single events strongly related to luck. They can help if you took about five versions of the test and calculated an average test score. You do not do that.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Knowledge and Judgment Scoring (KJS) solves this problem by letting you report what you know and trust. You are, in effect, scoring your own test based on your own preparation. Each student gets a customized test. Guessing is not required.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Now both student and teacher know what every high quality student knows and can do that can be trusted as the basis for further instruction, learning, and application regardless of the test score. Quantity and quality each generate separate scores.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;We need to promote Knowledge and Judgment Scoring (KJS). &lt;a href="http://www.nine-patch.com/"&gt;Power Up Plus (PUP)&lt;/a&gt; does this by offering students both RMS and KJS. They make the switch when they have matured enough in a supportive classroom that places equal emphasis on knowing and on the skills required by the successful independent achiever. I know. I don’t know. I know how to know.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;RMS today makes as much sense as selling gasoline at $3 per gallon from a pump that averages one gallon for each $3. It may deliver less than ½ gallon to over two gallons for each $3. But it does deliver an average of $3 per gallon if you sum all the customers for the day. That is a range of less than $1.50 to over $6.00 per gallon. Such a situation in academic measurement still goes, for the most part, unquestioned.&lt;/div&gt;&lt;!--EndFragment--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6676724996771468267-1774992643629938740?l=richard-hart.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://richard-hart.blogspot.com/feeds/1774992643629938740/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://richard-hart.blogspot.com/2011/12/true-score-diviner.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default/1774992643629938740'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default/1774992643629938740'/><link rel='alternate' type='text/html' href='http://richard-hart.blogspot.com/2011/12/true-score-diviner.html' title='True Score Diviner'/><author><name>Richard Hart</name><uri>http://www.blogger.com/profile/04962997526156185761</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_D9sVIEz0Aho/SPYhfcx0BCI/AAAAAAAAAA0/BNYxI4ytqCY/S220/Page0002a.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/-3RKeoXkYELw/TtFZr8l2ARI/AAAAAAAAARQ/VhlqwA2KguY/s72-c/Diviner.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6676724996771468267.post-6950305435490706649</id><published>2011-12-14T06:00:00.000-08:00</published><updated>2011-12-14T06:00:09.481-08:00</updated><title type='text'>Is Student Debriefing Hacking?</title><content type='html'>&lt;!--[if gte mso 9]&gt;&lt;xml&gt;  &lt;o:DocumentProperties&gt;   &lt;o:Template&gt;Normal&lt;/o:Template&gt;   &lt;o:Revision&gt;0&lt;/o:Revision&gt;   &lt;o:TotalTime&gt;0&lt;/o:TotalTime&gt;   &lt;o:Pages&gt;1&lt;/o:Pages&gt;   &lt;o:Words&gt;770&lt;/o:Words&gt;   &lt;o:Characters&gt;4390&lt;/o:Characters&gt;   &lt;o:Lines&gt;36&lt;/o:Lines&gt;   &lt;o:Paragraphs&gt;8&lt;/o:Paragraphs&gt;   &lt;o:CharactersWithSpaces&gt;5391&lt;/o:CharactersWithSpaces&gt;   &lt;o:Version&gt;11.1287&lt;/o:Version&gt;  &lt;/o:DocumentProperties&gt;  &lt;o:OfficeDocumentSettings&gt;   &lt;o:AllowPNG/&gt;   &lt;o:TargetScreenSize&gt;640x480&lt;/o:TargetScreenSize&gt;  &lt;/o:OfficeDocumentSettings&gt; &lt;/xml&gt;&lt;![endif]--&gt;&lt;!--[if gte mso 9]&gt;&lt;xml&gt;  &lt;w:WordDocument&gt;   &lt;w:Zoom&gt;0&lt;/w:Zoom&gt;   &lt;w:DoNotShowRevisions/&gt;   &lt;w:DoNotPrintRevisions/&gt;   &lt;w:DisplayHorizontalDrawingGridEvery&gt;0&lt;/w:DisplayHorizontalDrawingGridEvery&gt;   &lt;w:DisplayVerticalDrawingGridEvery&gt;0&lt;/w:DisplayVerticalDrawingGridEvery&gt;   &lt;w:UseMarginsForDrawingGridOrigin/&gt;  &lt;/w:WordDocument&gt; &lt;/xml&gt;&lt;![endif]--&gt;     &lt;!--StartFragment--&gt;  &lt;br /&gt;&lt;div class="MsoNormal"&gt;“How did the test go?”&lt;span style="mso-spacerun: yes;"&gt;&amp;nbsp; &lt;/span&gt;“Fine.” This common exchange is heard after every standardized test. It does not disclose the content of the test, the questions on the test, or the score. It is more wish than fact. It reveals nothing that is meaningful to student and teacher; a frequent end result of NCLB standardized testing.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;The current trend in revising the Elementary and Secondary Education Act (ESEA) is to add tests within the course to the final test. This is promoted as formative testing. Unfortunately formative testing requires timely feedback. Computers can provide non-judgmental timely feedback. This gave rise to the &lt;a href="http://www.edu-soft.org/"&gt;Educational Software Cooperative, Inc, non-profit&lt;/a&gt;. Learning at higher levels of thinking (question/answers/verify) provides effective self-motivating feedback. A standardized test that only returns a test score several weeks later has little if any formative testing content. &lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;The new within-course tests are actually an expansion of predatory testing. Predatory testing crowds out instructional/learning time. It unfortunately encourages lengthy test preparation at lower levels of thinking by the very schools that most need higher levels of thinking instructional/learning time. It encourages a short-term fix rather than a long-term solution (rote over understand).&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;The classroom teacher has several options:&lt;/div&gt;&lt;ol start="1" style="margin-top: 0in;" type="1"&gt;&lt;li class="MsoNormal" style="mso-list: l1 level1 lfo3; tab-stops: list .5in;"&gt;Devote      little, if any, time to test preparation. Conduct the classroom in such a      manner that the standardized test is, as knowledgeable students put it,      “No big deal.” &lt;/li&gt;&lt;li class="MsoNormal" style="mso-list: l1 level1 lfo3; tab-stops: list .5in;"&gt;Prepare      students to take the test at higher levels of thinking by using &lt;a href="http://www.nine-patch.com/"&gt;Knowledge and Judgment Scoring (KJS)&lt;/a&gt;      on projects and classroom essay and multiple-choice tests.&lt;/li&gt;&lt;li class="MsoNormal" style="mso-list: l1 level1 lfo3; tab-stops: list .5in;"&gt;Continue      lengthy test preparation at lower levels of thinking (which in my opinion      should be outlawed; recognized as a trait of incompetent school      administration). &lt;/li&gt;&lt;/ol&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;One way of making ESEA standardized tests function as formative assessments is to debrief students shortly after the test. High scoring classes can do this very informally for the first teacher option above.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Less successful classes, at higher levels of thinking, can collect the topics students find puzzling. High quality students have good judgment in determining what they know and what they have yet to learn.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;At lower levels of thinking, students and teachers are most interested in the right answer for each question: A or B or C or D. Debriefing at this level, in my opinion, is as meaningless as reading off the answers to an in-class test.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Each of the above levels penetrates closer to the actual question stem and answer options. The concept of “fair use”, when applied to standardized test questions, requires that whatever is done, it must not reduce the market value of the test. It must not be for profit. It must only be of benefit to the participating students. The actual test questions must not be discussed. They must remain secret. Debriefing is then restricted to a one-time affair. Debriefing is of decreasing value to students performing at higher levels of thinking down to lower levels of thinking. &lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Student debriefing is hacking:&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;ol start="1" style="margin-top: 0in;" type="1"&gt;&lt;li class="MsoNormal" style="mso-list: l0 level1 lfo2; tab-stops: list .5in;"&gt;It is      a violation of copyright. (Fair use of copyrighted material does not      include disclosing or direct copying of a standardized test question. A      standardized test question is used to make comparative assessments [the      common items must be protected]. By its very nature, it must be kept      secret or its market value is affected. What portion can be copied or referenced      is open to interpretation*.) &lt;/li&gt;&lt;li class="MsoNormal" style="mso-list: l0 level1 lfo2; tab-stops: list .5in;"&gt;It      promotes the sale of test question answers. (Informal and higher levels of      thinking debriefing do not require the exact question stem nor the      question answers. Any attempt to recall exact question stems and answers is      of limited use as good standardized tests scramble the answer options,      edit the question stems, and replace a portion of the questions between      each test. Computer adaptive tests [CAT] do much of this during each      student application – no two students even get the same test.)&lt;/li&gt;&lt;/ol&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Student debriefing is not hacking:&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;ol start="1" style="margin-top: 0in;" type="1"&gt;&lt;li class="MsoNormal" style="mso-list: l2 level1 lfo1; tab-stops: list .5in;"&gt;It      makes a formative assessment out of predatory testing.&lt;/li&gt;&lt;li class="MsoNormal" style="mso-list: l2 level1 lfo1; tab-stops: list .5in;"&gt;Debriefing      with a test company provided summary lesson plan, listing topics with      model test questions, would not be hacking.&lt;span style="mso-spacerun: yes;"&gt;&amp;nbsp; &lt;/span&gt;For a test of 30 questions covering 6 topics, the 6      topics could be listed with a model question for each topic.&lt;span style="mso-spacerun: yes;"&gt;&amp;nbsp; &lt;/span&gt;The model questions could be ones      released from past tests. In-class scoring of this summary test would      provide immediate feedback for students and teachers. This formative      assessment lesson plan would increase the test’s market value. &lt;/li&gt;&lt;/ol&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;*At one extreme, the Georgia Professional Standards Commission bands any mention, reference to, or discussion of test questions. Students take the test and close the booklets. The closed booklets are collected and returned. &lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;At the other extreme, parents of students who have learning problems can view the test booklets. This is justified as “fair use” as it provides parents some idea of what the student should have been able to do.&lt;span style="mso-spacerun: yes;"&gt;&amp;nbsp; &lt;/span&gt;It is of help in educating the student. It is not for profit. This one time use applies to no one other than to the parent/school/student relationship. It is therefore not a breach of security.&lt;/div&gt;&lt;!--EndFragment--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6676724996771468267-6950305435490706649?l=richard-hart.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://richard-hart.blogspot.com/feeds/6950305435490706649/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://richard-hart.blogspot.com/2011/12/is-student-debriefing-hacking.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default/6950305435490706649'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default/6950305435490706649'/><link rel='alternate' type='text/html' href='http://richard-hart.blogspot.com/2011/12/is-student-debriefing-hacking.html' title='Is Student Debriefing Hacking?'/><author><name>Richard Hart</name><uri>http://www.blogger.com/profile/04962997526156185761</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_D9sVIEz0Aho/SPYhfcx0BCI/AAAAAAAAAA0/BNYxI4ytqCY/S220/Page0002a.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6676724996771468267.post-4523766603423039119</id><published>2011-12-07T06:00:00.000-08:00</published><updated>2011-12-07T06:00:04.348-08:00</updated><title type='text'>Is Wallpapering Hacking?</title><content type='html'>Hacking, in the beginning, was an honorable tradition of learning how to control and use a computer, for something useful, without having access to machine and language manuals. It was playing (question/answers/verify; just as is done in putting a puzzle together). It was pioneering. It was empowering. It was fun.&amp;nbsp;Over time, “hacking” became all of the above, but with malice intent. A few bad apples tarnished the image of the bright and the bold.&lt;br /&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Dumb wallpapering, marking the same option, “C” for example, if you do not know, does not improve test results or student scores. Smart wallpapering, creating a unique answer pattern PRIOR to seeing the test, yields improved KJS results. It can rather uniformly alter student scores.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-uYNbbCIOo5M/TtFTPusL65I/AAAAAAAAARI/viRKMpIxSlc/s1600/Scores.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="193" src="http://2.bp.blogspot.com/-uYNbbCIOo5M/TtFTPusL65I/AAAAAAAAARI/viRKMpIxSlc/s320/Scores.jpg" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;When the wallpaper contains a right answer, everyone who uses the wallpaper mark gets a right answer. This holds for low quality and high quality students. This is fair. The class, the team, wins or loses together. This is the same level that standardized Dumb Testing data makes sense in ranking classes and schools.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Wallpapering reduces test stress by reducing the time and effort wasted on trying to find the “best answer” to a question you cannot read or understand, let along have nothing in mind for an answer. &lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Wallpapering is hacking:&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;ol start="1" style="margin-top: 0in;" type="1"&gt;&lt;li class="MsoNormal" style="mso-list: l0 level1 lfo2; tab-stops: list .5in;"&gt;It      restricts a wrong mark to one option per question. (The mathematical model      for Dumb Testing assumes that a student randomly marks wrong answers. This      is not true. The model also assigns the starting test value to zero. This      is not true. On a 4-option question test, the starting value is 25%, on      average.)&lt;/li&gt;&lt;li class="MsoNormal" style="mso-list: l0 level1 lfo2; tab-stops: list .5in;"&gt;Students      are acting in collusion. (It makes no difference if individual students      decide before the test or during the test what option to make for a forced      mark. Wallpapering requires the selection to be made BEFORE seeing the      test.)&lt;/li&gt;&lt;/ol&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Wallpapering is not hacking:&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;ol start="1" style="margin-top: 0in;" type="1"&gt;&lt;li class="MsoNormal" style="mso-list: l1 level1 lfo1; tab-stops: list .5in;"&gt;It      only formalizes the advice students have been given for decades: “Mark ‘C’      if you cannot select a ‘best’ answer”.&lt;/li&gt;&lt;li class="MsoNormal" style="mso-list: l1 level1 lfo1; tab-stops: list .5in;"&gt;It      does not change Dumb Testing standardized test scores.&amp;nbsp;&lt;/li&gt;&lt;/ol&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6676724996771468267-4523766603423039119?l=richard-hart.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://richard-hart.blogspot.com/feeds/4523766603423039119/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://richard-hart.blogspot.com/2011/12/is-wallpapering-hacking.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default/4523766603423039119'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default/4523766603423039119'/><link rel='alternate' type='text/html' href='http://richard-hart.blogspot.com/2011/12/is-wallpapering-hacking.html' title='Is Wallpapering Hacking?'/><author><name>Richard Hart</name><uri>http://www.blogger.com/profile/04962997526156185761</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_D9sVIEz0Aho/SPYhfcx0BCI/AAAAAAAAAA0/BNYxI4ytqCY/S220/Page0002a.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-uYNbbCIOo5M/TtFTPusL65I/AAAAAAAAARI/viRKMpIxSlc/s72-c/Scores.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6676724996771468267.post-8181481635658021753</id><published>2011-11-30T06:00:00.000-08:00</published><updated>2011-11-30T06:00:04.198-08:00</updated><title type='text'>Smart Wallpaper Testing</title><content type='html'>The idea for wallpaper came from a simple fact. Students need protection from predatory testing. Know or not, they must mark an answer to each question. Birds fly in flocks and fish swim in schools. They do the same thing at the same time to avoid predators. Wallpaper lets students mark &lt;a href="http://youtu.be/UdX5dXGjFAI"&gt;the same option&lt;/a&gt; when they cannot use the test to report what they know.&lt;br /&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Two wallpaper patterns can be used to extract higher levels of thinking (Smart Testing) information. Dumb wallpaper is based on one of the answer options. Smart wallpaper can be based on the most frequent wrong mark for each question, for example.&amp;nbsp; Dumb wallpaper pays no attention to student performance. Smart wallpaper is based on expected student performance.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-tJsU7xLG9Us/TtFGT2m-ZuI/AAAAAAAAAQw/OZPIKVnfsYk/s1600/3bST.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" height="155" src="http://4.bp.blogspot.com/-tJsU7xLG9Us/TtFGT2m-ZuI/AAAAAAAAAQw/OZPIKVnfsYk/s200/3bST.jpg" width="200" /&gt;&lt;/a&gt;&lt;/div&gt;Wallpaper extracts higher levels of thinking (Smart Testing) information using &lt;a href="http://www.nine-patch.com/"&gt;Knowledge and Judgment Scoring&lt;/a&gt; (KJS). The assumption is that students omit or use the wallpaper pattern when not using the question to report what is known and trusted. This can be seen in the progression from KJS without wallpaper, Table 3bST,&amp;nbsp;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-bbcCQ7bKjIo/TtFGz2j_AzI/AAAAAAAAAQ4/IvKecdj2K_c/s1600/3bSD.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" height="155" src="http://2.bp.blogspot.com/-bbcCQ7bKjIo/TtFGz2j_AzI/AAAAAAAAAQ4/IvKecdj2K_c/s200/3bSD.jpg" width="200" /&gt;&lt;/a&gt;&lt;/div&gt;KJS with Dumb wallpaper, Table 3bSD, &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-bxaC1ugcWSk/TtFHRVh8cLI/AAAAAAAAARA/tzoC8CXXlNI/s1600/3bSS.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" height="155" src="http://3.bp.blogspot.com/-bxaC1ugcWSk/TtFHRVh8cLI/AAAAAAAAARA/tzoC8CXXlNI/s200/3bSS.jpg" width="200" /&gt;&lt;/a&gt;&lt;/div&gt;and KJS with Smart wallpaper, Table 3bSS.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;The student counseling mark matrix analysis (the test taker view of the test) changes from nonsense, to a better performance with Dumb wallpaper, to a typical Knowledge and Judgment Scoring (KJS) printout with Smart wallpaper.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Test scores increase as the simulated quality increases. The distributions (Standard Deviations) of scores and item difficulty decrease. Test reliability declines!&amp;nbsp; Oops!&amp;nbsp; “Houston, we have a problem!” Test companies optimize (brag about) their test reliability based on poor quality data. KJS optimizes student judgment to produce accurate, honest, and fair data. &lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-w0m0p21UWr0/TtE2CmemnSI/AAAAAAAAAQo/WYUt60LyfhE/s1600/TPPa.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="156" src="http://1.bp.blogspot.com/-w0m0p21UWr0/TtE2CmemnSI/AAAAAAAAAQo/WYUt60LyfhE/s320/TPPa.jpg" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;This table clearly captures this conflict in numbers. High test reliability is needed to obtain similar consecutive average test scores. It follows the lower the quality of student scores and the lower the average test score, the more chance determines the average test score. It is also known that the normal curve is highly reproducible by chance alone. High test reliability can become an artifact of test design rather than student performance.&lt;br /&gt;&lt;br /&gt;To the fact that the starting score on a multiple-choice test is 1/(number of options) rather than zero, we can now add a second form of self-deception (psychometricians refer to these as simplifications). They made some sense when everything was done with paper and pencil. Today there is no need to still lock quality and quantity together on a multiple-choice test, especially now that one (KJS) can measure what students actually know and trust rather than just rank students (RMS).&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;The misconceptions in Table 3bST are artifacts created by forcing students to mark when they have no answer of their own. They were not given the option to omit (to mark an accurate, honest and fair answer sheet). Table 3bSS, using Smart wallpaper, shows all four groups of questions (expected, discriminating, guessing, and misconception – EDGM). Higher quality students earn higher test scores that are more accurate, honest and fair.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&lt;/div&gt;&lt;div class="MsoNormal"&gt;The scores in Table 3bSS are only obtainable if students omitted instead of marking the most frequent wrong mark for each question. This simulation fails to capture what students would actually do, if given the opportunity to only mark, when marking reports something they know and trust (can confirm).&amp;nbsp; Given that opportunity, some quality scores would be higher and some lower. Also there is no way to know which wrong mark will be the most frequently marked for each question. Wallpaper must be created BEFORE the test, not after the test.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;This simulation again demonstrates there is no way of equating RMS and KJS results from one set of data. To know what students actually know they must be give the opportunity to report what they know that is meaningful and useful as the basis for further learning, instruction, and use on the job. Traditional RMS only does this when test scores are near 90%. &lt;a href="http://www.nine-patch.com/"&gt;Knowledge and Judgment Scoring&lt;/a&gt; (Smart Testing) yields a valid quality score (%RT) for every test score, a valid test score for every high quality (%RT) student performance.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6676724996771468267-8181481635658021753?l=richard-hart.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://richard-hart.blogspot.com/feeds/8181481635658021753/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://richard-hart.blogspot.com/2011/11/smart-wallpaper-testing.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default/8181481635658021753'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default/8181481635658021753'/><link rel='alternate' type='text/html' href='http://richard-hart.blogspot.com/2011/11/smart-wallpaper-testing.html' title='Smart Wallpaper Testing'/><author><name>Richard Hart</name><uri>http://www.blogger.com/profile/04962997526156185761</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_D9sVIEz0Aho/SPYhfcx0BCI/AAAAAAAAAA0/BNYxI4ytqCY/S220/Page0002a.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/-tJsU7xLG9Us/TtFGT2m-ZuI/AAAAAAAAAQw/OZPIKVnfsYk/s72-c/3bST.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6676724996771468267.post-3609678527954456669</id><published>2011-11-26T08:09:00.000-08:00</published><updated>2011-11-26T08:28:17.852-08:00</updated><title type='text'>Wallpaper Modified Testing</title><content type='html'>&lt;div class="MsoNormal"&gt;The minimum requirement for traditional multiple-choice tests is to mark one option on each question, right mark scoring (RMS). The student is not given the option to omit. The test score indicates luck, guessing, and what the student may know. The score only ranks the student. After experiencing &lt;a href="http://www.nine-patch.com/"&gt;Knowledge and Judgment Scoring&lt;/a&gt; (KJS) my students called traditional testing Dumb Testing. Dumb Testing is easy and fast. Reading all the test questions is optional.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Smart Testing (KJS) requires that each question stem is read and visualized (a web of relevant relationships) before looking at the answer options. If the student’s answer matches one of the answer options, that option is probably the right answer for the question. The student has brought something to the test that can be reported using this question.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Knowledge and judgment can be given &lt;a href="http://www.nine-patch.com/qstart.htm"&gt;equal value&lt;/a&gt;.&amp;nbsp;The test score is a combination of the knowledge and judgment scores (the quantity and quality scores). Forced guessing is not required. The result is an accurate, honest and fair test score.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Changing from Dumb Testing to Smart Testing requires some experience. This is much like changing from a tricycle to a bicycle. It is scary the first few times. After that it is fun. Over 90% of students voluntarily switch from Dumb Testing to Smart Testing after two experiences. &lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Until Smart Testing is offered on NCLB standardized tests, there is a way to modify Dumb Testing to obtain Smart Testing information. It comes from wallpapering the answer sheet. It requires a third key (WP KEY&amp;nbsp;&amp;nbsp; ) for the wallpaper.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;The trick is to assign one answer option on each question as the “omit” option BEFORE seeing the test. Students mark only if they can trust the answer to be correct. Instead of, “mark a best answer on each question”, now students only, “mark answers you can use to report what you trust you know or can do”. Near the &lt;a href="http://youtu.be/UdX5dXGjFAI"&gt;end of the test&lt;/a&gt;, they fill in the remaining marks following the wallpaper design.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;The simplest design is the age-old advice: “If you do not know an answer, just mark C”. Any letter option can be selected for the class PRIOR to seeing the test.&amp;nbsp; &lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;The next most frequent design students have used is the “Christmas tree”: ABCDABCD . . .&amp;nbsp; and AABABCABCDAAB . . .. Random designs can be used if the pattern is posted for all the students to use at the end of the test.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Wallpapering does not change Dumb Testing (RMS) test scores. Changing a wrong mark to a wallpapered omit is still “wrong” with traditional right mark scoring (RMS). &amp;nbsp;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-l-NcRxKhK-0/TtEHXiOeEHI/AAAAAAAAAQQ/15_NgGL2ezI/s1600/3aDT.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" height="163" src="http://3.bp.blogspot.com/-l-NcRxKhK-0/TtEHXiOeEHI/AAAAAAAAAQQ/15_NgGL2ezI/s200/3aDT.jpg" width="200" /&gt;&lt;/a&gt;&lt;/div&gt;Right Mark Scoring&amp;nbsp;&lt;a href="http://richard-hart.blogspot.com/2011/08/scoring-clicker-data.html"&gt;clicker data&lt;/a&gt; with no wallpaper. &amp;nbsp;&amp;nbsp;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-1-U-W8L2cN0/TtEKA12rvII/AAAAAAAAAQY/QpZgM2WeHS8/s1600/3aDD.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" height="163" src="http://1.bp.blogspot.com/-1-U-W8L2cN0/TtEKA12rvII/AAAAAAAAAQY/QpZgM2WeHS8/s200/3aDD.jpg" width="200" /&gt;&lt;/a&gt;&lt;/div&gt;Right Mark Scoring clicker data with Dumb wallpaper (based on any single answer option). &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-w0Jk1uHqEHU/TtELkaZlSNI/AAAAAAAAAQg/EJzbtDXykF0/s1600/3aDS.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" height="163" src="http://1.bp.blogspot.com/-w0Jk1uHqEHU/TtELkaZlSNI/AAAAAAAAAQg/EJzbtDXykF0/s200/3aDS.jpg" width="200" /&gt;&lt;/a&gt;&lt;/div&gt;Right Mark Scoring clicker data with Smart wallpaper (based on student judgment). &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Commercial testing companies can still score the tests to produce traditional Dumb Testing student and school rankings.&lt;br /&gt;&lt;br /&gt;Wallpapering does change Smart Testing (KJS) test scores.&amp;nbsp;&lt;a href="http://www.nine-patch.com/"&gt;Power Up Plus (PUP)&lt;/a&gt; then extracts quantity and quality Smart Testing values (including test maker and test taker views). (See next post.)&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6676724996771468267-3609678527954456669?l=richard-hart.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://richard-hart.blogspot.com/feeds/3609678527954456669/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://richard-hart.blogspot.com/2011/11/wallpaper-modified-testing.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default/3609678527954456669'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default/3609678527954456669'/><link rel='alternate' type='text/html' href='http://richard-hart.blogspot.com/2011/11/wallpaper-modified-testing.html' title='Wallpaper Modified Testing'/><author><name>Richard Hart</name><uri>http://www.blogger.com/profile/04962997526156185761</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_D9sVIEz0Aho/SPYhfcx0BCI/AAAAAAAAAA0/BNYxI4ytqCY/S220/Page0002a.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/-l-NcRxKhK-0/TtEHXiOeEHI/AAAAAAAAAQQ/15_NgGL2ezI/s72-c/3aDT.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6676724996771468267.post-2536210747442048156</id><published>2011-08-24T06:00:00.000-07:00</published><updated>2011-08-31T10:11:43.579-07:00</updated><title type='text'>Standardized Testing - Structure, Function, and Operation</title><content type='html'>The &lt;b&gt;structure&lt;/b&gt;&lt;span style="font-weight: normal;"&gt;, &lt;/span&gt;&lt;b&gt;function&lt;/b&gt;&lt;span style="font-weight: normal;"&gt; and &lt;/span&gt;&lt;b&gt;operation&lt;/b&gt;&lt;span style="font-weight: normal;"&gt; of standardized testing must all be considered when evaluating the usefulness of test results. Standardized test results are not always what they are claimed to be. When mixed with politics, they usually have even less value, as will be discussed near the end of this post.&lt;/span&gt;&lt;br /&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Standardized testing involves test &lt;b&gt;score distributions&lt;/b&gt;&lt;span style="font-weight: normal;"&gt; (statistical tea leaves). Their two most easily recognized characteristics are the &lt;/span&gt;&lt;b&gt;average&lt;/b&gt;&lt;span style="font-weight: normal;"&gt; score, or mean, and the &lt;/span&gt;&lt;b&gt;spread&lt;/b&gt;&lt;span style="font-weight: normal;"&gt; of the distribution, or &lt;a href="http://raschmodelaudit.blogspot.com/2010/12/standard-units.html"&gt;standard deviation&lt;/a&gt; (SD). &lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Two methods of obtaining score distributions are now in use. The &lt;b&gt;traditional method&lt;/b&gt;&lt;span style="font-weight: normal;"&gt;, counting right marks on a multiple-choice test, is the same as used on most classroom tests. The &lt;/span&gt;&lt;b&gt;&lt;a href="http://raschmodelaudit.blogspot.com/2010/12/perfect-rasch-model.html"&gt;Rasch model&lt;/a&gt; method&lt;/b&gt;&lt;span style="font-weight: normal;"&gt;, used by many state education departments, converts test results to estimated &lt;/span&gt;&lt;b&gt;&lt;a href="http://raschmodelaudit.blogspot.com/2011/01/rasch-estimated-measures.html"&gt;measures&lt;/a&gt;&lt;/b&gt; of student ability and of item difficulty.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;The value of multiple-choice test results depends upon how the test is administered. Both methods allow for two modes: &lt;b&gt;forced choice&lt;/b&gt;&lt;span style="font-weight: normal;"&gt;, mark every answer, and &lt;/span&gt;&lt;b&gt;student choice&lt;/b&gt;&lt;span style="font-weight: normal;"&gt; of items that can be used to report what the student trusts that is useable as the basis for further learning and instruction.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;The following table relates the above four combinations to two software programs, the fixed, reproducible, &lt;b&gt;structures&lt;/b&gt;&lt;span style="font-weight: normal;"&gt; that produce score distributions. &lt;a href="http://www.nine-patch.com/"&gt;Power Up Plus (PUP)&lt;/a&gt; and &lt;a href="http://winsteps.com/winsteps.htm"&gt;Winsteps&lt;/a&gt; produce score distributions for classroom use and for standardized testing.&amp;nbsp; &lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-rgLKJjdXu8M/TkHI_wwHm_I/AAAAAAAAAQA/_LnAisO_DgE/s1600/Table1.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="171" src="http://2.bp.blogspot.com/-rgLKJjdXu8M/TkHI_wwHm_I/AAAAAAAAAQA/_LnAisO_DgE/s400/Table1.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;div style="text-align: -webkit-auto;"&gt;Three of the four modes produce traditional right count quantitative score distributions: Quantity Scores. KJS adds a quality score that is &lt;a href="http://raschmodelaudit.blogspot.com/2011/08/pup-quality-and-winsteps-measures.html"&gt;comparable&lt;/a&gt; to the full credit mode measure distribution.&lt;/div&gt;&lt;/div&gt;&lt;span style="font-family: Arial; font-size: 12pt;"&gt;&lt;br clear="ALL" style="page-break-before: always;" /&gt; &lt;/span&gt;  &lt;br /&gt;&lt;div class="MsoNormal"&gt;The distribution of scores from a traditional multiple-choice test can be a good indicator of &lt;b&gt;classroom performance&lt;/b&gt;&lt;span style="font-weight: normal;"&gt; (teacher and student). As a standardized test, only counting right marks places as much, if not more, emphasis on &lt;/span&gt;&lt;b&gt;test performance&lt;/b&gt;&lt;span style="font-weight: normal;"&gt; as on &lt;/span&gt;&lt;b&gt;student performance&lt;/b&gt;&lt;span style="font-weight: normal;"&gt;. Items are carefully selected to produce a predicted score distribution. This score distribution is expected to match some subjectively set standard (cut score) such as grade level or job readiness. But how the test is administered changes the value and meaning of key &lt;/span&gt;&lt;b&gt;functions&lt;/b&gt;&lt;span style="font-weight: normal;"&gt;. Forced choice and student choice produce two &lt;a href="http://richard-hart.blogspot.com/2011/08/scoring-clicker-data.html"&gt;different views&lt;/a&gt; from the same students.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-6_Pmwv1bvOc/TkHJpRPRciI/AAAAAAAAAQE/Gb-TknhJryA/s1600/Table2.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="115" src="http://2.bp.blogspot.com/-6_Pmwv1bvOc/TkHJpRPRciI/AAAAAAAAAQE/Gb-TknhJryA/s400/Table2.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;span class="Apple-style-span" style="border-collapse: collapse;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;For many historical reasons, including tradition and short-term accountability, NCLB has used the forced choice mode that only assesses and promotes the lowest levels of thinking. It is fast, cheap, and ineffective. Testing, and unfortunately as a result teaching, limited to the lowest levels of thinking is more counter productive the longer students are exposed to it. This may be an underlying factor in the poor showing made by high school students, in general, in relation to the lower grades (the spread between the levels of thinking required and that seniors possess my contribute to the current emphasis on senior attitude).&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;When students are allowed to report what they trust as a basis for further learning and instruction, a wealth of information becomes available for student counseling to direct student development. &lt;a href="http://www.nine-patch.com/"&gt;PUP&lt;/a&gt; allows students to switch from forced choice to reporting what they know when they are comfortable doing so.&amp;nbsp; &lt;a href="http://www.knowledgefactor.com/"&gt;Knowledge Factor&lt;/a&gt; is a patented instructional/assessment system that guarantees mastery learners. Development to use all levels of thinking is critical to success in school and in the workplace. &lt;/div&gt;&lt;span style="font-family: Arial; font-size: 12pt;"&gt;&lt;br clear="ALL" style="page-break-before: always;" /&gt; &lt;/span&gt;  &lt;br /&gt;&lt;div class="MsoNormal"&gt;Many ways of &lt;b&gt;operating&lt;/b&gt;&lt;span style="font-weight: normal;"&gt; standardized testing have been used in assessing students for NCLB. Multiple-choice was derided at first and then returned as the primary method. Almost everything that is not assessed by actual performance can be usefully measured with multiple-choice (A, B, C, D and omit). Traditional multiple-choice was crippled by dropping the option to omit (don’t know) early on. Just counting right marks was easier and gave a useable ranking for grading. How the rank relates to what a student knows or can do is still an open debate. Knowledge and Judgment Scoring settles this matter with a quality score.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;A &lt;b&gt;test maker&lt;/b&gt;&lt;span style="font-weight: normal;"&gt; (teacher or standardized item author) has all of the above structure and function options to consider when creating an operational test. The value of the final test results depends upon how the options are mixed and handled (a simple ranking or an assessment of what is known and can be done along with the judgment to use it well).&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;b&gt;Test banking&lt;/b&gt;&lt;span style="font-weight: normal;"&gt; can be very &lt;/span&gt;&lt;b&gt;simple&lt;/b&gt;&lt;span style="font-weight: normal;"&gt;. It can be a list of 25 questions that is edited each semester. The test is then scored by any one of the above four modes. The choice depends upon the use of the results. RMS ranks students and permits comparing your success from year to year. KJS and the partial credit Rasch model explores which students are still lingerers, followers and self-directed learners. The quality score can point out what each student knows or can do as the basis for further learning and instruction regardless of the test score.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;b&gt;Test banking&lt;/b&gt;&lt;span style="font-weight: normal;"&gt; can be very &lt;/span&gt;&lt;b&gt;complicated&lt;/b&gt;&lt;span style="font-weight: normal;"&gt;, time consuming, and expensive.&amp;nbsp; Winsteps appears to be about the least complicated, the least time consuming and the least expensive way for standardized testing. It has been used by many states.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;A &lt;b&gt;test bank&lt;/b&gt;&lt;span style="font-weight: normal;"&gt; is created from items that have been &lt;/span&gt;&lt;b&gt;calibrated&lt;/b&gt;&lt;span style="font-weight: normal;"&gt; by Winsteps. A high scoring sample will produce items with low difficulty. A low scoring sample will produce items with high difficulty. &lt;/span&gt;&lt;b&gt;Equating&lt;/b&gt;&lt;span style="font-weight: normal;"&gt;, with the use of a set of common items, can bring these together if the two samples are believed to be from the same population. Winsteps does not know to do this on its own. When and how to equate requires an operational decision. &lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-N20rL-6t4iY/TkHKlbj35jI/AAAAAAAAAQI/PbU7dWzfZBQ/s1600/Table3.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="182" src="http://2.bp.blogspot.com/-N20rL-6t4iY/TkHKlbj35jI/AAAAAAAAAQI/PbU7dWzfZBQ/s400/Table3.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;span class="Apple-style-span" style="border-collapse: collapse;"&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;However the operations are carried out, human intervention is needed to start it and thereafter at about every other step. Standardized testing is still a mix of art, science and politics.&amp;nbsp;&amp;nbsp; &lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;A &lt;b&gt;benchmark test&lt;/b&gt;&lt;span style="font-weight: normal;"&gt; is selected from the test bank. A range of item difficulties is selected to match the population to be assessed. A small common item set is included. The mean and standard deviation of the predicted distribution are calculated.&amp;nbsp; Time and money permitting, the benchmark test is administered one or more times. Now a known mean and standard deviation are in hand for the distribution. This ends research.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;An &lt;b&gt;application test&lt;/b&gt;&lt;span style="font-weight: normal;"&gt; is administered to the full population: every Algebra I student in the state, for example. This operational test also contains a set of common items used in creating the benchmark test. Winsteps scores the application test.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;b&gt;Resolution&lt;/b&gt;&lt;span style="font-weight: normal;"&gt; of the test results is not the same as &lt;/span&gt;&lt;b&gt;equating&lt;/b&gt;&lt;span style="font-weight: normal;"&gt; items for a test bank. Winsteps can be used here in the same manner as in test banking, but the environment is now very different. A pre-application public declaration of cut scores is no longer recommended due to &lt;a href="http://www.ccsso.org/Documents/2011/Addressing%20Two%20Commonly%20Unrecognized.pdf"&gt;newly found&lt;/a&gt; (Feb 2011) sources of score instability. If the operational test has not performed as expected, the needed adjustment can favor the desired performance for the average score, the cut score, the scaled score, the percent passing, or the percent improvement. Public exposure of average scores has been &lt;a href="http://www.cep-dc.org/publications/index.cfm?selectedYear=2011" title="Scroll down to Open Letter to the Member States of PARCC and SBAC"&gt;requested&lt;/a&gt; by the Center on Education Policy (CEP), Open letter to the member states of PARCC and SBAC, May 3, 2011. Everyone can then know the starting point for whatever resolution adjustments are made. This would help reestablish public trust and increase the value of test results.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Test banking data can be liberally culled to obtain the best fit of data to the Rasch model because of the unique properties of the model. That same liberal attitude is, in my opinion, not justified when manipulating the operational test results.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;The final step for Winsteps is the &lt;b&gt;conversion of measures&lt;/b&gt;&lt;span style="font-weight: normal;"&gt; to expected raw scores. The conversion is a matter of changing log units to normal units when the test results are not manipulated. No human judgment is required. A normal bell curve distribution is again created.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;This brings this series of posts related to the high jinks exposed in several state education departments to an end. Over the past few years several states have displayed marked deficiencies in their short-term competition for federal money and adequate yearly progress (AYP) including Texas and Illinois (part of the motivation for this year long &lt;a href="http://richard-hart.blogspot.com/2010/11/rasch-model-irt-demystified.html"&gt;investigation&lt;/a&gt; into Rasch model IRT test analysis). During this last year &lt;span style="font-family: ArialMT;"&gt;&lt;a href="http://www.nytimes.com/2010/10/11/education/11scores.html"&gt;&lt;span style="color: #0f00bc; font-family: Arial;"&gt;New York&lt;/span&gt;&lt;/a&gt;&lt;/span&gt; presented the worst example I know of. In my opinion the recent cheating scandals in Georgia will have done less damage to students, teachers and schools than the manipulation of New York state test results by state officials. &lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;a href="http://2.bp.blogspot.com/-q9tUiYoCZPA/TkHLDH-R9HI/AAAAAAAAAQM/czd6CCctoOE/s1600/EOC.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" height="218" src="http://2.bp.blogspot.com/-q9tUiYoCZPA/TkHLDH-R9HI/AAAAAAAAAQM/czd6CCctoOE/s320/EOC.jpg" width="320" /&gt;&lt;/a&gt;&lt;a href="http://arkansased.org/testing/test_scores.html"&gt;Arkansas&lt;/a&gt;, on the other hand, has posted &lt;b&gt;almost perfect examples&lt;/b&gt;&lt;span style="font-weight: normal;"&gt; for AYP on NCLB tests for over a ten-year period: 2001-2011 End-of-course Comparison. &lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;(The percent combined proficient and advanced is a derived value. Average test scores, and related cut scores, are based directly upon student marks on the test.)&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;This demonstrates exceptional skill in managing test performance. Such a performance has therefore invited suspicions of the test becoming more standardized on test performance (the test score) than on student performance (what students know and can do). Were that to be true, it would make Arkansas a good case of successful well-intentioned self-deception, created by instruction (curriculum), learning (level of thinking) and assessment (test items) being optimized for NCLB test results. These doubts are probably not valid given the &lt;b&gt;awards&lt;/b&gt;&lt;span style="font-weight: normal;"&gt; won and &lt;/span&gt;&lt;b&gt;leadership&lt;/b&gt;&lt;span style="font-weight: normal;"&gt; demonstrated by &lt;a href="http://arkansased.org/testing/performance_report.html"&gt;Arkansas&lt;/a&gt;. Comparison with &lt;a href="http://www.cep-dc.org/page.cfm?FloatingPageID=14"&gt;NAEP&lt;/a&gt; also shows that two different views of the same students can vary a great deal. Both views may be validated with sufficient student performance information to clarify what each test is testing. &lt;a href="http://arkansased.org/about/pdf/releases/grade_inflation_release_011210.pdf"&gt;Arkansas&lt;/a&gt; has also equated classroom and state test scores as part of their management of &lt;/span&gt;&lt;b&gt;grade inflation&lt;/b&gt;&lt;span style="font-weight: normal;"&gt; (again, two views of the same students).&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Replacing the national academic lottery conducted with right count scored tests with tests that actually assess what students know and can do, as the basis for further learning and instruction, is one way of clarifying this situation (&lt;a href="http://www.nine-patch.com/"&gt;Knowledge and Judgment Scoring&lt;/a&gt; and the &lt;a href="http://winsteps.com/winsteps.htm"&gt;partial credit Rasch model&lt;/a&gt;, for example). The same tests now used for ranking can also be used when upgrading classroom testing (to assess both quantity and quality) to better prepare students for whatever forms of questions are used on the new NCLB tests. There is a great increase in useful information for students and teachers to direct classroom assignments and activities at all levels of thinking. Or replace the classroom with a complete instruction/assessment package like &lt;a href="http://www.knowledgefactor.com/"&gt;Amplifire&lt;/a&gt;.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;The spread of certified &lt;a href="http://thejournal.com/articles/2011/07/26/toward-a-competency-based-learning-system.aspx"&gt;competency-based learning&lt;/a&gt; may help bring about the needed change in assessment methods. A test must measure what it claims it is measuring. The test results must not be subject to a variety of secretive factors that only delay the inevitable full disclosure. “You can fool part of the people (including yourself) part of the time, but not all of the people all of the time.” The software packages are honest. It is how they are used that is open to question.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6676724996771468267-2536210747442048156?l=richard-hart.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://richard-hart.blogspot.com/feeds/2536210747442048156/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://richard-hart.blogspot.com/2011/08/standardized-testing-structure-function.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default/2536210747442048156'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default/2536210747442048156'/><link rel='alternate' type='text/html' href='http://richard-hart.blogspot.com/2011/08/standardized-testing-structure-function.html' title='Standardized Testing - Structure, Function, and Operation'/><author><name>Richard Hart</name><uri>http://www.blogger.com/profile/04962997526156185761</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_D9sVIEz0Aho/SPYhfcx0BCI/AAAAAAAAAA0/BNYxI4ytqCY/S220/Page0002a.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-rgLKJjdXu8M/TkHI_wwHm_I/AAAAAAAAAQA/_LnAisO_DgE/s72-c/Table1.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6676724996771468267.post-7050020961495797610</id><published>2011-08-10T06:00:00.000-07:00</published><updated>2011-08-10T06:00:15.164-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='data'/><category scheme='http://www.blogger.com/atom/ns#' term='grading'/><category scheme='http://www.blogger.com/atom/ns#' term='clicker'/><category scheme='http://www.blogger.com/atom/ns#' term='grading clicker data'/><title type='text'>Grading Clicker Data</title><content type='html'>&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;span class="Apple-style-span" style="font-size: 16px;"&gt;The &lt;/span&gt;&lt;span class="Apple-style-span" style="font-size: 16px;"&gt;&lt;a href="http://richard-hart.blogspot.com/2011/08/scoring-clicker-data.html"&gt;clicker data&lt;/a&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-size: 16px;"&gt; provided by GMW11 can be assigned grades in many ways. A traditional multiple-choice curve used by GMW11 produced 3 A, 1 B, 24 C, 24 D, and 69 F grades with an average score of 34%.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-9sVXnDoo7Bg/TjmRxZWRT0I/AAAAAAAAAPg/vOxqj1k067s/s1600/Grades4.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" height="400" src="http://3.bp.blogspot.com/-9sVXnDoo7Bg/TjmRxZWRT0I/AAAAAAAAAPg/vOxqj1k067s/s400/Grades4.jpg" width="271" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;span class="Apple-style-span" style="font-size: 16px;"&gt;A typical &lt;/span&gt;&lt;span class="Apple-style-span" style="font-size: 16px;"&gt;&lt;a href="http://www.nine-patch.com/"&gt;Knowledge and Judgment Scoring&lt;/a&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-size: 16px;"&gt; (KJS) distribution, with letter grades set every ten percentage points, would be 1 B, 3 C, 13 D, and 104 F grades. A KJS curve comparable to a right mark scoring (RMS) curve yields 4 A, 3 B, 15 C, 31 D, and 68 F grades with an average score of 49%. The same number, 69 and 68, are passing on each test.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;span style="font-family: Arial; font-size: 12pt;"&gt;Comparable curving produced similar grade distributions. However, what is being assessed and rewarded is very different. A RMS curve is based on a student’s luck on test day (both in marking and in the selection of questions presented on the test). A KJS curve is based on each student’s self-assessment, it combines knowledge and judgment in selecting questions to use to report what is actually known. Top students earn the same grades by both methods, as do most poor students. &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;span style="font-family: Arial; font-size: 12pt;"&gt;High quality, self-assessing, students earn a reward for reporting what they can trust as the basis for further learning and instruction. The sharper the incline connecting RMS and KJS scores on the chart the higher the quality. High quality students are teachable. KJS identifies them. RMS does not.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;span style="font-family: Arial; font-size: 12pt;"&gt;By scoring the clicker data by both methods and curving the scores in the same manner, the difference in student performance on the two scoring methods is clearly exposed. The task of the RMS student is to mark the best guess of a right answer for each question. Understanding, problem solving, and reading ability are secondary and even, at times, unnecessary. These are all crucial for a KJS student to determine if a question can be used to report something that is understood or which has sufficient relationships with other information or skills that a verifiable right answer can be marked. &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;span style="font-family: Arial; font-size: 12pt;"&gt;In this day, all multiple-choice tests should offer both methods of scoring. Students can easily switch from lower to higher levels of thinking; from little responsibility to near full responsibility for learning. Successful implementation requires letting students make the switch. Forcing students into KJS is about as unproductive a thing to do as forcing them to mark an answer to every question on a test they cannot understand or at times even read. Power Up Plus scores both methods, as does &lt;a href="http://winsteps.com/winsteps.htm"&gt;Winsteps&lt;/a&gt; (full credit and partial credit Rasch IRT models). No additional preparation time or effort is needed beyond that required for creating any multiple-choice test.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;span style="font-family: Arial; font-size: 12pt;"&gt;To the student: Your highest score/grade is obtained by being honest in reporting what you know, understand, and can trust at any level of preparation.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;span style="font-family: Arial; font-size: 12pt;"&gt;To the teacher: You know what each student can do and understand as the basis for further learning and instruction.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;span style="font-family: Arial; font-size: 12pt;"&gt;To the administrator: You know the levels of thinking, for each student, and in classroom instruction, as passive pupils prepare to be independent learners (self-assessing, self-correcting scholars).&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;span style="font-family: Arial; font-size: 12pt;"&gt;&lt;a href="http://www.nine-patch.com/"&gt;Knowledge and Judgment Scoring&lt;/a&gt; promotes student development when used on essay tests, multiple-choice tests, and I would suggest the same for clicker data.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6676724996771468267-7050020961495797610?l=richard-hart.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://richard-hart.blogspot.com/feeds/7050020961495797610/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://richard-hart.blogspot.com/2011/08/grading-clicker-data.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default/7050020961495797610'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default/7050020961495797610'/><link rel='alternate' type='text/html' href='http://richard-hart.blogspot.com/2011/08/grading-clicker-data.html' title='Grading Clicker Data'/><author><name>Richard Hart</name><uri>http://www.blogger.com/profile/04962997526156185761</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_D9sVIEz0Aho/SPYhfcx0BCI/AAAAAAAAAA0/BNYxI4ytqCY/S220/Page0002a.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/-9sVXnDoo7Bg/TjmRxZWRT0I/AAAAAAAAAPg/vOxqj1k067s/s72-c/Grades4.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6676724996771468267.post-5262779360251314242</id><published>2011-08-03T08:46:00.000-07:00</published><updated>2011-08-03T10:38:19.004-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='data'/><category scheme='http://www.blogger.com/atom/ns#' term='clicker'/><category scheme='http://www.blogger.com/atom/ns#' term='scoring'/><category scheme='http://www.blogger.com/atom/ns#' term='scoring clicker data'/><title type='text'>Scoring Clicker Data</title><content type='html'>&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;a href="http://1.bp.blogspot.com/-sOHFH5V_pA4/Tjll8xAoh1I/AAAAAAAAAPI/Qr4uvL2bVT8/s1600/Chart1.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" height="116" src="http://1.bp.blogspot.com/-sOHFH5V_pA4/Tjll8xAoh1I/AAAAAAAAAPI/Qr4uvL2bVT8/s200/Chart1.jpg" width="200" /&gt;&lt;/a&gt;I was recently presented with some clicker data to examine (GMW11). It had been scored by traditional right count scoring. There were a number of scores below 20%. There were even four students with a score of zero. That is way below the average guessing score when using five options on each question.&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-g-BokNgEDJM/Tjlm68EoPsI/AAAAAAAAAPM/0zVOH1IhzLk/s1600/Chart2.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" height="118" src="http://3.bp.blogspot.com/-g-BokNgEDJM/Tjlm68EoPsI/AAAAAAAAAPM/0zVOH1IhzLk/s200/Chart2.jpg" width="200" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;A different score distribution was produced by scoring the data for both knowledge and judgment. This distribution looks very much like what one would expect from students on their first introduction to &lt;a href="http://www.nine-patch.com/"&gt;Knowledge and Judgment Scoring&lt;/a&gt;. Students earn the same scores by both methods of scoring when they fail to exercise their own judgment (mark an answer to every question).The top three students therefore obtained the same score with both methods of scoring.&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;span style="font-family: Arial;"&gt;Here is an opportunity to compare Right Mark Scoring (RMS) and Knowledge and Judgment Scoring (KJS) when used on &lt;b&gt;any multiple-choice test&lt;/b&gt;&lt;/span&gt;&lt;span style="font-family: Arial;"&gt;. There is one catch, both methods of scoring are being used on one, the same, set of answer sheets. &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;span style="font-family: Arial;"&gt;Normally students would elect which method they felt comfortable using (and if time permits, on the first or second test, they may fill out two answer sheets, one for each method of scoring). The same test data can support a number of different stories. This story will assume that the test was presented with a choice of RMS and KJS, and further, that this was the first such test for the class. Most would be expected to select what they are most familiar with: RMS.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-bsSHXmP1FwU/Tjln-g1NWYI/AAAAAAAAAPQ/oHbTWew3bmk/s1600/Chart3.jpg" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="193" src="http://1.bp.blogspot.com/-bsSHXmP1FwU/Tjln-g1NWYI/AAAAAAAAAPQ/oHbTWew3bmk/s200/Chart3.jpg" width="200" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;span style="font-family: Arial;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;span style="font-family: Arial;"&gt;When quantity and quality are scatterplotted from RMS data, the result is a straight line. Only one dimension is being measured: a count of right marks. &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;a href="http://3.bp.blogspot.com/-hB9QJnKagPE/TjlobIJLgWI/AAAAAAAAAPU/bR3RUiJlEnI/s1600/Chart4.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" height="193" src="http://3.bp.blogspot.com/-hB9QJnKagPE/TjlobIJLgWI/AAAAAAAAAPU/bR3RUiJlEnI/s200/Chart4.jpg" width="200" /&gt;&lt;/a&gt;&lt;span style="font-family: Arial;"&gt;&amp;nbsp;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;KJS data are two-dimensional. A range of quality scores can yield the same test score. The test score of 46% was earned by students with a range of quality scores from zero to 44%.&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;span style="font-family: Arial;"&gt;Higher quality students are found above 50%. Lower quality students are found below 50%. Higher quality students get higher scores by marking more right answers. Only one student marked a perfect 100% quality score (no wrong marks). &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;span style="font-family: Arial;"&gt;Lower quality students get lower scores by marking more wrong answers. Four marked a zero quality score (every one of their two to 10 marks on the 23 question test was wrong). &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;span style="font-family: Arial;"&gt;Quantity and quality have been given equal value. The active test score then starts at 50%: 1 point for right, 1 point for good judgment (omit or right), and zero for wrong (poor judgment). [Back in the 1970s, when this work first began, the active test score started with zero. It was called net yield scoring; right minus wrong. The discovery of the quality score produced the second dimension that assesses student performance rather than defaulting to luck on test day.]&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-iQxps3fvac4/TjlsgDwizCI/AAAAAAAAAPY/jzl80jHbaJE/s1600/Chart5.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" height="193" src="http://3.bp.blogspot.com/-iQxps3fvac4/TjlsgDwizCI/AAAAAAAAAPY/jzl80jHbaJE/s200/Chart5.jpg" width="200" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;The end result of training students to accurately report what they trust they know or can do is shown in the Fall88 scatterplot. After an initial test (such as the clicker data) where most students elect RMS, they change study habits, and voluntarily switch to KJS. Here most of the class show a quality score about one letter grade higher than their test score. There is a bit of a disconnect at the pass/fail line of 60% (70% C, 80% B, and 90% A). Experienced students feel more comfortable reporting what they know than guessing at answers on all items on the test. They are on the path to being independent learners (self-correcting scholars).&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;span style="font-family: Arial;"&gt;This is in contrast to traditional right mark scoring where any score can be one letter grade higher with good luck to one letter grade lower with bad luck than a student’s actual ability. A grade of B one day may be a D on another day. And no one, including the student, knows what the student actually knows and can do as the basis for further learning and instruction.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;"&gt;&lt;span style="font-family: Arial;"&gt;Grading has an important effect on which scoring method students select: RMS (“I mark, you score”) or KJS (self-assessment, “I tell you”). RMS students tend to cram and to match. KJS students bring a rich web of relationships (from learning by questioning, answering, and verifying) that they can apply to questions they have not seen before. There is an operational difference between remembering and understanding that can be measured &lt;a href="http://www.nine-patch.com/"&gt;(RMS vs. KJS)&lt;/a&gt;.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6676724996771468267-5262779360251314242?l=richard-hart.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://richard-hart.blogspot.com/feeds/5262779360251314242/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://richard-hart.blogspot.com/2011/08/scoring-clicker-data.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default/5262779360251314242'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default/5262779360251314242'/><link rel='alternate' type='text/html' href='http://richard-hart.blogspot.com/2011/08/scoring-clicker-data.html' title='Scoring Clicker Data'/><author><name>Richard Hart</name><uri>http://www.blogger.com/profile/04962997526156185761</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_D9sVIEz0Aho/SPYhfcx0BCI/AAAAAAAAAA0/BNYxI4ytqCY/S220/Page0002a.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/-sOHFH5V_pA4/Tjll8xAoh1I/AAAAAAAAAPI/Qr4uvL2bVT8/s72-c/Chart1.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6676724996771468267.post-8305800935366283809</id><published>2010-11-01T08:44:00.000-07:00</published><updated>2010-11-02T14:39:22.009-07:00</updated><title type='text'>Rasch Model IRT Demystified</title><content type='html'>&lt;span class="Apple-style-span" style="font-family: 'Lucida Grande';"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Lucida Grande';"&gt;Questionable NCLB test cut scores have put &lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: 'Lucida Grande';"&gt;&lt;a href="http://www.nytimes.com/2010/01/12/education/12exit.html"&gt;Arkansas&lt;/a&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: 'Lucida Grande';"&gt;, &lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: 'Lucida Grande';"&gt;&lt;a href="http://www.chron.com/disp/story.mpl/metropolitan/7041445.html"&gt;Texas&lt;/a&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: 'Lucida Grande';"&gt;, &lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: 'Lucida Grande';"&gt;&lt;a href="http://gothamschools.org/2010/09/23/half-of-all-summer-school-students-have-to-repeat-a-grade/"&gt;New York&lt;/a&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: 'Lucida Grande';"&gt;, and now &lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: 'Lucida Grande';"&gt;&lt;a href="http://articles.chicagotribune.com/2010-10-18/news/ct-met-isat-answers-20101018_1_math-tests-new-isat-wrong-answers"&gt;Illinois&lt;/a&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: 'Lucida Grande';"&gt;&amp;nbsp;in the news this year. How NCLB test cut scores are set is of concern. The traditional method of just counting right marks (known as classical test theory or CTT) is not used.&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Lucida Grande';"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Lucida Grande'; font-size: small;"&gt;&lt;div class="MsoNormal"&gt;Instead the Rasch model item response theory (IRT) that estimates student ability and question difficulty is used. It is an acceptable way to calibrate questions for computer assisted testing (CAT); where you only answer enough questions to determine pass or fail. This leaves in question how psychometricians, education officials, and politicians &lt;a href="http://www.scribd.com/doc/34538511/A-New-Proficiency-Public-Version-07"&gt;use&lt;/a&gt;&amp;nbsp;the Rasch model on NCLB tests.&lt;o:p&gt;&lt;/o:p&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;How tests are scored should not be a mystery known only to those who benefit directly from “higher test scores” that may have no other meaning or use.&amp;nbsp; A detailed examination can also determine the Rasch model’s ability to make useful sense of classroom test results for instructional and student counseling purposes.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;This blog will now pause a bit to relate the printouts from the &lt;a href="http://www.winsteps.com/winsteps.htm"&gt;Winsteps&lt;/a&gt;&amp;nbsp;Rasch model IRT (student ability and item difficulty) with the &lt;a href="http://www.nine-patch.com/"&gt;Power Up Plus&lt;/a&gt;&amp;nbsp;(right mark scoring or RMS) printouts in a new blog: &lt;a href="http://raschmodelaudit.blogspot.com/2010/11/winsteps-person-most-unexpected.html"&gt;Rasch Model Audit&lt;/a&gt;.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;a href="http://www.nine-patch.com/"&gt;Power Up Plus&lt;/a&gt;&amp;nbsp;(FreePUP) prints out two student counseling reports:&amp;nbsp;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_D9sVIEz0Aho/TM7XYAnllHI/AAAAAAAAAIA/li_kW6S8RVE/s1600/PUP+Table+3.JPG" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" height="248" src="http://4.bp.blogspot.com/_D9sVIEz0Aho/TM7XYAnllHI/AAAAAAAAAIA/li_kW6S8RVE/s320/PUP+Table+3.JPG" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Table 3. Student Counseling Mark Matrix with Scores and Item Difficulty contains the same student marks that &lt;a href="http://www.winsteps.com/ministep.htm"&gt;Ministep&lt;/a&gt;&amp;nbsp;(the free version of [&lt;a href="http://www.winsteps.com/winsteps.htm"&gt;Winsteps&lt;/a&gt;]) starts with when doing a Rasch model IRT test score analysis. The most able students with the least difficult items are in the upper left. The least able students with the most difficult items are in the lower right. The relationships between student, item, mark, and test are presented in a highly usable fashion for both students and teachers for student counseling.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_D9sVIEz0Aho/TM7XxLLBqMI/AAAAAAAAAII/Oe7IGE7wXdM/s1600/PUP+Table+3a.JPG" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="259" src="http://2.bp.blogspot.com/_D9sVIEz0Aho/TM7XxLLBqMI/AAAAAAAAAII/Oe7IGE7wXdM/s320/PUP+Table+3a.JPG" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Table 3a. Student Counseling Mark Matrix with Mastery/Easy, Unfinished and Discriminating (MUD) Analysis re-tables the data to assist in the improvement of instruction and testing. Winsteps Rasch Model IRT quantifies each of the marks on these two tables. This is a most interesting and powerful addition to RMS. PUP Tables 3&amp;nbsp; and 3a will be used as working papers in this audit of the Rasch model.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;On return, this blog will continue with the application of &lt;a href="http://www.nine-patch.com/"&gt;Knowledge and Judgment Scoring&lt;/a&gt; (KJS) and the Rasch model to promote student development (using all levels of thinking). We need accurate, honest and fair test results presented in an easy to understand and to use manner. &lt;a href="http://www.nine-patch.com/"&gt;KJS&lt;/a&gt;&amp;nbsp;does this, as well as promotes student development. We also need to detect sooner when school and state officials are releasing meaningless test results (it took three years in &lt;a href="http://www.scribd.com/doc/34538511/A-New-Proficiency-Public-Version-07"&gt;New York&lt;/a&gt;). Both needs require some of the same insights.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;a href="http://raschmodelaudit.blogspot.com/2010/11/winsteps-person-most-unexpected.html"&gt;Next:&amp;nbsp;&lt;/a&gt;Rasch Model Audit&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6676724996771468267-8305800935366283809?l=richard-hart.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://richard-hart.blogspot.com/feeds/8305800935366283809/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://richard-hart.blogspot.com/2010/11/rasch-model-irt-demystified.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default/8305800935366283809'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default/8305800935366283809'/><link rel='alternate' type='text/html' href='http://richard-hart.blogspot.com/2010/11/rasch-model-irt-demystified.html' title='Rasch Model IRT Demystified'/><author><name>Richard Hart</name><uri>http://www.blogger.com/profile/04962997526156185761</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_D9sVIEz0Aho/SPYhfcx0BCI/AAAAAAAAAA0/BNYxI4ytqCY/S220/Page0002a.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_D9sVIEz0Aho/TM7XYAnllHI/AAAAAAAAAIA/li_kW6S8RVE/s72-c/PUP+Table+3.JPG' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6676724996771468267.post-1381548578784226567</id><published>2010-07-09T13:28:00.000-07:00</published><updated>2010-07-09T13:33:22.088-07:00</updated><title type='text'>TAKS Qualms - Part 2</title><content type='html'>&lt;span class="Apple-style-span" style="font-family: 'Lucida Grande'; font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-size: 12px;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Lucida Grande'; font-size: small;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Lucida Grande'; font-size: small;"&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;The &lt;a href="http://richard-hart.blogspot.com/2010/06/understanding-and-trusting-nclb-test.html"&gt;last&lt;/a&gt; blog reported a change was made in the passing rate on the TAKS Social Studies Grade 8 tests for 2003 and 2004. It was a far greater deviation than that found between 2009 and 2010 of concern in the &lt;a href="http://www.chron.com/disp/story.mpl/metropolitan/7041445.html"&gt;Houston Chronicle&lt;/a&gt; by Ericka Mellon.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Further study revealed that the passing rates on all four subjects&amp;nbsp;(English Language Arts,&amp;nbsp;Mathematics,&amp;nbsp;Science, and&amp;nbsp;Social Studies)&amp;nbsp;were all changed on the Grade 10 tests for the years 2003 and 2004. Texas allowed their Rasch One Parameter IRT (ROPIRT) to roam the open range for two years.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_D9sVIEz0Aho/TDd-1gVDf8I/AAAAAAAAAHg/ttncDY71iYo/s1600/Math10C.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" height="247" src="http://3.bp.blogspot.com/_D9sVIEz0Aho/TDd-1gVDf8I/AAAAAAAAAHg/ttncDY71iYo/s320/Math10C.jpg" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;In mathematics and science it returned impressive passing rates the first year and lower on the second year. Without the changes, it would have taken seven years for the passing rates to exceed the initial 2003 benchmark values. That just did not look right.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;By 2006 Texas had fenced in their ROPIRT. It thereafter produced values that looked right to education officials. By changing the passing rates for 2003 and 2004, the resulting curves looked very right: slow and continued progress toward an impossible goal of a passing rate of 100% by 2014.&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_D9sVIEz0Aho/TDd-6cFdPoI/AAAAAAAAAHo/GDh4GnzuOR4/s1600/Science10C.jpg" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="247" src="http://4.bp.blogspot.com/_D9sVIEz0Aho/TDd-6cFdPoI/AAAAAAAAAHo/GDh4GnzuOR4/s320/Science10C.jpg" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;The largest change was in&amp;nbsp;science:&amp;nbsp;69% passing in 2003 was changed to 42%, a change of 27 percentage points or a lowering of the original figure by 39%.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;IMHO what we are seeing here is the result of learning to use a new statistical tool that many would like to believe is a “standard statistical process”. It generates the numbers the states believed were wanted by the federal government. Secretary Duncan now considers the results as, “lying to our students”. &lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Unlike Knowledge and Judgment Scoring (KJS) that assesses at all levels of thinking and Confidence Based Learning (CBL) that assesses at the mastery level, the Texas ROPIRT has been fed right count scoring (RCS) data at the lowest level of thinking. The emphasis in assessment has changed, from earning high scores, to justifying the lowest cut point score. With KJS and CBL, the emphasis is on producing self-correcting high quality achievers who would in general find these tests, “a waste of time”.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;The one striking observation in all four charts is that the average percent test score, in general, shows a gradual increase from year to year. If these tests were of comparable difficulty (with unchanging cut scores, they must be of comparable difficulty for 2005-2009), student performance on these tests was increasing prior to 2010. &lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_D9sVIEz0Aho/TDd-sTX3rSI/AAAAAAAAAHY/D8swlBSA8rY/s1600/ELA10C.jpg" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"&gt;&lt;/a&gt;&lt;a href="http://1.bp.blogspot.com/_D9sVIEz0Aho/TDd-_lCtYLI/AAAAAAAAAHw/35lXpSjaJ8M/s1600/Social10C.jpg" imageanchor="1" style="clear: right; display: inline !important; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" height="154" src="http://1.bp.blogspot.com/_D9sVIEz0Aho/TDd-_lCtYLI/AAAAAAAAAHw/35lXpSjaJ8M/s200/Social10C.jpg" width="200" /&gt;&lt;/a&gt;&lt;img border="0" height="152" src="http://2.bp.blogspot.com/_D9sVIEz0Aho/TDd-sTX3rSI/AAAAAAAAAHY/D8swlBSA8rY/s200/ELA10C.jpg" width="200" /&gt;&lt;span class="Apple-style-span" style="-webkit-text-decorations-in-effect: none; color: black;"&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;As is, the&amp;nbsp;English Language Arts&amp;nbsp;and&amp;nbsp;Social Studies&amp;nbsp;tests are performing at the mastery level, above an average test score of 80% and passing rates at 90% and above. These tests now function as check lists of what experts in these fields consider necessary. Quibbling over a few points on test scores must now give way to serious concern about the quality of these tests to detect those students who will succeed in future schooling and on the job. Passing the test must be meaningful in the real world as well as in the edu-politic-money games currently being played.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;We do not yet know how the ROPIRT does its work, but we can observe its behavior. Texas has seen three periods of different behavior: 2003-2004, where the wild rate of passing results were later tamed to look right; 2005-2009, where the average test score and the cut scores changed in unison; and 2010, where &lt;b&gt;all four&lt;/b&gt;&lt;span style="font-weight: normal;"&gt; tests showed increased passing rates even though the 2010 average test scores were higher than for 2009 on one test, the same on one test, and lower on two tests. &lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;And the rate of change in passing was more than twice that of former years where the results looked right. The change was all in the same direction: up.&lt;br /&gt;&lt;br /&gt;1. &amp;nbsp;The      difference in behavior of the Texas ROPIRT model in 2010 was do to:&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;a. &amp;nbsp;political      influence.&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;b. &amp;nbsp;Texas      losing control again.&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;c. &amp;nbsp;student      performance.&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;d. &amp;nbsp;all of      the above.&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;e. &amp;nbsp;none      of the above.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Please be able to support your answer from your own experience or with information from trusted sources. (Good judgment is to omit if you cannot trust&amp;nbsp;your mark to be right.)&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6676724996771468267-1381548578784226567?l=richard-hart.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://richard-hart.blogspot.com/feeds/1381548578784226567/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://richard-hart.blogspot.com/2010/07/taks-qualms-part-2.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default/1381548578784226567'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default/1381548578784226567'/><link rel='alternate' type='text/html' href='http://richard-hart.blogspot.com/2010/07/taks-qualms-part-2.html' title='TAKS Qualms - Part 2'/><author><name>Richard Hart</name><uri>http://www.blogger.com/profile/04962997526156185761</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_D9sVIEz0Aho/SPYhfcx0BCI/AAAAAAAAAA0/BNYxI4ytqCY/S220/Page0002a.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_D9sVIEz0Aho/TDd-1gVDf8I/AAAAAAAAAHg/ttncDY71iYo/s72-c/Math10C.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6676724996771468267.post-3972023447314707625</id><published>2010-06-25T14:06:00.000-07:00</published><updated>2010-06-26T12:17:16.890-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='percent passing'/><category scheme='http://www.blogger.com/atom/ns#' term='NCLB'/><category scheme='http://www.blogger.com/atom/ns#' term='cut score'/><category scheme='http://www.blogger.com/atom/ns#' term='TAKS'/><category scheme='http://www.blogger.com/atom/ns#' term='test'/><category scheme='http://www.blogger.com/atom/ns#' term='qualms'/><category scheme='http://www.blogger.com/atom/ns#' term='multiple choice'/><title type='text'>Understanding and Trusting NCLB Test Standards - TAKS</title><content type='html'>&lt;span class="Apple-style-span" style="font-family: 'Lucida Grande'; font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-size: 12px;"&gt;   &lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Lucida Grande'; font-size: small;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Lucida Grande'; font-size: small;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Lucida Grande'; font-size: small;"&gt;&lt;div class="MsoNormal"&gt;After eight years there is still a problem with people not understanding and trusting NCLB standardized testing according to the Texas TAKS social studies grade 8 article, “&lt;a href="http://www.chron.com/disp/story.mpl/metropolitan/7041445.html"&gt;Qualms arise over TAKS standards&lt;/a&gt;”, in the Houston Chronicle by Ericka Mellon, 7 June 2010.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;‘State Rep. Scott Hochberg, vice chairman of the House Public Education Committee said in the Houston Chronicle, “You can get more than halfway to passing just by guessing”.’&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;A distribution of expected lucky scores from the test with 48 questions and 4-option answers shows this to be correct, on average.&amp;nbsp;&amp;nbsp;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_D9sVIEz0Aho/TCUOgXBJHXI/AAAAAAAAAG4/dB2f9W1B7kY/s1600/ExpectedLuckyScores.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_D9sVIEz0Aho/TCUOgXBJHXI/AAAAAAAAAG4/dB2f9W1B7kY/s320/ExpectedLuckyScores.jpg" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;‘TEA Deputy Associate Commissioner Gloria Zyskowski said agency officials set the bar high enough so “students can’t pass the test by chance alone.”’&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;One very lucky student out of a 100 needs to add 2 right marks to pass. One very unlucky student out of a 100 needs to add 17 right marks to pass. Students cannot pass the test by luck alone. The unfairness of students starting the test with lucky scores ranging from 5 to 19 is not very important on the this test as 95 percent passed the test with an average score over 80%. [&lt;a href="http://www.youtube.com/watch?v=t-Xen7yq13E"&gt;YouTube&lt;/a&gt;]&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;‘Sarah Winkler, the president of the Texas Association of School Boards, was shocked to find out Monday that the TEA doesn’t set the passing bar – called the cut score – until after students take the TAKS.’ &lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;This practice takes the TEA out of the game. They no longer have to make a bet on what the cut score should be (and to respond to all of the ramifications if they are wrong). They can bring all their expertise to bear on setting the most appropriate cut score. An operational test is not governed by research rules and hypothesis testing of average scores. &lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Operational testing is concerned about each student when the results determine pass or fail a grade. Is this case, and especially when low cut scores are used, it would be nice if students could also get out of the game of right count scoring (guess testing). The TEA can do this by using &lt;a href="http://www.nine-patch.com/"&gt;Knowledge and Judgment Scoring&lt;/a&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;that lets all students start the test at the same score and gives equal value to what they know and to the judgment needed to make use of what they know. It assesses all levels of thinking, an innovation ready for the next revision of NCLB. The social studies test yielded an average score over 80%. The TEA could also use &lt;a href="http://www.knowledgefactor.com/"&gt;Confidence Based Learning&lt;/a&gt; scoring that functions at the mastery level.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;‘“We didn’t do anything differently than previous years,” said TEA spokeswoman Debbie Ratcliffe. “It wouldn’t be fair to kids if this test wasn’t at the same difficulty level from year to year.”’&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_D9sVIEz0Aho/TCUSUdi57hI/AAAAAAAAAHA/iiPkUdKRqcs/s1600/CharacteristicCurves.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_D9sVIEz0Aho/TCUSUdi57hI/AAAAAAAAAHA/iiPkUdKRqcs/s320/CharacteristicCurves.jpg" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;The test characteristic curves, used by psychometricians, for the eight years bare this out. The curves for six of the eight years fall directly on top of one another with a cut score of 25. This is an outstanding piece of work. The year 2005 shows a slight deviation (cut score of 24) and 2010 a much greater deviation in difficulty (cut score of 21). The minute breaks in the scale scores at 2100 and 2400 are the standards for met and commended performance levels. (PLEASE NOTE that these curves descend to zero on tests that are designed to generate a lowest lucky score of 12 out of 48 questions, on average. This is no problem for true believers.)&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;‘TEA officials say the questions, for the most part, were harder this year, so they followed standard statistical process and lowered the number of items students needed to get correct.’ But were the questions harder or the students less prepared?&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;The TEA is faithfully following the operating rules that come with their Rash one-parameter IRT model analyzer (ROPIRT). For the thoroughly indoctrinated true believer a ROPIRT works like a charm in a space with arbitrary dimensions. There is a mysterious interaction between the average score of a set of anchor questions embedded in each test, the average right count test score, the cut score, and the percent passing, on each test, and with the preceding test, within the ROPIRT. Only the last two or three are generally posted on the Internet. For the rest of us, we must judge its output by the results it returns to the real world.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;The eight-year run of the social studies grade 8 test shows some informing behavior. Years 2003 and 2004 were originally assigned cut scores of 19 and 22. That yielded passing rates of 93% and 88%. Later, all years were assigned a cut score of 25 except for 2005 (24) and 2010 (21). Now to weave a story with these facts.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_D9sVIEz0Aho/TCUX5ezO_4I/AAAAAAAAAHI/aXR9ZKwJWJk/s1600/SocialStudies8.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_D9sVIEz0Aho/TCUX5ezO_4I/AAAAAAAAAHI/aXR9ZKwJWJk/s320/SocialStudies8.jpg" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Starting in 2003 with a cut score set at 25, 77% passed the test with an average test score of 65.5%. The average test score increased by 5.2% in 2004 to 70.8%. This was not enough to trigger a change in the cut score. The passing rate increased to 81%.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;The average test score remained stationary in 2005. This triggered a 4% change in the cut score by one count from 25 to 24. The ROPIRT decided that the test was more difficult this year so the passing rate should be &lt;b&gt;adjusted&lt;/b&gt;&lt;span style="font-weight: normal;"&gt; &lt;/span&gt;&lt;b&gt;up&lt;/b&gt;&lt;span style="font-weight: normal;"&gt; from 81 to 85%.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;The average test score increased by 4.2% in 2006 to 75%. This triggered a 4% change in the cut score by one count from 24 back to 25. The ROPIRT decided that the test was too easy this year so the passing rate should be &lt;b&gt;adjusted&lt;/b&gt;&lt;span style="font-weight: normal;"&gt; &lt;/span&gt;&lt;b&gt;down&lt;/b&gt;&lt;span style="font-weight: normal;"&gt; from 85 to 83%.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;The average test score increased by lesser amounts in 2007, 2008, and 2009 (3.1, 2.1, and 2.1%). These did not trigger an adjustment in the cut score.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;In 2010, the average test score decreased by only 2.2% to 80.2%, the same average score as in 2008. The ROPIRT decided the test was way too difficult by changing the cut score by 4 counts from 25 to 21. This was a 16% adjustment in cut score for a 2.2% change in the average test score.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;The amount of the adjustment is not consistent with the previous adjustments. The resulting passing rate for 2010 of 95% is not consistent with the passing rate for 2008 (90%) with the same average test score. The ROPIRT (which to my knowledge) only looks back one test, is drifting away from previous decisions it has made. [&lt;a href="http://www.youtube.com/watch?v=t-Xen7yq13E"&gt;YouTube&lt;/a&gt;]&amp;nbsp;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;The Texas data show four interesting things:&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;ol start="1" style="margin-top: 0in;" type="1"&gt;&lt;li class="MsoNormal" style="mso-list: l0 level1 lfo1; tab-stops: list .5in;"&gt;If      students do too well on a test, it is declared too easy and the cut score      is raised to lower the pass rate even though they may have actually      performed better.&lt;/li&gt;&lt;li class="MsoNormal" style="mso-list: l0 level1 lfo1; tab-stops: list .5in;"&gt;If      students do too poorly on a test, it is declared too difficulty and the      cut score is lowered to raise the pass rate even though they may have      actually performed poorly.&lt;/li&gt;&lt;li class="MsoNormal" style="mso-list: l0 level1 lfo1; tab-stops: list .5in;"&gt;If the      above goes on long enough, the whole process drifts away from the original      benchmark values and requires recalibration.&lt;/li&gt;&lt;li class="MsoNormal" style="mso-list: l0 level1 lfo1; tab-stops: list .5in;"&gt;A      benchmark cut score can be revised based on the results of following      years. This is consistent with the ROPIRT operating instructions to remove      imperfect data until you get the right answer.&amp;nbsp; &lt;/li&gt;&lt;/ol&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Calibrating questions with a ROPIRT for use in time saving computer assisted testing (CAT) is valid. Using it to equate tests over a period of seven years is another matter. By design (an act of faith), a ROPIRT cannot error as it lives in a perfect world of true scores (these error free true scores, the raw scores found on the raw score to scale score conversion tables, are generally considered to be on the same scale as the right count test scores even though each student’s right count test score is influenced by a number of factors including item discrimination and lucky scores). Error occurs when imperfect data are fed into a ROPIRT that are not manually detected and removed. The blame game then ends with operator inexperience. Since Texas is using a Rasch Partial-Credit Model in a ROPIRT mode, it could use &lt;a href="http://www.nine-patch.com/"&gt;Knowledge and Judgment Scoring&lt;/a&gt; to reduce the error from traditional right count scoring.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_D9sVIEz0Aho/TCUaHoegw9I/AAAAAAAAAHQ/6aG1eBIQWvE/s1600/ChangeEffects.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_D9sVIEz0Aho/TCUaHoegw9I/AAAAAAAAAHQ/6aG1eBIQWvE/s320/ChangeEffects.jpg" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;For someone outside a State Department of Education to assess the operation of their ROPIRT, the investigator would need a minimum set of information for each year of the test: The mean of the anchor set of questions embedded in each test that is the primary determiner of the change in the cut score, the mean of the right count scored student tests, the cut score, and the percent passing. I have yet to find a state that posts or will provide all four of these values. Texas posts the last three. Arkansas posts the last two.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Are the test results politically influenced? From the data in hand, I don’t know enough to know. High scores (now high pass rates that are sensitive to low scores) are needed to meet federal standards. The shape of the gently ever more slowly rising curve for the passing rate appears to be more carefully choreographered than to be a direct result of student performance for several states. I think a better question is: Is this from political influence or the result of a smoothing effect created when using (and learning to use) a ROPIRT? The revised passing rates for 2003 and 2004 on the social studies grade 8 test give us a mixed clue.&lt;/div&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6676724996771468267-3972023447314707625?l=richard-hart.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://richard-hart.blogspot.com/feeds/3972023447314707625/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://richard-hart.blogspot.com/2010/06/understanding-and-trusting-nclb-test.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default/3972023447314707625'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default/3972023447314707625'/><link rel='alternate' type='text/html' href='http://richard-hart.blogspot.com/2010/06/understanding-and-trusting-nclb-test.html' title='Understanding and Trusting NCLB Test Standards - TAKS'/><author><name>Richard Hart</name><uri>http://www.blogger.com/profile/04962997526156185761</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_D9sVIEz0Aho/SPYhfcx0BCI/AAAAAAAAAA0/BNYxI4ytqCY/S220/Page0002a.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_D9sVIEz0Aho/TCUOgXBJHXI/AAAAAAAAAG4/dB2f9W1B7kY/s72-c/ExpectedLuckyScores.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6676724996771468267.post-2233823474587283829</id><published>2010-05-19T07:55:00.000-07:00</published><updated>2010-05-19T07:55:00.199-07:00</updated><title type='text'>Wallpapering Traditional Multiple-Choice Tests</title><content type='html'>&amp;nbsp;Wallpapering is preparing, in advance of the test, a mark pattern to be used when students do not have answers they can verify and trust. Students have &lt;span style="font-family: 'Arial Bold';"&gt;&lt;b&gt;three options&lt;/b&gt;&lt;/span&gt; after marking all the questions that can be used to report what is known or can be done:&lt;br /&gt;&lt;div class="MsoNormal"&gt;&lt;div style="text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;ol start="1" style="margin-top: 0in;" type="1"&gt;&lt;li class="MsoNormal" style="mso-list: l0 level1 lfo1; tab-stops: list .5in;"&gt;&lt;span style="font-family: 'Arial Bold';"&gt;&lt;b&gt;Turning in the answer sheet&lt;/b&gt;&lt;/span&gt;      yields an accurate, honest, but unfair score unless omit or judgment is      given a value equal to, or higher than, knowledge; as is done with      Knowledge and Judgment Scoring (&lt;a href="http://www.nine-patch.com/"&gt;KJS&lt;/a&gt;) and Confidence Based Learning (&lt;a href="http://www.knowledgefactor.com/"&gt;CBL&lt;/a&gt;).&lt;/li&gt;&lt;/ol&gt;&lt;div class="MsoNormal" style="text-indent: 3.0pt;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;ol start="2" style="margin-top: 0in;" type="1"&gt;&lt;li class="MsoNormal" style="mso-list: l0 level1 lfo1; tab-stops: list .5in;"&gt;&lt;span style="font-family: 'Arial Bold';"&gt;&lt;b&gt;Randomly marking&lt;/b&gt;&lt;/span&gt; the      remaining questions gives judgment a value of zero. The score is less      accurate, honest, and fair the lower it gets until it only reflects answer      sheet marking ability. The test is a high anxiety academic casino game at      the lowest levels (orders) of thinking.&lt;/li&gt;&lt;/ol&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;ol start="3" style="margin-top: 0in;" type="1"&gt;&lt;li class="MsoNormal" style="mso-list: l0 level1 lfo1; tab-stops: list .5in;"&gt;&lt;span style="font-family: 'Arial Bold';"&gt;&lt;b&gt;Wallpapering&lt;/b&gt;&lt;/span&gt; is a defensive      measure. It reduces test anxiety. It increases fairness and test security.      It shares the same good luck.&lt;/li&gt;&lt;/ol&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Being prepared &lt;span style="font-family: 'Arial Bold';"&gt;&lt;b&gt;reduces&lt;/b&gt;&lt;/span&gt; &lt;span style="font-family: 'Arial Bold';"&gt;&lt;b&gt;test anxiety&lt;/b&gt;&lt;/span&gt;. This includes how to make a forced-choice mark when you do not have a trusted answer. The age-old advice is to pick one option, such as C. Wallpapering adds one more step: Everyone in the class makes the same mark (with KJS and CBL everyone just omits).&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-family: 'Arial Bold';"&gt;&lt;b&gt;A fair test&lt;/b&gt;&lt;/span&gt; requires a fair starting score (which exists with KJS and CBL).&lt;/div&gt;&lt;div class="MsoNormal"&gt;The active starting score on traditional multiple-choice tests is about 33%, on average. That is a range of independent starting scores of about two letter grades. Wallpapering reduces this range. &lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Wallpapering produces a &lt;span style="font-family: 'Arial Bold';"&gt;&lt;b&gt;security code&lt;/b&gt;&lt;/span&gt;. The wallpaper marking-pattern can be made as elaborate as needed. Over half of the marks on an answer sheet can come from wallpapering when test scores drop below 50%. A set of answer sheets marked right and wallpapered, and with no erasures, indicates no tampering. &lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;NCLB raw scores below 40% are now listed as Proficient in several states. The distribution of scores from marginal students with equal abilities follows the normal curve of error. The distribution widens as the test scores descend. It is gambling. Some pass. Some fail. This is not fair.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;a href="http://www.youtube.com/watch?v=UdX5dXGjFAI"&gt;Wallpapering&lt;/a&gt; reduces this unfairness. All students in the group (class) mark the same answer when they cannot trust making a right mark. They do the same thing at the same time rather than individually trust to luck. This does not change their individual test scores, on average.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Wallpaper is an answer sheet created BEFORE seeing the test. Individual variation is markedly reduced. The simplest example is for all in the group to agree to mark the same letter when in doubt. More variable patterns can be created using mnemonics for easy memory. Short patterns can repeat every few questions. The Christmas tree repeats every 4 questions (A, B, C, D) on a 4-option test. Longer patterns can use poetry and music.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Doing the same thing at the same time has evolved in birds as a means of &lt;a href="http://greenconsciousness.org/weblog/2008/10/how-does-flock-of-birds-wheel-and-swoop.html"&gt;protecting&lt;/a&gt;&amp;nbsp;individual members of the flock from predators. The tight formation &lt;a href="http://www.youtube.com/watch?v=b8eZJnbDHIg"&gt;protects&lt;/a&gt; individual members and decreases the energy needed to fly. The same &lt;a href="http://photography.nationalgeographic.com/photography/photos/schools-fish/sea-lion-chase-photography.html"&gt;protection&lt;/a&gt; and energy savings applies in schools of fish.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Wallpapering has this effect for marginal students taking tests using RMS. It reduces the random &lt;a href="http://www.youtube.com/watch?v=NT3KvtF_vVk"&gt;lucky score variation&lt;/a&gt;&amp;nbsp;in individual test scores.&amp;nbsp;&lt;a href="http://www.youtube.com/watch?v=UdX5dXGjFAI"&gt;Wallpapering&lt;/a&gt;&amp;nbsp;allows students to do the same thing at the same time with equal ease when marking a trusted right answer or marking the equivalent of omit using &lt;a href="http://www.nine-patch.com/"&gt;KJS&lt;/a&gt; or &lt;a href="http://www.knowledgefactor.com/"&gt;CBL&lt;/a&gt;. &lt;span class="Apple-style-span" style="font-family: ArialMT;"&gt;A few minutes of planning equal a few millennia of evolution in protecting marginal students from the vagaries of NCLB testing.&lt;/span&gt;&lt;br /&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-family: ArialMT;"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Multiple Choice Bubble Sheet Template:&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;a href="http://teacherspayteachers.com/"&gt;http://teacherspayteachers.com&lt;/a&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Teacher-Author: E. Fisher&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Price: FREE&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6676724996771468267-2233823474587283829?l=richard-hart.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://richard-hart.blogspot.com/feeds/2233823474587283829/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://richard-hart.blogspot.com/2010/05/wallpapering-traditional-multiple.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default/2233823474587283829'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default/2233823474587283829'/><link rel='alternate' type='text/html' href='http://richard-hart.blogspot.com/2010/05/wallpapering-traditional-multiple.html' title='Wallpapering Traditional Multiple-Choice Tests'/><author><name>Richard Hart</name><uri>http://www.blogger.com/profile/04962997526156185761</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_D9sVIEz0Aho/SPYhfcx0BCI/AAAAAAAAAA0/BNYxI4ytqCY/S220/Page0002a.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6676724996771468267.post-4681599655355877079</id><published>2010-05-12T08:22:00.000-07:00</published><updated>2010-05-12T08:22:00.224-07:00</updated><title type='text'>My Score Quality</title><content type='html'>&lt;div align="center" class="MsoNormal" style="text-align: left;"&gt;Examiners can tell students, parents, and employers how a score relates to other examinees on a test. But how does it relate to everything else?&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;What does my score mean other than I passed the Arkansas Algebra I (AAI) end of course test? Am I ready for Algebra II? Have I mastered the general lifetime skills supported by learning Algebra? Did I take a lower-order thinking appreciation course or a higher-order thinking skills course? Did I just pass a graduation requirement and get a grade? Are the &lt;a href="http://richard-hart.blogspot.com/2010/04/multiple-choice-lucky-scores.html"&gt;newspapers&lt;/a&gt; right that the course is not tough enough, that the passing cut score is too low?&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Arkansas is one of five states to have a Statewide &lt;a href="http://www.ecs.org/html/Document.asp?chouseid=6412"&gt;Uniform Grading Scal&lt;/a&gt;e for classroom tests. This is one way of indicating quality. The final determiner is how students perform on their next unit, next semester or next job assignment.&lt;br /&gt;&lt;br /&gt;&lt;div class="MsoNormal"&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;Quality varies between states. The letter grade of “C” ranges from 70% to 77%. A &amp;nbsp;classroom “D” is 60% in Arkansas&amp;nbsp;and Florida and 70% in South Carolina and Tennessee. The quality of a test score is dependent upon a number of factors including &lt;a href="http://www.oursc.k12.ar.us/default_images/science/EOC_Biology_CutScores_Spring2009.pdf"&gt;scale scores&lt;/a&gt;. The AAI raw score equivalent to a&amp;nbsp;&lt;a href="http://arkedu.state.ar.us/commemos/static/fy0809/attachments/Algebra_1_EOC_Pass_Cut_Score_and_PLD.pdf"&gt;classroom&lt;/a&gt;&amp;nbsp;pass = 24%.&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_D9sVIEz0Aho/S-BAwm44q3I/AAAAAAAAAGA/NCr7E4rjxd4/s1600/Table1.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="118" src="http://1.bp.blogspot.com/_D9sVIEz0Aho/S-BAwm44q3I/AAAAAAAAAGA/NCr7E4rjxd4/s400/Table1.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;a href="http://3.bp.blogspot.com/_D9sVIEz0Aho/S-BrnseyHgI/AAAAAAAAAGw/cuQmnCHC13Y/s1600/Slide2.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" height="150" src="http://3.bp.blogspot.com/_D9sVIEz0Aho/S-BrnseyHgI/AAAAAAAAAGw/cuQmnCHC13Y/s200/Slide2.jpg" width="200" /&gt;&lt;/a&gt;&lt;a href="http://1.bp.blogspot.com/_D9sVIEz0Aho/S-BrihsI3NI/AAAAAAAAAGo/18QSYa1a4Ds/s1600/Slide1.jpg" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="150" src="http://1.bp.blogspot.com/_D9sVIEz0Aho/S-BrihsI3NI/AAAAAAAAAGo/18QSYa1a4Ds/s200/Slide1.jpg" width="200" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;If the AAI test were all multiple-choice, every score falls in the shadow of the lucky scores. The score of 25 is nonsense. The cut scores of 21, 24 and 37 could be obtained by just marking the answer sheet without looking at the test. All cut scores would be shady “no quality” scores.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="color: #666666; font-family: Arial, Helvetica, sans-serif; font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-size: 12px; white-space: pre;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_D9sVIEz0Aho/S-BCZWbdIKI/AAAAAAAAAGY/Lxuc8ztHUkE/s1600/Slide3.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" height="150" src="http://1.bp.blogspot.com/_D9sVIEz0Aho/S-BCZWbdIKI/AAAAAAAAAGY/Lxuc8ztHUkE/s200/Slide3.jpg" width="200" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;a href="http://www.youtube.com/watch?v=WfnrBcjMst8"&gt;Replacing&lt;/a&gt; 40 of the multiple-choice questions with five open-response questions toughens up the test. The lucky scores on the AAI 4-option question test now cast a shadow over just half of the playing field, from the 15% to the 60% line. A score of 15 can be expected from lucky scores, down from 25, on average. Both 24 and 37 fall about half shaded. They have a quality score of less than 50%. Any score below 50% is a low quality score. Right mark scoring (RMS) holds students accountable for their luck on test day, as much as or more than, for what they know or can do.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_D9sVIEz0Aho/S-BCdW7qcXI/AAAAAAAAAGg/coTi53vDwSo/s1600/Slide4.jpg" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="150" src="http://2.bp.blogspot.com/_D9sVIEz0Aho/S-BCdW7qcXI/AAAAAAAAAGg/coTi53vDwSo/s200/Slide4.jpg" width="200" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Psychometricians were not on the side of the students when they included the five open-response questions. However these questions are, in general, non-functional.&amp;nbsp; The test designed for 100 points actually functions as a test based&lt;/div&gt;&lt;div class="MsoNormal"&gt;on 60 points. The functional passing scores are 40% (24) and 60% (37) out of 60 even though the designed passing scores are 24 and 37 out of 100. Few multiple-choice tests using RMS function as designed. &lt;/div&gt;&lt;div class="MsoNormal"&gt;&amp;nbsp; &lt;/div&gt;&lt;div class="MsoNormal"&gt;The AAI is designed for students to mark their best guess at the “best answer” on each question. Individual student test scores below 50% only have meaning after being averaged into a class or school score ranking.&amp;nbsp;(RMS remains the least expensive way to obtain school rankings.)&amp;nbsp;This&amp;nbsp;research technique fails to apply to individual students. A test score of 37%, on a crippled multiple-choice test (no omit), is also a quality score of 37%. The test is not designed for students to report what they trust they know and can use as the basis for further learning and instruction. That requires the option missing on tests using RMS: omit (I have yet to learn this).&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;RMS and knowledge and judgment scoring (&lt;a href="http://www.nine-patch.com/"&gt;KJS&lt;/a&gt;) can be combined on the same test as a means of gently nudging students out of the habit of guessing, to reporting what they actually know. The test scores and student counseling matrixes guide students on the path from passive pupil to self-correcting high achiever. There is an additional dimension of information available that is not obtainable with RMS even when using the same test questions.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;(Wallpaper has a third use with RMS. Along with reducing test anxiety, and the variation in lucky score starting positions, it allows KJS to extract ¾ of the quality information lost with RMS. A wallpaper key is added to the answer key and weight key.) &lt;o:p&gt;&lt;/o:p&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;The learning cycle shortens as passive pupils become self-correcting high quality achievers. Boring classes become exciting adventures. A multiple-choice test that randomly passes and fails low performing students of equal abilities with RMS becomes a seek-and-find task to report what is meaningful and useful for each student with KJS and Confidence Based Learning (&lt;a href="http://www.knowledgefactor.com/"&gt;CBL&lt;/a&gt;). &lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;When students elect to report what they know and trust with KJS or CBL, they receive a quantity score, a quality score and a test score. High quality students obtain individual confirmation that they do know what they know and that they are skilled at using this knowledge regardless of the quantity of right marks. &lt;span style="font-family: 'Arial Bold';"&gt;&lt;b&gt;Success&lt;/b&gt;&lt;/span&gt; is doing more of what each student is good at doing. This is in contrast to RMS where doing more of what low scoring students are doing (guessing right answers) is a continuation of &lt;span style="font-family: 'Arial Bold';"&gt;&lt;b&gt;failure &lt;/b&gt;&lt;/span&gt;(a practice in continually failing schools).&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;a href="http://www.youtube.com/watch?v=WfnrBcjMst8"&gt;Assessment&lt;/a&gt;&amp;nbsp;should produce high quality scores and promote the development of high quality students. CBL differentiates questions into informed, uninformed, misinformed and good judgment to omit, to question, and not make a serious error.&amp;nbsp; KJS sorts questions into expected, difficult, misconception and good judgment to not make a wrong mark and thus report what has yet to be learned. Quality is independent from quantity.&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Secretary of Education, Ernie Duncan’s &lt;a href="http://ed.gov/news/pressreleases/2009/10/10292009.html"&gt;opinion&lt;/a&gt;: “At a time when we should be raising standards to compete in the global economy, more states are lowering the bar than raising it. We're lying to our children when we tell them they're proficient but they're not achieving at a level that will prepare them for success once they graduate.”&lt;o:p&gt;&lt;/o:p&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6676724996771468267-4681599655355877079?l=richard-hart.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://richard-hart.blogspot.com/feeds/4681599655355877079/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://richard-hart.blogspot.com/2010/05/my-score-quality.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default/4681599655355877079'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default/4681599655355877079'/><link rel='alternate' type='text/html' href='http://richard-hart.blogspot.com/2010/05/my-score-quality.html' title='My Score Quality'/><author><name>Richard Hart</name><uri>http://www.blogger.com/profile/04962997526156185761</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_D9sVIEz0Aho/SPYhfcx0BCI/AAAAAAAAAA0/BNYxI4ytqCY/S220/Page0002a.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_D9sVIEz0Aho/S-BAwm44q3I/AAAAAAAAAGA/NCr7E4rjxd4/s72-c/Table1.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6676724996771468267.post-3555753320894713286</id><published>2010-05-06T06:03:00.000-07:00</published><updated>2010-05-06T06:03:40.729-07:00</updated><title type='text'>Three Multiple-Choice Games</title><content type='html'>&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_D9sVIEz0Aho/S9sC_JURI-I/AAAAAAAAAFw/fLn_L2ESL8o/s1600/Slide6.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" height="150" src="http://3.bp.blogspot.com/_D9sVIEz0Aho/S9sC_JURI-I/AAAAAAAAAFw/fLn_L2ESL8o/s200/Slide6.jpg" width="200" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div align="center" class="MsoNormal" style="text-align: left;"&gt;&lt;div style="text-align: auto;"&gt;&lt;div style="text-align: auto;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Lucida Grande'; font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-size: 12px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;Three multiple-choice games can be played on the same field. Each has its own rules for scoring and grading. [&lt;a href="http://www.youtube.com/watch?v=Xu2jnQJi-Js"&gt;YouTube&lt;/a&gt;]&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;The 2009 Arkansas Algebra I (AAI) end-of-course test has the game field designed with 100 points, the same number as yards on a football field. The field slopes from a swamp down at the left end were the guessers play up to dry land were the 100% goal posts stand.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;The number of answer options for each multiple-choice question controls the difficulty of play related to luck. The more options per question, the more skilled the players must be to win and the fewer lucky winners. Anyone can play when right mark scoring (RMS) is used: students, employees, and animals (the target of the original complete multiple-choice test that included omit).&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_D9sVIEz0Aho/S9sCJ2ZBJkI/AAAAAAAAAFQ/E1--iNbmFHU/s1600/Slide1.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" height="150" src="http://2.bp.blogspot.com/_D9sVIEz0Aho/S9sCJ2ZBJkI/AAAAAAAAAFQ/E1--iNbmFHU/s200/Slide1.jpg" width="200" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;The Arkansas &lt;a href="http://secc.sedl.org/orc/rr/secc_rr_00094.pdf"&gt;Uniform Grading Scale&lt;/a&gt; rules set the letter grades of D to A at 60 to 90 for traditional right mark scoring&amp;nbsp; (RMS) on classroom tests. The static starting score is set to zero. The hidden active starting score is 25, on average.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_D9sVIEz0Aho/S9sCyyqoEwI/AAAAAAAAAFY/VrYQ0PZra7A/s1600/Slide3.jpg" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="150" src="http://1.bp.blogspot.com/_D9sVIEz0Aho/S9sCyyqoEwI/AAAAAAAAAFY/VrYQ0PZra7A/s200/Slide3.jpg" width="200" /&gt;&lt;/a&gt;&lt;/div&gt;The Arkansas Algebra I (AAI) end-of-course test replaces 40 multiple-choice with five 8-point open response questions. The hidden active starting score is reduced from 20 to 15, on average. The test is now ten points, or one letter grade, more difficult. A student cannot pass the test by guessing.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_D9sVIEz0Aho/S9sC2ep3v9I/AAAAAAAAAFg/9IBgjNTTvKg/s1600/Slide4.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" height="150" src="http://2.bp.blogspot.com/_D9sVIEz0Aho/S9sC2ep3v9I/AAAAAAAAAFg/9IBgjNTTvKg/s200/Slide4.jpg" width="200" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;The active starting score, the lucky score, is hidden at the left end of the playing field in the foggy swamp where the guessers play among the lucky-score trees. The traditional classroom game starts here with lower order thinking skills. Students are encouraged to guess from 5, 4, 3, or 2 options. Only right marks count as blank and omit have no value with RMS.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Confidence Based Learning (&lt;a href="http://www.knowledgefactor.com/"&gt;CBL&lt;/a&gt;) only plays on dry ground near the goal posts. It uses 3-option questions. It starts play at the 75% (25-yard) line for good judgment, far away from the swamp of shady scores. Mastery players receive points for both knowledge and their skillful to use their knowledge (their judgment). They attempt to reach the 100% goal posts. They make few, if any, wrong marks.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_D9sVIEz0Aho/S9sC66xi-nI/AAAAAAAAAFo/chw11E63Fv0/s1600/Slide5.jpg" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_D9sVIEz0Aho/S9sC66xi-nI/AAAAAAAAAFo/chw11E63Fv0/s320/Slide5.jpg" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Knowledge and Judgment Scoring (&lt;a href="http://www.nine-patch.com/"&gt;KJS&lt;/a&gt;) starts play at the 50% (50-yard) line for good judgment. Students functioning at lower levels of thinking can mark every question (which may put them back in the swamp with RMS). Students and employees functioning at higher (all) levels of thinking use the test to report what they trust. Their goal is to make the highest number of right marks with the fewest number, if any, of wrong marks. [&lt;a href="http://www.youtube.com/watch?v=Xu2jnQJi-Js"&gt;YouTube&lt;/a&gt;]&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_D9sVIEz0Aho/S9sGv6aMwQI/AAAAAAAAAF4/_8cbuQR7gY8/s1600/Chart6.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/_D9sVIEz0Aho/S9sGv6aMwQI/AAAAAAAAAF4/_8cbuQR7gY8/s320/Chart6.jpg" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;A universal score board sums the rules for the three methods of scoring. Scoring is compared in passive, static, mode after the test is finished; and in active, dynamic, mode during the test. Scoring for KJS and CBL are usually expressed in the active, dynamic, mode as the scoring starts with the value given to perfect judgment, 50% or 75% (no wrong marks have been made at the start of the test). &lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Scoring for RMS is usually expressed in the passive, static, mode after the test paper has been turned in. This allows resetting the starting score (and the value of judgment) to zero. This has deceptive consequences. Students like the apparent “no risk” feature. They also like the help from lucky marks. What they do not realize is that every wrong mark reduces their lucky score.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Changing from RMS to KJS or CBL is about the same as changing from a tricycle to a bicycle. It is changing from external control and correction to internal control and self-correction; from linear, low order, thinking to include high order, cyclical, thinking.&amp;nbsp; It takes practice; about three experiences.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;It is scary to do something new. Who ever heard of getting one point for a right mark and one point for the good judgment to not make a wrong mark (omit)? It is done on every essay test where students report what they know and trust, and omit what they have yet to learn.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Students quickly like KJS as it saves them time not having to come up with “the best answer” to a question they cannot read or understand. They like to see the quality score confirm what they trust; what they really know and can build on. &lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;They like the freedom to customize the multiple-choice test to match their preparation (a 90% quality score), as they do on most other assessments. This is effective formative assessment as students learn to question, to answer, and to confirm as they are learning in preparation for assessment. They are in charge as they develop from passive pupil to self-motivated high achiever.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Teachers benefit too. &lt;a href="http://www.nine-patch.com/"&gt;KJS&lt;/a&gt; and &lt;a href="http://www.knowledgefactor.com/"&gt;CBL&lt;/a&gt; differentiate misconceptions, where students think they know the answer but do not, from just guessing on difficult questions. Students are sorted by their level of thinking (teachable level), as well as, by what they know. Each student presents a quantity, a quality, and a test score. You have accurate, honest and fair numbers to support you classroom observations.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Since the three methods of scoring are based on different skills, the Universal Cut Point Raw Score &lt;a href="http://richard-hart.blogspot.com/2010/01/classroom-and-standardized-test-grades.html"&gt;Grade Equalizer&lt;/a&gt; or other methods can be used to assign grades (2009 Arkansas End-of-Course Raw to Scale&amp;nbsp;&lt;a href="http://arkansased.org/testing/pdf/benchmark_rawtoscale_061109.pdf"&gt;Score Conversion Table&lt;/a&gt;&amp;nbsp;and &lt;a href="http://arkedu.state.ar.us/commemos/static/fy0809/attachments/Algebra_1_EOC_Pass_Cut_Score_and_PLD.pdf"&gt;State Law&lt;/a&gt;).&lt;/div&gt;&lt;div class="MsoNormal" style="margin-left: .25in;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;All three methods produce the same raw score when examinees fail to exercise good judgment and mark all questions in hope of getting a lucky passing score. An accurate and honest performance produces the highest score, on average.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6676724996771468267-3555753320894713286?l=richard-hart.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://richard-hart.blogspot.com/feeds/3555753320894713286/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://richard-hart.blogspot.com/2010/05/three-multiple-choice-games.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default/3555753320894713286'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default/3555753320894713286'/><link rel='alternate' type='text/html' href='http://richard-hart.blogspot.com/2010/05/three-multiple-choice-games.html' title='Three Multiple-Choice Games'/><author><name>Richard Hart</name><uri>http://www.blogger.com/profile/04962997526156185761</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_D9sVIEz0Aho/SPYhfcx0BCI/AAAAAAAAAA0/BNYxI4ytqCY/S220/Page0002a.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_D9sVIEz0Aho/S9sC_JURI-I/AAAAAAAAAFw/fLn_L2ESL8o/s72-c/Slide6.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6676724996771468267.post-1312254937823863799</id><published>2010-04-28T17:17:00.000-07:00</published><updated>2010-04-29T06:49:28.091-07:00</updated><title type='text'>My Lucky Score</title><content type='html'>&lt;div class="MsoNormal" style="text-align: auto;"&gt;&lt;span class="Apple-style-span" style="font-size: 15px;"&gt;Students and teachers are as interested in what the next test score will be as in the latest test score. Will it be at or above an expected score? What can be expected from luck? [&lt;a href="http://www.youtube.com/watch?v=NT3KvtF_vVk"&gt;YouTube&lt;/a&gt;]&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_D9sVIEz0Aho/S9h6qsnyBBI/AAAAAAAAAE4/dzSKnr4ykSo/s1600/Table2.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" height="139" src="http://1.bp.blogspot.com/_D9sVIEz0Aho/S9h6qsnyBBI/AAAAAAAAAE4/dzSKnr4ykSo/s200/Table2.jpg" width="200" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span" style="font-size: 15px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-size: 11pt;"&gt;The portion of the time each student will be lucky can be obtained from charts in the previous blog. These charts show the number of lucky scores obtained when the answer sheets were marked without looking at the test.&amp;nbsp;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span" style="font-size: 15px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span" style="font-size: 15px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span" style="font-size: 15px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_D9sVIEz0Aho/S9h7VTVIaYI/AAAAAAAAAE8/F9dVhtVE0eQ/s1600/Chart2.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" height="143" src="http://3.bp.blogspot.com/_D9sVIEz0Aho/S9h7VTVIaYI/AAAAAAAAAE8/F9dVhtVE0eQ/s200/Chart2.jpg" width="200" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-size: 11pt;"&gt;The number of lucky scores becomes the expected frequency of lucky scores for each student. The bar graph becomes an uncluttered line graph. &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span" style="font-size: 15px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-size: 11pt;"&gt;On 4-option questions, a student can expect to receive a lucky test score of 15 out of 60, about 1/8&lt;sup&gt;th&lt;/sup&gt; of the time (0.12), by just marking the answer sheet without looking at the test. &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_D9sVIEz0Aho/S9h8wyV03AI/AAAAAAAAAFA/S97zhshz0xQ/s1600/Chart3.jpg" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="141" src="http://1.bp.blogspot.com/_D9sVIEz0Aho/S9h8wyV03AI/AAAAAAAAAFA/S97zhshz0xQ/s200/Chart3.jpg" width="200" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span" style="font-size: 15px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-size: 11pt;"&gt;Half of the time, the lucky test score is expected to be 15 or less, and half of the time 15 or more. Students can increase their luck by deleting one or more answer options. The average lucky score becomes 20 when one option is deleted on each question.&amp;nbsp;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-size: 11pt;"&gt;Students can turn luck on and off by the decisions they make and the chances they take. The Arkansas Algebra I (AAI) test contains sixty 4-option multiple-choice questions. How students take the test determines how difficult it will be. If students think of options not on the test, they make the test more difficult, a 4-option question becomes a 5-option question or more. They are going in the wrong direction.&amp;nbsp; &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_D9sVIEz0Aho/S9h9ZgVFVaI/AAAAAAAAAFE/huw7uBWEiio/s1600/Chart4.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" height="221" src="http://1.bp.blogspot.com/_D9sVIEz0Aho/S9h9ZgVFVaI/AAAAAAAAAFE/huw7uBWEiio/s320/Chart4.jpg" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span" style="font-size: 15px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-size: 11pt;"&gt;Rather than picking a right answer, delete wrong answers and then guess. At the other extreme, if students can discard all but two options, on average, they can expect a lucky score of 30 out of the 60 questions, or 50%. [The higher order thinking skills needed to do this are promoted in the classroom by Knowledge and Judgment Scoring (&lt;a href="http://www.nine-patch.com/"&gt;KJS&lt;/a&gt;) and Confidence Based Learning (&lt;a href="http://www.knowledgefactor.com/"&gt;CBL&lt;/a&gt;). Students do not need to know “the right answers” to beat standardized tests. They need a practiced self-judgment.]&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-size: 11pt;"&gt;The expected average score is a stable value between 15 and 20. Where each student’s (my) lucky score will fall under that average is not. There is no way to predict each student’s lucky score. That is what makes luck enticing. We can predict the average lucky score and the range in which the lucky score will occur very well. Students can always pass the test with proper preparation.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-size: 11pt;"&gt;The inability to predict individual student lucky scores is of little consequence with Confidence Based Learning (CBL), or the ACT and SAT, as chance has little effect at the mastery level of learning and performing. It has a devastating effect on students with similar abilities being selected to pass or fail a test with raw scores below 50%. Using an average score protects teachers and schools. It has taken forced disaggregation of NCLB test scores to prevent hiding low performance by groups smaller than about 30 students from being masked by the high performance of other students. &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-size: 11pt;"&gt;Fair means chance will distribute scores in a “bell shaped curve” or under the “normal curve of error.” (If there are enough questions on the test. The AAI, with 60 questions, has enough.) The curve has the name “normal” because this is what happens when you know nothing on the test, or mark the test without looking at the test booklet. It could be called the “know nothing curve”.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-size: 11pt;"&gt;On a multiple-choice test scored only by counting right marks, Right Mark Scoring (RMS), there are no qualification runs to put the best or the worst at the head of the pack. Instead, chance assigns each student a secret handicap; luck, on test day. The student with the least ability in your class may draw 20 points and the next student may only draw 10. This is fair with RMS rules as both students have an equal opportunity to draw. [&lt;a href="http://www.youtube.com/watch?v=NT3KvtF_vVk"&gt;YouTube&lt;/a&gt;]&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-size: 11pt;"&gt;Some people believe that tests, especially high-stakes tests, should not be games of chance. They let examinees report what they know, based on their own judgment. Both knowledge and judgment are scored, just as on projects, essays, job assignments, and reports.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-size: 11pt;"&gt;Knowledge and Judgment Scored (&lt;a href="http://www.nine-patch.com/"&gt;KJS&lt;/a&gt;) tests and Confidence Based Learning (&lt;a href="http://knowledgefactor.com/"&gt;CBL&lt;/a&gt;) tests give you a quantity, quality and test score. This form of testing and learning, in the classroom, promotes the student development needed for your students to be winners on any test based on high quality work.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-size: 11pt;"&gt;Next, the three games played on a multiple-choice playing field, from traditional RMS (guess testing) to obtaining accurate, honest and fair scores.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6676724996771468267-1312254937823863799?l=richard-hart.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://richard-hart.blogspot.com/feeds/1312254937823863799/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://richard-hart.blogspot.com/2010/04/my-lucky-score.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default/1312254937823863799'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default/1312254937823863799'/><link rel='alternate' type='text/html' href='http://richard-hart.blogspot.com/2010/04/my-lucky-score.html' title='My Lucky Score'/><author><name>Richard Hart</name><uri>http://www.blogger.com/profile/04962997526156185761</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_D9sVIEz0Aho/SPYhfcx0BCI/AAAAAAAAAA0/BNYxI4ytqCY/S220/Page0002a.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_D9sVIEz0Aho/S9h6qsnyBBI/AAAAAAAAAE4/dzSKnr4ykSo/s72-c/Table2.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6676724996771468267.post-7364246638953944730</id><published>2010-04-26T15:02:00.000-07:00</published><updated>2010-04-28T11:02:02.397-07:00</updated><title type='text'>Multiple-Choice Lucky Scores</title><content type='html'>&lt;span class="Apple-style-span" style="font-family: 'Lucida Grande'; font-size: small;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Lucida Grande'; font-size: small;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Lucida Grande'; font-size: small;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Lucida Grande'; font-size: small;"&gt;&lt;div align="center" class="MsoNormal" style="text-align: auto;"&gt;&lt;div style="text-align: auto;"&gt;&lt;div style="text-align: left;"&gt;&lt;span class="Apple-style-span" style="font-size: 15px;"&gt;The news headlines could have been, “Cheat or Chance” or “Trick or Teach,” this past year. The cut score for passing a multiple-choice test, scored by only counting right marks, continued to fall. The traditional multiple-choice test scoring method was being pushed over a credibility limit.&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="text-align: left;"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-size: 11pt;"&gt;Aug 11: “City students are passing standardized tests just by &lt;a href="http://www.nydailynews.com/ny_local/2009/08/12/2009-08-12_standardized_tests_being_passed_just_by_guessing.html"&gt;guessing&lt;/a&gt;”&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-size: 11pt;"&gt;Aug 14: “&lt;a href="http://nypost.com/p/news/opinion/opedcolumnists/toughen_the_tests_EF91F0Q4g70eT5y9CCXV1K"&gt;TOUGHEN&lt;/a&gt; THE TESTS”&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-size: 11pt;"&gt;Aug 17: “&lt;a href="http://gothamschools.org/2009/08/17/guessing-my-way-to-promotion"&gt;Guessing&lt;/a&gt; My Way to Promotion”&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-size: 11pt;"&gt;Sep 14: “&lt;a href="http://www.nytimes.com/2009/09/14/education/14scores.html?_r-1"&gt;Botched&lt;/a&gt; Most Answers on New York State Math Test? You Still Pass”&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-size: 11pt;"&gt;Sep 16: “Is any test &lt;a href="http://blogs.ajc.com/get-schooled-blog/2009/09/16/is-any-test-reliable-crct-naep-act-pick-one/#"&gt;reliable&lt;/a&gt;? CRCT? SAT? NAEP? ACT? Pick one”&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-size: 11pt;"&gt;Oct 31: “&lt;a href="http://www.upi.com/Top_News/US/2009/10/31/Duncan-States-set-bar-too-low/UPI-18621256963096/"&gt;Ducan&lt;/a&gt;: States ‘set bar too low’”&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-size: 11pt;"&gt;Jan 11: “As School &lt;a href="http://www.nytimes.com/2010/01/12/education/12exit.html?pagewanted=1&amp;amp;hp"&gt;Exit Tests&lt;/a&gt; Prove Tough, States Ease Standards”&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span" style="font-size: 15px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-size: 15px;"&gt;The 100-point 2009 Arkansas Algebra I (AAI) end-of-course test, mentioned in the last article, is a good example to examine to see how standardized testing actually works:&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;ol start="1" style="margin-top: 0in;" type="1"&gt;&lt;li class="MsoNormal" style="mso-list: l1 level1 lfo1; tab-stops: list .5in;"&gt;&lt;span style="font-size: 11pt;"&gt;Items for new AAI versions are trial-tested, in a current operational test, rather than field-tested on a selected sub-sample at a      different time.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/li&gt;&lt;li class="MsoNormal" style="mso-list: l1 level1 lfo1; tab-stops: list .5in;"&gt;&lt;span style="font-size: 11pt;"&gt;A statewide &lt;a href="http://secc.sedl.org/orc/rr/secc_rr_00094.pdf"&gt;Uniform Grading Scale&lt;/a&gt; is monitored      for &lt;a href="http://www.arktimes.com/blogs/arkansasblog/graderelease.doc"&gt;inflation&lt;/a&gt; by comparing the pass rate in school with the pass rate on      the AAI.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/li&gt;&lt;li class="MsoNormal" style="mso-list: l1 level1 lfo1; tab-stops: list .5in;"&gt;&lt;span style="font-size: 11pt;"&gt;Arkansas has had a nearly perfect yearly increase      in the AAI test score for the past nine years (&lt;a href="http://arkansased.org/communications/pdf/assessment_071309.pdf"&gt;see page 24 of 28&lt;/a&gt;).&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-size: 11pt;"&gt;The multiple-choice portion of the test is played on the traditional field of varying quality. At the high end, everyone knows what the examinee knows or can do, including the examinee. The scoring in Confidence Based Learning (&lt;a href="http://www.knowledgefactor.com/"&gt;CBL&lt;/a&gt;) plays in this region, as does the SAT and ACT when used to pick top quality winners.&amp;nbsp;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-size: 11pt;"&gt;Traditional Right Marked Scoring (RMS), used &amp;nbsp;on the AAI, are played at the other, lower, end of the field. The examinee guesses and waits for the test score and even then no one knows what the student knows or can do, including the examinee.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-size: 11pt;"&gt;Knowledge and Judgment Scoring (&lt;a href="http://www.nine-patch.com/"&gt;KJS&lt;/a&gt;) permits students to individualize their test to match their preparation. They can opt for RMS or for KJS. They can opt for the teacher to tell them what they have right, or for reporting what they know and trust is right. They can opt for lower or higher-order thinking.&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="color: #535353; font-family: ArialMT;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-size: 11pt;"&gt;&lt;span class="Apple-style-span" style="color: #535353; font-family: ArialMT; font-size: 12px;"&gt;&lt;a href="http://www.youtube.com/watch?v=k6-6Pj0DkVU"&gt;&lt;/a&gt;&lt;span class="Apple-style-span" style="color: black; font-family: 'Lucida Grande'; font-size: 15px;"&gt;Chance plays almost no part in CBL. Chance is the main determiner of lucky scores.&amp;nbsp;&lt;a href="http://www.youtube.com/watch?v=k6-6Pj0DkVU"&gt;[YouTube]&lt;/a&gt;&amp;nbsp;&amp;nbsp;This holds for any test using RMS, including the SAT, ACT, and end-of-course tests.&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span" style="font-size: 15px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-size: 11pt;"&gt;The effects of unaltered pure chance can be seen on tests such as the AAI when:&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;ol start="1" style="margin-top: 0in;" type="1"&gt;&lt;li class="MsoNormal" style="mso-list: l2 level1 lfo2; tab-stops: list .5in;"&gt;&lt;span style="font-size: 11pt;"&gt;The answer sheets are marked randomly without      looking at the test booklet.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/li&gt;&lt;li class="MsoNormal" style="mso-list: l2 level1 lfo2; tab-stops: list .5in;"&gt;&lt;span style="font-size: 11pt;"&gt;The answer sheets have no erasures.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/li&gt;&lt;li class="MsoNormal" style="mso-list: l2 level1 lfo2; tab-stops: list .5in;"&gt;&lt;span style="font-size: 11pt;"&gt;No marking pattern is used such as wallpapering.      Wallpapering reduces test anxiety by students agreeing, before the test, how they will mark forced-choice guesses (when they have finished      reporting what they know and trust, but must not omit or not leave      blanks).&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/li&gt;&lt;li class="MsoNormal" style="mso-list: l2 level1 lfo2; tab-stops: list .5in;"&gt;&lt;span style="font-size: 11pt;"&gt;Student judgment is absent or is given no value (RMS).&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-size: 11pt;"&gt;There are several ways to score the effects of chance on multiple-choice tests:&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;ol start="1" style="margin-top: 0in;" type="1"&gt;&lt;li class="MsoNormal" style="mso-list: l0 level1 lfo3; tab-stops: list .5in;"&gt;&lt;span style="font-size: 11pt;"&gt;Randomly mark 100 AAI answer sheets for the 60      multiple-choice questions.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/li&gt;&lt;li class="MsoNormal" style="mso-list: l0 level1 lfo3; tab-stops: list .5in;"&gt;&lt;span style="font-size: 11pt;"&gt;Use a &lt;a href="http://www.merriam-webster.com/cgi-bin/audio.pl?quincu01.wav=quincunx" title="Pronunciation"&gt;quincunx&lt;/a&gt; board.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/li&gt;&lt;li class="MsoNormal" style="mso-list: l0 level1 lfo3; tab-stops: list .5in;"&gt;&lt;span style="font-size: 11pt;"&gt;Use the Excel function: BINOMDIST.&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-size: 11pt;"&gt;The &lt;a href="http://www.jcu.edu/math/isep/Quincunx/Quincunx.html" title="Online Activity"&gt;quincunx&lt;/a&gt; board allows you to see chance in action; that force behind what is called creativity in Arts, Letters, and Politics, and is also called error in Science, Math and Engineering. The quincunx board works well for normal classroom tests with about 25 students (balls) and 8 questions (9 bins). (Number each student. Run slowly. Have each student follow his/her ball as it falls into a bin. Repeat and compare results for an added effect.) &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-size: 11pt;"&gt;The Excel function BINOMDIST can be set for almost any number of students and questions. A set of 100 answer sheets produces a surprisingly uniform distribution even though the right answer is expected by chance but 1/4&lt;sup&gt;th&lt;/sup&gt; of the time.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span" style="font-size: 15px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_D9sVIEz0Aho/S9YIfsRBICI/AAAAAAAAAEk/7i0okZTeJxo/s1600/Table2.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" height="223" src="http://4.bp.blogspot.com/_D9sVIEz0Aho/S9YIfsRBICI/AAAAAAAAAEk/7i0okZTeJxo/s320/Table2.jpg" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span" style="font-size: 15px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span" style="font-size: 15px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-size: 11pt;"&gt;The graph of 4-option questions shows that no student can expect to pass the AAI by guessing. Classroom passing is set equal to 24 raw score points out of 100 points in Arkansas. The maximum lucky score on the sixty 4-option questions was 23, and that only happened about 1 out of 100 students. The required passing cut score of 37 points for graduation in Arkansas is far beyond the reach of lucky scores.&amp;nbsp;&lt;a href="http://www.youtube.com/watch?v=k6-6Pj0DkVU"&gt;[YouTube]&lt;/a&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span" style="font-size: 15px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span" style="font-size: 15px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_D9sVIEz0Aho/S9YK1NOoQJI/AAAAAAAAAEs/C4rLK-vlSyc/s1600/Table3.jpg" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/_D9sVIEz0Aho/S9YK1NOoQJI/AAAAAAAAAEs/C4rLK-vlSyc/s320/Table3.jpg" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span" style="font-size: 15px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span" style="font-size: 15px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span" style="font-size: 15px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span style="font-size: 11pt;"&gt;But students can alter these results by exercising higher-order thinking skills. If students can, on average, discard one option on each question, they are then working with a 3-option question test. The classroom test equivalent of 24 raw score points can be passed with lucky scores. Some 17 (6 + 4 + 3 + 2 + 1 + 1) out of 100 students passed by guessing from the remaining three options. Students who do this are often referred to as “test wise.”&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;div class="MsoNormal"&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;span style="font-size: 11pt;"&gt;&lt;span class="Apple-style-span" style="color: #535353; font-family: ArialMT; font-size: 12px;"&gt;&lt;span class="Apple-style-span" style="color: black; font-family: 'Lucida Grande'; font-size: 15px;"&gt;Students, teachers, test makers, and administrators can manipulate the effects of chance, for their benefit, in other ways.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span" style="font-size: 15px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6676724996771468267-7364246638953944730?l=richard-hart.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://richard-hart.blogspot.com/feeds/7364246638953944730/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://richard-hart.blogspot.com/2010/04/multiple-choice-lucky-scores.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default/7364246638953944730'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default/7364246638953944730'/><link rel='alternate' type='text/html' href='http://richard-hart.blogspot.com/2010/04/multiple-choice-lucky-scores.html' title='Multiple-Choice Lucky Scores'/><author><name>Richard Hart</name><uri>http://www.blogger.com/profile/04962997526156185761</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_D9sVIEz0Aho/SPYhfcx0BCI/AAAAAAAAAA0/BNYxI4ytqCY/S220/Page0002a.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_D9sVIEz0Aho/S9YIfsRBICI/AAAAAAAAAEk/7i0okZTeJxo/s72-c/Table2.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6676724996771468267.post-9218382103316395258</id><published>2010-01-20T12:03:00.000-08:00</published><updated>2010-01-20T12:08:36.515-08:00</updated><title type='text'>Classroom and Standardized Test Grades</title><content type='html'>&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Does a grade, or cut point, tell us what happened or just the appearance, that a politician or an administrator wants to give, of what happened?&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Even a simple question, “Why did I get the same grade on my math test as another student got on a government test when our test scores differed by more than ten percentage points?” has &lt;a href="http://www.youtube.com/watch?v=VfAiWUwwnwI"&gt;no simple answer&lt;/a&gt;.&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;A Universal Cut Point Raw Score Grade &lt;a href="http://www.multiplechoicescoring.org/blog2/graphics.htm"&gt;Equalizer&lt;/a&gt; helps put things into perspective:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_D9sVIEz0Aho/S1dcBLqrjuI/AAAAAAAAAEU/XZI_D5_m1oc/s1600-h/Equalizer.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="201" src="http://1.bp.blogspot.com/_D9sVIEz0Aho/S1dcBLqrjuI/AAAAAAAAAEU/XZI_D5_m1oc/s400/Equalizer.jpg" width="400" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Most teachers, who just count right marks, use a 10-point range scale, as it is easy to remember the cut points of 90, 80, 70, and 60%. Every student can earn any letter grade (all can be A’s if all have mastered the assignment).&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Other teachers use the average test score to select a scale for assigning grades. Average tests scores ranging from 70% up to 92.5% produce a range of grades for a raw score of 88%. It is an A on a 12-point range scale, a B on a 6-point scale, a C on a 4-point scale, and a D on a 3-point scale.&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;There are many ways to assign grades. In general, a test score below 80% means the student is not keeping up with the course and will not be prepared for the next course, whatever grade is assigned.&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Right mark scoring (RMS) grades are easily manipulated by the selection of questions, question difficulty, and cut points in the classroom and on standardized tests. Lowering the cut point to 40% (a range scale of 15 points and a quality score of 40%) insures that a portion of students will pass by luck alone. There is no way to know what the student actually trusted as a basis for further learning and instruction. &lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;Knowledge and Judgment Scoring &lt;a href="http://www.nine-patch.com/"&gt;(KJS&lt;/a&gt;) and Confidence Based Learning (&lt;a href="http://www.knowledgefactor.com/"&gt;CBL&lt;/a&gt;) value judgment (quality) independently from knowledge. The student is in charge of reporting what he can trust and what he has yet to master. KJS and CBL reward students for taking the responsibility to learn beyond the concrete level. They are rewarded for learning, anywhere and anytime, not just in class.&amp;nbsp; They ask questions, get help, and put in the time needed to master the assignment. It feels good to have mastered a clearly stated and understandable assignment.&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;KJS and CBL grades are not easily manipulated since there is a score for what is known and the degree to which it can be trusted. The grades reflect what self-motivated achievers are doing rather than how lucky passive pupils were on test day.&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;In my opinion, one of the main reasons schools that show a marked increase in RMS standardized test scores one year and no further increase in the following years is that passive pupils can only be pushed so far in traditional classrooms. Student development to produce self-motivated achievers, functioning at all levels of thinking, is needed to go further. These are the graduates that are successful in what they do next in school and beyond.&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;There are many ways for schools to promote mastery, and not just the appearance of mastery. &lt;a href="http://www.nine-patch.com/"&gt;KJS&lt;/a&gt; is a bridge to mastery. &lt;a href="http://www.knowledgefactor.com/"&gt;CBL&lt;/a&gt; guarantees mastery.&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6676724996771468267-9218382103316395258?l=richard-hart.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://richard-hart.blogspot.com/feeds/9218382103316395258/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://richard-hart.blogspot.com/2010/01/classroom-and-standardized-test-grades.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default/9218382103316395258'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default/9218382103316395258'/><link rel='alternate' type='text/html' href='http://richard-hart.blogspot.com/2010/01/classroom-and-standardized-test-grades.html' title='Classroom and Standardized Test Grades'/><author><name>Richard Hart</name><uri>http://www.blogger.com/profile/04962997526156185761</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_D9sVIEz0Aho/SPYhfcx0BCI/AAAAAAAAAA0/BNYxI4ytqCY/S220/Page0002a.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_D9sVIEz0Aho/S1dcBLqrjuI/AAAAAAAAAEU/XZI_D5_m1oc/s72-c/Equalizer.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6676724996771468267.post-5202320077805400621</id><published>2009-10-20T16:19:00.000-07:00</published><updated>2009-10-26T15:39:22.779-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='multiple-choice'/><category scheme='http://www.blogger.com/atom/ns#' term='bubbles'/><category scheme='http://www.blogger.com/atom/ns#' term='bubbling'/><title type='text'>Multiple-Choice Bubbling</title><content type='html'>Students first meet multiple-choice by receiving a #2 pencil and a sheet printed with a variety of boxes, circles or ovals to be darkened (unless they can mark directly on the test). The NCLB game is to have one properly darkened mark on one answer option for each question in the allotted time, at lower levels of thinking (LLOT) characteristic of NCLB testing. Secondly, for marginal students, the mark should hopefully be a right answer.  &lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_D9sVIEz0Aho/SuHxmNdnKHI/AAAAAAAAAC0/7qnbvAALBEk/s1600-h/Screen+shot+2009-10-23+at+9.08.jpg" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_D9sVIEz0Aho/SuHxmNdnKHI/AAAAAAAAAC0/7qnbvAALBEk/s200/Screen+shot+2009-10-23+at+9.08.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;a href="http://3.bp.blogspot.com/_D9sVIEz0Aho/SuHyV_gQs2I/AAAAAAAAAC8/mBeCpN2ACKA/s1600-h/229627-4-2.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_D9sVIEz0Aho/SuHyV_gQs2I/AAAAAAAAAC8/mBeCpN2ACKA/s320/229627-4-2.jpg" /&gt;&lt;/a&gt;Large bubble sizes were used to improve the accuracy of older answer sheet readers. &lt;a href="http://www.teachervision.fen.com/tv/printables/0134356500_m1mufm46.pdf"&gt;[teachervision]&lt;/a&gt; More time was needed to fully darken the outlined space [&lt;a href="http://www.youtube.com/watch?v=mlCpJ2efC30"&gt;YouTube&lt;/a&gt;].&amp;nbsp;“… because of such onerous requirements, test takers may become preoccupied with filling the bubbles correctly and less able to focus on the substantive tasks at hand.” &lt;a href="http://www.patentstorm.us/patents/7068861/description.html"&gt;[patentstorm]&lt;/a&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_D9sVIEz0Aho/SuJFtjT09pI/AAAAAAAAAEM/f_zmVz7UDsY/s1600-h/220610Aa.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_D9sVIEz0Aho/SuJFtjT09pI/AAAAAAAAAEM/f_zmVz7UDsY/s320/220610Aa.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;a href="http://1.bp.blogspot.com/_D9sVIEz0Aho/SuJFaqYynxI/AAAAAAAAAEE/Zx8ABTe8tCk/s1600-h/220610Aa_2.jpg" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/_D9sVIEz0Aho/SuJFaqYynxI/AAAAAAAAAEE/Zx8ABTe8tCk/s320/220610Aa_2.jpg" /&gt;&lt;/a&gt;Many solutions to this problem now exist: [Vertical ovals, &lt;a href="https://store.scantron.com/"&gt;Scantron&lt;/a&gt; Form 229627] [Small circles and with intermediate student spacing, &lt;a href="https://store.scantron.com/"&gt;Scantron&lt;/a&gt; Form 220610] (Both sets of circles have the same space to darken in each circle.)&lt;br /&gt;&lt;br /&gt;&lt;div&gt;&lt;a href="http://1.bp.blogspot.com/_D9sVIEz0Aho/SuIGONWgebI/AAAAAAAAAD0/Oc_ffbXbKo8/s1600-h/Apperson25090.jpg" imageanchor="1" style="clear: left; display: inline !important; float: left; margin-bottom: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/_D9sVIEz0Aho/SuIGONWgebI/AAAAAAAAAD0/Oc_ffbXbKo8/s320/Apperson25090.jpg" /&gt;&lt;/a&gt;&lt;a href="http://3.bp.blogspot.com/_D9sVIEz0Aho/SuIGhzPbfwI/AAAAAAAAAD8/n56wmEu3HW0/s1600/Apperson23020.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_D9sVIEz0Aho/SuIGhzPbfwI/AAAAAAAAAD8/n56wmEu3HW0/s320/Apperson23020.jpg" /&gt;&lt;/a&gt;The most novel solution is a large letter replacing the large bubble with a smaller letter inside. The reduced bubble now only covers the target area the optical mark reader actually sees. [Normal, &lt;a href="https://ssl1.appersonsecure.com/pdfs/common/25090.PDF"&gt;Apperson&lt;/a&gt; Form 205090; Large, &lt;a href="https://ssl1.appersonsecure.com/pdfs/common/23020.PDF"&gt;Apperson&lt;/a&gt; Form 23020] &amp;nbsp;The original relationships between bubble and letter have been inverted for young students, older adults, and employees.&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;The loss of meaningful information available from these tests, by only counting right marks, has also been solved by an inversion: inverting who is responsible for determining if a mark is right or wrong from teacher or examiner to student or examinee. &lt;a href="http://www.nine-patch.com/"&gt;[&lt;/a&gt;&lt;a href="http://www.nine-patch.com/"&gt;Nine-Patch Multiple-Choice&lt;/a&gt;&amp;nbsp;and&amp;nbsp;&lt;a href="http://www.knowledgefactor.com/"&gt;Knowledge Factor&lt;/a&gt;]&amp;nbsp;A multiple-choice test now takes on the same characteristics of other assessments, involving judgment, by requiring students to function at all levels of thinking, plus, rapid scoring and detailed student or examinee counseling reports.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6676724996771468267-5202320077805400621?l=richard-hart.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://richard-hart.blogspot.com/feeds/5202320077805400621/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://richard-hart.blogspot.com/2009/10/multiple-choice-bubbling.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default/5202320077805400621'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6676724996771468267/posts/default/5202320077805400621'/><link rel='alternate' type='text/html' href='http://richard-hart.blogspot.com/2009/10/multiple-choice-bubbling.html' title='Multiple-Choice Bubbling'/><author><name>Richard Hart</name><uri>http://www.blogger.com/profile/04962997526156185761</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='http://2.bp.blogspot.com/_D9sVIEz0Aho/SPYhfcx0BCI/AAAAAAAAAA0/BNYxI4ytqCY/S220/Page0002a.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_D9sVIEz0Aho/SuHxmNdnKHI/AAAAAAAAAC0/7qnbvAALBEk/s72-c/Screen+shot+2009-10-23+at+9.08.jpg' height='72' width='72'/><thr:total>0</thr:total></entry></feed>
