THE GREAT ESSAY/MULTIPLE CHOICE DEBATE: DIFFERENT STROKES FOR DIFFERENT FOLKS By Marilla Svinicki and Bill Koch University of Texas at Austin (Reprinted from _INNOVATION ABSTRACTS_, February 3, 1984) There has been a long-standing debate on academic testing--which is better, essay testing or multiple choice testing? Anyone in education as long as we knows that when- ever a question is phrased that way, the answer is "neither," followed by a long string of qualifiers. So it should come as no surprise that this is an appropriate answer to our question as well and that our purpose here is to present the qualifiers. The first and most important qualifier in determining which question type is preferable is the instructional ob- jective being tested. To coin a phrase, you can't weigh a duck with a yardstick." The measure you choose must match the purpose of the measurement, if not physically, at least conceptually. So if your purpose is to measure the student's grasp of basic facts, the type of test items you choose should require just that and nothing more. If, on the other hand, you want to measure a student's ability to communicate facts concisely, you need to choose a totally different task for the test. None of this is all that startling, but it does require a commitment on the part of the instructor to define exactly what the course objectives are. Then, for the tests to have content validity, there must be a close correspondence among the course objectives, the instructional activities, and the test items used to evaluate the learning that took place. And those course objectives must be stated not just in vague terms like "an understanding of the principle of one man, one vote," but more specifically like "recognizing examples of situations where the one man, one vote principle should be applied" or "predicting how the principle of one man, one vote might affect election outcomes under a variety of conditions" or "proposing solutions to problems of election reform which reflect the one man, one vote principle." These three alternative interpretations of "an understanding of the one man, one vote principle," are all valid; yet they require different types of testing. So, the first task of any test designer _must_ be to match the instructional ob- jectives to the types of items on the test, laying out a blueprint which charts the course content against the level of understanding desired. -Theoretical Differences- The second set of qualifiers for choosing items types falls under the heading of differences in theoretical prop- erties between the essay item and the multiple choice item. The first of these qualifiers involves the level of cogni- tive processing and memory required by the two item types. In essay items, the student must use the cues present in the question to search through his memory and _recall_ all the relevant information, use that information to structure a correct response, and write it down. In multiple choice items, the student needs to _recognize_ a correct choice either by matching the alternatives to a correct answer that already exists in memory, generating a correct answer and comparing alternatives to it, or evaluating each alternative as a potential answer and trying to reason from the question to that answer. The advantage of the multiple choice over the essay, from the student's perspective, is that there is an immediate check on memory. If none of the alternatives match his preferred response, he knows that his choice is faulty (except where "none of the above" is an option). Along similar lines is another, more empirically based characteristic to consider: the testing time spent on _think- ing versus production_. Even if complex multiple choice items do indeed require the same initial thought process as essay items, there is a distinct difference in the amount of time for answer production. We often forget that there is simply a certain amount of time needed to physically write the an- swer out for an essay, over and above the time needed to compose it. In multiple choice items, the time can be devoted almost exclusively to thinking about the answer and the alternatives. Therefore, an instructor should consider where the student's time would be spent most effectively. A third characteristic of these two types is their dif- ferential sensitivity to learning. Because the amount of information present in the item is greater in a multiple choice item than in an essay item, there is more information which can trigger the memory of the correct response. There- fore, multiple choice items are more sensitive to smaller amounts of specific learning (as evidenced by their suscepti- bility to cramming). On the other hand, essay items allow the student to throw in everything he knows about a subject; and while these items may be more sensitive to breadth of learning, they are more vulnerable to padding or skirting the question. A fourth typical characteristic, though not one which is inherent in the item format itself, is the difference in cognitive complexity which is usually demanded by each type of item. In general, essay items tend to be used to measure more complex learning, such as analysis or synthesis while multiple choice items are used more for basic comprehension and application. It is possible to use multiple choice items to probe for more complex understanding, but it is difficult and time-consuming to write this type of test item. A fifth characteristic which distinguishes these two types of items is that essay items require more than content knowledge. It has been frequently demonstrated that even the most sophisticated scorer is not immune to the effects of communication skills in evaluating essay answers. The same can be a problem from the student's point of view as well. The confounding of communication skills with content know- ledge can be a bane or a boon, depending upon the student's facility with each. The instructor must remember that the confounding exists both at the time of testing and at the time of scoring. A last theoretical difference between these two types of items may be a psychological one. There is research evi- dence that there is a difference in the way students view the content and the appropriate way to prepare for an exam depending upon the type of exam being given. Essay exams encourage specificity and concentration on facts. This may be an artifact of the tendency of instructors to use essay items for more general concepts while using multiple choice items for specifics; however, until that trend is reversed, students will continue to approach the two types of exams that way. -Technical Differences- From a measurement point of view, these two types of items are quite distinct. The first difference between the two types lies in their respective abilities to allow for adequate sampling of the content to be tested. Since multi- ple choice items require less time to answer than do essay items, more items can be included in an exam; therefore, more areas can be tested. The wider the sample range, the more valid and reliable a single test can be. A second consideration is test reliability. Multiple choice tests, in general, are more reliable than essay tests. The greater reliability is primarily a function of the required grading procedures. Essay items are subject to variability due to errors or influences on grading as well as achievement level differences among the test takers, while the variability of multiple choice items comes mostly from differences among test takers. It is pos- sible to train essay graders to be more consistent and reliable, but such training occurs infrequently. Finally, multiple choice items lend themselves to statistical analysis for evaluation and improvement pur- poses. Measurement and evaluation centers can assess the effectiveness of test items or provide instructors with information for completing their own. Although compara- tive analyses can be done on essay items, the procedures must be done by hand and are far less reliable. -Practical Differences- There are two other practical differences between these two item types. Perhaps the most significant con- sideration in choosing a test type is that of the time- commitment involved in producing, administering, and grading the exam. Multiple choice tests are very time- consuming to construct, but they take little time to grade. Also the time during the testing is devoted al- most exclusively to thought processes rather than the production of an answer. Essay tests are less complicated to construct, but they are time-consuming to grade. Dur- ing testing, a significant proportion of student time is devoted to production of the answer. A second consideration is the potential for cheating and guessing. Multiple choice items require that the instructor be particularly alert to both behaviors; they are easily accomplished and difficult to spot. Essay items are much less likely to foster either type of behavior although students may try to "pad" their way through an essay item without knowing the subject. -Making the Choice- It is important to remember that item types should match test objectives. Using a variety of item types serves students well; those who are stronger in one re- sponse type than another will have equal opportunities to perform well. Finally, it is important to consider using a variety of evaluation vehicles for assessing stu- dent learning. Putting all of your instructional eggs in one basket is like trying to document teaching effective- ness on the basis of student evaluations alone. [From the Center for Teaching Effectiveness _Newsletter_, Vol. 5, No. 2, November 1983] For further information contact Marilla Svinicki Center for Teaching Effectiveness 2202 Main Building University of Texas at Austin Austin, TX 78712 512/471-1488