TESTING MEMO 3: ESSAY TESTS by Lawrence H. Cross Virginia Polytechnic Institute and State University In contrast to "multiple-guess" tests, which are often viewed as little more than games of chance, essay tests are considered by many to represent the no-nonsense approach to testing. Whereas multiple- choice tests may serve to measure recognition of factual information, only essay tests can, they say, be depended upon to assess the higher levels of learning. Moreover, the popularity of essay tests at all levels of education attests, in part, to the tremendous faith educators have in them. At a more operational level, anyone who has had to grade a set of essay tests has no doubt experienced some anxiety over the correctness of the many judgements required, regardless of the method used to determine scores and assign grades. That errors of judgement are likely to enter this process is well documented in the research literature. The process of grading essay tests is particularly difficult if one attempts to award scores on the basis of what has been learned about a subject without being influenced by the quality of writing. But because the ability to write is such an important skill, it is frequently argued that it is reasonable and even desirable to consider the quality of writing when grading essay tests. Yet few would advocate adjusting essay test grades in, say, history or geography as a function of a student's quantitative ability, even though quantitative skills can be just as important as verbal skills. Ironically, however, it seems quite reasonable to bias the grades from such tests in favor of those with higher verbal ability. We are not suggesting that writing skills should be ignored by all except teachers of English; indeed, each instructor should exercise professional judgement as to the importance of writing in each course and plan written assignments accordingly. What we do suggest is that, in most courses, when one wishes to test achievement in a particular area, the test scores should not be biased by the examinees' ability to write. For these, and other reasons, few specialists in educational measurement would recommend the use of essay tests over other more objective testing methods except when learning to write well extemporaneously or under pressure is a legitimate course goal. Such a goal would be appropriate in some English, journalism and business communication courses but certainly not in a very broad range of high school or college-level courses. Some instructors claim to reduce or eliminate the problem of bias due to variation in verbal ability by attempting to evaluate only the content of essay test answers without regard to adequacy of the writing itself. Typically, answers are checked for inclusion of specific information or arguments, and points are awarded accordingly. We do not recommend this approach because: 1) indirectly, it may encourage careless writing, 2) research has shown that quality of writing is likely to influence scoring in spite of intentions to the contrary, and (3) an objectively scoreable test could easily be substituted in most cases. In fact, we strongly recommend objective or problem-solving tests for most academic courses. At the same time, we recommend that written work be completed by students under conditions conducive to good writing and that the results be subjected to scrupulous and detailed criticism. Nevertheless, despite these arguments and widespread and persistent criticism of essay tests by measurement specialists since the turn of the century, many teachers still favor the essay over objective testing methods. Therefore, though recommending against the use of essay tests under most circumstances, we will also adopt a positive approach by offering a few well-considered recommendations that may foster more effective use of essay tests. 1. Have the examinees sign their names on every page of the examination. This recommendation is designed to relieve those nagging doubts which arise when you are trying to decide on the merits of an original answer. Moreover, it will tend to enhance the reliability of the resulting scores by introducing a systematic source of variance associated with what you already knew about the examinee before the question was written. It also works against grade inflation by insuring that your worst students get the grades they deserve. 2. Grade all answers for a given examinee before going on to the next paper. This practice prevents all the needless paper shuffling required if you were to grade all answers to a single question before grading the next question. A liability of this method is that you may be influenced or distracted by an examinee's answer to a previous question, but this shouldn't matter if you use a holistic approach wherein the grade assigned is based on an overall opinion of the examinee's ability. Finally, by the time you get to the next person's paper, you will have had time to forget the previous person's answer to the first question thus assuring greater independence of scores assigned across examinees. Also, this approach tends to prevent boredom. 3. Be vague about what examinees should do when they don't know an answer. It is hard to justify this recommendation until you consider the possibilities for an alternative. If you hold honest admission of ignorance in high esteem, you may direct students to refrain from rephrasing the question in their own words and embellishing upon this with inappropriate but related verbiage. However, considering human nature, it seems likely that at least a few students may not adopt your philosophy or your advice and try to fake it in the hope of duping you into giving them partial credit. (Even two points out of ten points is better than a score of zero for a blank page.) Recognizing that this approach may unfairly penalize the timid or obedient examinee, you may instead instruct students to "bluff" and answer even if totally ignorant. Though this strategy will tend to minimize the influence of personality on scores, it will probably make your task of grading more difficult and will certainly increase the amount you have to read. For these reasons it is probably best to be vague. 4. Unstructured questions are to be preferred over questions focusing on a specific problem. The ability to demonstrate creativity and original thought can be demonstrated most effectively by having the students respond to broad topics. (Don't worry about what they were expected to learn.) Thus a question which asks the examinees to describe the solar system would provide greater opportunity for scholarly writing than would a narrowly focused question which might ask the examinees to explain how Bode's law helped lead to the discovery of asteroids. Moreover, broad questions allow you to test an entire unit of instruction with only one or two questions, thus effecting a substantial saving of time and also forestalling the often heard criticism that the test did not cover what the students studied. 5. Offer a choice of questions on which to write. With each student answering a different set of questions, the tedium of grading is reduced considerably. Don't worry about the fact that scores arising from different selections of questions aren't comparable. In fact, defending your basis for grading is actually made easier, because the students have less opportunity to compare how you graded a given question from one student to another. In the same vein, students who get low grades are more hard pressed to blame the available list of test questions. You can always say, "Well, I tried to give a selection that would be helpful to everyone." 6. Impose rigid time constraints and require the students to write in ink. There's nothing like a little pressure to separate the men from the boys, and requiring ink gives more opportunities to take off points for messiness when you can think of no other reason for an answer you didn't like. If students offer objections to your essay tests, simply tell them that essay tests provide the best way to learn to write well and that they're lucky you are willing to go to the trouble when a multiple-choice test would be so much easier for you. 7. Invite students to see you individually if they would like to discuss an answer to any question. This practice will help you avoid those embarrassing moments that arise in class when a student challenges you to explain why you awarded his answer two points fewer than Sally's answer to the same question. It is always better to explain the finer distinction in scholarship you were looking for in an answer in the privacy of your office than in front of the whole class. 8. IGNORE RECOMMENDATIONS ONE THROUGH SEVEN. For more information, contact Bob Frary at Robert B. Frary, Director of Measurement and Research Services Office of Measurement and Research Services 2096 Derring Hall Virginia Polytechnic Institute and State University Blacksburg, VA 24060 703/231-5413 (voice) frary#064;vtvm1.cc.vt.edu ### .