Volume: | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 |

A peer-reviewed electronic journal. ISSN 1531-7714

Permission is granted to distribute this article for nonprofit, educational purposes if it is copied in its entirety and the journal is credited. Please notify the editor if an article is to be used in a newsletter. |

Practical Assessment, Research & Evaluation, 3(5). Retrieved August 18, 2006 from http://edresearch.org/pare/getvn.asp?v=3&n=5 . This paper has been viewed 5,108 times since 11/13/99.
## Resampling: A Marriage of Computers and Statistics Lawrence M. RudnerClearinghouse on Assessment and Evaluation
Suppose your superintendent asked you to determine whether voucher students are doing better than non-voucher students in your district's elementary schools. You might perform a simple t test or an analysis of variance to find your answer. You would report mean differences and probability levels. And if your superintendent is like most non-statisticians, he or she would accept the magic of statistics without questioning the validity of the assumptions made to use the t test. Thanks to advances in computer technology, educational researchers are beginning to use simpler statistical methods. These techniques let us empirically address a wider range of questions with smaller data sets and with fewer, less restrictive assumptions. Using such techniques, we can focus on reasoning and on understanding the data, not on complicated formulas and tables. The techniques promise to make statistics a useful, easily learned tool for educational policy makers and researchers. This article introduces computationally intensive statistics, collectively called resampling techniques. After defining these statistics, we'll use one technique to answer our opening question. We'll then present the arguments for and against resampling.
Resampling is simply a process for estimating probabilities by conducting vast numbers of numerical experiments. Today, resampling is done with the aid of high speed computers. In Science News, Peterson (1991) compares resampling techniques to the trial-and-error way gamblers once used to figure odds in card or dice games. Before the invention of probability theory, gamblers would deal out many hands of a card game to count the number of times a particular hand occurred. Thus, by experimentation, gamblers could figure the odds of getting a certain hand in their game. Probability theory freed researchers from the drudgery of repeated experiments. With a few assumptions, researchers could address a wide range of topics. While the advances in statistics paved the way for elegant analysis, the costs came high: - We could analyze only certain types of statistics, such as the mean and standard deviation.
- We had to make certain assumptions, like the normality assumption, about the underlying distribution.
- And researchers needed specialized training to apply, understand, and appreciate statistics.
But resampling techniques overcome all these limitations today: - We can analyze virtually any statistic.
- We don't have to make any assumptions about the distribution of the data.
- And the techniques are easily to understand.
All resampling techniques rely on the computer to generate data sets from the original data. The techniques differ, however, in how they generate the data sets. Four techniques are important: - the bootstrap, invented by Bradley Efron;
- the jacknife, invented by Maurice Quenouille and later developed by John W. Tukey;
- cross-validation, developed by Seymour Geisser, Mervyn Stone, and Grace G. Wahba; and
- balanced repeated replication, developed by Philip J. McCarthy.
Back to the question comparing the grades of voucher and non-voucher students: Using the bootstrap technique, we can empirically construct the distribution of mean grade differences for students in these two groups. If the observed difference is unusual, then we would reject the null hypothesis that grades are unrelated to voucher status. For simplicity, let's assume that the district has 13 voucher students and 39 non-voucher students, and the mean difference is 10 standard score units. To empirically construct the distribution, we'd follow these steps: - Create a data base with all the student grades.
- Randomly sort the data base.
- Compute the mean for the first 13 students.
- Compute the mean for the other 39 students.
- Record the test statistic--the absolute value of the mean difference.
- Then repeat steps 2 though 5 many times.
That way, we'd get the distribution of mean differences when we randomly select students. The probability of observing a mean difference of 10 when everything is random is the proportion of experimental test statistics in step 5 that are greater than 10. Noreen (1989) noted several striking aspects of this approach: - Researchers make no assumptions about the distribution of grades (for example, no normality assumption).
- The data are not a random sample from some population.
Diaconis and Efron (1983) argue that the resampling method frees researchers from two limitations of conventional statistics: "the assumption that the data conform to a bell-shaped curve and the need to focus on statistical measures whose theoretical properties can be analyzed mathematically." Instead, Peterson says, this method "addresses a key problem in statistics: how to infer the 'truth' from a sample of data that may be incomplete or drawn from an ill-defined population." The resampling method forces researchers to clarify the problem: With no formulas to fall back on, you have to explicitly define the question you want to answer. According to Simon and Bruce (1991), the method prevents researchers from "simply grabbing the formula for some test without understanding why they chose that test." As Peterson explains, instead of asking which formula to use, you "begin tackling such questions as what makes certain results statistically 'significant.'"
In Because resampling techniques like the bootstrap are so easy to use and understand, Simon and Bruce advocate teaching these techniques to students first--that way, students learn how to translate their "scientific" question into a "statistical" question. By learning how to think clearly about their problem, students won't "select their methods blindly." They cite a study where one group of students learned resampling techniques, and the other learned conventional methods. The students taught the resampling techniques did much better solving statistical problems than the other students taught conventional methods. Further, the students who learned the resampling techniques enjoyed statistics, and their attitudes toward math improved during the course. However, the attitudes of the students who learned conventional techniques got worse during the course.
Critics question the resampling method itself. They argue, as Stephen E. Fienberg says, that "you're trying to get something for nothing. You use the same numbers over and over again until you get an answer that you can't get any other way. In order to do that, you have to assume something, and you may live to regret that hidden assumption later on" (Peterson, 1991, p. 57). Other critics question the accuracy of the estimates that resampling yields--if, for example, the researcher doesn't make enough experimental trials. In some situations, resampling may be less accurate than conventional parametric methods.
The classic introduction to this field:
Diaconis, P., and B. Efron. (1983). Computer-intensive methods in statistics.
Noreen, Eric. (1989).
Peterson, I. (July 27, 1991). Pick a sample.
Simon, J. L. (1990).
Simon, J. L., and P. Bruce. (1991). Resampling: A tool for everyday statistical work.
| |||||||||||||

Descriptors: Computer Oriented Programs; Computer Uses in Education; *Educational Research; Elementary Secondary Education; *Estimation (Mathematics); Nonparametric Statistics; *Probability; *Research Methodology; Sampling; Statistical Distributions; *Statistics; Tech |

Sitemap 1 - Sitemap 2 - Sitemap 3 - Sitemap 4 - Sitemape 5 - Sitemap 6