The Concept of Statistical Significance Testing

Clearinghouse on Assessment and Evaluation

The Concept of Statistical Significance Testing

A frequently asked question of:

The Clearinghouse on Assessment and Evaluation [ERIC/AE]
1129 Shriver Laboratory
University of Maryland - College Park
College Park  MD  20742
Toll Free:	1.800.464.3742
E-mail:		askae#064;ericae.net
WWW:		http://ericae.net

Example queries:

As a postsecondary student who is studying inferential statistics, I need documentation to help me to understand the controversy that surrounds statistical significance testing.

What is the proper role of tests of statistical significance in social science research?

What are complementary methods to statistical significance testing for determining the replicability of results in research experiments?

INTRODUCTION

Introduction to the Logic and Process of Significance Testing:

1. Set up a null hypothesis and an alternative about the population or populations.

2. Set up an alpha level. An alpha level is the probability level you view as low enough to constitute evidence that there is a contradiction between the data and the assumption that the null hypotheis is true in the population (often alpha is set at .05 in the behavioral sciences).

3. Gather data from a sample.

4. Compute the value of a test statistic based on the sample data.

5. Compute the probability of the value of the test statistic in Step 4 under the assumption that the null is true (usually given in a table or as part of a print-out).

6. If the probability in step 5 is less than alpha selected in step 2, then conclude that there is an inconstancy between the null hypothesis and the data. You can then reject the null hypothesis in favor of the alternative hypothesis and state that the results are statistically significant.

If the probability is greater than the alpha level, then conclude that the sample data is consistent with the null hypothesis. You must then fail to reject the null hypothesis and state that the results are not statistically significant.

Note that statistical significance is not the same as practical significance. For example, the null hypothesis is often something like

population_mean_1 = population_mean_2
Rejecting this null hypothesis only indicates that the sample data imply that there is some difference in the population; however, that difference may be small and unimportant.

[Table of Contents]

ERIC DIGESTS
[Topical Overviews in Full-Text]

The Concept of Statistical Significance Testing [1994] - Bruce Thompson

Inappropriate Statistical Practices in Counseling Research: Three Pointers for Readers of Research Literature. [1995] - Bruce Thompson

Pitfalls of Data Analysis [1996] - Clay Helberg

ERIC DOCUMENTS CITATIONS

Dynamic (i.e., Live) Searches of the ERIC Documents Database

The following search options employ the electronic Thesaurus of ERIC Descriptors as the search interface for the ERIC documents database. If you would like to ensure a current bibliography of ERIC documents for any of the given sub-topics, then we highly recommend that you pursue this option. Also, once you have selected a search option, you may edit the given strategy to introduce concepts into the search to accommodate your specific needs; or, you can build an entirely new search strategy in the ERIC Search Wizard. For a short, selective bibliography for an introduction to the use of statistical significance testing in educational research, please see the Selected ERIC Documents Citations below.

Dynamic Search of the ERIC Database for an Overview of the Controversy Pertaining to the Use & Misuse of Statistical Significance Testing

Dynamic Search of the ERIC Database for Methods for the Determination of Statistical Significance

Dynamic Search of the ERIC Database for Techniques to Complement Statistical Significance in the Determination of the Likelihood of Research Replication

Selected ERIC Documents Citations for an Overview of the Controversy Pertaining to the Use & Misuse of Statistical Significance Testing
Although less current and less plentiful than the Dynamic Search options (above), this selective bibliography presents ERIC documents citations that have been preselected on the basis of the thoroughness and/or authority and/or uniqueness that they reflect.
Instructions for ERIC Documents Access

[Table of Contents]

MAJOR BIBLIOGRAPHIC RESOURCES
[Check your local library or bookstore for access]

Chow, S.L. (1996). Statistical significance: rationale, validity and utility. Thousand Oaks, CA: Sage. Harlow, L.L., Mulaik, S.A., & Steiger, J.H. (Eds.). (1997). What if there were no significance tests? (Multivariate Applications Book Series). Mahwah, NJ: Lawrence Erlbaum Associates. McLean, J.E. & Kaufman, A.S. (Eds.). (1998). Statistical significance testing [Special Issue]. Research in the Schools, 5(2). Birmingham, AL: Mid-South Educational Research Association. Mohr, L.B. (1990). Understanding significance testing. (Quantitative Applications in the Social Sciences No. 07-073). Newbury Park: Sage. New ways in statistical methodology: from significance tests to Bayesian inference. (1998). (European University Studies Series VI, Psychology). New York: P. Lang.

[Table of Contents]

ORGANIZATIONS
American Educational Research Association - Division D: Measurement and Research Methodology [AERA-D]

American Statistical Association - Social Statistics Section

[Table of Contents]

Publication Notes:
Created: August 1, 1999
Last Revised: August 25, 1999

Please send your comments about this site to ERIC/AE Webmaster.

Return to the Index of FAQs

Sitemap 1 - Sitemap 2 - Sitemap 3 - Sitemap 4 - Sitemap 5 - Sitemap 6