>
|
|
ERIC Documents Database Citations & Abstracts for The Concept of Statistical Significance TestingInstructions for ERIC Documents Access
Search Strategy:Statistical Significance [ERIC Descriptor, with heavily weighted status]AND Research Methodology OR Educational Research OR Educational History OR Statistical Inference OR Statistical Analysis OR Hypothesis Testing OR Null Hypothesis [ERIC Descriptors/Identifiers] ED419023 TM028329 Five Methodology Errors in Educational Research: The Pantheon of Statistical Significance and Other Faux Pas. Thompson, Bruce 1998 102p.; Paper presented at the Annual Meeting of the American Educational Research Association (San Diego, CA, April 13-17, 1998). Document Type: EVALUATIVE REPORT (142); CONFERENCE PAPER (150) After presenting a general linear model as a framework for discussion, this paper reviews five methodology errors that occur in educational research: (1) the use of stepwise methods; (2) the failure to consider in result interpretation the context specificity of analytic weights (e.g., regression beta weights, factor pattern coefficients, discriminant function coefficients, canonical function coefficients) that are part of all parametric quantitative analyses; (3) the failure to interpret both weights and structure coefficients as part of result interpretation; (4) the failure to recognize that reliability is a characteristic of scores, and not of tests; and (5) the incorrect interpretation of statistical significance and the related failure to report and interpret the effect sizes present in all quantitative analysis. In several cases small heuristic discriminant analysis data sets are presented to make the discussion of each of these five methodology errors more concrete and accessible. Four appendixes contain computer programs for some of the analyses. (Contains 19 tables, 1 figure, and 143 references.) (SLD) Descriptors: *Educational Research; *Effect Size; *Research Methodology; Scores; *Statistical Significance; Tables (Data); *Test Reliability Identifiers: Stepwise Regression; *Weighting (Statistical) ED416214 TM028066 Why "Encouraging" Effect Size Reporting Isn't Working: The Etiology of Researcher Resistance to Changing Practices. Thompson, Bruce 1998 18p.; Paper presented at the Annual Meeting of the Southwest Educational Research Association (Houston, TX, January 1998). Document Type: PROJECT DESCRIPTION (141); CONFERENCE PAPER (150) Given decades of lucid, blunt admonitions that statistical significance tests are often misused, and that the tests are somewhat limited in utility, what is needed is less repeated bashing of statistical tests, and some honest reflection regarding the etiology of researchers' denial and psychological resistance (sometimes unconscious) to improved practice. Three etiologies are briefly explored here: (1) atavism; (2) "is/ought" logic fallacies; and (3) confusion/desperation. Understanding the etiology of psychological resistance may ultimately lead to improved interventions to assist in overcoming researcher resistance to reporting effect sizes and using non-nil nulls and other analytic improvements. (Contains 45 references.) (Author) Descriptors: Attitudes; Change; Denial (Psychology); *Educational Research; *Effect Size; *Etiology; *Research Methodology; *Researchers; *Statistical Significance ED408302 TM026504 Use of Tests of Statistical Significance and Other Analytic Choices in a School Psychology Journal: Review of Practices and Suggested Alternatives. Snyder, Patricia A.; Thompson, Bruce 24 Jan 1997 25p.; Paper presented at the Annual Meeting of the Southwest Educational Research Association (Austin, TX, January 24, 1997). Document Type: EVALUATIVE REPORT (142); CONFERENCE PAPER (150) The use of tests of statistical significance was explored, first by reviewing some criticisms of contemporary practice in the use of statistical tests as reflected in a series of articles in the "American Psychologist" and in the appointment of a "Task Force on Statistical Inference" by the American Psychological Association (APA) to consider recommendations leading to improved practice. Related practices were reviewed in seven volumes of the "School Psychology Quarterly," an APA journal. This review found that some contemporary authors continue to use and interpret statistical significance tests inappropriately. The 35 articles reviewed reported a total of 321 statistical tests for which sufficient information was provided for effect sizes to be computed, but authors of only 19 articles did report various magnitudes of effect indices. Suggestions for improved practice are explored, beginning with the need to interpret statistical significance tests correctly, using more accurate language, and the need to report and interpret magnitude of effect indices. Editorial policies must continue to evolve to require authors to meet these expectations. (Contains 50 references.) (SLD) Descriptors: Educational Psychology; *Educational Research; *Effect Size; Elementary Secondary Education; Research Methodology; Research Reports; *Scholarly Journals; *School Psychologists; Statistical Inference; *Statistical Significance; Test Interpretation; *Test Use Identifiers: American Psychological Association EJ565847 TM520973 Statistical Significance Testing Practices in "The Journal of Experimental Education." Thompson, Bruce; Snyder, Patricia A. Journal of Experimental Education, v66 n1 p75-83 Fall 1997 ISSN: 0022-0973 Document Type: JOURNAL ARTICLE (080); EVALUATIVE REPORT (142) The use of three aspects of recommended practice (language use, replicability analyses, and reporting effect sizes) was studied in quantitative reports in "The Journal of Experimental Education" (JXE) for the academic years 1994-95 and 1995-96. Examples of both errors and desirable practices in the use and reporting of statistical significance tests in JXE are noted. (SLD) Descriptors: *Effect Size; *Language Usage; *Research Methodology; Research Reports; Scholarly Journals; *Statistical Significance Identifiers: *Research Replication EJ564700 TM520936 Statistical Significance: Rationale, Validity and Utility book review. Simon, Marilyn K. Canadian Journal of Program Evaluation/La Revue canadienne d'evaluation de programme, v12 n2 p189-90 Aut 1997 ISSN: 0834-1516 Document Type: BOOK-PRODUCT REVIEW (072); JOURNAL ARTICLE (080) Review states that the book gives an examination of the null- hypothesis significance test procedure as an integral component of data analysis of quantitative research studies in the social sciences. It is designed for the nonmathematics student who will be doing empirical studies involving the testing of substantive hypotheses. (SLD) Descriptors: *Hypothesis Testing; *Research Methodology; Research Utilization; *Social Science Research; *Statistical Significance; *Validity Identifiers: *Null Hypothesis EJ559584 EC618689 Debunking the Myth of the "Highly Significant" Result: Effect Sizes in Gifted Education Research. Plucker, Jonathan A. Roeper Review, v20 n2 p122-26 Dec 1997 ISSN: 0278-3193 Document Type: JOURNAL ARTICLE (080); RESEARCH REPORT (143) Describes the utility of effect size reporting and reports on a study that reviewed articles in three quarterly gifted journals and 40 articles in journals not directly associated with gifted education, published over the last five years. Effect sizes were generally not included in research articles, with results consistent across journals. (Author/CR) Descriptors: Educational Research; *Effect Size; *Gifted; *Research Methodology; *Scholarly Journals; *Statistical Significance; Technical Writing EJ551464 UD520180 Rejoinder: Editorial Policies Regarding Statistical Significance Tests: Further Comments. Thompson, Bruce Educational Researcher, v26 n5 p29-32 Jun-Jul 1997 Document Type: JOURNAL ARTICLE (080); POSITION PAPER (120) Argues that describing results as "significant" rather than "statistically significant" is confusing to the very people most apt to misinterpret this telegraphic wording. The importance of reporting the effect size and the value of both internal and external replicability analyses are stressed. (SLD) Descriptors: *Editing; *Educational Research; *Effect Size; Scholarly Journals; *Statistical Significance; Test Use; *Writing for Publication Identifiers: *Research Replication EJ551463 UD520179 Reflections on Statistical and Substantive Significance, with a Slice of Replication. Robinson, Daniel H.; Levin, Joel R. Educational Researcher, v26 n5 p21-26 Jun-Jul 1997 Document Type: JOURNAL ARTICLE (080); EVALUATIVE REPORT (142) Proposes modifications to the recent suggestions by B. Thompson (1996) for an American Educational Research Association editorial policy on statistical significance testing. Points out that, although it is useful to include effect sizes, they can be misinterpreted, and argues, as does Thompson, for greater attention to replication in educational research. (SLD) Descriptors: *Editing; *Educational Research; *Effect Size; Research Methodology; Research Reports; Scholarly Journals; *Statistical Significance; *Test Use; Writing for Publication Identifiers: *Research Replication EJ541829 SE557633 A Note on p-Values. Evans, Gwyn Teaching Statistics, v19 n1 p22-23 Spr 1997 ISSN: 0141-982X Document Type: TEACHING GUIDE (052); JOURNAL ARTICLE (080) Demonstrates the advantages of a p-value as compared with a standard significance test procedure. Contains examples in the discussion of testing the mean of a normal distribution and testing a probability of proportion. (DDR) Descriptors: British National Curriculum; Educational Strategies; Foreign Countries; Higher Education; *Probability; *Ratios (Mathematics); *Statistical Significance; *Statistics Identifiers: Great Britain ED415265 TM027966 Has Testing for Statistical Significance Outlived Its Usefulness? McLean, James E.; Ernest, James M. 1997 21p.; Paper presented at the Annual Meeting of the Mid-South Educational Research Association (26th, Memphis, TN, November 12-14, 1997). Document Type: EVALUATIVE REPORT (142); CONFERENCE PAPER (150) The research methodology literature in recent years has included a full frontal assault on statistical significance testing. An entire edition of "Experimental Education" explored this controversy. The purpose of this paper is to promote the position that while significance testing by itself may be flawed, it has not outlived its usefulness. However, it must be considered in combination with other criteria. Specifically, statistical significance is but one of three criteria that must be demonstrated to establish a position empirically. Statistical significance merely provides evidence that an event did not happen by chance. However, it provides no information about the meaningfulness (practical significance) of an event or if the event is replicable. Consequently, statistical significance testing must be accompanied by judgments of the event's practical significance and replicability. However, the likelihood of a chance occurrence of an event must not be ignored. It is acknowledged that the importance of significance testing is reduced as sample size increases. In large sample experiments, particularly those involving multiple variables, the role of significance testing is diminished because even small differences are often statistically significant. In small sample studies where assumptions such as random sampling are practical, significance testing can be quite useful. It is important to remember that statistical significance is but one criterion useful to inferential researchers. In addition to statistical significance, practical significance, and replicability, researchers must also consider Type II errors and sample size. Furthermore, researchers should not ignore other techniques such as confidence intervals. While all of these statistical concepts are related, they provide different types of information that assist researchers in making decisions. (Contains 30 references.) (Author/SLD) Descriptors: Criteria; Decision Making; *Research Methodology; *Sample Size; *Statistical Significance; *Test Use Identifiers: *Research Replication ED413342 TM027613 If Statistical Significance Tests Are Broken/Misused, What Practices Should Supplement or Replace Them? Thompson, Bruce 1997 32p.; Paper presented at the Annual Meeting of the American Psychological Association (105th, Chicago, IL, August 1997). Document Type: POSITION PAPER (120); CONFERENCE PAPER (150) Given some consensus that statistical significance tests are broken, misused, or at least have somewhat limited utility, the focus of discussion within the field ought to move beyond additional bashing of statistical significance tests, and toward more constructive suggestions for improved practice. Five suggestions for improved practice are recommended: (1) required reporting of effect sizes; (2) reporting of effect sizes in an interpretable manner; (3) explicating the values that bear on results; (4) providing evidence of result replicability; and (5) reporting confidence intervals. Although the five recommendations can be followed even if statistical significance tests are reported, social science will proceed most rapidly when research becomes the search for replicable effects noteworthy in magnitude in the context of both the inquiry and personal or social values. (Contains 1 table and 74 references.) (Author/SLD) Descriptors: *Effect Size; *Research Methodology; *Statistical Significance; *Test Use Identifiers: *Confidence Intervals (Statistics); *Research Replication ED408336 TM026589 Statistical Significance Testing in "Educational and Psychological Measurement" and Other Journals. Daniel, Larry G. Mar 1997 33p.; Paper presented at the Annual Meeting of the National Council on Measurement in Education (Chicago, IL, March 25-27, 1997). Document Type: EVALUATIVE REPORT (142); CONFERENCE PAPER (150) Statistical significance tests (SSTs) have been the object of much controversy among social scientists. Proponents have hailed SSTs as an objective means for minimizing the likelihood that chance factors have contributed to research results. Critics have both questioned the logic underlying SSTs and bemoaned the widespread misapplication and misinterpretation of the results of these tests. This paper offers a framework for remedying some of the common problems associated with SSTs via modification of journal editorial policies. The controversy surrounding SSTs is reviewed, with attention given to both historical and more contemporary criticisms of bad practices associated with misuse of SSTs. Examples from the editorial policies of "Educational and Psychological Measurement" and several other journals that have established guidelines for reporting results of SSTs are discussed, and suggestions are provided regarding additional ways that educational journals may address the problem. These guidelines focus on selecting qualified editors and reviewers, defining policies about use of SSTs that are in line with those of the American Psychological Association, and stressing effect size reporting. An appendix presents a manuscript review form. (Contains 61 references.) (Author/SLD) Descriptors: Editing; *Educational Assessment; Policy; Research Problems; *Scholarly Journals; *Social Science Research; *Statistical Significance; *Test Use Identifiers: *Educational and Psychological Measurement ED408303 TM026505 Use of Statistical Significance Tests and Reliability Analyses in Published Counseling Research. Thompson, Bruce; Snyder, Patricia A. 25 Mar 1997 24p.; Paper presented at the Annual Meeting of the American Educational Research Association (Chicago, IL, March 1997). Document Type: EVALUATIVE REPORT (142); CONFERENCE PAPER (150) The mission of the "Journal of Counseling and Development" (JCD) includes the attempt to serve as a "scholarly record of the counseling profession" and as part of the "conscience of the profession." This responsibility requires the willingness to engage in self-study. This study investigated two aspects of research practice in 25 quantitative studies reported in 1996 JCD issues, the use and interpretation of statistical significance tests, and the meaning of and ways of evaluating the score reliabilities of measures used in substantive research inquiry. Too many researchers have persisted in equating result improbability with result value, and too many have persisted in believing that statistical significance evaluates result replicability. In addition, too many researchers have persisted in believing that result improbability equals the magnitude of study effects. Authors must consistently begin to report and interpret effect sizes to aid the interpretations they make and those made by their readers. With respect to score reliability evaluation, more authors need to recognize that reliability inures to specific sets of scores and not to the test itself. Thirteen of the JCD articles involved reports of score reliability in previous studies and eight reported reliability coefficients for both previous scores and those in hand. These findings suggest some potential for improved practice in the quantitative research reported in JCD and improved editorial policies to support these changes. (Contains 39 references.) (SLD) Descriptors: *Counseling; Educational Research; *Effect Size; Evaluation Methods; Reliability; Research Methodology; *Research Reports; *Scholarly Journals; Scores; *Statistical Significance; *Test Use Identifiers: *Journal of Counseling and Development; Research Replication ED407423 TM026445 Ways To Explore the Replicability of Multivariate Results (Since Statistical Significance Testing Does Not). Kier, Frederick J. 23 Jan 1997 17p.; Paper presented at the Annual Meeting of the Southwest Educational Research Association (Austin, TX, January 23-25, 1997). Document Type: EVALUATIVE REPORT (142); CONFERENCE PAPER (150) It is a false, but common, belief that statistical significance testing evaluates result replicability. In truth, statistical significance testing reveals nothing about results replicability. Since science is based on replication of results, methods that assess replicability are important. This is particularly true when multivariate methods, which capitalize on sampling error, are used. This paper explores three methods that can give an idea of the replicability of results in multivariate analysis without having to repeat the study. The first method is cross validation, a replication technique in which the entire sample is first run through the planned analysis and then the sample is randomly split into two unequal parts so that separate analyses are done on each half. The jackknife is a second method of replicability that relies on partitioning out the impact or effect of a particular subset of the data on an estimate derived from the total sample. The bootstrap, a third method of studying replicability, involves copying the data set into an infinitely large "mega" data set. Many different samples are then drawn from the file and results are computed separately for each sample and then averaged. The main drawback of all these internal replicability procedures is that their results are all based on the data from the one sample being analyzed. However, internal replication techniques are better than not addressing the issue at all. (Contains 18 references.) (SLD) Descriptors: Evaluation Methods; *Multivariate Analysis; *Sampling; *Statistical Significance Identifiers: Bootstrap Methods; Cross Validation; Jackknifing Technique; *Research Replication EJ535148 TM519805 What to Do with the Upward Bias in R Squared: A Comment on Huberty. Snijders, Tom A. B. Journal of Educational and Behavioral Statistics, v21 n3 p283-98 Fall 1996 These articles comment on a recent article by Carl J. Huberty (1994), "A Note on Interpreting an R-squared Value," Journal of Educational and Behavioral Statistics, v19, p351-56. ISSN: 1076-9986 Document Type: BOOK-PRODUCT REVIEW (072); EVALUATIVE REPORT (142); JOURNAL ARTICLE (080) Two commentaries describe some shortcomings of a recent discussion of the significance testing of R-squared by C. J. Huberty and upward bias in the statistic. Both propose some modifications. A response by Huberty acknowledges the importance of the exchange of ideas in the field of data analysis. (SLD) Descriptors: *Bias; *Correlation; *Effect Size; *Regression ( Statistics); *Statistical Significance EJ533527 TM519729 Practical Significance: A Concept Whose Time Has Come. Kirk, Roger E. Educational and Psychological Measurement, v56 n5 p746-59 Oct 1996 Article based on the presidential address delivered to the Southwestern Psychological Association meeting (Houston, TX, April 5, 1996). ISSN: 0013-1644 Document Type: EVALUATIVE REPORT (142); CONFERENCE PAPER (150); JOURNAL ARTICLE (080) Practical significance is concerned with whether a research result is useful in the real world. The use of procedures to supplement the null hypothesis significance test in four journals of the American Psychological Association is examined, and an approach to assessing practical significance is presented. (SLD) Descriptors: *Educational Research; *Hypothesis Testing; *Research Utilization; Sampling; *Scholarly Journals; *Statistical Significance Identifiers: American Psychological Association; *Null Hypothesis; *Practical Significance EJ525478 UD519259 AERA Editorial Policies Regarding Statistical Significance Testing: Three Suggested Reforms. Thompson, Bruce Educational Researcher, v25 n2 p26-30 Mar 1996 ISSN: 0013-189X Document Type: EVALUATIVE REPORT (142); JOURNAL ARTICLE (080) Reviews practices regarding tests of statistical significance and policies of the American Educational Research Association (AERA). Decades of misuse of statistical significance testing are described, and revised editorial policies to improve practice are highlighted. Correct interpretation of statistical tests, interpretation of effect sizes, and exploration of research replicability are essential. (SLD) Descriptors: *Editing; Educational Research; *Effect Size; *Statistical Significance; Test Interpretation; *Test Use Identifiers: American Educational Research Association; *Editorial Policy; *Research Replication EJ520936 TM519322 The Impact of Data-Analysis Methods on Cumulative Research Knowledge: Statistical Significance Testing, Confidence Intervals, and Meta-Analysis. Schmidt, Frank; Hunter, John E. Evaluation and the Health Professions, v18 n4 p408-27 Dec 1995 Special issue titled "The Meta-Analytic Revolution in Health Research: Part II." ISSN: 0163-2787 Document Type: EVALUATIVE REPORT (142); JOURNAL ARTICLE (080) It is argued that point estimates of effect sizes and confidence intervals around these point estimates are more appropriate statistics for individual studies than reliance on statistical significance testing and that meta-analysis is appropriate for analysis of data from multiple studies. (SLD) Descriptors: *Effect Size; Estimation (Mathematics); *Knowledge Level; *Meta Analysis; *Research Methodology; *Statistical Significance; Test Use Identifiers: *Confidence Intervals (Statistics) ED393939 TM024976 Understanding the Sampling Distribution and Its Use in Testing Statistical Significance. Breunig, Nancy A. 9 Nov 1995 25p.; Paper presented at the Annual Meeting of the Mid-South Educational Research Association (Biloxi, MS, November 1995). Document Type: EVALUATIVE REPORT (142); CONFERENCE PAPER (150) Despite the increasing criticism of statistical significance testing by researchers, particularly in the publication of the 1994 American Psychological Association's style manual, statistical significance test results are still popular in journal articles. For this reason, it remains important to understand the logic of inferential statistics. A fundamental concept in inferential statistics is the sampling distribution. This paper explains the sampling distribution and the Central Limit Theorem and their role in statistical significance testing. Included in the discussion is a demonstration of how computer applications can be used to teach students about the sampling distribution. The paper concludes with an example of hypothesis testing and an explanation of how the standard deviation of the sampling distribution is either calculated based on statistical assumptions or is empirically estimated using logics such as the "bootstrap." These concepts are illustrated through the use of hand generated and computer examples. An appendix displays five computer screens designed to teach these topics. (Contains 1 table, 4 figures, and 20 references.) (Author/SLD) Descriptors: *Computer Uses in Education; *Educational Research; *Hypothesis Testing; *Sampling; Statistical Distributions; Statistical Inference; *Statistical Significance; Test Results Identifiers: Bootstrap Methods; *Central Limit Theorem ED392819 TM024458 Editorial Policies Regarding Statistical Significance Testing: Three Suggested Reforms. Thompson, Bruce 8 Nov 1995 24p.; Paper presented at the Annual Meeting of the Mid-South Educational Research Association (Biloxi, MS, November 1995). Document Type: POSITION PAPER (120); CONFERENCE PAPER (150) Editorial practices revolving around tests of statistical significance are explored. The logic of statistical significance testing is presented in an accessible manner--many people who use statistical tests might not place such a premium on them if they knew what the tests really do, and what they do not do. The etiology of decades of misuse of statistical tests is explored, highlighting the bad implicit logic of persons who misuse statistical tests. Finally, three revised editorial policies that would improve conventional practice are discussed. The first is the use of better language, with insistence on universal use of the phrase "statistical significance" to emphasize that the common meaning of "significant" has nothing to do with results being important. A second improvement would be emphasizing effect size interpretation, and a third would be using and reporting strategies that evaluate the replicability of results. Internal replicability analyses such as cross validation, the jackknife, or the bootstrap would help determine whether results are stable across sample variations. (Contains 51 references.) (Author/SLD) Descriptors: *Editing; *Educational Assessment; *Effect Size; Quality Control; *Research Methodology; *Statistical Significance; *Test Use Identifiers: Bootstrap Methods; Cross Validation; Jackknifing Technique; *Research Replication ED382639 TM023069 Effect Size as an Alternative to Statistical Significance Testing. McClain, Andrew L. Apr 1995 18p.; Paper presented at the Annual Meeting of the American Educational Research Association (San Francisco, CA, April 18-22, 1995). Document Type: REVIEW LITERATURE (070); CONFERENCE PAPER (150) The present paper discusses criticisms of statistical significance testing from both historical and contemporary perspectives. Statistical significance testing is greatly influenced by sample size and often results in meaningless information being over-reported. Variance-accounted-for-effect sizes are presented as an alternative to statistical significance testing. A review of the "Journal of Clinical Psychology" (1993) reveals a continued reliance on statistical significance testing on the part of researchers. Finally, scatterplots and correlation coefficients are presented to illustrate the lack of linear relationship between sample size and effect size. Two figures are included. (Contains 24 references.) (Author) Descriptors: Correlation; *Effect Size; Research Methodology; *Sample Size; *Statistical Significance; *Testing Identifiers: Scattergrams; *Variance (Statistical) EJ481563 EC608481 Interpretation of Statistical Significance Testing: A Matter of Perspective. McClure, John; Suen, Hoi K. Topics in Early Childhood Special Education, v14 n1 p88-100 Spr 1994 Theme Issue: Methodological Issues and Advances. ISSN: 0271-1214 Document Type: JOURNAL ARTICLE (080); POSITION PAPER (120) Target Audience: Researchers This article compares three models that have been the foundation for approaches to the analysis of statistical significance in early childhood research--the Fisherian and the Neyman-Pearson models (both considered "classical" approaches), and the Bayesian model. The article concludes that all three models have a place in the analysis of research results. (JDD) Descriptors: *Bayesian Statistics; Early Childhood Education; Educational Research; *Hypothesis Testing; Models; *Research Methodology; Statistical Analysis; *Statistical Significance ED367678 TM021117 Historical Origins of Contemporary Statistical Testing Practices: How in the World Did Significance Testing Assume Its Current Place in Contemporary Analytic Practice? Weigle, David C. Jan 1994 18p.; Paper presented at the Annual Meeting of the Southwest Educational Research Association (San Antonio, TX, January 27, 1994). Document Type: EVALUATIVE REPORT (142); CONFERENCE PAPER (150) The purposes of the present paper are to address the historical development of statistical significance testing and to briefly examine contemporary practices regarding such testing in the light of these historical origins. Precursors leading to the advent of statistical significance testing are examined as are more recent controversies surrounding the issue. As the etiology of current practice is explored, it will become more apparent whether current practices evolved from deliberative judgment or merely developed from happenstance that has become reified in routine. Examination of the history of analysis suggests that the development of statistical significance testing has indeed involved a degree of deliberative judgment. It may be that the time for significance testing came and went, but there is no doubt that significance testing served as an important catalyst for the growth of science in the 20th century. (Contains 39 references.) (Author/SLD) Descriptors: *Data Analysis; Educational History; Etiology; *Research Methodology; *Scientific Research; *Statistical Significance; *Testing EJ475203 TM517631 Statistical Significance Testing from Three Perspectives and Interpreting Statistical Significance and Nonsignificance and the Role of Statistics in Research. Levin, Joel R.; And Others Journal of Experimental Education, v61 n4 p378-93 Sum 1993 Theme issue title: "Statistical Significance Testing in Contemporary Practice: Some Proposed Alternatives with Comments from Journal Editors." ISSN: 0022-0973 Document Type: COLLECTION (020); POSITION PAPER (120); JOURNAL ARTICLE (080) Journal editors respond to criticisms of reliance on statistical significance in research reporting. Joel R. Levin ("Journal of Educational Psychology") defends its use, whereas William D. Schafer ("Measurement and Evaluation in Counseling and Development") emphasizes the distinction between statistically significant and important. William Asher ("Journal of Experimental Education") comments on preceding discussions. (SLD) Descriptors: Editing; Editors; Educational Assessment; *Educational Research; Elementary Secondary Education; Higher Education; Hypothesis Testing; *Research Methodology; Research Reports; Scholarly Journals; *Statistical Significance; Statistics EJ475198 TM517626 What Statistical Significance Testing Is, and What It Is Not. Shaver, James P. Journal of Experimental Education, v61 n4 p293-316 Sum 1993 Theme issue title: "Statistical Significance Testing in Contemporary Practice: Some Proposed Alternatives with Comments from Journal Editors." ISSN: 0022-0973 Document Type: EVALUATIVE REPORT (142); JOURNAL ARTICLE (080) Reviews the role of statistical significance testing, and argues that dominance of such testing is dysfunctional because significance tests do not provide the information that many researchers assume they do. Possible reasons for the persistence of statistical significance testing are discussed briefly, and ways to moderate negative effects are suggested. (SLD) Descriptors: Educational Practices; *Educational Research; Elementary Secondary Education; Higher Education; Research Design; Research Methodology; *Research Problems; Scholarly Journals; *Statistical Significance EJ475197 TM517625 The Case against Statistical Significance Testing, Revisited. Carver, Ronald P. Journal of Experimental Education, v61 n4 p287-92 Sum 1993 Theme issue title: "Statistical Significance Testing in Contemporary Practice: Some Proposed Alternatives with Comments from Journal Editors." ISSN: 0022-0973 Document Type: EVALUATIVE REPORT (142); JOURNAL ARTICLE (080) Four things are recommended to minimize the influence or importance of statistical significance testing. Researchers must not neglect to add "statistical" to significant and could interpret results before giving p-values. Effect sizes should be reported with measures of sampling error, and replication can be built into the design. (SLD) Descriptors: Educational Researchers; *Effect Size; Error of Measurement; *Research Methodology; Research Problems; Sampling; *Statistical Significance Identifiers: *P Values; *Research Replication ED364608 TM020880 Meaningfulness, Statistical Significance, Effect Size, and Power Analysis: A General Discussion with Implications for MANOVA. Huston, Holly L. Nov 1993 29p.; Paper presented at the Annual Meeting of the Mid-South Educational Research Association (22nd, New Orleans, LA, November 9- 12, 1993). Document Type: EVALUATIVE REPORT (142); CONFERENCE PAPER (150) This paper begins with a general discussion of statistical significance, effect size, and power analysis; and concludes by extending the discussion to the multivariate case (MANOVA). Historically, traditional statistical significance testing has guided researchers' thinking about the meaningfulness of their data. The use of significance testing alone in making these decisions has proved problematic. It is likely that less reliance on statistical significance testing, and an increased use of power analysis and effect size estimates in combination could contribute to an overall improvement in the quality of new research produced. The more informed researchers are about the benefits and limitations of statistical significance, effect size, and power analysis, the more likely it is that they will be able to make more sophisticated and useful interpretations about the meaningfulness of research results. One table illustrates the discussion. (Contains 37 references.)(SLD) Descriptors: *Effect Size; *Estimation (Mathematics); *Multivariate Analysis; *Research Methodology; Research Reports; *Statistical Significance Identifiers: *Meaningfulness; *Power (Statistics) ED364593 TM020837 What Is the Probability of Rejecting the Null Hypothesis?: Statistical Power in Research. Galarza-Hernandez, Aitza Nov 1993 30p.; Paper presented at the Annual Meeting of the Mid-South Educational Research Association (22nd, New Orleans, LA, November 9- 12, 1993). Document Type: EVALUATIVE REPORT (142); CONFERENCE PAPER (150) Power refers to the probability that a statistical test will yield statistically significant results. In spite of the close relationship between power and statistical significance, there is a consistent overemphasis in the literature on statistical significance. This paper discusses statistical significance and its limitations and also includes a discussion of statistical power in the behavioral sciences. Finally, some recommendations to increase power are provided, focusing on the necessity of paying more attention to power issues. Changing editorial policies and practices so that editors ask authors to estimate the power of their tests is a useful way to improve the situation. Planning research to consider power is another way to ensure that the question of the probability of rejecting the null hypothesis is answered correctly. Four tables and two figures illustrate the discussion. (Contains 28 references.) (SLD) Descriptors: *Behavioral Science Research; Editors; *Estimation (Mathematics); Hypothesis Testing; Literature Reviews; *Probability; Research Design; *Research Methodology; Scholarly Journals; *Statistical Significance Identifiers: *Null Hypothesis; *Power (Statistics) EJ458887 CG542330 Simultaneous Inference: Objections and Recommendations. Schafer, William D. Measurement and Evaluation in Counseling and Development, v25 n4 p146-48 Jan 1993 ISSN: 0748-1756 Document Type: JOURNAL ARTICLE (080); POSITION PAPER (120) Considers objections to comparisonwise position, which holds that, when conducting simultaneous significance procedures, per-test Type I error rate should be controlled and that it is unnecessary to introduce adjustments designed to control familywise rate. Objections collected by Saville in an attempt to refute them are discussed along with Saville's conclusions. Recommendations are introduced for reporting significance tests in journals. (NB) Descriptors: *Statistical Inference; *Statistical Significance; Statistics Identifiers: *Simultaneous Inference ED347169 TM018523 Statistical Significance Testing: Alternatives and Considerations. Wilkinson, Rebecca L. Jan 1992 28p.; Paper presented at the Annual Meeting of the Southwest Educational Research Association (Houston, TX, January 31-February 2, 1992). Document Type: POSITION PAPER (120); CONFERENCE PAPER (150) Problems inherent in relying solely on statistical significance testing as a means of data interpretation are reviewed. The biggest problem with statistical significance testing is that researchers have used the results of this testing to ascribe importance or meaning to their studies where such meaning often does not exist. Often researchers mistake statistically significant results for important effects. Statistical procedures are too often used as substitutes to thought, rather than as aids to researcher thinking. Alternatives to statistical significance testing that are explored are effect size, statistical power, and confidence intervals. Other considerations for further data analysis that are explored are: (1) measurement reliability; (2) data exploration; and (3) the replicability of research results. It is suggested that statistical significance testing be used only as a guide in interpreting one's results. Two tables present illustrative information, and there is a 22-item list of references. (SLD) Descriptors: *Data Interpretation; Effect Size; *Reliability; Researchers; Research Methodology; *Research Problems; *Statistical Significance Identifiers: Confidence Intervals (Statistics); Power (Statistics); Research Replication ED344905 TM018225 What Statistical Significance Testing Is, and What It Is Not. Shaver, James P. Apr 1992 43p.; Paper presented at the Annual Meeting of the American Educational Research Association (San Francisco, CA, April 20-24, 1992). Document Type: CONFERENCE PAPER (150) A test of statistical significance is a procedure for determining how likely a result is assuming a null hypothesis to be true with randomization and a sample of size n (the given size in the study). Randomization, which refers to random sampling and random assignment, is important because it ensures the independence of observations, but it does not guarantee independence beyond the initial sample selection. A test of statistical significance provides a statement of probability of occurrence in the long run, with repeated random sampling under the null hypothesis, but provides no basis for a conclusion about the probability that a particular result is attributable to chance. A test of statistical significance also does not indicate the probability that the null hypothesis is true or false and does not indicate whether a treatment being studied had an effect. Statistical significance indicates neither the magnitude nor the importance of a result, and is no indication of the probability that a result would be obtained on study replication. Although tests of statistical significance yield little valid information for questions of interest in most educational research, use and misuse of such tests remain common for a variety of reasons. Researchers should be encouraged to minimize statistical significance tests and to state expectations for quantitative results as critical effect sizes. There is a 58-item list of references. (SLD) Descriptors: Educational Research; Evaluation Problems; Hypothesis Testing; Probability; Psychological Studies; *Research Design; Research Problems; *Sample Size; *Statistical Significance; Test Validity Identifiers: *Null Hypothesis; *Randomization (Statistics); Research Replication ED333036 TM016545 The Place of Significance Testing in Contemporary Social Science. Moore, Mary Ann 3 Apr 1991 23p.; Paper presented at the Annual Meeting of the American Educational Research Association (Chicago, IL, April 3-7, 1991). Document Type: EVALUATIVE REPORT (142); CONFERENCE PAPER (150) This paper examines the problems caused by relying solely on statistical significance tests to interpret results in contemporary social science. The place of significance testing in educational research has often been debated. Among the problems in reporting statistical significance are questions of definition and terminology. Problems are also found in the use, as well as the reporting, of significance testing. One of the most important problems is the effect of sample size on significance. An example with a fixed effect size of 25% and samples containing 22, 23, and 24 people illustrates these effects. The issues of validity and reliability in significance testing with measurement studies are considered. Although these problems are widely recognized, publishers show a clear bias in favor of reports that claim statistical significance. Researchers need to recognize the limitations of significance testing. Effect size statistics aid in the interpretation of results and provide a guide to the relative importance of the study. Two tables illustrate the effects of sample size. A 22-item list of references is included. (SLD) Descriptors: *Data Interpretation; Educational Research; *Effect Size; Research Methodology; *Research Problems; *Sample Size; *Social Science Research; *Statistical Significance ED325524 TM015782 Alternatives to Statistical Significance Testing. Palomares, Ronald S. 8 Nov 1990 20p.; Paper presented at the Annual Meeting of the Mid-South Educational Research Association (19th, New Orleans, LA, November 14- 16, 1990). Document Type: EVALUATIVE REPORT (142); CONFERENCE PAPER (150) Researchers increasingly recognize that significance tests are limited in their ability to inform scientific practice. Common errors in interpreting significance tests and three strategies for augmenting the interpretation of significance test results are illustrated. The first strategy for augmenting the interpretation of significance tests involves evaluating significance test results in a sample size context. A second strategy involves interpretation of effect size estimates; several estimates and corrections are discussed. A third strategy emphasizes interpretation based on estimated likelihood that results will replicate. The bootstrap method of B. Efron and others and cross-validation strategies are illustrated. A 28-item list of references and four data tables are included. (Author/SLD) Descriptors: *Effect Size; Estimation (Mathematics); *Evaluation Methods; *Research Design; Research Problems; *Sample Size; *Statistical Significance Identifiers: Bootstrap Methods; Cross Validation ED320965 TM015274 Looking beyond Statistical Significance: Result Importance and Result Generalizability. Welge-Crow, Patricia A.; And Others 25 May 1990 23p.; Paper presented at the Annual Meeting of the American Psychological Society (Dallas, TX, June 9, 1990). Document Type: EVALUATIVE REPORT (142); CONFERENCE PAPER (150) Three strategies for augmenting the interpretation of significance test results are illustrated. Determining the most suitable indices to use in evaluating empirical results is a matter of considerable debate among researchers. Researchers increasingly recognize that significance tests are very limited in their potential to inform the interpretation of scientific results. The first strategy involves evaluating significance test results in a sample size context. The researcher is encouraged to determine at what smaller sample size a statistically significant fixed effect size would no longer be significant, or conversely, at what larger sample size a non- significant result would become statistically significant. The second strategy would involve interpreting effect size as an index of result importance. The third strategy emphasizes interpretation based on the estimated likelihood that results will replicate. These applications are illustrated via small heuristic data sets to make the discussion more concrete. A 37-item list of references, seven data tables, and an appendix illustrating relevant computer commands are provided. (TJH) Descriptors: Educational Research; *Effect Size; Estimation (Mathematics); *Generalizability Theory; Heuristics; Mathematical Models; Maximum Likelihood Statistics; *Research Methodology; *Sample Size; *Statistical Significance; *Test Interpretation; Test Results Identifiers: Empirical Research; Research Replication EJ404813 CG537044 Multiple Criteria for Evaluating the Magnitude of Experimental Effects. Haase, Richard F.; And Others Journal of Counseling Psychology, v36 n4 p511-16 Oct 1989 Document Type: JOURNAL ARTICLE (080); POSITION PAPER (120) Contends that tests of statistical significance and measures of magnitude in counseling psychology research do not provide same information. Argues interpreting magnitude of experimental effects must be two-stage decision process with the second stage of process being conditioned on results of a test of statistical significance and entailing evaluation of absolute magnitude of effect. (Author/ABL) Descriptors: *Research Methodology; *Research Needs; *Statistical Significance; *Test Interpretation Identifiers: *Counseling Psychology ED314450 TM014265 Comments on Better Uses of and Alternatives to Significance Testing. Davidson, Betty M.; Giroir, Mary M. 9 Nov 1989 18p.; Paper presented at the Annual Meeting of the Mid-South Educational Research Association (Little Rock, AR, November 8-10, 1989). Document Type: REVIEW LITERATURE (070); EVALUATIVE REPORT (142); CONFERENCE PAPER (150) Controversy over the proper place of significance testing within scientific methodology has continued for some time. The suggestion that effect sizes are more important than whether results are significant is presented. Effect size can be defined as an estimate of how much of the dependent variable is accounted for by the independent variables. Interpretations of statistical significance can be seriously incorrect when the researcher underinterprets an outcome with a large effect size that is nonsignificant or overinterprets an outcome that involves a small effect size but which is statistically significant. These problems can be avoided if the researcher includes effect size in result interpretation. It has been stated that statistical significance was never intended to take the place of replication in research. Researchers must begin drawing conclusions based on effect sizes and not statistical significance alone; and the replicability and reliability of results must be recognized, analyzed, and interpreted. Two tables illustrate effect sizes. (SLD) Descriptors: *Effect Size; *Reliability; Research Design; Researchers; *Scientific Methodology; Statistical Analysis; *Statistical Significance Identifiers: *Significance Testing ED314449 TM014264 Ways of Estimating the Probability That Results Will Replicate. Giroir, Mary M.; Davidson, Betty M. 9 Nov 1989 17p.; Paper presented at the Annual Meeting of the Mid-South Educational Research Association (Little Rock, AR, November 8-10, 1989). Document Type: EVALUATIVE REPORT (142); CONFERENCE PAPER (150) Replication is important to viable scientific inquiry; results that will not replicate or generalize are of very limited value. Statistical significance enables the researcher to reject or not reject the null hypothesis according to the sample results obtained, but statistical significance does not indicate the probability that results will be replicated. Three techniques for evaluating the sampling specificity of results are described: (1) the jackknife technique of J. W. Tukey (1969); (2) the bootstrap technique of Efron, described by P. Diaconis and E. Bradley (1983); and (3) cross- validation methods described by B. Thompson (1989). A small data set developed by B. Thompson in 1979 is used to demonstrate the cross- validation procedure in detail. These three procedures allow the researcher to examine the replicability and generalizability of results and should be used frequently. Two tables present the study results, and an appendix gives examples of commands for the Statistical Analysis System computer package used for the cross- validation example. (SLD) Descriptors: *Estimation (Mathematics); *Generalizability Theory; Hypothesis Testing; Probability; Research Design; Sample Size; *Sampling; Scientific Methodology; *Statistical Significance Identifiers: Bootstrap Hypothesis; *Cross Validation; Jackknifing Technique; *Research Replication; Research Results ED303514 TM012775 Statistical Significance Testing: From Routine to Ritual. Keaster, Richard D. Nov 1988 15p.; Paper presented at the Annual Meeting of the Mid-South Educational Research Association (Louisville, KY, November 9-11, 1988). Document Type: CONFERENCE PAPER (150); EVALUATIVE REPORT (142); REVIEW LITERATURE (070) Target Audience: Researchers An explanation of the misuse of statistical significance testing and the true meaning of "significance" is offered. Literature about the criticism of current practices of researchers and publications is reviewed in the context of tests of significance. The problem under consideration occurs when researchers attempt to do more than just establish that a relationship has been observed. More often than not, too many researchers assume that the difference, and even the size of the difference, proves or at least confirms the research hypothesis. Statistical significance is not a measure of "substantive' significance or what might be called scientific importance. Significance testing was designed to yield yes/no decisions. It is suggested that authors or research projects should not try to interpret the magnitudes of their significance findings. Significance testing must be returned to its proper place in the scientific process. (SLD) Descriptors: Educational Assessment; Research Design; Research Methodology; *Research Problems; Statistical Analysis; *Statistical Significance; Statistics EJ352091 CG531911 How Significant Is a Significant Difference? Problems With the Measurement of Magnitude of Effect. Murray, Leigh W.; Dosser, David A., Jr. Journal of Counseling Psychology, v34 n1 p68-72 Jan 1987 Document Type: JOURNAL ARTICLE (080); GENERAL REPORT (140) The use of measures of magnitude of effect has been advocated as a way to go beyond statistical tests of significance and to identify effects of a practical size. They have been used in meta-analysis to combine results of different studies. Describes problems associated with measures of magnitude of effect (particularly study size) and implications for researchers. (Author/KS) Descriptors: *Effect Size; *Meta Analysis; Research Design; Research Methodology; *Sample Size; *Statistical Analysis; *Statistical Inference; *Statistical Significance ED285902 TM870488 The Use of Invariance and Bootstrap Procedures as a Method to Establish the Reliability of Research Results. Sandler, Andrew B. 30 Jan 1987 19p.; Paper presented at the Annual Meeting of the Southwest Educational Research Association (Dallas, TX, January 29-31, 1987). Document Type: CONFERENCE PAPER (150); RESEARCH REPORT (143) Target Audience: Researchers Statistical significance is misused in educational and psychological research when it is applied as a method to establish the reliability of research results. Other techniques have been developed which can be correctly utilized to establish the generalizability of findings. Methods that do provide such estimates are known as invariance or cross-validation procedures and the bootstrap method. Invariance procedures split the total sample into two subgroups and apply techniques to analyze each subgroup and compare results, often by using parameters obtained from one subgroup to evaluate the other subgroup. A simulated data set is presented and analyzed by invariance procedures for: (1) canonical correlation; (2) regression and discriminant analysis; (3) analysis of variance and covariance; and (4) bivariate correlation. Whereas invariance procedures split a sample into two parts, the bootstrap method creates multiple copies of the data set. The number of copies could exceed millions with current computer capability. The copies are shuffled and artificial samples of 20 cases each, called bootstrap samples, are randomly selected. The value of the Pearson product- moment correlation (or other statistics) is then calculated for each bootstrap sample to assess the generalizability of the results. (LPG) Descriptors: Analysis of Covariance; Analysis of Variance; Correlation; Discriminant Analysis; *Generalizability Theory; *Mathematical Models; Regression (Statistics); *Reliability; Research Design; Research Problems; *Sample Size; Sampling; Simulation; Statistical Inference; *Statistical Significance; Statistical Studies; Validity Identifiers: *Bootstrap Hypothesis; *Cross Validation; Invariance Principle ED281852 TM870223 A Primer on MANOVA Omnibus and Post Hoc Tests. Heausler, Nancy L. 30 Jan 1987 21p.; Paper presented at the Annual Meeting of the Southwest Educational Research Association (Dallas, TX, January 30, 1987). Document Type: CONFERENCE PAPER (150); RESEARCH REPORT (143) Target Audience: Researchers Each of the four classic multivariate analysis of variance (MANOVA) tests of statistical significance may lead a researcher to different decisions as to whether a null hypothesis should be rejected: (1) Wilks' lambda; (2) Lawley-Hotelling trace criterion; (3) Roy's greatest characteristic root criterion; and (4) Pillai's trace criterion. These four omnibus test statistics are discussed and their optimal uses illustrated using hypothetical data sets. Discriminant analysis as a post hoc method to MANOVA is illustrated in detail. Once a significant MANOVA has been found, the next step is to interpret the non-chance association between dependent and independent variables. (Author/GDC) Descriptors: Analysis of Variance; Discriminant Analysis; Factor Analysis; *Hypothesis Testing; *Multivariate Analysis; *Statistical Significance; Statistical Studies Identifiers: Omnibus Test; Post Hoc Methods EJ327959 EA519388 Chance and Nonsense: A Conversation about Interpreting Tests of Statistical Significance, Part 2. Shaver, James P. Phi Delta Kappan, v67 n2 p138-41 Oct 1985 For Part 1, see EJ 326 611 (September 1985 "Phi Delta Kappan"). Document Type: JOURNAL ARTICLE (080); RESEARCH REPORT (143); POSITION PAPER (120) Target Audience: Researchers; Practitioners The second half of a dialogue between two fictional teachers examines the significance of statistical significance in research and considers the factors affecting the extent to which research results provide important or useful information. (PGD) Descriptors: Educational Research; *Research Methodology; Research Problems; Sampling; Statistical Analysis; *Statistical Significance EJ326611 EA519370 Chance and Nonsense: A Conversation about Interpreting Tests of Statistical Significance, Part 1. Shaver, James P. Phi Delta Kappan, v67 n1 p57-60 Sep 1985 For Part 2, see EJ 327 959 (October 1985 "Phi Delta Kappan"). Document Type: JOURNAL ARTICLE (080); RESEARCH REPORT (143); POSITION PAPER (120) Target Audience: Researchers; Practitioners A dialog between two fictional teachers provides some basic examples of how research that uses approved methodology may provide results that are significant statistically but not significant practically. (PGD) Descriptors: Educational Research; Research Methodology; Research Problems; *Sampling; Statistical Analysis; *Statistical Significance EJ326117 UD511911 Mind Your p's and Alphas. Stallings, William M. Educational Researcher, v14 n9 p19-20 Nov 1985 Document Type: JOURNAL ARTICLE (080); POSITION PAPER (120); GENERAL REPORT (140) In the educational research literature, alpha and p are often conflated. Paradoxically, alpha retains a prominent place in textbook discussions, but it is often supplanted by p in the results sections of journal articles. Because alpha and p have unique uses, researchers should continue to employ both conventions in summarizing the outcomes of tests of significance. (KH) Descriptors: *Educational Research; *Research Methodology; Statistical Analysis; *Statistical Significance Identifiers: *Alpha Coefficient; *p Coefficient ED253566 TM850106 Mind Your p's and Alphas. Stallings, William M. 1985 11p.; Paper presented at the Annual Meeting of the American Educational Research Association (69th, Chicago, IL, March 31-April 4, 1985). Document Type: CONFERENCE PAPER (150); POSITION PAPER (120); REVIEW LITERATURE (070) Target Audience: Researchers In the educational research literature alpha, the a priori level of significance, and p, the a posteriori probability of obtaining a test statistic of at least a certain value when the null hypothesis is true, are often confused. Explanations for this confusion are offered. Paradoxically, alpha retains a prominent place in textbook discussions of such topics as statistical hypothesis testing, multivariate analysis, power, and multiple comparisons while it seems to have been supplanted by p in current journal articles. The unique contributions of both alpha and p are discussed and a plea is made for using both conventions in reporting empirical studies. (Author) Descriptors: Educational Research; *Hypothesis Testing; Multivariate Analysis; *Probability; *Research Methodology; Research Problems; *Statistical Significance; Statistical Studies Identifiers: *Alpha Coefficient EJ307832 TM510187 Policy Implications of Using Significance Tests in Evaluation Research. Schneider, Anne L.; Darcy, Robert E. Evaluation Review, v8 n4 p573-82 Aug 1984 Document Type: JOURNAL ARTICLE (080); RESEARCH REPORT (143) The normative implications of applying significance tests in evaluation research are examined. The authors conclude that evaluators often make normative decisions, based on the traditional .05 significance level in studies with small samples. Additional reporting of the magnitude of impact, the significance level, and the power of the test is recommended. (Author/EGS) Descriptors: *Evaluation Methods; *Hypothesis Testing; *Research Methodology; Research Problems; Sample Size; *Statistical Significance Identifiers: Data Interpretation; *Evaluation Problems; Evaluation Research ED249266 TM840619 Power Differences among Tests of Combined Significance. Becker, Betsy Jane Apr 1984 21p.; Paper presented at the Annual Meeting of the American Educational Research Association (68th, New Orleans, LA, April 23-27, 1984). Document Type: CONFERENCE PAPER (150); RESEARCH REPORT (143) Target Audience: Researchers Power is an indicator of the ability of a statistical analysis to detect a phenomenon that does in fact exist. The issue of power is crucial for social science research because sample size, effects, and relationships studied tend to be small and the power of a study relates directly to the size of the effect of interest and the sample size. Quantitative synthesis methods can provide ways to overcome the problem of low power by combining the results of many studies. In the study at hand, large-sample (approximate) normal distribution theory for the non-null density of the individual p value is used to obtain power functions for significance value summaries. Three p- value summary methods are examined: Tippett's counting method, Fisher's inverse chi-square summary, and the logit method. Results for pairs of studies and for a set of five studies are reported. They indicate that the choice of a "most-powerful" summary will depend on the number of studies to be summarized, the sizes of the effects in the populations studied, and the sizes of the samples chosen from those populations. (BW) Descriptors: Effect Size; Hypothesis Testing; *Meta Analysis; Research Methodology; Sample Size; *Statistical Analysis; *Statistical Significance Identifiers: *Power (Statistics) Return to FAQ on The Concept of Statistical Signicance Testing |
|
|||
Full-text Library | Search ERIC | Test Locator | ERIC System | Assessment Resources | Calls for papers | About us | Site map | Search | Help Sitemap 1 - Sitemap 2 - Sitemap 3 - Sitemap 4 - Sitemap 5 - Sitemap 6
©1999-2012 Clearinghouse on Assessment and Evaluation. All rights reserved. Your privacy is guaranteed at
ericae.net. |