Clearinghouse on Assessment and Evaluation

Library | SearchERIC | Test Locator | ERIC System | Resources | Calls for papers | About us

 

 


ERIC Documents Database Citations & Abstracts for The Concept of Statistical Significance Testing


Instructions for ERIC Documents Access

Search Strategy:
Statistical Significance [ERIC Descriptor, with heavily weighted status]
AND
Research Methodology OR Educational Research OR Educational History OR Statistical Inference OR Statistical Analysis OR Hypothesis Testing OR Null Hypothesis [ERIC Descriptors/Identifiers]


  ED419023  TM028329
  Five Methodology Errors in Educational Research: The Pantheon of
Statistical Significance and Other Faux Pas.
  Thompson, Bruce
  1998
  102p.; Paper presented at the Annual Meeting of the American 
Educational Research Association (San Diego, CA, April 13-17, 1998).
  Document Type: EVALUATIVE REPORT (142);  CONFERENCE PAPER (150)
  After presenting a general linear model as a framework for 
discussion, this paper reviews five methodology errors that occur in 
educational research: (1) the use of stepwise methods; (2) the 
failure to consider in result interpretation the context specificity 
of analytic weights (e.g., regression beta weights, factor pattern 
coefficients, discriminant function coefficients, canonical function 
coefficients) that are part of all parametric quantitative analyses; 
(3) the failure to interpret both weights and structure coefficients 
as part of result interpretation; (4) the failure to recognize that 
reliability is a characteristic of scores, and not of tests; and (5) 
the incorrect interpretation of statistical significance and the 
related failure to report and interpret the effect sizes present in 
all quantitative analysis.  In several cases small heuristic 
discriminant analysis data sets are presented to make the discussion 
of each of these five methodology errors more concrete and accessible.  
Four appendixes contain computer programs for some of the analyses.  
(Contains 19 tables, 1 figure, and 143 references.) (SLD)
  Descriptors: *Educational Research; *Effect Size; *Research 
Methodology; Scores; *Statistical Significance; Tables (Data); *Test 
Reliability
  Identifiers: Stepwise Regression; *Weighting (Statistical)


  ED416214  TM028066
  Why "Encouraging" Effect Size Reporting Isn't Working: The Etiology 
of Researcher Resistance to Changing Practices.
  Thompson, Bruce
  1998
  18p.; Paper presented at the Annual Meeting of the Southwest 
Educational Research Association (Houston, TX, January 1998).
  Document Type: PROJECT DESCRIPTION (141);  CONFERENCE PAPER (150)
  Given decades of lucid, blunt admonitions that statistical 
significance tests are often misused, and that the tests are somewhat 
limited in utility, what is needed is less repeated bashing of 
statistical tests, and some honest reflection regarding the etiology 
of researchers' denial and psychological resistance (sometimes 
unconscious) to improved practice.  Three etiologies are briefly 
explored here: (1) atavism; (2) "is/ought" logic fallacies; and (3) 
confusion/desperation.  Understanding the etiology of psychological 
resistance may ultimately lead to improved interventions to assist in 
overcoming researcher resistance to reporting effect sizes and using 
non-nil nulls and other analytic improvements.  (Contains 45 
references.) (Author)
  Descriptors: Attitudes; Change; Denial (Psychology); *Educational 
Research; *Effect Size; *Etiology; *Research Methodology; 
*Researchers; *Statistical Significance


  ED408302  TM026504
  Use of Tests of Statistical Significance and Other Analytic Choices 
in a School Psychology Journal: Review of Practices and Suggested 
Alternatives.
  Snyder, Patricia A.; Thompson, Bruce
  24 Jan 1997
  25p.; Paper presented at the Annual Meeting of the Southwest 
Educational Research Association (Austin, TX, January 24, 1997).
  Document Type: EVALUATIVE REPORT (142);  CONFERENCE PAPER (150)
  The use of tests of statistical significance was explored, first by 
reviewing some criticisms of contemporary practice in the use of 
statistical tests as reflected in a series of articles in the 
"American Psychologist" and in the appointment of a "Task Force on 
Statistical Inference" by the American Psychological Association 
(APA) to consider recommendations leading to improved practice.  
Related practices were reviewed in seven volumes of the "School 
Psychology Quarterly," an APA journal.  This review found that some 
contemporary authors continue to use and interpret statistical 
significance tests inappropriately.  The 35 articles reviewed 
reported a total of 321 statistical tests for which sufficient 
information was provided for effect sizes to be computed, but authors 
of only 19 articles did report various magnitudes of effect indices.  
Suggestions for improved practice are explored, beginning with the 
need to interpret statistical significance tests correctly, using 
more accurate language, and the need to report and interpret 
magnitude of effect indices.  Editorial policies must continue to 
evolve to require authors to meet these expectations.  (Contains 50 
references.) (SLD)
  Descriptors: Educational Psychology; *Educational Research; *Effect 
Size; Elementary Secondary Education; Research Methodology; Research 
Reports; *Scholarly Journals; *School Psychologists; Statistical 
Inference; *Statistical Significance; Test Interpretation; *Test Use
  Identifiers: American Psychological Association


  EJ565847  TM520973
  Statistical Significance Testing Practices in "The Journal of 
Experimental Education."
  Thompson, Bruce; Snyder, Patricia A.
  Journal of Experimental Education, v66 n1 p75-83 Fall 
  1997
  ISSN: 0022-0973
  Document Type: JOURNAL ARTICLE (080);  EVALUATIVE REPORT (142)
  The use of three aspects of recommended practice (language use, 
replicability analyses, and reporting effect sizes) was studied in 
quantitative reports in "The Journal of Experimental Education" (JXE) 
for the academic years 1994-95 and 1995-96.  Examples of both errors 
and desirable practices in the use and reporting of statistical 
significance tests in JXE are noted.  (SLD)
  Descriptors: *Effect Size; *Language Usage; *Research Methodology; 
Research Reports; Scholarly Journals; *Statistical Significance
  Identifiers: *Research Replication


  EJ564700  TM520936
  Statistical Significance: Rationale, Validity and Utility  book 
review.
  Simon, Marilyn K.
  Canadian Journal of Program Evaluation/La Revue canadienne 
d'evaluation de programme, v12 n2 p189-90 Aut   1997
  ISSN: 0834-1516
  Document Type: BOOK-PRODUCT REVIEW (072);  JOURNAL ARTICLE (080)
  Review states that the book gives an examination of the null-
hypothesis significance test procedure as an integral component of 
data analysis of quantitative research studies in the social sciences.  
It is designed for the nonmathematics student who will be doing 
empirical studies involving the testing of substantive hypotheses. (SLD)
  Descriptors: *Hypothesis Testing; *Research Methodology; Research 
Utilization; *Social Science Research; *Statistical Significance; 
*Validity
  Identifiers: *Null Hypothesis


  EJ559584  EC618689
  Debunking the Myth of the "Highly Significant" Result: Effect Sizes
in Gifted Education Research.
  Plucker, Jonathan A.
  Roeper Review, v20 n2 p122-26 Dec   1997
  ISSN: 0278-3193
  Document Type: JOURNAL ARTICLE (080);  RESEARCH REPORT (143)
  Describes the utility of effect size reporting and reports on a 
study that reviewed articles in three quarterly gifted journals and 
40 articles in journals not directly associated with gifted 
education, published over the last five years.  Effect sizes were 
generally not included in research articles, with results consistent 
across journals.  (Author/CR)
  Descriptors: Educational Research; *Effect Size; *Gifted; *Research 
Methodology; *Scholarly Journals; *Statistical Significance; 
Technical Writing


  EJ551464  UD520180
  Rejoinder: Editorial Policies Regarding Statistical Significance 
Tests: Further Comments.
  Thompson, Bruce
  Educational Researcher, v26 n5 p29-32 Jun-Jul   1997
  Document Type: JOURNAL ARTICLE (080);  POSITION PAPER (120)
  Argues that describing results as "significant" rather than 
"statistically significant" is confusing to the very people most apt 
to misinterpret this telegraphic wording.  The importance of 
reporting the effect size and the value of both internal and external 
replicability analyses are stressed.  (SLD)
  Descriptors: *Editing; *Educational Research; *Effect Size; 
Scholarly Journals; *Statistical Significance; Test Use; *Writing for 
Publication
  Identifiers: *Research Replication


  EJ551463  UD520179
  Reflections on Statistical and Substantive Significance, with a 
Slice of Replication.
  Robinson, Daniel H.; Levin, Joel R.
  Educational Researcher, v26 n5 p21-26 Jun-Jul   1997
  Document Type: JOURNAL ARTICLE (080);  EVALUATIVE REPORT (142)
  Proposes modifications to the recent suggestions by B. Thompson 
(1996) for an American Educational Research Association editorial 
policy on statistical significance testing.  Points out that, 
although it is useful to include effect sizes, they can be 
misinterpreted, and argues, as does Thompson, for greater attention 
to replication in educational research.  (SLD)
  Descriptors: *Editing; *Educational Research; *Effect Size; 
Research Methodology; Research Reports; Scholarly Journals; 
*Statistical Significance; *Test Use; Writing for Publication
  Identifiers: *Research Replication


  EJ541829  SE557633
  A Note on p-Values.
  Evans, Gwyn
  Teaching Statistics, v19 n1 p22-23 Spr   1997
  ISSN: 0141-982X
  Document Type: TEACHING GUIDE (052);  JOURNAL ARTICLE (080)
  Demonstrates the advantages of a p-value as compared with a 
standard significance test procedure.  Contains examples in the 
discussion of testing the mean of a normal distribution and testing a 
probability of proportion.  (DDR)
  Descriptors: British National Curriculum; Educational Strategies; 
Foreign Countries; Higher Education; *Probability; *Ratios 
(Mathematics); *Statistical Significance; *Statistics
  Identifiers: Great Britain


  ED415265  TM027966
  Has Testing for Statistical Significance Outlived Its Usefulness?
  McLean, James E.; Ernest, James M.
  1997
  21p.; Paper presented at the Annual Meeting of the Mid-South 
Educational Research Association (26th, Memphis, TN, November 12-14, 
1997).
  Document Type: EVALUATIVE REPORT (142);  CONFERENCE PAPER (150)
  The research methodology literature in recent years has included a 
full frontal assault on statistical significance testing.  An entire 
edition of "Experimental Education" explored this controversy.  The 
purpose of this paper is to promote the position that while 
significance testing by itself may be flawed, it has not outlived its 
usefulness.  However, it must be considered in combination with other 
criteria.  Specifically, statistical significance is but one of three 
criteria that must be demonstrated to establish a position 
empirically.  Statistical significance merely provides evidence that 
an event did not happen by chance.  However, it provides no 
information about the meaningfulness (practical significance) of an 
event or if the event is replicable.  Consequently, statistical 
significance testing must be accompanied by judgments of the event's 
practical significance and replicability.  However, the likelihood of 
a chance occurrence of an event must not be ignored.  It is 
acknowledged that the importance of significance testing is reduced 
as sample size increases.  In large sample experiments, particularly 
those involving multiple variables, the role of significance testing 
is diminished because even small differences are often statistically 
significant.  In small sample studies where assumptions such as 
random sampling are practical, significance testing can be quite 
useful.  It is important to remember that statistical significance is 
but one criterion useful to inferential researchers.  In addition to 
statistical significance, practical significance, and replicability, 
researchers must also consider Type II errors and sample size.  
Furthermore, researchers should not ignore other techniques such as 
confidence intervals.  While all of these statistical concepts are 
related, they provide different types of information that assist 
researchers in making decisions.  (Contains 30 references.) 
(Author/SLD)
  Descriptors: Criteria; Decision Making; *Research Methodology; 
*Sample Size; *Statistical Significance; *Test Use
  Identifiers: *Research Replication


  ED413342  TM027613
  If Statistical Significance Tests Are Broken/Misused, What 
Practices Should Supplement or Replace Them?
  Thompson, Bruce
  1997
  32p.; Paper presented at the Annual Meeting of the American 
Psychological Association (105th, Chicago, IL, August 1997).
  Document Type: POSITION PAPER (120);  CONFERENCE PAPER (150)
  Given some consensus that statistical significance tests are 
broken, misused, or at least have somewhat limited utility, the focus 
of discussion within the field ought to move beyond additional 
bashing of statistical significance tests, and toward more 
constructive suggestions for improved practice.  Five suggestions for 
improved practice are recommended: (1) required reporting of effect 
sizes; (2) reporting of effect sizes in an interpretable manner; (3) 
explicating the values that bear on results; (4) providing evidence 
of result replicability; and (5) reporting confidence intervals.  
Although the five recommendations can be followed even if statistical 
significance tests are reported, social science will proceed most 
rapidly when research becomes the search for replicable effects 
noteworthy in magnitude in the context of both the inquiry and 
personal or social values.  (Contains 1 table and 74 references.) 
(Author/SLD)
  Descriptors: *Effect Size; *Research Methodology; *Statistical 
Significance; *Test Use
  Identifiers: *Confidence Intervals (Statistics); *Research 
Replication


  ED408336  TM026589
  Statistical Significance Testing in "Educational and Psychological 
Measurement" and Other Journals.
  Daniel, Larry G.
  Mar 1997
  33p.; Paper presented at the Annual Meeting of the National Council 
on Measurement in Education (Chicago, IL, March 25-27, 1997).
  Document Type: EVALUATIVE REPORT (142);  CONFERENCE PAPER (150)
  Statistical significance tests (SSTs) have been the object of much 
controversy among social scientists.  Proponents have hailed SSTs as 
an objective means for minimizing the likelihood that chance factors 
have contributed to research results.  Critics have both questioned 
the logic underlying SSTs and bemoaned the widespread misapplication 
and misinterpretation of the results of these tests.  This paper 
offers a framework for remedying some of the common problems 
associated with SSTs via modification of journal editorial policies.  
The controversy surrounding SSTs is reviewed, with attention given to 
both historical and more contemporary criticisms of bad practices 
associated with misuse of SSTs.  Examples from the editorial policies 
of "Educational and Psychological Measurement" and several other 
journals that have established guidelines for reporting results of 
SSTs are discussed, and suggestions are provided regarding additional 
ways that educational journals may address the problem.  These 
guidelines focus on selecting qualified editors and reviewers, 
defining policies about use of SSTs that are in line with those of 
the American Psychological Association, and stressing effect size 
reporting.  An appendix presents a manuscript review form.  (Contains 
61 references.) (Author/SLD)
  Descriptors: Editing; *Educational Assessment; Policy; Research 
Problems; *Scholarly Journals; *Social Science Research; *Statistical 
Significance; *Test Use
  Identifiers: *Educational and Psychological Measurement


  ED408303  TM026505
  Use of Statistical Significance Tests and Reliability Analyses in 
Published Counseling Research.
  Thompson, Bruce; Snyder, Patricia A.
  25 Mar 1997
  24p.; Paper presented at the Annual Meeting of the American 
Educational Research Association (Chicago, IL, March 1997).
  Document Type: EVALUATIVE REPORT (142);  CONFERENCE PAPER (150)
  The mission of the "Journal of Counseling and Development" (JCD) 
includes the attempt to serve as a "scholarly record of the 
counseling profession" and as part of the "conscience of the 
profession." This responsibility requires the willingness to engage 
in self-study.  This study investigated two aspects of research 
practice in 25 quantitative studies reported in 1996 JCD issues, the 
use and interpretation of statistical significance tests, and the 
meaning of and ways of evaluating the score reliabilities of measures 
used in substantive research inquiry.  Too many researchers have 
persisted in equating result improbability with result value, and too 
many have persisted in believing that statistical significance 
evaluates result replicability.  In addition, too many researchers 
have persisted in believing that result improbability equals the 
magnitude of study effects.  Authors must consistently begin to 
report and interpret effect sizes to aid the interpretations they 
make and those made by their readers.  With respect to score 
reliability evaluation, more authors need to recognize that 
reliability inures to specific sets of scores and not to the test 
itself.  Thirteen of the JCD articles involved reports of score 
reliability in previous studies and eight reported reliability 
coefficients for both previous scores and those in hand.  These 
findings suggest some potential for improved practice in the 
quantitative research reported in JCD and improved editorial policies 
to support these changes.  (Contains 39 references.) (SLD)
  Descriptors: *Counseling; Educational Research; *Effect Size; 
Evaluation Methods; Reliability; Research Methodology; *Research 
Reports; *Scholarly Journals; Scores; *Statistical Significance; 
*Test Use
  Identifiers: *Journal of Counseling and Development; Research 
Replication


  ED407423  TM026445
  Ways To Explore the Replicability of Multivariate Results (Since 
Statistical Significance Testing Does Not).
  Kier, Frederick J.
  23 Jan 1997
  17p.; Paper presented at the Annual Meeting of the Southwest 
Educational Research Association (Austin, TX, January 23-25, 1997).
  Document Type: EVALUATIVE REPORT (142);  CONFERENCE PAPER (150)
  It is a false, but common, belief that statistical significance 
testing evaluates result replicability.  In truth, statistical 
significance testing reveals nothing about results replicability.  
Since science is based on replication of results, methods that assess 
replicability are important.  This is particularly true when 
multivariate methods, which capitalize on sampling error, are used.  
This paper explores three methods that can give an idea of the 
replicability of results in multivariate analysis without having to 
repeat the study.  The first method is cross validation, a 
replication technique in which the entire sample is first run through 
the planned analysis and then the sample is randomly split into two 
unequal parts so that separate analyses are done on each half.  The 
jackknife is a second method of replicability that relies on 
partitioning out the impact or effect of a particular subset of the 
data on an estimate derived from the total sample.  The bootstrap, a 
third method of studying replicability, involves copying the data set 
into an infinitely large "mega" data set.  Many different samples are 
then drawn from the file and results are computed separately for each 
sample and then averaged.  The main drawback of all these internal 
replicability procedures is that their results are all based on the 
data from the one sample being analyzed.  However, internal 
replication techniques are better than not addressing the issue at 
all.  (Contains 18 references.) (SLD)
  Descriptors: Evaluation Methods; *Multivariate Analysis; *Sampling; 
*Statistical Significance
  Identifiers: Bootstrap Methods; Cross Validation; Jackknifing 
Technique; *Research Replication


  EJ535148  TM519805
  What to Do with the Upward Bias in R Squared: A Comment on Huberty.
  Snijders, Tom A. B.
  Journal of Educational and Behavioral Statistics, v21 n3 p283-98 
Fall   1996
  These articles comment on a recent article by Carl J. Huberty 
(1994), "A Note on Interpreting an R-squared Value," Journal of 
Educational and Behavioral Statistics, v19, p351-56.
  ISSN: 1076-9986
  Document Type: BOOK-PRODUCT REVIEW (072);  EVALUATIVE REPORT (142); 
 JOURNAL ARTICLE (080)
  Two commentaries describe some shortcomings of a recent discussion 
of the significance testing of R-squared by C. J. Huberty and upward 
bias in the statistic.  Both propose some modifications.  A response 
by Huberty acknowledges the importance of the exchange of ideas in 
the field of data analysis.  (SLD)
  Descriptors: *Bias; *Correlation; *Effect Size; *Regression (
Statistics); *Statistical Significance


  EJ533527  TM519729
  Practical Significance: A Concept Whose Time Has Come.
  Kirk, Roger E.
  Educational and Psychological Measurement, v56 n5 p746-59 Oct 
  1996
  Article based on the presidential address delivered to the 
Southwestern Psychological Association meeting (Houston, TX, April 5, 
1996).
  ISSN: 0013-1644
  Document Type: EVALUATIVE REPORT (142);  CONFERENCE PAPER (150);  
JOURNAL ARTICLE (080)
  Practical significance is concerned with whether a research result 
is useful in the real world.  The use of procedures to supplement the 
null hypothesis significance test in four journals of the American 
Psychological Association is examined, and an approach to assessing 
practical significance is presented.  (SLD)
  Descriptors: *Educational Research; *Hypothesis Testing; *Research 
Utilization; Sampling; *Scholarly Journals; *Statistical Significance
  Identifiers: American Psychological Association; *Null Hypothesis; 
*Practical Significance


  EJ525478  UD519259
  AERA Editorial Policies Regarding Statistical Significance Testing: 
Three Suggested Reforms.
  Thompson, Bruce
  Educational Researcher, v25 n2 p26-30 Mar   1996
  ISSN: 0013-189X
  Document Type: EVALUATIVE REPORT (142);  JOURNAL ARTICLE (080)
  Reviews practices regarding tests of statistical significance and 
policies of the American Educational Research Association (AERA).  
Decades of misuse of statistical significance testing are described, 
and revised editorial policies to improve practice are highlighted.  
Correct interpretation of statistical tests, interpretation of effect 
sizes, and exploration of research replicability are essential. (SLD)
  Descriptors: *Editing; Educational Research; *Effect Size; 
*Statistical Significance; Test Interpretation; *Test Use
  Identifiers: American Educational Research Association; *Editorial 
Policy; *Research Replication


  EJ520936  TM519322
  The Impact of Data-Analysis Methods on Cumulative Research 
Knowledge: Statistical Significance Testing, Confidence Intervals, 
and Meta-Analysis.
  Schmidt, Frank; Hunter, John E.
  Evaluation and the Health Professions, v18 n4 p408-27 Dec 
  1995
  Special issue titled "The Meta-Analytic Revolution in Health 
Research: Part II."
  ISSN: 0163-2787
  Document Type: EVALUATIVE REPORT (142);  JOURNAL ARTICLE (080)
  It is argued that point estimates of effect sizes and confidence 
intervals around these point estimates are more appropriate 
statistics for individual studies than reliance on statistical 
significance testing and that meta-analysis is appropriate for 
analysis of data from multiple studies.  (SLD)
  Descriptors: *Effect Size; Estimation (Mathematics); *Knowledge 
Level; *Meta Analysis; *Research Methodology; *Statistical 
Significance; Test Use
  Identifiers: *Confidence Intervals (Statistics)


  ED393939  TM024976
  Understanding the Sampling Distribution and Its Use in Testing 
Statistical Significance.
  Breunig, Nancy A.
  9 Nov 1995
  25p.; Paper presented at the Annual Meeting of the Mid-South 
Educational Research Association (Biloxi, MS, November 1995).
  Document Type: EVALUATIVE REPORT (142);  CONFERENCE PAPER (150)
  Despite the increasing criticism of statistical significance 
testing by researchers, particularly in the publication of the 1994 
American Psychological Association's style manual, statistical 
significance test results are still popular in journal articles.  For 
this reason, it remains important to understand the logic of 
inferential statistics.  A fundamental concept in inferential 
statistics is the sampling distribution.  This paper explains the 
sampling distribution and the Central Limit Theorem and their role in 
statistical significance testing.  Included in the discussion is a 
demonstration of how computer applications can be used to teach 
students about the sampling distribution.  The paper concludes with 
an example of hypothesis testing and an explanation of how the 
standard deviation of the sampling distribution is either calculated 
based on statistical assumptions or is empirically estimated using 
logics such as the "bootstrap." These concepts are illustrated 
through the use of hand generated and computer examples.  An appendix 
displays five computer screens designed to teach these topics.  
(Contains 1 table, 4 figures, and 20 references.) (Author/SLD)
  Descriptors: *Computer Uses in Education; *Educational Research; 
*Hypothesis Testing; *Sampling; Statistical Distributions; 
Statistical Inference; *Statistical Significance; Test Results
  Identifiers: Bootstrap Methods; *Central Limit Theorem


  ED392819  TM024458
  Editorial Policies Regarding Statistical Significance Testing: 
Three Suggested Reforms.
  Thompson, Bruce
  8 Nov 1995
  24p.; Paper presented at the Annual Meeting of the Mid-South 
Educational Research Association (Biloxi, MS, November 1995).
  Document Type: POSITION PAPER (120);  CONFERENCE PAPER (150)
  Editorial practices revolving around tests of statistical 
significance are explored.  The logic of statistical significance 
testing is presented in an accessible manner--many people who use 
statistical tests might not place such a premium on them if they knew 
what the tests really do, and what they do not do.  The etiology of 
decades of misuse of statistical tests is explored, highlighting the 
bad implicit logic of persons who misuse statistical tests.  Finally, 
three revised editorial policies that would improve conventional 
practice are discussed.  The first is the use of better language, 
with insistence on universal use of the phrase "statistical 
significance" to emphasize that the common meaning of "significant" 
has nothing to do with results being important.  A second improvement 
would be emphasizing effect size interpretation, and a third would be 
using and reporting strategies that evaluate the replicability of 
results.  Internal replicability analyses such as cross validation, 
the jackknife, or the bootstrap would help determine whether results 
are stable across sample variations.  (Contains 51 references.) 
(Author/SLD)
  Descriptors: *Editing; *Educational Assessment; *Effect Size; 
Quality Control; *Research Methodology; *Statistical Significance; 
*Test Use
  Identifiers: Bootstrap Methods; Cross Validation; Jackknifing 
Technique; *Research Replication


  ED382639  TM023069
  Effect Size as an Alternative to Statistical Significance Testing.
  McClain, Andrew L.
  Apr 1995
  18p.; Paper presented at the Annual Meeting of the American 
Educational Research Association (San Francisco, CA, April 18-22, 
1995).
  Document Type: REVIEW LITERATURE (070);  CONFERENCE PAPER (150)
  The present paper discusses criticisms of statistical significance 
testing from both historical and contemporary perspectives.  
Statistical significance testing is greatly influenced by sample size 
and often results in meaningless information being over-reported.  
Variance-accounted-for-effect sizes are presented as an alternative 
to statistical significance testing.  A review of the "Journal of 
Clinical Psychology" (1993) reveals a continued reliance on 
statistical significance testing on the part of researchers.  
Finally, scatterplots and correlation coefficients are presented to 
illustrate the lack of linear relationship between sample size and 
effect size.  Two figures are included.  (Contains 24 references.) 
(Author)
  Descriptors: Correlation; *Effect Size; Research Methodology; 
*Sample Size; *Statistical Significance; *Testing
  Identifiers: Scattergrams; *Variance (Statistical)


  EJ481563  EC608481
  Interpretation of Statistical Significance Testing: A Matter of 
Perspective.
  McClure, John; Suen, Hoi K.
  Topics in Early Childhood Special Education, v14 n1 p88-100 Spr 
  1994
  Theme Issue: Methodological Issues and Advances.
  ISSN: 0271-1214
  Document Type: JOURNAL ARTICLE (080);  POSITION PAPER (120)
  Target Audience: Researchers
  This article compares three models that have been the foundation 
for approaches to the analysis of statistical significance in early 
childhood research--the Fisherian and the Neyman-Pearson models (both 
considered "classical" approaches), and the Bayesian model.  The 
article concludes that all three models have a place in the analysis 
of research results.  (JDD)
  Descriptors: *Bayesian Statistics; Early Childhood Education; 
Educational Research; *Hypothesis Testing; Models; *Research 
Methodology; Statistical Analysis; *Statistical Significance


  ED367678  TM021117
  Historical Origins of Contemporary Statistical Testing Practices: 
How in the World Did Significance Testing Assume Its Current Place in 
Contemporary Analytic Practice?
  Weigle, David C.
  Jan 1994
  18p.; Paper presented at the Annual Meeting of the Southwest 
Educational Research Association (San Antonio, TX, January 27, 1994).
  Document Type: EVALUATIVE REPORT (142);  CONFERENCE PAPER (150)
  The purposes of the present paper are to address the historical 
development of statistical significance testing and to briefly 
examine contemporary practices regarding such testing in the light of 
these historical origins.  Precursors leading to the advent of 
statistical significance testing are examined as are more recent 
controversies surrounding the issue.  As the etiology of current 
practice is explored, it will become more apparent whether current 
practices evolved from deliberative judgment or merely developed from 
happenstance that has become reified in routine.  Examination of the 
history of analysis suggests that the development of statistical 
significance testing has indeed involved a degree of deliberative 
judgment.  It may be that the time for significance testing came and 
went, but there is no doubt that significance testing served as an 
important catalyst for the growth of science in the 20th century.  
(Contains 39 references.) (Author/SLD)
  Descriptors: *Data Analysis; Educational History; Etiology; 
*Research Methodology; *Scientific Research; *Statistical 
Significance; *Testing


  EJ475203  TM517631
  Statistical Significance Testing from Three Perspectives and 
Interpreting Statistical Significance and Nonsignificance and the 
Role of Statistics in Research.
  Levin, Joel R.; And Others
  Journal of Experimental Education, v61 n4 p378-93 Sum 
  1993
  Theme issue title: "Statistical Significance Testing in 
Contemporary Practice: Some Proposed Alternatives with Comments from 
Journal Editors."
  ISSN: 0022-0973
  Document Type: COLLECTION (020);  POSITION PAPER (120);  JOURNAL 
ARTICLE (080)
  Journal editors respond to criticisms of reliance on statistical 
significance in research reporting.  Joel R. Levin ("Journal of 
Educational Psychology") defends its use, whereas William D. Schafer 
("Measurement and Evaluation in Counseling and Development") 
emphasizes the distinction between statistically significant and 
important.  William Asher ("Journal of Experimental Education") 
comments on preceding discussions.  (SLD)
  Descriptors: Editing; Editors; Educational Assessment; *Educational 
Research; Elementary Secondary Education; Higher Education; 
Hypothesis Testing; *Research Methodology; Research Reports; 
Scholarly Journals; *Statistical Significance; Statistics


  
  EJ475198  TM517626
  What Statistical Significance Testing Is, and What It Is Not.
  Shaver, James P.
  Journal of Experimental Education, v61 n4 p293-316 Sum 
  1993
  Theme issue title: "Statistical Significance Testing in 
Contemporary Practice: Some Proposed Alternatives with Comments from 
Journal Editors."
  ISSN: 0022-0973
  Document Type: EVALUATIVE REPORT (142);  JOURNAL ARTICLE (080)
  Reviews the role of statistical significance testing, and argues 
that dominance of such testing is dysfunctional because significance 
tests do not provide the information that many researchers assume 
they do.  Possible reasons for the persistence of statistical 
significance testing are discussed briefly, and ways to moderate 
negative effects are suggested.  (SLD)
  Descriptors: Educational Practices; *Educational Research; 
Elementary Secondary Education; Higher Education; Research Design; 
Research Methodology; *Research Problems; Scholarly Journals; 
*Statistical Significance


  EJ475197  TM517625
  The Case against Statistical Significance Testing, Revisited.
  Carver, Ronald P.
  Journal of Experimental Education, v61 n4 p287-92 Sum 
  1993
  Theme issue title: "Statistical Significance Testing in 
Contemporary Practice: Some Proposed Alternatives with Comments from 
Journal Editors."
  ISSN: 0022-0973
  Document Type: EVALUATIVE REPORT (142);  JOURNAL ARTICLE (080)
  Four things are recommended to minimize the influence or importance 
of statistical significance testing.  Researchers must not neglect to 
add "statistical" to significant and could interpret results before 
giving p-values.  Effect sizes should be reported with measures of 
sampling error, and replication can be built into the design.  (SLD)
  Descriptors: Educational Researchers; *Effect Size; Error of 
Measurement; *Research Methodology; Research Problems; Sampling; 
*Statistical Significance
  Identifiers: *P Values; *Research Replication


  ED364608  TM020880
  Meaningfulness, Statistical Significance, Effect Size, and Power 
Analysis: A General Discussion with Implications for MANOVA.
  Huston, Holly L.
  Nov 1993
  29p.; Paper presented at the Annual Meeting of the Mid-South 
Educational Research Association (22nd, New Orleans, LA, November 9-
12, 1993).
  Document Type: EVALUATIVE REPORT (142);  CONFERENCE PAPER (150)
  This paper begins with a general discussion of statistical 
significance, effect size, and power analysis; and concludes by 
extending the discussion to the multivariate case (MANOVA).  
Historically, traditional statistical significance testing has guided 
researchers' thinking about the meaningfulness of their data.  The 
use of significance testing alone in making these decisions has 
proved problematic.  It is likely that less reliance on statistical 
significance testing, and an increased use of power analysis and 
effect size estimates in combination could contribute to an overall 
improvement in the quality of new research produced.  The more 
informed researchers are about the benefits and limitations of 
statistical significance, effect size, and power analysis, the more 
likely it is that they will be able to make more sophisticated and 
useful interpretations about the meaningfulness of research results.  
One table illustrates the discussion.  (Contains 37 references.)(SLD)
  Descriptors: *Effect Size; *Estimation (Mathematics); *Multivariate 
Analysis; *Research Methodology; Research Reports; *Statistical 
Significance
  Identifiers: *Meaningfulness; *Power (Statistics)

  
  ED364593  TM020837
  What Is the Probability of Rejecting the Null Hypothesis?: 
Statistical Power in Research.
  Galarza-Hernandez, Aitza
  Nov 1993
  30p.; Paper presented at the Annual Meeting of the Mid-South 
Educational Research Association (22nd, New Orleans, LA, November 9-
12, 1993).
  Document Type: EVALUATIVE REPORT (142);  CONFERENCE PAPER (150)
  Power refers to the probability that a statistical test will yield 
statistically significant results.  In spite of the close 
relationship between power and statistical significance, there is a 
consistent overemphasis in the literature on statistical significance.  
This paper discusses statistical significance and its limitations and 
also includes a discussion of statistical power in the behavioral 
sciences.  Finally, some recommendations to increase power are 
provided, focusing on the necessity of paying more attention to power 
issues.  Changing editorial policies and practices so that editors 
ask authors to estimate the power of their tests is a useful way to 
improve the situation.  Planning research to consider power is 
another way to ensure that the question of the probability of 
rejecting the null hypothesis is answered correctly.  Four tables and 
two figures illustrate the discussion.  (Contains 28 references.) 
(SLD)
  Descriptors: *Behavioral Science Research; Editors; *Estimation 
(Mathematics); Hypothesis Testing; Literature Reviews; *Probability; 
Research Design; *Research Methodology; Scholarly Journals; 
*Statistical Significance
  Identifiers: *Null Hypothesis; *Power (Statistics)

  
  EJ458887  CG542330
  Simultaneous Inference: Objections and Recommendations.
  Schafer, William D.
  Measurement and Evaluation in Counseling and Development, v25 n4 
p146-48 Jan   1993
  ISSN: 0748-1756
  Document Type: JOURNAL ARTICLE (080);  POSITION PAPER (120)
  Considers objections to comparisonwise position, which holds that, 
when conducting simultaneous significance procedures, per-test Type I 
error rate should be controlled and that it is unnecessary to 
introduce adjustments designed to control familywise rate.  
Objections collected by Saville in an attempt to refute them are 
discussed along with Saville's conclusions.  Recommendations are 
introduced for reporting significance tests in journals.  (NB)
  Descriptors: *Statistical Inference; *Statistical Significance; 
Statistics
  Identifiers: *Simultaneous Inference


  ED347169  TM018523
  Statistical Significance Testing: Alternatives and Considerations.
  Wilkinson, Rebecca L.
  Jan 1992
  28p.; Paper presented at the Annual Meeting of the Southwest 
Educational Research Association (Houston, TX, January 31-February 2, 
1992).
  Document Type: POSITION PAPER (120);  CONFERENCE PAPER (150)
  Problems inherent in relying solely on statistical significance 
testing as a means of data interpretation are reviewed.  The biggest 
problem with statistical significance testing is that researchers 
have used the results of this testing to ascribe importance or 
meaning to their studies where such meaning often does not exist.  
Often researchers mistake statistically significant results for 
important effects.  Statistical procedures are too often used as 
substitutes to thought, rather than as aids to researcher thinking.  
Alternatives to statistical significance testing that are explored 
are effect size, statistical power, and confidence intervals.  Other 
considerations for further data analysis that are explored are: (1) 
measurement reliability; (2) data exploration; and (3) the 
replicability of research results.  It is suggested that statistical 
significance testing be used only as a guide in interpreting one's 
results.  Two tables present illustrative information, and there is a 
22-item list of references.  (SLD)
  Descriptors: *Data Interpretation; Effect Size; *Reliability; 
Researchers; Research Methodology; *Research Problems; *Statistical 
Significance
  Identifiers: Confidence Intervals (Statistics); Power (Statistics); 
Research Replication


  ED344905  TM018225
  What Statistical Significance Testing Is, and What It Is Not.
  Shaver, James P.
  Apr 1992
  43p.; Paper presented at the Annual Meeting of the American 
Educational Research Association (San Francisco, CA, April 20-24, 
1992).
  Document Type: CONFERENCE PAPER (150)
  A test of statistical significance is a procedure for determining 
how likely a result is assuming a null hypothesis to be true with 
randomization and a sample of size n (the given size in the study).  
Randomization, which refers to random sampling and random assignment, 
is important because it ensures the independence of observations, but 
it does not guarantee independence beyond the initial sample 
selection.  A test of statistical significance provides a statement 
of probability of occurrence in the long run, with repeated random 
sampling under the null hypothesis, but provides no basis for a 
conclusion about the probability that a particular result is 
attributable to chance.  A test of statistical significance also does 
not indicate the probability that the null hypothesis is true or 
false and does not indicate whether a treatment being studied had an 
effect.  Statistical significance indicates neither the magnitude nor 
the importance of a result, and is no indication of the probability 
that a result would be obtained on study replication.  Although tests 
of statistical significance yield little valid information for 
questions of interest in most educational research, use and misuse of 
such tests remain common for a variety of reasons.  Researchers 
should be encouraged to minimize statistical significance tests and 
to state expectations for quantitative results as critical effect 
sizes.  There is a 58-item list of references.  (SLD)
  Descriptors: Educational Research; Evaluation Problems; Hypothesis 
Testing; Probability; Psychological Studies; *Research Design; 
Research Problems; *Sample Size; *Statistical Significance; Test 
Validity
  Identifiers: *Null Hypothesis; *Randomization (Statistics); 
Research Replication

  
  ED333036  TM016545
  The Place of Significance Testing in Contemporary Social Science.
  Moore, Mary Ann
  3 Apr 1991
  23p.; Paper presented at the Annual Meeting of the American 
Educational Research Association (Chicago, IL, April 3-7, 1991).
  Document Type: EVALUATIVE REPORT (142);  CONFERENCE PAPER (150)
  This paper examines the problems caused by relying solely on 
statistical significance tests to interpret results in contemporary 
social science.  The place of significance testing in educational 
research has often been debated.  Among the problems in reporting 
statistical significance are questions of definition and terminology.  
Problems are also found in the use, as well as the reporting, of 
significance testing.  One of the most important problems is the 
effect of sample size on significance.  An example with a fixed 
effect size of 25% and samples containing 22, 23, and 24 people 
illustrates these effects.  The issues of validity and reliability in 
significance testing with measurement studies are considered.  
Although these problems are widely recognized, publishers show a 
clear bias in favor of reports that claim statistical significance.  
Researchers need to recognize the limitations of significance testing.  
Effect size statistics aid in the interpretation of results and 
provide a guide to the relative importance of the study.  Two tables 
illustrate the effects of sample size.  A 22-item list of references 
is included.  (SLD)
  Descriptors: *Data Interpretation; Educational Research; *Effect 
Size; Research Methodology; *Research Problems; *Sample Size; *Social 
Science Research; *Statistical Significance


  ED325524  TM015782
  Alternatives to Statistical Significance Testing.
  Palomares, Ronald S.
  8 Nov 1990
  20p.; Paper presented at the Annual Meeting of the Mid-South 
Educational Research Association (19th, New Orleans, LA, November 14-
16, 1990).
  Document Type: EVALUATIVE REPORT (142);  CONFERENCE PAPER (150)
  Researchers increasingly recognize that significance tests are 
limited in their ability to inform scientific practice.  Common 
errors in interpreting significance tests and three strategies for 
augmenting the interpretation of significance test results are 
illustrated.  The first strategy for augmenting the interpretation of 
significance tests involves evaluating significance test results in a 
sample size context.  A second strategy involves interpretation of 
effect size estimates; several estimates and corrections are 
discussed.  A third strategy emphasizes interpretation based on 
estimated likelihood that results will replicate.  The bootstrap 
method of B. Efron and others and cross-validation strategies are 
illustrated.  A 28-item list of references and four data tables are 
included.  (Author/SLD)
  Descriptors: *Effect Size; Estimation (Mathematics); *Evaluation 
Methods; *Research Design; Research Problems; *Sample Size; 
*Statistical Significance
  Identifiers: Bootstrap Methods; Cross Validation

  
  ED320965  TM015274
  Looking beyond Statistical Significance: Result Importance and 
Result Generalizability.
  Welge-Crow, Patricia A.; And Others
  25 May 1990
  23p.; Paper presented at the Annual Meeting of the American 
Psychological Society (Dallas, TX, June 9, 1990).
  Document Type: EVALUATIVE REPORT (142);  CONFERENCE PAPER (150)
  Three strategies for augmenting the interpretation of significance 
test results are illustrated.  Determining the most suitable indices 
to use in evaluating empirical results is a matter of considerable 
debate among researchers.  Researchers increasingly recognize that 
significance tests are very limited in their potential to inform the 
interpretation of scientific results.  The first strategy involves 
evaluating significance test results in a sample size context.  The 
researcher is encouraged to determine at what smaller sample size a 
statistically significant fixed effect size would no longer be 
significant, or conversely, at what larger sample size a non-
significant result would become statistically significant.  The 
second strategy would involve interpreting effect size as an index of 
result importance.  The third strategy emphasizes interpretation 
based on the estimated likelihood that results will replicate.  These 
applications are illustrated via small heuristic data sets to make 
the discussion more concrete.  A 37-item list of references, seven 
data tables, and an appendix illustrating relevant computer commands 
are provided.  (TJH)
  Descriptors: Educational Research; *Effect Size; Estimation 
(Mathematics); *Generalizability Theory; Heuristics; Mathematical 
Models; Maximum Likelihood Statistics; *Research Methodology; *Sample 
Size; *Statistical Significance; *Test Interpretation; Test Results
  Identifiers: Empirical Research; Research Replication

  
  EJ404813  CG537044
  Multiple Criteria for Evaluating the Magnitude of Experimental 
Effects.
  Haase, Richard F.; And Others
  Journal of Counseling Psychology, v36 n4 p511-16 Oct 
  1989
  Document Type: JOURNAL ARTICLE (080);  POSITION PAPER (120)
  Contends that tests of statistical significance and measures of 
magnitude in counseling psychology research do not provide same 
information.  Argues interpreting magnitude of experimental effects 
must be two-stage decision process with the second stage of process 
being conditioned on results of a test of statistical significance 
and entailing evaluation of absolute magnitude of effect.  
(Author/ABL)
  Descriptors: *Research Methodology; *Research Needs; *Statistical 
Significance; *Test Interpretation
  Identifiers: *Counseling Psychology


  ED314450  TM014265
  Comments on Better Uses of and Alternatives to Significance 
Testing.
  Davidson, Betty M.; Giroir, Mary M.
  9 Nov 1989
  18p.; Paper presented at the Annual Meeting of the Mid-South 
Educational Research Association (Little Rock, AR, November 8-10, 
1989).
  Document Type: REVIEW LITERATURE (070);  EVALUATIVE REPORT (142);  
CONFERENCE PAPER (150)
  Controversy over the proper place of significance testing within 
scientific methodology has continued for some time.  The suggestion 
that effect sizes are more important than whether results are 
significant is presented.  Effect size can be defined as an estimate 
of how much of the dependent variable is accounted for by the 
independent variables.  Interpretations of statistical significance 
can be seriously incorrect when the researcher underinterprets an 
outcome with a large effect size that is nonsignificant or 
overinterprets an outcome that involves a small effect size but which 
is statistically significant.  These problems can be avoided if the 
researcher includes effect size in result interpretation.  It has 
been stated that statistical significance was never intended to take 
the place of replication in research.  Researchers must begin drawing 
conclusions based on effect sizes and not statistical significance 
alone; and the replicability and reliability of results must be 
recognized, analyzed, and interpreted.  Two tables illustrate effect 
sizes.  (SLD)
  Descriptors: *Effect Size; *Reliability; Research Design; 
Researchers; *Scientific Methodology; Statistical Analysis; 
*Statistical Significance
  Identifiers: *Significance Testing

  
  ED314449  TM014264
  Ways of Estimating the Probability That Results Will Replicate.
  Giroir, Mary M.; Davidson, Betty M.
  9 Nov 1989
  17p.; Paper presented at the Annual Meeting of the Mid-South 
Educational Research Association (Little Rock, AR, November 8-10, 
1989).
  Document Type: EVALUATIVE REPORT (142);  CONFERENCE PAPER (150)
  Replication is important to viable scientific inquiry; results that 
will not replicate or generalize are of very limited value.  
Statistical significance enables the researcher to reject or not 
reject the null hypothesis according to the sample results obtained, 
but statistical significance does not indicate the probability that 
results will be replicated.  Three techniques for evaluating the 
sampling specificity of results are described: (1) the jackknife 
technique of J. W. Tukey (1969); (2) the bootstrap technique of 
Efron, described by P. Diaconis and E. Bradley (1983); and (3) cross-
validation methods described by B. Thompson (1989).  A small data set 
developed by B. Thompson in 1979 is used to demonstrate the cross-
validation procedure in detail.  These three procedures allow the 
researcher to examine the replicability and generalizability of 
results and should be used frequently.  Two tables present the study 
results, and an appendix gives examples of commands for the 
Statistical Analysis System computer package used for the cross-
validation example.  (SLD)
  Descriptors: *Estimation (Mathematics); *Generalizability Theory; 
Hypothesis Testing; Probability; Research Design; Sample Size; 
*Sampling; Scientific Methodology; *Statistical Significance
  Identifiers: Bootstrap Hypothesis; *Cross Validation; Jackknifing 
Technique; *Research Replication; Research Results


  ED303514  TM012775
  Statistical Significance Testing: From Routine to Ritual.
  Keaster, Richard D.
  Nov 1988
  15p.; Paper presented at the Annual Meeting of the Mid-South 
Educational Research Association (Louisville, KY, November 9-11, 
1988).
  Document Type: CONFERENCE PAPER (150);  EVALUATIVE REPORT (142);  
REVIEW LITERATURE (070)
  Target Audience: Researchers
  An explanation of the misuse of statistical significance testing 
and the true meaning of "significance" is offered.  Literature about 
the criticism of current practices of researchers and publications is 
reviewed in the context of tests of significance.  The problem under 
consideration occurs when researchers attempt to do more than just 
establish that a relationship has been observed.  More often than 
not, too many researchers assume that the difference, and even the 
size of the difference, proves or at least confirms the research 
hypothesis.  Statistical significance is not a measure of 
"substantive' significance or what might be called scientific 
importance.  Significance testing was designed to yield yes/no 
decisions.  It is suggested that authors or research projects should 
not try to interpret the magnitudes of their significance findings.  
Significance testing must be returned to its proper place in the 
scientific process.  (SLD)
  Descriptors: Educational Assessment; Research Design; Research 
Methodology; *Research Problems; Statistical Analysis; *Statistical 
Significance; Statistics


  EJ352091  CG531911
  How Significant Is a Significant Difference? Problems With the 
Measurement of Magnitude of Effect.
  Murray, Leigh W.; Dosser, David A., Jr.
  Journal of Counseling Psychology, v34 n1 p68-72 Jan 
  1987
  Document Type: JOURNAL ARTICLE (080);  GENERAL REPORT (140)
  The use of measures of magnitude of effect has been advocated as a 
way to go beyond statistical tests of significance and to identify 
effects of a practical size.  They have been used in meta-analysis to 
combine results of different studies.  Describes problems associated 
with measures of magnitude of effect (particularly study size) and 
implications for researchers.  (Author/KS)
  Descriptors: *Effect Size; *Meta Analysis; Research Design; 
Research Methodology; *Sample Size; *Statistical Analysis; 
*Statistical Inference; *Statistical Significance


  ED285902  TM870488
  The Use of Invariance and Bootstrap Procedures as a Method to 
Establish the Reliability of Research Results.
  Sandler, Andrew B.
  30 Jan 1987
  19p.; Paper presented at the Annual Meeting of the Southwest 
Educational Research Association (Dallas, TX, January 29-31, 1987).
  Document Type: CONFERENCE PAPER (150);  RESEARCH REPORT (143)
  Target Audience: Researchers
  Statistical significance is misused in educational and 
psychological research when it is applied as a method to establish 
the reliability of research results.  Other techniques have been 
developed which can be correctly utilized to establish the 
generalizability of findings.  Methods that do provide such estimates 
are known as invariance or cross-validation procedures and the 
bootstrap method.  Invariance procedures split the total sample into 
two subgroups and apply techniques to analyze each subgroup and 
compare results, often by using parameters obtained from one subgroup 
to evaluate the other subgroup.  A simulated data set is presented 
and analyzed by invariance procedures for: (1) canonical correlation; 
(2) regression and discriminant analysis; (3) analysis of variance 
and covariance; and (4) bivariate correlation.  Whereas invariance 
procedures split a sample into two parts, the bootstrap method 
creates multiple copies of the data set.  The number of copies could 
exceed millions with current computer capability.  The copies are 
shuffled and artificial samples of 20 cases each, called bootstrap 
samples, are randomly selected.  The value of the Pearson product-
moment correlation (or other statistics) is then calculated for each 
bootstrap sample to assess the generalizability of the results. (LPG)
  Descriptors: Analysis of Covariance; Analysis of Variance; 
Correlation; Discriminant Analysis; *Generalizability Theory; 
*Mathematical Models; Regression (Statistics); *Reliability; Research 
Design; Research Problems; *Sample Size; Sampling; Simulation; 
Statistical Inference; *Statistical Significance; Statistical Studies; 
Validity
  Identifiers: *Bootstrap Hypothesis; *Cross Validation; Invariance 
Principle


  ED281852  TM870223
  A Primer on MANOVA Omnibus and Post Hoc Tests.
  Heausler, Nancy L.
  30 Jan 1987
  21p.; Paper presented at the Annual Meeting of the Southwest 
Educational Research Association (Dallas, TX, January 30, 1987).
  Document Type: CONFERENCE PAPER (150);  RESEARCH REPORT (143)
  Target Audience: Researchers
  Each of the four classic multivariate analysis of variance (MANOVA) 
tests of statistical significance may lead a researcher to different 
decisions as to whether a null hypothesis should be rejected: (1) 
Wilks' lambda; (2) Lawley-Hotelling trace criterion; (3) Roy's 
greatest characteristic root criterion; and (4) Pillai's trace 
criterion.  These four omnibus test statistics are discussed and 
their optimal uses illustrated using hypothetical data sets.  
Discriminant analysis as a post hoc method to MANOVA is illustrated 
in detail.  Once a significant MANOVA has been found, the next step 
is to interpret the non-chance association between dependent and 
independent variables.  (Author/GDC)
  Descriptors: Analysis of Variance; Discriminant Analysis; Factor 
Analysis; *Hypothesis Testing; *Multivariate Analysis; *Statistical 
Significance; Statistical Studies
  Identifiers: Omnibus Test; Post Hoc Methods


  
  EJ327959  EA519388
  Chance and Nonsense: A Conversation about Interpreting Tests of 
Statistical Significance, Part 2.
  Shaver, James P.
  Phi Delta Kappan, v67 n2 p138-41 Oct   1985
  For Part 1, see EJ 326 611 (September 1985 "Phi Delta Kappan").
  Document Type: JOURNAL ARTICLE (080);  RESEARCH REPORT (143);  
POSITION PAPER (120)
  Target Audience: Researchers; Practitioners
  The second half of a dialogue between two fictional teachers 
examines the significance of statistical significance in research and 
considers the factors affecting the extent to which research results 
provide important or useful information.  (PGD)
  Descriptors: Educational Research; *Research Methodology; Research 
Problems; Sampling; Statistical Analysis; *Statistical Significance

  
  EJ326611  EA519370
  Chance and Nonsense: A Conversation about Interpreting Tests of 
Statistical Significance, Part 1.
  Shaver, James P.
  Phi Delta Kappan, v67 n1 p57-60 Sep   1985
  For Part 2, see EJ 327 959 (October 1985 "Phi Delta Kappan").
  Document Type: JOURNAL ARTICLE (080);  RESEARCH REPORT (143);  
POSITION PAPER (120)
  Target Audience: Researchers; Practitioners
  A dialog between two fictional teachers provides some basic 
examples of how research that uses approved methodology may provide 
results that are significant statistically but not significant 
practically.  (PGD)
  Descriptors: Educational Research; Research Methodology; Research 
Problems; *Sampling; Statistical Analysis; *Statistical Significance

  
  EJ326117  UD511911
  Mind Your p's and Alphas.
  Stallings, William M.
  Educational Researcher, v14 n9 p19-20 Nov   1985
  Document Type: JOURNAL ARTICLE (080);  POSITION PAPER (120);  
GENERAL REPORT (140)
  In the educational research literature, alpha and p are often 
conflated.  Paradoxically, alpha retains a prominent place in 
textbook discussions, but it is often supplanted by p in the results 
sections of journal articles.  Because alpha and p have unique uses, 
researchers should continue to employ both conventions in summarizing 
the outcomes of tests of significance.  (KH)
  Descriptors: *Educational Research; *Research Methodology; 
Statistical Analysis; *Statistical Significance
  Identifiers: *Alpha Coefficient; *p Coefficient

  
  ED253566  TM850106
  Mind Your p's and Alphas.
  Stallings, William M.
  1985
  11p.; Paper presented at the Annual Meeting of the American 
Educational Research Association (69th, Chicago, IL, March 31-April 
4, 1985).
  Document Type: CONFERENCE PAPER (150);  POSITION PAPER (120);  
REVIEW LITERATURE (070)
  Target Audience: Researchers
  In the educational research literature alpha, the a priori level of 
significance, and p, the a posteriori probability of obtaining a test 
statistic of at least a certain value when the null hypothesis is 
true, are often confused.  Explanations for this confusion are 
offered.  Paradoxically, alpha retains a prominent place in textbook 
discussions of such topics as statistical hypothesis testing, 
multivariate analysis, power, and multiple comparisons while it seems 
to have been supplanted by p in current journal articles.  The unique 
contributions of both alpha and p are discussed and a plea is made 
for using both conventions in reporting empirical studies.  (Author)
  Descriptors: Educational Research; *Hypothesis Testing; 
Multivariate Analysis; *Probability; *Research Methodology; Research 
Problems; *Statistical Significance; Statistical Studies
  Identifiers: *Alpha Coefficient

  
  EJ307832  TM510187
  Policy Implications of Using Significance Tests in Evaluation 
Research.
  Schneider, Anne L.; Darcy, Robert E.
  Evaluation Review, v8 n4 p573-82 Aug   1984
  Document Type: JOURNAL ARTICLE (080);  RESEARCH REPORT (143)
  The normative implications of applying significance tests in 
evaluation research are examined.  The authors conclude that 
evaluators often make normative decisions, based on the traditional 
.05 significance level in studies with small samples.  Additional 
reporting of the magnitude of impact, the significance level, and the 
power of the test is recommended.  (Author/EGS)
  Descriptors: *Evaluation Methods; *Hypothesis Testing; *Research 
Methodology; Research Problems; Sample Size; *Statistical 
Significance
  Identifiers: Data Interpretation; *Evaluation Problems; Evaluation 
Research

  
  ED249266  TM840619
  Power Differences among Tests of Combined Significance.
  Becker, Betsy Jane
  Apr 1984
  21p.; Paper presented at the Annual Meeting of the American 
Educational Research Association (68th, New Orleans, LA, April 23-27, 
1984).
  Document Type: CONFERENCE PAPER (150);  RESEARCH REPORT (143)
  Target Audience: Researchers
  Power is an indicator of the ability of a statistical analysis to 
detect a phenomenon that does in fact exist.  The issue of power is 
crucial for social science research because sample size, effects, and 
relationships studied tend to be small and the power of a study 
relates directly to the size of the effect of interest and the sample 
size.  Quantitative synthesis methods can provide ways to overcome 
the problem of low power by combining the results of many studies.  
In the study at hand, large-sample (approximate) normal distribution 
theory for the non-null density of the individual p value is used to 
obtain power functions for significance value summaries.  Three p-
value summary methods are examined:  Tippett's counting method, 
Fisher's inverse chi-square summary, and the logit method.  Results 
for pairs of studies and for a set of five studies are reported.  
They indicate that the choice of a "most-powerful" summary will 
depend on the number of studies to be summarized, the sizes of the 
effects in the populations studied, and the sizes of the samples 
chosen from those populations.  (BW)
  Descriptors: Effect Size; Hypothesis Testing; *Meta Analysis; 
Research Methodology; Sample Size; *Statistical Analysis; 
*Statistical Significance
  Identifiers: *Power (Statistics)

Return to FAQ on The Concept of Statistical Signicance Testing

Return to the Index of FAQs


Degree Articles

School Articles

Lesson Plans

Learning Articles

Education Articles

 

 Full-text Library | Search ERIC | Test Locator | ERIC System | Assessment Resources | Calls for papers | About us | Site map | Search | Help

Sitemap 1 - Sitemap 2 - Sitemap 3 - Sitemap 4 - Sitemap 5 - Sitemap 6

©1999-2012 Clearinghouse on Assessment and Evaluation. All rights reserved. Your privacy is guaranteed at ericae.net.

Under new ownership