Anda di halaman 1dari 12

A peer-reviewed electronic journal.

Copyright is retained by the first or sole author, who grants right of first publication to the Practical Assessment, Research & Evaluation.
Permission is granted to distribute this article for nonprofit, educational purposes if it is copied in its entirety and the journal is credited.
PARE has the right to authorize third party reproduction of this article in print, electronic and database forms.
Volume 16, Number 14, October 2011 ISSN 1531-7714

Reporting and Discussing Effect Size: Still the Road Less Traveled?
James H. McMillan
Jennifer Foley
Virginia Commonwealth University

This study shows the extent to which effect size is reported and discussed in four major journals. A
series of judgments about different aspects of effect size were conducted for 417 articles from four
journals. Results suggest that while the reporting of simple effect size indices is more prevalent,
substantive discussions of the meaning of effect size is lacking.

Despite the admonitions of many statisticians specific nature of effect size indices reported, as well
and educational researchers (e.g., Fidler and as the extent to which researchers discuss magnitude
Cummings, 2008; Thompson, 2008), and newer of effect in analyzing their findings, and compares
AERA standards requiring effect sizes to be these findings to earlier research to get some idea of
reported and interpreted for every essential the impact of increasing attention to effect size in
statistical result (AERA, 2006), universal reporting the last decade.
and discussing effect size in quantitative studies
Review of Related Literature
remains elusive. Using specific effect size statistics,
Eleven previous reviews of different
or even the concept of magnitude of findings as
educational and psychological journals published
different from statistical significance, is clearly not
from 1990 through 2002 indicate considerable
yet integral to conducting and reporting educational
variability between journals and little increase over
research. This appears to be true even though
time in the percentages of articles that included
effect size has been addressed in most statistical
effect size indices (Vacha-Haase, Nilsson, Reetz,
methods textbooks for three decades (Huberty,
Lance, & Thompson, 2000). The typical
2001), well-known and respected methodologists
methodology employed in these studies was to
have written about effect size in leading journals and
review every article in all journal issues of a
in books (e.g., Cohen, 1994; Ellis, 2010; Glass, 1976;
designated volume. Only two of the studies
Grissom & Kim, 2005; Kirk, 1996; Rosenthal, 1991;
reviewed four or more journals (Keselman, Huberty,
Rosenthal & Rubin, 1982; Wilkinson & APA Task
Lix, Olejnik, Cribbie, Donahue, Kowalchuk,
Force on Statistical Inference, 1999; Thompson,
Lowman, Petoskey, Keselman, & Levin, 1998; and
2007; and Vacha-Haase & Thompson, 2004), and
Kirk, 1996), and a number of important journals for
over 23 journals have adopted editorial policies that
educational research were either omitted or
require effect size reporting (Vacha-Haase &
reviewed for only one or two years. Furthermore,
Thompson, 2004).
many of the studies were simple counts of the
This article will further educate researchers on frequency of effect size indices used (Henson &
the importance of substantive reporting of effect Smith, 2000). A more meaningful review, included
size. The purpose of this study is to partially with the present study, indicates, as suggested by
replicate an earlier investigation of effect size Vacha-Haase & Thompson (2004), whether there
reporting in four well known educational research was any discussion or interpretation of effect size.
journals. The study includes a determination of the This is important since it is necessary to both
Practical Assessment, Research & Evaluation, Vol 16, No 14 Page 2
McMillan & Foley, Reporting and Discussing Effect Size

interpret and evaluate effect size (Kline, 2004), not the overall rate of reporting effect size is still far
to simply include the statistical measure of effect from satisfactory (p. 998).
size without discussion. Alhija and Levy (2009) conducted a content
Four more recent studies, one in psychology, analysis of 183 statistical analyses reported in 99
two in different areas of education, and one in both randomly selected education articles from both
education and psychology, also show the extent and educational and psychological journals (10 total
nature of use of effect size. Sun, Pan, and Wang journals). Examining two years of articles, 2003-
(2010) reviewed effect size reporting from 1,243 2004, they found that most studies investigating
articles published in 14 journals, from 20052007 strength of relations reported effect size (86%,
(Table 1). including r, R2, and chi-square), while a much
Table 1: Summary of Effect Size Reporting in 14 smaller percentage of studies examining differences
Educational and Psychological Journals, 2005- reported effect size (57%, including Cohens d, eta2,
20071 and partial eta2). They also found only a small
Reported2 Interpreted3 difference between journals requiring effect size
Category Total n n % n % compared to journals not requiring effect size,
Journal type with slightly more for journals with a policy of
AERA4 69 50 73 31 62 requiring effect size (Table 2). However, they
APA5 863 349 40 179 51 included r as an effect size indicator. Though the
Independent6 311 211 68 136 64 actual number of analyses that utilized r is not
Total 1,243 610 49 346 57 reported separately, the inclusion of it would inflate
Year the number of effect sizes. Also, consistent with Sun
2005 438 198 45 116 59 et al, there was little interpretation or discussion of
2006 422 207 49 115 56
effect size in these studies.
2007 383 205 53 115 56
1Sun, Pan, & Wang (2010). Table 2: Summary of Effect Size Reporting for
2Included Cohens d, f, and h; r, R2; eta2and partial eta2 ; Quantitative Analyses in 99 Randomly Selected
odds ratio. Articles from 10 Educational Journals, 2003-
3Percentage based on number reported. 20041,2,3
4American Educational Research Journal, Educational
Reported
Evaluation and Policy Analysis. Category Total n n %
5Journal of Educational Psychology, Journal of Experimental
Journal type
Psychology: Applied, Journal of Experimental Psychology: Requiring effect size 100 74 74
Human Perception and Performance, Journal of Experimental
Psychology: Learning, Memory, and Cognition, School Psychology Not requiring effect size 83 56 67
Quarterly, Dreaming. Measure type
6Early Childhood Research Quarterly, Journal of Experimental Mean differences4 99 56 57
Education, Journal of Learning Disabilities, Learning Strength of relations5 74 64 86
Disabilities Research & Practice, Journal of Special Education, 1Alhija & Levy (2008)
Infant and Child Development. 2Included Cohens d and f; r, R2, ,and variance

They found that up to 49% of the articles reported explained; eta2 and partial eta2 ; chi-square.
effect size, including about 70% of articles in two 3Contemporary Educational Psychology, Journal of Learning

AERA journals (Educational Evaluation and Policy Disabilities, Exceptional Children, Early Childhood Research
Quarterly, Journal of Experimental Education, Journal of
Analysis and American Educational Research Journal), Educational Psychology, Learning Disabilities Research &
though only slightly more than half of these articles Practice, Journal of Special Education, Infant and Child
analyzed and/or interpreted the effect size results. Development, and Educational Research.
They also found that the most frequently used effect 4Includes Cohens d, eta2, and partial eta2.

size measure was some variation of strength of 5Includes r, R2, and chi-square.

relationship, often R2. The authors conclude that


Practical Assessment, Research & Evaluation, Vol 16, No 14 Page 3
McMillan & Foley, Reporting and Discussing Effect Size

Zientek, Capraro, and Capraro (2008), analyzed of magnitude or practical significance). In


217 statistical tests from a review of 174 quantitative particular, Thompson (2008) argues that greater
articles concerning evidence of best practices in discussion of effect size results is needed. Of
teacher education published prior to 2005. Effect interest with the present study is to investigate the
sizes were found in 39% of the cited articles. progress of our profession in following repeated
However, they included Pearson r and Spearman recommendations by experts, journals, and
rho as effect size indices. Not surprisingly, they professional associations for both reporting and
found that 94% of studies reporting correlations or discussing the nature of effect size.
regressions included effect size, and only 7% of
studies using difference effect size statistics, such as
Cohens d. They also found little discussion of effect Research Questions
sizes when reported, mostly about results of zero- The purpose of this study is to partially
order correlations and regression analyses. Similar replicate the McMillan et al. study of effect size
results were found by Mathews, Gentry, McCoach, reporting in four well known educational research
Worrell, Matthews, and Dixon (2008), in a study of journals. The investigation includes a determination
101 quantitative journal articles investigating gifted of the specific nature of effect size indices reported
education, using articles from five journals across as well as the extent to which researchers discuss
ten years (1996-2006). They found that the magnitude of effect in analyzing their findings. The
percentage of articles reporting effect size increased study then compares these findings to earlier
from 26% during 1996-2000 to 46% during 2001- research to get some idea of the impact of increasing
2005. Approximately 65% of the indicators were attention to effect size in the last decade. More
for correlational findings, which included reporting specific research questions included:
zero-order correlation coefficients. As with other 1. What percentage of quantitative or mixed-
previous reviews, this suggests reports of high method empirical studies reported and discussed
percentages of articles using effect size were effect size?
spurious.
2. What specific effect size estimates are used?
Previous investigations of effect size reporting
suggest that, despite recommendations in the 1994, 3. What is the extent of interpretation and
2001, and 2010 editions of the APA Publication discussion of the meaning of the effect size
Manual (e.g., For the reader to appreciate the estimates?
magnitude or importance of a studys findings it is 4. How do recently published studies reporting
almost always necessary to include some measure of and use of effect size compare with other
effect size in the Results section, APA Publication investigations of effect size?
Manual, 2010, p. 34) and of AERA (AERA, 2006)
that essentially admonish researchers to use effect 5. What is the implication of the current nature
size information, many researchers are still not of effect size reporting for graduate training and
including effect size indicators. The McMillan, professional development of educational
Lawson, Lewis, and Snyder (2002) study found that, researchers?
of the 508 articles that were either quantitative,
mixed-method, or simulation, 148 (29%) at least
mentioned or calculated an effect size. Only 82 of Data Source
these 508 articles (16%) included both a calculation Quantitative and mixed-method empirical
of effect size and at least limited discussion of studies were reviewed in four journals (Journal of
magnitude or practical significance. Only 30 of 508 Educational Psychology, Journal of Experimental Education,
articles (6%) included both a calculated effect size Journal of Educational Research, and Contemporary
and what was judged to be extensive discussion Educational Psychology), from 2008-2010. The
(typically several sentences or more of interpretation selection of journals was made to provide a
Practical Assessment, Research & Evaluation, Vol 16, No 14 Page 4
McMillan & Foley, Reporting and Discussing Effect Size

representative sample that would present a quantitative and mixed-method studies reported
reasonable indication of the extent to which (92%). This included 98% of the articles in the
educational researchers used effect size indices. It Journal of Educational Psychology. Only the Journal of
included journals that are well known, widely Educational Research included a number of qualitative
distributed, and publish a relatively large number of articles during the three years (12%). It is important
quantitative studies. These same four journals were to note that the Journal of Educational Psychology had
reviewed previously by McMillan, et al. considerably more articles than the other three
journals, which results in a large portion of the total
(47%).
Method Table 3: Number and Percent of Journal Article
A systematic content analysis of each article Methodologies by Journal and Year
was conducted, following closely earlier reviews by Tot Quanti- Mixed- Quali- Other
Sun et al., and Alhija & Levy. A Data Extraction n tative Method tative
Form (Appendix B) was pilot tested and reviewed Category n % n % n % n %
by experts1 and revised to enable objective recording Journal
of rater judgments (see Appendix A). For each 1 86 75 87 4 5 1 1 6 7
quantitative article there was a determination of 2 198 193 98 1 1 2 1 2 1
whether any effect size indicators were reported, 3 83 52 63 11 13 10 12 10 12
and the level of discussion of magnitude based on 4 50 45 90 2 4 2 4 1 2
these findings. For articles that included an effect Year
size index, the specific type of index was determined
2008 148 126 85 11 7 8 5 3 2
under two general categories - difference and
2009 151 136 90 3 2 4 3 8 5
relationship - with note of the specific reported
2010 118 103 87 4 3 3 3 8 7
statistic (e.g., Cohen's d, r2, eta2). Discussion and
interpretation was rated according to suggestions in Total 417 365 88 18 4 15 4 19 5
the literature, such as comparing effect sizes to Journals: 1. Contemporary Educational Psychology, 2. Journal of
those from previous research, relating to sample Educational Psychology, 3. Journal of Educational Research, 4.
Journal of Experimental Education
size, considerations of the context of the study, and
extensiveness. Examples of discussion and Table 4 shows the frequencies and percentages
interpretation, when present, were recorded. of quantitative and mixed-method studies that
Interrater reliability was established on a reported some type of effect size indicator.
random selection of 61 journal articles (15% Frequently, more than one index was reported in the
random sample). For each article there were 17 same article. For all studies, 74% used a specific
judgments, resulting in a total of 1,037 judgments. indicator of effect size. The percentages ranged
Interrater agreement was found on 936 categories ( from a high of 82% for the Journal of Educational
90%). For the single item that used a scale rather Research to a low of 64% for the Journal of
than dichotomous category, agreement within 1 Experimental Education. There was no clear trend
point on the scale was 87%. across the three years. Three effect size indicators
dominated those reported (Cohens d, eta2 and
partial eta2, and proportion of variance accounted
Results for [R2]). There was very little use of Hedgess g,
Table 3 presents frequencies and percentages of odds ratio, Cohens f, or omega2. Authors
articles in each of the four journals by dominant publishing in Journal of Educational Psychology used
methodology. A total 417 articles were reviewed. As proportion of variance and Cohens d more than the
expected, there was a high percentage of other three journals.
Practical Assessment, Research & Evaluation, Vol 16, No 14 Page 5
McMillan & Foley, Reporting and Discussing Effect Size

Table 4: Number and Percent of Effect Size Indicators Reported in Quantitative and
Mixed-Method Articles by Journal and Year
R2;
Eta2;
proportio
partial Odds
Reported Cohens d Hedgess g n of Omega 2
Eta2 Ratio
variance
Category n % n % n % n % n % n % n %
Journal
1 42 70 17 23 0 0 21 28 1 1 18 24 1 1
2 143 76 60 32 1 1 37 20 6 3 81 44 1 1
3 50 82 16 26 1 1 19 31 7 12 23 38 1 2
4 30 64 10 21 0 0 13 28 3 6 14 30 0 0
Year
2008 94 72 37 29 0 0 33 26 4 3 48 37 0 0
2009 104 77 36 27 1 1 38 28 5 4 49 37 1 1
2010 67 74 30 29 1 1 19 18 8 8 39 37 2 1
Total 265 74 103 28 2 1 90 25 17 5 136 37 3
Journals: 1. Contemporary Educational Psychology, 2. Journal of Educational Psychology, 3. Journal of Educational
Research, 4. Journal of Experimental Education

Our interpretation of how the authors magnitude of effect expressed as a function of


discussed effect size indicators that were reported is standard deviation, like Cohens d. There was
summarized in Table 5. When Cohens d was comparatively little discussion of the meaning of the
reported it was typically followed by an indication effect size results, whether as compared to effect
that Cohens guidelines (Cohen, 1988) for sizes in previous research, the context of the study,
interpretation were used (.2= small, .5= or as related to sample size. This was true for each
medium, and .8= large). The percentage of of the journals without clear patterns of differences
articles using this convention, as a percentage of across years.
articles reporting Cohens d, was very high (94%). Table 6 presents the results for the single item
Few articles mentioned the What Works that shows the judgments of the raters concerning
Clearinghouse guidelines for interpretation of the the extensiveness of the interpretation and/or
Table 5: Number and Percent of Nature of Interpretation and Discussion of Effect Size
Indicators Reported in Quantitative and Mixed-Method Studies by Journal and Year
Compared to
previous Explicitly
Cohens d WWC research effect Considers related to
Category guidelines guidelines size context sample size
Journal n % n % n % n % n %
1 20 26 0 0 7 9 9 12 5 7
2 56 30 0 0 33 18 29 15 15 8
3 13 21 2 3 8 13 5 8 4 7
4 8 17 0 0 4 9 9 19 4 9
Year
2008 30 23 0 0 15 12 17 13 8 6
2009 36 27 1 1 22 16 17 13 13 10
2010 31 29 1 1 15 14 18 17 7 7
Total 97 26 2 1 52 14 52 14 28 8
Journals: 1. Contemporary Educational Psychology, 2. Journal of Educational Psychology, 3. Journal of Educational
Research, 4. Journal of Experimental Education
Practical Assessment, Research & Evaluation, Vol 16, No 14 Page 6
McMillan & Foley, Reporting and Discussing Effect Size

discussion of effect size. Appendix A contains size, at least in these four journals, weighted heavily
excerpts from articles to show differences between by the Journal of Educational Psychology.
different levels of interpretation. Approximately half
of the articles did not include any discussion or
interpretation of effect size results. That is, the Discussion
authors of these studies typically showed Cohens d, The first important issue with this study, within
eta2 or R2 as part of reporting statistical significance, the context of other systematic reviews of articles
without any interpretation. This was the case for all for effect size, is the determination of what does
four journals. Nearly 20% of the articles included and does not count as an effect size indicator. In
very extensive or extensive interpretation and our study, unlike most others, we chose not to
discussion, amounting to three or more sentences. include the reporting of r and chi-square as effect

Table 6: Extensiveness of Interpretation and Discussion of Effect Size Indicators Reported


in Quantitative and Mixed-Method Studies by Journal and Year1
Category Very Extensive Extensive Some Little None
Journal n % n % n % n % n %
1 5 7 6 8 8 11 14 19 41 55
2 24 13 18 10 36 19 18 10 92 49
3 7 12 18 10 36 19 16 26 23 38
4 4 9 3 6 5 11 9 19 26 55
Total 40 11 31 8 60 16 57 15 182 49
1Very extensive more than 5 sentences; Extensive 3 to 5 sentences; Some 2 or 3 sentences;
Little one sentence

The final data from the review, summarized in size statistics. This may explain why our findings
Table 7, shows the percentages of quantitative and showed a much lower percentage of interpretation
mixed method studies reporting effect size as compared to the reviews by Sun et al. (2010) and
indicators in the same four journals from 2008-2010 Alhija and Levy (2008). It seems that there is a need
compared to articles published from 1997-2000. for further clarification of what constitutes effect
The percentage changes show substantial increases size and why it is important.
in reporting effect size indices over this approximate In many of the studies we reviewed, Cohens d
ten-year period. All four journals showed increases, and eta2 were simply listed along with indicators of
Table 7: Percentage of Quantitative and Mixed- statistical significance, without further reference to
Method Studies Reporting and Interpreting Effect Size it, as if d and eta2 are meaningful simply by being
by Year reported. This interpretation is consistent with
Reported Interpreted Zientek et al. (2008) and Kirk (1996), who also
Cate- 1997- 2008- 1997- 2008-
gory 20001 2010
Change
20001 2010
Change found reporting with little if any discussion. This
Journal % % % % % % suggests that it is difficult to know if the researchers
1 35 70 + 35 39 45 +6 actually understood effect size or merely included an
2 23 76 + 43 23 51 + 28 effect size statistic.
3 35 82 + 47 34 62 + 28
4 44 64 + 20 44 45 +1 It seems to us that zero-order correlations, by
Total 30 74 + 44 30 51 + 21 themselves, should not be interpreted as an effect
1McMillan, Lawson, Lewis, Snyder (2001)
size indicator, despite Cohens inclusion of it in his
2Little, Some, Extensive, or Extensive
list of effect size statistics. The proportion of
variance accounted, with the coefficient of
with an overall increase of 44%. This suggests that
determination or coefficient of multiple
many more researchers are now reporting effect
determination, is a much better indicator of effect
Practical Assessment, Research & Evaluation, Vol 16, No 14 Page 7
McMillan & Foley, Reporting and Discussing Effect Size

size. Of course you can mentally square the of the sample, context, or effect size from previous
correlation easily enough, but without including this studies. This is not surprising, given the ubiquitous
squared statistic the meaning of proportion of though often thoughtless use of Cohens guidelines,
variance accounted for is unclear. R2, on the other and the use of R2 for many years with interpretation
hand, which is commonly reported in studies using of regression analyses.
regression, does indicate proportion of variance and Sun et al. reported a higher percentage of
is a reasonable measure of effect size. With R2 there articles that interpreted effect size. In their study
is an understanding of how much variability is and is there are no examples of what they considered
not accounted for, which helps in interpreting the interpreted, which, along with the inclusion of
extent to which a relationship is a function of different journals and raters, could explain why their
measured variables. percentage was so much higher than what we found.
Considering that we did not include r as a When results of the current study considered
measure of effect size, our finding that 64% - 82% interpretation only for quantitative and mixed-
of the quantitative and mixed method articles at method studies reporting effect size, 46% were rated
least included a measure of effect size is as having at least some discussion, still lower than
encouraging. Our percentages reflecting use were the Sun et al. findings.
somewhat higher than what was reported by Alhija It was interesting to find only six articles that
& Levy (2008), and Sun et al. (2010). This may be used guidelines from the WWC. This is important
because of differences in methodology, journals since the WWC has made an extensive effort in
used for the review, and time frame, which reflected suggesting that an effect size that shows only a
findings in earlier years. A direct comparison is not small effect, as per Cohen, is probably meaningful
possible. Our findings clearly indicated an increase and important for practice. That is, a difference of
in use of effect size from 1997 to 2010, though this only a quarter or fifth of a standard deviation in
also may be partially a function of different raters educational studies is considered important in many
and differences in specific rating forms. Only one contexts. This highlights the current limited
rater was the same for both analyses. interpretation and use of effect size. As many have
The extent of interpretation also appears to argued, the reason we have effect size is to show the
have increased from 1997 to 2010 for the four magnitude and importance of the results, explained
journals in this study, albeit to a lesser extent than in such as way so that it is clear what the results
reporting. In the earlier study the rating of mean with respect to original units or units that are
interpretation was based on a four point scale, relevant to understanding the practical impact.
ranging from none to 3, with 3 indicating Effect size transforms abstract statistical significance
extensive interpretation. The current study used a testing to concrete measures of relationship or
five point scale and attempted to capture discussion difference. This is what the WWC is trying to show
with a more concrete indicator. While the number in their evaluation of reported effect sizes.
of sentences is not the same as the more qualitative Researchers publishing in the four journals reviewed
judgment used in the earlier study, the 19 percentage for this study may not be aware of the WWC
rated very extensive or extensive in the current guidelines, and/or may not understand the
study is about the same as the 6% rated extensive importance of discussing what effect size results
(3) or 23% rated as a 2 or 3 in the earlier study of mean. Judging from the low percentage even
articles from the same journals. This consistently interpreting effect size results, there appears to be a
low percentage shows little progress toward need for researchers to emphasize, analyze, and
interpretation of effect size. Our current data show report the meaning of results in units and in ways
much more interpretation with Cohens d and R2 that give others an indication of the magnitude and
than other indicators, with lower percentages importance of the findings.
explicitly discussing effect size in relation to the size
Practical Assessment, Research & Evaluation, Vol 16, No 14 Page 8
McMillan & Foley, Reporting and Discussing Effect Size

In summary, our findings suggest that when it Byrnes, J. P., & Wasik, B. A. (2009). Factors predictive of
comes to effect size the number of researchers mathematics achievement in kindergarten, first and
reporting and interpreting such statistics is third grades: An opportunitypropensity
improving. While the number of studies using analysis. Contemporary Educational Psychology, 34(2),
effect size has increased, analysis of the nature of 167-183.
the effect size and meaningfulness of the results is Cohen, J. (1994). Statistical power analysis for the behavioral
lacking. Many researchers are using effect size sciences (2ed). Hillsdale, NJ: Lawrence Erlbaum
Associates, Publishers.
without sufficient discussion of what it actually
suggests in the context of their study and for Daniels, L. M., Haynes, T. L., Stupinsky, R. H., Perry, R.
practice. Furthermore, there was reliance on P.; Newall, N. E., & Pekrun, R. (2008). Individual
differences in achievement goals: A longitudinal
Cohens d guidelines for interpreting effect size
study of cognitive, emotional, and achievement
results, rather than what is recommended and used outcomes. Contemporary Educational Psychology, 33(4)
by the What Works Clearinghouse, which were 584-608.
designed with the educational setting in mind. This Ellis, P. D. (2010). The essential guide to effect sizes: Statistical
suggests a need to clarify for researchers which power, meta-analysis, and the interpretation of research
guidelines make most sense within the context of results. Cambridge, UK: Cambridge University
their research, and what considerations should be Press.
analyzed to discuss and interpret effect size results, Fidler, F., & Cumming, G. (2008). The new stats:
consistent with recommendations from Thompson Attitudes for the twenty-first century. In J.W.
(2008). While difference indices are appropriate for Osborne (Ed.), Best practices in quantitative methods.
many studies, more attention may need to be given Thousand Oaks, CA: Sage Publishing.
to relationship indicators. Gehlbach, H., Brown, S. W., Ioannou, A. Boyer, M.A.,
The trends reported here, albeit with only four Hudson, N., Niv-Solomon, A., Maneggia, D., &
journals, have demonstrated typical usage of effect Janik, L. (2008). Increasing interest in social studies:
Social perspective taking and self-efficacy in
size in educational research and suggest that there is stimulating simulations. Contemporary Educational
need for researchers and statisticians to make more Psychology, 33(4), 894-914.
effective use of effect size in reporting results and
Glass, G. V. (1976). Primary, secondary, and meta-
discussion of practical significance. Those teaching analysis of research. Educational Researcher, 5, 3-8.
research have a specific responsibility to educate
Grissom, R. J., & Kim, J. J. (2005). Effect sizes for research:
students about the nature and use of effect size so
A broad practical approach. Mahwah, NJ: Erlbaum.
that meaningfulness and practical importance are
clearly understood. Henson, R. K., & Smith, A. D. (2000). State of the art in
statistical significance and effect size reporting: A
review of the APA Task force report and current
trends. Journal of Research and Development in
References Education, 33, 285-296.

Alhija, F. N.-A., & Levy, A. (2009). Effect size reporting Huberty, C. J. (2001). A history of effect size indexs.
practices in published articles. Educational and Paper presented at the Annual Meeting of the
Psychological Measurement, 69(2), 245-265. American Educational Research Association,
American Educational Research Association. (2006). Seattle.
Standards for reporting on empirical social science Keselman, H. J., Huberty, C. J., Lix, L. M., Olejnik, S.,
research in AERA publications. Educational Researcher, Cribbie, R. A., Donahue, B., Kowalchuk, R. K.,
35(6), 33-40. Lowman, L. L., Petoskey, M. D., Keselman, J. C., &
American Psychological Association. (2010). Publication Levin, J. R. (1998). Statistical practices of
manual of the American Psychological Association (6th ed.). educational researchers: An analysis of their
Washington, DC: Author. ANOVA, MANOVA, and ANCOVA Analyses.
Review of Educational Research 68(3), 350-386.
Practical Assessment, Research & Evaluation, Vol 16, No 14 Page 9
McMillan & Foley, Reporting and Discussing Effect Size

Kirk, R. E. (1996). Practical significance: A concept professional development intervention on head start
whose time has come. Educational and Psychological teachers and children. Journal of Educational Psychology,
Measurement, 56, 746-759. 102(2), 299-312.
Klassen, R. M. Chiu, M. M. (2010). Effects on teachers' Rosenthal, R. (1991). Effect sizes: Pearson's correlation,
self-efficacy and job satisfaction: Teacher gender, its display via the BESD, and alternative indexs.
years of experience, and job stress. Journal of American Psychologist, 46, 1086-1087.
Educational Psychology, 102(3), 741-756. Rosenthal, R., & Rubin, D.B. (1982). A simple general
Kline, R. B. (2004). Beyond significance testing: Reforming purpose display of magnitude of experimental
data analysis methods in behavioral research. Washington, effect. Journal of Educational Psychology, 74, 166-169.
DC: American Psychological Association. Sun, S., Pan, W., & Wang, L. L. (2010). A comprehensive
Koth, C. W., Bradshaw, C. P., & Leaf, P. J. (2008) A review of effect size reporting and interpreting
multilevel study of predictors of student perceptions practices in academic journals in education and
of school climate: The effect of classroom-level psychology. Journal of Educational Psychology, 102(4),
factors. Journal of Educational Psychology, 101(1), 96- 989-1004.
104. Thompson, B. (2007). Effect sizes, confidence intervals,
Marsh, H>, Martin, A., & Cheng, J.H.S. (2008). A and confidence intervals for effect sizes. Psychology in
multilevel perspective on gender in classroom the Schools, 44, 423-432.
motivation and climate: Potential benefits of male Thompson, B. (2008). Computing and interpreting effect
teachers for boys? Journal of Educational Psychology, sizes, confidence intervals, and confidence intervals
100(1), 78 95. for effect sizes. In J. Osborne (Ed.), Best practices in
McMillan, J. H., Lawson, S., Lewis, K., & Snyder, A. quantitative methods (pp. 246-262). Newbury Park,
(2002, April). Reporting effect size: The road less CA: Sage.
traveled. Paper presented at the 2002 Annual Vacha-Haase, T., Nilsson, J. E., Reetz, D. R., Lance, T.
Meeting of the American Educational Research S., & Thompson, B. (2000). Reporting practices
Association, New Orleans. and APA editorial policies regarding statistical
Muis, K. R., & Franco, G. M. (2009) Epistemic beliefs: significance and effect size. Theory & Psychology,
Setting the standards for self-regulated learning, 10(3), 413-425.
Contemporary Educational Psychology, 34(4), 306-318. Vacha-Haase, T., & Thompson, B. (2004). How to
Nelson, R. M., & DeBacker, T. K. (2008). Achievement estimate and interpret various effect sizes. Journal of
motivation in adolescents: The role of peer climate Counseling Psychology, 51, 473 481.
and best friends, Journal of Experimental Education, Wilkinson, L., & APA Task Force on Statistical
76(2), 170 189. Inference. (1999). Statistical methods in psychology
Pekrun, R., Goetz, T., Daniels, Lia M., Stupnisky, R. H., journals: Guidelines and explanations. American
& Perry, R. P. (2010). Boredom in achievement Psychologist, 54, 594-604.
settings: Exploring controlvalue antecedents and Zientek, L. R., Capraro, M. M., & Capraro, R. M. (2008).
performance outcomes of a neglected emotion. Reporting practices in quantitative teacher
Journal of Educational Psychology, 102 (3), 531-549 education research: One look at the evidence cited
in the AERA panel report. Educational Researcher,
Powell, D. R., Diamond, K. E., Burchinal, M.R., & 37(4), 208-216.
Koehler, M. (2010). Effects of an early literacy
Practical Assessment, Research & Evaluation, Vol 16, No 14 Page 10
McMillan & Foley, Reporting and Discussing Effect Size

Appendices

Appendix A: Excerpts From Articles Showing Different Levels of Interpretation


Limited Some Extensive Very Extensive
Black boys had 55% greater Finally, although some of our Specifically, children in The proportion of variance
odds of receiving a teacher- effect sizes are small according intervention classrooms had suggests there is greater
reported ODR compared with to conventional standards, they significant gains in letter variability in students
white boys. (Bradshaw, indicate systematic knowledge (d= .29) concepts willingness to learn within
Mitchell, O involvement of goals across about print (d=.22), writing schools and that this aspect of
Brennan, & Leaf, 2010, p. 512). three domains of academic (d= .17) and blending (d=. school climate may be more
Note how the effect sizes for outcomes. Given that we 18). indicative of individuals own
ethnicity were considerably controlled for strong effect size results from prior motivation than is overall
smaller than those for SES background variables including studies of similar interventions aggregated perception. In
(Byrnes, & Wasik, 2009, p. age and high school average, all and target populations provide contrast, the amount of
177). additional variance explained in a benchmark for interpreting school-level variance for order
Given that effect sizes the dependent variables can be the current studys effect sizes and discipline was much higher
ranged from moderate to large viewed as meaningful. (Daniels, for significant intervention (27 %), suggesting that
when changes occur, these Haynes, Stupinsky, Perry, efforts. (Powell, Diamond, perceptions of school safety
changes were noteworthy Newall, & Pekrun, 2008, p. Burchinal, & Koehler, 2010, p. may be more relevant to school
(Muis, & Franco, 2009, p. 269). 604). 307, 309). characteristics than
The overall variances However, effect sizes in Only 2 of 23 effects were achievement motivation;
accounted for in the outer these differences were small ( statistically significant, and however, the individual level
variables were modest, yet ds=-.28 and .27 for control and each of these was small. Girls still accounted for the majority
small effects might have larger elaborationas in study 2, had higher self efficacies of the variance. Last, 8% to
accumulations effects over effect sizes were small (ds= .21, (averages across the three 9% of the variance across the
time (Nelson, & DeBacker, .23, and .29) Pekrun, Goetz, subjects), but the difference two climate outcomes was
2008, p. .) Daniels, Stupnisky, & Perry, was not large (approximately attributable to clustering at the
2010, p. 538). .09 standard deviations, as all classroom level. This
Workload stress accounted
the variables were standardized partitioning of variance is
for 31% of the variance in
to facilitate interpretations) relatively consistent with
teachers overall teaching stress
(Marsh, Martin, & Cheng, previous research by Vieno et
Klassen, & Chiu, 2010, p. 747).
2008, p. 85). al. (2005), who found that 84%
of the variation in climate was
accounted for at the individual
level, whereas 11% was
accounted for at the class level,
and just 4% at the school level.
(Koth, Bradshaw, & Leaf,
2008, p. 101).
Practical Assessment, Research & Evaluation, Vol 16, No 14 Page 11
McMillan & Foley, Reporting and Discussing Effect Size

Appendix B: Data Extraction Form

Effect Size Reporting and Interpretation

Coder: ______________________ Date: _______________________

Journal: ____
1- Contemporary Educational Psychology
2- Journal of Educational Psychology
3- Journal of Educational Research
4- Journal of Experimental Education

Last Name of First Author: __________________________

Year: _____________ Volume: ______________ Number: _________________

Dominant Nature of Study: ________


1- Quantitative
2- Mixed Method
3- Qualitative
4- Other, e.g., review of literature

For Quantitative and Mixed Method Research Design: ______________


1-Experimental
2-Single Subject
3-Causal Comparative/Ex post facto
4-Correlational/Comparative
5- Other, eg., instrument development
Effect Size Index(s) Reported:

Specific index reported Yes _____ No ______


Cohens d Yes _____ No ______
Hedgess g Yes _____ No ______
Eta2 Yes _____ No ______
Odds Ratio Yes _____ No ______
Proportion of variance explained (R2 or r2) Yes _____ No ______
Omega2 Yes _____ No ______
Other: _____________________________________

Effect size confidence intervals reported? Yes ______ no ________

Interpretation and Discussion:


Cohens d Guidelines Yes _____ No ______
WWC Guidelines Yes _____ No ______
Comparison with effect size of previous research? Yes _____ No ______
Includes consideration of context what outcome Yes _____ No ______
is measured?
Explicitly related to sample size? Yes_____ No ______
Practical Assessment, Research & Evaluation, Vol 16, No 14 Page 12
McMillan & Foley, Reporting and Discussing Effect Size

Extensiveness of Interpretation and Discussion: ________


1. Very extensive more than five sentences
2. Extensive three to five sentences
3. Some two to three sentences
4. Little one sentence
5. None

Nature of Interpretation and Discussion:

Other Comments:

Citation:
McMillan, James H. & Jennifer Foley (2011). Reporting and Discussing Effect Size: Still the Road Less
Traveled? Practical Assessment, Research & Evaluation, 16(14). Available online:
http://pareonline.net/getvn.asp?v=16&n=14

Acknowledgement

Reviewers of a draft of the Data Extraction Form were Gene Glass, Bruce Thompson, and Bob Slavin. Their
help is appreciated.

Authors:

James H. McMillan Jennifer Foley


School of Education School of Education
Virginia Commonwealth University Virginia Commonwealth University
Box 842020 Box 842020
Richmond, Virginia 23284-2020 Richmond, Virginia 23284-2020
Jmcmillan [at] vcu.edu

Anda mungkin juga menyukai