REPLY
Craig A. Anderson
This document is copyrighted by the American Psychological Association or one of its allied publishers.
A large meta-analysis by Anderson et al. (2010) found that violent video games increased aggressive
thoughts, angry feelings, physiological arousal, and aggressive behavior and decreased empathic
feelings and helping behavior. Hilgard, Engelhardt, and Rouder (2017) reanalyzed the data of
Anderson et al. (2010) using newer publication bias methods (i.e., precision-effect test, precision-
effect estimate with standard error, p-uniform, p-curve). Based on their reanalysis, Hilgard, Engel-
hardt, and Rouder concluded that experimental studies examining the effect of violent video games
on aggressive affect and aggressive behavior may be contaminated by publication bias, and these
effects are very small when corrected for publication bias. However, the newer methods Hilgard,
Engelhardt, and Rouder used may not be the most appropriate. Because publication bias is a potential
a problem in any scientific domain, we used a comprehensive sensitivity analysis battery to examine
the influence of publication bias and outliers on the experimental effects reported by Anderson et al.
We used best meta-analytic practices and the triangulation approach to locate the likely position of
the true mean effect size estimates. Using this methodological approach, we found that the combined
adverse effects of outliers and publication bias was less severe than what Hilgard, Engelhardt, and
Rouder found for publication bias alone. Moreover, the obtained mean effects using recommended
methods and practices were not very small in size. The results of the methods used by Hilgard,
Engelhardt, and Rouder tended to not converge well with the results of the methods we used,
indicating potentially poor performance. We therefore conclude that violent video game effects
should remain a societal concern.
Anderson et al. (2010) published a large meta-analysis of 381 as well as correlations between violent game play and aggres-
effects from violent video game studies involving more than sive affect, behavior, and cognitions in cross-sectional studies.
130,000 participants. They found that violent video games Hilgard et al. (2017) examined a total of 13 meta-analytic distri-
increased aggressive thoughts, angry feelings, physiological butions (see their Table 3). For the most part, there is agreement
arousal, and aggressive behavior, and decreased empathic feel- between the mean estimates of Hilgard, Engelhardt, and Rouder and
ings and helping behavior. Hilgard, Engelhardt, and Rouder Anderson et al., although Hilgard, Engelhardt, and Rouder concluded
(2017) reanalyzed the data of Anderson et al. on experimental that the estimates of Anderson et al. of the experimental effects of
effects of violent-game exposure on aggressive affect, aggres- violent video games on aggressive behavior and aggressive
sive behavior, aggressive cognitions, and physiological arousal affect should be adjusted downward. Their conclusions are
based on several relatively new publication bias methods, in-
cluding the precision-effect test (PET), precision-effect estimate
with standard error (PEESE), p-uniform, and p-curve.
Sven Kepes, Department of Management, School of Business, Vir- In this response, we follow a two-pronged approach. First, we
ginia Commonwealth University; Brad J. Bushman, School of Communi- provide a brief critique of the methods Hilgard et al. (2017) used.
cation and Department of Psychology, Ohio State University; Craig A. An-
Second, given the shortcomings highlighted in our critique and
derson, Department of Psychology, Iowa State University.
Correspondence concerning this article should be addressed to Brad J. taking a strong inference approach (Platt, 1964), we reanalyze the
Bushman, School of Communication, Ohio State University, 3016 Derby experimental data with additional recommended statistical tech-
Hall, 154 North Oval Mall, Columbus, OH 43210. E-mail: bushman niques to determine with greater confidence whether Anderson et
.20@osu.edu al.s (2010) conclusions need to be altered.
775
776 KEPES, BUSHMAN, AND ANDERSON
Hilgard et al.s (2017) Anderson et al.s original analysis, their assertion is not necessarily
Methodological and Statistical Approach correct. We believe the most sophisticated analysis uses best
meta-analytic practices (e.g., Kepes & McDaniel, 2015; Kepes et
Hilgard et al. (2017) suggest that trim and fill, the publication al., 2013; Viechtbauer & Cheung, 2010) and the triangulation
bias assessment method Anderson et al. (2010) used, is best approach (Jick, 1979) to locate the likely position of the true mean
viewed as a sensitivity analysis rather than a serious estimate of the effect size estimate using a comprehensive sensitivity analysis
unbiased [meta-analytic] effect size (p. 760). In turn, they imply battery (Kepes et al., 2012). We use this more comprehensive
that their publication bias assessment methods are not sensitivity approach to determine whether the results reported by Hilgard et
analyses and should be viewed as more serious because they al. (2017) or by Anderson et al. (2010) are more accurate. How-
provide an accurate for-bias-adjusted mean estimate. Such an ever, before we proceed to reanalyzing the data, we briefly review
implication is misleading because all methods that assess the the publication bias methods used by Hilgard, Engelhardt, and
robustness of a nave meta-analytic mean estimate should be Rouder.
viewed as sensitivity analyses (Kepes, McDaniel, Brannick, &
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
without any adjustment for potential biases (Copas & Shi, 2000).
Sensitivity analyses examine the degree to which the results of The PET-PEESE (Stanley & Doucouliagos, 2014) approach to
a nave meta-analysis remain stable when conditions of the data or publication bias is a combination of two weighted regression
the analysis change (Greenhouse & Iyengar, 2009). We know of models. As Hilgard et al. (2017) stated, PET extrapolates from the
no valid method that can provide a for-bias-adjusted mean estimate available data to estimate what the effect would be in a hypothet-
of the true underlying population effect size. Instead, sensitivity ical study with perfect precision (p. 760). PEESE works in a
analyses tend to estimate the degree to which a nave meta-analytic similar manner, except that precision is modeled as a quadratic
mean may be adversely affected by publication and/or other biases. function instead of a linear function. Both PET and PEESE may
Furthermore, it is important to note that all methods become less incorporate multiple moderator variables, although Hilgard, En-
stable with small distributions. In fact, most publication bias gelhardt, and Rouder did not use them in that way. Furthermore,
assessment methods should not be applied to meta-analytic distri- both PET and PEESE are modified versions of Eggers test of the
butions with fewer than 10 samples, including funnel plot- and intercept and, as such, some of the shortcomings associated with
regression-based methods (Kepes, Banks, McDaniel, & Whetzel, the Egger test (Moreno et al., 2009; Stanley & Doucouliagos,
2012; Sterne et al., 2011). 2014; Sterne & Egger, 2005) may also apply to PET and/or
In addition, Hilgard et al. (2017) focused on one type of sensi- PEESE.
tivity analysispublication bias. Yet as Hilgard et al. (2017) PET is known to underestimate the size of nonzero effects
noted, heterogeneity can adversely affect the results of publication (Stanley & Doucouliagos, 2007), and PEESE can yield inaccurate
bias analyses (as well as the results of a nave meta-analysis). results the closer the true mean effect size is to zero (Stanley &
Because outliers can be a major source of between-study hetero- Doucouliagos, 2012), which is why Stanley and Doucouliagos
geneity, they should be considered when examining the potential (2014) outlined conditional decision rules to determine which of
effects of publication bias (Kepes & McDaniel, 2015). Like pub- the two models should be used to assess the potential presence of
lication bias (Kepes et al., 2012; Rothstein, Sutton, & Borenstein, publication bias (see also Kepes & McDaniel, 2015; van Elk et al.,
2005), the effects of outliers tend to lead to upwardly biased mean 2015). In a reanalysis of data regarding the predictive validity of
estimates to the extent that they are on one side of the distribution conscientiousness, Kepes and McDaniel (2015) found that their
(Viechtbauer & Cheung, 2010). Furthermore, because between- PET-PEESE results converged relatively well with the results of a
study heterogeneity due to outliers can be mistakenly attributed to battery of other publication bias assessment methods, indicating
publication bias, a comprehensive assessment of the influence of that the method tended to perform quite well with real data. More
publication bias should also include a thorough assessment of recently, Stanley and Doucouliagos (2017) conducted a simulation
outliers or otherwise influential data points (Kepes & McDaniel, and concluded that PET-PEESE properly accounts for heteroge-
2015). In other words, to obtain precise and robust estimates neity and performs quite well, although another simulation study
regarding the potential presence of publication bias, one should found that variants related to PET and PEESE did not perform well
account for outliers when conducting publication bias analyses. (Moreno et al., 2009). Therefore, there is somewhat contradictory
Unfortunately, Hilgard et al. (2017) used only leave-one-out evidence regarding the performance of PET-PEESE.
(i.e., one-sample-removed) analyses to identify outliers. In this
type of sensitivity analysis, the influence of each individual sample
P-uniform
on the nave mean is assessed. This approach poses two problems.
First, no consideration is given to the possibility that more than one The p-uniform method is essentially a selection model (Mc-
outlier has adverse effects on the nave meta-analytic mean esti- Shane, Bckenholt, & Hansen, 2016) that uses only significant
mates. Second, it is unclear what criteria Hilgard, Engelhardt, and studies to estimate the true effect using a fixed-effects model. The
Rouder used when determining whether a particular sample should developers explicitly stated that it is not applicable in the presence
be left out or excluded from subsequent analyses. of between-study heterogeneity (van Assen, van Aert, & Wicherts,
Taken together, although Hilgard et al. (2017) presented their 2015). In support of this view, p-uniform exhibited very low
reanalysis of Anderson et al.s (2010) meta-analytic data set as the convergence rates with other publication bias assessment methods
most up-to-date and comprehensive reanalysis possible, it is not when using real data (Kepes & McDaniel, 2015), probably because
without its own shortcomings. Albeit more sophisticated than of its sensitivity to heterogeneity. More recently, a comprehensive
REPLY TO HILGARD, ENGELHARDT, AND ROUDER (2017) 777
simulation study highlighted p-uniforms poor performance in we included them as well (Stanley & Doucouliagos, 2014). Fur-
realistic settings, which have been defined as settings with thermore, there is value in assessing the level of convergence
flexible publication rules and heterogeneous effect as opposed to between PET-PEESE and other, more established methods (e.g.,
restrictive settings, which involve rigid publication rules and trim-and-fill, selection models), especially because of the newness
homogeneous effect sizes (McShane et al., 2016, p. 731). More of the method. However, following the recommendations by Stan-
traditional selection models that use the complete data when esti- ley and Doucouliagos (2014), we use the conditional PET-PEESE
mating the adjusted mean effect (e.g., Hedges & Vevea, 2005) model and report only the appropriate estimate of the respective
should be used instead because they tend to perform better (Mc- mean effect.
Shane et al., 2016). With regard to trim and fill, we use the recommended fixed-
effects (FE) model with the L0 estimator (Kepes et al., 2012). To
address some of the legitimate criticisms of the trim-and-fill
P-Curve
method, we also use the random-effects (RE) model with the same
Like p-uniform, the p-curve method uses only significant studies estimator to assess the robustness of the results from the FE model
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
to estimate an overall mean effect. Therefore, as with p-uniform, (Moreno et al., 2009). In addition to the general cumulative meta-
This document is copyrighted by the American Psychological Association or one of its allied publishers.
for the p-curve method to work, the nonsignificant studies have to analysis by precision, which typically gets plotted in a forest plot
be estimating the same overall mean effect as the significant (see Kepes et al., 2012), we also present the cumulative meta-
studies, and typically that is not the case when there is between- analytic mean of the five most precise effect sizes (i.e., the effect
study heterogeneity (as there is in virtually all real data in the sizes from the five largest primary studies; for a similar approach,
social sciences). Indeed, when the developers of the p-curve see Stanley, Jarrell, & Doucouliagos, 2010). This method helps
method tested it against a gold standard of replications of 13 shed some light on the issue of low statistical power that often
effects across 36 laboratories, they focused on the effects that plagues social science studies. For the selection models, we use a
proved homogeneous across the laboratories, for exactly this rea- priori models (e.g., Hedges & Vevea, 2005) with recommended p
son (Simonsohn, Nelson, & Simmons, 2014). Not surprisingly, as value cut points to model moderate and severe instances of pub-
with p-uniform, McShane et al.s (2016) simulation study found lication bias (Vevea & Woods, 2005).
that p-curve did not perform well in realistic settings and con- Our comprehensive approach involved five steps. First, we
cluded that traditional selection models (e.g., Hedges & Vevea, performed a nave meta-analysis for each relevant subsample of
2005) are more appropriate for assessing the potential presence of studies on violent video games. Second, we applied our compre-
publication bias in meta-analytic studies. hensive battery of publication bias analyses. Third, we assessed the
potential presence of outliers using a battery of multidimensional,
multivariate influence diagnostics (Viechtbauer, 2015; Viecht-
Summary
bauer & Cheung, 2010). Fourth, we deleted any identified outli-
Although Hilgard et al. (2017) used more recently developed er(s) from the meta-analytic distribution and reran all analyses.
publication bias methods than Anderson et al. (2010) did, past Hence, all meta-analytic and publication bias analyses were ap-
research has shown that several of their methods tend to perform plied to data with and without identified outliers. Fifth, we con-
poorly when applied to real data. It is therefore questionable ducted all analyses with and without the two studies identified by
whether the methods Hilgard, Engelhardt, and Rouder used to Hilgard et al. (2017; p. 763) as being problematic (i.e., Graybill,
assess publication bias perform better than the trim-and-fill Kirsch, & Esselman, 1985; Panee & Ballard, 2002).1 This com-
method used by Anderson et al. (2010). Thus, Hilgard, Engelhardt, prehensive approach allows us to present the possible range of
and Rouders obtained results and conclusions could be erroneous, mean effect size estimates instead of relying on a single value,
as could Anderson et al.s results, especially because neither set of which is aligned with the advantages of the triangulation approach
authors used a comprehensive approach to account for outlier- and customer-centric science (Aguinis et al., 2010; Jick, 1979;
induced between-study heterogeneity, which can adversely affect Kepes et al., 2012). In fact, our comprehensive approach is re-
nave meta-analytic estimates and publication bias results (Kepes quired or recommended in some areas in the medical and social
& McDaniel, 2015; Viechtbauer & Cheung, 2010). sciences (American Psychological Association, 2008; Higgins &
Green, 2011; Kepes et al., 2013).
Our Methodological and Statistical Approach
Results
We implemented a comprehensive battery of sensitivity analy-
ses using the R programing language and the metafor (Viecht- The results of our analyses are displayed in Table 1 (the bottom
bauer, 2015) and meta (Schwarzer, 2015) packages. Following panel displays the results with identified outliers removed). The
best-practice recommendations (Kepes et al., 2012; Kepes & Mc- first three columns report what distribution was analyzed as well as
Daniel, 2015; Rothstein et al., 2005; Viechtbauer & Cheung,
2010), we used trim-and-fill (Duval, 2005), cumulative meta- 1
We note that these two studies with the four samples were deleted
analysis (Kepes et al., 2012), selection models (Vevea & Woods, across study type (e.g., experimental studies, cross-sectional studies, lon-
2005), the one-sample removed analysis (Borenstein, Hedges, gitudinal studies) and outcome (e.g., aggressive affect, aggressive cogni-
Higgins, & Rothstein, 2009), and a battery of multivariate influ- tion, aggressive behavior, physiological arousal). Thus, the removal of the
two studies did not affect the number of correlations in all meta-analytic
ence diagnostics (Viechtbauer, 2015; Viechtbauer & Cheung, distributions equally. In fact, some meta-analytic distributions were com-
2010). Given that Hilgard et al., (2017) based their conclusions to pletely unaffected by their removal (e.g., aggressive cognition best ex-
a large extent on the results from their PET and PEESE analyses, periments).
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
Table 1
778
Meta-Analytic and Publication Bias Results for the Anderson et al. (2010) Data Set
Original distributions
Aggressive affect
All experiments 37 3,015 .23 .16, .29 .05, .47 111.22 67.63 .16 .20, .24; .23 L 9 .14 .07, .22 0 .23 .16, .29 .08 .19 .13 .34
All experiments (w/o 2 s) 36 2,979 .21 .15, .28 .05, .45 102.30 65.79 .16 .19, .22; .21 L 8 .14 .07, .22 L 7 .15 .08, .22 .08 .18 .13 .33
Best experiments 21 1,454 .33 .25, .41 .09, .54 49.15 59.31 .15 .28, .34; .34 L 6 .25 .15, .34 0 .33 .25, .41 .22 .31 .29 .55
Best experiments (w/o 2 s) 20 1,418 .32 .24, .39 .09, .52 43.82 56.64 .14 .27, .33; 32 L 6 .24 .15, .34 0 .32 .24, .39 .22 .30 .28 .55
Aggressive cognition
All experiments 48 4,289.5 .21 .16, .25 .04, .37 90.00 47.78 .10 .19, .21; .21 0 .21 .16, .25 R 6 .23 .19, .28 .21 .18 .13 .25
All experiments (w/o 2 s) 47 4,173.5 .19 .16, .23 .07, .31 66.31 30.63 .07 .19, .20; .19 0 .19 .16, .23 0 .19 .16, .23 .21 .18 .15 .22
Best experiments 24 2,887 .22 .18, .27 .11, .33 35.11 34.49 .07 .21, .23; .22 L 5 .20 .15, .25 L 5 .20 .15, .25 .23 .21 .20 .19
Best experiments (w/o 2 s) Same results as above
Aggressive behavior
All experiments 45 3,464 .19 .14, .24 .02, .36 79.08 44.36 .10 .18, .20; .19 L 8 .15 .10, .21 L 8 .15 .10, .21 .14 .17 .13 .23
All experiments (w/o 2 s) 44 3,428 .18 .14, .21 .08, .27 52.94 18.78 .06 .17, .18; .18 L 7 .16 .11, .20 L 7 .16 .11, .20 .14 .16 .14 .17
Best experiments 27 2,513 .21 .17, .25 .18, .24 19.41 .0 .0 .20, .23; .21 L 10 .18 .15, .22 L 10 .18 .15, .22 .16 .20 .19 .07
Best experiments (w/o 2 s) Same results as above
Physiological arousal
All experiments 29 1,906 .15 .09, .21 .03, .31 45.48 38.44 .10 .13, .16; .15 L 1 .14 .08, .20 0 .15 .09, .21 .09 .12 .07 .11
All experiments (w/o 2 s) 28 1,870 .15 .09, .21 .02, .31 43.59 38.06 .10 .13, .16; .15 L 3 .13 .06, .20 0 .15 .09, .21 .09 .12 .08 .09
Best experiments 15 969 .20 .10, .29 .05, .42 30.43 53.99 .14 .17, .22; .20 0 .20 .10, .29 0 .20 .10, .29 .19 .16 n/a .27
Best experiments (w/o 2 s) 14 933 .21 .11, .31 .02, .43 27.62 52.93 .14 .18, .24; .21 L 5 .10 .01, .21 0 .21 .11, .31 .19 .18 .11 .23
Distributions without identified outliers
Aggressive affect
All experiments 36 2,985 .20 .14, .25 .0, .38 75.53 53.66 .12 .19, .21; .20 L 8 .14 .08, .20 L 7 .15 .09, .21 .08 .17 .14 .01
All experiments (w/o 2 s) 35 2,949 .19 .13, .24 .0, .36 66.24 48.67 .11 .18, .20; .19 L 7 .14 .09, .20 L 6 .15 .10, .21 .08 .16 .13 .01
Best experiments 20 1,424 .28 .23, .33 .21, .34 20.25 6.15 .03 .27, .29; .28 L 6 .24 .18, .30 L 6 .24 .18, .30 .22 .27 .26 .0
Best experiments (w/o 2 s) 19 1,388 .27 .21, .31 .22, .31 14.35 .0 .0 .26, .28; .27 L 5 .24 .18, .29 L 5 .24 .18, .29 .22 .26 .25 .0
Aggressive cognition
KEPES, BUSHMAN, AND ANDERSON
All experiments 46 3,966.5 .19 .15, .22 .08, .29 58.45 23.01 .06 .18, .19; .19 0 .19 .15, .22 0 .19 .15, .22 .18 .17 .15 .20
All experiments (w/o 2 s) 46 3,966.5 .19 .15, .22 .08, .29 58.45 23.01 .06 .18, .19; .19 0 .19 .15, .22 0 .19 .15, .22 .18 .17 .15 .20
Best experiments No outlier(s) identified (see the original distribution for the results)
Best experiments (w/o 2 s) No outlier(s) identified (see the original distribution for the results)
Aggressive behavior
All experiments 43 3,074 .18 .14, .22 .08, .28 51.26 18.07 .06 .18, .19; .18 L 6 .16 .12, .20 L 6 .16 .12, .20 .17 .17 .15 .19
All experiments (w/o 2 s) Same results as above
Best experiments 26 2,159 .23 .19, .27 .19, .26 14.91 .0 .0 .22, .23; .23 L 7 .20 .17, .24 L 7 .20 .17, .24 .18 .22 .21 .18
Best experiments (w/o 2 s) Same results as above
Physiological arousal
All experiments 28 1,872 .13 .08, .18 .02, .24 33.90 20.35 .06 .12, .14; .13 L 2 .12 .06, .18 L 1 .13 .07, .18 .09 .10 .06 .08
All experiments (w/o 2 s) 27 1,836 .13 .08, .19 .02, .24 32.17 19.18 .06 .12, .14; .13 L 2 .13 .07, .18 L 1 .13 .08, .19 .09 .11 .07 .06
Best experiments No outlier(s) identified (see the original distribution for the results)
Best experiments (w/o 2 s) No outlier(s) identified (see the original distribution for the results)
Note. w/o 2 s, without the two studies excluded by Hilgard et al. (2017); k, number of correlation coefficients in the analyzed distribution; N, meta-analytic sample size; ro, random-effects weighted
mean observed correlation; 90% PI, 90% prediction interval; Q, weighted sum of squared deviations from the mean; I2, ratio of true heterogeneity to total variation; , between-sample standard deviation;
osr, one sample removed, including the minimum and maximum effect size and the median weighted mean observed correlation; trim and fill, trim-and-fill analysis; FPS, funnel plot side (i.e., side
of the funnel plot in which samples were imputed; L, left; R, right); ik, number of trim-and-fill samples imputed; t&fFE ro, fixed-effects trim-and-filladjusted observed mean; t&fFE 95% CI, fixed-effects
trim-and-filladjusted 95% confidence interval; t&fRE ro, random-effects trim-and-filladjusted observed mean; t&fRE 95% CI, random-effects trim-and-filladjusted 95% confidence interval; CMA,
cumulative meta-analysis; pr5 ro, meta-analytic mean estimate of the five most precise effects; smm ro, one-tailed moderate selection models adjusted observed mean; sms ro, one-tailed severe selection
models adjusted observed mean; PET-PEESE, precision-effect testprecision effect estimate with standard error; PET-PEESE ro, PET-PEESE adjusted observed mean; n/a, not applicable (because
sms ro presented nonsensical results because of high variance estimates).
REPLY TO HILGARD, ENGELHARDT, AND ROUDER (2017) 779
its number of samples (k) and individual observations (N). Col- exceptions, particularly for PET-PEESE (e.g., aggressive affect
umns 4 10 display the nave meta-analytic results, including the all experiments and aggressive affect best experiments).
RE meta-analytic mean (the nave mean; ro), the 95% confidence
interval, the 90% prediction interval (PI), Cochrans Q statistic, I2,
tau (), and the one-sample removed analysis (minimum, maxi- Discussion
mum, and median mean estimates). Columns 1118 show the Recent research indicates that publication bias and outliers can
results from the trim-and-fill analyses; for the recommended FE as distort meta-analytic results and associated conclusions (e.g.,
well as the RE model, respectively. For each model, the table Banks, Kepes, & McDaniel, 2015; Kepes, Banks, & Oh, 2014;
includes the side of the funnel plot on which the imputed samples Kepes & McDaniel, 2015; Viechtbauer & Cheung, 2010). Hilgard
are located (FPS), the number of imputed samples (ik), the trim- et al. (2017) concluded that some of the Anderson et al. results
and-fill adjusted mean effect size (t&fFE ro or t&fRE ro), and the overestimated the impact of violent video game playing on aggres-
respective 95% confidence interval. Column 19 contains the cu- sive tendencies. Below, we will address some of the main conclu-
mulative mean for the five most precise samples (pr5 ro). Columns sions of Hilgard, Engelhardt, and Rouder.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
20 and 21 illustrate the results from the moderate (smm ro) and
This document is copyrighted by the American Psychological Association or one of its allied publishers.
Other Issues cially after outlier removal, the results of the various publication bias
assessment methods converged, increasing our confidence in the
Hilgard et al. (2017) recommended the exclusion of two studies. obtained results and associated conclusions.
Although their exclusion may be justifiable based on conceptual or We do not dispute that publication bias is a serious problem in
methodological grounds, we did not find support for the notion that general or that it may have affected some of the estimates in the
the four samples in these two studies had a real meaningful effect Anderson et al. (2010) meta-analysis. In fact, we found that out-
on the obtained meta-analytic results, regardless of whether or not liers, in addition to publication bias, affected some estimates
we took the potential effects of publication bias and outliers into reported by Anderson et al. We also echo prior calls for compre-
consideration. Furthermore, we found that more than one identi- hensive reanalyses of previous published meta-analytic reviews
fied outlier was detected in several meta-analytic distributions. The (e.g., Kepes et al., 2012). However, such reanalyses should follow
leave-one-out method used by Hilgard, Engelhardt, and Rouder is best-practice recommendations and, therefore be primarily con-
not capable of handling such situations. Relatedly, our results ducted with appropriate and endorsed methods instead of relying
indicated that outliers, in addition to publication bias, did have a on relatively new and potentially unproven methods, especially
noticeable effect on the originally reported mean estimates (An-
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
Anderson et al. considered numerous moderators (e.g., participant Copas, J., & Shi, J. Q. (2000). Meta-analysis, funnel plots and sensitivity
gender; participant age; Eastern vs. Western country; type of analysis. Biostatistics, 1, 247262. http://dx.doi.org/10.1093/
design experimental, cross-sectional, or longitudinal; type of biostatistics/1.3.247
outcomeaggressive cognition, aggressive affect, physiological De Angelis, C., Drazen, J. M., Frizelle, F. A. P., Haug, C., Hoey, J.,
arousal, aggressive behavior, empathy, helping; game characteris- Horton, R., . . . the International Committee of Medical Journal Eds.
tics such as human vs. nonhuman targets, first- vs. third-person (2004). Clinical trial registration: A statement from the International
Committee of Medical Journal Eds. New England Journal of Medicine,
perspectives), these moderators did not fully account for the
351, 1250 1251. http://dx.doi.org/10.1056/NEJMe048225
between-study heterogeneity observed in the effects. Thus, future
Duval, S. J. (2005). The trim and fill method. In H. R. Rothstein, A.
research should examine other possible moderator variables, such Sutton, & M. Borenstein (Eds.), Publication bias in meta analysis:
as publication year (to see whether the effects have changed over Prevention, assessment, and adjustments (pp. 127144). West Sussex,
time), amount of blood and gore in the game, whether the violence UK: Wiley.
is justified or unjustified, whether players use a gun-shaped con- Graybill, D., Kirsch, J. R., & Esselman, E. D. (1985). Effects of playing
troller or a standard controller, whether the video game is played violent versus nonviolent video games on the aggressive ideation of
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
cooperatively or competitively, and whether the video game is aggressive and nonaggressive children. Child Study Journal, 15, 199
This document is copyrighted by the American Psychological Association or one of its allied publishers.
played alone or with other players, to name a few. There were not 205.
enough studies to test these latter potential moderators in 2010, but Greenhouse, J. B., & Iyengar, S. (2009). Sensitivity analysis and diagnos-
there may be now. tics. In H. Cooper, L. V. Hedges, & J. C. Valentine (Eds.), The handbook
of research synthesis and meta-analysis (2nd ed., pp. 417 433). New
York, NY: Russell Sage Foundation.
Conclusion Hedges, L. V., & Vevea, J. L. (2005). Selection methods approaches. In
In conclusion, the trustworthiness of our cumulative knowledge H. R. Rothstein, A. Sutton, & M. Borenstein (Eds.), Publication bias in
meta analysis: Prevention, assessment, and adjustments (pp. 145174).
regarding the effects of violent video games is of clear concern to
West Sussex, UK: Wiley.
society, which is why we applaud Hilgard et al.s (2017) attempt
Higgins, J. P., & Green, S. (Eds.). (2011). Cochrane handbook for system-
to assess the trustworthiness of this literature. However, our con- atic reviews of interventions; version 5.1.0 [updated September 2011].
clusions about violent video game effects differ from those of The Cochrane Collaboration. Available at www.cochrane-handbook
Hilgard, Engelhardt, and Rouder. Contrary to the conclusions of .org
Hilgard, Engelhardt, and Rouder, ours are based on results from a Hilgard, J., Engelhardt, C. R., & Rouder, J. N. (2017). Overstated evidence
comprehensive battery of sensitivity analyses and are thus likely to for short-term effects of violent games on affect and behavior: A
be more robust to potential adverse effects. reanalysis of Anderson et al. (2010). Psychological Bulletin, 143, 757
There was convergence in our results across various different 774. http://dx.doi.org/10.1037/bul0000074
methods when we triangulated the true underlying mean effect for Jick, T. D. (1979). Mixing qualitative and quantitative methods: Triangu-
the relations between violent video games and aggression. Con- lation in action. Administrative Science Quarterly, 24, 602 611. http://
trary to what Hilgard et al. (2017) suggested, that effect was not dx.doi.org/10.2307/2392366
very small in size. As stated in our title, although the magnitude of Kepes, S., Banks, G. C., McDaniel, M. A., & Whetzel, D. L. (2012).
the mean effects were reduced by publication bias and outliers, Publication bias in the organizational sciences. Organizational Research
Methods, 15, 624 662. http://dx.doi.org/10.1177/1094428112452760
violent video game effects remain a societal concern.
Kepes, S., Banks, G. C., & Oh, I.-S. (2014). Avoiding bias in publication
bias research: The value of null findings. Journal of Business and
References Psychology, 29, 183203. http://dx.doi.org/10.1007/s10869-012-9279-0
Kepes, S., Bennett, A. A., & McDaniel, M. A. (2014). Evidence-based
Aguinis, H., Werner, S., Abbott, J. L., Angert, C., Park, J. H., & Kohl-
management and the trustworthiness of our cumulative scientific knowl-
hausen, D. (2010). Customer-centric science: Reporting significant re-
edge: Implications for teaching, research, and practice. Academy of
search results with rigor, relevance, and practical impact in mind. Or-
Management Learning & Education, 13, 446 466. http://dx.doi.org/10
ganizational Research Methods, 13, 515539. http://dx.doi.org/10.1177/
.5465/amle.2013.0193
1094428109333339
Kepes, S., & McDaniel, M. A. (2013). How trustworthy is the scientific
Anderson, C. A., Shibuya, A., Ihori, N., Swing, E. L., Bushman, B. J.,
literature in industrial and organizational psychology. Industrial and
Sakamoto, A., . . . Saleem, M. (2010). Violent video game effects on
Organizational Psychology: Perspectives on Science and Practice, 6,
aggression, empathy, and prosocial behavior in eastern and western
countries: A meta-analytic review. Psychological Bulletin, 136, 151 252268. http://dx.doi.org/10.1111/iops.12045
173. http://dx.doi.org/10.1037/a0018251 Kepes, S., & McDaniel, M. A. (2015). The validity of conscientiousness is
American Psychological Association. (2008). Reporting standards for re- overestimated in the prediction of job performance. PLoS ONE, 10,
search in psychology: Why do we need them? What might they be? e0141468. http://dx.doi.org/10.1371/journal.pone.0141468
American Psychologist, 63, 839 851. http://dx.doi.org/10.1037/0003- Kepes, S., McDaniel, M. A., Brannick, M. T., & Banks, G. C. (2013).
066X.63.9.839 Meta-analytic reviews in the organizational sciences: Two meta-analytic
Banks, G. C., Kepes, S., & McDaniel, M. A. (2015). Publication bias: schools on the way to MARS (the Meta-analytic Reporting Standards).
Understanding the myths concerning threats to the advancement of Journal of Business and Psychology, 28, 123143. http://dx.doi.org/10
science. In C. E. Lance & R. J. Vandenberg (Eds.), More statistical and .1007/s10869-013-9300-2
methodological myths and urban legends (pp. 36 64). New York, NY: Maxwell, S. E. (2004). The persistence of underpowered studies in psy-
Routledge. chological research: Causes, consequences, and remedies. Psychological
Borenstein, M., Hedges, L. V., Higgins, J. P., & Rothstein, H. R. (2009). Methods, 9, 147163. http://dx.doi.org/10.1037/1082-989X.9.2.147
Introduction to meta-analysis. West Sussex, UK: Wiley. http://dx.doi McShane, B. B., Bckenholt, U., & Hansen, K. T. (2016). Adjusting for
.org/10.1002/9780470743386 publication bias in meta-analysis: An evaluation of selection methods
782 KEPES, BUSHMAN, AND ANDERSON
and some cautionary notes. Perspectives on Psychological Science, 11, Stanley, T. D., & Doucouliagos, H. (2017). Neither fixed nor random:
730 749. http://dx.doi.org/10.1177/1745691616662243 Weighted least squares meta-regression. Research Synthesis Methods, 8,
Moreno, S. G., Sutton, A. J., Ades, A. E., Stanley, T. D., Abrams, K. R., 19 42. http://dx.doi.org/10.1002/jrsm.1211
Peters, J. L., & Cooper, N. J. (2009). Assessment of regression-based Stanley, T. D., Jarrell, S. B., & Doucouliagos, H. (2010). Could it be better
methods to adjust for publication bias through a comprehensive simu- to discard 90% of the data? A statistical paradox. American Statistician,
lation study. BMC Medical Research Methodology, 9, 2. http://dx.doi 64, 70 77. http://dx.doi.org/10.1198/tast.2009.08205
.org/10.1186/1471-2288-9-2 Sterne, J. A., & Egger, M. (2005). Regression methods to detect publica-
OBoyle, E. H., Jr., Banks, G. C., & Gonzalez-Mul, E. (2017). The tion bias and other bias in meta-analysis. In H. R. Rothstein, A. J. Sutton,
chrysalis effect: How ugly initial results metamorphosize into beautiful & M. Borenstein (Eds.), Publication bias in meta analysis: Prevention,
articles. Journal of Management, 43, 376 399. http://dx.doi.org/10 assessment, and adjustments (pp. 99 110). West Sussex, UK: Wiley.
.1177/0149206314527133 http://dx.doi.org/10.1002/0470870168.ch6
Panee, C. D., & Ballard, M. E. (2002). High versus low aggressive priming Sterne, J. A. C., Sutton, A. J., Ioannidis, J. P. A., Terrin, N., Jones, D. R.,
during video-game training: Effects on violent action during game play, Lau, J., . . . Higgins, J. P. T. (2011). Recommendations for examining
hostility, heart rate, and blood pressure (Vol. 32, pp. 2458 2474). and interpreting funnel plot asymmetry in meta-analyses of randomised
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
United Kingdom: Blackwell Publishing. controlled trials. British Medical Journal, 343, d4002. http://dx.doi.org/
Platt, J. R. (1964). Strong inference: Certain systematic methods of scien- 10.1136/bmj.d4002
This document is copyrighted by the American Psychological Association or one of its allied publishers.
tific thinking may produce much more rapid progress than others. van Assen, M. A. L. M., van Aert, R. C. M., & Wicherts, J. M. (2015).
Science, 146, 347353. http://dx.doi.org/10.1126/science.146.3642.347 Meta-analysis using effect size distributions of only statistically signif-
Richard, F. D., Bond, C. F., Jr., & Stokes-Zoota, J. J. (2003). One hundred icant studies. Psychological Methods, 20, 293309. http://dx.doi.org/10
years of social psychology quantitatively described. Review of General .1037/met0000025
Psychology, 7, 331363. http://dx.doi.org/10.1037/1089-2680.7.4.331 van Elk, M., Matzke, D., Gronau, Q. F., Guan, M., Vandekerckhove, J., &
Rothstein, H. R., Sutton, A. J., & Borenstein, M. (2005). Publication bias Wagenmakers, E.-J. (2015). Meta-analyses are no substitute for regis-
in meta-analysis: Prevention, assessment, and adjustments. West Sus- tered replications: A skeptical perspective on religious priming. Fron-
sex, UK: Wiley. http://dx.doi.org/10.1002/0470870168 tiers in Psychology, 6, 1365. http://dx.doi.org/10.3389/fpsyg.2015
Schwarzer, G. (2015). Meta-analysis package for R: Package meta. R .01365
package (version 4.3-2) [Computer software]. Retrieved from http:// Vevea, J. L., & Woods, C. M. (2005). Publication bias in research syn-
portal.uni-freiburg.de/imbi/lehre/lehrbuecher/meta-analysis-with-r thesis: Sensitivity analysis using a priori weight functions. Psychologi-
Simonsohn, U., Nelson, L. D., & Simmons, J. P. (2014). P-curve and effect cal Methods, 10, 428 443. http://dx.doi.org/10.1037/1082-989X.10.4
size: Correcting for publication bias using only significant results. Per- .428
spectives on Psychological Science, 9, 666 681. http://dx.doi.org/10 Viechtbauer, W. (2015). Meta-analysis package for R: Package metafor.
.1177/1745691614553988 R package (version 1.9-5) [Computer software]. Retrieved from http://
Stanley, T. D., & Doucouliagos, H. (2007). Identifying and correcting www.metafor-project.org/doku.php
publication selection bias in the efficiency-wage literature: Heckman Viechtbauer, W., & Cheung, M. W. L. (2010). Outlier and influence
meta-regression. Economics Series, 11. Retrieved from https://ideas diagnostics for meta-analysis. Research Synthesis Methods, 1, 112125.
.repec.org/p/dkn/econwp/eco_2007_11.html http://dx.doi.org/10.1002/jrsm.11
Stanley, T. D., & Doucouliagos, H. (2012). Meta-regression analysis in
economics and business. New York, NY: Routledge.
Stanley, T. D., & Doucouliagos, H. (2014). Meta-regression approxima- Received October 3, 2016
tions to reduce publication selection bias. Research Synthesis Methods, Revision received May 2, 2017
5, 60 78. http://dx.doi.org/10.1002/jrsm.1095 Accepted May 4, 2017