MTS 525-0
Special Topics Research Seminar
Section 20: Generalizing about Message Effects
Spring 2020
SYLLABUS: TOPIC 5
TOPIC 5: Interpreting effect
size magnitude and variability
5.1 Effect size magnitudes
5.1.1 Abstract characterizations of effect size
magnitude
5.1.2 Observed average effect sizes
5.1.3 The null as a range: Equivalence testing and second-generation p-values
5.2 Effect size variability
5.2.1 Heterogeneity indices (I2, Q, Birge’s R, etc.)
5.2.2 Prediction intervals
5.3 The “replication crisis” revisited
5.1 Effect size magnitudes
5.1.1 Abstract characterizations of effect size magnitude
Funder, D. C., & Ozer, D. J. (2019). Evaluating effect size in psychological research: Sense and nonsense. Advances
in Methods and Practices in Psychological Science, 2, 156-168.
doi:10.1177/2515245919847202
For further reading:
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum.
Abelson, R. P. (1985). A variance explanation paradox: When a little is a lot. Psychological Bulletin, 97, 129-133. doi: 10.1037/0033-2909.97.1.129
Prentice, D. A., & Miller, D. T. (1992). When small effects are impressive. Psychological Bulletin, 112, 160-164. doi:10.1037/0033-2909.112.1.160
Pogrow, S. (2019). How effect size (practical significance) misleads clinical practice: The case for switching to practical benefit to assess applied research findings. The American Statistician, 73(S1), 223-234. doi:10.1080/00031305.2018.1549101
Correll, J., Mellinger, C., McClelland, G. H., & Judd, C. M. (2020). Avoid Cohen’s ‘small’, ‘medium’, and ‘large’ for power analysis. Trends in Cognitive Sciences, 24(3), 200-207. https://doi.org/10.1016/j.tics.2019.12.009
5.1.2 Observed average effect sizes
Rains, S.
A., Levine, T. R., & Weber, R. (2018). Sixty years of quantitative
communication research summarized: Lessons from 149 meta-analyses. Annals
of the International Communication Association, 42, 105-124.
doi:10.1080/23808985.2018.1446350
Schäfer, T., & Schwarz, M. A. (2019). The
meaningfulness of effect sizes in psychological research: Differences between
sub-disciplines and the impact of potential biases. Frontiers in Psychology, 10, article 813. doi:10.3389/fpsyg.2019.00813
For further reading:
Haase, R. F., Waechter, D. M., & Solomon, G. S. (1982). How significant is a significant difference? Average effect size of research in counseling psychology. Journal of Counseling Psychology, 29, 58-65.
Cooper, H., & Findley, M. (1982). Expected effect sizes: Estimates for statistical power analysis in social psychology. Personality and Social Psychology Bulletin, 8, 168-173. doi:10.1177/014616728281026
Hemphill, J. F. (2003). Interpreting
the magnitudes of correlation coefficients. American Psychologist, 58(1), 78-79.
doi:10.1037/0003-066X.58.1.78
Richard, F. D., Bond, C. F., Jr., & Stokes-Zoota, J. J. (2003). One hundred years of social psychology quantitatively described. Review of General Psychology, 7(4), 331-363. doi:10.1037/1089-2680.7.4.331
Hill, C. J., Bloom, H. S., Black, A. R., & Lipsey, M. W. (2008). Empirical benchmarks for interpreting effect sizes in research. Child Development Perspectives, 2(3), 172-177. doi:10.1111/j.1750-8606.2008.00061.x
Ferguson, C. F. (2009). Is psychological research really as good as medical research? Effect size comparisons between psychology and medicine. Review of General Psychology, 13, 130-136. doi:10.1037/a0015103
Chen, H., Cohen, P., & Chen, S. (2010). How big is a big odds ratio? Interpreting the magnitudes of odds ratios in epidemiological studies. Communications in Statistics: Simulation and Computation, 39(4), 860-864. doi:10.1080/03610911003650383
Bosco, F. A., Aguinis, H., Singh, K., Field, J. G., & Pierce, C. A. (2015). Correlational effect size benchmarks. The Journal of Applied Psychology, 100(2), 431–449. doi:10.1037/a0038047
Leucht, S., Helfer, B., Gartlehner, G., & Davis, J. M. (2015). How effective are common medications: A perspective based on meta-analyses of major drugs. BMC Medicine, 13, 253. doi:10.1186/s12916-015-0494-1
Gignac, G. E., & Szodorai, E. T. (2016). Effect size guidelines for individual differences researchers. Personality and Individual Differences, 102, 74–78. doi:10.1016/j.paid.2016.06.069
Paterson, T. A., Harms, P. D., Steel, P., & Credé, M. (2016). An assessment of the magnitude of effect sizes: Evidence from 30 years of meta-analysis in management. Journal of Leadership & Organizational Studies, 23(1), 66-81. doi:10.1177/1548051815614321
Lovakov, A., & Agadullina, E. (2017). Empirically derived guidelines for interpreting effect size in social psychology. PsyArXiv manuscript. psyarxiv.com/2epc4. doi:10.17605/OSF.IO/2EPC4
Brydges, C. R. (2019). Effect size guidelines, sample size calculations, and statistical power in gerontology. Innovation in Aging, 3(4), igz036. doi:10.1093/geroni/igz036
5.1.3 The null as a range: Equivalence testing and
second-generation p-values
Weber, R., & Popova, L. (2012). Testing equivalence in communication research: Theory and application. Communication Methods and Measures, 6, 190-213. doi:10.1080/19312458.2012.703834
Blume, J. D., Greevy, R. A., Welty, V. F., Smith, J. R., & Dupont, W. D. (2019). An introduction to second-generation p-values. The American Statistician, 73(S1), 157-167. doi:10.1080/00031305.2018.1537893
For further reading:
Wellek, S. (2010). Testing statistical hypotheses of equivalence and noninferiority (2nd ed.). Boca Raton, FL: Chapman & Hall/CRC.
Goertzen, J. R., & Cribbie, R. A. (2010). Detecting a lack of association: An equivalence testing approach. British Journal of Mathematical and Statistical Psychology, 63, 527–537. doi:10.1348/000711009X475853
Rainey, C. (2014). Arguing for a negligible effect. American Journal of Political Science, 58, 1083-1091. doi:10.1111/ajps.12102
Lash, T. L., & Kaufman, J. S. (2015). Seeking persuasively null results. Epidemiology, 26, 449-450. doi: 10.1097/EDE.0000000000000318
Lakens, D. (2017). Equivalence tests: A practical primer for t-tests, correlations, and meta-analyses. Social Psychological and Personality Science, 8, 355-362. doi:10.1177/1948550617697177
Lakens, D., Scheel, A. M., & Isager, P. M. (2018). Equivalence testing for psychological research: A tutorial. Advances in Methods and Practices in Psychological Science, 1, 259-269. doi:10.1177/2515245918770963
5.2 Effect size variability
5.2.1 Heterogeneity indices (I2, Q, Birge’s
R, etc.)
Higgins, J. P. T., & Thompson, S. G. (2002). Quantifying heterogeneity in a meta-analysis. Statistics in Medicine, 21, 1539-1558. doi:10.1002/sim.1186
For further reading:
Birge, R. T. (1932). The calculation of errors by the method of least squares. Physical Review, 40 (2nd ser.), 207-227.
Hall, J. A., & Rosenthal, R. (1991). Testing for moderator variables in meta-analysis: Issues and methods. Communication Monographs, 58, 437-448. doi:10.1080/03637759109376240
Sánchez-Meca,
J., & Marin-Martinez, F. (1997). Homogeneity tests in meta-analysis: A
Engels, E. A., Schmid, C. H., Terrin, N., Olkin, I., & Lau, J. (2000). Heterogeneity and statistical significance in meta-analysis: An empirical study of 125 meta-analyses. Statistics in Medicine, 19, 1707-1728.
Higgins,
J., Thompson, S., Deeks, J., & Altman, D. (2002).
Statistical heterogeneity in systematic reviews of clinical trials: A critical
appraisal of guidelines and practice. Journal of Health Services Research
and Policy, 7, 51-61. doi:10.1258/1355819021927674
Higgins, J. P. T., Thompson, S. G., Deeks, J. J., & Altman, D. G. (2003). Measuring
inconsistency in meta-analyses. BMJ, 327,
557-560. doi:10.1136/bmj.327.7414.557
Huedo-Medina, T. B., Sánchez-Meca, J., Marín-Martínez, F., & Botella, J. (2006). Assessing heterogeneity in meta-analysis: Q statistic or I2 index? Psychological Methods, 11, 193-206. doi:10.1037/1082-989X.11.2.193
Rücker,
G., Schwarzer, G., Carpenter, J. R., &
Schumacher, M. (2008). Undue reliance on I2
in assessing heterogeneity may mislead. BMC Medical Research Methodology, 8, 79. doi:10.1186/1471-2288-8-79
Ioannidis, J. P. A. (2008). Interpretation of tests of heterogeneity and bias in meta-analysis. Journal of Evaluation in Clinical Practice, 14, 951-957. doi:10.1111/j.1365-2753.2008.00986.x
Pereira, T. A., Patsopoulos, N. A., Salanti, G., & Ioannidis, J. P. A. (2010). Critical interpretation of Cochran's Q test depends on power and prior assumptions about heterogeneity. Research Synthesis Methods, 1, 149–161. doi: 10.1002/jrsm.13
Card, N. A. (2012). Section 8.4: Evaluating heterogeneity among effect sizes. In Applied meta-analysis for social science research (pp. 184-191). New York: Guilford.
Langan, D., Higgins, J. P. T., & Simmonds, M. (2015). An empirical comparison of heterogeneity variance estimators in 12 894 meta-analyses. Research Synthesis Methods, 6, 195–205. doi: 10.1002/jrsm.1140
Wiernik, B. M., Kostal, J. W., Wilmot, M. P., Dilchert, S., & Ones, D. S. (2017). Empirical benchmarks for interpreting effect size variability in meta-analysis. Industrial and Organizational Psychology, 10(3), 472–479. https://doi.org/10.1017/iop.2017.44
5.2.2 Prediction intervals
Borenstein, M., Higgins, J. P. T., Hedges, L. V., & Rothstein, H. R. (2017). Basics of meta-analysis: I2 is not an absolute measure of heterogeneity. Research Synthesis Methods, 8, 5-18. doi:10.1002/jrsm.1230
IntHout, J., Ioannidis, J. P. A., Rovers, M. M., & Goeman, J. J. (2016). Plea for routinely presenting prediction intervals in meta-analysis. BMJ Open, 6, e010247. doi:10.1136/bmjopen-2015-010247
For further reading:
Borenstein, M., Hedges, L. V., Higgins, J. P. T., & Rothstein, H. R. (2009). Chapter 17: Prediction intervals. In Introduction to meta-analysis (pp. 127-133). Chicester, West Sussex, UK: Wiley.
Spence, J. R., & Stanley, D. J. (2016). Prediction interval: What to expect when you’re expecting … a replication. PLoS ONE, 11, e0162874. doi:10.1371/journal.pone.0162874
Partlett, C., & Riley, R. D. (2017). Random effects meta‐analysis: Coverage performance of 95% confidence and prediction intervals following REML estimation. Statistics in Medicine, 36, 301-317. doi:10.1002/sim.7140
Borenstein, M. (2018). Chapter 9.3: Prediction intervals. In Common mistakes in meta-analysis and how to avoid them (pp. 85-93). Englewood, NJ: Biostat.
Nagashima, K., Noma, H., & Furukawa, T. A. (2019). Prediction intervals for random-effects meta-analysis: a confidence distribution approach. Statistical Methods in Medical Research, 28, 1689–1702. doi:10.1177/0962280218773520
5.3 The “replication crisis” revisited
Patil, P., Peng, R. D., & Leek, J. T. (2016). What should researchers expect when they replicate studies? A statistical view of replicability in psychological science. Perspectives on Psychological Science, 11, 539-544. doi:10.1177/1745691616646366
De Boeck, P., & Jeon, M. (2018). Perceived crisis and reforms: Issues, explanations, and remedies. Psychological Bulletin, 144, 757-777. doi:10.1037/bul0000154
For further reading:
Hedges, L. V. (1987). How hard is hard science, how soft is soft science? The empirical cumulativeness of research. American Psychologist, 42(5), 443–455. https://doi.org/10.1037/0003-066X.42.5.443
O’Keefe,
D. J. (1999). Variability of persuasive message effects: Meta-analytic evidence
and implications. Document Design, 1, 87-97. doi:10.1075/dd.1.2.02oke
Kaptein, M., & Eckles, D. (2012). Heterogeneity in the effects of online persuasion. Journal of Interactive Marketing, 26, 176-188. doi: 10.1016/j.intmar.2012.02.002
Bahník, Š., & Vranka, M. A. (2017). If it’s difficult to pronounce, it might not be risky: The effect of fluency on judgment of risk does not generalize to new stimuli. Psychological Science, 28(4), 427–436. https://doi.org/10.1177/0956797616685770
Amrhein, V., Trafimow, D., & Greenland, S. (2019). Inferential statistics as descriptive statistics: There is no replication crisis if we don’t expect replication. The American Statistician, 73(S1), 262-270. doi:10.1080/00031305.2018.1543137
Kenny, D. A., & Judd, C. M. (2019). The unappreciated heterogeneity of effect sizes: Implications for power, precision, planning of research, and replication. Psychological Methods, 24(5), 578-589. doi:10.1037/met0000209
Vivalt, E. (in press). How much can we generalize from impact evaluations? Journal of the European Economics Association. Available at: http://evavivalt.com/wp-content/uploads/How-Much-Can-We-Generalize.pdf