Ajzen, I. (1991). The theory of planned behavior. Organizational Behavior and Human Decision Processes, 50, 179–211.
Allen, M., Poggiali, D., Whitaker, K., Marshall, T. R., & Kievit, R. A. (2019). Raincloud plots: A multi-platform tool for robust data visualization [version 1; peer review: 2 approved]. Wellcome Open Research, 4.
American Psychological Association. (2020). Publication manual of the American Psychological Association (7th ed). Washington, DC: American Psychological Association.
Baron, R. M., & Kenny, D. A. (1986). The moderator–mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51, 1173.
Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language, 68, 255–278.
Bates, D., Kliegl, R., Vasishth, S., & Baayen, H. (2015). Parsimonious mixed models. arXiv:1506.04967 [Stat].
Bayes, T. (1763). An essay towards solving a problem in the doctrine of chances. Philosophical Transactions of the Royal Society of London, 53, 370–418.
Berger, J. O., & Wolpert, R. L. (1988). The likelihood principle. Hayward, CA: Institute of Mathematical Statistics.
Box, G. E. (1954). Some theorems on quadratic forms applied in the study of analysis of variance problems, i. Effect of inequality of variance in the one-way classification. The Annals of Mathematical Statistics, 290–302.
Box, G. E. P., & Draper, N. R. (1987). Empirical model-building and response surfaces. New York, NY: John Wiley & Sons.
Carragher, D. J., Thomas, N. A., Gwinn, O. S., & Nicholls, M. E. (2019). Limited evidence of hierarchical encoding in the cheerleader effect. Scientific Reports, 9, 1–13.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Lawrence Erlbaum Associates.
Cohen, J. (1994). The earth is round (p<. 05). American Psychologist, 49, 997–1003.
Cumming, G. (2014). The new statistics: Why and how. Psychological Science, 25, 7–29.
Davison, A. C., & Hinkley, D. V. (1997). Bootstrap methods and their application. Cambridge university press.
Fisman, R., Iyengar, S. S., Kamenica, E., & Simonson, I. (2006). Gender differences in mate selection: Evidence from a speed dating experiment. The Quarterly Journal of Economics, 121, 673–697.
Galton, F. (1907). Vox populi. Nature, 75, 450--451.
Gelman, A., & Loken, E. (2013). The garden of forking paths: Why multiple comparisons can be a problem, even when there is no "fishing expedition" or "p-hacking" and the research hypothesis was posited ahead of time. Department of Statistics, Columbia University.
Gilder, T. S. E., & Heerey, E. A. (2018). The role of experimenter belief in social priming. Psychological Science, 29, 403–417.
Greenhouse, S. W., & Geisser, S. (1959). On methods in the analysis of profile data. Psychometrika, 24, 95–112.
Hoerl, A. E., & Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12, 55–67.
Howell, D. C. (2012). Statistical methods for psychology. Cengage Learning.
Huynh, H., & Feldt, L. S. (1976). Estimation of the Box correction for degrees of freedom from sample data in randomized block and split-plot designs. Journal of Educational Statistics, 1, 69–82.
James, E. L., Bonsall, M. B., Hoppitt, L., Tunbridge, E. M., Geddes, J. R., Milton, A. L., & Holmes, E. A. (2015). Computer game play reduces intrusive memories of experimental trauma via reconsolidation-update mechanisms. Psychological Science.
Jeffreys, H. (1939). The theory of probability. Oxford: Oxford University Press.
Judd, Charles M., McClelland, G. H., & Ryan, C. S. (2011). Data analysis: A model comparison approach. Routledge.
Judd, Charles M., Westfall, J., & Kenny, D. A. (2012). Treating stimuli as a random factor in social psychology: A new and comprehensive solution to a pervasive but largely ignored problem. Journal of Personality and Social Psychology, 103, 54–69.
Kenny, D. A., & Judd, C. M. (1986). Consequences of violating the independence assumption in analysis of variance. Psychological Bulletin, 99, 422–431.
Kenward, M. G., & Roger, J. H. (1997). Small sample inference for fixed effects from restricted maximum likelihood. Biometrics, 53, 983–997.
Klein, R. A., Ratliff, K. A., Vianello, M., Adams Jr, R. B., Bahnı́k, Š., Bernstein, M. J., … others. (2014). Investigating variation in replicability: A "many labs"" replication project. Social Psychology, 142–152.
Kruschke, J. (2014). Doing bayesian data analysis: A tutorial with r, JAGS, and stan. Academic Press.
Levene, H. (1960). Robust tests for equality of variances. In I. Olkin (Ed.), Contributions to probability and statistics. Essays in honor of Harold Hotelling (pp. 279–292). Palo Alto, CA: Stanford University Press.
Liang, F., Paulo, R., Molina, G., Clyde, M. A., & Berger, J. O. (2008). Mixtures of g priors for Bayesian variable selection. Journal of the American Statistical Association, 103, 410–423.
Lix, L. M., Keselman, J. C., & Keselman, H. J. (1996). Consequences of assumption violations revisited: A quantitative review of alternatives to the one-way analysis of variance F test. Review of Educational Research, 66, 579–619.
Luke, S. G. (2017). Evaluating significance in linear mixed-effects models in r. Behavior Research Methods, 49, 1494–1502.
Lumley, T., Diehr, P., Emerson, S., & Chen, L. (2002). The importance of the normality assumption in large public health data sets. Annual Review of Public Health, 23, 151–169.
MacKinnon, D. P., Fairchild, A. J., & Fritz, M. S. (2007). Mediation analysis. Annual Review of Psychology, 58, 593–614.
MacKinnon, D. P., Krull, J. L., & Lockwood, C. M. (2000). Equivalence of the mediation, confounding and suppression effect. Prevention Science, 1, 173–181.
Matuschek, H., Kliegl, R., Vasishth, S., Baayen, H., & Bates, D. (2017). Balancing type i error and power in linear mixed models. Journal of Memory and Language, 94, 305–315.
Mauchly, J. W. (1940). Significance test for sphericity of a normal n-variate distribution. The Annals of Mathematical Statistics, 11, 204–209.
Maxwell, S. E., Delaney, H. D., & Kelley, K. (2017). Designing experiments and analyzing data: A model comparison perspective. Routledge.
Morey, Richard D., Hoekstra, R., Rouder, J. N., Lee, M. D., & Wagenmakers, E.-J. (2016). The fallacy of placing confidence in confidence intervals. Psychonomic Bulletin & Review, 23, 103–123.
Morey, Richard D., & Rouder, J. N. (2018). BayesFactor: Computation of bayes factors for common designs. Retrieved from
Nickerson, R. S. (2000). Null hypothesis significance testing: A review of an old and continuing controversy. Psychological Methods, 5, 241.
Pinheiro, J., & Bates, D. (2006). Mixed-effects models in s and s-PLUS. Springer Science & Business Media.
Popper, K. (1959). The logic of scientific discovery. Routledge.
Preacher, K. J., & Hayes, A. F. (2008). Asymptotic and resampling strategies for assessing and comparing indirect effects in multiple mediator models. Behavior Research Methods, 40, 879–891.
Rosenthal, R., & Gaito, J. (1963). The interpretation of levels of significance by psychological researchers. The Journal of Psychology, 55, 33–38.
Rouder, J. N., & Morey, R. D. (2012). Default bayes factors for model selection in regression. Multivariate Behavioral Research, 47, 877–903.
Rouder, J. N., Morey, R. D., Speckman, P. L., & Province, J. M. (2012). Default Bayes factors for ANOVA designs. Journal of Mathematical Psychology, 56, 356–374.
Rouder, J. N., Speckman, P. L., Sun, D., Morey, R. D., & Iverson, G. (2009). Bayesian t tests for accepting and rejecting the null hypothesis. Psychonomic Bulletin & Review, 16, 225–237.
Sakia, R. M. (1992). The box-cox transformation technique: A review. Journal of the Royal Statistical Society: Series D (The Statistician), 41, 169–178.
Satterthwaite, F. E. (1941). Synthesis of variance. Psychometrika, 6, 309–316.
Schaffner, B. F., Macwilliams, M., & Nteta, T. (2018). Understanding white polarization in the 2016 vote for president: The sobering role of racism and sexism. Political Science Quarterly, 133, 9–34.
Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22, 1359–1366.
Simonsohn, U. (2014). Posterior-hacking: Selective reporting invalidates Bayesian results also. Retrieved from
Singmann, H., & Kellen, D. (2019). An introduction to linear mixed modeling in experimental psychology. In New methods in cognitive psychology (pp. 4–31). Psychology Press.
Student. (1908). The probable error of a mean. Biometrika, 6, 1–25.
Thompson, B. (1992). Two and one-half decades of leadership in measurement and evaluation. Journal of Counseling & Development, 70, 434–438.
Tomarken, A. J., & Serlin, R. C. (1986). Comparison of ANOVA alternatives under variance heterogeneity and specific noncentrality structures. Psychological Bulletin, 99, 90.
Tukey, J. W. (1977). Exploratory data analysis. Reading, MA: Addison-Wesley,.
Wagenmakers, E.-J. (2007). A practical solution to the pervasive problems of p values. Psychonomic Bulletin & Review, 14, 779–804.
Walker, D., & Vul, E. (2014). Hierarchical encoding makes individuals in a group seem more attractive. Psychological Science, 25, 230–235.
Wilks, S. S. (1938). The large-sample distribution of the likelihood ratio for testing composite hypotheses. The Annals of Mathematical Statistics, 9, 60–62.
Zabell, S. L. (2008). On student’s 1908 article "the probable error of a mean"". Journal of the American Statistical Association, 103, 1–7.
Zaval, L., Markowitz, E. M., & Weber, E. U. (2015). How will I be remembered? Conserving the environment for the sake of one’s legacy. Psychological Science, 26, 231–236.