α-correction for simultaneous statistical inference:
Familywise error rate vs. per-family error rate
A meta-analysis of more than 30000 published articles indicated that less than 1% applied α-
corrections for multiple comparisons even though the median number of hypothesis tests per
article was ≈ 9 (Conover, 1973; Derrick & White, 2017; Pratt, 1959). A crucial, yet underap-
preciated difference, is the distinction between 1) the familywise (or experimentwise) error rate
(FWER), and 2) the per-family error rate (PFER). FWER is the probability of making at least
one Type I error in a family of hypotheses. The PFER, on the other hand, which is the number
α-errors expected to occur in a family of hypotheses (in other words, the sum of the probabili-
ties of α-errors for all the hypotheses in the family).The per-comparison error rate (PCER) is
the probability of a α-error in the absence of any correction for multiple comparisons
(Benjamini & Hochberg, 1995). Moreover, the false discovery rate (FDR) quantifies the ex-
pected proportion of "discoveries" (rejected null hypotheses) that are false (incorrect rejec-
tions).
The majority of investigations focus on the former while the latter is largely ignored even
though it evidently is at least equally important if not more so (Barnette & Mclean, 2005;
Kemp, 1975). The experimentwise (EW) error rate does not take the possibility of multiple α-
errors in the same experiment into account. Per-experiment (PE) α-control techniques control
α for all comparisons (a priori and post hoc) in a given experiment. In other terms, they con-
sider all possible α-errors that in a given experiment. It has been persuasively argued that per-
experiment α control is most relevant for pairwise hypothesis decision-making (Barnette &
Mclean, 2005) even though most textbooks (and researchers) focus on the experimentwise er-
ror rate. Both approaches differ significantly in the way they adjust α for multiple hypothesis
tests. It has been pointed out that the almost exclusive focus on experimentwise error rates is
not justifiable (Barnette & Mclean, 2005). From a pragmatic point of view, per-experiment
error correction is much closer aligned with prevailing research practices. In other words, in
most experiments it is not just the largest difference between conditions which is of empirical
interest and most of the time all pairwise comparisons are computed. The EW error rate treats
each experiment as one test even though multiple comparisons might have been conducted. A
systematic Monte Carlo based comparison between four different adjustment methods showed
that, for experimentwise control, Tukey’s HSD is the most accurate procedure (as an unpro-
tected test). If experimentwise α-control is desired, Tukey’s HSD (unprotected) test is the most
accurate procedure. If the focus is on per-experiment α-control, the Dunn-Bonferroni (again
unprotected) is the most accurate α-adjustment procedure (Barnette & Mclean, 2005).
References
Barnette, J. J., & Mclean, J. E. (2005). Type I Error Of Four Pairwise Mean Comparison
Procedures Conducted As Protected And Unprotected Tests. Journal of Modern Applied
Statistical Methods, 4(2), 446–459. https://doi.org/10.22237/jmasm/1130803740
Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: a practical and
powerful approach to multiple testing. Journal of the Royal Statistical Society B.
https://doi.org/10.2307/2346101
Conover, W. J. (1973). On methods of handling ties in the wilcoxon signed-rank test. Journal
of the American Statistical Association, 68(344), 985–988.
https://doi.org/10.1080/01621459.1973.10481460
Derrick, B., & White, P. (2017). Comparing two samples from an individual Likert question.
International Journal of Mathematics and Statistics, 974–7117. Retrieved from
http://eprints.uwe.ac.uk/30814%0Ahttp://www.ceser.in/ceserp/index.php/ijms
Kemp, K. E. (1975). Multiple comparisons: comparisonwise versus experimentwise Type I
error rates and their relationship to power. Journal of Dairy Science, 58(9), 1374–1378.
https://doi.org/10.3168/jds.S0022-0302(75)84722-9
Pratt, J. W. (1959). Remarks on Zeros and Ties in the Wilcoxon Signed Rank Procedures.
Journal of the American Statistical Association, 54(287), 655–667.
https://doi.org/10.1080/01621459.1959.10501526