*** Welcome to piglix ***

Multiple comparisons problem


In statistics, the multiple comparisons, multiplicity or multiple testing problem occurs when one considers a set of statistical inferences simultaneously or infers a subset of parameters selected based on the observed values. In certain fields it is known as the look-elsewhere effect.

The more inferences are made, the more likely erroneous inferences are to occur. Several statistical techniques have been developed to prevent this from happening, allowing significance levels for single and multiple comparisons to be directly compared. These techniques generally require a higher significance threshold for individual comparisons, so as to compensate for the number of inferences being made.

The interest in the problem of multiple comparisons began in the 1950s with the work of Tukey and Scheffé. Other methods, such as the closed testing procedure (Marcus et al., 1976) and the Holm–Bonferroni method (1979), later emerged. In 1995, work on the false discovery rate began. In 1996, the first conference on multiple comparisons took place in Israel. This was followed by conferences around the world, usually taking place about every two years.

Multiple comparisons arise when a statistical analysis involves multiple statistical tests, each of which has a potential to produce a "discovery." Failure to compensate for multiple comparisons can have important real-world consequences, as illustrated by the following examples:

In both examples, as the number of comparisons increases, it becomes more likely that the groups being compared will appear to differ in terms of at least one attribute. Our confidence that a result will generalize to independent data should generally be weaker if it is observed as part of an analysis that involves multiple comparisons, rather than an analysis that involves only a single comparison.

For example, if one test is performed at the 5% level and the corresponding null hypothesis is true, there is only a 5% chance of incorrectly rejecting the null hypothesis. However, if 100 tests are conducted and all corresponding null hypotheses are true, the expected number of incorrect rejections (also known as false positives or Type I errors) is 5. If the tests are statistically independent from each other, the probability of at least one incorrect rejection is 99.4%.


...
Wikipedia

...