What is fdr in statistics
Last updated: April 1, 2026
Key Facts
- FDR was developed by Yoav Benjamini and Yosef Hochberg in 1995 as an improvement over stricter multiple comparison corrections
- It controls the proportion of false discoveries among rejected hypotheses, not the probability of any single false positive
- The Benjamini-Hochberg (BH) procedure is the most widely used method for controlling FDR
- FDR is more statistically powerful than Bonferroni correction, making it valuable in high-dimensional testing scenarios like genomics
- FDR has become essential in genomics, neuroscience, psychology, and any field conducting thousands of simultaneous statistical tests
Understanding False Discovery Rate
The False Discovery Rate (FDR) is a fundamental concept in statistics that helps researchers manage the problem of multiple comparisons. When conducting many statistical tests simultaneously, the probability of finding false positives (Type I errors) increases substantially. For example, if you conduct 1,000 independent tests at a significance level of 0.05, you would expect approximately 50 false positives by chance alone. FDR provides a principled way to control this inflation while maintaining reasonable statistical power.
The Multiple Comparisons Problem
In modern research, scientists often test thousands of hypotheses simultaneously. In genomics, researchers might test whether each of thousands of genes is associated with a disease. In neuroimaging, researchers test associations across millions of brain voxels. Traditional methods like the Bonferroni correction, which divides the significance level by the number of tests, are overly conservative in these settings. The Bonferroni approach controls the family-wise error rate but becomes too stringent for large-scale testing, missing true discoveries.
How FDR Works
FDR controls the expected proportion of false discoveries among all rejected hypotheses. If FDR is set to 0.05, it means that among all tests you call significant, approximately 5% are expected to be false positives. This is fundamentally different from traditional significance levels, which control the probability of a single false positive. The Benjamini-Hochberg procedure implements FDR control by ranking p-values and determining a threshold that controls the expected proportion of false discoveries.
FDR vs. Traditional Methods
Compared to stricter corrections:
- Bonferroni Correction: Highly conservative, maintains family-wise error rate but often too stringent for modern applications
- Family-Wise Error Rate (FWER): Controls probability of any false positive anywhere, sacrificing statistical power
- False Discovery Rate: More powerful than FWER methods while still controlling the proportion of false positives, ideal for exploratory research
Applications and Advantages
FDR has become the standard in genomics and gene expression studies, where thousands of genes are tested simultaneously. It's equally valuable in neuroimaging analysis, microarray experiments, and psychological research involving multiple tests. The primary advantage is maintaining statistical power while controlling false discoveries, enabling researchers to make meaningful discoveries in high-dimensional data without being overwhelmed by false positives.
Related Questions
What is the difference between FDR and p-value?
A p-value represents the probability of observing results as extreme as or more extreme than those observed under the null hypothesis for a single test. FDR, conversely, controls the expected proportion of false discoveries among multiple rejected hypotheses, making it applicable when conducting many tests simultaneously.
What does an FDR of 0.05 mean?
An FDR of 0.05 means that among all tests you declare significant, approximately 5% are expected to be false positives. This is different from a p-value of 0.05, which addresses a single test, not multiple comparisons.
Why is FDR important in genomics?
In genomics, researchers test thousands of genes simultaneously. FDR control allows researchers to manage false positive rates efficiently while maintaining enough statistical power to detect true genetic associations, which would be impossible with stricter corrections like Bonferroni.