Skip to main content
Top
Published in: European Journal of Epidemiology 2/2013

01-02-2013 | METHODS

Distinguishing true from false positives in genomic studies: p values

Authors: Linda Broer, Christina M. Lill, Maaike Schuur, Najaf Amin, Johannes T. Roehr, Lars Bertram, John P. A. Ioannidis, Cornelia M. van Duijn

Published in: European Journal of Epidemiology | Issue 2/2013

Login to get access

Abstract

Distinguishing true from false positive findings is a major challenge in human genetic epidemiology. Several strategies have been devised to facilitate this, including the positive predictive value (PPV) and a set of epidemiological criteria, known as the “Venice” criteria. The PPV measures the probability of a true association, given a statistically significant finding, while the Venice criteria grade the credibility based on the amount of evidence, consistency of replication and protection from bias. A vast majority of journals use significance thresholds to identify the true positive findings. We studied the effect of p value thresholds on the PPV and used the PPV and Venice criteria to define usable thresholds of statistical significance. Theoretical and empirical analyses of data published on AlzGene show that at a nominal p value threshold of 0.05 most “positive” findings will turn out to be false if the prior probability of association is below 0.10 even if the statistical power of the study is higher than 0.80. However, in underpowered studies (0.25) with a low prior probability of 1 × 10−3, a p value of 1 × 10−5 yields a high PPV (>96 %). Here we have shown that the p value threshold of 1 × 10−5 gives a very strong evidence of association in almost all studies. However, in the case of a very high prior probability of association (0.50) a p value threshold of 0.05 may be sufficient, while for studies with very low prior probability of association (1 × 10−4; genome-wide association studies for instance) 1 × 10−7 may serve as a useful threshold to declare significance.
Appendix
Available only for authorised users
Literature
1.
go back to reference Hirschhorn JN, Lohmueller K, Byrne E, Hirschhorn K. A comprehensive review of genetic association studies. Genet Med. 2002;4(2):45–61.PubMedCrossRef Hirschhorn JN, Lohmueller K, Byrne E, Hirschhorn K. A comprehensive review of genetic association studies. Genet Med. 2002;4(2):45–61.PubMedCrossRef
4.
go back to reference Wacholder S, Chanock S, Garcia-Closas M, El Ghormli L, Rothman N. Assessing the probability that a positive report is false: an approach for molecular epidemiology studies. J Natl Cancer Inst. 2004;96(6):434–42.PubMedCrossRef Wacholder S, Chanock S, Garcia-Closas M, El Ghormli L, Rothman N. Assessing the probability that a positive report is false: an approach for molecular epidemiology studies. J Natl Cancer Inst. 2004;96(6):434–42.PubMedCrossRef
5.
go back to reference Weitkunat R, Kaelin E, Vuillaume G, Kallischnigg G. Effectiveness of strategies to increase the validity of findings from association studies: size versus replication. BMC Med Res Methodol. 2010;10:47. doi:10.1186/1471-2288-10-47.PubMedCrossRef Weitkunat R, Kaelin E, Vuillaume G, Kallischnigg G. Effectiveness of strategies to increase the validity of findings from association studies: size versus replication. BMC Med Res Methodol. 2010;10:47. doi:10.​1186/​1471-2288-10-47.PubMedCrossRef
8.
go back to reference Rebbeck TR, Ambrosone CB, Bell DA, Chanock SJ, Hayes RB, Kadlubar FF, et al. SNPs, haplotypes, and cancer: applications in molecular epidemiology. Cancer Epidemiol Biomark Prev. 2004;13(5):681–7. Rebbeck TR, Ambrosone CB, Bell DA, Chanock SJ, Hayes RB, Kadlubar FF, et al. SNPs, haplotypes, and cancer: applications in molecular epidemiology. Cancer Epidemiol Biomark Prev. 2004;13(5):681–7.
9.
go back to reference Thomas DC, Clayton DG. Betting odds and genetic associations. J Natl Cancer Inst. 2004;96(6):421–3.PubMedCrossRef Thomas DC, Clayton DG. Betting odds and genetic associations. J Natl Cancer Inst. 2004;96(6):421–3.PubMedCrossRef
11.
go back to reference Ioannidis JP, Boffetta P, Little J, O’Brien TR, Uitterlinden AG, Vineis P, et al. Assessment of cumulative evidence on genetic associations: interim guidelines. Int J Epidemiol. 2008;37(1):120–32. doi:10.1093/ije/dym159.PubMedCrossRef Ioannidis JP, Boffetta P, Little J, O’Brien TR, Uitterlinden AG, Vineis P, et al. Assessment of cumulative evidence on genetic associations: interim guidelines. Int J Epidemiol. 2008;37(1):120–32. doi:10.​1093/​ije/​dym159.PubMedCrossRef
13.
go back to reference Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical. J R Statist Soc B. 1995;57(1):289–300. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical. J R Statist Soc B. 1995;57(1):289–300.
14.
go back to reference Gordon A, Glazko G, Qiu X, Yakovlev A. Control of the mean number of false discoveries, Bonferroni and stability of multiple testing. Ann Appl Stat. 2007;1:179–90.CrossRef Gordon A, Glazko G, Qiu X, Yakovlev A. Control of the mean number of false discoveries, Bonferroni and stability of multiple testing. Ann Appl Stat. 2007;1:179–90.CrossRef
15.
go back to reference Khoury MJ, Bertram L, Boffetta P, Butterworth AS, Chanock SJ, Dolan SM, et al. Genome-wide association studies, field synopses, and the development of the knowledge base on genetic variation and human diseases. Am J Epidemiol. 2009;170(3):269–79. doi:10.1093/aje/kwp119.PubMedCrossRef Khoury MJ, Bertram L, Boffetta P, Butterworth AS, Chanock SJ, Dolan SM, et al. Genome-wide association studies, field synopses, and the development of the knowledge base on genetic variation and human diseases. Am J Epidemiol. 2009;170(3):269–79. doi:10.​1093/​aje/​kwp119.PubMedCrossRef
17.
go back to reference Bertram L, McQueen MB, Mullin K, Blacker D, Tanzi RE. Systematic meta-analyses of Alzheimer disease genetic association studies: the AlzGene database. Nat Genet. 2007;39(1):17–23. doi:10.1038/ng1934.PubMedCrossRef Bertram L, McQueen MB, Mullin K, Blacker D, Tanzi RE. Systematic meta-analyses of Alzheimer disease genetic association studies: the AlzGene database. Nat Genet. 2007;39(1):17–23. doi:10.​1038/​ng1934.PubMedCrossRef
20.
21.
go back to reference Kavvoura FK, McQueen MB, Khoury MJ, Tanzi RE, Bertram L, Ioannidis JP. Evaluation of the potential excess of statistically significant findings in published genetic association studies: application to Alzheimer’s disease. Am J Epidemiol. 2008;168(8):855–65. doi:10.1093/aje/kwn206.PubMedCrossRef Kavvoura FK, McQueen MB, Khoury MJ, Tanzi RE, Bertram L, Ioannidis JP. Evaluation of the potential excess of statistically significant findings in published genetic association studies: application to Alzheimer’s disease. Am J Epidemiol. 2008;168(8):855–65. doi:10.​1093/​aje/​kwn206.PubMedCrossRef
24.
go back to reference Rothman KJ. Epidemiology: an introduction. 1st ed. Oxford: Oxford University Press; 2002. Rothman KJ. Epidemiology: an introduction. 1st ed. Oxford: Oxford University Press; 2002.
25.
go back to reference Panagiotou OA, Ioannidis JP, Genome-Wide Significance Project. What should the genome-wide significance threshold be? Empirical replication of borderline genetic associations. Int J Epidemiol. 2011;. doi:10.1093/ije/dyr178.PubMed Panagiotou OA, Ioannidis JP, Genome-Wide Significance Project. What should the genome-wide significance threshold be? Empirical replication of borderline genetic associations. Int J Epidemiol. 2011;. doi:10.​1093/​ije/​dyr178.PubMed
29.
go back to reference Boseley S. Six men in intensive care after drug trial goes wrong. The Guardian. 2006. Boseley S. Six men in intensive care after drug trial goes wrong. The Guardian. 2006.
30.
go back to reference Farrer LA, Cupples LA, Haines JL, Hyman B, Kukull WA, Mayeux R, et al. Effects of age, sex, and ethnicity on the association between apolipoprotein E genotype and Alzheimer disease. A meta-analysis. APOE and Alzheimer disease meta analysis consortium. JAMA. 1997;278(16):1349–56.PubMedCrossRef Farrer LA, Cupples LA, Haines JL, Hyman B, Kukull WA, Mayeux R, et al. Effects of age, sex, and ethnicity on the association between apolipoprotein E genotype and Alzheimer disease. A meta-analysis. APOE and Alzheimer disease meta analysis consortium. JAMA. 1997;278(16):1349–56.PubMedCrossRef
Metadata
Title
Distinguishing true from false positives in genomic studies: p values
Authors
Linda Broer
Christina M. Lill
Maaike Schuur
Najaf Amin
Johannes T. Roehr
Lars Bertram
John P. A. Ioannidis
Cornelia M. van Duijn
Publication date
01-02-2013
Publisher
Springer Netherlands
Published in
European Journal of Epidemiology / Issue 2/2013
Print ISSN: 0393-2990
Electronic ISSN: 1573-7284
DOI
https://doi.org/10.1007/s10654-012-9755-x

Other articles of this Issue 2/2013

European Journal of Epidemiology 2/2013 Go to the issue