Skip to main content
Top
Published in: European Journal of Epidemiology 5/2010

01-05-2010 | COMMENTARY

Don’t split your data

Authors: Henrik Källberg, Lars Alfredsson, Maria Feychting, Anders Ahlbom

Published in: European Journal of Epidemiology | Issue 5/2010

Login to get access

Abstract

False positive findings are a common problem in whole genome association studies. In this commentary we show that nothing is gained by randomly splitting a data sample to two equal sized subsets, where the first data subset is used for explorative purposes and the other sub set is used to confirm the findings in the first subset. We compare the random splitting procedure to using the full data sample for analysis, by using a Bayesian perspective with consideration taken to prior probability of a false positive finding.
Literature
1.
go back to reference Thomas DC, Siemiatycki J, Dewar R, Robins J, Goldberg M, Armstrong BG. The problem of multiple inference in studies designed to generate hypothesis. Am J Epidemiol. 1985;122:1080–95.PubMed Thomas DC, Siemiatycki J, Dewar R, Robins J, Goldberg M, Armstrong BG. The problem of multiple inference in studies designed to generate hypothesis. Am J Epidemiol. 1985;122:1080–95.PubMed
2.
go back to reference Satagopan JM, Verbel DA, Venkatraman ES, Offit KE, Begg CB. Two-stage designs for gene-disease association studies. Biometrics. 2002;58(1):163–70.CrossRefPubMed Satagopan JM, Verbel DA, Venkatraman ES, Offit KE, Begg CB. Two-stage designs for gene-disease association studies. Biometrics. 2002;58(1):163–70.CrossRefPubMed
3.
go back to reference Thomas CD, Cassey G, Conti DV, Haile RW, Lewinger JP, Stram DO. Methodological issues in multistage genome-wide association studies. Stat Sci. 2009 (in press). Thomas CD, Cassey G, Conti DV, Haile RW, Lewinger JP, Stram DO. Methodological issues in multistage genome-wide association studies. Stat Sci. 2009 (in press).
4.
go back to reference Wacholder S, Chanock S, Garcia-Closas M, El Ghormli L, Rothman N. Assessing the probability that a positive report is false: an approach for molecular epidemiology studies. J Natl Cancer Inst. 2004;96(6):434–42.PubMedCrossRef Wacholder S, Chanock S, Garcia-Closas M, El Ghormli L, Rothman N. Assessing the probability that a positive report is false: an approach for molecular epidemiology studies. J Natl Cancer Inst. 2004;96(6):434–42.PubMedCrossRef
Metadata
Title
Don’t split your data
Authors
Henrik Källberg
Lars Alfredsson
Maria Feychting
Anders Ahlbom
Publication date
01-05-2010
Publisher
Springer Netherlands
Published in
European Journal of Epidemiology / Issue 5/2010
Print ISSN: 0393-2990
Electronic ISSN: 1573-7284
DOI
https://doi.org/10.1007/s10654-010-9447-3

Other articles of this Issue 5/2010

European Journal of Epidemiology 5/2010 Go to the issue