Published in:
01-10-2010 | Letter to the Editor
Do split your epidemiological data
Authors:
Fredrik A. Dahl, Jūratė Šaltytė Benth
Published in:
European Journal of Epidemiology
|
Issue 10/2010
Login to get access
Excerpt
This letter is a response to the commentary of Kallberg et al. [1], in which they argue against data splitting as a way of protecting against false positive discoveries in scientific studies. When reading it, we were rather surprised that it failed to refer to the article [2], which was published relatively recently in this journal, and discusses the same issue. Kallberg et al. analyze a two-stage testing procedure that defines a finding as valid if statistically significant on the 5% level in each part. They correctly argue that this trivially gives a significance level of (0.05)2 = 0.0025, and that there exist more powerful tests with the same significance level. On closer reading, Kallberg et al. appear to discuss the analysis of genomic data; a field in which this two-stage hypothesis testing procedure is indeed sometimes used. But why, then, is this published in a journal on epidemiology? And how could they avoid any reference to [2], which does discuss splitting of epidemiological data? …