Skip to main content
Top
Published in: BMC Medical Research Methodology 1/2011

Open Access 01-12-2011 | Research article

Stratified sampling design and loss to follow-up in survival models: evaluation of efficiency and bias

Authors: Cibele C César, Marilia S Carvalho

Published in: BMC Medical Research Methodology | Issue 1/2011

Login to get access

Abstract

Background

Longitudinal studies often employ complex sample designs to optimize sample size, over-representing population groups of interest. The effect of sample design on parameter estimates is quite often ignored, particularly when fitting survival models. Another major problem in long-term cohort studies is the potential bias due to loss to follow-up.

Methods

In this paper we simulated a dataset with approximately 50,000 individuals as the target population and 15,000 participants to be followed up for 40 years, both based on real cohort studies of cardiovascular diseases. Two sample strategies - simple random (our golden standard) and Stratified by professional group, with non-proportional allocation - and two loss to follow-up scenarios - non-informative censoring and losses related to the professional group - were analyzed.

Results

Two modeling approaches were evaluated: weighted and non-weighted fit. Our results indicate that under the correctly specified model, ignoring the sample weights does not affect the results. However, the model ignoring the interaction of sample strata with the variable of interest and the crude estimates were highly biased.

Conclusions

In epidemiological studies misspecification should always be considered, as different sources of variability, related to the individuals and not captured by the covariates, are always present. Therefore, allowance must be made for the possibility of unknown confounders and interactions with the main variable of interest in our data. It is strongly recommended always to correct by sample weights.
Appendix
Available only for authorised users
Literature
2.
go back to reference Xie Y: Otis Dudley Duncan's legacy: The demographic approach to quantitative reasoning in social science. Research in Social Stratification and Mobility. 2007, 25: 141-156. 10.1016/j.rssm.2007.05.006.CrossRef Xie Y: Otis Dudley Duncan's legacy: The demographic approach to quantitative reasoning in social science. Research in Social Stratification and Mobility. 2007, 25: 141-156. 10.1016/j.rssm.2007.05.006.CrossRef
3.
go back to reference DuMouchel W, Duncan G: Using sample survey weights in multiple regression analysis of stratified samples. Journal of the American Statistical Association. 1983, 78: 535-548. 10.2307/2288115.CrossRef DuMouchel W, Duncan G: Using sample survey weights in multiple regression analysis of stratified samples. Journal of the American Statistical Association. 1983, 78: 535-548. 10.2307/2288115.CrossRef
4.
5.
go back to reference Lawless J: Censoring and weighting in survival estimation from survey data. Proceedings of the Survey Mehods Section, Statistical Society of Canada 2003 Annual Meeting, Statistical Society of Canada. 2003 Lawless J: Censoring and weighting in survival estimation from survey data. Proceedings of the Survey Mehods Section, Statistical Society of Canada 2003 Annual Meeting, Statistical Society of Canada. 2003
6.
go back to reference Bertoni AG, Burke GL, Owusu JA, Carnethon MR, Vaidya D, Graham Barr G, Jenny NS, Ouyang P, Rotter JI: Inflammation and the Incidence of Type 2 Diabetes The Multi-Ethnic Study of Atherosclerosis (MESA). Diabetes Care. 2010, 33 (4): 804-810. 10.2337/dc09-1679.CrossRefPubMedPubMedCentral Bertoni AG, Burke GL, Owusu JA, Carnethon MR, Vaidya D, Graham Barr G, Jenny NS, Ouyang P, Rotter JI: Inflammation and the Incidence of Type 2 Diabetes The Multi-Ethnic Study of Atherosclerosis (MESA). Diabetes Care. 2010, 33 (4): 804-810. 10.2337/dc09-1679.CrossRefPubMedPubMedCentral
7.
go back to reference Bopp M, Braun J, Faeh D, Gutzwiller F, Group SNCS: Establishing a follow-up of the Swiss MONICA participants (1984-1993):record linkage with census and mortality data. BMC Public Health. 2010, 10: 562-10.1186/1471-2458-10-562.CrossRefPubMedPubMedCentral Bopp M, Braun J, Faeh D, Gutzwiller F, Group SNCS: Establishing a follow-up of the Swiss MONICA participants (1984-1993):record linkage with census and mortality data. BMC Public Health. 2010, 10: 562-10.1186/1471-2458-10-562.CrossRefPubMedPubMedCentral
8.
go back to reference Bild D, Bluemke D, Burke G, Detrano R, Diez Roux A, Folsom A, Greenland P, Jacob DJ, Kronmal R, Liu K, Nelson J, O'Leary D, Saad M, Shea S, Szklo M, Tracy R: Multi-ethnic study of atherosclerosis: objectives and design. American Journal of Epidemiology. 2002, 156 (9): 871-881. 10.1093/aje/kwf113.CrossRefPubMed Bild D, Bluemke D, Burke G, Detrano R, Diez Roux A, Folsom A, Greenland P, Jacob DJ, Kronmal R, Liu K, Nelson J, O'Leary D, Saad M, Shea S, Szklo M, Tracy R: Multi-ethnic study of atherosclerosis: objectives and design. American Journal of Epidemiology. 2002, 156 (9): 871-881. 10.1093/aje/kwf113.CrossRefPubMed
9.
go back to reference Böthig S: WHO MONICA Project: objectives and design. International Jounal of Epidemiology. 1989, 18 (3 Suppl 1): S29-37. Böthig S: WHO MONICA Project: objectives and design. International Jounal of Epidemiology. 1989, 18 (3 Suppl 1): S29-37.
11.
go back to reference Boudreau C, Lawless JF: Survival analysis based on the proportional hazards model and survey data. The Canadian Journal of Statistics. 2006, 34: Boudreau C, Lawless JF: Survival analysis based on the proportional hazards model and survey data. The Canadian Journal of Statistics. 2006, 34:
12.
go back to reference Binder DA: Fitting Cox's proportional hazards models from survey data. Biometrika. 1992, 79: 139-147. 10.1093/biomet/79.1.139.CrossRef Binder DA: Fitting Cox's proportional hazards models from survey data. Biometrika. 1992, 79: 139-147. 10.1093/biomet/79.1.139.CrossRef
13.
go back to reference Lin D: On fitting Cox's proportional hazards models to survey data. Biometrika. 2000, 87: 37-47. 10.1093/biomet/87.1.37.CrossRef Lin D: On fitting Cox's proportional hazards models to survey data. Biometrika. 2000, 87: 37-47. 10.1093/biomet/87.1.37.CrossRef
14.
go back to reference Kristman V, Manno M, Côté P: Loss to follow-up in cohort studies: how much is too much?. European Journal of Epidemiology. 2004, 19: 751-760.CrossRefPubMed Kristman V, Manno M, Côté P: Loss to follow-up in cohort studies: how much is too much?. European Journal of Epidemiology. 2004, 19: 751-760.CrossRefPubMed
16.
go back to reference Marín A, Medrano MJ, González J, Pintado H, Compaired V, Bárcena M, Fustero MV, Tisaire J, Cucalón JM, Martín A, Boix R, Hernansanz F, Bueno J: Risk of ischaemic heart disease and acute myocardial infarction in a Spanish population: observational prospective study in a primary-care setting. BMC Public Health. 2006, 6: 38-10.1186/1471-2458-6-38.CrossRefPubMedPubMedCentral Marín A, Medrano MJ, González J, Pintado H, Compaired V, Bárcena M, Fustero MV, Tisaire J, Cucalón JM, Martín A, Boix R, Hernansanz F, Bueno J: Risk of ischaemic heart disease and acute myocardial infarction in a Spanish population: observational prospective study in a primary-care setting. BMC Public Health. 2006, 6: 38-10.1186/1471-2458-6-38.CrossRefPubMedPubMedCentral
17.
go back to reference Little RJ, Lewitzky S, Heeringa S, Lepkowski J, Kessler RC: Assessment of weighting methodology for the National Comorbidity Survey. American Journal of Epidemiology. 1997, 146: 439-449.CrossRefPubMed Little RJ, Lewitzky S, Heeringa S, Lepkowski J, Kessler RC: Assessment of weighting methodology for the National Comorbidity Survey. American Journal of Epidemiology. 1997, 146: 439-449.CrossRefPubMed
18.
go back to reference Korn EL, Graubard BI: Analysis of large health surveys: accounting for the sampling design. Journal of the Royal Statistical Society. 1995, 158 (2): 263-295. 10.2307/2983292.CrossRef Korn EL, Graubard BI: Analysis of large health surveys: accounting for the sampling design. Journal of the Royal Statistical Society. 1995, 158 (2): 263-295. 10.2307/2983292.CrossRef
19.
go back to reference Yeboah J, McNamara CC, Jiang XC, Tabas I, Herrington DM, Burke GL, Shea S: Association of plasma sphingomyelin levels and incident coronary heart disease events in an adult population: Multi-Ethnic Study of Atherosclerosis. Arterioscherosis, Thrombosis and Vascular Biology. 2010, 30: 628-633. 10.1161/ATVBAHA.109.199281.CrossRef Yeboah J, McNamara CC, Jiang XC, Tabas I, Herrington DM, Burke GL, Shea S: Association of plasma sphingomyelin levels and incident coronary heart disease events in an adult population: Multi-Ethnic Study of Atherosclerosis. Arterioscherosis, Thrombosis and Vascular Biology. 2010, 30: 628-633. 10.1161/ATVBAHA.109.199281.CrossRef
20.
go back to reference Hardy SE, Allore H, Studenski SA: Missing Data: A Special Challenge in Aging Research. Journal of the American Geriatrics Society. 2009, 57 (4): 722-729. 10.1111/j.1532-5415.2008.02168.x.CrossRefPubMedPubMedCentral Hardy SE, Allore H, Studenski SA: Missing Data: A Special Challenge in Aging Research. Journal of the American Geriatrics Society. 2009, 57 (4): 722-729. 10.1111/j.1532-5415.2008.02168.x.CrossRefPubMedPubMedCentral
21.
go back to reference Yang X, Shoptaw S: Assessing missing data assumptions in longitudinal studies: an example using a smoking cessation trial. Drug and Alcohol Dependence. 2005, 77: 213-225. 10.1016/j.drugalcdep.2004.08.018.CrossRefPubMed Yang X, Shoptaw S: Assessing missing data assumptions in longitudinal studies: an example using a smoking cessation trial. Drug and Alcohol Dependence. 2005, 77: 213-225. 10.1016/j.drugalcdep.2004.08.018.CrossRefPubMed
22.
go back to reference Gunst RF, Mason RL: Biased estimation in regression: an evaluation using mean squared error. Journal of American Statistical Association. 1977, 72 (359): 616-628. 10.2307/2286229.CrossRef Gunst RF, Mason RL: Biased estimation in regression: an evaluation using mean squared error. Journal of American Statistical Association. 1977, 72 (359): 616-628. 10.2307/2286229.CrossRef
23.
go back to reference Cox D: Principles of Statistical Inference. 2006, Cambrigde University PressCrossRef Cox D: Principles of Statistical Inference. 2006, Cambrigde University PressCrossRef
24.
go back to reference Putter H, Fiocco M, Geskus RB: Tutorial in biostatistics: competing risks and multi-state models. Statistics in Medicine. 2007, 26: 2389-2430. 10.1002/sim.2712.CrossRefPubMed Putter H, Fiocco M, Geskus RB: Tutorial in biostatistics: competing risks and multi-state models. Statistics in Medicine. 2007, 26: 2389-2430. 10.1002/sim.2712.CrossRefPubMed
25.
go back to reference Polonsky TS, McClelland RL, Jorgensen NW, Bild DE, Burke GL, Guerci AD, Greenland P: Coronary Artery Calcium Score and Risk Classification for Coronary Heart Disease Prediction. The Journal of the American Medical Association. 2010, 303 (16): 1610-1616. 10.1001/jama.2010.461.CrossRefPubMed Polonsky TS, McClelland RL, Jorgensen NW, Bild DE, Burke GL, Guerci AD, Greenland P: Coronary Artery Calcium Score and Risk Classification for Coronary Heart Disease Prediction. The Journal of the American Medical Association. 2010, 303 (16): 1610-1616. 10.1001/jama.2010.461.CrossRefPubMed
26.
go back to reference Holt D, Simth TMF, Winter PD: Regression analysis of data from complex surveys. Journal of the Royal Statistical Society, Series A. 1980, 143 (Part 4): 474-487.CrossRef Holt D, Simth TMF, Winter PD: Regression analysis of data from complex surveys. Journal of the Royal Statistical Society, Series A. 1980, 143 (Part 4): 474-487.CrossRef
27.
go back to reference Nathan G, Holt D: The effect of survey design on regression analysis. Journal of the Royal Statistical Society, Series B. 1980, 42 (3): 377-386. Nathan G, Holt D: The effect of survey design on regression analysis. Journal of the Royal Statistical Society, Series B. 1980, 42 (3): 377-386.
28.
go back to reference Winship C, Radbill L: Sampling weights and regression analysis. Sociological Methods & Research. 1994, 23 (2): 230-257.CrossRef Winship C, Radbill L: Sampling weights and regression analysis. Sociological Methods & Research. 1994, 23 (2): 230-257.CrossRef
29.
go back to reference LaVange LM, Koch G, Shchwartz TA: Applying sample survey methods to clinical trials data. Statistics in Medicine. 2001, 20: 2609-2623. 10.1002/sim.732.CrossRefPubMed LaVange LM, Koch G, Shchwartz TA: Applying sample survey methods to clinical trials data. Statistics in Medicine. 2001, 20: 2609-2623. 10.1002/sim.732.CrossRefPubMed
30.
go back to reference Feder M, Nathan G, Pferffermann D: Multilevel modelling of complex survey longitudinal data with time varying random effects. Survey Methodology. 2000, 26 (1): 53-65. Feder M, Nathan G, Pferffermann D: Multilevel modelling of complex survey longitudinal data with time varying random effects. Survey Methodology. 2000, 26 (1): 53-65.
31.
go back to reference Vaupel JW, Manton KG, Stallard E: The impact of heterogeneity in individual frailty on the dynamics of mortality. Demography. 1979, 16 (3): 439-454. 10.2307/2061224.CrossRefPubMed Vaupel JW, Manton KG, Stallard E: The impact of heterogeneity in individual frailty on the dynamics of mortality. Demography. 1979, 16 (3): 439-454. 10.2307/2061224.CrossRefPubMed
32.
go back to reference Rubin DB: The design versus the analysis of observational studies for causal effects: parallels with the design of randomized trials. Statistics in Medicine. 2007, 26 (1): 20-36. 10.1002/sim.2739.CrossRefPubMed Rubin DB: The design versus the analysis of observational studies for causal effects: parallels with the design of randomized trials. Statistics in Medicine. 2007, 26 (1): 20-36. 10.1002/sim.2739.CrossRefPubMed
33.
go back to reference Graubard BI, Korn EL: Inference for superpopulation parameters using sample surveys. Statistical Science. 2002, 17 (1): 73-96. 10.1214/ss/1023798999.CrossRef Graubard BI, Korn EL: Inference for superpopulation parameters using sample surveys. Statistical Science. 2002, 17 (1): 73-96. 10.1214/ss/1023798999.CrossRef
Metadata
Title
Stratified sampling design and loss to follow-up in survival models: evaluation of efficiency and bias
Authors
Cibele C César
Marilia S Carvalho
Publication date
01-12-2011
Publisher
BioMed Central
Published in
BMC Medical Research Methodology / Issue 1/2011
Electronic ISSN: 1471-2288
DOI
https://doi.org/10.1186/1471-2288-11-99

Other articles of this Issue 1/2011

BMC Medical Research Methodology 1/2011 Go to the issue