Skip to main content
Top
Published in: Population Health Metrics 1/2012

Open Access 01-12-2012 | Research

The influence of measurement error on calibration, discrimination, and overall estimation of a risk prediction model

Authors: Laura C Rosella, Paul Corey, Therese A Stukel, Cam Mustard, Jan Hux, Doug G Manuel

Published in: Population Health Metrics | Issue 1/2012

Login to get access

Abstract

Background

Self-reported height and weight are commonly collected at the population level; however, they can be subject to measurement error. The impact of this error on predicted risk, discrimination, and calibration of a model that uses body mass index (BMI) to predict risk of diabetes incidence is not known. The objective of this study is to use simulation to quantify and describe the effect of random and systematic error in self-reported height and weight on the performance of a model for predicting diabetes.

Methods

Two general categories of error were examined: random (nondirectional) error and systematic (directional) error on an algorithm relating BMI in kg/m2 to probability of developing diabetes. The cohort used to develop the risk algorithm was derived from 23,403 Ontario residents that responded to the 1996/1997 National Population Health Survey linked to a population-based diabetes registry. The data and algorithm were then simulated to allow for estimation of the impact of these errors on predicted risk using the Hosmer-Lemeshow goodness-of-fit χ2 and C-statistic. Simulations were done 500 times with sample sizes of 9,177 for males and 10,618 for females.

Results

Simulation data successfully reproduced discrimination and calibration generated from population data. Increasing levels of random error in height and weight reduced the calibration and discrimination of the model. Random error biased the predicted risk upwards whereas systematic error biased predicted risk in the direction of the bias and reduced calibration; however, it did not affect discrimination.

Conclusion

This study demonstrates that random and systematic errors in self-reported health data have the potential to influence the performance of risk algorithms. Further research that quantifies the amount and direction of error can improve model performance by allowing for adjustments in exposure measurements.
Appendix
Available only for authorised users
Literature
1.
go back to reference Anderson KM, Wilson PWF, Odell PM, et al.: An updated coronary risk profile - a statement for health-professionals. Circulation 1991, 83: 356-362. 10.1161/01.CIR.83.1.356CrossRefPubMed Anderson KM, Wilson PWF, Odell PM, et al.: An updated coronary risk profile - a statement for health-professionals. Circulation 1991, 83: 356-362. 10.1161/01.CIR.83.1.356CrossRefPubMed
2.
go back to reference Hippisley-Cox J, Coupland C, Vinogradova Y, et al.: Derivation and validation of QRISK, a new cardiovascular disease risk score for the United Kingdom: prospective open cohort study. Br Med J 2007, 335: 136-141. 10.1136/bmj.39261.471806.55CrossRef Hippisley-Cox J, Coupland C, Vinogradova Y, et al.: Derivation and validation of QRISK, a new cardiovascular disease risk score for the United Kingdom: prospective open cohort study. Br Med J 2007, 335: 136-141. 10.1136/bmj.39261.471806.55CrossRef
3.
go back to reference Rosella LC, Manuel DG, Burchill C, et al.: A population-based risk algorithm for the development of diabetes: development and validation of the diabetes population risk tool (DPoRT). J Epidemiol Commun Health 2011, 65: 613-620. 10.1136/jech.2009.102244CrossRef Rosella LC, Manuel DG, Burchill C, et al.: A population-based risk algorithm for the development of diabetes: development and validation of the diabetes population risk tool (DPoRT). J Epidemiol Commun Health 2011, 65: 613-620. 10.1136/jech.2009.102244CrossRef
4.
go back to reference Lindstrom J, Tuomilehto J: The diabetes risk score: a practical tool to predict type 2 diabetes risk. Diabetes Care 2007, 26: 725-731.CrossRef Lindstrom J, Tuomilehto J: The diabetes risk score: a practical tool to predict type 2 diabetes risk. Diabetes Care 2007, 26: 725-731.CrossRef
5.
go back to reference Mainous AG, Koopman RJ, Diaz VA, et al.: A coronary heart disease risk score based on pateint-reported information. Am J Cardiol 2007,99(9):1236-1241. 10.1016/j.amjcard.2006.12.035CrossRefPubMedPubMedCentral Mainous AG, Koopman RJ, Diaz VA, et al.: A coronary heart disease risk score based on pateint-reported information. Am J Cardiol 2007,99(9):1236-1241. 10.1016/j.amjcard.2006.12.035CrossRefPubMedPubMedCentral
6.
go back to reference Flegal KM, Keyl PM, Nieto FJ: The effects of exposure misclassification on estimates of relative risk. Epidemiology 1986, 123: 736-751. Flegal KM, Keyl PM, Nieto FJ: The effects of exposure misclassification on estimates of relative risk. Epidemiology 1986, 123: 736-751.
7.
go back to reference Fuller WA: Estimation in the presence of measurement error. Int Stat Rev 1995, 63: 121-141. 10.2307/1403606CrossRef Fuller WA: Estimation in the presence of measurement error. Int Stat Rev 1995, 63: 121-141. 10.2307/1403606CrossRef
8.
go back to reference Weinstock MA, Colditz GA, Willet WC: Recall (report) bias and reliability in the retrospective assessment of melanoma risk. Am J Epidemiol 1991, 133: 240-245.PubMed Weinstock MA, Colditz GA, Willet WC: Recall (report) bias and reliability in the retrospective assessment of melanoma risk. Am J Epidemiol 1991, 133: 240-245.PubMed
9.
go back to reference Colditz G, Willet WC, Rotnitzky A, et al.: Weight gain as a risk factor for clinical diabetes mellitus in women. Ann Intern Med 1995, 122: 481-486.CrossRefPubMed Colditz G, Willet WC, Rotnitzky A, et al.: Weight gain as a risk factor for clinical diabetes mellitus in women. Ann Intern Med 1995, 122: 481-486.CrossRefPubMed
10.
go back to reference Colditz G, Willet WC, Stampfer MJ, et al.: Weight as a risk factor for clinical diabetes in women. Am J Epidemiol 1990, 132: 501-513.PubMed Colditz G, Willet WC, Stampfer MJ, et al.: Weight as a risk factor for clinical diabetes in women. Am J Epidemiol 1990, 132: 501-513.PubMed
11.
go back to reference Perry IJ, Wannamethee SG, Walker MJ, et al.: Prospective study of risk factors for development of non-insulin dependent diabetes in middle aged British men. Br Med J 1995, 310: 555-559. 10.1136/bmj.310.6979.555CrossRef Perry IJ, Wannamethee SG, Walker MJ, et al.: Prospective study of risk factors for development of non-insulin dependent diabetes in middle aged British men. Br Med J 1995, 310: 555-559. 10.1136/bmj.310.6979.555CrossRef
12.
go back to reference Vanderpump MPJ, Tundbridge WM, French JM, et al.: The incidence of diabetes mellitus in an english community: A 20-year follow-up of the Wickham Survey. Diabet Med 1996, 13: 741-747. 10.1002/(SICI)1096-9136(199608)13:8<741::AID-DIA173>3.0.CO;2-4CrossRefPubMed Vanderpump MPJ, Tundbridge WM, French JM, et al.: The incidence of diabetes mellitus in an english community: A 20-year follow-up of the Wickham Survey. Diabet Med 1996, 13: 741-747. 10.1002/(SICI)1096-9136(199608)13:8<741::AID-DIA173>3.0.CO;2-4CrossRefPubMed
13.
go back to reference Wilson P, Meigs JB, Sullivan LM, et al.: Prediction of incident diabetes mellitus in middle-aged adults. Arch Intern Med 2007, 167: 1068-1074. 10.1001/archinte.167.10.1068CrossRefPubMed Wilson P, Meigs JB, Sullivan LM, et al.: Prediction of incident diabetes mellitus in middle-aged adults. Arch Intern Med 2007, 167: 1068-1074. 10.1001/archinte.167.10.1068CrossRefPubMed
14.
go back to reference Harrell FE: Regression modeling strategies with applications to linear models, logistic regression, and survival analysis. New York: Springer; 2001. Harrell FE: Regression modeling strategies with applications to linear models, logistic regression, and survival analysis. New York: Springer; 2001.
15.
go back to reference Hosmer DW, Hosmer T, Cessie LE, et al.: A comparison of goodness-of-ft tests for the logistic regression model. Stat Med 1997, 16: 965-980. 10.1002/(SICI)1097-0258(19970515)16:9<965::AID-SIM509>3.0.CO;2-OCrossRefPubMed Hosmer DW, Hosmer T, Cessie LE, et al.: A comparison of goodness-of-ft tests for the logistic regression model. Stat Med 1997, 16: 965-980. 10.1002/(SICI)1097-0258(19970515)16:9<965::AID-SIM509>3.0.CO;2-OCrossRefPubMed
16.
go back to reference Hosmer DW, Lemenshow S: Applied logistic regression. New York: Wiley; 1989. Hosmer DW, Lemenshow S: Applied logistic regression. New York: Wiley; 1989.
17.
go back to reference Hosmer DW, Lid Hjort N: Goodness-of-fit processes for logistic regression: simulation results. Stat Med 2002, 21: 2723-2738. 10.1002/sim.1200CrossRefPubMed Hosmer DW, Lid Hjort N: Goodness-of-fit processes for logistic regression: simulation results. Stat Med 2002, 21: 2723-2738. 10.1002/sim.1200CrossRefPubMed
18.
go back to reference D’Agostino RB, Grundy S, Sullivan LM, et al.: Validation of the framingham coronary disease prediction scores. JAMA 2001, 286: 180-187. 10.1001/jama.286.2.180CrossRefPubMed D’Agostino RB, Grundy S, Sullivan LM, et al.: Validation of the framingham coronary disease prediction scores. JAMA 2001, 286: 180-187. 10.1001/jama.286.2.180CrossRefPubMed
19.
go back to reference Harrell FE, Lee KL, Mark DB: Multivariable prognostic models: Issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med 1996, 15: 361-387. 10.1002/(SICI)1097-0258(19960229)15:4<361::AID-SIM168>3.0.CO;2-4CrossRefPubMed Harrell FE, Lee KL, Mark DB: Multivariable prognostic models: Issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med 1996, 15: 361-387. 10.1002/(SICI)1097-0258(19960229)15:4<361::AID-SIM168>3.0.CO;2-4CrossRefPubMed
20.
go back to reference Pencina M, D’Agostino RB: Overall C as a measure of discrimination in survival analysis: model specific population value and confidence interval estimation. Stat Med 2004, 23: 2109-2123. 10.1002/sim.1802CrossRefPubMed Pencina M, D’Agostino RB: Overall C as a measure of discrimination in survival analysis: model specific population value and confidence interval estimation. Stat Med 2004, 23: 2109-2123. 10.1002/sim.1802CrossRefPubMed
21.
go back to reference Campbell G: General Methodology I: Advances in statistic methodology for the evaluation of diagnostic and laboratory tests. Stat Med 2004, 13: 499-508.CrossRef Campbell G: General Methodology I: Advances in statistic methodology for the evaluation of diagnostic and laboratory tests. Stat Med 2004, 13: 499-508.CrossRef
22.
go back to reference Statistics Canada: 1996–97 NPHS public Use microdata documentation. Ottawa; 1999. Statistics Canada: 1996–97 NPHS public Use microdata documentation. Ottawa; 1999.
23.
24.
go back to reference Lipscombe LL, Hux JE: Trends in diabetes prevalence, incidence, and mortality in Ontario, Canada 1995–2005: a population-based study. Lancet 2007, 369: 750-756. 10.1016/S0140-6736(07)60361-4CrossRefPubMed Lipscombe LL, Hux JE: Trends in diabetes prevalence, incidence, and mortality in Ontario, Canada 1995–2005: a population-based study. Lancet 2007, 369: 750-756. 10.1016/S0140-6736(07)60361-4CrossRefPubMed
25.
go back to reference Deyo RA, Diehr P, Patrick DL: Reproducibility and responsiveness of health status measures: Statistics and strategies for evaluation. Control Clin Trials 2008, 12: 142S-158S.CrossRef Deyo RA, Diehr P, Patrick DL: Reproducibility and responsiveness of health status measures: Statistics and strategies for evaluation. Control Clin Trials 2008, 12: 142S-158S.CrossRef
26.
go back to reference Fleiss J: Statistical methods for rates and proportions. New Jersey: John Wiley & Sons; 2003.CrossRef Fleiss J: Statistical methods for rates and proportions. New Jersey: John Wiley & Sons; 2003.CrossRef
27.
go back to reference Gorber SC, Sheilds M, Tremblay M, et al.: The feasibility of establishing correction factors to adjust self-reported estimates of obesity. Health Rep 2009., 19: Gorber SC, Sheilds M, Tremblay M, et al.: The feasibility of establishing correction factors to adjust self-reported estimates of obesity. Health Rep 2009., 19:
28.
go back to reference Shields M, Gorber SC, Tremblay MS: Estimates of obesity based on self-report versus direct measures. Health Rep 2008, 19: 1-16. Shields M, Gorber SC, Tremblay MS: Estimates of obesity based on self-report versus direct measures. Health Rep 2008, 19: 1-16.
29.
go back to reference Gorber SC, Tremblay M, Moher D, et al.: A comparison of direct vs. self-report measures for assesing height, weight, and body mass index: a systematic review. Obes Rev 2007, 8: 307-326. 10.1111/j.1467-789X.2007.00347.xCrossRefPubMed Gorber SC, Tremblay M, Moher D, et al.: A comparison of direct vs. self-report measures for assesing height, weight, and body mass index: a systematic review. Obes Rev 2007, 8: 307-326. 10.1111/j.1467-789X.2007.00347.xCrossRefPubMed
30.
go back to reference Diamond GA: What price perfection? Calibration and discrimination of clinical prediction models. J Clin Epidemiol 1992, 45: 85-89. 10.1016/0895-4356(92)90192-PCrossRefPubMed Diamond GA: What price perfection? Calibration and discrimination of clinical prediction models. J Clin Epidemiol 1992, 45: 85-89. 10.1016/0895-4356(92)90192-PCrossRefPubMed
31.
go back to reference Nawaz H, Chan W, Abdulraham M, et al.: Self-reported weight and height: implications for obesity research. J Prevent Med 2001, 20: 294-298. 10.1016/S0749-3797(01)00293-8CrossRef Nawaz H, Chan W, Abdulraham M, et al.: Self-reported weight and height: implications for obesity research. J Prevent Med 2001, 20: 294-298. 10.1016/S0749-3797(01)00293-8CrossRef
32.
go back to reference Niedhammer I, Bugel I, Bonenfant S, et al.: Validity of self-reported weight and height in the French GAZEL cohort. Int J Obes 2000, 24: 1111-1118. 10.1038/sj.ijo.0801375CrossRef Niedhammer I, Bugel I, Bonenfant S, et al.: Validity of self-reported weight and height in the French GAZEL cohort. Int J Obes 2000, 24: 1111-1118. 10.1038/sj.ijo.0801375CrossRef
33.
go back to reference Bostrom G, Diderichsen F: Socioeconomic differentials in misclassification of height, weight and body mass index based on questionnaire data. Int J Epidemiol 1997, 26: 860-866. 10.1093/ije/26.4.860CrossRefPubMed Bostrom G, Diderichsen F: Socioeconomic differentials in misclassification of height, weight and body mass index based on questionnaire data. Int J Epidemiol 1997, 26: 860-866. 10.1093/ije/26.4.860CrossRefPubMed
34.
go back to reference Wardle K, Johnson F: Sex differences in the association of socioeconomic status with obesity. Int J Obes Relat Metab Disord 2002, 26: 1144-1149. 10.1038/sj.ijo.0802046CrossRefPubMed Wardle K, Johnson F: Sex differences in the association of socioeconomic status with obesity. Int J Obes Relat Metab Disord 2002, 26: 1144-1149. 10.1038/sj.ijo.0802046CrossRefPubMed
Metadata
Title
The influence of measurement error on calibration, discrimination, and overall estimation of a risk prediction model
Authors
Laura C Rosella
Paul Corey
Therese A Stukel
Cam Mustard
Jan Hux
Doug G Manuel
Publication date
01-12-2012
Publisher
BioMed Central
Published in
Population Health Metrics / Issue 1/2012
Electronic ISSN: 1478-7954
DOI
https://doi.org/10.1186/1478-7954-10-20

Other articles of this Issue 1/2012

Population Health Metrics 1/2012 Go to the issue