Skip to main content
Top
Published in: European Radiology 3/2010

01-03-2010 | Computer Applications

Bias, underestimation of risk, and loss of statistical power in patient-level analyses of lesion detection

Authors: Nancy A. Obuchowski, Peter J. Mazzone, Abraham H. Dachman

Published in: European Radiology | Issue 3/2010

Login to get access

Abstract

Purpose

Sensitivity and the false positive rate are usually defined with the patient as the unit of observation, i.e., the diagnostic test detects or does not detect disease in a patient. For tests designed to find and diagnose lesions, e.g., lung nodules, the usual definitions of sensitivity and specificity may be misleading. In this paper we describe and compare five measures of accuracy of lesion detection.

Methods

The five levels of evaluation considered were patient level without localization, patient level with localization, region of interest (ROI) level without localization, ROI level with localization, and lesion level.

Results

We found that estimators of sensitivity that do not require the reader to correctly locate the lesion overstate sensitivity. Patient-level estimators of sensitivity can be misleading when there is more than one lesion per patient and they reduce study power. Patient-level estimators of the false positive rate can conceal important differences between techniques. Referring clinicians rely on a test’s reported accuracy to both choose the appropriate test and plan management for their patients. If reported sensitivity is overstated, the clinician could choose the test for disease screening, and have false confidence that a negative test represents the true absence of lesions. Similarly, the lower false positive rate associated with patient-level estimators can mislead clinicians about the diagnostic value of the test and consequently that a positive finding is real.

Conclusion

We present clear recommendations for studies assessing and comparing the accuracy of tests tasked with the detection and interpretation of lesions...
Appendix
Available only for authorised users
Literature
1.
go back to reference Fryback DG, Thornbury JR (1991) The efficacy of diagnostic imaging. Med Decis Mak 11:88–94CrossRef Fryback DG, Thornbury JR (1991) The efficacy of diagnostic imaging. Med Decis Mak 11:88–94CrossRef
2.
go back to reference Zhou XH, Obuchowski NA, McClish DL (2002) Statistical methods in diagnostic medicine. Wiley, New YorkCrossRef Zhou XH, Obuchowski NA, McClish DL (2002) Statistical methods in diagnostic medicine. Wiley, New YorkCrossRef
3.
go back to reference Pepe MS (2004) The Statistical evaluation of medical tests for classification and prediction. Oxford University Press, Oxford Pepe MS (2004) The Statistical evaluation of medical tests for classification and prediction. Oxford University Press, Oxford
5.
go back to reference Zweig MH, Campbell G (1993) Receiver operating characteristic plots: a fundamental evaluation tool in clinical medicine. Clin Chem 39:561–577PubMed Zweig MH, Campbell G (1993) Receiver operating characteristic plots: a fundamental evaluation tool in clinical medicine. Clin Chem 39:561–577PubMed
6.
go back to reference Hanley JA, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143:29–36PubMed Hanley JA, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143:29–36PubMed
7.
go back to reference Kundel HL, Nodine CF (1983) A visual concept shapes image perception. Radiology 146:363–368PubMed Kundel HL, Nodine CF (1983) A visual concept shapes image perception. Radiology 146:363–368PubMed
8.
go back to reference Chakraborty DP (2006) A search model and figure of merit for observer data acquired according to the free-response paradigm. Phys Med Biol 51:3449–3462CrossRefPubMed Chakraborty DP (2006) A search model and figure of merit for observer data acquired according to the free-response paradigm. Phys Med Biol 51:3449–3462CrossRefPubMed
9.
go back to reference Edwards DC, Kupinski MA, Metz CE, Nishikawa RM (2002) Maximum likelihood fitting of FROC curves under an initial-detection-and-candidate-analysis model. Med Phys 29:2861–2870CrossRefPubMed Edwards DC, Kupinski MA, Metz CE, Nishikawa RM (2002) Maximum likelihood fitting of FROC curves under an initial-detection-and-candidate-analysis model. Med Phys 29:2861–2870CrossRefPubMed
10.
go back to reference Wagner RF, Metz CE, Campbell G (2007) Assessment of medical imaging systems and computer aids: a tutorial review. Acad Radiol 14:723–748CrossRefPubMed Wagner RF, Metz CE, Campbell G (2007) Assessment of medical imaging systems and computer aids: a tutorial review. Acad Radiol 14:723–748CrossRefPubMed
11.
go back to reference Rockette HE (1994) An index of diagnostic accuracy in the multiple disease setting. Acad Radiol 1:283–286CrossRefPubMed Rockette HE (1994) An index of diagnostic accuracy in the multiple disease setting. Acad Radiol 1:283–286CrossRefPubMed
12.
go back to reference Chakraborty DP (2006) ROC curves predicted by a model of visual search. Phys Med Biol 51:3463–3482CrossRefPubMed Chakraborty DP (2006) ROC curves predicted by a model of visual search. Phys Med Biol 51:3463–3482CrossRefPubMed
13.
go back to reference Song T, Bandos AI, Rockette HE (2008) On comparing methods for discriminating between actually negative and actually positive subjects with FROC type data. Med Phys 35:1547–1558CrossRefPubMed Song T, Bandos AI, Rockette HE (2008) On comparing methods for discriminating between actually negative and actually positive subjects with FROC type data. Med Phys 35:1547–1558CrossRefPubMed
14.
go back to reference Pickhardt PJ, Nugent PA, Mysliwiec PA, Choi RJ, Schindler WR (2004) Location of adenomas missed by optical colonoscopy. Ann Intern Med 141:352–359PubMed Pickhardt PJ, Nugent PA, Mysliwiec PA, Choi RJ, Schindler WR (2004) Location of adenomas missed by optical colonoscopy. Ann Intern Med 141:352–359PubMed
15.
go back to reference Obuchowski NA (1998) On the comparison of correlated proportions for clustered data. Stat Med 17:1495–1507CrossRefPubMed Obuchowski NA (1998) On the comparison of correlated proportions for clustered data. Stat Med 17:1495–1507CrossRefPubMed
16.
go back to reference Obuchowski NA (1997) Nonparametric analysis of clustered ROC curve data. Biometrics 53:170–180CrossRef Obuchowski NA (1997) Nonparametric analysis of clustered ROC curve data. Biometrics 53:170–180CrossRef
17.
go back to reference Obuchowski NA, Lieber ML, Powell KA (2000) Data analysis for detection and localization of multiple abnormalities with application to mammography. Acad Radiol 7:516–525; Author’s Response to Comments, 7:554–555CrossRefPubMed Obuchowski NA, Lieber ML, Powell KA (2000) Data analysis for detection and localization of multiple abnormalities with application to mammography. Acad Radiol 7:516–525; Author’s Response to Comments, 7:554–555CrossRefPubMed
18.
go back to reference Beam CA (1998) Analysis of clustered data in receiver operating characteristic studies. Stat Methods Med Res 7:324–336CrossRefPubMed Beam CA (1998) Analysis of clustered data in receiver operating characteristic studies. Stat Methods Med Res 7:324–336CrossRefPubMed
19.
go back to reference Rutter CM (2000) Bootstrap estimation of diagnostic accuracy with patient-clustered data. Acad Radiol 7:516–525CrossRef Rutter CM (2000) Bootstrap estimation of diagnostic accuracy with patient-clustered data. Acad Radiol 7:516–525CrossRef
20.
go back to reference Chakraborty DP, Berbaum KS (2004) Observer studies involving detection and localization: modeling, analysis, and validation. Med Phys 31:2313–2330CrossRefPubMed Chakraborty DP, Berbaum KS (2004) Observer studies involving detection and localization: modeling, analysis, and validation. Med Phys 31:2313–2330CrossRefPubMed
21.
go back to reference Chakraborty DP (2006) Analysis of location specific observer performance data: validated extensions of the jackknife free-response (JAFROC) method. Acad Radiol 13:1187–1193CrossRefPubMed Chakraborty DP (2006) Analysis of location specific observer performance data: validated extensions of the jackknife free-response (JAFROC) method. Acad Radiol 13:1187–1193CrossRefPubMed
22.
go back to reference Kish L (1965) Survey sampling. Wiley, New York Kish L (1965) Survey sampling. Wiley, New York
Metadata
Title
Bias, underestimation of risk, and loss of statistical power in patient-level analyses of lesion detection
Authors
Nancy A. Obuchowski
Peter J. Mazzone
Abraham H. Dachman
Publication date
01-03-2010
Publisher
Springer-Verlag
Published in
European Radiology / Issue 3/2010
Print ISSN: 0938-7994
Electronic ISSN: 1432-1084
DOI
https://doi.org/10.1007/s00330-009-1590-4

Other articles of this Issue 3/2010

European Radiology 3/2010 Go to the issue