Skip to main content
Top
Published in: BMC Medical Informatics and Decision Making 1/2020

Open Access 01-12-2020 | Research article

A new concordant partial AUC and partial c statistic for imbalanced data in the evaluation of machine learning algorithms

Authors: André M. Carrington, Paul W. Fieguth, Hammad Qazi, Andreas Holzinger, Helen H. Chen, Franz Mayr, Douglas G. Manuel

Published in: BMC Medical Informatics and Decision Making | Issue 1/2020

Login to get access

Abstract

Background

In classification and diagnostic testing, the receiver-operator characteristic (ROC) plot and the area under the ROC curve (AUC) describe how an adjustable threshold causes changes in two types of error: false positives and false negatives. Only part of the ROC curve and AUC are informative however when they are used with imbalanced data. Hence, alternatives to the AUC have been proposed, such as the partial AUC and the area under the precision-recall curve. However, these alternatives cannot be as fully interpreted as the AUC, in part because they ignore some information about actual negatives.

Methods

We derive and propose a new concordant partial AUC and a new partial c statistic for ROC data—as foundational measures and methods to help understand and explain parts of the ROC plot and AUC. Our partial measures are continuous and discrete versions of the same measure, are derived from the AUC and c statistic respectively, are validated as equal to each other, and validated as equal in summation to whole measures where expected. Our partial measures are tested for validity on a classic ROC example from Fawcett, a variation thereof, and two real-life benchmark data sets in breast cancer: the Wisconsin and Ljubljana data sets. Interpretation of an example is then provided.

Results

Results show the expected equalities between our new partial measures and the existing whole measures. The example interpretation illustrates the need for our newly derived partial measures.

Conclusions

The concordant partial area under the ROC curve was proposed and unlike previous partial measure alternatives, it maintains the characteristics of the AUC. The first partial c statistic for ROC plots was also proposed as an unbiased interpretation for part of an ROC curve. The expected equalities among and between our newly derived partial measures and their existing full measure counterparts are confirmed. These measures may be used with any data set but this paper focuses on imbalanced data with low prevalence.

Future work

Future work with our proposed measures may: demonstrate their value for imbalanced data with high prevalence, compare them to other measures not based on areas; and combine them with other ROC measures and techniques.
Footnotes
1
Since TNR = 1 − FPR, measures in terms of average TNR are easily translated to measures in average FPR and vice-versa.
 
Literature
1.
go back to reference Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A. Global cancer statistics 2018: Globocan estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2018;68(6):394–424.CrossRefPubMed Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A. Global cancer statistics 2018: Globocan estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2018;68(6):394–424.CrossRefPubMed
3.
go back to reference Obuchowski NA, Bullen JA. Receiver operating characteristic (roc) curves: review of methods with applications in diagnostic medicine. Phys Med Biol. 2018;63(7):07–1.CrossRef Obuchowski NA, Bullen JA. Receiver operating characteristic (roc) curves: review of methods with applications in diagnostic medicine. Phys Med Biol. 2018;63(7):07–1.CrossRef
5.
go back to reference Streiner DL, Cairney J. What’s under the roc? An introduction to receiver operating characteristics curves. Can J Psychiatr. 2007;52(2):121–8.CrossRef Streiner DL, Cairney J. What’s under the roc? An introduction to receiver operating characteristics curves. Can J Psychiatr. 2007;52(2):121–8.CrossRef
6.
go back to reference Provost F, Fawcett T. Robust classification for imprecise environments. Mach Learn. 2001. Provost F, Fawcett T. Robust classification for imprecise environments. Mach Learn. 2001.
8.
go back to reference Austin PC, Steyerberg EW. Interpreting the concordance statistic of a logistic regression model: relation to the variance and odds ratio of a continuous explanatory variable. BMC Med Res Methodol. 2012;12(1):82.CrossRefPubMedPubMedCentral Austin PC, Steyerberg EW. Interpreting the concordance statistic of a logistic regression model: relation to the variance and odds ratio of a continuous explanatory variable. BMC Med Res Methodol. 2012;12(1):82.CrossRefPubMedPubMedCentral
11.
go back to reference Zhou X-H, McClish DK, Obuchowski NA. In: Wiley J, Sons, editors. Statistical Methods in Diagnostic Medicine, vol. 569; 2009. p. 28. Zhou X-H, McClish DK, Obuchowski NA. In: Wiley J, Sons, editors. Statistical Methods in Diagnostic Medicine, vol. 569; 2009. p. 28.
13.
14.
go back to reference Wagstaff K. Machine learning that matters. Arxiv Preprint Arxiv. 2012;1206:4656. Wagstaff K. Machine learning that matters. Arxiv Preprint Arxiv. 2012;1206:4656.
16.
go back to reference McNeil BJ, Hanley JA. Statistical approaches to the analysis of receiver operating characteristic (roc) curves. Med Decis Mak. 1984;4(2):137–50.CrossRef McNeil BJ, Hanley JA. Statistical approaches to the analysis of receiver operating characteristic (roc) curves. Med Decis Mak. 1984;4(2):137–50.CrossRef
19.
go back to reference Tang Y, Zhang Y-Q, Chawla NV, Krasser S. Svms modeling for highly imbalanced classification. IEEE Trans Syst Man Cybern B (Cybernetics). 2009;39(1):281–8.CrossRef Tang Y, Zhang Y-Q, Chawla NV, Krasser S. Svms modeling for highly imbalanced classification. IEEE Trans Syst Man Cybern B (Cybernetics). 2009;39(1):281–8.CrossRef
21.
go back to reference Bradley AP. Half-AUC for the evaluation of sensitive or specific classifiers. Pattern Recogn Lett. 2014;38:93–8.CrossRef Bradley AP. Half-AUC for the evaluation of sensitive or specific classifiers. Pattern Recogn Lett. 2014;38:93–8.CrossRef
23.
go back to reference Hu Y-C, Chen C-J. A promethee-based classification method using concordance and discordance relations and its application to bankruptcy prediction. Inf Sci. 2011;181(22):4959–68.CrossRef Hu Y-C, Chen C-J. A promethee-based classification method using concordance and discordance relations and its application to bankruptcy prediction. Inf Sci. 2011;181(22):4959–68.CrossRef
24.
go back to reference Joerin F, Musy A. Land management with gis and multicriteria analysis. Int Trans Oper Res. 2000;7(1):67–78.CrossRef Joerin F, Musy A. Land management with gis and multicriteria analysis. Int Trans Oper Res. 2000;7(1):67–78.CrossRef
25.
go back to reference Legendre P. Species associations: the kendall coefficient of concordance revisited. J Agric Biol Environ Stat. 2005;10(2):226.CrossRef Legendre P. Species associations: the kendall coefficient of concordance revisited. J Agric Biol Environ Stat. 2005;10(2):226.CrossRef
26.
go back to reference Mendas A, Delali A. Integration of multicriteria decision analysis in gis to develop land suitability for agriculture: application to durum wheat cultivation in the region of mleta in Algeria. Comput Electron Agric. 2012;83:117–26.CrossRef Mendas A, Delali A. Integration of multicriteria decision analysis in gis to develop land suitability for agriculture: application to durum wheat cultivation in the region of mleta in Algeria. Comput Electron Agric. 2012;83:117–26.CrossRef
27.
go back to reference Hilden J. The area under the roc curve and its competitors. Med Decis Mak. 1991;11(2):95–101.CrossRef Hilden J. The area under the roc curve and its competitors. Med Decis Mak. 1991;11(2):95–101.CrossRef
29.
go back to reference Pepe MS. The statistical evaluation of medical tests for classification and prediction: Oxford University Press; 2003. Pepe MS. The statistical evaluation of medical tests for classification and prediction: Oxford University Press; 2003.
30.
go back to reference Hanley JA, Hajian-Tilaki KO. Sampling variability of nonparametric estimates of the areas under receiver operating characteristic curves: an update. Acad Radiol. 1997;4(1):49–58.CrossRefPubMed Hanley JA, Hajian-Tilaki KO. Sampling variability of nonparametric estimates of the areas under receiver operating characteristic curves: an update. Acad Radiol. 1997;4(1):49–58.CrossRefPubMed
31.
go back to reference DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44(3):837–45.CrossRefPubMed DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44(3):837–45.CrossRefPubMed
33.
go back to reference Vickers AJ, Cronin AM. Everything you always wanted to know about evaluating prediction models (but were too afraid to ask). Urology. 2010;76(6):1298–301.CrossRefPubMed Vickers AJ, Cronin AM. Everything you always wanted to know about evaluating prediction models (but were too afraid to ask). Urology. 2010;76(6):1298–301.CrossRefPubMed
34.
go back to reference Green DM, Swets JA, et al. Signal Detection Theory and Psychophysics, vol. 1: Wiley New York; 1966. Green DM, Swets JA, et al. Signal Detection Theory and Psychophysics, vol. 1: Wiley New York; 1966.
35.
go back to reference Hosmer DW, Lemeshow S. Applied Logistic Regression; 2000. p. 160–165173180.CrossRef Hosmer DW, Lemeshow S. Applied Logistic Regression; 2000. p. 160–165173180.CrossRef
36.
go back to reference Uno H, Cai T, Pencina MJ, D’Agostino RB, Wei L. On the c-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat Med. 2011;30(10):1105–17.PubMedPubMedCentral Uno H, Cai T, Pencina MJ, D’Agostino RB, Wei L. On the c-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat Med. 2011;30(10):1105–17.PubMedPubMedCentral
37.
go back to reference Brentnall AR, Cuzick J. Use of the concordance index for predictors of censored survival data. Stat Methods Med Res. 2018;27(8):2359–73.CrossRefPubMed Brentnall AR, Cuzick J. Use of the concordance index for predictors of censored survival data. Stat Methods Med Res. 2018;27(8):2359–73.CrossRefPubMed
38.
go back to reference Steyerberg EW. Clinical prediction models. Springer. 2009. Steyerberg EW. Clinical prediction models. Springer. 2009.
39.
go back to reference Michalski RS, Mozetic I, Hong J, Lavrac N. The multi-purpose incremental learning system aq15 and its testing application to three medical domains. Proc AAAI. 1986;1986:1–041.CrossRef Michalski RS, Mozetic I, Hong J, Lavrac N. The multi-purpose incremental learning system aq15 and its testing application to three medical domains. Proc AAAI. 1986;1986:1–041.CrossRef
40.
go back to reference Wolberg WH, Mangasarian OL. Multisurface method of pattern separation for medical diagnosis applied to breast cytology. Proc Natl Acad Sci. 1990;87(23):9193–6.CrossRefPubMedPubMedCentral Wolberg WH, Mangasarian OL. Multisurface method of pattern separation for medical diagnosis applied to breast cytology. Proc Natl Acad Sci. 1990;87(23):9193–6.CrossRefPubMedPubMedCentral
41.
go back to reference Bradley AP. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recogn. 1997;30:1145–59.CrossRef Bradley AP. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recogn. 1997;30:1145–59.CrossRef
42.
go back to reference Bradley, A.P.: The Use of the Area Under the ROC Curve in the Evaluation of Machine Learning Algorithms. PhD thesis, The University of Queensland. Bradley, A.P.: The Use of the Area Under the ROC Curve in the Evaluation of Machine Learning Algorithms. PhD thesis, The University of Queensland.
43.
go back to reference Metz CE, Kronman HB. Statistical significance tests for binormal roc curves. J Math Psychol. 1980;22(3):218–43.CrossRef Metz CE, Kronman HB. Statistical significance tests for binormal roc curves. J Math Psychol. 1980;22(3):218–43.CrossRef
44.
go back to reference Ṕerez-Ferńandez, S., Mart́ınez-Camblor, P., Filzmoser, P., Corral, N.: nsroc: An r package for non-standard roc curve analysis. R I Dent J 10 (2), 55–77 (2018).CrossRef Ṕerez-Ferńandez, S., Mart́ınez-Camblor, P., Filzmoser, P., Corral, N.: nsroc: An r package for non-standard roc curve analysis. R I Dent J 10 (2), 55–77 (2018).CrossRef
45.
go back to reference Ozenne B, Subtil F, Maucort-Boulch D. The precision–recall curve overcame the optimism of the receiver operating characteristic curve in rare diseases. J Clin Epidemiol. 2015;68(8):855–9.CrossRefPubMed Ozenne B, Subtil F, Maucort-Boulch D. The precision–recall curve overcame the optimism of the receiver operating characteristic curve in rare diseases. J Clin Epidemiol. 2015;68(8):855–9.CrossRefPubMed
Metadata
Title
A new concordant partial AUC and partial c statistic for imbalanced data in the evaluation of machine learning algorithms
Authors
André M. Carrington
Paul W. Fieguth
Hammad Qazi
Andreas Holzinger
Helen H. Chen
Franz Mayr
Douglas G. Manuel
Publication date
01-12-2020
Publisher
BioMed Central
Published in
BMC Medical Informatics and Decision Making / Issue 1/2020
Electronic ISSN: 1472-6947
DOI
https://doi.org/10.1186/s12911-019-1014-6

Other articles of this Issue 1/2020

BMC Medical Informatics and Decision Making 1/2020 Go to the issue