Skip to main content
Top
Published in: Journal of Translational Medicine 1/2019

Open Access 01-12-2019 | Chronic Kidney Disease | Research

Comparison and development of machine learning tools in the prediction of chronic kidney disease progression

Authors: Jing Xiao, Ruifeng Ding, Xiulin Xu, Haochen Guan, Xinhui Feng, Tao Sun, Sibo Zhu, Zhibin Ye

Published in: Journal of Translational Medicine | Issue 1/2019

Login to get access

Abstract

Background

Urinary protein quantification is critical for assessing the severity of chronic kidney disease (CKD). However, the current procedure for determining the severity of CKD is completed through evaluating 24-h urinary protein, which is inconvenient during follow-up.

Objective

To quickly predict the severity of CKD using more easily available demographic and blood biochemical features during follow-up, we developed and compared several predictive models using statistical, machine learning and neural network approaches.

Methods

The clinical and blood biochemical results from 551 patients with proteinuria were collected. Thirteen blood-derived tests and 5 demographic features were used as non-urinary clinical variables to predict the 24-h urinary protein outcome response. Nine predictive models were established and compared, including logistic regression, Elastic Net, lasso regression, ridge regression, support vector machine, random forest, XGBoost, neural network and k-nearest neighbor. The AU-ROC, sensitivity (recall), specificity, accuracy, log-loss and precision of each of the models were evaluated. The effect sizes of each variable were analysed and ranked.

Results

The linear models including Elastic Net, lasso regression, ridge regression and logistic regression showed the highest overall predictive power, with an average AUC and a precision above 0.87 and 0.8, respectively. Logistic regression ranked first, reaching an AUC of 0.873, with a sensitivity and specificity of 0.83 and 0.82, respectively. The model with the highest sensitivity was Elastic Net (0.85), while XGBoost showed the highest specificity (0.83). In the effect size analyses, we identified that ALB, Scr, TG, LDL and EGFR had important impacts on the predictability of the models, while other predictors such as CRP, HDL and SNA were less important.

Conclusions

Blood-derived tests could be applied as non-urinary predictors during outpatient follow-up. Features in routine blood tests, including ALB, Scr, TG, LDL and EGFR levels, showed predictive ability for CKD severity. The developed online tool can facilitate the prediction of proteinuria progress during follow-up in clinical practice.
Appendix
Available only for authorised users
Literature
1.
go back to reference Go AS, Chertow GM, Fan D, McCulloch CE. Hsu C-y: Chronic kidney disease and the risks of death, cardiovascular events, and hospitalization. N Engl J Med. 2004;351:1296–305.CrossRef Go AS, Chertow GM, Fan D, McCulloch CE. Hsu C-y: Chronic kidney disease and the risks of death, cardiovascular events, and hospitalization. N Engl J Med. 2004;351:1296–305.CrossRef
2.
go back to reference Levey AS, Tangri N, Stevens LA. Classification of chronic kidney disease: a step forward. Ann Intern Med. 2011;154:65–7.CrossRef Levey AS, Tangri N, Stevens LA. Classification of chronic kidney disease: a step forward. Ann Intern Med. 2011;154:65–7.CrossRef
3.
go back to reference Taal M, Brenner B. Renal risk scores: progress and prospects. Kidney Int. 2008;73:1216–9.CrossRef Taal M, Brenner B. Renal risk scores: progress and prospects. Kidney Int. 2008;73:1216–9.CrossRef
4.
go back to reference Tangri N, Stevens LA, Griffith J, Tighiouart H, Djurdjev O, Naimark D, Levin A, Levey AS. A predictive model for progression of chronic kidney disease to kidney failure. JAMA. 2011;305:1553–9.CrossRef Tangri N, Stevens LA, Griffith J, Tighiouart H, Djurdjev O, Naimark D, Levin A, Levey AS. A predictive model for progression of chronic kidney disease to kidney failure. JAMA. 2011;305:1553–9.CrossRef
5.
go back to reference Oliver MJ, Quinn RR, Garg AX, Kim SJ, Wald R, Paterson JM. Likelihood of starting dialysis after incident fistula creation. Clin J Am Soc Nephrol. 2012;7:466–71.CrossRef Oliver MJ, Quinn RR, Garg AX, Kim SJ, Wald R, Paterson JM. Likelihood of starting dialysis after incident fistula creation. Clin J Am Soc Nephrol. 2012;7:466–71.CrossRef
6.
go back to reference O’Hare AM, Choi AI, Bertenthal D, Bacchetti P, Garg AX, Kaufman JS, Walter LC, Mehta KM, Steinman MA, Allon M. Age affects outcomes in chronic kidney disease. J Am Soc Nephrol. 2007;18:2758–65.CrossRef O’Hare AM, Choi AI, Bertenthal D, Bacchetti P, Garg AX, Kaufman JS, Walter LC, Mehta KM, Steinman MA, Allon M. Age affects outcomes in chronic kidney disease. J Am Soc Nephrol. 2007;18:2758–65.CrossRef
7.
go back to reference Wojciechowski P, Tangri N, Rigatto C, Komenda P. Risk prediction in CKD: the rational alignment of health care resources in CKD 4/5 care. Adv Chronic Kidney Dis. 2016;23:227–30.CrossRef Wojciechowski P, Tangri N, Rigatto C, Komenda P. Risk prediction in CKD: the rational alignment of health care resources in CKD 4/5 care. Adv Chronic Kidney Dis. 2016;23:227–30.CrossRef
8.
go back to reference Provenzano M, Chiodini P, Minutolo R, Zoccali C, Bellizzi V, Conte G, Locatelli F, Tripepi G, Del Vecchio L, Mallamaci F. Reclassification of chronic kidney disease patients for end-stage renal disease risk by proteinuria indexed to estimated glomerular filtration rate: multicentre prospective study in nephrology clinics. Nephrol Dial Transpl. 2018. https://doi.org/10.1093/ndt/gfy217.CrossRef Provenzano M, Chiodini P, Minutolo R, Zoccali C, Bellizzi V, Conte G, Locatelli F, Tripepi G, Del Vecchio L, Mallamaci F. Reclassification of chronic kidney disease patients for end-stage renal disease risk by proteinuria indexed to estimated glomerular filtration rate: multicentre prospective study in nephrology clinics. Nephrol Dial Transpl. 2018. https://​doi.​org/​10.​1093/​ndt/​gfy217.CrossRef
9.
go back to reference Everitt B, Hothorn T. An introduction to applied multivariate analysis with R. New York: Springer; 2011.CrossRef Everitt B, Hothorn T. An introduction to applied multivariate analysis with R. New York: Springer; 2011.CrossRef
10.
go back to reference Mendenhall WM, Sincich TL, Boudreau NS. Statistics for engineering and the sciences, student solutions manual. New York: Chapman and Hall/CRC; 2016.CrossRef Mendenhall WM, Sincich TL, Boudreau NS. Statistics for engineering and the sciences, student solutions manual. New York: Chapman and Hall/CRC; 2016.CrossRef
11.
go back to reference Aho KA. Foundational and applied statistics for biologists using R. New York: Chapman and Hall/CRC; 2016.CrossRef Aho KA. Foundational and applied statistics for biologists using R. New York: Chapman and Hall/CRC; 2016.CrossRef
12.
go back to reference Glantz SA, Slinker BK, Neilands TB. Primer of applied regression and analysis of variance. New York: McGraw-Hill; 1990. Glantz SA, Slinker BK, Neilands TB. Primer of applied regression and analysis of variance. New York: McGraw-Hill; 1990.
13.
go back to reference Spiegel M, Stephens L. Schaum’s outline of statistics. 5th ed. New York: McGraw-Hill Education; 2014. Spiegel M, Stephens L. Schaum’s outline of statistics. 5th ed. New York: McGraw-Hill Education; 2014.
14.
15.
go back to reference Meadows K, Gibbens R, Gerrard C, Vuylsteke A. Prediction of patient length of stay on the intensive care unit following cardiac surgery: a logistic regression analysis based on the cardiac operative mortality risk calculator, EuroSCORE. J Cardiothorac Vasc Anesth. 2018;32(6):2676–82.CrossRef Meadows K, Gibbens R, Gerrard C, Vuylsteke A. Prediction of patient length of stay on the intensive care unit following cardiac surgery: a logistic regression analysis based on the cardiac operative mortality risk calculator, EuroSCORE. J Cardiothorac Vasc Anesth. 2018;32(6):2676–82.CrossRef
16.
go back to reference Kim S-J, Koh K, Lustig M, Boyd S, Gorinevsky D. An interior-point method for large-scale $\ell_1 $-regularized least squares. IEEE J Select Top Signal Process. 2007;1:606–17.CrossRef Kim S-J, Koh K, Lustig M, Boyd S, Gorinevsky D. An interior-point method for large-scale $\ell_1 $-regularized least squares. IEEE J Select Top Signal Process. 2007;1:606–17.CrossRef
17.
go back to reference Friedman J, Hastie T, Tibshirani R. Regularization paths for generalized linear models via coordinate descent. J Stat Softw. 2010;33:1.CrossRef Friedman J, Hastie T, Tibshirani R. Regularization paths for generalized linear models via coordinate descent. J Stat Softw. 2010;33:1.CrossRef
18.
go back to reference Marafino BJ, Boscardin WJ, Dudley RA. Efficient and sparse feature selection for biomedical text classification via the elastic net: application to ICU risk stratification from nursing notes. J Biomed Inform. 2015;54:114–20.CrossRef Marafino BJ, Boscardin WJ, Dudley RA. Efficient and sparse feature selection for biomedical text classification via the elastic net: application to ICU risk stratification from nursing notes. J Biomed Inform. 2015;54:114–20.CrossRef
19.
go back to reference Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc Ser B. 1996;58:267–88. Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc Ser B. 1996;58:267–88.
20.
go back to reference Tikhonov AN, Goncharsky A, Stepanov V, Yagola AG. Numerical methods for the solution of ill-posed problems. New York: Springer; 2013. Tikhonov AN, Goncharsky A, Stepanov V, Yagola AG. Numerical methods for the solution of ill-posed problems. New York: Springer; 2013.
21.
go back to reference Hoerl AE, Kennard RW. Ridge regression: biased estimation for nonorthogonal problems. Technometrics. 1970;12:55–67.CrossRef Hoerl AE, Kennard RW. Ridge regression: biased estimation for nonorthogonal problems. Technometrics. 1970;12:55–67.CrossRef
22.
go back to reference Wan S, Mak M-W, Kung S-Y. R3P-Loc: a compact multi-label predictor using ridge regression and random projection for protein subcellular localization. J Theor Biol. 2014;360:34–45.CrossRef Wan S, Mak M-W, Kung S-Y. R3P-Loc: a compact multi-label predictor using ridge regression and random projection for protein subcellular localization. J Theor Biol. 2014;360:34–45.CrossRef
23.
go back to reference Nigrin A. Neural networks for pattern recognition. Agri Eng Int Cigr J Sci Res Devel Manusc Pm. 1993;12:1235–42. Nigrin A. Neural networks for pattern recognition. Agri Eng Int Cigr J Sci Res Devel Manusc Pm. 1993;12:1235–42.
24.
go back to reference Salekin A, Stankovic J: Detection of chronic kidney disease and selecting important predictive attributes. In: IEEE Healthcare Informatics (ICHI), 2016 IEEE International Conference on. 2016. p. 262–70. Salekin A, Stankovic J: Detection of chronic kidney disease and selecting important predictive attributes. In: IEEE Healthcare Informatics (ICHI), 2016 IEEE International Conference on. 2016. p. 262–70.
25.
go back to reference Cortes C, Vapnik V. Support-vector networks. Mach Learn. 1995;20:273–97. Cortes C, Vapnik V. Support-vector networks. Mach Learn. 1995;20:273–97.
26.
go back to reference Dolatabadi AD, Khadem SEZ, Asl BM. Automated diagnosis of coronary artery disease (CAD) patients using optimized SVM. Comput Methods Programs Biomed. 2017;138:117–26.CrossRef Dolatabadi AD, Khadem SEZ, Asl BM. Automated diagnosis of coronary artery disease (CAD) patients using optimized SVM. Comput Methods Programs Biomed. 2017;138:117–26.CrossRef
28.
go back to reference Ho TK. Random decision forests. In: Document analysis and recognition, 1995, proceedings of the third international conference on. IEEE; 1995. p. 278–282. Ho TK. Random decision forests. In: Document analysis and recognition, 1995, proceedings of the third international conference on. IEEE; 1995. p. 278–282.
29.
go back to reference Asaoka R, Hirasawa K, Iwase A, Fujino Y, Murata H, Shoji N, Araie M. Validating the usefulness of the “random forests” classifier to diagnose early glaucoma with optical coherence tomography. Am J Ophthalmol. 2017;174:95–103.CrossRef Asaoka R, Hirasawa K, Iwase A, Fujino Y, Murata H, Shoji N, Araie M. Validating the usefulness of the “random forests” classifier to diagnose early glaucoma with optical coherence tomography. Am J Ophthalmol. 2017;174:95–103.CrossRef
30.
go back to reference Chen T, Guestrin C: Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd ACM sigkdd international conference on knowledge discovery and data mining. ACM; 2016. p. 785–94. Chen T, Guestrin C: Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd ACM sigkdd international conference on knowledge discovery and data mining. ACM; 2016. p. 785–94.
31.
go back to reference Chen T, He T, Benesty M. Xgboost: extreme gradient boosting. R package version. 2015;04–2:1–4. Chen T, He T, Benesty M. Xgboost: extreme gradient boosting. R package version. 2015;04–2:1–4.
33.
go back to reference Bhuvaneswari P, Therese AB. Detection of cancer in lung with k-nn classification using genetic algorithm. Procedia Mater Sci. 2015;10:433–40.CrossRef Bhuvaneswari P, Therese AB. Detection of cancer in lung with k-nn classification using genetic algorithm. Procedia Mater Sci. 2015;10:433–40.CrossRef
34.
go back to reference Altman NS. An introduction to kernel and nearest-neighbor nonparametric regression. Am Stat. 1992;46:175–85. Altman NS. An introduction to kernel and nearest-neighbor nonparametric regression. Am Stat. 1992;46:175–85.
35.
go back to reference Heaton J. Ian goodfellow, yoshua bengio, and aaron courville: deep learning. Genet Program Evolvable Mach. 2018;19:305–7.CrossRef Heaton J. Ian goodfellow, yoshua bengio, and aaron courville: deep learning. Genet Program Evolvable Mach. 2018;19:305–7.CrossRef
36.
go back to reference Murphy KP. Machine learning: a probabilistic perspective. Cambridge: MIT Press; 2012. Murphy KP. Machine learning: a probabilistic perspective. Cambridge: MIT Press; 2012.
37.
38.
go back to reference Flach P, Kull M. Precision-recall-gain curves: Pr analysis done right. In: Advances in neural information processing systems. 2015. p. 838–46. Flach P, Kull M. Precision-recall-gain curves: Pr analysis done right. In: Advances in neural information processing systems. 2015. p. 838–46.
39.
go back to reference Cerqueira DC, Soares CM, Silva VR, Magalhães JO, Barcelos IP, Duarte MG, Pinheiro SV, Colosimo EA, e Silva ACS, Oliveira EA. A predictive model of progression of ckd to esrd in a predialysis pediatric interdisciplinary program. Clin J Am Soc Nephrol. 2014;9:728–35.CrossRef Cerqueira DC, Soares CM, Silva VR, Magalhães JO, Barcelos IP, Duarte MG, Pinheiro SV, Colosimo EA, e Silva ACS, Oliveira EA. A predictive model of progression of ckd to esrd in a predialysis pediatric interdisciplinary program. Clin J Am Soc Nephrol. 2014;9:728–35.CrossRef
40.
go back to reference Herget-Rosenthal S, Dehnen D, Kribben A, Quellmann T. Progressive chronic kidney disease in primary care: modifiable risk factors and predictive model. Prev Med. 2013;57:357–62.CrossRef Herget-Rosenthal S, Dehnen D, Kribben A, Quellmann T. Progressive chronic kidney disease in primary care: modifiable risk factors and predictive model. Prev Med. 2013;57:357–62.CrossRef
41.
go back to reference Usui T, Kanda E, Iseki C, Iseki K, Kashihara N, Nangaku M. Observation period for changes in proteinuria and risk prediction of end-stage renal disease in general population. Nephrology. 2017;23:821–9.CrossRef Usui T, Kanda E, Iseki C, Iseki K, Kashihara N, Nangaku M. Observation period for changes in proteinuria and risk prediction of end-stage renal disease in general population. Nephrology. 2017;23:821–9.CrossRef
42.
go back to reference Garlo KG, White WB, Bakris GL, Zannad F, Wilson CA, Kupfer S, Vaduganathan M, Morrow DA, Cannon CP, Charytan DM. Kidney biomarkers and decline in eGFR in patients with type 2 diabetes. Clin J Am Soc Nephrol. 2018;13:398–405.CrossRef Garlo KG, White WB, Bakris GL, Zannad F, Wilson CA, Kupfer S, Vaduganathan M, Morrow DA, Cannon CP, Charytan DM. Kidney biomarkers and decline in eGFR in patients with type 2 diabetes. Clin J Am Soc Nephrol. 2018;13:398–405.CrossRef
43.
go back to reference Hsu CY, Xie D, Waikar SS, Bonventre JV, Zhang X, Sabbisetti V, Mifflin TE, Coresh J, Diamantidis CJ, He J, Lora CM. Urine biomarkers of tubular injury do not improve on the clinical model predicting chronic kidney disease progression. Kidney Int. 2017;91:196–203.CrossRef Hsu CY, Xie D, Waikar SS, Bonventre JV, Zhang X, Sabbisetti V, Mifflin TE, Coresh J, Diamantidis CJ, He J, Lora CM. Urine biomarkers of tubular injury do not improve on the clinical model predicting chronic kidney disease progression. Kidney Int. 2017;91:196–203.CrossRef
44.
go back to reference Afshinnia F, Rajendiran TM, Karnovsky A, Soni T, Wang X, Xie D, Yang W, Shafi T, Weir MR, He J. Lipidomic signature of progression of chronic kidney disease in the chronic renal insufficiency cohort. Kidney Int Rep. 2016;1:256–68.CrossRef Afshinnia F, Rajendiran TM, Karnovsky A, Soni T, Wang X, Xie D, Yang W, Shafi T, Weir MR, He J. Lipidomic signature of progression of chronic kidney disease in the chronic renal insufficiency cohort. Kidney Int Rep. 2016;1:256–68.CrossRef
45.
go back to reference Lin LJ, Chen XQ, Lin-Hong WU, Wei-Wei FU, Long ZP, Nephrology DO, Hospital P. Blood pressure control on the progression of renal function in elderly patients with chronic kidney disease. China J Modern Med. 2015;25:78–81. Lin LJ, Chen XQ, Lin-Hong WU, Wei-Wei FU, Long ZP, Nephrology DO, Hospital P. Blood pressure control on the progression of renal function in elderly patients with chronic kidney disease. China J Modern Med. 2015;25:78–81.
46.
go back to reference Chase HS, Hirsch JS, Mohan S, Rao MK, Radhakrishnan J. Presence of early CKD-related metabolic complications predict progression of stage 3 CKD: a case–controlled study. BMC Nephrol. 2014;15:187.CrossRef Chase HS, Hirsch JS, Mohan S, Rao MK, Radhakrishnan J. Presence of early CKD-related metabolic complications predict progression of stage 3 CKD: a case–controlled study. BMC Nephrol. 2014;15:187.CrossRef
47.
go back to reference Khannara W, Iam-On N, Boongoen T. Predicting duration of CKD progression in patients with hypertension and diabetes. In: Intelligent and evolutionary systems. New York: Springer; 2016. p. 129–41. Khannara W, Iam-On N, Boongoen T. Predicting duration of CKD progression in patients with hypertension and diabetes. In: Intelligent and evolutionary systems. New York: Springer; 2016. p. 129–41.
Metadata
Title
Comparison and development of machine learning tools in the prediction of chronic kidney disease progression
Authors
Jing Xiao
Ruifeng Ding
Xiulin Xu
Haochen Guan
Xinhui Feng
Tao Sun
Sibo Zhu
Zhibin Ye
Publication date
01-12-2019
Publisher
BioMed Central
Published in
Journal of Translational Medicine / Issue 1/2019
Electronic ISSN: 1479-5876
DOI
https://doi.org/10.1186/s12967-019-1860-0

Other articles of this Issue 1/2019

Journal of Translational Medicine 1/2019 Go to the issue