Skip to main content
Top
Published in: BMC Medical Informatics and Decision Making 1/2023

Open Access 01-12-2023 | Chronic Kidney Disease | Research

Risk factor mining and prediction of urine protein progression in chronic kidney disease: a machine learning- based study

Authors: Yufei Lu, Yichun Ning, Yang Li, Bowen Zhu, Jian Zhang, Yan Yang, Weize Chen, Zhixin Yan, Annan Chen, Bo Shen, Yi Fang, Dong Wang, Nana Song, Xiaoqiang Ding

Published in: BMC Medical Informatics and Decision Making | Issue 1/2023

Login to get access

Abstract

Background

Chronic kidney disease (CKD) is a global public health concern. Therefore, to provide timely intervention for non-hospitalized high-risk patients and rationally allocate limited clinical resources is important to mine the key factors when designing a CKD prediction model.

Methods

This study included data from 1,358 patients with CKD pathologically confirmed during the period from December 2017 to September 2020 at Zhongshan Hospital. A CKD prediction interpretation framework based on machine learning was proposed. From among 100 variables, 17 were selected for the model construction through a recursive feature elimination with logistic regression feature screening. Several machine learning classifiers, including extreme gradient boosting, gaussian-based naive bayes, a neural network, ridge regression, and linear model logistic regression (LR), were trained, and an ensemble model was developed to predict 24-hour urine protein. The detailed relationship between the risk of CKD progression and these predictors was determined using a global interpretation. A patient-specific analysis was conducted using a local interpretation.

Results

The results showed that LR achieved the best performance, with an area under the curve (AUC) of 0.850 in a single machine learning model. The ensemble model constructed using the voting integration method further improved the AUC to 0.856. The major predictors of moderate-to-severe severity included lower levels of 25-OH-vitamin, albumin, transferrin in males, and higher levels of cystatin C.

Conclusions

Compared with the clinical single kidney function evaluation indicators (eGFR, Scr), the machine learning model proposed in this study improved the prediction accuracy of CKD progression by 17.6% and 24.6%, respectively, and the AUC was improved by 0.250 and 0.236, respectively. Our framework can achieve a good predictive interpretation and provide effective clinical decision support.
Appendix
Available only for authorised users
Literature
1.
go back to reference Luyckx VA, Al-Aly Z, Bello AK, Bellorin-Font E, Carlini RG, Fabian J, Garcia-Garcia G, Iyengar A, Sekkarie M, van Biesen W, et al. Sustainable development goals relevant to kidney health: an update on progress. Nat Rev Nephrol. 2021;17(1):15–32.CrossRefPubMed Luyckx VA, Al-Aly Z, Bello AK, Bellorin-Font E, Carlini RG, Fabian J, Garcia-Garcia G, Iyengar A, Sekkarie M, van Biesen W, et al. Sustainable development goals relevant to kidney health: an update on progress. Nat Rev Nephrol. 2021;17(1):15–32.CrossRefPubMed
2.
go back to reference Methven S, MacGregor MS, Traynor JP, Hair M, O’Reilly DS, Deighan CJ. Comparison of urinary albumin and urinary total protein as predictors of patient outcomes in CKD. Am J Kidney Dis. 2011;57(1):21–8.CrossRefPubMed Methven S, MacGregor MS, Traynor JP, Hair M, O’Reilly DS, Deighan CJ. Comparison of urinary albumin and urinary total protein as predictors of patient outcomes in CKD. Am J Kidney Dis. 2011;57(1):21–8.CrossRefPubMed
3.
go back to reference Robinson BM, Akizawa T, Jager KJ, Kerr PG, Saran R, Pisoni RL. Factors affecting outcomes in patients reaching end-stage kidney disease worldwide: differences in access to renal replacement therapy, modality use, and haemodialysis practices. Lancet. 2016;388(10041):294–306.CrossRefPubMedPubMedCentral Robinson BM, Akizawa T, Jager KJ, Kerr PG, Saran R, Pisoni RL. Factors affecting outcomes in patients reaching end-stage kidney disease worldwide: differences in access to renal replacement therapy, modality use, and haemodialysis practices. Lancet. 2016;388(10041):294–306.CrossRefPubMedPubMedCentral
4.
go back to reference Fishbane S, Spinowitz B. Update on Anemia in ESRD and earlier Stages of CKD: Core Curriculum 2018. Am J Kidney Dis. 2018;71(3):423–35.CrossRefPubMed Fishbane S, Spinowitz B. Update on Anemia in ESRD and earlier Stages of CKD: Core Curriculum 2018. Am J Kidney Dis. 2018;71(3):423–35.CrossRefPubMed
5.
go back to reference Ruiz-Ortega M, Rayego-Mateos S, Lamas S, Ortiz A, Rodrigues-Diez RR. Targeting the progression of chronic kidney disease. Nat Rev Nephrol. 2020;16(5):269–88.CrossRefPubMed Ruiz-Ortega M, Rayego-Mateos S, Lamas S, Ortiz A, Rodrigues-Diez RR. Targeting the progression of chronic kidney disease. Nat Rev Nephrol. 2020;16(5):269–88.CrossRefPubMed
6.
go back to reference Yang C, Wang H, Zhao X, Matsushita K, Coresh J, Zhang L, Zhao MH. CKD in China: evolving Spectrum and Public Health Implications. Am J Kidney Dis. 2020;76(2):258–64.CrossRefPubMed Yang C, Wang H, Zhao X, Matsushita K, Coresh J, Zhang L, Zhao MH. CKD in China: evolving Spectrum and Public Health Implications. Am J Kidney Dis. 2020;76(2):258–64.CrossRefPubMed
7.
go back to reference Hirano K, Kobayashi D, Kohtani N, Uemura Y, Ohashi Y, Komatsu Y, Yanagita M, Hishida A. Optimal follow-up intervals for different stages of chronic kidney disease: a prospective observational study. Clin Exp Nephrol. 2019;23(5):613–20.CrossRefPubMedPubMedCentral Hirano K, Kobayashi D, Kohtani N, Uemura Y, Ohashi Y, Komatsu Y, Yanagita M, Hishida A. Optimal follow-up intervals for different stages of chronic kidney disease: a prospective observational study. Clin Exp Nephrol. 2019;23(5):613–20.CrossRefPubMedPubMedCentral
9.
go back to reference Lee YW, Choi JW, Shin EH. Machine learning model for predicting malaria using clinical information. Comput Biol Med. 2021;129:104151.CrossRefPubMed Lee YW, Choi JW, Shin EH. Machine learning model for predicting malaria using clinical information. Comput Biol Med. 2021;129:104151.CrossRefPubMed
10.
go back to reference Huang X, Cao T, Chen L, Li J, Tan Z, Xu B, Xu R, Song Y, Zhou Z, Wang Z, et al. Novel insights on establishing machine learning-based stroke prediction models among hypertensive adults. Front Cardiovasc Med. 2022;9:901240.CrossRefPubMedPubMedCentral Huang X, Cao T, Chen L, Li J, Tan Z, Xu B, Xu R, Song Y, Zhou Z, Wang Z, et al. Novel insights on establishing machine learning-based stroke prediction models among hypertensive adults. Front Cardiovasc Med. 2022;9:901240.CrossRefPubMedPubMedCentral
11.
go back to reference Kang MW, Kim J, Kim DK, Oh KH, Joo KW, Kim YS, Han SS. Machine learning algorithm to predict mortality in patients undergoing continuous renal replacement therapy. Crit Care. 2020;24(1):42.CrossRefPubMedPubMedCentral Kang MW, Kim J, Kim DK, Oh KH, Joo KW, Kim YS, Han SS. Machine learning algorithm to predict mortality in patients undergoing continuous renal replacement therapy. Crit Care. 2020;24(1):42.CrossRefPubMedPubMedCentral
12.
go back to reference Ketteler M, Ambuhl P. Where are we now? Emerging opportunities and challenges in the management of secondary hyperparathyroidism in patients with non-dialysis chronic kidney disease. J Nephrol. 2021;34(5):1405–18.CrossRefPubMedPubMedCentral Ketteler M, Ambuhl P. Where are we now? Emerging opportunities and challenges in the management of secondary hyperparathyroidism in patients with non-dialysis chronic kidney disease. J Nephrol. 2021;34(5):1405–18.CrossRefPubMedPubMedCentral
13.
go back to reference Pasha SJ, Mohamed ESJIA. Novel feature reduction (NFR) model with machine learning and data mining algorithms for effective disease risk prediction. 2020, 8:184087–108. Pasha SJ, Mohamed ESJIA. Novel feature reduction (NFR) model with machine learning and data mining algorithms for effective disease risk prediction. 2020, 8:184087–108.
14.
go back to reference Pasha SJ, Mohamed ESJIiMU. Advanced hybrid ensemble gain ratio feature selection model using machine learning for enhanced disease risk prediction. 2022, 32:101064. Pasha SJ, Mohamed ESJIiMU. Advanced hybrid ensemble gain ratio feature selection model using machine learning for enhanced disease risk prediction. 2022, 32:101064.
15.
go back to reference Pasha SJ, Mohamed ES. Bio inspired ensemble feature selection (BEFS) model with machine learning and data mining algorithms for disease risk prediction. In: 2019 5th International Conference On Computing, Communication, Control And Automation (ICCUBEA): 2019: IEEE; 2019: 1–6. Pasha SJ, Mohamed ES. Bio inspired ensemble feature selection (BEFS) model with machine learning and data mining algorithms for disease risk prediction. In: 2019 5th International Conference On Computing, Communication, Control And Automation (ICCUBEA): 2019: IEEE; 2019: 1–6.
16.
go back to reference Lundberg SM, Lee S-I. A unified approach to interpreting model predictions. Adv Neural Inf Process Syst. 2017;30:4768–77. Lundberg SM, Lee S-I. A unified approach to interpreting model predictions. Adv Neural Inf Process Syst. 2017;30:4768–77.
17.
go back to reference Zhao QY, Wang H, Luo JC, Luo MH, Liu LP, Yu SJ, Liu K, Zhang YJ, Sun P, Tu GW, et al. Development and validation of a machine-learning model for prediction of Extubation failure in Intensive Care Units. Front Med (Lausanne). 2021;8:676343.CrossRefPubMed Zhao QY, Wang H, Luo JC, Luo MH, Liu LP, Yu SJ, Liu K, Zhang YJ, Sun P, Tu GW, et al. Development and validation of a machine-learning model for prediction of Extubation failure in Intensive Care Units. Front Med (Lausanne). 2021;8:676343.CrossRefPubMed
18.
go back to reference Tseng PY, Chen YT, Wang CH, Chiu KM, Peng YS, Hsu SP, Chen KL, Yang CY, Lee OK. Prediction of the development of acute kidney injury following cardiac surgery by machine learning. Crit Care. 2020;24(1):478.CrossRefPubMedPubMedCentral Tseng PY, Chen YT, Wang CH, Chiu KM, Peng YS, Hsu SP, Chen KL, Yang CY, Lee OK. Prediction of the development of acute kidney injury following cardiac surgery by machine learning. Crit Care. 2020;24(1):478.CrossRefPubMedPubMedCentral
19.
go back to reference Stekhoven DJ, Bühlmann P. MissForest–non-parametric missing value imputation for mixed-type data. Bioinformatics. 2012;28(1):112–8.CrossRefPubMed Stekhoven DJ, Bühlmann P. MissForest–non-parametric missing value imputation for mixed-type data. Bioinformatics. 2012;28(1):112–8.CrossRefPubMed
20.
go back to reference Zopluoglu C. Detecting examinees with item Preknowledge in large-scale testing using Extreme Gradient Boosting (XGBoost). Educ Psychol Meas. 2019;79(5):931–61.CrossRefPubMedPubMedCentral Zopluoglu C. Detecting examinees with item Preknowledge in large-scale testing using Extreme Gradient Boosting (XGBoost). Educ Psychol Meas. 2019;79(5):931–61.CrossRefPubMedPubMedCentral
21.
go back to reference Zhang H, Jiang T, Shan G. Identification of hot spots in protein structures using Gaussian Network Model and Gaussian Naive Bayes. Biomed Res Int. 2016;2016:4354901.CrossRefPubMedPubMedCentral Zhang H, Jiang T, Shan G. Identification of hot spots in protein structures using Gaussian Network Model and Gaussian Naive Bayes. Biomed Res Int. 2016;2016:4354901.CrossRefPubMedPubMedCentral
22.
23.
go back to reference Rokem A, Kay K. Fractional ridge regression: a fast, interpretable reparameterization of ridge regression. Gigascience 2020, 9(12). Rokem A, Kay K. Fractional ridge regression: a fast, interpretable reparameterization of ridge regression. Gigascience 2020, 9(12).
25.
go back to reference Anderson AH, Xie D, Wang X, Baudier RL, Orlandi P, Appel LJ, Dember LM, He J, Kusek JW, Lash JP, et al. Novel risk factors for Progression of Diabetic and nondiabetic CKD: findings from the chronic renal insufficiency cohort (CRIC) study. Am J Kidney Dis. 2021;77(1):56–73e51.CrossRefPubMed Anderson AH, Xie D, Wang X, Baudier RL, Orlandi P, Appel LJ, Dember LM, He J, Kusek JW, Lash JP, et al. Novel risk factors for Progression of Diabetic and nondiabetic CKD: findings from the chronic renal insufficiency cohort (CRIC) study. Am J Kidney Dis. 2021;77(1):56–73e51.CrossRefPubMed
26.
go back to reference Inaguma D, Imai E, Takeuchi A, Ohashi Y, Watanabe T, Nitta K, Akizawa T, Matsuo S, Makino H, Hishida A, et al. Risk factors for CKD progression in japanese patients: findings from the chronic kidney Disease Japan Cohort (CKD-JAC) study. Clin Exp Nephrol. 2017;21(3):446–56.CrossRefPubMed Inaguma D, Imai E, Takeuchi A, Ohashi Y, Watanabe T, Nitta K, Akizawa T, Matsuo S, Makino H, Hishida A, et al. Risk factors for CKD progression in japanese patients: findings from the chronic kidney Disease Japan Cohort (CKD-JAC) study. Clin Exp Nephrol. 2017;21(3):446–56.CrossRefPubMed
27.
go back to reference Inaguma D, Kitagawa A, Yanagiya R, Koseki A, Iwamori T, Kudo M, Yuzawa Y. Increasing tendency of urine protein is a risk factor for rapid eGFR decline in patients with CKD: a machine learning-based prediction model by using a big database. PLoS ONE. 2020;15(9):e0239262.CrossRefPubMedPubMedCentral Inaguma D, Kitagawa A, Yanagiya R, Koseki A, Iwamori T, Kudo M, Yuzawa Y. Increasing tendency of urine protein is a risk factor for rapid eGFR decline in patients with CKD: a machine learning-based prediction model by using a big database. PLoS ONE. 2020;15(9):e0239262.CrossRefPubMedPubMedCentral
28.
go back to reference Huang J, Huth C, Covic M, Troll M, Adam J, Zukunft S, Prehn C, Wang L, Nano J, Scheerer MF, et al. Machine learning approaches reveal metabolic signatures of incident chronic kidney disease in individuals with Prediabetes and Type 2 diabetes. Diabetes. 2020;69(12):2756–65.CrossRefPubMed Huang J, Huth C, Covic M, Troll M, Adam J, Zukunft S, Prehn C, Wang L, Nano J, Scheerer MF, et al. Machine learning approaches reveal metabolic signatures of incident chronic kidney disease in individuals with Prediabetes and Type 2 diabetes. Diabetes. 2020;69(12):2756–65.CrossRefPubMed
29.
go back to reference Rashed-Al-Mahfuz M, Haque A, Azad A, Alyami SA, Quinn JMW, Moni MA. Clinically Applicable Machine Learning Approaches to identify attributes of chronic kidney disease (CKD) for use in low-cost diagnostic screening. IEEE J Transl Eng Health Med. 2021;9:4900511.CrossRefPubMed Rashed-Al-Mahfuz M, Haque A, Azad A, Alyami SA, Quinn JMW, Moni MA. Clinically Applicable Machine Learning Approaches to identify attributes of chronic kidney disease (CKD) for use in low-cost diagnostic screening. IEEE J Transl Eng Health Med. 2021;9:4900511.CrossRefPubMed
30.
go back to reference Ferguson T, Ravani P, Sood MM, Clarke A, Komenda P, Rigatto C, Tangri N. Development and External Validation of a machine learning model for progression of CKD. Kidney Int Rep. 2022;7(8):1772–81.CrossRefPubMedPubMedCentral Ferguson T, Ravani P, Sood MM, Clarke A, Komenda P, Rigatto C, Tangri N. Development and External Validation of a machine learning model for progression of CKD. Kidney Int Rep. 2022;7(8):1772–81.CrossRefPubMedPubMedCentral
31.
go back to reference Xiao J, Ding R, Xu X, Guan H, Feng X, Sun T, Zhu S, Ye Z. Comparison and development of machine learning tools in the prediction of chronic kidney disease progression. J Transl Med. 2019;17(1):119.CrossRefPubMedPubMedCentral Xiao J, Ding R, Xu X, Guan H, Feng X, Sun T, Zhu S, Ye Z. Comparison and development of machine learning tools in the prediction of chronic kidney disease progression. J Transl Med. 2019;17(1):119.CrossRefPubMedPubMedCentral
32.
go back to reference Christodoulou M, Aspray TJ, Schoenmakers I. Vitamin D supplementation for patients with chronic kidney disease: a systematic review and Meta-analyses of trials investigating the response to supplementation and an overview of Guidelines. Calcif Tissue Int. 2021;109(2):157–78.CrossRefPubMedPubMedCentral Christodoulou M, Aspray TJ, Schoenmakers I. Vitamin D supplementation for patients with chronic kidney disease: a systematic review and Meta-analyses of trials investigating the response to supplementation and an overview of Guidelines. Calcif Tissue Int. 2021;109(2):157–78.CrossRefPubMedPubMedCentral
33.
go back to reference Figueroa SM, Araos P, Reyes J, Gravez B, Barrera-Chimal J, Amador CA. Oxidized albumin as a mediator of kidney disease. Antioxid (Basel) 2021, 10(3). Figueroa SM, Araos P, Reyes J, Gravez B, Barrera-Chimal J, Amador CA. Oxidized albumin as a mediator of kidney disease. Antioxid (Basel) 2021, 10(3).
34.
go back to reference Levitt DG, Levitt MD. Human serum albumin homeostasis: a new look at the roles of synthesis, catabolism, renal and gastrointestinal excretion, and the clinical value of serum albumin measurements. Int J Gen Med. 2016;9:229–55.CrossRefPubMedPubMedCentral Levitt DG, Levitt MD. Human serum albumin homeostasis: a new look at the roles of synthesis, catabolism, renal and gastrointestinal excretion, and the clinical value of serum albumin measurements. Int J Gen Med. 2016;9:229–55.CrossRefPubMedPubMedCentral
35.
go back to reference Obert LA, Elmore SA, Ennulat D, Frazier KS. A review of specific biomarkers of Chronic Renal Injury and their potential application in Nonclinical Safety Assessment Studies. Toxicol Pathol. 2021;49(5):996–1023.CrossRefPubMedPubMedCentral Obert LA, Elmore SA, Ennulat D, Frazier KS. A review of specific biomarkers of Chronic Renal Injury and their potential application in Nonclinical Safety Assessment Studies. Toxicol Pathol. 2021;49(5):996–1023.CrossRefPubMedPubMedCentral
36.
Metadata
Title
Risk factor mining and prediction of urine protein progression in chronic kidney disease: a machine learning- based study
Authors
Yufei Lu
Yichun Ning
Yang Li
Bowen Zhu
Jian Zhang
Yan Yang
Weize Chen
Zhixin Yan
Annan Chen
Bo Shen
Yi Fang
Dong Wang
Nana Song
Xiaoqiang Ding
Publication date
01-12-2023
Publisher
BioMed Central
Published in
BMC Medical Informatics and Decision Making / Issue 1/2023
Electronic ISSN: 1472-6947
DOI
https://doi.org/10.1186/s12911-023-02269-2

Other articles of this Issue 1/2023

BMC Medical Informatics and Decision Making 1/2023 Go to the issue