Top

BMC Medical Research Methodology

Published in:

Open Access 01-12-2024 | Tuberculosis | Research

Interpretable machine learning in predicting drug-induced liver injury among tuberculosis patients: model development and validation study

Authors: Yue Xiao, Yanfei Chen, Ruijian Huang, Feng Jiang, Jifang Zhou, Tianchi Yang

Published in: BMC Medical Research Methodology | Issue 1/2024

Abstract

Background

The objective of this research was to create and validate an interpretable prediction model for drug-induced liver injury (DILI) during tuberculosis (TB) treatment.

Methods

A dataset of TB patients from Ningbo City was used to develop models employing the eXtreme Gradient Boosting (XGBoost), random forest (RF), and the least absolute shrinkage and selection operator (LASSO) logistic algorithms. The model's performance was evaluated through various metrics, including the area under the receiver operating characteristic curve (AUROC) and the area under the precision recall curve (AUPR) alongside the decision curve. The Shapley Additive exPlanations (SHAP) method was used to interpret the variable contributions of the superior model.

Results

A total of 7,071 TB patients were identified from the regional healthcare dataset. The study cohort consisted of individuals with a median age of 47 years, 68.0% of whom were male, and 16.3% developed DILI. We utilized part of the high dimensional propensity score (HDPS) method to identify relevant variables and obtained a total of 424 variables. From these, 37 variables were selected for inclusion in a logistic model using LASSO. The dataset was then split into training and validation sets according to a 7:3 ratio. In the validation dataset, the XGBoost model displayed improved overall performance, with an AUROC of 0.89, an AUPR of 0.75, an F1 score of 0.57, and a Brier score of 0.07. Both SHAP analysis and XGBoost model highlighted the contribution of baseline liver-related ailments such as DILI, drug-induced hepatitis (DIH), and fatty liver disease (FLD). Age, alanine transaminase (ALT), and total bilirubin (Tbil) were also linked to DILI status.

Conclusion

XGBoost demonstrates improved predictive performance compared to RF and LASSO logistic in this study. Moreover, the introduction of the SHAP method enhances the clinical understanding and potential application of the model. For further research, external validation and more detailed feature integration are necessary.

Available only for authorised users

Jiang F, Yan H, Liang L, et al. Incidence and risk factors of anti-tuberculosis drug induced liver injury (DILI): Large cohort study involving 4,652 Chinese adult tuberculosis patients. Liver Int. 2021;41(7):1565–75.CrossRefPubMed

Abbara A, Chitty S, Roe JK, et al. Drug-induced liver injury from antituberculosis treatment: a retrospective study from a large TB center in the UK. BMC Infect Dis. 2017;17:231.CrossRefPubMedPubMedCentral

Council for International Organizations Medical Sciences. Drug-induced liver injury. Geneva: CIMOS; 2020. Available from: https://cioms.ch/wp-content/uploads/2020/06/CIOMS_DILI_Web_16Jun2020.pdf. Accessed 01 Mar 2021

Nahid P, Dorman SE, Alipanah N, et al. Official American Thoracic Society/Centers for Disease Control and Prevention/Infectious Diseases Society of America Clinical Practice Guidelines: Treatment of Drug-Susceptible Tuberculosis. Clin Infect Dis. 2016;63(7):e147–95.CrossRefPubMedPubMedCentral

Stravitz RT. WM Lee. Acute liver failure The Lancet. 2019;394(10201):869–81.

World Health Organization. Global tuberculosis report. Geneva: WHO; 2020. Available from: https://www.who.int/tb/publications/global_report/en/.

Shen T, Liu Y, Shang J, et al. Incidence and Etiology of Drug-Induced Liver Injury in Mainland China. Gastroenterology. 2019;156(8):2230-2241.e11.CrossRefPubMed

Sarker IH. Machine Learning: Algorithms, Real-World Applications and Research Directions. SN COMPUT. 2021;2:160.CrossRef

Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM; 2016;785–795.

10.

Breiman L. Random Forests. Mach Learn. 2001;45:5–32.CrossRef

11.

Bjerregaard SS. Exploring predictors of welfare dependency 1, 3, and 5 years after mental health-related absence in Danish municipalities between 2010 and 2012 using flexible machine learning modelling. BMC Public Health. 2023;23(1):224.CrossRefPubMedPubMedCentral

12.

Alan I, Andrew P, Catherine BH. Visualizing Variable Importance and Variable Interaction Effects in Machine Learning Models. J Comput Graph Stat. 2022;31(3):766–78.CrossRef

13.

Lu S, Chen R, Wei W, et al. Understanding Heart Failure Patients EHR Clinical Features via SHAP Interpretation of Tree-Based Machine Learning Model Predictions. AMIA Annu Symp Proc. 2022;2021:813–22.PubMedPubMedCentral

14.

Jiang WX, Huang F, Tang SL, et al. Implementing a new tuberculosis surveillance system in Zhejiang, Jilin and Ningxia: improvements, challenges and implications for China’s National Health Information System. Infect Dis Poverty. 2021;10(1):22.CrossRefPubMedPubMedCentral

15.

Liu Z, Zhang L, Yang Y, et al. Active Surveillance of Adverse Events Following Human Papillomavirus Vaccination: Feasibility Pilot Study Based on the Regional Health Care Information Platform in the City of Ningbo, China. J Med Internet Res. 2020;22(6): e17446.CrossRefPubMedPubMedCentral

16.

Schneeweiss S. Automated data-adaptive analytics for electronic healthcare data to study causal treatment effects. Clin Epidemiol. 2018;10:771–88.CrossRefPubMedPubMedCentral

17.

Chen Q, Hu A, Ma A, et al. Effectiveness of Prophylactic Use of Hepatoprotectants for Tuberculosis Drug-Induced Liver Injury: A Population-Based Cohort Analysis Involving 6,743 Chinese Patients. Front Pharmacol. 2022;20(13): 813682.CrossRef

18.

Polinski JM, Schneeweiss S, Glynn RJ, et al. Confronting “confounding by health system use” in Medicare Part D: comparative effectiveness of propensity score approaches to confounding adjustment. Pharmacoepidemiol Drug Saf. 2012;21(Suppl 2):90–8.CrossRefPubMedPubMedCentral

19.

Schneeweiss S, Rassen JA, Glynn RJ, et al. High-dimensional propensity score adjustment in studies of treatment effects using health care claims data. Epidemiology. 2009;20(4):512–22.CrossRefPubMedPubMedCentral

20.

Yu YC, Mao YM, Chen CW, et al. CSH guidelines for the diagnosis and treatment of drug-induced liver injury. Hepatol Int. 2017;11(3):221–41.CrossRefPubMed

21.

Sun L, Wang Q, Liu M, et al. Albumin binding function is a novel biomarker for early liver damage and disease progression in non-alcoholic fatty liver disease. Endocrine. 2020;69:294–302.CrossRefPubMed

22.

James G, Witten D, Hastie T, et al. An introduction to statistical learning: with applications in R. New York: Springer; 2013.CrossRef

23.

Sattar N, Scherbakova O, Ford I, et al. Elevated alanine aminotransferase predicts new-onset type 2 diabetes independently of classical risk factors, metabolic syndrome, and C-reactive protein in the west of Scotland coronary prevention study. Diabetes. 2004;53(11):2855–60.CrossRefPubMed

24.

Coyner AS, Chen JS, Singh P, et al. Single-Examination Risk Prediction of Severe Retinopathy of Prematurity. Pediatrics. 2021;148(6): e2021051772.CrossRefPubMed

25.

Cao J, Mi Y, Shi C, et al. First-line anti-tuberculosis drugs induce hepatotoxicity: A novel mechanism based on a urinary metabolomics platform. Biochem Biophys Res Commun. 2018;497(2):485–91.CrossRefPubMed

26.

Tweed CD, Wills GH, Crook AM, et al. Liver toxicity associated with tuberculosis chemotherapy in the REMoxTB study. BMC Med. 2018;16(1):46.CrossRefPubMedPubMedCentral

27.

Patterson B, Abbara A, Collin S, et al. Predicting drug-induced liver injury from anti-tuberculous medications by early monitoring of liver tests. J Infect. 2021;82(2):240–4.CrossRefPubMed

28.

Lammert C, Imler T, Teal E, et al. Patients With Chronic Liver Disease Suggestive of Nonalcoholic Fatty Liver Disease May Be at Higher Risk for Drug-Induced Liver Injury. Clin Gastroenterol Hepatol. 2019;17(13):2814–5.CrossRefPubMed

29.

Chang KC, Leung CC, Yew WW, et al. Hepatotoxicity of pyrazinamide: cohort and case-control analyses. Am J Respir Crit Care Med. 2008;177(12):1391–6.CrossRefPubMed

30.

Hosford JD, von Fricken ME, Lauzardo M, et al. Hepatotoxicity from antituberculous therapy in the elderly: a systematic review. Tuberculosis (Edinb). 2015;95(2):112–22.CrossRefPubMed

31.

Chen M, Bisgin H, Tong L, et al. Toward predictive models for drug-induced liver injury in humans: are we there yet? Biomark Med. 2014;8(2):201–13.CrossRefPubMed

32.

Vall A, Sabnis Y, Shi J, et al. The Promise of AI for DILI Prediction. Front Artif Intell. 2021;14(4): 638410.CrossRef

33.

Minerali E, Foil DH, Zorn KM, et al. Comparing Machine Learning Algorithms for Predicting Drug-Induced Liver Injury (DILI). Mol Pharm. 2020;17(7):2628–37.CrossRefPubMedPubMedCentral

34.

Xu Y, Dai Z, Chen F, et al. Deep Learning for Drug-Induced Liver Injury. J Chem Inf Model. 2015;55(10):2085–93.CrossRefPubMed

35.

Williams DP, Lazic SE, Foster AJ, et al. Predicting Drug-Induced Liver Injury with Bayesian Machine Learning. Chem Res Toxicol. 2020;33(1):239–48.CrossRefPubMed

36.

Zhong T, Zhuang Z, Dong X, et al. Predicting Antituberculosis Drug-Induced Liver Injury Using an Interpretable Machine Learning Method: Model Development and Validation Study. JMIR Med Inform. 2021;9(7): e29226.CrossRefPubMedPubMedCentral

37.

Linden A. Measuring diagnostic and predictive accuracy in disease management: an introduction to receiver operating characteristic (ROC) analysis. J Eval Clin Pract. 2006;12(2):132–9.CrossRefPubMed

38.

Ye L, Ngan DK, Xu T, et al. Prediction of drug-induced liver injury and cardiotoxicity using chemical structure and in vitro assay data. Toxicol Appl Pharmacol. 2022;1(454): 116250.CrossRef

39.

Liu Z, Shi Q, Ding D, et al. Translating clinical findings into knowledge in drug safety evaluation–drug induced liver injury prediction system (DILIps). PLoS Comput Biol. 2011;7(12): e1002310.CrossRefPubMedPubMedCentral

40.

Fisher S, Rosella LC. Priorities for successful use of artificial intelligence by public health organizations: a literature review. BMC Public Health. 2022;22:2146.CrossRefPubMedPubMedCentral

41.

Obermeyer Z, et al. Dissecting racial bias in an algorithm used to manage the health of populations. Science. 2019;366(6464):447–53.CrossRefPubMed

42.

Juurlink David N. Drug-drug interactions among elderly patients hospitalized for drug toxicity. JAMA. 2003;289(13):1652–8.CrossRefPubMed

43.

Luo W, Phung D, Tran T, et al. Guidelines for Developing and Reporting Machine Learning Predictive Models in Biomedical Research: A Multidisciplinary View. J Med Internet Res. 2016;18(12): e323.CrossRefPubMedPubMedCentral

Title: Interpretable machine learning in predicting drug-induced liver injury among tuberculosis patients: model development and validation study
Authors: Yue Xiao
Yanfei Chen
Ruijian Huang
Feng Jiang
Jifang Zhou
Tianchi Yang
Publication date: 01-12-2024
Publisher: BioMed Central
Keywords: Tuberculosis
Tuberculosis
Fatty Liver
Published in: BMC Medical Research Methodology / Issue 1/2024
Electronic ISSN: 1471-2288
DOI: https://doi.org/10.1186/s12874-024-02214-5

At a glance: The STEP trials

Springer Medicine

Interpretable machine learning in predicting drug-induced liver injury among tuberculosis patients: model development and validation study

Abstract

Background

Methods

Results

Conclusion

At a glance: The STEP trials

Springer Medicine

Abstract

Background

Methods

Results

Conclusion

Please log in to get access to this content

Other articles of this Issue 1/2024

A cautionary tale: an evaluation of the performance of treatment switching adjustment methods in a real world case study

Assessing the properties of patient-specific treatment effect estimates from causal forest algorithms under essential heterogeneity

Protocol implementation during the COVID-19 pandemic: experiences from a randomized trial of stress ulcer prophylaxis

Designing tailored maintenance strategies for systematic reviews and clinical practice guidelines using the Portfolio Maintenance by Test-Treatment (POMBYTT) framework

Leveraging machine learning for predicting acute graft-versus-host disease grades in allogeneic hematopoietic cell transplantation for T-cell prolymphocytic leukaemia

A data-adaptive method for investigating effect heterogeneity with high-dimensional covariates in Mendelian randomization