Skip to main content
Top
Published in: European Journal of Medical Research 1/2023

Open Access 01-12-2023 | Breast Cancer | Research

Development and testing of a random forest-based machine learning model for predicting events among breast cancer patients with a poor response to neoadjuvant chemotherapy

Authors: Yudi Jin, Ailin Lan, Yuran Dai, Linshan Jiang, Shengchun Liu

Published in: European Journal of Medical Research | Issue 1/2023

Login to get access

Abstract

Background

Breast cancer (BC) is the most common malignant tumor around the world. Timely detection of the tumor progression after treatment could improve the survival outcome of patients. This study aimed to develop machine learning models to predict events (defined as either (1) the first tumor relapse locally, regionally, or distantly; (2) a diagnosis of secondary malignant tumor; or (3) death because of any reason.) in BC patients post-treatment.

Methods

The patients with the response of stable disease (SD) and progressive disease (PD) after neoadjuvant chemotherapy (NAC) were selected. The clinicopathological features and the survival data were recorded in 1 year and 5 years, respectively. Patients were randomly divided into the training set and test set in the ratio of 8:2. A random forest (RF) and a logistic regression were established in both of 1-year cohort and the 5-year cohort. The performance was compared between the two models. The models were validated using data from the Surveillance, Epidemiology, and End Results (SEER) database.

Results

A total of 315 patients were included. In the 1-year cohort, 197 patients were divided into a training set while 87 were into a test set. The specificity, sensitivity, and AUC were 0.800, 0.833, and 0.810 in the RF model. And 0.520, 0.833, and 0.653 of the logistic regression. In the 5-year cohort, 132 patients were divided into the training set while 33 were into the test set. The specificity, sensitivity, and AUC were 0.882, 0.750, and 0.829 in the RF model. And 0.882, 0.688, and 0.752 of the logistic regression. In the external validation set, of the RF model, the specificity, sensitivity, and AUC were 0.765, 0.812, and 0.779. Of the logistics regression model, the specificity, sensitivity, and AUC were 0.833, 0.376, and 0.619.

Conclusion

The RF model has a good performance in predicting events among BC patients with SD and PD post-NAC. It may be beneficial to BC patients, assisting in detecting tumor recurrence.
Appendix
Available only for authorised users
Literature
1.
go back to reference Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2021;71(3):209–49.CrossRefPubMed Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2021;71(3):209–49.CrossRefPubMed
3.
go back to reference Early Breast Cancer Trialists' Collaborative Group (EBCTCG). Long-term outcomes for neoadjuvant versus adjuvant chemotherapy in early breast cancer: meta-analysis of individual patient data from ten randomised trials. Lancet Oncol. 2018;19(1):27–39. Early Breast Cancer Trialists' Collaborative Group (EBCTCG). Long-term outcomes for neoadjuvant versus adjuvant chemotherapy in early breast cancer: meta-analysis of individual patient data from ten randomised trials. Lancet Oncol. 2018;19(1):27–39.
4.
go back to reference Cortazar P, Zhang L, Untch M, Mehta K, Costantino JP, Wolmark N, et al. Pathological complete response and long-term clinical benefit in breast cancer: the CTNeoBC pooled analysis. Lancet. 2014;384(9938):164–72.CrossRefPubMed Cortazar P, Zhang L, Untch M, Mehta K, Costantino JP, Wolmark N, et al. Pathological complete response and long-term clinical benefit in breast cancer: the CTNeoBC pooled analysis. Lancet. 2014;384(9938):164–72.CrossRefPubMed
5.
go back to reference Spring L, Greenup R, Niemierko A, Schapira L, Haddad S, Jimenez R, et al. Pathologic complete response after neoadjuvant chemotherapy and long-term outcomes among young women with breast cancer. J Natl Compr Canc Netw. 2017;15(10):1216–23.CrossRefPubMed Spring L, Greenup R, Niemierko A, Schapira L, Haddad S, Jimenez R, et al. Pathologic complete response after neoadjuvant chemotherapy and long-term outcomes among young women with breast cancer. J Natl Compr Canc Netw. 2017;15(10):1216–23.CrossRefPubMed
6.
go back to reference Hou Y, Peng Y, Li Z. Update on prognostic and predictive biomarkers of breast cancer. Semin Diagn Pathol. 2022;39(5):322–32.CrossRefPubMed Hou Y, Peng Y, Li Z. Update on prognostic and predictive biomarkers of breast cancer. Semin Diagn Pathol. 2022;39(5):322–32.CrossRefPubMed
7.
go back to reference Tarighati E, Keivan H, Mahani H. A review of prognostic and predictive biomarkers in breast cancer. Clin Exp Med. 2023;23(1):1–16.PubMed Tarighati E, Keivan H, Mahani H. A review of prognostic and predictive biomarkers in breast cancer. Clin Exp Med. 2023;23(1):1–16.PubMed
8.
go back to reference Kos Z, Dabbs DJ. Biomarker assessment and molecular testing for prognostication in breast cancer. Histopathology. 2016;68(1):70–85.CrossRefPubMed Kos Z, Dabbs DJ. Biomarker assessment and molecular testing for prognostication in breast cancer. Histopathology. 2016;68(1):70–85.CrossRefPubMed
9.
go back to reference Yau C, Osdoit M, van der Noordaa M, Shad S, Wei J, de Croze D, et al. Residual cancer burden after neoadjuvant chemotherapy and long-term survival outcomes in breast cancer: a multicentre pooled analysis of 5161 patients. Lancet Oncol. 2022;23(1):149–60.CrossRefPubMed Yau C, Osdoit M, van der Noordaa M, Shad S, Wei J, de Croze D, et al. Residual cancer burden after neoadjuvant chemotherapy and long-term survival outcomes in breast cancer: a multicentre pooled analysis of 5161 patients. Lancet Oncol. 2022;23(1):149–60.CrossRefPubMed
10.
go back to reference Huang K, Zhang J, Yu Y, Lin Y, Song C. The impact of chemotherapy and survival prediction by machine learning in early elderly triple negative breast cancer (eTNBC): a population based study from the SEER database. BMC Geriatr. 2022;22(1):268.CrossRefPubMedPubMedCentral Huang K, Zhang J, Yu Y, Lin Y, Song C. The impact of chemotherapy and survival prediction by machine learning in early elderly triple negative breast cancer (eTNBC): a population based study from the SEER database. BMC Geriatr. 2022;22(1):268.CrossRefPubMedPubMedCentral
11.
go back to reference Zheng X, Yao Z, Huang Y, Yu Y, Wang Y, Liu Y, et al. Deep learning radiomics can predict axillary lymph node status in early-stage breast cancer. Nat Commun. 2020;11(1):1236.CrossRefPubMedPubMedCentral Zheng X, Yao Z, Huang Y, Yu Y, Wang Y, Liu Y, et al. Deep learning radiomics can predict axillary lymph node status in early-stage breast cancer. Nat Commun. 2020;11(1):1236.CrossRefPubMedPubMedCentral
12.
go back to reference Li C, Liu M, Li J, Wang W, Feng C, Cai Y, et al. Machine learning predicts the prognosis of breast cancer patients with initial bone metastases. Front Public Health. 2022;10:1003976.CrossRefPubMedPubMedCentral Li C, Liu M, Li J, Wang W, Feng C, Cai Y, et al. Machine learning predicts the prognosis of breast cancer patients with initial bone metastases. Front Public Health. 2022;10:1003976.CrossRefPubMedPubMedCentral
13.
go back to reference Asare EA, Liu L, Hess KR, Gordon EJ, Paruch JL, Palis B, et al. Development of a model to predict breast cancer survival using data from the national cancer data base. Surgery. 2016;159(2):495–502.CrossRefPubMed Asare EA, Liu L, Hess KR, Gordon EJ, Paruch JL, Palis B, et al. Development of a model to predict breast cancer survival using data from the national cancer data base. Surgery. 2016;159(2):495–502.CrossRefPubMed
14.
go back to reference de Glas NA, Bastiaannet E, Engels CC, de Craen AJ, Putter H, van de Velde CJ, et al. Validity of the online PREDICT tool in older patients with breast cancer: a population-based study. Br J Cancer. 2016;114(4):395–400.CrossRefPubMedPubMedCentral de Glas NA, Bastiaannet E, Engels CC, de Craen AJ, Putter H, van de Velde CJ, et al. Validity of the online PREDICT tool in older patients with breast cancer: a population-based study. Br J Cancer. 2016;114(4):395–400.CrossRefPubMedPubMedCentral
15.
go back to reference Kindts I, Laenen A, Peeters S, Janssen H, Depuydt T, Nevelsteen I, et al. Validation of the web-based IBTR! 2.0 nomogram to predict for ipsilateral breast tumor recurrence after breast-conserving therapy. Int J Radiat Oncol Biol Phys. 2016;95(5):1477–84.CrossRefPubMed Kindts I, Laenen A, Peeters S, Janssen H, Depuydt T, Nevelsteen I, et al. Validation of the web-based IBTR! 2.0 nomogram to predict for ipsilateral breast tumor recurrence after breast-conserving therapy. Int J Radiat Oncol Biol Phys. 2016;95(5):1477–84.CrossRefPubMed
17.
go back to reference Yu Y, Tan Y, Xie C, Hu Q, Ouyang J, Chen Y, et al. Development and validation of a preoperative magnetic resonance imaging radiomics-based signature to predict axillary lymph node metastasis and disease-free survival in patients with early-stage breast cancer. JAMA Netw Open. 2020;3(12):e2028086.CrossRefPubMedPubMedCentral Yu Y, Tan Y, Xie C, Hu Q, Ouyang J, Chen Y, et al. Development and validation of a preoperative magnetic resonance imaging radiomics-based signature to predict axillary lymph node metastasis and disease-free survival in patients with early-stage breast cancer. JAMA Netw Open. 2020;3(12):e2028086.CrossRefPubMedPubMedCentral
18.
go back to reference Massafra R, Comes MC, Bove S, Didonna V, Diotaiuti S, Giotta F, et al. A machine learning ensemble approach for 5- and 10-year breast cancer invasive disease event classification. PLoS ONE. 2022;17(9):e0274691.CrossRefPubMedPubMedCentral Massafra R, Comes MC, Bove S, Didonna V, Diotaiuti S, Giotta F, et al. A machine learning ensemble approach for 5- and 10-year breast cancer invasive disease event classification. PLoS ONE. 2022;17(9):e0274691.CrossRefPubMedPubMedCentral
19.
go back to reference Mikhailova V, Anbarjafari G. Comparative analysis of classification algorithms on the breast cancer recurrence using machine learning. Med Biol Eng Comput. 2022;60(9):2589–600.CrossRefPubMed Mikhailova V, Anbarjafari G. Comparative analysis of classification algorithms on the breast cancer recurrence using machine learning. Med Biol Eng Comput. 2022;60(9):2589–600.CrossRefPubMed
20.
go back to reference Song Y, Yin Z, Zhang C, Hao S, Li H, Wang S, et al. Random forest classifier improving phenylketonuria screening performance in two Chinese populations. Front Mol Biosci. 2022;9:986556.CrossRefPubMedPubMedCentral Song Y, Yin Z, Zhang C, Hao S, Li H, Wang S, et al. Random forest classifier improving phenylketonuria screening performance in two Chinese populations. Front Mol Biosci. 2022;9:986556.CrossRefPubMedPubMedCentral
21.
go back to reference Liu YH, Jin J, Liu YJ. Machine learning-based random forest for predicting decreased quality of life in thyroid cancer patients after thyroidectomy. Support Care Cancer. 2022;30(3):2507–13.CrossRefPubMed Liu YH, Jin J, Liu YJ. Machine learning-based random forest for predicting decreased quality of life in thyroid cancer patients after thyroidectomy. Support Care Cancer. 2022;30(3):2507–13.CrossRefPubMed
22.
go back to reference Eisenhauer EA, Therasse P, Bogaerts J, Schwartz LH, Sargent D, Ford R, et al. New response evaluation criteria in solid tumours: revised RECIST guideline (version 1.1). Eur J Cancer. 2009;45(2):228–47.CrossRefPubMed Eisenhauer EA, Therasse P, Bogaerts J, Schwartz LH, Sargent D, Ford R, et al. New response evaluation criteria in solid tumours: revised RECIST guideline (version 1.1). Eur J Cancer. 2009;45(2):228–47.CrossRefPubMed
23.
go back to reference Schwartz LH, Litière S, de Vries E, Ford R, Gwyther S, Mandrekar S, et al. RECIST 1.1-Update and clarification: from the RECIST committee. Eur J Cancer. 2016;62:132–7.CrossRefPubMedPubMedCentral Schwartz LH, Litière S, de Vries E, Ford R, Gwyther S, Mandrekar S, et al. RECIST 1.1-Update and clarification: from the RECIST committee. Eur J Cancer. 2016;62:132–7.CrossRefPubMedPubMedCentral
24.
go back to reference Huang X, Yin YM. Updates of Chinese society of clinical oncology (CSCO) guideline for breast cancer in 2018. Zhonghua Yi Xue Za Zhi. 2018;98(16):1213–7.PubMed Huang X, Yin YM. Updates of Chinese society of clinical oncology (CSCO) guideline for breast cancer in 2018. Zhonghua Yi Xue Za Zhi. 2018;98(16):1213–7.PubMed
25.
go back to reference Gradishar WJ, Moran MS, Abraham J, Aft R, Agnese D, Allison KH, et al. Breast cancer, version 3.2022, NCCN clinical practice guidelines in oncology. J Natl Compr Canc Netw. 2022;20(6):691–722.CrossRefPubMed Gradishar WJ, Moran MS, Abraham J, Aft R, Agnese D, Allison KH, et al. Breast cancer, version 3.2022, NCCN clinical practice guidelines in oncology. J Natl Compr Canc Netw. 2022;20(6):691–722.CrossRefPubMed
26.
go back to reference Li JB, Jiang ZF. Chinese society of clinical oncology breast cancer guideline version 2021: updates and interpretations. Zhonghua Yi Xue Za Zhi. 2021;101(24):1835–8.PubMed Li JB, Jiang ZF. Chinese society of clinical oncology breast cancer guideline version 2021: updates and interpretations. Zhonghua Yi Xue Za Zhi. 2021;101(24):1835–8.PubMed
27.
go back to reference Abubakar M, Guo C, Koka H, Sung H, Shao N, Guida J, et al. Clinicopathological and epidemiological significance of breast cancer subtype reclassification based on p53 immunohistochemical expression. NPJ Breast Cancer. 2019;5:20.CrossRefPubMedPubMedCentral Abubakar M, Guo C, Koka H, Sung H, Shao N, Guida J, et al. Clinicopathological and epidemiological significance of breast cancer subtype reclassification based on p53 immunohistochemical expression. NPJ Breast Cancer. 2019;5:20.CrossRefPubMedPubMedCentral
28.
go back to reference Yaghoobi V, Martinez-Morilla S, Liu Y, Charette L, Rimm DL, Harigopal M. Advances in quantitative immunohistochemistry and their contribution to breast cancer. Expert Rev Mol Diagn. 2020;20(5):509–22.CrossRefPubMed Yaghoobi V, Martinez-Morilla S, Liu Y, Charette L, Rimm DL, Harigopal M. Advances in quantitative immunohistochemistry and their contribution to breast cancer. Expert Rev Mol Diagn. 2020;20(5):509–22.CrossRefPubMed
29.
go back to reference Cutler DR, Edwards TC Jr, Beard KH, Cutler A, Hess KT, Gibson J, et al. Random forests for classification in ecology. Ecology. 2007;88(11):2783–92.CrossRefPubMed Cutler DR, Edwards TC Jr, Beard KH, Cutler A, Hess KT, Gibson J, et al. Random forests for classification in ecology. Ecology. 2007;88(11):2783–92.CrossRefPubMed
Metadata
Title
Development and testing of a random forest-based machine learning model for predicting events among breast cancer patients with a poor response to neoadjuvant chemotherapy
Authors
Yudi Jin
Ailin Lan
Yuran Dai
Linshan Jiang
Shengchun Liu
Publication date
01-12-2023
Publisher
BioMed Central
Published in
European Journal of Medical Research / Issue 1/2023
Electronic ISSN: 2047-783X
DOI
https://doi.org/10.1186/s40001-023-01361-7

Other articles of this Issue 1/2023

European Journal of Medical Research 1/2023 Go to the issue