Skip to main content
Top
Published in: Radiation Oncology 1/2018

Open Access 01-12-2018 | Research

Effect of machine learning methods on predicting NSCLC overall survival time based on Radiomics analysis

Authors: Wenzheng Sun, Mingyan Jiang, Jun Dang, Panchun Chang, Fang-Fang Yin

Published in: Radiation Oncology | Issue 1/2018

Login to get access

Abstract

Background

To investigate the effect of machine learning methods on predicting the Overall Survival (OS) for non-small cell lung cancer based on radiomics features analysis.

Methods

A total of 339 radiomic features were extracted from the segmented tumor volumes of pretreatment computed tomography (CT) images. These radiomic features quantify the tumor phenotypic characteristics on the medical images using tumor shape and size, the intensity statistics and the textures. The performance of 5 feature selection methods and 8 machine learning methods were investigated for OS prediction. The predicted performance was evaluated with concordance index between predicted and true OS for the non-small cell lung cancer patients. The survival curves were evaluated by the Kaplan-Meier algorithm and compared by the log-rank tests.

Results

The gradient boosting linear models based on Cox’s partial likelihood method using the concordance index feature selection method obtained the best performance (Concordance Index: 0.68, 95% Confidence Interval: 0.62~ 0.74).

Conclusions

The preliminary results demonstrated that certain machine learning and radiomics analysis method could predict OS of non-small cell lung cancer accuracy.
Appendix
Available only for authorised users
Literature
1.
go back to reference Bhattacharjee A, Richards WG, Staunton J, Li C, Monti S, Vasa P, et al. Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proc Natl Acad Sci U S A. 2001;98(24):13790–5.CrossRef Bhattacharjee A, Richards WG, Staunton J, Li C, Monti S, Vasa P, et al. Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proc Natl Acad Sci U S A. 2001;98(24):13790–5.CrossRef
2.
go back to reference Howlader N, Noone AM, Krapcho M, et al. SEER Cancer statistics review, 1975–2012. Seer.cancer.gov/csr/1975_2012/ Bethesda. MD: National Cancer Institute; 2015. Howlader N, Noone AM, Krapcho M, et al. SEER Cancer statistics review, 1975–2012. Seer.cancer.gov/csr/1975_2012/ Bethesda. MD: National Cancer Institute; 2015.
3.
go back to reference Gillies RJ, Kinahan PE, Hricak H. Radiomics: images are more than pictures, they are data. Radiology. 2015;278(2):563–77.CrossRef Gillies RJ, Kinahan PE, Hricak H. Radiomics: images are more than pictures, they are data. Radiology. 2015;278(2):563–77.CrossRef
4.
go back to reference Aerts HJ, Velazquez ER, Leijenaar RT, et al. Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach. Nat Commun. 2014;5:4006.CrossRef Aerts HJ, Velazquez ER, Leijenaar RT, et al. Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach. Nat Commun. 2014;5:4006.CrossRef
5.
go back to reference Vallières M, Zwanenburg A, et al. Responsible radiomics research for faster clinical translation. J Nucl Med. 2018;59:189–93.CrossRef Vallières M, Zwanenburg A, et al. Responsible radiomics research for faster clinical translation. J Nucl Med. 2018;59:189–93.CrossRef
6.
go back to reference Cui Y, Song J, Pollom E, et al. Quantitative analysis of 18F-Fluorodeoxyglucose positron emission tomography identifies novel prognostic imaging biomarkers in locally advanced pancreatic cancer patients treated with stereotactic body radiation therapy. Int J Radiat Oncol Biol Phys. 2016;96(1):102–9.CrossRef Cui Y, Song J, Pollom E, et al. Quantitative analysis of 18F-Fluorodeoxyglucose positron emission tomography identifies novel prognostic imaging biomarkers in locally advanced pancreatic cancer patients treated with stereotactic body radiation therapy. Int J Radiat Oncol Biol Phys. 2016;96(1):102–9.CrossRef
7.
go back to reference Lambin P, van Stiphout RG, Starmans MH, et al. Predicting outcomes in radiation oncology–multifactorial decision support systems. Nat Rev Clin Oncol. 2013;10(1):27–40.CrossRef Lambin P, van Stiphout RG, Starmans MH, et al. Predicting outcomes in radiation oncology–multifactorial decision support systems. Nat Rev Clin Oncol. 2013;10(1):27–40.CrossRef
8.
go back to reference Chen HH, Su W, Hsueh W, Wu Y, Lin F. Summation of F18-FDG uptakes on PET/CT images predicts disease progression in non-small cell lung cancer. Int J Radiat Oncol. 2010;78((3):S504. Chen HH, Su W, Hsueh W, Wu Y, Lin F. Summation of F18-FDG uptakes on PET/CT images predicts disease progression in non-small cell lung cancer. Int J Radiat Oncol. 2010;78((3):S504.
9.
go back to reference Tiwari P, Kurhanewicz J, Madabhushi A. Multi-kernel graph embedding for detection, Gleason grading of prostate cancer via MRI/MRS. Med Image Anal. 2013;17(2):219–35.CrossRef Tiwari P, Kurhanewicz J, Madabhushi A. Multi-kernel graph embedding for detection, Gleason grading of prostate cancer via MRI/MRS. Med Image Anal. 2013;17(2):219–35.CrossRef
10.
go back to reference Ahmad C, Christian D, Matthew T, Bassam A. Predicting survival time of lung cancer patients using radiomic analysis. Oncotarget. 2017;8(61):104393–407. Ahmad C, Christian D, Matthew T, Bassam A. Predicting survival time of lung cancer patients using radiomic analysis. Oncotarget. 2017;8(61):104393–407.
11.
go back to reference Parmar C, Grossmann P, et al. Radiomic machine-learning classifiers for prognostic biomarkers of head and neck cancer. Front Oncol. 2015;5:272.CrossRef Parmar C, Grossmann P, et al. Radiomic machine-learning classifiers for prognostic biomarkers of head and neck cancer. Front Oncol. 2015;5:272.CrossRef
12.
go back to reference Mohri M, Rostamizadeh A, Talwalkar A. Foundations of machine learning. Ch. 1, 1–3, MIT press, 2012. Mohri M, Rostamizadeh A, Talwalkar A. Foundations of machine learning. Ch. 1, 1–3, MIT press, 2012.
13.
go back to reference Leger S, Zwanenburg A, et al. A comparative study of machine learning methods for time-to-event survival data for radiomics risk modelling. Sci Rep. 2017;7:13206.CrossRef Leger S, Zwanenburg A, et al. A comparative study of machine learning methods for time-to-event survival data for radiomics risk modelling. Sci Rep. 2017;7:13206.CrossRef
14.
go back to reference Harrel FE Jr, Lee KL, Mark DB. Tutorial in biostatistics: multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing error. Stat Med. 1996;15(4):361–87.CrossRef Harrel FE Jr, Lee KL, Mark DB. Tutorial in biostatistics: multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing error. Stat Med. 1996;15(4):361–87.CrossRef
15.
go back to reference Newson R. Confidence intervals for rank statistics: Somers’ D and extensions. Stata J. 2006;6(3):309–34. Newson R. Confidence intervals for rank statistics: Somers’ D and extensions. Stata J. 2006;6(3):309–34.
16.
go back to reference Harrell FE. Regression modeling strategies: with applications to linear models, logistic regression, and survival analysis. New York: springer science & business media; 2001.CrossRef Harrell FE. Regression modeling strategies: with applications to linear models, logistic regression, and survival analysis. New York: springer science & business media; 2001.CrossRef
17.
go back to reference Parmar C, Grossmann P, et al. Machine learning methods for quantitative Radiomic biomarkers. Sci Rep. 2015;5:13087.CrossRef Parmar C, Grossmann P, et al. Machine learning methods for quantitative Radiomic biomarkers. Sci Rep. 2015;5:13087.CrossRef
18.
go back to reference Aerts HJ, Rios V, et al. Data from NSCLC-Radiomics. Cancer Imaging Archive. 2015. Aerts HJ, Rios V, et al. Data from NSCLC-Radiomics. Cancer Imaging Archive. 2015.
19.
go back to reference Clark K, Vendt B, Smith K, et al. The Cancer imaging archive (TCIA): maintaining and operating a public information repository. J Digit Imaging. 2013;26(6):1045–57.CrossRef Clark K, Vendt B, Smith K, et al. The Cancer imaging archive (TCIA): maintaining and operating a public information repository. J Digit Imaging. 2013;26(6):1045–57.CrossRef
20.
go back to reference Collins GS, Reitsma JB, et al. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. Ann Intern Med. 2015;162:55.CrossRef Collins GS, Reitsma JB, et al. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. Ann Intern Med. 2015;162:55.CrossRef
21.
go back to reference Moons KGM, Altman DG, et al. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): explanation and elaboration. Ann Intern Med. 2015;162:W1.CrossRef Moons KGM, Altman DG, et al. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): explanation and elaboration. Ann Intern Med. 2015;162:W1.CrossRef
22.
go back to reference Snoek J, Larochelle H, Adams RP. Practical Bayesian optimization of machine learning algorithms. Adv Neural Inf Proces Syst. 2012;2:2951–9. Snoek J, Larochelle H, Adams RP. Practical Bayesian optimization of machine learning algorithms. Adv Neural Inf Proces Syst. 2012;2:2951–9.
23.
go back to reference Haralick RM Shanmugam K. Textural features for image classification. IEEE Trans Syst Man Cybern. 1973;3(6):610–21.CrossRef Haralick RM Shanmugam K. Textural features for image classification. IEEE Trans Syst Man Cybern. 1973;3(6):610–21.CrossRef
24.
go back to reference Tang X. Texture information in run-length matrices. IEEE Trans Image Process. 1998;7(11):1602–9.CrossRef Tang X. Texture information in run-length matrices. IEEE Trans Image Process. 1998;7(11):1602–9.CrossRef
25.
go back to reference Guo W, et al. Prediction of clinical phenotypes in invasive breast carcinomas from the integration of radiomics and genomics data. J Med Imaging (Bellingham). 2015;2(4):041007.CrossRef Guo W, et al. Prediction of clinical phenotypes in invasive breast carcinomas from the integration of radiomics and genomics data. J Med Imaging (Bellingham). 2015;2(4):041007.CrossRef
26.
go back to reference Zwanenburg A, Leger S, Vallie’res M, Löck S. Image biomarker standardization initiative arXiv161207003. 2016. Zwanenburg A, Leger S, Vallie’res M, Löck S. Image biomarker standardization initiative arXiv161207003. 2016.
27.
go back to reference Selesnick I. The double density DWT wavelets in signal and image analysis: from theory to practice. Norwell: Kluwer Academic Publishers; 2001. Selesnick I. The double density DWT wavelets in signal and image analysis: from theory to practice. Norwell: Kluwer Academic Publishers; 2001.
28.
go back to reference Selesnick I, Baraniuk RG, Kingsbury NG. The dual-tree complex wavelet transform. IEEE Signal Processing Mag. 2005;22(6):123–51.CrossRef Selesnick I, Baraniuk RG, Kingsbury NG. The dual-tree complex wavelet transform. IEEE Signal Processing Mag. 2005;22(6):123–51.CrossRef
29.
go back to reference Karl P. Notes on regression and inheritance in the case of two parents. Proc R Soc London. 1895;58(1895):240–2. Karl P. Notes on regression and inheritance in the case of two parents. Proc R Soc London. 1895;58(1895):240–2.
30.
go back to reference Kendall M. A new measure of rank vorrelation. Biometrika. 1991;30(1–2):81–9. Kendall M. A new measure of rank vorrelation. Biometrika. 1991;30(1–2):81–9.
31.
go back to reference Jerome LM, Arnold DW. Research design and statistical analysis 2nd. Mahwah: Lawrence Erlbaum; 2003. Jerome LM, Arnold DW. Research design and statistical analysis 2nd. Mahwah: Lawrence Erlbaum; 2003.
32.
go back to reference Pocock A, Zhao MJ, Luján M. Conditional likelihood mximisation: a unifying framework for information theoretic feature selection gavin brown. J Mach Learn Res. 2012;13:27–66. Pocock A, Zhao MJ, Luján M. Conditional likelihood mximisation: a unifying framework for information theoretic feature selection gavin brown. J Mach Learn Res. 2012;13:27–66.
33.
go back to reference Andersen P, Gill R. Cox’s regression model for counting processes, a large sample study. Ann Stat. 1982;10:1100–20.CrossRef Andersen P, Gill R. Cox’s regression model for counting processes, a large sample study. Ann Stat. 1982;10:1100–20.CrossRef
34.
go back to reference Hofner B, Mayr A, Robinzonov N, Schmid M. Model-based boosting in R: a hands-on tutorial using the R package mboost. Comput Stat. 2014;29:3–35.CrossRef Hofner B, Mayr A, Robinzonov N, Schmid M. Model-based boosting in R: a hands-on tutorial using the R package mboost. Comput Stat. 2014;29:3–35.CrossRef
35.
go back to reference Binder H, Allignol A, Schumacher M, Beyersmann J. Boosting for high-dimensional time-to-event data with competing risks. Bioinformatics. 2009;25:890–6.CrossRef Binder H, Allignol A, Schumacher M, Beyersmann J. Boosting for high-dimensional time-to-event data with competing risks. Bioinformatics. 2009;25:890–6.CrossRef
36.
go back to reference Hothorn T, Lausen B, Benner A, Radespiel-Troeger M. Bagging survival trees. Stat in Med. 2004;23(1):77–91.CrossRef Hothorn T, Lausen B, Benner A, Radespiel-Troeger M. Bagging survival trees. Stat in Med. 2004;23(1):77–91.CrossRef
37.
go back to reference Ishwaran H, Kogalur UB, Blackstone EH, Lauer MS. Random survival forests. Ann Appl Stat. 2008;2:841–60.CrossRef Ishwaran H, Kogalur UB, Blackstone EH, Lauer MS. Random survival forests. Ann Appl Stat. 2008;2:841–60.CrossRef
38.
go back to reference Kalbfleisch JD, Prentice RL. The statistical analysis of failure time data. New York: Wiley; 2002.CrossRef Kalbfleisch JD, Prentice RL. The statistical analysis of failure time data. New York: Wiley; 2002.CrossRef
39.
go back to reference Van Belle V, Pelcmans K, et al. Improved performance on high-dimensional survival data by application of survival-SVM. Bioinformatics (Oxford). 2011;27:87–94.CrossRef Van Belle V, Pelcmans K, et al. Improved performance on high-dimensional survival data by application of survival-SVM. Bioinformatics (Oxford). 2011;27:87–94.CrossRef
40.
go back to reference Van Belle V, Pelcmans K, et al. Support vector methods for survival analysis: a comparison between ranking and regression approaches. Artif Intell Med. 2011;53:107–18.CrossRef Van Belle V, Pelcmans K, et al. Support vector methods for survival analysis: a comparison between ranking and regression approaches. Artif Intell Med. 2011;53:107–18.CrossRef
41.
go back to reference Brungard CW, Boettinger JL, et al. Machine learning for predicting soil classes in three semi-arid landscapes. Geoderma. 2015;239-240:8–83.CrossRef Brungard CW, Boettinger JL, et al. Machine learning for predicting soil classes in three semi-arid landscapes. Geoderma. 2015;239-240:8–83.CrossRef
42.
go back to reference Heung B, Bulmer CE, Schmidt MG. Predictive soil parent material mapping at a regional-scale: a random forest approach. Geoderma. 2014;214-215:41–154.CrossRef Heung B, Bulmer CE, Schmidt MG. Predictive soil parent material mapping at a regional-scale: a random forest approach. Geoderma. 2014;214-215:41–154.CrossRef
43.
go back to reference Kang L, Chen W, Petrick NA, Gallas BD. Comparing two correlated C indices with right-censored survival outcome: a one-shot nonparametric approach. Stat Med. 2014;34(4):685–703.CrossRef Kang L, Chen W, Petrick NA, Gallas BD. Comparing two correlated C indices with right-censored survival outcome: a one-shot nonparametric approach. Stat Med. 2014;34(4):685–703.CrossRef
44.
go back to reference Royston P, Altman DG. External validation of a cox prognostic model: principles and methods. BMC Med Res Methodol. 2013;13:33.CrossRef Royston P, Altman DG. External validation of a cox prognostic model: principles and methods. BMC Med Res Methodol. 2013;13:33.CrossRef
45.
go back to reference Bolón-Canedo V, Sánchez-Maroño N, et al. Review of microarray datasets and applied feature selection methods. Inform Sciences. 2014;282(20):111–35.CrossRef Bolón-Canedo V, Sánchez-Maroño N, et al. Review of microarray datasets and applied feature selection methods. Inform Sciences. 2014;282(20):111–35.CrossRef
46.
go back to reference Guyon I, Elisseeff A. An introduction to variable and feature selection. J Mach Learn Res. 2003;3(6):1157–82. Guyon I, Elisseeff A. An introduction to variable and feature selection. J Mach Learn Res. 2003;3(6):1157–82.
Metadata
Title
Effect of machine learning methods on predicting NSCLC overall survival time based on Radiomics analysis
Authors
Wenzheng Sun
Mingyan Jiang
Jun Dang
Panchun Chang
Fang-Fang Yin
Publication date
01-12-2018
Publisher
BioMed Central
Published in
Radiation Oncology / Issue 1/2018
Electronic ISSN: 1748-717X
DOI
https://doi.org/10.1186/s13014-018-1140-9

Other articles of this Issue 1/2018

Radiation Oncology 1/2018 Go to the issue