Skip to main content
Top
Published in: Cancer Imaging 1/2023

Open Access 01-12-2023 | Research article

Performance comparison between multi-center histopathology datasets of a weakly-supervised deep learning model for pancreatic ductal adenocarcinoma detection

Authors: Francisco Carrillo-Perez, Francisco M. Ortuno, Alejandro Börjesson, Ignacio Rojas, Luis Javier Herrera

Published in: Cancer Imaging | Issue 1/2023

Login to get access

Abstract

Background

Pancreatic ductal carcinoma patients have a really poor prognosis given its difficult early detection and the lack of early symptoms. Digital pathology is routinely used by pathologists to diagnose the disease. However, visually inspecting the tissue is a time-consuming task, which slows down the diagnostic procedure. With the advances occurred in the area of artificial intelligence, specifically with deep learning models, and the growing availability of public histology data, clinical decision support systems are being created. However, the generalization capabilities of these systems are not always tested, nor the integration of publicly available datasets for pancreatic ductal carcinoma detection (PDAC).

Methods

In this work, we explored the performace of two weakly-supervised deep learning models using the two more widely available datasets with pancreatic ductal carcinoma histology images, The Cancer Genome Atlas Project (TCGA) and the Clinical Proteomic Tumor Analysis Consortium (CPTAC). In order to have sufficient training data, the TCGA dataset was integrated with the Genotype-Tissue Expression (GTEx) project dataset, which contains healthy pancreatic samples.

Results

We showed how the model trained on CPTAC generalizes better than the one trained on the integrated dataset, obtaining an inter-dataset accuracy of 90.62% ± 2.32 and an outer-dataset accuracy of 92.17% when evaluated on TCGA + GTEx. Furthermore, we tested the performance on another dataset formed by tissue micro-arrays, obtaining an accuracy of 98.59%. We showed how the features learned in an integrated dataset do not differentiate between the classes, but between the datasets, noticing that a stronger normalization might be needed when creating clinical decision support systems with datasets obtained from different sources. To mitigate this effect, we proposed to train on the three available datasets, improving the detection performance and generalization capabilities of a model trained only on TCGA + GTEx and achieving a similar performance to the model trained only on CPTAC.

Conclusions

The integration of datasets where both classes are present can mitigate the batch effect present when integrating datasets, improving the classification performance, and accurately detecting PDAC across different datasets.
Appendix
Available only for authorised users
Literature
1.
go back to reference Hruban RH, Gaida MM, Thompson E, Hong S-M, Noë M, Brosens LA, Jongepier M, Offerhaus GJA, Wood LD. Why is pancreatic cancer so deadly? the pathologist’s view. J Pathol. 2019;248(2):131–41.CrossRefPubMed Hruban RH, Gaida MM, Thompson E, Hong S-M, Noë M, Brosens LA, Jongepier M, Offerhaus GJA, Wood LD. Why is pancreatic cancer so deadly? the pathologist’s view. J Pathol. 2019;248(2):131–41.CrossRefPubMed
2.
go back to reference Pereira SP, Oldfield L, Ney A, Hart PA, Keane MG, Pandol SJ, Li D, Greenhalf W, Jeon CY, Koay EJ, et al. Early detection of pancreatic cancer. Lancet Gastroenterol Hepatol. 2020;5(7):698–710.CrossRefPubMedPubMedCentral Pereira SP, Oldfield L, Ney A, Hart PA, Keane MG, Pandol SJ, Li D, Greenhalf W, Jeon CY, Koay EJ, et al. Early detection of pancreatic cancer. Lancet Gastroenterol Hepatol. 2020;5(7):698–710.CrossRefPubMedPubMedCentral
3.
go back to reference Gaddam S, Abboud Y, Oh J, Samaan JS, Nissen NN, Lu SC, Lo SK. Incidence of pancreatic cancer by age and sex in the us, 2000–2018. JAMA. 2021;326(20):2075–7.CrossRefPubMedPubMedCentral Gaddam S, Abboud Y, Oh J, Samaan JS, Nissen NN, Lu SC, Lo SK. Incidence of pancreatic cancer by age and sex in the us, 2000–2018. JAMA. 2021;326(20):2075–7.CrossRefPubMedPubMedCentral
4.
go back to reference Singhi AD, Koay EJ, Chari ST, Maitra A. Early detection of pancreatic cancer: opportunities and challenges. Gastroenterology. 2019;156(7):2024–40.CrossRefPubMed Singhi AD, Koay EJ, Chari ST, Maitra A. Early detection of pancreatic cancer: opportunities and challenges. Gastroenterology. 2019;156(7):2024–40.CrossRefPubMed
5.
go back to reference Golan T, Sella T, Margalit O, Amit U, Halpern N, Aderka D, Shacham-Shmueli E, Urban D, Lawrence YR. Short-and long-term survival in metastatic pancreatic adenocarcinoma, 1993–2013. J Natl Compr Canc Netw. 2017;15(8):1022–7.CrossRefPubMed Golan T, Sella T, Margalit O, Amit U, Halpern N, Aderka D, Shacham-Shmueli E, Urban D, Lawrence YR. Short-and long-term survival in metastatic pancreatic adenocarcinoma, 1993–2013. J Natl Compr Canc Netw. 2017;15(8):1022–7.CrossRefPubMed
6.
go back to reference Carpelan-Holmström M, Nordling S, Pukkala E, Sankila R, Lüttges J, Klöppel G, Haglund C. Does anyone survive pancreatic ductal adenocarcinoma? a nationwide study re-evaluating the data of the finnish cancer registry. Gut. 2005;54(3):385–7.CrossRefPubMedPubMedCentral Carpelan-Holmström M, Nordling S, Pukkala E, Sankila R, Lüttges J, Klöppel G, Haglund C. Does anyone survive pancreatic ductal adenocarcinoma? a nationwide study re-evaluating the data of the finnish cancer registry. Gut. 2005;54(3):385–7.CrossRefPubMedPubMedCentral
7.
go back to reference Hu Z, Tang J, Wang Z, Zhang K, Zhang L, Sun Q. Deep learning for image-based cancer detection and diagnosis- a survey. Pattern Recogn. 2018;83:134–49.CrossRef Hu Z, Tang J, Wang Z, Zhang K, Zhang L, Sun Q. Deep learning for image-based cancer detection and diagnosis- a survey. Pattern Recogn. 2018;83:134–49.CrossRef
8.
go back to reference Tran KA, Kondrashova O, Bradley A, Williams ED, Pearson JV, Waddell N. Deep learning in cancer diagnosis, prognosis and treatment selection. Genome Med. 2021;13(1):1–17.CrossRef Tran KA, Kondrashova O, Bradley A, Williams ED, Pearson JV, Waddell N. Deep learning in cancer diagnosis, prognosis and treatment selection. Genome Med. 2021;13(1):1–17.CrossRef
12.
go back to reference Sutton RT, Pincock D, Baumgart DC, Sadowski DC, Fedorak RN, Kroeker KI. An overview of clinical decision support systems: benefits, risks, and strategies for success. NPJ Digit Med. 2020;3(1):17.CrossRefPubMedPubMedCentral Sutton RT, Pincock D, Baumgart DC, Sadowski DC, Fedorak RN, Kroeker KI. An overview of clinical decision support systems: benefits, risks, and strategies for success. NPJ Digit Med. 2020;3(1):17.CrossRefPubMedPubMedCentral
13.
go back to reference Fu H, Mi W, Pan B, Guo Y, Li J, Xu R, Zheng J, Zou C, Zhang T, Liang Z, et al. Automatic pancreatic ductal adenocarcinoma detection in whole slide images using deep convolutional neural networks. Front Oncol. 2021;11:665929.CrossRefPubMedPubMedCentral Fu H, Mi W, Pan B, Guo Y, Li J, Xu R, Zheng J, Zou C, Zhang T, Liang Z, et al. Automatic pancreatic ductal adenocarcinoma detection in whole slide images using deep convolutional neural networks. Front Oncol. 2021;11:665929.CrossRefPubMedPubMedCentral
14.
go back to reference Kronberg RM, Haeberle L, Pfaus M, Xu HC, Krings KS, Schlensog M, Rau T, Pandyra AA, Lang KS, Esposito I, et al. Communicator-driven data preprocessing improves deep transfer learning of histopathological prediction of pancreatic ductal adenocarcinoma. Cancers. 2022;14(8):1964.CrossRefPubMedPubMedCentral Kronberg RM, Haeberle L, Pfaus M, Xu HC, Krings KS, Schlensog M, Rau T, Pandyra AA, Lang KS, Esposito I, et al. Communicator-driven data preprocessing improves deep transfer learning of histopathological prediction of pancreatic ductal adenocarcinoma. Cancers. 2022;14(8):1964.CrossRefPubMedPubMedCentral
15.
go back to reference Li B, Nelson MS, Savari O, Loeffler AG, Eliceiri KW. Differentiation of pancreatic ductal adenocarcinoma and chronic pancreatitis using graph neural networks on histopathology and collagen fiber features. J Pathol Inform. 2022;13:100158.CrossRefPubMedPubMedCentral Li B, Nelson MS, Savari O, Loeffler AG, Eliceiri KW. Differentiation of pancreatic ductal adenocarcinoma and chronic pancreatitis using graph neural networks on histopathology and collagen fiber features. J Pathol Inform. 2022;13:100158.CrossRefPubMedPubMedCentral
16.
go back to reference Qiu W, Duan N, Chen X, Ren S, Zhang Y, Wang Z, Chen R. Pancreatic ductal adenocarcinoma: machine learning–based quantitative computed tomography texture analysis for prediction of histopathological grade. Cancer Manag Res. 2019;11:9253.CrossRefPubMedPubMedCentral Qiu W, Duan N, Chen X, Ren S, Zhang Y, Wang Z, Chen R. Pancreatic ductal adenocarcinoma: machine learning–based quantitative computed tomography texture analysis for prediction of histopathological grade. Cancer Manag Res. 2019;11:9253.CrossRefPubMedPubMedCentral
17.
go back to reference Alves N, Schuurmans M, Litjens G, Bosma JS, Hermans J, Huisman H. Fully automatic deep learning framework for pancreatic ductal adenocarcinoma detection on computed tomography. Cancers. 2022;14(2):376.CrossRefPubMedPubMedCentral Alves N, Schuurmans M, Litjens G, Bosma JS, Hermans J, Huisman H. Fully automatic deep learning framework for pancreatic ductal adenocarcinoma detection on computed tomography. Cancers. 2022;14(2):376.CrossRefPubMedPubMedCentral
18.
go back to reference Xuan W, You G. Detection and diagnosis of pancreatic tumor using deep learning-based hierarchical convolutional neural network on the internet of medical things platform. Futur Gener Comput Syst. 2020;111:132–42.CrossRef Xuan W, You G. Detection and diagnosis of pancreatic tumor using deep learning-based hierarchical convolutional neural network on the internet of medical things platform. Futur Gener Comput Syst. 2020;111:132–42.CrossRef
19.
go back to reference Chen W, Ji H, Feng J, Liu R, Yu Y, Zhou R, Zhou J. Classification of pancreatic cystic neoplasms based on multimodality images. In: Machine Learning in Medical Imaging: 9th International Workshop, MLMI 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 16, 2018, Proceedings. Springer; 2018. p. 161-169. Chen W, Ji H, Feng J, Liu R, Yu Y, Zhou R, Zhou J. Classification of pancreatic cystic neoplasms based on multimodality images. In: Machine Learning in Medical Imaging: 9th International Workshop, MLMI 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 16, 2018, Proceedings. Springer; 2018. p. 161-169.
20.
go back to reference Weinstein JN, Collisson EA, Mills GB, Shaw KR, Ozenberger BA, Ellrott K, Shmulevich I, Sander C, Stuart JM. The cancer genome atlas pan-cancer analysis project. Nat Genet. 2013;45(10):1113–20.CrossRefPubMedPubMedCentral Weinstein JN, Collisson EA, Mills GB, Shaw KR, Ozenberger BA, Ellrott K, Shmulevich I, Sander C, Stuart JM. The cancer genome atlas pan-cancer analysis project. Nat Genet. 2013;45(10):1113–20.CrossRefPubMedPubMedCentral
21.
go back to reference Ellis MJ, Gillette M, Carr SA, Paulovich AG, Smith RD, Rodland KK, Townsend RR, Kinsinger C, Mesri M, Rodriguez H, et al. Connecting genomic alterations to cancer biology with proteomics: the nci clinical proteomic tumor analysis consortium. Cancer Discov. 2013;3(10):1108–12.CrossRefPubMedPubMedCentral Ellis MJ, Gillette M, Carr SA, Paulovich AG, Smith RD, Rodland KK, Townsend RR, Kinsinger C, Mesri M, Rodriguez H, et al. Connecting genomic alterations to cancer biology with proteomics: the nci clinical proteomic tumor analysis consortium. Cancer Discov. 2013;3(10):1108–12.CrossRefPubMedPubMedCentral
22.
go back to reference Consortium, G. The gtex consortium atlas of genetic regulatory effects across human tissues. Science. 2020;369(6509):1318–30.CrossRef Consortium, G. The gtex consortium atlas of genetic regulatory effects across human tissues. Science. 2020;369(6509):1318–30.CrossRef
23.
go back to reference Howard FM, Dolezal J, Kochanny S, Schulte J, Chen H, Heij L, Huo D, Nanda R, Olopade OI, Kather JN, et al. The impact of site-specific digital histology signatures on deep learning model accuracy and bias. Nat Commun. 2021;12(1):4423.CrossRefPubMedPubMedCentral Howard FM, Dolezal J, Kochanny S, Schulte J, Chen H, Heij L, Huo D, Nanda R, Olopade OI, Kather JN, et al. The impact of site-specific digital histology signatures on deep learning model accuracy and bias. Nat Commun. 2021;12(1):4423.CrossRefPubMedPubMedCentral
24.
go back to reference Coudray N, Ocampo PS, Sakellaropoulos T, Narula N, Snuderl M, Fenyö D, Moreira AL, Razavian N, Tsirigos A. Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning. Nat Med. 2018;24(10):1559–67.CrossRefPubMedPubMedCentral Coudray N, Ocampo PS, Sakellaropoulos T, Narula N, Snuderl M, Fenyö D, Moreira AL, Razavian N, Tsirigos A. Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning. Nat Med. 2018;24(10):1559–67.CrossRefPubMedPubMedCentral
25.
go back to reference Lu MY, Chen TY, Williamson DF, Zhao M, Shady M, Lipkova J, Mahmood F. Ai-based pathology predicts origins for cancers of unknown primary. Nature. 2021;594(7861):106–10.CrossRefPubMed Lu MY, Chen TY, Williamson DF, Zhao M, Shady M, Lipkova J, Mahmood F. Ai-based pathology predicts origins for cancers of unknown primary. Nature. 2021;594(7861):106–10.CrossRefPubMed
26.
go back to reference Lu MY, Williamson DF, Chen TY, Chen RJ, Barbieri M, Mahmood F. Data-efficient and weakly supervised computational pathology on whole-slide images. Nat Biomed Eng. 2021;5(6):555–70.CrossRefPubMedPubMedCentral Lu MY, Williamson DF, Chen TY, Chen RJ, Barbieri M, Mahmood F. Data-efficient and weakly supervised computational pathology on whole-slide images. Nat Biomed Eng. 2021;5(6):555–70.CrossRefPubMedPubMedCentral
27.
go back to reference Goode A, Gilbert B, Harkes J, Jukic D, Satyanarayanan M. Openslide: a vendor-neutral software foundation for digital pathology. J Pathol Inform. 2013;4:27.CrossRefPubMedPubMedCentral Goode A, Gilbert B, Harkes J, Jukic D, Satyanarayanan M. Openslide: a vendor-neutral software foundation for digital pathology. J Pathol Inform. 2013;4:27.CrossRefPubMedPubMedCentral
28.
go back to reference Otsu N. A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern. 1979;9(1):62–6.CrossRef Otsu N. A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern. 1979;9(1):62–6.CrossRef
29.
go back to reference Dolezal J, Kochanny S, Howard F. Slideflow: a unified deep learning pipeline for digital histology. Zenodo. Version 1.1. 0. 2022. Dolezal J, Kochanny S, Howard F. Slideflow: a unified deep learning pipeline for digital histology. Zenodo. Version 1.1. 0. 2022.
30.
go back to reference Reinhard E, Adhikhmin M, Gooch B, Shirley P. Color transfer between images. IEEE Comput Graphics Appl. 2001;21(5):34–41.CrossRef Reinhard E, Adhikhmin M, Gooch B, Shirley P. Color transfer between images. IEEE Comput Graphics Appl. 2001;21(5):34–41.CrossRef
31.
go back to reference Kather JN, Heij LR, Grabsch HI, Loeffler C, Echle A, Muti HS, Krause J, Niehues JM, Sommer KA, Bankhead P, et al. Pan-cancer image-based detection of clinically actionable genetic alterations. Nat Cancer. 2020;1(8):789–99.CrossRefPubMedPubMedCentral Kather JN, Heij LR, Grabsch HI, Loeffler C, Echle A, Muti HS, Krause J, Niehues JM, Sommer KA, Bankhead P, et al. Pan-cancer image-based detection of clinically actionable genetic alterations. Nat Cancer. 2020;1(8):789–99.CrossRefPubMedPubMedCentral
32.
go back to reference Teichmann M, Aichert A, Bohnenberger H, Ströbel P, Heimann T. End-to-end learning for image-based detection of molecular alterations in digital pathology, vol. 13432. In: Medical Image Computing and Computer Assisted Intervention–MICCAI 2022: 25th International Conference, Singapore, September 18–22, 2022, Proceedings, Part II. Springer; 2022. p. 88–98. Teichmann M, Aichert A, Bohnenberger H, Ströbel P, Heimann T. End-to-end learning for image-based detection of molecular alterations in digital pathology, vol. 13432. In: Medical Image Computing and Computer Assisted Intervention–MICCAI 2022: 25th International Conference, Singapore, September 18–22, 2022, Proceedings, Part II. Springer; 2022. p. 88–98.
33.
go back to reference He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016. p. 770–778. He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016. p. 770–778.
34.
go back to reference Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics. JMLR Workshop and Conference Proceedings. 2010. p. 249-256. Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics. JMLR Workshop and Conference Proceedings. 2010. p. 249-256.
35.
go back to reference Loshchilov I, Hutter F. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101. 2017. Loshchilov I, Hutter F. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101. 2017.
36.
go back to reference Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, et al. Pytorch: an imperative style, high-performance deep learning library. Adv Neural Inform Process Syst. 2019;32:8026–37. Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, et al. Pytorch: an imperative style, high-performance deep learning library. Adv Neural Inform Process Syst. 2019;32:8026–37.
37.
go back to reference Cao L, Huang C, Zhou DC, Hu Y, Lih TM, Savage SR, Krug K, Clark DJ, Schnaubelt M, Chen L, et al. Proteogenomic characterization of pancreatic ductal adenocarcinoma. Cell. 2021;184(19):5031–52.CrossRefPubMedPubMedCentral Cao L, Huang C, Zhou DC, Hu Y, Lih TM, Savage SR, Krug K, Clark DJ, Schnaubelt M, Chen L, et al. Proteogenomic characterization of pancreatic ductal adenocarcinoma. Cell. 2021;184(19):5031–52.CrossRefPubMedPubMedCentral
38.
go back to reference Toro-Domínguez D, Martorell-Marugán J, López-Domínguez R, García-Moreno A, González-Rumayor V, Alarcón-Riquelme ME, Carmona-Sáez P. Imageo: integrative gene expression meta-analysis from geo database. Bioinformatics. 2019;35(5):880–2.CrossRefPubMed Toro-Domínguez D, Martorell-Marugán J, López-Domínguez R, García-Moreno A, González-Rumayor V, Alarcón-Riquelme ME, Carmona-Sáez P. Imageo: integrative gene expression meta-analysis from geo database. Bioinformatics. 2019;35(5):880–2.CrossRefPubMed
Metadata
Title
Performance comparison between multi-center histopathology datasets of a weakly-supervised deep learning model for pancreatic ductal adenocarcinoma detection
Authors
Francisco Carrillo-Perez
Francisco M. Ortuno
Alejandro Börjesson
Ignacio Rojas
Luis Javier Herrera
Publication date
01-12-2023
Publisher
BioMed Central
Published in
Cancer Imaging / Issue 1/2023
Electronic ISSN: 1470-7330
DOI
https://doi.org/10.1186/s40644-023-00586-3

Other articles of this Issue 1/2023

Cancer Imaging 1/2023 Go to the issue
Webinar | 19-02-2024 | 17:30 (CET)

Keynote webinar | Spotlight on antibody–drug conjugates in cancer

Antibody–drug conjugates (ADCs) are novel agents that have shown promise across multiple tumor types. Explore the current landscape of ADCs in breast and lung cancer with our experts, and gain insights into the mechanism of action, key clinical trials data, existing challenges, and future directions.

Dr. Véronique Diéras
Prof. Fabrice Barlesi
Developed by: Springer Medicine