Skip to main content
Top
Published in: Journal of Digital Imaging 6/2016

01-12-2016

Cloud-Based NoSQL Open Database of Pulmonary Nodules for Computer-Aided Lung Cancer Diagnosis and Reproducible Research

Authors: José Raniery Ferreira Junior, Marcelo Costa Oliveira, Paulo Mazzoncini de Azevedo-Marques

Published in: Journal of Imaging Informatics in Medicine | Issue 6/2016

Login to get access

Abstract

Lung cancer is the leading cause of cancer-related deaths in the world, and its main manifestation is pulmonary nodules. Detection and classification of pulmonary nodules are challenging tasks that must be done by qualified specialists, but image interpretation errors make those tasks difficult. In order to aid radiologists on those hard tasks, it is important to integrate the computer-based tools with the lesion detection, pathology diagnosis, and image interpretation processes. However, computer-aided diagnosis research faces the problem of not having enough shared medical reference data for the development, testing, and evaluation of computational methods for diagnosis. In order to minimize this problem, this paper presents a public nonrelational document-oriented cloud-based database of pulmonary nodules characterized by 3D texture attributes, identified by experienced radiologists and classified in nine different subjective characteristics by the same specialists. Our goal with the development of this database is to improve computer-aided lung cancer diagnosis and pulmonary nodule detection and classification research through the deployment of this database in a cloud Database as a Service framework. Pulmonary nodule data was provided by the Lung Image Database Consortium and Image Database Resource Initiative (LIDC-IDRI), image descriptors were acquired by a volumetric texture analysis, and database schema was developed using a document-oriented Not only Structured Query Language (NoSQL) approach. The proposed database is now with 379 exams, 838 nodules, and 8237 images, 4029 of them are CT scans and 4208 manually segmented nodules, and it is allocated in a MongoDB instance on a cloud infrastructure.
Footnotes
1
Available at www.​morpheusdata.​com/​ [Online; accessed on June 14, 2016]
 
2
Available at www.​speedtest.​net [Online; accessed on June 14, 2016]
 
Literature
1.
go back to reference Wu H, Sun T, Wang J, Li X, Wang W, Huo D, Lv P, He W, Wang K, Guo X: Combination of Radiological and Gray Level Co-occurrence Matrix Textural Features Used to Distinguish Solitary Pulmonary Nodules by Computed Tomography. J Digit Imaging 26(4):797–802, 2013CrossRefPubMedPubMedCentral Wu H, Sun T, Wang J, Li X, Wang W, Huo D, Lv P, He W, Wang K, Guo X: Combination of Radiological and Gray Level Co-occurrence Matrix Textural Features Used to Distinguish Solitary Pulmonary Nodules by Computed Tomography. J Digit Imaging 26(4):797–802, 2013CrossRefPubMedPubMedCentral
2.
go back to reference Reeves A, Chan A, Yankelevitz D, Henschke C, Kressler B, Kostis W: On Measuring the Change in Size of Pulmonary Nodules. IEEE Trans Med Imaging 25(4):435–450, 2006CrossRefPubMed Reeves A, Chan A, Yankelevitz D, Henschke C, Kressler B, Kostis W: On Measuring the Change in Size of Pulmonary Nodules. IEEE Trans Med Imaging 25(4):435–450, 2006CrossRefPubMed
4.
go back to reference Doi K: Computer-Aided Diagnosis in Medical Imaging: Historical Review, Current Status and Future Potential. Comput Med Imaging and Graph 31(4–5):198–211, 2007CrossRef Doi K: Computer-Aided Diagnosis in Medical Imaging: Historical Review, Current Status and Future Potential. Comput Med Imaging and Graph 31(4–5):198–211, 2007CrossRef
5.
go back to reference Akgul C, Rubin D, Napel S, Beaulieu C, Greenspan H, Acar B: Content-Based Image Retrieval in Radiology: Current Status and Future Directions. J Digit Imaging 24(2):208–222, 2011CrossRefPubMed Akgul C, Rubin D, Napel S, Beaulieu C, Greenspan H, Acar B: Content-Based Image Retrieval in Radiology: Current Status and Future Directions. J Digit Imaging 24(2):208–222, 2011CrossRefPubMed
6.
go back to reference Jalalian A, Mashohor S, Mahmud H, Saripan M, Ramli A, Karasfi B: Computer-Aided Detection/Diagnosis of Breast Cancer in Mammography and Ultrasound: a review. Clin Imaging 37(3):420–426, 2013CrossRefPubMed Jalalian A, Mashohor S, Mahmud H, Saripan M, Ramli A, Karasfi B: Computer-Aided Detection/Diagnosis of Breast Cancer in Mammography and Ultrasound: a review. Clin Imaging 37(3):420–426, 2013CrossRefPubMed
7.
go back to reference Deserno T, Welter P, Horsch A: Towards a Repository for Standardized Medical Image and Signal Case Data Annotated with Ground Truth. J Digit Imaging 25(2):213–226, 2012CrossRefPubMed Deserno T, Welter P, Horsch A: Towards a Repository for Standardized Medical Image and Signal Case Data Annotated with Ground Truth. J Digit Imaging 25(2):213–226, 2012CrossRefPubMed
8.
go back to reference Tsymbal A, Meissner E, Kelm M, Kramer M: Towards Cloud-Based Image-Integrated Similarity Search in Big Data. Biomedical and Health Informatics, DOI: 10.1109/BHI.2014.6864434, June 4, 2014. Tsymbal A, Meissner E, Kelm M, Kramer M: Towards Cloud-Based Image-Integrated Similarity Search in Big Data. Biomedical and Health Informatics, DOI: 10.​1109/​BHI.​2014.​6864434, June 4, 2014.
9.
go back to reference Armato S, McLennan G, Bidaut L, McNitt-Gray M, Meyer C, Reeves A, Zhao B, Aberle D, Henschke C, Hoffman E, et al: The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): A Completed Reference Database of Lung Nodules on CT Scans. Med Phys 38:915, 2011CrossRefPubMedPubMedCentral Armato S, McLennan G, Bidaut L, McNitt-Gray M, Meyer C, Reeves A, Zhao B, Aberle D, Henschke C, Hoffman E, et al: The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): A Completed Reference Database of Lung Nodules on CT Scans. Med Phys 38:915, 2011CrossRefPubMedPubMedCentral
10.
go back to reference Aberle D, Berg C, Black W, Church T, Fagerstrom R, Galen B, Gareen I, Gatsonis C, Goldin J, Gohagan J, et al: The National Lung Screening Trial: overview and study design. Radiology 258(1):243–253, 2011CrossRefPubMed Aberle D, Berg C, Black W, Church T, Fagerstrom R, Galen B, Gareen I, Gatsonis C, Goldin J, Gohagan J, et al: The National Lung Screening Trial: overview and study design. Radiology 258(1):243–253, 2011CrossRefPubMed
11.
go back to reference Aerts H, Velazquez E, Leijenaar R, Parmar C, Grossmann P, Carvalho S, Bussink J, Monshouwer R, Haibe-Kains B, Rietveld D, et al.: Decoding Tumour Phenotype by Noninvasive Imaging Using a Quantitative Radiomics Approach. Nature Communications, 5, 2014. Aerts H, Velazquez E, Leijenaar R, Parmar C, Grossmann P, Carvalho S, Bussink J, Monshouwer R, Haibe-Kains B, Rietveld D, et al.: Decoding Tumour Phenotype by Noninvasive Imaging Using a Quantitative Radiomics Approach. Nature Communications, 5, 2014.
13.
go back to reference Gavrielides M, Kinnard L, Myers K, Peregoy J, Pritchard W, Zeng R, Esparza J, Karanian J, Petrick N: A Resource for the Assessment of Lung Nodule Size Estimation Methods: database of thoracic CT scans of an anthropomorphic phantom. Optics Express 18(14):15244–15255, 2010CrossRefPubMedPubMedCentral Gavrielides M, Kinnard L, Myers K, Peregoy J, Pritchard W, Zeng R, Esparza J, Karanian J, Petrick N: A Resource for the Assessment of Lung Nodule Size Estimation Methods: database of thoracic CT scans of an anthropomorphic phantom. Optics Express 18(14):15244–15255, 2010CrossRefPubMedPubMedCentral
14.
go back to reference Das M, Ley-Zaporozhan J, Gietema H, Czech A, Muhlenbruch G, Mahnken A, Katoh M, Bakai A, Salganicoff M, Diederich S, et al: Accuracy of Automated Volumetry of Pulmonary Nodules Across Different Multislice CT Scanners. Eur Radiol 17(8):1979–1984, 2007CrossRefPubMed Das M, Ley-Zaporozhan J, Gietema H, Czech A, Muhlenbruch G, Mahnken A, Katoh M, Bakai A, Salganicoff M, Diederich S, et al: Accuracy of Automated Volumetry of Pulmonary Nodules Across Different Multislice CT Scanners. Eur Radiol 17(8):1979–1984, 2007CrossRefPubMed
16.
go back to reference Armato S, Roberts R, McNitt-Gray M, Meyer C, Reeves A, McLennan G, Engelmann R, Bland P, Aberle D, Kazerooni E, et al: The Lung Image Database Consortium (LIDC): Ensuring the integrity of expert-defined “truth”. Acad Radiol 14(12):1455–1463, 2007CrossRefPubMedPubMedCentral Armato S, Roberts R, McNitt-Gray M, Meyer C, Reeves A, McLennan G, Engelmann R, Bland P, Aberle D, Kazerooni E, et al: The Lung Image Database Consortium (LIDC): Ensuring the integrity of expert-defined “truth”. Acad Radiol 14(12):1455–1463, 2007CrossRefPubMedPubMedCentral
17.
go back to reference Sluimer I, Schilham A, Prokop M, Ginneken B: Computer Analysis of Computed Tomography Scans of the Lung: a survey. IEEE Trans Med Imaging 25(4):385–405, 2006CrossRefPubMed Sluimer I, Schilham A, Prokop M, Ginneken B: Computer Analysis of Computed Tomography Scans of the Lung: a survey. IEEE Trans Med Imaging 25(4):385–405, 2006CrossRefPubMed
20.
go back to reference Vaquero L, Rodero-Merino L, Caceres J, Lindner M: A Break in the Clouds: Towards a Cloud Definition. ACM SIGCOMM Computer Communication Review 39(1):50–55, 2008CrossRef Vaquero L, Rodero-Merino L, Caceres J, Lindner M: A Break in the Clouds: Towards a Cloud Definition. ACM SIGCOMM Computer Communication Review 39(1):50–55, 2008CrossRef
21.
22.
go back to reference Tiwari S: Professional NoSQL. John Wiley and Sons, Inc., 2011. Tiwari S: Professional NoSQL. John Wiley and Sons, Inc., 2011.
23.
go back to reference Manyika J, Chui M, Brown B, Bughin J, Dobbs R, Roxburgh C, Byers A: Big Data: The Next Frontier for Innovation, Competition, and Productivity. McKinsey Global Institute, pages 1–137, 2011. Manyika J, Chui M, Brown B, Bughin J, Dobbs R, Roxburgh C, Byers A: Big Data: The Next Frontier for Innovation, Competition, and Productivity. McKinsey Global Institute, pages 1–137, 2011.
24.
go back to reference Banker K: MongoDB in Action. Manning Publications Co., 2011. Banker K: MongoDB in Action. Manning Publications Co., 2011.
25.
go back to reference Strauch C, Sites U, Kriha W: NoSQL Databases. Stuttgart Media University, 2011. Strauch C, Sites U, Kriha W: NoSQL Databases. Stuttgart Media University, 2011.
26.
go back to reference Choi W, Choi T: Automated Pulmonary Nodule Detection Based on Three-Dimensional Shape-Based Feature Descriptor. Comput Methods Programs Biomed 113(1):37–54, 2014CrossRefPubMed Choi W, Choi T: Automated Pulmonary Nodule Detection Based on Three-Dimensional Shape-Based Feature Descriptor. Comput Methods Programs Biomed 113(1):37–54, 2014CrossRefPubMed
27.
go back to reference Erasmus J, Connolly J, McAdams H, Roggli V: Solitary Pulmonary Nodules: Part I. Morphologic Evaluation for Differentiation of Benign and Malignant Lesions 1. Radiographics, 20(1):43–58, 2000. Erasmus J, Connolly J, McAdams H, Roggli V: Solitary Pulmonary Nodules: Part I. Morphologic Evaluation for Differentiation of Benign and Malignant Lesions 1. Radiographics, 20(1):43–58, 2000.
28.
go back to reference Kumar A, Kim J, Cai W, Fulham M, Feng D: Content-Based Medical Image Retrieval: A Survey of Applications to Multidimensional and Multimodality Data. J Digit Imaging 26(6):1025–1039, 2013CrossRefPubMedPubMedCentral Kumar A, Kim J, Cai W, Fulham M, Feng D: Content-Based Medical Image Retrieval: A Survey of Applications to Multidimensional and Multimodality Data. J Digit Imaging 26(6):1025–1039, 2013CrossRefPubMedPubMedCentral
30.
go back to reference Clark K, Vendt B, Smith K, Freymann J, Kirby J, Koppel P, Moore S, Phillips S, Maffitt D, Pringle M, et al: The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository. J Digit Imaging 26(6):1045–1057, 2013CrossRefPubMedPubMedCentral Clark K, Vendt B, Smith K, Freymann J, Kirby J, Koppel P, Moore S, Phillips S, Maffitt D, Pringle M, et al: The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository. J Digit Imaging 26(6):1045–1057, 2013CrossRefPubMedPubMedCentral
31.
go back to reference Leavitt N: Will NoSQL Databases Live Up to Their Promise? Computer 43(2):12–14, 2010CrossRef Leavitt N: Will NoSQL Databases Live Up to Their Promise? Computer 43(2):12–14, 2010CrossRef
32.
go back to reference Liu L: Computing Infrastructure for Big Data Processing. Frontiers of Computer Science 7(2):165–170, 2013CrossRef Liu L: Computing Infrastructure for Big Data Processing. Frontiers of Computer Science 7(2):165–170, 2013CrossRef
34.
go back to reference Hayes B: Cloud Computing. Communications of the ACM, 51(7), 2008. Hayes B: Cloud Computing. Communications of the ACM, 51(7), 2008.
37.
go back to reference Oliveira M, Cirne W, Marques P: Towards Applying Content-Based Image Retrieval in the Clinical Routine. Future Generation Computer Systems 23(3):466–474, 2007CrossRef Oliveira M, Cirne W, Marques P: Towards Applying Content-Based Image Retrieval in the Clinical Routine. Future Generation Computer Systems 23(3):466–474, 2007CrossRef
38.
go back to reference Dhara A, Mukhopadhyay S, Dutta A, Garg M, Khandelwal N: A Combination of Shape and Texture Features for Classification of Pulmonary Nodules in Lung CT Images. J Digit Imaging, 1–10, 2016. Dhara A, Mukhopadhyay S, Dutta A, Garg M, Khandelwal N: A Combination of Shape and Texture Features for Classification of Pulmonary Nodules in Lung CT Images. J Digit Imaging, 1–10, 2016.
39.
go back to reference Han F, Wang H, Zhang G, Han H, Song B, Li L, Moore W, Lu H, Zhao H, Liang Z: Texture feature analysis for computer-aided diagnosis on pulmonary nodules. J Digit Imaging 28(1):99–115, 2015CrossRefPubMed Han F, Wang H, Zhang G, Han H, Song B, Li L, Moore W, Lu H, Zhao H, Liang Z: Texture feature analysis for computer-aided diagnosis on pulmonary nodules. J Digit Imaging 28(1):99–115, 2015CrossRefPubMed
40.
go back to reference Kaya A, Can A: A weighted rule based method for predicting malignancy of pulmonary nodules by nodule characteristics. J Biomed Inform 56:69–79, 2015CrossRefPubMed Kaya A, Can A: A weighted rule based method for predicting malignancy of pulmonary nodules by nodule characteristics. J Biomed Inform 56:69–79, 2015CrossRefPubMed
41.
42.
go back to reference Ghoneim D, Toussaint G, Constans J, Certaines J: Three Dimensional Texture Analysis in MRI: A Preliminary Evaluation in Gliomas. Magn Reson Imaging 21(9):983–987, 2003CrossRef Ghoneim D, Toussaint G, Constans J, Certaines J: Three Dimensional Texture Analysis in MRI: A Preliminary Evaluation in Gliomas. Magn Reson Imaging 21(9):983–987, 2003CrossRef
43.
go back to reference Haralick R, Shanmugam K, Dinstein I: Textural Features for Image Classification. IEEE Transactions on Systems, Man and Cybernetics, (6):610–621, 1973. Haralick R, Shanmugam K, Dinstein I: Textural Features for Image Classification. IEEE Transactions on Systems, Man and Cybernetics, (6):610–621, 1973.
44.
go back to reference Mehdi A, Vassili K, Eduard S, Vahid T: A Comprehensive Framework for Automatic Detection of Pulmonary Nodules in Lung CT Images. Image Analysis & Stereology 33(1):13–27, 2014CrossRef Mehdi A, Vassili K, Eduard S, Vahid T: A Comprehensive Framework for Automatic Detection of Pulmonary Nodules in Lung CT Images. Image Analysis & Stereology 33(1):13–27, 2014CrossRef
Metadata
Title
Cloud-Based NoSQL Open Database of Pulmonary Nodules for Computer-Aided Lung Cancer Diagnosis and Reproducible Research
Authors
José Raniery Ferreira Junior
Marcelo Costa Oliveira
Paulo Mazzoncini de Azevedo-Marques
Publication date
01-12-2016
Publisher
Springer International Publishing
Published in
Journal of Imaging Informatics in Medicine / Issue 6/2016
Print ISSN: 2948-2925
Electronic ISSN: 2948-2933
DOI
https://doi.org/10.1007/s10278-016-9894-9

Other articles of this Issue 6/2016

Journal of Digital Imaging 6/2016 Go to the issue