Top

Journal of Imaging Informatics in Medicine

Published in:

01-08-2018

MABAL: a Novel Deep-Learning Architecture for Machine-Assisted Bone Age Labeling

Authors: Simukayi Mutasa, Peter D. Chang, Carrie Ruzal-Shapiro, Rama Ayyala

Published in: Journal of Imaging Informatics in Medicine | Issue 4/2018

Abstract

Bone age assessment (BAA) is a commonly performed diagnostic study in pediatric radiology to assess skeletal maturity. The most commonly utilized method for assessment of BAA is the Greulich and Pyle method (Pediatr Radiol 46.9:1269–1274, 2016; Arch Dis Child 81.2:172–173, 1999) atlas. The evaluation of BAA can be a tedious and time-consuming process for the radiologist. As such, several computer-assisted detection/diagnosis (CAD) methods have been proposed for automation of BAA. Classical CAD tools have traditionally relied on hard-coded algorithmic features for BAA which suffer from a variety of drawbacks. Recently, the advent and proliferation of convolutional neural networks (CNNs) has shown promise in a variety of medical imaging applications. There have been at least two published applications of using deep learning for evaluation of bone age (Med Image Anal 36:41–51, 2017; JDI 1–5, 2017). However, current implementations are limited by a combination of both architecture design and relatively small datasets. The purpose of this study is to demonstrate the benefits of a customized neural network algorithm carefully calibrated to the evaluation of bone age utilizing a relatively large institutional dataset. In doing so, this study will aim to show that advanced architectures can be successfully trained from scratch in the medical imaging domain and can generate results that outperform any existing proposed algorithm. The training data consisted of 10,289 images of different skeletal age examinations, 8909 from the hospital Picture Archiving and Communication System at our institution and 1383 from the public Digital Hand Atlas Database. The data was separated into four cohorts, one each for male and female children above the age of 8, and one each for male and female children below the age of 10. The testing set consisted of 20 radiographs of each 1-year-age cohort from 0 to 1 years to 14–15+ years, half male and half female. The testing set included left-hand radiographs done for bone age assessment, trauma evaluation without significant findings, and skeletal surveys. A 14 hidden layer-customized neural network was designed for this study. The network included several state of the art techniques including residual-style connections, inception layers, and spatial transformer layers. Data augmentation was applied to the network inputs to prevent overfitting. A linear regression output was utilized. Mean square error was used as the network loss function and mean absolute error (MAE) was utilized as the primary performance metric. MAE accuracies on the validation and test sets for young females were 0.654 and 0.561 respectively. For older females, validation and test accuracies were 0.662 and 0.497 respectively. For young males, validation and test accuracies were 0.649 and 0.585 respectively. Finally, for older males, validation and test set accuracies were 0.581 and 0.501 respectively. The female cohorts were trained for 900 epochs each and the male cohorts were trained for 600 epochs. An eightfold cross-validation set was employed for hyperparameter tuning. Test error was obtained after training on a full data set with the selected hyperparameters. Using our proposed customized neural network architecture on our large available data, we achieved an aggregate validation and test set mean absolute errors of 0.637 and 0.536 respectively. To date, this is the best published performance on utilizing deep learning for bone age assessment. Our results support our initial hypothesis that customized, purpose-built neural networks provide improved performance over networks derived from pre-trained imaging data sets. We build on that initial work by showing that the addition of state-of-the-art techniques such as residual connections and inception architecture further improves prediction accuracy. This is important because the current assumption for use of residual and/or inception architectures is that a large pre-trained network is required for successful implementation given the relatively small datasets in medical imaging. Instead we show that a small, customized architecture incorporating advanced CNN strategies can indeed be trained from scratch, yielding significant improvements in algorithm accuracy. It should be noted that for all four cohorts, testing error outperformed validation error. One reason for this is that our ground truth for our test set was obtained by averaging two pediatric radiologist reads compared to our training data for which only a single read was used. This suggests that despite relatively noisy training data, the algorithm could successfully model the variation between observers and generate estimates that are close to the expected ground truth.

Breen MA et al.: Bone age assessment practices in infants and older children among Society for Pediatric Radiology members. Pediatr Radiol 46(9):1269–1274, 2016CrossRefPubMed

Bull RK et al.: Bone age assessment: a large scale comparison of the Greulich and Pyle, and Tanner and Whitehouse (TW2) methods. Arch Dis Child 81(2):172–173, 1999CrossRefPubMedPubMedCentral

Thodberg HH, Sävendahl L: Validation and reference values of automated bone age determination for four ethnicities. Acad Radiol 17(11):1425–1432, 2010CrossRefPubMed

Ontell FK et al.: Bone age in children of diverse ethnicity. AJR. Am J Roentgenol 167(6):1395–1398, 1996CrossRefPubMed

Berst MJ et al.: Effect of knowledge of chronologic age on the variability of pediatric bone age determined using the Greulich and Pyle standards. Am J Roentgenol 176(2):507–510, 2001CrossRef

Spampinato C et al.: Deep learning for automated skeletal bone age assessment in X-ray images. Medical image analysis 36:41–51, 2017CrossRefPubMed

Thodberg HH et al.: The BoneXpert method for automated determination of skeletal maturity. IEEE Trans Med Imaging 28(1):52–66, 2009CrossRefPubMed

Lee H, et al: Fully Automated Deep Learning System for Bone Age Assessment. J Digit Imaging (2017): 1–15

Shen W, Zhou M, Yang F, Yang C, Tian J: Multi-scale Convolutional Neural Networks for Lung Nodule Classification. Inf Process Med Imaging 24:588–599, 2015PubMed

10.

Anthimopoulos M, Christodoulidis S, Ebner L, Christe A, Mougiakakou S: Lung Pattern Classification for Interstitial Lung Diseases Using a Deep Convolutional Neural Network. IEEE Trans Med Imaging [Internet]. 2016 May [cited 2017 Jul 2];35(5):1207–16. Available from: http://ieeexplore.ieee.org/document/7422082/

11.

Kooi T, Litjens G, van Ginneken B, Gubern-Mérida A, Sánchez CI, Mann R, et al: Large scale deep learning for computer aided detection of mammographic lesions. Med Image Anal [Internet]. Elsevier; 2017 Jan 1 [cited 2017 Sep 18]; 35:303–12. Available from: http://www.sciencedirect.com/science/article/pii/S1361841516301244

12.

Dou Q, Chen H, Yu L, Zhao L, Qin J, Wang D, et al: Automatic Detection of Cerebral Microbleeds From MR Images via 3D Convolutional Neural Networks. IEEE Trans Med Imaging [Internet]. 2016 May [cited 2017 Jul 2];35(5):1182–95. Available from: http://ieeexplore.ieee.org/document/7403984/

13.

Pereira S, Pinto A, Alves V, Silva CA: Brain Tumor Segmentation Using Convolutional Neural Networks in MRI Images. IEEE Trans Med Imaging [Internet]. 2016 May [cited 2017 Jul 2];35(5):1240–51. Available from: http://ieeexplore.ieee.org/document/7426413/

14.

Chang PD: Fully Convolutional Deep Residual Neural Networks for Brain Tumor Segmentation. In: Crimi A, Menze B, Maier O, Reyes M, Winzeck S, Handels H, editors. Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries: Second International Workshop, BrainLes 2016, with the Challenges on BRATS, ISLES and mTOP 2016, Held in Conjunction with MICCAI 2016, Athens, Greece, October 17, 2016, Revised [Internet]. Cham: Springer International Publishing; 2016. p. 108–18. Available from: https://doi.org/10.1007/978-3-319-55524-9_11

15.

Wang J, Fang Z, Lang N, Yuan H, Su MY, Baldi P: A multi-resolution approach for spinal metastasis detection using deep Siamese neural networks. Comput Biol Med 84:137–146, 2017CrossRefPubMedPubMedCentral

16.

Cao F et al.: Digital hand atlas and web-based bone age assessment: system design and implementation. Comput Med Imaging Graph 24(5):297–307, 2000CrossRefPubMed

17.

LeCun Y et al.: Gradient-based learning applied to document recognition. Proceed IEEE 86(11):2278–2324, 1998CrossRef

18.

He K, et al: Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition. 2016

19.

Szegedy C, et al: Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning. AAAI. 2017

20.

Szegedy C, et al: Going deeper with convolutions. Proceedings of the IEEE conference on computer vision and pattern recognition. 2015

21.

Jaderberg M, Simonyan K, Zisserman A:. Spatial transformer networks. Adv Neural Inf Process Syst 2015

22.

Kingma D, Ba J: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 2014

23.

Nesterov Y: Gradient methods for minimizing composite objective function. 2007

24.

Dozat T: Incorporating nesterov momentum into Adam. 2016

25.

Glorot X, Bengio Y: Understanding the difficulty of training deep feedforward neural networks. Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics. 2010

26.

Srivastava N et al.: Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958, 2014

27.

Ioffe S, Szegedy C: Batch normalization: Accelerating deep network training by reducing internal covariate shift. International Conference on Machine Learning. 2015

28.

Maclaurin D, Duvenaud D, Adams R: Gradient-based hyperparameter optimization through reversible learning. International Conference on Machine Learning. 2015

29.

LeCun YA, et al: Efficient backprop. Neural networks: Tricks of the trade. Springer Berlin Heidelberg, 2012. 9–48

30.

Bengio Y: Practical recommendations for gradient-based training of deep architectures. Neural networks: Tricks of the trade. Berlin Heidelberg: Springer, 2012, pp. 437–478CrossRef

31.

Sun C, et al: Revisiting unreasonable effectiveness of data in deep learning era. arXiv preprint arXiv:1707.02968 2017

Title: MABAL: a Novel Deep-Learning Architecture for Machine-Assisted Bone Age Labeling
Authors: Simukayi Mutasa
Peter D. Chang
Carrie Ruzal-Shapiro
Rama Ayyala
Publication date: 01-08-2018
Publisher: Springer International Publishing
Published in: Journal of Imaging Informatics in Medicine / Issue 4/2018
Print ISSN: 2948-2925
Electronic ISSN: 2948-2933
DOI: https://doi.org/10.1007/s10278-018-0053-3

Keynote webinar | Spotlight on sleep in brain health

Springer Medicine

MABAL: a Novel Deep-Learning Architecture for Machine-Assisted Bone Age Labeling

Abstract

Keynote webinar | Spotlight on sleep in brain health

Springer Medicine

Abstract

Please log in to get access to this content

Other articles of this Issue 4/2018

Characterization of Pulmonary Nodules Based on Features of Margin Sharpness and Texture

Rethinking Skin Lesion Segmentation in a Convolutional Classifier

App Review: The Radiology Assistant 2.0

Application of Color Transformation Techniques in Pediatric Spinal Cord MR Images: Typically Developing and Spinal Cord Injury Population

Quantifying Analysis of Uncertainty in Medical Reporting: Creation of User and Context-Specific Uncertainty Profiles

A Deep-Learning System for Fully-Automated Peripherally Inserted Central Catheter (PICC) Tip Detection