Top

Published in:

01-12-2020 | Prostate Cancer | Imaging Informatics and Artificial Intelligence

Using decision curve analysis to benchmark performance of a magnetic resonance imaging–based deep learning model for prostate cancer risk assessment

Authors: Dominik Deniffel, Nabila Abraham, Khashayar Namdar, Xin Dong, Emmanuel Salinas, Laurent Milot, Farzad Khalvati, Masoom A. Haider

Published in: European Radiology | Issue 12/2020

Abstract

Objectives

To benchmark the performance of a calibrated 3D convolutional neural network (CNN) applied to multiparametric MRI (mpMRI) for risk assessment of clinically significant prostate cancer (csPCa) using decision curve analysis (DCA).

Methods

We retrospectively analyzed 499 patients who had positive mpMRI (PI-RADSv2 ≥ 3) and MRI-targeted biopsy. The training cohort comprised 449 men, including a calibration set of 50 men. Biopsy decision strategies included using risk estimates from the CNN (original and calibrated), to perform biopsy in men with PI-RADSv2 ≥ 4 only, or additionally in men with PI-RADSv2 3 and PSA density (PSAd) ≥ 0.15 ng/ml/ml. Discrimination, calibration and clinical usefulness in the unseen test cohort (n = 50) were assessed using C-statistic, calibration plots and DCA, respectively.

Results

The calibrated CNN achieved moderate calibration (Hosmer-Lemeshow calibration test, p = 0.41) and good discrimination (C = 0.85). DCA revealed consistently higher net benefit and net reduction in biopsies for the calibrated CNN compared with the original CNN, PI-RADSv2 ≥ 4 and the combined strategy of PI-RADSv2 and PSAd. Original CNN predictions were severely miscalibrated (p < 0.0001) resulting in net harm compared with a ‘biopsy all’ patients strategy. At-risk thresholds ≥ 10% using the calibrated CNN and the combined strategy reduced the number of biopsies by an estimated 201 and 55 men, respectively, per 1000 men at risk, without missing csPCa, while original CNN and PI-RADSv2 ≥ 4 could not achieve a net reduction in biopsies.

Conclusions

DCA revealed that our calibrated 3D-CNN resulted in fewer unnecessary biopsies compared with using PI-RADSv2 alone or in combination with PSAd. CNN calibration is important in achieving clinical utility.

Key Points

• A 3D deep learning model applied to multiparametric MRI may help to prevent unnecessary prostate biopsies in patients eligible for MRI-targeted biopsy.

• Owing to miscalibration, original risk estimates by the deep learning model require prior calibration to enable clinical utility.

• Decision curve analysis confirmed a net benefit of using our calibrated deep learning model for biopsy decisions compared with alternative strategies, including PI-RADSv2 alone and in combination with prostate-specific antigen density.

Available only for authorised users

Drost F-JHJH, Osses DF, Nieboer D et al (2019) Prostate MRI, with or without MRI-targeted biopsy, and systematic biopsy for detecting prostate cancer. Cochrane Database Syst Rev 2019:CD012663. https://doi.org/10.1002/14651858.CD012663.pub2CrossRef

Ahdoot M, Wilbur AR, Reese SE et al (2020) MRI-targeted, systematic, and combined biopsy for prostate cancer diagnosis. N Engl J Med 382:917–928. https://doi.org/10.1056/NEJMoa1910038CrossRef

Weinreb JC, Barentsz JO, Choyke PL et al (2016) PI-RADS prostate imaging – reporting and data system: 2015, version 2. Eur Urol 69:16–40CrossRef

Smith CP, Harmon SA, Barrett T et al (2019) Intra- and interreader reproducibility of PI-RADSv2: a multireader study. J Magn Reson Imaging 49:1694–1703. https://doi.org/10.1002/jmri.26555CrossRef

Greer MD, Shih JH, Lay N et al (2019) Interreader variability of prostate imaging reporting and data system version 2 in detecting and assessing prostate cancer lesions at prostate MRI. AJR Am J Roentgenol 212:1197–1205. https://doi.org/10.2214/AJR.18.20536CrossRef

Song Y, Zhang YD, Yan X et al (2018) Computer-aided diagnosis of prostate cancer using a deep convolutional neural network from multiparametric MRI. J Magn Reson Imaging 48:1570–1577. https://doi.org/10.1002/jmri.26047CrossRef

Aldoj N, Lukas S, Dewey M, Penzkofer T (2019) Semi-automatic classification of prostate cancer on multi-parametric MR imaging using a multi-channel 3D convolutional neural network. Eur Radiol. https://doi.org/10.1007/s00330-019-06417-z

Schelb P, Kohl S, Radtke JP et al (2019) Classification of cancer at prostate MRI: deep learning versus clinical PI-RADS assessment. Radiology 293:607–617. https://doi.org/10.1148/radiol.2019190938CrossRef

Ishioka J, Matsuoka Y, Uehara S et al (2018) Computer-aided diagnosis of prostate cancer on magnetic resonance imaging using a convolutional neural network algorithm. BJU Int 122:411–417. https://doi.org/10.1111/bju.14397CrossRef

10.

Yang X, Liu C, Wang Z et al (2017) Co-trained convolutional neural networks for automated detection of prostate cancer in multi-parametric MRI. Med Image Anal 42:212–227. https://doi.org/10.1016/j.media.2017.08.006CrossRef

11.

Alkadi R, Taher F, El-baz A, Werghi N (2019) A deep learning-based approach for the detection and localization of prostate cancer in T2 magnetic resonance images. J Digit Imaging 32:793–807. https://doi.org/10.1007/s10278-018-0160-1CrossRef

12.

Yoo S, Gujrathi I, Haider MA, Khalvati F (2019) Prostate cancer detection using deep convolutional neural networks. Sci Rep 9:19518. https://doi.org/10.1038/s41598-019-55972-4CrossRef

13.

Clark T, Zhang J, Baig S, Wong A, Haider MA, Khalvati F (2017) Fully automated segmentation of prostate whole gland and transition zone in diffusion-weighted MRI using convolutional neural networks. J Med Imaging (Bellingham) 4:1. https://doi.org/10.1117/1.jmi.4.4.041307

14.

Goldenberg SL, Nir G, Salcudean SE (2019) A new era: artificial intelligence and machine learning in prostate cancer. Nat Rev Urol 16:391–403CrossRef

15.

Khalvati F, Zhang J, Chung AG et al (2018) MPCaD: a multi-scale radiomics-driven framework for automated prostate cancer localization and detection. BMC Med Imaging. https://doi.org/10.1186/s12880-018-0258-4

16.

Lay N, Tsehay Y, Greer MD et al (2017) Detection of prostate cancer in multiparametric MRI using random forest with instance weighting. J Med Imaging (Bellingham) 4:024506. https://doi.org/10.1117/1.JMI.4.2.024506CrossRef

17.

Thompson IM, Ankerst DP, Chi C et al (2006) Assessing prostate cancer risk: results from the prostate cancer prevention trial. J Natl Cancer Inst 98:529–534. https://doi.org/10.1093/jnci/djj131CrossRef

18.

Roobol MJ, van Vugt HA, Loeb S et al (2012) Prediction of prostate cancer risk: the role of prostate volume and digital rectal examination in the ERSPC risk calculators. Eur Urol 61:577–583. https://doi.org/10.1016/j.eururo.2011.11.012CrossRef

19.

Mottet N, Cornford P, van den Bergh RCN et al (2019) EAU - EANM - ESTRO - ESUR - SIOG guidelines on prostate cancer 2019. Eur Assoc Urol Guidel 53:1–161

20.

Steyerberg EW, Vickers AJ, Cook NR et al (2010) Assessing the performance of prediction models: a framework for traditional and novel measures. Epidemiology 21:128–138CrossRef

21.

Vickers AJ, Elkin EB (2006) Decision curve analysis: a novel method for evaluating prediction models. Med Decis Making 26:565–574. https://doi.org/10.1177/0272989X06295361CrossRef

22.

Collins GS, Reitsma JB, Altman DG, Moons KGM (2015) Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. Ann Intern Med 162:55–63. https://doi.org/10.7326/M14-0697CrossRef

23.

Guo C, Pleiss G, Sun Y, Weinberger KQ (2017) On calibration of modern neural networks. Proc 34th Int Conf Mach Learn 70:1321–1330

24.

Van Calster B, Vickers AJ (2015) Calibration of risk prediction models: impact on decision-analytic performance. Med Decis Making 35:162–169. https://doi.org/10.1177/0272989X14547233CrossRef

25.

Fitzgerald M, Saville BR, Lewis RJ (2015) Decision curve analysis. JAMA 313:409–410CrossRef

26.

Balachandran VP, Gonen M, Smith JJ, DeMatteo RP (2015) Nomograms in oncology: more than meets the eye. Lancet Oncol 16:e173–e180CrossRef

27.

Kerr KF, Brown MD, Zhu K, Janes H (2016) Assessing the clinical impact of risk prediction models with decision curves: guidance for correct interpretation and appropriate use. J Clin Oncol 34:2534–2540. https://doi.org/10.1200/JCO.2015.65.5654CrossRef

28.

Vickers AJ, Van Calster B, Steyerberg EW (2016) Net benefit approaches to the evaluation of prediction models, molecular markers, and diagnostic tests. BMJ 352. https://doi.org/10.1136/bmj.i6

29.

Kelly CJ, Karthikesalingam A, Suleyman M, Corrado G, King D (2019) Transforming classifier scores into accurate multiclass probability estimates clinical decision support systems view project evaluation methodology view project transforming classifier scores into accurate multiclass probability estimates. Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining https://doi.org/10.1186/s12916-019-1426-2

30.

Nagendran M, Chen Y, Lovejoy CA et al (2020) Artificial intelligence versus clinicians: systematic review of design, reporting standards, and claims of deep learning studies in medical imaging. BMJ 368:m689. https://doi.org/10.1136/bmj.m689CrossRef

31.

Moore CM, Kasivisvanathan V, Eggener S et al (2013) Standards of reporting for MRI-targeted biopsy studies (START) of the prostate: recommendations from an international working group. Eur Urol 64:544–552. https://doi.org/10.1016/j.eururo.2013.03.030CrossRef

32.

Epstein JI, Egevad L, Amin MB et al (2016) The 2014 international society of urological pathology (ISUP) consensus conference on Gleason grading of prostatic carcinoma definition of grading patterns and proposal for a new grading system. Am J Surg Pathol 40:244–252. https://doi.org/10.1097/PAS.0000000000000530CrossRef

33.

Lehmann TM, Gönner C, Spitzer K (2001) Addendum: B-spline interpolation in medical image processing. IEEE Trans Med Imaging 20:660–665. https://doi.org/10.1109/42.932749CrossRef

34.

Kull M, Silva Filho TM, Flach P (2017) Beyond Sigmoids: how to obtain well-calibrated probabilities from binary classifiers with beta calibration. Electron J Stat 11:5052–5080. https://doi.org/10.1214/17-EJS1338SICrossRef

35.

van der Ploeg T, Nieboer D, Steyerberg EW (2016) Modern modeling techniques had limited external validity in predicting mortality from traumatic brain injury. J Clin Epidemiol 78:83–89. https://doi.org/10.1016/j.jclinepi.2016.03.002CrossRef

36.

Schoots IG, Osses DF, Drost F-JH et al (2018) Reduction of MRI-targeted biopsies in men with low-risk prostate cancer on active surveillance by stratifying to PI-RADS and PSA-density, with different thresholds for significant disease. Transl Androl Urol 7:132–144. https://doi.org/10.21037/tau.2017.12.29CrossRef

37.

Hansen NL, Kesch C, Barrett T et al (2017) Multicentre evaluation of targeted and systematic biopsies using magnetic resonance and ultrasound image-fusion guided transperineal prostate biopsy in patients with a previous negative biopsy. BJU Int 120:631–638. https://doi.org/10.1111/bju.13711CrossRef

38.

Venderink W, van Luijtelaar A, Bomers JGR et al (2018) Results of targeted biopsy in men with magnetic resonance imaging lesions classified equivocal, likely or highly likely to be clinically significant prostate cancer. Eur Urol 73:353–360. https://doi.org/10.1016/j.eururo.2017.02.021CrossRef

39.

Van Calster B, Wynants L, Verbeek JFMM et al (2018) Reporting and interpreting decision curve analysis: a guide for investigators. Eur Urol 74:796–804. https://doi.org/10.1016/j.eururo.2018.08.038CrossRef

40.

Capogrosso P, Vickers AJ (2019) A systematic review of the literature demonstrates some errors in the use of decision curve analysis but generally correct interpretation of findings. Med Decis Making 39:493–498. https://doi.org/10.1177/0272989X19832881CrossRef

41.

Vickers AJ, van Calster B, Steyerberg EW (2019) A simple, step-by-step guide to interpreting decision curve analysis. Diagn Progn Res 3:18. https://doi.org/10.1186/s41512-019-0064-7CrossRef

42.

Bossuyt PM, Reitsma JB, Bruns DE et al (2003) Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative. Radiology 226:24–28. https://doi.org/10.1148/radiol.2261021292CrossRef

43.

Zadrozny B, Elkan C (2002) Transforming classifier scores into accurate multiclass probability estimates clinical decision support systems view project evaluation methodology view project transforming classifier scores into accurate multiclass probability estimates. https://doi.org/10.1145/775047.775151

Title: Using decision curve analysis to benchmark performance of a magnetic resonance imaging–based deep learning model for prostate cancer risk assessment
Authors: Dominik Deniffel
Nabila Abraham
Khashayar Namdar
Xin Dong
Emmanuel Salinas
Laurent Milot
Farzad Khalvati
Masoom A. Haider
Publication date: 01-12-2020
Publisher: Springer Berlin Heidelberg
Keywords: Prostate Cancer
Magnetic Resonance Imaging
Magnetic Resonance Imaging
Artificial Intelligence
Prostate Cancer
Published in: European Radiology / Issue 12/2020
Print ISSN: 0938-7994
Electronic ISSN: 1432-1084
DOI: https://doi.org/10.1007/s00330-020-07030-1

At a glance: The STEP trials

Springer Medicine

Using decision curve analysis to benchmark performance of a magnetic resonance imaging–based deep learning model for prostate cancer risk assessment

Abstract

Objectives

Methods

Results

Conclusions

Key Points

At a glance: The STEP trials

Springer Medicine

Abstract

Objectives

Methods

Results

Conclusions

Key Points

Please log in to get access to this content

Other articles of this Issue 12/2020

MR elastography frequency–dependent and independent parameters demonstrate accelerated decrease of brain stiffness in elder subjects

Accuracy of CT in a cohort of symptomatic patients with suspected COVID-19 pneumonia during the outbreak peak in Italy

A fully automated software platform for structural mitral valve analysis

Emphysema quantification using low-dose computed tomography with deep learning–based kernel conversion comparison

Machine learning–based CT texture analysis to predict HPV status in oropharyngeal squamous cell carcinoma: comparison of 2D and 3D segmentation

Correction to: No evidence of improved efficacy of covered stents over uncovered stents in percutaneous palliation of malignant hilar biliary obstruction: results of a prospective randomized trial