To estimate the uncertainty of a deep learning (DL)-based prostate segmentation algorithm through conformal prediction (CP) and to assess its effect on the calculation of the prostate volume (PV) in patients at risk of prostate cancer (PC).
Methods
Three-hundred seventy-seven multi-center 3-Tesla axial T2-weighted exams from biopsied males (66.64 \(\pm\) 7.47 years) at risk of PC were retrospectively included in the study. Assessment of PV based on PI-RADS 2.1 ellipsoid formula (\({{{\rm{PV}}}}_{{ref}}\)) was available for included patients. Prostate segmentations were obtained from a DL model and used to calculate the PV (\({{{\rm{PV}}}}_{{DL}}\)). CP was applied at a confidence level of 85% to flag unreliable pixel segmentations of the DL model. Subsequently, the PV (\({{{\rm{PV}}}}_{{CP}}\)) was calculated when disregarding uncertain pixel segmentations. Agreement between \({{{\rm{PV}}}}_{{DL}}\) and \({{{\rm{PV}}}}_{{CP}}\) was evaluated against the reference standard \({{{\rm{PV}}}}_{{ref}}\). Intraclass correlation coefficient (ICC) and Bland–Altman plots were used to assess the agreement. The relative volume difference (RVD) was used to evaluate the PV calculation accuracy, and the Wilcoxon Signed-Rank Test was used to assess statistical differences. A p-value < 0.05 was considered statistically significant.
Results
Conformal prediction significantly reduced RVD when compared to the DL algorithm (RVD = − 2.81 \(\pm\) 8.85 and RVD = −8.01 \(\pm\) 11.50). \({{{\rm{PV}}}}_{{CP}}\) showed a significantly larger agreement than \({{{\rm{PV}}}}_{{DL}}\) when using the reference standard \({{{\rm{PV}}}}_{{ref}}\) (mean difference (95% limits of agreement) \({{{\rm{PV}}}}_{{CP}}\): 1.27 mL (− 13.64; 16.17 mL) \({{{\rm{PV}}}}_{{DL}}\): 6.07 mL (− 14.29; 26.42 mL)), with an excellent ICC (\({{{\rm{PV}}}}_{{CP}}\): 0.97 (95% CI: 0.97 to 0.98)).
Conclusion
Uncertainty quantification through CP increases the accuracy and reliability of DL-based PV assessment in patients at risk of PC.
Critical relevance statement
Conformal prediction can flag uncertain pixel predictions of DL-based prostate MRI segmentation at a desired confidence level, increasing the reliability and safety of prostate volume assessment in patients at risk of prostate cancer.
Key Points
Conformal prediction can flag uncertain pixel predictions of prostate segmentations at a user-defined confidence level.
Deep learning with conformal prediction shows high accuracy in prostate volumetric assessment.
Agreement between automatic and ellipsoid-derived volume was significantly larger with conformal prediction.