Skip to main content
Top
Published in:

Open Access 09-06-2024 | Artificial Intelligence | Review

Frequency and characteristics of errors by artificial intelligence (AI) in reading screening mammography: a systematic review

Authors: Aileen Zeng, Nehmat Houssami, Naomi Noguchi, Brooke Nickel, M. Luke Marinovich

Published in: Breast Cancer Research and Treatment | Issue 1/2024

Login to get access

Abstract

Purpose

Artificial intelligence (AI) for reading breast screening mammograms could potentially replace (some) human-reading and improve screening effectiveness. This systematic review aims to identify and quantify the types of AI errors to better understand the consequences of implementing this technology.

Methods

Electronic databases were searched for external validation studies of the accuracy of AI algorithms in real-world screening mammograms. Descriptive synthesis was performed on error types and frequency. False negative proportions (FNP) and false positive proportions (FPP) were pooled within AI positivity thresholds using random-effects meta-analysis.

Results

Seven retrospective studies (447,676 examinations; published 2019–2022) met inclusion criteria. Five studies reported AI error as false negatives or false positives. Pooled FPP decreased incrementally with increasing positivity threshold (71.83% [95% CI 69.67, 73.90] at Transpara 3 to 10.77% [95% CI 8.34, 13.79] at Transpara 9). Pooled FNP increased incrementally from 0.02% [95% CI 0.01, 0.03] (Transpara 3) to 0.12% [95% CI 0.06, 0.26] (Transpara 9), consistent with a trade-off with FPP. Heterogeneity within thresholds reflected algorithm version and completeness of the reference standard. Other forms of AI error were reported rarely (location error and technical error in one study each).

Conclusion

AI errors are largely interpreted in the framework of test accuracy. FP and FN errors show expected variability not only by positivity threshold, but also by algorithm version and study quality. Reporting of other forms of AI errors is sparse, despite their potential implications for adoption of the technology. Considering broader types of AI error would add nuance to reporting that can inform inferences about AI’s utility.
Appendix
Available only for authorised users
Literature
4.
13.
go back to reference Raudenbush SW (2009) Analyzing effect sizes: random-effects models. The handbook of research synthesis and meta-analysis, 2nd edn. Russell Sage Foundation, New York, pp 295–315 Raudenbush SW (2009) Analyzing effect sizes: random-effects models. The handbook of research synthesis and meta-analysis, 2nd edn. Russell Sage Foundation, New York, pp 295–315
15.
go back to reference Wickham H, editor An implementation of the grammar of graphics in R: ggplot. Book of Abstracts; 2006. Wickham H, editor An implementation of the grammar of graphics in R: ggplot. Book of Abstracts; 2006.
16.
24.
go back to reference Lee CI, Houssami N, Elmore JG, Buist DSM (2020) Pathways to breast cancer screening artificial intelligence algorithm validation. Breast 52:146–149CrossRefPubMed Lee CI, Houssami N, Elmore JG, Buist DSM (2020) Pathways to breast cancer screening artificial intelligence algorithm validation. Breast 52:146–149CrossRefPubMed
Metadata
Title
Frequency and characteristics of errors by artificial intelligence (AI) in reading screening mammography: a systematic review
Authors
Aileen Zeng
Nehmat Houssami
Naomi Noguchi
Brooke Nickel
M. Luke Marinovich
Publication date
09-06-2024
Publisher
Springer US
Published in
Breast Cancer Research and Treatment / Issue 1/2024
Print ISSN: 0167-6806
Electronic ISSN: 1573-7217
DOI
https://doi.org/10.1007/s10549-024-07353-3

Other articles of this Issue 1/2024

Breast Cancer Research and Treatment 1/2024 Go to the issue

2024 ASCO Annual Meeting Coverage

Live Webinar | 01-10-2024 | 12:30 (CEST)

Recent advances in the use of CAR T-cell therapies in relapsed/refractory diffuse large B-cell lymphoma and follicular lymphoma

Live: Tuesday 1st October 2024, 12:30-14:00 (CEST)

In this live webinar, Professor Martin Dreyling and an esteemed, international panel of CAR-T experts will discuss the very latest data on the safety, efficacy and clinical impact of CAR T-cell therapies in the treatment of r/r DLBCL and r/r FL, as presented at ASH 2023, EU CAR-T 2024, and EHA 2024. 

Please note, this webinar is not intended for healthcare professionals based in the US and UK.

Sponsored by: Novartis Pharma AG

Chaired by: Prof. Martin Dreyling
Developed by: Springer Healthcare