To automatically label chest radiographs and chest CTs regarding the detection of pulmonary infection in the report text, to calculate the number needed to image (NNI) and to investigate if these labels correlate with regional epidemiological infection data.
Materials and methods
All chest imaging reports performed in the emergency room between 01/2012 and 06/2022 were included (64,046 radiographs; 27,705 CTs). Using a regular expression-based text search algorithm, reports were labeled positive/negative for pulmonary infection if described.
Data for regional weekly influenza-like illness (ILI) consultations (10/2013–3/2022), COVID-19 cases, and hospitalization (2/2020–6/2022) were matched with report labels based on calendar date. Positive rate for pulmonary infection detection, NNI, and the correlation with influenza/COVID-19 data were calculated.
Between 1/2012 and 2/2020, a 10.8–16.8% per year positive rate for detecting pulmonary infections on chest radiographs was found (NNI 6.0–9.3). A clear and significant seasonal change in mean monthly detection counts (102.3 winter; 61.5 summer; p < .001) correlated moderately with regional ILI consultations (weekly data r = 0.45; p < .001).
For 2020–2021, monthly pulmonary infection counts detected by chest CT increased to 64–234 (23.0–26.7% per year positive rate, NNI 3.7–4.3) compared with 14–94 (22.4–26.7% positive rate, NNI 3.7–4.4) for 2012–2019. Regional COVID-19 epidemic waves correlated moderately with the positive pulmonary infection CT curve for 2020–2022 (weekly new cases: r = 0.53; hospitalizations: r = 0.65; p < .001).
Text mining of radiology reports allows to automatically extract diagnoses. It provides a metric to calculate the number needed to image and to track the trend of diagnoses in real time, i.e., seasonality and epidemic course of pulmonary infections.
Digitally labeling radiology reports represent previously neglected data and may assist in automated disease tracking, in the assessment of physicians’ clinical reasoning for ordering radiology examinations and serve as actionable data for hospital workflow optimization.
• Radiology reports, commonly not machine readable, can be automatically labeled with the contained diagnoses using a regular-expression based text search algorithm.
• Chest radiograph reports positive for pulmonary infection moderately correlated with regional influenza-like illness consultations (weekly data; r = 0.45; p < .001) and chest CT reports with the course of the regional COVID-19 pandemic (new cases: r = 0.53; hospitalizations: r = 0.65; p < 0.001).
• Rendering radiology reports into data labels provides a metric for automated disease tracking, the assessment of ordering physicians clinical reasoning and can serve as actionable data for workflow optimization.