Volume of screening mammography and performance in the Quebec population-based Breast Cancer Screening Program

Isabelle Théberge; Nicole Hébert-Croteau; André Langlois; Diane Major; Jacques Brisson

doi:10.1503/cmaj.1040485

Abstract

Background: In the Quebec Breast Cancer Screening Program (Programme québécois de dépistage du cancer du sein [PQDCS]), radiologists' and facilities' volumes of screening mammography vary considerably. We examined the relation of screening-mammography volume to rates of breast cancer detection and false-positive readings in the PQDCS.

Methods: The study population included 307 314 asymptomatic women aged 50–69 years screened during 1998–2000. Breast cancer detection rates were analyzed by comparing all women with screening-detected breast cancer (n = 1709) and a 10% random sample of those without (n = 30 560). False-positive rates were analyzed by comparing the 3159 women with false-positive readings and the 27 401 others in the 10% random sample. Characteristics of participants, radiologists and facilities were obtained from the PQDCS information system. Data were analyzed by means of logistic regression.

Results: The rate of breast cancer detection appeared to be unrelated to the radiologist's screening-mammography volume but increased with the facility's screening-mammography volume. The breast cancer detection rate ratio for facilities performing 4000 or more screenings per year, compared with those performing fewer than 2000, was 1.28 (95% confidence interval [CI] 1.07–1.52). In contrast, the frequency of false-positive readings was unrelated to the facility's screening volume but was inversely related to the radiologist's screening volume: the rate ratio for readers of 1500 or more screenings per year compared with those reading fewer than 250 was 0.53 (95% CI 0.35–0.79).

Interpretation: Radiologists' and facilities' caseloads showed independent and complementary associations with performance of screening mammography in the PQDCS. Radiologists who worked in larger facilities and read more screening mammograms had higher breast cancer detection rates while maintaining lower false-positive rates.

Caseload of health care providers and organizations has been linked with performance.1^,2^,3^,4 Providers with larger volumes of patients or procedures have often been shown to have better outcomes.2^,5 However, a recent comprehensive review of the literature underlined the methodologic limitations of published studies, especially poor adjustment for case mix and failure to account for characteristics of providers and organizations simultaneously.6

The population-based Quebec Breast Cancer Screening Program (Programme québécois de dépistage du cancer du sein [PQDCS]), launched in 1998, follows the North American standard of a minimum annual reading volume of 480 mammographic examinations (diagnostic and screening combined) for each collaborating radiologist.7 In addition, facilities in urban areas have to perform at least 4000 diagnostic or screening examinations each year to be eligible for the program.8 However, there is a large variability in radiologists' and facilities' volumes of screening examinations.

The objective of our study was to assess whether differences in screening volume were associated with rates of breast cancer detection and of false-positive readings. We examined the separate and combined effects of radiologists' and facilities' screening-mammography volumes.

Methods

Between May 1998 and December 2000, 321 985 women had a first screening in the PQDCS. (In this report, we use “screening” to refer to “mammography screening examination”; each screening includes 4 radiographic views; 1 screening corresponds to 1 woman screened.) Of these women, 14 671 were excluded from the study: 13 966 because of breast symptoms (e.g., lump, nipple inversion, nipple discharge), 522 because of prior mastectomy and 183 because of absence of identifying information about the radiologist who read the mammograms. Among the remaining 307 314 asymptomatic women, the mammograms were read as abnormal for 33 938: 1709 (5.6 per 1000) who received a diagnosis of screening-detected in situ or invasive breast cancer within 1 year after the initial abnormal screening result9 and 32 229 (10.5%) who did not receive such a diagnosis and therefore had a false-positive initial reading.

All women included in this analysis signed a consent form agreeing to participate in the PQDCS, which includes transmission of their data to a central database for analysis. More than 90% of those screened sign the form.10 The Commission d'accès à l'information du Québec authorized the use of this database by the Ministère de la Santé et des Services sociaux du Québec (MSSS). In turn, the MSSS mandated analysis of the data by the PQDCS evaluation team.

Information on risk factors for breast cancer is routinely collected with a standardized questionnaire and captured in the province-wide PQDCS information system, which contains the following variables on participants: age, parity, age at first childbirth, menopausal status, age at menopause, history of breast cancer among first-degree relatives, use of hormone replacement therapy, height, weight, mammographic rating of breast density, previous mammography, history of breast reduction or implant, and previous breast aspiration or biopsy.

We took into account several characteristics of the radiologists: sex, year of certification in radiology, number of practices and type (hospital or private clinic), annual volume of screenings, abnormal-recall rate and abnormal-recall rate of colleagues. Sex and year of certification were derived from the Quebec medical directory for the year 1999/2000.11 Other variables were extracted from the PQDCS information system. For a given radiologist, type of practice and abnormal-recall rate of colleagues were those of the centre where he or she performed the most examinations each year.

We estimated a radiologist's screening volume from the total number of screenings interpreted by the radiologist and the time between the first and last readings, as registered in the PQDCS information system for that individual during the study period. The average monthly volume was then projected over 12 months. For example, a radiologist who interpreted 2000 screenings between Sept. 1, 1998, and Dec. 31, 2000 (28 months), had an estimated annual volume of 857. We estimated the annual screening volume of each facility similarly, using data in the PQDCS information system.

We analyzed the data for the 307 314 eligible women with a case–control approach. First, we analyzed breast cancer detection rates by comparing the 1709 women with screening-detected breast cancer and a 10% simple random sample of those without such cancer (n = 30 560), as no greater precision was offered by use of a larger sample. Second, we analyzed false-positive rates by comparing the 3159 women with false-positive readings and the 27 401 others in the random sample. In these comparisons, the odds ratios can be interpreted as approximations of the breast cancer detection rate ratios and the false-positive rate ratios.12^,13

We used logistic regression for the multivariate analyses. We evaluated the association of radiologists' screening volume with breast cancer detection and false-positive rates in models, including characteristics of women and radiologists as covariates. Trends in outcomes according to screening volume were tested using the median value within each category of this variable. We used the same strategy to test the association of facilities' screening volumes with breast cancer detection and false-positive rates. Finally, we assessed the independent contribution of the radiologist and the screening facility by fitting models that included both variables simultaneously, after ruling out colinearity. In accordance with the multilevel structure of the data,14 we adjusted the confidence intervals (CIs) for the intrareader and intraclinic correlation. We used a 5% level of statistical significance throughout, and all tests were 2-sided.

Results

The distribution of most breast cancer risk factors was similar among the 1709 women with screening-detected breast cancer (1297 invasive and 349 in situ cancers; unknown status in the other 63 cases) and the 30 560 women without this disease (Table 1). For instance, the mean age at screening was 59 and 58 years, the mean body mass index 27 and 26 kg/m² and the mean age at first childbirth 25 and 24 years, respectively. The proportions of women with a family history of breast cancer, with a previous breast aspiration or biopsy and with dense breasts were higher among the women with than among those without breast cancer.

View this table:

Table 1.

In contrast, the 3159 women with false-positive mammograms were similar to the 27 401 women with normal mammograms with respect to most breast cancer risk factors (Table 1) except that, compared with the women who had normal readings, fewer of the women with false-positive readings reported previous mammography, more reported a previous breast aspiration or biopsy, and more had dense breasts.

The screening mammograms were interpreted by 275 radiologists working in 68 accredited facilities. Most of the radiologists (73%) were male, and 35% had received their radiology certification recently (between 1990 and 2000). About two-thirds (65%) were working primarily in private clinics. About half (53%) screened at a single facility; 6% screened at 4 or more facilities. The mean annual screening volume was 581 (range 16–2878) per radiologist and 2279 (range 307–7386) per facility.

Breast cancer detection rate ratios and 95% CIs, according to screening volume of either radiologists or facilities, are shown in Table 2. The radiologists' caseload did not seem to influence their ability to detect invasive or in situ breast cancer. The same was true for small invasive cancers (1 cm or less in diameter) (data not shown). By contrast, cancer detection was associated with the number of screenings performed in the facility. For all types of cancer combined, the adjusted detection rate ratio comparing the facilities performing at least 4000 screenings per year with those performing fewer than 2000 was 1.28 (95% CI 1.07– 1.52). The trend according to facility size was highly statistically significant (p = 0.004).

View this table:

Table 2.

The false-positive rate ratio decreased significantly (p value for trend = 0.001) with increasing screening volume of radiologists (Table 3). The false-positive rate ratio comparing radiologists reading 1500 or more screenings per year with those reading fewer than 250 was 0.53 (95% CI 0.35–0.79). By contrast, the screening volume of facilities was not associated with false-positive readings: the false-positive rate ratio was 1.20 (95% CI 0.94–1.51) for those performing 4000 or more screenings per year compared with those performing fewer than 2000.

View this table:

Table 3.

Table 4 shows the combined association of radiologists' and facilities' caseloads with performance. Radiologists who worked in facilities performing a greater number of screenings per year had higher detection rates than those who worked in facilities performing fewer, and this was true for all radiologists working in high-volume facilities, irrespective of their individual screening volume. In contrast, the false-positive rates decreased with increasing radiologist caseload, and this trend was clearer among those who worked in larger facilities.

View this table:

Table 4.

Interpretation

Our results support the notion that radiologists' and facilities' screening volumes are independently associated with performance and that their effects may be complementary. In our analysis, radiologists who read larger numbers of screening mammograms and worked in facilities performing larger numbers of screenings tended to have higher breast cancer detection rates while maintaining lower false-positive rates than radiologists who performed fewer readings and worked in facilities performing fewer screenings.

A few studies15^,16^,17^,18^,19 have examined the association of radiologists' interpretation volume with their performance. Four of them15^,16^,17^,18 were conducted as experiments; since performance in a test environment appears to be unrelated to performance in actual practice,20^,21 the relevance of the results of these studies to our purposes is questionable. Only Kan and colleagues19 evaluated the relation of radiologists' reading volume with performance in an operating population-based breast cancer screening program, like ours. They showed a trend, although not significant, to improved cancer detection with greater reading volume. They also showed a U-shaped relation between abnormal-recall rate (a proxy for false-positive rate) and radiologists' screening-mammography volume. In that study, conducted in British Columbia, the smallest and largest categories of annual screening volume were < 2000 and 4000–5199; in our analysis the corresponding volumes were < 250 and 1500–2878. In addition, Kan and colleagues could adjust only for age and history of previous screening, and they did not study the relation of facility screening volume to performance.

Our data suggest an association between a centre's screening volume and its cancer detection rate but not its false-positive rate. Blanks and associates22 observed a significant increase in invasive breast cancer detection rates and a decrease in abnormal-recall rates with increasing facility screening volume. Yankaskas and coworkers23 also observed lower recall rates with increasing facility screening volume but saw no change in sensitivity. Neither group took into account radiologists' reading volume.

Our study had some limitations. First, there were few radiologists or facilities with large numbers of screenings, and this limited our ability to compare our results with those in settings with higher radiologist or facility screening volumes. Second, our study covered only the first 2 years of the PQDCS, which might not be representative of current functioning. Third, only screening-mammography volumes of radiologists and facilities were available to us, whereas accreditation criteria are based on both screening and diagnostic experience. Finally, other important determinants of performance, such as double reading, participation in teaching or research, daily quality-control procedures within facilities and specific training of the radiologists in the interpretation of screening mammograms, could not be taken into account.

In conclusion, radiologists' and facilities' screening volumes appear to have independent and complementary influences on performance as measured by rates of breast cancer detection and false-positive readings. Although the average crude breast cancer detection rates met the minimum Quebec standard of 5 cancers per 10007 in all categories of volume shown for radiologists and facilities, the overall performance of screening mammography seems to be maximized when screenings are performed in larger centres and when, in these centres, mammograms are read by radiologists who interpret a large volume of films.

𝛃 See related article page 210

Footnotes

This article has been peer reviewed.

Contributors: All of the authors participated in the planning and design of the study, the data interpretation and critical revision of the manuscript; all approved the version to be published. Isabelle Théberge was responsible for statistical analysis.

Acknowledgements: We thank Drs. Michel-Pierre Dufresne, Sylvie Groleau, Michel Petitclerc, Robert Pronovost and Guy Roy for their helpful comments. We also thank members of the working group for the evaluation of the Quebec Breast Cancer Screening Program for their valuable suggestions.

Competing interests: None declared.

Correspondence to: Dr. Jacques Brisson, Direction des systèmes de soins et services, Institut national de santé publique du Québec, 945, ave. Wolfe, 5^e étage, Sainte-Foy QC G1V 5B3; fax 418 682-7949; jacques.brissonuresp.ulaval.ca

References

1.↵
Hillner BE, Smith TJ, Desch CE. Hospital and physician volume or specialization and outcomes in cancer treatment: importance in quality of cancer care. J Clin Oncol 2000;18:2327-40.
OpenUrl Abstract/FREE Full Text
2.↵
Birkmeyer JD, Siewers AE, Finlayson EV, Stukel TA, Lucas FL, Batista I, et al. Hospital volume and surgical mortality in the United States. N Engl J Med 2002; 346:1128-37.
OpenUrl CrossRef PubMed
3.↵
Hébert-Croteau N, Brisson J, Pineault R. Review of organizational factors related to care offered to women with breast cancer. Epidemiol Rev 2000;22:228-38.
OpenUrl FREE Full Text
4.↵
Hannan EL. The relation between volume and outcome in health care. N Engl J Med 1999;340:1677–9.
OpenUrl CrossRef PubMed
5.↵
Birkmeyer JD, Stukel TA, Siewers AE, Goodney PP, Wennberg DE, Lucas FL. Surgeon volume and operative mortality in the United States. N Engl J Med 2003;349:2117-27.
OpenUrl CrossRef PubMed
6.↵
Halm EA, Lee C, Chassin MR. Is volume related to outcome in health care? A systematic review and methodologic critique of the literature. Ann Intern Med 2002;137:511-20.
OpenUrl CrossRef PubMed
7.↵
Association canadienne des radiologistes. Programme d'agrément en mammographie. Ville Saint-Laurent (QC): l'Association; 2000. p. 5.
8.↵
Ministère de la santé et des services sociaux. Programme québécois de dépistage du cancer du sein. Cadre de référence. Québec: Gouvernement du Québec; 1996. p. 14.
9.↵
Théberge I, Major D, Langlois A, Brisson J. Validation de stratégies pour obtenir le taux de détection du cancer, la valeur prédictive positive, la proportion des cancers in situ, la proportion des cancers infiltrants de petite taille et la proportion des cancers infiltrants sans envahissement ganglionnaire dans le cadre des données fournies par le Programme québécois de dépistage du cancer du sein (PQDCS). Sainte-Foy (QC): Institut national de santé publique du Québec; 2003. p. 17.
10.↵
Ministère de la santé et des services sociaux. Programme québécois de dépistage du cancer du sein. Rapport d'activités des années 1998 et 1999. Québec: Gouvernement du Québec; 2001. p. 15.
11.↵
Collège des médecins du Québec. Annuaire médical 1999-2000. Montréal: le Collège; 1998.
12.↵
Cornfield J. A method of estimating comparative rates from clinical data; applications to cancer of the lung, breast, and cervix. J Natl Cancer Inst 1951; 11: 1269-75.
OpenUrl FREE Full Text
13.↵
Miettinen O. Estimability and estimation in case-referent studies. Am J Epidemiol 1976;103:226-35.
OpenUrl Abstract/FREE Full Text
14.↵
Diez-Roux AV. Multilevel analysis in public health research. Annu Rev Public Health 2000;21:171-92.
OpenUrl CrossRef PubMed
15.↵
Beam CA, Conant EF, Sickles EA. Association of volume and volume-independent factors with accuracy in screening mammogram interpretation. J Natl Cancer Inst 2003;95:282-90.
OpenUrl Abstract/FREE Full Text
16.↵
Elmore JG, Wells CK, Howard DH. Does diagnostic accuracy in mammography depend on radiologists' experience? J Womens Health 1998;7:443-9.
OpenUrl PubMed
17.↵
Ciatto S, Ambrogetti D, Catarzi S, Morrone D, Rosselli DT. Proficiency test for screening mammography: results for 117 volunteer Italian radiologists. J Med Screen 1999;6:149-51.
OpenUrl Abstract/FREE Full Text
18.↵
Esserman L, Cowley H, Eberle C, Kirkpatrick A, Chang S, Berbaum K, et al. Improving the accuracy of mammography: volume and outcome relationships. J Natl Cancer Inst 2002;94:369-75.
OpenUrl Abstract/FREE Full Text
19.↵
Kan L, Olivotto IA, Warren Burhenne LJ, Sickles EA, Coldman AJ. Standardized abnormal interpretation and cancer detection ratios to assess reading volume and reader performance in a breast screening program. Radiology 2000; 215:563-7.
OpenUrl CrossRef PubMed
20.↵
Rutter CM, Taplin S. Assessing mammographers' accuracy. A comparison of clinical and test performance. J Clin Epidemiol 2000;53:443-50.
OpenUrl CrossRef PubMed
21.↵
Elmore JG, Miglioretti DL, Carney PA. Does practice make perfect when interpreting mammography? Part II. J Natl Cancer Inst 2003;95:250-2.
OpenUrl FREE Full Text
22.↵
Blanks RG, Bennett RL, Wallis MG, Moss SM. Does individual programme size affect screening performance? Results from the United Kingdom NHS breast screening programme. J Med Screen 2002;9:11-4.
OpenUrl Abstract/FREE Full Text
23.↵
Yankaskas BC, Cleveland RJ, Schell MJ, Kozar R. Association of recall rates with sensitivity and positive predictive values of screening mammography. Am J Roentgenol 2001;177:543-9.
OpenUrl PubMed