Published in:
Open Access
01-12-2015 | Research article
Inter-observer agreement according to three methods of evaluating mammographic density and parenchymal pattern in a case control study: impact on relative risk of breast cancer
Authors:
Rikke Rass Winkel, My von Euler-Chelpin, Mads Nielsen, Pengfei Diao, Michael Bachmann Nielsen, Wei Yao Uldall, Ilse Vejborg
Published in:
BMC Cancer
|
Issue 1/2015
Login to get access
Abstract
Background
Mammographic breast density and parenchymal patterns are well-established risk factors for breast cancer. We aimed to report inter-observer agreement on three different subjective ways of assessing mammographic density and parenchymal pattern, and secondarily to examine what potential impact reproducibility has on relative risk estimates of breast cancer.
Methods
This retrospective case–control study included 122 cases and 262 age- and time matched controls (765 breasts) based on a 2007 screening cohort of 14,736 women with negative screening mammograms from Bispebjerg Hospital, Copenhagen. Digitised randomized film-based mammograms were classified independently by two readers according to two radiological visual classifications (BI-RADS and Tabár) and a computerized interactive threshold technique measuring area-based percent mammographic density (denoted PMD). Kappa statistics, Intraclass Correlation Coefficient (ICC) (equivalent to weighted kappa), Pearson’s linear correlation coefficient and limits-of-agreement analysis were used to evaluate inter-observer agreement. High/low-risk agreement was also determined by defining the following categories as high-risk: BI-RADS’s D3 and D4, Tabár’s PIV and PV and the upper two quartiles (within density range) of PMD. The relative risk of breast cancer was estimated using logistic regression to calculate odds ratios (ORs) adjusted for age, which were compared between the two readers.
Results
Substantial inter-observer agreement was seen for BI-RADS and Tabár (κ=0.68 and 0.64) and agreement was almost perfect when ICC was calculated for the ordinal BI-RADS scale (ICC=0.88) and the continuous PMD measure (ICC=0.93). The two readers judged 5% (PMD), 10% (Tabár) and 13% (BI-RADS) of the women to different high/low-risk categories, respectively. Inter-reader variability showed different impact on the relative risk of breast cancer estimated by the two readers on a multiple-category scale, however, not on a high/low-risk scale. Tabár’s pattern IV demonstrated the highest ORs of all density patterns investigated.
Conclusions
Our study shows the Tabár classification has comparable inter-observer reproducibility with well tested density methods, and confirms the association between Tabár’s PIV and breast cancer. In spite of comparable high inter-observer agreement for all three methods, impact on ORs for breast cancer seems to differ according to the density scale used. Automated computerized techniques are needed to fully overcome the impact of subjectivity.