Abstract
Background A set of nomograms based on the Dubbo Osteoporosis Epidemiology Study predicts the five- and ten-year absolute risk of fracture using age, bone mineral density and history of falls and low-trauma fracture. We assessed the discrimination and calibration of these nomograms among participants in the Canadian Multicentre Osteoporosis Study.
Methods We included participants aged 55–95 years for whom bone mineral density measurement data and at least one year of follow-up data were available. Self-reported incident fractures were identified by yearly postal questionnaire or interview (years 3, 5 and 10). We included low-trauma fractures before year 10, except those of the skull, face, hands, ankles and feet. We used a Cox proportional hazards model.
Results Among 4152 women, there were 583 fractures, with a mean follow-up time of 8.6 years. Among 1606 men, there were 116 fractures, with a mean follow-up time of 8.3 years. Increasing age, lower bone mineral density, prior fracture and prior falls were associated with increased risk of fracture. For low-trauma fractures, the concordance between predicted risk and fracture events (Harrell C) was 0.69 among women and 0.70 among men. For hip fractures, the concordance was 0.80 among women and 0.85 among men. The observed fracture risk was similar to the predicted risk in all quintiles of risk except the highest quintile of women, where it was lower. The net reclassification index (19.2%, 95% confidence interval [CI] 6.3% to 32.2%), favours the Dubbo nomogram over the current Canadian guidelines for men.
Interpretation The published nomograms provide good fracture-risk discrimination in a representative sample of the Canadian population.
Current recommendations for the treatment of osteoporosis are in transition. The T-score-based definition of osteoporosis and osteopenia by the expert committee of the World Health Organization on bone mineral density has been used in many guidelines to set intervention thresholds for treatment. However, studies have consistently reported that the highest number of fractures in a given population occurs in those with osteopenic or normal bone mineral density.1,2 In fact, the National Osteoporosis Foundation has singled out people with osteopenic bone mineral density as a population in which assessment for fracture risk is merited.3
Nevertheless, appropriate prevention and treatment strategies for such people are uncertain.4 Recent developments include the assessment of absolute fracture risk based on bone mineral density and other risk factors. Current Canadian methodology determines categorical risk based on age, sex, T-score, fracture history and glucocorticoid use.5 These criteria were derived from Swedish data, but have been assessed and validated in a cohort of Manitoba women.6 Newer nomograms based on the Australian cohort of the Dubbo Osteoporosis Epidemiology Study7 are now available for the calculation of low-trauma hip fracture8 and any fracture.9 These nomograms provide continuous estimates for five- and 10-year absolute fracture risk in both men and women (available at http://fractureriskcalculator.com). The use of factors in addition to bone mineral density may provide a better assessment of fracture risk for people who are near the T-score thresholds and facilitate decisions regarding therapeutic intervention.
A key step in the development of any prediction model is the assessment of its validity.10 The aim of our study was to assess the performance of the Australian-derived nomogram among community-dwelling Canadians aged 55–95 years old. The first part of this assessment was a comparison of the nomogram model using the same variables, but using data from a Canadian population — participants in the Canadian Multicentre Osteoporosis Study (www.camos.org). The second part involved computing the calibration and discrimination of the nomogram in a Canadian cohort. The final part was comparison of the new assessments with the existing Canadian risk classification system.
Methods
Participants
The study sample included all participants in the Canadian Multicentre Osteoporosis Study cohort who were 55–95 years old at baseline, who had undergone measurement of bone mineral density and had at least one year of follow-up data. Of 6539 women and 2884 men in the entire baseline cohort, 4940 women and 1883 men met the age criteria; 788 women and 277 men were excluded based on missing data, leaving a study sample of 4152 women and 1606 men. Details of how the original cohort was randomly sampled from nine Canadian communities have been previously published11 and are available at www.camos.org. Written informed consent was obtained from all participants and ethics approval was granted through McGill University and the ethics review boards for each participating centre.
Data collection
At baseline, all participants were given an interviewer-administered questionnaire, which determined their demographic characteristics, medical history and risk factors for fractures. Follow-up visits were scheduled in year 3 for participants aged 40 to 60 years only and in years 5 and 10 for all participants. In all other years, a self-administered, fracture-related questionnaire was mailed. We included incident fragility fractures that occurred between baseline and the tenth annual (2005–2006) follow-up.
Nomogram risk factors
The Dubbo nomograms were derived using model selection, and the final model included age, bone mineral density T-score of the femoral neck, number of prior fractures (after age 50) and number of falls in the previous year.8,9 For this study, bone mineral density T-scores were based on published reference standards for Canadians.12 All lunar measurements were converted to equivalent hologic values (g/cm2) using standard reference formulas.13 Detailed densitometer quality control is described elsewhere.14 We used number of falls in the preceding one-month (as opposed to one-year) period because this was the period examined in the CaMos questionnaire administered at baseline. The Dubbo nomogram for estimating five- and ten-year risk of hip fracture and any fracture in women and men is shown in Appendix 1 (available at www.cmaj.ca/cgi/content/full/cmaj.100458/DC1).
Fracture assessment
Self-reported incident fractures were identified by annual follow-up and confirmed by structured interview (telephone or in-person). Information that was gathered included fracture site, circumstance, treatment, and x-ray or medical report (if available). As in the Dubbo Osteoporosis Epidemiology Study, we included only low-trauma fractures (i.e., without trauma or caused by a fall from standing height or less) and excluded fractures of the skull, face, hands, ankles and feet.
Risk category comparison
Risk categories based on the predicted 10-year osteoporotic fracture risk in the Dubbo low-trauma fracture nomogram were as follows: low risk = 0%–10%, moderate = 10%–20% and high ≥ 20%. Categories based on the T-score thresholds of the World Health Organization were low risk = T-score > −1 with no prior fracture, moderate = −2.5 < T-score ≤ −1 with no prior fracture, and high = T-score ≤ −2.5 or prior osteoporotic fracture. Ten-year fracture risk categories in the Canadian guidelines were as follows: low risk = 0%–10%, moderate = 10%–20% and high > 20%,5 and were derived from age, minimum T-score (lumbar spine, total hip, femoral neck, trochanter),15 glucocorticoid use and history of fracture after age 40. We used current systemic glucocorticoid use together with at least three months of total use as a proxy for three months in the past year.
Statistical methods
We used Cox’s proportional hazards to develop a comparison model based on study participants with either hip fracture or low-trauma fracture as the outcome. Model entry was time of study enrolment and model exit was the first of the following: 10-year follow-up, fracture, loss to follow-up, or death. Model diagnostic tests included assessment of linearity of continuous variables, proportional hazards and overall model fit.
Validation of all models included both discriminative ability and calibration. Discriminative ability was assessed using the Harrell C, a statistic for survival analysis that is analogous to C-statistic or the area under the curve as used for diagnostic tests.16 Calibration was assessed by dividing the cohort into quintiles according to risk predicted by the nomogram and comparing the observed risk and predicted risk. We used Kaplan–Meier methods to compute observed risk. Cross-tabulation of categories, the Aickin kappa17 and the net reclassification index18 were used for all comparisons. The net reclassification index is a measure indicating the likelihood that people were correctly reclassified (i.e., that events were reclassified higher risk and non-events reclassified lower risk) versus incorrectly reclassified (i.e., that events were reclassified lower risk and non-events reclassified higher risk). In simulated data, adding a biomarker with an odds ratio of 2 to an existing classification resulted in a net reclassification index of 5%.19
Results
The baseline characteristics of participants are shown in Table 1. Among 4152 women, there were 583 fractures with a mean follow-up time of 8.6 years. Among 1606 men, there were 116 fractures with a mean follow-up time of 8.3 years. Distribution of first fractures by skeletal site was as follows: forearm or wrist 174, ribs 133, upper arm or shoulder 100, hip 97, spine 89, leg 52, pelvis 27, multiple sites 27. We obtained documented confirmation of 78% of these fractures. The overall 10-year observed risk for low-trauma fracture was 16.2% (95% CI 15.0 to 17.5) among women and 8.7% (95% CI 7.26 to 10.4) among men. These values were slightly lower than the Dubbo nomogram mean predicted risk, which was 18.3% among women and 11.8% among men. The 10-year observed risk for hip fracture among women was 2.8% (95% CI 2.3 to 3.4), which was lower than the Dubbo nomogram mean predicted risk of 5.6%. The 10-year observed risk for hip fracture among men was 2.4% (95% CI 1.7 to 3.5), which was similar to the mean predicted risk of 2.6%.
A comparison of the model coefficients based on the Canadian Multicentre Osteoporosis Study cohort and the Dubbo Osteoporosis Epidemiology Study cohort is shown in Table 2. The association between prior fracture and future hip fracture or low-trauma fracture was notably stronger in the Dubbo cohort. Overall individual risk profiles derived from the Canadian and Australian models were similar despite the modest difference in some model parameters. Pearson correlations between log hazards were r = 0.96 (women) and r = 0.82 (men) for low-trauma fracture and r = 0.93 (women) and r = 0.99 (men) for hip fracture.
For low-trauma fractures, the concordance (expressed as Harrell’s C) between predicted risk as assessed by the Dubbo nomogram and fracture outcomes in the study sample was C = 0.69 among women and C = 0.70 among men. For hip fractures, the concordance was C = 0.80 among women and C = 0.85 among men. Similar concordance was found when comparing risk derived from the study sample model and fracture outcomes. Calibration plots are shown in Figure 1. The observed low-trauma fracture risk was lower than the predicted risk in the highest quintile for both men and women. The observed hip fracture risk was also lower than the predicted risk in the highest quintile for women, but not for men. The observed number of hip fractures among women and the observed number of low-trauma fractures among men were both slightly lower than the predicted values across quintiles as well.
A comparison of the risk classification based on the Dubbo low-trauma fracture nomogram with that based on established T-score thresholds of the World Health Organization is shown in Appendix 2 (available at www.cmaj.ca/cgi/content/full/cmaj.100458/DC1). Overall, 58.8% (κ = 0.38) of men and 77.4% (κ = 0.66) of women were in the same fracture risk category, with stronger consistency among women. The nomogram reclassified more men as lower risk than as higher risk (26.7% v. 14.4%) but more women as higher risk than as lower risk (15.4% v. 7.2%). The net reclassification index favouring the Dubbo nomogram was equal to 6.7% (95% CI −6.0% to 19.4%) among men and 1.5% (95% CI −2.6% to 5.6%) among women, but neither was significant.
We also compared the risk classification based on the Dubbo low-trauma fracture nomogram with that appearing in Canadian guidelines.5 The underlying thresholds based on age and T-score are similar among women (Table 3), but radically different among men. Under both schemes, 68.3% of men (κ = 0.47) were in the same fracture risk category, compared to 79.1% (κ = 0.69) of women (Appendix 3, available at www.cmaj.ca/cgi/content/full/cmaj.100458/DC1). More men were reclassified as higher risk than as lower risk (30.2% v. 1.5%), which resulted in a drastic shift in risk prevalence and significant improvement in risk classification with a net reclassification index equal to 19.2% (95% CI 6.3% to 32.2%). The reclassification for women resulted in a slight decline in risk classification with a net reclassification index equal to −5.5% (95% CI −9.5% to −1.5%).
The prevalence of high fracture risk based on the three risk classification schemes is shown in Figure 2. The strongest age-gradient in high-risk prevalence was seen by using the Dubbo nomogram and resulted in the identification of roughly 90% of men and women older than 80 years old as high risk. For women, the overall prevalence of high risk was similar using the Dubbo nomogram or Canadian guidelines, but for men it was highest using the nomogram. We found that 40% of men with osteoporotic fractures were identified as high risk using the nomogram, which was notably higher than the percentage identified using other risk assessments.
Interpretation
Our study provides independent external validation of published nomograms based on the Dubbo Osteoporosis Epidemiology Study cohort of Australia for predicting absolute fracture risk in a Canadian population. The ability of the nomograms to discriminate between those who will and those who will not have hip fracture was excellent, whereas the discrimination for low-trauma fracture was more modest. The calibration of the Dubbo nomogram was very good. The main discrepancy between observed and predicted risks was in the highest quintiles. Possible explanations include treatment effects, competing mortality, the differing periods over which falls were assessed, or model shrinkage (i.e., the tendency of statistical models to overestimate the difference between low and high risk when using independent data).10 The discrepancy between predicted and observed fracture risk would not affect clinical decision-making because people in the highest quintile would be considered for treatment even with the lower observed risk.
The classification performance of categories based on the low-trauma fracture nomogram among women was similar to current Canadian guidelines,5 whereas for men there was substantially better performance. This improvement in classification was a result of better identification of men at high risk, given that the current model identifies very few of these people. Inclusion of fracture sites that better reflect osteoporosis among men may explain the substantial shift in thresholds. For women, the existing guidelines have previously been validated in a cohort of women in the province of Manitoba.6 Our analysis also shows that the continuous (rather than categorical) nomogram-based risk prediction is applicable to Canadian women and hence provides a refinement of the existing criteria.
The FRAX model is another tool for fracture-risk prediction that was constructed using a different methodology and is derived from a meta-analysis of nine studies, including the Dubbo Osteoporosis Epidemiology Study and the Canadian Multicentre Osteoporosis Study.20 The discriminative ability of the Dubbo nomogram is slightly higher than that found in the external validation of the FRAX model in most of the test cohorts;20 furthermore, only one of the test cohorts included men. Discrimination of the Dubbo nomogram is nearly the same as that of the Canadian FRAX model in a Manitoba cohort.21 The validation of the model for men is reassuring, given that an independent assessment of the FRAX model in a small sample showed poor discrimination in men.22 For women, results from the Study of Osteoporotic Fractures showed little benefit of the FRAX model compared with a model based on age, bone mineral density and previous fracture.23 The Dubbo nomogram appears robust to change in study population, as indicated by similarities between internal and external validation.8,9 The discriminative ability of the Dubbo nomogram is similar to the widely used Framingham risk score for cardiovascular events.24
There may be underlying differences in risk between cohorts beyond that of the measured risk factors, either directly attributable to fracture propensity, or indirectly as the result of competing risks. We note that the distribution of bone mineral density is roughly similar in Canada12 and Australia.25,26 However, we also note that the geographic variation in fracture does not always reflect underlying variation in bone mineral density.27 A recent comparison of surveillance rates of hip fracture among Canada, the United States and Germany has shown differences in hip fracture incidence that are both age- and sex-dependent.28
Limitations
Risk assessment is used to identify those who are above (or below) a certain level of risk for fracture. The label “high risk” should be a stimulus to physician-patient discussion of management options, but does not necessarily translate into pharmacologic interventions. We included participants who were receiving osteoporosis therapy. Gaps in calibration occurred in the direction predicted by therapy use among those at risk. The limited number of fractures among men results in uncertainty in the model parameters and in the tests of discrimination and calibration. We included all clinical fractures by self-report, which is reliable for hip and wrist fracture, but may result in some misclassification.29 Model calibration is not static, and recalibration may be necessary due to secular changes.30 Nonresponse bias may be present, but as previously shown, is minor except among those over 80 years of age.31 Finally, the results may not be generalizable to those in institutional care.
Conclusion
The Dubbo fracture risk nomogram was validated in an independent, population-based Canadian cohort of community-dwelling men and women and was shown to provide good discrimination for risk of future fracture and hip fracture. These simple nomograms for absolute fracture risk have potential to inform clinical decisions, notably those related to the large numbers of men and women with osteopenia who are at moderate risk.
Acknowledgement
The authors thank all those participants in CaMos whose careful responses and attendance made this analysis possible.
Footnotes
-
See related commentary by Bolland
-
Competing interests: Nguyen Nguyen is supported by a grant from the Australian Medical Bioinformatics Resource (AMBeR). Christopher Kovacs has served as a consultant or received grants from Amgen, Eli Lilly, GlaxoSmithKline, Merck, Novartis, Proctor and Gamble, Sanofi-Aventis, Servier and Novo Nordisk. Jacqueline Center has been supported by or given educational talks for Eli Lilly, Merck Sharp and Dohme, and Sanofi-Aventis. Suzanne Morin has served as a consultant or received payment for lectures given to Warner-Chilcott, Amgen, Eli Lilly, Novartis and Merck. Robert Josse serves as a consultant for Amgen, Bayer, Eli Lilly, Glaxo-SmithKline, Merck, Novartis, Proctor and Gamble, Sanofi-Aventis, Servier and Wyeth-Ayerst. Jonathan Adachi served as a consultant or worked on clinical trials for Amgen, Astra Zeneca, Eli Lilly, GlaxoSmithKline, Merck Frosst, Novartis, Proctor and Gamble, Roche, Sanofi-Aventis, Servier, Pfizer and Wyeth-Ayerst. David Hanley serves as a consultant or has received grants from Amgen, Eli Lilly, Merck, Novartis, Proctor and Gamble, Sanofi-Aventis, Servier, Wyeth-Ayerst and Nycomed. John Eisman served as a consultant or worked on clinical trials for Amgen, Eli Lilly, Merck, Novartis, Sanofi-Aventis and Servier. No competing interests declared by Lisa Langsetmo, Tuan Nguyen or Jerilynn Prior.
-
This article has been peer reviewed.
-
Contributors: Lisa Langsetmo, Tuan Nguyen, Nguyen Nguyen, Christopher Kovacs, Jerilynn Prior, David Hanley and John Eisman contributed to the design of the study. Lisa Langsetmo, Tuan Nguyen and John Eisman worked on the statistical analysis. All of the authors contributed to the interpretation of the data and the drafting and critical revision of the manuscript, and all of them approved the final version submitted for publication. Members of the Canadian Multicentre Osteoporosis Research Group were involved in the initial study design, recruitment of participants, data collection, quality control, review of projects, retention of cohort and other projects.
-
Funding: The Canadian Multicentre Osteoporosis Study was funded by the Canadian Institutes of Health Research (CIHR), Merck Frosst Canada, Eli Lilly Canada, Novartis, and the Alliance for Better Bone Health (Sanofi-Aventis, Procter and Gamble Pharmaceuticals Canada, Dairy Farmers of Canada and the Arthritis Society. These funding sources had no role in the conception of this analysis, in the statistical methods used or in the interpretation of the data.
-
CaMos Research Group: David Goltzman (co-principal investigator, McGill University, Montréal, Que.), Nancy Kreiger (co-principal investigator, Toronto, Ont.), Alan Tenenhouse (principal investigator emeritus, Toronto, Ont.). CaMos Coordinating Centre, McGill University, Montreal, Que.: Suzette Poliquin (national coordinator), Suzanne Godmaire (research assistant), Claudie Berger (study statistician). Memorial University, St. John’s, NL: Carol Joyce (director), Christopher Kovacs (co-director), Emma Sheppard (coordinator). Dalhousie University, Halifax, NS: Susan Kirkland, Stephanie Kaiser (co-directors), Barbara Stanfield (coordinator). Laval University, Québec City, Que.: Jacques P. Brown (director), Louis Bessette (co-director), Marc Gendreau (coordinator). Queen’s University, Kingston, Ont.: Tassos Anastassiades (director), Tanveer Towheed (co-director), Barbara Matthews (coordinator). University of Toronto, Toronto, Ont.: Bob Josse (director), Sophie A. Jamal (co-director), Tim Murray (past director), Barbara Gardner-Bray (coordinator). McMaster University, Hamilton, Ont.: Jonathan D. Adachi (director), Alexandra Papaioannou (co-director), Laura Pickard (coordinator). University of Saskatchewan, Saskatoon, Sask.: Wojciech P. Olszynski (director), K. Shawn Davison (co-director), Jola Thingvold (coordinator). University of Calgary, Calgary, Alta.: David A. Hanley (director), Jane Allan (coordinator). University British Columbia, Vancouver, BC: Jerilynn C. Prior (director), Millan Patel (co-director), Yvette Vigna (coordinator); Brian C. Lentle (radiologist).