Skip to main content
Top
Published in: BMC Medical Research Methodology 1/2012

Open Access 01-12-2012 | Research article

Evaluation of the Propensity score methods for estimating marginal odds ratios in case of small sample size

Authors: Romain Pirracchio, Matthieu Resche-Rigon, Sylvie Chevret

Published in: BMC Medical Research Methodology | Issue 1/2012

Login to get access

Abstract

Background

Propensity score (PS) methods are increasingly used, even when sample sizes are small or treatments are seldom used. However, the relative performance of the two mainly recommended PS methods, namely PS-matching or inverse probability of treatment weighting (IPTW), have not been studied in the context of small sample sizes.

Methods

We conducted a series of Monte Carlo simulations to evaluate the influence of sample size, prevalence of treatment exposure, and strength of the association between the variables and the outcome and/or the treatment exposure, on the performance of these two methods.

Results

Decreasing the sample size from 1,000 to 40 subjects did not substantially alter the Type I error rate, and led to relative biases below 10%. The IPTW method performed better than the PS-matching down to 60 subjects. When N was set at 40, the PS matching estimators were either similarly or even less biased than the IPTW estimators. Including variables unrelated to the exposure but related to the outcome in the PS model decreased the bias and the variance as compared to models omitting such variables. Excluding the true confounder from the PS model resulted, whatever the method used, in a significantly biased estimation of treatment effect. These results were illustrated in a real dataset.

Conclusion

Even in case of small study samples or low prevalence of treatment, PS-matching and IPTW can yield correct estimations of treatment effect unless the true confounders and the variables related only to the outcome are not included in the PS model.
Appendix
Available only for authorised users
Literature
1.
go back to reference Rosenbaum P, Rubin D: The central role of the propensity score in observational studies for causal effects. Biometrika. 1983, 70: 41-45. 10.1093/biomet/70.1.41.CrossRef Rosenbaum P, Rubin D: The central role of the propensity score in observational studies for causal effects. Biometrika. 1983, 70: 41-45. 10.1093/biomet/70.1.41.CrossRef
2.
go back to reference Rosenbaum P, Rubin D: Reducing bias in observational studies using sub-classification on the propensity score. J Am Stat Assoc. 1984, 79: 516-524. 10.1080/01621459.1984.10478078.CrossRef Rosenbaum P, Rubin D: Reducing bias in observational studies using sub-classification on the propensity score. J Am Stat Assoc. 1984, 79: 516-524. 10.1080/01621459.1984.10478078.CrossRef
3.
go back to reference D'Agostino RB: Propensity score methods for bias reduction in the comparison of a treatment to a non-randomized control group. Stat Med. 1998, 17 (19): 2265-2281. 10.1002/(SICI)1097-0258(19981015)17:19<2265::AID-SIM918>3.0.CO;2-B.CrossRefPubMed D'Agostino RB: Propensity score methods for bias reduction in the comparison of a treatment to a non-randomized control group. Stat Med. 1998, 17 (19): 2265-2281. 10.1002/(SICI)1097-0258(19981015)17:19<2265::AID-SIM918>3.0.CO;2-B.CrossRefPubMed
4.
go back to reference Austin PC: A critical appraisal of propensity-score matching in the medical literature between 1996 and 2003. Stat Med. 2008, 27 (12): 2037-2049. 10.1002/sim.3150.CrossRefPubMed Austin PC: A critical appraisal of propensity-score matching in the medical literature between 1996 and 2003. Stat Med. 2008, 27 (12): 2037-2049. 10.1002/sim.3150.CrossRefPubMed
5.
go back to reference Rosenbaum P: Model-Based Direct Adjustement. J Am Stat Assoc. 1987, 82 (398): 387-10.1080/01621459.1987.10478441.CrossRef Rosenbaum P: Model-Based Direct Adjustement. J Am Stat Assoc. 1987, 82 (398): 387-10.1080/01621459.1987.10478441.CrossRef
6.
go back to reference Robins JM, Hernan MA, Brumback B: Marginal structural models and causal inference in epidemiology. Epidemiology. 2000, 11 (5): 550-560. 10.1097/00001648-200009000-00011.CrossRefPubMed Robins JM, Hernan MA, Brumback B: Marginal structural models and causal inference in epidemiology. Epidemiology. 2000, 11 (5): 550-560. 10.1097/00001648-200009000-00011.CrossRefPubMed
7.
go back to reference Joffe MM, Ten Have TR, Feldman HI, Kimmel SE: Model selection, confounder control, and marginal structural models: review and new applications. The American Statistician. 2004, 58: 272-279. 10.1198/000313004X5824.CrossRef Joffe MM, Ten Have TR, Feldman HI, Kimmel SE: Model selection, confounder control, and marginal structural models: review and new applications. The American Statistician. 2004, 58: 272-279. 10.1198/000313004X5824.CrossRef
8.
go back to reference Lunceford JK, Davidian M: Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study. Stat Med. 2004, 23 (19): 2937-2960. 10.1002/sim.1903.CrossRefPubMed Lunceford JK, Davidian M: Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study. Stat Med. 2004, 23 (19): 2937-2960. 10.1002/sim.1903.CrossRefPubMed
9.
go back to reference Kurth T, Walker AM, Glynn RJ, Chan KA, Gaziano JM, Berger K, Robins JM: Results of multivariable logistic regression, propensity matching, propensity adjustment, and propensity-based weighting under conditions of nonuniform effect. Am J Epidemiol. 2006, 163 (3): 262-270.CrossRefPubMed Kurth T, Walker AM, Glynn RJ, Chan KA, Gaziano JM, Berger K, Robins JM: Results of multivariable logistic regression, propensity matching, propensity adjustment, and propensity-based weighting under conditions of nonuniform effect. Am J Epidemiol. 2006, 163 (3): 262-270.CrossRefPubMed
10.
go back to reference Austin PC: The relative ability of different propensity score methods to balance measured covariates between treated and untreated subjects in observational studies. Med Decis Making. 2009, 29 (6): 661-677. 10.1177/0272989X09341755.CrossRefPubMed Austin PC: The relative ability of different propensity score methods to balance measured covariates between treated and untreated subjects in observational studies. Med Decis Making. 2009, 29 (6): 661-677. 10.1177/0272989X09341755.CrossRefPubMed
11.
go back to reference Rubin DB: Estimating causal effects from large data sets using propensity scores. Ann Intern Med. 1997, 127 (8 Pt 2): 757-763.CrossRefPubMed Rubin DB: Estimating causal effects from large data sets using propensity scores. Ann Intern Med. 1997, 127 (8 Pt 2): 757-763.CrossRefPubMed
12.
go back to reference Wijeysundera DN, Beattie WS, Austin PC, Hux JE, Laupacis A: Epidural anaesthesia and survival after intermediate-to-high risk non-cardiac surgery: a population-based cohort study. Lancet. 2008, 372 (9638): 562-569. 10.1016/S0140-6736(08)61121-6.CrossRefPubMed Wijeysundera DN, Beattie WS, Austin PC, Hux JE, Laupacis A: Epidural anaesthesia and survival after intermediate-to-high risk non-cardiac surgery: a population-based cohort study. Lancet. 2008, 372 (9638): 562-569. 10.1016/S0140-6736(08)61121-6.CrossRefPubMed
13.
go back to reference Park DW, Seung KB, Kim YH, Lee JY, Kim WJ, Kang SJ, Lee SW, Lee CW, Park SW, Yun SC, et al: Long-term safety and efficacy of stenting versus coronary artery bypass grafting for unprotected left main coronary artery disease: 5-year results from the MAIN-COMPARE (Revascularization for Unprotected Left Main Coronary Artery Stenosis: Comparison of Percutaneous Coronary Angioplasty Versus Surgical Revascularization) registry. J Am Coll Cardiol. 2010, 56 (2): 117-124. 10.1016/j.jacc.2010.04.004.CrossRefPubMed Park DW, Seung KB, Kim YH, Lee JY, Kim WJ, Kang SJ, Lee SW, Lee CW, Park SW, Yun SC, et al: Long-term safety and efficacy of stenting versus coronary artery bypass grafting for unprotected left main coronary artery disease: 5-year results from the MAIN-COMPARE (Revascularization for Unprotected Left Main Coronary Artery Stenosis: Comparison of Percutaneous Coronary Angioplasty Versus Surgical Revascularization) registry. J Am Coll Cardiol. 2010, 56 (2): 117-124. 10.1016/j.jacc.2010.04.004.CrossRefPubMed
14.
go back to reference Fernandez-Nebro A, Olive A, Castro MC, Varela AH, Riera E, Irigoyen MV: Garcia de Yebenes MJ, Garcia-Vicuna R: Long-term TNF-alpha blockade in patients with amyloid A amyloidosis complicating rheumatic diseases. Am J Med. 2010, 123 (5): 454-461. 10.1016/j.amjmed.2009.11.010.CrossRefPubMed Fernandez-Nebro A, Olive A, Castro MC, Varela AH, Riera E, Irigoyen MV: Garcia de Yebenes MJ, Garcia-Vicuna R: Long-term TNF-alpha blockade in patients with amyloid A amyloidosis complicating rheumatic diseases. Am J Med. 2010, 123 (5): 454-461. 10.1016/j.amjmed.2009.11.010.CrossRefPubMed
15.
go back to reference Karlin L, Arnulf B, Chevret S, Ades L, Robin M, De Latour RP, Malphettes M, Kabbara N, Asli B, Rocha V, et al: Tandem autologous non-myeloablative allogeneic transplantation in patients with multiple myeloma relapsing after a first high dose therapy. Bone Marrow Transplant. 2011, 46 (2): 250-6. 10.1038/bmt.2010.90.CrossRefPubMed Karlin L, Arnulf B, Chevret S, Ades L, Robin M, De Latour RP, Malphettes M, Kabbara N, Asli B, Rocha V, et al: Tandem autologous non-myeloablative allogeneic transplantation in patients with multiple myeloma relapsing after a first high dose therapy. Bone Marrow Transplant. 2011, 46 (2): 250-6. 10.1038/bmt.2010.90.CrossRefPubMed
16.
go back to reference Iapichino G, Corbella D, Minelli C, Mills GH, Artigas A, Edbooke DL, Pezzi A, Kesecioglu J, Patroniti N, Baras M, et al: Reasons for refusal of admission to intensive care and impact on mortality. Intensive Care Med. 2010, 36 (10): 1772-1779. 10.1007/s00134-010-1933-2.CrossRefPubMed Iapichino G, Corbella D, Minelli C, Mills GH, Artigas A, Edbooke DL, Pezzi A, Kesecioglu J, Patroniti N, Baras M, et al: Reasons for refusal of admission to intensive care and impact on mortality. Intensive Care Med. 2010, 36 (10): 1772-1779. 10.1007/s00134-010-1933-2.CrossRefPubMed
17.
go back to reference Rubin DB, Thomas N: Matching using estimated propensity scores: relating theory to practice. Biometrics. 1996, 52 (1): 249-264. 10.2307/2533160.CrossRefPubMed Rubin DB, Thomas N: Matching using estimated propensity scores: relating theory to practice. Biometrics. 1996, 52 (1): 249-264. 10.2307/2533160.CrossRefPubMed
18.
go back to reference Perkins SM, Tu W, Underhill MG, Zhou XH, Murray MD: The use of propensity scores in pharmacoepidemiologic research. Pharmacoepidemiol Drug Saf. 2000, 9 (2): 93-101. 10.1002/(SICI)1099-1557(200003/04)9:2<93::AID-PDS474>3.0.CO;2-I.CrossRefPubMed Perkins SM, Tu W, Underhill MG, Zhou XH, Murray MD: The use of propensity scores in pharmacoepidemiologic research. Pharmacoepidemiol Drug Saf. 2000, 9 (2): 93-101. 10.1002/(SICI)1099-1557(200003/04)9:2<93::AID-PDS474>3.0.CO;2-I.CrossRefPubMed
19.
go back to reference Austin PC: Goodness-of-fit diagnostics for the propensity score model when estimating treatment effects using covariate adjustment with the propensity score. Pharmacoepidemiol Drug Saf. 2008, 17 (12): 1202-1217. 10.1002/pds.1673.CrossRefPubMed Austin PC: Goodness-of-fit diagnostics for the propensity score model when estimating treatment effects using covariate adjustment with the propensity score. Pharmacoepidemiol Drug Saf. 2008, 17 (12): 1202-1217. 10.1002/pds.1673.CrossRefPubMed
20.
go back to reference Weitzen S, Lapane KL, Toledano AY, Hume AL, Mor V: Weaknesses of goodness-of-fit tests for evaluating propensity score models: the case of the omitted confounder. Pharmacoepidemiol Drug Saf. 2005, 14 (4): 227-238. 10.1002/pds.986.CrossRefPubMed Weitzen S, Lapane KL, Toledano AY, Hume AL, Mor V: Weaknesses of goodness-of-fit tests for evaluating propensity score models: the case of the omitted confounder. Pharmacoepidemiol Drug Saf. 2005, 14 (4): 227-238. 10.1002/pds.986.CrossRefPubMed
21.
go back to reference Austin PC: The performance of different propensity score methods for estimating marginal odds ratios. Stat Med. 2007, 26 (16): 3078-3094. 10.1002/sim.2781.CrossRefPubMed Austin PC: The performance of different propensity score methods for estimating marginal odds ratios. Stat Med. 2007, 26 (16): 3078-3094. 10.1002/sim.2781.CrossRefPubMed
22.
go back to reference Forbes A, Shortreed S: Inverse probability weighted estimation of the marginal odds ratio: correspondence regarding 'The performance of different propensity score methods for estimating marginal odds ratios'. Stat Med. 2008, 27 (26)): 5556-5559. author reply 5560–5553CrossRefPubMed Forbes A, Shortreed S: Inverse probability weighted estimation of the marginal odds ratio: correspondence regarding 'The performance of different propensity score methods for estimating marginal odds ratios'. Stat Med. 2008, 27 (26)): 5556-5559. author reply 5560–5553CrossRefPubMed
23.
go back to reference Austin PC: Some methods of propensity-score matching had superior performance to others: results of an empirical investigation and Monte Carlo simulations. Biom J. 2009, 51 (1): 171-184. 10.1002/bimj.200810488.CrossRefPubMed Austin PC: Some methods of propensity-score matching had superior performance to others: results of an empirical investigation and Monte Carlo simulations. Biom J. 2009, 51 (1): 171-184. 10.1002/bimj.200810488.CrossRefPubMed
24.
go back to reference Rosenbaum P: Observational Studies. 2nd Edition. 2002, New York, Inc: Springer-VerlagCrossRef Rosenbaum P: Observational Studies. 2nd Edition. 2002, New York, Inc: Springer-VerlagCrossRef
25.
go back to reference Brookhart MA, Schneeweiss S, Rothman KJ, Glynn RJ, Avorn J, Sturmer T: Variable selection for propensity score models. Am J Epidemiol. 2006, 163 (12): 1149-1156. 10.1093/aje/kwj149.CrossRefPubMedPubMedCentral Brookhart MA, Schneeweiss S, Rothman KJ, Glynn RJ, Avorn J, Sturmer T: Variable selection for propensity score models. Am J Epidemiol. 2006, 163 (12): 1149-1156. 10.1093/aje/kwj149.CrossRefPubMedPubMedCentral
26.
go back to reference Cole SR, Hernan MA: Constructing inverse probability weights for marginal structural models. Am J Epidemiol. 2008, 168 (6): 656-664. 10.1093/aje/kwn164.CrossRefPubMedPubMedCentral Cole SR, Hernan MA: Constructing inverse probability weights for marginal structural models. Am J Epidemiol. 2008, 168 (6): 656-664. 10.1093/aje/kwn164.CrossRefPubMedPubMedCentral
27.
go back to reference Leon AC, Hedeker D: Quantile Stratification Based on a Misspecified Propensity Score in Longitudinal Treatment Effectiveness Analyses of Ordinal Doses. Comput Stat Data Anal. 2007, 51 (12): 6114-6122. 10.1016/j.csda.2006.12.021.CrossRefPubMedPubMedCentral Leon AC, Hedeker D: Quantile Stratification Based on a Misspecified Propensity Score in Longitudinal Treatment Effectiveness Analyses of Ordinal Doses. Comput Stat Data Anal. 2007, 51 (12): 6114-6122. 10.1016/j.csda.2006.12.021.CrossRefPubMedPubMedCentral
28.
go back to reference Austin PC, Grootendorst P, Anderson GM: A comparison of the ability of different propensity score models to balance measured variables between treated and untreated subjects: a Monte Carlo study. Stat Med. 2007, 26 (4): 734-753. 10.1002/sim.2580.CrossRefPubMed Austin PC, Grootendorst P, Anderson GM: A comparison of the ability of different propensity score models to balance measured variables between treated and untreated subjects: a Monte Carlo study. Stat Med. 2007, 26 (4): 734-753. 10.1002/sim.2580.CrossRefPubMed
29.
go back to reference Austin PC: Balance diagnostics for comparing the distribution of baseline covariates between treatment groups in propensity-score matched samples. Stat Med. 2009, 28 (25): 3083-3107. 10.1002/sim.3697.CrossRefPubMedPubMedCentral Austin PC: Balance diagnostics for comparing the distribution of baseline covariates between treatment groups in propensity-score matched samples. Stat Med. 2009, 28 (25): 3083-3107. 10.1002/sim.3697.CrossRefPubMedPubMedCentral
30.
go back to reference Hansen BB: The essential role of balance tests in propensity-matched observational studies: comments on 'A critical appraisal of propensity-score matching in the medical literature between 1996 and 2003' by Peter Austin, Statistics in Medicine. Stat Med. 2008, 27 (12)): 2050-2054. discussion 2066–2059CrossRefPubMed Hansen BB: The essential role of balance tests in propensity-matched observational studies: comments on 'A critical appraisal of propensity-score matching in the medical literature between 1996 and 2003' by Peter Austin, Statistics in Medicine. Stat Med. 2008, 27 (12)): 2050-2054. discussion 2066–2059CrossRefPubMed
31.
go back to reference Gayat E, Pirracchio R, Resche-Rigon M, Mebazaa A, Mary JY, Porcher R: Propensity scores in intensive care and anaesthesiology literature: a systematic review. Intensive Care Med. 2010, 36 (12): 1993-2003. 10.1007/s00134-010-1991-5.CrossRefPubMed Gayat E, Pirracchio R, Resche-Rigon M, Mebazaa A, Mary JY, Porcher R: Propensity scores in intensive care and anaesthesiology literature: a systematic review. Intensive Care Med. 2010, 36 (12): 1993-2003. 10.1007/s00134-010-1991-5.CrossRefPubMed
32.
go back to reference Austin PC: Optimal caliper widths for propensity-score matching when estimating differences in means and differences in proportions in observational studies. Pharm Stat. 2011, 10 (2): 150-161. 10.1002/pst.433.CrossRefPubMed Austin PC: Optimal caliper widths for propensity-score matching when estimating differences in means and differences in proportions in observational studies. Pharm Stat. 2011, 10 (2): 150-161. 10.1002/pst.433.CrossRefPubMed
33.
go back to reference Dehija RH, Wahba S: Propensity Score-Matching Methods for Nonexperimental Causal Studies. Rev Econ Stat. 2002, 84 (1): 151-161. 10.1162/003465302317331982.CrossRef Dehija RH, Wahba S: Propensity Score-Matching Methods for Nonexperimental Causal Studies. Rev Econ Stat. 2002, 84 (1): 151-161. 10.1162/003465302317331982.CrossRef
34.
go back to reference Frölich M: Finite-Sample Properties of Propensity-Score Matching and Weighting Estimators. Rev Econ Stat. 2004, 86 (1): 77-90. 10.1162/003465304323023697.CrossRef Frölich M: Finite-Sample Properties of Propensity-Score Matching and Weighting Estimators. Rev Econ Stat. 2004, 86 (1): 77-90. 10.1162/003465304323023697.CrossRef
35.
go back to reference Waernbaum I: Model misspecification and robustness in causal inference: comparing matching with doubly robust estimation. Stat Med. 2012, 10.1002/sim.4496.. [Epub ahead of print] Waernbaum I: Model misspecification and robustness in causal inference: comparing matching with doubly robust estimation. Stat Med. 2012, 10.1002/sim.4496.. [Epub ahead of print]
36.
go back to reference Austin PC, Grootendorst P, Normand SL, Anderson GM: Conditioning on the propensity score can result in biased estimation of common measures of treatment effect: a Monte Carlo study. Stat Med. 2007, 26 (4): 754-768. 10.1002/sim.2618.CrossRefPubMed Austin PC, Grootendorst P, Normand SL, Anderson GM: Conditioning on the propensity score can result in biased estimation of common measures of treatment effect: a Monte Carlo study. Stat Med. 2007, 26 (4): 754-768. 10.1002/sim.2618.CrossRefPubMed
37.
go back to reference Martens EP, Pestman WR, Klungel OH: Conditioning on the propensity score can result in biased estimation of common measures of treatment effect: a Monte Carlo study (p n/a) by Peter C. Austin, Paul Grootendorst, Sharon-Lise T. Normand, Geoffrey M. Anderson. Stat Med. 2007, 26 (16)): 3208-3210. 10.1002/sim.2618. Author reply Published Online: 16 June 2006.CrossRefPubMed Martens EP, Pestman WR, Klungel OH: Conditioning on the propensity score can result in biased estimation of common measures of treatment effect: a Monte Carlo study (p n/a) by Peter C. Austin, Paul Grootendorst, Sharon-Lise T. Normand, Geoffrey M. Anderson. Stat Med. 2007, 26 (16)): 3208-3210. 10.1002/sim.2618. Author reply Published Online: 16 June 2006.CrossRefPubMed
38.
go back to reference Stampf S, Graf E, Schmoor C, Schumacher M: Estimators and confidence intervals for the marginal odds ratio using logistic regression and propensity score stratification. Stat Med. 2010, 29 (7–8): 760-769.CrossRefPubMed Stampf S, Graf E, Schmoor C, Schumacher M: Estimators and confidence intervals for the marginal odds ratio using logistic regression and propensity score stratification. Stat Med. 2010, 29 (7–8): 760-769.CrossRefPubMed
39.
go back to reference Gail M, Wieand S, Piantadosi S: Biased estimates of treatment effect in randomized experiments with nonlinear regressions and omitted covariates. Biometrika. 1984, 71 (3): 431-444. 10.1093/biomet/71.3.431.CrossRef Gail M, Wieand S, Piantadosi S: Biased estimates of treatment effect in randomized experiments with nonlinear regressions and omitted covariates. Biometrika. 1984, 71 (3): 431-444. 10.1093/biomet/71.3.431.CrossRef
40.
go back to reference Greenland S: Interpretation and choice of effect measures in epidemiologic analyses. Am J Epidemiol. 1987, 125 (5): 761-768.PubMed Greenland S: Interpretation and choice of effect measures in epidemiologic analyses. Am J Epidemiol. 1987, 125 (5): 761-768.PubMed
Metadata
Title
Evaluation of the Propensity score methods for estimating marginal odds ratios in case of small sample size
Authors
Romain Pirracchio
Matthieu Resche-Rigon
Sylvie Chevret
Publication date
01-12-2012
Publisher
BioMed Central
Published in
BMC Medical Research Methodology / Issue 1/2012
Electronic ISSN: 1471-2288
DOI
https://doi.org/10.1186/1471-2288-12-70

Other articles of this Issue 1/2012

BMC Medical Research Methodology 1/2012 Go to the issue