Skip to main content
Top
Published in: BMC Medical Research Methodology 1/2019

Open Access 01-12-2019 | Research article

Assessing causal treatment effect estimation when using large observational datasets

Authors: E. R. John, K. R. Abrams, C. E. Brightling, N. A. Sheehan

Published in: BMC Medical Research Methodology | Issue 1/2019

Login to get access

Abstract

Background

Recently, there has been a heightened interest in developing and evaluating different methods for analysing observational data. This has been driven by the increased availability of large data resources such as Electronic Health Record (EHR) data alongside known limitations and changing characteristics of randomised controlled trials (RCTs). A wide range of methods are available for analysing observational data. However, various, sometimes strict, and often unverifiable assumptions must be made in order for the resulting effect estimates to have a causal interpretation. In this paper we will compare some common approaches to estimating treatment effects from observational data in order to highlight the importance of considering, and justifying, the relevant assumptions prior to conducting an observational analysis.

Methods

A simulation study was conducted based upon a small cohort of patients with chronic obstructive pulmonary disease. Two-stage least squares instrumental variables, propensity score, and linear regression models were compared under a range of different scenarios including different strengths of instrumental variable and unmeasured confounding. The effects of violating the assumptions of the instrumental variables analysis were also assessed. Sample sizes of up to 200,000 patients were considered.

Results

Two-stage least squares instrumental variable methods can yield unbiased treatment effect estimates in the presence of unmeasured confounding provided the sample size is sufficiently large. Adjusting for measured covariates in the analysis reduces the variability in the two-stage least squares estimates. In the simulation study, propensity score methods produced very similar results to linear regression for all scenarios. A weak instrument or strong unmeasured confounding led to an increase in uncertainty in the two-stage least squares instrumental variable effect estimates. A violation of the instrumental variable assumptions led to bias in the two-stage least squares effect estimates. Indeed, these were sometimes even more biased than those from a naïve linear regression model.

Conclusions

Instrumental variable methods can perform better than naïve regression and propensity scores. However, the assumptions need to be carefully considered and justified prior to conducting an analysis or performance may be worse than if the problem of unmeasured confounding had been ignored altogether.
Literature
1.
go back to reference Chavez-MacGregor M, Giordano SH. Randomized clinical trials and observational studies: is there a Battle? J Clin Oncol. 2016;34(8):772–3.PubMedCrossRef Chavez-MacGregor M, Giordano SH. Randomized clinical trials and observational studies: is there a Battle? J Clin Oncol. 2016;34(8):772–3.PubMedCrossRef
2.
go back to reference Fleming TR, DeMets DL. Surrogate end points in clinical trials: are we being misled? Ann Intern Med. 1996;125(7):605–13.PubMedCrossRef Fleming TR, DeMets DL. Surrogate end points in clinical trials: are we being misled? Ann Intern Med. 1996;125(7):605–13.PubMedCrossRef
3.
go back to reference Woolacott N, et al. Methodological challenges for the evaluation of clinical effectiveness in the context of accelerated regulatory approval: an overview. J Clin Epidemiol. 2017;90:108–18.PubMedCrossRef Woolacott N, et al. Methodological challenges for the evaluation of clinical effectiveness in the context of accelerated regulatory approval: an overview. J Clin Epidemiol. 2017;90:108–18.PubMedCrossRef
4.
go back to reference Sheehan NA, Didelez V. Epidemiology, genetic epidemiology and Mendelian randomisation: more need than ever to attend to detail. Hum Genet. 2019;27:1–6. Sheehan NA, Didelez V. Epidemiology, genetic epidemiology and Mendelian randomisation: more need than ever to attend to detail. Hum Genet. 2019;27:1–6.
6.
go back to reference Austin PC. An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivar Behav Res. 2011;46(3):399–424.CrossRef Austin PC. An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivar Behav Res. 2011;46(3):399–424.CrossRef
7.
go back to reference d'Agostino RB. Propensity score methods for bias reduction in the comparison of a treatment to a non-randomized control group. Stat Med. 1998;17(19):2265–81.PubMedCrossRef d'Agostino RB. Propensity score methods for bias reduction in the comparison of a treatment to a non-randomized control group. Stat Med. 1998;17(19):2265–81.PubMedCrossRef
8.
go back to reference Rosenbaum PR, Rubin DB. The central role of the propensity score in observational studies for causal effects. Biometrika. 1983;70(1):41–55.CrossRef Rosenbaum PR, Rubin DB. The central role of the propensity score in observational studies for causal effects. Biometrika. 1983;70(1):41–55.CrossRef
10.
go back to reference Didelez V, Meng S, Sheehan NA. Assumptions of IV methods for observational epidemiology. Stat Sci. 2010;25(1):22–40.CrossRef Didelez V, Meng S, Sheehan NA. Assumptions of IV methods for observational epidemiology. Stat Sci. 2010;25(1):22–40.CrossRef
11.
go back to reference Greenland S. An introduction to instrumental variables for epidemiologists. Int J Epidemiol. 2000;29(4):722–9.PubMedCrossRef Greenland S. An introduction to instrumental variables for epidemiologists. Int J Epidemiol. 2000;29(4):722–9.PubMedCrossRef
12.
go back to reference Hernán MA, Robins JM. Instruments for causal inference: an epidemiologist's dream? Epidemiology. 2006;17(4):360–72.PubMedCrossRef Hernán MA, Robins JM. Instruments for causal inference: an epidemiologist's dream? Epidemiology. 2006;17(4):360–72.PubMedCrossRef
13.
go back to reference Boef AGC, et al. Sample size importantly limits the usefulness of instrumental variable methods, depending on instrument strength and level of confounding. J Clin Epidemiol. 2014;67(11):1258–64.PubMedCrossRef Boef AGC, et al. Sample size importantly limits the usefulness of instrumental variable methods, depending on instrument strength and level of confounding. J Clin Epidemiol. 2014;67(11):1258–64.PubMedCrossRef
14.
go back to reference Bound J, Jaeger DA, Baker RM. Problems with instrumental variables estimation when the correlation between the instruments and the endogenous explanatory variable is weak. J Am Stat Assoc. 1995;90(430):443–50. Bound J, Jaeger DA, Baker RM. Problems with instrumental variables estimation when the correlation between the instruments and the endogenous explanatory variable is weak. J Am Stat Assoc. 1995;90(430):443–50.
15.
go back to reference Crown WH, Henk HJ, Vanness DJ. Some cautions on the use of instrumental variables estimators in outcomes research: how bias in instrumental variables estimators is affected by instrument strength, instrument contamination, and sample size. Value Health. 2011;14(8):1078–84.PubMedCrossRef Crown WH, Henk HJ, Vanness DJ. Some cautions on the use of instrumental variables estimators in outcomes research: how bias in instrumental variables estimators is affected by instrument strength, instrument contamination, and sample size. Value Health. 2011;14(8):1078–84.PubMedCrossRef
16.
go back to reference Didelez V, Sheehan N. Mendelian randomization as an instrumental variable approach to causal inference. Stat Methods Med Res. 2007;16(4):309–30.PubMedCrossRef Didelez V, Sheehan N. Mendelian randomization as an instrumental variable approach to causal inference. Stat Methods Med Res. 2007;16(4):309–30.PubMedCrossRef
17.
go back to reference Faria, R., et al., NICE DSU technical support document 17: the use of observational data to inform estimates of treatment effectiveness for technology appraisal: methods for comparative individual patient data. 2015. Faria, R., et al., NICE DSU technical support document 17: the use of observational data to inform estimates of treatment effectiveness for technology appraisal: methods for comparative individual patient data. 2015.
18.
go back to reference Agoritsas T, et al. Adjusted analyses in studies addressing therapy and harm: users’ guides to the medical literature. JAMA. 2017;317(7):748–59.PubMedCrossRef Agoritsas T, et al. Adjusted analyses in studies addressing therapy and harm: users’ guides to the medical literature. JAMA. 2017;317(7):748–59.PubMedCrossRef
19.
go back to reference Pearl, J. Causality. Cambridge: Cambridge University Press; 2009. Pearl, J. Causality. Cambridge: Cambridge University Press; 2009.
20.
go back to reference Pearl J. An introduction to causal inference. Int J Biostat. 2010;6(2):Article 7. Pearl J. An introduction to causal inference. Int J Biostat. 2010;6(2):Article 7.
21.
go back to reference Geneletti S, Dawid AP. In: Illari PM, Russo F, Williamson J, editors. Defining and Identifying the Effect of Treatment on the Treated in `Causality in the Sciences. Oxford: Oxford University press; 2011. Geneletti S, Dawid AP. In: Illari PM, Russo F, Williamson J, editors. Defining and Identifying the Effect of Treatment on the Treated in `Causality in the Sciences. Oxford: Oxford University press; 2011.
22.
go back to reference Brookhart MA, Schneeweiss S. Preference-based instrumental variable methods for the estimation of treatment effects: assessing validity and interpreting results. Int J Biostat. 2007;3(1):14.CrossRef Brookhart MA, Schneeweiss S. Preference-based instrumental variable methods for the estimation of treatment effects: assessing validity and interpreting results. Int J Biostat. 2007;3(1):14.CrossRef
23.
go back to reference Swanson SA, Hernán MA. Think globally, act globally: an epidemiologist's perspective on instrumental variable estimation. Stat Sci. 2014;29(3):371–4.PubMedPubMedCentralCrossRef Swanson SA, Hernán MA. Think globally, act globally: an epidemiologist's perspective on instrumental variable estimation. Stat Sci. 2014;29(3):371–4.PubMedPubMedCentralCrossRef
24.
go back to reference Swanson SA, et al. Nature as a Trialist?: Deconstructing the Analogy Between Mendelian Randomization and Randomized Trials. Epidemiol. 2017;28(5):653–9.CrossRef Swanson SA, et al. Nature as a Trialist?: Deconstructing the Analogy Between Mendelian Randomization and Randomized Trials. Epidemiol. 2017;28(5):653–9.CrossRef
25.
go back to reference Swanson SA, Hernán MA. The challenging interpretation of instrumental variable estimates under monotonicity. Int J Epidemiol. 2018;47(4):1289–97.PubMedCrossRef Swanson SA, Hernán MA. The challenging interpretation of instrumental variable estimates under monotonicity. Int J Epidemiol. 2018;47(4):1289–97.PubMedCrossRef
26.
go back to reference Bafadhel M, et al. Acute exacerbations of chronic obstructive pulmonary disease: identification of biologic clusters and their biomarkers. Am J Respir Crit Care Med. 2011;184(6):662–71. Bafadhel M, et al. Acute exacerbations of chronic obstructive pulmonary disease: identification of biologic clusters and their biomarkers. Am J Respir Crit Care Med. 2011;184(6):662–71.
27.
go back to reference Bafadhel, M., et al., Blood eosinophils to direct corticosteroid treatment of exacerbations of chronic obstructive pulmonary disease: a randomized placebo-controlled trial. Am J Respir Crit Care Med. 2012;186(1):48–55.PubMedPubMedCentralCrossRef Bafadhel, M., et al., Blood eosinophils to direct corticosteroid treatment of exacerbations of chronic obstructive pulmonary disease: a randomized placebo-controlled trial. Am J Respir Crit Care Med. 2012;186(1):48–55.PubMedPubMedCentralCrossRef
28.
go back to reference Vansteelandt S, Didelez V. Improving the robustness and efficiency of covariate-adjusted linear instrumental variable estimators. Scand J Stat. 2018;45(4):941–61.CrossRef Vansteelandt S, Didelez V. Improving the robustness and efficiency of covariate-adjusted linear instrumental variable estimators. Scand J Stat. 2018;45(4):941–61.CrossRef
29.
go back to reference Brookhart MA, et al. Variable selection for propensity score models. Am J Epidemiol. 2006;163(12):1149–56.PubMedCrossRef Brookhart MA, et al. Variable selection for propensity score models. Am J Epidemiol. 2006;163(12):1149–56.PubMedCrossRef
31.
go back to reference Martens EP, et al. Instrumental Variables: Application and Limitations. Epidemiol. 2006;17:260–7.CrossRef Martens EP, et al. Instrumental Variables: Application and Limitations. Epidemiol. 2006;17:260–7.CrossRef
32.
go back to reference Li M. Using the propensity score method to estimate causal effects: a review and practical guide. Organ Res Methods. 2013;16(2):188–226.CrossRef Li M. Using the propensity score method to estimate causal effects: a review and practical guide. Organ Res Methods. 2013;16(2):188–226.CrossRef
33.
go back to reference Lunceford JK. Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study. Stat Med. 2017;36(14):2320.PubMed Lunceford JK. Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study. Stat Med. 2017;36(14):2320.PubMed
34.
go back to reference Hade EM, Lu B. Bias associated with using the estimated propensity score as a regression covariate. Stat Med. 2014;33(1):74–87.PubMedCrossRef Hade EM, Lu B. Bias associated with using the estimated propensity score as a regression covariate. Stat Med. 2014;33(1):74–87.PubMedCrossRef
35.
go back to reference King G, Nielsen R. Why propensity scores should not be used for matching. Pol Anal. 2019;27(4):435-54.CrossRef King G, Nielsen R. Why propensity scores should not be used for matching. Pol Anal. 2019;27(4):435-54.CrossRef
36.
go back to reference Bang H, Robins JM. Doubly robust estimation in missing data and causal inference models. Biometrics. 2005;61(4):962–73.PubMedCrossRef Bang H, Robins JM. Doubly robust estimation in missing data and causal inference models. Biometrics. 2005;61(4):962–73.PubMedCrossRef
37.
go back to reference Kang JDY, Schafer JL. Demystifying double robustness: a comparison of alternative strategies for estimating a population mean from incomplete data. Stat Sci. 2007;22(4):523–39.CrossRef Kang JDY, Schafer JL. Demystifying double robustness: a comparison of alternative strategies for estimating a population mean from incomplete data. Stat Sci. 2007;22(4):523–39.CrossRef
38.
go back to reference Okui R, et al. Doubly robust instrumental variable regression. Stat Sin. 2012;22:173–205.CrossRef Okui R, et al. Doubly robust instrumental variable regression. Stat Sin. 2012;22:173–205.CrossRef
39.
go back to reference Laborde-Castérot H, Agrinier N, Thilly N. Performing both propensity score and instrumental variable analyses in observational studies often leads to discrepant results: a systematic review. J Clin Epidemiol. 2015;68(10):1232–40.PubMedCrossRef Laborde-Castérot H, Agrinier N, Thilly N. Performing both propensity score and instrumental variable analyses in observational studies often leads to discrepant results: a systematic review. J Clin Epidemiol. 2015;68(10):1232–40.PubMedCrossRef
40.
go back to reference Davies NM, et al. Issues in the reporting and conduct of instrumental variable studies: a systematic review. Epidemiology. 2013;24(3):363–9.PubMedCrossRef Davies NM, et al. Issues in the reporting and conduct of instrumental variable studies: a systematic review. Epidemiology. 2013;24(3):363–9.PubMedCrossRef
41.
go back to reference Rothman KJ, Greenland S, Lash TL. Modern epidemiology. 3rd ed. Philadelphia: Lippincott Williams & Wilkins; 2008. Rothman KJ, Greenland S, Lash TL. Modern epidemiology. 3rd ed. Philadelphia: Lippincott Williams & Wilkins; 2008.
42.
go back to reference Lawlor DA, Tilling K, Davey Smith G. Triangulation in aetiological epidemiology. Int J Epidemiol. 2016;45(6):1866–86.PubMed Lawlor DA, Tilling K, Davey Smith G. Triangulation in aetiological epidemiology. Int J Epidemiol. 2016;45(6):1866–86.PubMed
43.
go back to reference Vanderweele TJ, Ding P. Sensitivity analysis in observational research: introducing the E-value. Ann Intern Med. 2017;167(4):268.PubMedCrossRef Vanderweele TJ, Ding P. Sensitivity analysis in observational research: introducing the E-value. Ann Intern Med. 2017;167(4):268.PubMedCrossRef
46.
go back to reference Tchetgen ET. A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome. Epidemiol Methods. 2014;3(1):107–12. Tchetgen ET. A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome. Epidemiol Methods. 2014;3(1):107–12.
48.
go back to reference Hughes RA, et al. Selection Bias when estimating average treatment effects using one-sample instrumental variable analysis. Epidemiology. 2019;30(3):350–7.PubMedCrossRefPubMedCentral Hughes RA, et al. Selection Bias when estimating average treatment effects using one-sample instrumental variable analysis. Epidemiology. 2019;30(3):350–7.PubMedCrossRefPubMedCentral
Metadata
Title
Assessing causal treatment effect estimation when using large observational datasets
Authors
E. R. John
K. R. Abrams
C. E. Brightling
N. A. Sheehan
Publication date
01-12-2019
Publisher
BioMed Central
Published in
BMC Medical Research Methodology / Issue 1/2019
Electronic ISSN: 1471-2288
DOI
https://doi.org/10.1186/s12874-019-0858-x

Other articles of this Issue 1/2019

BMC Medical Research Methodology 1/2019 Go to the issue