Skip to main content
Top
Published in: BMC Medical Research Methodology 1/2022

Open Access 01-12-2022 | Research

Bayesian mendelian randomization with study heterogeneity and data partitioning for large studies

Authors: Linyi Zou, Hui Guo, Carlo Berzuini

Published in: BMC Medical Research Methodology | Issue 1/2022

Login to get access

Abstract

Background

Mendelian randomization (MR) is a useful approach to causal inference from observational studies when randomised controlled trials are not feasible. However, study heterogeneity of two association studies required in MR is often overlooked. When dealing with large studies, recently developed Bayesian MR can be computationally challenging, and sometimes even prohibitive.

Methods

We addressed study heterogeneity by proposing a random effect Bayesian MR model with multiple exposures and outcomes. For large studies, we adopted a subset posterior aggregation method to overcome the problem of computational expensiveness of Markov chain Monte Carlo. In particular, we divided data into subsets and combined estimated causal effects obtained from the subsets. The performance of our method was evaluated by a number of simulations, in which exposure data was partly missing.

Results

Random effect Bayesian MR outperformed conventional inverse-variance weighted estimation, whether the true causal effects were zero or non-zero. Data partitioning of large studies had little impact on variations of the estimated causal effects, whereas it notably affected unbiasedness of the estimates with weak instruments and high missing rate of data. For the cases being simulated in our study, the results have indicated that the “divide (data) and combine (estimated subset causal effects)” can help improve computational efficiency, for an acceptable cost in terms of bias in the causal effect estimates, as long as the size of the subsets is reasonably large.

Conclusions

We further elaborated our Bayesian MR method to explicitly account for study heterogeneity. We also adopted a subset posterior aggregation method to ease computational burden, which is important especially when dealing with large studies. Despite the simplicity of the model we have used in the simulations, we hope the present work would effectively point to MR studies that allow modelling flexibility, especially in relation to the integration of heterogeneous studies and computational practicality.
Literature
1.
go back to reference Katan MB. Apolipoprotein e isoforms, serum cholesterol, and cancer. Lancet. 1986; 327:507–8.CrossRef Katan MB. Apolipoprotein e isoforms, serum cholesterol, and cancer. Lancet. 1986; 327:507–8.CrossRef
2.
go back to reference Smith GD, Ebrahim S. Mendelian randomization: can genetic epidemiology contribute to understanding environmental determinants of disease?Int J Epidemiol. 2003; 32:1–22.CrossRef Smith GD, Ebrahim S. Mendelian randomization: can genetic epidemiology contribute to understanding environmental determinants of disease?Int J Epidemiol. 2003; 32:1–22.CrossRef
3.
go back to reference Lawlor DA, Harbord RM, Sterne JAC, Timpson N, Smith GD. Mendelian randomization: Using genes as instruments for making causal inferences in epidemiology. Int J Epidemiol. 2008; 27:1133–63. Lawlor DA, Harbord RM, Sterne JAC, Timpson N, Smith GD. Mendelian randomization: Using genes as instruments for making causal inferences in epidemiology. Int J Epidemiol. 2008; 27:1133–63.
5.
go back to reference Bowden J, Smith GD, Burgess S. Mendelian randomization with invalid instruments: effect estimation and bias detection through egger regression. Int J Epidemiol. 2015; 44(2):512–25.CrossRef Bowden J, Smith GD, Burgess S. Mendelian randomization with invalid instruments: effect estimation and bias detection through egger regression. Int J Epidemiol. 2015; 44(2):512–25.CrossRef
6.
go back to reference Bowden J, Smith GD, Haycock PC, Burgess S. Consistent estimation in mendelian randomization with some invalid instruments using a weighted median estimator. Genet Epidemiol. 2016; 40:304–14.CrossRef Bowden J, Smith GD, Haycock PC, Burgess S. Consistent estimation in mendelian randomization with some invalid instruments using a weighted median estimator. Genet Epidemiol. 2016; 40:304–14.CrossRef
7.
go back to reference Zhao Q, Wang J, Hemani G, Bowden J, Small DS. Statistical Inference in Two-sample Summary-data Mendelian Randomization Using Robust Adjusted Profile Score. Ann Statist; 48(3):1742–69. Zhao Q, Wang J, Hemani G, Bowden J, Small DS. Statistical Inference in Two-sample Summary-data Mendelian Randomization Using Robust Adjusted Profile Score. Ann Statist; 48(3):1742–69.
8.
go back to reference Berzuini C, Guo H, Burgess S, Bernardinelli L. A bayesian approach to mendelian randomization with multiple pleiotropic variants. Biostatistics. 2018; 21(1):86–101.CrossRef Berzuini C, Guo H, Burgess S, Bernardinelli L. A bayesian approach to mendelian randomization with multiple pleiotropic variants. Biostatistics. 2018; 21(1):86–101.CrossRef
9.
go back to reference Burgess S, Thompson SG. MENDELIAN RANDOMIZATION Methods for Using Genetic Variants in Causal Estimation. London: Chapman & Hall/CRC Press; 2015.CrossRef Burgess S, Thompson SG. MENDELIAN RANDOMIZATION Methods for Using Genetic Variants in Causal Estimation. London: Chapman & Hall/CRC Press; 2015.CrossRef
10.
go back to reference Kleibergen F, Zivot E. Bayesian and classical approaches to instrumental variable regression. J Econ. 2003; 114(1):29–72.CrossRef Kleibergen F, Zivot E. Bayesian and classical approaches to instrumental variable regression. J Econ. 2003; 114(1):29–72.CrossRef
11.
go back to reference Jones EM, Thompson JR, Didelez V, Sheehan NA. On the choice of parameterisation and priors for the bayesian analyses of mendelian randomisation studies. Stat Med. 2012; 31(14):1483–501.CrossRef Jones EM, Thompson JR, Didelez V, Sheehan NA. On the choice of parameterisation and priors for the bayesian analyses of mendelian randomisation studies. Stat Med. 2012; 31(14):1483–501.CrossRef
12.
go back to reference Zou L, Guo H, Berzuini C. Overlapping-sample mendelian randomisation with multiple exposures: a bayesian approach. BMC Med Res Methodol. 2020; 20:295.CrossRef Zou L, Guo H, Berzuini C. Overlapping-sample mendelian randomisation with multiple exposures: a bayesian approach. BMC Med Res Methodol. 2020; 20:295.CrossRef
13.
go back to reference Xue J, Liang F. Double-parallel monte carlo for bayesian analysis of big data. Stat Comput. 2019; 29(1):23–32.CrossRef Xue J, Liang F. Double-parallel monte carlo for bayesian analysis of big data. Stat Comput. 2019; 29(1):23–32.CrossRef
14.
go back to reference Sandhu MS, Waterworth DM, Debenham SL, Wheeler E, Papadakis K, Zhao JH, Song K, Yuan X, Johnson T, Ashford S, Inouye M, Luben R, Sims M, Hadley D, McArdle W, Barter P, Kesäniemi YA, Mahley RW, McPherson R, Grundy SM, Consortium WTCC, Bingham SA, Khaw K-T, Loos RJF, Waeber G, Barroso I, Strachan DP, Deloukas P, Vollenweider P, Wareham NJ, Mooser V. Ldl-cholesterol concentrations: a genome-wide association study. Lancet (London, England). 2008; 371(9611):483–91. https://doi.org/10.1016/S0140-6736(08)60208-1.CrossRef Sandhu MS, Waterworth DM, Debenham SL, Wheeler E, Papadakis K, Zhao JH, Song K, Yuan X, Johnson T, Ashford S, Inouye M, Luben R, Sims M, Hadley D, McArdle W, Barter P, Kesäniemi YA, Mahley RW, McPherson R, Grundy SM, Consortium WTCC, Bingham SA, Khaw K-T, Loos RJF, Waeber G, Barroso I, Strachan DP, Deloukas P, Vollenweider P, Wareham NJ, Mooser V. Ldl-cholesterol concentrations: a genome-wide association study. Lancet (London, England). 2008; 371(9611):483–91. https://​doi.​org/​10.​1016/​S0140-6736(08)60208-1.CrossRef
15.
go back to reference Willer CJ, Schmidt EM, Sengupta S, Peloso GM, Gustafsson S, Kanoni S, Ganna A, Chen J, Buchkovich ML, Mora S, Beckmann JS, Bragg-Gresham JL, Chang H-Y, Demirkan A, Den Hertog HM, Do R, et al.Discovery and refinement of loci associated with lipid levels. Nat Genet. 2013; 45(11):1274–83. https://doi.org/10.1038/ng.2797.CrossRef Willer CJ, Schmidt EM, Sengupta S, Peloso GM, Gustafsson S, Kanoni S, Ganna A, Chen J, Buchkovich ML, Mora S, Beckmann JS, Bragg-Gresham JL, Chang H-Y, Demirkan A, Den Hertog HM, Do R, et al.Discovery and refinement of loci associated with lipid levels. Nat Genet. 2013; 45(11):1274–83. https://​doi.​org/​10.​1038/​ng.​2797.CrossRef
19.
go back to reference Wainwright MJ, Jordan MI. Graphical models, exponential families, and variational inference. Found Trends Mach Learn. 2008; 1:1–305.CrossRef Wainwright MJ, Jordan MI. Graphical models, exponential families, and variational inference. Found Trends Mach Learn. 2008; 1:1–305.CrossRef
Metadata
Title
Bayesian mendelian randomization with study heterogeneity and data partitioning for large studies
Authors
Linyi Zou
Hui Guo
Carlo Berzuini
Publication date
01-12-2022
Publisher
BioMed Central
Published in
BMC Medical Research Methodology / Issue 1/2022
Electronic ISSN: 1471-2288
DOI
https://doi.org/10.1186/s12874-022-01619-4

Other articles of this Issue 1/2022

BMC Medical Research Methodology 1/2022 Go to the issue