Skip to main content
Top
Published in: BMC Medical Research Methodology 1/2014

Open Access 01-12-2014 | Research article

A Monte Carlo simulation study comparing linear regression, beta regression, variable-dispersion beta regression and fractional logit regression at recovering average difference measures in a two sample design

Authors: Christopher Meaney, Rahim Moineddin

Published in: BMC Medical Research Methodology | Issue 1/2014

Login to get access

Abstract

Background

In biomedical research, response variables are often encountered which have bounded support on the open unit interval - (0,1). Traditionally, researchers have attempted to estimate covariate effects on these types of response data using linear regression. Alternative modelling strategies may include: beta regression, variable-dispersion beta regression, and fractional logit regression models. This study employs a Monte Carlo simulation design to compare the statistical properties of the linear regression model to that of the more novel beta regression, variable-dispersion beta regression, and fractional logit regression models.

Methods

In the Monte Carlo experiment we assume a simple two sample design. We assume observations are realizations of independent draws from their respective probability models. The randomly simulated draws from the various probability models are chosen to emulate average proportion/percentage/rate differences of pre-specified magnitudes. Following simulation of the experimental data we estimate average proportion/percentage/rate differences. We compare the estimators in terms of bias, variance, type-1 error and power. Estimates of Monte Carlo error associated with these quantities are provided.

Results

If response data are beta distributed with constant dispersion parameters across the two samples, then all models are unbiased and have reasonable type-1 error rates and power profiles. If the response data in the two samples have different dispersion parameters, then the simple beta regression model is biased. When the sample size is small (N0 = N1 = 25) linear regression has superior type-1 error rates compared to the other models. Small sample type-1 error rates can be improved in beta regression models using bias correction/reduction methods. In the power experiments, variable-dispersion beta regression and fractional logit regression models have slightly elevated power compared to linear regression models. Similar results were observed if the response data are generated from a discrete multinomial distribution with support on (0,1).

Conclusions

The linear regression model, the variable-dispersion beta regression model and the fractional logit regression model all perform well across the simulation experiments under consideration. When employing beta regression to estimate covariate effects on (0,1) response data, researchers should ensure their dispersion sub-model is properly specified, else inferential errors could arise.
Appendix
Available only for authorised users
Literature
1.
go back to reference Johnson N, Kotz S, Balakrishnan N: Continuous Univariate Distributions. 1995, Hoboken, New Jersey: Wiley, 2 Johnson N, Kotz S, Balakrishnan N: Continuous Univariate Distributions. 1995, Hoboken, New Jersey: Wiley, 2
2.
go back to reference Gupta A, Nadarajah S: Handbook of Beta Distribution and its Applications. 2004, Boca Raton, Florida: CRC Press, 1 Gupta A, Nadarajah S: Handbook of Beta Distribution and its Applications. 2004, Boca Raton, Florida: CRC Press, 1
3.
go back to reference Paolino P: Maximum likelihood estimation of models with beta distributed dependent variables. Polit Anal. 2001, 9 (4): 325-346. 10.1093/oxfordjournals.pan.a004873.CrossRef Paolino P: Maximum likelihood estimation of models with beta distributed dependent variables. Polit Anal. 2001, 9 (4): 325-346. 10.1093/oxfordjournals.pan.a004873.CrossRef
4.
go back to reference Ferrari S, Cribrari-Neto F: Beta regression for modelling rates and proportions. J Appl Stat. 2004, 10: 1-18. Ferrari S, Cribrari-Neto F: Beta regression for modelling rates and proportions. J Appl Stat. 2004, 10: 1-18.
5.
go back to reference Smithson M, Verkuilen J: A better lemon squeezer? Maximum-likelihood regression with beta distributed dependent variables. Psychol Methods. 2006, 11 (1): 54-71.CrossRefPubMed Smithson M, Verkuilen J: A better lemon squeezer? Maximum-likelihood regression with beta distributed dependent variables. Psychol Methods. 2006, 11 (1): 54-71.CrossRefPubMed
6.
go back to reference McCullagh P, Nelder J: Generalized linear models. 1989, Boca Raton: CRC Press, 2CrossRef McCullagh P, Nelder J: Generalized linear models. 1989, Boca Raton: CRC Press, 2CrossRef
8.
go back to reference Papke L, Wooldridge J: Econometric methods for fractional response variables with an application to 401(K) plan participation rates. J Appl Econ. 1996, 11: 619-632. 10.1002/(SICI)1099-1255(199611)11:6<619::AID-JAE418>3.0.CO;2-1.CrossRef Papke L, Wooldridge J: Econometric methods for fractional response variables with an application to 401(K) plan participation rates. J Appl Econ. 1996, 11: 619-632. 10.1002/(SICI)1099-1255(199611)11:6<619::AID-JAE418>3.0.CO;2-1.CrossRef
9.
go back to reference Cox C: Non-linear quasi-likelihood models: applications to continuous proportions. Comput Stat Data Anal. 1996, 21 (4): 449-461. 10.1016/0167-9473(95)00024-0.CrossRef Cox C: Non-linear quasi-likelihood models: applications to continuous proportions. Comput Stat Data Anal. 1996, 21 (4): 449-461. 10.1016/0167-9473(95)00024-0.CrossRef
10.
11.
go back to reference White H: Asymptotic Theory for Econometricians. 2000, San Diego, California: Academic Press White H: Asymptotic Theory for Econometricians. 2000, San Diego, California: Academic Press
12.
go back to reference Kosmidis I, Firth D: A generic algorithm for reducing bias in parametric estimation. Electron J Stat. 2010, 4: 1097-1112. 10.1214/10-EJS579.CrossRef Kosmidis I, Firth D: A generic algorithm for reducing bias in parametric estimation. Electron J Stat. 2010, 4: 1097-1112. 10.1214/10-EJS579.CrossRef
13.
go back to reference Grun B, Kosmidis I, Zeileis A: Extended beta regression in R: shaken, stirred, mixed and partitioned. J Stat Softw. 2012, 48 (11): 1-25.CrossRef Grun B, Kosmidis I, Zeileis A: Extended beta regression in R: shaken, stirred, mixed and partitioned. J Stat Softw. 2012, 48 (11): 1-25.CrossRef
14.
go back to reference Knight K: Mathematical Statistics. 2000, Boca Raton, Florida: CRC Press Knight K: Mathematical Statistics. 2000, Boca Raton, Florida: CRC Press
15.
go back to reference Wasserman L: All of Statistics: A Concise Course in Statistical Inference. 2004, New York, New York: SpringerCrossRef Wasserman L: All of Statistics: A Concise Course in Statistical Inference. 2004, New York, New York: SpringerCrossRef
16.
go back to reference White I: SIMSUM: analyses of simulation studies including Monte Carlo Error. Stata J. 10 (3): 369-385. White I: SIMSUM: analyses of simulation studies including Monte Carlo Error. Stata J. 10 (3): 369-385.
17.
19.
go back to reference Zeileis A: Econometric computing with HC and HAC covariance matrix estimators. J Stat Softw. 2004, 11 (10): 1-17.CrossRef Zeileis A: Econometric computing with HC and HAC covariance matrix estimators. J Stat Softw. 2004, 11 (10): 1-17.CrossRef
20.
go back to reference Jackson C: Multi-state models for panel data: the msm package for R. J Stat Softw. 2011, 38 (8): 1-29.CrossRef Jackson C: Multi-state models for panel data: the msm package for R. J Stat Softw. 2011, 38 (8): 1-29.CrossRef
21.
go back to reference Kieschnick R, McCullough B: Regression analysis of variates observed on (0,1): percentages, proportions and fractions. Stat Model. 2003, 3 (3): 193-213. 10.1191/1471082X03st053oa.CrossRef Kieschnick R, McCullough B: Regression analysis of variates observed on (0,1): percentages, proportions and fractions. Stat Model. 2003, 3 (3): 193-213. 10.1191/1471082X03st053oa.CrossRef
22.
go back to reference Hunger M, Beaumert J, Holle R: Analysis of SF-6D index data: is beta regression appropriate?. Value Health. 2011, 14: 759-767. 10.1016/j.jval.2010.12.009.CrossRefPubMed Hunger M, Beaumert J, Holle R: Analysis of SF-6D index data: is beta regression appropriate?. Value Health. 2011, 14: 759-767. 10.1016/j.jval.2010.12.009.CrossRefPubMed
23.
go back to reference Swearingen C, Tilley B, Adams R, Rumboldt Z, Nicholas J, Bandyopadhyay D, Woolson R: Application of beta regression to analyze ischemic stroke volume in NINDS rt-PA clinical trials. Methods in Neuroepidemiology. 2011, 37: 73-82. 10.1159/000330375.CrossRefPubMed Swearingen C, Tilley B, Adams R, Rumboldt Z, Nicholas J, Bandyopadhyay D, Woolson R: Application of beta regression to analyze ischemic stroke volume in NINDS rt-PA clinical trials. Methods in Neuroepidemiology. 2011, 37: 73-82. 10.1159/000330375.CrossRefPubMed
Metadata
Title
A Monte Carlo simulation study comparing linear regression, beta regression, variable-dispersion beta regression and fractional logit regression at recovering average difference measures in a two sample design
Authors
Christopher Meaney
Rahim Moineddin
Publication date
01-12-2014
Publisher
BioMed Central
Published in
BMC Medical Research Methodology / Issue 1/2014
Electronic ISSN: 1471-2288
DOI
https://doi.org/10.1186/1471-2288-14-14

Other articles of this Issue 1/2014

BMC Medical Research Methodology 1/2014 Go to the issue

Reviewer acknowledgement

Reviewer acknowledgement 2013