Skip to main content
Top
Published in: BMC Medical Research Methodology 1/2019

Open Access 01-12-2019 | Research article

Sample size calculations for model validation in linear regression analysis

Authors: Show-Li Jan, Gwowen Shieh

Published in: BMC Medical Research Methodology | Issue 1/2019

Login to get access

Abstract

Background

Linear regression analysis is a widely used statistical technique in practical applications. For planning and appraising validation studies of simple linear regression, an approximate sample size formula has been proposed for the joint test of intercept and slope coefficients.

Methods

The purpose of this article is to reveal the potential drawback of the existing approximation and to provide an alternative and exact solution of power and sample size calculations for model validation in linear regression analysis.

Results

A fetal weight example is included to illustrate the underlying discrepancy between the exact and approximate methods. Moreover, extensive numerical assessments were conducted to examine the relative performance of the two distinct procedures.

Conclusions

The results show that the exact approach has a distinct advantage over the current method with greater accuracy and high robustness.
Appendix
Available only for authorised users
Literature
1.
go back to reference Cohen J, Cohen P, West SG, Aiken LS. Applied multiple regression/correlation analysis for the behavioral sciences. 3rd ed. Mahwah: Erlbaum; 2003. Cohen J, Cohen P, West SG, Aiken LS. Applied multiple regression/correlation analysis for the behavioral sciences. 3rd ed. Mahwah: Erlbaum; 2003.
2.
go back to reference Kutner MH, Nachtsheim CJ, Neter J, Li W. Applied linear statistical models. 5th ed. New York: McGraw Hill; 2005. Kutner MH, Nachtsheim CJ, Neter J, Li W. Applied linear statistical models. 5th ed. New York: McGraw Hill; 2005.
3.
go back to reference Montgomery DC, Peck EA, Vining GG. Introduction to linear regression analysis. 5th ed. Hoboken: Wiley; 2012. Montgomery DC, Peck EA, Vining GG. Introduction to linear regression analysis. 5th ed. Hoboken: Wiley; 2012.
4.
go back to reference Snee RD. Validation of regression models: methods and examples. Technometrics. 1977;19:415–28.CrossRef Snee RD. Validation of regression models: methods and examples. Technometrics. 1977;19:415–28.CrossRef
5.
go back to reference Maddahi J, Crues J, Berman DS, et al. Noninvasive quantification of left ventricular myocardial mass by gated proton nuclear magnetic resonance imaging. J Am Coll Cardiol. 1987;10:682–92.CrossRef Maddahi J, Crues J, Berman DS, et al. Noninvasive quantification of left ventricular myocardial mass by gated proton nuclear magnetic resonance imaging. J Am Coll Cardiol. 1987;10:682–92.CrossRef
6.
go back to reference Rose BI, McCallum WD. A simplified method for estimating fetal weight using ultrasound measurements. Obstet Gynecol. 1987;69:671–4.PubMed Rose BI, McCallum WD. A simplified method for estimating fetal weight using ultrasound measurements. Obstet Gynecol. 1987;69:671–4.PubMed
7.
go back to reference Cohen J. Statistical power analysis for the behavioral sciences. 2nd ed. Hillsdale: Erlbaum; 1988. Cohen J. Statistical power analysis for the behavioral sciences. 2nd ed. Hillsdale: Erlbaum; 1988.
8.
go back to reference Kraemer HC, Blasey C. How many subjects?: Statistical power analysis in research. 2nd ed. Los Angeles: Sage; 2015. Kraemer HC, Blasey C. How many subjects?: Statistical power analysis in research. 2nd ed. Los Angeles: Sage; 2015.
9.
go back to reference Murphy KR, Myors B, Wolach A. Statistical power analysis: a simple and general model for traditional and modern hypothesis tests. 4th ed. New York: Routledge; 2014.CrossRef Murphy KR, Myors B, Wolach A. Statistical power analysis: a simple and general model for traditional and modern hypothesis tests. 4th ed. New York: Routledge; 2014.CrossRef
10.
11.
go back to reference Gatsonis C, Sampson AR. Multiple correlation: exact power and sample size calculations. Psychol Bull. 1989;106:516–24.CrossRef Gatsonis C, Sampson AR. Multiple correlation: exact power and sample size calculations. Psychol Bull. 1989;106:516–24.CrossRef
12.
go back to reference Mendoza JL, Stafford KL. Confidence interval, power calculation, and sample size estimation for the squared multiple correlation coefficient under the fixed and random regression models: a computer program and useful standard tables. Educ Psychol Meas. 2001;61:650–67.CrossRef Mendoza JL, Stafford KL. Confidence interval, power calculation, and sample size estimation for the squared multiple correlation coefficient under the fixed and random regression models: a computer program and useful standard tables. Educ Psychol Meas. 2001;61:650–67.CrossRef
13.
14.
go back to reference Shieh G. Exact interval estimation, power calculation and sample size determination in normal correlation analysis. Psychometrika. 2006;71:529–40.CrossRef Shieh G. Exact interval estimation, power calculation and sample size determination in normal correlation analysis. Psychometrika. 2006;71:529–40.CrossRef
15.
go back to reference Shieh G. A unified approach to power calculation and sample size determination for random regression models. Psychometrika. 2007;72:347–60.CrossRef Shieh G. A unified approach to power calculation and sample size determination for random regression models. Psychometrika. 2007;72:347–60.CrossRef
16.
go back to reference Shieh G. Exact analysis of squared cross-validity coefficient in predictive regression models. Multivar Behav Res. 2009;44:82–105.CrossRef Shieh G. Exact analysis of squared cross-validity coefficient in predictive regression models. Multivar Behav Res. 2009;44:82–105.CrossRef
17.
go back to reference Kelley K. Sample size planning for the squared multiple correlation coefficient: accuracy in parameter estimation via narrow confidence intervals. Multivar Behav Res. 2008;43:524–55.CrossRef Kelley K. Sample size planning for the squared multiple correlation coefficient: accuracy in parameter estimation via narrow confidence intervals. Multivar Behav Res. 2008;43:524–55.CrossRef
18.
go back to reference Krishnamoorthy K, Xia Y. Sample size calculation for estimating or testing a nonzero squared multiple correlation coefficient. Multivar Behav Res. 2008;43:382–410.CrossRef Krishnamoorthy K, Xia Y. Sample size calculation for estimating or testing a nonzero squared multiple correlation coefficient. Multivar Behav Res. 2008;43:382–410.CrossRef
19.
go back to reference Shieh G. Sample size requirements for interval estimation of the strength of association effect sizes in multiple regression analysis. Psicothema. 2013;25:402–7.PubMed Shieh G. Sample size requirements for interval estimation of the strength of association effect sizes in multiple regression analysis. Psicothema. 2013;25:402–7.PubMed
20.
go back to reference Shieh G. Power and sample size calculations for contrast analysis in ANCOVA. Multivar Behav Res. 2017;52:1–11.CrossRef Shieh G. Power and sample size calculations for contrast analysis in ANCOVA. Multivar Behav Res. 2017;52:1–11.CrossRef
21.
go back to reference Tang Y. Exact and approximate power and sample size calculations for analysis of covariance in randomized clinical trials with or without stratification. Stat Biopharm Res. 2018;10:274–86.CrossRef Tang Y. Exact and approximate power and sample size calculations for analysis of covariance in randomized clinical trials with or without stratification. Stat Biopharm Res. 2018;10:274–86.CrossRef
22.
go back to reference Colosimo EA, Cruz FR, Miranda JLO, et al. Sample size calculation for method validation using linear regression. J Stat Comput Simul. 2007;77:505–16.CrossRef Colosimo EA, Cruz FR, Miranda JLO, et al. Sample size calculation for method validation using linear regression. J Stat Comput Simul. 2007;77:505–16.CrossRef
23.
go back to reference Binkley JK, Abbot PC. The fixed X assumption in econometrics: can the textbooks be trusted? Am Stat. 1987;41:206–14. Binkley JK, Abbot PC. The fixed X assumption in econometrics: can the textbooks be trusted? Am Stat. 1987;41:206–14.
24.
go back to reference Cramer EM, Appelbaum MI. The validity of polynomial regression in the random regression model. Rev Educ Res. 1978;48:511–5.CrossRef Cramer EM, Appelbaum MI. The validity of polynomial regression in the random regression model. Rev Educ Res. 1978;48:511–5.CrossRef
25.
go back to reference Shaffer JP. The Gauss-Markov theorem and random regressors. Am Stat. 1991;45:269–73. Shaffer JP. The Gauss-Markov theorem and random regressors. Am Stat. 1991;45:269–73.
26.
go back to reference Rencher AC, Schaalje GB. Linear models in statistics. 2nd ed. Hoboken: Wiley; 2007.CrossRef Rencher AC, Schaalje GB. Linear models in statistics. 2nd ed. Hoboken: Wiley; 2007.CrossRef
27.
go back to reference Anderson NG, Jolley IJ, Wells JE. Sonographic estimation of fetal weight: comparison of bias, precision and consistency using 12 different formulae. Ultrasound Obstet Gynecol. 2007;30:173–9.CrossRef Anderson NG, Jolley IJ, Wells JE. Sonographic estimation of fetal weight: comparison of bias, precision and consistency using 12 different formulae. Ultrasound Obstet Gynecol. 2007;30:173–9.CrossRef
Metadata
Title
Sample size calculations for model validation in linear regression analysis
Authors
Show-Li Jan
Gwowen Shieh
Publication date
01-12-2019
Publisher
BioMed Central
Published in
BMC Medical Research Methodology / Issue 1/2019
Electronic ISSN: 1471-2288
DOI
https://doi.org/10.1186/s12874-019-0697-9

Other articles of this Issue 1/2019

BMC Medical Research Methodology 1/2019 Go to the issue