Abstract
Event history models, also known as hazard models, are commonly used in analyses of fertility. One drawback of event history models is that the conditional probabilities (hazards) estimated by event history models do not readily translate into summary measures, particularly for models of repeatable events, like childbirth. In this paper, we describe how to translate the results of discrete-time event history models of all births into well-known summary fertility measures: simulated age- and parity-specific fertility rates, parity progression ratios, and the total fertility rate. The method incorporates all birth intervals, but permits the hazard functions to vary across parities. It can also simulate values for groups defined by both fixed and time-varying covariates, such as marital or employment life histories. We demonstrate the method using an example from the National Survey of Family Growth and provide an accompanying data file and Stata program.
Similar content being viewed by others
Notes
We only analyze singleton births. This is a limitation of our method, but only 140 or 1.5 % of births are twins, triplets, etc.
We treat educational attainment as time-varying. Although most people complete their education by age 25, women become at risk of childbearing much earlier (age 15), and throughout these years (15–25) educational attainment changes.
Although our sample includes women age 15–44, the life table extends to age 45 for the l x column only to record the number remaining childless on their 45th birthday.
For example, among women aged 25–34, 55 % of non-Hispanic white women were married compared with 27 % of black women.
We recommend using the cluster or svy options in Stata to account for dependence among observations (Cleaves et al. 2008, pp. 191–195), or analogous options in other statistic packages. Alternatively, one could estimate such models with individual random or fixed effects.
χ 2 = −2LLadditive – (−2LLpartially-interactive) and df = df additive − df partially-interactive
χ 2 = −2LLpartially-interactive − (−2LLfully-interactive) and df = df partially-interactive − df fully-interactive. The −2LL and degrees of freedom of the fully-interactive model are obtained by summing the −2LL and degrees of freedom across all of the separate birth interval models (Singer and Willett 2003, pp. 560–561).
The normalized weight is the NSFG sampling weight divided by a constant such that the sum of the weights equals the sample size.
If control variables that are used in the creation of the sample weight are entered into the regression model, unweighted models are likely to produce similar coefficients with lower variance. In our analyses, the estimates were nearly identical between weighted and unweighted models, and the standard errors were about 25 % larger when based on weighted models. However, our analyses further showed that when the models do not contain control variables, the results differed between the weighted and unweighted models. This could create problems when there is a need to estimate models without controls, such as in a series of nested models, or when appropriate controls are unavailable. For this reason, we opted to present results based on weighted models.
To explain, about 1/12 of women age x would have had their j − 1th birth sometime during the first month of the year and would, therefore, have been exposed to 11.5 additional months of risk of having another birth while they are still aged x. Another 1/12 would give birth during the second month of the year and, therefore, would be exposed to 10.5 months of risk; another 1/12 would be exposed to 9.5 months of risk, and so on. On average, the women aged x would have been exposed to 6 months of risk of having another birth while they are still aged x.
References
Allison, P. D. (1989). Event history analysis: Regression for longitudinal event data. Newbury Park, CA: Sage.
Allison, P. D. (1995). Survival analysis using the SAS system: A practical guide. Cary, NC: SAS Institute.
Brand, J. E., & Davis, D. (2011). The impact of college education on fertility: Evidence for heterogeneous effects. Demography, 48, 863–887.
Cai, L., Hayward, M. D., Saito, Y., Lubitz, J., Hagedorn, A., & Crimmins, E. (2010). Estimation of multi-state life table functions and their variability from complex survey data using the SPACE Program. Demographic Research, 22(6), 129–158.
Carter, M. (2000). Fertility of Mexican immigrant women in the U.S.: A closer look. Social Science Quarterly, 81(4), 1073–1086.
Centers for Disease Control and Prevention (CDC). (2012). About the National Survey of Family Growth. Atlanta, GA: National Survey of Family Growth. Retrieved May 2012 from http://www.cdc.gov/nchs/nsfg/about_nsfg.htm. Accessed 15 Oct 2012.
Cleaves, M., Gutierrez, R., Gould, W., & Marchenko, Y. (2008). An introduction to survival analysis using Stata. College Station, TX: Stata Press.
DeMaris, A. (1992). Logit modeling: Practical applications (Vol. 86). Newbury Park, CA: Sage.
Goldstein, A., White, M., & Goldstein, S. (1997). Migration, fertility, and state policy in Hubei Province, China. Demography, 34(4), 481–491.
Guzzo, K. B., & Hayford, S. (2011). Fertility following an unintended first birth. Demography, 48, 1493–1516.
Lee, M. A., & Rendall, M. S. (2001). Self-employment disadvantage in the working lives of blacks and females. Population Research and Policy Review, 20, 291–320.
Lepkowski, J. M., Mosher, W. D., Davis, K. E., Groves, R. M., & Van Hoewyk, J. (2010). The 2006–2010 National Survey of Family Growth: Sample design and analysis of a continuous survey. National Center for Health Statistics Vital Health Statistics 2(150).
Lynch, S. M., & Brown, J. S. (2010). Obtaining multistate life table distributions from cross-sectional data: A Bayesian extension of Sullivan’s method. Demography, 47(4), 1053–1077.
Martin, J. A., Hamilton, B. E., Ventura, S. J., Menacker, F. & Park, M. M. (2002). Births: Final data for 2000. National vital statistics reports, Vol. 50, No. 5, February 12, 2002, Tables 8 and 9.
Martin, J. A., Hamilton, B. E., Ventura, S. J., Osterman, M. J. K., Wilson, E. C. & Mathews, T. J. (2012). Births: Final data for 2010. National vital statistics reports, Vol. 61, No. 1, August 28, 2012, Tables 7 and 8.
National Center for Health Statistics. (1994). Vital statistics of the United States, 1990, VOI 1, natality. Washington: Public Health Service, Tables 1–10 and 1–13.
Poi, B. P. (2004). From the help desk: Some bootstrapping techniques. The Stata Journal, 4(3), 312–328.
Rendall, M. S., Weden, M. M., Fernandes, M. & Vaynman I. (2012). Hispanic and black children’s paths to higher adolescent obesity prevalence. Pediatric Obesity, 7, 423–435.
Retherford, R., Ogwa, N., Matsukura, R., & Eini-Zinab, H. (2010). Multivariate analysis of parity progression-based measures of the total fertility rate and its components. Demography, 47(1), 97–124.
Schellekens, J. (2009). Family allowances and fertility: Socioeconomic differences. Demography, 46(3), 451–468.
Singer, J. D., & Willett, J. B. (2003). Applied longitudinal data analysis: Modeling change and event occurrence. New York: Oxford Press.
Teachman, J., & Hayward, M. (1993). Interpreting hazard rate models. Sociological Methods and Research, 21, 340–371.
White, M. J., Muhidin, S., Andrzejewski, C., Tagoe, E., Knight, R., & Reed, H. (2008). Urbanization and fertility: An event-history analysis of coastal Ghana. Demography, 45(4), 803–816.
Acknowledgments
This research was supported in part by an infrastructure grant from the National Institutes of Health (R24 HD041025). The authors further thank Melissa Hardy and David Johnson for helpful comments on earlier drafts, and Steven Maczuga for programming support.
Author information
Authors and Affiliations
Corresponding author
Appendix: Sensitivity to Age Coding
Appendix: Sensitivity to Age Coding
We used 5-year age categories in our analyses because of limitations in sample size, particularly at higher parities. How would the results differ if we had coded age with single-year categories? To answer this question, we re-estimated the fully-interactive model; only this time, we used single-year age dummy variables. The model coefficients are shown in Table 6. The results illustrate the analytic challenges of using single- rather than 5-year age groups. First, we were forced to code ages 40–44 with a single 5-year category because there were so few births for these ages. Additionally, age groups 15–19 had to be combined in 2nd and 3rd + parity models, again because there were no events for some parity-single-age combinations.
We next assessed how the simulated TFRs and PPRs differed by age coding. Table 7 compares the results based on 5- and single-year age categories for black women and women married 25+ (with each group assessed at mean levels of model covariates). As shown, the results are nearly identical for black women and for women married 25+. They were also nearly identical for white women and other marital status groups (results not shown).
Finally, it is clear that the single-year coding provides greater detail for simulated ASFRs. This is shown in Fig. 4, which graphs the age-specific fertility rates for all parities among black women based on single- and 5-year age categories. When age is coded in single years, important details within the five-year age groups can be seen. For example, single-year fertility rates are very low for 15-year-olds but steadily increase across the 15–19 age category. However, the single-year rates can also be erratic and may not reflect actual age patterns. An example is the zig-zag pattern from ages 24 to 30.
Rights and permissions
About this article
Cite this article
Van Hook, J., Altman, C.E. Using Discrete-Time Event History Fertility Models to Simulate Total Fertility Rates and Other Fertility Measures. Popul Res Policy Rev 32, 585–610 (2013). https://doi.org/10.1007/s11113-013-9276-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11113-013-9276-7