Top

BMC Medical Research Methodology

Published in:

Open Access 01-12-2015 | Research article

Bayesian estimation of a cancer population by capture-recapture with individual capture heterogeneity and small sample

Authors: Laurent Bailly, Jean Pierre Daurès, Brigitte Dunais, Christian Pradier

Published in: BMC Medical Research Methodology | Issue 1/2015

Abstract

Background

Cancer incidence and prevalence estimates are necessary to inform health policy, to predict public health impact and to identify etiological factors. Registers have been used to estimate the number of cancer cases. To be reliable and useful, cancer registry data should be complete. Capture-recapture is a method for estimating the number of cases missed, originally developed in ecology to estimate the size of animal populations. Capture recapture methods in cancer epidemiology involve modelling the overlap between lists of individuals using log-linear models. These models rely on assumption of independence of sources and equal catchability between individuals, unlikely to be satisfied in cancer population as severe cases are more likely to be captured than simple cases.

Methods

To estimate cancer population and completeness of cancer registry, we applied M_th models that rely on parameters that influence capture as time of capture (t) and individual heterogeneity (h) and compared results to the ones obtained with classical log-linear models and sample coverage approach. For three sources collecting breast and colorectal cancer cases (Histopathological cancer registry, hospital Multidisciplinary Team Meetings, and cancer screening programmes), individual heterogeneity is suspected in cancer population due to age, gender, screening history or presence of metastases. Individual heterogeneity is hardly analysed as classical log-linear models usually pool it with between-“list” dependence. We applied Bayesian Model Averaging which can be applied with small sample without asymptotic assumption, contrary to the maximum likelihood estimate procedure.

Results

Cancer population estimates were based on the results of the M_h model, with an averaged estimate of 803 cases of breast cancer and 521 cases of colorectal cancer. In the log-linear model, estimates were of 791 cases of breast cancer and 527 cases of colorectal cancer according to the retained models (729 and 481 histological cases, respectively).

Conclusions

We applied M_th models and Bayesian population estimation to small sample of a cancer population. Advantage of M_th models applied to cancer datasets, is the ability to explore individual factors associated with capture heterogeneity, as equal capture probability assumption is unlikely. M_th models and Bayesian population estimation are well-suited for capture-recapture in a heterogeneous cancer population.

Available only for authorised users

Belot A, Grosclaude P, Bossard N, et al. Cancer incidence and mortality in France over the period 1980-2005. Rev Epidemiol Sante Publique. 2008;56(3):159–75.CrossRefPubMed

Bray F, Parkin DM. Evaluation of data quality in the cancer registry: Principles and methods. Part I: Comparability, validity and timeliness. Eur J Cancer. 2009;45:747–55.CrossRefPubMed

Chapman DG. The estimation of biological populations. Ann Math Stat. 1954;25:1–15.CrossRef

Cormack RM. The statistics of capture-recapture methods. Oceanogr Mar Biol Ann Rev. 1968;6:455–506.

Wittes JT, Sidel VW. A generalization of the simple capture recapture model with applications to epidemiological research. J Chronic Dis. 1968;21:287–301.CrossRefPubMed

Wittes JT. Applications of a multinomial capture-recapture model to epidemiological data. J Am Stat. 1974;69:93–7.CrossRef

Sekar CC, Deming WE. On a method of estimating birth and death rates and the extent of registration. American Stat Assoc J. 1949;44:101–15.CrossRef

Himes CL, Clogg CC. An overview of demographic analysis as a method for evaluating census coverage in the US Population. Index. 1992;58:587–607.CrossRef

Hook EB, Regal RR. Internal validity analysis: a method for adjusting capture-recapture estimates of prevalence. Am J Epidemiol. 1995;142(9):48–52.CrossRef

10.

Crocetti E, Miccinesi G, Paci E, Zappa M. An application of the two-source capture-recapture method to estimate the completeness of the Tuscany Cancer Registry. Italy Eur J Cancer Prev. 2001;10(5):417–23.CrossRefPubMed

11.

Ballivet S, Rachid Salmi L, Dubourdieu D. Capture-recapture method to determine the best design of a surveillance system. Application to a thyroid cancer registry. Eur J Epidemiol. 2000;16:147–53.CrossRefPubMed

12.

Seddon DJ, Williams EM. Data quality in population-based cancer registration: an assessment of the Merseyside and Cheshire Cancer Registry. Br J Cancer. 1997;76(5):667–74.CrossRefPubMedPubMedCentral

13.

International Working Group for Disease Monitoring and Forecasting. Capture-recapture and multiple-record systems estimation I: history and development. Am J Epidemiol. 1995;142(10):1047–58.

14.

International Working Group for Disease Monitoring and Forecasting. Capture-recapture and multiple-record systems estimation II: applications in human diseases. Am J Epidemiol. 1995;142(10):1059–68.

15.

Ledberg A, Wennberg A. Estimating the size of hidden populations from register data. BMC Med Res Methodol. 2014;14:58.CrossRefPubMedPubMedCentral

16.

Goodman LA. A general model for the analysis of surveys. American J Socio. 1972;77(6):1035–86.CrossRef

17.

18.

Tilling K, Sterne JAC. Capture-recapture models including covariate effects. Am J Epidemiol. 1999;149(4):392–400.CrossRefPubMed

19.

Chao A, Tsay PK, Lin SH, Shau WY, Chao DY. The applications of capture-recapture models to epidemiological data. Stat Med. 2001;20:3123–57.CrossRefPubMed

20.

King R, Bird SM, Hay G, Hutchinson SJ. Estimating current injectors in Scotland and their drug-related death rate by sex, region and age-group via Bayesian capture-recapture methods. Stat Methods Med Res. 2009;18(4):341–59.CrossRefPubMed

21.

Schmidtmann I. Estimating completeness in cancer registries --comparing capture-recapture methods in a simulation study. Biom J. 2008;6(50):1077–92.CrossRef

22.

Silcocks PB, Robinson D. Completeness of ascertainment by cancer registries: putting bounds on the number of missing cases. J Public Health (Oxf). 2004;26(2):161–7.CrossRef

23.

Chao A, Pan HY, Chiang SC. The Petersen–Lincoln Estimator and its extension to estimate the size of a shared population. Biom J. 2008;6(50):957–70.CrossRef

24.

Mao CX. Computing an NPMLE for a mixing distribution in two closed heterogeneous population size models. Biom J. 2008;6(50):983–92.CrossRef

25.

Manrique-Vallier D, Fienberg SE. Population size estimation using individual level mixture models. Biom J. 2008;6(50):1051–63.CrossRef

26.

Otis DL, Burnham KP, White GC, Anderson DR. Statistical inference from capture data on closed animal populations. Wildlife Monographs. 1978;62:1–135.

27.

King R, Brooks SP. On the Bayesian estimation of a closed population size in the presence of heterogeneity and model uncertainty. Biometrics. 2008;64(3):816–24.CrossRefPubMed

28.

Bailly L, Daurès JP, Pradier C. Investigating the completeness of a histopathological cancer registry: estimation by capture-recapture analysis in a French geographical unit Alpes-Maritimes, 2008. Cancer Epidemiol. 2011;35(6):62–8.CrossRef

29.

Chao DY, Shau WY, Lu CWK, Chen KT, Chu CL, Shu HM, et al. A large outbreak of hepatitis A in a college school in Taiwan: associated with contaminated food and water dissemination. Taiwan Government: Epidemiology Bulletin, Department of Health, Executive Yuan; 1997.

30.

Bruno GB, Biggeri A, LaPorte RE, McCarty D, Merletti F, Pagono G. Application of capture-recapture to count diabetes. Diabetes Care. 1994;17:548–56.CrossRefPubMed

31.

Wittes JT, Colton T, Sidel VW. Capture-recapture methods for assessing the completeness of cases ascertainment when using multiple information sources. J Chronic Dis. 1974;27:25–36.CrossRefPubMed

32.

Fienberg SE. The multiple recapture census for closed populations and incomplete 2 k contingency tables. Biometrika. 1972;59:591–603.

33.

Pledger S. Unified maximum likelihood estimates for closed capture-recapture models using mixtures. Biometrics. 2000;56(2):434–42.CrossRefPubMed

34.

Hoeting JA, Madigan D, Raftery AE, Kronmal RA. Bayesian model averaging: a tutorial. Stat Sci. 1999;14(4):382–417.CrossRef

35.

Gelfand AE, Smith AFM. Sampling-based approaches to calculating marginal densities. J Am Stat Assoc. 1990;85(410):398–409.CrossRef

36.

Lunn DJ, Thomas A, Best N, Spiegelhalter D. WinBUGS – a Bayesian modelling framework: concepts, structure, and extensibility. Stat Com. 2000;10:325–37.CrossRef

37.

Link WA, Barker RJ. Bayesian Inference with ecological applications. Elsevier, London: Academic; 2010. p. 201–24.

38.

Tilling K. Capture–recapture methods–useful or misleading ? Int J Epidemiol. 2001;30(1):12–4.CrossRefPubMed

39.

Brenner H, Stegmaier C, Ziegler H. Estimating completeness of cancer registration: an empirical evaluation of the two source capture-recapture approach in Germany. J Epidemiol Community Health. 1995;49(4):426–30.CrossRefPubMedPubMedCentral

40.

Coull BA, Agresti A. The use of mixed logit models to reflect heterogeneity in capture-recapture studies. Biometrics. 1999;55:294–301.CrossRefPubMed

Title: Bayesian estimation of a cancer population by capture-recapture with individual capture heterogeneity and small sample
Authors: Laurent Bailly
Jean Pierre Daurès
Brigitte Dunais
Christian Pradier
Publication date: 01-12-2015
Publisher: BioMed Central
Published in: BMC Medical Research Methodology / Issue 1/2015
Electronic ISSN: 1471-2288
DOI: https://doi.org/10.1186/s12874-015-0029-7

Keynote webinar | Spotlight on medication adherence

Springer Medicine

Bayesian estimation of a cancer population by capture-recapture with individual capture heterogeneity and small sample

Abstract

Background

Methods

Results

Conclusions

Keynote webinar | Spotlight on medication adherence

Springer Medicine

Abstract

Background

Methods

Results

Conclusions

Please log in to get access to this content

Other articles of this Issue 1/2015

Power and sample size determination for the group comparison of patient-reported outcomes using the Rasch model: impact of a misspecification of the parameters

Decomposing the heterogeneity of depression at the person-, symptom-, and time-level: latent variable models versus multimode principal component analysis

Erratum to: “Comparison of intervention effects in split-mouth and parallel-arm randomized controlled trials: a meta-epidemiological study”

Evaluation of a weighting approach for performing sensitivity analysis after multiple imputation

Pre-notification letter type and response rate to a postal survey among women who have recently given birth

Industry sponsorship and publication bias among animal studies evaluating the effects of statins on atherosclerosis and bone outcomes: a meta-analysis