Abstract
This article presents a model to evaluate the accuracy of diagnostic tests. Data from three tests for the detection of EF-positive Streptococcus suis serotype 2 strains in sows were analyzed. The data were collected in a field study in the absence of a gold standard, that is, the true disease status (noninfected or infected) of the tested animals was unknown. Two tests were based on a polymerase chain reaction (PCR); one test was applied to a tonsil swab (taken from the live animal), and the other test was applied to the whole tonsil (collected at slaughter). The third test was based on a bacterial examination (BE) of the whole tonsil. To reduce experimental cost BE was performed only for a subset of the animals in the sample. The model allows for dependence between tests, conditional upon the unknown true disease status of the animals. Accuracy was expressed in terms of sensitivity and specificity of the tests. A Bayesian analysis was performed that incorporated prior information about the accuracy of the tests. The model parameters have a simple interpretation and specification of priors is straightforward. Posterior inference was carried out with Markov chain Monte Carlo (MCMC) methods, employing the Gibbs sampler, as implemented in the WinBUGS program. Different parameterizations to allow for selection and missing values, use of different priors, practical problems in the analysis, and some interesting issues in a joint analysis of the binary (positive or negative) results of PCR and BE and two additional continuous enzyme-linked immunosorbent assays (ELISA) are discussed.
Similar content being viewed by others
References
Albert, P. S., McShane, L. S., Shih, J. H., and U.S. National Cancer Institute Bladder Tumor Marker Network (2001), “Latent Class Modelling Approaches for Assessing Diagnostic Error Without a Gold Standard: With Applications to P53 Immunohistochemical Assays in Bladder tumors,” Biometrics, 57, 610–619.
Bishop, Y. M. M., Fienberg, S. E., and Holland, P. W. (1975), Discrete Multivariate Analysis, Cambridge: MIT Press.
Bouma, A., Stegeman, J. A., Engel, B., Kluijver, E. P., Elbers, A. R. W., and de Jong, M. C. M. (2001), “Evaluation of Diagnostic Tests for the Detection of Classical Swine Fever in the Field Without a Gold Standard,” Journal of Veterinary Diagnostics Investigation, 13, 383–388.
Casella, G., and George, E. I. (1992), “Explaining the Gibbs Sampler,” The American Statistician 46, 167–174.
Catchpole, E. A., and Morgan, B. J. T. (2001), “Deficiency of Parameter-Redundant Models,” Biometrika, 88, 593–598.
Congdon, P. (2001), Bayesian Statistical Modelling, Chichester: Wiley.
Cook, R. J., Ng, E. T. M., and Meade, M. O. (2000), “Estimation of Operating Characteristics for Dependent Diagnostic Tests Based on Latent Markov Models,” Biometrics, 56, 1109–1117.
De Bock, G. H., and Van Houwelingen, J. C. (1998), “Diagnostic Test Evaluation Without a Gold Standard,” in Encyclopedia of Biostatistics, 2, eds. P. Armitage and T. Colton, New York: Wiley, pp. 1147–1151.
Dendukuri, N., and Joseph, L. (2001), “Bayesian Approaches to Modelling the Conditional Dependence Between Multiple Diagnostic Tests,” Biometrics, 57, 158–167.
Gardner, I. A., Stryhn, H., Lind, P., and Collins, M. T. (2000), “Conditional Dependence Between Tests Affects the Diagnosis and Surveillance of Animal Diseases,” Preventive Veterinary Medicine, 45, 107–122.
Gelman, A., Carlin, J. B., Stern, H. S., and Rubin, D. B. (1995), Bayesian Data Analysis. London: Chapman and Hall.
GenStat Committee (2000), The Guide to GenStat, ed. by R. W. Payne. Oxford: VSN Int. Ltd.
Georgiades, M. P., Johnson, W. O., Gardner, I. A., and Singh, R. (2003), “Correlation-Adjusted Estimation of Sensitivity and Specificity of Two Diagnostic Tests,” Applied Statistics, 52, 63–76.
Goetghebeur, E., Liinev, J., Boelaert, M., and Van der Stuyft (2000), “Diagnostic Tests Analyses in Search of Their Gold Standard: Latent Class Analyses With Random Effects,” Statistical Methods in Medical Research, 9, 231–248.
Greiner, M., Pfeiffer, D., and Smith, R. D. (2000), “Principles and Practical Application of the Receiver-Operating Characteristic, Analysis for Diagnostic Tests,” Preventive Veterinary Medicine, 45, 32–41.
Guggenmoos-Holzmann, I., and Van Houwelingen, H. C. (2000), “The (in) Validity of Sensitivity and Specificity,” Statistics in Medicine, 19, 1783–1792.
Hadgu, A., and Qu, Y. (1998), “A Biomedical Application of Latent Class Models with Random Effects,” Applied Statistics, 47, 603–616.
Hanson, T. E., Johnson, W. O., and Gardner, I. A. (2000), “Log-Linear and Logistic Modeling of Dependence Among Diagnostic Tests,” Preventive Veterinary Medicine, 45, 123–137.
— (2003), “Hierarchical Models for Estimating Herd Prevalence and Test Accuracy in the Absence of a God Standard,” Journal of Agricultural Biological, and Environmental Statistics, 8, 223–239.
Hui, S. L., and Walter, S. D. (1980), “Estimating the Error Rates of Diagnostic Tests,” Biometrics, 36, 167–171.
Joseph, L., Gyorkos, T. W., and Coupal, L. (1995), “Bayesian Estimation of Disease Prevalence and the Parameters of Diagnostic Tests in the Absence of a Gold Standard,” American Journal of Epidemiology, 141, 263–272.
Liu, J., and Hodges, J. S. (2003), “Posterior Bimodality in the Balanced One-Way Random-Effects Model,” Journal of the Royal Statistical Society, Ser. B, 65, 247–255.
Pouillot, R., Gerbier, G., and Gardner, I. A. (2002), “TAGS, a Program for the Evaluation of Test Accuracy in the Absence Of A Gold Standard,” Preventive Veterinary Medicine, 53, 67–81.
Qu, Y., Tan, M., and Kutner, M. H. (1996), “Random Effects Models in Latent Class Analysis for Evaluating Accuracy of Diagnostic Tests,” Biometrics, 52, 797–810.
Qu, Y., and Hadgu, A. (1998), “A Model for Evaluating Sensitivity and Specificity for Correlated Diagnostic Tests in Efficacy Studies With an Imperfect Reference Test,” Journal of the American Statistical Association, 93, 920–928.
Smith, T. C., Spiegelhalter, D. J., and Thomas, A. (1995), “Bayesian Graphical Modelling Applied to Random Effects Meta-analysis,” Statistics in Medicine, 14, 2685–2699.
Spiegelhalter, D., Thomas, A., Best, N., and Lunn, D. (2003), WinBUGS User Manual, Version 1.4. Available online at http://www.mrc-bsu.cam.ac.uk/bugs.
Staats, J. J., Feder, I., Okwumabua, O., and Chengappa, M. M. (1997), “Streptococcus suis: Past and Present,” Veterinary Research Communications, 21, 381–407.
Swildens, B., Wisselink, H. J., Engel, B., Smith, H. E., Nielen, M., Verheijden, J. H. M., and Stegeman, J. A. (2005), “Detection of Extracellular Factor-positive Streptococcus suis Serotype 2 Strains in Tonsillar Swabs of Live Sows by PCR,” Veterinary Microbiology, 109, 223–228.
Torrance-Rynard, V. L., and Walter, S. D. (1997), “Effects of Dependent Errors in the Assessment of Diagnostic Test Performance,” Statistics in Medicine, 16, 2157–2175.
Vacek, P. (1985), “The Effect of Conditional Dependence on the Evaluation of Diagnostic Tests,” Biometrics, 41, 959–968.
Yang, I., and Becker, M. P. (1997), “Latent Variable Modelling of Diagnostic Accuracy,” Biometrics, 53, 948–958.
Zeger, S. L., Liang, K. Y., and Albert, P. S. (1988), “Models for Longitudinal Data: A Generalized Estimating Approach,” Biometrics, 44, 1049–1060.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Engel, B., Swildens, B., Stegeman, A. et al. Estimation of sensitivity and specificity of three conditionally dependent diagnostic tests in the absence of a gold standard. JABES 11, 360–380 (2006). https://doi.org/10.1198/108571106X153534
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1198/108571106X153534