Abstract
We develop a Bayesian methodology for nonparametric estimation of ROC curves used for evaluation of the accuracy of a diagnostic procedure. We consider the situation where there is no perfect reference test, that is, no “gold standard”. The method is based on a multinomial model for the joint distribution of test-positive and test-negative observations. We use a Bayesian approach which assures the natural monotonicity property of the resulting ROC curve estimate. MCMC methods are used to compute the posterior estimates of the sensitivities and specificities that provide the basis for inference concerning the accuracy of the diagnostic procedure. Because there is no gold standard, identifiability requires that the data come from at least two populations with different prevalences. No assumption is needed concerning the shape of the distributions of test values of the diseased and non diseased in these populations. We discuss an application to an analysis of ELISA scores in the diagnostic testing of paratuberculosis (Johne’s Disease) for several herds of dairy cows and compare the results to those obtained from some previously proposed methods.
Similar content being viewed by others
References
Albert, P. S., and Dodd, L. E. (2004), “A Cautionary Note on the Robustness of Latent Class Models for Estimating Diagnostic Error Without a Gold Standard,” Biometrics, 60, 427–435.
Andersen, S. (1997), “Re: Bayesian Estimation of Disease Prevalence and the Parameters of Diagnostic Tests in the Absence of a Gold Standard,” American Journal of Epidemiology, 145, 290–291.
Andersen, H. J., Aagaard, K., Skjoth, F., Rattenborg, E., and Enevoldsen, C. (2000), “Integration of Research, Development, Health Promotion, and Milk Quality Assurance in the Danish Dairy Industry,” in Proceedings of the Ninth Symposium of the International Society of Veterinary Epidemiology and Economics, Breckenridge, CO, August 6–11, eds. M. D. Salman, P. Morley, and R. Ruch-Gallie, pp. 258–260.
Begg, C. B., and Metz, C. E. (1990), “Consensus Diagnosis and ‘Gold Standards’,” Medical Decision Making, 10, 29–30.
Beiden, S. V., Campbell, G., Meier, K. L., and Wagner, R. F. (2000), “On the Problem of ROC Analysis without Truth: The EM Algorithm and the Information Matrix,” in Proceedings of the Society of Photo-Optical Instrumentation Engineers (SPIE): The International Society for Optical Engineering, Bellingham WA, eds. M. D. Salman, P. Morley, and R. Ruch-Gallie, vol. 3981, pp. 126–134.
Best, N., Cowles, M., and Vines, S. (1995), CODA Manual version 0.30, Cambridge, UK: MRC Biostatistics Unit.
Black, M. A., and Craig, B. A. (2002), “Estimating Disease Prevalence in the Absence of a Gold Standard,” Statistics in Medicine, 21, 2653–2669.
Branscum, A. J., Gardner, I. A., and Johnson, W. O. (2005), “Estimation of Diagnostic-test Sensitivity and Specificity Through Bayesian Modeling,” Preventive Veterinary Medicine, 68, 145–163.
Choi, Y., Johnson, W. O., Collins, M. T., and Gardner, I. A. (2006), “Bayesian Inferences for Receiver Operating Characteristic Curves in the Absence of a Gold Standard,” Journal of Agricultural, Biological and Enviromental Statistics, 11, 210–229.
Dendukuri, N., and Joseph, L. (2001), “Bayesian Approaches to Modeling the Conditional Dependence Between Multiple Diagnostic Tests,” Biometrics, 57, 158–167.
Enøe, C., Georgiadis, M. P., and Johnson, W. O. (2000), “Estimation of Sensitivity and Specificity of Diagnostic Tests and Disease Prevalence When the True Disease State is Unknown,” Preventive Veterinary Medicine, 45, 61–81.
Garrett, E. S., Eaton, E. E., and Zeger, S. (2002), “Methods for Evaluating the Performance of Diagnostic Tests in the Absence of a Gold Standard: a Latent Class Model Approach,” Statistics in Medicine, 21, 1289–1307.
Georgiadis, M. P., Johnson, W. O., Gardner, I. A., and Singh, R. (2003), “Correlation-adjusted Estimation of Sensitivity and Specificity of Two Diagnostic Tests,” Applied Statistics, 52, 63–76.
Greiner, M., Pfeiffer, D., and Smith, R. D. (2000), “Priciples and Practical Application of Receiver Operating Characteristic Analysis for Diagnostic Tests,” Preventive Veterinary Medicine, 45, 23–41.
Gustafson, P. (2005), “The Utility of Prior Information and Stratification for Parameter Estimation With Two Screening Tests but No Gold Standard,” Statistics in Medicine, 24, 1203–1217.
Hall, P., and Zhou, X.-H. (2003), “Nonparametric Estimation of Component Distributions in a Multivariate Mixture,” The Annals of Statistics, 31, 201–224.
Hanson, T. E., Johnson, W. O., and Gardner, I. A. (2003), “Hierarchical Models for the Estimation of Disease Prevalence and the Sensitivity and Specificity of Dependent Tests in the Absence of a Gold Standard,” Journal of Agricultural, Biological, and Environmental Statistics, 8, 223–239.
Henkelman, R. M., Kay, I., and Bronskill, M. J. (1990), “Receiver Operator Characteristic (ROC) Analysis Without Truth,” Medical Decision Making, 10, 24–29.
Hui, S. L., and Walter, S. D. (1980), “Estimating the Error Rates of Diagnostic Tests,” Biometrics, 36, 167–171.
Hui, S. L., and Zhou, X. H. (1998), “Evaluation of Diagnostic Tests Without Gold Standards,” Statistical Methods in Medical Research, 7, 354–370.
Johnson, W. O., Gastwirth, J. L., and Pearson, L. M. (2001), “Screening Without a ‘Gold Standard’: The Hui-Walter Paradigm Revisited,” American Journal of Epidemiology, 153, 921–924.
Joseph, L., Gyorkos, T., and Coupal, L. (1995), “Bayesian Estimation of Disease Prevalence and the Parameters of Diagnostic Tests in the Absence of a Gold Standard,” American Journal of Epidemiology, 141, 263–272.
Nielsen, S. S., Gronbak, C., Agger, J. F., and Houe, H. (2002), “Maximum-like lihood Estimation of Sensitivity and Specificity of ELISAs and Faecal Culture for Diagnosis of Paratuberculosis,” Preventive Veterinary Medicine, 53, 191–204.
Pepe, M. (2003), The Statistical Evaluation of Medical Tests for Classification and Prediction, New York: Oxford University Press.
Qu, Y., Tan, M., and Kutner, M. K. (1996), “Random Effects Models for Evaluating Accuracy of Diagnostic Tests,” Biometrics, 52, 797–810.
Rideout, B. A., Brown, S., Davis, W. C., Gay, J. M., Giannella, R. A., Hines, M. E., Hueston, W. D., Hutchinson, L. J., and Rouse, T. (2003), The Diagnosis and Control of Johne’s Disease, Washington DC: National Academy Press.
Robert, C. P., and Casella, G. (2004), Monte Carlo Statistical Methods (2nd ed.), New York: Springer.
Spiegelhalter, D. J., Thomas, A., Best, N. G., and Gilks, W. R. (1995), BUGS: Bayesian Inference Using Gibbs Sampling Version 0.50, Cambridge: MRC Biostatistics Unit.
Stabel, J. (2000), “Transitions in Immune Responses to Mycobacterium Paratuberculosis,” Veterinary Microbiology, 77, 465–473.
Tanner, M., and Wong, W. (1987), “The Calculation Of Posterior Distributions By Data Augmentation,” Journal of the American Statistical Association, 82, 528–550.
The Math Works, Inc. (2004), Getting Started with MATLAB, Version 7.
Toft, N., and Jørgensen, E., and Højsgaard, S. (2005), “Diagnosing Diagnostic Tests: Evaluating the Assumptions Underlying the Estimation of Sensitivity and Specifity in the Absence of a Gold Standard,” Preventive Veterinary Medicine, 68, 19–33.
Walter, S. D., and Irwig, L. M. (1988), “Estimation of Test Error Rates, Disease Prevalence and Relative Risk From Misclassified Data—A Review,” Journal of Clinical Epidemiology, 41, 923–937.
Wang, C., Turnbull, B. W., Gröhn, Y. T., and Nielsen, S. S. (2006), “Nonparametric Estimation of ROC Curves Based on Bayesian Models when the True Disease State is Unknown,” Technical Report 1445., Cornell University, School of Operations Research and Industrial Engineering.
Yang, I., and Becker, M. P. (1997), “Latent Variable Modeling of Diagnostic Accuracy,” Biometrics, 53, 948–958.
Zhou, X.-H., Castelluccio, P., and Zhou, C. (2005), “Nonparametric Estimation of ROC Curves in the Absence of a Gold Standard,” Biometrics, 61, 600–609.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wang, C., Turnbull, B.W., Gröhn, Y.T. et al. Nonparametric estimation of ROC curves based on Bayesian models when the true disease state is unknown. JABES 12, 128–146 (2007). https://doi.org/10.1198/108571107X178095
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1198/108571107X178095