Skip to main content
Log in

Nonparametric estimation of ROC curves based on Bayesian models when the true disease state is unknown

  • Published:
Journal of Agricultural, Biological, and Environmental Statistics Aims and scope Submit manuscript

Abstract

We develop a Bayesian methodology for nonparametric estimation of ROC curves used for evaluation of the accuracy of a diagnostic procedure. We consider the situation where there is no perfect reference test, that is, no “gold standard”. The method is based on a multinomial model for the joint distribution of test-positive and test-negative observations. We use a Bayesian approach which assures the natural monotonicity property of the resulting ROC curve estimate. MCMC methods are used to compute the posterior estimates of the sensitivities and specificities that provide the basis for inference concerning the accuracy of the diagnostic procedure. Because there is no gold standard, identifiability requires that the data come from at least two populations with different prevalences. No assumption is needed concerning the shape of the distributions of test values of the diseased and non diseased in these populations. We discuss an application to an analysis of ELISA scores in the diagnostic testing of paratuberculosis (Johne’s Disease) for several herds of dairy cows and compare the results to those obtained from some previously proposed methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Albert, P. S., and Dodd, L. E. (2004), “A Cautionary Note on the Robustness of Latent Class Models for Estimating Diagnostic Error Without a Gold Standard,” Biometrics, 60, 427–435.

    Article  MATH  MathSciNet  Google Scholar 

  • Andersen, S. (1997), “Re: Bayesian Estimation of Disease Prevalence and the Parameters of Diagnostic Tests in the Absence of a Gold Standard,” American Journal of Epidemiology, 145, 290–291.

    Google Scholar 

  • Andersen, H. J., Aagaard, K., Skjoth, F., Rattenborg, E., and Enevoldsen, C. (2000), “Integration of Research, Development, Health Promotion, and Milk Quality Assurance in the Danish Dairy Industry,” in Proceedings of the Ninth Symposium of the International Society of Veterinary Epidemiology and Economics, Breckenridge, CO, August 6–11, eds. M. D. Salman, P. Morley, and R. Ruch-Gallie, pp. 258–260.

  • Begg, C. B., and Metz, C. E. (1990), “Consensus Diagnosis and ‘Gold Standards’,” Medical Decision Making, 10, 29–30.

    Article  Google Scholar 

  • Beiden, S. V., Campbell, G., Meier, K. L., and Wagner, R. F. (2000), “On the Problem of ROC Analysis without Truth: The EM Algorithm and the Information Matrix,” in Proceedings of the Society of Photo-Optical Instrumentation Engineers (SPIE): The International Society for Optical Engineering, Bellingham WA, eds. M. D. Salman, P. Morley, and R. Ruch-Gallie, vol. 3981, pp. 126–134.

  • Best, N., Cowles, M., and Vines, S. (1995), CODA Manual version 0.30, Cambridge, UK: MRC Biostatistics Unit.

    Google Scholar 

  • Black, M. A., and Craig, B. A. (2002), “Estimating Disease Prevalence in the Absence of a Gold Standard,” Statistics in Medicine, 21, 2653–2669.

    Article  Google Scholar 

  • Branscum, A. J., Gardner, I. A., and Johnson, W. O. (2005), “Estimation of Diagnostic-test Sensitivity and Specificity Through Bayesian Modeling,” Preventive Veterinary Medicine, 68, 145–163.

    Article  Google Scholar 

  • Choi, Y., Johnson, W. O., Collins, M. T., and Gardner, I. A. (2006), “Bayesian Inferences for Receiver Operating Characteristic Curves in the Absence of a Gold Standard,” Journal of Agricultural, Biological and Enviromental Statistics, 11, 210–229.

    Article  Google Scholar 

  • Dendukuri, N., and Joseph, L. (2001), “Bayesian Approaches to Modeling the Conditional Dependence Between Multiple Diagnostic Tests,” Biometrics, 57, 158–167.

    Article  MathSciNet  Google Scholar 

  • Enøe, C., Georgiadis, M. P., and Johnson, W. O. (2000), “Estimation of Sensitivity and Specificity of Diagnostic Tests and Disease Prevalence When the True Disease State is Unknown,” Preventive Veterinary Medicine, 45, 61–81.

    Article  Google Scholar 

  • Garrett, E. S., Eaton, E. E., and Zeger, S. (2002), “Methods for Evaluating the Performance of Diagnostic Tests in the Absence of a Gold Standard: a Latent Class Model Approach,” Statistics in Medicine, 21, 1289–1307.

    Article  Google Scholar 

  • Georgiadis, M. P., Johnson, W. O., Gardner, I. A., and Singh, R. (2003), “Correlation-adjusted Estimation of Sensitivity and Specificity of Two Diagnostic Tests,” Applied Statistics, 52, 63–76.

    MATH  MathSciNet  Google Scholar 

  • Greiner, M., Pfeiffer, D., and Smith, R. D. (2000), “Priciples and Practical Application of Receiver Operating Characteristic Analysis for Diagnostic Tests,” Preventive Veterinary Medicine, 45, 23–41.

    Article  Google Scholar 

  • Gustafson, P. (2005), “The Utility of Prior Information and Stratification for Parameter Estimation With Two Screening Tests but No Gold Standard,” Statistics in Medicine, 24, 1203–1217.

    Article  MathSciNet  Google Scholar 

  • Hall, P., and Zhou, X.-H. (2003), “Nonparametric Estimation of Component Distributions in a Multivariate Mixture,” The Annals of Statistics, 31, 201–224.

    Article  MATH  MathSciNet  Google Scholar 

  • Hanson, T. E., Johnson, W. O., and Gardner, I. A. (2003), “Hierarchical Models for the Estimation of Disease Prevalence and the Sensitivity and Specificity of Dependent Tests in the Absence of a Gold Standard,” Journal of Agricultural, Biological, and Environmental Statistics, 8, 223–239.

    Article  Google Scholar 

  • Henkelman, R. M., Kay, I., and Bronskill, M. J. (1990), “Receiver Operator Characteristic (ROC) Analysis Without Truth,” Medical Decision Making, 10, 24–29.

    Article  Google Scholar 

  • Hui, S. L., and Walter, S. D. (1980), “Estimating the Error Rates of Diagnostic Tests,” Biometrics, 36, 167–171.

    Article  MATH  Google Scholar 

  • Hui, S. L., and Zhou, X. H. (1998), “Evaluation of Diagnostic Tests Without Gold Standards,” Statistical Methods in Medical Research, 7, 354–370.

    Article  Google Scholar 

  • Johnson, W. O., Gastwirth, J. L., and Pearson, L. M. (2001), “Screening Without a ‘Gold Standard’: The Hui-Walter Paradigm Revisited,” American Journal of Epidemiology, 153, 921–924.

    Article  Google Scholar 

  • Joseph, L., Gyorkos, T., and Coupal, L. (1995), “Bayesian Estimation of Disease Prevalence and the Parameters of Diagnostic Tests in the Absence of a Gold Standard,” American Journal of Epidemiology, 141, 263–272.

    Google Scholar 

  • Nielsen, S. S., Gronbak, C., Agger, J. F., and Houe, H. (2002), “Maximum-like lihood Estimation of Sensitivity and Specificity of ELISAs and Faecal Culture for Diagnosis of Paratuberculosis,” Preventive Veterinary Medicine, 53, 191–204.

    Article  Google Scholar 

  • Pepe, M. (2003), The Statistical Evaluation of Medical Tests for Classification and Prediction, New York: Oxford University Press.

    Google Scholar 

  • Qu, Y., Tan, M., and Kutner, M. K. (1996), “Random Effects Models for Evaluating Accuracy of Diagnostic Tests,” Biometrics, 52, 797–810.

    Article  MATH  MathSciNet  Google Scholar 

  • Rideout, B. A., Brown, S., Davis, W. C., Gay, J. M., Giannella, R. A., Hines, M. E., Hueston, W. D., Hutchinson, L. J., and Rouse, T. (2003), The Diagnosis and Control of Johne’s Disease, Washington DC: National Academy Press.

    Google Scholar 

  • Robert, C. P., and Casella, G. (2004), Monte Carlo Statistical Methods (2nd ed.), New York: Springer.

    Google Scholar 

  • Spiegelhalter, D. J., Thomas, A., Best, N. G., and Gilks, W. R. (1995), BUGS: Bayesian Inference Using Gibbs Sampling Version 0.50, Cambridge: MRC Biostatistics Unit.

    Google Scholar 

  • Stabel, J. (2000), “Transitions in Immune Responses to Mycobacterium Paratuberculosis,” Veterinary Microbiology, 77, 465–473.

    Article  Google Scholar 

  • Tanner, M., and Wong, W. (1987), “The Calculation Of Posterior Distributions By Data Augmentation,” Journal of the American Statistical Association, 82, 528–550.

    Article  MATH  MathSciNet  Google Scholar 

  • The Math Works, Inc. (2004), Getting Started with MATLAB, Version 7.

  • Toft, N., and Jørgensen, E., and Højsgaard, S. (2005), “Diagnosing Diagnostic Tests: Evaluating the Assumptions Underlying the Estimation of Sensitivity and Specifity in the Absence of a Gold Standard,” Preventive Veterinary Medicine, 68, 19–33.

    Article  Google Scholar 

  • Walter, S. D., and Irwig, L. M. (1988), “Estimation of Test Error Rates, Disease Prevalence and Relative Risk From Misclassified Data—A Review,” Journal of Clinical Epidemiology, 41, 923–937.

    Article  Google Scholar 

  • Wang, C., Turnbull, B. W., Gröhn, Y. T., and Nielsen, S. S. (2006), “Nonparametric Estimation of ROC Curves Based on Bayesian Models when the True Disease State is Unknown,” Technical Report 1445., Cornell University, School of Operations Research and Industrial Engineering.

  • Yang, I., and Becker, M. P. (1997), “Latent Variable Modeling of Diagnostic Accuracy,” Biometrics, 53, 948–958.

    Article  MATH  Google Scholar 

  • Zhou, X.-H., Castelluccio, P., and Zhou, C. (2005), “Nonparametric Estimation of ROC Curves in the Absence of a Gold Standard,” Biometrics, 61, 600–609.

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chong Wang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, C., Turnbull, B.W., Gröhn, Y.T. et al. Nonparametric estimation of ROC curves based on Bayesian models when the true disease state is unknown. JABES 12, 128–146 (2007). https://doi.org/10.1198/108571107X178095

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1198/108571107X178095

Key Words

Navigation