Skip to main content

Density Estimation in Representation Space to Predict Model Uncertainty

  • Conference paper
  • First Online:
Engineering Dependable and Secure Machine Learning Systems (EDSMLS 2020)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1272))

Abstract

Deep learning models frequently make incorrect predictions with high confidence when presented with test examples that are not well represented in their training dataset. We propose a novel and straightforward approach to estimate prediction uncertainty in a pre-trained neural network model. Our method estimates the training data density in representation space for a novel input. A neural network model then uses this information to determine whether we expect the pre-trained model to make a correct prediction. This uncertainty model is trained by predicting in-distribution errors, but can detect out-of-distribution data without having seen any such example. We test our method for a state-of-the art image classification model in the settings of both in-distribution uncertainty estimation as well as out-of-distribution detection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., Mané, D.: Concrete problems in AI safety. arXiv:1606.06565 [cs], June 2016

  2. Athalye, A., Engstrom, L., Ilyas, A., Kwok, K.: Synthesizing robust adversarial examples. arXiv:1707.07397 [cs], July 2017

  3. DeVries, T. and Taylor, G.W.: Learning confidence for out-of-distribution detection in neural networks. arXiv:1802.04865 [cs, stat], February 2018

  4. Feinman, R., Curtin, R.R., Shintre, S., Gardner, A.B.: Detecting adversarial samples from artifacts. arXiv:1703.00410 [cs, stat], March 2017

  5. Gal, Y., Ghahramani, Z.: Dropout as a Bayesian approximation: representing model uncertainty in deep learning. arXiv:1506.02142 [cs, stat], June 2015

  6. Garnelo, M., et al.: Conditional neural processes. arXiv:1807.01613 [cs, stat], July 2018

  7. Guo, C., Pleiss, G., Sun, Y., Weinberger, K.Q.: On calibration of modern neural networks. arXiv:1706.04599 [cs], June 2017

  8. Hafner, D., Tran, D., Lillicrap, T., Irpan, A., Davidson, J.: Reliable uncertainty estimates in deep neural networks using noise contrastive priors. arXiv preprint arXiv:1807.09289 (2018)

  9. Hechtlinger, Y., Póczos, B., Wasserman, L.: Cautious deep learning. arXiv:1805.09460 [cs, stat], May 2018

  10. Hendrycks, D., Dietterich, T.: Benchmarking neural network robustness to common corruptions and perturbations. arXiv:1903.12261 [cs, stat], March 2019

  11. Hendrycks, D., Gimpel, K.: A baseline for detecting misclassified and out-of-distribution examples in neural networks. arXiv:1610.02136 [cs], October 2016

  12. Huang, S., Papernot, N., Goodfellow, I., Duan, Y., Abbeel, P.: Adversarial attacks on neural network policies (2017)

    Google Scholar 

  13. Jiang, H., Kim, B., Guan, M., Gupta, M.: To trust or not to trust a classifier. arXiv:1805.11783 [cs, stat], May 2018

  14. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

    Google Scholar 

  15. Lakshminarayanan, B., Pritzel, A., Blundell, C.: Simple and scalable predictive uncertainty estimation using deep ensembles. In: Advances in Neural Information Processing Systems, pp. 6402–6413 (2017)

    Google Scholar 

  16. Lee, K., Lee, H., Lee, K., Shin, J.: Training confidence-calibrated classifiers for detecting out-of-distribution samples. arXiv:1711.09325 [cs, stat], November 2017

  17. Lee, K., Lee, K., Lee, H., Shin, J.: A simple unified framework for detecting out-of-distribution samples and adversarial attacks. arXiv:1807.03888 [cs, stat], July 2018

  18. Liang, S., Li, Y., Srikant, R.: Enhancing the reliability of out-of-distribution image detection in neural networks. arXiv:1706.02690 [cs, stat], June 2017

  19. Malinin, A., Gales, M.: Predictive uncertainty estimation via prior networks. In: Advances in Neural Information Processing Systems, pp. 7047–7058 (2018)

    Google Scholar 

  20. Nalisnick, E., Matsukawa, A., Teh, Y.W., Gorur, D., Lakshminarayanan, B.: Do deep generative models know what they don’t know? arXiv:1810.09136 [cs, stat], October 2018

  21. Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning. In: NIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011 (2011)

    Google Scholar 

  22. Nguyen, A., Yosinski, J., Clune, J.: Deep neural networks are easily fooled: high confidence predictions for unrecognizable images, December 2014

    Google Scholar 

  23. Oberdiek, P., Rottmann, M., Gottschalk, H.: Classification uncertainty of deep neural networks based on gradient information. arXiv:1805.08440 [cs, stat], May 2018

  24. Osband, I., Aslanides, J., Cassirer, A.: Randomized prior functions for deep reinforcement learning. arXiv:1806.03335 [cs, stat], June 2018

  25. Papernot, N., McDaniel, P.: Deep k-nearest neighbors: towards confident, interpretable and robust deep learning, March 2018

    Google Scholar 

  26. Recht, B., Roelofs, R., Schmidt, L., Shankar, V.: Do imagenet classifiers generalize to imagenet? arXiv preprint arXiv:1902.10811 (2019)

  27. Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y

    Article  MathSciNet  Google Scholar 

  28. Sensoy, M., Kaplan, L., Kandemir, M.: Evidential deep learning to quantify classification uncertainty. In: Advances in Neural Information Processing Systems, pp. 3179–3189 (2018)

    Google Scholar 

  29. Snell, J., Swersky, K., Zemel, R.: Prototypical networks for few-shot learning. arXiv:1703.05175 [cs, stat], March 2017

  30. Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.: Inception-v4, Inception-ResNet and the impact of residual connections on learning, February 2016

    Google Scholar 

  31. Uesato, J., O’Donoghue, B., Oord, A.V.D., Kohli, P.: Adversarial risk and the dangers of evaluating against weak attacks. arXiv:1802.05666 [cs, stat], February 2018

  32. Zaheer, M., Kottur, S., Ravanbakhsh, S., Poczos, B., Salakhutdinov, R.R., Smola, A.J.: Deep sets. arXiv:1703.06114 [cs, stat], March 2017

Download references

Acknowledgements

The authors thank Elco Bakker for insightful feedback and comments on the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Miguel Miranda .

Editor information

Editors and Affiliations

A Appendix

A Appendix

We tuned the hyperparameters L, the number of layers; and k the number of nearest numbers fed to the classifier. All models were trained using the Adam optimizer with a learning rate of \(10^{-3}\) annealed to \(10^{-4}\) after 40000 steps. We train each model for one single epoch before validating. We considered the following values for hyperparameters: \(L\in [1, 2, 3]\) (with \(L=1\) corresponding to a linear model) and \(k\in [10, 50, 100, 200]\).

We chose the hyperparameters used in the paper \(L=2, k=[10, 200]\) based on the AUROC  [11] on the in-distribution mistake prediction task using the ILSVRC2012 validation set (Fig. 4).

Fig. 4.
figure 4

AUROC metric reported for models with \(L\in [1, 2, 3]\) and \(k\in [10, 50, 100, 200]\).

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ramalho, T., Miranda, M. (2020). Density Estimation in Representation Space to Predict Model Uncertainty. In: Shehory, O., Farchi, E., Barash, G. (eds) Engineering Dependable and Secure Machine Learning Systems. EDSMLS 2020. Communications in Computer and Information Science, vol 1272. Springer, Cham. https://doi.org/10.1007/978-3-030-62144-5_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-62144-5_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-62143-8

  • Online ISBN: 978-3-030-62144-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics