Explaining and Interpreting LSTMs

Arras, Leila; Arjona-Medina, José; Widrich, Michael; Montavon, Grégoire; Gillhofer, Michael; Müller, Klaus-Robert; Hochreiter, Sepp; Samek, Wojciech

doi:10.1007/978-3-030-28954-6_11

Leila Arras¹³,
José Arjona-Medina¹⁴,
Michael Widrich¹⁴,
Grégoire Montavon¹⁵,
Michael Gillhofer¹⁴,
Klaus-Robert Müller^15,16,17,
Sepp Hochreiter¹⁴ &
…
Wojciech Samek¹³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11700))

23k Accesses
41 Citations
5 Altmetric

Abstract

While neural networks have acted as a strong unifying force in the design of modern AI systems, the neural network architectures themselves remain highly heterogeneous due to the variety of tasks to be solved. In this chapter, we explore how to adapt the Layer-wise Relevance Propagation (LRP) technique used for explaining the predictions of feed-forward networks to the LSTM architecture used for sequential data modeling and forecasting. The special accumulators and gated interactions present in the LSTM require both a new propagation scheme and an extension of the underlying theoretical framework to deliver faithful explanations.

L. Arras and J. Arjona-Medina—Contributed equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The global conservation is exact up to the relevance absorbed by some stabilizing term, and by the biases, see details later in Sect. 11.3.1.
2.
https://github.com/jamie-murdoch/ContextualDecomposition.
3.
https://github.com/jiweil/Visualizing-and-Understanding-Neural-Models-in-NLP.
4.
Except for the LRP-prop variant, where we take \(\epsilon =0.2\). We tried following values: [0.001, 0.01, 0.1, 0.2, 0.3, 0.4, 1.0], and took the lowest one to achieve numerical stability.
5.
Ancona et al. [1] also performed a comparative study of explanations on LSTMs, however, in order to redistribute the relevance through product layers, the authors use standard gradient backpropagation. This redistribution scheme violates one of the key underlying property of LRP, which is local relevance conservation, hence their results for LRP are not conclusive.
6.
We use an arbitrary minimum magnitude of 0.5 only to simplify training (since sampling very small numbers would encourage the model weights to grow rapidly).
7.
The same phenomenon can occur, on the addition problem, when using only positive numbers as input. Whereas in the specific toy tasks we considered, the cell input (\(z_t\)) is required to process the numbers to add/subtract, and the cell state (\(c_t\)) accumulates the result of the arithmetic operation.

References

Ancona, M., Ceolini, E., Öztireli, C., Gross, M.: Towards better understanding of gradient-based attribution methods for deep neural networks. In: International Conference on Learning Representations (ICLR) (2018)
Google Scholar
Arjona-Medina, J.A., Gillhofer, M., Widrich, M., Unterthiner, T., Brandstetter, J., Hochreiter, S.: RUDDER: return decomposition for delayed rewards. arXiv:1806.07857 (2018)
Arras, L., Horn, F., Montavon, G., Müller, K.R., Samek, W.: “What is relevant in a text document?”: An interpretable machine learning approach. PLoS ONE 12(8), e0181142 (2017)
Article Google Scholar
Arras, L., Montavon, G., Müller, K.R., Samek, W.: Explaining recurrent neural network predictions in sentiment analysis. In: Proceedings of the EMNLP 2017 Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (WASSA), pp. 159–168 (2017)
Google Scholar
Arras, L., Osman, A., Müller, K.R., Samek, W.: Evaluating recurrent neural network explanations. In: Proceedings of the ACL 2019 Workshop on BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pp. 113–126. Association for Computational Linguistics (2019)
Google Scholar
Bach, S., Binder, A., Montavon, G., Klauschen, F., Müller, K.R., Samek, W.: On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS ONE 10(7), e0130140 (2015)
Article Google Scholar
Bakker, B.: Reinforcement learning with long short-term memory. In: Advances in Neural Information Processing Systems 14 (NIPS), pp. 1475–1482 (2002)
Google Scholar
Bakker, B.: Reinforcement learning by backpropagation through an LSTM model/ critic. In: IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, pp. 127–134 (2007)
Google Scholar
Becker, S., Ackermann, M., Lapuschkin, S., Müller, K.R., Samek, W.: Interpreting and explaining deep neural networks for classification of audio signals. arXiv:1807.03418 (2018)
Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Networks 5(2), 157–166 (1994)
Article Google Scholar
Chen, J., Song, L., Wainwright, M., Jordan, M.: Learning to explain: an information-theoretic perspective on model interpretation. In: Proceedings of the 35th International Conference on Machine Learning (ICML), vol. 80, pp. 883–892 (2018)
Google Scholar
Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1724–1734. Association for Computational Linguistics (2014)
Google Scholar
Denil, M., Demiraj, A., de Freitas, N.: Extraction of salient sentences from labelled documents. arXiv:1412.6815 (2015)
Ding, Y., Liu, Y., Luan, H., Sun, M.: Visualizing and understanding neural machine translation. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 1150–1159. Association for Computational Linguistics (2017)
Google Scholar
Donahue, J., et al.: Long-term recurrent convolutional networks for visual recognition and description. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 677–691 (2017)
Article Google Scholar
EU-GDPR: Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation). Official J. Eur. Union L 119(59), 1–88 (2016)
Google Scholar
Geiger, J.T., Zhang, Z., Weninger, F., Schuller, B., Rigoll, G.: Robust speech recognition using long short-term memory recurrent neural networks for hybrid acoustic modelling. In: Proceedings of the 15th Annual Conference of the International Speech Communication Association (INTERSPEECH), pp. 631–635 (2014)
Google Scholar
Gers, F.A., Schmidhuber, J.: Recurrent nets that time and count. In: Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN), vol. 3, pp. 189–194 (2000)
Google Scholar
Gers, F.A., Schmidhuber, J., Cummins, F.: Learning to forget: continual prediction with LSTM. In: Proceedings of the International Conference on Artificial Neural Networks (ICANN), vol. 2, pp. 850–855 (1999)
Google Scholar
Gers, F.A., Schmidhuber, J., Cummins, F.: Learning to forget: continual prediction with LSTM. Neural Comput. 12(10), 2451–2471 (2000)
Article Google Scholar
Gevrey, M., Dimopoulos, I., Lek, S.: Review and comparison of methods to study the contribution of variables in artificial neural network models. Ecol. Model. 160(3), 249–264 (2003)
Article Google Scholar
Gonzalez-Dominguez, J., Lopez-Moreno, I., Sak, H., Gonzalez-Rodriguez, J., Moreno, P.J.: Automatic language identification using long short-term memory recurrent neural networks. In: Proceedings of the 15th Annual Conference of the International Speech Communication Association (INTERSPEECH), pp. 2155–2159 (2014)
Google Scholar
Graves, A.: Generating sequences with recurrent neural networks. arXiv:1308.0850 (2014)
Graves, A., Liwicki, M., Fernandez, S., Bertolami, R., Bunke, H., Schmidhuber, J.: A novel connectionist system for unconstrained handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 31(5), 855–868 (2009)
Article Google Scholar
Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Networks 18(5–6), 602–610 (2005)
Article Google Scholar
Greff, K., Srivastava, R.K., Koutník, J., Steunebrink, B.R., Schmidhuber, J.: LSTM: a search space odyssey. IEEE Trans. Neural Netw. Learn. Syst. 28(10), 2222–2232 (2017)
Article MathSciNet Google Scholar
Hausknecht, M., Stone, P.: Deep recurrent Q-learning for partially observable MDPs. In: AAAI Fall Symposium Series - Sequential Decision Making for Intelligent Agents, pp. 29–37 (2015)
Google Scholar
Heess, N., Wayne, G., Tassa, Y., Lillicrap, T., Riedmiller, M., Silver, D.: Learning and transfer of modulated locomotor controllers. arXiv:1610.05182 (2016)
Hochreiter, S.: Implementierung und Anwendung eines ‘neuronalen’ Echtzeit-Lernalgorithmus für reaktive Umgebungen. Practical work, Institut für Informatik, Technische Universität München (1990)
Google Scholar
Hochreiter, S.: Untersuchungen zu dynamischen neuronalen Netzen. Master’s thesis. Institut für Informatik, Technische Universität München (1991)
Google Scholar
Hochreiter, S.: Recurrent neural net learning and vanishing gradient. In: Freksa, C. (ed.) Proceedings in Artificial Intelligence - Fuzzy-Neuro-Systeme 1997 Workshop, pp. 130–137. Infix (1997)
Google Scholar
Hochreiter, S.: The vanishing gradient problem during learning recurrent neural nets and problem solutions. Int. J. Uncertainty Fuzziness Knowl. Based Syst. 6(2), 107–116 (1998)
Article MathSciNet Google Scholar
Hochreiter, S., Bengio, Y., Frasconi, P., Schmidhuber, J.: Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. In: Kolen, J.F., Kremer, S.C. (eds.) A Field Guide to Dynamical Recurrent Networks, pp. 237–244. IEEE Press, New York (2001)
Google Scholar
Hochreiter, S., Heusel, M., Obermayer, K.: Fast model-based protein homology detection without alignment. Bioinformatics 23(14), 1728–1736 (2007)
Article Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Technical report, FKI-207-95, Fakultät für Informatik, Technische Universität München (1995)
Google Scholar
Hochreiter, S., Schmidhuber, J.: LSTM can solve hard long time lag problems. In: Advances in Neural Information Processing Systems 9 (NIPS), pp. 473–479 (1996)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Hochreiter, S., Younger, A.S., Conwell, P.R.: Learning to learn using gradient descent. In: Proceedings of the International Conference on Artificial Neural Networks (ICANN), pp. 87–94 (2001)
Chapter Google Scholar
Horst, F., Lapuschkin, S., Samek, W., Müller, K.R., Schöllhorn, W.I.: Explaining the unique nature of individual gait patterns with deep learning. Sci. Rep. 9, 2391 (2019)
Article Google Scholar
Kauffmann, J., Esders, M., Montavon, G., Samek, W., Müller, K.R.,: From clustering to cluster explanations via neural networks. arXiv:1906.07633 (2019)
Landecker, W., Thomure, M.D., Bettencourt, L.M.A., Mitchell, M., Kenyon, G.T., Brumby, S.P.: Interpreting individual classifications of hierarchical networks. In: IEEE Symposium on Computational Intelligence and Data Mining (CIDM), pp. 32–38 (2013)
Google Scholar
Lapuschkin, S., Binder, A., Montavon, G., Müller, K.R., Samek, W.: The LRP toolbox for artificial neural networks. J. Mach. Learn. Res. 17(114), 1–5 (2016)
MathSciNet MATH Google Scholar
Lapuschkin, S., Binder, A., Müller, K.R., Samek, W.: Understanding and comparing deep neural networks for age and gender classification. In: IEEE International Conference on Computer Vision Workshops, pp. 1629–1638 (2017)
Google Scholar
Lapuschkin, S., Wäldchen, S., Binder, A., Montavon, G., Samek, W., Müller, K.R.: Unmasking clever hans predictors and assessing what machines really learn. Nat. Commun. 10, 1096 (2019)
Article Google Scholar
Li, J., Chen, X., Hovy, E., Jurafsky, D.: Visualizing and understanding neural models in NLP. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), pp. 681–691. Association for Computational Linguistics (2016)
Google Scholar
Li, J., Monroe, W., Jurafsky, D.: Understanding neural networks through representation erasure. arXiv:1612.08220 (2017)
Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Advances in Neural Information Processing Systems 30 (NIPS), pp. 4765–4774 (2017)
Google Scholar
Luoma, J., Ruutu, S., King, A.W., Tikkanen, H.: Time delays, competitive interdependence, and firm performance. Strateg. Manag. J. 38(3), 506–525 (2017)
Article Google Scholar
Marchi, E., Ferroni, G., Eyben, F., Gabrielli, L., Squartini, S., Schuller, B.: Multi-resolution linear prediction based features for audio onset detection with bidirectional LSTM neural networks. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2164–2168 (2014)
Google Scholar
Mnih, V., et al.: Asynchronous methods for deep reinforcement learning. In: Proceedings of the 33rd International Conference on Machine Learning (ICML), vol. 48, pp. 1928–1937 (2016)
Google Scholar
Montavon, G., Binder, A., Lapuschkin, S., Samek, W., Müller, K.-R.: Layer-wise relevance propagation: an overview. In: Samek, W. et al. (eds.) Explainable AI, LNCS 11700, pp. 193–209. Springer, Heidelberg (2019)
Google Scholar
Montavon, G., Lapuschkin, S., Binder, A., Samek, W., Müller, K.R.: Explaining nonlinear classification decisions with deep Taylor decomposition. Pattern Recogn. 65, 211–222 (2017)
Article Google Scholar
Montavon, G., Samek, W., Müller, K.R.: Methods for interpreting and understanding deep neural networks. Digit. Signal Proc. 73, 1–15 (2018)
Article MathSciNet Google Scholar
Morcos, A.S., Barrett, D.G., Rabinowitz, N.C., Botvinick, M.: On the importance of single directions for generalization. In: International Conference on Learning Representations (ICLR) (2018)
Google Scholar
Munro, P.: A dual back-propagation scheme for scalar reward learning. In: Proceedings of the Ninth Annual Conference of the Cognitive Science Society, pp. 165–176 (1987)
Google Scholar
Murdoch, W.J., Liu, P.J., Yu, B.: Beyond word importance: contextual decomposition to extract interactions from LSTMs. In: International Conference on Learning Representations (ICLR) (2018)
Google Scholar
Poerner, N., Schütze, H., Roth, B.: Evaluating neural network explanation methods using hybrid documents and morphosyntactic agreement. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 340–350. Association for Computational Linguistics (2018)
Google Scholar
Rahmandad, H., Repenning, N., Sterman, J.: Effects of feedback delay on learning. Syst. Dyn. Rev. 25(4), 309–338 (2009)
Article Google Scholar
Rieger, L., Chormai, P., Montavon, G., Hansen, L.K., Müller, K.-R.: Structuring neural networks for more explainable predictions. In: Escalante, H.J., et al. (eds.) Explainable and Interpretable Models in Computer Vision and Machine Learning. TSSCML, pp. 115–131. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98131-4_5
Chapter Google Scholar
Robinson, A.J.: Dynamic error propagation networks. Ph.D. thesis, Trinity Hall and Cambridge University Engineering Department (1989)
Google Scholar
Robinson, T., Fallside, F.: Dynamic reinforcement driven error propagation networks with application to game playing. In: Proceedings of the 11th Conference of the Cognitive Science Society, Ann Arbor, pp. 836–843 (1989)
Google Scholar
Sahni, H.: Reinforcement learning never worked, and ‘deep’ only helped a bit. himanshusahni.github.io/2018/02/23/reinforcement-learning-never-worked.html (2018)
Sak, H., Senior, A., Beaufays, F.: Long short-term memory recurrent neural network architectures for large scale acoustic modeling. In: Proceedings of the 15th Annual Conference of the International Speech Communication Association (INTERSPEECH), Singapore, pp. 338–342 (2014)
Google Scholar
Samek, W., Binder, A., Montavon, G., Lapuschkin, S., Müller, K.R.: Evaluating the visualization of what a deep neural network has learned. IEEE Trans. Neural Netw. Learn. Syst. 28(11), 2660–2673 (2017)
Article MathSciNet Google Scholar
Schmidhuber, J.: Making the world differentiable: on using fully recurrent self-supervised neural networks for dynamic reinforcement learning and planning in non-stationary environments. Technical report, FKI-126-90 (revised), Institut für Informatik, Technische Universität München (1990). Experiments by Sepp Hochreiter
Google Scholar
Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015)
Article Google Scholar
Shrikumar, A., Greenside, P., Kundaje, A.: Learning important features through propagating activation differences. In: Proceedings of the 34th International Conference on Machine Learning (ICML), vol. 70, pp. 3145–3153 (2017)
Google Scholar
Simonyan, K., Vedaldi, A., Zisserman, A.: Deep inside convolutional networks: visualising image classification models and saliency maps. In: International Conference on Learning Representations (ICLR) (2014)
Google Scholar
Socher, R., et al.: Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1631–1642. Association for Computational Linguistics (2013)
Google Scholar
Srivastava, N., Mansimov, E., Salakhudinov, R.: Unsupervised learning of video representations using LSTMs. In: Proceedings of the 32nd International Conference on Machine Learning (ICML), vol. 37, pp. 843–852 (2015)
Google Scholar
Sturm, I., Lapuschkin, S., Samek, W., Müller, K.R.: Interpretable deep neural networks for single-trial EEG classification. J. Neurosci. Methods 274, 141–145 (2016)
Article Google Scholar
Sundararajan, M., Taly, A., Yan, Q.: Axiomatic attribution for deep networks. In: Proceedings of the 34th International Conference on Machine Learning (ICML), vol. 70, pp. 3319–3328 (2017)
Google Scholar
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems 27 (NIPS), pp. 3104–3112 (2014)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, 2nd edn. MIT Press, Cambridge (2017). Draft from November 2017
MATH Google Scholar
Thuillier, E., Gamper, H., Tashev, I.J.: Spatial audio feature discovery with convolutional neural networks. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6797–6801 (2018)
Google Scholar
Venugopalan, S., Xu, H., Donahue, J., Rohrbach, M., Mooney, R., Saenko, K.: Translating videos to natural language using deep recurrent neural networks. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), pp. 1494–1504. Association for Computational Linguistics (2015)
Google Scholar
Yang, Y., Tresp, V., Wunderle, M., Fasching, P.A.: Explaining therapy predictions with layer-wise relevance propagation in neural networks. In: IEEE International Conference on Healthcare Informatics (ICHI), pp. 152–162 (2018)
Google Scholar
Zaremba, W., Sutskever, I., Vinyals, O.: Recurrent neural network regularization. arXiv:1409.2329 (2015)
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53
Chapter Google Scholar
Zhang, J., Lin, Z., Brandt, J., Shen, X., Sclaroff, S.: Top-down neural attention by excitation backprop. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 543–559. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_33
Chapter Google Scholar

Download references

Acknowledgements

This work was supported by the German Ministry for Education and Research as Berlin Big Data Centre (01IS14013A), Berlin Center for Machine Learning (01IS18037I) and TraMeExCo (01IS18056A). Partial funding by DFG is acknowledged (EXC 2046/1, project-ID: 390685689). This work was also supported by the Institute for Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (No. 2017-0-00451, No. 2017-0-01779).

Author information

Authors and Affiliations

Fraunhofer Heinrich Hertz Institute, 10587, Berlin, Germany
Leila Arras & Wojciech Samek
Johannes Kepler University Linz, 4040, Linz, Austria
José Arjona-Medina, Michael Widrich, Michael Gillhofer & Sepp Hochreiter
Technische Universität Berlin, 10587, Berlin, Germany
Grégoire Montavon & Klaus-Robert Müller
Korea University, Anam-dong, Seongbuk-gu, Seoul, 02841, Korea
Klaus-Robert Müller
Max Planck Institute for Informatics, 66123, Saarbrücken, Germany
Klaus-Robert Müller

Authors

Leila Arras
View author publications
You can also search for this author in PubMed Google Scholar
José Arjona-Medina
View author publications
You can also search for this author in PubMed Google Scholar
Michael Widrich
View author publications
You can also search for this author in PubMed Google Scholar
Grégoire Montavon
View author publications
You can also search for this author in PubMed Google Scholar
Michael Gillhofer
View author publications
You can also search for this author in PubMed Google Scholar
Klaus-Robert Müller
View author publications
You can also search for this author in PubMed Google Scholar
Sepp Hochreiter
View author publications
You can also search for this author in PubMed Google Scholar
Wojciech Samek
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wojciech Samek .

Editor information

Editors and Affiliations

Fraunhofer Heinrich Hertz Institute, Berlin, Germany
Wojciech Samek
Technische Universität Berlin, Berlin, Germany
Grégoire Montavon
University of Oxford, Oxford, UK
Andrea Vedaldi
Technical University of Denmark, Kgs. Lyngby, Denmark
Lars Kai Hansen
Sekretariat MAR 4-1, Technical University of Berlin, Berlin, Berlin, Germany
Klaus-Robert Müller

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Arras, L. et al. (2019). Explaining and Interpreting LSTMs. In: Samek, W., Montavon, G., Vedaldi, A., Hansen, L., Müller, KR. (eds) Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. Lecture Notes in Computer Science(), vol 11700. Springer, Cham. https://doi.org/10.1007/978-3-030-28954-6_11

Download citation

DOI: https://doi.org/10.1007/978-3-030-28954-6_11
Published: 10 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-28953-9
Online ISBN: 978-3-030-28954-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics