Improving Interpretability via Explicit Word Interaction Graph Layer
DOI:
https://doi.org/10.1609/aaai.v37i11.26586Keywords:
SNLP: Interpretability & Analysis of NLP Models, ML: Graph-based Machine LearningAbstract
Recent NLP literature has seen growing interest in improving model interpretability. Along this direction, we propose a trainable neural network layer that learns a global interaction graph between words and then selects more informative words using the learned word interactions. Our layer, we call WIGRAPH, can plug into any neural network-based NLP text classifiers right after its word embedding layer. Across multiple SOTA NLP models and various NLP datasets, we demonstrate that adding the WIGRAPH layer substantially improves NLP models' interpretability and enhances models' prediction performance at the same time.Downloads
Published
2023-06-26
How to Cite
Sekhon, A., Chen, H., Shrivastava, A., Wang, Z., Ji, Y., & Qi, Y. (2023). Improving Interpretability via Explicit Word Interaction Graph Layer. Proceedings of the AAAI Conference on Artificial Intelligence, 37(11), 13528-13537. https://doi.org/10.1609/aaai.v37i11.26586
Issue
Section
AAAI Technical Track on Speech & Natural Language Processing