A Hierarchical LSTM Model for Joint Tasks

Zhou, Qianrong; Wen, Liyun; Wang, Xiaojie; Ma, Long; Wang, Yue

doi:10.1007/978-3-319-47674-2_27

Qianrong Zhou¹⁸,
Liyun Wen¹⁸,
Xiaojie Wang¹⁸,
Long Ma¹⁸ &
…
Yue Wang¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10035))

Included in the following conference series:

1946 Accesses
7 Citations

Abstract

Previous work has shown that joint modeling of two Natural Language Processing (NLP) tasks are effective for achieving better performances for both tasks. Lots of task-specific joint models are proposed. This paper proposes a Hierarchical Long Short-Term Memory (HLSTM) model and some its variants for modeling two tasks jointly. The models are flexible for modeling different types of combinations of tasks. It avoids task-specific feature engineering. Besides the enabling of correlation information between tasks, our models take the hierarchical relations between two tasks into consideration, which is not discussed in previous work. Experimental results show that our models outperform strong baselines in three different types of task combination. While both correlation information and hierarchical relations between two tasks are helpful to improve performances for both tasks, the models especially boost performance of tasks on the top of the hierarchical structures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://camdial.org/~mh521/dstc/.
2.
Sentences with ‘request’ intent are not included, since there is always no slot values in those sentences.
3.
http://taku910.github.io/crfpp/.
4.
http://nlp.fudan.edu.cn/nlpcc2015.

References

Zhou, J., Qu, W., Zhang, F.: Exploiting chunk-level features to improve phrase chunking. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 557–567. Association for Computational Linguistics, July 2012
Google Scholar
Chen, W., Zhang, Y., Isahara, H.: An empirical study of Chinese chunking. In: Proceedings of the COLING/ACL on Main Conference Poster Sessions, pp. 97–104. Association for Computational Linguistics, July 2006
Google Scholar
Tan, Y., Yao, T., Chen, Q., Zhu, J.: Applying conditional random fields to chinese shallow parsing. In: Gelbukh, A. (ed.) CICLing 2005. LNCS, vol. 3406, pp. 167–176. Springer, Heidelberg (2005)
Chapter Google Scholar
Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)
MATH Google Scholar
Lafferty, J., McCallum, A., Pereira, F.C.: Conditional random fields: probabilistic models for segmenting and labeling sequence data (2001)
Google Scholar
Lyu, C., Zhang, Y., Ji, D.: Joint word segmentation, POS-tagging and syntactic chunking. In: Thirtieth AAAI Conference on Artificial Intelligence, March 2016
Google Scholar
Tie-jun, Z.C.H.Z., De-quan, Z.: Joint Chinese word segmentation and POS tagging system with undirected graphical models. J. Electr. Inf. Technol. 3, 038 (2010)
Google Scholar
Shi, Y., Yao, K., Chen, H., Pan, Y.C., Hwang, M.Y., Peng, B.: Contextual spoken language understanding using recurrent neural networks. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5271–5275. IEEE, April 2015
Google Scholar
Lee, C., Ko, Y., Seo, J.: A simultaneous recognition framework for the spoken language understanding module of intelligent personal assistant software on smart phones. Short Papers, vol. 29, p. 818 (2015)
Google Scholar
Duh, K.: Jointly labeling multiple sequences: a factorial HMM approach. In: Proceedings of the ACL Student Research Workshop, pp. 19–24. Association for Computational Linguistics, June 2005
Google Scholar
Li, X.: Research on joint learning of sequence labeling in natural language processing (Dissertation for the Doctoral Degree in Engineering). Harbin Institue of Technology, Harbin, China (2010)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Graves, A., Mohamed, A.R., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6645–6649. IEEE, May 2013
Google Scholar
Chen, X., Qiu, X., Zhu, C., Liu, P., Huang, X.: Long short-term memory neural networks for chinese word segmentation. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (2015)
Google Scholar
Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)
MathSciNet MATH Google Scholar
Cotter, A., Shamir, O., Srebro, N., Sridharan, K.: Better mini-batch algorithms via accelerated gradient methods. In: Advances in neural information processing systems, pp. 1647–1655 (2011)
Google Scholar
Goller, C., Kuchler, A.: Learning task-dependent distributed representations by backpropagation through structure. In: IEEE International Conference on Neural Networks, vol. 1, pp. 347–352. IEEE, June 1996
Google Scholar
Bastien, F., Lamblin, P., Pascanu, R., Bergstra, J., Goodfellow, I., Bergeron, A., Bengio, Y.: Theano: new features and speed improvements. arXiv preprint arXiv:1211.5590 (2012)
Bergstra, J., Breuleux, O., Bastien, F., Lamblin, P., Pascanu, R., Desjardins, G., Bengio, Y.: Theano: a CPU and GPU math expression compiler. In: Proceedings of the Python for Scientific Computing Conference (SciPy), vol. 4, p. 3, June 2010
Google Scholar
Williams, J., Raux, A., Ramachandran, D., Black, A.: The dialog state tracking challenge. In: Proceedings of the SIGDIAL 2013 Conference, pp. 404–413, August 2013
Google Scholar
Mesnil, G., He, X., Deng, L., Bengio, Y.: Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding. In: INTERSPEECH, pp. 3771–3775, August 2013
Google Scholar
Mesnil, G., Dauphin, Y., Yao, K., Bengio, Y., Deng, L., Hakkani-Tur, D., Zweig, G.: Using recurrent neural networks for slot filling in spoken language understanding. IEEE/ACM Trans. Audio Speech Lang. Process. 23(3), 530–539 (2015)
Article Google Scholar
Florian, R., Ngai, G.: Multidimensional transformation-based learning. In: Proceedings of the 2001 workshop on Computational Natural Language Learning, vol. 7, p. 1. Association for Computational Linguistics, July 2001
Google Scholar
Pei, W., Ge, T., Chang, B.: Max-margin tensor neural network for Chinese word segmentation. ACL 1, 293–303 (2014)
Google Scholar
Qiu, X., Qian, P., Yin, L., Wu, S., Huang, X.: Overview of the NLPCC 2015 shared task: chinese word segmentation and POS tagging for micro-blog texts. In: Hou, L., et al. (eds.) NLPCC 2015. LNCS, vol. 9362, pp. 541–549. Springer, Heidelberg (2015). doi:10.1007/978-3-319-25207-0_50
Chapter Google Scholar

Download references

Acknowledgments

This paper is partially supported by National Natural Science Foundation of China (No. 61273365), discipline building plan in 111 base (No. B08004) and Engineering Research Center of Information Networks of MOE, and the Co-construction Program with the Beijing Municipal Commission of Education.

Author information

Authors and Affiliations

School of Computer, Beijing University of Posts and Telecommunications, Beijing, China
Qianrong Zhou, Liyun Wen, Xiaojie Wang, Long Ma & Yue Wang

Authors

Qianrong Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Liyun Wen
View author publications
You can also search for this author in PubMed Google Scholar
Xiaojie Wang
View author publications
You can also search for this author in PubMed Google Scholar
Long Ma
View author publications
You can also search for this author in PubMed Google Scholar
Yue Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qianrong Zhou .

Editor information

Editors and Affiliations

Tsinghua University , Beijing, China
Maosong Sun
Fudan University , Shanghai, China
Xuanjing Huang
Dalian University of Technology , Dalian, China
Hongfei Lin
Tsinghua University , Beijing, China
Zhiyuan Liu
Tsinghua University , Beijing, China
Yang Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhou, Q., Wen, L., Wang, X., Ma, L., Wang, Y. (2016). A Hierarchical LSTM Model for Joint Tasks. In: Sun, M., Huang, X., Lin, H., Liu, Z., Liu, Y. (eds) Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. NLP-NABD CCL 2016 2016. Lecture Notes in Computer Science(), vol 10035. Springer, Cham. https://doi.org/10.1007/978-3-319-47674-2_27

Download citation

DOI: https://doi.org/10.1007/978-3-319-47674-2_27
Published: 10 October 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-47673-5
Online ISBN: 978-3-319-47674-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics