Skip to main content

Abstract

Previous work has shown that joint modeling of two Natural Language Processing (NLP) tasks are effective for achieving better performances for both tasks. Lots of task-specific joint models are proposed. This paper proposes a Hierarchical Long Short-Term Memory (HLSTM) model and some its variants for modeling two tasks jointly. The models are flexible for modeling different types of combinations of tasks. It avoids task-specific feature engineering. Besides the enabling of correlation information between tasks, our models take the hierarchical relations between two tasks into consideration, which is not discussed in previous work. Experimental results show that our models outperform strong baselines in three different types of task combination. While both correlation information and hierarchical relations between two tasks are helpful to improve performances for both tasks, the models especially boost performance of tasks on the top of the hierarchical structures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://camdial.org/~mh521/dstc/.

  2. 2.

    Sentences with ‘request’ intent are not included, since there is always no slot values in those sentences.

  3. 3.

    http://taku910.github.io/crfpp/.

  4. 4.

    http://nlp.fudan.edu.cn/nlpcc2015.

References

  1. Zhou, J., Qu, W., Zhang, F.: Exploiting chunk-level features to improve phrase chunking. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 557–567. Association for Computational Linguistics, July 2012

    Google Scholar 

  2. Chen, W., Zhang, Y., Isahara, H.: An empirical study of Chinese chunking. In: Proceedings of the COLING/ACL on Main Conference Poster Sessions, pp. 97–104. Association for Computational Linguistics, July 2006

    Google Scholar 

  3. Tan, Y., Yao, T., Chen, Q., Zhu, J.: Applying conditional random fields to chinese shallow parsing. In: Gelbukh, A. (ed.) CICLing 2005. LNCS, vol. 3406, pp. 167–176. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  4. Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)

    MATH  Google Scholar 

  5. Lafferty, J., McCallum, A., Pereira, F.C.: Conditional random fields: probabilistic models for segmenting and labeling sequence data (2001)

    Google Scholar 

  6. Lyu, C., Zhang, Y., Ji, D.: Joint word segmentation, POS-tagging and syntactic chunking. In: Thirtieth AAAI Conference on Artificial Intelligence, March 2016

    Google Scholar 

  7. Tie-jun, Z.C.H.Z., De-quan, Z.: Joint Chinese word segmentation and POS tagging system with undirected graphical models. J. Electr. Inf. Technol. 3, 038 (2010)

    Google Scholar 

  8. Shi, Y., Yao, K., Chen, H., Pan, Y.C., Hwang, M.Y., Peng, B.: Contextual spoken language understanding using recurrent neural networks. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5271–5275. IEEE, April 2015

    Google Scholar 

  9. Lee, C., Ko, Y., Seo, J.: A simultaneous recognition framework for the spoken language understanding module of intelligent personal assistant software on smart phones. Short Papers, vol. 29, p. 818 (2015)

    Google Scholar 

  10. Duh, K.: Jointly labeling multiple sequences: a factorial HMM approach. In: Proceedings of the ACL Student Research Workshop, pp. 19–24. Association for Computational Linguistics, June 2005

    Google Scholar 

  11. Li, X.: Research on joint learning of sequence labeling in natural language processing (Dissertation for the Doctoral Degree in Engineering). Harbin Institue of Technology, Harbin, China (2010)

    Google Scholar 

  12. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  13. Graves, A., Mohamed, A.R., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6645–6649. IEEE, May 2013

    Google Scholar 

  14. Chen, X., Qiu, X., Zhu, C., Liu, P., Huang, X.: Long short-term memory neural networks for chinese word segmentation. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (2015)

    Google Scholar 

  15. Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)

    MathSciNet  MATH  Google Scholar 

  16. Cotter, A., Shamir, O., Srebro, N., Sridharan, K.: Better mini-batch algorithms via accelerated gradient methods. In: Advances in neural information processing systems, pp. 1647–1655 (2011)

    Google Scholar 

  17. Goller, C., Kuchler, A.: Learning task-dependent distributed representations by backpropagation through structure. In: IEEE International Conference on Neural Networks, vol. 1, pp. 347–352. IEEE, June 1996

    Google Scholar 

  18. Bastien, F., Lamblin, P., Pascanu, R., Bergstra, J., Goodfellow, I., Bergeron, A., Bengio, Y.: Theano: new features and speed improvements. arXiv preprint arXiv:1211.5590 (2012)

  19. Bergstra, J., Breuleux, O., Bastien, F., Lamblin, P., Pascanu, R., Desjardins, G., Bengio, Y.: Theano: a CPU and GPU math expression compiler. In: Proceedings of the Python for Scientific Computing Conference (SciPy), vol. 4, p. 3, June 2010

    Google Scholar 

  20. Williams, J., Raux, A., Ramachandran, D., Black, A.: The dialog state tracking challenge. In: Proceedings of the SIGDIAL 2013 Conference, pp. 404–413, August 2013

    Google Scholar 

  21. Mesnil, G., He, X., Deng, L., Bengio, Y.: Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding. In: INTERSPEECH, pp. 3771–3775, August 2013

    Google Scholar 

  22. Mesnil, G., Dauphin, Y., Yao, K., Bengio, Y., Deng, L., Hakkani-Tur, D., Zweig, G.: Using recurrent neural networks for slot filling in spoken language understanding. IEEE/ACM Trans. Audio Speech Lang. Process. 23(3), 530–539 (2015)

    Article  Google Scholar 

  23. Florian, R., Ngai, G.: Multidimensional transformation-based learning. In: Proceedings of the 2001 workshop on Computational Natural Language Learning, vol. 7, p. 1. Association for Computational Linguistics, July 2001

    Google Scholar 

  24. Pei, W., Ge, T., Chang, B.: Max-margin tensor neural network for Chinese word segmentation. ACL 1, 293–303 (2014)

    Google Scholar 

  25. Qiu, X., Qian, P., Yin, L., Wu, S., Huang, X.: Overview of the NLPCC 2015 shared task: chinese word segmentation and POS tagging for micro-blog texts. In: Hou, L., et al. (eds.) NLPCC 2015. LNCS, vol. 9362, pp. 541–549. Springer, Heidelberg (2015). doi:10.1007/978-3-319-25207-0_50

    Chapter  Google Scholar 

Download references

Acknowledgments

This paper is partially supported by National Natural Science Foundation of China (No. 61273365), discipline building plan in 111 base (No. B08004) and Engineering Research Center of Information Networks of MOE, and the Co-construction Program with the Beijing Municipal Commission of Education.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qianrong Zhou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Zhou, Q., Wen, L., Wang, X., Ma, L., Wang, Y. (2016). A Hierarchical LSTM Model for Joint Tasks. In: Sun, M., Huang, X., Lin, H., Liu, Z., Liu, Y. (eds) Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. NLP-NABD CCL 2016 2016. Lecture Notes in Computer Science(), vol 10035. Springer, Cham. https://doi.org/10.1007/978-3-319-47674-2_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-47674-2_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-47673-5

  • Online ISBN: 978-3-319-47674-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics