Skip to main content

Flexible Recurrent Neural Networks

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases (ECML PKDD 2020)

Abstract

We introduce two methods enabling recurrent neural networks (RNNs) to trade off accuracy for computational cost during the analysis of a sequence. This opens up the possibility to adapt RNNs in real time to changing computational constraints, such as when running on shared hardware with other processes or in mobile edge computing nodes. The first approach makes minimal changes to the model. Therefore, it avoids loading new parameters from slow memory. In the second approach, different models can replace one another within a sequence analysis. The latter works on more data sets. We evaluate these two approaches on permuted MNIST, adding task and a human activity recognition task. We demonstrate that changing the computational cost of a RNN with our approaches leads to sensible results. Indeed, the resulting accuracy and computational cost is typically a weighted average of the corresponding metrics of the models used. The weight of each model also increases with the number of time steps a model is used.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013)

  2. Campos, V., Jou, B., i Nieto, X.G., Torres, J., Chang, S.F.: Skip RNN: learning to skip state updates in recurrent neural networks. In: International Conference on Learning Representations (2018)

    Google Scholar 

  3. Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1724–1734 (2014)

    Google Scholar 

  4. Dennis, D., et al.: Shallow RNN: accurate time-series classification on resource constrained devices. In: Advances in Neural Information Processing Systems, pp. 12896–12906 (2019)

    Google Scholar 

  5. Eiffert, S.: RNN for human activity recognition - 2D pose input. https://github.com/stuarteiffert/RNN-for-Human-Activity-Recognition-using-2D-Pose-Input

  6. Guerra, L., Zhuang, B., Reid, I., Drummond, T.: Switchable precision neural networks. arXiv preprint arXiv:2002.02815 (2020)

  7. Hammerla, N.Y., Halloran, S., Plötz, T.: Deep, convolutional, and recurrent models for human activity recognition using wearables. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, pp. 1533–1540 (2016)

    Google Scholar 

  8. Hansen, C., Hansen, C., Alstrup, S., Simonsen, J.G., Lioma, C.: Neural speed reading with structural-jump-LSTM. In: International Conference on Learning Representations (2019)

    Google Scholar 

  9. Hinton, G.: Neural networks for machine learning. Coursera Video Lect. 264(1) (2012)

    Google Scholar 

  10. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  11. Jernite, Y., Grave, E., Joulin, A., Mikolov, T.: Variable computation in recurrent neural networks. In: International Conference on Learning Representations (2017)

    Google Scholar 

  12. Kusupati, A., Singh, M., Bhatia, K., Kumar, A., Jain, P., Varma, M.: FastGRNN: a fast, accurate, stable and tiny kilobyte sized gated recurrent neural network. In: Advances in Neural Information Processing Systems, pp. 9017–9028 (2018)

    Google Scholar 

  13. Le, Q.V., Jaitly, N., Hinton, G.E.: A simple way to initialize recurrent networks of rectified linear units. arXiv preprint arXiv:1504.00941 (2015)

  14. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  15. Li, Y., Yu, T., Li, B.: Recognizing video events with varying rhythms. arXiv preprint arXiv:2001.05060 (2020)

  16. Neil, D., Pfeiffer, M., Liu, S.C.: Phased LSTM: accelerating recurrent network training for long or event-based sequences. In: Advances in Neural Information Processing Systems, pp. 3882–3890 (2016)

    Google Scholar 

  17. Ofli, F., Chaudhry, R., Kurillo, G., Vidal, R., Bajcsy, R.: Berkeley MHAD: a comprehensive multimodal human action database. In: 2013 IEEE Workshop on Applications of Computer Vision (WACV), pp. 53–60. IEEE (2013)

    Google Scholar 

  18. Ruiz, A., Verbeek, J.J.: Adaptative inference cost with convolutional neural mixture models. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 1872–1881 (2019)

    Google Scholar 

  19. Satyanarayanan, M., Bahl, P., Caceres, R., Davies, N.: The case for VM-based cloudlets in mobile computing. IEEE Pervasive Comput. 8(4), 14–23 (2009)

    Article  Google Scholar 

  20. Seo, M., Min, S., Farhadi, A., Hajishirzi, H.: Neural speed reading via skim-RNN. In: International Conference on Learning Representations (2018)

    Google Scholar 

  21. Song, I., Chung, J., Kim, T., Bengio, Y.: Dynamic frame skipping for fast speech recognition in recurrent neural network based acoustic models. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4984–4988. IEEE (2018)

    Google Scholar 

  22. Tao, J., Thakker, U., Dasika, G., Beu, J.: Skipping RNN state updates without retraining the original model. In: Proceedings of the 1st Workshop on Machine Learning on Edge in Sensor Systems, pp. 31–36 (2019)

    Google Scholar 

  23. Thakker, U., et al.: Compressing RNNs for IoT devices by 15–38x using kronecker products. arXiv preprint arXiv:1906.02876 (2019)

  24. Thakker, U., Dasika, G., Beu, J., Mattina, M.: Measuring scheduling efficiency of RNNs for NLP applications. arXiv preprint arXiv:1904.03302 (2019)

  25. Thornton, M., Anumula, J., Liu, S.C.: Reducing state updates via gaussian-gated LSTMs. arXiv preprint arXiv:1901.07334 (2019)

  26. Wang, X., Han, Y., Leung, V.C., Niyato, D., Yan, X., Chen, X.: Convergence of edge computing and deep learning: a comprehensive survey. IEEE Commun. Surv. Tutor. 22(2), 869–904 (2020)

    Google Scholar 

  27. Wu, C.J., et al.: Machine learning at Facebook: understanding inference at the edge. In: 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA), pp. 331–344. IEEE (2019)

    Google Scholar 

  28. Yu, A.W., Lee, H., Le, Q.: Learning to skim text. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (vol. 1: Long Papers), Vancouver, Canada, pp. 1880–1890, July 2017

    Google Scholar 

  29. Yu, H., Wang, J., Huang, Z., Yang, Y., Xu, W.: Video paragraph captioning using hierarchical recurrent neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4584–4593 (2016)

    Google Scholar 

  30. Yu, J., Huang, T.S.: Universally slimmable networks and improved training techniques. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1803–1811 (2019)

    Google Scholar 

  31. Yu, J., Yang, L., Xu, N., Yang, J., Huang, T.: Slimmable neural networks. In: International Conference on Learning Representations (2019)

    Google Scholar 

  32. Zhang, S., Loweimi, E., Xu, Y., Bell, P., Renals, S.: Trainable dynamic subsampling for end-to-end speech recognition. In: Proceedings of INTERSPEECH 2019, pp. 1413–1417 (2019)

    Google Scholar 

  33. Zhang, Y., Suda, N., Lai, L., Chandra, V.: Hello edge: keyword spotting on microcontrollers. CoRR abs/1711.07128 (2017)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to François Schnitzler .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lambert, A., Le Bolzer, F., Schnitzler, F. (2021). Flexible Recurrent Neural Networks. In: Hutter, F., Kersting, K., Lijffijt, J., Valera, I. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2020. Lecture Notes in Computer Science(), vol 12457. Springer, Cham. https://doi.org/10.1007/978-3-030-67658-2_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-67658-2_40

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-67657-5

  • Online ISBN: 978-3-030-67658-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics