skip to main content
10.1145/3292500.3330723acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Open Access

Gmail Smart Compose: Real-Time Assisted Writing

Published:25 July 2019Publication History

ABSTRACT

In this paper, we present Smart Compose, a novel system for generating interactive, real-time suggestions in Gmail that assists users in writing mails by reducing repetitive typing. In the design and deployment of such a large-scale and complicated system, we faced several challenges including model selection, performance evaluation, serving and other practical issues. At the core of Smart Compose is a large-scale neural language model. We leveraged state-of-the-art machine learning techniques for language model training which enabled high-quality suggestion prediction, and constructed novel serving infrastructure for high-throughput and real-time inference. Experimental results show the effectiveness of our proposed system design and deployment approach. This system is currently being served in Gmail.

Skip Supplemental Material Section

Supplemental Material

p2287-chen.mp4

mp4

891.6 MB

References

  1. Cyril Allauzen, Mehryar Mohri, and Brian Roark. 2003. Generalized Algorithms for Constructing Statistical Language Models. In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics . Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Anton Bakhtin, Arthur Szlam, Marc'Aurelio Ranzato, and Edouard Grave. 2018. Lightweight Adaptive Mixture of Neural and N-gram Language Models. arXiv preprint arXiv:1804.07705 (2018).Google ScholarGoogle Scholar
  3. Yoshua Bengio, Réjean Ducharme, Pascal Vincent, and Christian Jauvin. 2003. A Neural Probabilistic Language Model. Journal of Machine Learning Research (2003). Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Google AI Blog. {n. d.}. The Machine Intelligence Behind Gboard. https://ai.googleblog.com/2017/05/the-machine-intelligence-behind-gboard.html .Google ScholarGoogle Scholar
  5. Samuel R Bowman, Luke Vilnis, Oriol Vinyals, Andrew M Dai, Rafal Jozefowicz, and Samy Bengio. 2016. Generating sentences from a continuous space. In 20th SIGNLL Conference on Computational Natural Language Learning (CoNLL) .Google ScholarGoogle ScholarCross RefCross Ref
  6. Thorsten Brants, Ashok C. Popat, Peng Xu, Franz J. Och, and Jeffrey Dean. 2007. Large language models in machine translation. In Proceedings of Empirical Methods in Natural Language Processing .Google ScholarGoogle Scholar
  7. Aylin Caliskan-Islam, Joanna Bryson, and Arvind Narayanan. 2016. Semantics derived automatically from language corpora necessarily contain human biases. Science , Vol. 356 (08 2016).Google ScholarGoogle Scholar
  8. Nicholas Carlini, Chang Liu, Jernej Kos, Ú lfar Erlingsson, and Dawn Song. 2018. The Secret Sharer: Measuring Unintended Neural Network Memorization & Extracting Secrets. CoRR , Vol. abs/1802.08232 (2018). arxiv: 1802.08232 http://arxiv.org/abs/1802.08232Google ScholarGoogle Scholar
  9. Jianmin Chen, Rajat Monga, Samy Bengio, and Rafal Jó zefowicz. 2016. Revisiting Distributed Synchronous SGD . CoRR , Vol. abs/1604.00981 (2016). arxiv: 1604.00981 http://arxiv.org/abs/1604.00981Google ScholarGoogle Scholar
  10. Mia Xu Chen, Orhan Firat, Ankur Bapna, Melvin Johnson, Wolfgang Macherey, George Foster, Llion Jones, Mike Schuster, Noam Shazeer, Niki Parmar, Ashish Vaswani, Jakob Uszkoreit, Lukasz Kaiser, Zhifeng Chen, Yonghui Wu, and Macduff Hughes. 2018. The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 76--86. http://aclweb.org/anthology/P18--1008Google ScholarGoogle ScholarCross RefCross Ref
  11. X. Chen, X. Liu, M. J. F. Gales, and P. C. Woodland. 2015. Investigation of back-off based interpolation between recurrent neural network and n-gram language models. In 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU). 181--186.Google ScholarGoogle Scholar
  12. Andrew M. Dai and Quoc V. Le. 2015. Semi-supervised sequence learning. In Advances in Neural Information Processing Systems. 3079--3087. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. CoRR , Vol. abs/1810.04805 (2018). arxiv: 1810.04805 http://arxiv.org/abs/1810.04805Google ScholarGoogle Scholar
  14. Ahmad Emami, Kishore Papineni, and Jeffrey Sorensen. 2007. Large-Scale Distributed Language Modeling. In 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07, Vol. 4.Google ScholarGoogle Scholar
  15. Orhan Firat, Kyunghyun Cho, and Yoshua Bengio. 2016. Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies . Association for Computational Linguistics, 866--875.Google ScholarGoogle ScholarCross RefCross Ref
  16. Joshua Goodman, Gina Venolia, Keith Steury, and Chauncey Parker. 2002. Language Modeling for Soft Keyboards. In Proceedings of the 7th International Conference on Intelligent User Interfaces. ACM, New York, NY, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Google. 2019 a. Google Cloud TPU . https://cloud.google.com/tpu/. Online; accessed 25 January 2019.Google ScholarGoogle Scholar
  18. Google. 2019 b. XLA Compiler . https://cloud.google.com/tpu/docs/system-architecture#xla_compiler . Online; accessed 25 January 2019.Google ScholarGoogle Scholar
  19. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition .Google ScholarGoogle Scholar
  20. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation , Vol. 9, 8 (1997), 1735--1780. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Bo-June Hsu. 2007. Generalized linear interpolation of language models. In 2007 IEEE Workshop on Automatic Speech Recognition Understanding (ASRU). 136--140.Google ScholarGoogle Scholar
  22. Aaron Jaech and Mari Ostendorf. 2018. Personalized Language Model for Query Auto-Completion. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics.Google ScholarGoogle ScholarCross RefCross Ref
  23. Fred Jelinek. 1997. Statistical Methods for Speech Recognition .MIT Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Melvin Johnson, Mike Schuster, Quoc V. Le, Maxim Krikun, Yonghui Wu, Zhifeng Chen, Nikhil Thorat, Fernanda Viégas, Martin Wattenberg, Greg Corrado, Macduff Hughes, and Jeffrey Dean. 2017. Google's Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation. Transactions of the Association for Computational Linguistics , Vol. 5 (2017), 339--351.Google ScholarGoogle ScholarCross RefCross Ref
  25. Norman P. Jouppi, Cliff Young, Nishant Patil, and et al David A. Patterson. 2017. In-Datacenter Performance Analysis of a Tensor Processing Unit. CoRR , Vol. abs/1704.04760 (2017). arxiv: 1704.04760 http://arxiv.org/abs/1704.04760 Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Rafal Jó zefowicz, Oriol Vinyals, Mike Schuster, Noam Shazeer, and Yonghui Wu. 2016. Exploring the Limits of Language Modeling. CoRR , Vol. abs/1602.02410 (2016). arxiv: 1602.02410 http://arxiv.org/abs/1602.02410Google ScholarGoogle Scholar
  27. Daniel Jurafsky and James H. Martin. 2000. Speech and Language Processing .Pearson Education. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Anjuli Kannan, Karol Kurach, Sujith Ravi, Tobias Kaufman, Balint Miklos, Greg Corrado, Andrew Tomkins, Laszlo Lukacs, Marina Ganea, Peter Young, and Vivek Ramavajjala. 2016. Smart Reply: Automated Response Suggestion for Email. In Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) (2016). Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Slava Katz. 1987. Estimation of probabilities from sparse data for the language model component of a speech recognizer. IEEE transactions on acoustics, speech, and signal processing , Vol. 35, 3 (1987), 400--401.Google ScholarGoogle ScholarCross RefCross Ref
  30. Diederik Kingma and Max Welling. 2014. Auto-Encoding Variational Bayes. CoRR , Vol. abs/1312.6114 (2014). arxiv: 1312.6114 http://arxiv.org/abs/1312.6114Google ScholarGoogle Scholar
  31. Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. CoRR , Vol. abs/1412.6980 (2014). http://arxiv.org/abs/1412.6980Google ScholarGoogle Scholar
  32. Philipp Koehn. 2011. Statistical Machine Translation .Cambridge University Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Guillaume Lample and Alexis Conneu. 2019. Cross-lingual Language Model Pretraining. CoRR , Vol. abs/1901.07291 (2019). arxiv: 1901.07291Google ScholarGoogle Scholar
  34. Hung-yi Lee, Bo-Hsiang Tseng, Tsung-Hsien Wen, and Yu Tsao. 2016. Personalizing Recurrent Neural Network Based Language Model by Social Network. IEEE Transactions on Audio, Speech, and Language Processing , Vol. PP (12 2016), 1--1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Christopher Manning and Hinrich Schütze. 1999. Foundations of Statistical Natural Language Processing .MIT Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Gá bor Melis, Chris Dyer, and Phil Blunsom. 2017. On the State of the Art of Evaluation in Neural Language Models. CoRR , Vol. abs/1707.05589 (2017). arxiv: 1707.05589 http://arxiv.org/abs/1707.05589Google ScholarGoogle Scholar
  37. Tomas Mikolov, Martin Karafiá t, Luká s Burget, Jan Cernocký , and Sanjeev Khudanpur. 2010. Recurrent neural network based language model. In INTERSPEECH 2010, 11th Annual Conference of the International Speech Communication Association, Makuhari, Chiba, Japan, September 26--30, 2010 .Google ScholarGoogle ScholarCross RefCross Ref
  38. Mehryar Mohri. {n. d.}. Finite-state Transducers in Language and Speech Processing. Computational Linguistics , Vol. 23, 2 ({n. d.}). Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Jakob Nielsen. 1993. Usability Engineering .Academic Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. In Proc. of NAACL .Google ScholarGoogle ScholarCross RefCross Ref
  41. S. Radicati. 2018. Email Statistics Report, 2018--2022 . https://www.radicati.com/wp/wp-content/uploads/2018/01/Email_Statistics_Report,_2018--2022_Executive_Summary.pdf . Online; accessed 25 January 2019.Google ScholarGoogle Scholar
  42. Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. 2016. SQuAD: 100,000Google ScholarGoogle Scholar
  43. Questions for Machine Comprehension of Text. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing . Association for Computational Linguistics, 2383--2392.Google ScholarGoogle Scholar
  44. M. Schuster and K. Nakajima. 2012. Japanese and Korean voice search. In 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 5149--5152.Google ScholarGoogle Scholar
  45. Jonathan Shen, Patrick Nguyen, Yonghui Wu, Zhifeng Chen, et almbox. 2019. Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling. arxiv: cs.LG/1902.08295Google ScholarGoogle Scholar
  46. Ilya Sutskever, Oriol Vinyals, and Quoc Le. 2014. Sequence to Sequence Learning with Neural Networks. In Advances in Neural Information Processing Systems 27 . Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jonathon Shlens, and Zbigniew Wojna. 2015. Rethinking the Inception Architecture for Computer Vision. CoRR , Vol. abs/1512.00567 (2015). arxiv: 1512.00567 http://arxiv.org/abs/1512.00567Google ScholarGoogle Scholar
  48. Bo-Hsiang Tseng, Hung-yi Lee, and Lin-Shan Lee. 2015. Personalizing universal recurrent neural network language model with user characteristic features by social network crowdsourcing. In 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU). 84--91.Google ScholarGoogle ScholarCross RefCross Ref
  49. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Ł ukasz Kaiser, and Illia Polosukhin. 2017. Attention is All You Need. In Advances in Neural Information Processing Systems 30 . Google ScholarGoogle Scholar
  50. Zhilin Yang, Zihang Dai, Ruslan Salakhutdinov, and William W. Cohen. 2018. Breaking the Softmax Bottleneck: A High-Rank RNN Language Model. In International Conference on Learning Representations .Google ScholarGoogle Scholar
  51. Seunghyun Yoon, Hyeongu Yun, Yuna Kim, Gyu-tae Park, and Kyomin Jung. 2017. Efficient Transfer Learning Schemes for Personalized Language Modeling using Recurrent Neural Network. CoRR , Vol. abs/1701.03578 (2017). arxiv: 1701.03578 http://arxiv.org/abs/1701.03578Google ScholarGoogle Scholar
  52. Konrad Zolna, Devansh Arpit, Dendi Suhubdy, and Yoshua Bengio. 2018. Fraternal Dropout. In International Conference on Learning Representations .Google ScholarGoogle Scholar

Index Terms

  1. Gmail Smart Compose: Real-Time Assisted Writing

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader