Skip to main content
Log in

Synthetic Data Generation and Shuffled Multi-Round Training Based Offline Handwritten Mathematical Expression Recognition

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Offline handwritten mathematical expression recognition is a challenging optical character recognition (OCR) task due to various ambiguities of handwritten symbols and complicated two-dimensional structures. Recent work in this area usually constructs deeper and deeper neural networks trained with end-to-end approaches to improve the performance. However, the higher the complexity of the network, the more the computing resources and time required. To improve the performance without more computing requirements, we concentrate on the training data and the training strategy in this paper. We propose a data augmentation method which can generate synthetic samples with new LaTeX notations by only using the official training data of CROHME. Moreover, we propose a novel training strategy called Shuffled Multi-Round Training (SMRT) to regularize the model. With the generated data and the shuffled multi-round training strategy, we achieve the state-of-the-art result in expression accuracy, i.e., 59.74% and 61.57% on CROHME 2014 and 2016, respectively, by using attention-based encoder-decoder models for offline handwritten mathematical expression recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Mouchère H, Zanibbi R, Garain U, Viard-Gaudin C. Advancing the state of the art for handwritten math recognition: The CROHME competitions, 2011{2014. International Journal on Document Analysis and Recognition, 2016, 19(2): 173-189. DOI: https://doi.org/10.1007/s10032-016-0263-5.

  2. Zhang J, Du J, Zhang S, Liu D, Hu Y, Hu J, Wei S, Dai L. Watch, attend and parse: An end-to-end neural network based approach to handwritten mathematical expression recognition. Pattern Recognition, 2017, 71: 196-206. DOI: https://doi.org/10.1016/j.patcog.2017.06.017.

    Article  Google Scholar 

  3. Anderson R H. Syntax-directed recognition of hand- printed two-dimensional mathematics. In Proc. the ACM Symposium on Interactive Systems for Experimental Applied Mathematics, August 1967, pp.436-459. DOI: https://doi.org/10.1145/2402536.2402585.

  4. Mouchère H, Viard G C, Zanibbi R, Garain U. ICFHR 2014 competition on recognition of online handwritten mathematical expressions (CROHME 2014). In Proc. the 2014 IEEE International Conference on Frontiers in Handwriting, September 2014, pp.791-796. DOI: https://doi.org/10.1109/ICFHR.2014.138.

  5. Mouchère H, Viard G C, Zanibbi R, Garain U. ICFHR2016 CROHME: Competition on recognition of online hand-written mathematical expressions. In Proc. the 2016 IEEE International Conference on Frontiers in Hand-writing Recognition, October 2016, pp.607-612. DOI: 10.1109/ICFHR.2016.0116.

  6. Hu L, Zanibbi R. Segmenting handwritten math symbols using AdaBoost and multi-scale shape context features. In Proc. the 2013 IEEE International Conference on Document Analysis and Recognition, August 2013, pp.1180-1184. DOI: 10.1109/ICDAR.2013.239.

  7. Álvaro F, Sánchez J A, Benedí J M. Offline features for classifying handwritten math symbols with recurrent neural networks. In Proc. the 2014 IEEE International Conference on Pattern Recognition, August 2014, pp.2944-2949. DOI: 10.1109/ICPR.2014.507.

  8. Álvaro F, Sánchez J A, Benedí J M. An integrated grammar-based approach for mathematical expression recognition. Pattern Recognition, 2016, 51: 135-147. DOI: https://doi.org/10.1016/j.patcog.2015.09.013.

    Article  MATH  Google Scholar 

  9. Awal A M, Mouchère H, Viard G C. A global learning approach for an online handwritten mathematical expression recognition system. Pattern Recognition Letter, 2014, 35: 68-77. DOI: https://doi.org/10.1016/j.patrec.2012.10.024.

    Article  Google Scholar 

  10. Deng Y, Kanervisto A, Ling J, Rush A M. Image-to-markup generation with coarse-to-fine attention. In Proc. the 34th International Conference on Machine Learning, August 2017, pp.980-989.

  11. Wang J, Du J, Zhang J, Wang Z. Multi-modal attention network for handwritten mathematical expression recognition. In Proc. the 2019 IEEE International Conference on Document Analysis and Recognition, September 2019, pp.1181-1186. DOI: 10.1109/ICDAR.2019.00191.

  12. Zhang J, Du J, Dai L. Multi-scale attention with dense encoder for handwritten mathematical expression recognition. In Proc. the 2018 IEEE International Conference on Pattern Recognition, August 2018, pp.2245-2250. DOI: 10.1109/ICPR.2018.8546031.

  13. Zhang J, Du J, Dai L. Track, attend and parse (TAP): An end-to-end framework for online handwritten mathematical expression recognition. IEEE Transactions on Multimedia, 2018, 21(1): 221-233. DOI: https://doi.org/10.1109/TMM.2018.2844689.

    Article  Google Scholar 

  14. Wu J, Yin F, Zhang Y, Zhang X, Liu C. Image-to-markup generation via paired adversarial learning. In Proc. the European Conference on Machine Learning and Knowledge Discovery in Databases, September 2018, pp.18-34. DOI: https://doi.org/10.1007/978-3-030-10925-7_2.

  15. Le A D, Nakagawa M. Training an end-to-end system for handwritten mathematical expression recognition by generated patterns. In Proc. the 2017 IAPR International Conference on Document Analysis and Recognition, November 2017, pp.1056-1061. DOI: 10.1109/IC-DAR.2017.175.

  16. Sutskever I, Vinyals O, Le Q V. Sequence to sequence learning with neural networks. arXiv:1409.3215, 2014. https://arxiv.org/abs/1409.3215, September 2022.

  17. Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhutdinov R, Zemel R S, Bengio Y. Show, attend and tell: Neural image caption generation with visual attention. In Proc. the 2015 International Conference on Machine Learning, July 2015, pp.2048-2057.

  18. Zeng X H, Liu B G, Zhou M. Understanding and generating ultrasound image description. Journal of Computer Science and Technology, 2018, 33(5): 1086-1100. DOI: https://doi.org/10.1007/s11390-018-1874-8.

    Article  Google Scholar 

  19. Salazer J, Kirchhoff K, Huang Z. Self-attention networks for connectionist temporal classification in speech recognition. In Proc. the 2019 IEEE International Conference on Acoustics, Speech and Signal Processing, May 2019, pp.7115-7119. DOI: 10.1109/ICASSP.2019.8682539.

  20. Luong M T, Pham H, Manning C D. Effective approaches to attention-based neural machine translation. arXiv:1508.04025, 2015. https://arxiv.org/abs/1508.04025, August 2022.

  21. Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556, 2014. https://arxiv.org/abs/1409.1556, September 2022.

  22. Huang G, Liu Z, Weinberger K Q, Maaten L V D. Densely connected convolutional networks. InProc. the 2017 IEEE Conference on Computer Vision and Pattern Recognition, July 2017, pp.4700-4708. DOI: 10.1109/CVPR.2017.243.

  23. Simard P Y, Steinkraus D, Platt J C. Best practices for convolutional neural networks applied to visual document analysis. In Proc. the 7th IEEE International Conference on Document Analysis and Recognition, August 2003, pp.958-963. DOI: 10.1109/ICDAR.2003.1227801.

  24. Zhong Z, Jin L, Xie Z. High performance offline handwritten Chinese character recognition using GoogLeNet and directional feature maps. In Proc. the 13th IEEE International Conference on Document Analysis and Recognition, August 2015, pp.846-850. DOI: 10.1109/ICDAR.2015.7333881.

  25. Yuan T, Zhu Z, Xu K, Li C, Mu T, Hu S. A large Chinese text dataset in the wild. Journal of the Computer Science and Technology, 2019, 34(3): 509-521. DOI: https://doi.org/10.1007/s11390-019-1923-y.

    Article  Google Scholar 

  26. Julca-Aguilar F, Mouchère H, Viard-Gaudin C, Hirata N S T. Top-down online handwritten mathematical expression parsing with graph grammar. In Proc. the Iberoamerican Congress on Pattern Recognition, November 2015, pp.444-451. DOI: https://doi.org/10.1007/978-3-319-25751-8_53.

  27. Schuster M, Paliwal K. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 1997, 45(11): 2673-2681. DOI: https://doi.org/10.1109/78.650093.

    Article  Google Scholar 

  28. Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation, 1997, 9(8): 1735-1780. DOI: https://doi.org/10.1162/neco.1997.9.8.1735.

    Article  Google Scholar 

  29. Chung J, Gulcehre C, Cho K H, Bengio Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv:1412.3555, 2014. https://arxiv.org/abs/1-412.3555, September 2022.

  30. MacLean S, Labahn G, Lank E, Marzouk S, Tausky D. Grammar-based techniques for creating ground-truthed sketch corpora. International Journal on Document Analysis and Recognition, 2015, 14(1): 65-74. DOI: https://doi.org/10.1007/s10032-010-0118-4.

    Article  Google Scholar 

  31. Al-Rfou R, Alain G, Almahairi A et al. Theano: A Python framework for fast computation of mathematical expressions. arXiv: 1605.02688, 2016. https://arxiv.org/abs/16-05.02688, December, Dec. 2022.

  32. Zeiler M D. ADADELTA: An adaptive learning rate method. arXiv:1212.5701, 2012. https://arxiv.org/abs/12-12.5701, September 2022.

  33. Klakow D, Peters J. Testing the correlation of word error rate and perplexity. Speech Communication, 2002, 38(1/2): 19-28. DOI: https://doi.org/10.1016/S0167-6393(01)00041-3.

    Article  MATH  Google Scholar 

  34. Medress M F, Cooper F S, Forgie J W et al. Speech understanding systems: Report of a steering committee. Artificial Intelligence, 1977, 9(3): 307-316. DOI: https://doi.org/10.1016/0004-3702(77)90026-1.

    Article  Google Scholar 

  35. Zhang J, Du J, Dai L. A GRU-based encoder-decoder approach with attention for online handwritten mathematical expression recognition. In Proc. the 2017 IAPR International Conference on Document Analysis and Recognition, November 2017, pp.902-907. DOI: 10.1109/IC-DAR.2017.152.

  36. Zhang X Y, Yin F, Zhang Y M et al. Drawing and recognizing Chinese characters with recurrent neural network. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 849-862. DOI: https://doi.org/10.1109/TPAMI.2017.2695539.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xin-Ming Zhang.

Supplementary Information

ESM 1

(PDF 104 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dong, LF., Liu, HC. & Zhang, XM. Synthetic Data Generation and Shuffled Multi-Round Training Based Offline Handwritten Mathematical Expression Recognition. J. Comput. Sci. Technol. 37, 1427–1443 (2022). https://doi.org/10.1007/s11390-021-0722-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-021-0722-4

Keywords

Navigation