Synthetic Data Generation and Shuffled Multi-Round Training Based Offline Handwritten Mathematical Expression Recognition

Dong, Lan-Fang; Liu, Han-Chao; Zhang, Xin-Ming

doi:10.1007/s11390-021-0722-4

Synthetic Data Generation and Shuffled Multi-Round Training Based Offline Handwritten Mathematical Expression Recognition

Regular Paper
Published: 30 November 2022

Volume 37, pages 1427–1443, (2022)
Cite this article

Journal of Computer Science and Technology Aims and scope Submit manuscript

Lan-Fang Dong¹,
Han-Chao Liu¹ &
Xin-Ming Zhang¹

136 Accesses
2 Citations
Explore all metrics

Abstract

Offline handwritten mathematical expression recognition is a challenging optical character recognition (OCR) task due to various ambiguities of handwritten symbols and complicated two-dimensional structures. Recent work in this area usually constructs deeper and deeper neural networks trained with end-to-end approaches to improve the performance. However, the higher the complexity of the network, the more the computing resources and time required. To improve the performance without more computing requirements, we concentrate on the training data and the training strategy in this paper. We propose a data augmentation method which can generate synthetic samples with new LaTeX notations by only using the official training data of CROHME. Moreover, we propose a novel training strategy called Shuffled Multi-Round Training (SMRT) to regularize the model. With the generated data and the shuffled multi-round training strategy, we achieve the state-of-the-art result in expression accuracy, i.e., 59.74% and 61.57% on CROHME 2014 and 2016, respectively, by using attention-based encoder-decoder models for offline handwritten mathematical expression recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer

An Encoder-Decoder Approach to Offline Handwritten Mathematical Expression Recognition with Residual Attention

Offline handwritten mathematical recognition using adversarial learning and transformers

Article 09 September 2023

References

Mouchère H, Zanibbi R, Garain U, Viard-Gaudin C. Advancing the state of the art for handwritten math recognition: The CROHME competitions, 2011{2014. International Journal on Document Analysis and Recognition, 2016, 19(2): 173-189. DOI: https://doi.org/10.1007/s10032-016-0263-5.
Zhang J, Du J, Zhang S, Liu D, Hu Y, Hu J, Wei S, Dai L. Watch, attend and parse: An end-to-end neural network based approach to handwritten mathematical expression recognition. Pattern Recognition, 2017, 71: 196-206. DOI: https://doi.org/10.1016/j.patcog.2017.06.017.
Article Google Scholar
Anderson R H. Syntax-directed recognition of hand- printed two-dimensional mathematics. In Proc. the ACM Symposium on Interactive Systems for Experimental Applied Mathematics, August 1967, pp.436-459. DOI: https://doi.org/10.1145/2402536.2402585.
Mouchère H, Viard G C, Zanibbi R, Garain U. ICFHR 2014 competition on recognition of online handwritten mathematical expressions (CROHME 2014). In Proc. the 2014 IEEE International Conference on Frontiers in Handwriting, September 2014, pp.791-796. DOI: https://doi.org/10.1109/ICFHR.2014.138.
Mouchère H, Viard G C, Zanibbi R, Garain U. ICFHR2016 CROHME: Competition on recognition of online hand-written mathematical expressions. In Proc. the 2016 IEEE International Conference on Frontiers in Hand-writing Recognition, October 2016, pp.607-612. DOI: 10.1109/ICFHR.2016.0116.
Hu L, Zanibbi R. Segmenting handwritten math symbols using AdaBoost and multi-scale shape context features. In Proc. the 2013 IEEE International Conference on Document Analysis and Recognition, August 2013, pp.1180-1184. DOI: 10.1109/ICDAR.2013.239.
Álvaro F, Sánchez J A, Benedí J M. Offline features for classifying handwritten math symbols with recurrent neural networks. In Proc. the 2014 IEEE International Conference on Pattern Recognition, August 2014, pp.2944-2949. DOI: 10.1109/ICPR.2014.507.
Álvaro F, Sánchez J A, Benedí J M. An integrated grammar-based approach for mathematical expression recognition. Pattern Recognition, 2016, 51: 135-147. DOI: https://doi.org/10.1016/j.patcog.2015.09.013.
Article MATH Google Scholar
Awal A M, Mouchère H, Viard G C. A global learning approach for an online handwritten mathematical expression recognition system. Pattern Recognition Letter, 2014, 35: 68-77. DOI: https://doi.org/10.1016/j.patrec.2012.10.024.
Article Google Scholar
Deng Y, Kanervisto A, Ling J, Rush A M. Image-to-markup generation with coarse-to-fine attention. In Proc. the 34th International Conference on Machine Learning, August 2017, pp.980-989.
Wang J, Du J, Zhang J, Wang Z. Multi-modal attention network for handwritten mathematical expression recognition. In Proc. the 2019 IEEE International Conference on Document Analysis and Recognition, September 2019, pp.1181-1186. DOI: 10.1109/ICDAR.2019.00191.
Zhang J, Du J, Dai L. Multi-scale attention with dense encoder for handwritten mathematical expression recognition. In Proc. the 2018 IEEE International Conference on Pattern Recognition, August 2018, pp.2245-2250. DOI: 10.1109/ICPR.2018.8546031.
Zhang J, Du J, Dai L. Track, attend and parse (TAP): An end-to-end framework for online handwritten mathematical expression recognition. IEEE Transactions on Multimedia, 2018, 21(1): 221-233. DOI: https://doi.org/10.1109/TMM.2018.2844689.
Article Google Scholar
Wu J, Yin F, Zhang Y, Zhang X, Liu C. Image-to-markup generation via paired adversarial learning. In Proc. the European Conference on Machine Learning and Knowledge Discovery in Databases, September 2018, pp.18-34. DOI: https://doi.org/10.1007/978-3-030-10925-7_2.
Le A D, Nakagawa M. Training an end-to-end system for handwritten mathematical expression recognition by generated patterns. In Proc. the 2017 IAPR International Conference on Document Analysis and Recognition, November 2017, pp.1056-1061. DOI: 10.1109/IC-DAR.2017.175.
Sutskever I, Vinyals O, Le Q V. Sequence to sequence learning with neural networks. arXiv:1409.3215, 2014. https://arxiv.org/abs/1409.3215, September 2022.
Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhutdinov R, Zemel R S, Bengio Y. Show, attend and tell: Neural image caption generation with visual attention. In Proc. the 2015 International Conference on Machine Learning, July 2015, pp.2048-2057.
Zeng X H, Liu B G, Zhou M. Understanding and generating ultrasound image description. Journal of Computer Science and Technology, 2018, 33(5): 1086-1100. DOI: https://doi.org/10.1007/s11390-018-1874-8.
Article Google Scholar
Salazer J, Kirchhoff K, Huang Z. Self-attention networks for connectionist temporal classification in speech recognition. In Proc. the 2019 IEEE International Conference on Acoustics, Speech and Signal Processing, May 2019, pp.7115-7119. DOI: 10.1109/ICASSP.2019.8682539.
Luong M T, Pham H, Manning C D. Effective approaches to attention-based neural machine translation. arXiv:1508.04025, 2015. https://arxiv.org/abs/1508.04025, August 2022.
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556, 2014. https://arxiv.org/abs/1409.1556, September 2022.
Huang G, Liu Z, Weinberger K Q, Maaten L V D. Densely connected convolutional networks. InProc. the 2017 IEEE Conference on Computer Vision and Pattern Recognition, July 2017, pp.4700-4708. DOI: 10.1109/CVPR.2017.243.
Simard P Y, Steinkraus D, Platt J C. Best practices for convolutional neural networks applied to visual document analysis. In Proc. the 7th IEEE International Conference on Document Analysis and Recognition, August 2003, pp.958-963. DOI: 10.1109/ICDAR.2003.1227801.
Zhong Z, Jin L, Xie Z. High performance offline handwritten Chinese character recognition using GoogLeNet and directional feature maps. In Proc. the 13th IEEE International Conference on Document Analysis and Recognition, August 2015, pp.846-850. DOI: 10.1109/ICDAR.2015.7333881.
Yuan T, Zhu Z, Xu K, Li C, Mu T, Hu S. A large Chinese text dataset in the wild. Journal of the Computer Science and Technology, 2019, 34(3): 509-521. DOI: https://doi.org/10.1007/s11390-019-1923-y.
Article Google Scholar
Julca-Aguilar F, Mouchère H, Viard-Gaudin C, Hirata N S T. Top-down online handwritten mathematical expression parsing with graph grammar. In Proc. the Iberoamerican Congress on Pattern Recognition, November 2015, pp.444-451. DOI: https://doi.org/10.1007/978-3-319-25751-8_53.
Schuster M, Paliwal K. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 1997, 45(11): 2673-2681. DOI: https://doi.org/10.1109/78.650093.
Article Google Scholar
Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation, 1997, 9(8): 1735-1780. DOI: https://doi.org/10.1162/neco.1997.9.8.1735.
Article Google Scholar
Chung J, Gulcehre C, Cho K H, Bengio Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv:1412.3555, 2014. https://arxiv.org/abs/1-412.3555, September 2022.
MacLean S, Labahn G, Lank E, Marzouk S, Tausky D. Grammar-based techniques for creating ground-truthed sketch corpora. International Journal on Document Analysis and Recognition, 2015, 14(1): 65-74. DOI: https://doi.org/10.1007/s10032-010-0118-4.
Article Google Scholar
Al-Rfou R, Alain G, Almahairi A et al. Theano: A Python framework for fast computation of mathematical expressions. arXiv: 1605.02688, 2016. https://arxiv.org/abs/16-05.02688, December, Dec. 2022.
Zeiler M D. ADADELTA: An adaptive learning rate method. arXiv:1212.5701, 2012. https://arxiv.org/abs/12-12.5701, September 2022.
Klakow D, Peters J. Testing the correlation of word error rate and perplexity. Speech Communication, 2002, 38(1/2): 19-28. DOI: https://doi.org/10.1016/S0167-6393(01)00041-3.
Article MATH Google Scholar
Medress M F, Cooper F S, Forgie J W et al. Speech understanding systems: Report of a steering committee. Artificial Intelligence, 1977, 9(3): 307-316. DOI: https://doi.org/10.1016/0004-3702(77)90026-1.
Article Google Scholar
Zhang J, Du J, Dai L. A GRU-based encoder-decoder approach with attention for online handwritten mathematical expression recognition. In Proc. the 2017 IAPR International Conference on Document Analysis and Recognition, November 2017, pp.902-907. DOI: 10.1109/IC-DAR.2017.152.
Zhang X Y, Yin F, Zhang Y M et al. Drawing and recognizing Chinese characters with recurrent neural network. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 849-862. DOI: https://doi.org/10.1109/TPAMI.2017.2695539.
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Technology, University of Science and Technology of China, Hefei, 230022, China
Lan-Fang Dong, Han-Chao Liu & Xin-Ming Zhang

Authors

Lan-Fang Dong
View author publications
You can also search for this author in PubMed Google Scholar
Han-Chao Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xin-Ming Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xin-Ming Zhang.

Supplementary Information

ESM 1

(PDF 104 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dong, LF., Liu, HC. & Zhang, XM. Synthetic Data Generation and Shuffled Multi-Round Training Based Offline Handwritten Mathematical Expression Recognition. J. Comput. Sci. Technol. 37, 1427–1443 (2022). https://doi.org/10.1007/s11390-021-0722-4

Download citation

Received: 19 June 2020
Accepted: 16 September 2021
Published: 30 November 2022
Issue Date: December 2022
DOI: https://doi.org/10.1007/s11390-021-0722-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Synthetic Data Generation and Shuffled Multi-Round Training Based Offline Handwritten Mathematical Expression Recognition

Abstract

Access this article

Similar content being viewed by others

Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer

An Encoder-Decoder Approach to Offline Handwritten Mathematical Expression Recognition with Residual Attention

Offline handwritten mathematical recognition using adversarial learning and transformers

References

Author information

Authors and Affiliations

Corresponding author

Supplementary Information

ESM 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Synthetic Data Generation and Shuffled Multi-Round Training Based Offline Handwritten Mathematical Expression Recognition

Abstract

Access this article

Similar content being viewed by others

Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer

An Encoder-Decoder Approach to Offline Handwritten Mathematical Expression Recognition with Residual Attention

Offline handwritten mathematical recognition using adversarial learning and transformers

References

Author information

Authors and Affiliations

Corresponding author

Supplementary Information

ESM 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation