Skip to main content

Recognizing Handwritten Chinese Texts with Insertion and Swapping Using a Structural Attention Network

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12824))

Abstract

It happens in handwritten documents that text lines distort beyond sequential structure because of in-writing editions such as insertion and swapping of text. This kind of irregularity can not be handled using existing text line recognition methods that assume regular character sequences. In this paper, we regard this irregular text recognition as a two-dimensional (2D) problem and propose a structural attention network (SAN) for recognizing texts with insertion and swapping. Particularly, we present a novel structural representation to help SAN learn these irregular structures. With the guidance of the structural representation, SAN can correctly recognize texts with insertion and swapping. To validate the effectiveness of our method, we chose the public SCUT-EPT dataset which contains some samples of text with insertion and swapping. Due to the scarcity of text images with text insertion and swapping, we generate a specialized dataset which only consists of these irregular texts. Experiments show that SAN promises the recognition of inserted and swapped texts and achieves state-of-the-art performance on the SCUT-EPT dataset.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Liu, C.L., Yin, F., Wang, D.H., Wang, Q.F.: Casia online and offline Chinese handwriting databases. In: Proceedings of 11th International Conference on Document Analysis and Recognition (ICDAR), pp. 37–41 (2011)

    Google Scholar 

  2. Su, T., Zhang, T., Guan, D.: HIT-MW dataset for offline Chinese handwritten text recognition. In: Proceedings of 10th International Workshop on Frontiers in Handwriting Recognition (IWFHR), pp. 1–5 (2006)

    Google Scholar 

  3. Zhu, Y., Xie, Z., Jin, L., Chen, X., Huang, Y., Zhang, M.: SCUT-EPT: new dataset and benchmark for offline Chinese text recognition in examination paper. IEEE Access 7, 370–382 (2019)

    Article  Google Scholar 

  4. Bhattacharya, N., Frinken, V., Pal, U., Roy, P.P.: Overwriting repetition and crossing-out detection in online handwritten text. In: Proceedings of Asian Conference on Pattern Recognition (ACPR), pp. 680–684 (2015)

    Google Scholar 

  5. Chaudhuri, B.B., Adak, C.: An approach for detecting and cleaning of struck-out handwritten text. Pattern Recogn. 61, 282–294 (2017)

    Article  Google Scholar 

  6. Adak, C., Chaudhuri, B.B.: An approach of strike-through text identification from handwritten documents. In: Proceedings of Nineth International Conference on Frontiers in Handwriting Recognition, pp. 643–648 (2014)

    Google Scholar 

  7. Graves, A., Schmidhuber, J.: Offline handwriting recognition with multidimensional recurrent neural networks. In: Advances in Neural Information Processing Systems, pp. 545–552 (2009)

    Google Scholar 

  8. Shi, B., Bai, X., Yao, C.: An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39(11), 2298–2304 (2017)

    Article  Google Scholar 

  9. Shi, B., Wang, X., Lyu, P., Yao, C., Bai, X.: Robust scene text recognition with automatic rectification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4168–4176 (2016)

    Google Scholar 

  10. Liao, M., et al.: Scene text recognition from two-dimensional perspective. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 8714–8721 (2019)

    Google Scholar 

  11. Lyu, P., Liao, M., Yao, C., Wu, W., Bai, X.: Mask textspotter: an end-to-end trainable neural network for spotting text with arbitrary shapes. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 67–83 (2018)

    Google Scholar 

  12. Jaderberg, M., Simonyan, K., Vedaldi, A., Zisserman, A.: Reading text in the wild with convolutional neural networks. Int. J. Comput. Vis. 116(1), 1–20 (2016)

    Article  MathSciNet  Google Scholar 

  13. Cheng, Z., Xu, Y., Bai, F., Niu, Y., Pu, S., Zhou, S.: AON: towards arbitrarily-oriented text recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5571–5579 (2018)

    Google Scholar 

  14. Liu, H., Jin, S., Zhang, C.: Connectionist temporal classification with maximum entropy regularization. In: Advances in Neural Information Processing Systems, pp. 831–841 (2018)

    Google Scholar 

  15. Shi, B., Yang, M., Wang, X., Lyu, P., Yao, C., Bai, X.: ASTER: an attentional scene text recognizer with flexible rectification. IEEE Trans. Pattern Anal. Mach. Intell. 41(9), 2035–2048 (2019)

    Article  Google Scholar 

  16. Yang, M., et al.: Symmetry-constrained rectification network for scene text recognition. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 9147–9156 (2019)

    Google Scholar 

  17. Li, H., Wang, P., Shen, C., Zhang, G.: Show, attend and read: a simple and strong baseline for irregular text recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 8610–8617 (2019)

    Google Scholar 

  18. Wan, Z., Xie, F., Liu, Y., Bai, X., Yao, C.: 2D-CTC for scene text recognition. arXiv preprint arXiv:1907.09705 (2019)

  19. Cheng, Z., Bai, F., Xu, Y., Zheng, G., Pu, S., Zhou, S.: Focusing attention: towards accurate text recognition in natural images. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 5076–5084 (2017)

    Google Scholar 

  20. Yue, X., Kuang, Z., Lin, C., Sun, H., Zhang, W.: RobustScanner: dynamically enhancing positional clues for robust text recognition. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12364, pp. 135–151. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58529-7_9

    Chapter  Google Scholar 

  21. Hong, C., Loudon, G., Wu, Y., Zitserman, R.: Segmentation and recognition of continuous handwriting Chinese text. Int. J. Pattern Recogn. Artif. Intell. 12(02), 223–232 (1998)

    Article  Google Scholar 

  22. Srihari, S.N., Yang, X., Ball, G.R.: Offline Chinese handwriting recognition: an assessment of current technology. Front. Comput. Sci. China 1(2), 137–155 (2007)

    Article  Google Scholar 

  23. Wang, Q., Yin, F., Liu, C.: Handwritten Chinese text recognition by integrating multiple contexts. IEEE Trans. Pattern Anal. Mach. Intell. 34(8), 1469–1481 (2012)

    Article  Google Scholar 

  24. Yin, F., Wang, Q.F., Zhang, X.Y., Liu, C.L.: ICDAR 2013 Chinese handwriting recognition competition. In: Proceedings of 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 1464–1470 (2013)

    Google Scholar 

  25. Messina, R., Louradour, J.: Segmentation-free handwritten Chinese text recognition with LSTM-RNN. In: Proceedings of 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 171–175 (2015)

    Google Scholar 

  26. Xie, Z., Sun, Z., Jin, L., Feng, Z., Zhang, S.: Fully convolutional recurrent network for handwritten Chinese text recognition. In: Proceedings of 23th International Conference on Pattern Recognition (ICPR), pp. 4011–4016 (2016)

    Google Scholar 

  27. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  28. Ghiasi, G., Lin, T.Y., Le, Q.V.: DropBlock: a regularization method for convolutional networks. In: Advances in Neural Information Processing Systems, pp. 10727–10737 (2018)

    Google Scholar 

  29. Ney, H., Haeb-Umbach, R., Tran, B., Oerder, M.: Improvements in beam search for 10000-word continuous speech recognition. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), vol. 1, pp. 9–12 (1992)

    Google Scholar 

  30. Ow, P.S., Morton, T.E.: Filtered beam search in scheduling. Int. J. Prod. Res. 26(1), 35–62 (1988)

    Article  Google Scholar 

  31. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)

    Google Scholar 

  32. Baek, J., et al.: What is wrong with scene text recognition model comparisons? Dataset and model analysis. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 4715–4723 (2019)

    Google Scholar 

  33. Litman, R., Anschel, O., Tsiper, S., Litman, R., Mazor, S., Manmatha, R.: SCATTER: selective context attentional scene text recognizer. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11962–11972 (2020)

    Google Scholar 

Download references

Acknowledgements

This work has been supported by the National Key Research and Development Program Grant 2020AAA0109702, the National Natural Science Foundation of China (NSFC) grants 61733007, 61721004.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cheng-Lin Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yan, S., Wu, JW., Yin, F., Liu, CL. (2021). Recognizing Handwritten Chinese Texts with Insertion and Swapping Using a Structural Attention Network. In: Lladós, J., Lopresti, D., Uchida, S. (eds) Document Analysis and Recognition – ICDAR 2021. ICDAR 2021. Lecture Notes in Computer Science(), vol 12824. Springer, Cham. https://doi.org/10.1007/978-3-030-86337-1_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-86337-1_37

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-86336-4

  • Online ISBN: 978-3-030-86337-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics