Skip to main content

Open-Set Text Recognition: Case-Studies

  • Chapter
  • First Online:
Open-Set Text Recognition

Part of the book series: SpringerBriefs in Computer Science ((BRIEFSCOMPUTER))

  • 87 Accesses

Abstract

This chapter surveys two specific open-set text recognition methods, OSOCR and OpenCCD, which support the full feature set of the open-set text recognition tasks, specifically novel character spotting and novel character recognition. The two approaches serve as examples to make case studies of the framework, and may also serve as baseline solutions for the interested readers with the code and models publicly available. For each method, we introduce its motivation, overall design, implementation of individual modules, and training techniques including regularization terms and practical label sampling tricks. This chapter also includes a summarization of the performances of the two models, together with some additional results from the released code.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 49.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Methods with a recurrent decoder [4, 5] need an extra change on the “hidden state” to remove the dependency on history predictions.

  2. 2.

    The Noto fonts for most languages are available at https://noto-Liuet.al.bsite-2.storage.googleapis.com/pkgs/Noto-unhinted.zip.

  3. 3.

    The end-of-speech label “[s]” and the unknown label “\([-]\)”.

References

  1. Liu, C., Yang, C., Qin, H., Zhu, X., Liu, C., Yin, X.: Towards open-set text recognition via label-to-prototype learning. Pattern Recognit. 134, 109109 (2023)

    Article  Google Scholar 

  2. Baek, J., Kim, G., Lee, J., Park, S., Han, D., Yun, S., Oh, S.J., Lee, H.: What is wrong with scene text recognition model comparisons? dataset and model analysis. In: 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), Oct 27–Nov 2, 2019. IEEE (2019), pp. 4714–4722

    Google Scholar 

  3. Cao, Z., Lu, J., Cui, S., Zhang, C.: Zero-shot handwritten Chinese character recognition with hierarchical decomposition embedding. Pattern Recognit. 107, 107488 (2020)

    Article  Google Scholar 

  4. Li, H., Wang, P., Shen, C., Zhang, G.: Show, attend and read: a simple and strong baseline for irregular text recognition. In: The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, Jan 27–Feb 1, 2019. AAAI Press (2019), pp. 8610–8617

    Google Scholar 

  5. Sheng, F., Chen, Z., Xu, B.: NRTR: a no-recurrence sequence-to-sequence model for scene text recognition. In: 2019 International Conference on Document Analysis and Recognition, ICDAR 2019, Sydney, Australia, Sept 20–25, 2019. IEEE (2019), pp. 781–786

    Google Scholar 

  6. Wang, T., Zhu, Y., Jin, L., Luo, C., Chen, X., Wu, Y., Wang, Q., Cai, M.: Decoupled attention network for text recognition. In: The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, Feb 7–12, 2020. AAAI Press (2020), pp. 12 216–12 224

    Google Scholar 

  7. Dai, J., Qi, H., Xiong, Y., Li, Y., Zhang, G., Hu, H., Wei, Y.: Deformable convolutional networks. In: IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, Oct 22–29, 2017. IEEE Computer Society (2017), pp. 764–773

    Google Scholar 

  8. Luo, C., Jin, L., Sun, Z.: MORAN: a multi-object rectified attention network for scene text recognition. Pattern Recognit. 90, 109–118 (2019)

    Article  Google Scholar 

  9. Yang, M., Guan, Y., Liao, M., He, X., Bian, K., Bai, S., Yao, C., Bai, X.: Symmetry-constrained rectification network for scene text recognition. In: 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), Oct 27–Nov 2, 2019. IEEE (2019), pp. 9146–9155

    Google Scholar 

  10. Qi, H., Brown, M., Lowe, D.G.: Low-shot learning with imprinted weights. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18–22, 2018. IEEE Computer Society (2018), pp. 5822–5830

    Google Scholar 

  11. Fei, G., Liu, B.: Breaking the closed world assumption in text classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics (2016), pp. 506–514

    Google Scholar 

  12. Zhang, H., Xu, H., Lin, T.: Deep open intent classification with adaptive decision boundary. In: Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, Thirty-Third Conference on Innovative Applications of Artificial Intelligence, IAAI 2021, The Eleventh Symposium on Educational Advances in Artificial Intelligence, EAAI 2021, Virtual Event, Feb 2–9, 2021. AAAI Press (2021), pp. 14 374–14 382

    Google Scholar 

  13. Shi, B., Yang, M., Wang, X., Lyu, P., Yao, C., Bai, X.: ASTER: an attentional scene text recognizer with flexible rectification. IEEE Trans. Pattern Anal. Mach. Intell. 41(9), 2035–2048 (2019)

    Article  Google Scholar 

  14. Liu, C., Yang, C., Yin, X.: Open-set text recognition via character-context decoupling. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18–24, 2022. IEEE (2022), pp. 4513–4522

    Google Scholar 

  15. Liao, M., Zhang, J., Wan, Z., Xie, F., Liang, J., Lyu, P., Yao, C., Bai, X.: Scene text recognition from two-dimensional perspective. In: The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, Jan 27–Feb 1, 2019. AAAI Press (2019), pp. 8714–8721

    Google Scholar 

  16. Wan, Z., He, M., Chen, H., Bai, X., Yao, C.: Textscanner: reading characters in order for robust scene text recognition. In: The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, Feb 7–12, 2020. AAAI Press (2020), pp. 12 120–12 127

    Google Scholar 

  17. Cheng, Z., Xu, Y., Bai, F., Niu, Y., Pu, S., Zhou, S.: AON: towards arbitrarily-oriented text recognition. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18–22, 2018. IEEE Computer Society (2018), pp. 5571–5579

    Google Scholar 

  18. Shi, B., Bai, X., Yao, C.: An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39(11), 2298–2304 (2017)

    Article  Google Scholar 

  19. Hou, J., Zhu, X., Liu, C., Sheng, K., Wu, L., Wang, H., Yin, X.: HAM: hidden anchor mechanism for scene text detection. IEEE Trans. Image Process. 29, 7904–7916 (2020)

    Article  Google Scholar 

  20. Wellman, M.P., Henrion, M.: Explaining ’explaining away. IEEE Trans. Pattern Anal. Mach. Intell. 15(3), 287–292 (1993)

    Article  Google Scholar 

  21. Fang, S., Xie, H., Wang, Y., Mao, Z., Zhang, Y.: Read like humans: autonomous, bidirectional and iterative language modeling for scene text recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19–25, 2021. Computer Vision Foundation/IEEE (2021), pp. 7098–7107

    Google Scholar 

  22. Yu, D., Li, X., Zhang, C., Liu, T., Han, J., Liu, J., Ding, E.: Towards accurate scene text recognition with semantic reasoning networks. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13–19, 2020. IEEE (2020), pp. 12 110–12 119

    Google Scholar 

  23. Chang, W., You, T., Seo, S., Kwak, S., Han, B.: Domain-specific batch normalization for unsupervised domain adaptation. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16–20, 2019. Computer Vision Foundation/IEEE (2019), pp. 7354–7362

    Google Scholar 

  24. Kingma, D.P., Welling, M.: Auto-encoding variational bayes. In: 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14–16, 2014, Conference Track Proceedings (2014)

    Google Scholar 

  25. Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, Dec 7–12, 2015, Montreal, Quebec, Canada (2015), pp. 91–99

    Google Scholar 

  26. Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21–26, 2017. IEEE Computer Society (2017), pp. 6517–6525

    Google Scholar 

  27. Zhang, X., Yang, H., Young, E.F.Y.: Attentional transfer is all you need: technology-aware layout pattern generation. In: 58th ACM, IEEE Design Automation Conference, DAC, San Francisco, CA, USA, Dec 5–9, 2021. IEEE 2021, pp. 169–174 (2021)

    Google Scholar 

  28. Liu, C., Yin, F., Wang, D., Wang, Q.: CASIA online and offline Chinese handwriting databases. In: 2011 International Conference on Document Analysis and Recognition, ICDAR 2011, Beijing, China, Sept 18–21, 2011. IEEE Computer Society (2011), pp. 37–41

    Google Scholar 

  29. Shi, B., Bai, X., Yao, C.: Script identification in the wild via discriminative convolutional neural network. Pattern Recognit. 52, 448–458 (2016)

    Article  Google Scholar 

  30. Jaderberg, M., Simonyan, K., Vedaldi, A., Zisserman, A.: Synthetic data and artificial neural networks for natural scene text recognition. In: NIPS Deep Learning Workshop. Neural Information Processing Systems (2014)

    Google Scholar 

  31. Gupta, A., Vedaldi, A., Zisserman, A.: Synthetic data for text localisation in natural images. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27–30, 2016. IEEE Computer Society (2016), pp. 2315–2324

    Google Scholar 

  32. Zhan, F., Lu, S.: ESIR: end-to-end scene text recognition via iterative image rectification. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16–20, 2019. Computer Vision Foundation/IEEE (2019), pp. 2059–2068

    Google Scholar 

  33. Litman, R., Anschel, O., Tsiper, S., Litman, R., Mazor, S., Manmatha, R.: SCATTER: selective context attentional scene text recognizer. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13–19, 2020. IEEE (2020), pp. 11 959–11 969

    Google Scholar 

  34. Qiao, Z., Zhou, Y., Yang, D., Zhou, Y., Wang, W.: SEED: semantics enhanced encoder-decoder framework for scene text recognition. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13–19, 2020. IEEE (2020), pp. 13 525–13 534

    Google Scholar 

  35. Borisyuk, F., Gordo, A., Sivakumar, V.: Rosetta: large scale system for text detection and recognition in images. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery Data Mining, KDD 2018, London, UK, Aug 19–23, 2018. ACM (2018), pp. 71–79

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xu-Cheng Yin .

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Yin, XC., Yang, C., Liu, C. (2024). Open-Set Text Recognition: Case-Studies. In: Open-Set Text Recognition. SpringerBriefs in Computer Science. Springer, Singapore. https://doi.org/10.1007/978-981-97-0361-6_7

Download citation

  • DOI: https://doi.org/10.1007/978-981-97-0361-6_7

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-97-0360-9

  • Online ISBN: 978-981-97-0361-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics