Skip to main content

HelixNet: Dual Helix Cooperative Decoders for Scene Text Removal

  • Conference paper
  • First Online:
Pattern Recognition and Computer Vision (PRCV 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14431))

Included in the following conference series:

  • 348 Accesses

Abstract

Scene text removal aims to remove scene text from images and fill the resulting gaps with plausible and realistic content. Within the context of scene text removal, two potential sub-tasks exist, i.e., text perception and text removal. However, most existing methods have ignored this premise or only divided this task into two consecutive stages, without considering the interactive promotion relationship between them. By leveraging some transformations, better segmentation results can better guide the process of text removal, and vice versa. These two sub-tasks can mutually promote and co-evolve, creating an intertwined and spiraling process similar to the double helix structure of Deoxyribonucleic acid (DNA) molecules. In this paper, we propose a novel network, HelixNet, incorporating Dual Helix Cooperative Decoders for Scene Text Removal. It is an end-to-end one-stage model with one shared encoder and two interacted decoders for the text segmentation and text removal sub-tasks. Through the use of dual branch information interaction, we can fuse complementary information from each sub-task, achieving interaction between scene text removal and segmentation. Our proposed method is extensively evaluated on publicly available and commonly used real and synthetic datasets. The experimental results demonstrate the promotion effect of the specially designed decoder and also show that HelixNet can achieve state-of-the-art performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Yang, Q., Jin, H., Huang, J., Lin, W.: SwapText: image based texts transfer in scenes. In: CVPR (2020)

    Google Scholar 

  2. Singh, A., Pang, G., Toh, M., Huang, J., Hassner, T.: TextOCR: towards large-scale end-to-end reasoning for arbitrary-shaped scene text. In: CVPR (2021)

    Google Scholar 

  3. Nakamura, T., Zhu, A., Yanai, K., Uchida, S.: Scene text eraser. In: ICDAR (2017)

    Google Scholar 

  4. Zhang, S., Liu, Y., Jin, L., Huang, Y., Lai, S.: EnsNet: ensconce text in the wild. In: AAAI (2019)

    Google Scholar 

  5. Tursun, O., Rui, Z., Denman, S., Sridharan, S., Fookes, C.: MTRNet: a generic scene text eraser. In: ICDAR (2019)

    Google Scholar 

  6. Yu, T., et al.: Inpaint anything: segment anything meets image inpainting. arXiv preprint arXiv:2304.06790 (2023)

  7. Liu, C., Liu, Y., Jin, L., Zhang, S., Wang, Y.: EraseNet: end-to-end text removal in the wild. IEEE Trans. Image Process. 29, 8760–8775 (2020)

    Article  Google Scholar 

  8. Lyu, G., Liu, K., Zhu, A., Uchida, S., Iwana, B.K.: FETNet: feature erasing and transferring network for scene text removal. Pattern Recognit. 140, 109531 (2023)

    Article  Google Scholar 

  9. Nguyen, N., et al.: Dictionary-guided scene text recognition. In: CVPR (2021)

    Google Scholar 

  10. Nobile, N., Suen, C.Y.: Text segmentation for document recognition. In: Doermann, D., Tombre, K. (eds.) Handbook of Document Image Processing and Recognition, pp. 257–290. Springer, London (2014). https://doi.org/10.1007/978-0-85729-859-1_8

    Chapter  Google Scholar 

  11. Bonechi, S., Bianchini, M., Scarselli, F., Andreini, P.: Weak supervision for generating pixel level annotations in scene text segmentation. Pattern Recogn. Lett. 138, 1–7 (2020)

    Article  Google Scholar 

  12. Xixi, X., Qi, Z., Ma, J., Zhang, H., Shan, Y., Qie, X.: BTS: a bi-lingual benchmark for text segmentation in the wild. In: CVPR (2022)

    Google Scholar 

  13. Barnes, C., Shechtman, E., Finkelstein, A., Goldman, D.B.: PatchMatch: a randomized correspondence algorithm for structural image editing. ACM Trans. Graph. 28(3), 24 (2009)

    Article  Google Scholar 

  14. Iizuka, S., Simo-Serra, E., Ishikawa, H.: Globally and locally consistent image completion. ACM Trans. Graph. 36(4), 1–14 (2017)

    Article  Google Scholar 

  15. Ballester, C., Bertalmio, M., Caselles, V., Sapiro, G., Verdera, J.: Filling-in by joint interpolation of vector fields and gray levels. IEEE Trans. Image Process. 10(8), 1200–1211 (2001)

    Article  MathSciNet  Google Scholar 

  16. Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., Efros, A.A.: Context encoders: feature learning by inpainting. In: CVPR (2016)

    Google Scholar 

  17. Lyu, G., Zhu, A.: PSSTRNet: progressive segmentation-guided scene text removal network. In: ICME (2022)

    Google Scholar 

  18. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)

    Google Scholar 

  19. Milletari, F., Navab, N., Ahmadi, S.A.: V-Net: fully convolutional neural networks for volumetric medical image segmentation. In: 3DV (2016)

    Google Scholar 

  20. Gupta, A., Vedaldi, A., Zisserman, A.: Synthetic data for text localisation in natural images. In: CVPR (2016)

    Google Scholar 

  21. Baek, Y., Lee, B., Han, D., Yun, S., Lee, H.: Character region awareness for text detection. In: CVPR (2019)

    Google Scholar 

  22. Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: CVPR (2017)

    Google Scholar 

  23. Tursun, O., Denman, S., Zeng, R., Sivapalan, S., Sridharan, S., Fookes, C.: MTRNet++: one-stage mask-based scene text eraser. Comput. Vis. Image Underst. 201, 103066 (2020)

    Article  Google Scholar 

Download references

Acknowledgement

This work is supported by the Open Project Program of the National Laboratory of Pattern Recognition (NLPR) (No. 202200049).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anna Zhu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liu, K., Lyu, G., Zhu, A. (2024). HelixNet: Dual Helix Cooperative Decoders for Scene Text Removal. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14431. Springer, Singapore. https://doi.org/10.1007/978-981-99-8540-1_3

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-8540-1_3

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-8539-5

  • Online ISBN: 978-981-99-8540-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics