Automatic Reading Order Detection of Comic Panels

Zhang, Yunlong; Hotta, Seiji

doi:10.1007/978-3-031-37742-6_6

Yunlong Zhang⁹ &
Seiji Hotta⁹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13644))

Included in the following conference series:

International Conference on Pattern Recognition

312 Accesses

Abstract

Tasks such as object detection for comic content are attracting more and more attention from the public. A lot of work focuses on character detection, text recognition, or other tasks. However, only a few of them focus on the reading order detection of panels. In this paper, we review several existing sorting methods and propose a novel method based on these existing methods. Experiment results show that the proposed method outperforms the baseline methods. The proposed method can deal with pages with basic layouts easily. And sometimes has the ability to deal with some pages with complex layouts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Aizawa, K., et al.: Building a manga dataset “manga109” with annotations for multimedia applications. IEEE MultiMedia 27(2), 8–18 (2020). https://doi.org/10.1109/mmul.2020.2987895
Arai, K., Herman, T.: Method for automatic e-comic scene frame extraction for reading comic on mobile devices. In: 2010 Seventh International Conference on Information Technology: New Generations, pp. 370–375 (2010)
Google Scholar
Aramaki, Y., Matsui, Y., Yamasaki, T., Aizawa, K.: Text detection in manga by combining connected-component-based and region-based classifications. In: 2016 IEEE International Conference on Image Processing (ICIP), pp. 2901–2905 (2016)
Google Scholar
Chu, W.T., Li, W.W.: Manga FaceNet: face detection in manga based on deep neural network. In: Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval (2017)
Google Scholar
Cohn, A.G., Bennett, B., Gooday, J., Gotts, N.M.: Representing and reasoning with qualitative spatial relations about regions. In: Stock, O. (eds.) Spatial and Temporal Reasoning, pp. 97–134. Springer, Dordrecht (1997). https://doi.org/10.1007/978-0-585-28322-7_4
Dubray, D., Laubrock, J.: Deep CNN-based speech balloon detection and segmentation for comic books. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1237–1243 (2019)
Google Scholar
Del Gobbo, J., Matuk Herrera, R.: Unconstrained text detection in Manga: a new dataset and baseline. In: Bartoli, A., Fusiello, A. (eds.) ECCV 2020. LNCS, vol. 12537, pp. 629–646. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-67070-2_38
Chapter Google Scholar
Guérin, C., Rigaud, C., Bertet, K., Revel, A.: An ontology-based framework for the automated analysis and interpretation of comic books’ images. Inf. Sci. 378, 109–130 (2017)
Article Google Scholar
Guérin, C., et al.: eBDtheque: a representative database of comics. In: Proceedings of the 12th International Conference on Document Analysis and Recognition (ICDAR) (2013)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Google Scholar
Hideaki, Y., Daisuke, I., Hiroshi, W.: Face detection for comic images using the deformable part model. IIEEJ Trans. Image Electron. Visual Comput. 4(2), 95–100 (2016)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997)
Article Google Scholar
Iyyer, M., et al.: The amazing mysteries of the Gutter: drawing inferences between panels in comic book narratives. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6478–6487 (2017)
Google Scholar
Li, L., Wang, Y., Tang, Z., Liu, D.: DRR - comic image understanding based on polygon detection. In: SPIE Proceedings, vol. 8658, pp. 87–97. SPIE, February 2013
Google Scholar
Lin, T.Y., Dollár, P., Girshick, R.B., He, K., Hariharan, B., Belongie, S.J.: Feature pyramid networks for object detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 936–944 (2017)
Google Scholar
Matsui, Y., et al.: Sketch-based manga retrieval using manga109 dataset. Multimedia Tools Appl. 76(20), 21811–21838 (2016). https://doi.org/10.1007/s11042-016-4020-z
Article Google Scholar
Morton, G.: A Computer Oriented Geodetic Data Base and a New Technique in File Sequencing. International Business Machines Company (1966)
Google Scholar
Nguyen, N.V., Rigaud, C., Burie, J.C.: Comic characters detection using deep learning. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 3, pp. 41–46 (2017)
Google Scholar
Ono, T.: Optimizing two-dimensional guillotine cut by genetic algorithms. In: Proceedings of the Ninth AJOU-FIT-NUST Joint Seminar, pp. 40–47 (1999)
Google Scholar
Pang, X., Cao, Y., Lau, R.W.H., Chan, A.B.: A robust panel extraction method for Manga. In: Proceedings of the 22nd ACM International Conference on Multimedia (2014)
Google Scholar
Ponsard, C., Ramdoyal, R., Dziamski, D.: An OCR-enabled digital comic books viewer. In: Miesenberger, K., Karshmer, A., Penaz, P., Zagler, W. (eds.) ICCHP 2012. LNCS, vol. 7382, pp. 471–478. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31522-0_71
Chapter Google Scholar
Qin, X., Zhou, Y., He, Z., Wang, Y., Tang, Z.: A faster R-CNN based method for comic characters face detection. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 1074–1080 (2017)
Google Scholar
Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1137–1149 (2015)
Article Google Scholar
Rigaud, C., Burie, J.-C., Ogier, J.-M.: Text-independent speech balloon segmentation for Comics and Manga. In: Lamiroy, B., Dueire Lins, R. (eds.) GREC 2015. LNCS, vol. 9657, pp. 133–147. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-52159-6_10
Chapter Google Scholar
Rigaud, C., Karatzas, D., van de Weijer, J., Burie, J.C., Ogier, J.M.: Automatic text localisation in scanned comic books. In: VISAPP (2013)
Google Scholar
Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vision 115(3), 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y
Article MathSciNet Google Scholar
Tanaka, T., Shoji, K., Toyama, F., Miyamichi, J.: Layout analysis of tree-structured scene frames in comic images. In: IJCAI (2007)
Google Scholar
Tanaka, T., Toyama, F., Miyamichi, J., Shoji, K.: Detection and classification of speech balloons in comic images. J. Inst. Image Inf. Television Eng. 64, 1933–1939 (2010)
Google Scholar
Yamada, M., Budiarto, R., Endo, M., Miyazaki, S.: Comic image decomposition for reading comics on cellular phones. IEICE Trans. Inf. Syst. 87-D, 1370–1376 (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer and Information Sciences, Tokyo University of Agriculture and Technology, 2-24-16, Nakacho, Koganei, Tokyo, 184-8588, Japan
Yunlong Zhang & Seiji Hotta

Authors

Yunlong Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Seiji Hotta
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Seiji Hotta .

Editor information

Editors and Affiliations

York University, Toronto, ON, Canada
Jean-Jacques Rousseau
Ontario Tech University, Oshawa, ON, Canada
Bill Kapralos

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, Y., Hotta, S. (2023). Automatic Reading Order Detection of Comic Panels. In: Rousseau, JJ., Kapralos, B. (eds) Pattern Recognition, Computer Vision, and Image Processing. ICPR 2022 International Workshops and Challenges. ICPR 2022. Lecture Notes in Computer Science, vol 13644. Springer, Cham. https://doi.org/10.1007/978-3-031-37742-6_6

Download citation

DOI: https://doi.org/10.1007/978-3-031-37742-6_6
Published: 02 August 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-37741-9
Online ISBN: 978-3-031-37742-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)