Knowledge-driven understanding of images in comic books

Rigaud, Christophe; Guérin, Clément; Karatzas, Dimosthenis; Burie, Jean-Christophe; Ogier, Jean-Marc

doi:10.1007/s10032-015-0243-1

Knowledge-driven understanding of images in comic books

Original Paper
Published: 09 April 2015

Volume 18, pages 199–221, (2015)
Cite this article

International Journal on Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

Christophe Rigaud^1,2,
Clément Guérin¹,
Dimosthenis Karatzas²,
Jean-Christophe Burie¹ &
…
Jean-Marc Ogier¹

1350 Accesses
30 Citations
13 Altmetric
Explore all metrics

Abstract

Document analysis is an active field of research, which can attain a complete understanding of the semantics of a given document. One example of the document understanding process is enabling a computer to identify the key elements of a comic book story and arrange them according to a predefined domain knowledge. In this study, we propose a knowledge-driven system that can interact with bottom-up and top-down information to progressively understand the content of a document. We model the comic book’s and the image processing domains knowledge for information consistency analysis. In addition, different image processing methods are improved or developed to extract panels, balloons, tails, texts, comic characters and their semantic relations in an unsupervised way.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Image understanding and the web: a state-of-the-art review

Article 12 June 2014

A Semantic Search Engine for Historical Handwritten Document Images

Semantic-Based Image Analysis with the Goal of Assisting Artistic Creation

Notes

In the rest of the paper, “character” is used in the sense of actor or protagonist of a comic’s story, not as a piece of text.
https://github.com/crigaud/publication/tree/master/2015/IJDAR/.
http://digitalcomicmuseum.com.
http://ebdtheque.univ-lr.fr/database/.
http://ebdtheque.univ-lr.fr/database/?overview=1.
http://ebdtheque.univ-lr.fr/references/.
https://code.google.com/p/tesseract-ocr/downloads/list.

References

Arai, K., Tolle, H.: Method for automatic e-comic scene frame extraction for reading comic on mobile devices. In: IEEE Computer Society Seventh International Conference on Information Technology: New Generations, ITNG ’10, pp. 370–375, Washington, DC, USA, (2010)
Arai, K., Tolle, H.: Method for real time text extraction of digital manga comic. Int. J. Image Proces. (IJIP) 4(6), 669–676 (2011)
Google Scholar
Back, M., Gold, R., Balsamo, A., Chow, M., Gorbet, M., Harrison, S., MacDonald, D., Minnerman, S.: Designing innovative reading experiences for a museum exhibition. Computer 34(1), 80–87 (2001)
Article Google Scholar
Blaschke, T., Hay, G.J., Kelly, M., Lang, S., Hofmann, P., Addink, E., Feitosa, R.Q., van der Meer, F., van der Werff, H., van Coillie, F., Tiede, D.: Geographic object-based image analysis: towards a new paradigm. J. Photogramm. Remote Sens. 87, 180–191 (2014)
Article Google Scholar
Borodo, M.: Multimodality, translation and comics. Perspectives 1–20 (2014)
Brandon, D.C.: Graphic novels and comics for the visually impaired exploredin award-winning paper. http://beditionmagazine.com/graphic-novels-and-comics-for-the-visually-impaired-explored-in-award-winning-paper/ (2014)
Di Sciascio, E., Donini, F.M., Mongiello, M.: Structured knowledge representation for image retrieval. J. Artif. Intell. Res. 16(1), 209–257 (2002)
MATH Google Scholar
Duc, B.: L’art de la BD—Tome 1—Du scénario à la réalisation. Glénat (1982)
Duda, R.O., Hart, P.E.: Use of the hough transformation to detect lines and curves in pictures. Commun. ACM 15, 11–15 (1972)
Article MATH Google Scholar
Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)
Article Google Scholar
Fidler, S., Yao, J., Urtasun, R.: Describing the scene as a whole: joint object detection, scene classification and semantic segmentation. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 702–709. (2012)
Guérin, C.: Ontologies nd spatial relations applied to comic books reading. In: PhD Symposium of Knowledge Engineering and Knowledge Management (EKAW), Galway, Ireland (2012)
Guérin, C., Rigaud, C., Mercier, A., et al.: ebdtheque: a representative database of comics. In: Proceedings of International Conference on Document Analysis and Recognition (ICDAR), Washington DC (2013)
Haarslev, V., Hidde, K., Möller, R., Wessel, M.: The RacerPro knowledge representation and reasoning system. Semant. Web 3(3), 267–277 (2012)
Google Scholar
Han, E., Kim, K., Yang, H., Jung, K.: Frame segmentation used mlp-based x-y recursive for mobile cartoon content. In: Proceedings of the 12th International Conference on Human–Computer Interaction: Intelligent Multimodal Interaction Environments, HCI’07, pp. 872–881. Springer, Berlin (2007)
Hayes-Roth, F., Waterman, D., Lenat, D.: Building expert systems. Addison-Wesley, Reading (1984)
Google Scholar
Hermann, A., Ferré, S., Ducassé, M.: Guided semantic annotation of comic panels with sewelis. In: EKAW, volume 7603 of Lecture Notes in Computer Science, pp. 430–433. Springer (2012)
Ho, A. K. N., Burie, J.-C., Ogier, J.-M.: Comics page structure analysis based on automatic panel extraction. In: GREC 2011, Nineth IAPR International Workshop on Graphics Recognition, Seoul, Korea, pp. 15–16 (2011)
Ho, A. K. N., Burie, J.-C., Ogier, J.-M.: Panel and speech balloon extraction from comic books. In: 2012 10th IAPR International Workshop on Document Analysis Systems, pp. 424–428 (2012)
Ho, H. N., Rigaud, C., Burie, J.-C., Ogier, J.-M.: Redundant structure detection in attributed adjacency graphs for character detection in comics books. In: Proceedings of the 10th IAPR International Workshop on Graphics Recognition (GREC), Bethlehem, PA, USA, (2013)
Hu, B., Dasmahapatra, S., Lewis, P., Shadbolt, N.: Ontology-based medical image annotation with description logics. In: Proceedings of the 15th IEEE International Conference on Tools with Artificial Intelligence (2003)
Hudelot, C., Atif, J., Bloch, I.: Fuzzy spatial relation ontology for image interpretation. Fuzzy Sets Syst. 159(15), 1929–1951 (2008)
Article MathSciNet Google Scholar
IBISWorld. Comic book publishing in the US: Market research report, (2013)
In, Y., Oie, T., Higuchi, M., Kawasaki, S., Koike, A., Murakami, H.: Fast frame decomposition and sorting by contour tracing for mobile phone comic images. Int. J. Syst. Appl. Eng. Dev. 5(2), 216–223 (2011)
Google Scholar
Japan Book Publishers Association: An Introduction to Publishing inJapan 2012–2013. Japan Book Publishers Association, Tokyo https://books.google.fr/books?id=-WnxlgEACAAJ (2012)
Jérémy, R., Vincent, B.: Comics reading: an automatic script generation. In: Proceedings of the 21st International Conference in Central Europe on Computer Graphics, Visualization and Computer Vision (WSCG), pp. 88–96 (2013)
Khan, F. S., Rao, M. A., van de Weijer, J., Bagdanov, A. D., Vanrell, M., Lopez, A.: Color attributes for object detection. In: Twenty-Fifth IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2012) (2012)
Lainé, J.-M., Delzant, S.: Le lettrage des bulles. Eyrolles, Paris (2010)
Google Scholar
Lamiroy, B., Ogier, J.-M.: Analysis and interpretation of graphical documents. In: Doermann, D., Tombre, K. (eds.) Handbook of Document Image Processing and Recognition. Springer, Berlin (2014)
Google Scholar
Li, C., Kowdle, A., Saxena, A., Chen, T.: Toward holistic scene understanding: feedback enabled cascaded classification models. IEEE Trans. Pattern Anal. Mach. Intell. 34(7), 1394–1408 (2012)
Article Google Scholar
Li, L., Wang, Y., Tang, Z., Gao, L.: Automatic comic page segmentation based on polygon detection. Multimed. Tools Appl. 69(1), 171–197 (2014)
Article Google Scholar
Li, L., Wang, Y., Tang, Z., Lu, X., Gao, L.: Unsupervised speech text localization in comic images. In: 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 1190–1194 (2013)
Mao, S., Rosenfeld, A., Kanungo, T.: Document structure analysis algorithms: a literature survey. In: Kanungo, T., Smith, E.H.B., Hu, J., Kantor, P.B. (eds.) Document Recognition and Retrieval X, volume 5010 of SPIE Proceedings, pp. 197–207. SPIE, Bellingham (2003)
Google Scholar
McCloud, S.: Understanding Comics. William Morrow Paperbacks, New York (1994)
Google Scholar
McGuinness, D. L., Van Harmelen, F.: OWL Web Ontology Language Overview. Technical report, W3C (2004)
Mezaris, V., Kompatsiaris, I., Strintzis, M.G.: An ontology approach to object-based image retrieval. In: International Conference on Image Processing (ICIP) vol 2, pp. 511–514 (2003)
Ogier, J., Mullot, R., Labiche, J., Lecourtier, Y.: Semantic coherency: the basis of an image interpretation device-application to the cadastral map interpretation. IEEE Trans. Syst. Man Cybern. Part B Cybern. 30(2), 322–338 (2000)
Article Google Scholar
Otsu, N.: A threshold selection method from gray level histograms. IEEE Trans. Syst. Man Cybern. 9, 62–66 (1979)
Article Google Scholar
Ponsard, C: Enhancing the accessibility for all of digital comic books. e-Minds, 1(5), (2009)
Ponsard, C., Ramdoyal, R., Dziamski, D.: An ocr-enabled digital comic books viewer. In: Computers Helping People with Special Needs, pp. 471–478. Springer, (2012)
Ratier, G.: 2013 : l’année de la décélération—acbd.fr, (2013)
Rhoades, S.: A Complete History of American Comic Books. Peter Lang, New York (2008)
Rigaud, C., Karatzas, D., Burie, J.-C., Ogier, J.-M.: Speech balloon contour classification in comics. In: Proceedings of the 10th IAPR International Workshop on Graphics Recognition (GREC), pp. 23–25, Bethlehem, PA, USA, (2013)
Rigaud, C., Karatzas, D., Burie, J.-C., Ogier, J.-M.: Color descriptor for content-based drawing retrieval. In: Proceedings of International Workshop on Document Analysis Systems (DAS), Tours, France, (2014)
Rigaud, C., Karatzas, D., Van de Weijer, J., Burie, J.-C., Ogier, J.-M.: An active contour model for speech balloon detection in comics. In: IEEE Proceedings of the 12th International Conference on Document Analysis and Recognition (ICDAR), (2013)
Rigaud, C., Karatzas, D., Van de Weijer, J., Burie, J.-C., Ogier, J.-M.: Automatic text localisation in scanned comic books. In: Proceedings of the 8th International Conference on Computer Vision Theory and Applications (VISAPP). SCITEPRESS Digital Library, (2013)
Rigaud, C., Tsopze, N., Burie, J.-C., Ogier, J.-M.: Robust frame and text extraction from comic books. In: Kwon, Y.-B., Ogier, J.-M. (eds.) Graphics Recognition. New Trends and Challenges. Lecture Notes in Computer Science, vol. 7423, pp. 129–138. Springer, Berlin (2013)
Chapter Google Scholar
Robin Varnum, G., Christina, T.: The Language of Comics: Word and Image. University Press of Mississippi, Mississippi (2007). Studies in Popular Culture
Google Scholar
Sarwar, S., Qayyum, Z. U., Majeed, S.: Ontology based image retrieval framework using qualitative semantic image descriptions. In: Procedia Computer Science, 17th International Conference in Knowledge Based and Intelligent Information and Engineering Systems—KES2013 22:285–294, (2013)
Singh, S., Cheok, A. D., Ng, G. L., Farbiz, F.: 3d augmented reality comic book and notes for children using mobile phones. In: Proceedings of the 2004 Conference on Interaction Design and Children: Building a Community, IDC ’04, pp. 149–150, ACM, New York, (2004)
Sirin, E., Parsia, B., Cuenca Grau, B., Kalyanpur, A., Katz, Y.: Pellet: a practical OWL-DL reasoner. Web Semant. Sci. Serv. Agents World Wide Web 5(2), 51–53 (2007)
Article Google Scholar
Smith, R.: An overview of the tesseract ocr engine. In: Proceedings of the Ninth International Conference on Document Analysis and Recognition—vol. 02, ICDAR ’07, pp. 629–633, IEEE Computer Society, Washington, DC, (2007)
Stommel, M., Merhej, L. I., Müller, M. G.: Segmentation-free detection of comic panels. In: Computer Vision and Graphics, pp. 633–640. Springer, (2012)
Su, C.-Y., Chang, R.-I., Liu, J.-C.: Recognizing text elements for svg comic compression and its novel applications. In: Proceedings of the 11th International Conference on Document Analysis and Recognition, ICDAR ’11, pp. 1329–1333, IEEE Computer Society, Washington, DC, (2011)
Sun, W., Kise, K.: Detection of exact and similar partial copies for copyright protection of manga. Int. J. Doc. Anal. Recognit. (IJDAR) 16(4), 331–349 (2013)
Article Google Scholar
Sun, W., Kise, K., Burie, J.-C., Ogier, J.-M.: Specific comic character detection using local feature matching. In: Proceedings of International Conference on Document Analysis and Recognition (ICDAR 2013), Washington, USA, (2013)
Suzuki, S., et al.: Topological structural analysis of digitized binary images by border following. Comput. Vision Graph. Image Process. 30(1), 32–46 (1985)
Article MATH Google Scholar
Tanaka, T., Shoji, K., Toyama, F., Miyamichi, J.: Layout analysis of tree-structured scene frames in comic images. In: IJCAI’07, pp. 2885–2890, (2007)
Thomas, E.: Invisible Art, Invisible Planes, Invisible People. Multicultural Comics: From Zap to Blue Beetle. University of Texas Press, Texas (2010)
Google Scholar
Wagner, R.A., Fischer, M.J.: The string-to-string correction problem. J. ACM (JACM) 21(1), 168–173 (1974)
Article MATH MathSciNet Google Scholar
Yamada, M., Budiarto, R., Endo, M., Miyazaki, S.: Comic image decomposition for reading comics on cellular phones. IEICE Trans. 87–D(6), 1370–1376 (2004)
Google Scholar

Download references

Acknowledgments

The authors would like to thank Karell Bertet and Arnaud Revel for their help with the high-level processing. This work was supported by a European Doctorate scholarship of the University of La Rochelle, European Regional Development Fund, the region Poitou-Charentes (France), the General Council of Charente Maritime (France), the municipality of La Rochelle (France) and the Spanish research projects TIN2011-24631, RYC-2009-05031. We are grateful to all authors and publishers of comics and manga from the eBDtheque dataset for having allowed us to show (Figs. 1–24), use and share their works.

Conflict of interest

The authors declare that there is no conflict of interests. This article does not contain any studies with human or animal subjects.

Author information

Authors and Affiliations

Laboratoire L3i, Université de La Rochelle, 17042, La Rochelle Cedex 1, France
Christophe Rigaud, Clément Guérin, Jean-Christophe Burie & Jean-Marc Ogier
Computer Vision Center, Universitat Autònoma de Barcelona, 08193, Bellaterra (Barcelona), Spain
Christophe Rigaud & Dimosthenis Karatzas

Authors

Christophe Rigaud
View author publications
You can also search for this author in PubMed Google Scholar
Clément Guérin
View author publications
You can also search for this author in PubMed Google Scholar
Dimosthenis Karatzas
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Christophe Burie
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Marc Ogier
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christophe Rigaud.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rigaud, C., Guérin, C., Karatzas, D. et al. Knowledge-driven understanding of images in comic books. IJDAR 18, 199–221 (2015). https://doi.org/10.1007/s10032-015-0243-1

Download citation

Received: 08 July 2014
Revised: 01 March 2015
Accepted: 24 March 2015
Published: 09 April 2015
Issue Date: September 2015
DOI: https://doi.org/10.1007/s10032-015-0243-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Knowledge-driven understanding of images in comic books

Abstract

Access this article

Similar content being viewed by others

Image understanding and the web: a state-of-the-art review

A Semantic Search Engine for Historical Handwritten Document Images

Semantic-Based Image Analysis with the Goal of Assisting Artistic Creation

Notes

References

Acknowledgments

Conflict of interest

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Knowledge-driven understanding of images in comic books

Abstract

Access this article

Similar content being viewed by others

Image understanding and the web: a state-of-the-art review

A Semantic Search Engine for Historical Handwritten Document Images

Semantic-Based Image Analysis with the Goal of Assisting Artistic Creation

Notes

References

Acknowledgments

Conflict of interest

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation