Abstract
Document layout analysis (DLA) is a preprocessing step of document understanding systems. It is responsible for detecting and annotating the physical structure of documents. DLA has several important applications such as document retrieval, content categorization, text recognition, and the like. The objective of DLA is to ease the subsequent analysis/recognition phases by identifying the document-homogeneous blocks and by determining their relationships. The DLA pipeline consists of several phases that could vary among DLA methods, depending on the documents’ layouts and final analysis objectives. In this regard, a universal DLA algorithm that fits all types of document-layouts or that satisfies all analysis objectives has not been developed, yet. In this survey paper, we present a critical study of different document layout analysis techniques. The study highlights the motivational reasons for pursuing DLA and discusses comprehensively the different phases of the DLA algorithms based on a general framework that is formed as an outcome of reviewing the research in the field. The DLA framework consists of preprocessing, layout analysis strategies, post-processing, and performance evaluation phases. Overall, the article delivers an essential baseline for pursuing further research in document layout analysis.
- Mudit Agrawal and David Doermann. 2009. Voronoi++: A dynamic page segmentation approach based on Voronoi and Docstrum features. In The International Conference on Document Analysis and Recognition. IEEE, 1011--1015.Google ScholarDigital Library
- Mudit Agrawal and David Doermann. 2010. Context-aware and content-based dynamic Voronoi page segmentation. In The 8th IAPR International Workshop on Document Analysis Systems. ACM Press, New York, 73--80.Google ScholarDigital Library
- Prakash K. Aithal, G. Rajesh, Dinesh U. Acharya, and P. C. Siddalingaswamy. 2013. A fast and novel skew estimation approach using radon transform. International Journal of Computer Information Systems and Industrial Management Applications 5 (2013), 337--344.Google Scholar
- Alireza Alaei, Umapada Pal, and P. Nagabhushan. 2011. A new scheme for unconstrained handwritten text-line segmentation. Pattern Recognition 44, 4 (2011), 917--928.Google ScholarDigital Library
- Michele Alberti, Mathias Seuret, Vinaychandran Pondenkandath, Rolf Ingold, and Marcus Liwicki. 2017. Historical document image segmentation with LDA-initialized deep neural networks. In The 4th International Workshop on Historical Document Imaging and Processing. ACM, 95--100.Google ScholarDigital Library
- Adnan Amin and Sue Wu. 2005. A robust system for thresholding and skew detection in mixed text/graphics documents. International Journal of Image and Graphics 5, 2 (Apr. 2005), 247--265.Google ScholarCross Ref
- Khalid M. Amin, Mohamed Abd Elfattah, Aboul Ella Hassanien, and Gerald Schaefer. 2014. A binarization algorithm for historical Arabic manuscript images using a neutrosophic approach. In The 9th International Conference on Computer Engineering 8 Systems. IEEE, 266--270.Google ScholarCross Ref
- A. Antonacopoulos, C. Clausner, C. Papadopoulos, and S. Pletschacher. 2011. Historical document layout analysis competition. In International Conference on Document Analysis and Recognition. IEEE, 1516--1520.Google Scholar
- A. Antonacopoulos, B. Gatos, and D. Karatzas. 2003. ICDAR 2003 page segmentation competition. In The 7th International Conference on Document Analysis and Recognition. 688--692.Google Scholar
- A. Antonacopoulos and R. T. Ritchings. 1995. Representation and classification of complex-shaped printed regions using white tiles. In The 3rd International Conference on Document Analysis and Recognition, Vol. 2. IEEE Comput. Soc. Press, 1132--1135.Google Scholar
- A. Antonacopoulos, S. Pletschacher, D. Bridson, and C. Papadopoulos. 2009. ICDAR2009 page segmentation competition. In The 10th International Conference on Document Analysis and Recognition. 1370--1374.Google Scholar
- Apostolos Antonacopoulos and David Bridson. 2007. Performance analysis framework for layout analysis methods. In The 9th International Conference on Document Analysis and Recognition (ICDAR), Vol. 2. IEEE, 1258--1262.Google ScholarCross Ref
- Apostolos Antonacopoulos, David Bridson, Christos Papadopoulos, and Stefan Pletschacher. 2009. A realistic dataset for performance evaluation of document layout analysis. In The 10th International Conference on Document Analysis and Recognition. IEEE, 296--300. DOI:https://doi.org/10.1109/ICDAR.2009.271Google ScholarDigital Library
- Apostolos Antonacopoulos, Christian Clausner, Christos Papadopoulos, and Stefan Pletschacher. 2013. ICDAR2013 competition on historical newspaper layout analysis (HNLA’13). In The 12th International Conference on Document Analysis and Recognition. IEEE, 1454--1458.Google Scholar
- Apostolos Antonacopoulos, Christian Clausner, Christos Papadopoulos, and Stefan Pletschacher. 2015. ICDAR2015 competition on recognition of documents with complex layouts. In The 13th International Conference on Document Analysis and Recognition. IEEE, 1151--1155.Google Scholar
- Manivannan Arivazhagan, Harish Srinivasan, and Sargur Srihari. 2007. A statistical approach to line segmentation in handwritten documents. In Document Recognition and Retrieval XIV, Xiaofan Lin and Berrin A. Yanikoglu (Eds.). International Society for Optics and Photonics, 65000T.Google Scholar
- Nikolaos Arvanitopoulos and Sabine Susstrunk. 2014. Seam carving for text line extraction on color and grayscale historical manuscripts. In The 14th International Conference on Frontiers in Handwriting Recognition. IEEE, 726--731.Google ScholarCross Ref
- Abedelkadir Asi, Rafi Cohen, Klara Kedem, and Jihad El-Sana. 2015. Simplifying the reading of historical manuscripts. In The 13th International Conference on Document Analysis and Recognition. IEEE, 826--830. DOI:https://doi.org/10.1109/ICDAR.2015.7333877Google ScholarDigital Library
- Abedelkadir Asi, Rafi Cohen, Klara Kedem, Jihad El-Sana, and Itshak Dinstein. 2014. A coarse-to-fine approach for layout analysis of ancient manuscripts. In The 14th International Conference on Frontiers in Handwriting Recognition. 140--145.Google ScholarCross Ref
- Abedelkadir Asi, Raid Saabni, and Jihad El-Sana. 2011. Text line segmentation for gray scale historical document images. In The Workshop on Historical Document Imaging and Processing. ACM Press, New York, 120.Google ScholarDigital Library
- Bruno Tenório Ávila and Rafael Dueire Lins. 2005. A fast orientation and skew detection algorithm for monochromatic document images. In The ACM Symposium on Document Engineering. ACM Press, New York, 118.Google Scholar
- Micheal Baechler, Marcus Liwicki, and Rolf Ingold. 2013. Text line extraction using DMLP classifiers for historical manuscripts. In The 12th International Conference on Document Analysis and Recognition. IEEE, 1029--1033. DOI:https://doi.org/10.1109/ICDAR.2013.206Google ScholarDigital Library
- A. Bagdanov and J. Kanai. 1997. Projection profile based skew estimation algorithm for JBIG compressed images. In The 4th International Conference on Document Analysis and Recognition, Vol. 1. IEEE Comput. Soc., 401--405. DOI:https://doi.org/10.1109/ICDAR.1997.619878Google Scholar
- Itay Bar-Yosef, Nate Hagbi, Klara Kedem, and Itshak Dinstein. 2009. Line segmentation for degraded handwritten historical documents. In The 10th International Conference on Document Analysis and Recognition. IEEE, 1161--1165. http://ieeexplore.ieee.org/document/5277595/.Google ScholarDigital Library
- P. Barlas, S. Adam, C. Chatelain, and T. Paquet. 2014. A typed and handwritten text block segmentation system for heterogeneous and complex documents. In The 11th IAPR International Workshop on Document Analysis Systems. IEEE, 46--50.Google Scholar
- J. Bernsen. 1986. Dynamic thresholding of gray level images. In The International Conference on Pattern Recognition. 1251--1255.Google Scholar
- Fadi Biadsy, Jihad El-Sana, and Nizar Habash. 2006. Online Arabic handwriting recognition using hidden Markov models. In The 10th International Workshop on Frontiers in Handwriting Recognition. Suvisoft.Google Scholar
- Thomas M. Breuel. 2003. High performance document layout analysis. In Symposium on Document Image Understanding Technology 3 (2003), 209--218.Google Scholar
- D. Bridson and A. Antonacopoulos. 2008. A geometric approach for accurate and efficient performance evaluation of layout analysis methods. In The 19th International Conference on Pattern Recognition. IEEE, 1--4.Google Scholar
- Syed Saqib Bukhari, Mayce Ibrahim Ali Al Azawi, Faisal Shafait, and Thomas M. Breuel. 2010. Document image segmentation using discriminative learning over connected components. In The 8th IAPR International Workshop on Document Analysis Systems. ACM Press, New York, 183--190.Google Scholar
- Syed Saqib Bukhari, T. M. Breuel, Abedelkadir Asi, and Jihad El-Sana. 2012. Layout analysis for arabic historical document images using machine learning. In The International Conference on Frontiers in Handwriting Recognition. IEEE, 639--644.Google ScholarDigital Library
- Syed Saqib Bukhari, Faisal Shafait, and Thomas M. Breuel. 2009. Script-independent handwritten textlines segmentation using active contours. In The 10th International Conference on Document Analysis and Recognition. IEEE, 446--450. http://ieeexplore.ieee.org/document/5277636/.Google Scholar
- Syed Saqib Bukhari, Faisal Shafait, and Thomas M. Breuel. 2011. Improved document image segmentation algorithm using multiresolution morphology. In International Society for Optics and Photonics, Gady Agam and Christian Viard-Gaudin (Eds.). International Society for Optics and Photonics, 78740D.Google Scholar
- Marius Bulacu, Rutger Van Koert, Lambert Schomaker, and Tijn van der Zant. 2007. Layout analysis of handwritten historical documents for searching the archive of the cabinet of the Dutch Queen. In The 9th International Conference on Document Analysis and Recognition. IEEE, 351--361.Google Scholar
- Mark J. Burge and Gladys Monagan. 1995. Using the Voronoi tessellation for grouping words and multipart symbols in documents. In The SPIE International Symposium on Optics, Imaging and Instrumentation, Robert A. Melter, Angela Y. Wu, Fred L. Bookstein, and William D. K. Green (Eds.). International Society for Optics and Photonics, 116--124.Google Scholar
- C. Clausner A. Antonacopoulos C. Papadopoulos, S. Pletschacher. 2013. The IMPACT dataset of historical document images. In The 2nd International Workshop on Historical Document Imaging and Processing. 123--130.Google ScholarDigital Library
- Yang Cao, Shuhua Wang, and Heng Li. 2003. Skew detection and correction in document images based on straight-line fitting. Pattern Recognition Letters 24, 12 (2003), 1871--1879.Google ScholarDigital Library
- Samuele Capobianco, Leonardo Scommegna, and Simone Marinai. 2018. Historical handwritten document segmentation by using a weighted loss. In IAPR Workshop on Artificial Neural Networks in Pattern Recognition. Springer, 395--406.Google ScholarCross Ref
- R. Cattoni, T. Coianiz, S. Messelodi, and Cm Modena. 1998. Geometric layout analysis techniques for document image understanding: A review. ITC-First Technical Report (1998), 1--68.Google Scholar
- F. Cesarini, M. Gori, S. Marinai, and G. Soda. 1999. Structured document segmentation and representation by the modified X-Y tree. In The 5th International Conference on Document Analysis and Recognition. IEEE, 563--566.Google Scholar
- Nabendu Chaki, Soharab Hossain Shaikh, and Khalid Saeed. 2014. A comprehensive survey on image binarization techniques. In Exploring Image Binarization Techniques. Springer India, 5--15.Google Scholar
- Kai Chen, Cheng-Lin Liu, Mathias Seuret, Marcus Liwicki, Jean Hennebert, and Rolf Ingold. 2016. Page segmentation for historical document images based on superpixel classification with unsupervised feature learning. In The 12th IAPR Workshop on Document Analysis Systems (DAS). IEEE, 299--304.Google ScholarCross Ref
- Kai Chen, Mathias Seuret, Jean Hennebert, and Rolf Ingold. 2017. Convolutional neural networks for page segmentation of historical document images. In The 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). IEEE, 965--970.Google ScholarCross Ref
- Kai Chen, Mathias Seuret, Marcus Liwicki, Jean Hennebert, and Rolf Ingold. 2015. Page segmentation of historical document images with convolutional autoencoders. In The 13th International Conference on Document Analysis and Recognition (ICDAR). IEEE, 1011--1015.Google ScholarDigital Library
- Kai Chen, Hao Wei, Jean Hennebert, Rolf Ingold, and Marcus Liwicki. 2014. Page segmentation for historical handwritten document images using color and texture features. In The 14th International Conference on Frontiers in Handwriting Recognition. 488--493.Google ScholarCross Ref
- Yiping Chen and Liansheng Wang. 2017. Broken and degraded document images binarization. Neurocomputing 237 (2017), 272--280.Google ScholarDigital Library
- Atul K. Chhabra and Ihsin T. Phillips. 1997. The second international graphics recognition contest-raster to vector conversion: A report. In International Workshop on Graphics Recognition. Springer, 390--410.Google Scholar
- Rafi Cohen, Abedelkadir Asi, Klara Kedem, Jihad El-Sana, and Itshak Dinstein. 2013. Robust text and drawing segmentation algorithm for historical documents. In The 2nd International Workshop on Historical Document Imaging and Processing. 110--117.Google ScholarDigital Library
- Rafi Cohen, Itshak Dinstein, Jihad El-Sana, and Klara Kedem. 2014. Using scale-space anisotropic smoothing for text line extraction in historical documents. In International Conference Image Analysis and Recognition. Springer International Publishing, 349--358.Google ScholarCross Ref
- Laboratoire National de metrologie et d’Essais (LNE). 2013. MAURDOR campaign. http://www.maurdor-campaign.org/index.php?id=83&L==1.Google Scholar
- Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In IEEE Conference on Computer Vision and Pattern Recognition. 248--255.Google ScholarCross Ref
- Markus Diem, Florian Kleber, and Robert Sablatnig. 2011. Text classification and document layout analysis of paper fragments. In The International Conference on Document Analysis and Recognition. IEEE, 854--858. DOI:https://doi.org/10.1109/ICDAR.2011.175Google ScholarDigital Library
- Markus Diem, Florian Kleber, and Robert Sablatnig. 2012. Skew estimation of sparsely inscribed document fragments. In The 10th IAPR International Workshop on Document Analysis Systems. IEEE, 292--296.Google ScholarDigital Library
- David Doermann Elena Zotkina, Himanshu Suri. 2013. GEDI: Groundtruthing Environment for Document Images. https://lampsrv02.umiacs.umd.edu/projdb/project.php?id=53.Google Scholar
- Boris Epshtein. 2011. Determining document skew using inter-line spaces. In The International Conference on Document Analysis and Recognition. IEEE, 27--31. DOI:https://doi.org/10.1109/ICDAR.2011.15Google ScholarDigital Library
- Sébastien Eskenazi, Petra Gomez-Krämer, and Jean-Marc Ogier. 2015. The Delaunay document layout descriptor. In ACM Symposium on Document Engineering. ACM Press, New York, 167--175.Google ScholarDigital Library
- Sébastien Eskenazi, Petra Gomez-Krämer, and Jean-Marc Ogier. 2017. A comprehensive survey of mostly textual document segmentation algorithms since 2008. Pattern Recognition 64 (2017), 1--14.Google ScholarDigital Library
- Jonathan Fabrizio. 2014. A precise skew estimation algorithm for document images using KNN clustering and Fourier transform. In The International Conference on Image Processing. IEEE, 2585--2588.Google ScholarCross Ref
- Andreas Fischer, Micheal Baechler, Angelika Garz, Marcus Liwicki, and Rolf Ingold. 2014. A combined system for text line extraction and handwriting recognition in historical documents. In The 11th IAPR International Workshop on Document Analysis Systems. 71--75.Google ScholarDigital Library
- Andreas Fischer, Volkmar Frinken, Alicia Fornés, and Horst Bunke. 2011. Transcription alignment of latin manuscripts using hidden Markov models. In The Workshop on Historical Document Imaging and Processing. ACM, 29--36.Google ScholarDigital Library
- Gaofeng Meng, Chunhong Pan, Nanning Zheng, and Chen Sun. 2010. Skew estimation of document images using bagging. IEEE Transactions on Image Processing 19, 7 (jul 2010), 1837--1846.Google Scholar
- Angelika Garz, Markus Diem, and Robert Sablatnig. 2010. Detecting text areas and decorative elements in ancient manuscripts. In The 12th International Conference on Frontiers in Handwriting Recognition. IEEE, 176--181. DOI:https://doi.org/10.1109/ICFHR.2010.35Google ScholarDigital Library
- Angelika Garz, Andreas Fischer, Robert Sablatnig, and Horst Bunke. 2012. Binarization-free text line segmentation for historical documents based on interest point clustering. In The 10th IAPR International Workshop on Document Analysis Systems. IEEE, 95--99.Google ScholarDigital Library
- Angelika Garz and Robert Sablatnig. 2010. Multi-scale texture-based text recognition in ancient manuscripts. In The 16th International Conference on Virtual Systems and Multimedia. IEEE, 336--339. DOI:https://doi.org/10.1109/VSMM.2010.5665938Google ScholarCross Ref
- Angelika Garz, Robert Sablatnig, and Markus Diem. 2011. Layout analysis for historical manuscripts using SIFT features. In The International Conference on Document Analysis and Recognition. 508--512.Google ScholarDigital Library
- B. Gatos, N. Papamarkos, and C. Chamzas. 1997. Skew detection and text line position determination in digitized documents. Pattern Recognition 30, 9 (1997), 1505--1519.Google ScholarCross Ref
- B. Gatos, N. Stamatopoulos, and G. Louloudis. 2011. ICDAR2009 handwriting segmentation contest. International Journal on Document Analysis and Recognition (IJDAR) 14, 1 (2011), 25--33.Google ScholarDigital Library
- Basilios Gatos, Pratikakis Ioannis, and Stavros J. Perantonis. 2004. An adaptive binarization technique for low quality historical documents. In Document Analysis Systems VI. Springer, Springer Berlin, 102--113.Google Scholar
- Basilis Gatos, Nikolaos Stamatopoulos, and Georgios Louloudis. 2010. ICFHR2010 handwriting segmentation contest. In The 12th International Conference on Frontiers in Handwriting Recognition. IEEE, 737--742.Google Scholar
- Tobias Grüning, Gundram Leifert, Tobias Strauß, and Roger Labahn. 2018. A two-stage method for text line detection in historical documents. arXiv preprint arXiv:1802.03345 (2018).Google Scholar
- Karim Hadjar and Rolf Ingold. 2004. Physical layout analysis of complex structured arabic documents using artificial neural nets. In Lecture Notes in Computer Science. Springer Berlin, 170--178.Google Scholar
- Sheng He and Lambert Schomaker. 2019. DeepOtsu: Document enhancement and binarization using iterative deep learning. Pattern Recognition 91 (2019), 379--390.Google ScholarDigital Library
- S. C. Hinds, J. L. Fisher, and D. P. D’Amato. 1990. A document skew detection method using run-length encoding and the hough transform. In The 10th International Conference on Pattern Recognition, Vol. I. IEEE Comput. Soc. Press, 464--468.Google Scholar
- Jaekyu Ha, R. M. Haralick, and I. T. Phillips. 1995. Document page decomposition by the bounding-box project. In The 3rd International Conference on Document Analysis and Recognition. IEEE Comput. Soc. Press, 1119--1122.Google Scholar
- Anil K. Jain and Yu Zhong. 1996. Page segmentation using texture analysis. Pattern Recognition 29, 5 (May 1996), 743--770.Google ScholarDigital Library
- N. Journet, V. Eglin, J. Y. Ramel, and R. Mullot. 2005. Text/graphic labelling of ancient printed documents. In The 8th International Conference on Document Analysis and Recognition. IEEE, 1010--1014 Vol. 2. DOI:https://doi.org/10.1109/ICDAR.2005.235Google Scholar
- Nicholas Journet, Jean-Yves Ramel, Rémy Mullot, and Véronique Eglin. 2008. Document image characterization using a multiresolution analysis of the texture: Application to old documents. International Journal of Document Analysis and Recognition (IJDAR) 11, 1 (Jun 2008), 9--18.Google ScholarDigital Library
- Hao Wei Marcus Liwicki Rolf Ingold Kai Chen, Mathias Seuret. 2015. Document, image, and video analysis DLA tool. http://diuf.unifr.ch/main/hisdoc/divadia.Google Scholar
- Rangachar Kasturi, Lawrence O’Gorman, and Venu Govindaraju. 2002. Document image analysis: A primer. Sadhana 27, 1 (2002), 3--22.Google ScholarCross Ref
- N. Khorissi, A. Namane, A. Mellit, F. Abdati, Z. A. Bensalama, and A. Guessoum. 2007. Application of the wavelet and the Hough transform for detecting the skew angle in arabic printed documents. In The 9th International Symposium on Signal Processing and Its Applications. IEEE, 1--4. http://ieeexplore.ieee.org/document/4555586/.Google Scholar
- Koichi Kise, Akinori Sato, and Motoi Iwata. 1998. Segmentation of page images using the area Voronoi diagram. Computer Vision and Image Understanding 70, 3 (1998), 370--382.Google ScholarDigital Library
- Koichi Kise. 2014. Page segmentation techniques in document analysis. In Handbook of Document Image Processing and Recognition. Springer London, London, 135--175.Google Scholar
- K. Kise, A. Sato, and K. Matsumoto. 1997. Document image segmentation as selection of Voronoi edges. In The Workshop on Document Image Analysis. IEEE Comput. Soc, 32--39. DOI:https://doi.org/10.1109/DIA.1997.627089Google Scholar
- Florian Kleber, Robert Sablatnig, Melanie Gau, and Heinz Miklas. 2008. Ancient document analysis based on text line extraction. In The 19th International Conference on Pattern Recognition. IEEE, 1--4. DOI:https://doi.org/10.1109/ICPR.2008.4761530Google ScholarCross Ref
- M. Krishnamoorthy, G. Nagy, S. Seth, and M. Viswanathan. 1993. Syntactic segmentation and labeling of digitized pages from technical journals. IEEE Transactions on Pattern Analysis and Machine Intelligence 15, 7 (1993), 737--747.Google ScholarDigital Library
- Victor Lavrenko, Toni M. Rath, and Raghavan Manmatha. 2004. Holistic word recognition for handwritten historical documents. In The 1st International Workshop on Document Image Analysis for Libraries. IEEE, 278--287.Google ScholarCross Ref
- Daniel S. Le, George R. Thoma, and Harry Wechsler. 1994. Automated page orientation and skew angle detection for binary document images. Pattern Recognition 27, 10 (1994), 1325--1344.Google ScholarCross Ref
- Shutao Li, Qinghua Shen, and Jun Sun. 2007. Skew detection using wavelet decomposition and projection profile analysis. Pattern Recognition Letters 28, 5 (2007), 555--562.Google ScholarDigital Library
- L. Likforman-Sulem, A. Hanimyan, and C. Faure. 1995. A Hough based algorithm for extracting text lines in handwritten documents. In The 3rd International Conference on Document Analysis and Recognition. IEEE Comput. Soc. Press, 774--777.Google Scholar
- Laurence Likforman-Sulem, Abderrazak Zahour, and Bruno Taconet. 2007. Text line segmentation of historical documents: A survey. International Journal of Document Analysis and Recognition (IJDAR) 9, 2--4 (Sept. 2007), 123--138. http://link.springer.com/10.1007/s10032-006-0023-zGoogle ScholarCross Ref
- N. Liolios, N. Fakotakis, and G. Kokkinakis. 2001. Improved document skew detection based on text line connected-component clustering. In The International Conference on Image Processing (Cat. No.01CH37205), Vol. 1. IEEE, 1098--1101.Google Scholar
- G. Louloudis, B. Gatos, and C. Halatsis. 2007. Text line detection in unconstrained handwritten documents using a block-based Hough transform approach. In The 9th International Conference on Document Analysis and Recognition. IEEE, 599--603.Google Scholar
- G. Louloudis, B. Gatos, I. Pratikakis, and C. Halatsis. 2009. Text line and word segmentation of handwritten documents. Pattern Recognition 42, 12 (2009), 3169--3183.Google ScholarDigital Library
- Scott Lowther, Vinod Chandran, and Subramanian Sridharan. 2002. An accurate method for skew determination in document images. In Digital Image Computing Techniques and Applications, Vol. 1. 25--29.Google Scholar
- Yue Lu and Chew Lim Tan. 2003. A nearest-neighbor chain based approach to skew estimation in document images. Pattern Recognition Letters 24, 14 (2003), 2315--2323.Google ScholarDigital Library
- Yue Lu, Zhe Wang, and Chew Lim Tan. 2004. Word grouping in document images based on Voronoi tessellation. In International Workshop on Document Analysis Systems. Springer Berlin, 147--157.Google ScholarCross Ref
- Simon M. Lucas. 2005. ICDAR 2005 text locating competition results. In The 8th International Conference on Document Analysis and Recognition. IEEE, 80--84.Google Scholar
- Song Mao, Azriel Rosenfeld, and Tapas Kanungo. 2003. Document structure analysis algorithms: A literature survey. SPIE 5010, Document Recognition and Retrieval X 5010, 1 (2003), 197.Google Scholar
- Simone Marinai, Marco Gori, and Giovanni Soda. 2005. Artificial neural networks for document analysis and recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 27, 1 (2005), 23--35.Google ScholarDigital Library
- Gale L. Martin. 1993. Centered-object integrated segmentation and recognition of overlapping handprinted characters. Neural Computation 5, 3 (1993), 419--429.Google ScholarDigital Library
- Maroua Mehri, Petra Gomez-Krämer, Pierre Héroux, Alain Boucher, and Rémy Mullot. 2013. Texture feature evaluation for segmentation of historical document images. In The 2nd International Workshop on Historical Document Imaging and Processing. ACM Press, New York, 102.Google ScholarDigital Library
- Maroua Mehri, Pierre Héroux, Petra Gomez-Krämer, and Rémy Mullot. 2017. Texture feature benchmarking and evaluation for historical document image analysis. International Journal on Document Analysis and Recognition (IJDAR) 20, 1 (2017), 1--35.Google ScholarDigital Library
- Maroua Mehri, Nibal Nayef, Pierre Héroux, Petra Gomez-Krämer, and Rémy Mullot. 2015. Learning texture features for enhancement and segmentation of historical document images. In The 3rd International Workshop on Historical Document Imaging and Processing. ACM Press, New York, 47--54.Google ScholarDigital Library
- G. Nagy. 2000. Twenty years of document image analysis in PAMI. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 1 (2000), 38--62.Google ScholarDigital Library
- George Nagy and Sharad Seth. 1984. Hierarchical representation of optically scanned documents. In The International Conference on Pattern Recognition. IEEE, 347--349.Google Scholar
- Y. Nakano, Y. Shima, H. Fujisawa, J. Higashino, and M. Fujinawa. 1990. An algorithm for the skew normalization of document image. In The 10th International Conference on Pattern Recognition, Vol. 2. IEEE Comput. Soc. Press, 8--13.Google Scholar
- N. Nandini, K. Srikanta Murthy, and G. Hemantha Kumar. 2008. Estimation of skew angle in binary document images using hough transform. World Academy of Science, Engineering and Technology 18 (2008), 44--49.Google Scholar
- Wayne; Niblack. 1986. An Introduction to Digital Image Processing. Prentice-Hall, Englewood Cliffs NJ. 115--116 pages.Google Scholar
- Nikos Nikolaou, Michael Makridis, Basilis Gatos, Nikolaos Stamatopoulos, and Nikos Papamarkos. 2010. Segmentation of historical machine-printed documents using adaptive run length smoothing and skeleton segmentation paths. Image and Vision Computing 28, 4 (Apr. 2010), 590--604.Google ScholarDigital Library
- Konstantinos Ntirogiannis, Basilis Gatos, and Ioannis Pratikakis. 2014. ICFHR2014 competition on handwritten document image binarization (H-DIBCO 2014). In The 14th International Conference on Frontiers in Handwriting Recognition. IEEE, 809--813.Google Scholar
- L. O’Gorman. 1993. The document spectrum for page layout analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 15, 11 (1993), 1162--1173.Google ScholarDigital Library
- Oleg Okun, Matti Pietikäinen, O. Okun, and M. Pietikäinen. 1999. A survey of texture-based methods for document layout analysis. In Workshop on Texture Analysis in Machine Vision. 137--148.Google ScholarCross Ref
- Sofia Ares Oliveira, Benoit Seguin, and Frederic Kaplan. 2018. dhSegment: A generic deep-learning approach for document segmentation. In The 16th International Conference on Frontiers in Handwriting Recognition. IEEE, 7--12.Google ScholarCross Ref
- Nobuyuki Otsu. 1979. A threshold selection method from gray-level histograms. IEEE Transactions on Systems, Man, and Cybernetics 9, 1 (1979), 62--66.Google ScholarCross Ref
- U. Pal and B. B. Chaudhuri. 1996. An improved document skew angle estimation technique. Pattern Recognition Letters 17, 8 (1996), 899--904.Google ScholarDigital Library
- G. S. Peake and T. N. Tan. 1997. A general algorithm for document skew angle estimation. In The International Conference on Image Processing. IEEE Comput. Soc., 230--233.Google Scholar
- Ihsin T. Phillips and Atul K. Chhabra. 1999. Empirical performance evaluation of graphics recognition systems. IEEE Transactions on Pattern Analysis and Machine Intelligence 21, 9 (1999), 849--870.Google ScholarDigital Library
- Ihsin T. Phillips, Jisheng Liang, Atul K. Chhabra, and Robert Haralick. 1997. A performance evaluation protocol for graphics recognition systems. In International Workshop on Graphics Recognition. Springer, 372--389.Google Scholar
- Stefan Pletschacher and Apostolos Antonacopoulos. 2010. The PAGE (page analysis and ground-truth elements) format framework. In The 20th International Conference on Pattern Recognition. IEEE, 257--260.Google ScholarDigital Library
- Wolfgang Postl. 1986. Detection of linear oblique structures and skew scan in digitized documents. In The 8th International Conference on Pattern Recognition. 687--689.Google Scholar
- Ioannis Pratikakis, Konstantinos Zagoris, George Barlas, and Basilis Gatos. 2016. ICFHR2016 handwritten document image binarization contest (H-DIBCO 2016). In The 15th International Conference on Frontiers in Handwriting Recognition (ICFHR). IEEE, 619--623.Google Scholar
- Ioannis Pratikakis, Konstantinos Zagoris, George Barlas, and Basilis Gatos. 2017. ICDAR2017 competition on document image binarization (DIBCO 2017). In The 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Vol. 1. IEEE, 1395--1403.Google Scholar
- Lorenzo Quirós. 2018. Multi-task handwritten document layout analysis. arXiv preprint arXiv:1806.08852 (2018).Google Scholar
- Lorenzo Quirós, Llu´s Serrano, Vicente Bosch, Alejandro H. Toselli, Rosa Congost, Enric Saguer, and Enrique Vidal. 2018. HTR Dataset ICFHR 2018. https://zenodo.org/record/1322666#.XHOanOgzaUk.Google Scholar
- Irina Rabaev, Ofer Biller, Jihad El-Sana, Klara Kedem, and Itshak Dinstein. 2013. Text line detection in corrupted and damaged historical manuscripts. In The 12th International Conference on Document Analysis and Recognition. IEEE, 812--816.Google ScholarDigital Library
- J. Y. Ramel, S. Leriche, M. L. Demonet, and S. Busson. 2007. User-driven page layout analysis of historical printed books. International Journal of Document Analysis and Recognition (IJDAR) 9, 2--4 (Apr. 2007), 243--261.Google ScholarCross Ref
- Marte A. Ramírez-Ortegón, Lilia L. Ramírez-Ramírez, Ines Ben Messaoud, Volker Märgner, Erik Cuevas, and Raúl Rojas. 2014. A model for the gray-intensity distribution of historical handwritten documents and its application for binarization. International Journal on Document Analysis and Recognition 17, 2 (2014), 139--160.Google ScholarDigital Library
- Tony M. Rath and Rudrapatna Manmatha. 2007. Word spotting for historical documents. International Journal on Document Analysis and Recognition 9, 2 (2007), 139--152.Google ScholarDigital Library
- Ahsen Raza, Imran Siddiqi, Ali Abidi, and Fahim Arif. 2012. An unconstrained benchmark urdu handwritten sentence database with automatic line segmentation. In International Conference on Frontiers in Handwriting Recognition. IEEE, 491--496.Google ScholarDigital Library
- Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical Image Computing and Computer-assisted Intervention. Springer, 234--241.Google ScholarCross Ref
- Raid Saabni, Abedelkadir Asi, and Jihad El-Sana. 2014. Text line extraction for historical document images. Pattern Recognition Letters 35, 1 (2014), 23--33.Google ScholarDigital Library
- Raid Saabni and Jihad El-Sana. 2011. Language-independent text lines extraction using seam carving. In The International Conference on Document Analysis and Recognition. IEEE, 563--568.Google ScholarDigital Library
- Rana S. M. Saad, Randa I. Elanwar, N. S. Abdel Kader, Samia Mashali, and Margrit Betke. 2016. BCE-Arabic-v1 dataset: Towards interpreting arabic document images for people with visual impairments. In The 9th ACM International Conference on PErvasive Technologies Related to Assistive Environments - PETRA. ACM Press, New York, New York, USA, 1--8.Google ScholarDigital Library
- T. Saitoh, M. Tachikawa, and T. Yamaai. 1993. Document image segmentation and text area ordering. In The 2nd International Conference on Document Analysis and Recognition. IEEE Comput. Soc. Press, 323--329.Google Scholar
- P. Saragiotis and N. Papamarkos. 2008. Local skew correction in documents. International Journal of Pattern Recognition and Artificial Intelligence 22, 4 (2008), 691--710.Google ScholarCross Ref
- M. Sarfraz, S. A. Mahmoud, and Z. Rasheed. 2007. On skew estimation and correction of text. In Computer Graphics, Imaging and Visualisation. IEEE, 308--313.Google Scholar
- Eric Saund, Jing Lin, and Prateek Sarkar. 2009. PixLabeler: User interface for pixel-level labeling of elements in document images. In The 10th International Conference on Document Analysis and Recognition. IEEE, 646--650.Google ScholarDigital Library
- J. Sauvola and M. Pietikainen. 1995. Skew angle detection using texture direction analysis. In The 9th Scandinvian Conference on Image Analysis. 1099--1106.Google Scholar
- J. Sauvola and M. Pietikäinen. 2000. Adaptive document image binarization. Pattern Recognition 33, 2 (2000), 225--236.Google ScholarCross Ref
- Seong-Whan Seong-Whan Lee and Dae-Seok Dae-Seok Ryu. 2001. Parameter-free geometric document layout analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 23, 11 (2001), 1240--1256.Google ScholarDigital Library
- Mathias Seuret, Michele Alberti, Marcus Liwicki, and Rolf Ingold. 2017. PCA-initialized deep neural networks applied to document image analysis. In The 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). IEEE, 877--882.Google ScholarCross Ref
- F. Shafait and T. M. Breuel. 2011. The effect of border noise on the performance of projection-based page segmentation methods. IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 4 (2011), 846--851.Google ScholarDigital Library
- F. Shafait, D. Keysers, and T. M. Breuel. 2006. Pixel-accurate representation and evaluation of page segmentation in document images. In The 18th International Conference on Pattern Recognition. IEEE, 872--875. DOI:https://doi.org/10.1109/ICPR.2006.934Google Scholar
- Faisal Shafait, Joost van Beusekom, Daniel Keysers, and Thomas M. Breuel. 2008. Background variability modeling for statistical layout analysis. In The 19th International Conference on Pattern Recognition. IEEE, 1--4.Google Scholar
- Mahnaz Shafii and Maher Sid-Ahmed. 2015. Skew detection and correction based on an axes-parallel bounding box. International Journal on Document Analysis and Recognition (IJDAR) 18, 1 (2015), 59--71.Google ScholarDigital Library
- Asif Shahab. 2013. UW3 and UNLV Datasets. Http://www.iapr-tc11.org/mediawiki/index.php/Table_Ground_Truth_for_the_UW3_and_UNLV_datasets.Google Scholar
- Zhixin Shi and Venu Govindaraju. 2004. Line separation for complex document images using fuzzy runlength. In The 1st International Workshop on Document Image Analysis for Libraries. 306--312.Google Scholar
- Zhixin Shi, Srirangaraj Setlur, and Venu Govindaraju. 2009. A steerable directional local profile technique for extraction of handwritten Arabic text lines. In The 10th International Conference on Document Analysis and Recognition. IEEE, 176--180.Google ScholarDigital Library
- Frank Y. Shih and Shy-Shyan Chen. 1996. Adaptive document block segmentation and classification. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 26, 5 (1996), 797--802.Google ScholarDigital Library
- P. Shivakumara, G. Hemantha Kumar, D. S. Guru, and P. Nagabhushan. 2005. A novel technique for estimation of skew in binary text document images based on linear regression analysis. Sadhana 30, 1 (2005), 69--85.Google ScholarCross Ref
- Fotini Simistira, Manuel Bouillon, Mathias Seuret, Marcel Wursch, Michele Alberti, Rolf Ingold, and Marcus Liwicki. 2017. ICDAR2017 competition on layout analysis for challenging medieval manuscripts. In The 14th IAPR International Conference on Document Analysis and Recognition. IEEE, 1361--1370.Google Scholar
- A. Simon, J.-C. Pret, and A. P. Johnson. 1997. A fast algorithm for bottom-up document layout analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 19, 3 (1997), 273--277.Google ScholarDigital Library
- Brij Mohan Singh, Rahul Sharma, Debashis Ghosh, and Ankush Mittal. 2014. Adaptive binarization of severely degraded and non-uniformly illuminated documents. International Journal on Document Analysis and Recognition (IJDAR) 17, 4 (2014), 393--412.Google ScholarDigital Library
- Chandan Singh, Nitin Bhatia, and Amandeep Kaur. 2008. Hough transform based fast skew detection and accurate skew correction methods. Pattern Recognition 41, 12 (2008), 3528--3546.Google ScholarDigital Library
- Bolan Su, Shijian Lu, and Chew Lim Tan. 2010. Binarization of historical document images using the local maximum and minimum. In The 8th IAPR International Workshop on Document Analysis Systems. ACM Press, New York, 159--166.Google ScholarDigital Library
- Wassim Swaileh, Kamel Ait Mohand, and Thierry Paquet. 2015. Multi-script iterative steerable directional filtering for handwritten text line extraction. In The 13th International Conference on Document Analysis and Recognition. IEEE, 1241--1245.Google ScholarDigital Library
- D. Sylwester and S. Seth. 1995. A trainable, single-pass algorithm for column segmentation. In The 3rd International Conference on Document Analysis and Recognition, Vol. 2. IEEE Comput. Soc. Press, 615--618.Google Scholar
- Breuel Thomas and Faisal Shafait. 2010. AutoMLP: Simple, effective, fully automated learning rate and size adjustment. In The Learning Workshop, Utah.Google Scholar
- Tuan Anh Tran, In-Seop Na, and Soo-Hyung Kim. 2015. Hybrid page segmentation using multilevel homogeneity structure. In The 9th International Conference on Ubiquitous Information Management and Communication. ACM Press, New York, 1--6.Google ScholarDigital Library
- Tuan Anh Tran, In Seop Na, and Soo Hyung Kim. 2016. Page segmentation using minimum homogeneity algorithm and adaptive mathematical morphology. International Journal on Document Analysis and Recognition (IJDAR) 19, 3 (Sep. 2016), 191--209.Google ScholarDigital Library
- Nikos Vasilopoulos and Ergina Kavallieratou. 2017. Complex layout analysis based on contour classification and morphological operations. Engineering Applications of Artificial Intelligence 65 (2017), 220--229.Google ScholarDigital Library
- Friedrich M. Wahl, Kwan Y. Wong, and Richard G. Casey. 1982. Block segmentation and text extraction in mixed text/image documents. Computer Graphics and Image Processing 20, 4 (Dec. 1982), 375--390.Google ScholarCross Ref
- Hao Wei, Micheal Baechler, Fouad Slimane, and Rolf Ingold. 2013. Evaluation of SVM, MLP and GMM classifiers for layout analysis of historical documents. In The 12th International Conference on Document Analysis and Recognition. IEEE, 1220--1224.Google ScholarDigital Library
- Hao Wei, Kai Chen, Rolf Ingold, and Marcus Liwicki. 2014. Hybrid feature selection for historical document layout analysis. In The 14th International Conference on Frontiers in Handwriting Recognition. 87--92.Google ScholarCross Ref
- Hao Wei, Kai Chen, Anguelos Nicolaou, Marcus Liwicki, and Rolf Ingold. 2014. Investigation of feature selection for historical document layout analysis. In The 4th International Conference on Image Processing Theory, Tools and Applications. 1--6.Google ScholarCross Ref
- Florian Westphal, Niklas Lavesson, and Håkan Grahn. 2018. Document image binarization using recurrent neural networks. In The 13th IAPR International Workshop on Document Analysis Systems (DAS). IEEE, 263--268.Google ScholarCross Ref
- Christoph Wick and Frank Puppe. 2018. Fully convolutional neural networks for page segmentation of historical document images. In The 13th IAPR International Workshop on Document Analysis Systems (DAS). IEEE, 287--292.Google ScholarCross Ref
- Chung-Chih Wu, Chien-Hsing Chou, and Fu Chang. 2008. A machine-learning approach for analyzing document layout structures with two reading orders. Pattern Recognition 41, 10 (2008), 3200--3213.Google ScholarDigital Library
- Yi Xiao and Hong Yan. 2003. Text region extraction in a document image based on the Delaunay tessellation. Pattern Recognition 36, 3 (2003), 799--809.Google ScholarCross Ref
- Yi Xiao and Hong Yan. 2004. Location of title and author regions in document images based on the Delaunay triangulation. Image and Vision Computing 22, 4 (2004), 319--329.Google ScholarCross Ref
- H. Yan. 1993. Skew correction of document images using interline cross-correlation. Graphical Models and Image Processing 55, 6 (1993), 538--543.Google ScholarDigital Library
- Younki Min, Sung-Bae Cho, and Yillbyung Lee. 1996. A data reduction method for efficient document skew estimation based on Hough transformation. In The 13th International Conference on Pattern Recognition, Vol. 3. IEEE, 732--736.Google Scholar
- Bin Yu and Anil K. Jain. 1996. A robust and fast skew detection algorithm for generic documents. Pattern Recognition 29, 10 (Oct. 1996), 1599--1629.Google ScholarCross Ref
- Yue Lu and C. L. Tan. 2005. Constructing area Voronoi diagram in document images. In The 8th International Conference on Document Analysis and Recognition, Vol. 1. IEEE, 342--346.Google Scholar
- A. Zahour, B. Taconet, P. Mercy, and S. Ramdane. 2001. Arabic hand-written text-line extraction. In The 6th International Conference on Document Analysis and Recognition. IEEE Comput. Soc., 281--285.Google Scholar
- Yefeng Zheng and David Doermann. 2010. LAMP Dataset of Layer Separation. https://lampsrv02.umiacs.umd.edu/projdb/project.php?id=61.Google Scholar
Index Terms
- Document Layout Analysis: A Comprehensive Survey
Recommendations
A Deep Learning-Based System for Document Layout Analysis
ICMLSC '22: Proceedings of the 2022 6th International Conference on Machine Learning and Soft ComputingDocument image understanding is an essential process in the digital transformation era. Those systems automatically convert a paper document to a digital document for storing and information extracting. In practice, document layout analysis is a ...
Arabic document layout analysis
Document layout analysis is a key step in the process of converting document images into text. Arabic language script is cursive and written in different styles which cause some challenges in the analysis of Arabic text documents. In this paper, we ...
Document analysis applied to fragments: feature set for the reconstruction of torn documents
DAS '10: Proceedings of the 9th IAPR International Workshop on Document Analysis SystemsDocument analysis is done to analyze entire forms (e.g. intelligent form analysis, table detection) or to describe the layout/structure of a document. In this paper document analysis is applied to snippets of torn documents to calculate features that ...
Comments