Abstract
Writer identification is an active research problem due to its applications in forensic and historic documents analysis. It is challenging to identify a writer from her handwritten characters' shapes produced via practiced writing style. Different writing shapes, styles, orientations, various sizes of characters, complex structures, inconsistency, and cursive nature of the text make it a tougher undertaking. To solve this problem, we need to explore a structural representation and spatial information of the handwritten characters. For this, a novel graph-based approach is proposed here to spatially map the handwritten text, adapt its structure, size, and explore the relationship that exist between them. First, image processing steps such as binarization, baseline correction, separation of the writing region, and thinning of the strokes to a width of a single pixel are executed. This work presents a novel algorithm for detecting key points (KPs) in a handwritten skeleton image and extracting their two-dimensional pixel coordinates values. The handwriting samples are then transformed into a graph-based representation with KPs representing nodes and the line segments connecting adjacent KPs as the edges. Features are extracted from the graph-based representations of the handwritten text. For classification, ensemble learning approaches are employed. Four benchmark datasets and one custom collected dataset are utilized for experimentations. The proposed solution achieves identification accuracies of 98.26%, 98.84%, 99.67%, 98.51%, and 97.73%, on CERUG-EN, CVL, Firemaker, IAM, and custom datasets, respectively.
Similar content being viewed by others
References
Bulacu M, Schomaker L (2007) Text-independent writer identification and verification using textural and allographic features. IEEE Trans Pattern Anal Mach Intell 29(4):701–717
Srihari SN, Cha SH, Arora H, Lee S (2002) Individuality of handwriting. J Forensic Sci 47(4):1–17
Rodriguez JA, Perronnin F (2008) Local gradient histogram features for word spotting in unconstrained handwritten documents. In: Proceedings of the 1st ICFHR, pp 7–12
Schomaker L, Bulacu M (2004) Automatic writer identification using connected-component contours and edge-based features of uppercase western script. IEEE Trans Pattern Anal Mach Intell 26(6):787–798
Abdi MN, Khemakhem M (2015) A model-based approach to offline text-independent Arabic writer identification and verification. Pattern Recogn 48(5):1890–1903
Chen S, Wang Y, Lin CT, Ding W, Cao Z (2019) Semi-supervised feature learning for improving writer identification. Inf Sci 482:156–170
Ghiasi G, Safabakhsh R (2013) Offline text-independent writer identification using codebook and efficient code extraction methods. Image Vis Comput 31(5):379–391
Nguyen HT, Nguyen CT, Ino T, Indurkhya B, Nakagawa M (2019) Text-independent writer identification using convolutional neural network. Pattern Recogn Lett 121:104–112
Yang W, Jin L, Liu M (2016) Deepwriterid: an end-to-end online text-independent writer identification system. IEEE Intell Syst 31(2):45–53
Pinhelli F, Britto Jr, AS, Oliveira LS, Costa YM, Bertolini D (2020) Single-sample writers “Document Filter” and their impacts on writer identification. arXiv:2005.08424.
Khan FA, Khelifi F, Tahir MA, Bouridane A (2018) Dissimilarity Gaussian mixture models for efficient offline handwritten text-independent identification using SIFT and RootSIFT descriptors. IEEE Trans Inf Forensics Secur 14(2):289–303
Fiel S, Sablatnig R (2015, September) Writer identification and retrieval using a convolutional neural network. In: International conference on computer analysis of images and patterns. Springer, Cham, pp 26–37
Bulacu M, Schomaker L, Brink A (2007, September) Text-independent writer identification and verification on offline Arabic handwriting. In: Ninth international conference on document analysis and recognition (ICDAR 2007) vol 2. IEEE, pp 769–773
He S, Schomaker L (2017) Writer identification using curvature-free features. Pattern Recogn 63:451–464
Wu X, Tang Y, Bu W (2014) Offline text-independent writer identification based on scale invariant feature transform. IEEE Trans Inf Forensics Secur 9(3):526–536
Fischer A, Suen CY, Frinken V, Riesen K, Bunke H (2013, May) A fast matching algorithm for graph-based handwriting recognition. In: International workshop on graph-based representations in pattern recognition. Springer, Berlin, pp 194–203
Maergner P, Pondenkandath V, Alberti M, Liwicki M, Riesen K, Ingold R, Fischer A (2019) Combining graph edit distance and triplet networks for offline signature verification. Pattern Recogn Lett 125:527–533
Brink AA, Smit J, Bulacu ML, Schomaker L (2012) Writer identification using directional ink-trace width measurements. Pattern Recogn 45(1):162–171
Bertolini D, Oliveira LS, Justino E, Sabourin R (2013) Texture-based descriptors for writer identification and verification. Expert Syst Appl 40(6):2069–2080
Newell AJ, Griffin LD (2014) Writer identification using oriented basic image features and the delta encoding. Pattern Recogn 47(6):2255–2265
Slimane F, Märgner V (2014, September) A new text-independent GMM writer identification system applied to Arabic handwriting. In: 2014 14th International conference on frontiers in handwriting recognition. IEEE, pp 708–713
Kumar R, Chanda B, Sharma JD (2014) A novel sparse model based forensic writer identification. Pattern Recogn Lett 35:105–112
Jain R, Doermann D (2014, September) Combining local features for offline writer identification. In: 2014 14th International conference on frontiers in handwriting recognition. IEEE, pp 583–588
Khalifa E, Al-Maadeed S, Tahir MA, Bouridane A, Jamshed A (2015) Off-line writer identification using an ensemble of grapheme codebook features. Pattern Recogn Lett 59:18–25
Xiong YJ, Wen Y, Wang PS, Lu Y (2015 August) Text-independent writer identification using SIFT descriptor and contour-directional feature. In: 2015 13th International conference on document analysis and recognition (ICDAR). IEEE, pp 91–95
Khan FA, Tahir MA, Khelifi F, Bouridane A, Almotaeryi R (2017) Robust off-line text independent writer identification using bagged discrete cosine transform features. Expert Syst Appl 71:404–415
Christlein V, Bernecker D, Maier A, Angelopoulou E (2015, October) Offline writer identification using convolutional neural network activation features. In: German conference on pattern recognition. Springer, Cham, pp 540–552
Christlein V, Bernecker D, Hönig F, Maier A, Angelopoulou E (2017) Writer identification using GMM supervectors and exemplar-SVMs. Pattern Recogn 63:258–267
Hadjadji B, Chibani Y (2018) Two combination stages of clustered one-class classifiers for writer identification from text fragments. Pattern Recogn 82:147–162
Kumar P, Sharma A (2019) DCWI: distribution descriptive curve and cellular automata-based writer identification. Expert Syst Appl 128:187–200
He S, Schomaker L (2020) Fragnet: writer identification using deep fragment networks. IEEE Trans Inf Forensics Secur 15:3013–3022
Chahi A, Ruichek Y, Touahni R (2020) Cross multi-scale locally encoded gradient patterns for off-line text-independent writer identification. Eng Appl Artif Intell 89:103459
Javidi M, Jampour M (2020) A deep learning framework for text-independent writer identification. Eng Appl Artif Intell 95:103912
Kumar P, Sharma A (2020) Segmentation-free writer identification based on convolutional neural network. Comput Electr Eng 85:106707
He S, Schomaker L (2019) Deep adaptive learning for writer identification based on single handwritten word images. Pattern Recogn 88:64–74
Muda AK, Shamsuddin SM, Abraham A (2010) Improvement of authorship invarianceness for individuality representation in writer identification. Neural Network World 20(3):371
Litifu A, Yan Y, Xiao J, Jiang H (2021) Writer identification using redundant writing patterns and dual-factor analysis of variance. Appl Intell 51:8865–8880
He S, Wiering M, Schomaker L (2015) Junction detection in handwritten documents and its application to writer identification. Pattern Recogn 48(12):4036–4048
Kleber F, Fiel S, Diem M, Sablatnig R (2013, August) CVL-database: an off-line database for writer retrieval, writer identification and word spotting. In: 2013 12th International conference on document analysis and recognition. IEEE, pp 560–564
Schomaker L, Vuurpijl L, Schomaker L (2000) Forensic writer identification: a benchmark data set and a comparison of two systems. Netherlands Forensic Inst., The Hague, The Netherlands, Tech.
Marti UV, Bunke H (2002) The IAM-database: an English sentence database for offline handwriting recognition. Int J Doc Anal Recogn 5(1):39–46
Durou A, Aref I, Al-Maadeed S, Bouridane A, Benkhelifa E (2019) Writer identification approach based on bag of words with OBI features. Inf Process Manage 56(2):354–366
Halim Z, Ali O, Khan G (2021) On the efficient representation of datasets as graphs to mine maximal frequent itemsets. IEEE Trans Knowl Data Eng 33(4):1674–1691
Iqbal S, Halim Z (2021) Orienting conflicted graph edges using genetic algorithms to discover pathways in protein–protein interaction networks. IEEE/ACM Trans Comput Biol Bioinf 18(5):1970–1985
Halim Z, Rehan M (2020) On identification of driving-induced stress using electroencephalogram signals: a framework based on wearable safety-critical scheme and machine learning. Inf Fusion 53:66–79
Acknowledgements
The authors are indebted to the editor and anonymous reviewers for their helpful comments and suggestions. The authors would like to thank GIK Institute for providing research facilities.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Rahman, A.U., Halim, Z. A graph-based solution for writer identification from handwritten text. Knowl Inf Syst 64, 1501–1523 (2022). https://doi.org/10.1007/s10115-022-01676-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-022-01676-7