Efficient Analysis of Vertical Projection Histogram to Segment Arabic Handwritten Characters

: The paper discusses the segmentation of words into characters, which is an essential task in the development process of character recognition systems, as poorly segmented characters will automatically be unrecognized. The segmentation of offline handwritten Arabic text poses a greater challenge because of its cursive nature and different writing styles. In this article, we propose a new approach to segment handwritten Arabic characters using an efficient analysis of the vertical projection histogram. Our approach was tested using a set of handwritten Arabic words from the IFN/ENIT database, and promising results were obtained.


Introduction
Writing recognition, a vast field of pattern recognition, is still a subject of intense research and experimentation. The problem is not yet fully solved, although in some applications where the vocabulary is limited or the font is unique or limited in number, we know how to obtain high rates. In addition, handwriting recognition is more complex than printed handwriting due to its extreme variability, variability of shapes, spacing between words and characters, line fluctuations. The Latin language has received the greatest attention from researchers [Abhijit and Deeksha (2015); Tanzila, Amjad and Mohamed (2014)]. However, despite the number of people who speak Arabic, little research has been done on this language [Yasser (2013); Naz, Umar, Shirazi et al. (2016)], mainly because of the difficulty of segmenting words into letters. The segmentation step is an important step in the recognition process; this step is simple in the case of printed Latin text, but very difficult in the case of cursive writing (Arabic writing). The complexity of the morphology of Arabic writing and its cursivity make it more difficult to segment words into characters. Several studies have been carried out by researchers based on the recognition of the entire word (global approach) without segmentation [Lawgali (2015)], and others assume that characters are already segmented to avoid the segmentation step [Lorigo and Govindaraju (2006); Khorsheed (2002)]. This step is a challenge for researchers and needs to be improved [Lawgali, Bouridane, Angelova et al. (2011)]. In fact, the development of a new segmentation algorithm is one of our objectives to make Arabic handwriting recognition more effective. The proposed algorithms perform a thorough analysis of the vertical projection histogram to extract the correct segmentation points. This paper is organized as follows: handwritten Arabic characters are described in Section 2. In Section 3, we present some recent work on this subject. Section 4 describes the proposed approach in detail, while Section 5 discusses the results and their analysis. Section 6 summarizes the results of this work and draws conclusions.

Characteristics of Arabic script
Arabic language is a consonant script that uses a 28-letter alphabet. The shape of each letter depends on its position in the word, the same character can have up to four different shapes (isolated, beginning, middle and end), which increases the number of patterns, as illustrated in Tab. 1, 15 letters out of 28 have one or more dots, these dots can be above or below the character size, but never high and low simultaneously. Arabic writing is written from right to left in a cursive manner in printed and handwritten characters; the characters of the same string are bound horizontally and sometimes vertically, as shown in Fig. 1.

Figure 1: Vertical ligature
Some characters cannot be attached to their left, so they can only be isolated or in the final position; this gives, when they exist, words composed of one or more parts generally called PAW (Peace of Arabic Word) or even sub words, as shown in Fig. 2.

Figure2: Example of words composed of 1, 2, 3 and 4 PAWs
Vertical overlaps can occur through the intersection of descendants that extend horizontally below the baseline and the next secondary word, as shown in Fig. 3.

Figure 3: Example of overlap
This makes the problem of segmentation of Arabic words into characters and their recognition more difficult.

Related works
Several methods and algorithms have been developed to segment the handwritten text into characters. As discussed in Gouda et al. [Gouda and Rashwan (2004)], the vertical projection and the baseline are used to segment a word into characters. The authors of Amin [Amin (1991)] select the weak points of each sub-word from the analysis of the vertical histogram, then the zero derivatives of the curvature contour are used to detect the convex dominant points. In Syiam et al. [Syiam, Nazmy, Fahmy et al. (2006)], the k-means classification method was applied to the vertical histogram for word segmentation. This method increases the efficiency of the histogram by recognizing handwriting. The idea of the research presented in Yusra [Yusra (2013)] is to draw the contours of the sub-words, then the segmentation points are the points where the contour passes from a horizontal line to a vertical or curved line. Lawgali et al. [Lawgali, Bouridane, Angelova et al. (2011)] first made the horizontal projection of the sub-word to determine the baseline. Then, the analysis of the vertical projection of the sub-word was performed to examine its distance from the baseline. Segmentation points that are far from the baseline are ignored. Zaidi et al. [Zaidi, Khansa, Noorzaily, et al. (2009)] proposed a character segmentation algorithm based on the normalization of the histogram gradient sign and the sliding window technique for handwritten segmentation of Jawi characters. The authors of [Mohamed (2016)] have developed a segmentation algorithm that uses several techniques such as: spaces between words and sub-words, pen thickness, character width and text height. These works are interesting, but they generally share a segmentation error greater than 15%, and if you add the error of the classification phase, these systems become insufficient to recognize handwritten Arabic writing, so it is very interesting to go further to explore other techniques or to combine several methods to further improve the segmentation process.

Proposed approach
To segment a word into its characters, our approach consists of several steps. First, we generate the vertical projection histogram. Then, the word is segmented into sub-words followed by the extraction of the segmentation points. Subsequently, several operations are performed on the segmentation points to improve the position of these points or to delete someone. In the following, we present a detailed description of each step of the segmentation process of our approach.

Vertical projection histogram
The vertical projection represents the number of black pixels in each column of the image, defined by the vector Vj of size N as follows: ( 1) where P(i,j) is a pixel of the binary image of the script and is either 0 or 1, i and j refer to indexes of the row and column.

Divide the word into its parts (sub-words)
Each word can be composed of one or more parts. We used the vertical projection vector V to determine the different parts of a word. The key idea is that when we find zero the value or a sequence of values of the projection vector (Vj=0), it means that the word can be divided into two sub-words in this position, as shown in Fig. 4.

Segmentation points extraction algorithm
We propose a new algorithm that uses the vertical projection vector (Eq. (1)) to extract the segmentation points. The algorithm has three parameters: Block size (Bs), Step size (Sp) and Threshold (T).

Vj = ∑j P(i,j)
The value of the parameter Sp must be less than or equal to Bs, if the two parameters are equal there is no overlap between the block and the next one, as shown in Fig. 5. The concept of our algorithm is as follows: whenever the sum of the block values increases, any segmentation point is generated. If there is a significant decrease between the sums of the current values of the block compared to the previous block (above the threshold), there is a segmentation point.  Fig. 6, we present an example of our algorithm application, the red vertical lines represent the position of the segmentation points.

Improvement of segmentation points position
This operation consists of performing a local search around each segmentation point in order to find the best position for these points. The final segmentation point is the nearest point that has the smallest value in the vertical projection vector, around the initial point as shown in Fig. 7.

Baseline Detection
We performed the horizontal projection to determine the baseline. This operation deletes segmentation points far from the baseline as shown in Fig. 8.

Number of transitions
In each column of a segmentation point, we conducted a scan to find the number of times the pixel value changes state from 0 to 1 or from 1 to 0. The number of transitions is the total number of times the pixel state changes, a segmentation point is ignored if this number is greater than two, as shown in Fig. 9.

Experimental results
To evaluate our approach, we used the IFN/ENIT database, which was developed in 2002 by the IFN (Institute of Communication Technologies in Germany) and ENIT (National Engineering School of Tunisia) in Tunisia. This is a database of Tunisian city names collected thanks to a contribution from 411 writers. Each of them wrote 60 names with their corresponding postal code. The database contains 26459 city names in a lexicon of 946 cities, 115585 pseudo-words and 212211 characters. A full annotation of the city name images is made automatically, preceded by a manual check.
The algorithm developed was tested on a set of words from the IFN/ENIT database. The words have been carefully chosen to cover all forms of Arabic characters. The results obtained show that 89.5% of the segmentation points were extracted correctly, which is a very appreciable rate compared to the state of the art. Our segmentation algorithm has three parameters and their values affect the results, in the following section we present a set of experiments to select the appropriate values of these parameters.
In the first experiment, we set the T threshold at 50 and the other two parameters are varied, the results obtained are presented in Fig. 10. The best result obtained is 89.5% with Bs=20 and Sp=15. Segmentation errors are due to overlap or when we fall into over or under-segmentation. If the value of the parameters Bs and Sp is small, the error of over-segmentation increases and vice versa, as shown in Fig. 11. In the second series of experiments, we set the block size and step size parameter (Bs=20, Sp=15), the results obtained by varying the threshold T are presented in Fig. 12. The results show that the best value of the threshold is 50. Increasing the threshold more than 50 causes in many cases an under-segmentation and vice versa, as shown in Fig. 13.

Conclusion and future work
Recognizing characters after the segmentation process involves more challenges since segmentation introduces the most serious problem in the development of the cursive Arabic writing system. The task of the segmentation word in its characters is more difficult than the segmentation that aims to extract lines and sub-words. In this work, a new segmentation algorithm is proposed based on a vertical projection histogram. Our segmentation approach gives encouraging results, with an accuracy of 89.5%. The major problem that makes this task crucial is the problem of overlapping characters. Therefore, on the basis of the promising findings presented in this paper, in future work, the segmentation algorithm will be improved by new research to solve the complex problem of overlapping characters.