Korean letter handwritten recognition using deep convolutional neural network on android platform

Currently, popularity of Korean culture attracts many people to learn everything about Korea, particularly its language. To acquire Korean Language, every single learner needs to be able to understand Korean non-Latin character. A digital approach needs to be carried out in order to make Korean learning process easier. This study is done by using Deep Convolutional Neural Network (DCNN). DCNN performs the recognition process on the image based on the model that has been trained such as Inception-v3 Model. Subsequently, re-training process using transfer learning technique with the trained and re-trained value of model is carried though in order to develop a new model with a better performance without any specific systemic errors. The testing accuracy of this research results in 86,9%.


Introduction
Popularity of Korean culture or Korean wave (hallyu) has become a phenomenon that draws public attention nowadays, especially in Indonesia. Many young Indonesians are motivated to learn Korean language and to learn it, they try to understand Korean non-latin character (Hangul) that consists of vocal and consonant [1]. Bahasa Indonesia uses latin character that makes Indonesian learners have to deal with a set of difficulties in learning Hangul.
The used method in this research is Deep Convolutional Neural Network. This method showed an outstanding performance in image recognition, because DCNN has a good ability to extract high-level features. The paper was organized as follows: Sect. 2 reviewing several previous researches related to Hangul recognition and Deep Convolutional Neural Network. Section 3 describing the used method in recognizing Korean letter handwritten Section 4. explaining the result of this research. Section 5 discussing the summary and suggestions for future research.

Related work
There are a lot of researches about Korean character (Hangul) conducted with various determination and method. Kang used Hierarchical Stochastic Modelling for Hangul character recognition [2]. Sari proposed Template Matching and K-Nearest Neighbors method to recognize Hangul with accuracy rate 85% for Template Matching and 72% for K-Nearest Neighbors [3]. Our previous work also in character recognition is in [11]  There was Hangul handwritten recognition done by Nasution. She used Backpropagation Neural Network with accuracy rates for memorization and generalization were 98,2% and 85,3%. The output of this system was very simple because it was still using GNU GCC MinGW Compiler [4].
Prabowo conducted research on utilization API Gesture to practice writing Hangul in android smartphone. However, the result from the system was not processed for any specific purpose [5].
In [6], they used Deep Convolutional Neural Network for recognizing Hangul handwritten with accuracy rate 95,96% for SERI95a dataset and 92,92% for PE92 dataset. The output of this research was desktop application.
Other similar research was also done by Bui and Chang. They used Deep Convolutional Neural Network for recognizing non-MNIST dataset, which was digit handwritten dataset with more noise and hard to recognize. This research achieved best accuracy rate of 98% [7].

Methodology
The used method in this research consisted of several steps, namely pre-trained neural network with inception-v3 model as an outcome, retraining process using transfer learning technique, and word recognition process. Word recognition process was done with following steps, image acquisition, preprocessing and recognition using Deep Convolutional Neural Network. Figure 1 describes the general architecture of the proposed method.

Pre-trained neural network
Pre-Trained Neural Network phase was used to train ImageNet Large Scale Visual Recognition Challenge (ILSVRC) dataset which produced a deep learning model called Inception-v3 Model. This model had an awesome capability in recognizing image.
DCNN method was used in the training process since it had outstanding performance in image recognition. DCNN's capability to recognize image was supported by its feature to extract high-level features. In addition, convolutional layer and max pooling layer could recognize various object. DCNN could conduct every phase in image recognition, since DCNN received raw image as an input, thus it did not need feature extraction phase such as conventional classifier [6]. Figure 2. shows a DCNN's architecture. DCNN consists of 5 convolutional layers and three fully-connected layers. The output of fully-connected layer is 1000-way-softmax and produces a distribution over 1000 class labels. Training process of each layer is shown in Table 1. The input image was filtered by the first Convolutional Layer with 96 kernels of size 11x11x3 with a stride of 4 pixels. The output of the first Convolutional Layer was filtered by the second. Convolutional Layer with 256 kernels of size 5x5x48. The third, fourth and fifth Convolutional layer was connected to each other without intervensi pooling and normalization layers. The output of second Convolutional Layer connected to the third Convolutional Layer with 384 kernels of size 3x3x256. The fourth and fifth Convolutional Layer had 256 kernels of size 3x3x192.

Retraining process (transfer learning)
Inception-v3 Model could not recognize characters and words that written in Hangul, therefore a retraining process using transfer learning technique with a dataset of Hangul handwritten characters needed. In this phase, bottleneck value of each image was calculated. Bottleneck is an informal term aim to calculate values of each image that will be used for classification.

Output (new model)
The output of retraining process is a new model that able to recognize handwritten of Korean words with Hangul characters.

Word Recognition
Word recognition has several steps as follows.
• Input: The input image is image that written on screen smartphone android.
• Preprocessing: There are 2 steps in preprocessing phase as follows.
• Thresholding: Thresholding is a conversion of input image to binary image [9]. Thresholding formula can be described in formula 1.
• Resizing: The input image was resized to become 224 x 224. • Recognition: In this phase, the result of image preprocessing was recognized by using DCNN method. • Output: The output of this research was a word that successfully recognized with translation and pronunciation features.

Experimental results
The obtained results from the training and application testing were discussed in this section.

Training results
Dataset in this research consisted of 11 Korean words with 1157 images, which 1100 of them were used as training data and the rest was used as testing data. The dataset of this research shows in Table 2  The retrain process was done by using Tensorflow.
Tensorflow is an open source library made by Google for machine learning that supports some of programming language [10]. Bottleneck value of each image was calculated first. The training was conducted with 4000 training steps. During training process, the 10 images were chosen randomly from the training set in each step, then bottleneck value was processed into final layer to get prediction value. There are 7 main parameters of this training, namely Training Steps, Learning Rate, Testing Percentage, Validation Percentage, Eval Step Interval, Train Batch Size dan Validation Batch Size. Table 3. shows main parameters in this training. Each training step indicated training accuracy, validation accuracy and cross entropy. Cross entropy is a loss function which shows how well the learning process is running. The lower the value of cross entropy, the better the learning process. If the cross entropy keeps going downwards, it means the learning process improved. If the training accuracy is high but the validation accuracy is low, that means the network is overfitting, which the network is memorizing some feature that are not useful for recognizing image. The final accuracy is displayed after the training process. Summary of the training process shows in Table 4    Cross entropy graph of this 4000 training steps in this research is shown in Figure 5. In Figure 5, the value of cross entropy kept downwards, means the system had a better performance in recognizing Hangul.

Testing Results
We tested Korean Letter Handwritten recognition with 57 images. In this application, the example of how to write the words was shown above the canvas. To start the recognition, firstly, the image was inputted by writing a word in a a screen of smartphone android and then the detect button pressed. The pronunciation and translation of the word would be displayed. If the play button pressed, the voice of pronunciation would be played. The recognition process is show in Figure 6. We conducted the application testing with 14 people. They wrote different words in the different numbers. The result of testing system is shown in Table 5  Overall, the detection performed with accuracy average achieve of 86,9%. There were several problems affecting accuracy of each image. The first problem was the stroke shape of the words. The stroke shape would be conformed with the image training. The abrupt stroke of the words affected the accuracy value too.

Conclusion and Future Work
Based on testing result of Korean letter handwritten recognition using Deep Convolutional Neural Network on Android platform, the average accuracy of 86,9% was achieved. Accuracy value of input image was affected by the stroke shape of Hangul. 100 % of training accuracy was acquired in each word which had 100 images. The number of trained images in each word affected the training accuracy. In the future, the number of word classes are possible to be added. System can recognize more than two syllables and cursive letters