A Comparative Study of Javanese Script Classification with GoogleNet, DenseNet, ResNet, VGG16 and VGG19

. Purpose: Javanese script is a legacy of heritage or heritage in Indonesia originating from the island of Java needs to be preserved. Therefore, in this study, the classification and identification process of Javanese script letters will be carried out using the CNN method. The purpose of this research is to be able to build a model which can properly classify Javanese script, it can help in the process of recognizing letters in Javanese script easily. Methods: In this study, the Javanese script classification process has been used the transfer learning process of Convolutional Neural Network, namely GoogleNet, DenseNet, ResNet, VGG16 and VGG19. The purpose of using transfer learning is to improve the sequential CNN model, processing can be better and optimal because it utilizes a previously trained model. Result: The results obtained after testing in this study are using the transfer learning method, the GoogleNet model gets an accuracy of 88.75%, the DenseNet model gets an accuracy of 92%, the ResNet model gets an accuracy of 82.75%, the VGG16 model gets an accuracy of 99.25% and the VGG19 model gets an accuracy of 99.50%. Novelty: In previous studies, it is still very rare to discuss the Javanese script classification process using the CNN transfer learning method and which method is the most optimal for performing the Javanese script classification process. In this study, it had been resulted find an effective method to be able to carry out the Javanese script classification process properly and optimally.


INTRODUCTION
Javanese language is one of the oldest heritage languages in Indonesia [1].Javanese script is a character in the form of ancient script or letters used on the island of Java [2] Because of this, Javanese script is a relic or cultural heritage is very valuable and priceless [3].Therefore, Javanese Script is important to preserve, one way is to make Javanese as a local content subject in elementary to high school [4].In the Javanese language lessons, there are lessons about Javanese script, but because of the complexity and diversity of Javanese script, students are still not optimal in mastering Javanese script both in reading and writing [5].This makes students quickly bored in learning Javanese script [6].Because of this, technological assistance is needed, students can easily understand and learn about Javanese characters [7], and the coordination and important role of education personnel is needed to be able to provide fun and effective learning for students [8].
Deep learning is a form of development of machine learning [9].Sarker [10], argues in the current era, deep learning is one of the popular topics to be discussed, especially in the fields of Artificial Intelligence, Machine Learning and data analysis.This can happen, because deep learning is very good at processing big data [11], which is now widely used.Deep learning is also widely used now because it is increasingly facilitated by the performance of CPU and GPU computers which are getting faster which can speed up the process of both training and testing models [12].In its implementation, deep learning uses the concept of artificial neural network [13] which is inspired by the workings of the biological neural system of the human brain [14], [15].Therefore, deep learning is widely used in image processing, sound processing [16], automation in industry, automatic control systems, and others [15].Deep learning is widely used in various fields, because in deep learning, it can automatically perform the process of learning features from data [17], without having to process the data first.
Therefore, when compared with machine learning methods, deep learning methods can provide good accuracy and speed performance [18], [19].The method can be used to perform deep learning is Convolutional Neural Network (CNN).This method can perform object recognition and detection in a digital image [20].Yadav et.al [21], suggested CNN is widely used and produces good accuracy in the image classification process since 2012.Because of this, CNN is known as the best solution for solving complex problems such as image processing and IOT [22], which can search for patterns from data well [23].In its implementation, building a CNN model requires layers.The layers are usually used are input layer, convolution layer, pooling layer, fully connected layer and output layer [24].Input layer is a layer reads data [25].Convolution layer is a layer used to extract features [26].Pooling layer is a layer used to improve the classification process by reducing the dimensions of the image generated from the previous process, the next process can be faster [27].Fully connected layer is a layer used to perform the classification process on an image based on existing classes [28].And the output layer is a layer used to display the classification results of the process on the previous layer [29].
Research on the identification and classification of Javanese script letters is an interesting research to do.There have been several studies conducted to classify Javanese alphabets.However, the research conducted still does not get perfect accuracy.As conducted in research by Rasyidi et.al [1], [30] in this study, the Javanese script letter classification process obtained an accuracy, precision and recall value of 97.7%.This value is not a bad value, but it can still be improvised, the value is close to perfect, therefore, this research is carried out, it is expected to increase the accuracy of the tests carried out.In research conducted by Das et.al [31] in 2019, discussed about classification of brain tumors using the CNN method.The purpose from this research is to build model can accurately classify brain tumors.The results obtained from this research is getting a brain tumor prediction accuracy of 94.39%.
Research conducted by Chaganti et.al [32], discusses the classification process of Dalmatian, Pizza, Dollar Ball, Soccer Ball and Sunflower images using CNN.The purpose conducting this research is to compare the machine learning process SVM and deep learning CNN to classify images.The results obtained from this study are after the classification process using SVM, getting a test accuracy of 82% while when the classification process using CNN, getting an accuracy of 93.57%.In research conducted by Jasim et al [33], discuss about process of classification and detection of diseases in leaves using deep learning.The purpose of this research is to build a model can accurately predict and classify leaf diseases.The results obtained from this study are by using deep learning get the prediction test accuracy of 98%.Research conducted by Badza et.al [34], discuss about process of classifying brain tumors based on MRI images using CNN.The purpose of this research is to build a model can accurately classify brain tumors using MRI images.The results obtained in this study are after testing with the model built to classify brain tumors, the test accuracy is 97.15% with an average precision of 97.15%, an average recall of 97.82% and an average f1-score of 97.47%.
Here, the classification process of Javanese script letters has been used CNN method.The CNN method has been used is the transfer learning method where in the transfer learning process, a previously trained model would be used again to perform new classification tasks [29].The CNN transfer learning method used in this study is GoogleNet, ResNet, DenseNet, VGG16 and VGG19 [35] architectures.The purpose of using all of transfer learning model is because these models can be used to capture hierarchical features inside in the images.The purpose of using the transfer learning process is so also for model training process can be faster and accurate for performing the calcification process.Then, purpose of using the five transfer learning architectures is because the five architectures have similarities and good performance to be able to classify images which are explained in the research methods chapter.The purpose of this research is to build a model can be used to classify Javanese letters accurately which can help in the process of identifying Javanese letter.

METHODS
In this research, we will use the publicly available Javanese script dataset which obtained from the kaggle.comwebsite.In the dataset would been used, there are 20 Javanese script letter classes consisting of ba, ca, da, dha, ga, ha, ja, ka, la, ma, na, nga, nya, pa, ra, sa, ta, tha, wa, ya.The dataset used is obtained from the kaggle.comwebsite and after the dataset is obtained, the dataset is divided into 60% training data, 10% validation data and 30% testing data using the train test split module from sklearn, later the data can be used for deep learning model building.Figure 1   In this research, the Javanese script classification process uses the CNN method.The advantage of using CNN is because CNN can perform feature extraction automatically [36], making it easier to compute.The CNN method had been used in this research is the CNN transfer learning method.Transfer learning is a method to use the model had already train to used again for new process [37].Rahman et.al [38], suggested transfer learning is used so the training process not require a large dataset for train because the model has previously been trained and can cut training time more efficiently, as implemented in this article using GoogleNet, DenseNet, ResNet, VGG16 and VGG19 all model is trained under the same dataset is Imaginet dataset [39].GoogleNet is model developed using a high feature representation trained using millions of images of everyday objects contained in the imaginet dataset [40].Therefore, this model already had lot of knowledge to able to perform recognition on images with various levels of complexity and various scales and resolutions.DenseNet is a model developed by utilizing very strong layer connections, all layers have direct connections to each other [41].ResNet is a model avoids gradient vanishing in the process by using the block residual method [42].In the block, the input will be taken in a skip connection or shortcut connection, where in the skip connection, the returned gradient will not be lost because part of the input will be added to the output.This also keeps the ResNet model from overfitting because it is possible for the model to remember the features in the input.VGG16 is a model built using the overfit dataset consisting of 1.2 million images with 1000 classes used during model training [30], which consists of 13 convolutional layers, 3 fully connected layers for classification [43].
The Javanese script classification process using the CNN transfer learning method is divided into 3 processing, namely the training, validation, and testing process.The training process is to make the model able to recognize patterns from the data.The validation process is useful for checking the model at the same time as the training process, so model not overfitting.While the testing process is a process to evaluate the performance of the model to be able to make predictions on new data other than training data and testing data.Figure 2 shown the visualization of the flow of the process carried out to be able to perform the classification process of Javanese script letters with CNN transfer learning.For an explanation of the process per stage is given below.

1.
First read the image will be used for the training and testing process of the model.

2.
After the image data to be used is read, the data division process is carried out so data is split into 70% training data, 10% validation data and 20% testing data.Where each data had been divided will be used for the training process, validation and testing of the model built.

3.
Then after the data division process is carried out, call or initialize the pre-trained model used for the transfer learning process.Pre-trained models used are GoogleNet, DenseNet, ResNet, VGG16 and VGG19 models.In this process, the model used does not use the top layer of the previous model, so the top layer in each pre-trained model is set false. 4.
After initializing the model to be used, the next layer will be built which will be used as the top layer of the CNN pre-trained model.The top layer of the pre-trained model used can be seen in Figure 3.

5.
Then after the model testing process is carried out, the performance calculation process of the model can be carried out using the confusion matrix to get the accuracy, precision, recall and f1-score values from the model test results.It can be seen how well the model performs to be able to carry out the classification process.

RESULTS AND DISCUSSIONS
Here, the Javanese script classification process had been used the python programming language to implement the system and jupyter notebook IDE as a tool used to write and run programs.After the program is made to read and share data, programming is done to initialize the pre-trained model will be used for transfer learning and the top layer used in each pre-trained model.Finally, the process for model training can be carried out and the validation process is carried out simultaneously.The results of the training and validation process given in the form of accuracy and loss graphs contained in Table 1.Table 1 shows the results of training and validation on the model in the process of classifying Javanese script letters.These results were obtained using 10 epochs on each model and with layers had been built before.
In Table 1, the best test and validation values are obtained when using the pre-trained CNN VGG16 model.With a test accuracy of 100% and a validation accuracy of 99.37%.This shows VGG16 can properly perform the classification process on training data.Meanwhile, the best loss value is also obtained when validating with the VGG16 model, it had been seen when the validation process is carried out with VGG16 it has good performance to be able to accurately classify the validation data.However, it had been seen in the graph of loss results from the training and validation process does not show a significant gap when the training and validation process is carried out.The model built in this study does not experience overfitting and can perform the classification process well.After training and validation of the model, a model training process will be carried out to see how the classification performance of the model has been trained and validated on new data other than train and valid data.
Table 2 shown the accuracy obtained after testing the pre-trained model has previously been trained and validated.Can been seen in table 2, the best test accuracy value is obtained when using the pre-trained model VGG16 which is 99.50%.Because the testing process uses new data has not previously been given to the model either in the training or validation process, it shows using the VGG16 pre-trained model with a previously built top layer can be very good and accurate in performing the Javanese script letter classification process.After testing the model, then the performance calculation process can be carried out using the confusion matrix.The values used in calculating the performance of model testing are the precision value, recall value, and the f1-score value previously described.For the results of these value after the testing process, given in Table 3.The best average precision value obtained after classifying using VGG19 model which is 100%.This shows the VGG19 model can accurately predict all existing classes.Meanwhile, the best average value of recall and f1-score obtained when using VGG16 and VGG19 models which is 99%.This shows the VGG16 and VGG19 models built had a good performance in predicting all classes and had a good balance value between precision, and recall.Based on the research results obtained by [44] regarding the Javanese script classification process using the model results of the combination of CNN and SVM, the test accuracy result is 98.35%, this proves the model built in this study can perform the classification process very well, especially with models use transfer learning VGG16 and VGG19.As for the support value in all models is 400, because the support value is the total amount of data used to test the model.Javanese script letter classification is interesting research to discuss.Therefore, there are several studies have been conducted to classify Javanese characters.Based on Table 4, the accuracy obtained in this study can provide an increase, especially the accuracy better than the previous research.It was concluded the pre-trained model built and implemented in this study can be more effective and accurate in carrying out the classification process of Javanese script letters.CNN with Pooling Layer Costumization CNN with average pooling layer 5x5 is 93%, then 3x3 is 91%.with max pooling layer 5x5 is 92%, then 3x3 is 90% Diqi [46] CNN Accuracy is 99.69%Our Proposed Pre-Trained Model CNN GoogleNet is 88.75%, DenseNet is 92%, ResNet is 82.7499%,VGG16 is 99.25%, VGG19 is 99.50%

CONCLUSION
After conducting research and testing to classify Javanese script letters using the CNN pre-trained model, getting the best test accuracy is when classifying using the VGG19 pre-trained model which is 99.50%.The average value of precision, recall, and f1 score also obtained by the VGG19 pre-trained model, namely 100%, 99% and 99%.It is concluded using the VGG19 pre-trained model can be good and accurate in the process of classifying Javanese script letters.We hope in the next research, it is expected to be able to carry out the image enhancement process to further prevent overfitting and also further expected to be able to use enhancement on CNN either by using other pre-trained models such as SqueezeNet, MobileNet, EfficientNet, and others or increasing parameters on the CNN model.Future research is also expected to improve the accuracy of GoogleNet, DenseNet, and ResNet transfer learning models by adding data augmentation or using parameter optimization using Particle Swarm Optimization or Genetic Algorithm or other methods.
shows the Javanese script letters had been used in this study.The purpose of dividing into training data and testing data, the trained model does not experience overfitting because the data used from training and testing data is a different type of data, by using testing data, the model can carry out the classification process with new data.Each letter shown in Figure1represents 1 class, so there is a representative visualization of 20 classes had been used in this study.In this research, used a total of 2000 images, each class had 100 images.The 2000 classes are divided into 80% train data and 20% test data.The distribution of data becomes 1600 train images and 400 test images.Which in the train data will be divided again, it becomes 70% train data and 10% valid data.The distribution of data becomes as much as 1400 train data, 200 valid data and 400 test data.

Table 4 .
A Comparation result based on our previous research