Classification Of Multiple Emotions In Indonesian Text Using The K-Nearest Neighbor Method

Emotions are expressions manifested by individuals in response to what they see or experience. In this study, emotions were examined through individuals' tweets regarding the election issues in Indonesia in 2024. The collected tweets were then labeled based on emotions using the emotion wheel, which consisted of six categories: joy, love, surprise, anger, fear, and sadness. After the labeling process, the next step involved weighting using TF-IDF (Term Frequency-Inverse Document Frequency) and Bag-of-Words (BoW) techniques. Subsequently, the model was evaluated using the K-Nearest Neighbor (KNN) algorithm with three different data splitting ratios: 80:20, 70:30, and 60:40. From the six labels used in the modeling process, the accuracy was then calculated, and the labels were subsequently merged into positive and negative categories. Then the modeling was conducted using the same process with the six labels. The results of this study revealed that the utilization of TF-IDF outperformed BoW. The highest accuracy was achieved with the 80:20 data splitting ratio, attaining 58% accuracy for the six-label classification and 79% accuracy for the two-label classification.


Introduction
Emotions are psychological states that involve feelings and mental states that can arise in response to certain stimuli or experiences (Pace-Schott et al., 2019). Emotions involve complex subjective experiences and can influence a person's behaviour, perception, and physical responses. Emotions can vary and include feelings such as joy, sadness, anger, fear, love, disgust, and many more (Gu et al., 2019). Emotions can be expressed through comments on an article or by expressing opinions on social media regarding the discussed subject (Graciyal & Viswam, 2021).
Currently, research related to emotions has been extensively discussed by previous researchers. One of them is the analysis of emotions using sentiment analysis (Chenna et al., 2021). Sentiment analysis is a method in natural language processing used to determine and analyze the sentiment or emotional attitude in text or data . The main goal of sentiment analysis is to identify whether a text or data contains positive, negative, or neutral sentiment (Iglesias & Moreno, 2020). However, in this study, sentiment analysis will be based on emotion labels, as done by previous researchers using labels such as anger, anticipation, disgust, fear, joy, love, optimistic, pessimistic, sad, surprise, and trust (Alturayeif & Luqman, 2021). Other studies also discuss emotions using several labels such as neutral, worry, happiness, sadness, love, surprise, fun, relief, empty, enthusiasm, boredom, and anger (Kiran Kumar & Kumar, 2021).
Most previous studies only used a single split and compared it with several other algorithms. For example, (Saifullah et al., 2021) used KNN, Bernoulli, Decision Tree, SVM, Random Forest, andXG-Boost, while (Kang et al., 2012) used SVM and Naïve Bayes. In this study, only one algorithm will be used, and three data splits will be employed for comparison, namely 80:20, 70:30, and 60:40. To obtain good accuracy results, this study utilizes data preprocessing with tools such as Data Cleaning, Case Folding, Tokenizing, Filtering, Stemming, and Transformation (Kusumawati et al., 2022).
Additionally, it employs word weighting using TF-IDF. Term Frequency-Inverse Document Frequency (TF-IDF) is an algorithmic method used to calculate the weight of each commonly used word. This method is known for its efficiency, simplicity, and accurate results. It calculates the values of Term Frequency (TF) and Inverse Document Frequency (IDF) for each token (word) in each document within the corpus. Simply put, the TF-IDF method is used to determine how often a word appears in a document (Putra et al., 2022). Besides TF-IDF, this research also utilizes the Bag of Words (BoW) approach. The boW is one of the simplest methods for converting text data into vectors that can be understood by computers. Essentially, this method only counts the frequency of word occurrences across all documents (Juluru et al., 2021). In the BoW approach, documents are treated as "bags" of words, which are sets of words present in the document. In this representation, information about word order and sentence structure is lost, and only the frequency of word occurrences is considered (Kowsari et al., 2019).
After word weighting, the next step involves testing with a model consisting of six labels: joy, love, surprise, anger, fear, and sadness. Additionally, this research will also conduct testing with two labels: positive and negative. These two labels are a combination of the six labels mentioned earlier, where joy and love fall into the positive label, while surprise, anger, fear, and sadness fall into the negative label.

Literature Review
Research related to emotions has been extensively conducted by previous studies. Table 1 provides an overview of studies that have investigated emotions. The research conducted by Fernandes et al. (2020) performed sentiment analysis on emotions by categorizing them into 5 labels. The modeling in this study achieved label-specific accuracies such as 86% for 'happy,' 81% for 'sad,' and so on. Other studies by Ramdani, Santosh, and Sajib also examined emotions using 2 labels, namely 'positive' and 'negative,' resulting in accuracies ranging from 79% to 93%. However, Sailunaz's research, which used more than 2 labels such as 'guilt,' 'joy,' 'shame,' 'fear,' 'sadness,' and 'disgust,' obtained a relatively low accuracy of 43% .
The research utilized the K-Nearest Neighbor (KNN) algorithm, chosen due to its high accuracy in previous studies. For instance, Chenna et al. (2021) achieved 87% accuracy, while Suprayogi obtained 85%. Based on these two studies, the current research utilizes the KNN algorithm to analyze emotions with 6 labels. Figure 1 illustrates the flowchart of the methodology utilized to facilitate the execution of this research. The dataset used in this research consists of tweets extracted from Twitter, specifically focusing on the topic of pilpres2024 (presidential election 2024). The dataset comprises a total of 1649 data instances. These data were labeled based on emotions derived from an emotion wheel. An emotion wheel is a visual model used to depict various human emotions. It places emotions in a circular diagram consisting of different sectors or categories of emotions. The emotion wheel aids in identifying and describing the nuances and variations of emotions experienced by humans. After labeling, the next step involved preprocessing the data. a. Data Cleaning:

Research Methods
Data cleaning is a procedure to ensure the correctness, consistency, and usability of existing data in a dataset. It involves detecting errors or corruption in the data and then fixing or removing the data if necessary (Angloher et al., 2023). b. Case Folding: Case folding is the process of converting all text to the same case, either lowercase or uppercase (Fauzi, 2019). c. Tokenizing: Tokenizing is the process of splitting words in text into separate word sequences, separated by spaces or other characters (Friedman, 2023). d. Filtering: Filtering, also known as stop-word removal, is a preprocessing process that aims to eliminate conjunctions, connectors, or other common words. This way, only important words are retained, while unimportant words are discarded (Madhavan et al., 2021). e. Stemming: Stemming is a necessary step to reduce the number of different word indexes for a given data. It brings a word with suffixes or prefixes back to its base form. It also groups other words that have the same base and similar meaning but different forms due to different affixes. NLTK library provides modules for stemming, including Porter, Lancaster, WordNet, and Snowball (Rifai & Winarko, 2019).

f. Transformation:
Transformation is a crucial step in data preprocessing for machine learning. It is used to convert or process the original data into a more suitable or useful form for machine learning modeling. The goal of transformation is to improve data quality, eliminate bias, enhance understanding, or improve model performance (Awan et al., 2018). The next step involves word weighting using TF-IDF on the cleaned data obtained from the preprocessing stage. Additionally, the Bag of Words (BoW) technique is used for word weighting. Word weighting, or feature extraction, is employed to enhance accuracy. This research compares the best feature extraction for classifying using both 6-label and 2-label scenarios, using the KNN algorithm.
Subsequently, modeling is conducted using 3 different data splitting ratios with the KNN algorithm. KNN is chosen for its simplicity and ease of implementation. It is a straightforward algorithm to understand (Uddin et al., 2022). The basic concept of KNN is to find the class or category that appears most frequently among the K nearest neighbors of a given data point. This makes it relatively easy to implement without requiring many tuning parameters (Pamuji, 2021). Furthermore, KNN is a non-parametric algorithm, meaning it does not make specific assumptions about the data distribution (Wang et al., 2020). This allows the algorithm to perform well on data without clear patterns or distributions. KNN also exhibits good tolerance to noise in the data. Data contaminated by noise or outliers will not significantly affect the classification performed by KNN since the algorithm relies on the majority vote of the nearest neighbors.

Results and Discussions KNN 6 Label Using TF-IDF
Here are the results obtained from the conducted modeling. Figure 2 represents the accuracy results using data splitting ratios of 60:40, 70:30, and 80:20.  Figure 2, it can be observed that the highest accuracy is obtained from the data splitting of 80:20, which is 58%. Previous studies also used KNN to analyze 5 emotion labels, namely anger, happiness, sadness, love, and fear, using TF-IDF. The accuracy results achieved were still below the accuracy results obtained in this study, which is 0.51% (Nugroho et al., 2022). Furthermore, this study also conducted sentiment analysis on emotions, but it calculated the accuracy for each label separately, resulting in accuracy above 90% (Kaur & Bhardwaj, 2019). Therefore, it can be concluded that the per-label accuracy is higher compared to the overall accuracy when using TF-IDF as the feature extraction method.

KNN 6 Label Using BoW
The next testing was conducted on 6 labels using Bag of Words (BoW). Figure 3 represents the accuracy results obtained using KNN with BoW for the 6 labels. The testing results from KNN with BoW show a decrease in accuracy. The highest accuracy in this testing is observed in the 80:20 data splitting, which is 54%. There is a 4% decrease compared to the testing using KNN and TF-IDF. Figure 4 shows that the accuracy using the KNN algorithm with TF-IDF feature extraction is better compared to BoW.  Figure 5 represents the accuracy results of TF-IDF with KNN using 2 labels.  Figure 5, it can be observed that the 80:20 data splitting still achieves the highest accuracy compared to others, reaching 79%. Additionally, using 2 labels significantly improves the accuracy compared to using 6 labels. However, in other studies such as (Alzami et al., 2020), the accuracy obtained was only 0.503, while (Junadhi et al., 2022) achieved an accuracy of 56%. Furthermore, another study obtained an accuracy of 69.68% in analyzing Arabic tweets (Aloqaily et al., 2020). In (Ritha et al., 2023), an accuracy of 62% was obtained using a dataset of 1600 data, which is similar to the dataset used in this study.

KNN 2 Label Using BoW
The evaluation conducted on the model using KNN with BoW feature extraction and 2 labels is shown in Figure 6. The accuracy results of this evaluation can be seen in the figure.  Figure 6, it can be observed that there is a decrease in accuracy of approximately 2% for the 80:20 data splitting. The highest accuracy achieved is 77%, which is indeed higher compared to the 70:30 and 60:40 splittings. Other studies that also utilized KNN and BoW obtained accuracies of 52% (Mujahid et al., 2021) and 64% (Alzami et al., 2020), respectively.

The comparison of model evaluation using 2 labels
In Figure 7, it can be seen that the TF-IDF accuracy with an 80:20 data splitting still outperforms the others, similar to the experiment with 6 labels. In previous studies, several researchers have made improvements to the KNN algorithm. For instance, a study conducted by Alzami et al. (2020) achieved a 70% accuracy with KNN by using a hybrid approach combining TF-IDF and W2V. Another study improved the KNN algorithm by employing BoW Ensemble Feature, which increased the KNN accuracy up to 96% (Irfan et al., 2018). Apart from enhancing KNN through feature extraction, some researchers also utilized hybrid approaches with other algorithms. Rani and Singh Gill (2020)  Ensemble using KNN, SVMRadial, and C5.0. Another study combined KNN with Decision Tree in a hybrid approach and obtained an 80% accuracy (Khattak et al., 2021). Furthermore, other studies have developed KNN to improve accuracy in multiclass scenarios. Pandian and Balasubramani (2020) employed the FA-KNN hybrid approach to enhance multiclass classification, resulting in a 91% accuracy. Menaouer et al. (2022) conducted a study that compared various developments of KNN to improve accuracy in a 7-label classification task. The study compared KNN Stacking, KNN Boosting, and KNN Bagging, and found that KNN Bagging achieved the highest accuracy at 88.6%. From these studies, it can be concluded that the development of accuracy, whether through hybrid approaches or incorporating additional algorithms, significantly influences the accuracy results

Conclusion
Based on the discussion above, the researcher concluded that the KNN algorithm with an 80:20 data-splitting ratio yields the best accuracy. It applies to both feature extraction methods used, namely TF-IDF and BoW, for both 6 labels and 2 labels. However, when compared, TF-IDF outperforms BoW. In comparison to other studies, the results obtained in this research are indeed better than previous studies that solely relied on the KNN algorithm. KNN's accuracy can be enhanced by employing hybrid approaches with other algorithms such as boosting, SVM, Bagging, and others. Therefore, for future research, it is recommended to explore hybrid methods with KNN to further enhance the accuracy obtained from this study.