S ENTIMENT A NALYSIS AND C LASSIFICATION OF A RAB J ORDANIAN F ACEBOOK C OMMENTS FOR J ORDANIAN T ELECOM C OMPANIES U SING L EXICON -B ASED A PPROACH AND M ACHINE L EARNING

in polarity classification, the researchers introduced them to our formulated dataset. The results of the classification were 97.8, 96.8 and 95.6% for Support Vector Machine (SVM), K-Nearest Neighbour (K-NN) and Naïve Bayes (NB) classifiers, respectively.


INTRODUCTION
Language Processing (LP) is the field of computer science and artificial intelligence that mainly studies human-computer language interaction [1].SA and opinion mining is a field of NLP that investigates and analyzes people's opinions, sentiments, evaluations, attitudes and emotions from written language.It is one of the most active research areas in NLP and is also widely studied in data mining, web mining and text mining [2] [24].
The important part of information-gathering behaviour has always been to find out what other people think.With the growing availability and popularity of opinion-rich resources, such as online reviews and personal blogs, new opportunities and challenges arise, as people now can actively use information technologies to seek out and understand the opinions of others.Polarity classification can be applied in individual reviews to evaluate the goodness of a certain product [22] [25].The sudden eruption of activity in the area of opinion mining and SA, which deals with the computational treatment of opinion, sentiment and subjectivity in text, has thus occurred at least in part as a direct response to the surge of interest in new systems that deal directly with opinions as a first-class object [3].
To determine whether a sentence, text or any comment expresses a positive or negative sentiment, three main approaches are commonly used: the lexicon-based approach, machine learning approach and a hybrid approach.Figure 1 explains these approaches [4] [29].In this work, we implemented the lexicon-based approach.The reason behind choosing the lexicon-based approach is that both machine learning and hybrid approaches demand a labeled dataset for supervised learning.Also, Jordanians as other Arabs use their dialects and modern Arabized words, letters, symbols, paronomasias and insinuations for expressing their opinions.
Companies (Zain, Orange and Umniah) interconnect through video, voice and data (mainly internet browsing and social media).The cost of communications provided by those companies is too low compared to neighbouring countries and the level of services provided is also very good, but Jordanians do express their opinions, feelings and sentiments about those companies regarding cost, coverage, offers, internet speed, …etc.These types of opinions may be an indicator of continuing or leaving one company to another or from offer to offer.Most of those opinions, feelings and sentiments are expressed using Jordanian different Arabic dialects in addition to lack of using Original Standard Arabic.

Learning and Lexicons based
Figure 1.Main approaches of SA [4].
Customer churn analysis is a very common task in data analysis.It involves trying to predict whether customers will quit or continue the contract.It is crucial to the telecommunication companies to review and analyze their customers' feedback to enhance their provided services and avoid losing their contracts.NLP is a great method to automatically analyze sentiments and predict whether those sentiments are positive or negative as an early indicator for the quality of the provided services.
In this work, we are proposing an approach to predict customer satisfaction with the services provided by the telecommunication companies.The approach collects posts and comments from Facebook pages related to Jordanian telecommunication companies in order to find out the customer attitude toward these companies.After collecting and pre-processing the data, sentiment analysis is achieved using the Lexicons-Based Approach (LBA).Owing to the amount of data handled, the work involves automatic translation of English sentiment lexicon to create Arabic sentiment lexicons.
The paper is prearranged as follows.In Section 2, we review some of the previous research related to the field of SA.Then, in Section 3, we introduce the lexicon-based approach for creating the dataset.In Section 4, we apply supervised learning algorithms on the formulated dataset.In addition, we describe the supervised learning model on both KNIME and ORANGE software and show the experimental results and the evaluation of the anticipated method.Finally, we address the conclusions and discuss future works in Sections 5 and 6.
Rehab M. Duwairi and Islam Qarqaz [5] carried out an experiment using Rapid miner, which is an open-source machine learning software, to perform SA in Arabic text.The dataset was collected from tweets and Facebook comments that address issues in education, sports and politics.In this study, the main issue was determining the polarity (positive, negative or neutral) of the given text.The authors applied two approaches: the machine learning approach and the lexicon-based approach.Three supervised classifiers (SVM, Naïve Bayes and K-NN) were applied on an in-house collected dataset of 2591 tweets/comments from social media to analyze the sentiment of Arabic reviews.Unfortunately, the dataset was not large enough to make strong conclusions.Rehab M. Duwairi [6] used classification for SA.After extracting Arabic tweets, the author applied Naïve Bayes (NB) and Support Vector Machine (SVM) classifiers.SVM and NB classifiers were used on a big dataset that consists of almost 22500 Arabic tweets.The experiments involved comparing the lexicon values without the dialect lexicon to the values with converting dialectical words into MSA words.The results show the great impact of the dialect lexicon on the F-measure of the positive and negative classes as well as the Macro-Precision, Macro-Recall and F-Measures.The results were limited by the storage deficiency of the Rapid miner software used.Ahmad A. Al Sallab et al. [7] concentrated on a deep learning framework to analyze the sentiment of Arabic text with features based on the developed Arabic sentiment lexicon with standard lexicon features.One supervised classifier (SVM) and four unsupervised classifiers (DNN, DBN, DAE and RAE) were applied on a dataset of 3795 entries.Results show that RAE produces the best accuracy.
Haifa K. Aldayel and Aqil M. Azmi [8] proposed a hybrid approach combining semantic orientation and SVM classifiers.The used data passed through pre-processing operations to be ready to a lexicalbased classifier, then the output data became a training data for the machine learning classifiers.The proposed approach used 1103 tweets.The experimental results show better F-measure and accuracy of the hybrid approach.
Hala Mulki et al. [9] proposed two classification models to analyze the Arabic sentiment of 3355 tweets written with MSA and Arabic dialects.Authors considered the sentiment classification of Arabic tweets through two classification models: supervised learning-based model and unsupervised learning-based (lexicon-based) model.The conducted experiments showed better F-score and Recall values using the supervised learning-based model.On the other hand, the unsupervised learning-based (lexicon-based) model achieved better results if the stemming did not assign the lookup process.
Nora Al-Twairesh et al. [10] collected a corpus of Arabic tweets by collecting over 2.2 million tweets.Authors presented the sequence of operations used in collecting and constructing a dataset of Arabic tweets: cleaning and pre-processing the collected dataset included filtering, normalization and tokenization.Later, with the help of annotators, the dataset was labeled with (positive, negative, mixed, neutral or indeterminate).Then, the data was classified using the SVM classifier and provided as a benchmark for future work on SA of Arabic tweets.
Hassan Najadat et al. [11] applied four supervised classifiers on a dataset of 4227 posts' texts from the Facebook pages to determine the efficiency of the main three telecommunication companies in Jordan: Orange, Zain and Umniah, based on the SA of customers who use social media, especially Facebook.The results were promising.However, the accuracy without sampling was better than that with sampling.
Leena Lulu and Ashraf Elnagar [12] proposed neural network models from different deep learning classifiers for the automatic classification of Arabic dialectical text.The proposed approach used the manually annotated Arabic online commentary (AOC) dataset that consists of 110 K labeled sentences.This approach yielded an accuracy of 90.3%.Assia Soumeur et al. [13] and [19] focused on opinions, sentiments and emotions based on various Facebook pages' posts written in Algerian dialect.The authors applied two types of neural network models: MLP and CNN, in addition to Naïve Bayes to classify comments as negative, positive or neutral.After considering the pre-processing steps, both models achieved good accuracy results with a slightly better accuracy using the CNN model.This indicates obtaining higher accuracies using deep learning models in general.
Jalal Omer Atoum and Mais Nouman [14] focused on SA of social media users' tweets written in Jordanian dialect.After a sequence of pre-processing steps, the dataset was labeled with positive, negative or neutral.The study applied two supervised classifiers, Naïve Bayes and SVM, on the tweets.The conducted experiments involved experimenting with different factors.The results show higher accuracy values using the SVM classifier.Results also show that using stems and root trigrams on balance data enhances the accuracy.In summary, Table 1 provides a comprehensive and comparative overview of the studied literature for the research from [5]- [14].In addition to previously summarized literature in Table 1, Saif M. Mohammad et al. [16] applied two different approaches to automatically generate several large sentiment lexicons.The first generating method was using distant supervision techniques on Arabic tweets and the second method was translating English sentiment lexicons into Arabic using a freely available statistical machine translation system.The authors provide a comparative analysis of the new and old sentiment lexicons in the downstream application of sentence-level SA.
Rehab Duwairi and Mahmoud El-Orfali [18] approached SA in Arabic text using three perspectives.First, investigating several alternatives for text representation; in particular, the effects of stemming feature correlation and n-gram models for Arabic text on SA.Second, investigating the behaviour of three classifiers; namely, SVM, Naïve Bayes and K-nearest neighbor, with SA.Third, analyzing the effects of the characteristics of the dataset on SA.

METHODOLOGY
Recently, many researchers have devoted efforts to studying the platforms of social media [30]- [31].
The interest in studying social media is due to the rapid growth of its contents as well as its impact on people's behaviour [15].A major part of their studies focused on SA and opinion mining.

Collecting Sentiment Lexicons
In the lexicon-based approach, big efforts focused on the English sentiment lexicons [16] while little focus was placed on Arabic sentiment lexicons.On the other hand, most of these efforts focused on solving special problem statements.Arabic language sentence flow is a challenging issue due to many reasons; for example, Arabic sentences are full of using negations, modals, intensifiers and diminishers.Moreover, the Arabic language is very rich in prepositions, conjunctions, connected pronouns, object pronouns, demonstrative pronouns, relative pronouns, pronouns, paronomasia, insinuation and many other issues that require a lot of work when computerizing the Arabic language understanding.Table 2 shows Arabic language controls.
Negators change lexicons from negative to positive and the other way around.For example, the word ‫,سعيد(‬ happy) is a 100% positive sentiment word, but if it is preceded with any suitable negation as (
The same is valid for the negative sentiment words if preceded with any suitable negation; for example, ( ‫)ضار‬ is a 100% negative sentiment word, but when negated like ( ‫يعتبر‬ ‫ال‬ ، ‫ضارً‬ ‫ليس‬ ، ‫ضار‬ ‫غير‬ ‫ضار‬ ً ‫ا‬ ), then it is positive (not necessarily, maybe neutral).Arabic negations applied are such as ( ‫لم،‬ ‫ال،‬ ‫غ‬ ‫ليس،‬ ‫ما،‬ ‫مو‬ ‫مش‬ ‫دون،‬ ‫لن،‬ ‫ير،‬ ).In diminishers, the word might be negative by itself, but in the sentence, it gives a positive connotation ‫الناس(‬ ‫الى‬ ‫اإلساءة‬ ‫في‬ ‫بخيل‬ ‫الخيرات،‬ ‫في‬ ‫.)كريم‬Other examples are ‫لين(‬ ‫حلو‬ ‫ر‬ ُ ‫م‬ ‫شرس‬ ‫ع‬ ‫طي‬ ‫عصي‬ ).The Arabic language is rich in paronomasia words and sentences that give many meanings, like ‫)رقيق(‬ which has the meaning of (slave), (thin) or (gentle).The insinuations are close to paronomasias, where the speaker may say positive words, but he/she means negative ones, like ( ‫اتقوا‬ ‫.)هللا‬The approaches used to create Arabic sentiment lexicons can be broadly divided into three categories [32].The first and most used approach is strongly based on the automatic translation of English sentiment lexicons and resources, either for all the entries in the lexicon or only for some parts.The second approach relies on choosing seed sentiment words and then identifying the words that occur alongside the seed words, using either statistical measures or simply using conjugates.The third approach involves human effort in manually extracting sentiment words, either from Arabic dictionaries or from datasets collected for Arabic SA and subsequently labeling these words with their polarities (positive, negative, neutral) [17].
In this work, we created Arabic sentiment lexicons through the automatic translation of English sentiment lexicons and the manual extraction of sentiment words.Next, we describe each of these resources.

Manually Prepared Lexicon
Our work concentrates on the comments of Facebook users related to social, public issues.The researchers had to add more words and phrases to the available lexicons because of the dialect phrases and words used by commenters as well as using negators, intensifiers, paronomasias and insinuations.
The Arabic Sentiment Lexicon comprises 333 negative phrases, 369 positive phrases, 4956 negative words and 2145 positive words, in addition to a largely manipulated negation applied on negative and positive words (39648 negative negations and 17160 positive negations).These negations were made through the concatenation of applicable negations with the sentiment's words and phrases.Figure 2 shows samples from the generated lexicons.

English Translated Sentiment and Emotion Lexicon
There are many English lexicons translated into Arabic, but they are hardly free of mistranslation or have different synonyms [33]- [34].In this work, we used some lexicons from Saif M. Mohammad's collection [16].In this collection, the author used automatic translations of English sentiment lexicons into Arabic.The study in [16] reveals that about 88% of the automatically translated entries are valid for Arabic and around 10% invalid entries were the result of gross mistranslation.40% of the invalid entries occur due to translation into a related word and about 50% occur due to differences in how the word is used in Arabic.The translations were often the word representing the predominant sense of the word in the English source [16].
All those lexicons are very powerful in terms of Arabic sentiments words and would be helpful if researchers were to mainly analyze texts written correctly according to the Arabic linguistic structure.Unfortunately, most comments were written quickly, without correct wording (misspelling) or informed prior thinking.The Facebook post was to get the opinions of users regarding services provided by Jordanian Telecommunication Companies (Zain, Orange and Umniah) and subjects (persons) of this study who put their comments on the post seemed to be in a hurry and with no concentration when writing their comments.Most of those subjects used their own words to describe the service or to talk about their own experience with the company using sentences not always free of improvised created words, slang vocabularies and inclusion.Saif M. Mohammad and his team [16] provided huge size files that need a lot of work and approval before applying them.However, we used what seems to fit and applies to the conducted experiments.

Formulating Labeled Dataset
Related Facebook comments were collected from the Facebook 123 pages of the Jordan telecom companies with some reviews from different related comments.Figure 3 shows some sample comments from Facebook regarding the services provided by the major telecommunication companies in Jordan.
Figure 3. Sample from the collected comments.
After loading and labelling the collected comments, the researchers applied some text pre-processing on them.Pre-processing is a vital step for getting sentimental text.The main tools that were used in this step are Microsoft Excel and KNIME.Microsoft Excel was used in preparing and proofing the lexicon words, concatenation of negations with lexicon words, removing modals, prepositions, pronouns and intensifiers, as well as other text operations, like refinement and trimming of demonstrative pronouns.
Meanwhile, KNIME was used in the analysis and labeling.The main pre-processing steps that were performed are the following steps: -Removing punctuations including apostrophe, colon, comma, dash, ellipsis, exclamation point, hyphen, parentheses, period, question mark, quotation mark and semicolon.
-Replacing some letters and words with other letters or words that are known to the system.-The priority of tagging was as follows: negative phrases, positive phrases, negated negative, negated positive, negative and positive.Note that after applying Dictionary Tagger in KNIME on a phrase or word, it will not be changed.Mostly, the researchers focused on negative phrases and words, since this solution follows the lexicon-based approach to perform sentiment analysis.A recent and important work can be referred to in [2].
-Filtering the tagged words, so that only sentiment words are included in the counting of sentiments.
-Creating a bag of words that separate each group of sentiment words individually.
-Counting the frequency of each group.
-Calculating the result using Equation 1 and if the value gained from Equation 1 is greater than or equal the mean (result), then the comment is POSITIVE; otherwise, the comment is NEGATIVE. ( Figure 4 represents the previously mentioned steps for pre-processing, analyzing and labelling of collected comments, then formulating a labeled dataset.
In this research, we have applied the KNIME Analytics Platform for analysis and labeling based on the dictionary tagger.KNIME is a software built for fast, easy and intuitive access to advanced data science, helping organizations drive innovation.KNIME 4 Analytics Platform became one of the leading open solutions for data-driven innovation, designed for discovering the potential hidden in data, mining for fresh insights or predicting new features.Figure 5 shows the KNIME analyzing and labeling model.
From the model in Figure 5, KNIME parts, nodes, extensions and components of the model along with its purpose are as follows: -"Excel Reader (XLS)" node is designed for reading Text Data from Excel files.Here, we manually collected the dataset from three Facebook pages.Then, we inserted it into the Excel sheet in addition to other lexicon files.
-"Strings to Document" node converts the specified strings into documents.The input is a data table containing string cells and the output is a table containing the strings of the data of the input table as  well as the created documents in an additional column.
-"Column Filter" node filters certain columns from the input table and allows the remaining columns to pass through the output table.
-"Punctuation Erasure" node removes all punctuation characters of terms contained in the input documents.Input is the table that contains the documents before pre-processing and output is the table that contains the documents after pre-processing.
-"String Replacer" node is for replacing and removing pre-positions and other punctuations previously listed.It replaces values in string cells if they match a certain wildcard pattern.Input is arbitrary data and the output column contains the SA (NEG, POS) of each data column (comment).
The SA results are calculated using Equation 1.
The labeling process in this manner was unsupervised.It was depending on the collected data, either from Arabic lexicons or English translated lexicons after refinement and filtering.

SUPERVISED MACHINE LEARNING
The previously described workflow in section 3 resulted in labeled data (unsupervised).To test the accuracy of the proposed approach, we used the labeled dataset to formulate a Machine Learning  Experiments were conducted on the 1300 comments to produce a Predictive Machine Learning Model (PMML) (supervised) using the PMML predictor.A text pre-processing step is necessary to remove all unnecessary and misleading words.Then, we experiment with the pre-processed text on Machine Learning.Figure 6 shows the workflow of the ML model implemented using ORANGE software.
After that, we tested the collected dataset using SVM, NB and K-NN algorithms, since they are the most used algorithms in this context.
In summary, Figure 7 illustrates our proposed work from unsupervised labeling using a lexicon-based approach till supervised learning verification of the labeled comments using the ML model.

RESULTS
This paper has addressed SA in Arabic comments.First, we implemented the lexicon-based approach for identifying the polarity of the provided text.Lexicons were two types of pure Arabic lexicons and refined and filtered English translated lexicons.The data samples are from local Jordanian people commenting on a public issue related to the services provided by the main telecommunication companies in Jordan.Second, we used the resulting labeled dataset frequently used ML algorithms for classification of comments in the absence of lexicons.The workflow in Figure 7 shows the whole process starting with importing the data through labelling it and ending with classification and results in SA.The procedure involves applying a user-defined lexicon based on the common Facebook posts and comments used by Jordanians, which resulted in a (60%) positive comments and (40%) negative ones.The total accuracy of lexicon-based labeling was calculated through a comparison between the achieved results and the ones achieved through manually labeled comments by experts.The general accuracy of lexicon-based labeling was (98%).Higher accuracy values can be accomplished by adding more words and phrases to our lexicons.
Figure 8 shows a sample of the results.The SA column (F) is automatically produced using the KNIME Software nodes (Tag Filter node in Figure 5).Researchers may see the partial results comment by comment with their sentiments using (Document view node).In Figure 8 and by referring to Figure 5, column A is the comment, columns B and C are sentiments that have resulted from the (TF node).This data is grouped (Group By) and columns are pivoted (Pivoting) to produce column D (total sentiments).The (Math formula node) is calculating the result (E) and finally with (Rule Engine node) a new column is added (F), which is our targeted LABELING (Sentiment of the comment: POS or NEG). Figure 9 shows a sample of unlabeled comments provided to the lexicon-based model, whereas the labeling results (SA) are shown in Figure 10.About 1300 comments were collected from Facebook pages and provided to the lexicon model for labeling.The accuracy achieved was 98% based on some experts and by expressing some comparisons with other labeled comments.Unfortunately, this model is still restricted to the availability of the words or phrases in the lexicons.It is considered as unsupervised learning that depends on a mathematical counting formula.
On the other hand, researchers can use the resulting labeled dataset to build an ML model that will efficiently classify any newly outlet comments, which is considered a supervised learning model.To perform the classification, we applied three classifiers; namely, SVM, NB and K-NN using ORANGE software tool.The model for the three famous classifiers and its details are shown in Figure 6.The accuracy results were very promising.Table 3 shows the accuracy results of those classifiers.Table 3 makes clear that all classifiers provided good results, but superiority was for the SVM classifier since it is powerful when dealing with binary classification problems.

CONCLUSIONS
This paper has focused on SA of Facebook Arabic comments for Jordanian telecom companies.The output of our work was an Arabic Sentiment Lexicon, which comprises 333 negative phrases and 369 positive phrases.Besides, the researchers have collected 4956 negative words and 2145 positive words in addition to a largely manipulated negation applied to negative and positive words.Most of the phrases and words came from the Jordanian dialect and MSA in addition to the applicable sentiment words from the English sentiments translated into Arabic.
The researchers implemented the lexicon-based approach for identifying the polarity of each of the provided Facebook comments.Data samples are from local Jordanian people commenting on a public issue related to the services provided by the main telecommunication companies in Jordan (Zain, Orange and Umniah).The produced results regarding the evaluation of Arabic sentiment lexicon were promising.When applying the user-defined lexicon based on the common Facebook posts and comments used by Jordanians, it scored (60%) positive and (40%) negative.The general accuracy of the lexicon was (98%).The lexicon was used to label a set of Facebook comments to formulate a big dataset of unlabeled comments.Using supervised Machine Learning (ML) algorithms that are usually used in polarity classification, the researchers introduced them to our formulated dataset.The results of the classification were 97.8, 96.8 and 95.6% for Support Vector Machine (SVM), K-Nearest Neighbour (K-NN) and Naïve Bayes (NB) classifiers, respectively.It is worthy to note that without applying Arabic language grammar rules and Arabic sentence structure, any lexicon would fail in such a task because of issues related to the Arabic language.

FUTURE WORK
The formulated lexicons can be improved by adding new phrases and words related to sentiments that will improve the accuracy and quantity of labeling.The paper highlights the need to have a dedicated website for uploading lexicons and datasets collected by researchers in the field of NLP which may be helpful in this context.Moreover, there are other fields in NLP that rely on the lexicon approach, which makes this work exploited in other tasks.To overcome some of the challenges of Arabic sentiment analysis, we are considering the use of recourses as SenticNet [23].
model using ORANGE software to implement the classifier.ORANGE is an open-source data visualization, machine learning and data mining toolkit.It features a visual programming front-end for explorative data analysis and interactive data visualization.ORANGE is a component-based visual programming software package for data analysis.ORANGE's components are called widgets and they range from simple data visualization, subset selection and pre-processing, to empirical evaluation of learning algorithms and predictive modeling.In ORANGE, visual programming is implemented through an interface in which workflows are created by linking predefined or user-designed widgets, while advanced users can use ORANGE5 as a Python library for data manipulation and widget alteration.

Table 1 .
A comprehensive and comparative overview of the studied literature.