An Enhanced Framework for Sentiment Analysis of Students' Surveys: Arab Open University Business Program Courses Case Study

We present an enhanced framework for sentiment analysis which can be used for universities to improve student retention, teaching, and facilities. In addition, our proposed framework can be an important source for further analysis and improved decision-making. To best of our knowledge, this is the first work which targets student comments within surveys. We believe that students’ comments are a good source to capture the overall students’ sentiment. Our framework shows 0.8 accuracy when using 4 grams.


Introduction
Student survey is a crucial tool to improve universities teaching and facilities. Many universities including Arab Open University (AOU) are conducting a mandatory survey every semester to get students' opinions/voice using Learning Management System (LMS).
The survey designed to have two kinds of questions as follows: • Interval Scale Question (ISQ) • Open-ended Question/Comment.
Using ordinary analytical tools, it is simple to provide a summary of students' opinions/views from ISQ type of questions. As students' comments are not quantitative data, it is very hard to analyze them, and it requires time and experts to do so. Sentiment Analysis (SA) is the process of computationally/ statistically identifying and categorizing opinions expressed in a piece of text, especially to determine whether the writer's attitude towards a subject, topic, category, etc. is positive, negative.
Using SA, an enhanced framework for AOU is proposed to provide analysis for students' comments. Our objective is to the analysis of student's comments using SA technique to improve teaching and learning facilities of AOU which enhance/improve student retention.
The remainder of the paper is organized as follows. We discuss existing sentiment analysis methods and techniques. We present our framework for sentiment analysis of students' surveys and experiments. Finally, conclusion and future work are discussed.

Existing Sentiment Analysis Methods and Techniques
The processes of sentiment analysis differ from system to system based on 1) types of the classes to predict (positive or negative, subjective or objective), 2) and different levels of classification (sentence, phrase, or document level) [1][2][3]. In addition, sentiment analysis differs in terms of language that is processed. The authors [4][5][6] proposed a system for subjectivity and sentiment analysis (SSA) for Arabic social media genres. The system deals with Arabic rich language which has significant complexities than the English language.
Recently, social networks become popular including Facebook, twitter, etc. and they become an emerging challenging sector where the natural language expressions of people can be easily reported through short but meaningful text messages. Many types of research proposed techniques for social networks sentiment analysis [7][8][9][10][11][12]. The main objectives of social networks sentiment analysis are to better understand consumer's feelings towards a brand, deliver signals into shifts in a brand and provide a better understanding of how a product or brand is perceived compared to the competition.
Although there are many different classifications for sentiment analysis, they are based on the same concept. Figure 1 shows overall framework. There is a training set which is used to learn the classifier. After building the classifier, a test set is used to check the accuracy of the classifier.  The process of prediction the sentiment is shown in Figure 2 where a sentence comes from a review and the classifier classifies the document whether positive or negative sentiment.

Proposed Framework
Most of the universities provide surveys to students to develop goals and strategies, evaluate programs and create a positive public image [13,14]. There is always an open-ended question to describe their comments. Unfortunately, students' comments are not analyzed properly due to they are written in natural language.
Our proposed method is to use the datasets (students' responses) accumulated previously to build the classifier shown in Figure 1. Figure  3 shows the dataset which represent students' responses to the survey. We first preprocess the data by classification of comments using the value of "Overall student views on course" as per the following: if the value is greater than 3 which is the average value for the scale of the field, then the comment is considered as a positive sentiment (Classification=1). On the other hand, if the value is less than 3, then the comment is considered as a negative sentiment (Classification=0). Figure 4 shows the data after preprocessing phase.   Our study is applied to the business program in AOU Kuwait branch. Figure 5 shows the course of the business program and their contributions in the dataset.
We used Graph lab library [15] to train the classifier and calculate accuracy. Table 1 shows the number of consecutive of words (N-grams) versus accuracy [16]. As we increase the N-grams, the processing time increases and the accuracy increase, however at a certain point the accuracy saturates. We chose 4 grams as best setup for the classifier as it achieves the best accuracy with smallest N grams.

Conclusion and Future Work
We presented an enhanced framework for sentiment analysis which can be utilized for universities. We applied our framework for AOU Kuwait branch especially business program. We studied different settings for N grams, we found that 4-grams is the best setting in terms of performance and accuracy.
In this framework, we did not spend much effort in studying other programs which contains Arabic comments [16]. It would be extremely interesting to extend the framework to include Arabic comments. It requires much more efforts as Arabic is written from right to left and there is no capitalization. Also, letters change their shape according to their position.