The Transition From Intelligent to Affective Tutoring System: A Review and Open Issues

The swelling use of computerized learning, accompanied by the rapid growth of information technology has become a surge of interest in the research community. Consequently, several technologies have been developed to maintain and promote computerized learning. In this study, we provided an in-depth analysis of two of the prominent computerized learning systems i.e., Intelligent Tutoring System (ITS) and Affective Tutoring System (ATS). An ITS is one of the training software systems, which use intelligent technologies to provide personalized learning content to students based on their learning needs with the aim of enhancing the individualized learning experience. Recently, researchers have demonstrated that the affect or emotional states of a student have an impact on the overall performance of his/her learning, which introduces a new trend of ITS development termed as ATS, which is the extended research of the previous one. Although there have been several studies on these tutoring systems, however, none of them has comprehensively analyzed both systems, particularly the transition from ITS to ATS. Therefore, this study examines these two tutoring systems more inclusively with regards to their architectures, models, and techniques and approaches used by taking into consideration the related researches conducted between 2014 to 2019. A crucial finding from the study is that ATS can be a promising tutoring system for the next generation learning environment by affiliating proper emotion recognition channels, along with computational intelligence approaches. Finally, this study concludes with research challenges and possible future directions and trends.


I. INTRODUCTION
The day-to-day use of computers and the Internet have created endless opportunities for the online education community. It enables students to learn a subject through the massive supply of online learning resources from anywhere and anytime. For example, an online learner studying animal science can watch videos or even take virtual tours to the zoo without leaving their seats. Traditionally, students seem less motivated by conventional methods of learning when a teacher directs students to learn through memorization and recitation techniques, such as in classrooms. They are The associate editor coordinating the review of this manuscript and approving it for publication was Youqing Wang . more tempted towards smart devices, i.e., computers, smartphones, and tablets. These devices are invading the education field at an increasing pace as new ways of learning, which shows promising improvement in productive achievement [1]. According to [2], [3] Computerized learning also synonymously referred to as online education, e-learning, computerassisted learning, online systems, and computerized tutoring systems, and in general, the term refers to any tutoring system that delivers teaching in a structured environment where learning is at least a peripheral element of the experience. However, in this study, we will focus on two of the prominent, i.e., Intelligent and Affective tutoring systems.
An Intelligent Tutoring System (ITS) is a computerized learning system that attempts to mimic human tutors and provide personalized instructions to students (one-one tutoring) by capturing and analyzing students' characteristics offering substantial learning gains [4]. To deliver learning in a personalized way, various learning styles have been used in ITSs. Providing learning experience according to the preferred learning styles has shown promising outcomes and effectiveness [5]. Besides, due to its usability and versatility, many research works are undertaken in various domains using ITSs, e.g., robotic-assisted surgery [6], zooarchaeology [7], and many more. However, in the past, researchers begin to feel that incorporating affective states into ITSs can lead to a significant improvement of the learner performance as well as the learning experience [8], [9]. Analogously, recent studies such as [10], [11] also found that the open learner model (OLM) (a means of visualizing current knowledge or skill level and the aim is to help learners learn more effectively by tracking, reflecting, and pacing the learning process) which are developed mainly upon ITSs ignores the emotion area mostly and support cognition the most and then a bit less metacognition and motivation. Hence, it is necessary to properly investigate affective intervention in tutoring environments to enhance and motivate the learning outcomes [12], [13].
Research carried out in education and psychology shows that emotion and learning have a hidden mutual relationship, which ultimately would enhance learning performance [14]. Emotions play a significant role in human behaviors in individual and social communities, which may be in any kind of human activities such as learning online [15]. Recently, researchers have acknowledged the role of emotion in online learning in improving learning outcomes and enhancing students' experience [16]- [18]. The significance of incorporating emotional states with the learning process has necessitated the development of ATSs, which is the extended research of ITSs, with the ability to adapt to the learner's adverse emotion effectively, so as to spark the learner's motivation to learn [19].
It is easy to confuse to understand the connection between ITS and ATS because both responsively provide teaching. Therefore, in this study, we survey the current state-of-theart studies of these two tutoring systems, between the years 2014-2019, intending to describe their architectures and how various models, approaches, and strategies can be used for designing and implementing these systems, including their advantages and disadvantages. A comprehensive taxonomy of these two systems formulation approaches is presented. Furthermore, how the transition paved the way from ITSs to ATSs is exhibited, and a critical explanation of why ATSs can be the new generation for computerized learning systems is provided. Additionally, various research works are analyzed from the research and development perspectives while demonstrating the current research issues and challenges with possible solutions to overcome. This study's aim is to assist novice researchers by providing them with a better understanding of the concept of Intelligent and Affective tutoring systems, motivating them to undertake research on these by helping them to choose appropriate and efficient techniques to implement the systems. Besides, it also provides experts with a broader perspective for further exploration to mitigate the current research challenges. There have been few review studies on these systems, e.g., [20] describes a trend of using artificial intelligence (AI) approaches in ITSs from the published works between the years 2008 to 2013 to show how these approaches enhance the performance and productivity of ITSs. However, the authors' focus is limited to investigating computational AI approaches for the student and tutor module of ITSs. The other relevant approaches to devise an ITS are ignored in their study. In another study, [21] describes a self-assessment manikin to provide a ground truth data for emotional states, which lacks a detailed discussion on the recent technologies applied in ATSs. Our approach also follows the same way, nevertheless, in a broader aspect. To the best of our knowledge, this is the first study to showcase the in-depth analysis of these two promising computerized learning systems in terms of their architectures, models, techniques, and approaches together with the challenges and future directions. For this task, 487 articles were taken in this study from various sources of the Internet and digital libraries. The keywords of the considered articles were restricted to the relevant studies of Intelligent and Affective tutoring systems. Additionally, other relevant articles were carefully studied as well to strengthen the designing approaches of these systems. Finally, 148 articles were chosen for the comprehensive analysis of this study. Fig. 1 depict the methodology of our survey.
As mentioned, two types of articles were considered for the review purpose in this study. Type 1 are articles on Intelligent tutoring systems focusing on their architectures, models, techniques, and approaches; Type 2 articles are on the aspects of architectures, models, techniques, and approaches of the Affective tutoring systems. The search and selection procedure follows four steps: Step 1 -Database Selection It is necessary to choose renowned scientific libraries for the authentication and validation of any scientific article. Hence, digital libraries such as ACM Digital Library, IEEE Xplore, SpringerLink, ScienceDirect, Web of Science (WoS) were scrutinized intensively for the articles reviewed in this study. Additionally, other relevant articles were also searched from Google Search Engine for example, Google Scholar.
Step 2 -The Primary Search Process At first, the relevant keywords were identified, and the search process begins from the selected databases. The selected keywords for Type 1 articles are ''intelligent tutoring system'', ''educational game and intelligent tutoring system'', ''tutoring system in education'', ''e-learning'', ''artificial intelligence in education'', ''computational intelligence approaches in e-learning''. For Type 2, the chosen keywords are ''affective tutoring system'', ''e-learning and affective computing'', ''affective computing in education'', ''affective e-learning'', ''emotion in education/learning'', ''emotion recognition''. It should be noted that only the above-mentioned keywords were included in this study. From January 2019 to November 2019, the entire search process was performed.
Step 3 -Filtering procedure In this stage, articles were clustered using the subject clustering process based on the results of the keyword search. They were ultimately clustered into two groups: Intelligent and Affective tutoring systems. Subsequently, each of the clusters was sub clustered into four categories, followed by architectures, models, and techniques, and approaches. The primary reason for the clustering process is to investigate the two tutoring systems thoroughly.
Step 4 -Final Selection Ultimately, a total of 487 articles were finally selected from the reputed journals, conferences together with major web references. The following criteria were mainly considered for the selection: 1) publication time (issued from 2014-2019), 2) the articles should be either high-quality journals or WoS proceeding, and 3) prominent web references. However, a number of articles were removed from Type 1 and Type 2 after carefully studying the abstracts. The reason for the removal is because either the articles were duplicated from other articles included in the review or the articles were out of scope from this study. By studying other relevant articles within the scope, a total of 148 articles were finalized as the ultimate list of references for the current study.
The remainder of the article is structured as follows: Section II illustrates the survey on computerized learning systems with respect to ITS and ATS in consecutive subsections. Section III discusses the current research challenges and opportunities. Section IV exhibits a case study, and Section V discusses the future development trend. Section VI discusses the limitation of the study, and finally, the article is concluded in Section VII.

II. COMPUTERIZED LEARNING SYSTEM
Computerized learning refers primarily using computers as a fundamental component of the educational environment. It makes use of the interactive elements of computer applications to enhance the overall educational and training experience. From the literature, among different computerized systems, we chose Intelligent and Affective tutoring systems because of their popularity and robustness in disseminating learning. In Fig. 2, a comprehensive taxonomy according to the state-of-the-art devising approaches of these systems are provided, and the approaches are briefly described in the following subsections.

A. INTELLIGENT TUTORING SYSTEM
An intelligent tutoring system (ITS) convey a one-to-one teaching technique with the aim to provide personalization in teaching with respect to help students to improve their learning process [22]. A research study by [23] demonstrated that in 98% of cases, the one-to-one teaching instruction is affirmed more persuasive, i.e., students' adept at learning two standard deviations over the conventional educating strategies. Hooshyar et al. [24] defined an ITS as ''a new generation of learning system which offers 'one-to-one' individualized instruction by stimulating activities of a human teacher, similar to one teacher to one student''. The main application of ITS involves personalized guidance, adapt to the learning materials, analysis of learning styles of the learners by applying several techniques to support a smooth teaching process. Multidisciplinary fields are profited utilizing ITSs for learning purposes, e.g., mathematics [25], electronics [26], natural languages [27], and so on. Many times, like a human trainer, a dialogue based ITS performs continuous monitoring, make appraisals, proffer critiques, and so forth. The problem-solving support technology provides learners with intelligent assistance and has implemented problemsolving solutions. It can focus on the wisdom of teaching experts and teach students in accordance with their aptitudes. Other than leading a profound investigation into a student's present knowledge, the system further uses its embedding models to provide the learner a better user experience [6]. According to Long and Aleven claim [28], it is necessary to include the following features for the effectiveness of an ITS: • The teaching technique must ensure a gradual guideline for complex problem-solving tasks through visible demonstration with a problem-solving interface, hints, and feedbacks that must enhance the user's thinking capacity.
• Based on learners' current knowledge state, numerous techniques may be required to be utilized to affirms individualized teaching in ITS.
Different authors propose various architectures of ITS; however, according to [21], [29], [30] the traditional architecture of an ITS is illustrated in Fig. 3.
• Domain module: the domain module is defined as the function for availing knowledge on a specific topic that must be taught or learned. The domain module is liable to generate and store knowledge on a determined subject.
• Student module: The student module consolidates all the fundamental data of the students concerning their learning progress, conduct, and mental attributes. It is additionally liable for processing and storing the amassed data about the students.
• Tutoring module: The tutoring module, otherwise termed as a pedagogical module, determines the teach-ing and tutorial strategies. Other than actualizing the showing procedure, this module is answerable for putting away pedagogical knowledge.
• Interface module: This module provides a productive interaction between the system and the students through numerous input/output gadgets.
Researchers proposed various approaches to device ITSs, such as extending the traditional architecture of ITS [31] or using various computational intelligence approaches, e.g., using the Bayesian network [29]. Furthermore, various teaching strategies, feedbacks, and learning styles were also utilized as well. From our survey (2014-2019), we identified the following state-of-the-art approaches for devising ITSs, according to the taxonomy (Fig. 2). A brief discussion of these approaches is described below.

1) COMPUTATIONAL INTELLIGENCE APPROACHES
In an ITS, the student module and tutor module are two of the core and dependent components. Various computational intelligence (CI) approaches have been adopted in ITSs to optimize the tutoring and learning activities. These approaches are used to provide tutorials and to support learning in a manner that is more natural, flexible, and robust along with overall improvement for the better performance of ITSs. According to our survey, we identified these state-ofthe-art CI approaches. A brief discussion of these approaches described below. Furthermore, in Table 1, the advantage and disadvantages of these approaches are described.

a: BAYESIAN NETWORK
In the e-learning situation, uncertainties are always involved, which may halt to reason with the systems properly. Bayesian network (BN) is an intelligent mathematical framework that uses probability theory for dealing with these systems under uncertainty [32]. A directed acyclic graph (DAG) along with the relevant conditional probability distributions (CPDs) forms the basic skeleton of a BN architecture. In an ITS, for example, manually designing DAG according to each specific concept within the problem domain can pursue a student's knowledge other than simply modeling the problem domain structure [33]. BN also has a good advantage in simulating the causal relationship between things, e.g., by utilizing the characteristics of prior knowledge and posterior data, a realistic estimation of the knowledge acquired by a student can be compared [34]. A large number of mathematical theories support the strict process of BN's reasoning, and the reasoning results are also highly reliable.  process by utilizing the relative graded membership function. Fuzzy logic is another popular technique that is used to reason with uncertainty and inaccurate information in the educational environment [36]. To describe more precisely, e.g., in an educational tutoring environment, the tutee's knowledge is a moving target because the tutee's acceptance of new concepts changes while being taught, and the tutoring process also changes according to the learning knowledge of the tutee. Thus, there is no straightforward process of learning, and this uncertain, complicated process can be represented by using a fuzzy model [37]. Fuzzy logic can increase the learning effectiveness in an ITS by offering adaptation and automatically model the learning or forgetting process of a tutee. For example, [38] adopts a rule-based fuzzy logic system to model the transition of cognitive state (assimilating, forgetting and learning) of the tutees by combining network concepts and fuzzy logic.

c: DATA MINING
Data mining can be defined as the way toward finding information from an enormous homogeneous dataset utilizing some advanced algorithms. Because of the type and essence of the activity, data mining is often considered identical to knowledge discovery. Nowadays, educational data mining approaches as a new discipline, responsible for establishing a framework for exploring data from a gigantic amount of educational source in order to facilitate the student better 204616 VOLUME 8, 2020 understand their topic and, at the same time setting up a suitable environment for the student. Various authors adopt data mining techniques, e.g., Hooshyar et al. [39] used a novel data mining algorithm to automatically assess students' performance through procrastination behaviors by using their assignment submission data. A Clustering algorithm (i.e. spectral clustering) is used to label the students into three candidates (non-procrastinator, procrastinator, procrastination). After that, classification method classifies the best students according to the criteria. Among numerous classifiers, neural networks outperformed others with an accuracy of 96%. Also, Riofrio-Luzcando et al. [40] developed a model based on educational data mining by clustering the previous history of student logs, and this model can predict the actions of new students with the aim to improve the tutoring feedback within an ITS.

d: ONTOLOGY
Ontology can be explained as the association of data, entities, and concepts within a domain of interest. It encompasses the properties and categories of data and concepts in a particular domain. Specifically, the emergence of the semantic web had started when the application of ontologies merged with the web [41]. The ontology study was first introduced by the philosophers, which have been widely used in information science in recent years. In the context of e-learning, there is no single model for the structured content or for the learner profile, which makes the need for ontology more essential [42]. Recently, ontologies and semantic web have been widely used in ITSs, e.g., represent natural language question into meaningful viewpoint and performing problemsolving without human involvement [43], helping teachers to acquire both linguistic knowledge and practical skills using an ontology of lexical concepts [44], to design a concrete game-based ITS [45] and so forth.

e: COLLABORATIVE FILTERING
Collaborative filtering (CF) is a recommendation framework that analyzes a user's behavioral data to anticipate his/her interest and suggests new items accordingly. For generating better and precise recommendations, CF consistently depends on the primary data collected from the users and does not depend on additional information. If there exists an identifiable trajectory among the user's behavior, then CF approaches are more efficient [46]. CF based approaches have two main categories: (i) model-based approaches and (ii) memory-based approaches [47]. However, in the CF research domain, memory-based CF is a pioneer and used widely for its simplicity and performance [47], [48]. Merging different algorithms with CF shows potential output, such as Bayesian Knowledge Tracking (BKT) is a popular user modeling method. Still, it rarely considers the situation of a user's first encounter of a knowledge component. To solve VOLUME 8, 2020 this issue [49] combines CF with BKT and develops a novel student performance scheme.

f: SWARM INTELLIGENCE
Swarm intelligence (SI) or meta-heuristic based algorithms is a nature-inspired artificial intelligence study, which mainly focuses on the intelligent behavior of the natural components. SI systems are based on the behavioral models of simple swarm agents that are self-associate within their environment [50]. These decentralized systems are habitually self-organized and often formed inspired by the natural, especially the biological operations. Ant colony optimization (ACO), genetic algorithm (GA), whale optimization algorithm (WOA), particle swarm optimization (PSO), and so forth are some significant instances of swarm intelligence. These optimization algorithms were also used to improve the performance of other computational intelligence algorithms, e.g., Wang and Liu [51] used particle swarm optimization to solve the Bayesian network's learning structure problem. Also, [52] used three optimization algorithms, such as ACO, GA, and PSO with a feed-forward neural network to design an improved learning concept for tutors in an ITS.

g: MACHINE LEARNING
Machine learning (ML), a recently developed wing of artificial intelligence (AI) that supports to establish of some intelligent frameworks for handling problems according to real-life scenarios. ML and AI often considered in the same breath [53]. According to the definition by Popenici and Kerr [54], ML ''is a subfield of artificial intelligence that includes software able to recognise patterns, make predictions, and apply newly discovered patterns to situations that were not included or covered by their initial design'' (p. 2). According to the learning nature, machine learning can be categorized into three major classes: supervised learning, unsupervised learning and reinforcement learning. Then again, based on operations, machine learning algorithms can be divided into classification, clustering, and regression. Other than a lot more applications, machine learning technologies have recently been being used in education sectors too, e.g. developing a smart tutoring system [55], creating an automated essay scoring in educational settings [56], implementing a better learning strategy by predicting user's learning preference in an ITS [57] and so forth. Neural Network (RNN), and so forth are some frequently utilized deep learning algorithms. The study on ANN mainly originated the concept of deep learning [58]. A further breakthrough happened in deep learning when Hinton proposed a deep belief network (DBN) [59]. In contrast with ML algorithms, DL algorithms produced far better achievements in many applications, e.g., computer vision, speech recognition, natural language processing [60], [61]. In recent time, researchers are applying DL algorithms in educational setting too, e.g. [62] used CNN for assessing student writing for creating a better picture of student knowledge, [63] used ANN to construct a learner model in an adaptive learning system architecture for tracking the student performance, [64] used RNN to create a deep knowledge tracing to predict students' performance in learning and so on. In Fig. 4, publication year versus the related article regarding the CI approaches is provided from the WoS database. The reason for choosing the WoS database because it acts as a gateway for all the peer-reviewed Science Citation Indexed (SCI) and Social Science Citation Indexed (SSCI) articles. The aim of Fig. 4. is to describe the categories of different CI approaches used by the researchers in various domains and also to observe the trend. From the graph, it can be observed that most of the research has done in the area of neural networks almost every year. This could be the increasing demand for understanding complex patterns and to learn and model nonlinear relationships because, in the real-life scenario, the relationship between input and output is complex and neural networks are not limited on the inputs provided to them, and it also can learn from examples as well. With the rapid development of hardware resources and computation techniques, researchers are now organizing grand challenges, crowdsourcing to harvest big data-both structured and unstructured and leverage it into many applications. As NN algorithms are mostly supervised learning algorithms, i.e., they need labeled data and this massive dataset with the combining with the NN's computation power showing satisfactory performance. From the discussion, it can be concluded that NNs can be the best CI approach that needs to explore more.

2) TEACHING STRATEGIES
To make tutoring environments more interactive and also for creative learning, it is necessary to employ appropriate teaching strategies [65], [66]. Suitable teaching strategies make the learners achieve their academic goals. From the inferred literature, frequently implemented teaching strategies seem to be, Scaffolding: This teaching technique helps a learner to get adequate supports from a tutor on a specific domain to positively mobilize the learning rate [67]. Wood, Bruner, and Ross [68] introduced this term as a medium of invocation for help from an expert to a non-expert. Later, McKenzie [69] described that a successful scaffolding must encompass a clear direction along with precise purpose and expectation. Therefore, the scaffolding technique should be inevitably responsible for reinforcing on-task activity, providing better student direction, and increasing efficiency. Also, it should be competent in reducing uncertainty and disappointment   or surprise besides ensuring palpable momentum. Various researchers used scaffolding teaching strategy, e.g., [70] used scaffolding to clarify problem-based learning [71] used to scale the academic writing and so on.
Socratic Questioning: Socratic questioning (SQ) is another teaching approach that intends its teaching activities through a ''question and answer'' process. This teaching approach is structured based on a progression of inquiries that are sequen-204620 VOLUME 8, 2020 tially posed to a learner to answer. The basic questioning process is decomposed into several questions at different levels [72]. This teaching technique starts with a question made by the tutor, which the students need to respond to. As per the relevant response student provides, the tutor reformulates a new question for the student, and along these lines, this procedure goes on until it arrives at the utmost objective. [73] used SQ to design and deploy a large-scale dialog based ITS, [74] analyzed a Socratic dialogue-based ITS to examine the relationship between tutor and tutee in tutoring.
Game-based learning: Game-based learning or educational game is recent technology, which is a branch of serious games to promote learning outcomes. Applying educational games in the learning process has the potentiality of improving the learner's learning experience [24], [75]. The online games utilized in education can bring the maximum outcomes for the learners if they are orchestrated for multiplayer to play simultaneously [76], [77]. However, improving the efficacy of online games is still considered a significant challenge [78].

3) FORMATIVE FEEDBACK
Due to the variance in personal characteristics like learning preference, prior knowledge, and learning progress, it is often necessary to provide adaptive support within the learning environment. Researchers have asserted in their studies that providing formative assessment during the learning process verily improved the learning viability and inspiration rather than a single summative judgment at the end [31], [79], [80]. According to [81], various characteristics of adaptive feedback is showed in Table 2.

4) LEARNING STYLE
Learning style is the way one learns best. It is based on individual characteristic preference. The learner learns in various ways, hence providing personalization according to their preferred learning styles would improve learning [82], [83]. According to the inferred literature, the most preferred learning style models described below.
Fleming's VARK model: This model is especially applicable in a self-paced e-learning course. Many researchers used this model to facilitate learning [84]- [86]. There are four learning style in the model (visual, aural, read/write, kinaesthetic) that help the learners receive information, and among them, one or more style seems to be predominant. The instrument is composed of 16 explicit questions with their appropriate answers. Each answer pursues a particular style among four distinctive learning techniques. According to [87], the learning style is determined based on the maximum accurate responses.
Felder-Silverman model: This learning style model by [88] (FS) is popular among the engineering students. To receive information and to comprehend the process, four distinct dimensions, such as perceptron dimension, input dimension, processing dimension, understanding dimension, are used. Every student as per their inclination towards a specific learning style is placed along with each dimension axis by the Index of Learning Styles (ILS) instrument by [89]. Low scores indicate a weak preference for a particular style, called neutral preferences, and such neutral learners are placed at the centre of the axis. According to [90], [83], this model shows promising results to smooth the progress of the learning process.
MBTI model: The Myers-Briggs Type Indicator (MBTI) [91] is a very straightforward learning model to facilitate a learner, more efficient process than numerous mentioned personality theories. The MBTI questionnaire decomposes the personality traits into four discrete spectrums: Extroverted (E)/Introverted (I), Sensing (S)/Intuitive (N), Thinking (T)/Feeling (F) and Judging (J)/Perceiving (P). Other than just personality appraisal, it tends to be utilized to evaluate problem-solving styles. MBTI discovers a very profound foundation for correlating personality traits with problemsolving styles. In this aspect, the problem-solving style can be managed essentially with the help of the preferences dimensioned in the form of introversion, extroversion or feeling. It exposes a learner's inclination for acting individually or in a group in case of handling forthcoming issues. For example, unlike feeling-type learners, thinking-type learners make decisions objectively based on facts and logic rather than subjective values.

5) SUMMARY OF THE PAST RESEARCHERS APPROACHES TO DEVISE ITS
According to the above discussion, we summarized different authors' approaches on how they devise ITS in their study and described below. However, various researchers used specific approaches in their study and, in some cases, e.g. [78] used feedback but did not mention which kind of adaptive feedback they used.
• Karaci [92] introduced a hybrid technique for establishing an intelligent tutoring system to teach punctuation in Turkish. This hybrid tutoring system was established based on the combination of fuzzy logic and a constrained-based student model (CBM). The CBM records the mistakes a learner makes during the learning process to provide the learner with immediate feedback or the necessary hints. The MYCIN certainty factor, along with fuzzy logic, determines the level of learning based on the time a learner takes to answer a particular question. Results show prospective improvement in student's individualized learning by eliminating a student's additional attempts, providing more hints, and removing unnecessary feedback using fuzzy logic.
• The student model is the basis for the personalization of an ITS. An important goal in the student model is course sequencing, which plays a major role because VOLUME 8, 2020 it determines the learning path of the student. Hence, the authors [93] takes into consideration of ontology and Wikipedia information to determine the sequence of learning concepts. A data mining, i.e., text mining algorithm, is proposed that works on the structure and content of Wikipedia to determine the order of contents. A comparison has been made against the domain expert to evaluate the accuracy. According to the result, a correlation of 0.664 has been found from the Pearson test between the algorithm and the experts, with a confidence level higher than 99%, which shows the feasibility of this approach.
• Pelánek and Jarušek [94] investigated that problemsolving in an ITS is established, not only focusing on correcting the answers but also on the timing information as well. Time-based problem-solving encourages the student to make immediate feedback to the system. Accordingly, placing the linear relationship of problem-solving skills to the logarithm of time is proposed to erect the student model. The proposed student mode encompasses the relationship of item response theory along with the collaborative filtering. This model excludes the addition of a game-like environment because it shows that students compete with each other to solve the problems relying on guessing rather than following a strategic approach. Evaluation results using subsample cross-validation (student stratified) shows the appropriateness of the proposed model with respect to guiding adaptive behavior, providing feedback.
• Based on the assessment of a student's learning rate, Kaoropthai et al. [67] proposed to utilize a scaffolding strategy to support the student's needs according to the skills provided to them. The scaffolding is expected to expedite the learners to reach a certain level with adequate skills before moving to the next level. An advanced data mining technique based on a two-step clustering (TSC) method was suggested in an ITS to establish an intelligent diagnostic framework (IDF) in order to accommodate teaching in an English class. An optimized diagnostic test on the 10 learners' erudite results was conducted with a pass criterion of 75% (≥ 3 out of 4). The results demonstrated that some 56% of lead users scored equal or higher, and some 68% scored an equal or higher number than the pre-test.
• Rastegarmoghadam and Ziarati [5] developed an improved student modeling within an ITS using a swarm intelligent approach which is ACO. The course content for the experiment carried out on a population of 50 ants in 2000 iterations and personalized following the VARK learning style, where the learning exercises were structured as per the MBTI model. The searching algorithm has been optimized through a dynamic association of the parameters so as to assert that all the potential ways have been explored. The proposed model has been structured carefully to locate the optimum learning path as well as to amplify learning proficiency by adjusting learners' traits and inclinations.
• A primary responsibility of an ITS is to decide what the learners must be taught by organizing some assessments in order to measure the learner's cognitive level.
To assist in organizing a learners' knowledge assessment process, [29] suggested a very sophisticated evaluation module along with its workflow comprising a detailed illustration of its integration process to ITS. The inference engine has been developed based on the Bayesian network. Four tests have been done for investigation, such as conceptual inference, testing question inference, time comparison, determining the student knowledge. It has been established from the above assessments that the proposed system enables the students to respond 2.7 times faster than the conventional exam system, and the concepts the students assemble, whether known or unknown, have a 75.6% probability of being correct.
• To optimize the information retrieval process as well as to expedite students to learn computer programming online, [78] developed an Online Game-based Bayesian Intelligent Tutoring System (OGITS). Tic-tac-toe and Snake & Ladder multiple competition board educational game have been comprised of the interface with a proper guideline along with feedback to help them understanding programming logic and concepts. Bayesian Network has been used to decide exactly which unknown concepts students should be referred to. Quantitative and subjective information has been broken down into various stages, for example, encoding information, presenting themes, building up and gathering information concerning themes, and it was found that 75.4% of students concurred with the cases that OGITS supported their learning, and 74.1% and 75.3% recognized OGITS improved their programming and online data aptitudes.
• In this work, [28] carried out a correlation comparison of a commercial game named Dragonbox for equation solving to a research-based ITS named Lynnette. Lynnette is a state-of-art ITS which has been developed utilizing the Cognitive Tutor Authoring Tools (CTAT). The authors' objective is to evaluate the learning outcomes by comparing these two systems. Experimental results showed that the use of Dragonbox helps the students learn in a very enjoyable environment that enhances their problem-solving capacity, while Lynnette helps the students perform incredibly better in the post-test. So, the authors concluded to use a hybrid model consisting of the combination of these two to make the learning scheme more enjoyable as well as progressively viable.
• An assessor module was developed by Zhang et al. [95] for an ITS to grade Chinese short answers automatically. This short answer evaluating model coordinates both the general and explicit data of the subject domain. A longshort-term-memory recurrent neural network is utilized to get familiar with the machine learning classifiers, e.g., naïve Bayes, logistic regression, decision tree, and support vector machine to regulate the sequence of the information. The viability of this model over the current programmed evaluation system has been affirmed through an experiment directed with more than 7 comprehensive questions containing more than 16,000 shortanswer patterns. Likewise, this model demonstrates the adequacy of consolidating the general and explicit domain data in evaluating the framework.
• In this work, Paquette and Baker [96] compared learning analytics and machine learning as well as their composite approach (combining learning analytics and machine learning) by presenting a contextual investigation in the context of modeling ''gaming the system'' behavior. The comparison has been conducted across three dimensions: accuracy of the model on the original data, interpretability and, generalizability of the model to new data and contexts. From the outcomes of the Kappa and the area under the curve (AUC), it has been exposed that the development of the hybrid approach requires extended resources even though it performs better across three dimensions. However, the development of the machine learning model demands fewer resources, yet it is hardly interpretable.
• Through this study, [97] proposed Dynamic Bayesian Network (DBN) over Bayesian knowledge tracking (BKT) to accurately represent and predict student knowledge. DBN is considered suitable for representing student knowledge due to its capability of representing multiple skills concurrently in one model. The evolutional data originate from information logs of various intelligent learning frameworks like Andes2, Calcularies, Dybuster, and Cognitive Tutor. These data sets encompass data over different learning domains like arithmetic, biology, physics and so forth for a large range of ages and classes from elementary school to university students. The experimental results from the root mean squared error (RMSE) and the receiver operating characteristic curve (ROC) yield a significant improvement in the prediction accuracy for modeling skill topologies over traditional student models.
• In conventional e-learning environments, the visualization and interpretation of a student's behavior is considered a challenge. The authors [98] were inspired to use students' logs to establish an architecture of visualization model for a collection of students from the 2D/3D virtual environments to improve the environment of the procedural training. The goal of the authors was to design visualization in order to intensify the tutoring tactics of an ITS by analyzing patterns through the data mining approaches. The non-parametric U test from Mann-Whitney was utilized for evaluation purposes on the final grades of students' data on biotechnology, bioinformatics and, biochemistry test, and the results showed improvement in the tutoring strategy by reducing the number of errors committing by students.
In the next section, we are describing how the transition has been done from Intelligent Tutoring Systems to Affective Tutoring Systems, where the latter is the primary focus of using affect-emotion.

B. AFFECTIVE TUTORING SYSTEM
From the above discussion, it has been comprehended that the main objective of an ITS is to provide a learner with an automatic and cost-effective one-to-one tutoring environment.
Although like a personal tutor, the ITS continuously interacts with the tutor and makes assessments of the students' progress to enhance the effectiveness, in the past, researchers pointed the main criticism of ITS is that they are devoid of emotional awareness and empathy, which limits the tutoring effectiveness [9], [102], [103]. Also, nowadays, according to [100], this form of affective support is not integrated with most of the computer-based systems; thus the motivation and learning interests of the tutees are adversely affected.
On this account, a hypothesis lies that analyzing affect and emotion can possibly augment the learning achievement of students. By the end of 1990-ties, Rosalind Picard coined the term affective computing in her book [101] that provides a direction to the evolution of affective computing (AC). Picard described that cognitive and social processes, e.g., human intelligence, education, invention, and decision making, are strongly activated by emotions. A branch of Artificial Intelligence, AC design systems and tools has a strong intelligence to identify, process, and interpret human emotions.
The main objective of AC is to recognize the user's emotion by taking the emotional expression conveyed by the user in any kind of form (face, text, speech etc.) into consideration and to interpret emotional states in order to make an optimal inference [102], [103].Affective computing is showing promising results on human-robot interaction which includes healthcare, entertainment, manufacturing [104], virtual reality [105], product recommendation [106], education [15] and many more. A point to be noted that affect and emotion may not be the same in subjectivity terms, but in computational linguistics, both have the same application [107]. Emotions are considered as individual experiences and responses which depend on the conditions in which they appear in human societies [108]. Activities, for instance, cognition and thinking capability, decision-making abilities, resilience, and the way of communications are greatly regulated by human behavior and reactions [109]. In this scenario, various studies have been conducted to reinforce an in-depth design of the interactions between emotions and learning. Psychologists [110] represented the impact of emotions over learning and success through a four-dimensional structure: i. Positive activating emotions (e.g., enjoyment, fun, hope, and happiness) causes strong learning motivation in students. ii. Positive deactivating emotions (e.g., relaxation and relief) cause learners to relax temporarily during the learning process. iii. Negative activating emotions (e.g., shame and anxiety) cause learners devoted to solving problems to avoid failure. iv. Negative deactivating emotions (e.g., hopelessness and boredom) cause learners not willing to overcome challenges encountered during learning.
Based on the discussions, emotion not only the driving factor that promotes learning but also the primary factor that also hinders the learning process as well. Hence, it is crucial to have reliable methods of emotion recognition in academic contexts [111]. Recognizing emotions and affective states are useful for analyzing user's reactions to elicit behavioral intentions and to create reasonable responses. This can develop an affective-aware system and improve its user interface in potential applications [112]. In conjunction with this, it is necessary to extensively study human emotions in the context of e-learning too [113] and thus, a trend has been started to apply an emotional lens in emerging academic research and position emotions as central to learning [114]- [116]. This has further given the rise in the development of ATSs, a type of ITSs, with the ability to control learners' adverse emotions [103]. Analogous to the way human tutors detect and respond to the affect of their tutees to sustain the engagement of the latter, ATSs adapt to the tutee's affect autonomously to bring about enhanced learning outcomes. Most affective tutoring systems incorporate affect sensing, tutoring strategies, and learning progress tracking into a single environment. An ATS, for instance, can sense that students are frustrated and offer hints to resolve the impasse. From the above discussion, it can be inferred that ATSs can be a potential learning environment for the next generation.
Under the above scrutinize discussion, it can be inferred that ATSs can be a potential learning environment for the next generation. In the consecutive sections, in accord with Fig. 2, a brief discussion of various formulation approaches (2014-2019) of ATSs has been described below:

1) ARCHITECTURE OF AN AFFECTIVE TUTORING SYSTEM
To the best of our knowledge, no generic architecture of an ATS can be found in the literature but based on description of [117], [118] we are describing an architecture of ATS, see Fig. 5, which comprises of Affect perception module, Student subsystem module and Tutoring subsystem module.

a: AFFECT PERCEPTION MODULE
It acquires and processes affect from raw data from various emotion recognition channels such as web camera for facial expression, keyboards and mouse for keystrokes, written text for textual emotion recognition, speech from the vocal sensor for speech recognition etc.

b: STUDENT SUBSYSTEM MODULE
This subsystem consists of affect inference module, which infers the affect of the student, and the student's action module will regulate the inferred affect to formulate an appropriate tutorial strategy.

c: TUTORING SUBSYSTEM MODULE
The inferred affect from the affect interface module and historic interaction from the student action module will be passed within the tutoring subsystem module. According to the learning status database and learning assessment, the teacher will offer appropriate assistance to control the tutee's learning status.

2) EMOTIONAL MODELS
Emotional models are organized on various human emotions ranging from scores, ranks or dimensions. Current emotional models characterize different sorts of emotions according to duration, appraisal elicitation, behavioral affect, intensity, quickness of change [119]. Researchers used various emotional models, e.g., [117] used Ekman's six emotional models in their study. In Table 3, we summarized all the emotion 204624 VOLUME 8, 2020

3) EMOTION RECOGNITION IN ATS
Affect can be measured by different emotion recognition channels. Emotion recognition is a fully programmed or a semi-automated method of analyzing human emotion based on individual skills and interpretation. Emotions can be conveyed directly or indirectly through facial expression, voice recognition, writings or signs and can be used to detect emotion, as shown in Fig. 6. To develop an ATS, several researchers used only a specific or employed combined approach to identify learners' affect. However, it is not possible to show all the methods for emotion recognition channels, and in this current study, we are describing the most commonly used emotion recognition channels found in the literature. To learn more about other emotion recognition channels, one can refer to [127]- [129].

a: FACIAL EXPRESSION RECOGNITION
Recognition of facial expression is one of the vigorous and demandable tasks in social communication. Different facial expressions convey different non-verbal communications. This has made facial expression recognition a top topic for researchers. Facial expression establishes direct communication with human emotions and intentions. According to Huang and Wang [130], there are three stages of facial expression recognition (FER).
• Pre-processing Pre-processing is carried out to ensure the precision of the feature extraction process because the performance of the facial expression recognition (FER) system mostly depends upon it. In order to improve the expression frame, numerous techniques like contrast adjustment or image scaling are employed to process the images.
• Feature extraction In FER, feature extraction comes next to pre-processing techniques that deal with the discovery and depiction of the decisive features of an image. This a very significant stage of the FER system because it discovers the implicit data from the graphical state of an image, which is naturally used as the input for classification tasks. The feature extraction approaches can be described as different classes: (i) global and regional feature-based, (ii) path feature-based, (iii) edgebased, (iv) texture-feature based, (v) geometric-feature, and (vi) appearance feature-based methods.
• Classification Classification, the final stage of the FER system, categorizes facial expression into various emotional states such as smile, sad, surprise, anger, fear, disgust, neutral and so on. Some commonly used classification algorithms are Support Vector Machine (SVM), Logistic regression (LR), Naïve Bayes (NB), Decision trees (DT), Hidden Markov models (HMM), Random forest (RF). Some deep learning algorithms like ANN, CNN and RNN are likewise used to achieve the classification task quickly and precisely.

b: TEXTUAL OR SEMANTIC EMOTION RECOGNITION
With the rise of social networks, the text has now become the most common form of communication. Textual emotion detection and 'Sentiment' analysis define synonymously, but they have different applications. According to VOLUME 8, 2020   Yadollahi et al. [130], three methods generally used for textual emotion recognition.
• Keyword-based Keyword-based emotion detection is the most intuitive and straightforward approach. It finds patterns similar to emotion keywords and matches them accordingly. The primary task is to discover out the word which expresses the emotion in a sentence. To do this, Parts-Of-Speech tagger is used for tagging the words of a sentence, and then Noun, Adjective, Verb and Adverb (NAVA) words are extracted. Then the extracted words are matched against a list of words called keyword dictionaries to represent emotion according to a specific emotion model. There are many online tools available, like WordNet for making the dictionary.
• Machine-learning based Textual emotion can be detected using machine learning methods. Both supervised and unsupervised approaches can be employed to perform emotion-detection assignments where a supervised approach requires an annotated emotion dataset to be trained and tested. SVM, NB and DT are some commonly used machine learning algorithms. The unsupervised approach discovers the internal patterns to classify the emotions rather than accepting any labeled data. The basic classification tasks are inaugurated with some specific words settled for each emotion that are later cross-referenced to the sentences, which are subsequently incorporated into identical emotions.
• Hybrid method To achieve the maximum level of accuracy in emotion detection operations, researchers begin to use a hybrid approach comprising two or more previously described methods. This method shows improved outcomes compare to the other two methods.

c: SPEECH EMOTION RECOGNITION
Speech is another significant way for emotion recognition in some cases and becomes essential for portraying emotion regardless of the use of facial expression and text. Likewise, 204626 VOLUME 8, 2020 facial expression recognition it also consists of three stages [119].
• Database Databases for speech recognition can be populated either using any simulation or in a natural way. The simulated databases consist of different emotions speech voiced by any skilled actor. The voice is recorded in a noise-proof environment to ensure its originality.

• Features
The feature selection is considered the most important part of an emotion recognition system as the efficiency of the emotion categorization task is accomplished base on the selected features. Features in the emotion recognition system can vary in types, as affirmed in the literature. Although individual feature like spectral or prosodic is very popular, sometimes prosodic and spectral are used combinedly to train the model.

d: PHYSIOLOGICAL EMOTION RECOGNITION
In the literature, physiological emotion recognition also plays an important role in emotion recognition, which can be inferred from different ways such as biometric [131], [132], EEG signal measurements [133], [134]. As discussed above, these approaches also have the same approaches, such as data collection, pre-processing, and computational intelligence methods, to infer emotions.
Below in Table 4, we have discussed the advantage and disadvantages of the above-discussed emotion recognition approaches. Besides, a chart Fig. 7, is provided from the WoS database for these approaches that were used in the literature from 2014-2019. It can be inferred from the chart that facial has been used mostly, followed by speech, text, and physiological.

4) SUMMARY OF THE PAST RESEARCHERS APPROACHES TO DEVISE ATS
Under the above discussion, we summarized different approaches used by the researchers to devise ATSs and described below.
• An emotional design tutoring system (EDTS) was developed by [117] to examine whether the proposed system influences the user to interact at a satisfactory rate or not in order to promote learning. Their main objective is to analyze students' learning progress by compiling the learning emotions of the students. There is a possibility of arising some negative emotions like the subjects are complicated or the course materials are not clear.
Facial and text emotion recognition has been taken into consideration. SVM (support vector machine) and SeCeVa (Semantic Clues Emotion Voting Algorithm) classifier used for facial and semantic emotion classification. Ekman's emotional model has been chosen for the emotional model. The EDTS tracks variation in subjects' emotions and relays the instructor quickly, and the instructor offers appropriate assistance to improve the learning process. In the experimental evaluation, two groups (emotional design group and control group) learning data has been analyzed on pre-test and post-test using descriptive statistics, and it has been shown that the EDTS offers an efficient procedure for both students and teachers to ensure the maximum learning effect as well as satisfaction.
• Fwa HL [118] hypothesized that the effectiveness of the tutoring system could be intensified by consolidating empathy into it. Consequently, they detailed the structure, usage, and assessment of an ATS in the field of computer programming. However, their architecture is loosely coupled and can be used for another domain as well. Facial expression cues have been taken into consideration for affect sensing. A Mann-Whitney result between two groups (Affective and Non-affective) showed that the time is taken to finish each exercise and the number of tasks endeavored is much greater in the affective group students who used this ATS. However, some students showed discomfort with the monitoring of their actions via the sensors such as a web camera.
• This study by Lin et al. [135] reveals the design and expansion of an affective instruction system that elevates students' learning interest by adjusting some evaluation to accommodate adequate feedback during the Japanese teaching language. Facial and semantic emotion recognition techniques are used for emotion analysis with Ekman's six basic emotional models. In order to recognize the learner usage, two groups of students (experimental and control group) were recruited, and a system of usability score (SUS) test was conducted. The statistical result shows a possible improvement in learning motivation and outcome.
• To form a humanized interactive mechanism, [102] developed an ATS to teach accounting that permits the learners to optimize the learning process by adjusting emotional expression to the system while at the same time the system provides adequate instruction. The proposed system learns students' emotions from both textual and facial expressions to reinforce the learning process. The authors argued that both positive and negative emotions promote the learning process but in different ways. To measure the learning effectiveness, a pre-test and post-test were conducted where the post-test (mean post-test score (M=78.78)) demonstrated a substantial increase of 19.42 with an effect size of 0.93 which confirms that ATS learning effectiveness is far higher than the conventional teaching methods. However, this system did not overcome the spatial and temporal limitations, i.e., it cannot be applied to any other educational domains.
• The spoken dialogue frameworks very often use the partially observable Markov decision process model (POMDP). This POMDP model, many times, demonstrated a suitable model compared to others found in the literature. Along these lines, [136] proposed that the ATS can be modeled based on the new factored POMDP model to enhance the learning effectiveness. Emotion response and goal response are two responsible parts of the proposed system to perform the response to the student's goal and emotions. Five reproductions analyze are planned by the point-based value iteration (PBVI) calculation peruses to assess the impact of the key parameters on the framework execution, and the outcomes demonstrated that the recommended model is sensible and practical for an ATS.
• This study by Bahreini et al. [137] presents a voice emotion recognition framework named FILTWAM (Framework for Improving Learning through Webcams and Microphones), a real-time emotion recognition system in a software artifact to communicate skill training in an e-learning setting such as ATS using a microphone.
In their previous study, they tested their facial expression recognition part, and now they are focusing on the voice emotion recognition. Ekman's six universal emotional models, along with Sequential Minimal Optimization (SMO), is used for classification. This study demon-strates the alignments of the raters with the software, ensuring the Kappa value of 0.743 with overall software artifact is 67%. However, this system is only designed for English speakers.
• An affective tutoring framework for environment management (ATEN) was suggested by Kaklauskas et al. [138] to maintain stress and to increase the productivity of a student. This system offers an automatic function to prepare customized learning materials for each student based on a specific topic. The system detects emotional expression from both textual and biometric data. According to the experimental results, it is clear that the proposed system affirms its learning effectiveness by providing adequate learning materials according to students' preferences.

III. RESEARCH CHALLENGES
The above discussion concluded that several potential research could be conducted on emotion regulation, computational intelligence approaches (particularly on the neural network), transferring computerized tutoring to lightweight mobile-based tutoring. Furthermore, designing these systems is costly and needs to evaluate every module step by step. Indeed, these challenges constitute the main practical implication of this present study. However, we have identified the following open issues and challenges that future research needs to address.

A. IDENTIFY EMOTIONS FOR DISABLED PEOPLE
From the above discussion (Section 2.2), it can be understood that emotion or affect plays a vital role in human cognition. To design an ATS, it is necessary to incorporate affect perception module in these systems for inferring emotions accurately, which needs proper body movements, language, hearing, visibility, and many more. However, it is quite challenging to identify emotions from disabled students due to their physical impairments. The possible solution may lie in designing these systems in such a way that they can accommodate those impairments adequately. In this case, biomedical sensors can be used to design a user-friendly interface that transmits the information appropriately for these physically disabled people. This issue indeed a significant challenge and calls for a precise investigation.

B. PRIVACY CONCERN
The architecture of ATS implies that a vital module is affect perception module, which incorporates empathy into this system to enhance its tutoring effectiveness. However, while monitoring the students' actions via sensors or any other way, there is a possibility of disclosing the records which may lead to privacy issues, and sometimes the situation creates discomfort among students too [118]. One suggestion is that to provide students in an ATS to opt-in or out of the monitoring but to opt out also means that the empathetic tutoring response will be disabled as well. Solving this privacy and usability conflict can be a potential area for future research. Using endto-end encryption, securing the passwords, or keeping the information privacy with Internet security to evade tracking can resolve the above issue. Also, it has to legally justifiable for an academic institution to make student data available to a third-party, including employers, by taking proper consent.

C. COMBINING SEVERAL APPROACHES
From the discussion of the past researchers work' for devising ITSs, the authors [28], proposed that combining the educational game with ITS can be a promising proposition to make learning more interesting, engaging, and effective. One thing to note that gaming the tutoring system sometimes has a negative impact as well, e.g. [94] mentioned that because of the gaming environment, students sometimes do not follow a strategic way to solve problems to compete with others and sometimes just complete the answers by guessing. However, the above-discussed literature review made it clear that identifying students' adverse emotion or affect and providing necessary feedback according to the impasse can significantly increase students' learning performance. Thus, formulating an ITS combining these approaches can be a potential future research endeavor.

D. COMPUTATIONAL INTELLIGENCE APPROACHES
From the discussion of ITSs, we saw that various computational intelligence approaches (section 2.2.1) had been used to model student/tutor module. Among them, Bayesian network, Fuzzy logic, swarm intelligence, neural networks seem to be promising. However, keeping the computational costs, robustness, performance, and other related parameters compared to existing methods still need more research efforts. From the past researchers' discussion case, e.g., [97] used the Dynamic Bayesian Network (DBN) over Bayesian Network (BN) and DBN shows potential improvement in student modeling. Also, neural networks (NNs) are quite useful to understand complex patterns. However, they need many resources than traditional machine learning algorithms. Moreover, the increasing number of hidden nodes in NNs make these algorithms to get trapped in the local optimum [139]. In this case, [140] used PSO to avoid this problem. These SI based optimization algorithms have been used widely, and also within neural networks as well as machine learning algorithms too for the possibility of getting the optimized output, e.g. [141] used GA with CNN to improve the precision and lower the computational cost of CNN. Also, [142] investigated on four computational intelligence algorithms, i.e., ANN, GA, ACO, PSO, to improve the precision of automatic learning style identification for an ITS. Besides, the extended versions of the previously discussed CI algorithms, e.g., fuzzy cognitive maps, fuzzy decision trees, unsupervised NNs algorithms and so on can also be investigated further to improve the performance of these approaches.

E. RESEARCH ON AFFECTIVE TUTORING SYSTEM
The statistics (Fig. 8) taken from WoS clearly state that compare to ITSs, ATSs are rarely implemented. This may be the fact that ATSs encompass cross-disciplinary knowledge spanning domains of education, psychology, and computer science. Also, from the discussion of ATS formulation approaches, it can be understood that the CI approaches rarely used in ATSs like ITSs. Devising an ATS with proper emotion recognition channel along with CI approaches can be a new era of tutoring system to investigate. Furthermore, from the psychological perspective, 'learning by cognition' is more productive than 'learning by example'. Hence, exploring ATSs can be a potential area for future research.

F. BUDGET ISSUES
To devise a tutoring system (either ITS or ATS), a variety of technologies have been used, which involves a huge cost because every module of a tutoring system needs to be tested and evaluated before completion. Additionally, there are some hidden costs like salaries of the designers, consultation with course experts, and so on. Further, from point E. ATSs has been rarely implemented because it compasses a span of education, psychology, and computer science. From the above past researchers' discussion of ATSs formulation approaches, the authors [135], [137] mentioned that their system could not be integrated into other educational domains. For this research challenge, a feasible solution could be to design a tutoring system using appropriate diverse technologies and techniques. It should be independent of the domain that can be deployed in other educational tutoring domains as well to improve the usability and versatility.

G. SELECTION OF CLOUD SERVICE
To design these tutoring systems, resource allocation is a demanding requirement to maintain and making these resources available for a particular user. It is necessary to schedule, which includes the process of utilizing these available resources efficiently. Cloud computing can be a potential solution for these issues. The primary purpose of cloud computing is to allocate the on-demand resources precisely and provide less overhead in maintaining computing resources.
Since cloud computing offers different services, it has to be determined which service proves more useful to educational institutions in terms of productivity and budget. It is a challenge to set the cloud infrastructures and services within the e-learning systems. These may include the development of appropriate and easy to integrate lecture and lab materials, requirement of special software and hardware needs, managing and setting up large students' accounts, security, and privacy issue. Due to its present limited scope, little research has been carried out to explore the use of cloud computing for designing these tutoring systems. Thus, an adaptation of cloud technology for e-learning will be a potential future endeavor to explore.

H. AVAILABILITY OF COMPUTERIZED LEARNING ON MOBILE PLATFORMS
Information revealed by the Stat Counter Global Stats exhibits that the quantity of mobile users is expanding rapidly, as appeared in Fig 9. In real-life scenarios, the quantity of mobile uses is unquestionably larger than desktop users. Mobile platforms like Symbian, Android, and IOS is rapidly expanding its popularity for its size and portability. In any case, ensuring easy access from the mobile platform to computerized learning systems (either formulating an ITS or ATS) is yet considered a challenge. Also, according to Fakhfakh [143], mobile cloud computing has now become a hot topic among researchers. This identified issue is, in fact, a major challenge to mitigate and so a rigorous analysis is required.

I. ETHICAL ISSUES
With the advent of big data and educational data mining, researchers and developers are now seeing endless 204630 VOLUME 8, 2020  opportunities to introduce different technologies to process and generate this large information to support learning. However, before embracing big data in the educational domain, several issues need to be addressed. It is a challenge for obtaining participant's consent following a standard way in big data research because in institutional databases, most of the data are already exist. The major ethical dilemma associated with big data research is maintaining integrity while using publicly accessible data. It is thinkable, and possibility may lie that the users' that generated that data might not be willing to consent to utilize these data, or these individuals are no longer accessible to researchers. Additionally, ownership and access to the data should be scrutinized as well. For example, it is not ethical for a student to access the same data as a faculty; educators from one course should not have access to the analytics of other courses. Intellectual property rights might be triggered by sharing data without proper guidelines. These issues of trust need to be addressed appropriately when sharing research data across the globe.

IV. A CASE STUDY
Previously we have discussed diverse technologies with prospects to design an Intelligent or Affective tutoring system. A practical design workflow is illustrated in this section for certain learning subjects and tutees. Specifically, our study focused on Tega [145], which is a one-to-one tutoring system for children's second language skills. Implementation of each module of the current Tega system is analyzed and in the context of the approaches discussed earlier, we are proposing several aspects of case studies to improve the Tega system. We aim to help anyone to design an improved ITS or ATS based on the Tega system from analyzing the case studies.

A. DESIGN ANALYSIS OF TEGA SYSTEM
The tutoring aim of Tega is to help children learn new words in a second language, which include a tablet to promote virtual scene creation and interaction, as well as a real robot to communicate physically, as seen in Fig 10. The structure for perception-planning-action is used to decompose the system, and the implementation of each module is analyzed below.

1) PERCEPTION
Two types of sensors (visual and acoustic) are equipped on the overall framework of the Tega system to capture the child's video stream. Additionally, a tactile sensor was placed over the tablet screen to capture the interaction between the child and the virtual game environment. Further, to extract the child's various facial emotions such as smile, brow-forehead, brow-raise, various depress of the lips, and to examine child's valence, and involvement, a real-time facial expression detection and analysis algorithm were implemented.

2) PLANNING OF TEACHING CONTENTS AND STRATEGIES
The Tega system adopted reinforce learning (RL) to analyze each child's personal affective policy. From the perception module, the input data such as child's valence and engagement status as well as task action within the virtual environment as fed into RL. The child's affective and task performance weighted sum is estimated, and based on this, the critic design is conducted. The RL then facilitate online training through the virtual game and physical robot and adjust suitable verbal and nonverbal acts. The student's characteristic, meanwhile, is implicitly modeled in the RL.

3) ACTION THROUGH MULTI-MODAL COMMUNICATIONS
The tablet implemented a scene construction, which synthesizes a virtual traveling game to allow the child to communicate and practice with a virtual animated personality. The physical robot may perform different non-verbal gestures such as midsection tilt (left/right and forward/back), fullbody (higher/lower and left/right) to attract and direct the child's attention along with utterance of the natural verbal language.
B. POSSIBLE IMPROVEMENTS Case 1: The Tega system uses a single visual sensor to capture and analyze the children's facial expressions. Besides, the system can use speech recognition technology as well for multi-modal perception. Since the target audience is children, it may be a difficult task to identify text emotions from them. Thus, it is advisable to remove text emotion technology for this task. Furthermore, a possibility may lie that the target audience may consist of disabled or autistic children. Analyzing of facial expression and speech from these children's will be a critical issue. As previously discussed (Section 3, point A), setting up suitable biomedical sensors may be leveraged to anlyze these child's valence and engagement. On top of that, the security and threat issues such as identification and authentication, the integrity of information, confidentiality, protection against several web-based attacks such as SQL injection, unauthorized programs (e.g., trojan horses and viruses) should be scrutinized as well to make the system more robust.
Case 2: The current implementation of Reinforcement Learning (RL) in the Tega system used case-by-case training and a fixed weight. Several methods can be used to improve the RL process in the system. At first, from the above CI approaches discussion, NNs seem quite popular in understanding the complex patterns with satisfactory computation performance. The authors [147] provided several progressive RL methods in NNs architectural task with challenges for further consideration. Merging the power of NNs with RL in the educational domain (i.e., devising an ITS or ATS) will be a novel approach to explore. Furthermore, the SI based optimization algorithms can also be investigated to improve the performance of NNs; for example, [99] demonstrate that GA can successfully evolve within the networks of NNs and performs well on most of the RL problems. It also reduced the training time as well as the deep RL problems.

V. FUTURE TREND
It has been observed that from the last decade that there is an ongoing trend of research interest in Intelligent Tutoring Systems. In recent years researchers have investigated the importance of integrating learners' affective states into these Intelligent Tutoring systems, which led to the transition of intelligent tutoring systems to affective tutoring systems. Thus, the significance of affective computing in the educational context is unquestionable. It is expected that to detect and monitor learners' emotions, most of the e-learning applications and platforms will incorporate an embedded system to recognize the user's affective states. Due to the swelling number of mobile phone users and the expeditious development of mobile learning technology, it can be anticipated that in educational contexts, mobile learning technology eventually would be the notable system for the future. This requires the development of more sophisticated models based on the emotion, which can be integrated into devices such as tablets and mobile phones along with cloud computing to maintain and allocate the demanding resources at low cost. Therefore, the future of the educational environments might be unlike the current situation and be more revolutionary than ever. In this context, industry practitioners should take further steps to collaborate with the researchers under the right policy by making appropriate plans and providing essential resources in order to institute future research and development in the approaches of affective expression and recognition.
Our study found that a trend was ongoing by researchers to put efforts into proposing and developing emotional systems/models/techniques to recognize and express emotion in the educational domain. In the view of designing and developing of affective recognition and expression systems, researchers and industry practitioners are urged to further explore the opportunity of incorporating the latest devices and equipment such as cameras, speech prosody, intelligent sensors, and intonation recognition to identify affective states of learners more effectively. Other types of research may include examining the notable impact of color features (such as lightness, hue or chroma on emotions, color bias in various 204632 VOLUME 8, 2020 FIGURE 10. Tega system implementation [146]. learning states considering the emotional differences), investigating the relationships among motivation, learning style, cognition, emotion, and investigating the effects of emotion on users' intention, achievement, and performance. The first step could be to design an affective method or system for emotion recognition by augmenting such systems into the educational context; however, the effectiveness, performance, and usability of these systems should be further examined carefully in order to design systems and methods in a more effective and appropriate manner.
There are five vital types of affective measurement channels: facial expression, textual, physiological, vocal, and multimodal [15] in affective computing studies. Among them, the most widely used channels are textual and facial expression as a multimodal channel. In relation to that, in recent times, some researchers have claimed that multimodal approaches may mitigate the limitations of individual channels in identifying emotion and bring better accuracy over the leading unimodal channels. However, there is still a need to address challenges in the methodological, theoretical, and measurement perspectives for using these types of channels to measure learners' emotions in an empirical manner. For example, investigating a proper way to achieve better results by integrating different channels, dealing with various types of data received from these channels, dealing with disputes on emotional states among different approaches at a specific time, and acquiring recognition results. Therefore, in-depth analysis is required to design comprehensive multimodal affective recognition systems/approaches that are able to decorate human emotional states while various teaching/learning conditions are taken into consideration.
In an Affective Tutoring system, it is imperative to recognize both negative and positive emotional states during learning episodes to improve learning productivity as well as to gain the desired academic goals. Therefore, investigating these in various learning techniques (such as game-oriented, mobile-based, online-based, face-to-face learnings) could be another direction for future studies. Furthermore, future study should also consider the emotional states in academic contexts (such as age, gender, groups, physical impairments, and subject domains) to expedite and surpass the process of affective computing. This will minimize the resource requirements in educational contexts, which has been paid little attention. This can be achieved by uniquely categorizing academicrelated emotions depending on various groups, ages, genders, subject domains.

VI. SURVEY LIMITATION
A literature review focusing on a broader aspect of these two tutoring systems is a tough task based on the requirement for extensive background knowledge. Therefore, it is necessary to mention the limitation of this study. These two systems encompass multidisciplinary broader areas such as computational intelligence approaches, their implementation methodologies, optimization techniques, emotion theory models, various channels of emotion recognition and their acquisition techniques, and so on. These subdivided areas itself require a critical investigation. Although this study is focused on showing the transition, but a critical discussion is provided within 2014-2019. We also provided a case study to comprehensively understand the approaches. Even though we reviewed 148 articles, we could only have access to VOLUME 8, 2020 subscribed journals, thus possibly omitting relevant articles from unsubscribed journals. Other than English, non-English publications are also omitted. It is advised for the readers that in a single article, it is impossible to review all the myriad aspects of ITS and ATSs. A possible solution will be doing a systematic literature review (SLR) on each part of the systems, which reduces the partiality of such cases. We suggest that further SLR and meta-analysis should be done to provide a detailed review of each category of approaches to these systems presented in this study.

VII. CONCLUSION
Students are highly motivated to learn using smart devices rather than conventional instructional delivery systems in learning. Nowadays, hardly any user can be found who are not using computers and the Internet in their daily life. Students prefer computerized learning, which has grown remarkably as an educational approach, just like technology has been evolved over the years. Providing personalized learning content based on the students' learning needs and preferences is essential for the effectiveness and efficiency of any learning approach. Therefore, numerous studies have been conducted on these issues from different perspectives. This study attempts to investigate two of the most popular tutoring systems, Intelligent and Affective tutoring systems. A taxonomy of these learning systems has been provided, which can be applied to some extent to formulate an appropriate tutoring system. This study analyzes their architectures and various models, approaches, and techniques used for designing and implementing each of the modules, along with the advantages and disadvantages of these systems. Some of the crucial findings reveal that neural networks are the most prominent computational intelligence approach for optimizing the tutoring and learning activities, which need to be explored further. Among the emotion measurement channels, facial has been used widely, followed by speech, text, and physiological. The Game-based learning approach is the most effective way to promote a learning outcome that urged attention. Furthermore, this study investigates and disseminates the importance of integrating the affective states in learning that has led many tutoring systems to include students' emotional signals in their learner environments. The benefits of the transition from ITS to ATS as a next-generation learning approach shows that ATS has surpassed ITS to detect the affective changes of the learner and adapt accordingly to improve the overall learning outcome. This study will help novice researchers to identify pertinent approaches to develop effective tutoring systems and motivate experts to explore innovative approaches to undertake the research challenges.
SITI SORAYA BINTI ABDUL RAHMAN received the bachelor's degree in information technology from the University of Glamorgan, U.K., in 1998, the master's degree in computer science from the University of Malaya, in 2003, and the Ph.D. degree in cognitive science from the University of Sussex, U.K., in 2012. She is currently a Senior Lecturer with the Department of Artificial Intelligence, Faculty of Computer Science, and Information Technology and a Program Leader for ALEPS (Adaptive Learning Environment for Problem Solving) under the Innovative Technology Research Cluster. Previously, she was the Head of Augmented Human Learning Program, under the Humanities Research Cluster, University of Malaya. Her research interests include cognitive science (cognition and programming, physics problem solving), artificial intelligence in education, E-learning, expert systems, knowledge representation, and reasoning.
MOHAMMAD MUSTANEER RAHMAN received the B.Sc. degree in computer science and engineering from Khulna University, Bangladesh, and the master's degree in computer science from the University of Malaya, Malaysia. He is currently pursuing the Ph.D. (Higher Degree by Research) degree in information technology with the School of Engineering and ICT, University of Tasmania, Hobart, Australia. He has eight years of industry experience in research and development. He has authored or coauthored in several publications in international journals and conferences. His research interests include spatial crowdsourcing, E-Learning, affective tutoring systems, recommender systems, adaptive learning, educational technology, human-computer interaction, image processing, decision support systems, machine learning, big data, distributed computing, and cloud computing.