Academic Decision Support System for Choosing Information Systems Sub Majors Programs using Decision Tree Algorithm

Background: Educational data mining is an emerging trend, especially in today Big Data Era. Numerous method and technique already been implemented in order to improve its process to gain better understanding of the educational process and to extract knowledge from various related data, but the implementation of these methods into Decision support system (DSS) application still limited, especially regarding help to choose university sub majors . Objective: To design an academic decision support system (DSS) by adopting Theory of Reasoned Action (TRA) concept and using Data Mining as a factor analytic apporach to extract rules for its knowledge model. Methods: We implemented factor analysis method and decision tree method of C.45 to produce rules of the impact course of the sub-majors and the job interest as the basic rules of the DSS. Results: The proposed academic decision support system able to give sub majors recommendations in accordance with student interest and competence, with 79.03% of precision and 61.11% of recall. Moreover, the system also has a dashboard feature that shows the information about the statistic of students in each sub majors


I. INTRODUCTION
Information System is one of the complex academic study that encompasses the concept, principles, and process of information technology and business management according to the organization's strategy, planning and practices.Therefore, the academic content of information systems degree program encompasses information technology, management of information system, information systems development, and business processes of organizational [1].Institut Teknologi Harapan Bangsa (ITHB) is one of educational institution that offer vast global career prospects in the Information and Communication Technology (ICT) area.One of its most highly favorite bachelor program degree is Information System.This department aims to create a professional that has competence in implementing information system to solve organization's problems, translate business requirement into technical information technology solution, to enhance business process effectiveness and efficiency, thus an organization's competitiveness.Because of its broad and complex study, Table 1 shown list of IS sub major which has divided its specialty into three sub majors.
The purpose of sub major in Information System (IS) bachelor program is to encourage students to explore their academic specialty and work interest through series of elective courses.Each of the Information System students entitled to choose at least one of the sub majors.Since the student that take this program come from a wide-ranging academic background, some student might experience different difficulties on different courses, and as the result it could affect their judgement to choose the most suitable sub major for them.Students tend to choose sub major that they are thinking they are informed at it, not based on their actual academic performance.In fact, according to a survey that have been conducted, almost 40% students change their sub major more than once.And according to the same survey, 65% of students choose their sub major just based on their interest, and they do not considerate the relationship between sub major and their professional career goal.This fact could cause student fail to finish 58 their study on time.But the problem is there are still no research regarding which courses and what its final grade that consider acceptable and could influence the success of the education process on each of IS sub major.This sub major aim to manage the essential operational components of the organization's information system, such as policies, processes, technologies, data and human aspects of information system

B Business Intelligence
This sub major aim to give the necessary skills and knowledge to analyze complex data and information accordance to organization key performance indicator in order to improve the efficiency and quality of the decision making in the organization.C E-Business In this sub major, IS student learn the concepts and methods of how to optimize business process in the digital era On the other hand, ITHB as one of the vast growing universities in Indonesia has already been awarded bachelor's degree to more than 1600 graduated students.These years of graduated students from information system bachelor program information has been collected and used whenever necessaries for management purposed only.These raw data contain a wealth of hidden knowledge.It is necessary to mine knowledge that based on extensive analysis of these raw data, especially knowledge that could help faculties and student to make a better decision making regarding this sub major selection process.Educational data mining can be used to understand the factors and the relationship between its attributes regarding sub major's selection process of classifying and predicting student performance.Especially in complex and unique bachelor program such as in Information System (IS) that combining skill in all Information Technology phases and business process in organizations.The problem is that there is still no knowledge model that could be used as guidance for student and faculty to advise the most suitable sub major based on the student interest and competencies Therefore, this research focused on to build an academic decision support system which its knowledge model based on the most suitable data mining technique from the collected data.This knowledge model would be used to design the performance based academic decision support system for choosing a sub major of information system student.
The proposed system would be able to help the Information System Department by giving recommendation and advice regarding the most suitable sub major for the information system students based on their competency, skill, and interest and career goal.
Decision support system (DSS) is a technology and application that could help decision maker in compiling useful information from related raw data to identify and solve problems.The main components of DSS are suitable data, the model, knowledge that acquired from modeling data, and the user interface.One of the main factors in designing a DSS is constructing and formulating a suitable model based on problems that would be solved.In this research, we build a model to help IS students and academic advisor to choose the most suitable IS sub major based on each competencies and job interest.The model will be constructed by gaining knowledge from previous IS graduated student.Research In academic decision support system already been done to solve various problem in university administration.Deniz and Ibrahim [2] developed a performance based academic DSS by employing a data mining technique in order to extract useful information from raw data from student databases and build a dashboard system that could visualized academic data.
The use of data mining technique to generated information and knowledge from large repositories of data in education field known as Educational Data Mining (EDM).Educational Data Mining (EDM) is the application of Data Mining (DM) techniques to educational data, and so, its objective is to analyze these types of data in order to resolve educational research issues [3].In fact, the performance or success of students in the examination as well as their overall personality development could be exponentially accelerated by thoroughly utilizing Data Mining techniques to evaluate their admission academic performance and finally the placement in an organization [4].The EDM methodology is not yet transparent and it is not clear which data mining approach and algorithm are preferable in general learning activities in educational settings.Data mining techniques have been proofed useful to gain rules to classify student and to detect the sources of any incongruous values received from student activities [5].Minaei-Bidgoli et al [6] used classification approach of data mining to build a model that could predict student final grade based on features extraction from education web-based system.Osmanbegovic and Suljic [7] compared various data mining techniques to develop a model, which can derive the conclusion on student's academic success.Kularbpettong and Tongsiri [8] used classification approach of data mining to build a model for student in choosing an emphasized track of majoring in computer science.Livieries et al [9] used 2level classification technique of data mining to predict student performance for final Algebra and Geometry course examination.The set of attributes used in their study about the students' performance, such as grades, tests scores, final examination grades and the semester grades [9].Mansur and Yusof [10] used data mining method to gain student-learning behavior and used K-Mean clustering to classify the student learning behavior.In this big data era, a numerous approach has already been implemented to analyze educational data, but the validity of these Fiarni, Sipayung,& Tumundo Journal of Information Systems Engineering and Business Intelligence, 2019, 5 (1), 57-66 59 implemented approaches needs to be checked regularly, which can slow down the adoption of educational data mining or learning analytics in everyday life [11].
This research will also adopt the Theory of Reasoned Action (TRA) as its conceptual framework.According to TRA, students' intention to work in a particular position is rooted from their attitude toward the chosen major and a variety of important characteristics, such as interest, aptitude, salary, personal and social image, and difficulty or workloads of the chosen bachelor major as can be seen in reference [12].TRA has already been used in several research to examine choice of major.Zhang [13] used TRA to find dependent variable on students to choose major in Information System.Downey et al [14] used TRA to understand the factors that influence students choose information system major.

II. METHODS
In this research, we utilized graduated academic data and their work experience to gain knowledge regarding factors that could influence IS (Information System) student to choose the most suitable IS sub major.We used both data mining and the rule-based approach as the model of the proposed DSS.For the data mining approach, we present supervised learning to classify each IS sub major based on the graduate student profiles report.The report would be used to extract the rules, which the objectives are: to find relationship between student grade of core courses and each IS sub major and also to classifying factor and its attribute between each sub major and their work of field interest.The research steps shown as a diagram on Fig. 1, which consists of two main parts, modeling process and computing the decision process.The goal of the first part of this research is to build the model and the second part goal is to implement the model of the proposed system.In this section we would discuss the research method of building the model.In order to build the model first we must prepare the data, then modeling the data by build hypothesis for factor analysis.The idea is, to build the model for the proposed system classifying the data from the graduated Information System students.And in the last phase we build the model based on knowledge gain from the factor analysis of the previous phase.
In the classification learning process, we will build the knowledge model for the proposed Academic DSS using factor analysis method.Factor analysis is sets of techniques used to find out the underlying constructs which influence the responses of several measured variables [15].Generally, factor analysis addresses the analysis of the correlation structure among several variables through the definition of a set of common hidden elements called factors.Factor analysis enables the researcher firstly to identify the independent structure factors and then determine a justifiable limit for each variable.Thus, the factors and related descriptions for each variable are determined and then, the initial applications of the factor analysis i.e. abstraction and data volume reduction, are obtained.In data abstraction, the hidden factors are revealed and then, these factors are interpreted and subsequently, data are described through fewer numbers of variables.Data volume reduction is accomplished through the score calculation for each hidden factor and replacing it with the initial variables as mentioned in reference [16].There are already several researches that had been done regarding to combining data mining and factor analysis.Rostamy et al [17] utilized data mining and factor analysis for identifying activity-based cost drivers in Iranian Bank.
Fig. 2 illustrates that for each IS sub major is influenced by courses and its final grade as well as their relation are unique.But on the other hand, each IS sub major also correlated to field of works and job interest, and they may have one or more factors in common.In order to build the academic decision support system that could advised which sub to major each student could take based on their performance, first we need to find which courses that influence those major and what its grade.

A. Dataset Collection and Preprocessing
The training data set consists of academic transcript and job position of IS graduated students' year 2009 to 2010.After eliminating incomplete data, sample comprise 67 IS graduated students.From their academic transcript, we used 91 courses, with 68 core courses and 23 electives.Total final grade course that would be used are 4851 datasets.The next step in building the predictive model is data preprocessing.The objective of this step is to prepare the collected sample dataset and analyses its characteristics before use it as training data of the data mining process.This step consists of data selection, data cleaning and data normalization.In the data selection process, three parameters from IS graduated student dataset will be used, which are their group of sub major, a final grade of each course, current job field.All selected parameters will be saved in csv file.In the normalization process, the numerical attribute of the final grade will transform and classified into three categorical groups.Table 2 shown the list attribute used for predicting the IS sub major.

B. Mapping Process and Factor Analysis
The goal of this process is to classify academic attitude and aptitudes characteristic for each of IS sub major in order to build the predictive model for the proposed DSS.As shown on Fig. 1, in order to generate rules for the proposed, training dataset that have been undergoing pre-processing data would be labelled and analyze.Then we used the same dataset to test the accuracy of the model.If the accuracy level of the model acceptable, the model would be used in the proposed Academic DSS.In this research, we adopt C4.5 decision tree classifier to construct models.C4.5 technique can produce decision tree and rule set.We choose C4.5 because it accounts for unavailable values, continuous attribute value ranges, pruning of decision trees and rule derivation [16].One of the most significant advantages of decision trees is the fact that knowledge can be extracted and represented in the form of classification (if-then) rules.Each rule represents a unique path from the root to each leaf [18].To build the model, this research used WEKA open source toolkit to train and test dataset.WEKA is a package of practical machine learning tools.Weka stands for Waikato Environment for Knowledge Analysis, which is made at the University of Waikato, New Zealand.Weka is able to solve data mining problems in the real world.This software is written in the Java class hierarchy with object-oriented methods and can run on almost all platforms.WEKA is easy to use and is applied to several different levels.WEKA contains tools for data pre-processing, classification, Fiarni, Sipayung,& Tumundo Journal of Information Systems Engineering and Business Intelligence, 2019, 5 (1), 57-66 61 regression, clustering, association rules and visualization.The data format used in WEKA is ARFF format.The arff file is an ASCII fle text that contains a list of instances in a set of attributes [19].
There are three main steps in the process of building model: a) Classifying field of work interest of Information System Bachelor Programme.The objective of this step is to grouping IS graduated student profession and their IS sub major.The data processing is done by dividing each graduate's work based on the sub-major taken from the graduate during the lecture and then matching the work specifications obtained from the 2013 Information System Job Index [20].This step resulting in work groups ideal for Information Systems graduates Grouping work data is as shown in Table 3.To get the right model we build seven hypotheses to find factor analysis that describe the relationship between IS sub major, the nine courses as a result of previous process and their final grade.Then we test the precision and recall value of each hypothesis as score calculation to reduce variable of the factor analysis technique (see formula 1 and 2).

C. Knowledge Modelling
The goal of this process is to build knowledge model that could integrate the academic interest and aptitudes characteristic for each of IS sub major and the job field using the factor analysis from the previous process.In this step we generate rules for the proposed academic decision support by combining the result of the pruning tree of competency from table 3 with the mapping of the IS field of work.Rules that are formed are sub-major selection rules based on student interests and competencies.The interest represented by the job desired by students obtained from graduate data is matched to work group, as shown in Table 3. Meanwhile Rule for competencies are represented by the basic values of student courses, obtained from data mining between course mapping data, student grades transcript data, and graduate data.Based on the rules formed, competency is the most influential Fiarni, Sipayung,& Tumundo Journal of Information Systems Engineering and Business Intelligence, 2019, 5 (1), 57-66 62 factor in the selection of sub majors.Competence was obtained from the pruning tree which resulted in the seven most influential subjects in the selection of majors.The number of existing courses will be included in the configuration rules that are in the system.The system created is a dynamic system, which means that additional rules can be formed.Fig. 2 shown the decision tree diagram for rule-based classifier model of the proposed Academic DSS.These extracted rules used as knowledge model to classify existing records using the "IF and THEN" conditions, to support academic decision regarding IS sub-major selection IS student dataset.

III. RESULTS
This section would describe the process of developing the proposed academic decision support system.In this development phase, we analyze the functional requirement, user interface design and implement the computing of the proposed system as illustrated in Fig. 1, in orderly phase.System will authenticate users because users have their own level of interaction with the proposed system, determine the competencies of students to each sub majors based on the values inputted in accordance with the proposed predictive model, displays job recommendations based on the sub-major proposed by the system, and view the selection of sub major recommendations that match their interests and competence of students.Dashboard system shows the information about the number of students who choose the sub majors both recommendations and the actual.This research used Unified Modeling Language (UML) to explain the flow of the proposed system.We used Use Case Diagram to describe functions of system and its interaction with users as shown on Fig. 3.As illustrated in Fig. 3, user on this proposed system are administrators, students and faculties.They respectively have accessed to six, three and two features of the proposed system.The next step of system development is interface design of the proposed system.Interface design is the process of defining how the system will interact with user pleasantly.The input data then would be used by system to recommend IS sub major based on proposed model.As the result, system would give recommendation for their IS sub major.These recommendation shown on Fig. 4. Fig. 5 shows the dashboard of recommendation and actual sub major for specific time range.The information that shown population of each sub major in pie chart form.Faculty could also get detailed information by 64 clicking a respective area in the pie chart and print the overall result for other academic purpose.The proposed system was tested using the data of IS graduate student that are already working.As shown on the Table 5, we used the 6th and 7th hypotheses as influence factors for sub major of information system bachelor program because it gives the highest Recall for sub A and B. And from the same table we gain knowledge that according to dataset we cannot gain factors that influence the IS sub major C. The result of system testing is, the recommendation given by system its results in a precision of 79.03% and a recall of 61.11%.These results indicate that the proposed system could make a quite accurate recommendation.

IV. DISCUSSION
In this research an exhaustive factor analysis has been done by using supervised data mining technique to find the most relate course and its grade for each IS sub major.The authors presented seven hypotheses as the base of the factor analysis.Then, we use C.45 to each of these hypotheses on the graduated student data.From the factor analysis we found that for the IS sub major Business Intelligence that the most influence courses are Business Mathematic with grade A-and above, and Discreet Mathematics with grade between C and B. These results reasonable as to work in business intelligent area will need a person that have good analytical as well as mathematical skill.As for IS sub major Infrastructure Management the most relatable courses are operating system with grade between B-and A, system application and product in data processing (SAP) with grade A or A-, and Accounting with grade minimum C. Finally as for IS sub major E-Business the most relatable course are Process business process analysis with grade A-and above, and enterprise system with grade between C and B. These factors then combine with the job experiences of the graduated student and use as a model of the proposed academic system.The rules-based model that implements the proposed academic decision support system prove to valid with 79.03% of its precision and 61.11% of its resale value.When the more analysis has been done to gain Fiarni, Sipayung,& Tumundo Journal of Information Systems Engineering and Business Intelligence, 2019, 5 (1), 57-66 65 information regarding academic decision support system's ability for each SI sub major, the result shown that system able to gain accuracy value 70% for Business Intelligent (B) sub major, 66.67% of Enterprise System (A) sub major and only 46.67% for E-Business (C) sub major.
The proposed academic decision support system will be implemented in the Information Systems department ITHB.This system will help the information system student, the fourth semester, students and faculty get personalized recommendation of IS sub major based on their competency and interest.This knowledge would help the success of the student education process from 5th semester and later.
More work could be carried out to improve the performance of the proposed system.Additional dataset and data mining, classification technique could be considered to increase its precision and recall value.Additional factors could also be used to enhance and encompass the complexity of the knowledge model.And since the academic decision support system has rules setting feature, the new and improve model could implement to system dynamically.

V. CONCLUSIONS
In this proposed system, we produce a rule-based model by extracting existing IS alumni data regarding their academic performance and professional job history to support IS sub-major selection.The research has found that student aptitude and attitude have a great impact on choosing IS sub major.The classification courses and IS graduated student's final grade to each sub major generated by a pruning process with C45 algorithm.The research results show that the C45 algorithm is suitable to classify core and elective courses for infrastructure management, business intelligent and e-business as IS sub-major in ITHB.As the pruning process with C45 Algorithm, resulting seven specific courses and their threshold value of the final grade, which each classify into each of IS sub major.The aptitude factors regarding selecting IS sub major get from mapping graduated students' job position in each IS sub majors.These are shows that C45 algorithm suitable for the factor analysis purpose to build knowledge modelling of the educational decision support system.The DSS application uses data mining, classification techniques with C45 decision tree to obtain a recommendation pattern for the selection of sub majors that are in accordance with the interests and competencies of students obtained from data mining on transcript data, course mapping data (elective and core), course prerequisites, and data from IS graduated.This research has become another important stepping-stone in enriching research in academic data mining field.Our objective and expectation are that this work could be used as a reference of research model in other education data mining area.Also, the DSS application that has been developed could be implemented to support this academic sub selection process.We hope this application could strengthen the service system in educational institutions by offering customized assistance according to students' predicted performance.In future work, more dataset and different approach will be collected, compared and analyzed in order to gain better knowledge model for Academic Decision Support System.

FiarniFig. 1 .
Fig. 1.Research overview diagramTABLE 2 LIST OF DATA ATTRIBUTES Parameters Value Job fieldComputer and information research science, Computer and information system manager, metwork and computer system administrator, computer network architecs, computer network support specialist, computer support specialis, information security analyst, web developers, software and system developers, computer programmers, software and application developers, database administration, computer system analyst, and Others.IS Sub major A= IT infrastructure B=Business Intelligent C=E-Business Final Grade for each course 1 st (excellent) = A and A-2 nd (very good) = B+, B, and B-3 rd (avarage) = C and C-

Fig. 2 .
Fig. 2. Factor Analysis for Knowledge Modelling of the Proposed System

Fig 3 .
Fig 3. Use Case Diagram of Proposed System

TABLE 3 CLASSIFYING
IS GRADUATED STUDENT JOB FIELD

TABLE 4
c) Classifying IS sub major based on actual dataset and mapping courses and its perquisite.

TABLE 5 THE
RESULT OF HYPOTHESIS TESTING FOR PREDICTIVE SUB MAJOR OF INFORMATION SYSTEM BACHELOR PROGRAM