Predicting Factors That Influence Students’ Learning Outcomes Using Learning Analytics in Online Learning Environment

The application of online learning has increased significantly, recently. One of the key successes in online learning is student interactions. An active learning strategy would engage the students to interact with the course or to get involved in the learning process. The objective of this research was to predict which one of the student activities that would improve the learning outcome of the students? All the activities are related to non-human interaction. One of the activities is concept mapping. All the students’ activities in online learning were stored in LMS and the data generated as a learning analytics. A linear regression method was used to analyze the data. This research confirmed that working on exercises by using concept mapping yields significant results in improving the learning outcome of students. Keywords—Concept mapping, learning analytics, learning outcomes, nonhuman interaction, student success in online learning


Introduction
Nobody denies that education is changing and technology has a big part in changing the current educational landscape. In this big data era, online learning usage has increased significantly, and has become more popular during the Covid-19 outbreak, and would be expected to substitute the traditional classroom (face to face setting). Therefore, educators should provide quality learning experiences for the students.
In order to promote student success in online learning, There are several important aspects that need to be considered in organizing online learning such as the use of application software which has many advantages including the ease of administering, managing, documenting, monitoring, content delivery, and evaluating online learning [1], utilization of educational data mining to improve the quality of education [2], as well as the application of learner-centered learning strategies [3]. A Learning management system (LMS) is a learning platform that is used to administer the educational program in an online learning environment which adopts the learning activities of the traditional classroom. An LMS has a common component consisting of synchronous and asynchronous communication tools, management features, and assessment utilities [4]. These features make the teacher is easier to structure the course. Thus, an LMS as a learning environment that is able to support both teachers and students to conduct self-directed learning [5].
An LMS records activity data on the LMS Activity logs. It collects student interaction activities data, such as when, how long and how often they access the facilities provided such as content, quiz, forum, and other facilities. Data generated by each online student and it will be different from each other. The log data generated from an LMS is called Learning Analytics (LA) and its function is to extract data in a big volume [6].
The LA is used to optimize the learning process by solving problems that exist within the learning process. The use of data with the aim of determining the right strategy and policy is known as a data-driven approach. Many problems can be solved by utilizing a data-driven approach, as stated by Jagadish, et al [7]. The use of LA can create a more personal, adaptive and interactive learning environment in order to increase the effectiveness in teaching and learning and improve the performance of students and teachers [8].
Responding to the large volume of educational data, it is expected that LA will be important equipment in supporting the teacher to have a greater understanding of student needs and performances [9]. LA provides meaningful information to teachers by combining and analyzing students' historical data during learning and then taking action to reflect and intervene in learning to increase students' absorption and participation [10].
However, the use of LMS and LA alone are not enough without an instructional strategy in online learning. An instructional strategy is aimed at increasing activity in learning. Student activeness in online learning can be seen from student interaction. In an online learning system, it plays an important role [11]. The student interaction is produced by active learning strategies as well. An active learning strategy can promote students' engagement in order to interact with the course or to become involved in the learning process [12].
This study uses LMS and LA as infrastructure in online learning. In addition, the learning activity used is the use of the Concept Mapping (CM) strategy which is one of the eight ways to promote generative learning activities [13].
In addition to CM, reading contents, number of logins, length of login, and submitting assignments are online learning activities which are the focus of this research. According to Hirumi [14], these activities are categorized as non-human interaction. All the activities were recorded in the LMS and the data presented as a learning analytics. The purpose of this study was to predict which of these activities would affect the learning outcome. Therefore, in this study, an experiment was carried out with the test subjects being students of the Educational Technology Department, State University of Malang. The test results were analyzed using regression analysis with multiple linear regression (MLR).

2
Literature Review

Active learning strategies in online learning
Interaction in online learning: Online learning activities combine several student's interaction types during the learning process. Many researches have discussed student interaction due to its important role in online learning. Hirumi [14] has summarized several types of interactions in online learning which consists of three levels, namely student self-interaction which relates to cognitive processes (level 1), student interactions with human and non-human resources in the learning environment (level 2), and student interaction with pedagogy or e-learning strategy (level 3). Furthermore, Chou, Peng, and Chang [15] also define interaction in online learning to be the student-self, student-student, student-teacher, student-content, and student interface. The more common types of interactions in online learning are stated by Gradel and Edson [16] which consists of student-content, student-teacher, and student-student interaction. Student-tool interaction is also classified in studentinteraction as non-human [14], [17].
Concept mapping: A student should not only passively receive information or knowledge but they must play an active role in constructing their understanding so that the learning process is also defined as a generative activity [18]. Generative learning is more focused on finding relationships to build new knowledge than just storing information in the short term (working memory) or long term memory. Thus, in generative learning, understanding is a result of the process of building relationships between one concept and another with the initial knowledge, learning experiences and new information [13].
According to Fiorella and Mayer [13], there are eight ways to promote generative learning, including learning by mapping. The concept map is an example of part of learning by mapping, where students convert their understanding into a spatial arrangement of words and make connections between these words. Joseph Novak and his team coined the concept mapping method in the early 1970s. The Theory of Ausable, a meaningful learning theory underpinned the concept mapping theory. A concept map is a way of representing or organizing knowledge. Concept maps identify the way we think, and the way we see relationships between knowledge [19]. Novak [20] implies that CM is a method for enhancing conceptual understanding.

Learning analytics in Learning Management System (LMS)
Currently, popular LMS provides essential tools that allow interactive activities in the course, such as forums, messages, online forms of assignments, virtual classroom, etc. These tools also assist teachers in tracking and monitoring the student learning process, such as status submitted assignment reports, the frequency of access statistics, activity logs on the system.
Online learning in higher education is growing dramatically around the globe. With the asynchronous and synchronous interaction and communication facilities in an online learning environment to substitute the traditional classroom approach, this is what makes online learning a part of higher education. The implementation of Learning Analytics can enhance the quality of an online learning process and outcomes due to the fact that it would provide a better understanding of students' performance during the learning process through seeing their learning track records [21].
LA is a sophisticated tool used to enhance learning and education. Interestingly, LA emerged from several fields of science and previous research such as educational data mining, web analytics, business intelligence, as well as academic analytics [22]. LA provides important information to teachers by combining and analyzing students' learning historical data then taking action to reflect and intervene in learning to increase students' absorption of material and participation [23].
Data sources in LA can be in the form of demographic data, online activities, assessments and learning achievement data. All data can be visualized in a variety of ways. The data are presented to the teachers and students. Clow [24] states that interventions and predictions also vary between teachers and students. They would be taking action based on presented LA, for instance, students would compare their learning progress and achievement to other peers, and the teacher would contact identified students who require some additional assistance.

3
Research Design

Participants
Participants of this research were 53 students who enrolled in the Web Programming Course. They were the third-year students of the Educational Technology Department of Faculty of Education, State University of Malang, Indonesia.

Learning procedure
The research was conducted for 6 weeks with details as follows: 1) Week 1, there were tutorials for all participants in terms of building understanding related to concept mapping that was held in face to face mode (traditional classroom; 2) Week 2, a pretest was conducted to determine the students' initial abilities; 3) Week 3-6, Online learning process which ended up with a posttest to measure learning outcomes.
The learning contents used in this research were the introduction of the Web Programming course, which is related to conceptual knowledge. The LMS was used to present the online learning course equipped with a generative activity feature, namely concept mapping (CM). The CM was built using the jsmind javascript library [25]. Research participants registered to LMS by creating a user identity (user-id) for each student. Student CM results were stored in the LMS system, so students could re-access or re-create them and the teacher could also assess the CM results of each student.

Data collection
Learning analytics data were taken from two tables in the LMS database system, namely the login table that records information about student login activities and the time duration they spent when using the LMS system. The other table was the session table that records every process of student activity when they accessed the LMS with the user ID and activity ID.
After deleting activity data from the teacher and administrator, the data were filtered on both tables so that the students' activities were obtained. The following are students' activity data consisting of: The post-test consisted of 25 questions that were validated by experts in the field of information technology and educational technology.

Data analysis
Statistical analysis in this study used SPSS version 26 application software. Multiple linear regression (MLR) was used to find relationships between several online learning activities which were categorized as non-human interactions with learning outcomes.. In the context of this study, the independent variable was the number of interactions of each type of interaction listed in Table 2, namely the number of logins, the duration of the logins, the duration of reading the content, the number of interactions in building the CM (working on exercises using CM), as well as number of submitting assignments whereas the dependent variable was the learning outcome, represented by the final grade (posttest) achieved by each student. The MLR method was used to calculate the variance of the dependent variable as a linear combination of the independent variables. Previously, a correlation test was performed to determine the correlation coefficient of each variable. This made it possible to create predictive models for dependent variables based on data from independent variables.

Students' activities in online learning
Online learning activities are applied more to individual learning, namely reading content, working on exercises such as using another tool or feature, and submitting assignments. These three activities are standard activities carried out by a student in online learning. In this research, working on exercises by constructing concept mapping was chosen by using specific features as a student to non-human interaction.
This research used all data records from Table 1 and Table 2 to record all learning activities. In each table, a different query was carried out using the SQL language which is commonly used to filter data in the MySQL relational database.  Table 1 shows login_count column representing the number of student authentication into the LMS and the session_duration column is considered one cycle of user activity that starts when a user connects to the service. These variables and the activities explained the learning behaviour of the online learning students.

Predicting students' learning outcome
The correlation between two variables had been calculated by using the Spearman Correlation. It was chosen due to some data not being homogeneous and normal, Therefore this non-parametric test was performed. Table 3. Spearman Correlation *. Correlation is significant at the 0.05 level (2-tailed). **. Correlation is significant at the 0.01 level (2-tailed). Table 3 shows the number of logins was closely related to all activities on the LMS with a 99% confidence level, however the number of logins was not related to the learning outcomes. Likewise, with the time spent by students in accessing LMS, the Spearman correlation test results showed that the time in accessing LMS was closely related to all variables tested except for the learning outcomes variables.
To interpret the MLR measurement results, several analysis of the test results were conducted including the F test, T-test, measurement of the coefficient of determination, as well as the multicollinearity test. The following is an explanation regarding the results of several tests that had been carried out. After obtaining the F value, the results of the determination coefficient measurement was showed (see Table 5). Table 5 shows the R square value of 0.241 (24.1%) and an adjusted R square of 0.161 (16.1%). Due to more than two independent variables was used, the adjusted R Square was selected as a reference as the coefficient of determination. This coefficient showed how much the percentage of variation in the independent variable used in the model was able to explain the variation in the dependent variable. Next, the multicollinearity measurement was intended to determine whether the regression model used had a strong correlation between independent variables by looking at the tolerance (T) and variance inflating factor (VIF) values. Table 6 shows that the T value for all independent variables had a value of> 0.1 and the VIF value for all independent variables had a value of <10. So it can be concluded that there was no multicollinearity in this regression model.  Table 7 is a summary of the results of the MLR analysis that had been conducted. If the value was Sig. <0.05 from an independent variable, it could be concluded that the independent variable had a partial effect on the dependent variable. In addition, by looking at the t value of each independent variable, if the t value> the t-table (2.01174), it could be concluded that the independent variable affected partially (individually) on the dependent variable. Table 7 shows that only the variable of working on exercises by using concept mapping had an effect on learning outcomes (post-test).

Discussion
LA is also very likely to be used by anyone who is involved and has an interest in implementing the learning process. Ifenthaler and Widanapathirana [26] divided the levels of stakeholders who might be involved and interested in LA into several levels, namely: mega-level, macro-level, meso-level, and micro-level. Greller and Drachsler [9] emphasize that the application of LA can be used by different stakeholders such as students, teachers, intelligent tutoring systems, educational institutions, researchers, and instructional designers.
Both students and teachers might be concerned about how the analysis of LA can improve the learning quality. How the students' grades will be improved or how teachers are helped to adjust a learning strategy, suitable learning materials based on students' need and their personalities to improve. LA as an analytical tool is utilized by stakeholders in this case educational institutions to support policy making by identifying student' s failures or learning needs in online learning. According to Campbell, DeBlois, and Diana [27], the learning analytics process is an iterative process consisting of five steps, namely capture, report, predict, act, and refine. This paper is focused on the third step of the learning analytics process, namely predict step. The analyzed data consisted of students' information about their learning process such as activities of reading contents, working on exercises using CM, submitting assignments, the number of logins, and the amount of time spent in online learning, as well as students' learning outcome. The content presented in this experiment is related to conceptual knowledge so that it requires deeper learning like an application used in learning [28].
From Table 7, this research findings confirmed that doing exercises using CM is the activity that influences the learning outcome. This finding is emphasized by Patrick' s work [29], where he concludes that concept mapping can affect student achievement. Other research findings also agree that the usage of concept mapping in teaching enables the students to achieve higher scores rather than students who are taught by conventional methods [30] [31]. The same is when Computer-Based Concept Mapping (CBCM) is applied in a digital learning environment. CBCM can provide meaningful learning for students [32]. This proves that CM can assist students to understand the concept through the process of constructing knowledge structures.
Moreover, the experiment conducted by Hwang, Yang, and Wang [33] on gamebased learning utilizing CM found that CM can significantly improve student learning achievement and reduce their cognitive load. Similarly, the results of a recent experiment found that CM used as a formative assessment can improve Student Engagement and learning outcomes in online learning when compared to using conventional assessments [34]. From the results of their study, it was found that the quantity of reading on LMS was not able to predict the learning outcomes of students. The study of Huang, Chern, and Lin [35] have emphasized these research findings. Huang et al. [35] state that reading activities in online learning can improve learning outcomes but are unable to predict learning outcomes, due to students finding difficulties in understanding the topics in an article that require a complex level of understanding. Huang et al. [35] imply that a special strategy should be needed to help students find ideas from an article and as a result of their study was that CM can help the students to learn.
Concept mapping is one of the generative learning activities where this activity views learning as an act of construction. Everyone understands something by integrating new experiences with their existing knowledge structures [36]. argues that students generate perceptions and meanings that are consistent with their prior knowledge. Besides, the concept map can be used as a tool in developing reflective thinking abilities of students, namely to integrate small pieces of knowledge into a complete and elaborate knowledge structure [37].
Meanwhile, the number of logins was found to correlate with the duration of interaction with the LMS in accessing learning contents, as well as the other independent variables which had a significant correlation between them. Research conducted by Asterhan and Hever [38] also indicate that reading content has a positive effect on learning outcomes. Nevertheless, the amount of interaction in building CM had a more significant correlation with learning outcomes. These findings were confirmed by the study of You [39] that identifies several interactions in the learning process that can predict learning outcomes. You [39] states that the quantity of interaction with LMS does not necessarily improve learning outcomes but the quality of learning behavior can predict learning outcomes.
In this study, only a few learning activities were included in the non-human interaction category [14], as a result the adjusted R square value which obtained during the experiment (see section 4) tends to be low. This is because many factors that influence the improvement of learning outcomes in online learning settings [28] which were no measured in this research such as learning motivation [40] , studentteacher interaction [41], interactions between students [42], and so on. In addition, according to Ismail, et al. [43] There are four factors that influence student academic performance, namely the use of technology, the interaction process, the characteristics of the student and the characteristics of the class.
Other than that, There have been many studies propose solutions to make interactive activities effectively support the learning process of students. Evans and Sabry [44] implemented three interactive activities: The pace control, self-assessment, interactive simulation of his research and time of using the system is a factor affecting student results. The results of their study showed that students with better results and need less time learning when interacting more with the system. Similarly, according to research findings of Damianov et al. [45], there is a positive influence in the duration of time spent online and the results calculated by the scores of students, especially students in the group above average. Contrary to the judgment of Eom et al. [46] showed that there was no relationship between other forms of interaction to the learning outcomes of students. Early research discovered that interactive activities online in the blended learning course have an impact on student learning outcomes.

Conclusion
The number of online learning users increases which affects the need for an LMS increases. LMS has the advantage of managing online learning by adopting traditional classrooms. Apart from managing, the LMS records all activities of the teachers and students in a feature activity logs. Generated data from this LMS is known as LA. Currently, this LA feature is very important, because data on students' online activities can be interpreted and used to solve problems in learning and assess the effectiveness of online learning.
This study uses LA for specific online learning activities in reading content, working on exercises using CM, submitting assignments, number of logins, login duration (time spent learning online). The purpose of this study is to predict which of these online learning activities will affect the learning outcome. The results of data analysis specifically found the number of logins was closely related to all activities on the LMS with a 99% confidence level.
However, the number of logins was not related to learning outcomes. Similarly, the time spent by students in accessing LMS, the Spearman correlation test results, showed duration in accessing LMS was closely related to all variables tested except for the learning outcomes variable.
Based on the results of the T-test analysis in the MLR analysis, it was found that only activities working on exercises using CM affected on learning outcomes with a sig. value of 0.024, while other activities have a sig value. above 0.05. These results prove that compared to the activities tested, the use of CM in training is effective in helping students learn content which is related to conceptual knowledge.