Time-Dependent Performance Prediction System for Early Insight in Learning Trends

K students' learning trends is relevant to diagnose learning performance and early detect situations where teachers' intervention would be most effective. Prediction systems represent one of the bests tools for this purpose. Predicting performance is the basis for student diagnostics, learning trends projection and early detection. Most performance prediction systems output numerical grades or performance class memberships. Research tends to focus on prediction accuracy. Accuracy is relevant, because it helps improving diagnostics, but it should not be confused with the main goal: improving learning. To help teachers improve student performance many other aspects can be considered: more accessible prediction data, better graphical representations, methods for detecting learning trends and most suitable moments for intervention, etc. Most of these improvements rely on the ability to consider learning data evolution over time. This is particularly relevant due to cumulative nature of learning and so it is one of the main characteristics considered in this work. This work is an empirical research in the search for practical systems to help teachers in their guidance duties. It relays on teachers receiving in-depth information on student learning trends during semester. This information is elaborated from an automatic system which yields predictions on expected student performance. Main contribution of this work is a custom-designed practical prediction system. Main innovations of the proposed system are its time-dependent nature and the use of probabilistic predictions. The proposed system delivers by-weekly probabilistic performance predictions and analytical timedependent graphs that help gaining insight in students’ learning trends. The proposed system is tested during a complete semester in the subject Mathematics I at the University of Alicante. Data gathered is used as initial evidence to empirically test the system and results are shown and discussed. Usefulness, convenience and advantages of the time-dependent nature of learning data are also tested and discussed. As an additional consequence derived from these tests, some initial methods for selecting the best moments for teacher intervention are proposed and discussed. Performance predictions are shown as point graphs over time, along with calculated trends. This information is summarized and organized to help teachers explore and analyse student learning performance efficiently. Some case examples are presented and analysed using these graphs, showing their potential to help teachers understand beyond raw data. Teachers can use this information to diagnose students, understand learning trends, early detect intervention situations and act accordingly to help students improve their learning results. This research considers only learning trend diagnosis and detection of most suitable moments for teacher intervention. Intervention strategies and their results are out of scope. This paper is structured in seven sections. Section II analyses some relevant background works. First, several reviews which describe the most appropriate techniques in prediction are presented. Then, some related works on early detection and on providing insightful, graphical representations are explained. Lastly, a discussion drawing conclusions of this review is performed. As a result, research questions are proposed in section III. A custom automated learning system, in which the proposed prediction system is included, is presented in section IV. Section V explains how data from the system is used to perform student diagnosis and to select the best intervention moment. Section VI analyses some Time-Dependent Performance Prediction System for Early Insight in Learning Trends


I. Introduction
K nowing students' learning trends is relevant to diagnose learning performance and early detect situations where teachers' intervention would be most effective. Prediction systems represent one of the bests tools for this purpose. Predicting performance is the basis for student diagnostics, learning trends projection and early detection.
Most performance prediction systems output numerical grades or performance class memberships. Research tends to focus on prediction accuracy. Accuracy is relevant, because it helps improving diagnostics, but it should not be confused with the main goal: improving learning. To help teachers improve student performance many other aspects can be considered: more accessible prediction data, better graphical representations, methods for detecting learning trends and most suitable moments for intervention, etc. Most of these improvements rely on the ability to consider learning data evolution over time. This is particularly relevant due to cumulative nature of learning and so it is one of the main characteristics considered in this work.
This work is an empirical research in the search for practical systems to help teachers in their guidance duties. It relays on teachers receiving in-depth information on student learning trends during semester. This information is elaborated from an automatic system which yields predictions on expected student performance. Main contribution of this work is a custom-designed practical prediction system. Main innovations of the proposed system are its time-dependent nature and the use of probabilistic predictions. The proposed system delivers by-weekly probabilistic performance predictions and analytical timedependent graphs that help gaining insight in students' learning trends. The proposed system is tested during a complete semester in the subject Mathematics I at the University of Alicante. Data gathered is used as initial evidence to empirically test the system and results are shown and discussed. Usefulness, convenience and advantages of the time-dependent nature of learning data are also tested and discussed. As an additional consequence derived from these tests, some initial methods for selecting the best moments for teacher intervention are proposed and discussed.
Performance predictions are shown as point graphs over time, along with calculated trends. This information is summarized and organized to help teachers explore and analyse student learning performance efficiently. Some case examples are presented and analysed using these graphs, showing their potential to help teachers understand beyond raw data. Teachers can use this information to diagnose students, understand learning trends, early detect intervention situations and act accordingly to help students improve their learning results. This research considers only learning trend diagnosis and detection of most suitable moments for teacher intervention. Intervention strategies and their results are out of scope. This paper is structured in seven sections. Section II analyses some relevant background works. First, several reviews which describe the most appropriate techniques in prediction are presented. Then, some related works on early detection and on providing insightful, graphical representations are explained. Lastly, a discussion drawing conclusions of this review is performed. As a result, research questions are proposed in section III. A custom automated learning system, in which the proposed prediction system is included, is presented in section IV. Section V explains how data from the system is used to perform student diagnosis and to select the best intervention moment. Section VI analyses some paradigmatic student case examples, showing how prediction graphs and calculated trends help understanding student learning trends. Finally, section VII covers conclusions and further work.

A. Prediction Techniques
Several prediction systems focused on student academic performance have been developed in recent years. Hellas et al. [1] perform a great survey on prediction techniques, predicted factors and prediction methods. Authors find that most predicted values are course grades and individual exam grades. Most studies used statistical correlations and regression, followed by machine learning techniques such us Decision Trees and Naive Bayes classifiers and clustering. Hämäläinen and Vinni [2] carry out a comprehensive study about classification methods in the discipline of Educational Data Mining. They organize predictive classifiers in education into four groups depending on the aim of the prediction: academic success, course outcomes, success in the next task and meta-cognitive skills, habits and motivation. They conclude that the main concerns are the choice of a discriminative or probabilistic classifier, the estimation of the real accuracy, the tradeoff between overfitting and underfitting and the impact of data preprocessing. According to this work [2], the most used classification techniques are Decision Trees, Bayesian Networks, Neural Networks and Support Vector Machines, in this order.
Kotsiantis [3] also makes an interesting review of different techniques in Learning Analytics for educational purposes (classification and regression algorithms, association rules, sequential patterns analysis, clustering and web mining). Kotsiantis indicates that the use of Machine Learning techniques is an emerging field that aims to develop educational methods of data exploration and meaningful patterns finding. He also notes that professionals tend to build a model once in time, not considering data evolution over time, and that the general trend focuses on predicting students' final grades (i.e. learning performance).
Prediction accuracy is the main concern of most works. [4] predict academic success of students as low, medium and high risk. They use two data mining techniques: Decision Trees and Neural Networks. After analytically comparing various techniques, [5] achieved high precision results in student performance, using Decision Trees and ranking students as fail, pass, good or very good. [6] also use Decision Trees to predict students' dropout, achieving comparable precision to more sophisticated techniques. However, it is important to consider that they perform flat classifications, not accounting for probabilistic belongings to classes. Hamound et al. [7] compare tree classifiers that try to predict student's success from questionnaires regarding their social activity, health, relationships and academic performance. They find the J48 algorithm to give better performance results than Random Tree and RepTree. In another work [8], authors compare four Machine Learning techniques to predict student performance. Their research compares quality of predictions based on two features: average precision and percentage of accurate predictions. They conclude that the simplest linear regression model is enough to predict average academic performance on groups of students, whereas individual performance is best predicted using Support Vector Machines (SVM). Importantly, these authors [8] train their predictive models once with static past data: they do not take into account data evolution over time.

B. Early Detection
Other works stress the importance of how system outputs are shown. They consider it highly relevant for teacher understanding and their improved ability to help in the learning process. [9] attempts to predict students' dropout or failure as earliest as possible. They use two pairs of descriptive-predictive techniques to achieve 80% accuracy: 1) Correlations / Linear Regression and 2) Association Rules / Bayes Model. They conclude that these techniques can help teachers understand and interpret course progress on two levels: 1) the whole group of students or 2) individually. [10] cluster students in three ways: 1) in nine classes by ranges of marks, 2) classified in high, medium or low performance, and 3) classified in pass or fail. They find that accuracy of their predictions improves when they use Genetic Algorithms. [11] also designs two partitions: 1) per mark as fail, pass, good or excellent, and 2) classified in underperforming, medium or high. They use student interaction data from Moodle and final marks. They combine different Data Mining techniques (Statistical Classification, Decision Trees, Neural Networks and Induction of Rules). They conclude that a classifier must be not only accurate, but also understandable by trainers to be useful as a guide in learning.
Early detecting learning performance issues is one of the most relevant goals in this field. Main intention is helping teachers to guide students towards academic success. Freund et al. [12] present a prototype of a performance prediction system, combining classification techniques based on Decision Trees, which achieves an accuracy close to 98%. It consists of a set of decision rules that automatically detect at risk students and trigger alerts based on most significant variables. Alerts materialize into emails sent to both student and teacher. In [13] authors propose using students' online activity data in a web-based Learning Management System. The system provides an early indicator of predicted academic performance and results of a test assessing student motivation for the online course. They also try to help at risk students by providing information on students who successfully finished the course and links to assess their willingness to virtual classes. Similarly, in [14], authors propose Feed-Forward Neural Networks to predict final marks of students in an e-Learning course. They use predictions to classify students into two performance groups. Their results show that accurate predictions are viable at early stages (in their case, in the third week). However, the proposed system failed predicting certain specific students. This is expectable with early predictions, but also opens up discussion on the convenience of intervention based on predictions at early stages. They conclude that their proposal can help teachers assist students in a more personalized manner.
[15] present a final marks prediction system. They argue that most previous works perform predictions after their corresponding courses, which neglects the possibility of early predictions and detecting at risk students amid lessons. They gather activity data in a Learning Management System during three different periods: weeks four, eight and thirteen. They use three classification techniques based on Decision Trees, obtaining an overall accuracy of 95% at week four. This is one of the few works that consider data evolution over time, but it does it with quite coarse granularity (only 3 big course periods). However, their results are quite relevant and an improvement of their work with more focus on enabling teacher diagnosis through appropriate data presentation (maybe using carefully designed graphs) would have breakthrough potential. Following a similar path, Akçapınar et al. [16] develop an early prediction system using student's eBook reading data to detect at-risk students. Their system uses 13 prediction algorithms using data from different weeks of the course. They obtained best performance using Random Forests and Naive Bayes and analysed different details on raw data versus elaborated features.

C. Graphical Representation
Graphically representing system outcomes has the potential to help teachers better understand student learning trends. It also may help them early diagnose students, detect at risk scenarios and relevant time-frames for intervention. However, not many works focus on the importance of graphical representations with respect to predictions. Most of them simply present predictions as raw values.
Some works use the graphing tools that come embedded in their learning platforms. [17] present Learning Analytics Enriched Rubric (LAe-R), a new cloud based assessment tool integrated into Moodle. They use GISMO, a visualization tool for Moodle that gathers and processes log data to produce graphical representations that can be used by teachers for assessing students' performance. They conclude highlighting the importance of data visualization for teachers and propose future work on this matter. [18] also graphically show prediction accuracy through Receiver Operating Characteristic (ROC) curves. [19] analyse accuracy on early identification of students who are at risk of not completing Massive Online Open Courses (MOOC) courses. They compare four weekly prediction models in terms of Area Under Curve (AUC), and graphically visualize student learning trends. [20] have the goal to enhance Reactive Blended Learning with a control system including prediction features. Authors remark that their work is not focused on obtaining a complete student model, but on improving student diagnosis to help teachers act on low performance risks. This drives them to show results in learning evolution graphs during their courses. They also compare traditional methods with their approach for two consecutive years with interesting results.

D. Discussion About Background
Analysis undertaken in this work yields the following conclusions about performance prediction systems: • Most of the works focus on prediction, specially on accuracy.
Many different algorithms, methods, data types, and data sources are used. Many works also perform algorithm comparison, almost always using accuracy as measure. There seems to be no consensus on which algorithms, methods or data sources are better. However, some algorithms seem to give good results in general, including Decision Trees, Random Forests and Support Vector Machines. Although more research in this area is clearly justified, there seems to be too much emphasis on accuracy, sometimes forgetting that students and learning should be the major goal.
• System predictions are mostly plain classifications, with very few works modeling uncertainty and/or probability, and even fewer considering evolution over time of data and predictions. None of the works analysed considered everything at once. Much more research on producing progressive and probabilistic predictions, and analysing predicted learning trends is expected to follow.
• There is definitely no common way of representing predictions. A great majority of works output predictions as raw numbers or similar. Some give several values, probabilities or classes. Only a few works give importance to visual representation and its key role on teacher understanding and student diagnosis. More work on this matter is encouraged, as powerful representations may constitute the basis for actual improvements on the teaching-learning process.

III. Research Questions
After reviewing works in section II, questions arise about the static and punctual nature of performance predictions in most of them: 1. Are there benefits on exploiting time-dependent nature of learning data, by yielding frequent students' performance predictions over time?
2. Could frequent time-dependent predictions help teachers deduce students' performance trends?
3. Could this give early insights in student learning trends?
4. Could these deduced trends help teachers identify most relevant time-spots in the learning process?
5. Could this information be used to detect best moments for teacher intervention?
It seems plausible that consecutive, frequent predictions over time could yield additional information. As an example, let us compare a punctual performance prediction to a picture. Depending on the circumstances, it could be deduced that a person is running. A set of consecutive and frequent pictures would probably make it evident, also yielding information on distance, velocity, running technique... The idea behind this work is equivalent: from a graph of consecutive, frequent predictions, additional information could be deduced about student learning trends. Concretely, performance trends, better estimations on future performance and most relevant time-spots in the learning process.
This work will address these five research questions from an empirical point of view. The performance prediction system presented in next section will be tested within a semester and results will be analysed. Data gathered along with student case analyses will be presented as initial evidence related to these research questions. Authors aim is to present this initial evidence results to show that the proposed system is promising in this field and to encourage following studies to gather more evidence.

IV. Automated Learning and Performance Prediction System
The main contribution of this work is an improved insight in learning trends provided by frequent, consecutive performance predictions. This additional information has the potential to help teachers to early diagnose student issues and schedule interventions. To achieve this result, the first step is gathering student data to make predictions. In this work, we use a custom learning system both for student assessing and for data gathering. This section describes this custom web-based system which was initially developed to automate processing of student learning activities. The system was also designed with data gathering in mind, to help understanding students learning progress. The system supports Mathematics I, a first-year subject in Computer Science Engineering and Multimedia Engineering degrees at the University of Alicante. Mathematics I introduces students into Computational Logic and Logic Programming through Prolog programming language. The automated learning system consists of four main components: • PLMan 1 , a Pacman-like, custom-developed videogame which is the students' central activity.
• A custom web-based automated learning system which manages student homework, assessment and progress, and lets teachers supervise the process.
• A performance prediction system based on Support Vector Machines (SVM) which classifies students every week according to their expected performance.
• A representation module which graphically shows predictions, current status and future trends about students.
Next subsections describe each one of these system components in greater detail.

A. Learning Activity: PLMan, the Game
PLMan is a cross-platform, text-mode, Pacman-like videogame implemented in Prolog programming language. It was created to support the learning of Prolog programming, Computational Logic and Reasoning. As it is part of the context of this work, this section briefly introduces the game. PLMan is described in depth in [21].
Students program the Artificial Intelligence (AI) of Mr. PLMan, a Pacman-like character, in Prolog. The goal is to make Mr. PLMan eat all the dots in a give maze (see Fig. 1). The game works like a simulator: unlimited different mazes can be created for PLMan. Each new maze becomes a different exercise for which students develop their AIs. Teacher-designed mazes are classified in increasing complexity to encourage students learn more about Prolog and be creative programming their AIs. These mazes are organized into four main stages and up to five levels of difficulty per stage.
Each AI program created by the students and aimed to solve a given PLMan maze is called a solution. The students send their solutions to a web-based system that evaluates them, based on the percentage of dots their solutions manage to eat when simulated. Each maze is worth different marks depending on its stage and difficulty. Students get cumulative marks for each solution sent, modulated by the percentage achieved. For instance, if maze A1 is worth 1 point, and student S1 sends a solution achieving 70%, student S1 will add 0,7 points to his/ her cumulative marks. A solution achieving 75% or more unlocks the next maze for the student. Students have no limit on the number of times they can resend solutions, nor they are penalized. The system always considers the best solution sent, not the last. The only limits are stage deadlines and ten minutes delay between sent solutions.
Formative assessment has been considered as the basis to design this learning process: mistakes and partial progress are encouraged rather than penalized, to let students learn from their mistakes and evolve. Also, freeing students from fear to fail makes them more willing to participate, increasing motivation. Students also follow their own path by selecting difficulty levels that make them feel more comfortable. The greater the difficulty, the more the marks. They may also stop whenever they want. For instance, a student into the 3 rd stage with 65 out of 100 marks accumulated may stop solving mazes and those will be his/her final marks (65%), or continue solving mazes to achieve better marks.

B. Web Site
Similar to PLMan, the web site is very briefly introduced in this section as context for this work. Complete details on the web system can be found in [22]. General behaviour of the web site is similar to many learning systems (like Moodle, for instance) but specifically adapted to the needs of the subject. For the purposes of this paper, there is no need to deepen into the details of this general part. The main contribution comes from the Progressive prediction system and representation modules, which are described in next subsections.
The web system is private and can only be accessed by students and teachers of Mathematics I. The public area lets anyone download the PLMan game and some utilities. Once students sign in to the private area they see their current profile, along with their progress and status (Fig. 2, left). Their status includes their accumulated marks, their assigned mazes and all details for each maze: completion percentage, acquired marks, total marks, download button, send solution button and results section. The results section (not shown in the figure) contains all the information about solution assessment: global results of execution, details on marks calculation, comparison rankings and execution logs that let students repeat exact executions that have been performed on the server.
For teachers there is an administration panel (Fig. 2, right). This panel lets them supervise the evolution of their students and groups of students. Teachers can explore all details of any given student: mazes assigned, solutions sent, results of the solutions, actions performed in the system, marks acquired, code from sent solutions, etc. They can also manage the basic parts of the course like group creation, student sign up, assignment and deadlines, system marks reviewing, etc.

C. Progressive Prediction System
The progressive prediction system is briefly described in this section, with emphasis on its progressive nature. Present description is aimed to give a general understanding of what the system does without including mathematical and computational details. Complete details can be found in [22].
The system general purpose is to predict final students' performance. For this purpose, the system collects all data from students' participation and solutions sent to PLMan mazes. Every week of the semester, the system uses up-to-date information and generates performance predictions for every student. These output predictions are comprised of three real numbers for each student. These numbers predict the probabilities for the student to end up the semester pertaining to one of the three student classes defined in Table I. For example, an output from the system like this prediction = (studentA, 0.40, 0.35, 0.25) would mean that studentA has a predicted probability of 40% to end up the semester in the High performance class, which means his/ her marks would be in the range ]80,5% -100%]. Similarly, studentA has 35% predicted probability of ending up in Medium performance class (marks in ]57.5% -80.5%]), and a 25% predicted probability of ending up in Low performance class (marks in [0% -57.5%]).  The system is based on Support Vector Machines (SVM) [23] as Machine Learning model and low-level prediction technique. It is designed to predict student performance every week, using all past cumulative information. For instance, when predicting expected performance on week five, the first five weeks of input data are used, and not only data from week five. In this sense, predictions are cumulative and progressive. As a complete semester has eleven working weeks 2 , eleven consecutive predictions are performed.
Although it would be much preferable to obtain predictions in the form of real-valued final marks, that practice results inviable for this study from the computational point of view. To obtain that kind of prediction within an acceptable error range the system would require input data in the order of the hundreds to miles of thousands of students. As results section shows, this work started with data from three hundred and thirty six students. Although it is a normal sized sample for two iterations of a first year semester, it is three to four orders of magnitude less than required for real-valued final marks as output prediction. Therefore, to obtain predictions within an acceptable error range, the system was designed as a classifier with the three classes presented in table I. This design decision ensures that the system will achieve a high probability of generalization in computational learning phases and minimal over/underfitting problems. This reasoning is similar to previous works found in literature, as many of them have samples of similar sizes.
The input information used for prediction comes directly from the interaction between the students and the system. This includes difficulties selected, mazes assigned, number of tries to solve a maze, time taken to develop solutions, etc. It also includes number of accesses to the website, time between accesses, downloads of mazes, time spent on different views of maze and solution information, etc. Data is collected, organized, normalized and finally input into one of eleven SVMs in the prediction system. Each SVM is specialized in predicting a final performance class for a specific week. For this purpose, each SVM uses the data corresponding to the previous n weeks. Detailed description on exact information used and input features constructed is specified in [22].
As discussed previously, each SVM outputs three prediction probabilities, one for each final performance class. The student is finally considered to pertain to the class with greatest probability. However, all probabilities are taken into account and given to teachers. This gives much more information than the single class the student is considered to pertain to. As a simple example, there are cases in which one class has 0.38 and next one has 0.375 probability. That means both may be almost equally probable, and this information is important to take into account when diagnosing a student. This information is given mainly in the form of graphs, but also numerically if requested. Fig. 3 exemplifies what can be shown having predicted performance probabilities over time. This graph could not have been drawn if students where merely classified in high, medium and low predicted performance classes, and it shows valuable information on student trends over time. Fig. 3. Example graph with predicted performance probabilities over 10 weeks from a random student. Probabilities are: green) high performance, blue) medium performance, red) low performance.
In Fig. 3, predicted performance probabilities come from the same student. Therefore, a simple visual glance shows that the student had similar probabilities of ending up as high/medium/low performance up to week five. In week six, there is a great change and student is predicted to end up in the high performance range with ~0.6 probability. Predicted probabilities maintain this trend up to week ten and, finally, the student got 91% marks, which effectively enters in the high performance range, confirming that predictions were accurate in this case.
As shown, the design of the system takes into account timedependent nature of data and predictions. Predictions are frequently made (every week), using cumulative data from previous weeks, and with probabilistic output. With all this information, progression graphs are created (see section VI). All these steps are done as a consequence from the first research question. Fig. 3 suggests that this additional information coming from probabilities, time-dependent predictions and cumulative data can be valuable to get more insights in student learning trends.

D. Representation Module
As many authors previously identified, it is highly relevant to find an appropriate graphical way to show system outputs to teachers. The proposed system, built on previous work [24], has a representation module able to show raw data to teachers as well as several graphs designed with evolution over time in mind.
Main visual outputs of the system are a set of point graphs and a control panel (see Fig. 5). Every student has three point-graphs with by-weekly predictions probabilities for high, medium and low performance groups, including all available information up to current week. For instance, leftmost point graph (green points) shows all system's probability predictions for the student ending up in high performance class at the end of the semester. Similarly, central and rightmost point-graphs show predictions for medium and low performance classes. Each point represents a single prediction made by the system and corresponds to a probability (y-axis) estimated at the week the prediction is made (x-axis). From all these predictions, a trend line is calculated by common linear regression and depicted dashed. This trend line visually shows if predicted probabilities are increasing or decreasing and how fast. In example Fig. 5, the student being analyzed is increasing (green) his/her probability of ending up the semester with high performance marks (]80.5%-100%]) and decreasing probabilities (grey/red) of ending up with medium or low performance marks.
The control panel on the right side of the graphs (Fig. 5) summarizes predictions for current week (seventh week on Fig. 5). There are three predictions in the form of value-arrow-color. Let us understand the value-color pair first, as it is most important. Value represents probability (0-1) and color identifies a performance group (green-high, grey-medium, red-low). So, 0.9-green means 0.9 probability of ending up in high performance group, whereas 0.9-red means 0.9 probability of ending up in low performance group. The arrow indicates the probability of increase/decrease. So an increasing green probability is a good sign (greater probability of high performance) whereas an increasing red probability is a bad sign (greater probability of low performance). The greater the increase/decrease velocity, the greater the angle for the arrow. Angles are discretized to nine possible values to simplify visual interpretation and comparison.
The representation module lets teachers study the evolution of individual students and groups. For individual students, several rows with point-graphs and control panels like those in Fig. 5 can be shown at once. This lets teacher select, analyse and study the evolution of any student with respect to system's performance predictions. Section VI shows three selected case studies from three model students to show how these analyses are performed. Section V shows most important group information that teachers use to select which students to analyse individually.

V. Diagnosing Students and Finding Time-spots for Intervention
Main research questions focus on diagnosing students, inducing and understanding learning trends and detecting relevant time-frames and spots for teacher intervention. Section IV has introduced the proposed performance prediction system and representation modules, which directly address student diagnosis. This section details the system processes and tools to help teachers perform diagnosis, and proposed ways to find the most important time-spots for intervention.

A. Student Diagnosis
Predictions, trends and student information are forms of generated and aggregated information that help teachers efficiently understand general student statuses. In the absence of this information, they would have to manually analyse all ground-work produced by students. Byweekly analysing every bit of student ground-work quickly becomes impractical. It is important to scale this information up, as teachers typically supervise tens to hundreds of students at once. This is the main purpose of student diagnosis tools and generated information.
The concrete process to produce student diagnosis information follows these steps: 1. The system estimates by-weekly probabilities for each student to end up as high, medium or low performance.
2. Predictions are accumulated into point-graphs that show student progression over time (Fig. 5).
3. Performance trend lines are estimated applying linear regression to probabilities (Fig. 5).
4. Latest predictions and trend estimations are summarized as three arrow-value pairs (Fig. 5).
The system has been designed considering teacher time as a scarce resource. Therefore, it should be assigned with higher priority to students that can benefit most from it. In the absence of a proper estimation, this work assumes that less performing students can benefit most, as they have greater improvement margin. A more accurate estimation would work as triage, removing uninterested students and leaving only those at-risk but willing to improve. However, data produced by the system cannot directly identify these cases. Therefore, the present system leaves this task to teachers.
To help teachers focus on students that can benefit most, the system performs a visual classification, beginning with student summaries that are shown in Fig. 4. Summaries reduce student information to their highest probability value-arrow pair, and sorts them from worst to best success probability. Fig. 4. A teacher is navigating student statuses. Each number+arrow represents one student by his/her highest probability. When pointed with the mouse, a popup shows present control panel for the pointed student with all probabilities. Student ID has been anonymized. Fig. 4 shows a screenshot of the classification for a group of students, while a teacher is navigating their status. First row of students in the figure are those with worst prediction: they show great probabilities (0.80 to 0.92) of ending up as low performance (red). Arrows help knowing if these students are increasing or decreasing this low performance probability. A decrease in this probability would mean an improvement, as it would be less probable for them to end up as low performance students.
Once teachers detect candidate students, they can click and get detailed information to diagnose them in detail. For this task, the system provides weekly point graphs and trend information described in section IV (representation). Section VI deeply analyses the information provided by point graphs and trends for some typical students. The system also provides access to complete student activity logs including all student accesses, realized tasks, assigned mazes, solutions sent, code from solutions, etc. This is the lowest level information and can represent tons of information just for a single student. Teachers have always relied in this information to diagnose students and is always required for proper diagnosis. The proposed system does automatic processing of this information, along with described predictions and graphs. This process helps teachers navigate information faster, diagnose easier and be more efficient on helping students, but does never substitute ground-level information.

B. Best Moment for Intervention
Student diagnosis is highly dependent on subject and tasks timeframes. Patterns are different for a single-project-based subject with only one final submission, than for another subject requiring byweekly task submissions. Considered subject asks students to submit solutions to many mazes one by one, but with no specified time-frame for individual mazes. Instead, mazes are grouped into stages with two intermediate deadlines for stages one and two, and a final subject deadline for the rest at the end of the semester. Intermediate deadlines where placed in weeks five-to-six and eight-to-nine.
Time-frames are highly important because they condition student workload. Students tend to accumulate work near deadlines. Although the system was designed with incentives to prevent this behaviour [21], it was only slightly mitigated. This greatly influences predictions and their importance. For instance, some students may not work at all during initial weeks, and perform great later. Early discriminating these students from those not willing to work could be very difficult. Moreover, students with difficulties may work from the start and have confusing results and predictions, which could difficult teacher diagnosis at first.
Similarly to a virus infection, symptoms may not be clear until an initial time-frame has passed. Understanding these timeframes and detecting spots where diagnosis could be most accurate is relevant for teacher intervention. Intervention could be most effective when performed on time: too early or too late interventions may target students not requiring it or may be ineffective due to lack of remaining time.
To find best moments for intervention, Fig. 6 shows all performance predictions for fifty test students. These test students have been selected randomly from the three hundred and thirty six that form our complete sample. Fifty is approximately 15% of the sample, and is a standard proportion to use for Machine Learning algorithms. For this study, this means that our Machine Learning SVM models have been trained with two hundred and eighty six students and these fifty have been left out for out-of-sample tests. This is a common practice to have an estimation on how well trained Machine Learning algorithms perform with new, not previously seen data. As these fifty students come from the main sample at random, it is appropriate to assume that both represent the same distribution. We use only test students because they represent the actual accuracy of the prediction system. Predictions are shown using three different symbols for performance groups: x low, · medium, ∕ high. These symbols have been selected to help visually identify predictions in Fig. 6. Weeks one to ten are semester weeks, whereas week eleven shows the final result of students. Students are identified by an anonymous number and visually grouped by their final marks to simplify analysis. Although performance predictions vary over time, there exists a week for every student from which predictions stabilize. This week is highlighted with a background colour: red low, grey medium, green high performance. Fig. 5. Visual representation module for a random student at semester week 7 (example). Three point-graphs show by-weekly predicted probabilities for high/ medium/low performance (current week, 7, highlighted with a vertical rectangle). Each point represents a probability prediction. Regression trend lines (dashed) are calculated from individual probabilities to show evolution over time. On the right side, control panel summarizes student status on current week (7): probabilities{0.85, 0.15, 0.00} for {high/medium/low} performance {green/grey/red} and arrows indicating whether each probability tends to increase or decrease. Inclination of the arrow represents increase/decrease velocity. Fig. 6. Weekly performance predictions for all test students (× low, · medium, / high). Highlighted cells indicate predictions becoming stable, revealing earliest moments for accurate student classification.
Analysing Fig. 6 some visual rules can be inferred: • Best performance students tend to stabilise their prediction during weeks five-to-six, coinciding with first deadline. Most students classified as medium or high in both weeks five and six end up as high performing. Moreover, only student 3 ends up as low performance with this classification. This simple visual rule is a great candidate for identifying candidate students.
• Most students with two consecutive low performance predictions at weeks five and six end up in medium or low performance groups. Only students 31 and 9 end up as high performance, with borderline result, as they are firsts in Fig. 6 with this classification (students are ordered by final marks).
• It seems quite difficult to identify students that will end up as low performance. On weeks seven to ten they seem to increase their efforts trying to save their final result. That is clearly shown in Fig. 6 with an increase in medium and high classifications. This also seems to happen with students that end up as medium performance. It might be due to a lack of information to get better predictions or, most probably, to an actual impossibility to predict which borderline students will be able to save their course with a final effort.
From this analysis, it seems that weeks five-to-six represent a great moment for discrimination between high performance students and medium-to-low performers. This could also represent a great moment for teacher intervention, as symptoms seem to be highly descriptive in many cases. Teachers could use those weeks to deeply analyse described cases and seek for student problems they can address to give them an effective impulse upwards. These conclusions are also supported by prediction accuracy, as shown in Fig. 7. Concretely, week six has greatest accuracy results for weeks five and six. Both weeks show 70% accuracy for low performance predictions, whereas week 6 shows 84% accuracy for high performance. Reasonably, medium performance is most difficult to predict. However, this problem is minimized by high accuracy of predictions for high performance group and visual aid of Fig. 6. Accuracy results from Fig. 7 are obtained only from test students. They are the proportion of correctly classified students, comparing their highest probability classification with their actual final class. Goodness of this accuracy results is bound to discussion. They are probably affected by a great variance, as N=50 is a small sample. Moreover, they could have been improved considering SVM classifiers second options by probability. On misclassified students this second option is usually correct and tends to be in narrow probability margin respect to the first option (typically < 0.05). However, further work on this topic has been left out, as this was not the focus of this research.

VI. Case Examples Discussion
Results presented on this paper have been obtained by the system on past courses of the subject Mathematics I. These results include a total of 400 first-year students, 336 of which actively participated in the practical lessons and used the system. 286 students were used to train SVM classifiers and 50 were reserved for 12 validation tests. All results presented on this paper refer to validation tests, as they represent out-of-training-sample probabilities that can better estimate actual application results. Original 336 students sample was composed of students with ages A ranging from 18 to 21, A~N(18.8, 1.33), of which 56 were female (16.6%) and 280 were male (83.3%).
To exemplify the inner working of the system and to discuss its utility three paradigmatic student cases have been selected for more detailed analysis. For each selected student, probability point graphs and summaries for weeks three, five and seven are shown and discussed. Results in these weeks provide an idea on how progression can help in student diagnosis. As discussed previously, the system provides access to teachers to all this information besides students' ground work (tasks, solutions to mazes, etc.).
The three examples have been grouped into two subsections: stable students and an unsteady student. Stable students are two of them that end up in high and low performance groups respectively. Both students exemplify the observed behaviour norm regarding these groups. They either work hard to get best marks or are not interested in the subject and do some minimal attempts. On the other side, the unsteady student represents most of the students. Although this example ends up as medium performance, many other students behave similarly and end up as low performance, and some of them as high performance. It clearly exemplifies why it results so difficult to accurately predict their behaviour and, consequently, their expected performance. Fig. 8 shows graphs for a high performance student who finds the right path very early and follows it up to the finish. This is student ID 12 from Fig. 6. The student achieves the final classification label at the 3 rd week of the course, which is quite remarkable and similar to other high performing students.

A. Stable Students
The student shows a clear trend to high performance right from the 3 rd week, after just one week of lower performance (the 2 nd one). Trend predictions from the 3 rd week clearly show that probabilities are not casual, but aligned with what probably is a great student: high performance increasing, both medium and low performance probabilities decreasing. The 5 th week confirms the prediction, but with one worrying detail. Although proportions are comparable to the 3 rd week, the 5 th week has introduced a slight increase in low performance probability, to the cost of high performance. There is still nothing to worry about, but this detail might signal an excess in confidence from the student who could be just partially exploiting capabilities. It could be a hint for the teacher to just ask the student about his progress and then induce some extra motivation for hard work.
However, the 7 th week clears all doubts about the progress. The student has achieved 0% probability of failure. Performance has great change to end up high, and could be medium with quite small probability. These results help teacher not to worry about the student, as is clearly well focused.
In contrast with these results, Fig. 9 shows predictions for a low performing student (ID 43 from Fig. 6) whose working attitude is almost null since the beginning.
This student shows clear trend to failure right from the 1 st week, with 21% probability of high performance versus 40% of low performance. These probabilities are maintained and worsened by the 3 rd week. A 50% low performance probability along with a serious tendency to increase. Although very early in the course, it would be interesting for the teacher to consider if the student has problems and can be helped. However, values seem to point to a lack of interest. The 5 th week shows an attempt at changing direction. Clearly, first four weeks are horrible for the trend of the student, whereas exactly the 5 th shows a change. This change identifies the student performing a before-deadline crash work. Judging by the probabilities, this work has not been enough to get great marks at the first deadline. However, this situation is appropriate for the teacher to verify student logs to see what has been achieved with respect to the deadline. This information could be valuable to help guide the student, in case there is some interest.
As in the previous example, the 7 th week gives clear evidence of the kind of student and where are the trends going to. The small increase in the 5 th week seems an illusion. Most probably, the student felt incapable of recovering lost time, partially failed at the first deadline and abandoned work. It remains unclear if some teacher intervention could have helped the student, either at the 3 rd or the 5 th week. However, prediction graphs and trends clearly help predict student progress beyond individual predictions, and give some interesting hints that teachers could use to diagnose and help some of these students. In any case, intervention strategies and their results lie outside the scope of this work.

B. An Unsteady Student
As stated before, next case shown in Fig. 10 represents one of the most general cases of students. Most students do not follow a clear pattern, but instead their numbers change and evolve in complicated ways. These behaviours justify the difficulties the SVM has to predict them, making the medium performance class the most difficult to accurately predict.
In the 1 st week the student seems to start well getting a 52% probability of high performance. But this start rapidly decays and medium-to-low classes gain much momentum. Summary for the 3 rd week clearly shows a 21% for high performance with an arrow that indicates a fast down tendency. Due to the interesting start of the 1 st week, tendencies are sharper and the student seems to go direct to a low-to-medium performance.
However, the 5 th week shows more balanced probabilities with flatter tendencies. The student seems to be climbing again and recovering. Numbers for this week are not conclusive, identifying a difficult to classify student. These kind of students probably represent the group that could benefit most from teacher intervention. Evolution shows interest in the subject, as the student is clearly working to pass, but it is unclear what exactly does happen. The student may have a lack of proper scheduling, may need help with some concepts or problems, may have temporal problems... It is interesting for the teacher to deepen in the knowledge about the student to try and help.
After the first deadline and getting into the 7 th week, the student has overall improved all probabilities, gaining much more momentum towards high performance. However, the difference between 6 th and 7 th week indicates that these numbers are much based on first deadline crash effort. After first deadline, student is again losing momentum, probably due to some relaxation after achieving an adequate result. As previously told, the student ends up in medium performance, which finally identifies with a general student. The analysis of the graphs results very interesting, because even with the difficulties to classify the student, there are many valuable clues. This suggests again that exploiting the time-related nature of predictions has great potential for providing insights in learning trends beyond the mere values of predictions.

VII. Conclusions and Further Work
This research started by creating a custom automated learning system to support Mathematics I, a first-year subject that introduces students into Computational Logic and Prolog Programming. This system included a performance prediction system based on Support Vector Machines. The complete system consisted on 4 main components: • A computer game called PLMan, whose many different mazes are learning activities students have to solve programming in Prolog language.
• A custom web-based automated learning system for teachers and students to interact based on PLMan mazes.
• A performance prediction system based on Support Vector Machines.
• A representation module for graphically showing performance predictions and student learning trends. The performance prediction system and representation module have been designed to exploit the time-dependent nature of student data. The system produces probabilistic, consecutive, by-weekly predictions which are added into progression graphs. With these predictions and graphs, learning trends are calculated using linear regression. All this information is shown and summarized in visual ways designed to help teachers diagnose students.
Moreover, filling up a table with by-weekly individual class predictions, some ways for understanding learning time-frames and selecting better moments for teacher intervention have been presented. Identifying most accurate by-weekly predictions for each class, prediction patterns in the table and when different student classifications stabilize, rules for selecting appropriate intervention moments are deduced.
Presented evidence produces some tentative initial answers to the research questions. First, it suggests that exploiting time-dependent nature of student data is viable and desirable. It also suggests that frequent, probabilistic and cumulative predictions have potential for giving early insight into student learning trends. Example cases analysed have shown how student status, progression and trends can be induced from presented graphs, providing more information than the mere performance probabilities. Moreover, evidence presented also indicates that there are methods to identify most relevant time-spots for teacher intervention. The presented method is simple yet effective. However, much research is required in this topic to develop proper, more elaborate and adaptive methods to select appropriate intervention moments.
Although this is just a first step in this direction, results are promising as an aid for teachers to be more efficient and effective in diagnosing and helping students. However, intervention strategies and their results have not been covered and are left for future research.
There are many debatable points in the presented research that represent valuable features for future development and improvement.
There is a question about accuracy of predictions. Other works seem to obtain much greater accuracy results. Although this has not been a problem for this research, it remains to be analysed if accuracy could be actually improved. In the same line, getting more data could help in creating classifications with greater granularity. Performance could also be broken into separate features and individual progression could be predicted and tracked among these. Better and distinct graphs can be designed to give teachers different views that could help them understand faster and deeper about student learning trends. Finally, analysing teacher intervention and their results in learning trends would be a step beyond.

VIII. Limitations
This is an initial empirical research that has gathered evidence to support the capabilities of the presented performance prediction system. Although gathered evidence shows that student performance trends can be inferred from by-weekly performance predictions, it is important to acknowledge that it has been done with a small sample of students (N=336) all from the same university and all first years. Bias and size of the sample are inevitable in this study, and so more studies and more data are required to empirically assess the validity of the proposed system as a way to predict student performance, trends and best moments for teacher intervention.