Using Learning Analytics to Understand K– 12 Learner Behavior in Online Video-Based Learning

This research investigated the potential of learning analytics (LA) as a tool for identifying and evaluating K–12 student behaviors associated with active learning when using video learning objects within an online learning environment (OLE). The study focused on the application of LA for evaluating K–12 student engagement in video-based learning—an area of inquiry highlighted in literature as important but significantly under-researched. Results determined that the LA method could identify active-learning behaviors and that LA can play a valuable role in providing information on learner activity in autonomous K–12 OLEs. However, LA did not provide a complete picture of learner behavior and viewing strategies, highlighting the importance of a multi-method approach to research on K–12 online learner behaviors. It is anticipated the accessible approach outlined in this study will provide educators with a viable means of using LA techniques to better understand how learners interact with course content and learning objects, greatly assisting the design of online learning programs.


Introduction
Pre-COVID, online learning was already a growing trend within education, with 90% of universities in the U.S offering some form of online education by 2014 (Bowers and Kumar, 2015).This trend has been accelerated by the advent of COVID-19, with UNESCO (2020) stating that due to the pandemic, one in five students worldwide were unable to attend face-to-face classes.While the COVID-19 situation is now somewhat resolved, a likely lasting impact will be an overall acceleration in the move to online learning (Brown et al., 2022;García-Morales et al., 2021).Some authors are now arguing that online learning is rapidly emerging as the predominant format for students to access higher education, and, as such, it is crucial that the substantial amount of generated data is effectively used by educators to enhance students' learning experiences (e.g., Maloney et al., 2022).In comparison to higher education, K-12 education has been identified as a relatively recent context for the adoption of online learning (Mayer, 2017), and although research into K-12 OLEs is growing, it still has a relatively narrow research base (Martin et al., 2021).Although both tertiary and K-12 institutions are increasing their adoption of online learning, it has been suggested that little is known about learner behavior within these environments (Winne, 2018).
The move to online learning and increased adoption of digital tools and subsequent advances in data quantity and quality had created a relatively new field of research within the learning sciences (Baker et al., 2016).This new field had been termed "learning analytics" (LA), and its aim is to use learner data to develop a greater understanding of learner behavior, particularly in online environments (Verbert et al., 2012).A commonly cited definition of LA comes from the 1st International Conference of Learning Analytics, which defined it as "(T)he measurement, collection, analysis and reporting of data about learners and their contexts, for purposes of understanding and optimizing learning and the environments in which it occurs" (Siemens and Long, 2011, p. 34).LA has been promoted as a necessary and effective tool for understanding this new teaching and learning paradigm (Pardo, 2014).However, while LA holds undoubted promise for advancing the field of education, early results have been mixed, and there are increasing calls for more learner-centered and teacheraccessible approaches (Kitto et al., 2017).
Responding to the paucity of existing research into online learner behavior in K-12 education, we conducted a study into the effectiveness of LA to identify online learning behaviors.Data were gathered within an OLE that featured courses for year 11 and 12 students in Physics, Chemistry, and Biology, developed by Macquarie University in Australia.The study was designed to specifically examine some of the affordances and limitations of LA as identified in the literature (Ferguson et al., 2019;Maloney et al., 2022;Ochoa, 2022).It applied an innovative LA method to identify learner behaviors and explore for evidence of active learning in the viewing of video objects.LA data were supplemented by a questionnaire that further investigated the students' behaviors-and the motivations for these, as identified in the LA data.The methods used commonly available data provided by a video-hosting service and relatively straightforward mathematical formulae to identify patterns of student engagement with video learning objects, as defined by Chi and Wylie's (2014) ICAP (Interactive, Constructive, Active, Passive) framework.By adopting the ICAP framework to interpret the click-stream data, the study aligned the data analysis method with established learning theory-an approach advocated by other researchers as supporting a more effective pedagogy first design (e.g., Macfadyen et al., 2020).This approach, and the study's accessible LA method, acknowledges the importance of learning design theory and the technical and operational capabilities of education practitioners, to the success of such innovations (Ferguson et al., 2019;Macfadyen et al., 2020;Rosé et al., 2019) Data were collected and analyzed responding to these research questions: 1. To what extent do students participate in active learning behaviors when engaging with videos in the OLE? 2. To what extent is learning analytics an effective tool for identifying patterns of student behavior associated with active learning when engaging with videos in the OLE?
A Review of Literature

Learning Analytics
LA had been touted as an effective method to identify student engagement and success as well as the quality of learning within OLEs in an efficient and cost-effective manner.This area of research had developed in response to the opportunities and challenges afforded by the vast increase in educational data produced by these new learning environments (Behrens and DiCerbo, 2014).While there has been significant development in the decade since its inception, LA is still described by some as being in a proof-of-concept phase, with limited research supporting its predictive power and little credible evidence of large-scale benefits to learners (e.g., Ferguson et al., 2019;Viberg et al., 2018;Zilvinskis et al., 2017).Despite its potential, Maloney et al. (2022) state that few studies have fully explored the learning data derived from digital environments like LMS (Learning Management Systems).They suggest the limited use of such data for informing teaching and learning practices, including corresponding research that aids educators in designing more informed and targeted resources, hinders the optimization of learning and the environments in which it takes place (Maloney et al., 2022).Moreover, as LA uses more complex modelling techniques such as those generated by machine learning, it becomes difficult for researchers to understand how models generated through this process work, and/or if they would apply to other datasets (Rose et al., 2019).
Recent discussion of some limitations of LA research can be found in the developing field of MMLA (Multimodal Learning Analytics).Described as a subfield of LA, MMLA serves an essential purpose in addressing educational contexts where capturing information beyond computer screen activities is valuable (Ouhaichi et al., 2023).MMLA encompasses the collection and integration of data from multiple sources, enabling a more comprehensive understanding of the various dimensions of learning and learning processes (Giannakos et al., 2022;Ochoa, 2022;Ouhaichi et al., 2023).This expansion is achieved by harnessing advancements in machine learning (ML) and cost-effective sensor technologies, that act as a 'virtual observer and analyst' of non-digitized learning activities (Giannakos et al., 2022).This new method acknowledges the risk with LA of oversimplification, or even misunderstanding of the learning process, if the focus is solely placed on a single type of trace data recorded in the logs of digital tools (Ochoa, 2022).This limitation results from the lack of available contextual information, which has been identified by the educational research community as one of the main criticisms of LA (e.g., Ochoa, 2022).The bias towards learning contexts heavily reliant on digital tools in LA can lead to a phenomenon known as the Streetlight Effect (Ochoa, 2022).This bias manifests as relying on a particular learning trace-such as accessing materials on the LMS, to infer a learning behavior such as engagement, simply because that data is readily available and without considering if there is a strong theoretical or empirical basis identifying access as a strong predictor of engagement (Ochoa, 2022).MMLA researchers argue the analysis of multimodal data allows for a more comprehensive analysis of learning contexts and provides a more holistic understanding of student engagement (Giannakos et al., 2022;Ochoa, 2022).However, proponents of the sub-field have also identified potential limits to MMLA's advancement, such as technical complexities associated with implementing multimodal analytic systems and the combination of expertise (e.g., learning scientists, data scientists, and computer scientists) required for MMLA studies (Giannakos et al., 2022;Ochoa, 2022).

Learning Analytics and Video-based Learning
Early studies investigating student behaviors associated with the viewing of video objects include those undertaken by Kim et al. (2014) and McGowan et al. (2016).Kim et al. (2014) used data harvested from 862 video-viewing sessions from a MOOC (Massive Open Online Course) to investigate student engagement.The McGowan et al. (2016) study involved a smaller cohort (80 students); however, it also applied a questionnaire to provide further insight into student viewing behaviors.Both studies analyzed student in-video engagement including rewinding, skipping ahead, and dropping out (exiting a video before completion), which revealed what the authors described as "peaks" and "drop-offs" in the data-visualization.The studies interpreted viewing behaviors such as rewinding, skipping ahead, and dropping out as evidence of disengagement, and argued that with more engaging videos students may stay longer, potentially enhancing learning outcomes.Both studies found that students watched more of a video in their first viewing session, and that in subsequent sessions there was more dropping out and "rewatching" (a section of the video being watched multiple times), which they interpreted as disengagement (Kim et al., 2014;McGowan et al., 2016).Kim et al.'s study concluded there was a relationship between longer videos and higher drop-out rates, which they argued may be due to students' short attention span and/or feeling bored, leading to their recommendation of a "6-minute rule" for video length (Kim et al., 2014).However, countering this, Lodge et al. (2017) argued that the focus on high-level taxonomies (as well as the underdeveloped nature of the research field) has led to a "proliferation of heuristics" (p. 2) in video object design that remain largely untested, pointing to the "6-minute rule" as one example of this.Furthermore, of the studies on videobased learning using randomized or semi-randomized conditions, few have yielded conclusive findings (Lodge et al., 2017).
A more recent study completed by Zhang et al. (2022) Kim et al.'s (2014) earlier studies, engagement was also defined by an accumulated count of watching and rewatching.Zhang et al. (2022) similarly identified a negative correlation between video length and the level of learner engagement (although not specifically adhering to the "6-minute rule"), as defined by the percentage of the videos students watched.However, Lagerstrom et al. (2015) investigated many of the same behaviors as Kim et al. and McGowan et al. and reached a different conclusion: that their data did not support a "6-minute rule" for video length to maintain student engagement.They found that although there may be higher dropout rates (the rate at which students leave the viewing session) when viewing individual sessions, students often returned to a video and that when the multiple viewing sessions were combined, the average percentage of a video watched by a student can be close to 90% (Lagerstrom et al., 2015).They argued these results disputed Kim et al.'s earlier "6-minute rule" for optimal video length.
Further studies have used clickstream data to explore the relationship between learner interaction with video objects and academic results (e.g., Chen et al., 2016;Stohr et al., 2019).Chen et al.'s study analysed clickstream data associated with actions such as playing, pausing, and seeking, presenting this information to instructors through a tool called "PeakVisor."An assumption underpinning this tool was that an area with high occurrence of pausing or backward seeking represented a difficult or confusing segment of the video, although this was not confirmed through participant checking.A similar result was also found by Stohr et al. (2019), although their study did not investigate "in-video" engagement beyond an initial action such as playing, pausing, seeking, or stopping.The majority of these studies, as well as more recent ones (e.g., Maloney et al., 2022;Zhang et al., 2022) have focused on singular analysis of clickstream data, and as such, possibly risk suffering from Ochoa's (2022) "streetlight effect." A further limitation of many LA studies into video-based learning is that the main measure identified for quantifying engagement has been watch time or the median of normalized engagement time-that is, the percentage of watch time relative to the total video duration (Maloney et al., 2022).However, some authors argue this does not provide a direct measure of viewer engagement (e.g., Chavan and Mitra, 2022;Chen and Thomas, 2020).For example, Chen and Thomas claim it is possible for video viewers to start playing a video but be engaged in a secondary task, simultaneously.Chavan and Mitra (2022) further note that only considering the number of views or watching patterns does not provide insights into the specific motivations behind these actions, which could vary based on factors including perceived importance, confusion, or engagement.
Responding to this, and to provide additional insights into viewer engagement, Chen and Thomas (2020) simulated an OLE within a laboratory setting, where participants viewed lecture videos containing different levels of "within-video" motion.They were then required to rate the engagement levels of the videos and complete recall and knowledge transfer tasks.The study found that there was agreement amongst students that they found "hand drawn" videos more engaging, which the authors state was consistent with earlier studies on videobased learning (e.g., Guo et al., 2014).However, the study did not find significant correlation between high levels of perceived engagement and better recall performance-only a small positive effect for the "low prior knowledge" cohort.In their study, Chavan and Mitra (2022) designed a dashboard that allowed students to voluntarily and in real-time report their cognitive-affective states during video lectures.The collected data was then presented back to instructors via their analytics dashboard (Tcherly).However, as the study focused on the usability of the prototype dashboard for instructors, it provided limited analysis regarding the types of student engagement in video-based learning.

The ICAP Framework
The review of literature to this point has identified few studies providing any analysis of the types of student engagement with video objects beyond simply "view-counts," and none that has adopted a pedagogical framework to help better understand that engagement.However, a study by Dodson et al. (2018) did apply a framework in an attempt to define the type of engagement as captured via click-stream data.The analysis framework through which students' viewing behaviors were identified and defined in Dodson et al.'s study, was the ICAP framework for active learning (Chi and Wylie, 2014).The framework divides and ranks active learning by (sub)modes of engagement labelled "Interactive," "Constructive," "Active," and "Passive" engagement.These terms form the acronym ICAP and are expressed in a hierarchy of I>C>A>P.Chi and Wylie (2014) argue that this hierarchy of engagement corresponds with associated levels of learning, with "Passive" being the lowest and "Interactive" the highest.They refer to this as the ICAP hypothesis (Chi and Wylie, 2014).
The ICAP framework makes assumptions supported by experimental studies and a meta-analysis of existing studies, that the behaviors reflect a learner's underlying cognitive engagement (Chi et al., 2018).Chi and Wylie (2014) specifically identify "pausing," "playing," "fast-forward," and "rewind" as examples of active engagement within videobased learning.Dodson et al. (2018) extended these signifiers by adding browsing, searching, changing playback speed, and rewatching, while passive engagement was revised to include watching a video linearly, without interaction.Their study also introduced a specially designed video player (ViDeX) that allowed additional behaviors to be executed, such as video-highlighting and note-taking.Dodson et al. (2018) argue that when provided with the right tools, learners will engage in active learning behaviors as defined by the ICAP framework.Their approach is consistent with recommendations that LA methods should have a solid grounding in learning theory (Ferguson et al., 2019;Macfadyen et al., 2020).As Ferguson et al. (2019) commented "Validating analytics would involve clearly linking behaviours and measurable outcomes with pedagogy and with learning benefits and employing an appropriate and robust scientific method."(p.52).
However, a significant limitation of Dodson et al.'s (2018) study was that trace data was logged from a very small sample comprising only 28 students.They highlighted the need for further studies with larger cohorts before any substantive conclusions might be advanced.Identifying it as a potentially valuable framework for embedding LA research, we adopted the same modified ICAP framework as used in Dodson et al.'s, (2018) work.Our study also applied a similar LA method (with an expanded participant base) but included a questionnaire to better understand students' behaviors as they align with the ICAP framework, including their underlying motivations.To reduce confusion, when specifying a mode within the ICAP framework, it has been capitalized e.g., Interactive, Constructive, Active.However, all modes fall under the umbrella term as evidence of active learning.The modified ICAP framework with identified modes of engagement, associated behaviors, and aligned motivations is presented in Table 1.
In summary, a number of limitations have been identified regarding LA and/or LA as applied to video-based learning.First, there is a dearth of studies completed in K-12 contexts, as well as in video-based learning, more generally.Second, LA has been critiqued for often taking a "black box" approach to its methodology (Rosé et al., 2019) as well as a disconnect existing between analysis and robust pedagogical frameworks (Ferguson et al., 2019;Macfadyen et al., 2020).Third, recent studies have highlighted a potential "streetlight" effect in LA and have recommended incorporating multimodal data into its analysis method (Giannakos et al., 2022;Ochoa, 2022).However, MMLA approaches further exacerbate the technological hurdle and specialized knowledge requirement that currently discourages many educators from using LA methods in their practice.The highest mode of engagement, and like constructive, it is generative, but with the additional requirement that the generative output was collaboratively created.
Collaborating with a peer or teacher to take notes or otherwise expand on the content of the video.

Research ethics
Research ethics clearance was obtained from the Macquarie University before any data were collected (application number: #5201834454739).

The learning environment
The online learning environment (OLE) (Figure 1) in which data for the study were harvested was a learning program called HSC Study Lab, developed by Macquarie University for the purposes of helping improve learning outcomes for students in years 11 and 12 of high school in physics, chemistry, and biology.HSC Study Lab is a custom OLE-developed by the university and all content within the OLE was delivered via pre-recorded video presentations and accompanied by simulated experiments, games and animations, with assessment comprising traditional recall-style quizzes with automated feedback.Course content was developed by experienced teachers in the Australian New South Wales (NSW) high school system and built by learning designers and educational technologists working at Macquarie University.The lesson content aligned with the NSW Higher School Certificate (HSC) curriculum and was designed to support students as they prepare for their end of year 11 and 12 school exams.Students were enrolled for 12-month periods and could access the learning material at any time over that period.HSC Study Lab exists in a digital ecosystem through which learner behavior in the form of trace data can be observed, recorded, and analyzed.Designed around an "anywhere, anytime" learning model, students are completely independent within the environment.As such, it is challenging for course designers to evaluate student learning and interaction with the program content.

Figure 1
The Lesson Page Interface Showing an Animated Video, a Tab to an Assessment Quiz as well as Additional Resources.

Data Methods and Analysis
A video from the OLE was randomly selected from the year 12 biology program and an aggregate of second-by-second user interaction data were analyzed.The video was an animated lecture on the innate and adaptive immune system.It was 9:44 minutes long and the total number of plays at the time of analysis was 870.A decision was made to use a video from the year 12 biology program as it comprised the largest cohort (of the three programs), and as such, constituted the largest possible sample size.

LA Data Capture
The first stage of LA research is the capture of data (Pardo, 2014).In this study, the main type of data captured were student actions while viewing the video objects in the OLE.The OLE used an external hosting service for streaming videos, and this service allowed the capture and visualization of data associated with watching the video (Figure 2).In Figure 2, the timestamp at the bottom of the bar indicates points of time throughout the video 1 , while the figure at the end of the bar records the overall percentage (not necessarily sequential) of the video watched.Finally, the colour of the bar indicates which sections were watched, and how often.The hue of the coloured bands within the bar indicates whether a section was rewatched, with the colour changing in intensity (darker green and then yellows and reds) depending on the number of times that section of the video was watched.The colours within the bar and the number of times that section was watched, is illustrated in Figure 3.The study was limited by the data sets available through the video hosting service; therefore, the analysis was restricted to identifying peaks in viewership caused by students rewatching or skipping sections of the video.Additional behaviors and/or reasons for behaviors could not be identified through click data alone.The addition of a questionnaire was essential for providing more accurate insights into the reasons for students' viewing behaviors.

Understanding LA Data
It was possible to identify different modes of active learning in the trace data.For example, rewatching sections of a video, along with pausing, or skipping, are behaviors consistent with Active engagement.Conversely, watching a video without otherwise acting on it corresponds with Passive engagement (Chi and Wylie, 2014).Each video also had a visualization of the aggregate data associated with all viewers and viewing sessions (Figure 4).However, it should be noted that the total number of views is not the same as the total number of viewers, as viewers may rewatch sections of videos multiple times.Peaks in the graph are caused by students rewatching sections of the video, while dips are caused by students dropping out or skipping ahead.

Definition of a Peak
The hosting service provided an aggregated display of student engagement with the video, which was revealed as a series of peaks2 mapped against the timestamp for the video.
However, there was a general decrease in viewership across the length of the video caused by user dropout, which tended to mask the significance of the peaks.Therefore, a working definition of a peak that took into consideration this overall trend was needed.When data was transposed to an Excel worksheet and converted to seconds, it was necessary to apply a formula that would account for the general decrease in viewership caused by the dropout rate, as well as reduce the interference generated by hundreds of in-video click interactions.Such an approach is an example of an ad-hoc analysis technique, which has been used successfully in other studies (e.g., Pardo, 2014). represents the number of students enrolled in the class, and () is the percentage total viewership () at time ().Note that () can be larger than  as students can rewatch sections of the video, and each time a student returns to a time instance   , (  ) increases by 1.The viewership as a percentage over time is calculated by this formula: A time interval earlier in the video was selected to act as a comparison point () against which changes in viewership could be identified.This was done to account for the general decrease in viewership over time.The comparison point () was set at 20  to identify specific points of interest.The formula for expressing this is () = () − ( − ) As there were almost continual changes in viewing percentages, a measure for meaningful change was required.A trigger (represented as ) was therefore created that would call a peak only when an increase in viewership was above a given percentage.The trigger for calling a peak was a 5% increase in viewership, which meant that if there was a 5% increase in viewership over any 20-second timeframe, a peak was called.Under these conditions, a peak is defined as

Questionnaire
To enhance interpretive validity, a web-based questionnaire was developed and sent to the year 12 students enrolled in the biology program.The questions and results can be found in Appendices A and B. Year 12 students were selected as they were likely to have had more experience in the program overall and possibly greater familiarity with the format and style of the videos.The questionnaire was emailed to students and 106 responses were received.While the total number of students enrolled in the biology program at the time of the study was 8,142, given enrolment was purchased in 12-month subscriptions and the subject itself can be taken anytime by a student over that time period; thus, it is difficult to know how many students were actively participating in the OLE at the time the questionnaire was sent out.However, despite the relatively small number of respondents, given the extent of agreement between respondents, we were able to calculate high confidence intervals for the results (see Appendices A and B).
The purpose of the questionnaire was, in the first instance, to triangulate the findings of the LA method as well as evaluate inferences that learner intentions behind identified behaviors conformed with active learning (as defined by the ICAP framework).This was needed because for learner activity to represent active learning, there must be a corresponding intent on the part of the learner (Bonwell and Eison, 1991;Chi and Wylie, 2014;Scardamalia and Bereiter, 2006).An additional purpose for the questionnaire was to identify non-program-based engagement with the video-based lessons, such as note-taking or discussing the videos with classmates and teachers.The analysis of trace data allowed the researchers to identify patterns of behavior that could be categorized as passive or active (including all submodes, as defined by the framework), while the questionnaire augmented these findings by asking participants to report on those behaviors.For example, as it is possible within the trace data to identify rewatching of sections of the video, a question specifically asked participants to confirm that they participated in that behavior.If the participants responded in the affirmative, then there is support that the patterns of behavior identified in the trace data are an accurate reflection of learner behaviors.Furthermore, as it is not enough that the behaviors conform to active learning-the intention or motivation behind the behaviors also needed to align with the observed (and reported) behavior.Therefore, additional questions were designed to elicit responses that provided more information about the motivations behind the behavior.The questionnaire comprised12 questions and were categorised as relating to either the "environment," "observable behavior," or "motivation/intention."The first three questions related to environment and were used to establish the context for learning (online and as individuals) while the responses to the following questions were categorised under "observable behavior" or "motivation/intention" and were further coded against the framework and mapped to the specific submodes of engagement within active learning.
Coding decisions were based on alignment of student responses with ICAP submodes and then independently blind-checked for accuracy by the coauthor.For example, item 10 asked participants whether while watching a video, they skip back to rewatch parts of it.This behavior was identified by the LA method and according to the ICAP framework as indicative of Active engagement.Participants were then asked for the reasons why they rewatched the video.A univariate analysis was completed with students reporting as participating in the behavior (or not), along with a general frequency.This relationship between observed behavior and viewer motivation is illustrated in Table 2.

Active
Results for all questionnaire items are presented in Appendices A and B. Appendix A summarizes questionnaire results for items categorized under "environment" and "observable behavior" and aligned with the submodes of the ICAP framework, while Appendix B does the same for "motivation/intention."In both Appendices, columns 1 and 2 record the primary category and item, column 3 the students' responses (options and short answer), and columns 4 and 5 the response count and alignment with the ICAP framework.By using the questionnaire, it was possible to more accurately determine student viewing behaviors within the OLE, including whether their underlying (and invisible to the LA method) motivations also aligned with active learning as defined by the framework.

Aggregate Data
Aggregate data were also harvested from the video that had been viewed 870 times.This provided a relatively large sample size, which increased the validity of conclusions about patterns of engagement.Once data were entered into a spreadsheet, the graph shown in Figure 5 was generated.It was evident that there was a large initial viewership, a relatively even (and steep) drop-off until around the two-hundred-second mark, and then a series of peaks and troughs until the five-hundred-second mark, before another steep drop-off, ending with below 50% viewership.These peaks were caused by an aggregate of collective engagement, generally caused by rewatching of the video by individual students, so although there was a decline in overall viewership caused by the dropout rate, this was countered by students rewatching specific sections of the video multiple times.
The formula was then applied, which allowed for a comparison of "peakiness" (height and width of peak) between data points.The data was then re-graphed and a visualization created (Figure 6).Along with the visualization, the formula revealed a total of six peaks at 152,222,263,332,382, and 449 seconds, with an average increase in height over the twentysecond timeframe of 9% and an average width equal to 15.83 seconds of video.

Figure 6
Graph Illustrating "Peakiness" of Data Over Time Peakiness over time (seconds) In the aggregate analysis, the peaks in the visualization (Figure 6) indicate multiple students rewatched specific sections of the video, which is evidence of Active engagement.As illustrated in Figure 6 the largest peak came at the 222-second mark and was an increase of 28% over the given timeframe.Further peaks occurred at 52, 263, 332, 382, and 449 seconds.This suggests that there was content within those sections of the video that students felt particularly engaged with.Whether that was due to interest, confusion, or difficulty of the subject matter could not be identified by the LA method alone.What was clear, however, was that there was non-random student engagement with the video in the form of rewatching specific sections.This analysis indicated patterns of behavior that aligned with Active engagement.

Individual Viewing Data
The video-hosting site provided visualizations of individual viewing sessions as illustrated in Figures 7 and 8.When analyzing individual viewing sessions and mapping the data against the ICAP framework, it was possible to identify different viewer behavior patterns.For example, Figure 7   By combining these data visualizations with individual IP addresses, it was possible to conduct a secondary analysis of some individual viewing sessions.By using the unique IP address to link separate (individual) viewing sessions and then analyzing the viewing behaviors in totality, it was possible to identify that students were returning to a video and completing it over multiple viewing sessions.For example, in the first pairing of viewing sessions (Figure 9a) a student started viewing the video, rewatched sections earlier on, and then rewatched sections from approximately three-minutes to six-minutes multiple times, before leaving the video around the eight-minute mark.In the second session the student returned to the video twenty days later when they skipped over the first three minutes of video, which was the same three minutes they showed limited engagement with in the first session.Then there is little to no rewatching of the video, and this time the video was completed.It could be reasonably concluded that the student found the content between the three and six-minute mark of most interest or relevance, and then revisited it twenty days later for a refresher, jumping directly to the section they found most relevant.The second pairing (Figure 9b) also indicates a student returned to and completed the video over multiple sessions.This student started the video and watched until around the six-minute mark before dropping out.In this session, they appeared very active as they rewatched multiple sections, and even skipped over some sections.Then they re-entered the video the next day, skipped ahead until they reached approximately when the previous session had ended and watched the video until completion, again rewatching a large section and smaller sections multiple times.

Discussion
This section discusses the results in relation to the research questions.This is followed by a general discussion of the findings with reference to other research on ICAP and active learning in OLEs.
1. To what extent do students participate in active learning behaviors when engaging with videos in the OLE?
Results from this study provide general support for earlier summarized arguments that students may participate in active learning behaviors when interacting with video objects in OLEs.Aggregate LA data clearly indicated many students rewatched specific sections of the video, in some cases multiple times, which is evidence of active engagement (Figure 9).Furthermore, the applied formula revealed a series of peaks within the data.From the clustering and size of these peaks, it could be defensibly concluded that there was content within those sections that students particularly engaged with.More significantly, the analysis revealed patterns of behavior that aligned with Active engagement as defined by the ICAP framework, with questionnaire results supporting the tentative conclusions derived from the LA method.Students reported that they did participate in the behaviors identified, and further, their actions were non-random, deliberate, and consistent with the definition of active learning.For example, the most cited reason for leaving a video was that the student had found what they needed, which is an example of learner intention/motivation that aligns with Active engagement (Table 1).Moreover, the mean of results from the questionnaire revealed 96% of respondents always or sometimes participated in active-learning behaviors, including taking notes, discussing with a peer, and/or rewatching sections of video.Questionnaire item 10 addressed the behavior of rewatching, which was also identified by the LA method.The results confirmed initial interpretations from the LA method, with students responding that they always (30.4%) or sometimes (66.7%) participate in rewatching behavior.

2.
To what extent is learning analytics an effective tool for identifying patterns of student behavior associated with active learning when engaging with videos in the OLE?
When considering the second question, it was important to evaluate which alternative modes of active learning the questionnaire revealed that were not identifiable by the LA method.For example, in item 7 the students were asked, "do you take notes while watching the video?" with 96% of respondents either answering "yes" or "sometimes."While this behavior aligns with a Constructive mode of active learning, it could not be determined using the LA method alone.This was due to the behavior occurring within other learning tools (e.g., a notebook or computer) that sat outside of the OLE and therefore did not create trace data in the video logs.Other questionnaire items indicated that all but one of the students pause to take notes while watching the video, while in item 9 where participants recorded whether they discussed the content of the videos with others, 59.8% indicated that they do.Neither behavior considered higher order engagement in the ICAP framework was identifiable by LA in the video log data.
In other instances, the questionnaire revealed alignment between LA-identified behavior and its underlying motivation, as illustrated by responses to items 11 and 11B.For example, 72.8% responded that they rewatched sections of a video because it was either confusing (57.6%) or interesting (15.2%)-behaviors strongly aligned with motivations indicating Active engagement.Of those who answered "other," responses to the follow up item "Please detail" revealed further Active motivations, as well as some Constructive motivations.In fact, 100% of respondents indicated an Active motivation for the behavior including, for example, that they would take notes, which has been aligned with Constructive learner intentions such as translating and linking concepts (Chi and Wylie, 2014).No participants reported "video error" or other technical reasons that were unrelated to activelearning motivations as a reason for rewatching a section of the video.

The ICAP Framework
Literature indicates studies that did not use the ICAP framework for identifying active learning often interpreted behaviors quite differently to those that did.For example, both McGowan et al. (2016) and Kim et al. (2014) concluded that skipping ahead indicates disengagement, while the ICAP framework suggests that the behavior is indicative of active learning.Studies that only used an LA method were limited in that they could only infer student intentions behind the observed behaviors (e.g., Kim et al., 2014, Lagerstrom et al., 2015, Zhang et al., 2022).By adopting the ICAP framework the study revealed limitations with LA as a method whereby it was effective at identifying lower order (within the ICAP framework) forms of engagement (Active) but unable to identify Constructive and Interactive engagement, which were both revealed by the questionnaire.This finding supports the conclusions of other authors regarding the limitations of LA as the sole method for identifying engagement (Chavan and Mitra, 2022;Chen andThomas, 2020, Giannakos et al., 2022;Ochoa, 2022).Dodson et al.'s. (2018) study also investigated behaviors such as skipping ahead and by supplementing LA data with a questionnaire, they were able to identify the student motivation behind the behavior.For example, students reported that they would often look for-specifically slides within the video, and then use a note-taking tool to record the information they needed (Dodson et al., 2018).The current study also found significant agreement between the responses to the questionnaire, the behaviors, and their underlying motivations.By supplementing the LA method with a questionnaire, this study has further developed understanding of student intentions when interacting with video objects and found that there is substantial alignment between trace data revealed using the LA method and the attributes of active learning defined by the ICAP framework.This highlights the importance of LA research adopting a solid theoretical referent to build more accurate understandings of the purposes and motivations behind patterns of learner engagement, as revealed by LA data (Macfadyen et al., 2020, Ferguson et al., 2019).
In conceptualizing these outcomes, the ICAP framework provided a valuable lens through which to evaluate data collected by the LA method.Earlier research (e.g., Giannakos et al., 2015) identified improved learning outcomes associated with active learning behaviors like rewatching, so it is encouraging that 97% of students responded that they engaged in such behavior at least some of the time.When analyzing the motivations behind the behaviors, most responses indicated students did this to improve clarity or understanding.This finding might suggest the subject content is not being clearly explained and/or is beyond the level of the student-knowledge which could be used to inform improvements in the design or presentation of the video content.Interestingly, within literature, analysis of dropping-out behavior or exiting a video is contentious, with some researchers interpreting the cause as low engagement on the part of the student (e.g., Kim et al., 2014;McGowan et al., 2016;Zhang et al., 2022).Kim et al. (2014).Zhang et al.'s (2022) studies further found that there was a relationship between video length and dropout rates, and, according to Kim et al. (2014), that students "might feel bored due to (a) shorter attention span or experience more interruption" (p.3).This finding led Kim et al. to recommend limiting the length of videos to six minutes.However, this recommendation was not backed up by other data that could verify LA-derived interpretations, such as that which could be gathered via participant checking.The present study achieved this by using a questionnaire to specifically investigate these assumptions.Indeed, the questionnaire suggested alternative motivations for such behaviors.
Likewise, the secondary analysis of individual viewing sessions by their individual IP addresses (Figure 7a and b) revealed students frequently watched a video across multiple sessions.This conclusion of the questionnaire also aligns with Lagerstrom et al.'s (2015) work, as 74% of respondents reported that they often returned to a video after exiting, before completion.This reveals an interesting area for potential future research.

Limitations and Further Studies
As this is a new area of study there was little guidance within the research as to what could be considered a meaningful or significant peak in terms of viewer engagement.This was ultimately decided by the width of the resulting peaks, but further analysis against multiple videos is required to add validity to this method.For example, initially 10 seconds earlier in the video was selected as the comparison point, which produced more peaks but the average width (the duration of viewing for each peak) was only 6.4 seconds.Increasing this timeframe to 20 seconds resulted in fewer peaks but the average duration (or width) of each peak increased to 15.83 seconds.This study selected the longer timeframe of 20 seconds, but further comparison across multiple videos is required to establish a more universally applicable baseline for significant events.
Although two data methods were used in this study adding validity to its findings, it is acknowledged that the size and scope of the study was limited.This provides an opportunity to apply its methods in new contexts and/or to larger datasets.Furthermore, the questionnaire was specifically designed to better understand and validate the behaviors and motivations as captured by the applied LA method, as well as test the assumptions of the ICAP framework.However, we acknowledge that although free text responses were permitted, these responses were limited to focusing principally on these behaviors and its interpretive framework.While doing this was consistent with the study's design, it is acknowledged that it could limit the range and depth of possible responses or hinder identification of other possibly relevant information.Future studies applying similar LA methods and interpretive frameworks could be strengthened by conducting in-depth interviews and/or focus groups matching engagement data to individuals, which could yield a wider range of possible responses and potentially identify new areas for inquiry.

Conclusion
This study focused on an under-researched and emerging area of inquiry (McGowan et al., 2016;Maloney et al., 2022;Viberg et al., 2018), seeking to build more accurate knowledge about the type and quality of student engagement with video objects in OLEs.This was achieved by adopting the ICAP framework for determining active learning and using a questionnaire that was able to identify student motivations behind the in-video clicks.This supported interrogation of previously inconclusive interpretations of student behaviors when interacting with video learning objects, with findings tending to support the earlier studies of Dodson et al. (2018) and Lagerstrom et al. (2015).The results also question earlier assumptions that video-based lessons often place students in the role of passive learners (Giannakos et al., 2015).Furthermore, it achieved this by using readily available services and techniques to make this type of data analysis more accessible to educators.The data collection method was essentially an "out-of-the-box" service offered by the video host (not dissimilar to YouTube analytics), and our method of analysis was a relatively straightforward mathematical formula.
Our study, therefore, offers an accessible strategy for educators who may not have the specialized expertise required for more complicated tools and techniques, which has been identified as a limiting factor on LA and, more recently, on MMLA research (Ferguson et al., 2019;Giannakos et al., 2022;Ochoa, 2022).Secondly, these results provide general support for LA as an effective method for identifying patterns of behavior associated with active learning when using video objects.Supporting this, the questionnaire verified many of the interpretations made from LA data, with most students confirming that they did participate in the behaviors identified by the LA method, and that they did so for reasons consistent with active learning.
As a tool for identifying active learning with video objects, results from this study suggest that LA has an important role to play and is greatly strengthened when mapped to a well-researched pedagogical model like the ICAP framework.Finally, while it was clear that the LA method could identify rewatching and that this behavior was possibly associated with Active engagement, without the questionnaire several additional active learning behaviors would not have been substantiated.This highlights a potential limitation with LA, also identified by those involved in the emerging subfield of MMLA, whereby the focus on digitized trace data alone can lead to oversimplification of findings or misunderstandings (Giannakos et al., 2022;Ochoa, 2022;Ouhaichi et al., 2023).Taking notes, too fast The first time to get a preliminary understanding of the overarching concept, then a second time to make sure that I have a deep understanding of everything that was said The videos often have text, and they aren't on the screen for very long so we watch it once to process the vid, then a second time to write down notes while pausing.There are times when the information is given out too quickly, so I need to re-watch particular parts to understand them better.

Appendix
There was something I needed to take notes on.They just spoke too fast for me to get all the info down, so I need to listen to it again to ensure I don't miss anything important.To ensure students they get a broad understanding of the topic To relearn content I forgot To take notes and solidify my understanding Because I needed to relook over the information Sometimes it's confusing The detail was skimmed over/ not written down To double check information To ensure I have written the correct information for my notes Watching it more than once is helpful You always miss something when it has been told

Figure 2 Figure 3
Figure 2 Visualization of Data on the Viewing Session of Individual Users reveals patterns of behavior aligned with Passive engagement, while Figure 8 reveals patterns of behavior aligned with Active engagement.

Figure 7
Figure 8 Visualization Illustrating 85% of Video Watched With Colored Bands Indicating a Pattern of Rewatching

Figure 9 Two
Figure 9

Zhang et al. explored the patterns of attention allocation (accumulation, circulation, and dissipation of collective attention) related to features associated with MOOC video lectures and engagement with videos. Consistent with
built on Kim et al.'s 2014 work.

Table 2
Category and Coding of Response Against ICAP Framework

A
Responses to Environment Questions Used to Establish Learner Context May have been a concept I didn't understand fully the first time, or just for revision purposes.Note taking/better information retention So I can write what they said Taking notes or if the content is interesting or confusing