The impact of disadvantage on higher education engagement during different delivery modes: a pre- versus peri-pandemic comparison of learning analytics data

Abstract The pandemic forced many education providers to pivot rapidly their models of education to increased online provision, raising concerns that this may accentuate effects of digital poverty on education. Digital footprints created by learning analytics systems contain a wealth of information about student engagement. Combining these data with student demographics can provide significant insights into the behaviours of different groups. Here we present a comparison of students’ data from disadvantaged versus non-disadvantaged backgrounds on four different engagement measures. Our results showed some indications of effects of disadvantage on student engagement in a UK university, but with differential effects for asynchronously versus synchronously delivered digital material. Pre-pandemic, students from disadvantaged backgrounds attended more live teaching, watched more pre-recorded lectures, and checked out more library books than students from non-disadvantaged backgrounds. Peri-pandemic, where teaching was almost entirely online, these differences either disappeared (attendance and library book checkouts), or even reversed such that disadvantaged students viewed significantly fewer pre-recorded lectures. These findings have important implications for future research on student engagement and for institutions wishing to provide equitable opportunities to their students, both peri- and post-pandemic.


Introduction
In 2020, the response from many governments to the COVID-19 pandemic was to 'lock down' their country, preventing their populations from leaving home except for various essential activities. In the UK, like much of the world, this meant that higher education institutions quickly had to pivot their learning and teaching activities to online-or mainly online-provision, with all large group lectures provided virtually. In some cases, and for some periods, no in-person teaching was allowed at all, although universities remained officially 'open' , with students able to use facilities such as the library and study areas. The reality of 'digital poverty'-exclusion from aspects of daily life through not having appropriate devices, software or internet connectivity-predated the pandemic, with effects far broader than the domain of education. However, the pandemic arguably intensified and more fully exposed such effects, causing concern that it would deepen inequalities, especially in educational settings (Higson, Moores, and Summers 2020;Kizilcec, Makridis, and Sadowski 2021). Thus far, the impact of the pandemic on student engagement has not received much attention (Senior et al. 2021), although evidence is now starting to emerge (e.g. Bashir et al. 2021;Xu and Wilson 2021;Zhang, Taub, and Chen 2021).

Learning analytics and prediction of student success
A plethora of research suggests a significant correlation between attendance and attainment at university (for a review see Credé, Roch, and Kieszczynka 2010), although the causal nature of this relationship is debated, with some researchers contesting that poor attainment can cause low attendance as well as vice versa (e.g . Jones 1984;Kahu 2013). nevertheless, learning analytics (LA) systems are increasingly being used to collect and report on student engagement data more broadly, using a variety of measures in addition to attendance, including library use and interaction with virtual learning environments (VLEs). A large amount of research has found correlations between VLE activity and academic performance (e.g. Macfadyen and Dawson 2010;Mogus, Djurdjevic, and Suvak 2012;You 2016;Chen and Cui 2020;Waheed et al. 2020;Summers, Higson, and Moores 2021). It has been reported that such activity can account for between 8% and 36% of the variance in end-of-year mark in online courses (Morris, Finnegan, and Wu 2005;Ramos and Yudko 2008;Macfadyen and Dawson 2010;Agudo-Peregrina et al. 2014) and up to 23% of the variance in end-of-year mark for in-person courses . A number of studies have also revealed relationships between library use and attainment, although the correlations are generally quite low (Allison 2015;Renaud et al. 2015;Robertshaw and Asher 2019). Whilst the data feeds for LA systems are typically tailored to the particular institution in question, the digital footprint created by these systems contains potentially valuable information about learners, learning, courses and the university itself; it also provides the potential to observe some of the effects of the pivot to online learning on student engagement.

Success of students from disadvantaged backgrounds
Historically, many universities have focussed efforts on equality of access of students from diverse backgrounds into Higher Education, rather than equality of success and progression to employment after enrolment. Data from England, show that, on average, students from more disadvantaged backgrounds-including disabled students, students from an ethnic minority background and students from lower Index of Multiple Deprivation quintiles-have a lower likelihood of continuing their studies after their first year, lower attainment in their degrees, and a lower chance of progression to highly skilled employment or higher-level study (Office for Students 2021). It is therefore increasingly recognised that 'getting on' as well as 'getting in' matters (Higson 2018).
There are numerous ethical issues surrounding the use of LA, including the potential for labelling bias (Sclater 2016).The ethos of many LA systems is thereby to allow students to view a record of their own behaviour and to trigger interventions based on this behaviour (rather than on any pre-existing data on prior attainment or demographics; for counterexamples see Arnold and Pistilli 2012;Jayaprakash et al. 2014). Foster and Siddle (2020) demonstrated that LA can potentially be used to reduce disparities in attainment between different populations without using students' demographic data. Similarly, Summers, Higson, and Moores (2021) concluded that targeting interventions arising from LA systems based on behaviour, rather than demographics, should be a successful strategy. nevertheless, this digital footprint can be combined with demographic data outside of the LA systems to allow us to understand more about student behaviour, challenges and patterns at a cohort level (Arnold and Pistilli 2012;Jayaprakash et al. 2014). Indeed, Williamson and Kizilcec (2021) argued that LA research has thus far mostly neglected diversity, equality and inclusion issues, and that LA dashboards provide a potential opportunity to improve equity outcomes at scale, but that more research is needed first (but see Hlosta, Herodotou, Bayer & Fernandez, 2021).

The present study
Here, we investigate three years of LA data from first-year undergraduates at a research active, medium-sized UK university with a highly diverse student population, comparing students from higher versus lower quintiles of the Index for Multiple Deprivation (IMD: IMD quintiles) in preand peri-pandemic times. We analysed results from four of our six possible LA feeds: (i) generic VLE course access, (ii) watching of asynchronously delivered material, (iii) watching of synchronously delivered material ('attendance'), and (iv) borrowing of library books. This allowed a comparison of digital and physical provision, and was important to elucidate the possible impact of the pandemic on the ability of students to engage with their studies. It should be noted that we do not necessarily consider these data feeds to be the best possible data that could be collected to answer our research question. neither was the configuration of the LA system used optimal in an online learning environment. Instead, these were the feeds that constituted the LA system at the time and therefore those available to us to analyse; ideally, data on library e-book and e-journal use would have additionally been available. The optimal learning feeds for LA systems are the subject of some debate and will be unique to individual institutions and teaching methods (Agudo-Peregrina et al. 2014). Two additional feeds were available to us-logins to the VLE and access to the online quiz system-but were not analysed here. Previous work  found that logins were highly correlated with access to course materials and that access to online quizzes were highly course dependent, and we therefore excluded them from analyses.

Sample data/participants
The data from three cohorts of undergraduate students at Aston University was used for analysis. The university is a medium sized UK university, research active and has an ethnically diverse population relative to other UK institutions. Approximately 53% of the sample of students read a STEM subject (science, technology, engineering or mathematics) and the remainder, 47%, were in the business school.
Undergraduate records were obtained for first-year full-time home students who began their studies in the 2018/19, 2019/20 or 2020/21 academic years. Students who did not complete their first two years (2018/19 and 2019/20 cohorts) or were not listed as current as of June 2021 were removed from the sample. For the remaining students we attempted to match their home postcode to a UK-wide adjusted IMD quintile. IMD quintiles are not normally comparable between the countries of the UK but Abel, Barclay, and Payne (2016) derived an adjustment such that indices from three of the constituent countries of the UK can be compared with the other. This adjustment has been updated for the most recent 2020 indices by Parsons (2021) and was used here; indices for Scotland, Wales and northern Ireland were adjusted to be comparable to those from England. After removing students whose IMD quintile could not be identified, due to unrecognised postcodes, we were left with a sample of 6486 students from the three cohorts (see Table 1). The UK Office for Students considers students from IMD quintiles 1 and 2 as meeting widening participation criteria (most disadvantaged), whilst students from quintiles 3-5 are not considered disadvantaged (Office for Students 2018). We have divided our students into two categories (Q12 and Q345) that align with this distinction.

Measures
All undergraduate modules at Aston are managed through the university VLE, where university announcements, timetables, online live lectures and course materials can be accessed. Since 2018, attendance at lectures and seminars has been electronically recorded by students swiping their identity card; though neither attendance nor the act of recording attendance is compulsory for home students. Additionally, all lectures are recorded and available through the VLE via a lecture capture system (LCS). Aston's learning analytics system, provided by Solutionpath, aggregates the log data from the VLE, attendance recording system and lecture recordings on a daily basis. Four data feeds comprise the digital footprint and, between them, represent proxies for access to online learning and in-person learning: (i) VLE course access: number of times the student accessed course materials, (ii) Attendance: total number of in-person classes and live online classes that the student attended, (iii) LCS: number of times the student viewed recorded lectures, and (iv) Library: number of printed materials checked out of the library by the student. note that during 2020/21 the vast majority of teaching was conducted online, with some exceptions in STEM subjects. To facilitate comparisons across academic years the total attendance was reported which combined in-person attendance and viewing of online live teaching.

Analyses
For each student, the daily data for the four feeds were aggregated on a weekly basis for the 21 teaching weeks of the 2018/19, 2019/20 and 2020/21 academic year. For the 2018/19 and 2019/20 academic years live teaching was conducted entirely on campus whereas for 2020/21 teaching was conducted almost entirely online. These weekly data were then averaged over the whole academic year for each student.
All the statistical analyses were computed using R 4.1.0 (R Core Team 2021). Linear mixed models were computed using lmer from the package lme4 (Bates, Mächler, Bolker & Walker, 2015). The significance of the effects of the main factors were evaluated following the approach of Luke (2017) using the package lmerTest (Kuznetsova, Brockhoff, and Christensen 2017) which implemented the Satterthwaite approximation to estimate the denominator degrees of freedom of the F statistic. Estimated marginal means from the models were computed using emmeans (Lenth 2021).
For each of the four data feeds a linear mixed model was computed that evaluated the interaction between teaching mode (predominantly online versus entirely in-person) and IMD (Q12 versus Q345) with course added as a random effect to account for possible course-dependent levels of activity of each data feed; especially given that some STEM courses required in-person attendance for laboratory classes even when almost all other teaching was delivered online.

Results
Overall, in comparison with in-person teaching, there was an increase in asynchronous interactions (LCS views and VLE course materials) for online teaching and a small decrease in synchronous interactions (attendance); library book checkouts dipped to near zero. Pre-pandemic, when teaching was entirely in-person, students from lower IMD quintiles tended to attend more lectures than their counterparts from IMD Interaction plots (teaching mode x IMD) for the four data feeds are shown in Figure 1 and summary data are in Table 2. These differences in behaviour between students from different IMD quintiles, and the interactions between IMD and teaching mode, were explored further in a series of linear mixed models. The results of these models (see Table 3) revealed that teaching mode was a significant factor for all the four data feeds, IMD was a significant factor only for library book checkoutswith disadvantaged students borrowing more books than their more advantaged counterpartsand that there were significant interactions between IMD and teaching delivery for attendance, LCS views and library book checkouts. Post-hoc pairwise comparisons on the estimated marginal means of the models (Table 4) revealed that, during in-person teaching, students from more disadvantaged backgrounds attended significantly more 'live' classes (0.12 extra classes/week) than students from less disadvantaged backgrounds, but this difference was eliminated during   online teaching. Furthermore, during in-person teaching, students from more disadvantaged backgrounds viewed significantly more recorded lectures (0.25 extra recorded lectures/week) than those from less disadvantaged backgrounds. This situation reversed entirely during online teaching with students from the most disadvantaged backgrounds watching significantly fewer recorded lectures (0.29 fewer views/week) than those from less disadvantaged backgrounds. Finally, students from IMD Q12 backgrounds checked out a significantly greater number of books during in-person teaching (0.20 extra books per week) than those students from less disadvantaged backgrounds. Whilst individually these effects are small, given the size of the 2020/21 cohort (1604 IMD Q12 students) their overall effect is potentially large. Using the difference data from Table 4, IMD Q12 students from the 2020/21 cohort during in-person teaching would have been expected to check out ~6700 more books (1604 students × 0.198 extra books/week × 21 teaching weeks), watch ~8400 more pre-recorded lectures, and attend ~4400 more live classes than those students from Q345.

Discussion
Overall, on three of the four measures, the results showed a differential change in engagement of our disadvantaged students versus our non-disadvantaged students to the relative detriment of disadvantaged students. However, the measures which showed this change were not all digital measures, instead including the measure of library book borrowing. It should also be noted that, when considering online teaching only, the engagement levels of the disadvantaged students only differed significantly from their counterparts for the viewing of recorded lectures; whilst pre-pandemic the disadvantaged students viewed significantly more lectures than their counterparts, peri-pandemic, they viewed significantly fewer.
That the number of course access events between the two groups was similar, and increased for both groups, suggests that frequency of availability of access to the VLE per se was not a particular issue for most students. We are unable to determine the reason for the differences between synchronous and asynchronous delivery methods from our data. Of our measures of engagement, interactive synchronous provision would-in principle-require the most internet data and the best internet connection, with adequate upload and download speeds being required to fully participate. In addition, this synchronous provision provides the least flexibility of access, requiring a digital device and internet connection at a precise time (problematic if, for example, devices are being shared in the household or if students need to be on campus to access the Internet). Instead, it was only for pre-recorded lectures that a significant difference was found between groups peri-pandemic, with disadvantaged students watching these recordings less often. It may be that students from both groups valued and enjoyed synchronous provision more, at least peri-pandemic, and were more motivated to attend. Whilst asynchronous material has the advantage of being available 'any time' , this may lead to complacence or procrastination (see, e.g. Baker et al. 2019, who showed how effects of scheduling when students should watch material affected early attainment). Alternatively, the reduction in engagement with synchronous provision in both groups may have reflected fewer of these types of interactions being available, whilst-in contrast-pre-recorded material may have been 'over-provided' .
Overall, these results suggest that effects of disadvantage on student engagement were potentially wider reaching, yet also more nuanced, than a simplistic or all-encompassing view of 'digital poverty' . The concept of digital poverty risks downplaying differential effects of different methods of digital delivery as well as other important aspects of the educational experience. Hodges et al. (2020;p5) argue that emergency remote teaching and online learning are very different, and that true online learning requires an 'ecosystem of learner supports' as is present for in-person learning and that 'Face-to-face education isn't successful because lecturing is good' . We therefore reiterate the need for a 'multi-pronged' approach to supporting students both peri-and post-pandemic, considering academic, experiential and pastoral issues (see also Higson, Moores, and Summers 2020).
As outlined in the introduction, there has been a relative dearth of research surrounding equality issues in LA (although see Hlosta et al. 2021). However, although student demographics were concluded to be unnecessary for the successful implementation of LA systems, Foster and Siddle (2020) found that low engagement alerts were 43% more likely to be sent to disadvantaged students, supporting their argument that targeting could be based on behaviour alone. Summers, Higson, and Moores (2021) reported that social economic status explained small, but statistically significant, amounts of variance in attainment, indicating that those with parents who had never worked/ long-term unemployed tended to have poorest attainment in comparison with those from other socio-economic backgrounds. These findings therefore offer some initial insight into potential equality issues both pre-and peri-pandemic, and as universities prepare themselves for a 'post-covid future' .
Whilst studies have suggested that relationships between library use and attainment are generally quite weak (Allison 2015;Renaud et al. 2015), it seems reasonable to speculate that it is because many more affluent students purchase key textbooks instead of borrowing library copies; although the increased use of e-books may also be a factor; unfortunately data on this was not available. The impact on attainment of not feeling able to either purchase or borrow physical textbooks has yet to be determined. In contrast, as outlined in the introduction, a large amount of research has found correlations between VLE activity and attainment, and between attendance and attainment. For this study, we did not have access to levels of attainment, but future research could investigate the impact of the changes in engagement on subsequent attainment. Future research should also consider the potential differential effects of engagement on attainment for different groups; for example, the act of borrowing library resources may be more important for some groups than for others.
There are several limitations with this research. First, the LA system in terms of digital engagement only counts individual login events, excluding other factors which may influence students' experience, such as what type of device they are using (e.g. mobile phone, tablet or PC), or whether the internet connection allowed them to watch, listen or contribute fully. Anecdotal evidence from academic staff suggests that many students are 'participating' in some interactive sessions on mobile phones with their cameras switched off. Second, we have not been able to track the use of other physical resources or space, such as use of the library without checking out books or use of other study areas. It seems reasonable to assume, for example, that the working environment of disadvantaged students is more likely to be less than optimal and that use of e-books and journals may have increased during the pandemic.
Third, we only have measures of absolute counts of engagement, without information on the proportion of possible activities engaged with; this may be over-estimating the proportion of lectures watched, especially because many lecturers were encouraged to provide their pre-recorded lectures in multiple smaller 15 to 20 min chunks, rather than a 50 min continuous recording. Thus, it would not be reasonable to assume that the total amount of lecture time experienced is directly related to a count of pre-recorded material engaged with. This issue, however, would be equal for both groups so does not impact on any conclusions relating to differences and interactions between groups, although it does affect the interpretation of increases or decreases in engagement on these measures overall.
Fourth, in terms of our 'attendance' measure, we may not be comparing like with like. Although in both modes of delivery our attendance measure is a measure of engagement with 'synchronous' learning, it is likely that pre-pandemic many of these sessions were lectures with relatively limited amounts of interaction (replaced peri-pandemic with recordings), whilst peri-pandemic synchronous sessions were more likely to have been designed to elicit a greater level of interactivity. This may be important because attendance at interactive sessions arguably requires a greater level of commitment, sense of belonging and confidence, which may differ amongst different groups (Oldfield et al. 2018); it is possible that the observed interaction here may reflect change of format, rather than an effect of digital poverty per se. In addition, these sessions may also have been held with different cohort sizes, which is known to influence attendance (Friedman, Rodriguez, and McComb 2001).
Finally, we should note that significant efforts and resources were employed to try to ensure that digital poverty did not impact on this cohort's student experience, e.g. via the purchase and deployment of laptops; therefore, some of the worst effects of digital poverty may have been mitigated.
The ability to detect the effects of disadvantage on student engagement, despite many efforts of the university to mitigate it, would not be possible without the large amount of data available from learning analytics systems. This research illustrates the effects of disadvantage on student engagement, that the effects of the pandemic on student engagement are likely to go beyond the digital realm, and that effects of disadvantage on engagement may be more easily observed for some specific types of education delivery than others. given that students from disadvantaged backgrounds are three times more likely to live at home than their more advantaged peers (Donnelly and gamsu 2018), the shift to online learning may have disproportionally affected such students who, when on campus, make use of non-contact time for further study at the library. Universities should seek to mitigate the broader effects of the pandemic on their students.