Using cluster analysis to explore the engagement with a flipped classroom of native and non-native English-speaking management students

Flipped classrooms are becoming increasingly popular, particularly when teaching non-native speaking students. Existing research has largely focused on examining academic performance and students’ perceptions of the learning process. This exploratory study uses log-file data to identify hidden patterns of student online behaviour in a flipped classroom environment for a cohort of students, with special attention to their command of the home language of the institution. Using cluster analysis, categories were identified regarding when and how often online flipped lessons were accessed: (i) before class, (ii) after class and before a weekly exam, and (iii) after the weekly exam but before an assignment. Gender and the average number of minutes the lessons were accessed before each time period were also considered. Findings indicated that there was sustained access to flipped materials throughout the semester for all students. In addition to accessing online lessons prior to class, students also accessed online lessons prior to weekly exams and project submission deadlines, indicating the value of such material for revision. Interestingly, two clusters of non-native English-speakers were identified where one group accessed the material more often than native students, and the second group accessed the material less frequently than the native English-speaking students.


Introduction
Participation by international student cohorts brings about many economic and cultural gains to higher education institutions. However, this potential does not come without its challenges to effective engagement and participation in the learning process. All international students attending higher education may face a culture shock due to different behaviours and expectations in the host country, but this is especially the case for those whose native language is different from that of the host country. Language proficiency below a certain threshold level was found to be a predictor of poorer academic outcomes implying international students face a systematic disadvantage (Trenkic & Warmington, 2019).
One educational alternative to traditional teaching delivery that has gained recent prominence, not only as a way of facing challenges with international students and of and globalisation more generally (Desai, Jabeen, Abdul, & Rao, 2018) but also, as argued by Minocha, Reynolds, and Hristov (2017) flipped classrooms (FCs) are suited to helping develop in managers the problem solving skills to complex problems that help increase the relevance of business schools. This method, which has become commonplace https://doi.org/10.1016/j.ijme.2020.100381 Received 16 September 2019; Received in revised form 29 December 2019; Accepted 25 February 2020 in business schools (Beenan & Arbaugh, 2019) and is seen as a new way of designing teaching and learning (Lopes & Soares, 2018) that involves online material being made available prior to class (Albert & Beatty, 2014). This frees class time to allow students more opportunity for class discussions and questions (Graham, McLean, Read, Suchet-Pearson, & Viner, 2017) which can improve students conceptual understanding (Burke & Fedorek, 2017). The affordances of the FC approach are important when students are undertaking courses in what is not their native language. The prior availability of material has been found to be of benefit to non-English speaking students in several studies (Asef-Vaziri, 2015;McCarthy, 2016;Zainuddin & Attaran, 2016;Zue & Bergom, 2010). Non-native students especially benefit from being able to watch recorded lectures more than once (Asef-Vaziri, 2015), as students can watch the videos at their own pace. Similarly, McCarthy (2016) found that some international students had difficulty comprehending material immediately, thus preferring the FC affordance of re-listening. Also, those that are not confident in the home language can prepare in advance for class discussions, resulting in increased confidence (Zainuddin & Attaran, 2016).
FCs have been examined through their outputs, often measured in terms of performance and grades, and the learning process. For example, FCs are associated with higher degrees of peer interaction (McCarthy, 2016), and improved perceptions of the learning environment (Baepler, Walker, & Driessen, 2014), improved understanding and engagement (Burke & Fedorek, 2017;Gross, Marinari, Hoffman, DeSimone, & Burke, 2015) as well as managing cross-cultural understanding (Desai et al., 2018). While FC's have been found to increase exam performance in large economics (Balaban, Gilleskie, & Tran, 2016) introductory management (Fadol, Aldamen, & Saadullah, 2018) and mathematics (Lopes & Soares, 2018) courses they have also been found not to result in significantly different in satisfaction levels from traditional lectures for undergraduate business students (Garnjost & Lawter, 2019). The FC approach poses a series of challenges and is not always popular with students (Chen, Yang, & Hsiao, 2016;McCarthy, 2016). FCs require students to allocate time to watch pre-class material with a risk that this may not occur (McCarthy, 2016). Indeed, the continuous availability of online videos may induce student procrastination (He, Holton, Farkas, & Warschauer, 2016). While some studies have found increased student effort in economics (Balaban et al., 2016) and organisational behaviour (Beenan & Arbaugh, 2019) the self-discipline needed to complete additional pre-class work can lead to decreased student satisfaction (Missildine, Fountain, Summers, & Gosselin, 2013). Also, while FCs present the challenge of dealing with students who did not complete or did not understand the material (Telford & Senior, 2017) research by Fadol et al. (2018) found FCs resulted in decreased absenteeism while admitting that students' perception of a learning mode may affect how well they adopt to its requirements. On a similar note, Beenan et al. (2019) found FC's more likely to be subsequently chosen by autonomously motivated business students. As FCs rely on students assimilating material prior to class (Albert & Beatty, 2014) and it has been found that students who watched flipped video materials missed less face-to-face classes (Fadol et al., 2018) it is important to examine not only if, but when, flipped material was accessed. Therefore, this study presents an investigation of a FC approach which considers when students access the available online material in relation to the timing of classes and subsequent assessments while considering their benefit for non-native speaking student cohorts. In exploring this, attention is given also to students' gender, as research has found it to be a variable likely to impact the students' engagement with online materials intended to support students (Alhasani, Mohd, & Masood, 2017;Tsay, Kofinas, & Luo, 2018).

VLE log-file data and cluster analysis
The adoption and acceptance of online learning has led to a dramatic increase in available learner data. Virtual Learning Environments (VLEs) generate abundant amounts of log-file data as they leave 'learning traces' (Gasevic, Dawson, & Siemens, 2015), that provides insights into student activities and learning processes. Learning analytics provides a new lens through which to understand education (Clow, 2013) and 'unveil' hidden information (Greller & Drachsler, 2012). While the FC, with its reliance on a central online component, makes it an ideal candidate for learning analytics, there have been few studies that examine students' actions rather than their perceptions. Overall, in a review of the research on FCs O'Flaherty and Philips (2015) concluded that there was a 'paucity of conclusive evidence' in this approach. While popular, there are inconsistent claims regarding the effectiveness of FCs, and evidence of its results in terms of student engagement largely rely on 'proxy' measures, mostly consisting on students' selfreports and performance outcomes. A limitation of previous FC studies that focus on student performance and student satisfaction is that they typically rely on self-reporting which may be inaccurate (Gilboy, Heinerichs, & Pazzaglia, 2015;Jovanovic, Pardo, & Dawson, 2017). Therefore, more studies based on log-file data are needed in order to add an additional level of research validity to the understanding of students' behaviour in relation to the FC approach. It has been argued that log-file data may be more genuine and 'authentic' than survey data, which are prone bias into students' interpretations and recall (Jo, Kim, & Yoon, 2015). Instead, learning analytics can reflect 'real and uninterrupted user behaviour' (Greller & Drachsler, 2012). Therefore, rather than relying on student perceptions, this study examines VLE data on how and when students accessed flipped material.
While some studies have extracted log-file data from VLE's, the data was often analysed using traditional statistical methods such as regression (Calvert, 2014;Jo et al., 2015;Lopes & Soares, 2018), correlation, and t-tests and ANOVA (Desai et al., 2018;Firat, 2016;Martin & Whitmer, 2016) rather than analytics algorithms. Instead, cluster analysis constitutes an exploratory tool that seeks to reveal previously unknown or not clear, but naturally occurring homogeneous groups (Brown, White, & Power, 2016;Del Valle & Duffy, 2009). In cluster analysis, cases are grouped by their similarity based on a distance index measure, with shorter distances indicating more similar cases (Kreber, Castleden, Erfani, & Wright, 2005). This index can be a composite score comprised of multiple standardized variables (Brown et al., 2016;Del Valle & Duffy, 2009). This means that, without human intervention or self-reported data, it is possible to uncover the 'underlying data structure' (Mirriahi, Liaqat, Dawson, & Gasevic, 2016). Therefore, cluster analysis can potentially be used to identify and classify previously unknown groups with different approaches to online learning, based on how similar members are using a composite score of several variables (Del Valle & Duffy, 2009). Another advantage is that, unlike J.N. Walsh and A. Rísquez The International Journal of Management Education 18 (2020) 100381 predictive analysis requiring representation from large samples, the use of cluster analysis can provide insightful results from relatively small samples (Mirriahi et al., 2016). Some studies have demonstrated the exploratory potential of cluster analysis of log-file data in other contexts such as peer tutoring (De Smet, Van Keer, & Valcke, 2008) and tool usage in VLEs (Lust, Vandewaetere, Ceulemans, Elen, & Clarebout, 2011). However, despite this exploratory potential, cluster analysis has remained largely underused in the educational context. Moreover, the rare previous applications of cluster analysis to the investigation of FCs have relied on students' perceptions of learning (McNally et al., 2017) or while analysing VLE data, focus on different questions to this study (Lust et al., 2011;Yen & Lee, 2011). In this paper, we avail of the affordances offered by cluster analysis to analyse VLE log-file data. This enabled us to identify and examine differences in the use of flipped classroom material of native and non-native English-speaking students without relying on statistical methods or students' perceptions of learning. The present study resembles the work of Jovanovic et al. (2017) and Gasevic, Dawson, Rogers, and Gasevic (2016), but both these studies focused on activity that took place only in the time prior to the flipped class, while this study is novel in applying cluster analysis to log-file data to identify patterns in how students access online resources over time while engaging with a FC approach. While important similarities with the study by Lust el al. (2011) also exist, this study addresses the limitation identified by these authors in working with homogeneous groups and explores instead the behaviour of a heterogeneous group. This paper seeks to address the lack of research using learning analytics in the FC context, using VLE data and the cluster analysis algorithm to examine when and how often native and non-native English-speaking students access online material as they participate in a FC.

Context of the study
As argued by Gasevic et al. (2015) it is important, when using analytics for blended learning, to consider the ways in which technology was implemented in the relevant context. This paper reports on a FC approach used for a knowledge management course, which included a section on business analytics, offered by the business school of an Irish university. Over the years, there was an increasing number of non-Native English-speaking students taking the course. In order to cater to this increasing diversity, the course was converted from traditional lectures to a FC approach. Online video-based material was developed which would enable students to replay content and also included e-tivities (Salmon, 2013) to allow students to assess their understanding of the online material provided. The videos were screencasts of lectures that showed content slides and were narrated by the lecturer. Some slides were blank to allow on-screen hand-written calculations to be added. This section covered material about five analytics techniques, each including session learning outcomes, embedded videos, as well as e-tivities. To engage in a FC, students were expected to access online material prior to each of the five classes. In week one of the course, the course content and assessments were outlined to the students, and the rationale for using a FC, how it would be implemented, and expectations of students were explained. Learning outcomes and online content was made available to students one week before each class. Pre-class e-tivities involved asking students to calculate certain figures and answer specific questions having watched the relevant videos and to suggest courses of action a manager might take. The objective was to encourage students to actively watch videos and use the content as a basis for subsequent class discussions and activities. Having freed up time from content delivery, each weekly 2-hour class centered around developing higher order skills such as applying techniques to new and different contexts through class exercises and discussions. Thus, the course design encouraged students to engage with online material so as to be able to work on in-class problems, as the online material would not be reiterated in class. Another motivation to attend class was that some of the material covered was similar in format and standard to assessment material. The detail of each topic was assessed the week after class in an invigilated 1-h assessment, and the application and integration of topics covered formed a major part of a group project due at the end of the semester. The classes involved getting students to use the techniques covered in the videos to solve problems in different contexts which were tested in weekly competency exams. Discussions in class also focused on how techniques could be applied more widely, requiring higher level skills, and these were then related to the group project.
As detailed above, this exploratory study used log-file data to identify hidden patterns of student online behaviour in this flipped classroom environment in relation to their command of the English language. The following research questions guided the investigation: H1. Students in a flipped classroom environment will access flipped materials through the semester in a sustained manner.
H2. Non-native English-speakers will access the flipped classroom material to a larger extent than native speaking students will access this material.
H3. Students will access flipped classroom material differently according to their gender and language profile.

Material and methods
The twelve-week knowledge management course was taken by 38 post-graduate business students, 24 of which were native Irish, 2 international students whose first language was English, and 12 were students whose native language was other than English. This latter group was compounded by a combination of Asian (10), North African (1) and European (1) students. There were 18 female and 20 male students. While the number of students may seem small similar studies of flipped classrooms have involved similar numbers (Critz & Wright, 2013;Mason, Shuman, & Cook, 2013;Strayer, 2012;Wilson, 2014;Yen & Lee, 2011). The log-file data recorded a total of 6,059 events, involving students. For each event, the data provided included: the event date and time, the identity J.N. Walsh and A. Rísquez The International Journal of Management Education 18 (2020) 100381 of the person, a summary code describing the event, the path of the resource or page accessed, the name of the resource or page accessed, and an identifier code for each log-in session. Events were filtered to include only access to resources by students, which provided a total of 1,237 rows of data. Variables were derived by pre-processing log-file data to represent students' online interactions and measure participants' viewing patterns (Jo et al., 2015;Mirriahi et al., 2016). Each lesson page in the VLE related to material on a single topic. Additional fields were added to each lesson providing the time of the associated class, the time of the assessment and the deadline for the group project. For each event in which a lesson was accessed by a student, it was possible to categorise the event as taking place (1) before class, (2) after class but before the weekly exam or (3) after the weekly exam but before the group project deadline. Values were inspected, and four outliers were identified. The outliers were found to occur when a student clicked on the incorrect lesson page, their next recorded action was to click on the correct lesson for that week. These initial but mistaken access times were ignored, and the subsequent access time was used instead. Three additional variables specifying how long before class, weekly exams, and the project deadline lessons were first accesses were also calculated. In addition, variables relating to students' gender and whether or not they were a native English speaker, were included. Data were analysed using SPSS Statistics 24. Two-step cluster analysis was chosen as it has been used to reveal natural but not readily identifiable groupings and because it allows the importance of each input variable to be identified (Brown et al., 2016). This technique has been used to cluster 45 educational programmes (Verburgh, Schouteden, & Elen, 2013), 33 countries (Zmuk, 2015) and 31 teachers (Kreber et al., 2005) making it also appropriate for the 38 students in this study. The log-likelihood method was used to calculate the distance between variables for cluster allocation when using both continuous and categorical data (Brown et al., 2016). Ward's (1963) method was used to minimise within-group variability and the silhouette-value was used to establish the number of clusters. Various combinations of the available variables were tested. To be considered 'good' the output model required a silhouette score of 0.5 or greater. Models achieving this were examined to determine the number of clusters that were automatically generated and the proportion of students in each. Though possessing a silhouette score of 0.5 or more, some models clustered almost all participants into one cluster. Two models were identified which were classified as good and had students spread over several clusters.

Results and discussion
Each model resulted in five clusters being identified. In addition to their SPSS identifiers each cluster was also named based on its key characteristic(s). The first model (Table 1) clustered students using the binary variables gender and language as well as the total number of times lessons were accessed before class, assessment, and project respectively.
Cluster A1, the 'All Rounders', was comprised of female native English-language speakers who accessed lessons 6.75 times on average before the project deadline, the second highest group (after group A4). They had the best overall performance for the module. Cluster A2, the 'Exam focused', were female non-native English-speakers who accessed lessons after the lecture but before labs, the highest amount of times (19.17) on average, while other groups typically accessed 5-8 times. They were also joint highest (8.50), with cluster A4, for viewing lessons before class though one other group (A5) was marginally less, but still over 8.00. The 'Least Engaged' cluster, A3, were non-native English-speakers, two-thirds of which were female, the only cluster to have both genders represented. The number of lesson-views before class (3.83) was the lowest of any group and the only one less than 5.00 indicating they did not, on average, access all 5 lessons before class. They were the lowest (5.00) on viewing prior to the weekly exams. As well as being least engaged they were also the cluster with the worst performance. What we term the 'Prepare and Reuse' cluster (A4) was composed of male English speakers. Members had the joint highest (8.50) views before class having 5.67 views prior to exams. They had the highest number of views (10.00) in advance of the project deadline, the next highest group having 6.75 views. The 'Class-Focused' A5 cluster had 12 male native-English speaking students. On average they accessed lessons before lectures 8.08 times (for 5 topics). They accessed lessons 5.33 times each on average prior to weekly exams, making them similar to cluster the prepare and reuse A4 cluster (5.67), the other male, native English-speaking group. What differentiates the 'class-focused' from the 'prepare and revise' cluster was the number of times that the former accessed lessons prior to the group project least (2.92 times in A5 versus 10 times in cluster A4). While the numbers were not sufficient for statistical tests, at first sight the average performance for each group raises some interesting issues: for example, the 'class focused group' who accessed more in advance of class performed better than the 'exam focused cluster'. The second model (Table 2) clustered students using gender, language, and the average number of days, hours and minutes, as appropriate, that lessons were accessed before each time period (before class, assessment, and project) for each cluster.
In this model, the 'last minute preparers' (B1) were native English-speaking males who accessed lessons closest prior to class and second closest before exams. The 'early class accessors' (B2), composed of all male native English-speakers, accessed lessons earliest before class (over 4 times earlier than the largest cluster, the 'last-minute preparers'). However, these 'early class accessors' were the group who then accessed lesson material nearest to the weekly exams. This strategy would appear to be successful as they were the group with the highest performance in the module. Cluster B3, formed by female native English-speakers, did not stand out against any of the three measures and were termed 'English-speaking moderates'. Cluster B5 was made up of non-native English-speaking females that accessed lessons relatively early before exams (almost twice as early as clusters B1 and B3). They accessed lessons in advance of weekly exams later than three of the four other clusters while accessing lessons closest to the group project deadline of any cluster. As they did not stand out on any variable they were categorised as 'non-English speaking moderates'. Comparing the performance of the two moderate clusters they were neither best or worst performers, but the English-speaking group did perform better. Cluster B4, comprised female non-native English-speakers, accessed online material the second earliest of any cluster, and were the earliest in accessing material prior to weekly exams. This cluster (B4) was also the group that accessed material earliest in advance of the project deadline. Based on these three behaviours they were termed 'overall early accessors' and were the group with the lowest average percentage for the course.

Engagement with the FC
Results show that each of the five lesson pages, in which flipped material was embedded, were accessed by different clusters of students in distinct ways as it was the case in Lust et al. (2011). Students in all but one of the clusters accessed each of the lesson pages at least once before class. Indeed, three clusters accessed the five lesson pages over eight times on average. There was only one cluster (A3) whose six members accessed lessons prior to class only 3.83 times on average. This indicates that the expectation that material would be accessed by students prior to class was met by most students so, in the main, the likelihood students would not watch material in advance of class (McCarthy, 2016) was not identified. The FC approach had also an important role to play in the preparation for continuous assessment, as we have observed that lessons were also accessed after class and prior to weekly tests. This is especially the case for the non-native English-speaking cluster (A2) which accessed lessons prior to class and before the weekly exam the joint-highest number of times. Of the remaining four clusters, those who had accessed lessons before class the least (A1, A3) increased the number of times that they accessed material before the weekly assessments, while those accessing before class the most (A4, A5) accessed the material less before weekly exams.
The study also focused on the need for increased effort (Beenan & Arbaugh, 2019) and the concomitant risk of procrastination as identified by He et al. (2016), by analysing whether students delayed accessing material until very near the class time. This study found that, on average, most (26) students (B1, B3, B5) first accessed online material the day before class, with some (B2, B3) beginning 3 and 2 days in advance respectively. This suggests students did not delay interacting with the material until the last minute but had time to consider the content in advance of the class. Based on the time assessments took place, most students accessed lessons the night before (B1, B3) for revision. Cluster B4 accessed material on average over two days in advance of both classes and assessments. Prior to the project deadline, the material was also accessed well in advance of the deadline. So, while this study did not examine how engagement impacted on the quantity of workload, it did identify that the use of online flipped material was reasonably paced through the semester. It should be taken into account that the group was composed of postgraduate management students in a fee-paying programme, and therefore likely to be highly motivated. Whatever the explanation, we observed that while the application of a FC approach did not imply overall procrastination in the class, it had a role to play to help those that did not engage as much during the semester. To compensate for this: those who had accessed lessons before class the least increased the number of times that they accessed material before the weekly assessments, while those accessing before class the most accessed the material less before assignments. Overall, the results allow us to confidently accept the hypothesis (H1) that students in a flipped classroom environment will access flipped materials through the semester in a sustained manner.

A diverse picture arising
Interestingly, of the 12 non-native English-speaking students, half were in cluster A3 who accessed online resources the least number of times, while the other half, cluster A2, accessed lessons prior to class the joint-highest number of times. The lack of engagement of cluster A3 is striking and remains unexplained, and suggests the interplay of other factors, like student motivation or perception of the FC approach, that could be playing a role in their engagement. However, our expectation that online video material and associated activities should engage non-native English-speakers especially (Asef-Vaziri, 2015;McCarthy, 2016) was fully met for cluster A2. It was also interesting to note that while these students engaged actively before each class and weekly assessment, engagement eventually decreased before the final project deadline. It could be that the FC approach had succeeded in compensating for their initial language disadvantage, or that this cluster was composed of highly motivated students who had achieved the learning outcomes by the time the project approached. Therefore, it is fair to affirm that the hypothesis (H2) that non-native English-speakers would access the flipped classroom material to a larger extent than native speaking students would, can only be partially accepted.
Finally, it is interesting to note that clusters that were entirely female (A1, A2) and predominantly female (A3) accessed lessons before weekly exams the most, while the opposite was the case for the solitary all-male clusters (A4 & A5). Therefore, it seemed that females in this cohort were more engaged with the FC for continuous assessment. This coincides with previous findings that female students tend to participate significantly more in online learning activities than their male counterparts (Alhasani et al., 2017;Tsay et al., 2018); and that females are particularly sensitive to course design in the context of a FC (Chen et al., 2016). Examining when lessons were first accessed, two all-male English-speaking clusters (B1, B2), two female non-English-speaking clusters (B4, B5), and one female English-speaking cluster (B3) were identified. The female non-English speaking cluster (B4) accessed material much earlier in all three time-periods, over twice as early prior to class and in advance of the project deadline than cluster B5, which was demographically similar. Instead, the pattern of behaviour displayed by this female non-English speaking cluster (B5) resembled that of the native English-speaking clusters, meaning that perhaps these students did not perceive the need for the additional support that the FC approach had intended to provide. Thus overall, these mixed results confirmed the hypothesis (H3) that students would access flipped classroom material differently according to their gender and language profile.

Limitations and further study
The quantitative data analysis used in this study has identified differences in how clusters of students access material but not the rationale behind those behaviors. Overall, results prove the capacity of using cluster analysis to identify differences among students with similar demographics, while acknowledging that its explanatory capacity is somewhat limited. The underlying diversity that may be at play was not fully captured and calls for the triangulation with qualitative results that investigate further the motivations, attitudes, and experiences that students brought into the process. For example, as the application and integration of topics covered formed the major part of the group project due at the end of the semester, it would have been interesting to triangulate these findings with the engagement of these students in class discussions, which focused on the practical application of the skills showcased in the FC format, and which in turn related to the group project. This dual analysis of online and face-to-face behavior is rare, and more research is called for to address it (Lust et al., 2011). Nonetheless, it is the case that using cluster analysis has allowed us to identify groups according to variables (native language and gender) that would have been averaged out using statistical techniques. Thus, the application of cluster analysis has been useful in developing a more nuanced understanding of variation within demographic subgroups, but as an exploratory technique, it raises multiple research questions. The use of cluster analysis is therefore suggested as not an alternative, but rather as a complementary exploratory technique. Future studies proposing to examine groups identified using cluster analysis need to consider several practical issues. Given the time needed to access and preprocess log-file data and then run a cluster analysis it may be that, like in this study, it is practically difficult to obtain other measures for triangulation with postgraduate students while they are still on campus. In order to capture qualitative data, further studies may utilize alternative cohorts in earlier years or focus on interventions that happen earlier in the academic year. Interviews, surveys or focus groups could be conducted prior to analysis, with student responses later grouped based on cluster membership, but this is not without the risk of uneven representation and bias effect due to observation. Another challenge when collecting qualitative data is likely to derive from representation of the cluster characteristics and size. For example, if using focus groups, as the size of each cluster will vary, some focus groups may be larger if conducted proportionately. Alternatively, if focus groups have the same number of participants, some will represent a higher proportion of the population. If possible, it would be beneficial if clustering was carried out before final performance metrics were made available so focus group representation could be based on cluster membership, but without students feeling they were assigned to a focus group on the basis of their grade.
A second limitation has to do with the development of proxy variables as indicators of student engagement, following precedents in the literature (Jo et al., 2015). The study was limited by the log-file data that was automatically collected by the VLE. As flipped classes can use multiple online resources it is important that researchers understand how interactions with such resources is recorded by the VLE. If possible, instructors should interact with a copy of the online course and determine if the log-file data meets their needs and define how the data will be preprocessed in advance of the course to provide time for any structural changes to be made. While an important advantage of using VLE data is that it helps overcome the self-reporting bias that is common when exploring engagement with the FC approach (Gilboy et al., 2015), using log-file data is not void of its own limitations and does not necessarily infer that a degree of learning by students took place. In addition, a qualitative dimension is called for in further investigations to be able to provide further insight into the findings, together with other proxy measures of academic achievement that can enrich the complexity of the cluster analysis, such in Lust et al. (2011). This study included the average percentage score for each cluster, with J.N. Walsh and A. Rísquez The International Journal of Management Education 18 (2020) 100381 some indication that performance was relevant to cluster usage patterns. However, the numbers were insufficient to draw statistically significant conclusions. Future studies could either use cluster analysis with larger student cohorts, or consider the applicability of other analytics techniques. For example, decision tree analysis would help to investigate the relative importance of existing proxy variables when the performance of each student is predefined based on a performance classification such as pass/fail or discrete letter grades. As one of the shortcomings of the cluster analysis technique has to do with the fact that the selection of classification measures is critical for the results (De Smet et al., 2008), future studies must follow up the guide of previous research to choose those variables most likely to contribute to a meaningful cluster solution.
Another potential limitation, and at the same time an affordance of cluster analysis, is related to the small sample size for this study (38). Other papers have used cluster analysis for similar (Verburgh et al., 2013) and fewer items (Kreber et al., 2005;Zmuk, 2015) with some studies examining flipped courses with an even lower numbers of students (Critz & Wright, 2013;Mason et al., 2013;Strayer, 2012;Wilson, 2014;Yen & Lee, 2011). Nonetheless, while the use of log-file data in this study adds an element of internal validity, results should not be generalised. Many uncontrolled variables are likely to be at play, most probably related to the culture of origin of international students. Our sample of non-native English speaker students was comprised of a mix of national backgrounds but was of Asian origin in its majority. Asian students are likely to be used to a more content-transmission style of teaching (Howitt & Pegrum, 2015), and often display a more passive learning style (Gram, Jaeger, Liu, Qing, & Wu, 2013), resulting in an over-reliance on the teacher figure as a source of authority (Holvikivi, 2007). Further studies that exploit the potential of learning analytics are called for in order to understand how FC approaches can effectively support these students in the acculturation process in Western educational contexts. Future studies that included large numbers of international students from different cultures such as continental Europe, Asia and India would allow an examination of intercultural differences. Also, the inclusion of standardised English language test scores would help identify students varying language abilities. As this study examined the section of a course requiring some computation by post-graduates, future studies should include consideration of students' primary degree, relating this to the material covered in the course examined. Also, while students in this study were all recent graduates, the inclusion of an age variable would be useful in courses with a significant proportion of mature students.

Conclusion
While seeking to explore the effectiveness of cluster analysis as a tool to examine VLE log-file data in the context of a FC, this investigation also contributes to the understanding of FCs using learning analytics and offers a novel examination of student patterns of their use of online resources emerging from a cluster analysis of log-file data, while considering the diversity of the student cohort. This paper differs from much prior research by focusing on what students did, rather than their perceptions (Beenan & Arbaugh, 2019;Garnjost & Lawter, 2019) or exam performance (Balaban et al., 2016;Lopes & Soares, 2018). Also, the study is different from recent research using cluster analysis on FC VLE data as it does not just focus on what students did prior to class Jovanović, Gašević, Dawson, Pardo, & Mirriahi, 2017) but also considers how student activity is related to continuous assessment. In focusing on student actions rather than perception this study responds to the pressing need identified by (Song, Jong, Chang, & Chen, 2017) to examine how FCs are implemented. The application of cluster analysis to VLE records to untangle some of the hidden diversity present was novel and would not have been as easy to identify with more commonly used statistical analysis techniques.
Placing our focus especially on non-native English-speaking students, it was found that this group was split in half between those that accessed online resources the least and the greatest number of times, both before classes and in preparation for continuous assessment. This revealed an underlying diversity that deserves further study that includes the role of cultural background, previous educational experiences in a Western higher educational setting, and language proficiency, amongst others. We were reminded of the importance of considering diversity within both international populations, and entire student populations. Results demonstrated that the FC approach succeeded in engaging most students in the class and constituted inclusive support that can provide the flexibility that current students demand. In doing so, it is likely that we can cater to the diversity in the whole student group rather than imposing a deficiency model based on identifying at-risk students. In order to use learning analytics to improve the quality of learning for all students, we need to understand what successful patterns look like when reflected in data and subsequently adjust the course design while taking into account that not all students engage with learning resources to the same extent (Bos, Brand-Gruwel, & Acm, 2016). Proving this point, we observed that the online resources were accessed by different clusters of students in distinct ways and had a potential role in helping those that had deferred their engagement with the course, for whatever the reasons. We found evidence that students engaged with the FC design throughout the semester in ways that are likely to encourage independent learning, and that continuous assessment was a major driver in their engagement. This highlighted the importance of constructive alignment (Biggs, 2014) of FC educational approaches, and places greater emphasis in the way FCs are structured and assessed at a course level to enhance the re-usability of educational resources. Ultimately, the FC is only one element of active blended learning (Armellini, 2018), and only relevant to the extent to which it can engage students actively in a variety of ways in and outside the classroom.
J.N. Walsh and A. Rísquez The International Journal of Management Education 18 (2020) 100381