Social learning analytics in computer-supported collaborative learning environments: A systematic review of empirical studies Computers and Education Open

Social learning analytics (SLA) is a promising approach for identifying students ’ social learning processes in computer-supported collaborative learning (CSCL) environments. To identify the main characteristics of SLA, gaps and future opportunities for this emerging approach, we systematically identified and analyzed 36 SLA- related studies conducted between 2011 and 2020. We focus on SLA implementation and methodological characteristics, educational focus, and the studies ’ theoretical perspectives. The results show the predominance of SLA in formal and fully online settings with social network analysis (SNA) a dominant analytical technique. Most SLA studies aimed to understand students ’ learning processes and applied the social constructivist perspective as a lens to interpret students ’ learning behaviors. However, (i) few studies involve teachers in developing SLA tools, and rarely share SLA visualizations with teachers to support teaching decisions; (ii) some SLA studies are atheoretical; and (iii) the number of SLA studies integrating more than one analytical approach remains limited. Moreover, (iv) few studies leveraged innovative network approaches (e.g., epistemic network analysis, multimodal network analysis), and (v) studies rarely focused on temporal patterns of students ’ interactions to assess how students ’ social and knowledge networks evolve over time. Based on the findings and the gaps identified, we present methodological, theoretical and practical recommendations for conducting research and creating tools that can advance the field of SLA.


Introduction
Following the extensive use of digital technology in education, a growing field of learning analytics (LA) has emerged since 2011. The term is used to describe studies aimed at exploring students' behavior based on large datasets gathered from digital learning environments (Draschler & Kalz, 2016). The field of LA aims to explore how the data generated from students' learning activities can yield an evidence base to inform student support and effective design for learning. For example, in a recent review of 2730 studies on LA, Adeniji (2019) found a tremendous growth in articles using LA approaches to analyze the complexity of learning processes. LA studies have increasingly made use of methodologies that go beyond educational data mining and automated discovery, introducing approaches such as social network analysis (SNA), discourse analysis, natural language processing, and multimodal LA [32]. In this regard, as a broad interdisciplinary community, LA research is focused on a range of epistemologies, ontological approaches, and methods (Author B, 2020). For example, results related to students' online profiles could be classified on several levels: the descriptive level (what happened), the diagnostic level (why it happened), the predictive level (what might happen), and the prescriptive level (what should be done) [13,64]. Importantly, as the field of LA continues to evolve, it is transitioning from a field largely focused on generating predictive models for the purpose of student retention to more sophisticated analyzes of students' learning processes and, in particular, group and social-based practices [32].
Accordingly, some LA researchers have drawn on socio-cultural [37] and other pedagogical approaches due to the recognition that knowledge and skills can be developed through social interactions and collaboration between two or more people [2], and should therefore be addressed specifically in practice, research and theory. In this regard, a distinctive subset of LA referred to as social learning analytics (SLA) [7], which highlights the social perspective of learning, has attracted increased attention from LA researchers [7,26]. The impetus behind SLA is the recognition that social interactions are a major source of knowledge construction, yet current LA research has taken it for granted. Consequently, SLA deserves serious consideration as an approach to enabling the sense-making of complex educational data generated during social activities for teachers, students, and other educational stakeholders.
The goal of this paper is to systematize and summarize the empirical and theoretical findings regarding SLA, with a focus on SLA implementation and methodological characteristics/considerations, the primary learning and teaching-related problems addressed by SLA, and the theoretical perspectives of the identified studies. In particular, we are interested in exploring the current progress and trends in the emerging approach of SLA. Hence, the objectives of this paper are twofold: (1) to identify the main characteristics of SLA; and (2) to identify gaps and future opportunities for conducting research and creating tools that can help advance the field of SLA.
We argue that a review of SLA is needed (i) to understand and conceptualize the existing body of SLA studies; (ii) to provide evidence about the implementation of SLA across a wide range of settings, techniques, and data sources; (iii) to offer a synthesis of the theories and conceptual frameworks that have informed SLA studies; and (iv) to develop a set of pointers for conducting rigorous SLA research. Thus, this study can provide a springboard for other researchers and practitioners interested in exploring SLA's potential to identify students' behaviors and learning patterns within computer-supported collaborative environments.

Overview of social learning analytics
To clarify the concept of SLA, we use the definition suggested by Buckingham Shum and Ferguson [7]. They defined SLA as the collection and measurement of students' produced digital artefacts and online interactions in formal and informal settings in order to analyze their activities, social behaviors, and knowledge creation in a social learning setting [7]. In contrast with LA approaches such as predictive analytics, which often emphasizes individual learning processes [59], SLA attempts to account for the socio-cultural contexts in which learning takes place ([9]9). The sociocultural theory views learning as interconnected in a broader ecology and that all cognitive functions originate in social interactions, and that learning is the process by which learners are integrated into a knowledge community [30], In this line, SLA, as an extension of LA, concentrates on the study of group processes and the collaborative construction of knowledge [11] from activities performed in social learning environments or participatory cultures (e.g., the production of digital artifacts and online interactions) [14]. The intention is to make these visible to learners, learning groups, and teachers, along with recommendations that spark and support learning [7].
In the original definition of SLA, Buckingham Shum and Ferguson [7] identified five categories of SLA under the umbrella of "inherent social analytics" and "socialized analytics." The inherent SLA categories include: (i) social learning network analytics (SLNA), which employ networked approaches to study student interactions when they are socially engaged; and (ii) social learning discourse analytics (SLDA), focused on analyzing textually based constructed knowledge [26] through large amounts of text generated during online interactions. The socialized SLA categories include: (iii) social learning content analytics, which uses automated methods to examine, index, and filter learner generated content (e.g. documents, images, logos); (iv) social learning context analytics, which involves analytic tools that expose, make use of, or seek to understand learning contexts; and (v) social learning disposition analytics, which combines learning dispositions data with data extracted from computer assisted, formative assessments (e.g., [3]).
The objective of this review is to examine studies employing inherent SLA (SLNA and SLDA), since these are primarily concerned with social interaction at the learning group level [7]. Thus, we use the abbreviation 'SLA' to refer to both SLNA and SLDA throughout the paper. These two forms of SLA are further described below.

Social learning network analytics (SLNA)
SLNA is a subset of SLA which puts emphasis on the study of individual and group interactions between learners, teachers, communities and resources within social settings using networked learning approaches such as social network analysis (SNA) [27] and epistemic network analysis (ENA) (Shaffer & Luis, 2017). The principles of networked learning approaches such as SNA derive from graph theory, which looks at patterns of relations between nodes in a graph. The nodes in a social network graph (sociogram) are the actors, who can be individuals (egocentric) or collective units such as teams or organizations (whole unit) [27]. In learning and education settings, the actors may be students connected to each other within a class or collaborative learning activity; teachers and students in a class or students and resources. Based on combining principles of networked learning approaches and computer-supported collaborative learning (CSCL), methods of learning analytics can be employed to provide information about group interactions in social settings at multiple levels of abstraction and how these could be used to support teaching and learning processes.

Social learning discourse analytics (SLDA)
SLDA is a subset of SLA, which involves the analysis of large amounts of text generated during the online interactions [7]. SLDA focuses on analytics to support high-quality discourse for learning contexts through the analysis of discourse data [38]. A central premise of the socio-cultural perspective is that language plays a significant role in understanding the learning process. This claim has been supported by previous research which reported that educational success is related to the quality of learners' educational dialog [21], which can be measured through discourse analysis. This implies that SLDA can be used to analyze large amounts of educational text, and potentially provide insights into the quality of students' text and speech posted in online collaborative environments. This approach supplements the insights generated by SLNA approaches, which examine connections without necessarily examining what the actors are paying attention to.

Social learning analytics in computer-supported collaborative learning environments
Computer-Supported Collaborative Learning (CSCL) is the field concerned with how computers might support learning in groups (colocated and distributed). It is also about understanding the actions and activities mediated by the computer in collaborative learning [40]. The research questions addressed in CSCL include how individuals learn with domain-specific tools, how small groups interact and develop shared meanings over time, and how online learning in communities (e. g. MOOCs) create new conditions for teaching and learning at scale. In this rapidly evolving field, Ludvigsen et al. [39] argue that CSCL is characterized by a more or less stable base of two epistemological stances, individualism and relationism. Individualism in CSCL means for researchers to use a cognitive perspective on group learning (e.g. shared cognition, predefined analytic categories, individualized knowledge) whereas relationism in CSCL is aligned with a sociocultural perspective (emergent collaboration, mediation, learning as a process).
Learning analytics has a role in both perspectives as technology support. For example, Wise et al. [63] make a distinction between using learning analytics as a research tool in CSCL (analytics of collaborative learning, ACL) vs. using analytics as a mediational tool in collaborative learning analytics (CLA). This dichotomy is not identical to the previous but shows a trend of development from ACL to CLA by integrating aspects of Ludvigsen et al. [39] two stances. With ACL, the core challenge is to map digital traces to learning constructs, and CLA takes it one step further and seeks to bridge "from clicks to constructs," starting from specific CSCL technologies identifying "clicks" (e.g. discussion forums, knowledge building environments, eye-tracking) and followed by monitoring and reporting conceptual aims and understanding ("constructs") judged important in CSCL (e.g., Uptake of ideas, promising ideas in knowledge building, and joint attention), respectively.

Supporting teaching through social learning analytics
SLA has emerged as a potential approach to provide insights and inform teaching decisions using hidden information in large amounts of educational data extracted from computer-supported collaborative learning (CSCL) environments (e.g., learning management systems [LMS] and wikis) [4,9,28]. This is particularly important, as current challenges in higher education require active student participation to encourage 21st-century skills such as critical thinking, collaboration, and self-regulation [48]. A consistent theme throughout most of the literature taking a student-centered approach is the importance of shifting the focus of the teaching and learning process away from the teacher, and instead empowering students to take a more active part in the construction of knowledge [12], through student-centered pedagogical approaches such as online discussions rather than having them as passive receivers of information [5]. For example, Hernandez-Garcia et al. (2015) showed that SNA could highlight the visible and "invisible" interactions occurring in online environments, thus helping teachers to improve the teaching and learning process based on the information about the actors and their activity in the online learning environment.
Meanwhile, a common challenge highlighted in the literature is that teachers find it difficult to monitor and support students' learning through approaches like online discussions, due to a large number of students and the complexity of online learning environments (Martinez et al., 2020). In this regard, SLA could be instrumental in providing insights to teachers about students' learning behaviors, which they can leverage to support students as active learners within CSCL environments [14]. For example, Kaliisa et al. (2019) used SLA (i.e., social learning network and discourse analytics) to analyze and visualize 34 students' online learning processes in a semester-long undergraduate course, using data generated from four weekly online discussions. Their findings revealed that SLA could be used to analyze students' cognitive and social learning processes in online learning environments, which teachers can leverage to make learning design decisions.
However, using SLA to support teaching and learning is without challenges. For example, because SLA relies mainly on the study of interactions in online environments, it is challenging to implement in blended learning environments where digital interactions are limited. In addition, obtaining students' informed consent to use their data from online social learning environments makes the use of SLA approaches such as SLNA problematic in very large communities (e.g. social media; MOOCs) since the inclusion of all subjects is important to leverage the power of network statistics.

Related reviews and identified gaps
There is an increasing body of systematic reviews that have reviewed the literature on LA from different perspectives, including open learner models [6], LA dashboards (LADs) [43], trends [32], multimodal LA [15,55], drivers, developments, and challenges [19]. For instance, in a review of 102 studies, Bodily et al. [6] reviewed open learner models and LADs, outlining the key themes (i.e., intelligent tutoring, self-regulating learning) and forms of data (i.e., assessment data) in the extant literature. In the same vein, Matcha et al. [43] reviewed 29 studies on LADs, examining whether they found support for self-regulated learning [65].
Viberg et al. [59] reviewed 252 studies on LA in higher education and reported little evidence that shows improvement in students' learning outcomes because of LA. Adeniji (2019) carried out a bibliometric study on LA-based on 2730 papers, with the aim of examining the intellectual structure of the LA domain. The review concluded that LA had captured the attention of the global community but recommended that future research should examine the impact of social networks on students' learning. More recently, Ifenthaler and Yau (2020) reviewed 46 empirical LA articles to explore LA's utility in facilitating study success in higher education. They concluded that different forms of data (e.g., background, behavior data, assessment data, and self-reported data) are all necessary in supporting student success.
While these systematic reviews provide important insights into the broader research on LA, to the best of our knowledge, no studies have included a specific focus on the social perspective in LA reviews or attempted to piece together different studies that employ SLA. The closest studies to ours include Vieira et al.'s [57] systematic review of 52 visual LA (e.g., LA facilitated by visual interfaces/interactions) studies. The study revealed that limited work has been done to bring visual LA tools into classroom settings, as well as a lack of studies employing sophisticated visualizations (e.g., interactive scatterplots). However, the study was limited in scope, emphasizing visual interfaces produced by LA systems. Jan et al. [29] analyzed studies using social network analysis (SNA) for investigating learning communities. However, this study was only focused on studies using SNA across different disciplines without necessarily taking a LA perspective. Moreover, while SNA is one of the tools used in SLA, it is important to note that SLA goes beyond visualizing social networks by emphasizing the analysis of students' online social interactions and artifacts to understand, explain and improve their learning [26]. Moreover, although the existing literature reviews offer valuable contributions and overviews of various research issues concerning LA, these reviews are more concentrated on the broader aspects of LA adoption, with no specific attention to SLA. We attempt to bridge the aforementioned gaps with the current review by examining the implementation and methodological characteristics of SLA.

Research questions
Three research questions guide this work: (1) What are the characteristics of SLA studies, particularly the methodologies (e.g., approaches, types of data, sample, tools, analysis techniques) and implementation tools (e.g., scale, settings) used from 2011 to 2020?
Research question 1 is grounded on findings from previous LA studies (e.g., [60]), which sought to provide a clear picture about students' learning by using relevant tools to collect meaningful data from relevant contexts. For example, Rogers et al. [49] argued that the use of inaccurate proxies and aggregate data for tracking and measuring academic performance is a key challenge that could affect teachers' adoption of LA. In the same way, Williamson [60] stated that "educational researchers will need to develop conceptual and methodological tools to investigate the social lives of educational data by performing genealogical investigations of their tangled social, technical, political, economic and scientific threads" (p. 205). In this review, we intend to scrutinize the approaches, data sources, and techniques used in SLA studies and assess the extent to which they align with the meaningful understanding of students' online social learning processes.
(2) What questions about learning and teaching have been addressed using SLA?
Gašević et al. [23] emphasize that LA is meant to support learning, and all LA efforts should be guided towards the support of teaching and learning practices. It is therefore important to analyze the kinds of questions being addressed by the different studies employing SLA and whether the focus of these studies support learning and teaching within the different contexts.
(3) How do existing studies integrate learning theories into SLA strategies?
Research question 3 addresses gaps in the current literature on LA, which has highlighted the lack of connection between LA and theory [37,62]. Learning theories play an important role in transforming results obtained from LA into insights about learning. While LA can help to identify student behavior patterns and add new understanding to the field of educational research, it alone does not provide explanations for underlying mechanisms [62]. Buckingham Shum and Ferguson [7] claimed that SLA is strongly grounded in learning theory and focuses attention on elements of learning that are relevant for learning in a participatory culture. Nonetheless, there remains a significant absence of theory in the LA research literature [22]. Thus, we aim to identify and classify the theories, models, and pedagogical assumptions that drive SLA studies.

Methodology
The methodology employed in this review is an adaptation of the three phases of a systematic review described by Kitchenham [36] (e.g., planning, conducting, and reporting the review). We chose Kitchenham's guidelines because they provide high-level but clear and easy-to-use guidelines to present a fair evaluation of a topic.

Planning the review
We started by identifying the need for a systematic review, as suggested in Kitchenham's guidelines. We tried to identify previous systematic reviews that addressed either our research questions or similar questions. However, as discussed above, none of the reviews focused on SLA. Thus, following Kitchenham's guidelines, we developed a review protocol to guide the execution of the systematic review. This process involved defining the search strategy, selecting criteria, developing quality assessment criteria, extracting data, and formulating a data analysis plan.
Search strategy and selection criteria: As a means of searching relevant studies, we selected the following databases as they contain relevant literature for the field of LA. ACM Digital library, Scopus, Web of Science, and Google Scholar. We also reviewed the proceedings of the International Learning Analytics and Knowledge (LAK) Conference (https://www.solaresearch.org/events/lak/) to identify relevant studies, as this is a key venue for LA research (Adeniji, 2019). Lastly, we scanned reference lists from relevant primary studies and review articles. Given that SLA is a relatively new approach with limited research, this review identified all potentially relevant papers (e.g. journal articles, book chapters and conference papers) to provide a comprehensive picture of the current research efforts on SLA implementation. To extract data from the diverse body of literature, we used the following combinations of keywords, which cover the main themes of the review: "social learning analytics AND higher education," "social learning analytics AND learning," "social learning analytics AND teaching," "learning analytics AND online learning environments," and "learning analytics AND social network analysis." In order to identify relevant studies, a set of inclusion and exclusion criteria was defined (Table 1).

Conducting the review
The target population of this review was a set of studies that reports on SLA between 2011, when the field of LA was defined, and May 2020, when the search process was completed. We searched for relevant papers based on the search strings and inclusion/exclusion criteria defined in the previous section. The first search process resulted in 1540 potential studies, which were then screened to determine the relevance of each paper for the systematic review. We excluded a number of studies, such as those using SNA but not within the discipline of LA. A thorough analysis of the papers' titles and abstracts returned 131 papers. Two researchers screened these using textual analysis based on the quality criteria (see Table A1) adapted by Mangaroska and Giannakos [42] in their systematic review of LA for learning design. These two researchers checked the extracted papers to ensure consistency and disagreements were discussed until consensus was reached. Following this process, 36 studies were selected and included in the final analysis. A summary of the systematic execution process is illustrated in Fig. 1.
Data coding and categorization: This phase involved the determination of an overall classification system for managing the data extracted in the different phases to ensure methodological rigor [10]. Following the screening process, two researchers ordered, coded, and categorized the selected papers using Google Sheets, which allowed easy collaboration and continuous update of the database throughout the review process. The reviewed studies were coded according to six dimensions in response to the research questions: study focus; target audience (e.g., teachers, students); SLA approach (e.g., SLNA, SLDA); theoretical framework (e.g., socio-constructivist); methodology (e.g., analysis approach, types of data, sample size, tools); and implementation details (e.g., scale, study settings). Social moderation (discussion between researchers) was used to settle any differences in the coding process. Finally, a narrative analysis of the quantitative and qualitative data was undertaken to provide a summarized overview of the themes identified from the studies. See Table A2 for a summary of all details extracted from each study.

Findings
The results section is divided into two subsections. The first subsection provides a brief description of the included papers to provide a context for understanding the analyzed SLA studies. The second Table 1 Inclusion/exclusion criteria.

Inclusion Exclusion
The study applies inherent SLA approaches (e.g., SLNA and/or SLDA), as suggested by Buckingham Shum and Ferguson [7] Study does not focus on the inherent SLA approaches.
The study is contextualized in an online social learning environment (e.g., LMS, social media platforms).
The study is not contextualized in an online social learning environment (e.g., LMS, social media platforms). The study was published between 2011, when the field of LA was defined, and May 2020, when the search was completed.
The study was published before 2011 or after May 2020.
The study was published in English.
The study was not published in English. Was the research design appropriate to the aims of the study? 5 Does the study clearly determine the research methods? (i.e. subjects, instruments, data collection, data analysis) 6 Was the data analysis sufficiently rigorous? 8 Is the study of value for research or practice?
subsection considers the results from the analyzed papers with reference to the three research questions stated in the background section.

Descriptive information of included studies
The 36 studies included in our analysis consisted of 19 journal articles, 16 conference papers, and one book chapter. Fourteen studies (e.g., Khousa & Masud, 2015) targeted students, 13 were aimed at teachers (e. g., Vuorikari & Scimeca, 2012), and 13 addressed issues of relevance to researchers (e.g., Yen et al., 2019). Some papers targeted more than one group (e.g., Dascalu et al., 2016). One key finding here is the limited attention towards teachers, despite the documented evidence of the potential benefits of using SLA to support learning design decisions.

SLA approaches
The open coding led us to identify four clusters of SLA approaches applied by the different studies. First, the majority of studies (n = 12) were nested within SLNA, which employ network approaches to study individual (egocentric) and group learning processes.   The third cluster of studies (n = 7) combined SLNA with SLDA. For example, in a study of students' online interactions in an undergraduate course, Authors A, C, et al. (2019) employed SNA and discourse analysis to analyze and visualize students' online learning processes and the discussion features of students' discussion posts. Lastly, seven studies referred to SLA in general, with no reference to a specific form of SLA. Most of the studies in this cluster were theoretical or methodological in nature and aimed to introduce innovative SLA approaches and tools. For example, De Laat and Prinsen (2014) described SLA as instrumental in formative assessment practices, while Dascalu et al. (2016) explored the potential and challenges associated with SLA. Overall, beyond a few noteworthy exceptions identified in this review, the majority of SLA studies employed SLNA while the number of studies combining different SLA approaches was relatively low.

SLA tools
The review found a range of tools used in SLA, which were categorized into four forms. First, the review identified six SNA tools that were used in 12 SLA studies. The tools included Gephi (e.g., [51]); igraph (e. The third category was LMS built-in-add-ons, consisting of tools used for SLA but embedded within LMS. This category consisted of four tools, the visual discussion forum (e.g., Wise et al., 2013), and Forum Graph [25], a plug-in tool for Moodle that creates and displays the social graph of a single forum selected by the user. Chen et al. [9] developed the CanvasNet, which turns discussion data from the Canvas learning management system into student-facing visualizations. The tool also shows snapshots of trending terms in student posts and contrasts a student's personal lexicon with the lexicon of the group for probable conceptual expansion. Another tool was GraphFES (Hernández-García et al., 2016), a web service and application for the extraction of forum-related activity in Moodle and the generation of data-rich participation, lurking, and message thread networks, which can then be analyzed using Gephi. The fourth category consisted of four general-purpose analysis tools that were used in SLA studies (e.g., Daz-Lzaro et al., 2017) but have been used more generally in domains other than SLA, such as SPSS, R, ORA, and NVivo. Eleven studies never reported any tools. The diversity of tools available for SLA analysis could point towards the flexibility of approaches but also a lack of sufficient maturity in determining common approaches for SLA.
The analysis also revealed that some studies combined more than one analytical approach. For example, to complement SNA findings, six studies combined SNA and automated content analysis. For example, Oliveira et al. (2016) used SNA to present a system for the integration of LMS and social media, presenting educational insights for teachers regarding the way online communities develop knowledge. Additionally, six articles combined SNA and inferential statistics (e.g., [4]), one study used SNA and manual content analysis [9], and another used inferential and manual content analysis (Author B, 2017). Lastly, one study [24] used SNA and epistemic network analysis to analyze students' online learning processes, which highlighted different facets of the phenomenon of learning and knowledge development.
Data sources: The main source of information for SLA was online discussion forums (n = 17). This was followed by social media platforms (n = 7), such as Facebook, Twitter, and WhatsApp. Other sources included weblogs (n = 4); online videos (n = 3); assessment data such as grades (n = 2); surveys (n = 2); simulated artificial data (n = 1); and online documents (n = 1). Four articles were theoretical (e.g., [7]) and relied on secondary data. To a certain extent, the diversity of data sources mirrors the diverse sources that could enable the capture of insights into social learning dynamics across learning settings.
Sample size: Coding revealed that the majority of SLA studies applied large sample sizes, with six studies having a sample size between 1000 and 160,000 participants, 11 studies with a sample size between 100 and 1000, and 10 studies with a sample size between 10 and 100. One study had a sample size of fewer than 10 participants, three did not specify sample size, and five were coded as not applicable (i.e., theoretical papers).
Learning context and settings: We analyzed studies to establish the settings in which SLA studies have been undertaken. The findings showed that SLA has traditionally been performed in formal learning settings (n = 25), specifically universities. Four studies were conducted in informal learning settings, such as workplace learning environments [20], social media platforms (e.g., Facebook), and online community forums and professional learning networks (Cambridge & Perez-Lopez, 2012). Lastly, three studies (e.g., Vuorikari & Scimeca, 2012) were conducted in non-formal learning settings (e.g. online conferences). At the same time, the coding revealed that the majority of SLA studies have been conducted in fully online settings (n = 22), such as MOOCs (e.g., [16]), where there is scope for increased integration of social learning activities given the large number of students in such courses. Only nine studies were situated in blended learning environments (e.g., Adraoui et al., 2017; [50,51]).
The findings on settings in which SLA is undertaken is further explained by the observed relationship between the research setting and sample size. For example, Fig. 3, shows that on one hand, most studies with larger sample sizes (e.g. between 100 and 1000 and above 1000), were conducted in formal and fully online learning settings, such as MOOCS (for example, see the intersection between studies above 1000 and their intersection with MOOCs as the learning settings in Fig. 3). On the other hand, studies with a small sample size (e.g. between 10 and 100 and below 10) were mainly conducted in high schools or universities, and specifically blended learning environments. This finding implies that fully online environments could be more convenient in terms of collecting SLA as compared to physical learning environments.
The scale of implementation: The majority of the studies were carried out at the course level (n = 23). Three studies were implemented at the scale of online communities such as Facebook groups (e.g., Oliveira & Figueira, 2016). Only one study was implemented at a program level (Cordova et al., 2018), and another at an international level. For example, Vuorikari and Scimeca (2012) used SLA to study teachers' large-scale professional networks and collaboration throughout Europe. No SLA study was implemented at an institutional level. The predominance of studies conducted at a course level could be justified by the emerging and exploratory state of SLA research and LA as a field in general.

Questions about learning and teaching addressed by SLA research (RQ2)
We analyzed the kinds of questions addressed by the different SLA studies. The primary focus of the majority of SLA studies was understanding students' learning processes (n = 19). These included studies centered on identifying relevant actors in social learning environments, be it most or least active students (Hernández-García, et al., 2016; Kaliisa et al., 2019), the relation between SNA centrality measures and students' learning behaviors (Hernández García et al., 2015), how students respond to the messages of others (Wise et al., 2013), and students' learning styles [4].
The review also identified an increasing number of scholars who have studied the detection of cognitive presence in discussion forum transcripts (e.g., Farrow et al., 2019), highlighting the visible and invisible interactions occurring in online environments [26] and demonstrating the association between students' academic performance and social centrality [16,50]. Other studies have focused on general educational phenomena such as conducting an assessment (De Laat & Prinsen, 2014), understanding online problem-based learning [51], detecting exploratory dialog [20], predicting the online knowledge building community's (OKBC) response to newcomer inquiries (Nistor et al., 2018), and tracking the development of learners' professional competences through social networks [35].
The coding also revealed six studies aimed at contributing to teacher efficiency and supporting informed teaching decisions. These  [24,45]. For example, Gasevic et al. [24] proposed the social network epistemic signature (SENS) approach, which combines SNA with epistemic network analytics to analyze SLA activities. These findings suggest that even though SLA may have many possible uses, recent research using this approach has focused on the identification of relevant learning agents and the connection between SNA parameters and students' learning behaviors. Fewer SLA studies have leveraged SLA to support teachers' learning design decisions.

Theoretical perspectives in SLA research (RQ3)
The last objective of this review was to identify how much SLA studies engaged with educational theories. The analysis revealed 22 studies that referred to learning theories or concepts. Surprisingly, 14 studies lacked reference to explicit learning theories. Among the studies that had a theoretical foundation, social constructivism was the most employed theory, accounting for 10 studies (e.g., [51]). These studies examined the interactions in online learning environments and related the interactions to the theory of social constructivism. Kaliisa et al.
(2019) employed social constructivism to make sense of students' online interactions in connection to the intended learning design. Six studies employed socio-cultural theory, which places more emphasis on the mediating role of cultural tools, including language (abstract tools) and artefacts (concrete tools) as facilitators of learning [61]. One of the illustrative examples is Dahlberg [11], who provided an account of technology-mediated interaction from the socio-cultural perspective. Shaffer and Ruis (2017) employed epistemic frame theory which models the ways of thinking, acting, and being in the world of some community of practice, while Schreurs et al. [53] grounded their study in networked learning theory which investigates how people develop and maintain a 'web' of social relations to support their learning [33].
Besides learning theories, four studies utilized learning concepts and models, which were in some cases used alongside the main theoretical orientations. For example, Farrow et al. (2019) and Rolim et al. [50] used the community of inquiry framework as a theoretical lens to code and analyze students' online discussions. Chua et al. (2017) used the conversational learning framework to study online conversations among social learners in MOOC environments. Aguilar et al. [4] used Felder and Silverman's model. In sum, SLA seems to be more oriented towards social constructivist approaches to teaching and learning, which is unsurprising given SLA's strong connection to social learning.

Discussion and implications for future sla research
In the following section, we discuss the findings presented in Section 4, through the lens of existing literature. We highlight several implications for methodology (e.g., need for reconfiguration of existing tools, integration of heterogeneous data sources, advanced computational linguistic analysis techniques, and temporality) and implementation (e. g., exploring diverse learning settings; moving from course to program and institutional applications of SLA; connecting SLA to learning design). Lastly, we discuss theoretical implications (e.g., the integration of learning theory) for the future advancement of SLA research.

Methodological implications
Need for reconfiguration of existing SLA tools: Although the review found a range of tools used by SLA researchers, there were few SLA tools that researchers and teachers can use to simultaneously analyze interactions and the actual content produced by students within computersupported collaborative learning environments (e.g., LMSs). The findings revealed that most SLA researchers rely on the general SNA applications or computational linguistic tools, which in most cases work outside the actual learning environments and require laborious efforts to perform the analysis. Moreover, most of the identified tools are designed to provide insights into one particular aspect of learning, such as social learning based on digital traces of connections, which limit a comprehensive understanding of the learning process. However, as noted in previous research, if SLA is to appeal to practitioners (e.g., teachers), there is a need for tools to extract interaction data automatically and provide real-time readable and informative visualizations so that teachers become more aware of the productive aspects of social connectivity [14]. This calls for the need to look into the existing SLA tools, especially the flexible (generic) tools, and reconfigure them in a way that serves the needs of practitioners such as teachers (e.g. simple tools with automated and timely visualizations).
Two good examples along this line are CanvasNet [9] and GraphFES [25], which are SLA tools developed to extract interaction data from Canvas and Moodle message boards, respectively. Nonetheless, the latter requires the exportation of interaction data to third-party SNA tools, which might not be practical for teachers. We recommend that future SLA research suggest standalone, integrated tools that can provide both teachers and students with timely insights about social learning activities. A possible future work would be the development of appropriate SLA tools that can support the automatic extraction of students' interactions and discussion messages from social learning environments and meaningful visualizations that consistently communicate useful information about the learning context to teachers [45]. This would support informed teaching and learning design decisions during the run of the course, rather than relying on evidence from summative assessments (e.g., course grades) that usually come at the end of the teaching period.
Integration of heterogeneous data sources in SLA studies: Regarding the sources of data used in SLA studies, the analysis showed that most of the data were collected from online discussion forums, but with increasing use of social media and trace data collected through different technologies, such as LMSs and other online learning platforms. However, even though trace data such as web logins could provide a good proxy of students' online learning practices, such sources could be inaccurate since they lack the social element, which is central to SLA. In addition, although seven studies used more than one data source, 22 studies used only one data source. Only one study (Dascalu et al., 2016) used interview data to explore how students and teachers make sense of learning networks and other visualizations generated from their online interactions. This result suggests that SLA researchers often analyze students' contributions and interaction data isolated from other information that might be relevant to the interpretation of the outcomes of a given activity. This is despite the fact that the potential of LA to support learning decisions is improved when multiple levels of LA are considered (Author A, C et al., 2020). Moving forward, given that SLA is still in its infancy, methodological diversity can help extend knowledge and facilitate implementation by leveraging multiple levels of data (e.g., discussion forums and interviews), thus enabling a clear interpretation of the results of SLA analysis.
Integration of advanced analytical techniques: The review found that SLA researchers have mainly relied on SNA techniques to aid in their understanding of teaching and learning interactions. However, as noted by Dado and Bodemer (2017) in their review of SNA in CSCL, network approaches are limited to descriptive reporting of learners' interactions, thus failing to capture higher-order learning constructs. Thus, SLA studies should move beyond SNA towards more knowledge-based network approaches such as epistemic network analysis, which visually and statistically analyzes the structure of connections among coded data (Shaffer & Ruis, 2017). In other words, there is a need for more efforts to combine the different strands of SLA (i.e., SLNA, SLDA) into a holistic view of social learning [9]. As Suthers and Rosen [54] argue, "the network structure is not enough: to explain the origin of social life we must understand the nature of the communication or interaction that takes place" (p. 17). For instance, Gasevic et al. [24] provided a promising example through the social epistemic network signature (SENS) approach, which combines SNA and epistemic network analysis to gain a comprehensive view of students' learning in collaborative environments. Dascalu et al. [13] also claimed that for SLA to be truly advanced, a multiple-level virtual profile of the students within the social learning platform must be analyzed (e.g., the learners' activities, the context, the content, mood, and interactions). This argument is corroborated by Kent and Rechavi [34] and Schreurs et al. [53] who have suggested that SLA should address different interaction types separately by providing models and visualizations capable of showing not only the usual SNA metrics but also the types of social ties forged between actors and topic-specific subnetworks.
In this regard, we suggest that future SLA studies apply advanced and multimodal network analysis approaches [46], including understanding, the properties of networks in learning settings and deriving insights about learning built on network analysis. The combination of different elements within SLA would be more laborious to perform and might require sophisticated tools for manual and automated content analysis (Kovanović et al., 2016). Nonetheless, studies of this type could strengthen the granularity of insights; construct validity, and theoretical soundness, facilitating understanding of students' social learning processes.
Integration of temporal dimensions in SLA studies: The findings show that the key focus of SLA studies is to explore and understand students' learning processes through identifying relevant actors in social learning environments and the relationships between SNA centrality measures and student outcomes. However, we identified a significant research gap in SLA studies concerning the study of temporal patterns of students' interactions, which is an important element in understanding students' learning processes [52]. The only exception found was Dahlberg [11], who visually presented the mobility of learners across space and time. The author argued that capturing temporal dynamics could help teachers identify critical moments during the learning process, which can be used as evidence to better support students' learning. Thus, within SLA it is important to consider temporal dynamics to investigate how collaboratively constructed knowledge and network processes evolve over time [31], thereby providing an informed evidence base for student support and effective design for learning. In practice, this could require tools that allow one to identify, measure and visualize students' temporal information (e.g. work in progress) while accomplishing different activities.

Implementation implications
Exploring diverse learning settings: Regarding the settings and contexts of implementation, most SLA studies have been undertaken in formal (e. g. university) and fully online learning settings (e.g., MOOCs), with only a few exceptions in blended learning contexts (e.g., [1]). Moreover, a deeper analysis of sample sizes and study settings revealed that the majority of the studies with a sample size larger than 100 were conducted in non-formal and fully online learning environments and used data sources such as for weblogs and online discussion forum messages. This finding is unsurprising, as the ease of data collection and the large numbers of participants associated with e.g. MOOCs motivate researchers to concentrate on such settings, rather than blended and strictly controlled learning environments. Nonetheless, it is important for researchers to explore the use of SLA in blended learning settings, including face to face, which offer a rich landscape of learning and is the default setting in most educational institutions. We recommend that SLA researchers leverage technological advancements (e.g., multimodal technologies), which can capture a multitude of social learning constructs (e.g. level of attention, gaze, heartbeat, body temperature, etc.) within blended and face-to-face environments. However, this requires sensory equipment to supplement the ordinary human-computer interface.
Moving towards the program and institutional application: The majority of studies were carried out at the course level, with no SLA study implemented at an institutional level. This finding is consistent with Tsai et al. [56] who found that the adoption of LA is mostly found to be small in scale and isolated at the instructor level. The predominance of studies conducted at a course level could be explained by the exploratory phase of SLA research and of LA as a field in general. However, to demonstrate the impact of SLA and realize LA's aim of optimizing teaching and learning, it is important to move from individual courses and small-scale experimental studies to an institutional scale [18].
Connecting SLA to learning design: SLA studies are mainly oriented towards understanding students' learning processes, with a limited focus on using SLA to support teachers' learning design decisions. This is despite the documented evidence of the potential benefits of using SLA to support learning design [2]. The study of students' interactions and the content produced is crucial for teachers to improve learning design, as these act as a proxy for students' learning [2]. As noted by Van Leeuwen et al. [58], one possible explanation for the low uptake of SLA in teacher practices is the scarcity of relevant tools that could translate SLA outputs (e.g. social interactions) into timely, usable insights to support course redesign on the fly. Thus, we recommend that future SLA research focus more on supporting learning design using reconfigurable tools that can capture insights originating from course designs and knowledge co-construction occurring within online collaborative learning environments.

Theoretical implications
The results of our systematic review demonstrate that SLA studies have been informed by a variety of theoretical backgrounds, including social constructivism, socio-cultural theory, epistemic frame theory, and networked learning theory. The dominance of social constructivism in SLA studies is not surprising since, as highlighted in the background section, social constructivist approaches give importance to the contextual nature of learning and the social construction of knowledge [8]. In this regard, social interaction is a critical component of SLA, as learning does not occur only within an individual learner but begins with collaborative interaction and the social construction of knowledge between participants within an environment (e.g., interactions and exchange of ideas) (Author C, 2010).
Nonetheless, even though authors frequently used theoretical perspectives such as social constructivism, the way such perspectives were conceptualized raises some questions. For example, some researchers used theories to guide their studies, but they did not explicitly explain how their findings connect to these theoretical perspectives. Moreover, 14 studies were atheoretical, meaning that they were not aligned to any theory. The absence of theoretical alignment in some SLA studies reminds us about the known concern of LA, which is the limited ability to provide adequate explanations for student performance and derive the underlying insights about learning [62]. Therefore, as the data does not speak for itself, we suggest that future SLA studies should consider learning theory to support the interpretation of observed online interactions and artefacts [22]. One promising approach that researchers could leverage is the consideration of learning design while interpreting SLA results so that relevant data and indicators of students' learning are selected against an absolute value set by the learning objectives.

Study limitations
The selection criteria we employed only captured relevant papers that used the keyword "social learning analytics." We may have missed relevant papers that did not explicitly use this term, so our findings should be treated as preliminary and interpreted with caution. The study also considered studies employing the more specific "inherent forms of SLA" (e.g. social learning network analytics and social learning discourse analytics) that are defined as inherently social. This implies that studies employing other forms of SLA such as "context analytics" were not included since our primary focus was on studies concerned with social interaction, which is the key defining element of SLA. In this regard, we encourage future researchers to conduct a comprehensive review covering both the inherent and socialized SLA. Nonetheless, this study provides the first of its kind systematic review of research on SLA. We hope that our findings reported could act as a new foundation for SLA research, and for researchers to use our work as a framework and lens through which to conduct more rigorous SLA studies.

Conclusion
In this paper, we provide a summary of the current state of the inherent SLA studies. As already noted, SLA is becoming recognized as an important trend in CSCL, especially given the increasing use of social and collaborative learning platforms across different learning settings. In this regard, SLA is used as both a research and mediational tool in collaborative learning analytics [63]. However, for the potential of SLA to be achieved, a variety of methodological and conceptual issues must be addressed, including developing appropriate automated SLA tools, integrating advanced network analysis techniques (e.g., epistemic network analysis), exploring diverse learning settings, integrating temporality in SLA analysis, connecting SLA to learning design, utilizing different data sources, and considering theoretical perspectives.
Nonetheless, this study should be seen as just the "tip of the iceberg": SLA is a relatively new extension of LA and is in its initial stages of development. As such, we are just beginning to become aware of its possibilities and scope of application in learning environments. Researchers can benefit from the outcomes of this systematic literature review, particularly the results that highlight the most frequently used data sources, learning environments, tools, analytical techniques, and questions being answered with SLA. We have also identified important questions that SLA researchers and technology developers should intentionally address to advance work on the use of SLA in CSCL environments.

Declarations of Competing Interest
None.