Mapping the open education landscape : citation network analysis of historical open and distance education research

The term open education has recently been used to refer to topics such as Open Educational Resources (OERs) and Massive Open Online Courses (MOOCs). Historically its roots lie in civil approaches to education and open universities, but this research is rarely referenced or acknowledged in current interpretations. In this article the antecedents of the modern open educational movement are examined, as the basis for connecting the various strands of research. Using a citation analysis method the key references are extracted and their relationships mapped. This work reveals eight distinct sub-topics within the broad open education area, with relatively little overlap. The implications for this are discussed and methods of improving inter-topic research are proposed.


Introduction
The purpose of this paper is to enrich current scholarship by exploring and identifying key historic papers, authors and themes in open education research. The work builds on a systematic approach that identified a corpus of historical open education articles from the 1970's which are almost entirely non-cited in the literature today (Rolfe, 2016). It is intended that this study will provide an accessible starting point for researchers to deepen their understanding and further explore and incorporate earlier open and distance education research into their current work.
Open education is an evolving term that covers a range of philosophies and practices aimed at widening access to education for those wishing to learn, with the current focus predominantly on practices based around reuse and sharing. This current focus can be traced back to the Open Educational Resources (OER) movement, and the use of open licences, such as Creative Commons licences.
Current interpretations of open education are often shaped by the OER movement with an emphasis on the '5Rs of reuse' (Reuse, Revise Remix, Redistribute and Retain - Wiley 2014). For instance Wiley (2013Wiley ( , 2017 defines open pedagogy as the 'set of teaching and learning practices only possible in the context of the affordances of open educational resources as enabled by the 5Rs' and talks of OER enabled pedagogies. The profile of open education has been further raised in recent years by the popularity of Massive Open Online Courses (MOOCs). Although they do not this suggests that the current manifestation of open education has its roots in previous interpretations and developments, much of the current literature in what can broadly be defined as open education fails to acknowledge or cite this earlier work. Weller (2016) analysed publications from an OER research repository (the OER Knowledge Cloud), and derived the following categories: Project Case Study; Technical; OER as subject; Research with impact data; Policy; Practitioner; OER in developing nations; MOOCs; Pedagogy; Open practice.
There is a strong tendency to be self-referential across all of these categories, with little reference to open education prior to OER movement. A preliminary systematic search (Rolfe, 2016) for "open education" across a number of databases, retrieved over two hundred articles and revealed that there was an initial peak in the period 1970-74, with articles deriving largely from the concentrating on open pedagogy in UK infant schools, and also from the founding of the Open University. The next significant peak in publications is found in 2010-15 as MOOCs, open textbooks and OER gain traction ( Figure 1).
There is little connection between these two peaks of open education publications however. For instance, Katz (1972) and Resnick (1972) were two of the most frequently cited papers (41 and 21 respectively) that deal with broadly applicable open education issues, but are rarely cited beyond the 1980s.
As the work above highlights, research and definitions of open education continues to evolve and branch into new areas of focus. However, many of its themes bear certain similarities to earlier research starting from the late 1960s and developing through to the '80s and beyond. For example, the popularity of MOOCs was hailed as a revolution in higher education, democratizing learning for millions (Koller, 2012), with 2012 being declared the 'Year of the MOOC' (Pappano, 2012). However, completion rates were very low (Jordan, 2014), the demographics of learners favoured those with an existing high level of education (Kolowich 2013), and they were expensive to produce (Hollands & Tirthali, 2014). By 2013, even MOOC pioneer Sebastian Thrun declared that they were 'a lousy product' (Chafkin, 2013). Much of the early MOOC literature ignored Martin Weller et al. 112 existing literature on distance education and e-learning, declaring them 'the first generation of online learning' (Godin, 2016). The literature on supporting students at a distance (e.g. Tait, 2004), e-learning costs (e.g. Bates, 1995;Weller, 2004), or student retention (e.g. Tinto, 1975) may well have provided useful contributions to this development, but was largely ignored. Similarly, much of the current provision in distance education can learn from the development of tools, and production techniques in MOOCs.
It is the authors' contention that providing connections between these bodies of research in open education is mutually beneficial for researchers and practitioners. The studies into practice since the 1970s have produced an extensive body of theory in open and distance education, which can add valuable insights for current researchers and practitioners. In addition, researchers and graduate students will be able to enrich their studies by tracing ideas, connections, discontinuities and patterns gleaned from the analysis of earlier studies. Further, current discourses about the meaning of openness in education may well benefit from an understanding of historical patterns of open and distance education research, in particular the challenges faced.

Methods
Social network analysis (SNA) approaches were used to build a network of the literature cited in the field. SNA is not a single approach but rather a toolkit of different metrics and analyses which can be used in a range of contexts where social relations can be conceived of as links between individual nodes (Borgatti, Mehra, Brass & Labianca, 2009;Kadushin, 2012;Wasserman & Faust, 1994). By viewing social relations as a network, novel insights can be gained in terms of the structure of communities and importance of key connections (Borgatti et al., 2009). By thinking in these terms, the literature cited in an academic publication can be conceived of as a network where each reference is a node, linked to another node (the publication it is cited in) through a tie which represents the social practice of a citation. This approach has been widely used to visualise the structure of scientific knowledge and map academic disciplines (Börner, Chen & Boyack, 2003;Small, 1999). When applied to a variety of subject areas, this approach has yielded insights into the sub-domains within a field and areas of overlap between them. Dawson, Gašević, Siemens and Joksimovic (2014) used this approach to examine the network of literature cited by papers at the Learning Analytics and Knowledge annual conferences from 2011 to 2013, with a view to "to identify the emergence of trends and disciplinary hierarchies that are influencing the development of the field to date" (Dawson et al., 2014, p. 231).
As such, using citation network analysis serves the goals of the present study to an extent, as a way of identifying sub-domains within literature related to openness and education. However, a key distinction between existing studies and the present study is the exploratory and historical nature of the research. Whereas citation networks typically start with a well bounded and defined set of literature (Dawson et al., 2014, for example), the term openness is not clearly defined and draws upon multiple subject areas, making a well-defined set of literature to include is a challenge (this problem also reflects the aims of the study itself). We also set out to trace the links between contemporary and historical perspectives on openness, which also calls for an exploratory approach to uncover the citation links to earlier works.
To this end, an iterative approach was used to generate the sample of papers selected for inclusion in the citation network. An initial sample of 20 documents were selected, on the basis of literature database searches for items which referred specifically to the history or definition of openness (("open education", "open learning", openness)AND(history,definition)), listed in Table 1. The references were then extracted from each of the above (forward citations were not included). The literature and references were checked for consistency and duplicate items in a two-column spreadsheet (references in a first column of 'source' items and the articles in which they are cited in a second 'target' column). The data were then exported as CSV files and imported into Gephi for network analysis (Bastian, Heymann & Jacomy, 2009). The steps involved in the process are illustrated in Figure 2 using some of the references from one of the initial sample of 'seed' papers.  The papers which were cited by at least two of the original sample items were then added to the sample to include their references in the next iteration. Although this process could be repeated indefinitely, four iterations have been carried out and it was felt that meaningful clusters had emerged at this point. It is worth reiterating that the nature of the network is exploratory rather than exhaustive. At this point, the network included 5,217 references from a total of 172 publications. Note that it was not possible to include references for some multi-cited items due to not having any references, or not being accessible online (books or chapters).

Results
The full final citation network is shown in Figure 3. Articles which were included in the sample and their references used to build the network are shown as magenta nodes. Those which were cited more than twice but whose references were not included are shown in blue. There were several reasons why this would be the case, including articles not having references, references not being accessible online, or having achieved >2 citations in the fourth iteration (i.e. those which would have been included in a fifth iteration of the network). Nodes which were only cited once are shown in grey.
The network visualisation in Figure 3 uses the Force Atlas 2 algorithm (Jacomy, Venturini, Heymann & Bastian, 2014). The algorithm is based on two simple principles: "Nodes repulse each other like charged particles, while edges attract their nodes, like springs" (Jacomy et al. 2014). As a result, clusters of papers have emerged based on the extent of sharing the same references, which raises questions of both what the clusters represent, and which key publications act as links between different clusters. In order to clearly characterise the network further, the same layout will be maintained but items for which references were not included will be removed. Highly cited items (>4 citations) for which references were not included will be kept, as this will include notable publications which did not have references or references were inaccessible. The resulting network is shown in Figure 4, with nodes colour-coded to show categories applied by the researcher in order to distinguish the nature of different communities 1 . Items which did not immediately lend themselves to a particular category are shown in grey.
These categories are partly a subjective interpretation of the clustering. Each of them is now considered in turn, and the type of subjects they address.
The Open Education in schools (or Open Classrooms) movement is the earliest cluster present in the network, receiving greatest focus in the early 1970s. The term originated in the UK in the wake of the Plowden report (1967) Mapping the Open Education Landscape 117 E-learning and online education rose to prominence in the 1990s and early 2000s, bridging the gap between distance education and OER. This period saw a mainstreaming of many of the issues relating to open education, as e-learning became an area of interest for traditional universities and not just open education providers. Over this period, e-learning (and related terms, such as technology enhanced learning) become increasingly synonymous with the Internet and web-based technologies, while largely not losing sight of the importance of pedagogy and adapting teaching practices rather than relying on new technology alone.
Open access publishing entered the network as a concept towards the end of the 1990s, with a focus on metrics and how OA compares to traditional scholarly publishing during the 2000s. In contrast to the other themes so far, this cluster is not primarily concerned with education in terms of teaching, but rather focused on the research activities and outputs of higher education. As such, it is not widely linked to the other themes in the network, but has been an important contributor towards open practices in terms of digital scholarship.
The Open Educational Resources (OER) theme is a tight-knit community at the heart of the network. The OER theme emerges around the year 2000, initially focusing upon learning objects, open source education, and OpenCourseWare. The theme is central to the citation network, both drawing upon existing work in e-learning and distance education, and influencing subsequent themes of MOOCs and open practices. While the discourse around OER emphasise opening up quality educational resources on a global scale, later in the theme a recognition that access is not enough and need to be combined with open educational practices emerges.
Social media emerged as a theme in the network, from the mid 2000s. While the majority of papers included in the network are written from a more general Internet Studies or Communication perspective rather than focused on education or academia, the position of the theme suggests that this body of work has been influential in thinking about open practices and scholarly activities online. Use of online social networking tools is particularly prominent, but the theme also includes ideas related to 'Web 2.0' and social media more broadly, such as blogging. In very recent years, this theme has been less well represented as the focus has shifted towards use of tools as part of Open practices.
Massive open online courses (MOOCs) represent one of the most recent themes within the network. Although 'open' is ostensibly foregrounded, being part of the acronym itself, the relationship with the discourse surrounding openness in education is less clear. The group of papers on the theme of MOOCs have some shared connections to the OER and e-learning clusters, but are distinct.
The theme of Open practices is one of the most recent and ongoing areas for research in the field. Its location within the network shows how it sits at the intersection of social media, open access publishing, and OER. It includes articles focused upon digital scholarly practices, and open educational practices, spanning both the research and teaching remits of higher education.
In addition to identifying research themes through characterising the clustering within the network, viewing the connections in this way also gives insight into their relative proximity. Open practices have emerged as the connection between three of the major communities -OER, Open Access publishing, and social media. MOOCs appear to be most closely related to OER, whilst the two oldest communities (Open education in schools, and Distance education and open learning) are only weakly linked to the main body of the network, and only to each other through more recent work. The temporal development of the network can be seen more clearly through Figure 5.
Open Praxis, vol. 10 issue 2, April-June 2018, pp. 109-126 Martin Weller et al. 118 In addition to the two communities (Open education in schools, and Distance education and open learning) highlighted as some of the oldest papers in the network in Figure 5, there are also a handful of older, highly-cited papers at the heart of the network. These nodes are also not easily classified within a particular community (Figure 3). The most highly cited nodes (>7 citations) within the network are listed in Table 2, and their positions within the network are labelled in Figure 6. For items in Table 2 which were highly cited but did not clearly sit exclusively within one particular community, the 'category' field is left blank.  In addition to considering the number of citations as a way of identifying key papers within the network, betweenness centrality is a network metric which can be used to identify papers based on their position within the network structure. Betweenness centrality is calculated based on the number of shortest paths; that is, the shortest way to navigate through the network between any two given nodes. The 20 publications with the highest betweenness centrality are listed in Table 3, and their network positions shown in Figure 7. Note that some of the 'category' fields in Table 3 are left intentionally blank, as these items did not fall clearly into one of the emergent communities or another in the network, i.e. they correspond to some of the nodes which are colour-coded as grey in Figure 4.  Table 2 within the network (cropped)    There has been a temporal aspect to much of this development which is represented in Figure 5. Distance education morphed into e-learning literature during much of the 1980s and 1990s. The initiation of the OER movement since 2002 has also coincided with open access as a field of interest. The rise of web 2.0 and social media in the late 2000s led to research relating to academic use of these tools. Social media, OER and open access can be seen as precursors to MOOCs and open practice respectively. Open education in schools has seen different periods of interest, but remained largely distinct from the others. Each of these practices might make reference to its precursor movement, but rarely beyond that.
However, the linking between the sub topics in the network should not be viewed simply as newer developments, such as MOOCs, acknowledging and learning from prior developments, but also established areas benefiting from new insights. For example, Tait (in press) analyses the future of open, distance education universities and highlights a lack of innovation as a potential threat to their long-term sustainability. Similarly, Paul (2016) argues that open universities have been resistant to adopting many of the digital methods in delivery, allowing other providers to 'steal their clothes' in Daniel's (2017, p.2) phrase. The research in topics such as MOOCs, social media and OER are closely related to open university practice and so provide a route for innovation that falls within the remit of such universities. Strengthening the relationship between these research areas then might be seen as a first step in addressing this innovation lag.
Of the eight areas identified there seems to be a relationship between how tightly clustered the references are and the clarity of definition. For example, clear definitions exist for open access (e.g. Suber, 2004) and OER (e.g. UNESCO, 2002). E-learning comparatively is less well defined, covering any aspect of ICT in education, online learning, learning management systems, and so on. The references here are thus less well connected. Similarly, open educational practice (OEP) is an emerging field which does not have a clear definition, as Havemann (2016) states, 'the value of OEP as a concept is in its more wide-ranging remit'. Thus, what is included in this classification is more disparate than for others. It can also be seen however as a connecting thread between all the other fields. OEP addresses the manner in which each of these other areas are implemented and educators adapt their practice.
These and other patterns in the diagram give evidence of a lack of solid connections between what intuitively would appear to be strongly related areas. It also highlights the importance of publications that act as nodes between these 'islands', forming possible bridges between the different communities. Open education does not constitute a discipline, in the manner of a hard science for example, so there is no agreed canon of research that all researchers will be familiar with. It is also an area that practitioners tend to move into from other fields, often because of an interest in applying aspects of openness to their foundational discipline. This can be seen as an advantage, in that different perspectives are brought into the domain, and it evolves rapidly. However, it also results in an absence of shared knowledge, with the consequence that existing knowledge is often 'rediscovered' or not built upon. In order to partly address this issue, the authors have created a Beginner's Guide with a summary of key articles in each of the eight areas identified (Jordan & Weller, 2017).
There are limitations to the research which should be acknowledged. The first of these is that there is a backward perspective as the citation network builds on past papers, so there may be a lag between significant papers and their recognition via this method. The method therefore provides a means of establishing a historical perspective but does not reflect the current state of the field and leading edges of research. Further, it is not possible to get a sense of the history of highly cited items which do not have references themselves to the same extent, in a network they tend to be deadends rather than nodes. Perhaps most significantly here are biases inherent in the social practice of citation and academia more generally, such as gender (Savonick & Davidson, 2016) and northern hemisphere bias which this work could serve to reinforce. One method of addressing this would be to reseed the initial citation network with explicitly sourced references to prioritise a particular perspective, for example publications from the global south. Also, the inaccessibility of references within print publications privileges electronic journal articles. Finally, in this approach certain types of paper tend to be more highly referenced, as noted by Dawson et al. (2014), "The analyses also indicate that the commonly cited papers are of a more conceptual nature than empirical research reflecting the need for authors to define the learning analytics space" (p. 231). The results of the method then can be influenced by the initial seeding articles. This can also be seen as a benefit however, as different versions of the network can be created to serve different purposes.
However, accepting these limitations, the method and findings of this research represent an initial attempt to provide a conceptual mapping of the broad field of open education. The findings provide some evidence that sub-topics within this area operate largely in isolation, with little cross referencing. Given the shared principles outlined previously, as well as commonality in many of the motivations and problems and techniques, this can be seen as detrimental to the development of the field as a whole. It is hoped that this work will provide some means of addressing these silos of practice.