Information Technology & Software Engineering Performing Knowledge Requirements Analysis for Public Organisations in a Virtual Learning Environment : A Social Network Analysis Approach

Fontenele, M. P., Sampaio, R. B., da Silva, A. I. B., Fernandes, J. H. C. and Sun, L. (2014) Performing knowledge requirements analysis for public organisations in a Virtual Learning Environment: a social network analysis approach. Journal of Information Technology & Software Engineering, 4 (2). 134. ISSN 2165­7866 doi: https://doi.org/10.4172/2165­ 7866.1000134 Available at http://centaur.reading.ac.uk/39276/


Introduction
A postgraduate course derived from a partnership between university and Brazilian Federal Public Administration (FPA) employs a virtual learning environment (VLE). It then becomes challenging in terms of recognising knowledge demands from FPA agencies related to students. To achieve this end, knowledge requirements analysis plays a pivotal role in setting a clear goal for feedback and reinement on topics taught in the course and delivery of content with cooperation in knowledge management (KM) between public agencies through the VLE. At the end of the course, a dataset was obtained from VLE in order to perform such analysis.
here are various approaches to knowledge requirements analysis, one of which is Social Network Analysis (SNA). SNA comprises an extensive set of methods that can be used to evaluate the structure of social groups and their perceptions in relation to social environment [1,2]. During this evaluation, new phenomena can be studied and new hypotheses can be proved (e.g. a relationship between topics and interests of participants, and a relationship between topics and interests of participant's organisations). One assumption of the analysis is that the demand for knowledge of a particular public agency may be assessed by the competence of public servants in addressing speciic issues. It is oten the case that the studies focus on contextualised problems in the public agency. A topic in the study may serve multiple purposes for diferent public agencies. herefore, a study of this kind requires knowledge sharing and management between diferent organisations. In order to realise this goal, this paper introduces prospects for improvement in KM in public agencies, as well as for highlighting the need to strengthen ties between teaching, research and development in public agencies. he remainder of the paper is organised as follows: irst, the research background and related work is presented. hen, a method for practical application of knowledge analytics combining SNA with other ields of knowledge is detailed. Results comprising data collection and network analysis of relationships between organisations and topics covered in the course are presented. hese results lead to discussion of the adopted method and the implications and limitations of our work. Finally, a conclusion is drawn along with indications for future work.

Research Background and Related Work
A new learning paradigm VLE refers to the entire category of technology enhanced learning systems ofering administrative and didactical supportive functionalities [3]. In fact [3] describe a VLE as a comprehensive main category in the domain of technology enhanced learning. here is a variety of such systems, such as Blackboard [4], ProProfs [5], eCollege [6] and Moodle [7]. Mueller and Strohmeier [3] also discuss the efectiveness of VLE in relation to their design characteristics. Hence, this approach supports an evaluation of VLE among other research.
Extracting knowledge from VLE has already been discussed [8] emphasise the importance and possibilities of data mining in learning management systems, performing a case study tutorial with Moodle [9] have used this approach in order to build historical reference models of students who dropped out of and students who completed their researched course. his data was generated by the interaction of students with e-learning environments also using Moodle. his kind of activity is a type of knowledge discovery in databases (KDD). According to [10], KDD refers to the overall process of discovering useful knowledge from data [10] emphasise that data mining refers to just a particular step in this process, which requires appropriate prior knowledge and proper interpretation of the results. herefore, KDD still lacks management of generated knowledge [11,12] present a variety of commercial and free data mining tools. dimensional space and the relationships between them are represented by lines connecting these points. Lines can be directed or not, depending on the nature of the social relationship. Mathematically, one sociogram is a graph. Sociograms visually represent the structure of social networks and allow the understanding of their structural properties.
Actors and their actions must be viewed as interdependent rather than as independent units [1]. Furthermore, the relationships should be seen as channels for the low of resources. his interpretation of networks opens new ways of studying the requirements for information and information low.
It is useful to develop SNA of computer-mediated communications, recognising how such communications can afect and interact with social relations and social organisation [25]. herefore, applying SNA to such rich repositories of data as VLE may provide relevant information. Organisations might take advantage of SNA results to determine collaborative channels, information fusion through such channels and key participants or groups in the analysed network [26].
Previous research applied SNA for evaluation of knowledge creation [27] and sharing [28]. SNA points out that the construction of knowledge between members of the scientiic community context also comes from social networks [29]. SNA concepts, adapted to the collaborative distance-learning context, can help in measuring the cohesion of small groups [30]. Our paper also analyses cohesion, but focuses on cohesion derived from ailiation networks. SNA can be used in data extracted from VLE, such as sociograms in which vertices are participants of forums and the links are their information exchanges or other available connections between them [31].
Most networks in SNA are one-mode networks, where the actors are all from one set, for example, people who participate on a board [1]. However, there are networks in which actors belong to several possible sets of entities, as in a two-mode or higher modes networks. In a onemode network, each actor can have, or not, a relationship with any other actors in his network, including himself. Two-mode networks consist of two diferent sets of actors, or a set of actors and a set of events (or activities), and by the relationships between actors (or events) of each set. An ailiation network consist of at least two sets of vertices such that ailiations connect vertices from diferent sets only [2,32,33]. In such networks, actors of the same set do not directly connect with each other, but they may be indirectly connected through an actor in a diferent set. Social homogeneity is not only predicted among "actors" who are directly connected (on what is called "structural cohesion model"), but also among those who may be totally disconnected in terms of direct interactions, such as in "structural equivalence models" [34,35]. Following this approach, [33] propose that cohesion aspects in animal behaviour could be mapped using ailiation networks [2] explain the nature of cohesive subgroups such as m-slices, which is a maximal sub network containing lines with a multiplicity equal to or greater than m and the vertices connected by these lines. In an m-slice, each vertex is connected to at least one other vertex of the same slice by lines with multiplicity m or greater. It allows extraction of nodes bearing the strongest relationships, but some vertices of an m-slice subnet may not be connected. It is an important concept for ailiation networks because one-mode networks derived from multiple mode networks tend to have more dense connections [2].
For more practical details on SNA, the authors recommend [2], based on the use of Pajek sotware [36]. Table 1 presents a list of applications and, for that matter, doesn't have many published papers. his also turned out to be a motivation for the authors to present an application of SNA.

Knowledge management
he use of KDD technology can greatly enhance governmental practices by sharing information between many diverse agencies in an efective way [13]. Authors claim that KM is linked to the management of people and that the use of information technologies and management practices is relevant to create an appropriate environment in which to share information and knowledge [14][15][16]. Models emphasise the importance of interpersonal ties for knowledge creation [17].
KM systems alone are not enough. hey have to integrate with other core systems in order to develop and maintain sustainable competitive advantage, especially in the local government domain [19] propose a multi-layered semantic repository solution to support e-Government and warn that e-Government systems should deal with continual change. herefore, the KM process itself should be evaluated in periods of time, ergo better change management should be employed [19]. In addition, we propose that KM could be implemented not just inside, but potentially between organisations [20] describe, from an organisational viewpoint, some common applications for SNA, such as analysis, partnership between companies; evaluation of strategy implementation, network integration and development of communities of practice.
Implementation of KM programmes includes adoption of techniques capable of encouraging employees of a given organisation to maintain constant contacts with each other, such as communities of practice [21]. Communities of practice are communities formed by two or more individuals for conversation and information sharing, aiming to develop new ideas and processes in a certain domain. Participation is voluntary, and the higher the interest of the participants, the greater the number of conditions to develop within the community. Such communities attract individuals who are willing to share their expertise and what moves these communities is the interest of its participants to strengthen individual skills [21].
Communities of practice can also beneit FPA. Strategic partnerships involving government and university can merge distinct knowledge pools and communities of practice into a richer knowledge environment [22]. Strong partnerships between government agencies and interdisciplinary teams at universities can provide access to required expertise. his paper's context relies on this assumption.

SNA concepts
SNA originally gained its popularity in social and behaviour sciences, involving understanding the linkages among social entities and the implications of these linkages [23]. A social network consists of one or more inite sets of actors and the relationships deined between them. Actors in a social network can be either individuals or collective social units such as public service agencies. he concept of actor is lexible, allowing diferent levels of aggregation, which allows its adaptation to diferent research problems [1].
he fundamental diference between SNA and other methods is that the emphasis is not on the attributes (features) individually present in the actors, but on the structure of the connections between them. he observation unit is composed of the actors and their ties. According to [1] and [24], Moreno, in the 1930's, created a representation technique known as a sociogram. A sociogram is a graphic representation of a social network in which the actors are represented as points in a two-

Methodology he analysis context
he course involved more than 180 participants from 40 agencies and 17 diferent units of federation. hey would have worked, or have a prospect of working, in information security management in their organisations.
he intense use of a VLE in mediating the interaction between students has generated a rich record of interaction, partially analysed here. Several features of Moodle were employed, especially structuring in modules, the use of discussion forums, online tasks and quizzes.
Data selection and SNA methods is restricted to illustrate one possible application. For the purposes of our proposal, henceforth we will call knowledge analytics the overall process of creating and managing knowledge combining diferent methodologies.

Data analysis
In order to deine scope of the analysis in this study, a dataset was obtained from Moodle with some anonymised information using SQL query. he dataset contains a worksheet with a piece of the VLE database, which referred to students who completed the course and presented the following attributes: student identiication, virtual classroom to which the student belonged (approximately 180 students were divided in 6 virtual classrooms), total number of actions performed in VLE during the whole 2 years, agency in which the student worked, main subject researched on student's monograph (as identiied by course management), the federation unit in which the student resided, among other attributes. he manipulation of the data presented was done mainly using Pajek. Other social network iles used for Pajek sotware were made available in network (.net), partition (.clu) and vector (.vec) formats.
Based on the provided dataset, an ailiation undirected sociogram was generated with the following sets of actors: students, agencies and main subjects researched. Figure 1 illustrates a three-mode network and the analysis model used in this paper in which each student (students #1, #2, #3, #4, #5, #6 and #7) establishes a "working" relationship with his own organisation (agencies #12, #13, #14 and #15). Each student has also researched a speciic topic concerning information and communication security management in his/her approved monograph (subjects #8, #9, #10 and #11). Students of Organisation #15, when in existence, did not inish the course. No student developed Subject #11 (Figure 1).
Indirect relations between each student and its organisation, and between each student and its research theme, produce ailiation relationships between organisations and themes. Such relationships are mediated by students, as shown in Figure 2.
Using SNA methods, a two-mode network, such as presented in Figure 2, can be converted into two one-mode networks, generating an "organisation proximity" network (mediated by researched subjects, as shown in Figure 3) and a "subject proximity" network (mediated by organisations, as shown in Figure 4). he network in Figure 3 indicates cooperation possibilities between agencies, which could require KM, while the sociogram presented in Figure 4 indicates possibilities of conceptual proximity between subjects. he latter provides a graphic representation of relationships between diferent ields of knowledge. Potentials of the presented model will be discussed later.
We here use the concept of "proximity network" because, although networks presented in Figures 3 and 4 may seem cohesive subgroups, they actually derive from a multi-mode network, so vertices are just indirectly connected. Such concept difers from "proximity-based network", which is commonly used for wireless or locale-based networks.

Data collection
A hundred twenty-four students completed the course, twentynine organisations employed graduating students and there were twenty-nine central themes researched on approved monographs. he ailiation network presented in Figure 5 contains ity-eight vertices in which the actors are the organisations (students' employers) and the events are the subjects (themes addressed in students' monographs, mostly organisational case studies). For illustrative purposes, Figure  5 also presents that some of the organisations are ailiated to others. For example, "MD/EB" means that EB is part of MD. However, both EB (i.e. "MD/EB") and MD provided their own students. herefore, the analysis could be conducted in diferent levels of organisational granularity.
here are multiple lines repeated between organisations and subjects. In order to simplify visualisation, it presents only the quantitative plurality of lines. Labels showing the thickness values (regarding number of students in each relationship) were omitted for visualisation purposes. Tables 2 and 3 present a frequency analysis of organisations' and subjects' degrees.
As Table 3 shows, the most studied subjects were Information and Communications Security Management (ICSM) and Security Procedures and Processes (SP&P), both with twelve monographs. Computer Network Management, Security Risk Evaluation and Management (SRE&M), and Secure Sotware Development were studied in eight monographs. he sociogram in Figure 5 contributes with a quick analysis of the network situation and its relationships. It shows how organisations and subjects relate to each other. Some organisations are more central or more representative in relation to the network as a whole, either by the number of its students or by the number of relationships with distinct subjects. In a similar way, it is observable how speciic subjects can gather organisations in which students developed a speciic research topic. Some subjects were selected by students in only one organisation, such as "Ethics" and "Training". In order to avoid translating and displaying the FPA agencies' full name, we here use their abbreviations.
In terms of subjects, even though the same number of students have studied ICSM and SP&P, the former was investigated by nine diferent organisations, therefore it has a bigger aggregation potential than the latter, which was studied in just ive distinct organisations. he aggregation potential of an organisation or of a subject can be deined by the vertex degree, which it represents, eliminating loops and multiplicity of network lines. In Figure 3, for example, if the loop on Agency #12 and subject multiplicities between #12 and #13 are taken of, the aggregation potential of Organisation #13 equals 2, while the aggregation potential of Organisations #12 and #14 is one. he aggregation potential of the three subjects in Figure 4 is also two.

Proximity networks
Proximity networks between agencies and between topics can    be generated (as illustrated by one-mode networks in Figures 6 and  7 presented on a Fruchterman-Reingold energy layout) starting from network ailiations between organisations and themes (two-mode network in Figure 5). If we consider the multiplicity of relationships, one can analyse these networks by extracting the proximity of its m-slices. his can be done using Pajek sotware. Pajek and Gephi can generate networks on which thickness proportionality of lines represent values of multiplicity approaching organisations and themes (Figures 6 and 7).

Proximity networks between organisations
In order to represent only the most signiicant connections between organisations and between subjects, one can see only the m-slices above a certain value. In this case, the m-slice established a cut-of point from which links between actors with values below the plurality of selected lines were discarded. hus, one can have a more accurate perception of the degree of cohesion between organisations and between subjects by analysing the frequency of slices found. Table 4 presents the detected m-slices between organisations and the number of members in each slice.
If lines with multiplicity less than or equal to three are removed from the proximity network, the less signiicant links between the twenty-nine agencies will be excluded, leaving the proximity of subjects among iteen organisations. he proximity of interest between organisations is stronger when a greater number of individuals of each agency study the same topics. In the speciic case of the relationship between MP (Ministry of Planning, Budget and Management) and EB (Brazilian Army) the relationship has a value of nineteen because three students who work in MP and six students from EB studied the subject SP&P (eighteen points), while just one student from each agency wrote about Security Risk Management. he proximity analysis between organisations may denote the potential to generate synergy regarding agencies with the same goals.

Proximity networks between subjects
A similar analysis with regard to the previous section can be made in order to identify proximity between researched subjects in    eight students studying SRE&M. Although there are fewer agencies connected by SP&P, there are more students studying it. In this case, we have a mean value of 2.4 "interfaces" per agency discussing SP&P and just one "interface" per agency discussing SRE&M. his means that, on average, ties are stronger between organisations in SP&P than in SRE&M, as was suggested previously.
Another way to identify proximity on a given subject is that the maximum number of channels of communication between students who research a subject can be represented by a combination of such students in pairs, because actually we want to know how an agency can relate to another (or even to itself) in considering a given subject. herefore, we can represent this mathematically as n! / k!(n-k)! where k equals two. Taking this approach, SP&P has sixty-six possible combinations of students, while SRE&M has only twenty-eight.
Standard deviation may also apply in order to recognise agencies that are more connected. Regarding SP&P, EB is above one standard deviation and this is easily recognisable in a sociogram because of thickness of lines or labels containing values of multiplicity of lines.
EB and MP are the most meaningful agencies studying SP&P, as they both contribute 75% of the students researching this subject. It is worth noting that, while EB produced twice the monographs than did MP, SP&P is more important to MP (three out of four students researched it) than to EB (six out of thirty-eight students researched it). herefore, it is important to consider both relative and absolute values during analysis. When we consider links between two agencies, we have to multiply their "interfaces" (students) in order to obtain the possible combinations of links. In SP&P there are eighteen links between EB and MP, while there is only one link for SRE&M. hese assumptions have already arisen just by using SNA techniques. he same reasoning is applied when analysing relations between subjects researched by a given agency.
In terms of proximity networks, even in a lack of direct interactions social homogeneity can arise among people or groups in structural equivalence models, such as an ailiation network, and leads to the conclusion that direct, and even short indirect, relations are critical components for success in predicting homogeneity [34]. herefore, in the presented network, the more indirect relationships actors of the same sets (organisations and research topics) have, the more homogeneity they tend to have. hus, we believe that a joint efort towards KM between agencies that share the same concerns and bearing a higher degree of connections can bring several beneits (e.g. save time, money, work efort and avoid redundancy). Improvement in a research network can be achieved through linking experts in a speciic area of study [13,29]. Hence, having   monographs. Such an analysis might help identify boundaries and bonds between complex and difuse subjects. Table 5 contains details of a proximity network between subjects, such as frequency distribution of topics pertaining to diferent m-slices.

Discussion
In this paper, we sought to propose a method of identifying organisations that might combine their eforts in research subjects due to their common areas of interest in a more eicient way. he previous sections presented such a proposal using an SNA approach in order to provide metrics to support a quantitative analysis. Results could be sorted in order to identify the most relevant partnerships among organisations. his section will present other methods in an attempt to validate this paper's methodology.
In terms of the value of every multiplicity relationship, there is great proximity between subjects such as ICSM and SP&P, which leads to the hypothesis that these issues should be worked out jointly since they arise as simultaneous demands of various organisations.
It can also be noted that ICSM, Physical Security Controls, Computer Network Management and SP&P are subjects that gather more people in their development.
Another possibility for further analysis is to identify whether subjects that have few connections should connect to others because there is no interest in assembling communities of practice to address themes of little interest.

Conclusion
his article describes a method in knowledge analytics, based on the statement that the study of the interaction of institutions, mediated by research themes of its employees, can identify and map the knowledge creating process and the need to manage it.
his paper sought to demonstrate, through an SNA application using ailiation networks, that it is possible to identify proximity for knowledge sharing between organisations. he concept of proximity network is formulated from ailiation networks. Cut of points and threshold could be established by using m-slices in order to discard the meaningless proximity networks, thus allowing a clear picture of which organisations have more dense connections. In the studied case, two-mode networks and their arising one-mode networks for suggest demands knowledge in FPA organisations. To do so, one of the SNA methods on data acquired from Moodle was applied, taking into account subjects of interest to students on a postgraduate course and their respective agencies. Other hypotheses emerged from this study that could also be tested using SNA.
his article suggests that the demand for knowledge in an organisation can be assessed by investment of its employees' time on studying speciic topics and themes. his article also presented some possibilities and opportunities arising out of relationships between diferent organisations according to topics studied during the course, and proposed that those relationships might even support them in developing joint solutions. It was also observed that it is possible, and advisable, to conduct mapping of topics of interest in organisations through the research themes of their employees. Some subjects may be more critical to some organisations than to others, therefore other parameters such as criticality may be added in future work, resulting in a more qualitative analysis.
In Figure 6, one can see that EB, DATAPREV and MP distinguish themselves by the number of themes in common (represented by thickness of lines) and thus, should seek greater proximity. he authors assume that such agencies could combine their eforts based on common and more relevant topics of research. hus, a hypothesis for future work is the validity of creating communities of practice among agencies that have a higher number of similar knowledge demands. Such communities should group together study topics and organisations in order to minimise research eforts and to enhance solutions exchange and synergy among agencies.