Introduction

In the confluence of Artificial Intelligence and learning sciences, numerous advances have been made over the past five decades (Balacheff et al., 2009; Woolf, 2015; Kay & Aleven, 2016; Chassignol et al., 2018), both for individual learning with technology (e.g., Koedinger et al., 1997; Kulik & Fletcher, 2016; Koedinger & Aleven, 2016) and for collaborative learning with technology (e.g., Kumar et al., 2007; Hmelo-Silver et al., 2013; Adamson et al., 2014; Graesser et al., 2018). Notable examples of AI in Education in the area of workplace learning and professional learningFootnote 1 exist (e.g., Gott et al., 1986; Lajoie & Lesgold, 1989; McCall et al., 1990; Lesgold et al., 1991; Fischer et al., 1993; Collins et al., 1997; Frasson, & Aı̈meur, E., 1998; Lindstaedt et al., 2010; Schwendimann et al., 2015; Westerfield et al., 2015; Fessl et al., 2017; González-Eras & Aguilar, 2019). However, most advances in AIED in the past two decades have been made in formal learning environments, with the bulk of work focusing on formal K-12 or higher education (Roll & Wylie, 2016). An exception is the surge of attention on Massive Open Online Courses (MOOCs) roughly a decade ago, where the focus has been on adult and often recreational learning, typically using content exported from formal learning settings (e.g., from existing university courses). Professional learning, especially for continuing certification in fields like teaching, has been studied (Milligan & Littlejohn, 2014; Milligan & Littlejohn, 2017), and online degree programs and continuing certification programs have become more mainstream over the past decade. In these online learning contexts, the ingredients for a high impact of AI enabled instruction such as Intelligent Tutoring systems and dynamic support for collaborative learning are all available, and beginning to be adopted (Rosé & Ferschke, 2016). AI-enabled scaffolding and just-in-time and situated learning in the workplace on the other hand is a new frontier for future impact of AIED.

In this paper we raise the questions i) which data-related ethics issues are perceived by professionals as they engage with learning technology situated in their workplace environment, and ii) how might ongoing AIED research address data-related ethics issues. As data are the foundation of modern AI, addressing data-related ethics issues will be central to making AIED work for informal and situated professional learning: Firstly, in order to develop machine learning models, suitable training data about informal and professional learning, learning contexts, and learning activities needs to be collected; and some data must even be continuously collected at runtime for AI-enabled technology. Secondly, data about learning in context is needed in order to assess and improve learning interventions in a data-driven manner.

Below, we first provide background on professional learning and research on professional learning within AIED (Section 2). Then we describe case study research as the selected methodological foundation for this paper (Section 3). In Section 4 we describe five cases and observations about ethics issues within them in order to build an empirical foundation for our discussion. In Section 5, we develop these observations into three themes that constitute data-related ethics challenges. In Section 6, we look forward and discuss how the identified themes of data-related ethics issues synergize with current directions in AI in Education and related fields that provide resources for addressing the issues raised. Section 7 concludes the paper.

Background

The work of this paper begins with a realization that as technological and scientific advances accumulate, lifelong learning has become accepted as a necessity for professionals in order to re-skill from disappearing jobs, and to continually work to stay current within a chosen professional domain (Littlejohn & Pammer-Schindler, forthcoming). In parallel, technologies play a substantial role not only in necessitating lifelong professional learning, but also in enabling such learning (ibid).

Professional Learning

Professional learning means learning that relates to a substantial degree to work, such that it encompasses all learning “needed for successful performance in an occupation” (Hager, 2011, p.17). Professional learning mostly comes with the implicit connotation of relating to work within an organization; and mostly, work is understood to refer to paid work. This is definitely the case in the context of the field studies described in this paper. Professional learning often falls under the header of workplace learning; and where it is enacted as formal learning, is often also termed “continuing education” or “training”. In general, professional learning covers the full breadth of the spectrum of formal learning and informal learning. By formal learning we mean learning in contexts explicitly designed for learning, typically with designated teachers, and in which typically there is some form of curriculum and some form of certification (cp. Hager, 2011). By informal learning in contrast we mean learning in contexts that aren’t primarily designed and structured for learning, where typically no a priori teachers, curricula, or certification exists (e.g., Eraut, 2004; Hager, 2011). Such learning can be more or less conscious, and more or less planned (cp. Eraut, 2004’s categorization of implicit, reactive and deliberative learning). We highlight here, that while informal learning can be incidental and not planned, it can also be planned and systematic (ibid). It is understood, that important parts of professional learning are informal learning (e.g., Eraut, 2004). Further, substantial challenges in professional learning relate to contextualizing knowledge and acquired competencies with respect to ongoing work experience in a concrete social setting of practice (e.g., employer organization and business context) in which concrete projects are carried out (Eraut, 2009; Hager, 2011). The field studies described in this paper constitute examples of informal professional learning.

While professional learning has historically been an important part of successful professional careers (cp. Hager, 2011); there is a growing awareness of a renewed urgency for more research regarding professional learning, and particularly also to understanding and designing novel technologies for professional learning (cp. Littlejohn & Pammer-Schindler, forthcoming). The underlying reasons are globalization and increasing pace of progress in many domains, such as computer science, medicine, agriculture, or production. These make modern workplaces unpredictable, and highly dynamic in terms of the business environment and knowledge required to get the job done. Subsequently, lifelong continued learning is important for professionals both to re-skill from disappearing jobs, and to continually work to stay current within their professional domain.

AIED Research in Professional Learning

AIED as a field has always worked on professional learning, albeit sparsely. As a very rough quantification of this sparsity, the number of search results returned can be counted for any query within “K-12, higher education, workplace learning, professional learning, continuing education” in the library of the International Journal of Artificial Intelligence in Education or in Google Scholar (each keyword extended with “technology”), see Table 1. While these numbers are merely indicative, they do show an imbalance in how research effort has been invested across different learning contexts, where the number of publications serves as proxy for invested research effort.

Table 1 Number of search results for K-12, higher education, or workplace learning and technology (search in full text; date of search: March 2, 2020). k = thousand, M = million

Acknowledging this sparsity, we note that in professional learning analytics studies, teachers as well as other professionals have been researched as learners (e.g., Renner et al., 2019; Ruiz-Calleja et al., 2017). We note that in these studies the reported analytics are by and large algorithmically simpler than in many state-of-the-art learning analytics or educational data mining papers, probably due to the smaller size of data sets.

Further, AIED and related research has been interested in modelling professional competencies using methods from artificial intelligence as the basis for intelligent tutoring systems (e.g., Gott et al., 1986; Kay & Kummerfield, 2011; Ley & Kump, 2013) or as the basis for job recommendations (e.g., González-Eras & Aguilar, 2019). There has also been interest in identifying joint learning goals amongst professional learners as the basis for peer support (Littlejohn et al., 2009), in recommending learning goals based on user modelling (Ley et al., 2010), on supporting contextualized reflective learning through learning prompts in knowledge work (e.g., Fessl et al., 2017; Fischer et al., 1993; McCall et al., 1990), and finally on in-situ learning support in knowledge work (e.g., Lindstaedt et al., 2010) or industrial work (e.g., Frasson, & Aı̈meur, E., 1998; Westerfield et al., 2015).

Overall, this research involves challenges related to the collection of data about professionals in rich social contexts, and in creating suitable models that could form the basis for intelligent tutoring systems. A typical solution is to focus technology development on narrow and well-delimited domains, as e.g., in Fischer et al. (1993), McCall et al. (1990), or Westerfield et al. (2015). Where systems are built with substantially fewer constraints, as in Lindstaedt et al. (2010), challenges exist with respect to manual domain modelling, the collection of suitable amounts of data, and also evaluations in the field (in terms of issues of comparability).

In such research, data plays a crucial role: First, it forms a basis for learning and reflection when the collected data is modeled and then served back to students or instructors through visualizations (learning analytics). Second, it forms a basis for developing adaptation mechanisms through statistical methods or methods from artificial intelligence to make learning technologies adaptive, i.e. for online adaptation of technology at use time without humans in the loop. Subsequently, considering data-related ethics issues is of key importance when designing AIED systems.

Methodology

In this paper we raise the questions i) which data-related ethics issues are perceived by professionals as they engage with learning technology situated in their workplace; and ii) how might ongoing AIED research be able to address data-related ethics issues.

We consider five cases. All cases are from settings where professionals encountered – in different depths – data-driven and adaptive technology for informal and professional learning, and different issues or non-issues related to data and ethics came up. In four of these cases, a field study was carried out. In one case the field study didn’t take place due to privacy concerns; note that there was still a technology encounter; albeit a very brief one.

Therefore, the present paper can be understood as a multiple-cases study (cp. Yin, 1994). The analysis constitutes a secondary analysis. By this we mean that the field studies that were conducted were designed for a different purpose than to investigate data-related ethics issues in professional learning. Subsequently planned data collection and analysis from these field studies also focused on different issues than ours. In particular, the field studies aimed to substantiate assumptions about how data-driven and adaptive technologies might support reflective learning in the workplace.

The case study analysis in this paper is mostly descriptive (in Section 4) in the sense of focusing on narrating what happened in the cases. Subsequently, observations of data-related ethics issues in those cases are interpreted in relationship to existing literature (partly in Section 4, and further developed in connection with identified themes in Section 5) as potential explanations for why these issues arose. We label these interpretations as themes that we have developed, highlighting through this wording that the themes are not inherent in the data, but are an outcome of active conceptual work and discussion among researchers (cp. a similar argumentation for thematic analysis in Braun & Clarke, 2006). Finally, we connect the developed themes to ongoing research within AIED that is promising for addressing data-related ethics issues we have raised (Section 6).

Due to the nature of secondary data analysis and the fact that cases were not systematically selected from a pool of cases to be analyzed, but rather those cases that were available were analyzed (convenience sampling), generalizability is limited. In particular, all discussed cases are cases of informal professional learning, where socio-technical interventions introduced (or attempted to introduce) data-driven and adaptive technology for reflection in connection with reflective practice around such technology.

We argue that secondary analyses of field studies are valuable to the field. Field studies in which new interventions are experimentally tested in workplace learning are relatively rare due to the required investment of resources. Secondary analyses capitalize on an investment already made. A single secondary analysis such as ours, especially of a convenience sample, by itself does not provide the rigorous foundation for a strong conclusion: The offered interpretations cannot claim to constitute a model or theory of ethics issues in AI-enabled educational technology. However, this discussion does constitute a baseline for systematically investigating perceptions of data-related ethics issues to lay a foundation for future field studies, and an impulse for systematically investigating the identified AIED research directions with respect to their advantages for data-related ethics. Further, some sense of generalizability can be achieved across multiple case studies, if case study selection is sufficiently systematic (cp. Yin, 1994). This means that if more case studies are published, more resources exist for secondary analysis, and better subsequent studies can be designed on this foundation.

Multiple Case Studies

This section describes the five cases that provide the empirical foundation of our work: Four field studies that investigated the usage of automatically and manually collected data as a basis for reflective learning and adaptive reflection prompts (cases 1–4; main results published in Fessl et al., 2012; Fessl et al., 2017; Pammer et al., 2015; Rivera-Pelayo et al., 2017). Note that the adaptive prompts are rule-based rather than machine-learning based (even though the rules themselves are based on simple descriptive statistics of collected data). The fifth case is a planned but not implemented field study with the same goal (referred to as case X). The first author of this paper was part of the respective research teams planning and conducting the case study research.

Reflection in the sense of a critical evaluation of past experiences with the goal to learn (cp. Schön, 1983; Boud et al., 1985; Pammer et al., 2017) is a key mechanism in informal professional learning. It becomes possible as a learner’s expertise in a field grows (Kirschner et al., 2006). In parallel, reflection also becomes necessary, as direct instruction becomes less available in the complex and dynamic environments in which experts work. In these, few people can be found who can directly instruct correct behavior, and standard codified knowledge on how to act correctly isn’t available (cp. Burnes et al., 2003; Knipfer et al., 2013; Pammer et al., 2017; Thalmann et al., 2020).

The investigated roles of computing technology to support such learning were firstly, to capture, analyze and represent relevant data to users; and secondly, to adaptively guide reflection. This research therefore firstly relates and contributes to learning analytics research. However, data is captured about work experiences in order to support learning; rather than data being captured about learning experiences and processes. Secondly, this research contributes to research on adaptive and contextualized reflection guidance (Fischer et al., 1993; McCall et al., 1990; Kocielnik et al., 2018); and more broadly to research on context-aware prompting (e.g., Ho & Intille, 2005; Pejovic & Musolesi, 2014).

In the description of the cases below, the cases are organized into two research streams:

  • Research stream 1: Automatic activity log data for reflection on time management (Cases 1, 2 – main publication: Pammer et al., 2015; Case X - unpublished). Described in Section 4.1.

  • Research stream 2: Self-tracked mood data for reflecting about salient work issues (Case 3 – main publication: Fessl et al., 2012; Case 4 – main publication: Rivera-Pelayo et al., 2017). Described in Section 4.2.

In both research streams, the researcher teams were aware of potential data-related ethics issues from the beginning, however, perceptions of privacy and confidentiality played out into different ways in the two research streams.

When interpreting these research streams from the perspective of ethics issues, we identified three themes:

  • Theme 1: Relevant data being about learners, others, and other things

  • Theme 2: Manual tracking as a conduit for increased user control.

  • Theme 3: Learning as a non-shared goal in workplace settings

These themes are referenced in this section when describing the cases, and further developed in Section 5.

Privacy Concerns Related to Activity Log Data for Reflecting on Time Management

This research stream investigated the usefulness of activity log data as an objective basis for reflecting on and improving time management. In this stream, privacy and confidentiality issues were consistently a critical issue.

This stream highlights the fact that relevant data in informal professional learning is potentially about learners, others that learners interact with (colleagues, customers, clients), or other things that may be confidential (Theme 1 – see Section 5.1); the overall sensitivity of fine-grained and automatically logged activity data (Theme 2 – see Section 5.2), and the fact that learning isn’t necessarily a shared goal in workplaces (Theme 3 – see Section 5.3).

Background and Design Rationale

The ability to manage ones time is one of the key challenges for knowledge workers (see e.g., Mark et al., 2008; Wu & Tremaine, 2004). Time management (TM) involves activities like assessing, planning and monitoring time use with the goal to organize time use in a productive and healthy manner (e.g., Claessens et al., 2007). Fragmentation of worktime is a relevant dimension for reviewing personal time management. It has been shown that knowledge workers’ time is typically severely fragmented by interruptions (Mark et al., 2005). Such interruptions severely impact productivity (Czerwinsky et al., 2004; Mark et al., 2008) and lead to stress (Mark et al., 2008). Commercial activity logging tools like ManicTime,Footnote 2 RescueTimeFootnote 3 or SLifeFootnote 4 already claim to support time management. Scientifically however, the usefulness of activity logging tools for (learning about in the sense of improving) time management is under-explored.

Research Prototype

Against this background, KnowSelf has been developed as an activity log application for Windows (Pammer et al., 2012) (Fig. 1). KnowSelf collects time-stamped PC activity data, linked to web or file resources, identifies idle times, and visualizes the fragmentation of worktime similar to the Windows disk fragmenter.Footnote 5 KnowSelf supports manual time labeling to sort automatically logged data into categories that are meaningful to the user (e.g., project names). Labeling is useful to record non-digital activities and to provide higher-level names to activities that span multiple resources. The prototype has note-taking functionalities, and provides visualizations of time use around resources, applications and tasks that support reflection-on-action. KnowSelf also has proactive prompts that support reflection-on-action, which are triggered at pre-determined times; and proactive prompts that support reflection-in-action, which are triggered by user behavior such as unusually long idleness or high frequency of switching application windows (Pammer et al., 2015; Fessl et al., 2017).

Fig. 1
figure 1

Activity Logging Tool - Research Prototype: The default tab gives an overview of the following: 1) a representation of fragmentation of worktime. 2) a sortable list of timespan and digital resources in focus. 3) a visualization of the overall time spent on a day per application. 4) an overview over multiple days with time fragmentation per application, and 5) a representation of the extent to which a selected resource has been used over time (the rationale being, that some documents are relevant only for a short period of time, while others are frequently used but shortly). Other tabs give an analysis per application, per project, and the possibility to take notes

Cases 1–2, X: Two Field Studies, One Not Implemented Due to Privacy Concerns

KnowSelf and regular review of time use as reflective practice around it were positively evaluated in two field studies, both of which were set in a German medium-sized company with reasonably senior IT and strategy consultants (Cases 1 and 2; cp. Pammer et al., 2015, Fessl et al., 2017). A further field study in a non-German consulting team with a comparable level of seniority was planned but couldn’t be implemented: In the team that had earlier been identified as the target user group, the organisational climate at the time of research was such that there was concern that data might in some way reach management and be used as impetus to fire people.Footnote 6 This is referred to subsequently as Case X.

Emerging Data-Related Ethics Issues

In order to deal with potential privacy issues from the beginning, a threat analysis for activity logging was carried out, in the sense of who would potentially be interested in the data, and capable of accessing it under which conditions (Pammer et al., 2014). Subsequently, a distributed architecture that is capable of respecting users’ privacy with respect to their data was developed, both conceptually and technically. The architecture considers multiple data collection devices and a central server for data storage and analysis, in order to save space on users’ digital devices, allow combined analysis and visualization of time use on multiple devices, and allow centralized data analysis. Two different security configurations were conceptualized and implemented in a sensing framework, differentiated based on whether local access to data is required or desirable. One of those was associated with a private key only at server side, and one with a private key also at the client side (ibid).

However, when activity logging on multiple devices together with centralized storage and analysis was discussed with target users and user representatives prior to the field study in Case 1, this idea was strongly rejected by target users. The main reasons were related to sensitivity of data. Firstly, of course, the data are sensitive with respect to individual users, and users were reluctant to allow logging on multiple devices, and central data storage, and thereby to facilitate integration of data. In order to continue to have potential access to data logged throughout the studies that involved activity logging, we implemented a purely local activity logger for Windows that was able to export data in an anonymized manner as a CSV file. The anonymized export hashed filenames, manual labels, and notes, and left only timestamps and application names intact. Even with this anonymization, a majority of study participants in Cases 1 and 2 in the end decided against handing out this data to researchers.

While this was unfortunate from a methodological point of view, it is still understandable given the highly sensitive nature of data that lies not only in the cleartext but also in the pattern of activities throughout a longer period of time.Footnote 7 A later study (Gorm & Shklovski, 2016) succinctly analyzes one of the mechanisms that may have been at play in our study, namely that some privacy concerns only appear over extended, and concrete usage of interventions; in our case, participants having agreed beforehand to sending us anonymized data, but noticing concretely only after data collection that this still didn’t sufficiently answer their reservations. This relates inversely to Theme 2 – Manual tracking as a conduit for user control (Section 5.2 below), in particular in the sense that reviewing fine-grained activity log data for potentially sensitive information is easily overwhelming; to the extent that it is clearly easier to decide not to share data at all.

This sensitivity of data with respect to users was also the reason why we were unable to implement a field study using activity logging for time management in a non-German consulting team (which we refer to as Case X). In this team, as mentioned earlier, organizational climate at the time of research was such that there was concern that data might in some way reach management and be used as a reason to fire people.Footnote 8 This highlights the relevance of considering the potential of data collected for reflection purposes as workplace surveillance in the sense of “monitoring and recording aspects of an individual or group’s behavior […] for the purposes of judging these as appropriate or inappropriate; as productive or unproductive; as desirable or undesirable” (Introna, 2003; p.210). This example shows how learning is not an a priori shared goal of involved stakeholders: In case of a difficult organizational climate, trust in that collected data will be used to support learning rather than to support performance measurement may be limited (Theme 3 – Learning as a non-shared goal, see Section 5.3 below).

Study participants (in Case 1) were also concerned that logged data wasn’t only sensitive with respect to themselves, but would contain confidential data about clients, sometimes already in file or folder names.Footnote 9 Within the field study, this concern existed despite local-only logging, and anonymized export (hashed filenames).

We encountered a similar concern in the context of a different research stream in informal communications: For physicians, reflection and discussion of ongoing cases is part of their job. Beyond this, documentation of particular cases is interesting as part of their informal as well as formal continued professional learning: For informal learning, keeping written personal documentation of interesting cases beyond just memory can be helpful, especially if discussion with peers is temporally or spatially distributed. For formal learning (certification), treatment of specific cases may need to be documented. Every case of a physician, however, is obviously about a patient – a human subject of the data collection; and contains highly sensitive information. Of course, physicians can deal with this by keeping case data in an anonymized manner, and by getting formally necessary confirmation by a supervisor in-situ, thereby avoiding the necessity of traceability of case data for certification reasons. On the downside, as automatic anonymize-and-export functionality from medical information systems is not standard functionality, the full effort of re-creating an anonymous version of the case data is left to the physician (in this case, the learner). Further, systematic follow-up on patients based on case data is not possible for individual physicians.

We see the fact that relevant data is potentially sensitive with respect to others than the learners, such as clients or customers; and confidential with respect to social entities beyond the learners, such as the employer organization, as a salient consideration in informal professional learning (Theme 1 – Relevant data is about the learner, others, and other things, see Section 5.1 below for further discussion).

Balancing Privacy and Usefulness in the Case of Self-Tracked Mood Data

The second research stream investigated the usefulness of self-tracked mood data in collaborative working settings in order to trigger reflection, increase awareness of one’s own emotions, and to facilitate communication within teams. In this stream, data collection related concerns weren’t in the forefront.

This stream overall shows that when everything fits together, data collection isn’t an issue. Again, in this stream, collected data are about the professionals as learners, others and other things (Theme 1 – see Section 5.1). However, data aren’t necessarily private with respect to customers, or confidential; which can be attributed to manual tracking, allowing both coarse-granular tracking, and giving full control of the content of further (verbal) elaborations (Theme 2 – Section 5.2). This stream relates inversely to Theme 3 (see Section 5.3) by exemplifying how individual learning can be closely aligned with performance-oriented and organizational goals.

Background and Design Rationale

In collaborative settings, awareness of others’ emotions has been shown to enable users to respond accordingly and subsequently to achieve better results in collaborative work (García et al., 1999; Dullemond et al., 2013). This complements the knowledge in computer-supported cooperative work that awareness of significant information about others is beneficial in collaborative work settings (Gutwin & Greenberg, 2002). In reflective learning, past and present emotions can point to salient aspects for reflection and can trigger and impact it (cp. Boud et al., 1985 for the role of emotions in learning, and Pammer et al., 2017 for triggers). Inversely, reflection can increase awareness of one’s own emotions (e.g., Morris et al., 2010).

Research Prototype

Against this background, the mood self-tracking research prototype MoodMap App was developed (see Fig. 2). While we labelled the self-tracking application “MoodMap App”, we didn’t strictly stipulate that users track mood in the sense of longer-term, diffuse affective states, as opposed to emotions that arose as affective reactions to specific events (cp. Frijda, 1994): The goal was for users to capture their current affective state as seemed relevant in their context of use. The interface follows Russell’s two-dimensional model of affect (Russell, 1980), which describes affect along the two dimensions of valence (feeling good – feeling bad) and arousal (high energy – low energy). Visualizations following Russell’s (1980) model of affect have been investigated and validated in previous research (Morris et al., 2010; Ståhl et al., 2009). Our visual representation of mood is similar to these visualizations. Mood in the MoodMap App is captured by clicking on the bi-dimensional mood map (Fig. 2 - a) based on Itten’s colour system (Itten, 1971). Personal notes (free text) can be attached to mood entries and context information can be added outside of mood entries (e.g., a task has been finished). Moods, notes, and context are aggregated and visualized in different views on an individual as well as collaborative level. At the team level, the average mood of each team is calculated with the last mood of each user captured during the present day.

Fig. 2
figure 2

MoodMap App research prototype - figure as shown in Rivera-Pelayo et al. (2017): a Mood can be entered (=captured) by clicking on a coloured, bi-dimensional mood representation. The entered mood is also translated into a smiley. b In the Compare Me View, ones own average valence (feeling good/bad) and arousal (high energy/low energy) are compared to average valence and arousal of the group. c In the Collaborative View, the current mood of all others in the group are shown. Different versions of the MoodMap App show this anonymously or with names. d A mood report summarizes captured mood over a period of time. Different versions of the report were experimented with. The report shown in the above figure shows for instance the development of valence and arousal over time (left bottom corner), gives contextual information about the meeting, lists full-text notes given in addition to mood entries, and separates the mood map into four quadrants, showing how many moods were stated in which quadrant (right top corner)

Cases 3–4: Two Field Studies

The MoodMap App and social practice around it were tested in two field studies. One study (Case 3) was set in virtual team meetings (Fessl et al., 2012) in a distributed Europe-centered team, where the MoodMap App was used to support side-channel communication in virtual team meetings, with ambiguous results. A subsequent field study (Case 4) was set in four business-2-business call-center teams (Rivera-Pelayo et al., 2017) in the same international organization again with a European focus, where the MoodMap App was used again to support side-channel communication and intra-team awareness (including the team managers); with positive impact shown for those two of the four teams who accepted the intervention (ibid).

The MoodMap App was improved and adapted in user-centered design iterations prior to the longer-term field studies in both settings because the study settings were deemed to be sufficiently different. Consequently, the MoodMap App versions tested in the two field studies differ in terms of maturity, and configuration such as whether contextual information to mood was mandatory or not, or whether and which pre-defined contextual descriptions were available. Most relevant to the discussion in the present paper, the MoodMap App in the virtual team meeting setting anonymized user names in the collaborative view, while the MoodMap App in the call center setting didn’t. This is discussed further in the following subsection.

Emerging Data-Related Ethics Issues

Initially, in preparation for potential privacy issues, all collaborative views in the MoodMap App had been designed to be anonymous. The field study carried out in the virtual team meetings (Case 3, Fessl et al., 2012) used this anonymous version. However, this impacted the usefulness of this view, as knowing the author of a mood entry is necessary in order to be able to react ad personam.

In the call center setting (Case 4, Rivera-Pelayo et al., 2017), based on preliminary design activities with a subset of target users, it was decided to show the author of each mood entry. In this field study and resulting publication (ibid), issues of privacy were explicitly investigated as a factor related to technology acceptance and subsequently benefit. Analysis of logged mood data showed a balanced distribution of positive and negative mood entries. Explicit questions about how comfortable users were with sharing the self-tracked data elicited an overall positive response. Nonetheless, even here, three out of 29 study participants reported not being comfortable with data sharing. In parallel, post-hoc study results showed that users were particularly interested in the mood entries of others. We interpret these results such that the data sharing aspect has contributed to the value of the MoodMap App, and thereby supported its uptake. On the other hand, verbal elaborations of mood data were few, and very coarse.

Overall, we interpret the positive attitude towards sharing data in both cases as stemming from multiple characteristics of the investigated socio-technical intervention. In particular, for Case 4, we have interpreted these in Rivera-Pelayo et al. (2017) as follows: Firstly, the goals of the intervention, including data collection, were aligned both with individual goals like getting support in the case of difficult customer cases, and with shared goals in terms of team performance. This relates to Theme 3 of this paper on learning as a (non-)shared goal in workplaces: Continued learning of business-2-business call-takers isn’t a goal for any specific calling customer because the positive effect will not benefit them, but rather future customers. However, in Case 4, the positive effects were experienced immediately in the call-taking teams. Performance of call handlers is definitely a shared goal of individual professionals, their immediate management, and the wider organization. In Case 4, there were no concerns that data would impact customers’ confidentiality. This can be understood as relating to i) the coarse granularity of the main data (mood) and mood statements being only about the learners, not about others and other things; and ii) to the fact that elaboration was verbal and optional, i.e., under the full control of users. This relates to the two ways in which control over data is easier in manual tracking settings than in automatic tracking settings: Firstly, as data is entered manually it can be curated; and secondly, verbal statements can easily be reviewed and assessed with respect to their criticality in terms of privacy or confidentiality (Theme 2- see Section 5.2).

Discussion

This discussion is structured along the three themes of data-related ethics in informal professional learning that we have developed based on the above cases, and which we develop further in relationship to existing literature.

Beyond the relationship between the themes and existing scientific literature, they related in particular also to the European regulation “General Data Protection RegulationFootnote 10 (GDPR). In the context of this paper, we refer to the GDPR in the sense of a practical and coherent baseline of how to ethically deal with data about humans.

Relevant Data is about the Learner, Others, and Other Things

The first theme is that in informal and situated professional learning, relevant data for learning is not only about the learner (cases 1–4). Instead, data is also about others with whom the learner (the professional!) interacts at work – colleagues, customers or clients. Data may also represent confidential or proprietary information.

Where data is about others than the learner, these others become the data subjects (i.e. the humans about whom the data says something; the term “data subject” is used as it is in the GDPR). It is by now widely agreed, as evidenced also by the GDPR, that data should have particular rights with respect to this data. Under the GDPR for instance, data subjects have the right to access data including the right to get data in a portable format, to demand rectification and deletion of data, and to decide upon what data can be used for (control). These rights exist for data as long as the data subject is identifiable using the data. Ethically and in some contexts also legally therefore, data collection for informal professional learning may require that many more people than the learner need to know about the data collection and its purpose, and may need to consent to such data collection.

This data-related ethics issue is specific to informal and situated professional learning, in which the ongoing work experience is the source of problems that motivate learning, and the field of application for newly gained knowledge. The reason is that ongoing work experience involves other actors besides the learner (colleagues, customers clients) and information in it, and this information may be confidential and proprietary. When professionals as learners then reflect on their work experiences, data and other representations of these people and entities then become important learning “content” (cp. Müller et al., 2017; Pammer et al., 2017).

Subsequently, however, learners and their respective employer organizations are not necessarily owners of data that is relevant for learning, or have the right to use existing data for learning. Even learners’ own observations and knowledge about the objects of their reflection may be sensitive and confidential, such as a physician’s knowledge about patients, or an engineer’s knowledge about confidential elements of an IPR protected product. Of course, relevant data is also about the learners themselves, and this is then a concern that is shared with any other learning scenario.

The main benefit of collecting and using data for learning, on the other hand, lies primarily with the learners. It lies only by extension with their employer organization who benefits from reflective practice of their employees; or with colleagues, customers and clients who may benefit in the future from improved practice.

As a result, individual professionals may feel that they do not have the power to decide what can, and what cannot, happen with the data; and indeed, that reaching a decision on this may be too complicated, and getting consent too improbable to be worth the bother (Case 1). Individual professionals may decide not to use data-driven technology for learning on an individual basis, but rather rely on technologies assessed and approved of by their employer organization. This, in turn, will be more acceptable to professionals if there is sufficient agreement on learning as a shared goal (cp. Theme 3 – Section 5.3 below). A strategy that is, we suspect, current standard in many organizations, is not to re-purpose data collected for business purposes at all for learning.

Despite such data-related ethics issues, AIED technology needs data to develop and train computational (machine learning) models that suitably represent learners, learning domains, and learning contexts. In order to support in-situ professional learning then, training data for algorithms (both for offline and online machine learning) needs to be collected within workplace environments. This data will be about others than the learners, and will be about potentially confidential entities. For this reason, we consider this theme to constitute an important challenge for ethical AIED implementation.

Manual Tracking as a Conduit for User Control

The second theme is an interpretation of the different perceptions of privacy issues that we have observed in the five cases. In those cases with manual tracking (Cases 3 and 4) no substantial privacy concerns appeared, whereas privacy concerns were a consistent issue in the cases related to automatic activity tracking (Cases 1, 2, X). We interpret manual tracking being a conduit for user control as an underlying reason for the distinction. In particular, manual tracking makes it easy for users to control data for privacy and confidentiality issues by embedding curation and review into the data collection process. Further, manually tracked data as basis for reflection and adaptation may have additional benefits in terms of learning.

As a key starting point, we observe that sensitivity and confidentiality were significant issues in designing the activity-log based research prototype for time management. In this setting, study participants didn’t control which data were logged or not. Privacy was not such a substantial issue in the two cases based on collaborative mood self-tracking. Here, all data were entered manually and hence under immediate user control, and further were explicitly intended for usage in the reflective learning environment.

In detail, the field studies differ in many factors (business sector, educational level of target users, individual vs. collaborative tracking, activity log vs. mood log plus elaborations, etc.). This means that it is debatable as to whether manual tracking as a conduit for user control is the key distinguishing difference between the problematic and non-problematic cases. However, the fundamental statement that manual tracking is a conduit for user control is less debatable; and below we go on to discuss other work that has similarly found user control to be important in self-tracking.

From a privacy and confidentiality perspective, manual tracking allows users to control data in the sense that they can immediately curate which private or confidential data not to collect. Especially, users have control over the level of granularity of collected data; and manual tracking favors coarse-granular and sparse data (cp. Nafus & Sherman, 2014), and natural language statements over large amounts of numeric data as delivered by sensors or automatic logging mechanisms. This in turn makes collected data easier to review, actually not only with respect to privacy and confidentiality, but with respect to any perspective that users may choose to take.

This argument, of users being able to curate what is tracked, is typically made against manual tracking, in the sense of users being able to introduce a bias in terms of unwanted subjectivity.

However, also from the viewpoint of learning, manual tracking brings with it an advantage, namely that manually tracked data already constitutes mini-reflections directly within the activity (cp. Fessl et al., 2017; Nafus & Sherman, 2014). Overall, Nafus & Sherman (ibid) describe the “quantified selfers” (people who follow the practice of self-tracking) they study as exerting control in their self-tracking through choosing what to track, switching between different tools, and in particular also how to track and interpret tracked data. Of course, the collected data is fragmented, heterogeneous, and subjective. However, the typical aversion of research against subjectively collected data might bias researchers against user control in data collection more than is necessary. Specifically in learning contexts, such subjectivity could also be understood as an expression of self-regulated and empowered user behavior; and furthermore behavior that allows for individuality and for tool appropriation.

One salient characteristic of professional learning that might help make manual tracking reasonable and useful is that adult learners with a reasonable educational background and competence in self-regulated learning are expected. Both educational background and competence at self-regulation are necessary prerequisites such that manual (and reflective) tracking may lead to learning.

Learning as a (Non-)Shared Goal

As a final consideration, we observe that in Case X, fundamental distrust in the interest and commitment of the organizational environment in supporting learning (and the power of the researchers to protect collected data against this organizational environment) played a key part in cancelling the field study. In Case 4 on the other hand (collaborative mood tracking in the case of a business-2-business call center), discussion of issues related to current customer cases between colleagues and team managers followed-up on tracking within the MoodMap App. These also supported performance, which is definitely a shared goal between individual professionals, team management, and the wider organization. Furthermore, the team culture was obviously open enough for issues to be addressed in such conversations.

The fact that this theme is strongly visible through case X where we were NOT able to carry out a field study expresses a general bias in field studies. In particular where participants voluntarily use novel technology within their operative environments, it is more likely to be the case that those who participate are engaged, motivated, and overall have a functional environment and hence the resources to do “something extra”.

We can relate this theme to the more general observation that professional learning is situated in environments that are not primarily designed for learning, and learning is not an a priori shared goal of stakeholders in informal learning settings such as the workplace. Instead, the major shared goal within an organization is to produce or provide a service. Time spent on learning (e.g., reflection) is potentially an unproductive time with respect to short-term organizational performance (for instance as observed in Pammer-Schindler et al., 2018). Reflective learning also requires openly considering errors, as well as alternatives that may not be the most popular, in line with current best practice, or strategy etc. This is different in educational, formal learning contexts, where learning and personal growth of learners are in principle shared goals of involved stakeholders. This is definitely a characteristic that distinguishes informal learning settings such as the workplace from formal learning settings.

For data collection and sharing, benefits of data collection and usage as bases for reflection or for use as part of the development of AI systems may therefore need to be carefully examined in relation to shared and unshared goals and motivations of involved stakeholders. Where benefit and trust aren’t clear and given, one can be reasonably concerned about achieving positive effects on learning with this data (cp. Bulger, 2016).

In line with learning not necessarily being a shared goal in workplaces, and definitely not being the primary goal of workplace organizations, much data that is already being collected is collected for purposes of performance monitoring, optimization, quality control, etc. There are two implications: At the level of data relevance, the purpose for which data has been collected influences what data is collected and how, which may actually render the available data less useful for educational research. From the viewpoint of ethics, the consideration that data collected in a workplace setting is often not primarily collected for supporting or studying learning, relates to Nissenbaum’s (1998) discussion of sharing data across contexts. Nissenbaum (ibid) has stipulated that sharing data outside the context in which it was collected might damage the principle of contextual integrity, which in turn constitutes an ethical (or legal, depending on the legal framework) concern. The GDPR disallows such a re-purposing of data without further information to and consent from the data subjects.

Outlook: Three Ways Forward in Terms of Methods and Technologies

The three themes elaborated in this paper – relevant data being about others, manual tracking as being a conduit for user control, and learning as a non-shared goal in the workplace – are challenges for AIED research to overcome in order to address informal workplace learning and professional learning. They constitute barriers for collecting data that might serve as the basis for developing AI-based models that could serve as the core of AIED technology; and constitute barriers for collecting data during the runtime of AIED technologies. Addressing these challenges is therefore necessary for AIED to move into informal workplace and professional learning as areas for future – potential – impact of AIED.

As a central contribution of this paper, we seek not only to elucidate problems, but also point towards possible solutions. Specifically, below we point towards possible solutions that are already visible in ongoing research within AIED: Which ongoing research directions promise to help address the above identified challenges?

The below discussed directions are certainly not the only ways in which the above challenges could be addressed; nor are they interesting only for professional learning or only to address ethics issues. However, we believe these three directions are useful and supportive of the aim to mitigate data-related ethics issues within AIED research, and hope that they will be investigated as such in future AIED research.

Natural Language Statements as Data Units

In relationship to the above themes around relevant data being about learners, others, and other things; as well as around advantages of manual self-tracking, we see that AIED research is already working on understanding natural language artefacts as documenting learning activities, experiences, and learning outcomes. As examples of such ongoing work within AIED, we see discourse analytics (Clarke et al., 2018; Rosé, 2017), reflection analytics (e.g., Cui et al., 2019; Ullmann, 2019), or any form of learning assessments based on natural language texts (e.g., Rosé et al., 2017) – be that essays, conversations, or briefer statements as e.g., in the context of analyzing peer-2-peer interactions in MOOCs (e.g., Rosé & Ferschke, 2016). These research streams aim to analyze natural language statements as the basis for measuring learning processes or outcomes, and as the basis for system adaptivity such as in Adamson et al. (2014) in the context of conversational agents, or in McNamara et al. (2004) in the context of intelligent tutoring systems for improving reading and comprehension.

Natural language utterances as the data units to be analyzed have two advantages with respect to data-related ethics issues: Firstly, humans are simply very good in expressing themselves in natural language. Subsequently, when “data collection” refers to collecting verbal statements made by learners or others, it is easy for the speakers to curate what is being said with respect to respecting others’ privacy and the confidentiality of information. Secondly, humans are very good in understanding and assessing natural language. Subsequently, it is easy for humans to review collected data with respect to their suitability to re-purpose it within a learning context in cases where data collection has initially been for another purpose. Such review might be necessary both from a content-wise and ethical perspective,if existing data is re-purposed. Thirdly, self-tracking as a part of reflective practice has an added benefit, as tracking then already constitutes conscious acts of mini-reflections.

A substantial technical challenge in using natural language artefacts as a basis for understanding learning activities, experiences, and outcomes, is succeeding with natural language processing itself. Natural Language Processing is a huge and currently highly active research field on its own. A methodological challenge in the direction of reflective self-tracking is to design a meaningful reflective practice around self-tracking in light of the capabilities of this technology.

This research direction speaks in particular to Themes 1 (Relevant data is about the learner, others, and other things) and 2 (Manual tracking as a conduit for user control over data), as natural language statements as data units make it easy to review data for privacy and confidentiality issues.

Ethics-Aware Socio-Technical Design

By socio-technical design approaches we understand approaches that operationalize the realization that both technology and practice inter-relate and influence each other when technology is used in practice (cp. Dennerlein et al., 2020; Ropohl, 1999; Scacchi, 2004). This perspective has been argued and taken up in all manners of learning contexts (e.g., Buckingham Shum et al., 2019; Dillenbourg et al., 2011; Holstein et al., 2019; Holstein et al., 2020; Littlejohn & Pammer-Schindler, forthcoming). Following this understanding, design and in particular the resolution of problematic issues can happen at three levels: Developing and adapting technology, individual people (in the sense of education or reflection), and organizational practice and culture. For all above described field studies Cases 1–4, both technology and reflective practice surrounding it were developed (iteratively, prior to the above field studies).

Two particularly important means to embed this understanding of socio-technical design into the development process of technology are: Firstly, it is important to involve or consider different stakeholders in technology design. In particular in order to consider different interests and goals of stakeholders, and the different and dynamic trust relationships involved. This is the basis for identifying which data should be collected, and for which purposes. Fears and concerns in relation to data usage, requirements on who to share data with and under which conditions (e.g., to make analytics accessible only to the teacher, not the school, unless reasonable anonymity could be guaranteed) would then become issues to be raised and discussed already at design time, but hopefully minimized during the use of such interventions. Secondly, it is important to test technology in the field, as the socio-technical system – which is now understood to be the relevant unit of analysis – can be fully observed within complex social settings.

Experiments, as a traditionally respected method for providing scientific value and validity, play a role in between these phases. Thus, experiments have a place in ethical systems design in order to ensure that technology does what it has been designed to do. Complementing this, socio-technical design approaches ensure that these goals are suitably chosen with respect to the overall socio-technical system within which technology is embedded; and allow for integration of ethics as a particular focus.

Overall, socio-technical design in principle isn’t new, but the emphasis on considering data-related ethical issues in relationship to powerful AI is new as AI capabilities and amount of data, and hopes to create value out of data, are all increasing. Hence, systematically considering ethics in socio-technical systems design is new. Subsequently, methods from socio-technical systems design need to be inter-linked with research methods like algorithmic and experimental research methods that are more firmly accepted in AIED. Challenges in this direction are therefore both methodological and technological (such as modelling and implementing trust relationships and conditional consent for data sharing within systems). On the methodological side, some work is already ongoing for the purpose of extending existing methods specifically with respect to ethics issues. For instance, Dennerlein et al. (2020) have proposed a framework for introducing reflection on ethical issues within the design process of educational technology (albeit not specifically for AIED), that center on identifying different roles in educational technology design, and related different responsibilities with respect to technological components.

This research direction addresses in particular Theme 1 (Relevant data is about the learner, others, and other things) as a motivation for exploring socio-technical design as a solution; and Theme 3 (Learning as a non-shared goal of stakeholders in workplace environments) by giving space for explicitly discussing and agreeing on learning (and limits to it) within workplace environments.

Scenario-Based Data Collection in Instrumented Labs

A further research direction is emerging around extending the space between classical experiments (little ecological validity, high control) and field studies (high ecological validity, low control) by creating ecologically valid and rich scenarios that are further studied in laboratory settings. The idea is to create simulated workplace laboratories for running controlled studies as a preparation for AIED development and deployment studies in the workplace.

In these simulated workplace laboratories, scenarios are the starting point for study participants to act. Study participants’ actions are tracked in detail based on the available sensors. The rationale is to substantially reuse valuable data collected from real workplaces; to remove ethically questionable references from data for scenarios, and to carry out substantial experimentation before inserting novel technology into real workplaces in field studies. As a prerequisite for such scenario-based simulation to produce data of some ecological validity and at the same time to address data-related ethics issues, scenarios need to be based on detailed data from field studies (including manually and automatically tracked data); with rigorous post-processing and curation of data to remove private and confidential elements. Data collection in such scenario-based simulations in instrumented labs is then efficient, ethically safer than data collection in field studies, and still provides some ecological validity.

Similar approaches have already proven useful for furthering research on support for military operations (e.g., Warner et al., 2003). For learning analytics as a specific technology, Holstein et al. (2020) have used historic data as part of user enactments of learning scenarios in order to elicit user feedback related to technology – in this case data analytics based - that users aren’t familiar with.

We have ourselves taken steps in this direction: We have developed an infrastructure referred to as the “Smart Office Space”, which is set up to operate like a software development workspace. Overall, this set-up has already been put together into a coherent demonstration of how these technologies can be used in a lab setting to monitor and support workplace relevant activities, especially with respect to face-to-face collaborative work (Wang et al., 2020). The room has been instrumented with a variety of sensors including four Lorex 4 K cameras with microphones, an Intel RealSense depth-sensing camera, a Kinect camera with a microphone array, and an AWS DeepLens camera. Key components include the Microsoft Platform for Situated Intelligence (PSI) (Bohus et al., 2017) for coordination across datastreams, CMU Sphinx (Lamere et al., 2003) and the Azure Speech Recognizer for speech recognition, the USC Institute for Creative Technologies Virtual Human Toolkit (VHT) to present an embodied conversational agent (Hartholt et al., 2013), OpenFace for face recognition (Amos et al., 2016), OpenPose for sensing body movement and positioning (Cao et al., 2017), and the Bazaar architecture for sensing collaboration-relevant events and triggering support for collaboration in response (Adamson et al., 2014).

Data collection in a simulation makes sense even if such data cannot (yet) be collected in a real workplace: Firstly, to have more data during design time, following the understanding that during technology development, more fine-granular data than at runtime might be helpful. Secondly, as technology advances, it is possible that ten years from now many more data will be collected in real workplaces than now. By equipping office-like labs with sensors that aren’t state of the art in real workplaces, we can look ahead and prepare for such futures.

As a basis for making full use of such instrumented simulation environments as methodological tools for brining AIED into informal and situated professional learning, it will be necessary to develop good scenarios as a starting point for professional learning simulations. Only then, large scale and controlled experiments can be conducted that mix ecological validity and high control over the situation without intruding on professionals’ privacy, and data as a basis for machine learning models can be collected that is to some extent ecologically valid. It will be a major challenge to develop sufficiently generalizable scenarios for informal and professional learning that subsequently lead to sensing data from scenario-based simulations that generalize sufficiently well to real-world scenarios. However, if this is achieved, scenario-based data collection in instrumented labs could reduce (probably not eliminate) the need for AIED research to collect automatically sensed, fine-granular and privacy-intruding data in real workplaces.

This research direction addresses in particular Theme 1 (Relevant data is about the learner, others, and other things) as a motivation for exploring ways to move data collection necessary to develop new AI-based systems out of the field into laboratory while expanding ecological validity of laboratory experiments.

Conclusion

In this paper, we develop what we see as salient characteristics of informal and situated professional learning with respect to ethics issues in data collection. Firstly, in informal professional learning, data that is relevant for reflection and as a basis for system adaptivity can represent not only the learner, but also others (colleagues, customers, clients), and other entities, which may be confidential. Secondly, manual tracking is a conduit to increasing user control over data. Thirdly, learning is not necessarily a shared goal in situated professional learning settings. Development of these three themes is the first part of this paper’s contribution.

However, AIED research relies on data, which traditionally has been fine-grained automatic log data, often captured within dedicated learning environments. Further, AIED aims as a community at the development of technologies that respect and address ethics issues. Subsequently, the three above themes – data is about others, manual tracking is a conduit for user control, learning as a non-shared goal - then constitute challenges that need to be navigated in order not to become barriers for AIED uptake in professional learning.

However, research directions already exist within the community that we believe could be used to address these challenges. As a second part of this paper’s contribution, we therefore formulate a vision for the way AIED research might address these challenges. Note that we don’t argue that these research directions already fully address data-related ethics challenges, but see them as holding the promise to do so. Firstly, we see that AIED research already actively investigates understanding natural language artefacts as representations of learning activities, experiences, and outcomes. As such, human-readable artefacts become important data within AIED systems. Both the creation and review of such data, supported by automated curation, would facilitate respecting privacy and confidentiality issues for the humans-in-the-loop. This poses especially technical challenges in terms of natural language processing, and a socio-technical challenge in designing reflective workplace learning practice with self-tracking as one part in it. Secondly, ethical issues can be considered in a holistic manner in socio-technical design approaches to developing and evaluating (AI-enabled) educational technology. This poses methodological challenges in that specific support will need to be developed to consider ethics in connection with these approaches, and to communicate potential ethics issues to stakeholders. Thirdly, the value of data, once it has been collected in informal professional learning settings, should be maximized. This could be accomplished by developing workplace learning scenarios that can then be enacted in laboratory environments that faithfully simulate a workplace. Scenario-based experiments in such instrumented laboratory settings then constitute a hybrid method in terms of collecting fine-grained automatic log data without leading to data-related ethics issues, while still preserving some ecological validity. Such scenario-based simulation will need to be fully developed as a methodology, and in particular, adequate scenarios for professional learning will need to be developed.