The possibilities and limits of XAI in education: a socio-technical perspective

ABSTRACT Explicable AI in education (XAIED) has been proposed as a way to improve trust and ethical practice in algorithmic education. Based on a critical review of the literature, this paper argues that XAI should be understood as part of a wider socio-technical turn in AI. The socio-technical perspective indicates that explicability is a relative term. Consequently, XAIED mediation strategies developed and implemented across education stakeholder communities using language that is not just ‘explicable’ from an expert or technical standpoint, but explainable and interpretable to a range of stakeholders including learners. The discussion considers the impact of XAIED on several educational stakeholder types in light of the transparency of algorithms and the approach taken to explaination. Problematising the propositions of XAIED shows that XAI is not a full solution to the issues raised by AI, but a beginning and necessary precondition for meaningful discourse about possible futures.


Introduction
AI was predicted to disrupt human society and productivity as an aspect of the '4th Industrial Revolution' (Schwab 2016;Timms 2016) and the effects of this are already being observed in education. The pace of AI uptake is increasing, and 2023 sees an explosion of interest in language-based tools like ChatGPT (OpenAI 2023) while AI tools for large scale learning are also being developed (Kiecza 2022). According to the AI in Education Market Research Report (Market Research Future 2020), the global market reached $1.1 billion in 2019 and is predicted to generate $25.7 billion in 2030. Statista (2020) estimates the AI market as a whole will be worth $126 billion by 2025. Contemporary applications of AIED include adaptive learning systems, tailored assessments, automated feedback and tutoring tools, and learning analytics dashboards (Khosravi et al. 2022).
The Covid-19 crisis catalysed uptake of learning management systems, incentivizing higher education institutions to move towards online learning and automation, though AI tools evidently did not prove to be especially useful during the pandemic (Heaven 2021). One high profile use of algorithms in education during this time in the UK saw automated grading of the General Certificate of Education (GCE) Advanced (A) Level exams when in-person exams could not take place (Ehsan et al. 2021). Huge outcry among educators, learners and institutions over the perceived unfairness of grades allocated resulted in a government u-turn and the resignation of the chief executive of the UK exams regulator Ofqual. The UK Prime Minister blamed a 'mutant algorithm' for the debacle (Everett 2021). AI proliferation is thus often presented as progress despite falling short of targeted or imputed standards (Baur 2020;Chatfield 2020).
There is a growing awareness of the profound ethical implications of AIED. AI is seen as a potential route to boosting job markets, lifelong learning and democratic participation but is also open to heinous misuse (European Parliament 2022;European Commission 2018). Algorithmic bias has been the focus of many critiques of AI (e.g., Baker and Hawn 2021;Birhane et al. 2022;Noble 2018;Samuel 2021;Wachter forthcoming;Zuboff 2019). Bulathwela et al. (2021, 6) found in their review that 'AI will impact education greatly. However, virtually no research has been undertaken, no guidelines have been agreed, no policies have been developed, and no regulations have been enacted to address the use of AI in education'.
There is consequently much debate on how to manage the risks that are potentially introduced to democracy and accountability in teaching and learning systems. The emerging consensus is that there needs to be adequate transparency and explicability for the use of algorithms (Floridi, Cowls, and Beltrametti 2018;Gunning et al. 2019;Kiourti et al. 2019;Panigutti, Perotti, and Pedreschi 2020). Explicability is intended to make it easier to reconstruct actions taken by AI programs and to show who might be responsible for consequences. The three distinctive features of XAI are 'algorithmic transparency; explainer generalizability; and explanation granularity' (Antoniadi et al. 2021). However, there are few detailed descriptions of what this will look like or aspire to be in educational contexts (e.g., Khosravi et al. 2022). The goal of this paper is to understand the nature of XAIED; determine what might make it effective, and identify any ethical or practical limits to such transparency in teaching and learning processes.

Materials and methods
The claims of this paper are based on a thematic literature search at the intersection of several disciplines relating to XAIED. A purposive, emergent snowballing approach (Wohlin 2014;Lecy and Beatty 2012) was used to compile resources, supplemented by keyword searches on Google Scholar (n.d.) and question queries submitted to the Elicit (n.d.) database. 58 items published in 2020 or later were selected for review. Additional relevant references were drawn from these and added to the dataset. The total number of resources consulted was 102. The method of presentation below is summative, thematic, synthetic, reflective and analytical. No statistical claim is about the choice of literature, which was guided by inquiry. The review took place between October 2021 and October 2022.

Artificial intelligence in education (AIED)
Thousands of institutions are already using AI technologies to shape and plan the delivery of education (Zawacki-Richter et al. 2019;Luckin et al. 2016;Dignum 2021). AIED is often presented as a pragmatic tool which simply delivers existing tasks more efficiently, and therefore has benefits for both learners and educators. The conviction that AI presents a route to improving many services associated with teaching and learning is a clear driver of activity and reflects the optimistic view that innovation in techniques like machine learning and deeper learning will lead to tangible benefits in practice. As an extension of the move towards digitalisation in higher education institutions (Orr, Weller, and Farrow 2018) the use of AI has become a focus for innovation and competitive edge (Khosravi et al. 2022). Applications of algorithmic intelligence are anticipated in areas such as profiling learners; intelligent tutoring systems; assessment; evaluation; adaptive systems and personalised learning (Luckin et al. 2016). Natural language processing can be used to connect learners and educators with relevant information in a more timely way. Personalisation (Fiok et al. 2022) can draw on data external to and generated by the learner to suggest interventions.
Automated models are being built to analyse the social and emotional moods of learners; provide feedback; create authentic learning simulations and offer personal support through AI tutoring, writing assistance, and chatbots (Sharples and Pérez y Pérez 2022). Educators can be supported by delegating administrative tasks to machines, freeing time for more creative activity. Algorithmic data mining has been shown to produce an increase in student enrolment of more than 20% and thus a significant uplift in revenue (Aulck, Nambi, and West 2019). Thus, the strategic value of AI in education is only partly determined by a focus on learning and teaching.
Many of the anticipated uses of AIED rely on the assumption that mass data collection and analysis will take place. This can include data about learner progress through a virtual learning environment and which pedagogical approaches have been most effective for different learner profiles; but includes tracking biometric data, taking voice samples, and using eye-tracking software (Luckin et al. 2016, 34). Already there is considerable reliance on the use of controversial tracking technologies in proctoring and assessment (Coghlan, Miller, and Paterson 2021). Institutional planning is increasingly data-driven and based on harvesting increasing amounts of information from virtual learning environments and combining these with other data sets as an expanded neural network. Beetham et al. (2022, 18) describe the key aspects of surveillance in higher education as 'the rendering of student and educator activities as behaviours that can be "datafied"; inequalities of power that exist between data owners/companies and the people whose data is being collected, analysed, managed and shared; the insertion and intensification of data-based and data-generating digital platforms into the core activities of universities, and the normalisation of vendor-university relationships'. There is no way to separate the use of analytics and surveillance. However, the scale and penetration of machine learning data collection can be unsettling: a recent study found that 146 of 164 EdTech products recommended, mandated or procured by governments during the Covid-19 pandemic harvested the data of millions of children (Human Rights Watch 2022).
As AIED becomes increasingly mainstream attention is shifting from the technical to the sociotechnical perspective. The majority of legacy AIED literature is based in quantitative computer science and there is little expertise in AI in the humanities (Zawacki-Richter et al. 2019;Dignum 2021) leading to calls that AI would benefit from greater interdisciplinarity (Gilpin et al. 2018;Dignum 2021). More generally, differences in contexts of application complicate attempts to assess the impact of AI as a whole. Xuesong et al. (2021) suggest a threefold categorisation of the challenges facing AIED. Firstly, arising from the attempt to apply AI techniques from one context of application into another; secondly, the disruptive effects on the traditional roles and activities of learner and teacher; and thirdly the wider social impacts that can emerge when things go wrong (such as the inappropriate exposure or use of data).

The 'Black box' problem
Pasquale (2020, 225) has described how advanced socio-technical systems can appear 'humanly inexplicable' or even 'magical'. The key structural feature of the 'black box' model of computation is the non-transparency of the processes and workings that convert input to output. Tjoa and Guan (2021) find that 'the black box nature of [deeper learning] is still unresolved, and many machine decisions are still poorly understood'. Machine learning has made little progress with representing higher order thoughts, higher levels of abstraction, being creative with language, or 'common sense' (Russell and Norvig 2021). Dramatic progress has been made in recent years with respect to functional or "weak" applications using natural language programming, many of which are often branded in the unrestrained language of AI marketing. Guidotti et al. (2018) propose a universal typology for understanding issues around 'black box' computation: the model explanation problem; the outcome explanation problem; the inspection problem; and the transparent box design problem. These vary based on the specific explanation problem addressed, the type of explanator adopted, the black box model opened, and the type of data used as input by the black box model. Markus, Kors, and Rijnbeek (2021) similarly propose three types of explanations: model-based explanations (where a simplified model is presented to explain the workings of the AI model), attribution-based explanations (which explain the task model in terms of input features), and example-based explanations (which involve looking at specific instances or cases to explain how a model worksor doesn't work). Páez (2019) supports the idea that interpretative models present the best route to understanding but the purely functional approach doesn't really explain the actual XAI part at all: 'The task ahead for XAI is thus to fulfil the double desiderata of finding the right fit between the interpretative and the black box model, and to design interpretative models and devices that are easily understood by the intended users.' (Table 1) The explicability turn XAI addresses four traditional moral principles (beneficence; non-maleficence; autonomy; and justice) through two key questions: how does [the algorithm] work? and who is responsible for the way it works? Through greater accountability and legibility, Floridi, Cowls, and Beltrametti (2018) anticipate more open ethical deliberations supported by training more engineers in ethical and legal perspectives, new qualification programmes in the ethics of AI, greater public awareness of AI, and promotion of computer science. From this perspective, XAI is a retort to the 'black box' problem which responds with transparency to foster trust (Hanif, Zhang, and Wood 2021).
Notably, not all agree that XAI is a solution. Robbins (2019) argues that many uses for AI are low risk and don't require explication; in some cases XAI could prevent the advantages of AI being realised. According to this view "a principle of explicability for AI makes the use of AI redundant" because it is not the algorithm (process) or designer/decision maker but the underlying principle that determines ethical value (ibid.). Jiang, Kahai, and Yang (2022) further argue that XAI can overwhelm and introduce epistemic uncertainty. There remains considerable debate and ambiguity around terms like explicability, explainability, interpretability, comprehensibility, intelligibility, transparency, and understandability. Some (e.g., Páez 2019) consequently argue that explicability remains a vague and under-theorised term with no definitive meaning. Nonetheless, XAI remains the most common response to criticisms of algorithmic bias, unwanted impacts, and lack of scrutiny.  Adadi and Berrada (2018) propose that '[e]xplainability provides insights to a targeted audience to fulfil a need, whereas interpretability is the degree to which the provided insights can make sense for the targeted audience's domain knowledge.' XAI cannot result in a single form of explanation since different stakeholders require different kinds of explanations which are commensurate with their own baseline understanding and ability to interpret. In the case of education this means providing XAI at the most generalisable level (Antoniadi et al. 2021) although this might look different for learners, educators and developers, for instance. The key consideration for XAI is the question 'what makes for a good explanation?' (Mueller et al. 2019) but good explanations are relative. A simple distinction here could differentiate the domain of technical expertise from the knowledge of the layperson. Markus, Kors, and Rijnbeek (2021) suggest that the quintessential XAI distinction is between those accounts which emphasise intelligibility to a human and those which faithfully reconstruct and represent the tasks performed by an algorithm (Figure 2). This typology distinguishes interpretability which is human readable and fidelity which is the accurate, technical description of what happens in the 'black box'. The technical explanation of an algorithm might include things like exploratory or statistical analysis; evaluation of machine learning models; periodic iterations of concepts and validation of results; user testing; and producing documentation for datasets and models. For stakeholders lacking expert knowledge such transparency presumably has limited value without simplified explanations nor a trusted broker who can  interpret on their behalf. As Khosravi et al. (2022) note, this is particularly apt in the case of educational administrators, institutional leaders and legal officers who have responsibility for governance. Bloch-Wehba (2020) thus argues for greater transparency in the use of automated systems of governance. Tutt (2020) similarly contends that algorithms should be directly regulated by new governmental agencies which work in partnership with industry to develop common standards of acceptable practice. XAI supports the uptake and operation of machine learning in education since non-transparency negatively affects trust (Hanif, Zhang, and Wood 2021).
Socio-Technical perspectives on xaied Birhane et al. (2022) argue that although AI ethics is a rapidly growing field it cannot keep pace with the rapid development and rollout of AI systems into all parts of society, and as a result most work in this area is shallow. They describe AI ethics as characterised by agnosticism about existing forms of oppression and insufficiently focused on the structures and institutions that perpetuate inequality. Hickok (2021) also calls for greater diversity amid a need to progress from high-level abstractions and concepts in favour of applied ethics which establish accountability. Chatfield (2020) similarly points out that we can't think about the ethics of AI distinctly from the ethics of our society.
Attempting to fully understand the socio-technical scale of AI implementations is challenging. Crawford and Joler (2018) have described the interconnected nature of such systems through primary production and processing of raw materials; manufacturing; logistics; assembly; data preparation; programming; AI training; infrastructure, platformisation, user interfaces; and devices. Each stage involves various forms of human labour (much of which is ethically questionable though 'invisible' to the end user). To focus on the AI-user dichotomy is to overlook many socio-technical and context-dependent aspects (Vera Liao, Gruen, and Miller 2020). Antoniadi et al. (2021) reviewed 121 papers, finding that explainability is an important part of building trust in AI systems but that introducing XAI features can add significantly to the cost of systems. They found that there is a significant amount of work to be done in studying applications of XAI in ethically important contexts (such as medicine). Notably, the bigger the datasetand AI requires ever bigger datasetsthe less connection there is to the individual. We are increasingly affected by algorithms which one has not intentionally engaged with: shadow profiling is common on social networks and for advertisers and interlinked systems sharing data means isolating systems is difficult. Viljoen (2021, 37) notes that such 'horizontal' data relations within our technological infrastructures are designed to facilitate and monetise data flows rather than regulate responsibilities or prevent injustices. There is also a need for human data curation to support machine learning which can lead to exploitation of the most marginalised who are most at risk when algorithmic systems fail (Hao and Hernández 2022;Birhane et al. 2022;Ricaurte 2022;Carman and Rosman 2020).
Accordingly, Ehsan et al. (2021) propose the concept of social transparency for XAI. This approach adjusts the algorithmic centrality of AI decision-making towards 'a socio-technically informed perspective that incorporates the socio-organizational context'. AI systems can be understood as human-AI assemblages which are already socio-technically embedded. Hence, a socially situated XAI needs to prioritise the complexity of human-AI assemblage over technical solutionism. Selbst et al. (2018) identify five 'traps' for AI systems that fail to adequately recognise the socio-technical context for AI decision-making (Table 2).
Socio-technical approaches inherently acknowledge the range of stakeholder perspectives. For instance, Prinsloo, Slade, and Khalil (2022) propose a cautious, non-binary, granular approach to human-algorithmic decision-making across areas like admissions, student support, pedagogy and assessment based on specific conditions and contexts. Hu et al. (2021) propose an XAI toolkit (XAITK) which comprises an open-source collection of XAI tools and resources which can be applied across multiple domains and systems. This approach emphasises transparency and greater sharing of data across disciplines and domains of application. Similarly, the XAI-ED Framework (Khosravi et al. 2022) consists of critical questions about stakeholders, XAI benefits and user experience, approach to AI explanation, and pitfalls/risks. Here the recommendation is to distinguish the global and local forms of explanation that use proxies which are less complex to understand: global forms explain the entire AI model while local forms relate to individual predictions, and each is associated with particular mathematical models. The XAI-ED model confers a flexible lens on issues of XAI and suggests pragmatic routes to aligning different stakeholder groups with appropriate proxies and communication strategies (see Figure 3).
Systems of feedback and evaluation are needed for understanding the impact of AI. Morley et al. (2021) argue that existing approaches to closing the gap between ethical theory and the practical design of AI systems are ineffective, meaning that regular re-evaluation of AI systems is necessary. Markus, Kors, and Rijnbeek (2021) recommend that trust in XAI be built through reporting data quality (so that issues around bias and low quality data can be explored); performing extensive, external evaluation (to interrogate and optimise models); and through regulation. One potential solution is to develop regularised approaches to assessing the impact via a combination of experts and public scrutiny (Moss et al. 2021). Crucially, these audits could typically take place before implementation and use public transparency to ensure further accountability.

Discussion: the possibilities and limits of xaied
XAI should help educators to understand the algorithms that will influence their practice as AIED becomes more common. Similarly, learners stand to benefit from XAI when it helps them to comprehend how decisions are made that affect their learning with AIED. Other stakeholders involved in educational processes (managers, administrators, technicians, librarians, designers, etc.) are also potentially empowered. An explainable account of the same AIED system might look quite different from these alternative perspectives.
Educators and learners are likely to use different tools and services within AIED. Learners might use adaptive learning management systems, augmented interfaces, and receive support from chat Be circumspect about how and when to design technological systems, realising that platformisation is not the answer to every scenario bots or intelligent tutoring systems (ITS). Educators might make use of automated assessment, plagiarism checkers and administrative tools, as well as reviewing dashboards of predictive analytics. Institutions can use an overview of the data to monitor, manage and plan activity. We can consider each of these XAIED perspectives by way of Antoniadi et al. (2021) as 'algorithmic transparency; explainer generalizability; and explanation granularity'. As pedagogical experts, educators have an interest in a high-level of AIED explicability and need to have a good awareness of the role of AIED in the design of learning. Finer granularity of explanation might be needed where AIED plays a more central role, but educators should be able to explain AIED processes. By contrast, the learner might only require a simple to understand model for the role of AI in learning systems but this may limit algorithmic transparency. Providing a detailed account of how the algorithm influences the learning process might also influence how a learner behaves. This could be a distraction from the authentic learning process, or even attempts to manipulate algorithms. Tong et al. (2021) found that while AI feedback can be of high quality it can be perceived negatively by learners.
Having human educators deliver AIED feedback may be beneficial to learning but also potentially limits explicability. Many traditional pedagogies rely on a degree of authority and are rarely fully transparent. XAIED threatens to disrupt traditional pedagogical structures by laying bare aspects of the learning process, especially at scale. From a learner's point of view there can be a benefit to 'forgetting' past performance and not being judged by previous performance (Luckin et al. 2016). The demands that AIED systems will make of future learners remained underexplored. If AI systems require learner data to be effective, will learners be permitted to withhold their data? Recommendations made to learners will require some understanding of how such computations work and a degree of critical reflectiveness to make sense of. Failure to ensure that learners have these skills risks another form of the digital divide. Similarly, little attention has been paid to the demands that AIED enhanced systems will make of learners and how they will acquire the required skills in areas like communication, self-assessment, reflection, remote work and self-management.
There is every indication that AI algorithms do more to exacerbate structural inequality than act as a corrective as a result of bias. Furthermore, personalised learning threatens to exacerbate inequality in educational experience. The solution proposed by Bulathwela et al. (2021, 7) is that we embrace diversity and dialogue to 'collectively design a global education revolution that will help us solve educational inequity' by addressing the political and social context which engenders unequal access to quality education. AIED can contribute to this, but not through the typical forms of AI solutionism where every machine learning issue is 'solved' through more machine learning (Chatfield 2020). Maintaining explicability as a principle of organisation encourages participation and balancing dialogue around human-AI assemblages, but XAI alone cannot engage with underlying socioeconomic conditions.
Pedagogies are rarely fully transparent and so there is a need to retain the possibility of nontransparency to different stakeholders. However, this does not preclude the possibility of making those systems and algorithms transparent for auditing purposes or external examination. Making exemptions subject to scrutiny from an expert regulator would constitute a limited form of transparency that could protect stakeholders. A key goal of such audits would be to minimise the differences between XAI descriptions for various stakeholders in the presentation of socio-technical AI systems. Controlled sharing could allow audit information to be shared selectively with the public while commercially sensitive details remain opaque (Morten 2022). Pasquale (2020, 19) argues that "as soon as algorithms [have] effects in the world, they must be regulated and their programmers subject to ethical and legal responsibility for the harms they cause". However, the inscrutability and complexity of machine learning has impeded attempts to regulate it, and AI lacks an agreed professional code or ethical framework (Crawford 2021, 214-224). Legislative moves are underway. Regulatory force in cases of AIED could include the destruction of algorithmic data, models and algorithms themselves (Kaye 2022). The United Nations has called for a moratorium on the sale and use of AI on the basis of risk to human rights (United Nations 2021). Expressing concerns about the application of AI tools to areas like law enforcement, national security, criminal justice and border management, they call for cross-sectoral regulation and a drastic increase in transparency to ameliorate the 'black box' problem of AI informed decision-making where algorithmic recommendations are made but it is not possible to reconstruct or explicate the process through which recommendations were generated.
In the most recent recommendations made by the UN High Commissioner to member states there is a call to ban any applications that cannot be run in full compliance with human rights legislation (ibid., 15). The USA has similarly proposed a bill of rights for AI systems (White House 2022) which foregrounds explanation of why 'an automated system is being used and understand how and why it contributes to outcomes that impact you'. The bill recommends plain language reporting which is technically valid, meaningful and should be shared publicly where possible.
The forthcoming AI Act (European Commission 2021) proposes a regulatory framework for the exploitation of AI technologies which aims to be consistent with existing rights and values. According to the AI Act, key to building trust in AI systems is to introduce higher degrees of oversight, monitoring and transparency which are greater in higher-risk scenarios (such as those involving vulnerable groups, biometric data, social scoring, or manipulative generated content like deep fake images). The key regulatory challenge going forward is finding non-reductive ways to make socio-technical AI processes not just transparent, but understandable.

Conclusion
XAI is often portrayed as a route to ameliorating fears about the mechanisation of society. Being able to explain what is happening to those affected requires careful messaging. In educational contexts, it should always be possible to provide accounts of AIED which are interpretable to the layperson alongside more technical accounts which can be made available to specialist auditors or external examiners. Furthermore, appropriate governance measures can be put in place so that it is always possible to identify a human being who takes responsibility for what an algorithm has done or recommended (cf. Floridi, Cowls, and Beltrametti 2018).
It is likely that educational institutions will not in fact be the gatekeepers of AI technologies as they begin to proliferate consumer devices. Educators are already starting to integrate language processors like ChatGPT in their teaching as students increasingly use them to overcome the parameters of traditional assessments like essays. It is essential that educators engage with the impact of generative AI on existing delivery and assessment systems. It is possible that we will see the introduction of new roles that support this (such as the brokering and auditing roles described above) by drawing on the distinctively human aspects of sentience and moral agency (Véliz 2021;Weizenbaum 1976).
Greater transparency and explicability indicates a route to critical reflection upon the application of algorithms in education and AI in social life more generally. This critical review of literature has shown that a socio-technical perspective for XAIED is essential. For educators and learners to participate in AIED they need to be able to understand and meaningfully consent to AI interventions, and trust must be built as transparently as possible. The risks and impacts of AIED are in the process of becoming: XAIED is necessary for AIED, not least because the only alternatives are opaque AI or no AI. For promoting trust, ameliorating risk and the exchange of stakeholder perspectives, XAIED could even be considered a kind of default position for educational institutions. However, it is also necessary to acknowledge that radical transparency is potentially disruptive to traditional pedagogical approaches, and AIED introduces risks (such as algorithmic manipulation; bias; modifying rather than measuring behaviour; and disincentivizing learning). For learners to participate in AIED they need to be able to understand and meaningfully consent to the processes and effects of algorithmic intervention. It is hard to see how this can happen unless those who support learners also understand what is happening and all the ethical implications. Even if one could render all algorithms transparent and fully explicable, the socio-technical ecosystems of production, assembly, programming, training, using and maintaining AI systems is so diffuse as to be obscured in its entirety from any one individual view. Problematising the proposition of XAIED from a socio-technical view shows that XAI is not a full solution to the issues raised by AI, but both a beginning of and a necessary precondition for meaningful discourse about our possible futures.