Explainable Artificial Intelligence (XAI) 2.0: A manifesto of open challenges and interdisciplinary research directions

Falsifiability challenges encapsulate the complexities and nuances of XAI and offer a road map for future research. For each problem, we provide promising research directions in the hope of harnessing the collective intelligence of interested stakeholders.


Introduction
The field of Explainable AI (XAI) has undergone significant growth and development over the past few years.It has evolved from being a niche research topic within the larger field of Artificial Intelligence (AI) [1,2,3] to becoming a highly active field of research, with a large number of theoretical contributions, empirical studies, and reviews being proposed every year [4,5,6,7,8,9,10,11,12,13,14,15,16,17].Furthermore, XAI has evolved into an exceedingly multidisciplinary, interdisciplinary, and transdisciplinary field.Among others, XAI is now a research topic in a broad range of disciplines outside of computer science, such as engineering, chemistry, biology, education, psychology, neuroscience, and philosophy [17,18,19,20].The growth of XAI can be attributed to the increasing success of AI.In recent years, Machine Learning (ML), specifically Deep Learning (DL), has been successfully used in many real-world applications due to its ability to learn and automatically extract patterns from complex and non-linear data.ML and DL techniques have been used for classification, forecasting, prediction, recommendation, and data generation.The success of these techniques and their application in critical areas such as finance [21] and healthcare [22] has made it necessary to understand these models' underlying mechanisms and their often opaque outputs.XAI has emerged as a response to this demand, as it seeks to develop methods for explaining the behaviour and outputs of AI systems.In other words, AI's need for transparency and interpretability has made XAI an area of study with practical and ethical value in various fields [4,18].Despite much progress in XAI, many open problems require further exploration.For instance, XAI alone is not enough for trustworthiness, and there is a lack of consensus concerning the priorities and directions needed to advance this research field.Often these open problems are viewed through siloed perspectives [23,4].A broader, multidisciplinary approach that draws on the expertise of researchers across different fields could bring about advances towards XAI 2.0.Our work addresses this gap by bringing together a wide range of experts from diverse fields to collaborate on identifying and tackling open problems in XAI.The focus is on synchronizing the research agendas of scholars working in the field, identifying directions that could catalyze XAI in real-world applications.
The goal is to form a proposal open to discussion, a sincere attempt at fostering a debate around XAI and what research should be addressed in the future.By doing so, we hope to offer new insights and perspectives on, for instance, developing and improving methods and applying existing methods in novel domains.Through this collaborative effort, we seek to advance the field of XAI and contribute to its continued growth and success.In particular, we seek to propose a manifesto that comprises several propositions governing scientific research in the field of Explainable AI (XAI).To achieve the goal described above, this article has come about through a very specific synthesis.To get different perspectives on XAI various experts from different disciplines, including philosophy, psychology, HCI, and, of course, computer science were brought together.This significant effort has resulted in a total of 27 problems with their challenges, which we have divided into nine categories of two to four problems.
Overall, the structure of this paper is as follows.Through the illustration of some of the many possible use-cases of AI, 2 presents a variety of advances of XAI techniques and methods, along with their application in real-world settings.This is meant to highlight the benefits of XAI to people, businesses, institutions, and society.Subsequently, the article's core follows in 3 by describing 27 problems in XAI, the challenges in solving them, and our suggestions for possible solutions.Finally, 4 summarizes our manifesto concisely, offering a roadmap for future research.

Advances and Applications of xAI Research
In this section, we showcase that research in XAI is alive and useful.In particular, 2.1 focuses on synthesizing the main breakthrough in XAI, demonstrating its enormous potential.Similarly, 2.2 demonstrates the large and increasing number of applications of XAI methods, techniques, and tools and their utility in real-world scenarios.

XAI Trends, Advances, and Breakthroughs
The prime goal of explanations is to make a model understandable or comprehensible to its stakeholders [24,17,25,26].To this end, several methods have been introduced in the last few years to explain the decisions of complex AI systems in many application domains [4,17,27,9,10].Synthesizing explanations for AI systems has been shown to have the potential to solve several technical and societal problems.Explanations can facilitate the understanding of how learning from data has occurred, for instance, via feature attribution methods.Furthermore, explanations can reveal information about how a model can be exploited to improve its performance.They can also support and improve human confidence in the output of a given model.Explanations may reveal the existence of hidden biases in the training data, learned during model training, that negatively impact a model's generalisation when predicting unseen data [28].Other purposes for demanding explanations include data stream settings, where they can be used to characterize what a model observes over time.This can serve as a knowledge base to detect non-stationarities in the task being solved and, thus, concept deviations [29].Similarly, wrongly annotated data instances in large-scale databases can be identified by computing a measure of disagreement between the explanations issued for a model.Application opportunities such as these may also arise in vertical federated learning, where aggregation policies can be adjusted by examining commonalities among local models during update rounds [30].Explanations can also drive pruning and model compression strategies, linking irrelevant concepts to specific neurons that can hence be removed from a neural network [31].

Attribution Methods
A lot of work exists on explaining the decisions of a classifier with attribution methods [32].For instance, model agnostic attribution methods such as Local Interpretable Model-Agnostic Explanations (LIME) [33], Shapley Additive Explanation (SHAP) [34], and many others can contribute to the interpretation of DL models by computing the importance of input features [35,36].Furthermore, saliency maps built by attribution methods such as network gradients, Deconvolutional Neural Networks (DeConvNet), Layer-Wise Relevance Propagation (LRP), Pattern Attribution, and Randomized Input Sampling for Explanation (RISE) can identify relevant inputs for the decisions of classification or regression tasks.In the image or text domain, explanations using attributions are intuitive and often perceived as easy to understand by the human receiver.For instance, one immediately understands that a classifier might not work correctly if it classifies horse images not by looking at the horse itself but by focusing on a copyright watermark, which is often present in images of this category.Such misbehaving classifiers have been termed 'Clever Hans' predictors [37] or 'Short-Cuts' [38].However, identifying such misbehaviour or understanding the meaning of an attribution-based explanation can be significantly more difficult in other domains [39].For instance, an attribution map computed on a multivariate time series signal or a complex biological sequence can be significantly more difficult to understand for the human receiver; that means the 'interpretation gap' is much larger than in the horse example.Moreover, even in the image domain, attribution maps only indicate where the relevant information is located, but it is still up to the human to assign meaning to this information.For example, when an attribution map highlights the teeth of a 20 years old person as an indicator for the prediction of the class 'young adult', it does not convey whether the white colour of the teeth is the important cue for the prediction or the fact that the person smiles [40].

Interpretable Models
Explainability in contexts like finance often has a special flavour.In this domain, information is mainly presented as tabular or temporal data.Here, traditional ML techniques are often adopted, especially techniques based on Decision Trees (DTs) [41].The benefit of these techniques is, among others, that supposedly they lead to inherently interpretable models.Some scholars argue that using a black box model, usually derived by applying DL methods, only marginally improves the performance of classical AI methods [42] (cf., [43]).Accordingly, models that are interpretable by design, such as DTs [44], are preferred for many applications [45].For this reason, another recent development within XAI is that of rule-based approaches and rule extraction methods, building on their long history within AI.For example, using symbolic rules to derive knowledge is still popular today [46].Although these methods can improve the overall performance of XAI systems by synthesizing effective explanations, they are still largely ignored when prioritizing interpretability.One reason might be that the coverage and specificity of the generated trees or rules are low.In methods based on rule extraction, an opaque 'black box' model is typically trained first and then used to construct a transparent A Manifesto of Open Challenges and Interdisciplinary Research Directions 'white box' model, such as a rule-based model or a DT.However, limiting the complexity of a DT while achieving a high accuracy via rule extraction is an open problem [47].

New Kinds of Approaches
Recent approaches have shown potential for resolving problems of older approaches, even if more research must be performed to confirm this [48,49].One such approach has integrated attention-based explanations into a neural architecture to achieve an efficient computation of tabular data and to increase its interpretability [50].Results are encouraging, but explanations remain highly subject to the inner variability of attention when transformer architectures are used.In that respect, the attention mechanisms could be heavily exploited with a variety of established techniques, including attention flow and rollout [51], LRP adaptation [52], or attention memory [53].Such techniques are promising in enhancing explanations for complex models but the properties of explanation need to be further investigated, especially concerning stability, robustness, and fidelity [54,39,55].Connected to the use of rules as a means for enabling the explainability of AI systems, another new trend within XAI is the use of argumentation [56,57,58].In particular, computational argumentation can be useful to explain all the steps towards a rational decision, as well as enabling reasoning under uncertainty to find solutions with conflictual pieces of information [59,60,61].In this context, rules are seen as arguments, and their interaction is seen as a conflict that can be resolved with argumentation semantics [62].Typically, computational argumentation implements non-monotonic reasoning, a type of reasoning where conclusions can be retracted in the light of new reasons [63,64,65].This formalism is appealing within XAI because it mirrors one-way human reasoning works [58].

Applications of XAI Methods
XAI methods have been widely applied in several fields, including finance, education, environmental science and agriculture, and medicine and health care.This section describes some of the many applications of XAI methods.The goal is to provide stakeholders with illustrations and case studies.

Medicine, Health-Care, and Bioinformatics
The inferences produced by AI-based systems, such as Clinical Decision Support Systems, are often used by doctors and clinicians to inform decision-making, communicate diagnoses to patients, and choose treatment decisions.However, it is essential to adequately trust an AI-supported medical decision, as, for example, a wrong diagnosis can have a significant impact on patients.In this regard, understanding AI-supported decisions can help to calibrate trust and reliance.For this reason, many XAI methods such as LIME, SHAP, and Anchors have been applied in Electronic Medical Records, COVID-19 identification, chronic kidney disease, and fungal or bloodstream infections [66].In these high-stakes scenarios, there is evidence that AI-based systems can have superior diagnostic capabilities than human experts [67].Thus, the explainability of these systems is not only a technological issue, but boils down to medical, legal, ethical, and societal questions that need careful consideration [68].

Finance
In finance, institutions such as banks and investment firms leverage AI to automate their processes, reduce costs, improve service security, and, generally, gain a competitive advantage.AI algorithms are used at scale to predict credit risk, detect fraud, and diagnose investment portfolios for optimisation purposes.In these contexts, applying AI often requires transparency and explainability for legal reasons.This requirement is particularly significant in the customer banking sector, where banks must comply with strict regulations such as the USA Equal Credit Opportunity Act (ECOA) or the USA Fair Housing Act (FHA) to expose adverse action codes and provide clear explanations for their decisions.Similar guidelines and law enforcement are present in Europe guided by the General Data Protection Regulation (GDPR) law of the European Union.For example, if a customer's loan application is denied, the bank must be able to provide a clear and understandable reason for this.It becomes increasingly difficult for banks, when adopting AI algorithms, to provide explanations which are stable and trustworthy [69,70] In other words, it becomes increasingly hard to justify the inferences of AI models, both with simpler transparent models [71,72,73], and even more with complex models [50,48,49].This lack of transparency can put banks at risk of regulatory penalties and erode customer trust.In investment banking, the demand for XAI is driven by the need to ensure the robustness and stability of AI systems [74], which could be subjected to extreme market conditions and unexpected events.If an AI system makes inferences that are difficult to validate, it could lead to disastrous outcomes.

Environmental Science and Agriculture
Another area of application of AI that has benefited from adopting XAI methods is the intelligent analysis, modelling, and management of agricultural and forest ecosystems-an important task for securing our planet for future generations.For example, forest carbon stock is a critical metric for climate research and management, as forests play a vital role in sequestering atmospheric carbon dioxide.In this context, drones can be deployed for data collection, and ML techniques can be used for estimating forest carbon storage [75].Forest inventory also plays a crucial role in forest engineering, as it provides critical information on forest characteristics, such as tree species, size, and density, which can inform forest management decisions [76,77].In these life-critical environments, sensor-based technology is employed to collect data, which is often high-dimensional and heterogeneous, and then AI-based models are trained on it.However, data is often poor in quality, thus leading to models that lack robustness.Furthermore, even if such models are robust, there are still challenges in terms of tracing and understanding their inferences, and in ascertaining the causal factors that underlie them.Even the smallest perturbations in the input data can dramatically affect a model's output, leading to completely different inferences and thus undermining the trustworthiness of such models [78,79].Additionally, in these naturalistic environments, a challenge for forest engineering is the development of methods for uncertainty quantification and propagation.In fact, AI methods for developing forest inventory models are subject to various sources of uncertainty, including measurement error, spatial variability, and model misspecification.It is, therefore, extremely important to analyse the robustness of AI methods-for instance, through explainability-and enhance it for the produced models and their inferences [80,81].

Education
AI in Education (AIED) focuses on developing AI-powered educational technologies to aid students, instructors, and educational institutions [82,83,84] in their teaching and learning activities.On the one hand, for students, AIED has focused on developing models [85], and adaptive systems that can identify learners' strengths and weaknesses across a variety of topics, leading to customized instructions and resources that align with their learning needs [86].These are, for example, focused on improving their meta-cognitive processes of self-monitoring, reflection, and planning [87].On the other hand, for instructors, AIED tools can act as smart teaching assistants [88], help them orchestrate the classroom [89], grade assessments [90], and answer student queries [91], minimizing students dropout [92].The most recent example of an application of AIED includes the use of some Large Language Model (LLM) capable of generating new textual content based on human input prompts.This can be used for writing essays, producing software code, or generating educational content such as multiple-choice questions or worked examples with step-by-step solutions.A growing concern is that students, instructors, and educators lose control of AI-based technologies as they fail to determine how these work, why they produce certain outputs, and what impact they may have.In particular, AIED tools such as educational recommender systems are increasingly used to automate and personalize learning activities [93].These tools pose various concerns about their use in high-stakes decisions, including fairness, accountability, transparency, and ethics [94,95,96].The impact of these technologies on students' agency and self-regulated learning is a growing concern, as the lack of transparency and feedback can make it difficult for instructors and learners to calibrate their trust in AI-based inferential systems and understand their current state of learning and the benefits derived from engaging with a particular educational resource [97].

Challenges and Research Directions
Despite the many advances, breakthroughs, and potential applications of XAI methods, more research is clearly required to address open problems in the field.For example, it is still unclear how XAI methods should be evaluated, how different terms should be used in the debate, or how, exactly, XAI is related to trustworthiness.Many surveys tackling some of these aspects of XAI exist and keep appearing in conference proceedings and journals [20].However, they are rather scattered, often specific to an application domain or focused on specific methods.Against this backdrop, this section aims at extracting and synthesizing the diverse challenges in XAI that motivate the formation of a manifesto.Overall, we identified 27 problems, which we have grouped into nine high-level categories-our manifesto-as depicted in 1.These problems and the related challenges are often interconnected; thus, they may, in principle, belong to multiple categories.

Creating Explanations for New Types of AI
The ever-evolving landscape of AI introduces novel types of models, such as generative models or concept-based learning algorithms, each with its unique set of properties.Against this backdrop, this category of challenges describes the intricacies of creating explanations for these new types of AIs.Generative AI models, such as those employed for diffusion denoising [98,99] or the family of GPT models for large-scale language generation [100], are disrupting many sectors.These models deliver exceptional performance due to their immense scale.With billions, and in some cases, nearly trillions of parameters, however, their sheer size poses a significant challenge to existing XAI methods [14].In particular, these methods grapple with the high-dimensional nature of such models, both in terms of computational complexity and in extracting learned concepts.For instance, one obstacle related to the latter point lies in the polysemantic nature of the neurons in generative models, which is thought to arise from a superposition of multiple independent features [101].XAI methods have so far been mostly limited to classification and regression problems.Accordingly, completely new approaches have to be developed for generative models.In particular, self-supervised or neural generative models such as Variational Autoencoders (VAEs) and Generative Adversarial Networks (GANs) are becoming more popular.For instance, examining the latent spaces they learn and synthesizing explanations for them is very challenging.Another challenge, particularly for LLMs, concerns scaling laws.Neural scaling laws are functional relationships that relate variables associated to a neural network, such as the number of layers in its architecture and its achieved accuracy after training.Such functional relation of two x and y variable is of the general form y = a • x α , where a and α are constants of the scaling law.Such laws govern the aggregate capabilities of LLMs, yet a precise understanding of individual task-level implications of these laws remains elusive, as they appear to manifest unpredictably.It is an open issue whether scaling laws can be used to infer the quality of the artefacts or concepts learned by LLMs.Even if this were possible, such laws might reveal plateaus of XAI scaling relative to general ability scaling or indicate a gap between the two.
Solution Ideas Mechanistic interpretability [102,103] is a promising approach to gain deeper insights into the functioning and scaling laws of generative models, such as for grokking mechanics [104] and the ability to solve problems recursively [105].In particular, mechanistic interpretability has shown promising results at small model scales and for toy problems.Researchers from institutions and companies like MIT, OpenAI, DeepMind, and Anthropic pursue mechanistic interpretability as an approach that attempts to reverse engineer the learned representations and algorithms of trained models using causality-based methods.Piecewise linear activation functions have been used to partition the activation space into polytope-shaped monosemantic regions [106] and sparse autoencoders have been successfully used for the mono-semanticity of Deep Neural Network (DNN) models [101].There are also challenges in mechanistic interpretability, such as disentangling multiple algorithm implementations and finding unknown algorithms [107].Furthermore, there are preliminary results from vision models that scaling does not help the mechanistic interpretability of models [108], calling for designing models for mechanistic interpretability.A potential complement to mechanistic interpretability may be information geometry [109], which can help analyze high-dimensional spaces involved in the processing of LLMs.Furthermore, constraints may have to be imposed on the training and functioning of LLMs to ensure safety as well as explainability [110,108].Such constraints could be directly part of the automated optimization (learning) process or indirectly used through a human-in-the-loop approach.An example of good direction is in [111] which introduces a training procedure that encourages modularity and interpretability by discouraging non-local connections between neurons through local L1 regularization with swaps of neuron locations.Finally, it remains to be seen if these methods can be scaled to relevant models and problem sizes and complexities.

Creating Explanations for Concept-Based Learning Algorithms
Concept-based learning algorithms are another class of new forms of AI for which good XAI methods do not yet exist.Several such algorithms have been proposed over the years to directly learn features that describe 'prototypical concepts' or 'prototypes' present in each input to the model, including ProtoPNet [112], ProtoTree [113], ProtoPShare [114], Concept Bottleneck Models [115], Concept Activation Vectors [116], Concept Embedding Models [117] or Concept Atlases [118].Neuro-symbolic learning, namely, the symbiosis between connectionist and concept-based symbolic learning, has recently also gained momentum [119,120,121].Hybridizing knowledge graphs (KGs) with learning algorithms also fall within the landscape of approaches used to map knowledge encoded in the parameters of a model with a priori known concepts and the interrelationships among them [15].Unfortunately, use cases proposed to showcase how these approaches explain their decisions are limited, very narrow, and assume a priori knowledge about the concepts that can be discriminative for the task at hand.This assumption may imprint a large inductive bias in their explanation-producing process, not properly generalising when explaining new inputs that are distributionally novel with respect to the training data.Furthermore, the continuous proposal of new datasets for concept learning, including Clevr/Clevrer [122,123], Kandinsky Patterns [124] or Closure [125], sheds evidence on the need for eliciting local explanations that can be formulated in terms of concepts and their spatial distribution.
Solution Ideas One interesting avenue of research explores the potential for genetically evolvable connections between identifiable concepts in input data using object detection models and evolutionary programming solvers [16].This hybridization could offer the advantage of employing symbolic classifiers that are interpretable, algorithmically transparent, and well-suited for handling datasets that encapsulate discriminative, concept-wise compositional information.Additionally, there is a growing demand to expand hybrid approaches that unite Knowledge Graphs with concept-based learning methods.This expansion aims to enable the discovery of relevant concepts, attributes, and relationships that extend beyond the confines of specific use cases or domains, as discussed by Lecue et al. [54].

Improving Current XAI Methods
A spectrum of challenges arises when considering current XAI methods.Many of these have long-known disadvantages that need to be overcome, as described below.

Augmenting and Improving Attribution Methods
One major branch of XAI methods relies on pixel attribution with heatmaps or saliency masks [126], one of the most prominent classes of XAI methods used for computer vision tasks.Such methods are often based on perturbations [127,128] or gradients [129,130].Despite the great success of these methods to, for instance, detect biases and flaws in the learned prediction strategies (so-called 'Clever Hans Effect' [37], see 2.1), attribution-based explanation methods also have limitations.For instance, saliency masks on the level of pixels are often unsuited for laypersons [39].A major technical limitation of attribution methods is their sensitivity to 1) internal hyper-parameter tuning and customisation (such as baselines), 2) the manual settings of interpretable interfaces, and 3) the assumption surrounding the model under exploitation.For example, the results of model-agnostic attribution methods, including LIME and SHAP, can change based on the range of input perturbation.Similarly, many gradient-based methods require setting a proper sampling interval.Finally, relevance propagation methods such as LRP have to use different methods depending on the layer of the DNN.Additionally, some methods have issues with computational efficiency, requiring many passes for calculating attributions.

Solution Ideas
We can tackle the inherent issues in attribution methods by combining them with other approaches to XAI to get a portfolio approach that hedges the weak characteristics of each individual approach.Mechanistic interpretability is an orthogonal approach with different characteristics that could play along well with these approaches.Methods in the portfolio could negotiate like in a market to tune themselves and converge to a majority view, or even better, a list of hypotheses with their plausibilities based on the votes of each portfolio participant.
Longo L. et al.Explainable Artificial Intelligence (XAI) 2.0: A Manifesto of Open Challenges and Interdisciplinary Research Directions

Removing Artifacts in Synthesis-Based Explanations
Generating explanations through synthesis is a promising direction to advance the field of XAI.While a user is unlikely to directly understand the layer activations of specific classes, it may be different for examples of those classes.The synthesis of such examples, however, is often noisy.For instance, a synthesized image may contain artefacts.It is unclear whether this noise is due to the synthesis process itself or is, de facto, part of a concept learned by the model.For example, while a GAN architecture can synthesize an image representing the pattern that activates a neuron most strongly, this image might have various artefacts that make it appear somewhat distorted.This might happen due to shortcomings of the GAN models, which means these artefacts must actually be present to activate the neuron strongly.Two existing methods for synthesis in the literature are a decoder for layer activation [131], and a GAN for single neurons [132].Unfortunately, the mere synthesis of inputs is insufficient for understanding concepts.There are few works on using generative models for explanations, including the work of [131] and the chapter concept vectors in [133].Most methods explaining concepts rely on a given dataset of human-defined concepts [133], which, however, might not be available for a specific domain and must be collected at high costs.Furthermore, even if a dataset is available, there is a considerable risk that the user-defined concepts are incomplete or inaccurate, leading to poor or biased explanations.

Solution Ideas
To minimize artefacts, state-of-the-art models and recent popular techniques in DL [134], especially diffusion models, could be leveraged [98].However, even state-of-the-art generative models do not ensure the absence of artefacts.Thus, to verify whether there are any distortions due to the synthesis, one idea is to compute a reconstruction of the original input serving as a reference [135].This reference stems from a separate model with the same architecture as the decoder synthesizing inputs from the model to explain.Subsequently, a user can compare the original input, the synthesized image from layer activations of the model to explain (that means, what the classifier 'sees'), and the reference, allowing them to identify distortions due to the synthesis process.If it can be seen that the original image and the reference are fairly similar, then distortions might be considered minor.However, a classifier might not rely on certain concepts associated with the input.Therefore, while the comparison with a reference might be considered a valid approach, it is still tedious for the lay user and non-trivial to apply beyond autoencoders.

Creating Robust Explanations
The fragility of posthoc XAI methods to small perturbations at the model's input and the known inconsistency in synthesized explanations for a given input [136] highlights the challenge of creating robust explanations.This is frequently advocated as a requirement for calibrating human trust and building acceptance of a model being audited.Several works have advocated the idea of exploiting explanations beyond just explaining decisions [137,138], for instance, to also improve models.However, the susceptibility of explanations to the XAI technique under consideration detracts from the explanation's robustness, jeopardizing the reliable application of explanations to improve a model.Methodologies for delivering robust explanations under different circumstances are investigated in several recent works [74,139,140].A satisfying solution, however, does not yet exist.The difficulty lies especially in the fact that for a robust explanation, the model itself must be robust.

Solution Ideas
As a first step towards robust explanations, evaluations on standard benchmarks should be done to identify common biases of an XAI method and to define ways to mitigate them.Furthermore, robust explanations could be created by aggregating explanations.For example, a proposal exists to blend uncertainty quantification and XAI [141].Other research has placed emphasis on the robustness of the AI model itself, for instance, in the form of explanations that inform about model inversion or extraction attacks [142,143].In a similar vein, the recently proposed "reveal to revise" framework enables practitioners to iteratively identify, mitigate, and (re-)evaluate spurious model behaviour with a minimal amount of human interaction [144].

Evaluating XAI Methods and Explanations
Evaluation is an important aspect of the development and deployment of XAI systems.Evaluating XAI methods, however, is a complex task, and no gold standard exists on what makes for a good explanation [39].

Facilitating Human Evaluation of Explanations
One problem concerning the evaluation of XAI methods is that they often lack user studies.Current evaluation approaches typically only analyze certain properties of the XAI methods themselves without accounting for the interaction with the final user [145,146,147,148,55].For instance, a survey of user studies has shown that only 36 out of 127 research works employing counterfactual explainers adopted a human evaluation approach, and only 7% of them tested alternative approaches [149].Individual differences in understanding, prior knowledge, and the cognitive load required to comprehend explanations add further challenges to evaluating XAI methods.In general, it is difficult to compare different forms and types of explanations to determine which of them are the most effective.Additionally, users are typically 'passive recipients' of explanations, and the actual usage or exploitation of such explanations is barely tested.For certain properties, there are no approaches at all that test for them [55].While some studies evaluated the impact of synthesized explanations of AI systems on humans when compared to the scenario where no explanations were provided [150,151,152], there is clearly a need for more (and more systematic) work on the topic.
Solution Ideas Establishing a solid foundation for XAI must be grounded in empirical research involving users.Achieving this demands a collaborative, interdisciplinary approach, uniting ML experts with researchers from fields like HCI, psychology, and the social sciences.Valuable insights can be gleaned from the collective body of knowledge in these domains, leveraging their expertise in conducting user studies [23,153].To streamline the evaluation process, it is imperative to establish standardized frameworks encompassing every stage of it, from formulating hypotheses to data collection, analysis, and utilising online questionnaires.With this robust methodology in place, the research community can, then, embark on the crucial task of developing heuristics, principles, and patterns that enable the design of effective XAI systems for real-world applications.This comprehensive approach ensures that XAI not only benefits from theoretical foundations but is also shaped by empirical user-centric research, ultimately enhancing its practical utility.

Creating an Evaluation Framework for XAI Methods
There are several works that address the evaluation of XAI methods.Hoffman et al. [154], for instance, integrate extensive literature and various psychometric assessments to introduce key concepts for measuring the quality of an XAI system.Similarly, Vilone and Longo [12] aggregate evaluation approaches for XAI methods from several scientific studies via a hierarchical system.Furthermore, Van der Lee et al. [155] define a list of steps and best practices for conducting evaluations in the context of generated text.An analysis of these works reveals that evaluating the goodness and effectiveness of explanations is a prerequisite for calibrating trust in AI.However, there is currently a lack of standardized methods and metrics for evaluating XAI systems.In other words, despite the broad interest in the design of XAI methods [156,145,8,10,157], it is still unclear how to compare the results of different evaluations and establish a common understanding of how to evaluate explanations.What is missing is a set of evaluation metrics for explainability that are generally applicable across studies, contexts, and settings.
Solution Ideas There are already some good approaches in the literature to solve this problem.In a recent survey on the evaluation of XAI, for instance, the authors identify several conceptual properties that should be considered to assess the quality of an explanation, and they propose quantitative evaluation methods to evaluate an explanation [158].Furthermore, the recently developed XAI evaluation framework Quantus [159] implements over 30 evaluation metrics from six categories.Frameworks such as Quantus allow for the evaluation and comparison of explanations in a standardized and reproducible manner.Furthermore, publicly available XAI evaluation datasets with ground truth information, such as CLEVR-XAI [160], allow for objective evaluations.In the future, artefacts like these need to be extended to more application areas, especially outside of computer vision.

Overcoming Limitations of Studies with Humans
Evaluating XAI methods with humans has limitations.Often, the number of participants that can be put together in a study does not represent the general population.Thus, a study's results may be prone to bias and errors and may not generalise well [161].Overall, the evaluation of XAI methods in studies with humans is prone to issues such as poor reproducibility and inappropriate statistical analyses, resulting in no solid evidence for their usefulness [150,151,152,162,163,164,165].
Solution Ideas A potential solution involves augmenting human studies with synthetic data and virtual participants.By creating synthetic datasets that span a wide range of demographic characteristics, behaviours, and preferences, researchers can address the issue of limited sample representativeness.These synthetic datasets can simulate diverse user profiles and scenarios, enabling more robust and extensive evaluations of XAI methods.Additionally, virtual participants, based on AI-driven agents or personas, can be incorporated into studies to provide a broader range of user interactions and perspectives.To enhance the reproducibility and rigour of XAI evaluations, standardized methodologies and statistical analyses must be employed.Researchers should adopt transparent reporting practices and adhere to well-defined evaluation protocols, ensuring that the evidence generated from these studies is solid and dependable.Another approach would be to create sample explanations or schemes for explanations against which generated explanations are checked.While the samples would still need to be tested in studies first, this could alleviate the overall need for studies with humans.

Clarifying the Use of Concepts in XAI
As a multidisciplinary research area, another category of challenges for XAI is the disparate and unclear use of terms.

Elucidating the Main Concepts
In research on XAI, there is a conceptual ambiguity regarding various terms, such as explainability, interpretability, transparency, understanding, explicability, perspicuity, and intelligibility.This represents a challenge in XAI, as the lack of clear and consistent definitions of terms can hinder progress in developing effective and useful XAI systems.Some researchers have attempted to define an explainable Artificial System such as one that produces details or reasons to make its functioning clear or easy to understand [166].Other researchers use terms like explainability and interpretability synonymously [153,167], while others draw major distinctions between them [42].These differences pose problems for applied research and interdisciplinary collaboration.Discussions about clarifying terms in the field of XAI tend to take two distinct approaches.On the one hand, some contend that attempts to define the terms in question are futile, impossible, counterproductive or unnecessary, and previous definitions of explainability have failed and, in general, the whole endeavour of finding definitions is doomed to failure (for example, [168,169,170]).On the other hand, some attempt to provide explicit definitions, intending to differentiate between the various terms employed (for example, [171,172,148,173,174]).

Solution Ideas
As the lack of a clear and consistent definition of terms related to explainability can hinder progress in developing effective and useful XAI systems, the communication challenges should be addressed holistically rather than perpetuated by ambiguity in the use of terms.Against this background, it seems desirable to join the latter of the above camps and strive for a uniform use of different terms.A minimal solution of this kind would be for authors to always clarify, in their articles, what they mean by certain concepts.A more desirable solution, however, would be to define the various terms once and for all.In this line of thought, meaningful definitions can only be found if already existing ways of usage are considered.Creating completely new usages of the various terms is more likely to contribute to conceptual confusion than to resolve it.The first step in coining a generally applicable definition of the terms is, therefore, to identify current usages of them and to create an overview and comparison of them.For instance, some work identifies relevant notions [148], but limited work exists in comparing them.As a next step, the merit of the various proposed definitions must be determined.For this purpose, quality criteria should be established (see, for instance, [171]), which can be consulted to evaluate each proposed definition.

Clarifying the Relationship Between XAI and Trustworthiness
A similar conceptual challenge exists concerning trustworthiness.Properties like safety, fairness, and accountability are often mentioned for meeting regulatory actions focusing on the trustworthiness of AI.For instance, the Ethics Guidelines for Trustworthy AI, issued by the EU High-Level Expert Group on AI, listed seven requirements for AI-based models and systems to be seen as trustworthy [175]: human agency and oversight, technical robustness and safety, privacy awareness and data governance, transparency and explainability, diversity, non-discrimination and fairness, societal and environmental well-being, and accountability.While XAI has the potential to help with most of these [176], it is taken to help with one of them primarily: transparency and explainability.However, even this relationship is unclear, as many sources contain contradicting statements.In these sources, it is possible to observe various claims about the relationship between trustworthiness and XAI: trustworthiness is seen as a main goal of XAI [166], but XAI is also claimed to be a part of trustworthiness [177].XAI is purported to change the belief in the trustworthiness of a system [178], while it should also support the trustworthy integration of systems [179].These are just a few examples, and in other articles, it is also possible to find completely different relationships [176,180,148].One reason for this divergence in descriptions is that there is no uniform way of using terms like trustworthiness (and other terms in XAI, see 3.4.1).For this reason, as long as it is not clarified what each term describes and what property it expresses, it will not be possible to specify the relationship between XAI and trustworthiness.

Solution Ideas
The relationship between XAI and trustworthiness is widely discussed.We must distinguish between trustworthiness as a property of an AI system, trustworthy AI as a technology enabler to accomplish a responsible and safe AI, and the technical requirements required for an AI system to be trustworthy.XAI is identified as one of the seven trustworthy AI requirements [175,181].On the other hand, XAI must contribute towards achieving trustworthiness.Currently, it is necessary to connect XAI with the fundamental properties of AI trustworthiness for AI risk management and the AI lifecycle, measuring their presence and impact.In this regard, we must highlight the report recently published by the UC Berkeley Center for Long-Term Cybersecurity (CLTC) 1 .This report aims to help organizations develop and deploy more trustworthy AI technologies, including 150 properties related to one of the seven "characteristics of trustworthiness" defined in the NIST AI RMF2 : valid and reliable, safe, secure and resilient, accountable and transparent, explainable and interpretable, privacy-enhanced, and fair with harmful biases managed.Another important aspect is AI governance [182] and the need of governance measures linked to the importance of managing AI risks, another scenario where XAI becomes of utmost importance.These are new fundamental scenarios posing essential challenges for the design, development, and safe deployment of responsible AI systems [181].In the current debate, XAI is identified as a vital technology to decrease the uncertainty and worry about AI systems in the society.

Finding a Useful Account of Understanding
Another challenge to bring about conceptual clarity is finding a useful account of understanding.An obstacle to providing such an account of understanding in XAI is the lack of conceptual clarity about what understanding itself is.In philosophy, there are at least three different approaches to this problem.The more traditional view asserts that understanding logically depends on explanation: only true explanations can provide understanding [183,184].The other end of the spectrum is occupied by philosophers who allow other paths to understanding, even if they offer distorted or false accounts of their targets [185,186].Finally, intermediate views exist, allowing that some, but not all of the pieces of information used to provide understanding can be false [187,188,189].There is no consensus regarding which of these views is more adequate in the context of AI explanations.For example, while [190] sides with the traditional view, [24] adopts a more pragmatic stance.Another obstacle arises from the fact that the understanding provided by XAI methods that account for singular predictions need not be the same type of understanding provided by proxy or surrogate models that offer a global account of the target AI model.There might be different underlying cognitive processes and abilities involved in each case.Prima facie, the explanation of singular predictions provides a type of understanding that epistemologists call 'understanding-why' [191], while proxy or surrogate models provide 'objectual explanations' [192] of their targets.The relation between the two types of understanding requires clarification, not only from a philosophical perspective but also from the point of view of psychology.In addition, there is a third type of understanding that depends entirely on the functional correlations between inputs and outputs [193].Functional understanding might be sufficient for most users in many cases of human-computer interaction.
Solution Ideas Solving the problem of a useful account of understanding in XAI potentially requires a two-pronged approach.On the one hand, conceptual clarity is required.Several recent papers [24,194,195,196,190,197,198,199,26,25] have focused on the relation between explanation and understanding in AI.The conceptual map of this specific problem is now quite clear.Still, future developments will have to respond to new psychological evidence about human-computer interaction and to the development of new XAI methods.On the other hand, empirical work on understanding is essential.For a long time, XAI researchers have tried to ensure that the methods they develop are comprehensible to their peers, a phenomenon referred to as "the inmates running the asylum" [23, p. 36].The proposed and endorsed alternative is to incorporate results from psychology and philosophy to XAI [17,200,153,201].Existing theories of how people formulate questions and how they select and evaluate answers should inform the discussion [153].

Supporting the Multi-Dimensionality of Explainability
Another class of challenges for XAI is that explanations are multi-dimensional.In other words, explainability is a concept which has multiple facets and spans a variety of disciplines.

Creating Multi-Faceted Explanations
For regulatory purposes, explanations should depend on and incorporate information about requirements for trustworthy AI systems.In some cases, there is no reason to spend much resources and effort explaining a decision made by an AI model if such a model is not accurate, not lawful, or not fair.In this line of thought, there have recently been calls stating that different dimensions of trustworthiness (for example, safety, fairness, accountability) should not be shown separately or individually to the audience of a given model or AI-based artefact.For this reason, explanations should be offered to humans by not only explaining the functioning (that means, traditional transparency and explainability) but also by justifying the reliability of the inferences of an AI system (for example, concerning technical robustness, safety, lawfulness, and fairness).If these properties are not considered, explanations will fail to calibrate users' trust correctly.This issue is particularly acute in situations of concept drifts or uncertainty.

Longo L. et al. Explainable Artificial Intelligence (XAI) 2.0: A Manifesto of Open Challenges and Interdisciplinary Research Directions
Solution Ideas One approach to such multi-faceted explanations could involve developing trustworthiness metrics that encapsulate dimensions like safety, fairness, and accountability.XAI can, then, be tailored to the trustworthiness level of the AI system, ensuring that less trustworthy models provide extensive justifications for their decisions while highly trustworthy systems may offer simple explanations.Trustworthiness thresholds can be established, triggering detailed explanations when the system falls below a predefined trustworthiness level.Furthermore, dynamic explanations that adapt to context, such as concept drift or uncertainty, can ensure that users' trust remains calibrated.A user-centric approach, allowing customization of explanation depth, would empower users to align the system's explanations with their specific needs.Transparency in the trustworthiness assessment process may enhance user confidence, and continuous monitoring and reporting offer the capability to adapt explanations as trustworthiness metrics change.This comprehensive strategy aims to ensure that trustworthiness considerations are an integral part of the XAI process, leading to multi-faceted explanations.A complementary way to tackle the multidimensionality of explanation concerns its operationalisation which should be performed as it happens with other psychological constructs such as 'intelligence' or 'cognitive load' [165].A solution is to propose a novel, inclusive definition of explainability that is modellable and that can be seen as a foundation to support the next generation of empirical-based research in the field.Modelability here means that the definition should contain high-level classes of notions and concepts that can be individually modelled, operationalized, and investigated empirically.The main rationale behind this solution is practical, as the aim is to provide scholars with an operational characterization of explainability that can be parsed into sub-components that, in turn, can be individually modelled.This should motivate the use of quantitative methods for greater reproducibility, replicability and falsifiability.

Enabling Interdisciplinary Work in XAI
XAI is an interdisciplinary research field [17,18].For example, through the collaboration of philosophers and computer scientists, XAI is envisioned to ensure the ethical use of AI [18].However, it is often difficult for researchers of different disciplines to engage in joint research in XAI [4].There are several reasons for that.First, the rapid increase of publications in XAI makes it difficult for researchers to keep up even with research in their own discipline, such that they often cannot spare to engage with research of other disciplines (which also has an overwhelming number of publications) [4].Furthermore, the different disciplines involved in XAI may have their own established usage of certain terms [172].This can lead to confusion and difficulty adapting to different usage in XAI.Eventually, for terms for which there is no common usage, different disciplines may establish their own meanings, further leading to confusion.
Solution Ideas To counteract the information overload caused by a rapid increase in publications, a centralized knowledge-sharing platform for XAI could be established.This platform would curate and categorize relevant research from various disciplines, making it more manageable for scholars to access and engage with research from other disciplines.A crucial aspect of this collaborative platform would involve the development of standardized terminology and glossaries that unify the usage of key terms across disciplines.This would reduce confusion arising from varying interpretations of terminology, ensuring that researchers can communicate effectively and harmoniously.These terms should be updated periodically to accommodate evolving interdisciplinary insights.Moreover, fostering regular crossdisciplinary dialogues and forums can promote mutual understanding among researchers from different backgrounds.Dedicated workshops, conferences, and seminars for interdisciplinary work in XAI could facilitate knowledge exchange and encourage the development of shared research goals and methodologies.Additionally, funding agencies and institutions should incentivize and prioritize interdisciplinary research by offering grants, awards, and recognition for collaborative projects.This would motivate researchers to actively engage in cross-disciplinary efforts in XAI.

Supporting the Human-Centeredness of Explanations
One class of challenge in XAI lies in providing explanations that are specifically adapted to the humans receiving them.

Creating Human-Understandable Explanations
In his seminal paper about explanations in AI and social sciences, Miller points out that explanations should be social, contrastive, and selective to be understandable to humans [153].Confalonieri et al. discussed further properties for explanations, including integrating symbolic knowledge and statistical approaches to explainability [3,202].Unfortunately, many current XAI methods do not have these properties.In particular, many XAI methods provide explanations that do not extrapolate beyond the domain of their input data.A clear example of this phenomenon is the manifold number of gradient-based attribution methods [203,129,130], all yielding explanations in the form of visual heatmaps quantifying the relative importance of every pixel of the input image to the prediction issued by the model.Many contributions assume that such heatmaps are enough for explainability simply because a 'narrative' can be built to relate pixels to concepts that emerge from intuition.There are, however, several problems with this assumption.First, the intuitions in question are often from experts [23], and the presentation of explanations in the form of pixel attributions may not be comprehensible to laypersons [39].Second, in more complex scenarios, crafting a narrative can become challenging, especially when discriminating between classes relies on intricate distributions of concepts within an image [204,205], or other semantically defined relations among the entities to which these concepts belong.Third, these narratives are sometimes elaborate guesses at best.Assume a saliency map, serving as an explanation, highlights coarsely the face of a person to classify it as a human.It is unclear whether the underlying classifier used features such as the shape of the face, the skin colour of the face, or characteristics of the face such as mouth and lips, or a combination thereof, to make its inference.
Solution Ideas Audiences without technical background are often concerned with concepts, not with data.For instance, in a classifier discriminating between 'dogs' and 'cats', it is significantly more informative for many people to state that 'the shape of whiskers' is a discriminative concept in the images rather than the relevance of isolated pixels as dictated by a gradient-based attribution technique.In this line of thought, concept-based XAI methods explain individual predictions not as pixel-wise attributions but in terms of semantically meaningful concepts (for example, 'eye', 'red stripe', 'tyre') represented by hidden-layer elements of the neural network.Often, concept-based explanations can be enriched by reference samples from the training dataset.Combining local XAI methods (that means, explaining individual predictions) with global XAI methods (that means, explaining the whole model) might lead to semantically richer and more human-understandable explanations.This 'glocal' approach was taken in concept relevance propagation, an upgrade to LRP, to simultaneously identify concepts learned by the model (global) and match them to each individual input (local) [118].Enriching explanations with explicit knowledge can enact scenarios in which formal and common-sense reasoning can be used to create explanations that are closer to the way in which humans think.In this line of thought, computational argumentation techniques could be exploited to generate explanations that can mimic the way humans reason under uncertainty [206,60,207,208,209,57,58,59].Another possible solution to create human-understandable explanations is to map explanations to a more interpretable domain.For instance, one approach to providing more interpretable explanations on time series data has been recently explored in [210].In this context, the explanation is firstly computed on the time domain, which is the domain of the operation of the model.Then, the solution is mapped through an invertible layer where explanations can be computed in different spaces.Future research should investigate meaningful invertible mappings, for example, by using autoencoders [211], for this and other domains.

Facilitating Explainability With Concept-Based Explanations
Humans and AI systems make decisions differently.In particular, AI systems, especially those based on DL, often rely on features that are hard to grasp for humans.On the other hand, humans use concepts that are coarse-grained representations of reality [212,213].This difference is often not taken into account when it comes to creating explanations.For example, prominent explainability methods such as LIME or SHAP rely on feature attributions that might reveal little about how an AI model works [35,36].Concept-based XAI methods go beyond attribution and aim to express human-understandable concepts as part of the explanation that must first be synthesized from the model to be explained.One benefit of concept-based explanations is that they can aid the insertion of expert knowledge in the learning process of a model, allowing users to impose explicit domain-driven constraints defined as concepts, attributes, and predicates (for example, in so-called Logic Tensor Networks [214]).However, explanations based on human-understandable concepts are still in early development.In particular, concept-based explanations are mostly elaborated only for classification or regression models, leaving aside other problems and models in which concept-based explanations could be useful.This could be the case for reinforcement learning, in which explanations should inform about how the agent's interaction with concepts existing in the environment produces a series of actions that fulfil the formulated task [215].Furthermore, limited work investigates XAI methods that aim at synthesizing human-understandable concepts in concrete applications.While some concepts are universal, such as 'every car has tires and tires are round', others are more subjective or differ among stakeholders and cultures and depend on domain knowledge, that is, knowledge related to training data [216].Accordingly, using a method that is generalisable and applicable across diverse areas and different contexts is needed, as one might be interested in using concepts in a personalized way to explain.

Solution Ideas
Creating concept-based XAI requires a multi-faceted approach that takes into account a broad range of sub-problems.It begins with finding reliable ways for the extraction and identification of relevant concepts from data or AI models.For this first step, employing techniques from natural language processing, semantic analysis, and domain-specific knowledge can assist in systematically pinpointing concepts.This systematic identification lays the foundation for offering insights rooted in real-world, comprehensible terms.Next, the concepts must be personalized so that they are tailored to the individual consuming them.Allowing users to define their own concepts would be one way Longo L. et al.Explainable Artificial Intelligence (XAI) 2.0: A Manifesto of Open Challenges and Interdisciplinary Research Directions to ensure personalization.Interdisciplinary collaboration and continuous feedback loops could refine these concepts, making them more meaningful and interpretable.A supplementary avenue could be to organize concepts within a hierarchical structure.This structure could be useful for delivering explanations that can be tailored to different levels of granularity.This hierarchy may allow users (or the XAI methods) to select explanations that match specific needs.Technical challenges include identifying and minimizing the inaccuracies of synthesized concept-based explanations, which could be tackled by introducing quality metrics for concept-based explanations.Finally, the application of XAI methods based on concept synthesis in different domains and applications is another sub-problem.This might be solved by personalization, as described above.

Addressing Explanations Divorced From Reality
The complexity of information flows in increasingly complex AI systems can result in what we call a 'reality drift'.As AI systems are becoming smarter, their decision-making is becoming more intricate.AI systems might start using concepts that are impossible to convey to humans [217,218].This means that the concepts humans use to understand the world might no longer suffice to describe reality in a meaningful and useful way [219].Consequently, the workings of such systems would become necessarily incomprehensible to us, and the utility of explanations, which are increasingly divorced from reality, may be questionable.To bridge this gap, one might initially think that new concepts are needed that both humans and machines can use.However, there are differences in how humans and machines store and process information, making the success of this approach uncertain.In general, explanations provided by AI systems may seem plausible to humans but could be detached from actual reality.This raises important questions about the usefulness of explainability in ensuring AI safety, especially when dealing with highly complex AI systems that are hard to decipher [220,110].

Solution Ideas
To address the gap between explanations and reality, one potential solution involves engaging society and implementing regulations that ensure that someone can be held accountable for the performance of AI systems, especially in critical situations3 .To achieve this, it is crucial to ensure that explanations are falsifiable (see 3.8.2).Selecting explanation forms based on their falsifiability enables market and legal control over the types of AI systems used.Future systems should also tackle the uncertainty in modelling explanations by incorporating information from ontologies.There are three research directions to consider from here.The first direction explores the proof of the (non)existence of specific concept properties, such as gap size, robustness, simplicity, and estimability, to mention a few.The second direction focuses on developing adaptive ontology-generation methods to track evolving reality.These methods create adaptable and robust ontologies with computational properties that respect the limitations of human understanding.Basically, this approach would enhance the relevance of explainability in the context of reality drift.The third direction is sociological and deals with updating ontologies within society after adaptations.In addition, when seeking adversarial robustness, it is preferable to establish protectorates at the highest possible level of abstraction in the ontology generation process for computational efficiency [47].This comprehensive approach aims to improve the alignment of AI explainability with the dynamic nature of real-world scenarios.

Uncovering Causality for Actionable Explanations
Causality is arguably among the most desired properties when constructing a model from data.In this regard, uncovering causal connections learned through a model via explanations is a fundamental hope associated with XAI [27,10,221].However, off-the-shelf posthoc XAI methods fail to disentangle the correlation represented in the learned model from the causation between observed variables and predictions, making it questionable whether received explanations are suitable for guiding people's actions [222].Explanations which are purely based on correlations can hinder decision-making when a model's outputs contain important information for action, for instance, the probability of failure of a production facility in industrial forecasting.Actionable and action-guiding explanations derived from causal models are needed in the real world, especially in scenarios where decisions may affect people.To address this issue, counterfactual generation methods for ML methods have garnered attention [147].Contrary to most XAI approaches, counterfactuals attempt to answer why a black box model leads to a certain prediction by helping users understand what would need to change at its output to achieve a desired result [223].In this answer, several desired properties should be met, namely, proximity, plausibility, sparsity, diversity, and feasibility [147].However, most works only regard a subset of these when producing counterfactuals, ignoring challenging issues.These include the provision of plausibility guarantees in highly complex data or generating diverse samples for largely parametric generative models prone to fall into single modalities.Furthermore, there are few causal approaches for XAI since finding causal relationships from observational data is extremely difficult to achieve [224].

Solution Ideas
To tackle the need for actionable explanations, technological advancements in AI, such as large generative models, can open new opportunities in counterfactual explanations.One assumption is that such advancements can endow the produced counterfactuals with some of the desired properties for explanations such as proximity, plausibility, sparsity, diversity, and feasibility.This has been approached recently in [225], where counterfactuals are produced by means of an optimization problem formulated over conditional GANs comprising three different objectives: one related to plausibility, another one to sparsity, and a third one that relates to feasibility.With initial explorations of diffusion-based counterfactuals being reported in recent research [226,227], questions such as how to sample-efficiently diversify adversarial outputs produced by these models will be interesting.Another direction worth exploring is how to connect causal graphs, relating each input of the model with its output, particularly in high-dimensional data.Most expert knowledge is represented in terms of entities and semantic relationships that inherently encode cause-effect links, as in knowledge bases.The goal in this context is to automatically construct causal graphs for models that do not necessarily operate on concepts or entities but rather on raw data.A potential solution is interfacing learning algorithms with symbolic knowledge about how the world behaves so that explanations for models grounded on such established causal links are endowed with the sought actionability.

Adjusting XAI Methods and Explanations
Another class of challenges in XAI is related to adjusting explanations.With the diverse range of applications of AI systems, XAI methods have to produce explanations that fit diverse stakeholders, domains, and goals.However, there is not yet enough research addressing these concerns.

Adjusting Explanations to Different Stakeholders
An explanation can be required by many different kinds of stakeholders during the development, evaluation, and use of an AI system [17].Each stakeholder brings their own attitudes, preferences, aptitudes, abilities, and previous experiences that influence the kind of explanation they require.Designing and tailoring explanations that are appropriate for each of these stakeholder types, both in terms of content as well as format and presentation, is an ongoing challenge.For example, in the business context, the same objective facts must be explained and tailored to the stakeholders' respective interests and objectives.A business person is usually mostly interested in the bottom line impact of an AI system, a technical person is interested in the process and the validity of implementation, and a financial person is interested in the cash flow.Adding to that mix, the different educational backgrounds and language used necessarily call for very different explanations for each of the three actors.
Solution Ideas Future work should investigate new ways to enrich explanations semantically by combining different types of XAI methods and utilizing additional information sources (for example, training data, ontologies, and other modalities).Ideas from personalizing DL models [228] and, more specifically, creating personalized explanations [229] can be helpful.Explanations could also be made interactive.Humans should be able to refine explanations through interaction, as recently advocated in the reinforcement learning community through reinforcement learning from human feedback [230,231].

Adjusting Explanations to Different Domains
The domain and context in which explanations are consumed are critical.For example, explanations for using a self-driving car must differ greatly from those in a clinical decision support system.Each domain brings different assumptions, environments, expectations, and stakes.In self-driving cars, the details about the passengers are not as important, but adherence to regulation is paramount.In contrast, in a clinical situation, the patient details are crucial, but regulation does not (directly) prescribe decisions.Making each explanation universally applicable, precise, and compact means omitting many details that pertain to a domain, certainly sacrificing the explanation's effectiveness.Instead, we take the domain as indispensable and build on it.This makes meaningful explanations dependent on the domain on whose peculiarities and context they are built.In this line of thought, research is starting to emerge focused on distinguishing between high-stakes and low-stakes domains [232,233,17].However, the influence of the domain in which an AI is used has not been fully explored.
Solution Ideas Domain-specific explanation models should be developed to cater to the unique requirements of various application areas.These models should incorporate relevant knowledge, terminology, and context-specific reasoning to provide meaningful explanations.Furthermore, research efforts should prioritize the development of guidelines and standards for context-aware explanations.These guidelines would provide a structured approach for AI developers to assess the use of and determine the most suitable explanation strategy.

Adjusting Explanations to Different Goals
Another fundamental challenge is to adjust explanations to what they should achieve when being presented to a stakeholder.For instance, data scientists might want to develop an accurate data-driven model; a regulator might want to assess the fairness of an AI-assisted loan offer; or a loan applicant might want to know the reason behind a rejection [218,234,17].An underlying assumption is that XAI seeks to achieve these desiderata by improving the mental model that a stakeholder has of an underlying AI-system [19,235,236,17].However, the understanding required for each desideratum might differ, requiring tailored explanations.
Solution Ideas Adjusting explanations to different goals might not be possible without factoring in the stakeholder that has these goals.Accordingly, one approach is to employ a stakeholder-centric explanation strategy, recognizing that different stakeholders have distinct goals and information needs.For data scientists aiming to improve model accuracy, explanations can focus on technical model details, feature importance, and model performance metrics.Regulators seeking to assess fairness may require explanations related to fairness metrics, compliance with regulations, and potential bias sources.Meanwhile, end-users, such as loan applicants, often require clear, user-friendly explanations for transparency regarding AI-driven decisions, allowing them to understand the reasons behind outcomes.This stakeholder-specific tailoring ensures that the goals pursued with explainability are met effectively.

Mitigating the Negative Impact of XAI
Although XAI has noble goals, it might also have negative impacts that need to be avoided or mitigated.

Mitigating Failed Support by XAI
In some domains, especially in the medical domain [237], the ineffective support by XAI can sometimes be harmful.This has been associated with the so-called 'white-box paradox' [238,239], which urges not to take the value of the support delivered by XAI systems for granted.There are two possible cases: failed and misleading explanations.The first case might occur when the advice from an AI system is correct, but the associated explanation fails to inform the decision maker positively.This can happen because the explanation is inappropriate or wrong, to appear faulty, irrelevant, or unclear to users [238].In this situation, users might not accept the correct advice because of inadequate explanations.The second case is perhaps even worse and paradoxical; it occurs when the inference or advice of an AI system is wrong, but the synthesized explanations have a sufficient persuasive force for convincing users that such advice is correct.In this situation, users are misled and thus potentially prone to mistakes [240].
Solution Ideas A first option would be to detect failure situations and label them appropriately and reliably.Then, one possible course of action would be not to provide users with any XAI support if this is deemed detrimental or irrelevant in a given setting, as for instance, in radiological settings (see [237,239]).Another approach to mitigate failed support by XAI is to challenge the oracular conception of AI support.This conception assumes that AI outputs are judged based on moral categories like right and wrong, with AI-generated explanations serving as aids to help humans determine whether to trust the outputs.However, AI systems were originally conceived as generative and persuasive technologies [241], not oracular ones.This oracular nature can be characterized as an alethic nature, which assumes that machines can, and should, always state the truth [242].Relaxing the expectation of truthfulness is feasible, especially when dealing with probabilistic outputs or uncertainty estimates from AI systems.To this end, we could introduce a third type of explanation, namely a perorative explanation, alongside the two traditional types of explanations provided by XAI systems: motivational and justificative explanations.In legal terms, peroration refers to the conclusion of a speech or argument, where a speaker summarizes their main points and seeks to persuade an audience of their position.By providing a set of possible explanations for different AI-based inferences, including opposing and contradictory ones, XAI systems enhance accountability among human decision-makers.This approach can be likened to a judicial process, where opposing parties present evidence and arguments before an impartial judge makes the final decision, offering a more balanced perspective than the oracular approach.[243,242,244]

Devising Criteria for the Falsifiability of Explanations
Explanations are often requested to clarify issues such as accountability [217,245,201,246].However, explanations might be wrong.In such a case, parties that did not contribute to a mistake could be held accountable.Unfortunately, there is a lack of clarity regarding when an explanation is incorrect and under what conditions it becomes falsifiable.Falsifiability is a critical element in introducing a commitment to the explanations provided by AI systems and understanding the potential consequences that follow.Without clear criteria for falsifiability, benchmarks for the correctness of explanations cannot be established, and it becomes challenging to hold AI practitioners accountable for the accuracy of their explanations.In some cases, practitioners may rely too heavily on intuition rather than rigorous methods regarding interpretability.Therefore, the question of establishing what are ground truths for explainability in benchmarks and how they were produced are open questions.As a more ambitious goal, we may ask about the discriminability between very differing plausible explanations and their ordering concerning quality and acceptability.
Solution Ideas Establishing criteria for falsifiability in XAI could draw inspiration from the philosophy of science and related research fields.One potential solution lies in adopting the Popperian notion of falsifiability as a cornerstone of empirical science that can serve as a guiding principle [247].Within this framework, XAI could systematically integrate hypothesis testing and experimentation to subject explanations to rigorous empirical examination.In this line of thought, some researchers have advocated for a framework that promotes falsifiable research in the field of explainability, emphasizing the need for precision and rigour in evaluating and validating explanations [248].Additionally, insights from epistemology and cognitive science can inform the development of standardized protocols for evaluating the correctness of explanations, drawing parallels with how empirical claims in the sciences are subjected to rigorous scrutiny.Furthermore, interdisciplinary collaboration between computer scientists, philosophers of science, ethicists, and cognitive psychologists can facilitate the development of a comprehensive framework that incorporates not only empirical falsifiability but also ethical considerations and cognitive principles.By anchoring XAI practices in wellestablished principles from the philosophy of science and related disciplines, it can pave the way for more robust, accountable, and scientifically grounded explanations within AI systems.

Securing Explanations from Being Abused by Malicious Human Agents
Explainability is an important aspect of human coordination with machines [249,220,236,250].This is especially true in the near term, where AI systems may not be competent enough for autonomous adversarial behaviour.XAI involves understanding how AI systems arrive at their inferences, decisions or recommendations and providing a clear explanation of the logic and reasoning behind these outcomes.The effectiveness and adequacy of explainability as a tool for AI safety may be limited in certain scenarios [251].For instance, AI systems in the hands of malicious human actors, can pose significant challenges to explainability [220] through manipulation and adversarial attacks.As an example, employers may systematically discriminate job applicants by using socially misaligned ML models while serving borderline plausible explanations to avoid detection.

Solution Ideas
The need for discriminating between different explanations ties directly to the falsifiability of explanations (see 3.8.2).Furthermore, Concept-based explanations could help combat adversarial attacks, especially a recent form of such attacks that aim to trick both humans and classifiers [131].For example, a malicious sample might be detectable by comparing a concept-based explanation of an adversarial sample with that of a non-adversarial sample.However, this is challenging because explanations can also be manipulated and used to trick or deceive [252].Similarly, another application context includes forensic analysis, which aims to understand the concepts learned by a classifier [253].Concept-based explanations could also be helpful for reflective learning from data [254], which means classifiers can be improved through processing explanations during training.

Securing Explanations from Being Abused by Malicious Superintelligent Agents
Explainability is an important aspect of AI safety.Many of the challenges, highlighted in the literature [255,110] and here, already showcase fundamental limitations on the human ability to understand the behaviour of current AI systems.However, assuming no constraints on their design or physical limitations, future AI systems may become so competent that understanding them becomes fundamentally impossible.Exacerbating this issue, using formal verification to guarantee benign behaviour may not be viable due to unverifiability [256].In independent domains where AI agents are non-adversarial, these issues are not of much concern.At worst, we are in a situation where cooperative AI agents work for us and tell us fairytales that make us content.However, when it comes to adversarial scenarios, the question is how much our assimilated explanatory concepts are adversarially robust through the existence of some computational protectorates that leave not many exposed loopholes.As the complexity and capabilities of AI agents increase, these agents may discover ways to deliberately fool people by exploiting the tension between the explanatory concepts that emerge from human capabilities, perception, and action and between those that complex agents can utilize.If such were the case, and humans rely only on explainability for safety, malignant gain by AI agents could be unbounded [110].
Solution Ideas Explainability can be an effective tool in ensuring the safety of AI systems, even in the long term, assuming that the problem of alignment between the technical capabilities of XAI methods and their application and utility for humans, in reality, is solved (see 3.6.3).The effectiveness and adequacy of explainability as a tool for AI safety may be limited in certain scenarios [251].For this reason, explainability should be only one part of every safety toolkit as it has its strengths and limitations that need to be complemented in a portfolio of approaches.Work on building such a portfolio is welcome, as there is a growing need.This line of work is more long-term and can be solved Longo L. et al.Explainable Artificial Intelligence (XAI) 2.0: A Manifesto of Open Challenges and Interdisciplinary Research Directions only partially by constructive approaches.At the same time, the other part would be a restraint in building powerful super-intelligent systems without strong reasons to believe they are aligned with our values.

Improving the Societal Impact of XAI
Research on explainable AI and the derived methods, models and techniques used to create real-world applications can impact society.

Facilitating Originality Attribution of AI-Generated Data and Plagiarism Detection
A special challenge for the explainability of novel generative models, which we think warrants to be mentioned separately, exists with respect to originality attribution and plagiarism detection of AI-generated data.Concerning the problem of originality attribution, pieces of art produced by generative models have been recently taken to exhibit a similar level of creativity as humans.In particular, contests won with AI-generated art have stirred controversy concerning the intellectual property of the output of a model learned from third-party data [257,258].Likewise, plagiarism detection is becoming central in LLMs that excel across different domains, for example, with ChatGPT [259]).The massive usage of these models to produce apparently original textual content has disrupted the idea of plagiarism, as such content has been shown to evade mainstream tools for plagiarism detection easily.Thus, whether information biases the generative process, as, for example, with the prompt in a language-to-image stable diffusion model, is sufficiently original for intellectual property and author attribution claims remains an open question.

Solution Ideas
Regarding originality attribution, a solution is reformulating the concept of authorship in these models both from the technical point of view and from the legal and regulatory perspectives.Explainability should play a part in future regulation, as explanations could reveal which instances or parts of the modelled data distribution are relevant for a given synthesized output of the model.Solutions should be devoted to understanding if generalisation implies any form of plagiarism or whether it is a new form of inspiration, interfacing creative thoughts with original synthesized content.On the topic of plagiarism detection, efforts have been made recently to determine whether the content produced by AI models is artificially generated, proposing the inclusion of tailored tokens, for example, watermarking [260], in the produced content [261].Explainability techniques will be relevant in determining which learning instances were more influential in producing a given outcome.

Facilitating the Right to Be Forgotten
Large-scale generative models require tons of data to fine-tune their trainable parameters, which often account for several hundreds of terabytes in size.Such a huge data substrate may clash with a fundamental right in data governance: the right to be ignored or forgotten by data-driven models.While the interest in the machine unlearning paradigm [262] has been on the rise [263], it is unclear how to efficiently ensure that data owned by a certain user is unlearned by a given large-scale generative model, so that it can be ensured that no instance like that of the user claiming their right to be forgotten will be produced when the model is queried.

Solution Ideas
The right to be forgotten could be supported by similarity-based explanations and incrementally retraining the model to avoid sampling around the part of the subspaces close to the forbidden data.XAI can also play a pivotal role by explaining model decisions and revealing which data points influenced those decisions.This transparency can empower users to identify the data instances that relate to them, enabling them to exercise their right to be forgotten.Additionally, XAI can aid in auditing and verifying that the unlearning process is carried out effectively, reassuring users that their privacy rights are upheld.In general, humans should be granted the chance to verify that a generative model does not learn from them.

Addressing the Power Imbalance Between Individuals and Companies
A significant issue in XAI is that the efforts in guaranteeing more transparency of AI systems often are not enough to mitigate or even address the problem of unfair AI systems that exacerbate the societal power imbalance between individuals and companies using AI systems [264,265,218,266].In other words, explaining the logic of an algorithm might be essential to empowering individuals to understand how to react to unreasonable AI-driven systems, especially when those systems take automated decisions that can legally or similarly significantly affect individuals.Still, explainability is often hard to achieve in practice and limited in scope.The capability to understand 'why' a certain automated system followed a path from some inputs to some outputs may not be enough to empower individuals in case such a path was logically correct but legally or ethically disputable.Explanations are not enough if they are not accompanied by accountable systems of contestability [267] and by justificatory statements that could prove why the 'path' from inputs to outputs is not only logically correct but also non-discriminatory, non-manipulative, non-illegal, non-unfair [268,269].Therefore, only acting at the level of the individual 'reactions' to the outputs of automated decision-making, including understandability, contestability, and justifiability, fails to completely address the main societal and ethical problems behind unfair and untrustworthy AI.The XAI community should shift its focus to tackle the power imbalance between AI developers or controllers and those affected by AI [270,271].The power imbalance is a structural problem, but the way AI increases such an imbalance cannot be faced only by more explainability.There is a broader problem of under-representation, hidden discrimination, and lack of accountability [272].Current XAI methods answer this issue, but they can address only a small part of the problem [273].
Solution Ideas To address the power imbalance between individuals and companies in the realm of XAI, a new approach to designing future AI systems via XAI methods can include participative design, where impacted stakeholders are invited into the decision-making process [272,274].There are different modalities of participative approach to AI design, but an essential consideration is the participative impact assessment [275].Vulnerable impacted stakeholders should be included, through an open and circular approach, in the key value-sensitive decisions in the AI design [276].Following the example of environmental impact assessment or workers' participation in business decision-making [277], there are different ways in which digital users, individually, in groups or through representatives, can participate in the data processing decision-making or the design of data-driven technologies.

A Novel Manifesto
We conclude this article by presenting a manifesto for XAI.This manifesto aims to define and succinctly describe the open challenges scholars in the field face.It includes propositions governing independent scientific research.The Manifesto is a mechanism for shaping our shared visions about science in the field of XAI, and it is the outcome of the engagement of diverse expertise and different experiences by its authors.for the falsifiability of explanations, and secure explanations from being abused by malicious human or superintelligent agents.9. Improving the Societal Impact of XAI: To facilitate the originality attribution of AI-generated data and plagiarism detection, support the right to be forgotten, and address the power imbalance between individuals and companies.
We believe working together as a community will lead to more productive and up-to-date work, increase reliability and enhance falsifiability.The spirit of close collaboration, even among scholars with different scientific backgrounds and focused on specific disciplines, along with the respect and the willingness to build on each other's work, will certainly inspire more scholars to join us in advancing eXplainable Artificial Intelligence as a field.This manifesto is a genuine attempt at it, an exciting opportunity for shaping the future of AI-based systems for the benefit of human society.
Longo L. et al.Explainable Artificial Intelligence (XAI) 2.0: A Manifesto of Open Challenges and Interdisciplinary Research Directions

Figure 1 :
Figure 1: A manifesto for eXplainable Artificial Intelligence (XAI): High-level challenges Longo L. et al.Explainable Artificial Intelligence (XAI) 2.0: A Manifesto of Open Challenges and Interdisciplinary Research Directions Longo L. et al.Explainable Artificial Intelligence (XAI) 2.0: A Manifesto of Open Challenges and Interdisciplinary Research Directions

1 . 4 . 8 .
Creating Explanations for New Types of AI: To create explanations for generative models (for example, LLMs) and for concept-based learning algorithms.2. Improving (and Augmenting) Current XAI Methods: To augment and improve attribution methods, remove artefacts in synthesis-based explanations, and create robust explanations.3. Evaluating XAI Methods and Explanations: To facilitate the human evaluation of explanations, create an evaluation framework for XAI methods, and overcome limitations of studies with humans.Clarifying the Use of Concepts in XAI: To clarify the main concepts in XAI and its relationship to trustworthiness, and to find a useful account of understanding. 5. Supporting the Multi-Dimensionality of Explainability: To create multi-faceted explanations and enable interdisciplinary work in XAI. 6. Supporting the Human-Centeredness of Explanations: To create human-understandable explanations, facilitate explainability with concept-based explanations, address explanations divorced from reality, and uncover causality for actionable explanations.7. Adjusting XAI Methods and Explanations: To adjust explanations to different stakeholders, domains, and goals.Mitigating the Negative Impact of XAI: To adjust explanations to different stakeholders, devise criteria