Assessing societal effects: Lessons from evaluation approaches in transdisciplinary research fields

Achieving societal effects is crucial for transdisciplinary research. In this article, we present key characteristics of impact evaluation of transdisciplinary research. We compare different approaches in sustainability, public health, and development research to advance joint learning and define common challenges

GAIA 32/1 (2023): 178 -185 FOCUS: CREATING SPACES AND CULTIVATING MINDSETS FOR TD | RESEARCH > addition, we use "impact" in established terms such as impact evaluation or research impact.
The fields of sustainability, public health, and R4D are closely related in terms of their research approaches, which provides an opportunity for mutual learning by comparing experiences of designing impact evaluations to measure the societal effects of TDR (OECD 2020).The author team, which includes those with experiences with TDR in these fields, conducted exploratory research on impact evaluation approaches across fields.As a kickoff, we convened an international interdisciplinary workshop at the 2021 International Transdisciplinarity Conference (ITD), and invited contributions from transdisciplinary researchers representing these fields (Kny et al. 2021).Insights were expanded through a narrative literature review of key publications based on the authors' knowledge, without a claim to completeness (Greenhalgh et al. 2018), and our own experiences as TDR evaluators.This paper provides an overview of the key characteristics of different approaches to evaluating the effects of TDR, discusses the commonalities and challenges in applying such approaches, and suggests how they can be overcome.

Transdisciplinary research and impact evaluation in the three fields
Public health began employing participatory approaches to evaluate community projects in the 1980s.This fostered collaboration and acknowledgement that public health operates in a complex problem domain, which led to calls for TDR approaches to address health-related social injustice (Abrams 2006).Several tools and frameworks have been developed since to map complex impact pathways, including realist evaluation (Pawson and Tilley 1997) and the community-based participatory research model (Wallerstein and Duran 2010).
In R4D, early assessments of research impact in the 1970s were based on technology-centric research, focusing on innovation and adoption processes as well as their projected effects to improve livelihoods and food security (Pingali 2001).These quantitative assessments of technological innovations (research outputs) are primarily randomized control trials (Stevenson and Vlek 2018).R4D programmes that involve multiple disciplines and stakeholders pose challenges for applying randomized control trials in their evaluation, because interventions occur in multiple emergent change processes, which are not always replicable (Belcher andHughes 2020, Adler et al. 2018).Consequently, qualitative and theory-based approaches to evaluation that trace the contributions of research activities, outputs, outcomes, and impacts have gained traction in evaluations of R4D (Belcher et al. 2020, Douthwaite et al. 2017).Similar concerns have been raised in public health, and both research fields now use more qualitative as well as theory-based approaches to evaluation that combine different methods depending on the context of meeting a variety of evaluation needs (Belcher andHughes 2020, Jagosh 2019).
Impact evaluation of sustainability research lags significantly behind the other two fields (Plummer et al. 2022).Transdisciplinary forms were only initiated in the 1990s, thereby highlighting the need for close collaboration between science and practice to contribute to solving real-world problems.It is assumed that a well-designed TDR process involving relevant scientific disciplines as well as experience and real-world knowledge from nonscientific actors leads to results with a high potential for societal effects (Lux et al. 2019, Hansson andPolk 2018).Lang et al. (2012) emphasise tracking scientific and societal effects as a key design principle for transdisciplinary sustainability research, and mention challenges such as timing, attribution, and measurability of effects.Further, several conceptual, methodological, and empirical contributions have been made recently in designing impact reflection and assessment approaches that resemble approaches in R4D and public health in building on qualitative and theorybased evaluation (e. g., Williams 2019, Munaretto et al. 2022).
After reviewing the history of impact evaluation across the three fields, it became apparent that the approaches were beginning to converge over time.In the remainder of the paper, we therefore focus on theory-based evaluation, realist evaluation, and community-based participatory research as relevant ap proach es applied to assess research impact across the three fields.Thus, we provide more detail on the similarities and differences in applying these methodological approaches in order to enable transdisciplinary and evaluation practitioners to consider how they can be used.In addition, we include examples from particular approaches and fields to illustrate relevant differences.Our analysis is limited to evaluation approaches that focus on TDR.

Methodological approaches for impact evaluation of transdisciplinary research
We find that the different approaches applied to evaluate the effects of TDR have adopted similar principles for the appropriate design of TDR evaluation (box 1).

BOX 1: Three approaches to assess transdisciplinary research impact
Theory-based evaluation (TBE) aims to test the research team's hypotheses regard ing how their project/programme will contribute to tangible social, economic, and environmental benefits by assessing its contributions to influence various actor groups.Community-based participatory research is an ap proach that can be used in both TBE and RE.It aims to engage those whose lives are affected by the proj ect/programme by involving them in the research at all stages in order to increase the likelihood that effects will occur for systems and communities.

Conceptual framing
In all three fields, approaches applied to evaluate TDR draw on theory of change (ToC) thinking.A ToC maps the assumed relationships between activities and short-, medium-, and long-term effects of an intervention, thereby making explicit the assumptions regarding why and how change occurs (Weiss 1995, Claus et al. 2023, in this issue).As the basis for the development of ToCs, most approaches differentiate effects which can be achieved during the research process from those that occur at its end and beyond, thereby modelling explicit steps and assumptions for how research contributes to changes in the broader problem context (Schäfer et al. 2021, Luederitz et al. 2017, Wiek et al. 2014).ToC is the main analytical framework in theory-based evaluation approaches (Nagy and Schäfer 2022, Munaretto et al. 2022, Schneider et al. 2019, Williams 2019, Van Drooge and Spaapen 2017).Moreover, theory-based evaluation is an approach that aims to test the hypotheses of the research team regarding how a project/programme will contribute to tangible social, economic, and environmental benefits by assessing the effects on different actor groups (Belcher et al. 2020, Walter et al. 2007).
Programme/project ToCs can be constructed by combining sources of knowledge, including practical experience and academic research.Ideally, one or more impact pathways are modelled in collaboration with relevant actors at the outset, which determines the information that needs to be collected to test the anticipated links between activities and effects to support ongoing reflection and learning (Oberlack et al. 2019).With regard to the three fields, we found that ToCs can be refined by including scientific and non-scientific actors, but there is variability in appli cation, participation, and how theory is used to produce impact pathways.In R4D and sustainability research, we find examples of ToCs being developed by the research team and close collaborators to surface assumptions and hypotheses regarding research contributions to change; however, they are rarely grounded in existing social-change theories.In sustainability research, different theoretical lenses may be added to conceptualise the specific analytical framework (e. g., transition theory) (Williams 2019).
Although public health also utilises the theory-based evaluation approach and ToC as an analytical framework, over the past ten years, the field has increasingly applied a variation termed realist evaluation (Pawson and Tilly 1997).Realist evaluation proposes that responses are context-dependent and documents how groups may respond favourably in one set of circumstances but may not respond at all in another.Therefore, the approach can be useful for decision-makers by enabling them to consider whether the particular context in which they are going to implement a programme or policy considers the mechanisms that may generate effects.Realist evaluation generates an initial theory proposing the elements that are required to generate favourable responses; however, it differs from theory-based evaluation in that it continues to test and refine the theory during data collection and analysis.The relationships between context, mechanisms, and effects are documented to identify patterns that are used to predict what intervention will work for which actor group and in what circumstances in order to enable informed decisions regarding the implementation of the activity across different contexts (Pawson and Tilly 1997).Further, realist evaluation also explicitly engages with various theories to model effects: candidate theories are presented to research teams and a theory that resonates is either selected or constructed from elements of two or more theories (a hybrid theory).The mechanisms that lead to activities producing effects in specific contexts are described in detail.The focus of interest in realist evaluation differs from ToC, because it produces a theory explaining how, when, and why effects are generated in different groups, situations, and contexts rather than aiming to describe how mechanisms work with regard to a single project/programme in a certain context.
In addition, the compared approaches utilise different typologies to describe the effects of TDR activities based on growing distance in time and space (e. g., within/close to/beyond project context) (Schäfer et al. 2021, Wiek et al. 2014, Walter et al. 2007) or relevant actor groups and institutions (Beckett et al. 2018, Belcher et al. 2020).Hence, these evaluation approaches recognise that in complex systems, the relative influence of research activities declines as interactions with other actors and processes increase (Belcher and Halliwell 2021).Thus, these authors and others distinguish what is within the sphere of control, influence, and interest of a TDR activity.The concept of impact pathways, which qualifies the relationships between activities, outputs, and effects to illustrate the main mechanisms through which TDR contributes to change is also employed in all the approaches.In contrast to older concepts of linear pathways of knowledge transfer, these approaches acknowledge impact pathways to be nonlinear -that is, they include interactions, interdependencies, and feedbacks within and between various levels and context conditions.
It is possible in both theory-based evaluation and realist evaluation to utilise an approach referred to as community-based participatory research.Community-based participatory research is a paradigm that can be applied across any research design and posits that participation of the individuals who will be affected by the research fundamentally affects the research in several ways.
The fields of sustainability, public health, and research for development are closely related in terms of their research approaches and impact evaluation approaches, which provides an opportunity for mutual learning.

FOCUS: CREATING SPACES AND CULTIVATING MINDSETS FOR TD | RESEARCH
Their contributions can produce more complete information regarding possible research effects.Involving communities also increases the likelihood that findings will be relevant to specific contexts and more likely to be used by policymakers to improve -for example, population health and the conditions affecting it.In addition, communities are involved at all stages of a project, which produces effects for the individuals, groups, and organisations who are part of the research team, thereby increasing the likelihood that there will be effects on communities and systems throughout the duration of the project (ICPHR 2020).

Data collection and analysis
Impact evaluation approaches in the three fields apply a mix of qualitative and quantitative methods to tackle the variety and complexity of contexts and societal effects.Qualitative methods are particularly valued because they explain the mechanisms of recursiveness: level of satisfaction of relevant stakeholders with the adaptation of the TDR process due to changes of context conditions during the TDR process: level of satisfaction of the participants with the quality of the training, knowledge exchange, and dissemination events level of satisfaction with the quality of the cooperation in networks which result from the TDR process at the end of the TDR process: level of satisfaction with the range of relevant stakeholders and the comprehensibility and appropriateness of knowledge dissemination materials level of satisfaction of the target groups with the compatibility of the modified practices (e. g., energy saving) with daily routines level of satisfaction of the target groups with the appropriateness of health prevention measures and their ability to use them level of satisfaction of the addressed communities with the standard of living (e. g., income and health) compared to baseline level of perceived improvement of image of an industry, municipality, or organisation due to reduction of CO 2 emissions level of satisfaction with the quality of the service provided Belcher et al. 2020, Schäfer et al. 2021, Williams 2019).Note: Baseline here refers to state prior to transdisciplinary research (TDR).

RESEARCH | FOCUS: CREATING SPACES AND CULTIVATING MINDSETS FOR TD
how and why effects are achieved, which addresses a key shortcoming of quantitative methods.To be able to learn from cases in a certain context, it is necessary to describe contexts qualitatively, compare them, and draw conclusions for adapting the TDR design to other contexts (Nagy et al. 2020, Munaretto et al. 2022).
In all approaches, the (re)actions of actors (i.e., individuals or organisations) are collected -primarily via qualitative/semi-structured interviews, documents, surveys, workshops, and observations -to describe what happens (or not) during or in response to the TDR process.In realist evaluation, this is often done using both academic researchers and community-based researchers with experiential knowledge of the problem and better access to target groups.Data can be collected during TDR activities to understand how the process supports the realisation of anticipated effects and, additionally, allows for adaptive management (Verwoerd et al. 2020, Munaretto et al. 2022).Alternatively, data can be collected when the project is concluded.Regardless of the approach, the resources available and the funder's objectives determine which methods are applied for data collection and validation of results and effects.
The purpose of a TDR project/programme is decisive for which data should and can be collected to demonstrate the effects of TDR (Munaretto et al. 2022).TDR often aims to contribute to ambitious effects -for example, a decline in CO 2 emissions, a higher number of patients engaging in preventive health production, or higher household incomes in target communities.To know and observe whether targeted changes occur, qualitative and quantitative indicators can provide a basis for monitoring progress within a specific context.Apart from indicators for the quality of the TDR process and its outputs, such indicators can include those which trace the changes in individual and organisational practices, or the change in state and conditions prompted by the changed practices.Table 1 provides examples of possible indicators that could be adapted to measure the intended effects of a particular TDR project/programme.
Data are analysed in accordance with different aims in theory-based evaluation and realist evaluation.In theory-based evaluation approaches, the analysis of research effects typically focuses on how TDR activities contribute to the intended effects.It considers which design and implementation characteristics of the TDR process support the emergence of effects and whether higher-level effects are likely to manifest in the future.The analysis tests the anticipated linkages between activities, outputs, and effects in the original model.In contrast, realist evaluation analysis focuses on what produces or supports the effects given different sets of conditions -for example, for whom, in what circumstances, to what extent, in what contexts, and how.The analysis aims to identify patterns that are then compared with the original logic model, thereby producing configurations of context, mechanisms, and outcomes to refine the impact theory.

Challenges in evaluating the effects of transdisciplinary research
Effects are assessed with similar motivations across all approaches: to produce knowledge that supports the development of efficient activities to tackle a specific complex problem.However, the combined findings from our workshop, the literature review, and evaluation experiences revealed that TDR teams across fields encounter considerable challenges in demonstrating whether, when, and how research has contributed to change.We found three common challenges: 1. evidencing causal claims, 2. including diverse perspectives, and 3. continuous monitoring and evaluation to support adaptations in research design and implementation.We suggest a few possible solutions derived from the compared fields and approaches to tackle each challenge.

Evidencing causal claims
It is difficult, if not impossible, to say what would have happened without a TDR activity.The unique, emergent, and non-discrete nature of TDR poses challenges for making statistical inferences and generalisations regarding its effects (Belcher and Hughes 2020).Consequently, testing a counterfactual in TDR evaluations mostly relies on the perceptions of the actors in the system, and these perceptions are usually understood through interviews or observation of meetings (Walter et al. 2007).Those external to the project will often attribute the change to more than one source of knowledge, which demonstrates that the direct effects of TDR become difficult to trace among other competing factors (Belcher et al. 2017).
To overcome the inherent limitations of relying on individual perceptions, data can be triangulated by data source and using different investigators and methods to collect data.In theorybased evaluation approaches, for example, the intended effects (and corresponding proxy indicators) determine the data sources that can be used to measure them.Those who are expected to do something differently as a result of a TDR process are interviewed, and additional documents (e. g., government reports) can be sourced to refute or validate claims; however, it is impor-Transdiciplinary research for sustainability, public health, and development faces common challenges when assessing societal impact: to prove causal claims, include diverse perspectives and continuously evaluate research design and implementation to support their adaptations.

FOCUS: CREATING SPACES AND CULTIVATING MINDSETS FOR TD | RESEARCH
tant to note instances of conflicting information (Williams 2019, Belcher et al. 2020).In realist evaluation, different investigators can interview participants under the assumption that respondents may share different information with different people.For example, certain realist evaluation processes engage community-based interviewers in addition to academic researchers (Jagosh 2019).Similar methods -for example, interviews and storiescan be employed to identify data convergence in more and less structured accounts of the same phenomenon.In both approaches, workshops can be convened to make sense of the interpretation of findings, soliciting feedback and validation from academic and community researchers, policymakers, and other actors.As stated earlier, it depends on the specific purpose of the TDR if it makes sense to collect quantitative data or use official data (e.g., use of preventive health programmes, number of households provided with energy, etc.) to complement interview data.

Including multiple perspectives on effects
The impact assessment of TDR should aim for the inclusion and representativeness of different perspectives (Reed et al. 2021).However, perspectives can conflict and diverge among diverse groups, as consensus cannot be assumed.For example, evaluation approaches in transdisciplinary sustainability research often do not explicitly deal with questions of power distribution (as a basis to identify who should have a voice) (Fritz and Meinherz 2020).Including a variety of perspectives prevents the risk that only "representatives" are consulted regarding effects and those affected are not given voice.To achieve this, realist evaluation approaches -which can deliberately seek differences -can be employed to manage multiple perspectives.Similarly, communitybased participatory research aims to identify all those with a stake, accommodating multiple perspectives via discussion and nego tiation.By focusing on underlying assumptions and mechanisms, theory-based evaluation and realist evaluation approaches seek to uncover the unexpected positive and negative effects of the TDR process.The ToC enables context-responsive evaluation to create space for the involvement of target groups to express how a TDR process has influenced their lives.Informants are posed the question directly, which helps to identify elements in the process that could have been improved.

Sustaining continuous monitoring and evaluation
Participants in our interdisciplinary workshop agreed that numerous scientists are trained in conceptualising linear research designs (i.e., formulating a research question, outlining methods for investigation, and describing the output produced from the research).Funders often expect that research plans will be followed without deviation.Since TDR deals with complex realworld problems, some of the compared approaches argue that it is key to consider changes in contextual conditions throughout the research process (e. g., windows of opportunity opening due to a changed actor constellation or introduction of favourable/ hindering regulation).Repeated reflection regarding the intended effects and the possibilities to achieve them built into the eval-uation process can enable adaptation of the research design according to the challenges at hand.However, embedding a cycle of reflection, learning, and adaptation into a project is challenging if researchers and other involved actors are unaware of the benefits (Pineo et al. 2021).Further, it has been widely acknowledged that a large number of researchers lack training in involving lay people in the academic research process (Staniszewska et al. 2018).
There are several solutions to these challenges that can be found in TDR evaluation practices.Public health projects that employ community-based participatory research aim to include academics and members of the wider evaluation team in regular reflection and adaptation of TDR processes as a key component.However, this is not necessarily done comprehensively.Ideally, systematic action learning cycles would be written into the research plan from the outset, where researchers and nonscientific actors receive training to engage in continuous monitoring and reflexivity to assess societal effects.In theory-based evaluation, the use of ToC is intended to be iterative in order to ensure that the research strategy is aligned to the anticipated effects on an ongoing basis (Oberlack et al. 2019).From our own experiences and those of the workshop participants, this is rarely the case in practice, which hinders effective adaptive research management.In sustainability research, there are initial attempts to include formative evaluation in research processes to enable periodical adaptation of the research design (Nagy andSchäfer 2022, Schäpke andBeecroft 2022).These encompassing approaches place high demands on involved actors with regard to time resources, skills, and motivation.Thus, financial compensation might be necessary for certain actor groups.Thus far, institutional barriers -such as the lack of incentives, mechanisms, and limited skills in formative evaluation -often appear to prevent the establishment of continuous monitoring and evaluation.

Conclusion
Our comparison of impact evaluation approaches employed in the fields of public health, development, and sustainability research are rooted in rather individual discourses, but recent trends toward TDR promote convergence.Increasingly, impact evaluation of TDR that is iterative and complex-aware as well as aims at formative learning and adaptive research management is developed, refined, and applied.Such flexible approaches enable impact assessments to meet the dual roles of supporting researchers in improving the design and implementation of TDR for effects and create accountability for funders and society at large.Research that operates in complex problem contexts engages multiple actors, processes, and pathways to impact and, thus, requires evaluation on their effects and interplay to prove its value empirically, and promote learning.In our view, TDR funders across fields need to acknowledge and support this by creating space for reflexive processes of impact evaluation in funding budgets and requirements from the outset -for exam-> RESEARCH | FOCUS: CREATING SPACES AND CULTIVATING MINDSETS FOR TD ple, ex-ante co-design of impact evaluation as a standard -to expand the knowledge base on impact orientation and societal effects.Without empirical accounts from evaluation, research contributions to societal change remain implicit ambitions and hopes that may conflict among TDR stakeholders.
Key challenges remain that need to be addressed when assessing the effects of TDR.Evaluators could address the causal claim challenge by using triangulation methods and contextaware cross-case analysis to validate perceived effects.Research funders could support such an evaluation design as well as acknowledge that judgements of effects are, to a certain extent, subjective.Even if monitoring based on quantitative data is possible in certain cases, qualitative data can supplement with explanations to offer additional value and strengthen rigour.Therefore, methods to integrate various perspectives should be included in impact evaluation processes.In addition, training needs to be provided that acknowledges that a large number of researchers do not have a background in mixed-methods design or experience in including diverse groups of participants in the research team.Lastly, systematic approaches to ongoing reflection, monitoring, and evaluation in TDR activities need to be applied and funded to construct a better base to understand and facilitate effects.
In this paper, we open the discussion on commonalities, differences, and challenges of conceptualising and operationalising impact evaluation among transdisciplinary researchers with a background in the three fields discussed here.In future, involving transdisciplinary researchers from other fields (e. g., educational and social work research) with evaluation expertise would enrich this endeavour.

Martina Schäfer
PhDs in environmental technology and sociology (Technische Universität Berlin, DE).Since 2010 scientific director of the Center for Technology and Society, Technische Universität Berlin.Research interests: sustainable consumption, sustainable regional development, methods for inter-and transdisciplinary research.

Janet Harris
Studies in community based public health and participatory health research.PhD in public health and national knowledge mobilisation.Fellow working with the University of Sheffield, UK.Research interests: social movements in health, effects of community involvement in promoting health and wellbeing.
) aims to theorise what works, for whom, in what circumstances, and to what extent.Analysis focuses on aspects of context that may provide alternative explanations of effects to aid policymakers in making decisions about wider implementation.
indications of the quality and quantity of knowledge as well as goods and services provided by the research (depending on the research question and the type of research activity) case-specific indications (depending on the research question and the type of research activity) of: 1. changes in organisational and individual practices; 2. changes in state or conditions (e. g., socioeconomic status or environmental conditions) QUANTITATIVE INDICATORS inclusivity: number of relevant stakeholders consulted (e. g., considering age and gender)transparency: number of events and other strategies used to describe the process of conducting the evaluation recursiveness: available time and funding resources for iterative evaluation and adaptation of the TDR process during the TDR process: number and characteristics (e. g., gender and age) of people participating in training provided based on research findings; extent of knowledge exchange during the project (e. g., via networks and learning events); number of new strategic contacts resulting from the TDR process; number of dissemination events at key points in the TDR process at the end of the TDR process: number/reach of policy papers, publications in practiceoriented media, guidelines, and toolkits level or reported change in management (e.g., energy saving) practices employed (as compared to baseline scenario prior to intervention) number and characteristics (e. g., gender and age) of people who employ health prevention measures change in average income of affected communities compared to baseline change in level of CO 2 emissions in target region from an industry, municipality, or organisation compared to baseline change in the percentage of people affected by a certain disease who use health services in the target region compared to baselineQUALITATIVE INDICATORSinclusivity: level of satisfaction of relevant stakeholders regarding their inclusion transparency: completeness of reporting on the process of developing indicators and criterion for selecting indicators, etc.