Supporting validity: steps to contextualise applications for technology and assessment, for learning

In the paper, the authors offer perspectives on the uses of technology and assessment, that support learning. The perspectives are viewed through validity (from the field of assessment) as a framework and they discuss four aspects of an interconnected technology, learning and assessment space that represent theory informed, authentic practice. The four are: 1) integrated coherence for learning, assessment and technology; 2) responsibilities for equity, diversity, inclusion and wellbeing; 3) sustainability; and 4) balancing resources in global contexts. The authors propose steps and considerations for medical and health professions educators who need to contextualise applications for technology, learning and assessment, for positive impact for learners, faculty, institutions and patient care.


Introduction
For educators in the medical and health professions, the Ottawa consensus statements represent an expert view of evidence-informed best practice (Issenberg, 2011).Two recent statements explored issues of assessment, technology and the impact on learners and learning: Performance Assessment (Boursicot et al., 2021) and Technology in Assessment (Fuller et al., 2022).Key messages from these consensus statements are briefly summarised in Table 1 below.A wide body of work in the fields of assessment and technology in assessment, including these two consensus statements, has contributed to deeper understanding of learners and learning, and both assessment for learning and assessment of learning.Educators and supervisors of students and trainees (in the paper we subsequently refer to these as 'learners') in the medical and health professions, can draw upon decades of academic work to understand learning from theoretical and authentic, practice base viewpoints.Work by the National Academies of Sciences, Engineering and Medicine (2018) has been instrumental in synthesising knowledge from the fields of education and learning sciences on how individualslearn, building on theories of learning (Piaget, 1950;von Glasersfeld, 1995;Wertsch, 1985).From the fields of measurement and evaluation, assessors of learners are able to follow testing and assessment standards developed through work on validity and test validation, readily accessible through influential publications such as the Standards of Educational and Psychological Testing (American Educational Research Association et al., 2014) and the work of Kane (Clauser & Bunch, 2022;Kane, 2013).Concurrently, a substantial body of work supports the interface of these two areas through the design and application of assessment for learning (Black et al., 2007;Boud, 2007;Wiliam, 2018).
The challenge for educators is the synthesis and application from these diverse fields: that is, how learning and assessment interfaces with technology use in a specific medical or health professional education programme, within a particular healthcare setting, and embedded in a particular country/jurisdiction.Further challenges arise from rapid and continual developments with emerging and adaptive technologies (Shankar, 2022) and the current debates surrounding the opportunities and challenges of artificial intelligence (Cotton et al., 2023).
In this paper, we draw upon validity (from the field of assessment) as a framework to discuss four aspects of an interconnected technology, learning and assessment space that are critical components of validity 'evidence' (Cook et al., 2015).The four are: 1) coherence; 2) equity, diversity, inclusion and wellbeing; 3) sustainability; and 4) global resources.We propose steps and considerations for all health professions educators who need to contextualise applications for technology, learning and assessment, for positive impact.

1) Steps towards integrated coherence for learning, assessment and technology
A key aspect of validity is being attentive to coherence between learning and assessment.To extend the original work on 'constructive alignment' (Biggs & Tang, 2011), technology is now integrated into curriculum, learning experiences, and assessment tasks.The challenge is adopting an approach to achieve coherence in this integration.Validity underpins the test validation and technical processes we need to adopt to ensure that the intended use of our assessment, the sources of validity evidence we generate, and the inferences we draw from assessment data can be justified (Kane, 2013).In this way, validity provides a way of thinking about purpose of the learning and assessment activities and underpins evaluation of WBAs: use narrative feedback, marking schemes to reflect clinical performance, system to encourage desirable learning behaviours, interpret data holistically, and be aware of the 'failure to fail' phenomenon.
When applying technology, an assessment 'lifecycle' approach should target five key foci: (1) advancing authenticity of assessment, (2) engaging learners with assessment, (3) enhancing design and scheduling, (4) optimising assessment delivery and recording learner achievement, and (5) tracking learner progress and faculty activity and thereby continuous assessment.

Performance assessment should:
-drive desirable learning behaviours.
-account for psychometric implications.
-include narrative feedback for learning.
Technology should support longitudinal learning and be: -enabler of assessment with active engagement of learners and faculty.-planned with active involvement of learners.
whether there is a desirable impact on learners and learning, as well as achievement.Key questions to ensure integrated coherence across learning, assessment and technology include: a) How do we check curriculum and learning alignment, using assessment data?b) How well do we provide actionable feedback to learners?(i.e.feedback from assessors that enables learners to take action to improve learning) • assessment data indicates a cohort of learners performed well across examination items blueprinted to a learning outcome related to the topic of 'prescribing'.The same cohort performs poorly on 'management of the rapidly deteriorating patient'.This could suggest change to learning experiences.
• data related to learner confidence about their performance and knowledge (Menon et al., 2017) on assessment items is generated.The areas where learner confidence and expected performance misaligns for groups of learners could suggest change to learning experiences.
• numerical and qualitative data is aggregated to inform the decisions of competence committees (Carney et al., 2023).Cohort performance data on the professionalism domain could suggest change to learning experiences in the clinical setting.
• analysing aggregated, larger scale data from Workplace Based Assessment for the presence of actionable feedback to learners.Sparse actionable feedback suggests change to the briefing of clinical supervisors about how to give feedback to learners.
However, these affordances must look beyond provision of technology software and hardware and ensure support at the level of individual learners and faculty.Both groups need instruction in the use of technology to learn, assess, improve and progress, with a greater focus on the quantified self (Malecka & Boud, 2021).An increased focus on co-creation and shared design enables greater engagement, confidence in the learning/assessment technology and authentic design, aligned to workplace tasks (Cumbo & Selwyn, 2022;Divami, 2021).Technology provides powerful affordances to deliver interventions to support learning, including: • At the level of the learning/assessment environment, e.g. using augmented reality to support just in time learning and assessment during placement rotations or a move to a new workplace, drawing on the theories of transition and critically intensive learning periods (Kilminster et al., 2011).
• Through the support of individual cohorts or particular subgroups who may be in need of extra support, where technology can deliver targeted resources (Foster & Siddle, 2020).
• At the level of the individual, where technology mediated behaviour change (e.g., incentivisation, nudging) can support learning, assessment and actioning feedback, including the use of remote mentoring and coaching (Damgaard & Nielsen, 2018).
Learning, assessment and technology are increasingly intertwined (and interdependent) and building integrated coherence between these three areas is a critical goal.

2) Steps towards shared responsibilities for equity, diversity, inclusion and wellbeing
A key aspect for validity is attention to the responsibilities for ensuring equity, diversity, inclusion and wellbeing, when designing, implementing and evaluating assessments and the impact on learning.Much work has been situated around debates on 'decolonisation' and what this means in higher education contexts (Bhambra et al., 2018) and within the realm of assessment the focus has been on attainment gaps between students from different backgrounds and whether these are likely to be overcome within various different education systems (Holmwood, 2018).When the ultimate aim of all health professions education is to strive to improve patient care, to compete over practice in this area feels at odds with the ethos of what we wish to achieve, and the pandemic showed us that despite years of working in competition, we can work in a more agile way, across borders both nationally and internationally (Fuller et al., 2020).One of the challenges for educators is that many of us are only at the start of the journey in terms of acknowledging these responsibilities and figuring out how we can ensure that we view our curriculum, teaching-learning and assessment activities through the lenses of equity, diversity, inclusion and wellbeing.
In both the Performance Assessment and Technology Enhanced Assessment consensus statements (Boursicot et al., 2021;Fuller et al., 2022), there were themes around the responsibilities that we, as educators, have to share best practice globally and these responsibilities further prompt key questions: a) How might we think about reducing inequity in assessment practice?
b) How do we approach embracing diversity of assessors involved in assessments?c) How can we ensure appropriate inclusion of voices for their viewpoints and perspectives: students, patients, other stakeholder groups?For embracing diversity, the performance assessment consensus statement highlighted the evidence that has supported the value of rater (assessor) variance as meaningful (Boursicot et al., 2021).Rather than attempting to avoid or control differences between assessors, these differences should be embraced and actively designed into the process and taking a longitudinal, holistic view is important in relation to inclusivity (ibid).To enact this, we need diversity within our assessor groups, and a way to understand this diversity in places that we do not always see; the diversity of examiners across an OSCE for example can be influenced, but we may not know who assessors of learners are in the clinical practice setting.Alongside this, we need to ensure any data interpretation is holistic, and ensure we are not disadvantaging certain groups through our assessments, either in their design, in our interpretation of results, or in the use of technology to perform the same function (Broussard et al., 2018).
The work on the assessment life cycle (Fuller et al., 2022) reminds us that we need to think about not just protecting from discrimination, but actively including voices from a range of backgrounds and experiences.We so often see equity, diversity and inclusivity mentioned without acknowledgement that what happens when we get those things right -that is, we create wellbeing for our staff, students and patients.When individuals feel they are represented, when they have a voice, and that we can assure ourselves there is as much fairness and equity as is possible in a diverse world, it follows that much of the anxiety in terms of learning and assessment -in terms of what feels "unfair" -is addressed and results in improved wellbeing for both learners and the educators.
3) Steps towards achieving a sustainable assessment culture A key aspect of validity links to the purposes of assessments.We regularly focus on the learning purposes of assessment, that is, assessment should help learners improve.Some assessments are designed with a different purpose: to make judgements of clinical competence or other attainment.The limitations of single assessment tasks to meet multiple purposes has been noted by Norcini et al. (2018) who highlight a need for carefully designing the system of assessment and that this system must be more than a collection of assessments.The purpose of making a high stakes judgement of performance to assure competence of health professionals, and thus fulfil our collective duties to patient safety, the public, the profession, and learners, is one distinct purpose.This purpose must, in a system of assessment, coexist with other key purposes such as supporting and evidencing longitudinal learner development.
High quality feedback to learners to inform their learning, and sustain that learning across time, means that this is an important shift.We need to attend to feedback not only from single assessments, but also to feedback within the system of assessment, and this is needed from a longitudinal perspective.Whilst many assessment taxonomies only describe hierarchies of cognitive systems (e.g., comprehension, analysis, synthesis), work by Marzano and Kendall highlights the importance of both metacognitive and self-systems within assessment (Marzano & Kendall, 2007).Namely, it requires us to consider not just the assessment/task, but the learner's reaction to it (and impact on engagement, motivation and goal setting).This has important consequences for the design, delivery, timing and consequences of assessment, particularly if assessment is scheduled during busy teaching periods/ clinical placement activities.
Sustainable assessment (Boud & Soler, 2016) has emerged from research and practices that focus on assessment for learning.By reviewing many of the unintended consequence of (high stakes) assessment, the concept of sustainable assessment provides a framework for more holistic practice, allowing the purposeful design of assessment encounters that continue to positively shape approaches to learning and development be sustained beyond any given assessment.By reintegrating assessment of competency with active learner development theories (e.g., growth mindset) and the environment in which learning (and remediation) take place, it provides a powerful transformative approach.• the importance of feedback framed as action by the learner, • the minimal value (and often negative impact) for learning of information that compares learners with others, and, • the importance of self-monitoring assessment data and engaging in goal setting (within or outside the system).

4) Steps towards balancing resources in global contexts
A key aspect of validity is the context in which learners complete assessments, and this includes the resources available within the education system, reflecting a community and societal resourcing context.Health professions education in a range of countries share a common goal of educating safe and effective clinical practitioners.However, the learning experiences, assessment design and other components of education reflect the technological resources available.Key questions which educators in countries may differ in their responses include: 1. How do we prioritise resources, particularly technological resources, for assessment and learning?
2. How do we develop best practices which account for our specific needs and our technological context?
3. How do we contribute to global literature to provide a balanced and realistic account of approaches and practices?
Different countries and jurisdictions reflect the history, needs and responses to the societal context.Increasingly we recognise the impact of colonisation on First Nations peoples, intergenerational disadvantage, and the effect of poverty.Technology for learning and assessment that is expensive; requires significant training and/or personnel; is misaligned with learner technology background or experience; or does not reflect technologies in local educational or clinical settings, is unlikely to be feasible.Academic work is largely drawn from technologically well-resourced areas and this knowledge may not always be helpful for other jurisdictions.Thus, the knowledge base in the current literature does not represent all the global stakeholders in the field of medical and health professional education.The first step towards supporting validity of our assessments from a global resources perspective is to generate multiple avenues for learning together and create a knowledge base relevant to contexts with different access to resources.This collective effort needs also to focus beyond resource and technology and examine the importance of culture, e.g., through the lens of improvement and what we can share from frugal innovation (Hossain, 2017).

Conclusion
The steps outlined above are suggested ways to consider learning, assessment and technology.Each of the four steps that we addressed: importance of integration that is coherent; sharing responsibilities for equity, diversity, inclusion and wellbeing; sustainability; and balancing resources in global contexts, represents progress in our ability to support validity of our assessment practice and enhance learning.
We argue that all involved in learning and assessment in health professional education, need to ensure strong alignment between theoretical understandings from the fields of learning sciences and assessment, in order to evaluate how to use, implement, and adapt understandings from the field of technology.This is critical when designing learning experiences and assessing learning and/or achievement.
Questions we need to address moving forward include: How do we leverage technology to analyse alignment, from the learner/cohort perspectives?How do we look for evidence of success with practical strategies we adopt for supporting equity, diversity, inclusivity and wellbeing?How do we investigate the consequences of the design of our assessments, using technology?How do we work together with global partners, to design resource prioritisation fit for context?
Progress towards understanding our responses to these questions supports validation for contextualised applications of technology for assessment, and for learning.To promote this, we call for scholarship, and scholarly output, to 'advance conversations' (Boud & Soler, 2016), particularly where fast evolving applications of technology connect with learning and assessment.A focus on scholarship (as well as research) will promote active collaboration across different settings and strengthen diversity, as well as enable rapid dissemination of knowledge to inform the complexity of contemporary education of all health professionals.
In this article Kemp and colleagues are challenging us to think carefully about the interconnected nature of technology, learning and assessment.They do this by focusing our attention on key areas for consideration as we are drawn into these voluminous medical education spaces.Drawing on evidence-informed best practice from two Ottawa consensus statements, they unpack four key aspects of the assessment validity framework (coherence, equity, diversity, inclusion and wellbeing, sustainability, global resources).
While we agree with Harden (2023) that the article does take the reader time to distil and consider its key messages, this is partially due to the mystique that surrounds the concept of validity and assessment.While table 1 summarises the key messages from the two consensus statements, the article provides a rich array of further examples to explain each validity component, suggesting areas for contemplation that can be applied to any assessment format, not just performancebased assessment.For each aspect they identify three questions that extend the principles and synthesized evidence outlined in these consensus statement.Readers will gain the most value from this paper when they examine their specific context in light of these key questions.
In considering the interdependent nature of technology, learning and assessment, a common theme that runs across all four aspects of the validity framework is the value of developing selfefficacy and learner-centred approaches to achieve positive impact.When coherence is realised, how might we use the data to shift the focus -micro (individual or course) to macro (cohort or program) -to achieve an equitable learning experience for all?The call to action by the authors to share scholarship in this space offers both a sustainable and scalable option for transformative learning for health professions educators across contexts.
As noted by Thomas and Ellaway (2021) making decisions in health professions education is context-bound requiring careful consideration of local factors such as resourcing, the culture of the organisation and the values of those within it.So how can the emerging evidence from consensus statements be translated into local policy and practice initiatives?This contribution hones our thinking, illuminates important considerations and provide us with a way forward.
Is the topic of the opinion article discussed accurately in the context of the current literature?
Yes.An area that could be clarified to the reader is the purpose and structure of Table 1.This table summarises the key messages from the consensus statements, but it is unclear if the rows have a common theme across each row for the two consensus statements listed, or is each column representing a list of key messages independently?If the former is the intent, perhaps a sub heading for each row would assist to improve clarity to connect each column.The table is only referred to once at the beginning of the paper, should additional references be made throughout the opinion piece as a reminder of the relationship the validity components as they are discussed and these key messages?
Are all factual statements correct and adequately supported by citations?Yes the additional reference provides a good stepping off for the reader to explore further background and educational principles.They have demonstrated the important relationship between performance assessment and the use of technology and shown how the integration of technology has dramatically influenced performance assessment by enhancing its precision, efficiency and scope.They suggest that the successful integration of technology, however, requires thoughtful consideration.
This paper is not an easy read but it is well worth the time required to take on board its key messages, which are clearly argued.It represents a valuable contribution to the health professions education literature.Reviewer Expertise: Medical education I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
Are arguments sufficiently supported by evidence from the published literature?Yes it was pleasing to connect the four key aspects of the validity framework with the broader educational literature beyond a focus on medical education.The importance of interdisciplinarity in educational research is also highlighted byWainwright et al. (2020).Are the conclusions drawn balanced and justified on the basis of the presented arguments?YesReferences 1. Thomas A, Ellaway RH: Rethinking implementation science for health professions education: A manifesto for change.Perspect Med Educ.2021; 10 (6): 362-368 PubMed Abstract | Publisher Full Text 2. Wainwright E, Aldridge D, Biesta G, Filippakou O: Why educational research should remain mindful of its position: Questions of boundaries, identity and scale.British Educational Research Journal.2020; 46 (1): 1-5 Publisher Full Text Is the topic of the opinion article discussed accurately in the context of the current literature?Yes Are all factual statements correct and adequately supported by citations?Yes Are arguments sufficiently supported by evidence from the published literature?Yes Are the conclusions drawn balanced and justified on the basis of the presented arguments?Yes Finally, health professional educators can share what "works" and with technology partners develop ways that this can technology, learning and assessment -the "space" -used in many contexts.This is a valuable and well-referenced opinion article that all should consider.Is the topic of the opinion article discussed accurately in the context of the current literature?Yes Are all factual statements correct and adequately supported by citations?Yes Are arguments sufficiently supported by evidence from the published literature?Yes Are the conclusions drawn balanced and justified on the basis of the presented arguments?Yes Competing Interests: No competing interests were disclosed.Reviewer Expertise: Medical education I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.Reviewer Report 22 August 2023 https://doi.org/10.21956/mep.21089.r34322© 2023 Harden R.This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.Ronald M. Harden An International Association for Medical Education, Dundee, Scotland, UK I recommend this insight into assessment, which provides a valuable perspective on the subject.It looks at the technology-assessment relationship.Frans Johansson (2006) in The Medici Effect argued that when you step into an intersection of fields, disciplines or cultures, you can combine existing concepts into a large number of extraordinary new ideas.Kemp et al. have made a valuable service to medical education by bringing together the recommendations from the Ottawa Consensus Statement on Technology and the Consensus Statement on Assessment.

References 1 .
Johansson F: The Medici Effect: Breakthrough Insights at the Intersection of Ideas, Concepts, and Cultures.Harvard Business Review Press.2006.Is the topic of the opinion article discussed accurately in the context of the current literature?Yes Are all factual statements correct and adequately supported by citations?Yes Are arguments sufficiently supported by evidence from the published literature?Yes Are the conclusions drawn balanced and justified on the basis of the presented arguments?Yes Competing Interests: No competing interests were disclosed.

Table 1 . Key messages from Consensus Statements. Performance Assessment Consensus Statement Technology in Assessment Consensus Statement Trends
in past 10 years relate to systems of assessment, validity standards for assessment, rater cognition, and feedback.
(Fuller et al., 2022)ordances for each of these.For reducing inequity in assessment practice, we can use technologies to bring multi-jurisdiction groups together to learn from and partner with educators across different settings.A longitudinal approach to supporting each other will create opportunities to share learning and is a social responsibility(Fuller et al., 2022).Within the realm of equality related to social justice, we all have increasing social responsibility to use less, and that includes technological resources.Many of us could do what we need with what we already have, and so thinking creatively with technology rather than constantly upgrading will help both equality and sustainability in the longer term.New technology involves hidden costs, including faculty and student training.Doing better with what we already have can therefore be both more efficient, but also help us share best practice and potentially reduce inequity if we are sharing this practice more widely.