Challenges with measures used for assessing research impact in higher education institutions

Internationally, there has been a push for the prioritisation of research impact beyond its scholarly contribution. Traditionally, research impact assessments have focused on academic impact and quantitative measures, at the expense of researchers for whom research impact cannot be quantified. Bibliometric indicators and other quantitative measures are still the most widely used method for evaluating research impact because these measures are easy to use and provide a quick solution for evaluators. Conversely, metric indicators fail to capture important dimensions of high-quality research. Hence, in this study, we explored challenges with metric indicators. We adopted a case study of the University of Cape Town and used document analysis, a questionnaire survey to collect data from academics and researchers, as well as semi-structured interviews with a sample of academic and research staff. The findings highlight common challenges with quantitative measures, such as bias and discipline coverage, and the ability of measures to drive researchers’ behaviour in another direction. We propose the adoption of responsible research metrics and assessment in South African higher education institutions for more inclusive and equitable research impact assessments.


Introduction and background
Universities are increasingly called on to "maximise public benefits arising from publicly funded research" 1 and thus focus has turned towards methods for assessing and incentivising public benefits of research.The state of research impact beyond scholarly contribution is shaping how research is supported financially, undertaken and eventually assessed. 2Research impact is a convoluted, multifaceted and rapidly growing field of inquiry, and, by highlighting how research funding and time are being used, impact assessment can inform strategy and decisionmaking by both funding bodies and research institutions. 3Research impact refers to the benefits that result from research.Academic or scientific impact is the intellectual contribution to one's field of study, while societal impact is the impact of research on various levels and areas of society (social, cultural, environmental, and more).Societal impact is seen as the impact beyond academia or the intended audience.
Recent literature has shown that a "dynamic and inclusive research system is profoundly important for both science and society" 2 , and it can advance the fundamental knowledge and understanding necessary to address the increasingly urgent global challenges.But higher education institutions (HEIs) are under pressure due to increasing expectations from funders, the government, and the publishing industry.The expectations from the key actors in research re-enforce tensions between researchers, which results in many researchers competing for limited resources. 2Because of these individual pressures to show productivity using metrics in the 'publish or perish' environment, researchers in this context are more inclined to compete for various academic opportunities than collaborate.In the context of this study, the University of Cape Town (UCT)'s research impact (academic and societal impact) assessments are used for ad-hominem promotion and academic excellence awards, while researcher rating by the South African National Research Foundation (NRF) and academic appointment focus more on academic impact.This is because in the South African context, research impact assessment is still predominately focused on bibliometrics and government subsidy (for publications in journals on the Department of Higher Education and Training's list of accredited journals and in the six Department-approved international journal lists (Directory of Open Access Journals, International Bibliography of the Social Sciences, Norwegian, SciELO SA, Scopus and Web of Science)), which pushes researchers to publish more and quickly, creating perverse and unintended consequences, as noted by De Rijcke et al. 2 As the national project becomes the university project, the university has to ensure its success by imposing practices, expectations and standards by which scholars are judged, which are fashioned around countable items such as peer-reviewed publications. 3These metrics are crude and are often routinely used even though they fail to capture important additional dimensions of high-quality research, such as those found in mentoring, data sharing, engaging in public discourse, nurturing the next generation of scholars, and identifying and giving opportunities to under-represented groups. 2 assessment can be broad and extend to departments and institutions, in this paper, we focus on the assessment of individual researchers and their research.

research problem
Globally, there is a growing recognition that metrics are narrow and simple in nature and thus are limited in how they capture the quality and diversity of research. 2However, in the search for accountability and research excellence, easily available research metrics from scientific citation indexes such as Web of Science, Scopus and Google Scholar have been used as they provide a quick, easy solution to evaluate research. 4Bibliometrics have traditionally provided a useful complement to the peer review process, yet these metrics are used inappropriately and without any consideration for context. 5Similarly, concerns have been raised about the validity and reliability of bibliometric measurement in assessing the benefits emanating from research. 6Metrics provide data and evidence to support decision-making, yet some aspects of academia and scholarship cannot be quantified through using simple metrics as they fail to capture the richness and plurality of research. 7ward systems in HEIs like UCT to some extent rely on proxy measures of quality (such as citations and journal impact factors) to assess researchers in academic performance and promotion reviews and excellence awards; these proxy measures are also utilised in NRF rating applications.Therefore, research impact assessment is part of formal processes used for academic advancement at UCT and in other HEIs in South Africa.The use of proxy measures can demoralise researchers and deter them from working on other activities (such as teaching, mentorship and work that has societal impact) that are also important to the mission of most research institutions. 8While the use of metrics may vary from discipline to discipline, most disciplines utilise bibliometrics to ascertain quantity (publication count) and 'quality' of research outputs, especially in the natural sciences.These proxy measures fall short of recognising and rewarding the many aspects on which a healthy scholarly ecosystem depends 9 and are not robust for new forms of digital scholarship processes, nor are they meaningful for specific audiences such as the general public. 10Hence, a growing number of research leaders believe that the current system of higher education incentives and rewards is misaligned with the needs of society. 11Therefore, the problem explored in this paper is the challenges with methods used in HEIs for research impact assessment, with the aim of suggesting principles that can inform holistic methods for assessing research impact.

Literature
Traditionally, the total number of publications has been used to derive the productivity of researchers and their institution; however, the total number of publications does not provide an indication of the quality and significance of a research publication, nor does it indicate the impact of the research or the researcher. 12Bibliometric indicators are increasingly applied by governments and funders, mainly because of their large-scale applicability, lower costs and time requirements, and their perceived objectivity. 13However, recent developments in the area of research impact assessment have shown that traditional methods of assessing the impacts of research are driving scholarship in an opposite direction; hence the support globally for research assessment reform and adoption of responsible metrics.

Research impact assessment
Academic research impact is traditionally measured using, among others, the number of publications and citation counts, the h-index, journal impact factor and article-level metrics.Traditional measures need to be supplemented with other metrics and non-citation metrics that represent social or academic engagement of scholarly processes by scholarly and non-scholarly audiences. 10Citations reflect the usage of a scholarly product; however, citations take time to reflect, which may affect research assessment.Thelwall and others 14 claim that citations need time to accrue, and they are not the best indicators of important recent work as users may cite the work for different reasons.Haustein 15 adds that the audience for scientific researchers is not confined to those who cite, as many readers are not producers of research and thus evaluating a journal based on its citations does not give the full picture.The journal impact factor, developed in the 1960s by Thomson Reuters, now known as Clarivate Analytics, was for many years regarded as the best tool to determine the prestige and quality of a journal. 16The journal impact factor was originally created as a tool to assist librarians in determining which journals to purchase, and not as a measure of the scientific quality of research articles. 13,17Such metrics have evoked mixed emotions from the research community, which has resulted in various declarations such as the 2012 San Francisco Declaration on Research Assessment (DORA), the Metric Tide and the Leiden Manifesto for research metrics. 18DORA recommends that journal metrics should be avoided when trying to judge individual papers or individuals for hiring, promotion and funding decisions.Institutions and funders should judge the content of individual papers and take into account other research outputs, as well as a researcher's influence on policy and practice. 17bliometrics have been criticised for the homogenisation of the sciences, a lack of true objectivity and bias. 19Concerns have been raised in the scientific community about the validity and reliability of bibliometric measurement, coupled with an increased desire from funders (public and private) to show a return on money invested in research in terms of societal impacts. 20Steele et al. 6 explain that policymakers are often unaware of the problems with the use of the data -such as inherent bias with language and country, the differences in citation patterns between disciplines, lack of coverage of certain disciplines and bias in journal indexing, thus under-representing some areas of the world in their coverage.Hence, scholarly output from Africa remains under the radar, making it largely inaccessible and unavailable for comprehensive and strategic studies of research performance because 'local' publications are often not captured by international bibliographical databases.Similarly, Raftery and others 21 also noted the disciplinary bias in indicators used, which tends to privilege 'hard' research over humanities and social sciences research.Nevertheless, bibliometric analysis is still the most widely used method for evaluating research impact.Therefore, Wilsdon and others 7 assert that leaders in HEIs ought to develop a clear statement of principles regarding their approach to research assessment, including the role of quantitative indicators.

Research assessment reform
Since 2010, reform movements advocating for the use of 'responsible metrics' in research assessment and 'responsible research assessment' have emerged; these movements have been more focused on ensuring that bibliometrics is used appropriately rather than calling for these metrics to be abandoned. 22,23These movements came about as a result of limitations and biases with quantitative indicators.Quantitative indicators provide a good source of evidence for tracking research outputs, but alone they are not enough.So more recently, calls for reforming assessment practices have been extended to emphasise "values promoted by parallel reform agendas including movements for open science, research integrity, and diversity, equity and inclusion" 22,23 .These movements overall have had two primary foci: raising awareness of the challenges around bibliometrics and the development of good and responsible practices globally.
Research reform has received significant attention in research evaluation and assessment in the past 10 years, but these debates have been more on a global scale and mostly in the Global North.The recent year's research assessment reform conversations have been gathering momentum in the Global South, especially in Latin America, Asia and more recently in South Africa.Global actors like UNESCO and the Global Young Academy, and regional actors like the Latin American Forum for Research Assessment (FOLEC), have championed research assessment reform, even though much momentum has come from Europe. 23ountries in Europe, such as the Netherlands, Norway and Finland, have national policies to endorse research assessment reform practices, and the UK has also been leading in this area with a greater focus on understanding "what a healthy, thriving research system looks like and how an assessment model can best form its foundation" 23 .
In contrast, the responsible metrics movement has had seemingly less impact in the USA 22 , although Canada is making reasonable progress, this is not any different for Africa and South Africa.This lack of progress is also evidenced by the number of HEIs in Africa that are DORA signatories: by 31 May 2023, not a single African research-intensive HEI had signed DORA, a 10-year-old declaration.However, there is a sizeable number of African institutes, associations, publishers, and individuals that have signed the declaration.Cozzens 24 argues that it is not unusual for countries like the USA and South Africa to have gone in this direction, that is, of not having a concerted national conversation or efforts on the role of metrics in evaluation.Unlike other countries -like those in Europe, the UK, and others -these countries do not have a government-led national assessment exercise.Mitchell 25 adds that, in countries like South Africa where there is no national assessment or reform, efforts tend to fail because of lack of support in terms of funding and legislation from the national government.Therefore, a change in research assessment will require a significant level of resources from universities and funders to adopt research assessment reforms, making it a challenge if the government is not working with these actors.Hatch and Curry 8 note that changing how institutions, governments and funders evaluate research is difficult, but it is not impossible.

Methodology
In this paper, we report on an aspect of a study that was conducted in 2020/2021 among academics and researchers at the University of Cape Town (UCT), South Africa.We used a pragmatist paradigm and a mixed-methods approach to explore the challenges in research impact assessment.We undertook a questionnaire survey using SurveyMonkey in the first quantitative phase, followed by semistructured interviews, via Zoom and Microsoft Teams, in the second qualitative phase, which allowed for greater insight into the challenges in research impact assessment practices at UCT.In the first phase, the survey was completed by 119 UCT academics and researchers, and 30 academics and researchers were interviewed in the follow-up phase across the eight faculty structures, namely: Commerce; Engineering and the Built Environment; Health Sciences; Humanities; Law; Science; the Graduate School of Business; and the Centre for Higher Education Development.'Researchers at UCT' refers to individuals whose job involves a higher research component as opposed to academics who have relatively high teaching and research components in their role.Hence, researchers in this context includes postdoctoral fellows.To triangulate data collected via questionnaires and interviews, we also analysed documents related to research impact assessment: UCT faculties' ad-hominem promotion guidelines, NRF evaluation and rating guidelines, NRF funding guidelines, and Wellcome Trust funding guidelines).The study received ethical clearance from UCT's Humanities Faculty (Ref.no.: UCTLIS202004-02).Among the critical questions we interrogated in the study were: 1. How are metrics used in research impact assessment? 2. What are the common challenges experienced with metric indicators used in research impact assessment?
3. What underlying principles should inform the indicators used in research impact assessment?

Findings and discussion
In this section, we present and discuss findings from academics and researchers at UCT on what they perceive as common challenges with metric indicators and what they consider as underlying principles that should inform indicators used in research impact assessment to lessen the challenges experienced with metric indicators.Similarly, the challenges and underlying principles were also explored in the document analysis process.

Use of metrics in research impact assessment
We asked academics and researchers about the use of metrics and other indicators in their disciplinary spaces at UCT in order to contextualise the challenges with metrics.Academics and researchers use metrics for different career milestones; metrics are used mostly for research funding applications (29.2%) and ad-hominem promotion applications (26.2%).
Other uses included NRF rating, performance review, and job and fellowship applications.Some respondents commented that they had not used metrics as they either had not yet published or had only recently published.This finding was also reflected in the reviewed documents.UCT faculties' ad-hominem promotion, academic excellence and merit awards guidelines as well as the NRF evaluation and rating (UCT) template tend to require metrics (publication counts, h-index, journal impact factor, etc.).

Challenges with metrics
Table 1 presents common challenges with current metrics for assessing research impact as shared by academics and researchers.Academics and researchers agree on common metric challenges -bias and discipline coverage (73.1%); behavioural impact (72.3%); and interpretation (65.5%) -with the mean score for these challenges leaning towards 'agree' (3 on the Likert scale).Table 1 also shows a relatively high internal consistency for challenges encountered with current metrics for assessing research impact, as shown by Cronbach's alpha coefficient of 0.850.
When it comes to research assessment, 'bias and discipline coverage' is a common challenge internationally, as has been noted by others 2,6 .Bias and discipline coverage is a serious challenge for the Global South as the metric indicators can be biased towards certain countries, particularly those from the Global South.The Global South tends to be excluded from the knowledge production ecosystem and therefore scholarship from the Global South is partly invisible and inaccessible.Moreover, language bias has also been noted in knowledge creation as the English language,and more specifically Western English, is favoured more than other languages, which leads to the call for the adoption of 'world Englishes' (a concept that embraces the diversity that exists worldwide about the English language) to adopt a more inclusive approach to the use of English in scholarship.Similarly, discipline bias is a key challenge in bibliometrics as these indicators tend to cover applied sciences better than social sciences and humanities, leading to social science and humanities researchers calling for equitable discipline coverage.Biases in metric indicators tend to drive researchers' behaviour in a particular direction and make researchers focus more on 'what counts' rather than what is important, driving scholarship away from its intended purpose, which is to address community and societal needs, and to advance fundamental knowledge.This is mainly because the bias in these indicators, while a separate challenge, is interlinked with behavioural impact, which is why these challenges emerged as the top challenges in this study.Hatch and Curry 8 highlight that the use of surrogate measures also preserves biases against scholars who still feel the force of historical and geographical exclusion from the research community.
The document analysis in relation to common challenges with databases used to retrieve metrics showed that Scopus, Web of Science and Google Scholar are the most used bibliographic databases.Scopus and Web of Science databases tend to limit researchers to what is available in the respective database, neither of which index the majority of local and regional journals.Another challenge in the use of databases that are developed in the Global North is their language and geographic biases against the Global South.Many regional publications are not indexed in these databases.Google Scholar, however, indexes more local publications, but data quality may be a challenge as there are no quality criteria for inclusion; nevertheless, it provides more breadth to complement the Scopus and Web of Science databases.Further, these databases use 'Western standards' (a generally accepted standard originating from the Global North which is assumed as the world standard) to measure the local and global impact of research, and the databases do not recognise local context and differences between the Global South and the rest of the world in terms of research impact.
We found misinterpretation (65.7%) of metric indicators to be a challenge, especially journal impact factor.Another study also found this to be the case despite many declarations like DORA warning against the use of the journal impact factor for assessing journals and individual researchers for research impact assessment.In a study 26 involving US and Canadian universities, it was found that the journal impact factor was associated with quality (63% of institutions' reappointment promotion tenure documents) and the impact or importance of faculty research or publications (40% of institutions' reappointment promotion tenure documents).Consequently, researchers considered it necessary to have publications in journals with high journal impact factors to succeed and be promoted 27 , which speaks to the behavioural influence of bibliometrics.Moreover, in some countries, institutions are financially rewarded for publishing in journals with high journal impact factors, demonstrating an extreme but important example of how this metric may be distorting academic incentives and behaviour. 26e challenges related to metric indicators were also explored via semistructured interviews and one academic/researcher commented: "These metrics tend to be very biased and push academics to behave like a corporation with a big divide between established researchers and ECRs [early career researchers]".A majority of the academics and researchers interviewed reacted to challenges around bibliometric indicators such as systemic bias against individuals in or from the Global South, biases against younger researchers or those who have not been in research for long.Academics and researchers also noted the biases which are embedded in current assessment systems which tend to privilege certain groupings.Related to this, one academic/researcher commented:

Metrics and evaluation systems privilege researchers that have no responsibility outside of themselves and their institution… it privileges researchers and not people (who are also researchers) trying to change unjust systems or think about alternative systems.
In an earlier study 8 , for which the findings are in agreement with this notion, it was pointed out that current incentives often discourage researchers and academics from engaging in 'other' work such as mentorship and social responsiveness, as these kinds of work do not lend themselves towards the incentive structures.A common critique from the interviewed academics and researchers was that quantitative indicators tend to fuel the 'publish or perish' principle in that researchers tend to aim for quantity, which can compromise the quality of research.The International Alliance of Research Universities advocates that metrics alone are not sufficient for assessing the impact of isolated research and therefore should be used in conjunction with other indicators. 15An interviewed researcher in the study shared: The one cardinal rule that I was told, quite unequivocally, when I came to UCT is that the university, institution and government don't care about what we do practically as researchers.They care only about publications.
This point supports the challenge noted by researchers on behavioural impact.Similarly, another academic/researcher: Research assessment practice privileges dominant views, not paradigm-shifting thinking; polemic or controversial pieces e.g., Nattrass 2020 article published in South African Journal of Science will be hugely cited but is awful scholarship.
Another researcher also shared their views on the issue of the interpretation of metric indicators: Metrics may be misinterpreted as an absolute measure of value, without taking contextual factors into account.The problem with this kind of approach is that it drives undesirable behaviour from researchers, publishers and bureaucrats which should not be underestimated.
This view was also argued by Agate and others, when they observed that bibliometrics and altmetrics quantify the impact, thus resulting in a flattening and alienating effect because the assigning of scores as proxies of quality does not effectively account for nuances of context, depth of engagement and integrity of the process. 5Hence DORA developed a checklist for a balanced, broad, and responsible research assessment which suggests that evaluators need to be clear about the limitations and context of metrics used and to complement metrics with qualitative indicators, as well as be aware of unintended biases that arise from scientific and cultural stereotypes such as gender, ethnicity, seniority, affiliation and discipline. 28

Underlying principles that need to inform indicators
We also asked the surveyed academics and researchers about the underlying principles that they regarded as important for assessing research impact.Table 2 indicates that 43% of the participants regarded 'responsible research practices' (24%) and 'open science' (19%) to be important principles for metric indicator use.The underlying principles were cross tabulated with the faculty of the academics and researchers to see if there were any differences across the eight faculty structures at UCT.Some differences were evident because of disciplinary differences: those from the Science Faculty rated 'transparent reporting' second to 'responsible research practices', while those from the Faculty of Health Sciences rated 'open science' most highly, and for those from the Humanities Faculty 'diversity of types of research is valued' was most important.In the discipline of health sciences, 'open science' is promoted and encouraged more than in the other disciplines at UCT. Humanities disciplines tend to have diverse research outputs such as creative works which are as valuable as articles, which explains this principle being rated highly.Similarly, transparent reporting is important for reproducibility of science and is more valued for disciplines like science compared to others.The underlying principles, specifically the top three, imply that researchers are aware of the challenges with metrics and potential solutions to these challenges, such as the adoption of responsible research practices in assessment, which can lessen the issue of misinterpretation of metric indicators and related biases.Responsible research practices advocate for an open, inclusive, and impactful research culture that recognises the plural characteristics of high-quality research.
As stated earlier, Humanities Faculty participants regarded recognition of diverse research outputs to be an important principle, while respondents in the Science Faculty rated 'transparent reporting' high, which speaks to the disciplinary differences and preferences of these faculties.This result came as no surprise as humanities disciplines have the most diverse outputs which include creative works which traditionally were not appropriately recognised as valuable scholarly outputs comparable to traditional outputs.This was noted in the challenges but also relates to the recognition of diverse types of outputs and recognition of all contributions to research and scholarly activity.A participant in this study acknowledged: "Metrics are crude for research assessment and very output-focused while disciplines engage in public discourse which is regarded as a significant contribution in other disciplines."This notion was also discussed by de Rijcke and others 2 who stated that metrics which are routinely used fail to capture important additional aspects of high-quality research.Similarly, open science practices like transparent reporting of methods is a well-established practice in the applied sciences.
A few academics and researchers commented on 'other' underlying principles that are important for assessing research impact (but not captured by the question); these included: contribution to equity and transformation; reproducibility of research through supporting and recognising replication studies; and software and data as research ).This Australian study found that the funders incentivised some of the responsible research practices, but there was no mention of others and applicants addressed only those that were required (four out of the nine) rather than what was encouraged. 29The authors argued, quite correctly, that simply encouraging or recommending responsible research practices seems unlikely to substantially change researcher behaviour. 29This observation is similar to what was observed in our study on the role of metrics and behavioural impact, and the same is true in terms of rewards and incentives.Researchers are inclined to do what is mandated, required, recognised and rewarded more than what is encouraged and advised.The underlying principles identified by this study have the potential to offer guidance towards a solution to address the challenges presented in this study.

Conclusion and recommendations
While challenges related to measures used for assessing research impact are not unfamiliar, they have not been explored in a decolonising South African higher education context.Assessing research impact is a multidimensional and complex phenomenon and solutions to this wicked problem are in progress.We have reported on the challenges with quantitative indicators used to evidence the impact of research using UCT as a case study.Bias and discipline coverage were the most prominent challenges noted by researchers and academics in this study.These are challenges that have been noted globally in the literature on research assessment, and many leaders are calling for the adoption of responsible research practices and research assessment reform as current systems of evaluation are neither equitable nor inclusive.While this is a global challenge, researchers from Africa and the broader Global South tend to be particularly affected by the biases embedded in metric indicators.
Similarly, the problematic but undisputed focus on journals from the Global North and applied sciences further re-enforces these biases on researchers who already feel excluded from the knowledge production system.The use of metric indicators tends to exert pressure on researchers to change their behaviour and research agendas to conform to these norms at the expense of locally relevant scholarship.We recommend the adoption of responsible research practices that complement quantitative metrics with qualitative metrics.Moreover, there needs to be a concerted national conversation on research assessment reform in South Africa among higher education leaders, funders and policymakers.Key actors in research need to lead in the adoption of responsible research practices and responsible metrics.At an institutional level, there is a need for alignment of policy and practices around research assessment, especially with open science practices and institutional values.
outputs.These principles relate to that identified earlier on open science, as reproducibility is one of the practices in open science but also relates to recognition of all contributions to research.One of the key underlying principles identified in the document analysis was open science and open access.Both institutions and funders in the South African context have created policies to guide and mandate open science for researchers.But there is a misalignment, as open science principles do not feature in research assessment practices, as evident in the documents reviewed, even though there are policies.A similar trend was reflected in an Australian study by Diong et al. 29 who assessed funding scheme instructions against nine criteria to determine to what extent they incentivised responsible research and reporting practices (such as open data, conducting quality research, discouraging use of publication metrics, etc.

current metrics for assessing research impact response (n = 119) Cronbach's alpha coefficient 0.850
Higher education leaders, policymakers and funders need to review the current indicators used and how fit they are for purpose, and to what extent they help researchers and research from the Global South contribute towards addressing local challenges.Challenges with

table 2 :
Embracing responsible research practices and open science in theory, policy and practice may move HEIs like UCT in South Africa, and other related Underlying principles that academics and researchers regard as being important for assessing research impact, in relation to faculty: Centre for Higher Education Development (CHED), Commerce, Engineering and the Built Environment (EBE), Graduate School of Business (GSB), Health Sciences, Humanities, Law and Science to addressing some of the challenges with quantitative indicators generally and how they are applied in HEIs like UCT.