A qualitative and multicriteria assessment of scientists: a perspective based on a case study of INRAE, France

Psychosociology theories indicate that individual evaluation is integral to the recognition of professional activities. Building upon Christophe Dejours’ contributions, this recognition is inﬂuenced by two complementary judgments: the “utility” judgment from those in hierarchy and the “beauty” judgment from the peers. The aim of this paper is to elucidate how at INRAE individual assessment of scientists is conducted. This process follows a qualitative and multicriteria-based approach by peers, providing both appreciations and advice to the evaluated scientists (the “beauty” judgment). Furthermore, we expound on how INRAE regularly adapts this process to the evolving landscape of research practices, such as interdisciplinary collaboration or open science, assuring that assessments align with


Introduction: why an assessment of scientists?
Scientists undergo various types of evaluations throughout their careers, including during recruitment, promotions, calls for projects and at key points along their professional trajectory.Assessment in science is central, and this perspectives article (In PCI Organization Studies, perspective articles seek to disseminate knowledge or inform policy-makers and practitioners) proposes to address several aspects of evaluating science activities, going beyond primary scientific knowledge production and moving towards more qualitative aspects.
In France, the assessment of civil-servant scientists with permanent positions is mandatory and governed by decree 83-1260 (30 th December 1983; https://www.legifrance.gouv.fr/loda/id/JORFTE),which sets out the statutory regulations applicable across French research organizations.This obligation provides an opportunity to devise a method of peer assessment that is beneficial to both the individual scientist and the research organization.This article focuses on the routine assessments conducted throughout a scientist's career and does not address promotions.
In France, as well as in other countries, there are two primary types of institutions for research and higher education: research institutes focused mainly on research and universities that have dual goals of education and research.French universities recruit scientists based on both their scientific expertise and their teaching specializations.However, in terms of assessment, there is typically no regular evaluation of scientists at French universities except for promotion purposes.In French research institutes such as INRAE (French national research institute for agriculture, food and environment, https://www.inrae.fr/)scientists are recruited based on their scientific expertise, research skills and projects.At INRAE, assessments are conducted every two years, independent of promotions, and most other French research institutes follow a similar practice.
In this context, our initial focus for this perspective paper will be on examining the methods through which assessments can be conducted, with a particular emphasis on the indispensable role of peers.Peers play a crucial role in facilitating a nuanced and tailored qualitative assessment of scientists.Consequently, we will delve into the comparison between quantitative and qualitative assessment approaches and explore the evolving criteria essential for qualitative evaluations, applied particularly to expertise, partnership for innovation, interdisciplinarity and open science.Lastly, we will underscore the significance of adopting a multicriteria approach to ensure equitable assessments of scientists within an applied research institute like INRAE.

How can we assess "work"?
The word "work" encompassed multiple meanings, such as « labour » and « artwork », as defined in the Cambridge dictionary (see also Sennett, 2008).Evaluation must consider this dual aspect of work, taking into account both the execution of tasks and the creative process.Christophe Dejours's contributions are important for sociology: he created a new discipline, that of "work psychodynamics".One of the contributions of work psychodynamics concerns the individual and collective defence strategies deployed to combat suffering.This theory also highlights the importance of intelligence at work at the individual level (working) and at the collective level (the production of work rules or deontic activity), as well as the recognition of work and the worker (through judgments of beauty and utility) (Dejours & Deranty, 2010).According to Christophe Dejours' (2003) observations and conclusions, the concept of "work" itself, encompassing both its labor and artistic dimensions, cannot be directly assessed.Instead, only the outcomes of work are observable and accessible for evaluation (Dejours & Deranty, 2010;Dejours et al., 2018).Assessing these results commonly involves quantitative approaches such as measurement and counting.However, the efforts, strategies, tricks and ingenuity invested in achieving predetermined objectives are often invisible to evaluators.Dejours argues that the true nature of work must be evaluated through qualitative criteria, such as those involving "telling", "relating", "narrating", "explaining", and "clarifying" -verbs that convey a "story" or narrative.Consequently, peers who share the same profession as the evaluated persons, understand the intricacies, face similar challenges and innovate to overcome obstacles are best positioned to identify and assess the essence of "work" according to Dejours's definition.
Peer-assessment of scientists' work is crucial and indispensable for different reasons.Firstly, recognition is vital for fostering well-being in the workplace, and this applies to all workers, including scientists.As emphasized by Christophe Dejours, recognition entails two fundamental judgments: the "utility" and the "beauty" judgment, both of which are complementary and essential.The "utility" judgement, given by the hierarchy (e.g., the managers), signifies that the activities and contributions of the worker are valuable to the organization, thereby imbuing their work with significance.Conversely, the "beauty" judgment concerns the compliance of the work with the rules of the profession.This judgment thus confers belonging to the collective.It always contains in its statement a judgment that also consists of appreciating what makes the specificity, the uniqueness, the originality, and the style of the work.This judgment, therefore, confers recognition of the identity of the person evaluated and does not equate them to any other (Alderson, 2024).Thus, the "beauty judgment" offered by peers who understand the intricacies of the job because they performed similar tasks is equally essential.Peers possess unique insights into the challenges faced by the assessed individual and can discern hidden aspects of their work, going beyond standard requirements to achieve objectives.Therefore, peer assessment ensures a holistic understanding of the scientist's contributions and provides valuable recognition, taking into account intrinsic values.
These conclusions, proposed by Christophe Dejours, are general and not specific to the assessment of scientist's work (Laaser & Karlsson, 2021).However, they provide a framework within which INRAE proposes an assessment procedure to strive for the most accurate qualitative assessment of scientists possible: an evaluation conducted by peers collegially, with the aim of providing advice rather than sanctions or rewards and considering a wide range of missions and activities corresponding to personal and professional trajectories.To our knowledge, INRAE is unique among other French research institutes and universities in referencing these conclusions.
The aim of individual assessment is to evaluate the work, not only its outcomes.In practice, the peers at INRAE refer to the report written by the evaluated person, trying to i) comprehend the individual's position within the organization (team, lab, institute), ii) focus on the activities and achievements, the substance of what is documented rather than the quantity of publications, and iii) analyse the person's insight into their work.These factors are discussed among committee members (see below), and subsequently, a message conveying their "beauty" judgement is communicated to the evaluated scientist.

Quantitative versus qualitative assessment
Most assessment methods implemented by organizations, including research institutions and funders, have traditionally focused (and are still) on quantitative criteria, which are effective for evaluating the outcomes of work rather than the work itself.Quantifying achievements, such as publications for scientists, has been (and still is) a common practice, but is now increasingly recognized as imperfect and even unfair (DORA, 2023;Hicks et al., 2015;Moher et al., 2020).It is challenging to account for the fact that certain achievementssuch as publicationsproceed from scientific work and research which may vary in difficulty, depending on the discipline (e.g.large experimental design versus single tested parameters, or multipartner project versus individual project) and the hypotheses being pursued.One approach to quantifying outputs has been the use of the journal impact factor, which is based on the notion that publishing in a highranked journal indicates superiority compared to publishing in a lower-ranked journal.In recent years, there has been a shift away from relying solely on quantitative parameters such as the number of publications, impact factor, H-index, and journal quartile ranking.In fact, several organizations, including DORA, (the San Francisco Declaration on Research Assessment, https://sfdora.org/read/),have even prohibited their use.This move stems from the recognition that simple ranking offers a narrow perspective of the work being evaluated.Quantitative parameters only measure the outcomes of work, i.e. the quantitative objectives.They fail to capture how scientists conceptualize their ideas, address the goals of their study, and navigate challenges or novel situations.None of these aspects -such as a person's ability to adapt to professional circumstances, search for solutions, or contend with unpredictable resultsare discerned by quantitative parameters.These capacities represent a broader view of "real work," as described by Christophe Dejours, involving the ability to confront the resistance of reality.Sometimes, this "real work" may remain hidden because it diverges from conventional approaches (and may even break the rules) or because the situation compels the worker to take risks and defy the odds.In the realm of research, this hidden work may encompass all the unsuccessful attempts or negative resultsexperiments that did not fail outright but did not confirm the initial hypothesis.If these "failures" are not acknowledged, it obscures the efforts and time invested in exploring various avenues to generate new knowledge.From an assessment perspective, it is crucial to provide scientists with the opportunity to communicate and elucidate these elements, which encapsulate the myriad approaches and strategies employed to achieve scientific objectives (which may eventually be translated into publication, for instance).
The Coalition for Advancing Research Assessment (CoARA, https://coara.eu/)campaigns for an "assessment of research, researchers and research organizations that recognises the diverse outputs, practices and activities that maximise the quality and impact of research.This requires basing assessment primarily on qualitative judgement, for which peer review is central, supported by responsible use of quantitative indicators".This is a highly significant position statement signed by hundreds of European universities and institutes, including INRAE, which reject and question the paradigm of quantitative assessment of research.In our view, the push for a paradigm shift originated from bottom-up initiatives such as DORA and the Leiden Manifesto (http://www.leidenmanifesto.org/),followed by top-down decisions at international European and national levels, and was further reinforced by other bottom-up initiatives like Peer Community In (PCI; https://peercommunityin.org/).The overarching goal is to transition towards qualitative assessment methods.
The current paradigm of quantitative research assessment emerged in the 1980's, influenced by the theory of new public management, which emphasizes the use of indicators to define research excellence.Consequently, employing bibliometric methods made it convenient for nonscientists, such as policymakers and administrative managers, to rank and compare individuals and organizations using a standardized set of indicators.Gingras (2016) clearly discusses bibliometry's application in research, highlighting both negative aspects, particularly when applied to individual assessment, and positive aspects, such as tracking the temporal dynamics of science.Hence, while indicators can prove valuable in specific contexts, they continue to play a central role in assessment procedures.
Indicators are very valuable tools for ranking.A recent open discussion held by DORA focused on the potential negative repercussions of rankings in evaluation practices, particularly in connection with a capitalist approach to research activities.At a DORA webinar, Krystian Szadkowski linked the functioning of science to the capitalist economy, where liberalism provides a framework for competing for research funding and influences the ranking and prestige of institutions and researchers, illustrating the interconnectedness of these aspects (Dogan, 2023).Indeed, there are numerous instances worldwide where governments prioritize "profitable" research, emphasizing "value for money" in research assessment, as evidenced by examples such as Martin (2011) in United Kingdom where research assessment is linked to excellence, and the work of Gingras & Khelfaoui (2021) concerning French medical research (medical research assessment has been transformed into an administrative management tool for the budget allocated to research).However, evaluation could serve as a catalyst to unravel this complexity.Ranking, being ubiquitous and intertwined with capitalist economies, is a cornerstone of quantitative assessment methods (Gingras, 2016).This poses challenges to public research, prompting several initiatives to advocate for more thoughtful methods in the use of rankings.While we will not delve into the topic of international rankings of universities and research institutions here (as our focus is on individual scientist assessment), we can highlight initiatives like the "More Than Our Ranks" (MTOR, https://inorms.net/more-than-our-rank/)initiative within the International Network of Research Management Societies (INORMS, https://inorms.net/).MTOR aims to equip institutions with tools to evaluate all their activities independently of rankings.

Peers for a qualitative assessment scientists work
The challenge facing research organizations, their managers and scientists today is to adopt an assessment method that is more suitable than strictly quantitative methods.The goal is to capture the "hidden" activities of researchers, which includes not only their successes but also their difficulties, failures and strategies for overcoming obstacleselements that truly represent their work beyond just the outcomes.Peers play a central role in this process, as they bear the responsibility of recognizing and evaluating these aspects.
Following the decree of 1983, INRAE established a conventional process of scientific assessment.Initially, the evaluation consisted mainly of a report and a list of productions analysed by committees (see below), which provided feedback to the scientists.While this framework is still in use today, there have been changes in the content and the structure of the report and productions, with several additional elements being developed (see below).One significant reorientation occurred when France established a National division for research organization assessments in 2006, known as AERES, later becoming HCERES (https://www.hceres.fr/).INRAE then began developing adapted criteria for research institutes based on finalized objectives.This marked the first instance of employing a multicriteria approach to assess scientists, which continues to be utilized today (see below).This shift was crucial for tracking the evolution of scientist's various activities at INRAE.Nonetheless, challenges persist, including the integration of assessment for open science practices, interdisciplinary research, scientific integrity, and qualitative assessment -goals central to this perspective.
Nowadays, at INRAE, individual assessments of scientists are performed by groups of peers called "Specialized Scientific Commissions" (SSCs), organized by disciplines or groups of disciplines.The 13 SSCs cover all types of disciplines present at INRAE (Table 1), and each INRAE scientist selects the SSC that corresponds best to her or his activities.Peers (either internal or external to INRAE) are not remunerated for their work.INRAE considers that this is part of the "expertise" criteria (see below) for participating in peer evaluation within the SSC.The average cost of one SSC per year is approximately 8000 euros for reimbursement, based on 3 to 4 days of meetings per SSC, with no interview (see below, conclusions) and for assessing approximately 700 scientists per year.
The SSCs produce independent assessments regardless of the INRAE hierarchy.They provide advice that is collectively discussed.Specific referees, chosen from among the members of the SSC, are appointed for each evaluated scientist but remain unknown to her or him.This confidentiality is important as the judgment of "beauty" is thus given by a community of peers rather than by only one peer in order to enhance the value and the significance of the assessment.The outcome of this process is a personal and dedicated assessment provided through a written message every two or three years by peers of the discipline to the evaluated INRAE scientists.

An advice-based assessment
As mentioned earlier, a scientist's assessment is based on the "beauty" judgment by peers.INRAE's aim is not to punish or reward, but to provide advice in a humane manner and with good intentions.The advice typically strikes a balance between acknowledging positive aspects evaluated by the peers and offering opinions on the scientist's choices (such as methods), the dynamics and relevance of research, or potential future directions (for example, in terms of collaboration).This general advice pertains to the trajectory of the evaluated scientists and may vary between junior and senior scientists.For example, the message might include the following elements and sentences: a contextual statement like "You work on the effect of…", or "You are involved in projects aiming at…"; followed by a series of sentences congratulating activities (specific results, management of important projects, involvement in education if applicable, or other specific activities like open science); and finally a series of sentences offering advice or discussing elements concerning the near future, project orientation, or trajectory of the future career.When the situation indicates elements of degradation (see below), this can be delicately mentioned in the message.For example: "The committee has identified a critical issue concerning your publications since you have not published since…", or "Regarding the degraded relationship you mention within your group, we encourage you to …", and "In this context, the committee will inform your hierarchy to assist you in resolving this issue".A considerable amount of time is taken to craft these messages in order to ensure clarity and limit potential misinterpretation by the assessed scientist.One particular consideration is the coherence of their work with the strategy of INRAE, even though this specific point, theoretically, is more a judgement of "utility".Since the commissions work independently and autonomously in their assessment of "beauty", there is no direct correlation with the judgment of "utility".The latter is determined by laboratory directors or senior managers, and the evaluated researcher must include the opinion of their laboratory director in their file following a personal interview.This opinion and interview constitute a judgment of "utility".In other words, throughout the assessment process, the researcher receives the two necessary opinions for recognition of their work (from peers and from the hierarchy), but these opinions are provided independently.However, there are indirect connections, as i) the hierarchy receives the messages delivered by the commissions to the researchers, and ii) the commissions may contact the hierarchy in case of difficulties encountered by the researchers (see below).
Upon reviewing the entire file written by the evaluated scientist and considering the recommendation to present it in a narrative mode, peers can identify various types of difficulties the evaluated individual might have encountered.These difficulties may be evident either because the evaluated person describes a challenging situation in the document or because SSC members perceive a lack of dynamism or motivation.This action enables the hierarchy to take appropriate steps (e.g.contacting the scientist, the head of the lab) with the assistance of professional human resources personnel at INRAE to understand and support the individual and the laboratory in resolving the situation as much as possible.In Table 2, we provide an example of types of such situations detected in 2020 (involving 60 concerned scientists) and 2021 (involving 74 concerned scientists) out of a total of 1454 assessed scientists.The majority of these situations (78%) are easily resolved within a few months: a simple discussion between the scientist and their hierarchy is often sufficient to clarify the issue and find a resolution.In other cases, both parties agree that a change is necessary, and various solutions are explored, such as a change of team or laboratory.Resolving these cases can take a year or even more time.In very rare instances, the situation is severely degraded (for any reason): this may prompt a comprehensive analysis made in coordination by multiple human resources professionals.The results in Table 2 clearly highlight that "junior scientists" are more frequently identified with difficulties (157 occurrences versus 29).Several hypotheses can be proposed to explain this phenomenon.Firstly, it is possible that both INRAE and peers pay closer attention to junior scientists who are still establishing themselves as permanent researchers and thus may require additional support and guidance.Secondly, junior scientists are no longer in the post-doctoral phase and are expected to manage various aspects of research, such as proposing projects, seeking funding, and supervising students.This transition to a more comprehensive role in research may pose challenges for some junior scientists.Thirdly, the process of adapting to a new environment, new research topics, and the demands of a permanent scientific position can disrupt the ability of junior scientists to fully engage in their research activities during the initial years.Additionally, settling into a permanent scientific role may coincide with personal life changes, further affecting their capacity to fully immerse themselves in their work.Lastly, junior scientists at INRAE undergo assessment three times in five years, which is more frequent than senior scientists who are assessed only twice.This increased frequency of assessment for junior scientists may lead to a higher likelihood of identifying difficulties or challenges they may encounter.Overall, these factors contribute to the higher frequency of identified difficulties among junior scientists compared to their senior counterparts.
During the discussion among committees, different views on specific cases may arise.While there is no specific procedures in place to manage these discussions, the role of the committee president becomes crucial.Their personality and ability to foster an environment of attentive listening, and benevolence are instrumental in managing any conflicts that may arise and in reaching consensus decisions.If an unresolved conflict does occur between members of a committee (which to our knowledge has happened only once in the last five years) then the Evaluation Department steps in to act as mediator between the involved assessors.
In conclusion, the role of peers is crucial in providing scientists with recognition through a "beauty" judgement and in proactively identifying and addressing potential to prevent situation from escalating.

Evolution of qualitative assessment criteria
Qualitative assessment by SSC members is grounded in a framework and guidelines provided to scientists, aimed at guiding their writing towards a "storytelling mode" centred on i) facts and achievements, and ii) a reflexive analysis of the activity, including successes, failures, and difficulties (Direction de l'Evaluation, 2023).
Assessment solely based only on (inappropriate) quantitative metrics appears to be the simplest way for an automatic or administrative evaluation by non-experts, individuals who lack the ability to analyse the true quality of work and its outcomes.There are several examples of misapplication of quantitative indicators in the research assessment of individuals.One example pertains to impact factors and the number of citations, which often poorly correlate with the fundamental criteria expected for good research: quality research, statistical robustness, the value of declared data, and replicability (Dougherty & Horne, 2022).The transition from quantitative metrics to a qualitative approach on a broader scale may necessitate the definition and implementation of "qualitative indicators".Wouters et al. ( 2019) proposed a framework comprising approximately 150 indicators and assessed their relevance for various types of evaluations, including infrastructures, research and funding organizations, individual researcher activities, career advancement, and recruitment.While this approach aids in defining qualitative indicators, there is a risk of succumbing to the temptation to assess qualitatively with an excessively lengthy list of indicators, potentially leading to a return to the paradox of using a quantitative approach for a qualitative assessment.
One of the risks of qualitative evaluation is the significant loss of benchmarks, resulting in subjective judgment.Therefore, it is necessary to strike a balance between enabling qualitative assessment, which allows for customization of files, while maintaining fairness in judgment among scientists.Subjectivity refers to how individuals' perceptions and interests may influence assessment outcomes.Quantitative assessment likely aims to minimize subjectivity.Subjectivity in qualitative evaluation is recognized as being present; it is not necessarily a risk if the evaluators are aware of it and integrate it into their way of analyzing the files (Bumbuc, 2016).It is important to note that much of the literature on subjectivity in work evaluation pertains to performances, indicating that we are still far from achieving a comprehensive assessment that transcends mere "performances" (Tran & Järvinen, 2022).
It is crucial for everyone involved in qualitative assessment to be aware of when subjectivity comes into play and to address these potential biases (e.g., over-interpretation of some situations, unfair judgments, lack of criteria and thus heterogeneity in file assessment) when evaluating research quality.Employing methods such as cross-checking analyses and seeking multiple perspectives on a situation, commonly referred to as triangulation (Fichten & Dreier, 2003), can help mitigate the adverse effect of subjectivity.Therefore, i) the use of indicators is essential to minimize the risk of subjectivity during qualitative assessment, and ii) employing a multicriteria approach, as done at INRAE, can further reduce this risk by balancing each criterion, ultimately defining the profile of each scientist based on the distribution of their various activities.
For an organization like INRAE, qualitative assessment must consider potential new directions in research practices.Recently, INRAE has chosen to enhance its activities and visibility in various areas, which are elaborated below: expertise, partnerships, interdisciplinarity, and open science.

Expertise and support for public policies
The goal of expertise is to provide scientific and technical knowledge, tools and methods to stakeholders responsible for public policies, including ministries, agencies, local authorities, European and international institutions, and universities.These resources aid in informing, designing, implementing, and evaluating public policies.At INRAE, expertise activities manifest in various forms, such as collective scientific expertise, foresight studies, research for and on public policies, training, participating in working groups and public bodies, and the establishment and management of observatories or databases.These aspects are outlined in guidelines provided to scientists who are asked to describe these activities in their assessment files.Peers evaluate how these expertise activities align with the three other types of activities, assess their coherence with the scientist's personal trajectory, examine their outputs and determine if they have been effectively disseminated to the appropriate audience.

Research in partnership with a view to contributing to diverse forms of innovation
Innovation stems from multiple partnerships involving research or training institutions, technical centers, agricultural or agro-industrial institutes, competitiveness clusters, public and private economic entities and civil society organizations.The objective is to facilitate the co-construction of the value creation process among all project stakeholders.INRAE advocates for the concept of diverse innovations, meaning that research may innovate to address economic, political, environmental, societal or health-related issues.Collaborations entail producing outputs with others that are enriched and different from what could have been achieved independently.Therefore, it is crucial in terms of assessment that researchers explicitly articulate their partnership approach in terms of co-design, co-construction and co-realization, with long-term programs punctuated by more focused These partnerships should address and respond to questions concerning original and beneficial research.

Practices of interdisciplinarity
By definition, a partnership aims to foster innovation and achieve more collectively by leveraging differences in ideas, skills, expertise, and resources.Collaboration involves working with individuals who may come from diverse scientific backgrounds.The success of an interdisciplinary partnership hinges on the ability to facilitate dialogue among individuals from various disciplines.Whether the partnership is academic, private, public or/and involves citizen participation, at either national or international levels, interdisciplinarity must be actively fostered in the processes of co-construction and co-realization.Considering interdisciplinarity in the assessment process is crucial in order to recognize the inherent costs associated with this interdisciplinary effort and to focus, within this context, on the quality of research inquiry and the relevance of this approach.However, practising interdisciplinarity in terms of assessment can sometimes be perceived as a disadvantage as it may blur professional identities and introduce complexities associated with belonging to multiple social groups (Negro and Leung 2013).A recent study delved deeper into this issue and revealed interdisciplinarity is often penalized due to reinforcement of social boundaries (Fini et al., 2023).At INRAE, since each SSC is centered on a specific discipline, there might be difficulties in assessing scientists who work at the interface of multiple disciplines.Peers within a particular SSC may not be fully equipped to deliver comprehensive and tailored "beauty" judgment.As a solution, INRAE allows scientists to be assessed by two different SSCs, covering the disciplines relevant to their research (for example, mathematics and ecology).In this way, the evaluated scientist receives two complementary "beauty judgment" from each discipline.While this approach mitigates the issue of social boundaries mentioned by Fini et al. (2023), it remains a proxy for assessing interdisciplinarity work since each SSC evaluates only one discipline, potentially overlooking the true capacity to work at disciplinary interface.

Practices in open science
Open science is a comprehensive approach aimed at enhancing the reproducibility, transparency and robustness of research (Susi et al., 2022).Over the past few years, there has been a significant international and European effort to promote open science practices, particularly concerning publications (open access), data, code and computer programs, and citizen science.Many international scientific institutions and universities have endorsed various manifestos, such as DORA and Leiden (Hicks et al., 2015).Additionally, several countries, including the European Community, have developed roadmaps to encourage scientists to adopt these new practices.The overarching principle behind this engagement is that the scientific content of an article holds greater importance than publication metrics and that all the data should adhere to the FAIR principle and be well-described through metadata.
In February 2022, during an Open Science European Conference (OSEC) held in Paris, France, a significant number of European universities, research organizations (including INRAE) and funders signed the "Paris Call on Research Assessment" (https://osec2022.eu/paris-call/).The Peer Community Journal, Vol. 4 (2024), article e77 https://doi.org/10.24072/pcjournal.432 objective of this initiative is to reinforce the shared European vision regarding the recognition of quality and the diverse impacts of research that adhere to the highest standards of ethics and integrity.It aims to value the diversity of research activities and recognize not only research outputs but also the proper conduct of research.Research organizations, including INRAE, now have a clearly defined framework for evaluating open science practices.In that context, we conducted a benchmark analysis to examine how different countries and organizations incorporate open science practices.This benchmark utilized a corpus of twenty documents produced by various international organizations, states or universities (such as Bristol, UCL, Utrecht) spanning from 2015 to 2022 (Table 3).
Our benchmarking primarily focuses on universities, national roadmaps, and clusters of organizations such as the League of European Research Universities, and the European Commission, which have broader objectives than INRAE.We encountered challenges in augmenting this corpus with data from other scientific organizations that share similar objectives to INRAE.
As a starting point, we examined how the four main activities used to structure the assessment of INRAE scientists (i.e., production of knowledge, expertise, training, and management) crossalign with criteria considered by other international organizations.Initially, we observed that the four activities encompass all types of activities identified in assessment procedures within an open science context.In Figure 1, we illustrate the correspondence between INRAE categories and those of other organization.Despite differences in vocabulary (for instance, the term "Education" is more common in universities than "Training") a strong alignment is evident.However, there are terms not explicitly mentioned in INRAE activities, such as "Soft Skills" or "Leadership".We interpret these terms as referring to skills that permeate all types of activities, and are challenging to assess due to their focus on behaviour and abilities.The term soft skill (or interpersonal skills) encompasses relational intelligence, communication abilities, character, and interpersonal aptitudes which could then allow qualities for having leadership.term research initiatives (Joly & Matt, 2017;Joly et al., 2019).INRAE maintains that impact assessment cannot be attributed solely to one individual (the scientist), as the impact is systemic and involves multiple actions and stakeholders.
Based on the different axes outlined in Figure 1, there is no significant discrepancy in the mode of research assessment between INRAE and other international organizations.In fact, INRAE is one of the organizations that has established specific procedures and criteria for assessing open science practices.Practically, we recommend the individual being evaluated should mention their potential involvement in open science by i) listing new products specific to bibliodiversity (e.g.preprints, data repositories), ii) describing actions toward achieving FAIR data principle, iii) outlining their strategy in open science (e.g.preference for diamond or golden journals), and if applicable, iv) explaining their personal involvement or actions in open science initiatives such as participation in Peer Community In, engaging in open peer reviews, processes for sharing data, or involvement in citizen science projects.
Open science practices raise questions regarding scientific integrity due to the expansion of target audiences, widespread dissemination of research results and increased production.Consequently, there is a risk of misuse of this information, such as its propagation through social networks or preprints being misconstrued as validated science.This necessitates heightened vigilance regarding ethical and deontological considerations (Shaw, 2003), as well as a focus on transparency and traceability of research processes.Moreover, greater attention must be paid to research data, their management, and, when appropriate, their sharing.Regarding the assessment of scientific integrity, INRAE provides scientists with the opportunity to articulate in their report how they uphold their scientific integrity in general and, more specifically, in relation to open science practices.This allows individuals to reflect on their adherence to ethical principles and demonstrate their commitment to maintaining the integrity of their research endeavors.

A multicriteria assessment
There are several reasons for considering different criteria during the assessment of scientists.Firstly, INRAE is a research institute that integrates basic and applied approaches to achieve final objectives for society.This encompasses various disciplines and expertise, each requiring specific criteria, ranging from agronomists working with farmers to molecular biologists in the laboratory.Secondly, scientists' missions extend beyond knowledge production to encompass expertise, education, and management.Thirdly, scientists's missions evolve throughout their careers, with senior scientists often becoming increasingly involved in management roles.
In the mid-2000s, INRA and Irstea (the two founding organizations of INRAE) established and participated in the inter-institutional working group on the evaluation of the finalized research, known as EREFIN ("Evaluation de la REcherche FINalisée" which stands in French, translated to "Assessment of Finalized Research").The objective was to promote a comprehensive evaluation that goes beyond simply assessing the production and dissemination of new knowledge to include other missions such as expertise, training, and contribution to scientific culture.This initiative resulted in the development of tools that provide a framework for various activities, a catalog of possible potential outputs and descriptors, and assessment criteria (EREFIN, 2011).Today, these tools remain largely utilized in various organizations and evaluations throughout France.
At INRAE, building upon the framework provided by EREFIN, we encourage scientists to report their engagement in various anticipated activities through four primary categories (Figure 2): -Production of knowledge.
-Expertise and knowledge mobilization.
-Training through research, initial and continuing training.
-Animation or direction of institutional groups, major instruments, resources, programs or networks.This list represents various potential areas of actions, subjected to the "beauty" judgment of the peers within the SSC; there is no expectation for any individual scientist to engage in all of these activities.Consequently, peers will not criticize a scientist for not participating in all four activities.Instead, peers are tasked with evaluating how scientists manage their diverse activities and whether this aligns with their objectives and career stage.
At INRAE, assessment also involves examining scientists' trajectory, recognizing that junior and senior scientists may allocate their time differently among the four different main types of activity.Depending on factors such as age, experience, career path, and scientific fields, researchers may engage with these four main types of activities to varying degrees (Figure 3).

Impact of evaluation for scientists and INRAE
Individual assessments of scientists have two main outputs: one for INRAE in terms of the identification of potential new scientific breakthroughs and one for the scientist in terms of professional trajectory and development.In short, these outputs concern the "evaluation of evaluation", or the impact of assessment.For the Institute, the evaluation reports compiled by the researchers represent an unparalleled and highly valuable database from which both quantitative and qualitative analyses can be conducted.These analyses serve to assess the current state (strengths and weaknesses) of research expression at INRAE and to identify opportunities or risks for the future.INRAE is currently focused on i) determining the questions to which evaluation data could provide answers, and ii) developing the tools to extract and analyze the information.For the scientists, ASIRPA's ex-post tools (Joly and Matt 2017) could be adapted to analyze the impact of specific projects on researchers' careers, aiding them in gaining perspective on their career trajectory and impact, thus clarifying their professional future (scientific directions, mission orientation, etc.).

Towards assessment by interview
INRAE currently has a roadmap that includes several items, such as integrating elements of scientific integrity into scientists's reports.However, it remains challenging to define criteria or indicators for this integration (Moher et al., 2020).One option could be to incorporate an interview between the assessed scientist and the SSC; this approach might allow for a deeper exploration of how the scientist embodies scientific integrity across four crucial aspects: reliability, honesty, respect and accountability.Additionally, interviews could encompass other aspects related to softskills (e.g., to have good human relations skills, listening abilities, co-construction capabilities, healthy debate skills, and mentoring), which are easier to appreciate through face to face discussion.Interviews would also provide an opportunity to engage in scientific discussions, further aiding scientists in their professional trajectories.Although scientists are encouraged to mention difficulties or failures they may have encountered in their report, an interview may be better suited for explaining these delicate situations.There are numerous advantages to considering interviews in the individual assessment process.However, two obstacles currently exist: firstly, the number of scientists at INRAE (approximately 2,500) makes organizing such interviews challenging, and secondly, interviews inherently breach the confidentiality rule of assessment.Nevertheless, one potential compromise for conducting these interviews could involve selecting a subset of assessed scientists based on specific career periods, thereby accepting that confidentiality may be breached in these cases.

Environmental impact
The evolution of evaluation procedures and criteria is necessary as it aligns with scientists' evolving missions and work methods.The professional environment influences how scientists carry out their jobs.A simple example is the impact of the Sars-Cov-2 epidemic on in-house work and telecommuting.However, on a deeper level, broader environmental and societal contexts (not directly related to scientific activities) may prompt new ways of considering scientists' roles.At INRAE, some scientists are mindful of the impact of their activities on climate change and carbon footprint.Consequently, they may adjust their experimental plans to be more energy-efficient and reduce international air travel.These changes in practices could affect the scale of experiments and/or international collaborations.It is too early to determine whether those potential changes will become long-term practices, but it illustrates how assessment procedures might evolve to account for the environmental impact of research activities.

Figure 1 -
Figure 1 -Comparison of categories used for assessment of research activities between INRAE (left) and other organizations (right).

Figure 2 -
Figure 2 -Description of the four main types of activities at INRAE that scientists under assessment can mention and develop in their report.

Figure 3 -
Figure 3 -Example of repartition of the activities of junior senior INRAE scientists.Based on 4989 responses from scientists in the period of 2015-2021 years.

Table 1 -
List of the 13 INRAE Specialized Scientific Commissions (SSCs).Alphabetical order.Each of the 13 INRAE SSCs is a group of 20-24 scientists nominated or elected for four years half of whom do not belong to INRAE, and is headed by a president who is external to INRAE.Within a committee, there are representatives from different categories of groups: gender balance and age categories are respected, aiming to balance "junior" and "senior" assessors to capitalise on different views.Each SSC follows precise guidelines by INRAE for several years (Direction de l'Evaluation, 2023).When peers meet to discuss the different dossiers, a representative of the Evaluation Department is present to guarantee the process, ensuring proper operation according to SSC rules, particularly to prevent any misuse of quantitative criteria and to respect the 25 criteria of discrimination prohibited by French law (article 225-1 of the penal code, 2022; https://www.legifrance.gouv.fr/codes/article_lc/LEGIARTI000045391831).

Table 2 -
Types of difficulties identified by the peers, for 2020 and 2021.This corresponds to 105 different assessments (scientists).For each scientist, the difficulty might fit with different types*.

Table 3 -
Corpus used to compare and constrast scientist assessment at INRAE and other organizations.YUFE: Young Universities of the Future Europe.LERU: League of European Research Universities.KNAW, NWO and VSNU: association of universities in the Netherlands, which had already signed the DORA declaration.DORA: Declaration of San Francisco.EUA: European University Association.TJNK: Finish Committee for Public Information.TSV: Federation of Finnish Learned Societies.FOLEC: Latin American Forum on Research Assessment.ZonMw: The Netherlands Organization for Health Research and Development.