Keywords
Bias, burden, conservatism, innovation, creativity, peer review, grant funding
This article is included in the Research on Research, Policy & Culture gateway.
This article is included in the Peer Review and the Pandemic collection.
Bias, burden, conservatism, innovation, creativity, peer review, grant funding
Peer review is a core part of the academic research process, with the majority of academic research funding allocated through peer review. In a recent review (Guthrie et al., 2018), some of the potential limitations and outcomes of the use of peer review to allocate research funding were explored, with a key finding of that work being that there is a need for further experimentation and evaluation in relation to peer review and the grant award process more widely. However, measuring the performance of the funding allocation processes can be challenging, and there is a need to better share learning and approaches. This paper aims to address this gap by providing a review of existing practice in the measurement of research funding processes in relation to three of the main challenges identified by Guthrie et al. (2018):
The intention of this work is to provide a review of ideas and approaches that funders can use to better analyse their own funding processes and to help facilitate a more open and analytical review of funding systems. Through our interviews with funders, we also explored current practice internationally in attempting to reduce burden and bias and to facilitate innovation and creativity in research.
We undertook a Rapid Evidence Assessment (REA) that built on previous work—such as that by Guthrie et al. (2018) —encompassing methods for evaluating programs, the challenges faced in evaluation, issues associated with research evaluation and the importance of responsible metrics. We focused specifically on metrics and measurement approaches that address bias, burden and conservatism. We restricted our search to literature in English from the 10 years between 2008 and 2018 to ensure we focused on the latest developments in the field and current best practice. We covered academic literature in Scopus as well as grey literature, e.g. policy reports and studies, government documents and news articles.
We identified relevant literature through three routes:
1. Academic literature search: Scopus search using the search terms in Table 1 for publications from 2008 onwards. To identify literature that focused on bias, burden and conservatism, we operationalised these search strings as follows: [Group 1 AND Group 2 AND (Group 3 OR Group 4 OR Group 5 OR Group 6 OR Group 7)].
2. Grey literature search: search on the websites of the funding bodies considered in this study (Table 2)
3. Snowballing: Snowballing refers to the continuous, recursive process of gathering and searching for references from within the bibliographies of the shortlisted articles. We performed snowballing from the reference lists of publications identified following screening.
Using the search strings described above, the Scopus database yielded 1,741 results. We performed an initial test of our strategy by checking that specific key papers we were already aware of appeared in the results, for example Guthrie et al. (2018). Once satisfied the search strategy was performing effectively, we implemented a filtering process to determine the inclusion or exclusion of articles based on their relevance to address the primary objectives of this task as set out in Figure 1.
Data extraction was performed using a data extraction template—a pre-determined coding framework based on the study aims (i.e. bias, burden, and conservatism). The headers of this template against which data was extracted for each article (where available) were:
Researcher extracting data (initials)
Source (author, date)
Title
Year of publication
URL
Type of document (journal article, review, grey report, book, policy, working paper, etc)
Objectives (aims of the work)
Area of peer review (journals, grants, other)
Evaluation framework or model (to evaluate funding program)
Evidence on and measures of burden (on researchers, institutions, funding bodies)
Evidence on and measures of bias (gender, career stage, research type, institution)
Evidence on and measures of innovation
Datasets and collection (any datasets used for evaluation purposes or information on data collection)
Metrics and indicators (any specified metrics used for evaluation)
Challenges (any identified challenges for evaluations)
Strengths and weaknesses (of the study)
Quality of study design and conduct (if appropriate assign red, amber, or green)
Strength and generalisability of the findings (assign red, amber, or green)
Three researchers performed the full extraction of 100 articles in parallel. During this process, each researcher was instructed to add key references to a ‘snowballing database’. The snowballing database was populated with 15 articles, which were passed through the filtering processes described above, yielding an additional eight papers that were fully extracted. We also considered additional articles using a combination of targeted web searches and suggestions from our key informant interviews. These methods yielded an additional 18 articles that were included in our REA.
We conducted key informant interviews with one representative from each research funding organisation in Table 2 in order to understand how evaluation methods are employed in practice and to explore evaluation approaches that may not be documented in the literature. We identified respondents with relevant expertise at key biomedical and health research funders internationally and contacted them by email to request their participation. We focused on developed research systems that may be comparable with the Australian system, primarily in Europe and North America. We also interviewed researchers working on the analysis of peer review and grant funding approaches and their challenges. Initially 12 individuals were contacted. Of those, 6 agreed to participate; 5 did not respond to our request; 1 declined to participate; 2 further identified colleagues to participate in their place, who were contacted by email and in both cases accepted.
Interviews were conducted by telephone and lasted for approximately one hour. Interviews were recorded and field notes taken. One interview was conducted per participant. Interviews were conducted following a semi-structured protocol (see Table 3) to enable consistent evidence collection while providing the opportunity to explore emerging issues. As the interviews were designed to be semi-structured, we encouraged the interviewees to explore areas they thought were important that may not have been directly covered in our interview protocol. To protect the anonymity of the interviewees, the analysis that we report does not make any specific reference to individuals; we use the interview identifiers INT01, INT02, etc. to make references to specific interviews in our analysis.
The analysis took a framework analytic approach, aiming to capture information on processes and metrics used in practice across organisations in relation to the aims of this study to identify how bias, burden and innovation in funding process can be measured. Data from each interview was coded into an excel template by each individual conducting interviews (GM, DRR), with one row per interview (and hence organisation). The column headers were as follows:
Organisation name
Aims/structure of funding programme
Application process
Review process
Burden
Bias
Innovation
Evaluation method
Metrics used for evaluation
Challenges
Strengths
Weaknesses
Analysis was primarily focused on capturing information on practice at each of these organisations to provide a picture of the methods currently being used by research funding organisations to measure and to alleviate burden, bias and conservatism in peer review-based funding processes. However, we also reviewed evidence on challenges, strengths and weaknesses to identify any information to inform our wider analysis and discussion.
This study was recommended for exemption by the RAND Human Subjects Protection Committee. Participant consent was obtained orally at the start of each interview. The precise detail of consent sought is set out in the interview protocol (Table 3).
Robustness of procedure and efficiency of funding distribution are the two pillars supporting the legitimacy of peer review (Gurwitz et al., 2014), which has been described as a cornerstone of the scientific method (Tomkins et al., 2017) and the backbone of modern science (Tamblyn et al., 2018). A robust peer review process must be fair and objective in the distribution of grants. However, an increasing number of studies suggest that systematic bias occurs in a range of forms across grant peer review processes.
While there are an increasing number of studies examining bias in grant peer review, there is still deemed to be a lack of comparable, quantitative studies in the area (Bromham et al., 2016; Gurwitz et al., 2014; Guthrie et al., 2018). The main body of work has been performed by analysing historic data made available by funding agencies to academic researchers, though funding bodies themselves have undertaken work in the area (DFG, 2016; Ranga et al., 2012). A challenge in identifying and evaluating sources of bias is the lack of generalisability of findings, as funding programs have highly variable structures and funding bodies collect and make available different datasets. Table 4 lists the measurement approaches employed in the literature, indicates which areas of potential bias were explored and provides key references. A table listing each item identified during the literature review is available as Underlying data (Guthrie, 2019).
Measurement approach | Area of potential bias investigated |
---|---|
Statistical evaluation of funding data | • Gender (Kaatz et al., 2016; Mutz et al., 2012; Tamblyn et al., 2018; Van Der Lee et al., 2015) • Field of Research (Tamblyn et al., 2018) • Ethnicity (Ginther et al., 2011) • Institution size (Murray et al., 2016) • Reviewer expertise (Gallo et al., 2016; Li, 2017) • Reviewer social environment (Marsh et al., 2008) |
Bibliometrics | • Gender (Tamblyn et al., 2018; Wennerås & Wold, 1997) • Career stage (Bornmann & Daniel, 2005; Gaster & Gaster, 2012) |
Text mining and analysis | • Gender (Kaatz et al., 2015; Malikireddy et al., 2017) |
Longitudinal | • Gender (Levitt, 2010) |
Experimental randomisation | • Ethnicity (Forscher et al., 2018) |
New metrics | • Field of research (Bromham et al., 2016) |
Funding agency | Strategies to reduce bias | Approaches to evaluating bias |
---|---|---|
European Research Council | 1. Three different grant programs open to applicants based on years post-PhD • Starting grants (€2 million for 5 years): 2–7 years post-PhD • Consolidator grants (€2.75 million for 5 years): 7–15 years post-PhD • Advanced grant (€3.5 million for 5 years): over 15 years post-PhD 2. Years post-PhD required for the starting and consolidator grants can be broadened due to career disruptions 3. Scientific Council made up of prominent researchers who brief reviewers on bias 4. Training video provided to reviewers | 1. Impact studies of funding on researchers’ careers 2. Working Group on Widening European Participation: proposes measures to encourage high-calibre scientist from regions with a lower participation rate to successfully apply, and analyses data and processes to ensure that the ERC peer review is unbiased 3. Working Group in Gender Balance. The group has commissioned two studies: gender aspects in careers structures and careers paths, and ERC proposal submission, peer review and gender mainstreaming |
German Research Foundation | 1. Since 2008, member organisations of the DFG implement the Research-Oriented Standards on Gender Equality, aimed at enhancing gender equality at German Universities 2. DFG created the DFG toolbox, a database listing around 300 equality measures 3. Applications can be submitted in English or German, with a preference for English. Applications in English widen the circle of potential reviewers and make it easier to avoid bias 4. DFG supports researchers at different stages in their career through direct promotion of individuals and through prizes for scientists at various stages of their academic career 5. DFG offers contact persons to advise on program selection and the application process | 1. Number of applications by gender, age and ethnicity 2. Success rate by gender, age and ethnicity proportional to applications received 3. Number of women in panels 4. Number of female reviewers 5. Progression of careers 6. Institutional bias |
Medical Research Council | 1. For interdisciplinary research, a cross-council funding agreement is in place. One research council leads on review, ensuring an appropriate mix of reviewers are approached taking account of advice received from other councils on potential peer reviewers and the need for advice from reviewers with relevant expertise in working across disciplinary boundaries | 1. Model funding rates considering the age, background, sex and subject field of recipients 2. Funding rates of interdisciplinary proposals 3. Funding rates of smaller research fields 4. Research funding and career progression of recipients for 7–8 years. |
Canadian Institutes of Health Research | 1. Software and algorithms used to assign reviewers 2. Unconscious bias training 3. Indigenous Persons Committee | 1. Equality, diversity and inclusion assessments |
National Institute for Health Research – Research for Patient Benefit | 1. Different funding tiers depending on scope of the study: • Tier 1: randomised controlled trials • Tier 2: feasibility studies • Tier 3: innovative research | 1. Collect data on gender and send it externally for analysis 2. Diversity of applications |
Australian Research Council | 1. Research Opportunity Performance Experience (ROPE) statement 2. Assessor and selection committee training (including unconscious bias) 3. Report on grant outcomes by gender 4. Australian and New Zealand Standard Research Classification (ANZSRC) codes allows granular linking of research proposals to assessors with the appropriate expertise 5. Where applications are considered by discipline-specific panels, the available funds are relative to the number of applications made falling under the panel 6. Reporting of successful grants and ongoing access to all research projects supported by the ARC 7. Minimum number of assessors per application, and identifying discrepancy in scores | 1. Addressed through overall approaches to evaluation, such as seeking regular feedback from sector, survey of reviewers, targeted evaluations and international benchmarking |
AQuAS1 | 1. Fund projects based on qualitative perception of the evaluator in terms of the overall portfolio of publications and scientific outputs, analysed within context | 1. Questionnaires sent to Centres and institutes asking for specific information (i.e. gender leadership and public engagement) |
ZonMw | 1. Three different grant programs open to applicants based on years post-PhD • Veni (€250,000 for three years): up to three years post-PhD or women with career disruption • Vidi (€800,000 for three years): up to eight years post-PhD • Vici (€1.5 million for five years): above eight years post-PhD 2. Review committee is put together considering male: female ratio 3. Each application reviewed by five panel members 4. Internal program (Open Science) where they address issues with bias 5. Participate in ‘funders forum’ and discuss guiding principles of research, including bias 6. Policy in place to favour female applicants when gender ratio is highly biased towards men and there are two equally good applicants | No specific information was found for ZonMw on evaluating bias |
Applicant characteristics. Applicant characteristics collected during grant application processes may include gender, age, race, ethnicity and nationality. While reviewers do not usually see information about all characteristics, for instance race and ethnicity (Erickson, 2011), it may be apparent due to name, affiliation or prior knowledge.
Gender bias has been the primary area of study within applicant characteristics, perhaps having gained significant visibility in an early study that showed that females needed to be 2.5-fold more productive to achieve the same scores as males in the Swedish Medical Research Council’s peer review process (Wennerås & Wold, 1997).
Following this initial study, gender bias has been explored in several different countries. In The Netherlands, researchers funded by The Netherlands Organisation for Scientific Research (NWO) examined 2,823 applications between 2010 and 2012 from early career scientists, analysed gender as a statistical predictor of funding rate, and examined the success rate throughout the process (application, pre-selection, interview, award) (Van Der Lee et al., 2015) The authors found that there was a gender disparity with males receiving higher scores in ‘quality of researcher’ evaluations but not ‘quality of proposal’ evaluations, particularly in disciplines with equal gender distribution among applicants.
Another study in the US looked at bias in the Research Project (R01) grants from the National Institutes of Health (NIH), and found this grant program exhibited gender bias in Type 2 renewal applications (Kaatz et al., 2016). The authors analysed 739 critiques of both funded and unfunded applications, using text analysis and regression models. The study found that reviewers gave worse scores to female applicants even though they used standout adjectives in more of their critiques. A second piece of work from the same authors employed more state-of-the art text mining algorithms to discover linguistic patterns in the critiques (Malikireddy et al., 2017). The algorithms showed that male investigators were described in terms of leadership and personal achievement while females were described in terms of their working environments and ‘expertise’—potentially suggesting an implicit bias where reviewers more easily view males as scientific leaders, which is a criterion of several grant funding programs.
In a longitudinal study, researchers followed the careers of an elite cohort of PhDs who started postdoctoral fellowships between 1992 and 1994 (Levitt, 2010). The study found that 16 years after the fellowships, although 9 per cent of males had stopped working in a scientific field, compared with 28 per cent of females, there was no significant difference in the fractions obtaining associate or full professorships. However, females whose mentors had an h-index in the top quartile were almost three times more likely to receive grant funding – males’ success had no such correlation with their mentors’ publication record.
In a Canadian Institutes of Health Research (CIHR) funded study, researchers evaluated all grant applications submitted to CIHR in the years 2012–2014 (Tamblyn et al., 2018). Descriptive statistics were used to summarise grant applications, along with applicant and reviewer characteristics. The dataset was then interrogated with a range of statistical approaches (2-tailed F-test, Wald χ2 test), which showed that higher scores were associated with having previously obtained funding and the applicant’s h-index and lower scores with applicants who were female or working in the applied sciences.
Some funding agencies do not detect gender bias in their grant programs. For example, the Austrian Science Fund performed an analysis of 8,496 research proposals from the years 1999–2009 using a multilevel regression model, and found no statistically significant association between application outcome and gender (Mutz et al., 2012). Meta-analyses of gender bias have reported on both sides of the debate, with claims that applicant gender has little (Ceci & Williams, 2011) or substantial (Bornmann et al., 2007) effect on receiving grants.
Exploration in relation to racial bias has also been performed, though there is a smaller body of work than on gender bias. In 2011 researchers funded by the NIH showed that black applicants were ten percentage points less likely to obtain R01 funding than their white peers, after extensively controlling for external factors (educational background, country of origin, training, previous research awards, publication record and employer characteristics) (Ginther et al., 2011). A funding gap between white/mixed-race applications and minority applicants has been a persistent feature of NIH grant funding between 1985 and 2013 (Check Hayden, 2015). According to a preprint article from mid-2018, racial bias in the NIH system may have diminished (Forscher et al., 2018). The researchers report on an experiment where 48 NIH R01 proposals were modified to contain white male, white female, black male and black female names before being sent for review by 412 scientists. The authors found no evidence—at the level of ‘pragmatic importance’—of white male names receiving better evaluations than any other group; however, they note there may be bias present at other stages of the granting process.
Career stage. Career stage is another potential source of bias in the peer review process. An ageing population of researchers may cause problems because it may crowd-out early-career researchers from funding, thus preventing them from establishing their careers (Blau & Weinberg, 2017). In the US, the average career age of new NIH grantees, defined as years elapsed since award of the highest doctoral degree, increased dramatically between 1965 and 2005, rising from 7.25 years to 12.8 years (Azoulay et al., 2013). The cause of the shift is uncertain. Proposed theories include an increased burden of knowledge due to an expanding scientific frontier; the use of post-doctoral positions as a ‘holding tank’ for cheap, skilled labour; and the move to awarding grants as prizes for substantial preliminary results rather than to fund new research (Azoulay et al., 2013).
Measuring bias regarding career stage is problematic due to the challenges associated with defining career stage. Career age—the years elapsed since award of the highest doctoral degree—is one commonly used description (Azoulay et al., 2013). While this approach is suitable for identifying strong trends, such as the near doubling of career age of new NIH grantees discussed above, it does not take into account factors such as teaching commitments, changing research topics, clinical work or career breaks (Spaan, 2010).
There are other approaches to defining career stage, for example focusing on necessary competences rather than time elapsed. The European Framework for Research Careers has four stages—first stage researcher, recognised researcher, established researcher and leading researcher—and provides a classification system that is independent of career path or sector (EC-DGRI, 2011).
Research field. There may be biases between research fields and also against research that falls between, or combines, those fields. While interdisciplinary research is often considered fertile ground for innovation, there is a belief among researchers that interdisciplinary proposals are less likely to receive funding (Bromham et al., 2016). Defining and identifying interdisciplinary research is a challenge that has hindered the evaluation of this potentially damaging belief. A recent study sought to address this challenge by developing a biodiversity metric, the interdisciplinary distance (IDD) metric, to capture the relative representation of different research fields and the distance between them (Bromham et al., 2016). Using data from 18,476 proposals submitted to the Australian Research Council’s Discovery Program over a five-year period, the authors found that the greater the degree of interdisciplinarity, the lower the probability of an application being funded.
Institution. There is also some evidence that characteristics of the institution may be a source of bias in the grant application process. For example, a 2016 study of Canada’s Natural Sciences and Engineering Research Council (NSERC) Discovery Grant program found that funding success and quantity were consistently lower for applicants from small institutions, and that this finding persisted across all levels of applicant experience as well as three different scoring criteria (Murray et al., 2016). The authors analysed 13,526 proposal review scores, using logistical regression to determine patterns of funding success and developing a forecasting model that was parameterised using the dataset. The authors note that some differences between institutions may be due to differences in merit and differences in research environments; they recommend that more needs to be done to ensure funds are distributed appropriately and without bias.
Reviewers. Reviewers may have overt or implicit biases that can affect their scoring of grant proposals, some of which are noted above. The level of expertise that reviewers have relating to an application can affect their evaluations, with studies finding both advantageous and disadvantageous effects. Li examined this issue by constructing and analysing a dataset of almost 100,000 applications evaluated in over 2,000 meetings (Li, 2017). The study found an applicant was 2.2 per cent more likely to receive funding, the equivalent of one-quarter of the standard deviation, if evaluated by an intellectually closer reviewer as measured by the number of permanent reviewers who had cited the applicant’s work in the five years prior to the meeting. Conversely, another study found that reviewers with more expertise in an applicant’s field, as measured by a self-assessment of their level of expertise relating to an application, were harsher in their evaluations (Gallo et al., 2016).
The characteristics of reviewers have also been shown to affect their evaluations. Jayasinghe, Marsh and Bond have published several studies based on collaboration with the Australian Research Council (Marsh et al., 2008) exploring the peer review of grant applications. One finding was that the nationality of peer reviewers affected the ratings they gave, with Australian reviewers scoring similarly to European reviewers, but harsher than those from other countries and specifically North America. The authors were unable to determine if the cause was awareness that Australian reviewers were likely competing with applicants for funding, or purely a result of different nationalities.
Even in the absence of bias, reviewers may not always agree on the quality of a proposal. The concept of inter-rater reliability—the degree of agreement among raters—is central to peer-review, yet has not been thoroughly examined in this context (Clarke et al., 2016). Three studies over the last half century have shown quite consistent levels of disagreement between reviewers, ranging from 24–35 per cent disagreement (Cole et al., 1981; Fogelholm et al., 2012; Hodgson, 1997). A more recent randomised trial study considered 60 applications to NHMRC’s Early Career Fellowship program, which were duplicated by NHMRC secretariat and reviewed by two grant panels (Clarke et al., 2016). The study found inter-rater reliability to be 83 per cent, which is comparable to the previous studies. The authors suggest that the slight reduction in disagreement may be due to the nature of early career applications or differences in the scoring and assessment criteria.
As the research community has gained an increasing awareness of bias, steps have been taken to develop fairer processes and programs. Table 5 below provides a summary of approaches used by a range of international funders both to reduce bias in their funding processes, and to evaluate bias across their funding streams.
Many funders now have targeted grant streams to support applicants who were found to be disadvantaged by biases in peer review or program structure, such as early career researchers. For example, the NIH K02 Independent Research Scientist Development Award, the Medical Research Council (MRC) New Investigator Research Grant, and the European Research Council (ERC) Starting Grants, are a small selection of funding aimed at early career researchers.
There is some emerging evidence that training can reduce bias and increase the inter-rater reliability of reviewers. The CIHR introduced a reviewer training program following the discovery that its new grant system focusing on applicants’ track records was disadvantaging women, while a program focusing on the research proposal was not. In the grant cycle following the introduction of a training module on unconscious biases, female and male scientists had equal success rates (Guglielmi, 2018). Additionally, an online training video was found to increase the inter-rater reliability for both novice and experienced NIH reviewers, with correlation scores rising from 0.61 to 0.89 following training (Sattler et al., 2015).
Blinding the identity of applicants from reviewers has been studied as a mechanism for increasing the fairness of peer review systems. In the context of journal peer review, the journal Behavioural Ecology found that its introduction of double-blind review increased the representation of female authors by 33 per cent, to reach a level that reflects the composition of the life sciences academic workforce (Budden et al., 2008). The US National Science Foundation (NSF) has trialled a blinded application process called ‘The Big Pitch’, which involves applicants submitting an anonymised two-page research proposal alongside a full conventional proposal (Bhattacharjee, 2012). The NSF reported that there was only ‘a weak correlation’ between the success outcomes of the full and the brief, anonymous applications.
The number of grant applications for research submitted for review is increasing in the majority of countries and disciplines (De Vrieze, 2017). However, funding for research is being reduced, leading to a decrease in the success rate of applications. The grant application process is time-consuming and costly, with the burden falling on those applying for the funding (Bolli, 2014; Gurwitz et al., 2014; Guthrie et al., 2018; Kulage et al., 2015), those reviewing the applications submitted (Bolli, 2014; Kitt et al., 2009; Schroter et al., 2010; Snell, 2015), the funding agency (Liaw et al., 2017; Schroter et al., 2010; Snell, 2015) and the research institutions (Kulage et al., 2015; Ledford, 2014; Specht, 2013).
Measuring burden in the funding process. The burden of the grant application process has been measured for applicants, reviewers, funders and research institutions using a variety of methods. A list of the different approaches used to evaluate burden of the application process can be found in Table 6.
Reference | Objective | Evaluation method | Methodology | Outcomes |
---|---|---|---|---|
Herbert et al., 2014 | Examine the impact of applying for funding on personal workloads, stress and family relationships | Qualitative study of researchers preparing grant proposals | Researchers were asked questions regarding their current academic level, location of primary institution, role, grants currently held, and proposals submitted in the latest round. They were then asked to rate their agreement as strongly agree or strongly disagree for statements such as ‘I give top priority to writing my proposals over my other work commitments’ | Preparing grant proposals was a priority for 97 per cent of researchers over other work Preparing grant proposals was a priority over personal commitments for 87 per cent of researchers The workload of grant writing season was seen as stressful by 93 per cent of researchers Funding deadlines led to holiday restriction in 88 per cent of researchers |
Kulage et al., 2015 | Determine the time and costs to a school of nursing to prepare a National Institutes of Health grant application | Prospectively recorded time spent preparing a grant proposal by principal investigators and research administrators in one school of nursing, and calculated the costs | 3 PIs, 1 PhD student and 3 research administrators who were planning applications agreed to track the time they spent on a daily basis on all activities related to preparing their grant applications. Time tracking forms were tailored to the different activities related to grant preparation for the different groups. Activities were divided into 4 categories: (a) preparatory work, (b) collaborative work, (c) grant preparation and writing, and (d) quality assurance Cost calculation used the most recent data on average nationwide salaries, fringe rates and F&A cost recovery rates for personnel, academic medical centres and education and health service private industries | The total time spent by research administrators was considerably less (33.9–56.4 h) than the total time spent by PIs (69.8–162.3 h) PI costs for grant preparation were greater ($2,726–$11,098) than for associated research administrators combined ($1,813–$3,052) The largest amount of time spent in grant preparation was in the writing/revising/formatting category |
Herbert et al., 2013 | Estimate the time spent by researchers preparing grant proposals, and examine whether spending more time increases the chances of success | Online survey by invitation | Researchers were asked if they were the lead researcher on the proposal, how much time (in days) they spent on the proposal, whether the proposal was new or a resubmission, and their salary in order to estimate the cost of proposal preparation. Researchers who had submitted more than one proposal were asked to rank their proposals in the order in which they perceived to be more deserving of funding. Researchers were also asked about their previous experience with the grant peer- review system as an expert panel member or external peer reviewer. The number of days spent preparing proposals was estimated based on the data collected. The authors also used a logistic regression model to estimate the prevalence ratio of success according to the researchers’ experience and time spent on the proposal. The authors also examined potential non-linear associations between time and success, as well as comparing the researchers’ ranking of their proposals with their outcome through peer review | This study found an estimated 550 working years of researchers’ time was spent preparing the 3,727 proposals submitted for NHMRC funding in 2012, accounting for an estimated AU$66 million per year The authors also found that more time spent on the proposal did not increase the chance of a successful outcome A slight yet not statistically significant increase in success rate was associated with experience with the peer-review system |
Schroter et al., 2010 | Describe the current status of grant review for biomedical projects and programs, and explore the interest in developing uniform requirements for grant review | Survey | Biomedical research funding organisations were selected from North America, Europe, Australia, and New Zealand, to include both private and public funders. The study used a questionnaire developed by the authors based on discussions with funding agencies about current practice and problems with peer review. Participants were asked to respond with ‘never’, ‘occasionally’, ‘frequently’, or ‘very frequently’ to the following questions Statements focusing on problems experienced by organisations with regard to peer review included reviewers declining to review; difficulty finding new reviewers for database system; and difficulty retaining good reviewers (among others). Statements regarding the decision of reviewers to participate in the review process included: opportunity to learn something new; wanting to keep up to date on research advances in specific areas; and relevance of the topic to your own work or interests. Statements regarding barriers of reviewers to participate in the review process included: insufficient interest in the focus of the application; having to review too many grants for funding organisation; and lack of formal recognition of reviewer contributions | This study found funders are experiencing an increase in the number of applications and difficulty in finding external reviewers Some organisations highlighted the need for streamlining and efficient administrative systems, as well as support for unified requirements for applications The study also showed a sense of professional duty and fairness was the main motivators for reviewers to participate in the review process |
Barnett et al., 2015 | Evaluate the new streamlined application system of AusHSI | Observational study of proposals for four health services research funding rounds | Applicants were asked to estimate the number of days they spent preparing their proposal. The time from submission to notification of funding decision was recorded for each round. Applicants were invited to respond to their written feedback using email. Summary statistics comprised: applications received, eligible applications, resubmissions, shortlisted, interviewed, funded after interview, mean and median days spent by applicants on proposal, time from submission to notification, allocated funding and median budget | The streamlined application system led to an increase in success rates and shorter time from application submission to outcome notification. |
Surveys have frequently been used to assess the burden of grant preparation (Herbert et al., 2013; Herbert et al., 2014; Kulage et al., 2015; Wadman, 2010) and grant reviewing (Schroter et al., 2010; Wadman, 2010). These surveys enquire on the time spent on average in the application and review process, as well as the distribution of time among the various activities of the grant application process.
Another approach has been to use an observational study of proposals for health services research funding to measure time spent preparing proposals, the use of simplified scoring of grant proposals (reject, revise or invite to interview), and progression from submission to outcome (Barnett et al., 2015).
Others have estimated the cost of grant application by tracking the time spent preparing a proposal and combining the data with nationwide salaries, fringe rates, and facilities and administrative (F&A) cost recovery rates for personnel (Kulage et al., 2015).
The minimum number of reviewers has also been assessed in order to reduce the burden on reviewers. (Liaw et al., 2017; Snell, 2015). One study focused on NHMRC and looked at the agreement on funding decisions on applications between different numbers of panellists and different lengths of applications (Liaw et al., 2017). A second study evaluated the review process from a CIHR post-doctoral fellowship competition, bootstrapping replicates of scores from different reviewers to determine the minimum number of reviewers required to obtain reliable and consistent scores (Snell, 2015).
In recent years, different strategies have been developed to try to reduce the burden of grant applications. Table 7 provides a summary of approaches used by a range of international funders, both to reduce burden in their funding processes, and to measure the level of burden. These measures could allow researchers to focus on their research, save reviewers time, and potentially reduce the cost of grant review by reducing the labour required to review grant applications (Bolli, 2014).
Funding agency | Strategies to reduce burden | Approaches to evaluating burden |
---|---|---|
European Research Council | 1. Introduction of resubmission rules 2. Two-step evaluation: • Concrete proposal: two-page CV and five-page synopsis (30 per cent success rate) • Full proposal: 15 pages 3. Evaluators receive payment for their work 4. Panellists cannot serve more than four times and serve in alternate years | 1. Not officially measured, only anecdotal |
German Research Foundation | 1. Electronic Proposal Processing System for Applicants, Reviewers and Committee Members 2. Nine different individual grant schemes | 1. Periodic RCU and UK SBS satisfaction surveys to applicants, reviewers, panel members and research administration officers 2. Success rates |
Medical Research Council | 1. End of grant reporting typically through Research Fish. Exceptionally, the MRC may require a separate final report on the conduct and outcome of a project. | 1. Statistical reports about the burden of reviewers, looking at changes in the pool of reviewers and number of grants/reviews written annually per person |
Canadian Institute for Health Research | 1. Standard CV for all Canadian funding agencies, which links to PubMed | 1. Survey of applicants 2. Survey of reviewers 3. Data collected on administrative time |
National Institute for Health Research – Research for Patient Benefit | 1. Multiple calls per year (three) 2. Two-stage application process 3. Feedback to applicants within six weeks of application submission | 1. Survey of reviewers 2. Survey of applicants 3. Quality of applications in two-stage process 4. Success rate 5. Application numbers |
Australian Research Council | 1. Recipients can hold a limited number of grants 2. Moving to CV updates via ORCID or self-populated forms 3. Provide instructions to Universities on quality threshold to avoid below- standard application submission 4. Selection meetings supported by online meeting management 5. Recognition of the work of assessors through an annual report on assessor performance 6. Use of video-conferencing for some selection meetings (reduces travel load for meetings) 7. Auto-population of details from the ARC’s Research Management System to applications and final reports | 1. Conducted project to engage with applicants about use of ARC Research Management System (RMS) 2. Survey of reviewers about which sections of the applications are used for assessment |
AQuAS2 | 1. Reduce ex-ante evaluation of research and promote ex-post evaluation | No specific information was found for AQuAS on evaluating burden |
ZonMw | 1. Pre-screening of applications based on CV 2. No external reviewers | 1. Success rate 2. Post-funding evaluation: mid- and end-point review. Includes many questions on data management and knowledge utilisation |
Application limits. In 2009, the NIH incorporated a clause into their grant policy limiting applicants to two submissions of a research proposal (Rockey, 2012). Although this policy was not well received by the research community (Benezra, 2013), analysis by the NIH revealed a higher proportion of first-time applications were being awarded, along with a reduction in the average time to award from submission (Rockey, 2012). Restricting resubmission has also been adopted by the European Research Council (ERC) (Council, 2017).
Two-stage application process. The ERC has combined the application limit with a two-stage application that involves awarding project proposals a score during the first stage of the application process (A, B or C). If the project is awarded an A, the proposal will proceed onto the next assessment stage. If the proposal received a B, the applicant must wait one year before reapplying. And if the proposal is graded with a C, the applicant must wait two years before reapplying to any of the ERC-funded programs. This approach has led to a decrease in the number of applications received for evaluation by the ERC (INT04). The NIHR have also adopted a two-stage application process that has led to a decrease in the number of applications sent for peer review and a shorter time between application submission and outcome notification (INT01).
Multiple calls per year. Moreover, the NIHR has multiple calls for proposals throughout the year, which reduces not only the burden on reviewers by decreasing the number of applicants to review per round (INT01), but also on the applicants by having ongoing grant applications (Herbert et al., 2014).
Grant application length. In 2012, the NIH reduced the length of most grant applications from 25 pages to 12 pages, with the aim of reducing the administrative burden on both applicants and reviewers, and to focus on the concept and impact of the proposed research rather than the details of the methodological approach (Wadman, 2010). However, a study by Barnet et al. found that shortening the length of an application slightly increased the time spent by applicants preparing the proposal (Barnett et al., 2015).
Funding period. Extending the funding period to five years (Bolli, 2014) has also been suggested to reduce the burden of grant application.
What constitutes ‘innovative research’ is difficult to define. Approaches to identifying and defining innovation are varied and include defining innovation based on expert opinion (Marks, 2011), as research that does not require extensive preliminary results (Spier, 2002) (INT04), and (in the field of clinical research) as efforts that lead to improvements in patient care and progress (Benda & Engels, 2011).
Although there is a belief that quality research should be innovative and lead to new understanding of science (Benda & Engels, 2011), the current process of reviewing grant applications, mainly peer review, has been defined as ‘anti-innovation’, providing strong incentives for incremental research and discouraging research into new unexplored approaches (Azoulay et al., 2013; Fang & Casadevall, 2016; Guthrie et al., 2018).
Innovation and productivity are driven by diversity (Magua et al., 2017); therefore, advancing women or other underrepresented groups to institutional leadership (Magua et al., 2017) as well as promoting interdisciplinary research (Bromham et al., 2016) should have a positive impact on promoting innovative research.
Another feature of innovative research is its uncertain and potentially controversial nature. While many funding agencies aim to support innovative research, the body of work on peer review suggests it is an inherently conservative process (Langfeldt & Kyvik, 2010).
Since defining and identifying innovative, creative research is challenging, it can be difficult to measure the level of innovation and creativity within a research portfolio, or the extent to which a research funding program fosters innovation. However, there are examples in the literature of potential approaches to measuring innovation, as set out in Table 8.
Reference | Definition of innovation | Measurement of innovation |
---|---|---|
Kaplan, 2007 | NIH definition: research that challenges and seeks to shift current research or clinical practice paradigms through new theoretical concepts, approaches or methodologies. Defined as high-risk, high-reward. | Disagreement between reviewers using metrics such as variance or negative kurtosis |
Manso, 2011 | Discovery, through experimentation and learning, of actions that are superior to previously known actions | Bandit problem embedded into a principal-agent framework. Used to evaluate how to structure incentives in a program seeking to motivate innovation. |
Azoulay et al., 2011 | Comparison between the HHMI Investigator Grant scheme, which provides incentives that encourage innovation, and the NIH R01 scheme | Three ways: Changes in research agenda of principal investigators after the grant has been awarded; novelty of the keywords tagging their publications, relative to the overall published research and to the scientists themselves; broadening of the impact of research inferred by the variety of journals that cite the work |
Boudreau et al., 2016 | Novelty of the research | New Medical Subject Headings (MeSH) term combinations in relation to the existing literature – demonstrating new connections across fields |
Liaw et al., 2017 | American Heart Association (AHA) definition: research that may introduce a new paradigm, challenge current paradigms, add new perspectives to existing problems or exhibit uniquely creative qualities | • Some organisations include innovation as one of the assessment criteria, accounting for a percentage of the overall score of the proposal. In AHA, innovative research gets scored on the following questions: Does the project challenge existing paradigms and present an innovative hypothesis or address a critical barrier to progress in the field? • Does the project develop or employ novel concepts, approaches, methodologies, tools or technologies for this area? |
By definition, new ideas are likely not to be met with consensus. It has been suggested that innovation could be measured through a metric based on lack of agreement between reviewers, measuring controversy as a surrogate for innovation, with new metrics, including variance or negative kurtosis, the degree to which observations occur in the tails of the grading distribution (Kaplan, 2007).
Productivity is another potential approach to measuring innovation. One study assessed the careers of researchers funded by two distinct mechanisms, investigator-initiated R01 grants from the NIH and the investigator program from the Howard Hughes Medical Institute (HHMI), with the aim of determining whether HHMI-style incentives result in higher rate of production of valuable ideas (Azoulay et al., 2011). The authors estimated the effect of the program by comparing the outputs of HHMI‐funded scientists with that of the NIH-funded scientists within the same area of research, who received prestigious early career awards. Using a combination of propensity-score weighting and difference-in-differences estimation strategies, the authors found that HHMI investigators produced more high-impact journal articles than the NIH-funded researchers, and that their research was more prone to changes.
Another study looked at the relation between the knowledge contained in an application proposal and a reviewer’s expertise and the outcome of proposals focusing on innovative research and area of expertise (Boudreau et al., 2016). In this study, the authors designed and executed a grant proposal process for research, and randomised how proposals and reviewers were assigned, generating 2,130 evaluator-proposal pairs. The authors found that evaluators give lower scores to research proposals that are highly novel, evaluated as new combinations of MeSH terms in a proposal relative to MeSH terms in the existing scientific literature, and to proposals in their area of expertise (Boudreau et al., 2016). However, another study focusing on public health proposals found that reviewers favour their own fields (Gerhardus et al., 2016).
A few organisations conduct peer review, with some unique practices placing higher value on innovation (Liaw et al., 2017). Criteria for assessing innovation are determined by the different organisations.
In recent years, different strategies have been developed to improve the assessment of innovation in grant review. Table 9 provides a summary of approaches used by a range of international funders, both to increase innovation and creativity and to evaluate the level of innovation and creativity across their funding streams.
Funding agency | Strategies to address innovation and creativity | Approaches to evaluating innovation and creativity |
---|---|---|
European Research Council | 1. Panellists encourage high-risk, high-gain projects 2. Synergy grants to encourage multidisciplinary research 3. Proof of Concept grants | 1. External program evaluation 2. Working Group on Innovation and Relations with Industry |
German Research Foundation | 1. There are six coordinated programs aimed at research groups and priority programs to promote cooperation through national and international collaboration 2. Award criteria on quality and added value of cooperation, as well as program-specific criteria for Coordinated Procedures 3. DFG supports the exchange between science and possible areas of application by promoting knowledge transfer | No specific information was found for DFG on evaluating innovation and creativity |
Medical Research Council | 1. Transformative translational research agenda aimed at accelerating innovation 2. MRC Industry Collaboration Agreement: supports collaborative research between academic and industry researchers 3. Intellectual property rights do not belong to MRC but rather the researchers and universities doing the work | 1. Databases recording research activity at an individual, national and global level that feed into evaluation reports |
Canadian Institute for Health Research | 1. Initiatives to encourage innovation, such as the pan-Canadian SPOR Network in Primary and Integrated Health Care Innovations, and the eHealth Innovations initiative 2. Over 50 per cent of funding aimed at investigator-driven research | 1. Performance measurement framework for the Health Research Roadmap strategy and larger performance measurement strategy for CIHR based on the Canadian Academy of Health Sciences research outcomes framework |
National Institute for Health Research – Research for Patient Benefit | 1. Specific funding for innovative research | 1. Do not follow applicant trajectory but rather trajectory of research ideas |
Australian Research Council | 1. Innovation supported through grant process, in selection criteria and instructions to assessors 2. Move to continuous application for one scheme to allow application submission at a time suitable for applicants 3. Multidisciplinary, interdisciplinary and cross-disciplinary proposals assessed outside discipline norms | 1. Addressed through overall approaches to evaluation such as seeking regular feedback from sector, survey of reviewers, targeted evaluations and international benchmarking |
AQuAS3 | 1. Indicators for success in specific contexts are developed in collaboration with the researchers 2. Observatory for Innovation in Healthcare Management in Catalonia: Centre developed with the aim of acting as a framework for promoting participation by healthcare professionals and health centres in identifying and evaluating innovative experiences in the field of healthcare management. | 1. Participatory sessions with grantees to determine which indicators to use ex-post to evaluate their innovation. The aim of this is to raise awareness, motivate and make grantees feel like this is an achievable goal. |
ZonMw | 1. Off-road program (€100,000 for 1.5 years): high-risk, high-reward. The application consists of a 300-word description of why the research they propose is novel or different. | No specific information was found for ZonMw on evaluating innovation and creativity |
To ensure innovative research is being funded some agencies, including the NIH, adopt an ‘out of order funding’ approach (Lindner & Nakamura, 2015). In this approach, a number of applications for innovative research are chosen for funding despite receiving lower scores than other funded research based purely on the peer review process. In the NIH, this strategy has led to approximately 15 per cent of applications selected ‘out of order’.
The NIH has also made additional changes to the peer review process in order to increase the emphasis on innovation and decrease the focus on methodological detail (Lindner et al., 2016). These changes included reducing the length of the methodological description (from 25 to 12 pages), with guidance to focus away from routine methodological details towards describing how their application is innovative. Including innovation as a criterion for grant assessment could incentivise researchers to include innovative ideas and new approaches into their proposals (Guthrie et al., 2018).
Many funding agencies have also adopted the strategy of having a separate scheme to fund innovative research, allocating smaller funds with a shorter time frame to these specific streams. The NIH has developed the New Innovator Award, committing $80 million to the award, and two others that specifically encourage innovation, the Pioneer and Transformative R01 Awards (Alberts, 2009). ZonMW have designed an ‘off-road’ program aimed at high-risk, high-reward projects, providing €100,000 for 1.5 years (INT02). NIHR has also designed different funding tiers to promote funding for innovative projects, providing £150,000 funding for 18 months (INT01). However, this strategy could include longer funding periods to encourage a culture of innovation among young researchers who remain reluctant to take the risk of pursuing ambitious ideas, acknowledging the need for preliminary results to obtain funding for most research (Alberts, 2009).
Our review of international practice regarding the characterisation and measurement of bias, burden, and conservatism innovation and creativity in the grant funding process demonstrated that the efforts so far systematically to measure these characteristics by funders have been limited. However, in each area there were examples of existing practice we can draw upon as summarised in Table 10.
It is also worth noting the challenges in defining each of these elements, partly reflecting the diversity within each of these areas. In terms of bias, we note biases can emerge in terms of a range of areas, with five main areas highlighted in the literature: applicant characteristics (e.g. gender, ethnicity); career stage; research field; institution; and reviewer characteristics. Burden can be characterised in terms of where the burden is experienced: by applicants, reviewers, the funding agency and by institutions. Efforts to address burden and ways of measuring their effectiveness may differ across these groups. Finally, a key challenge in measuring innovation is providing a definition of innovative or creative research that can be operationalised. Often funders do this based on expert judgement, but this is challenging to use for portfolio assessment and analysis.
Finally, a key limitation of the work is that since this is a review of the existing literature and practice, we are constrained by what has so far been reported, which in some areas is fairly limited. In particular, the majority of the literature focuses on the application and peer review process, which only forms a part of the overall funding scheme that starts from the initial establishment of the structure of the funding scheme through to the monitoring and evaluation of ongoing and completed funding awards. We set out in Table 11 a wider conceptualisation of some of the ways in which challenges could theoretically emerge in relation to funding schemes at different stages throughout this process. This is intended to illustrate the potential breadth of scope for this work beyond the literature: as such it is neither exhaustive nor driven by existing evidence of those challenges or opportunities emerging in practice. Rather it acts as an aid to thinking through the full process of the development and implementation of funding schemes. We suggest that further research and evaluation efforts are needed to more fully conceptualise and measure effectively the concepts of bias, burden and innovation in research across the full scope of the research funding process.
Figshare: Articles on bias burden and conservatism in grant peer review. https://doi.org/10.6084/m9.figshare.8184113.v1 (Guthrie, 2019).
This project contains a list of all publications, with URLs, identified during the literature review.
Consent sought and confidentiality assured in the interview process means that the interview transcripts from this study cannot be made publicly available (see interview protocol Table 3). This is due to the high risk of identifying individuals from the small sample of interviews conducted.
This work was funded by the National Health and Medical Research Council (NHMRC) of Australia.
The funders had no role in the study design, data collection and analysis.
Views | Downloads | |
---|---|---|
F1000Research | - | - |
PubMed Central
Data from PMC are received and updated monthly.
|
- | - |
Is the work clearly and accurately presented and does it cite the current literature?
Partly
Is the study design appropriate and is the work technically sound?
Partly
Are sufficient details of methods and analysis provided to allow replication by others?
Yes
If applicable, is the statistical analysis and its interpretation appropriate?
Partly
Are all the source data underlying the results available to ensure full reproducibility?
Yes
Are the conclusions drawn adequately supported by the results?
Partly
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Epidemiology, Health Information Systems, Health Services, Medical Education, Medical Informatics, Pharmacoepidemiology
Is the work clearly and accurately presented and does it cite the current literature?
Yes
Is the study design appropriate and is the work technically sound?
Yes
Are sufficient details of methods and analysis provided to allow replication by others?
Yes
If applicable, is the statistical analysis and its interpretation appropriate?
Not applicable
Are all the source data underlying the results available to ensure full reproducibility?
Yes
Are the conclusions drawn adequately supported by the results?
Partly
References
1. Klocker N, Drozdzewski D: Commentary. Environment and Planning A: Economy and Space. 2012; 44 (6): 1271-1277 Publisher Full TextCompeting Interests: No competing interests were disclosed.
Reviewer Expertise: statistics, meta-research
Alongside their report, reviewers assign a status to the article:
Invited Reviewers | ||
---|---|---|
1 | 2 | |
Version 1 12 Jun 19 |
read | read |
Provide sufficient details of any financial or non-financial competing interests to enable users to assess whether your comments might lead a reasonable person to question your impartiality. Consider the following examples, but note that this is not an exhaustive list:
Sign up for content alerts and receive a weekly or monthly email with all newly published articles
Already registered? Sign in
The email address should be the one you originally registered with F1000.
You registered with F1000 via Google, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Google account password, please click here.
You registered with F1000 via Facebook, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Facebook account password, please click here.
If your email address is registered with us, we will email you instructions to reset your password.
If you think you should have received this email but it has not arrived, please check your spam filters and/or contact for further assistance.
Comments on this article Comments (0)