Inequalities in the distribution of National Institutes of Health research project grant funding

Previous reports have described worsening inequalities of National Institutes of Health (NIH) funding. We analyzed Research Project Grant data through the end of Fiscal Year 2020, confirming worsening inequalities beginning at the time of the NIH budget doubling (1998–2003), while finding that trends in recent years have reversed for both investigators and institutions, but only to a modest degree. We also find that career-stage trends have stabilized, with equivalent proportions of early-, mid-, and late-career investigators funded from 2017 to 2020. The fraction of women among funded PIs continues to increase, but they are still not at parity. Analyses of funding inequalities show that inequalities for investigators, and to a lesser degree for institutions, have consistently been greater within groups (i.e. within groups by career stage, gender, race, and degree) than between groups.


Introduction
Over the past few years, there has been increasing interest (Peifer, 2017) in how the National Institutes of Health (NIH) funding support is distributed, with concern voiced by some that there may be excess concentration of support given to men and to the most well-funded late-career investigators. In a report (National Institutes of Health, 2019) issued by an NIH Working Group to the Advisory Committee to the Director (ACD), it was noted that "In biomedical science, power stems from who has access to awards. The Working Group heard repeatedly that the concentration of funding in a relatively small number of investigators (who are overwhelmingly white, cisgender, straight men) incentivizes universities to protect researchers bringing in high levels of grant funding".
Recently published literature has raised concerns regarding how NIH distributes funding support. One report (Katz and Matter, 2020) which focused on all 'R' grants found increasing inequality of funding support over 30 years . A research letter (Oliveira et al., 2019) found lower levels of support for grants in which women were identified as Principal Investigators. Other reports have documented disproportionate aging of the research workforce (Blau and Weinberg, 2017) and stresses particular to mid-career investigators (Charette et al., 2016); these reports are concerning given evidence that there is no correlation between research stage and scientific impact (Sinatra et al., 2016).
In 2017, the NIH considered imposing a cap (Lauer, 2017a) on individual-investigator research support through use of a 'Grant Support Index or GSI' (Lauer, 2017b) which classified grants according to mechanism (e.g. R01, P01, U54) rather than according to dollars. The GSI set a value of 7 for R01 grants, with lower values for 'smaller' mechanisms like R03 or R21 and greater values for mechanisms like P01 or U54. The proposed cap was set at 21, meaning that on average no investigator could be designated as PI on more than the equivalent of three R01 grants. The proposed cap was highly controversial (Kaiser, 2017) and was dropped in favor of a different approach (Lauer et al., 2017) that targeted funds directly toward early career investigators.
Here, we present updated data on distribution of NIH support for principal investigators ('PIs', keeping in mind that NIH issues awards to institutions [Lauer, 2018], not to individual scientists) with particular attention to career stage, gender, race, and degree. We focus on research project grants ('RPGs') as these comprise close to 80% of all NIH extramural research funding; we can also assess patterns that are independent of already well-known disparities for small business and non-RPG research grants.

Results
Distribution of funding to RPG PIs over time Figure 1 shows different measures of funding distribution to RPG PI's between fiscal years 1985 to 2020. These measures reflect different approaches that economists use to assess income inequality; here we use RPG funding as the analogue of income. We use three different measures: . Proportion of funds going to the top 1%, or centile, as well as to the top 10%, or decile, (Saez and Zucman, 2020) in contrast to the proportion to the bottom 50% (Panel A) or considered alone (Panel C).
. Standard deviation of the log of funding (Hoffmann et al., 2020), a measure that accounts for the well-documented skewness in funding and that is particularly sensitive to low and intermediate levels of funding (Panel B).
. The Theil T index (Conceiç ão and Ferreira, 2000), a measure that is more sensitive to higher levels of funding (Panel D). Unlike other measures of inequality, the Theil Index is not intuitive. However, it can be used to parse group data, allowing us to parse inequality into within group and between group components; for example, we can see whether there is a greater degree of inequality between men and women as opposed to within cohorts of men and women.
All three measures indicate greater inequalities in funding since the early 1990s through 2006 corresponding to the NIH-doubling and its aftermath; a plateau from 2006 to 2013; a rapid rise after 2013 (the year of sequestration) to 2017; and a decline approaching 2013 levels from 2018 to 2020. The inequalities are more striking among the most highly funded investigators (Panels C and D), where increases are noted with the NIH doubling (1998 to 2003) and in the first few years after the 2013 budget sequestration. The top 1% of investigators received 8% of RPG funds in 1998; in recent years, they received close to 10% of funds. While this may not seem like much, we should keep in mind that a difference of 2% of RPG funds means that a small group of~300 investigators are receiving in 2020 approximately $420 million (inflation-adjusted) more than they would have received by 1998 standards. Given that the average RPG costs about $500,000, this difference is the equivalent of 800 grants. Inequalities among investigators receiving low to intermediate levels of funding followed a somewhat different trajectory, decreasing during the NIH doubling while increasing after 2013.
Characteristics of the most highly funded RPG principal investigators Table 1 shows characteristics of 34,936 principal investigators funded in fiscal year 2020 according to whether or not they were among the top funded centile. We defined proxies for career stage according to age, with values of 'early' (age < 46), 'middle' (age 46 to 58), and 'late' (age > 58). Compared to the bottom 99%, the top 1% of investigators were in later career stages and more likely to be white, non-Hispanic, and to hold an MD degree (either alone or with a PhD). The difference in funding levels is striking, with top 1% investigators receiving a median of $4.8 million compared to $0.4 million for all others; they were also much more likely to be supported on multiple RPG grants. Table 2 shows corresponding characteristics of 19,221 principal investigators funded in fiscal year 1995, before the begining of the NIH doubling. In contrast to 2020, career stage and race differences were less marked, but gender differences were more so. During both eras (before the doubling and in most recent times) top centile investigators were much more likely to hold an MD degree. Consistent with prior literature (Blau and Weinberg, 2017), the age range of all NIH funded investigators is skewing older over time. Another noteworthy difference between FY2020 and FY1995 is that much greater proportions of investigators were supported on multiple -3, 4, or 5 or moregrants in FY2020 than in FY1995.  investigators. Middle career investigators are comprising a lower proportion of the workforce since the mid-2000s. Over the past 4-5 years, the proportions of PIs at different career stages have stabilized. The proportion of late-career investigators is no longer rising while that of mid-career investigators is no longer falling. This stabilization has occured at the same as NIH implementation of its Next Generation Researchers Initiative (Lauer et al., 2017). The fraction of women among funded PIs continues to increase, but they are still not at parity. The proportion of MD-only degree holders has fallen, while the proportion of MD-PhD degree holders has increased. Figure 3 shows using box plots the FY 2020 distribution of funding to RPG PIs according to career stage, gender, race, and degree. Late career investigators, men, whites, and those holding MD degrees are better funded. Nonetheless, one notes that there appears to be greater variability within groups than between groups. Table 3 shows FY2020 characteristics according to career stage. Late-career investigators were more likely to be white males, to hold MD degrees, and to be designated as PI on a larger number of grants. Table 4 shows FY2020 investigator characteristics according to gender. Women were younger, more likely to hold a PhD degree, and less likely to be principal investigators of 2 or more RPG grants. Table 5 shows corresponding race data. Black or African-American investigators were younger, more likely to be women, and more likely to hold MD degrees. They were also much more likely to serve a PI on only one RPG grant.

Inequalities between and within groups
The Theil T index enables us to formally assess between-group and within-group contributions to inequality. Figure 4 shows that for all groupings, within group differences contribute more to inequality than between-group differences. The small between-group differences are shown in  elements' because they on average receive higher levels of funding. Nonetheless, the absolute values of these elements, as compared to the total Theil index, are small.

Organizational inequalities
In additional analyses, we look at RPG funding inequalities among organizations. Figure 6 shows data analagous to those in Figure 1. Because the absolute number of organizations is much less than for PIs (e.g. in 2020 there were 1097 unique organizations receiving RPG funding) we focus on the top decile (10%) rather than the top centile. The top 10% of organizations have been receiving approximately 70% of RPG funding, while the bottom half have received well under 5%. Like with PIs, inequalities increased after the doubling, but patterns in more recent years have differed. Inequalities decreased in the late 2000s (perhaps coincident with the 2008 finanical crash), but have increased slightly in more recent years. Figure 7 shows distribution of RPG funding in Fiscal Year 2020 according to organization type. Because the distributions are highly skewed (even more so than with PIs), we show log-transformed values (Panel A). There are marked differences between groups -medical schools are receiving higher levels of funding than other institutions. We confirm this by calculating Theil indices, which show that organizational inequalities stem from both between group and within group variability (Panel B). The Theil elements plot (Panel C), consistent with Panel A, shows that medical schools, and to a lesser extent hospitals, are groups that receive higher levels of funding. Figure 8 shows corresponding data according to organization region. Funding inequalities were greater within regions than betweeen regions. Figure 9 shows similarly that for domestic institutions within state inequalities contribute more to overall inequality that between-state inequalities.

Perspective: Income inequality in the united States and Europe -Population Data
In order to put these NIH-specific data into perspective, we present high-level income equality data for general populations of the United States and the European Union. We show data from the World Inequality Database (Saez, 2021), which was developed by Emmanuel Saez and colleagues.  Figure 10 shows percent of annual income going to the top centile (Panel A) and the bottom half (Panel B) of the populations of the United States and Europe from 1980 to 2020. We focus on income, instead of wealth, since income for most people comes from remuneration for work and therefore would be analogous to RPG funding awarded in anticipation of scientific work. At all times, income inequality has been greater in the United States. Changes in inequality have also been greater in the United States. From 1995 to 2019, the proportion of income going to the top centile of the United States population has increased from 14.3% to 18.7%, a relative increase of 31%.   During the same time, the proportion of RPG funding going to the top centile of RPG PIs has increased from 8.3% to 10.8% (Figure 1, Panel C), a relative increase of 30%. Although the US population and NIH-funded PIs have experienced different events -e.g., the 2000 recession and the 2008 financial crash for the US population; the NIH doubling, the 2006 payline crash, the 2013 sequestration, and the recent string of budget increases for NIH-funded PIs -the overall relative changes in inequality at the top are remarkably similar.

Discussion
Inequalities in funding of RPG PIs have increased since the NIH doubling, with further increases since sequestration in 2013 ( Figure 1). Over the past few years, a time of substantial and sustained budget increases for NIH and a time of focus on early career investigators, there has been a decrease in the degree of inequality, but not quite back to the level of 2013. The RPG funding inequalities primarily reflect changes 'at the top,' meaning among the most highly funded investigators (Figure 1, Panels C and D). The top 1%'s share of RPG funding has increased from 8% before the doubling to nearly 10% now (Figure 1, Panel C); this difference translates into~$400 million, or the equivalent of 800 RPG awards. Since sequestration, the top 1% has received an increased share of funding, while the bottom 50% has received less. During the NIH doubling, both the top centile and the lower half saw increases in the proportion of funding they received (Figure 1, Panels A and B). The composition of the RPG PI workforce has evolved over time, with greater proportions of investigators who are late career, women, and Asian, and lesser proportions of MD-only degree holders (Figure 2). Despite steady increases in the proportion of women investigators, they are still well below parity. (Figure 2, Panel B). Among the groups studied, more funding goes to late career investigators, as well as to men, whites, and holders of MD degrees. Nonetheless, there is greater inequality within groups than between groups (Figures 3-5). One might argue that it may be reasonable for researchers to receive more funding at later career stages as they may have larger networks and are more experienced at posing research questions. Thus, some inequality may be considered 'acceptable.' But there is not funding parity for gender or race for researchers in the workforce, which are unacceptable inequalities. Over the past few years NIH has launched high-  Degree. For all groups, within-group differences contribute more to inequality than between-group differences.

Materials and methods
From the NIH IMPAC II database, we obtained PI-specific data on inflation-adjusted total-cost funding of Research Project Grants (RPGs), defined as those grants with activity codes of DP1, DP2, DP3, DP4, DP5, P01, PN1, PM1, R00, R01, R03, R15, R21, R22, R23, R29, R33, R34, R35, R36, R37, R61,   We measured inequality by three approaches: Proportion of funds going to the top 1%,or centile, (Saez and Zucman, 2020); standard deviation of the log of funding (Hoffmann et al., 2020), a measure that accounts for the well-documented skewness in funding and that is particularly sensitive to low and intermediate levels of funding; and the Theil T index (Conceiç ão and Ferreira, 2000), a measure that is more sensitive to higher levels of funding and that be exploited to explore contributions of different groups to overall inequality.
For individual level data (say of individual PIs), the Theil Index (T) of funding inequality is mathematially represented as: where n is the number of individual PIs, y p is the funding of PI p, and y is the population mean funding. The final logarithmic fraction takes on a value greater than 0 if the individual investigator p's funding is greater than the population mean y and less than 0 if the individual investigator's funding is less than the population mean. We can think of the three terms as: 1 n as the investigator's proportion of the population; yp y as the magnitude of deviance compared to the overall population; and ln yp y as the direction of deviance. For grouped data (e.g. data grouped by career stage or gender or other characteristics), we can present the Theil Index T as a weighted average of inequality within each group plus inequality between those groups. That is: where T 0 g is the between-group component and T w g is the within group component. The between-group component of the Theil Index (T 0 g ) is mathematically represented in a form similar to the overall Theil Index (Equation 1), namely: where i indexes the m groups (e.g. early, middle, and late career investigators), P is the total Figure 9. RPG funding distribution and inequalities according to organization state within the United States. The panel shows a Theil index components plot, showing that within state inequalities contribute more to overall inequality than between-state inequality.
population, y i is the average funding of the group i, and m is the average funding accross the entire population. The expression within the parenthesis is called the 'Theil element,' which is positive (or negative) if the group's average funding is above (or below) the population average and zero if the averages are equal. The Theil elements represent the contribution of each group to total inequality between the groups. Unlike other measures of inequality (e.g. proportion of funding going to the top centile or standard deviation of log funding), the Theil Index is not intuitive. However, it can be used to parse group data, allowing us to parse inequality into within group and between group componentd between group component. The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.