Influence of the first-mover advantage on the gender disparities in physics citations

Kong, Hyunsik; Martin-Gutierrez, Samuel; Karimi, Fariba

doi:10.1038/s42005-022-00997-x

Download PDF

Article
Open access
Published: 13 October 2022

Influence of the first-mover advantage on the gender disparities in physics citations

Communications Physics volume 5, Article number: 243 (2022) Cite this article

5839 Accesses
9 Citations
141 Altmetric
Metrics details

Subjects

Abstract

Mounting evidence suggests that science and engineering fields suffer from gender biases. In this paper, we study the physics community, a discipline where women are still under-represented and gender disparities persist. To reveal such inequalities, we perform a paper matching analysis using a robust statistical similarity metric. Our analyses indicate that women’s papers tend to have lower visibility in the global citation network, a phenomenon significantly influenced by the temporal aspects of scientific production. Within pairs of similar papers, the authors that publish first tend to obtain more citations. From the group perspective, men have cumulative historical advantages due to women joining the field later and at a slower rate. Altogether, these results indicate that the first-mover advantage plays a crucial role in the emergence of gender disparities in citations of women-authored papers in the physics community.

Improving microbial phylogeny with citizen science within a mass-market video game

Article Open access 15 April 2024

Worldwide divergence of values

Article Open access 09 April 2024

Artificial intelligence and illusions of understanding in scientific research

Article 06 March 2024

Introduction

Mounting evidence suggests gender bias in publications and citations of scholars in STEM^1,2. Such biases can result in situations where women (or other under-represented minorities) may feel invisible and ignored in men-dominated environments. The feeling of not being part of the community can result in a higher dropout rate among women, a phenomenon known as leaky pipeline³. Leakage in the academic pipeline consequently affects the academic community for generations to come due to a lack of diversity, inclusion, innovation, and role models. Thus, it is of utmost societal importance to accurately identify those biases and devise bottom-up approaches to tackle them.

Gender inequality in academia manifests itself in the production of science and performance outcomes. Unequal division of childcare, parental leave policies, career breaks, limited access to role models and resources can create situations in which women and other minorities show less productivity and performance compared with their men peers. Frequently, these inequalities are exacerbated through formal and informal social relationships, which in turn affect the citation network structure and reinforce existing inequalities.

Academic productivity is often associated with number of publications throughout a researcher’s career. Previous studies have found that women publish fewer peer-reviewed articles than men^4,5, while a more recent study found that the disparity in the productivity of men and women disappears if we compare the productivity with regard to the scholar’s career length^6,7. Women display higher publication rates later in their academic careers, but take up fewer leadership roles^5,8. Mueller et al.⁹ suggest that publication productivity may be a factor that hinders women from advancing within surgery, while Reed and colleagues point out that mid-career assessment of productivity may not be an appropriate measure of leadership skills⁵.

Beyond disparities in publication and productivity, analyzing citation patterns can help to identify whether gender differences exist in the way scholars award and recognize each others’ works. In other words, while productivity is associated with individual or collaborative efforts, citation is an indication of how these efforts are perceived by the community¹⁰. In this sense, one can argue that while the former operates among a small number of collaborators, the latter is related to the social processes that govern the community of scholars at large.

Previous studies have shown that patterns of citation can be different for men and women. This could be explained by intentional decision, quality difference, or paradigmatic research topics^11,12,13. It has been argued that in the most productive countries, articles with women in key author positions receive fewer citations than those with men in the same positions^14,15. Moreover, some research concluded that the differences in citation rates between men and women increase with the number of authors per article¹⁶. This indicates that women are not only relatively less represented as high-impact key authors, but also that they attract significantly fewer citations for those key positions compared to men. One plausible assumption is that the lack of women in leadership positions causes this accentuated women under-representation (structural reasons) since the distribution of key authorships follows, by convention, a hierarchical order. In a recent paper, Dworkin and colleagues² present a case study of citation patterns in top neuroscience journals, finding that papers for which first and last authors are men are over-represented in reference lists, and that the discrepancy is most prominent in the citation behaviors of men and is getting worse over time.

A major methodological obstacle is that simply comparing number of publications and citations of men and women is misleading. Men and women have different rates of entry in the scientific community for historical reasons, and when combined with other non-academic responsibilities, they may not show a similar behavior at the aggregate level. Indeed, recent findings show that when differences in career length are controlled for, men and women scientists have similar rates of publication and citation on average⁷. However, beyond these insights on a population level, do men and women really receive different recognition for a similar work published around the same time? To truly examine the gender differences in citation, one should compare pairs of papers that cover the same topics in a comparable way. Relying on analyzing only the average performance may hide variations that exist in data, and drive the community to inaccurate conclusions or inappropriate policies.

In this paper, we focus on analyzing publication and citation patterns in the physics community as one of the core STEM areas where women are exceedingly under-represented, often facing belittling remarks and harassment^17,18. Our analyses reveal significant gender disparities in productivity, dropout rate, self-citation, and overall visibility of women in the citation network. More importantly, we examine gender differences not only at the population level, but also at the microscale by comparing pairs of statistically validated similar papers. We find that temporal biases play a central role in gender inequalities, as men benefit from an asymmetrical first-mover advantage and, due to historical biases, there is a disproportionate number of men senior researchers.

Results

We start by describing the dataset we have analyzed and briefly explaining the methodology we have used to build the citation network and the pairs of similar papers. Then, we proceed to study gender disparities, first at the aggregate level and then by comparing pairs of similar papers.

Data description

We study an American Physical Society (APS) dataset from 1893 to 2009, which contains articles’ metadata, the authors’ basic information, and the citations within the papers. The metadata consists of authors’ full names and a unique digital object identifier (DOI) of the publication in a string format. For those names that are repeated in the dataset, we used name disambiguation methods proposed by Sinatra et al.¹⁹ to detect unique authors and correctly match authors to publications (see Supplementary Fig. 1). To infer gender from names, we implemented a gender-detection procedure that combines author names with an image-based gender inference technique applied to search results from Google Images²⁰. This combined method results in high accuracy in the gender identification of scholars from different nationalities (see Supplementary Methods). The final dataset consists of 541,448 scholarly articles published over the course of 116 years, categorized into 11 journals. Among those 541,448 papers, we were able to identify at least one participating author’s gender of 375,736 papers. We have identified 120,776 gendered names, 17,763 women and 103,013 men. The evolution of the number of authors per year is shown in Fig. 1a.

**Fig. 1: Rate of growth of women participation, average publications by career age, dropout rate and annual ratio of men/women self-citations.**

Here, the notion of “gender” refers neither to the sex of the authors nor to the gender that the author self-identifies as. By the word “woman”, we mean an author whose name has a high probability of being assigned to female at birth or being identified as a woman due to facial characteristics. Given this limitation, we can safely argue that these methodologies are in accordance with social constructs and what people perceive as gender in society.

Constructing citation networks and assessing similar pairs

We build the citation networks by considering each paper as a node and making a link from paper i to paper j if i includes a citation to j. We measure the similarity between two papers using the bibliographic coupling strength^21,22; that is, the number of publications that both papers cite. Two papers that cover similar topics in a comparable way are assumed to include a similar set of outgoing citations. However, within subfields there is usually a handful of classic publications that are cited in most works, so their inclusion in two different papers may not indicate actual similarity, but a citation convention. To avoid such shortcomings of naive bibliographic coupling, and guarantee the significance of the overlapping set of citations, we apply a statistical test based on the hypergeometric distribution. This test controls for the incoming citations of the commonly cited papers and checks whether the size of the common set of citations is so large that it cannot be explained by randomness. The problem of identifying similar papers to assess gender disparities has also been approached recently using machine learning techniques²³.

To explore gender disparities, we select pairs of similar papers respectively written by men and women primary authors. Then, we compare the future incoming citations to each of the pair. This comparison allows us to detect potential inequalities in the citation patterns. We have summarized this methodology in the diagram of Fig. 2 and provided all the technical details in Methods.

Aggregate gender disparity trends

To characterize the gender disparities at the aggregate level, we first analyze the aspects of scientific production that depend primarily on individual choices and ability: in particular, productivity, dropout rate, and self-citations. Then, we discuss authorship order, which depends on the internal organization of research groups. Finally, we study the behavior of the scientific community as a whole by comparing the citations received by men and women.

Productivity

We define productivity as the number of publications that scholars produce during their career. In physics, we observe that women have a lower average number of publications compared to men across all their career ages (Fig. 1b). While in the first two years of author’s career the publication gap is closing, we observe a sudden increase in the gap from the second to the eleventh year. After this point, the publication gap starts decreasing again. These fluctuations in publication productivity can be associated with, among other things, the disproportionate family responsibilities that women have to take on compared with men²⁴. For the aggregate results, see the productivity distributions by gender in Supplementary Fig. 4.

Although a researcher’s productivity can be considered to be determined mainly by individual skills, the collaborative nature of scientific work makes it dependent on external factors such as other team members or departmental organization. Likewise, these factors, together with other aspects like social perception or family responsibilities, affect women’s motivation to keep working in academia, potentially leading to the leaky pipeline phenomenon. To quantify this phenomenon, in the next section we explore the differences in dropout rates between men and women.

Dropout rate

We compute dropout as a lack of publication activity for at least five years to distinguish the authors who are active in publishing from those who have dropped out. We investigate the ratio of dropout scholars at each career age compared to the number of active scholars by gender. Figure 1c shows that women authors have a higher dropout ratio throughout their whole career. The largest gaps appear in the early career years, with a 2.28% difference between men and women in the first year and a 2.26% difference in the sixth year. The dropout rates of authors who leave academia after their first year (career age 0) are not shown in Fig. 1c. This career age presents the highest dropout rates, with 39.94% for men authors and 47.55% for women authors.

Self-citation

Self-citation refers to cases where authors cite their own previous works. Self-citations increase the total citation count and the visibility of scholars^25,26,27, potentially enhancing academic promotion and attention. We have measured the relative number of self-citations by all men and women authors with the following metric (r) to study the difference in self-citation ratios between the two genders over time²⁵:

$$r=\frac{\frac{ \% {{{{{{{\rm{men}}}}}}}}{{\hbox{'}}} {{{{{\rm{s}}}}}}\,{{{{{\rm{self}}}}}}-{{{{{\rm{citations}}}}}}}{ \% {{{{{\rm{men}}}}}}{{\hbox{'}}} {{{{{\rm{s}}}}}}\,{{{{{\rm{citations}}}}}}}}{\frac{ \% {{{{{\rm{women}}}}}}{{\hbox{'}}} {{{{{\rm{s}}}}}}\,{{{{{\rm{self}}}}}}-{{{{{\rm{citations}}}}}}}{ \% {{{{{\rm{women}}}}}}{{\hbox{'}}} {{{{{\rm{s}}}}}}\,{{{{{\rm{citations}}}}}}}}$$

(1)

Figure 1d shows the temporal evolution of the ratio r. This result shows that women tend to cite themselves less than men and that this trend is consistent over the years (See Supplementary Table 2 for more details). Consequently, women’s visibility in the citation network is partly penalized by the higher ratio of men citing their own previous works.

Another fundamental factor that affects an author’s visibility is the position in which her name appears in the list of authors. This position depends on how the whole research group is organized and, crucially, in most cases it depends on the perceived level of contribution of each collaborator.

Authorship order analysis

In the majority of the scientific fields, including physics, the authorship order indicates relative contribution and seniority by putting emphasis on the first, the last, and the second positions^28,29. In order to compare the positions of authors, we first discard those papers for which authorship order is alphabetical. For this purpose, we perform a string comparison of the last names of the contributing authors and consider them to be in alphabetical order if the paper has at least four authors and all of them follow this order. Around 3.54% of the papers can be considered as alphabetically ordered; in Supplementary Table 3 we detail their fraction by PACS subfield (Physics and Astronomy Classification Scheme). After discarding those papers from the analysis, we study the authorship order in each publication and compare the proportion of women and men in each position of the author list (first, second, middle and last). We perform this comparison using a two-proportion z-test (see Methods). If there is only one author in a paper, we consider her the first author. Middle authors are those between second and last in papers with more than three authors.

The results show that there are more women than expected by chance in the first, second and middle author positions, and they are heavily under-represented as last authors (see Supplementary Table 4). The last author in physics papers is usually the most senior member of the team, so this trend can be explained by the later and slower rate of arrival of women, combined with their higher dropout rate throughout their career. This is in line with previous findings that women feature only rarely as the last authors in leading journals³⁰.

While the authorship order reflects how a researcher’s coworkers perceive her contribution, the collective perception of the scientific community regarding the importance of a paper is manifested in the citations of papers. In the following sections we will thoroughly compare the relative popularity of publications led by women and men.

Citation centrality analysis

The flow of citations determines the visibility and recognition of papers both locally and globally. To measure the local influence of papers we use the in-degree metric, and to measure the global influence, we use the PageRank centrality. Our aim is to verify if the visibility of papers written by women is proportionate to what we expect from their overall population size. To do that, we focus on the ranking of the nodes according to their respective centrality.

Understanding ranking centrality is important for three reasons. First, the authors of papers in top ranks gain more visibility for themselves and those central papers influence future citation patterns^31,32,33. Second, the visibility of papers in top ranks is being exacerbated by algorithmic tools such as Google Scholar. Third, since citation networks follow a heavy-tailed distribution, those in top ranks stabilize their ranking position and give few opportunities for other papers to catch up³⁴. Because of these network effects, it is important to study how minorities are represented in top network centrality ranks.

We assigned to each paper a gender by labeling it based on its first author. Then, we analyzed the top h% in-degree/PageRank centrality of the papers. Figure 3a suggests that papers written by women have significantly lower in-degree and pagerank centrality than expected from their overall proportion. Women-led publications are substantially under-represented in the highest 20th, 30th, and 40th percentages, and the deviation between the observed and the expected proportions likewise increases in the highest rank positions. While in-degree and PageRank follow a similar trend as expected, the proportion of women with high PageRank centrality is even lower when compared to the in-degree centrality. This suggests not only that papers written by women receive less attention but also that they are disadvantaged in terms of their position within the entire citation network. Statistical tests confirm these findings (see Supplementary Table 5).

**Fig. 3: Women author proportions in degree and PageRank centrality, evolution of centrality difference by year and relationship between time of publication and citation.**

So far, the global gender analysis points towards a notable disparity in productivity and citation of men and women. This could be partly due to historical reasons, to the cumulative advantage that early arrival confers to men, as well as to the high dropout rate of women⁷. The slower rate of arrival of women (see Fig. 1a) may also play a relevant role. Together, these factors affect women’s global visibility. The question that arises from these global results is, are scholars intentionally ignoring (and therefore, under-citing) research works led by women? To explore this possibility, in the following section we study pairs of papers written by men and women that are statistically validated twins, and measure the citations that each paper receives.

Pair-wise citation analysis

We identified statistically validated pairs of similar papers (one with a man as first author and the other with a woman) using the methodology described in Methods and summarized in Supplementary Fig. 2. Then, we computed the difference in the number of citations each member of the pair receives. The overall expectation is that similar pairs of papers should have a similar number of incoming citations on average. The first sign of gender bias that we have found is that, within similar pairs of man–woman papers, men get more citations in 45% of the pairs, women in 39%, and in 16% they receive the same number of citations. We performed binomial tests against the null hypothesis that men and women should be equally likely to get citations within each similar pair and obtained a strong rejection (p-value ≈ 0).

To quantify men’s advantage, we computed the average citation difference between the man-led and the woman-led paper of each pair. Then we normalized it using the standard deviation of men’s and women’s citations to obtain Cohen’s d, a measure of effect size for the difference of means. We evaluated the significance of these differences using z-tests (see Methods). As shown in Table 1, men’s average citation count is significantly higher than women’s both in aggregate and when we consider each PACS subfield separately to control for potential differences in the citation biases per subfield. We obtained similar results by controlling for journal instead of subfield (see Supplementary Note 1 and Supplementary Table 10). We performed analogous analyses for last authors, finding consistent results for most subfields and journals (see Table 2 and Supplementary Table 12). The only noteworthy difference appears in PACS 80 (Interdisciplinary Physics & Related Studies), where women get more citations on average as first authors.

Table 1 Differences in received citations among similar pairs of publications labeled by their first-author gender.

Full size table

Table 2 Differences in received citations among similar pairs of publications labeled by their last-author gender.

Full size table

It is known that the publication time of a paper influences its citation count, and previous studies^1,35 have used different strategies to control for it. To check whether the temporal difference between two papers is responsible for the citation disparity for women (an older paper has had more time to accumulate citations), we add a maximum 3-year difference restriction between two similar papers and redo the citation difference analyses. Tables 1 and 2 show that when the time constraint is applied, the citation difference between two similar publications decreases significantly (see Supplementary Tables 11 and 13 for the journal-wise analyses). The effect is stronger for first than for last authors. The subfield Interdiscplinary Physics & Related Studies (PACS 80) presents an anomalous behavior, as women have the citation advantage as first authors while men have it as last authors. In contrast to the rest of subfields, this advantage increases after applying the time constraint.

However, citations have a very heterogeneous distribution, with a tiny fraction of papers gathering a huge number of citations, so these discrepancies may be caused by a few papers written by women with many citations. To mitigate the influence of such outliers, we have performed analogous tests for the difference of medians. In particular, we have used the Wilcoxon test to quantify the significance of the difference and the rank biserial correlation (rc)³⁶ to estimate its effect size. The rc metric takes values between −1 when women have more citations in every pair and +1 when men do. The results, presented in Supplementary Tables 14 and 15, show that the apparent advantage of women in PACS 80 (and in PACS 00—General Physics) after applying the time constraint, were mostly driven by outliers, as rc is positive in all cases; although, consistent with the previous analyses, it is smaller when the time constraint is applied.

Throughout these analyses, we have seen that the gender disparity within similar man–woman pairs is small (small effect sizes), but significant (p-values close to 0). However, we should be cautious when interpreting those p-values. The statistical tests rely on the assumption of independent samples, but in our methodology one paper can be part of several statistical twins, so those pairs would not be independent. The independence violation results in narrower standard errors and, in turn, lower p-values. Nevertheless, the consistency of the gender asymmetries should not be underestimated.

The temporal dimension is fundamental when comparing citation counts, as the first-mover advantage plays a crucial role in scientific success³⁷. Within similar man–woman pairs, the man’s paper is published first in 47.7% of the pairs, the woman’s paper in 41.3%, and approximately at the same time (the same year) in 11.0% of the pairs. These results point to a clear first-mover advantage by men.

First-mover advantage within similar pairs of papers

Given the above results, we now seek to confirm whether the time of publication is a main driver for the citation disparity and whether the first-mover advantage in publication affects men-led papers and women-led papers similarly. We define Δ_t = Y_m − Y_f as the year difference between the publication dates of man–woman pairs of similar papers and Δ_C = c_m − c_f as their citation difference. We plotted the year difference Δ_t against the citation difference Δ_C in Fig. 3b. We likewise elaborated ten analogous plots after categorizing the data into subfields by PACS number (shown in Supplementary Fig. 5) to control for variations between subfields. Note that for this analysis we impose no time restriction between the publication times of the two papers of each pair.

To verify that the disparity in citations is caused by the first-mover advantage, we first need to test whether a first-mover advantage in fact exists. If that is the case, when a man publishes first (Δ_t < 0) he should get more citations (Δ_C > 0) on average, but when a woman publishes first (Δ_t > 0) she is the one who should get more citations (Δ_C < 0) on average; that is, in Fig. 3b, quadrants Q2 and Q4 should be more populated than expected if we treated Δ_t and Δ_C as independent random variables. Equivalently, we should observe a negative correlation between Δ_t and Δ_C.

To test this hypothesis, we compared the empirical joint probability distribution of Δ_t and Δ_C (P_emp(Δ_t, Δ_C)) with the one that we would obtain if they were independent variables (P_null(Δ_t, Δ_C) = p(Δ_t)p(Δ_C)) by computing the probability anomaly as:

$${P}_{{{{{{{{\rm{diff}}}}}}}}}({\Delta }_{t},{\Delta }_{C})=\frac{{P}_{{{{{{{{\rm{emp}}}}}}}}}({\Delta }_{t},{\Delta }_{C})-{P}_{{{{{{{{\rm{null}}}}}}}}}({\Delta }_{t},{\Delta }_{C})}{{P}_{{{{{{{{\rm{null}}}}}}}}}({\Delta }_{t},{\Delta }_{C})}$$

(2)

The resulting values of P_diff(Δ_t, Δ_C) are shown in Fig. 3c and, as can be observed, they support the hypothesis of the first-mover advantage, since Q2 and Q4 present positive anomalies while Q1 and Q3 present negative ones. It is worth emphasizing that a positive (resp. negative) anomaly indicates higher (resp. lower) density of points with respect to a situation of no correlation between Δ_t and Δ_C. To quantify this trend we computed the Pearson and Spearman correlations between Δ_t and Δ_C, obtaining − 0.13 and − 0.34, respectively.

Once the existence of the first-mover advantage has been confirmed, we need to test whether there exists an asymmetry in the relative advantage that men and women obtain when they publish first. If there is no asymmetry, the average number of citations that a woman obtains by publishing a certain number of years ahead of a man should be comparable to the number of citations that a man obtains in the equivalent situation.

To verify this, we compared the citation differences of Q2 with Q4 (pairs where the earlier paper received more citations) and Q1 with Q3 (pairs where the earlier paper received fewer citations) for each temporal difference; in other words, we compared the average absolute value $|{\Delta }_{C}|$ of points from Q2 with the average $|{\Delta }_{C}|$ of points from Q4 for each $|{\Delta }_{t}|=1,2,\ldots$ separately (analogously for Q1 and Q3). To perform this comparisons, we used z-tests for difference of means for each year difference (see Methods). The results of the tests for the whole dataset, shown in Table 3, indicate that men have an asymmetric advantage, gaining comparatively more citations when they publish first. We obtain similar results for each subfield (see Supplementary Table 16). The exceptions are General Physics (PACS 00) and Interdisciplinary Physics & Related Studies (PACS 80), where women get an asymmetric advantage.

Table 3 Statistical tests of gender asymmetry in the first-mover advantage.

Full size table

Researcher seniority as a temporal advantage

While we have verified that the first-mover advantage plays a relevant role in the citation disparities between genders in a microscopic level, the differences between similar pairs, even if significant, are fairly small. Therefore, the temporal advantage gained by individual papers published earlier than their statistical twins may not be enough to explain the visibility differences manifested in the centrality rankings shown in Fig. 3a. As mentioned above, there are group-level temporal disparities that should also be taken into account: women’s delayed arrival, their slower rate of arrival, and their higher dropout rate, captured in Fig. 1.

These factors can have dramatic effects on the distribution of seniority of researchers (see Fig. 4a), which is another potential source of inequality. As a researcher progresses through her career, she not only gathers citations, but also recognition, which in turn attracts more citations. As we observe in Fig. 4b, the proportion of male to female authors increases with career age, indicating a strong gender bias in the seniority distribution. This bias in the proportion of senior researchers is transferred to the ranking of centrality of papers (see Fig. 4c), which shows, on the one hand, that the higher ranks are occupied on average by older researchers, and on the other hand, that the average age of women authors is consistently lower throughout all ranks.

**Fig. 4: Seniority distribution of researchers by gender.**

This thorough analysis indicates that temporal advantages are critical factors in the emergence of gender inequalities. From the individual’s perspective, researchers that publish a result earlier gain the first-mover advantage. Men publish earlier more frequently and obtain an asymmetrical advantage when they do so. At the population level, historical disadvantages driven by the late arrival and higher dropout rate of women cause a deficit of female senior researchers, which may explain women’s low visibility in the citation network.

Historical trend in citation

Finally, we hypothesize that the physics community might have been less receptive to the contribution of women in the past compared to the present. To test this hypothesis, we measure the temporal evolution of the centrality differences (Δ_C) between man–woman pairs by year and limit the publication time difference between the two papers to a trailing window of 3 years. Then, we compute the mean and standard error of Δ_C for all the pairs within each window. For comparison, we perform an analogous computation for random samples of similar man-man pairs. In each time window, we matched the number of sampled man-man pairs with the number of similar man–woman pairs. We repeated the man-man computation 100 times independently and computed the average Δ_C and the standard error, which we use as a baseline.

Figure 3d shows the citation differences for man–woman pairs of similar papers over the years compared with the baseline given by man-man pairs of papers. The earlier man–woman pairs seem to present a higher disparity favoring men than later pairs, whereas the Δ_C values for man-man pairs throughout the years are, as expected, consistently located around zero. After all, the similar man-man pairs were chosen randomly and there is no reason for one paper of the pair to have a higher or lower citation count than the other. The early fluctuations in Fig. 3d are due to sample size (see Supplementary Fig. 6), and the negative peak of 2002 is caused by an extremely influential paper led by a woman that laid many of the theoretical foundations of the subfield of Network Science³⁸. To measure the decreasing trend in the man–woman pairs, we ran a Mann–Whitney U-Test comparing the Δ_C of man–woman pairs published before and after 1995, obtaining a p-value = 1.78 × 10⁻⁵⁸. Hence, as hypothesized, the man–woman pairs published before 1995 show a significant disparity favouring men when compared to those published after 1995. We obtained qualitatively similar results when we performed the computation considering only the citations received up to 5 years after publication for each paper.

Discussion

The primary objective of this research was to identify gender disparities in physics focusing on five topics of interest: productivity, author order analysis, self-citation analysis, and the comparison of citations for pairs of similar papers. Therefore, our study makes a substantial contribution to the current body of literature by comprehensively analyzing the citation patterns of men and women in physics. We assembled information about all papers published in the American Physical Society from 1894 to 2009. Using a technique that combines name and image recognition, we inferred the gender of the primary authors of papers and, to study potential gender biases, we looked for statistically significant differences in the citation patterns of papers written by men and women primary authors.

Despite all the efforts to avoid any biases in our analysis, some caveats should be considered. We have combined name and image inference to identify the gender of the scholars. Even with this careful examination, we cannot infer the gender of authors who have only initials as their first names. Another caveat is related to ethnicity, as we cannot identify the majority of Asian names originating from Korea and China²⁰ (see Supplementary Table 1). However, we can safely argue that this lack of gender identification likely affects both genders similarly. Another sensitive step of our data processing pipeline is name disambiguation, used to identify all the papers published by a given author. Although we have used various criteria to disambiguate names, there still might be errors in identifying unique authors and these errors may affect minorities, which have lower numbers of instances in the data. There are other factors that can affect citation and may not be determined by assessing similar papers. For example, papers that are novel and ground-breaking or interdisciplinary in their nature may contain citations from outside physics that make them less similar to other established papers, and those are likely not being adequately assessed in our analysis. In this case, we acknowledge that the focus of our analysis is on those scholars who work predominantly on mainstream physics.

The academic community tends to evaluate scientists based on the behavior of the majority, which in physics is predominantly the behavior of white, Western men. This evaluation, at its core, is problematic and can cause discrimination against other groups that are historically, socially, or politically discriminated against. In such cases, more attention and care should be given to women and other minorities who are more likely to suffer from such historical disadvantages. Once the system moves towards a more diverse representation, its core values will no longer be determined by only one type of majority.

The structure of the citation network can influence the future citations and recognition that papers receive. Through reading papers, scholars often follow cited papers to read and cite previous works. If papers written by women are under-represented in influential positions of the citation network, this will affect their future visibility even if they are cited adequately compared to their statistical twins. This phenomenon, also known as success-breeds-success³⁹, in addition to cumulative advantages and the first-mover advantage³⁷, can be consequential for the success and recognition of scholars, their visibility³³, future success, and the scientific community’s perception of their work⁴⁰.

Science, at its core, is a collaborative process. Through collaboration and research visits, scientists meet, ideas spread, and the foundations are laid for future collaborations. Mobility hugely impacts the centrality of scholars in their collaboration networks⁴¹. There are implicit factors that can indirectly affect the participation of women in scientific collaborations. For example, geographical distance is more likely to affect women due to their family responsibilities, restrictions on travel during pregnancy, and breastfeeding, to name a few reasons. Women might not be welcomed in certain social events that are predominantly preferred by men or for those with no family responsibilities. Lack of chemistry or shyness in interacting with another gender might also make women less likely to be invited for research visits and collaborations. We note that women are not the only group who suffer from geographical restrictions, as other forms of discrimination or simply high traveling costs can affect the collaboration of scholars from Muslim and developing countries.

Diversity has a crucial role in shaping and spreading new ideas. For example, one can safely argue that many recent publications that aim to understand the inequality and biases in academia and other social domains are directly related to the boost in participation of women and minorities. However, it is also known that despite their contributions to innovative research, minorities do not reap the benefits of their innovation when compared with majorities⁴². In future work, intersectional inequalities should be studied at large scale by considering the intersection of multiple disadvantaged categories such as gender, ethnicity, and race.

Conclusion

In sum, we found that despite the rise of women’s participation in physics in recent years, the rate of entry of new women into the field is still much slower than for men. Women tend to be less productive than men in their mid-career, and they tend to have a higher dropout rate over their academic careers. Moreover, in agreement with previous works, we found that men tend to cite their own previous works with more frequency than women, penalizing the visibility of women and their potential for academic promotion. This disparity in visibility is also manifested in the under-representation of women at the top ranks of both degree and PageRank centrality of the citation network, which implies a disadvantage on both a local scale (lower number of citations) and a global scale (peripheral location within the network).

When assessing pairs of similar papers, we found that the first-mover advantage drives the citation disparity significantly. These results combined suggest that the overall disparity in the citation network is a result of cumulative advantages and the first-mover effect that men have in physics. This cumulative advantage could create implicit biases that should be tackled by appropriate policies that foster the participation of women and other minorities.

Methods

Assessing similar pairs of papers

The main objective of this paper is to compare pairs of similar papers in an unbiased fashion. The similarity analysis is based on the concept of bibliographic coupling strength N_ij of pairs of articles (i, j), which is defined as the number of common articles cited by both i and j^21,22. To overcome the shortcomings of the most commonly used normalized versions of N_ij (the Jaccard index and fractional counting, described in Supplementary Methods), we identify couples of similar papers by looking both at the outgoing references of the pair and the incoming citations of the articles they cite. In particular, we perform a statistical test using the hypergeometric distribution as a null model and detect pairs of papers whose set of common outgoing citations has a very low probability of having been generated by chance^43,44. In Supplementary Fig. 2 we present a diagram of this methodology, which is explained below in detail.

First, the citation network is built for each physics subfield (the first two digits of PACS), and then each paper in the citation network is further labeled by the gender of its primary author. After establishing the citation network, two sets ${S}_{A}^{k}$ and ${S}_{B}^{k}$ are defined: ${S}_{B}^{k}$ includes all articles that are cited k times, and ${S}_{A}^{k}$ includes all articles that cite any element in ${S}_{B}^{k}$. Notice that each publication may belong to one set, to the other or to both.

Then, we build all possible pairs $i,j\in {S}_{A}^{k}$. In order to quantify the similarity between i and j, we compute the probability of i and j both referencing a certain number of publications using the hypergeometric distribution:

$$H(X| {N}_{B}^{k},{d}_{i},{d}_{j})=\frac{\left(\begin{array}{c}{d}_{i}\\ X\end{array}\right)\left(\begin{array}{c}{N}_{B}^{k}-{d}_{i}\\ {d}_{j}-X\end{array}\right)}{\left(\begin{array}{c}{N}_{B}^{k}\\ {d}_{j}\end{array}\right)}$$

(3)

where ${N}_{B}^{k}=| {S}_{B}^{k}|$ and d_i, d_j are the number of elements in ${S}_{B}^{k}$ that publications i and j respectively cite. Supplementary Fig. 2 shows a diagram that illustrates the meaning of these variables. Notice that if d_i and d_j are interchanged, the value of H remains the same. Finally, X would be the number of overlapping citations. The term $\left({{N}_{B}^{k}}\atop{{d}_{j}}\right)$ corresponds to all the possible ways of choosing d_j publications from the set ${S}_{B}^{k}$; $\left({{d}_{i}}\atop{X}\right)$ denote the number of ways one can choose exactly X publications from the d_i papers that i cites and $\left({{N}_{B}^{k}-{d}_{i}}\atop{{d}_{j}-X}\right)$ are the number of ways the d_j − X papers cited by j and not by i can be chosen from ${S}_{B}^{k}$. Intuitively, this hypergeometric distribution can be understood as an urn model with ${N}_{B}^{k}$ balls, such that d_i of them are good balls and the rest are bad balls. H is then the probability of obtaining exactly X good balls when retrieving d_j balls from this urn.

Now, if i and j have actually cited ${N}_{ij}^{k}$ common papers of in-degree k, the cumulative probability of $X\le {N}_{ij}^{k}$ provides a measure of how probable it is that the size of their set of overlapping citations can be explained by randomness:

$${p}_{ij}(k)=\mathop{\sum }\limits_{X=0}^{{N}_{ij}^{k}-1}H\left(X| {N}_{B}^{k},{d}_{i},{d}_{j}\right)$$

(4)

The higher p_ij(k) is, the less probable it is that the size of ${N}_{ij}^{k}$ is due to chance. Therefore, we devise a measure of similarity as follows:

$${q}_{ij}(k)=1-{p}_{ij}(k)$$

(5)

Notice that q_ij(k) is the probability of a particular bibliographic coupling strength of randomly selected papers i and j towards articles in ${S}_{B}^{k}$ being greater than or equal to ${N}_{ij}^{k}$. This computation is repeated for all k and the different values of q_ij(k) are stored. The similarity of the couple (i, j) is measured by the minimum q_ij(k) over all possible values of k:

$${{q}_{ij}\left(k\right)}_{\min }=\mathop{\min }\limits_{k}\{{q}_{ij}(k)\}$$

(6)

Publications i and j are considered similar if ${{q}_{ij}(k)}_{\min } < \;{p}{* }$, where p* is a threshold value. We have chosen a threshold of p* = 0.001, which provides a good balance between similarity sensitivity and sample size. In Supplementary Methods and Supplementary Fig. 3, we detail the criteria for adopting this threshold.

We take the maximum similarity (minimum q_ij(k)) across k values because similarity can be manifested in the reference lists in very different ways. For example, two similar papers of a niche area could share just one or two references that almost no other publication cites. In the other extreme, two similar papers of an interdisciplinary or generalist field could share many widely cited references, so the probability that they were included only due to their popularity is very low. Both of these situations would lead to high similarity values. One would present a high similarity (low q_ij(k)) only for low k, while the other would do so only for high k. Since the q_ij(k) are p-values, the statistical significance of each of them should be tested independently. By only testing the minimum we are not disregarding the remaining k, as there may be other k for which q_ij(k) is low enough to pass the p* threshold. Instead, following Ciotti et al.⁴⁴, we are simplifying the analysis, as for two papers to be considered similar, it is enough for one q_ij(k) to pass the p* test.

To verify the accuracy of our approach, we manually inspected several pairs of papers with validated similarity measurements. For this test, we set a low threshold value, p* = 10⁻⁶, and applied a constraint of maximum publication year difference of 3 years. We validated the similarity between the two papers through the inspection of keywords, titles, and citation activities.For instance, papers⁴⁵ and⁴⁶, with ${{q}_{ij}(k)}_{\min }=1.0056\times 1{0}^{-8}$, present some connection between their main ideas and share a common author. Additionally, a large proportion of their citation activities align. Another similar pair is formed by articles⁴⁷ and⁴⁸ with ${{q}_{ij}(k)}_{\min }=4.0735\times 1{0}^{-12}$, which show extremely similar citation activities and deal with similar topics. As a final example,⁴⁹ and⁵⁰, with ${{q}_{ij}(k)}_{\min }=2.5139\times 1{0}^{-7}$, share topic, citation activities, and a collaborating author. It is worth emphasizing that, due to the highly restrictive p*, some of these statistically validated pairs of similar papers share a common author, which is a strong verification of our algorithm.

In a nutshell, the hypergeometric probability testing compares how significant the overlapping outflow of citations is for two papers compared to what we expect from the in-degree and out-degree of the citation network. Using this technique, we are able to compare papers that are inherently similar in their subject field by not only comparing their overlapping references, but also accounting for variations in the citations received by each reference. Since we control both for the outgoing citations of the pair and the incoming citations of the commonly cited papers, the comparison is robust and unbiased.

Authorship order two-proportion z-test

We denote the total men’s and women’s population as N_m and N_w, and total number of men and women first authors as n_m and n_w, respectively. We further define ${p}_{m}=\frac{{n}_{m}}{{N}_{m}},{p}_{w}=\frac{{n}_{w}}{{N}_{w}},p=\frac{{n}_{m}+{n}_{w}}{{N}_{m}+{N}_{w}}$ and the two-proportion z-test is performed as below:

$$z=\frac{{p}_{m}-{p}_{w}}{\sqrt{p(1-p)\left(\frac{1}{{N}_{m}}+\frac{1}{{N}_{w}}\right)}}$$

(7)

Calculating differences in received citations

Let N_mw denote the cardinality of the set of all pairs (m, w) where m and w denote publications by a primary man and woman author that share at least one reference and let M(p*) be the subset of all similar pairs validated under p*. c_m and c_w indicate number of citations received by m and w, and the average citation difference c_d can be computed by

$${c}_{d}({p}^{* })=\frac{1}{| M({p}^{* })| }\mathop{\sum }\limits_{x=1}^{| M({p}^{* })| }{\left({c}_{m}-{c}_{w}\right)}_{x}$$

(8)

where x denotes the index of pairs (m, w) ∈ M(p*). Since we are interested in comparing pairs of papers, we normalize this average difference to obtain Cohen’s d_avg, a widely used measure of effect size for difference of means⁵¹ (we actually use the unbiased version of Cohen’s d, Hedge’s g, but we will keep the d notation to emphasize its interpretation as average difference):

$$d({p}^{* })=\frac{{c}_{d}({p}^{* })}{\sqrt{\frac{{\sigma }_{{c}_{w}}^{2}+{\sigma }_{{c}_{m}}^{2}}{2}}}$$

(9)

To assess the significance of this difference we perform a difference of means z-test with H₀ : c_m = c_w, with the z-statistic defined as

$$z=\frac{{\bar{c}}_{m}-{\bar{c}}_{w}}{\sqrt{\frac{{\sigma }_{{c}_{m}}^{2}}{| M({p}^{* })| }+\frac{{\sigma }_{{c}_{w}}^{2}}{| M({p}^{* })| }}}$$

(10)

Hence, a positive z-score indicates that the data displays higher degree centrality for authors who are men than expected.

Computing temporal citation differences

We compared the citation differences of Q2 with Q4 (pairs where the earlier paper received more citations) and Q1 with Q3 (pairs where the earlier paper received fewer citations) for each temporal difference; in other words, we compared the average absolute value $|{\Delta }_{C}|$ of points from Q2 with the average $|{\Delta }_{C}|$ of points from Q4 for each $|{\Delta }_{t}|=1,2,\ldots$ separately (analogously for Q1 and Q3). To perform these comparisons, we used z-tests for difference of means for each year difference:

$$z=\frac{\overline{| {\Delta }_{C}^{{Q}_{i}}| }-\overline{| {\Delta }_{C}^{{Q}_{j}}}| }{\sqrt{\frac{{\sigma }_{{Q}_{i}}^{2}}{N({Q}_{i})}+\frac{{\sigma }_{{Q}_{j}}^{2}}{N({Q}_{j})}}}$$

(11)

In this test, we evaluate the mean ($\overline{| {\Delta }_{C}^{{Q}_{i}}| }$) and the standard deviation (${\sigma }_{{Q}_{i}}$) of $|{\Delta }_{C}|$ for two subsets of quadrants Q_i and Q_j. N(Q_i) is the number of data points in quadrant i (number of similar pairs). We run the z-test for (i, j) = (1, 3) and (i, j) = (2, 4).

Data availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request. The data are stored in CSV format.

Code availability

Python codes to generate similar pairs and other relevant measurements are available at https://github.com/CSHVienna/firstmover.

References

Caplar, N., Tacchella, S. & Birrer, S. Quantitative evaluation of gender bias in astronomical publications from citation counts. Nat. Astron. 1, 1–5 (2017).
Google Scholar
Dworkin, J. D. et al. The extent and drivers of gender imbalance in neuroscience reference lists. Nat. Neurosci. 23, 918–926 (2020).
Article Google Scholar
Alper, J. & Gibbons, A. The pipeline is leaking women all the way along. Science 260, 409–412 (1993).
Article ADS Google Scholar
Kaufman, R. R. & Chevan, J. The gender gap in peer-reviewed publications by physical therapy faculty members: a productivity puzzle. Phys. Ther. 91, 122–131 (2011).
Article Google Scholar
Reed, D. A., Enders, F., Lindor, R., McClees, M. & Lindor, K. D. Gender differences in academic productivity and leadership appointments of physicians throughout academic careers. Acad. Med. 86, 43–47 (2011).
Article Google Scholar
Jadidi, M., Karimi, F., Lietz, H. & Wagner, C. Gender disparities in science? dropout, productivity, collaborations and success of male and female computer scientists. Adv. Complex Syst. 21, 1750011 (2018).
Article MathSciNet Google Scholar
Huang, J., Gates, A. J., Sinatra, R. & Barabási, A.-L. Historical comparison of gender inequality in scientific careers across countries and disciplines. Proc. Natl Acad. Sci. USA 117, 4609–4616 (2020).
Article ADS Google Scholar
Maske, K. L., Durden, G. C. & Gaynor, P. E. Determinants of scholarly productivity among male and female economists. Econ Inq. 41, 555–564 (2003).
Article Google Scholar
Mueller, C., Wright, R. & Girod, S. The publication gender gap in us academic surgery. BMC Surg. 17, 16 (2017).
Article Google Scholar
Barabási, A.-L. The Formula: The Five Laws Behind Why People Succeed (Pan Macmillan, 2018).
Aksnes, D. W., Rorstad, K., Piro, F. & Sivertsen, G. Are female researchers less cited? a large-scale study of norwegian scientists. J. Am. Soc. Inf. Sci. Technol. 62, 628–636 (2011).
Article Google Scholar
Lindsey, D. Using citation counts as a measure of quality in science measuring what’s measurable rather than what’s valid. Scientometrics 15, 189–203 (1989).
Article Google Scholar
Davenport, E. & Snyder, H. Who cites women? whom do women cite?: an exploration of gender and scholarly citation in sociology. J. Doc. 51, 404–410 (1995).
Article Google Scholar
Lee, C. J., Sugimoto, C. R., Zhang, G. & Cronin, B. Bias in peer review. J. Am. Soc. Inform. Sci. Technol. 64, 2–17 (2013).
Article Google Scholar
Larivière, V., Ni, C., Gingras, Y., Cronin, B. & Sugimoto, C. R. Bibliometrics: Global gender disparities in science. Nature 504, 211–213 (2013).
Article Google Scholar
Bendels, M. H., Müller, R., Brueggmann, D. & Groneberg, D. A. Gender disparities in high-quality research revealed by Nature Index Journals. PLoS ONE 13, e0189136 (2018).
Barthelemy, R. S., McCormick, M. & Henderson, C. Gender discrimination in physics and astronomy: graduate student experiences of sexism and gender microaggressions. Phys. Rev. Phys. Educ. Res. 12, 020119 (2016).
Article Google Scholar
Aycock, L. M. et al. Sexual harassment reported by undergraduate female physicists. Phys. Rev. Phys. Educa. Res. 15, 010121 (2019).
Article ADS Google Scholar
Sinatra, R., Wang, D., Deville, P., Song, C. & Barabási, A.-L. Quantifying the evolution of individual scientific impact. Science 354, aaf5239 (2016).
Article Google Scholar
Karimi, F., Wagner, C., Lemmerich, F., Jadidi, M. & Strohmaier, M. Inferring gender from names on the web: A comparative evaluation of gender detection methods. In Proceedings of the 25th International Conference Companion on World Wide Web, 53–54 (International World Wide Web Conferences Steering Committee, 2016).
Kessler, M. M. Bibliographic coupling between scientific papers. Am. Doc. 14, 10–25 (1963).
Article Google Scholar
Egghe, L. & Rousseau, R. Co-citation, bibliographic coupling and a characterization of lattice citation networks. Scientometrics 55, 349–361 (2002).
Article Google Scholar
Koffi, M. Innovative ideas and gender inequality. Working Paper Series 35, . https://www.econstor.eu/handle/10419/234474 (Canadian Labour Economics Forum (CLEF), Waterloo (2021).
Morgan, A. C. et al. The unequal impact of parenthood in academia. Sci. Adv. 7, eabd1996 (2021).
Article ADS Google Scholar
King, M. M., Bergstrom, C. T., Correll, S. J., Jacquet, J. & West, J. D. Men set their own cites high: gender and self-citation across fields and over time. Socius 3, 2378023117738903 (2017).
Article Google Scholar
Fowler, J. & Aksnes, D. Does self-citation pay? Scientometrics 72, 427–437 (2007).
Article Google Scholar
Maliniak, D., Powers, R. & Walter, B. F. The gender citation gap in international relations. Int. Organ. 67, 889–922 (2013).
Article Google Scholar
Baerlocher, M. O., Newton, M., Gautam, T., Tomlinson, G. & Detsky, A. S. The meaning of author order in medical research. J. Investig. Med. 55, 174–180 (2007).
Article Google Scholar
Sauermann, H. & Haeussler, C. Authorship and contribution disclosures. Sci. Adv. 3, e1700404 (2017).
Article ADS Google Scholar
Shen, Y. A., Shoda, Y. & Fine, I. Too few women authors on research papers in leading journals. Nature 555, 165–166 (2018).
Article ADS Google Scholar
Bloch, F., Jackson, M. O. & Tebaldi, P. Centrality Measures in Networks. Available at SSRN 2749124: https://ssrn.com/abstract=2749124 (2017).
Ding, Y., Yan, E., Frazho, A. & Caverlee, J. Pagerank for ranking authors in co-citation networks. J. Am. Soc. Inform. Sci. Technol. 60, 2229–2243 (2009).
Article Google Scholar
Karimi, F., Génois, M., Wagner, C., Singer, P. & Strohmaier, M. Homophily influences ranking of minorities in social networks. Sci. Rep. 8, 1–12 (2018).
Article Google Scholar
Ghoshal, G. & Barabási, A.-L. Ranking stability and super-stable nodes in complex networks. Nat. Commun. 2, 1–7 (2011).
Article Google Scholar
Teich, E. G. et al. Citation inequity and gendered citation practices in contemporary physics. Preprint at https://arxiv.org/abs/2112.09047(2021).
Kerby, D. S. The simple difference formula: an approach to teaching nonparametric correlation. Compr. Psychol. 3, 11.IT.3.1 (2014).
Article Google Scholar
Newman, M. E. The first-mover advantage in scientific publication. Europhys. Lett. 86, 68001 (2009).
Article ADS Google Scholar
Albert, R. & Barabási, A.-L. Statistical mechanics of complex networks. Rev. Mod. Phys. 74, 47–97 (2002).
Article ADS MathSciNet MATH Google Scholar
Van de Rijt, A., Kang, S. M., Restivo, M. & Patil, A. Field experiments of success-breeds-success dynamics. Proc. Natl Acad. Sci. USA 111, 6934–6939 (2014).
Article ADS Google Scholar
Lee, E. et al. Homophily and minority-group size explain perception biases in social networks. Nat. Human Behav. 3, 1078–1087 (2019).
Article Google Scholar
Momeni, F., Karimi, F., Mayr, P., Peters, I. & Dietze, S. The many facets of academic mobility and its impact on scholars’ career. J. Informetr. 16, 101280 (2022).
Article Google Scholar
Hofstra, B. et al. The diversity–innovation paradox in science. Proc. Natl Acad. Sci. USA 117, 9284–9291 (2020).
Article ADS Google Scholar
Tumminello, M., Micciche, S., Lillo, F., Piilo, J. & Mantegna, R. N. Statistically validated networks in bipartite complex systems. PLoS ONE 6, e17994 (2011).
Article ADS Google Scholar
Ciotti, V., Bonaventura, M., Nicosia, V., Panzarasa, P. & Latora, V. Homophily and missing links in citation networks. EPJ Data Sci. 5, 7 (2016).
Article Google Scholar
Wu, C., Qian, W.-L. & Su, R.-K. Improved density-dependent quark mass model with quark-σ meson and quark-ω meson couplings. Phys. Rev. C 77, 015203 (2008).
Article ADS Google Scholar
Yin, S. & Su, R.-K. Consistent thermodynamic treatment for a quark-mass density-dependent model. Phys. Rev. C 77, 055204 (2008).
Article ADS Google Scholar
Lee, D., Pegg, D. & Hanstorp, D. Fast ion-beam photoelectron spectroscopy of Ca^-: cross sections and asymmetry parameters. Phys. Rev. A 58, 2121 (1998).
Article ADS Google Scholar
Yuan, J. Core-valence electron correlation effects in photodetachment of Ca^- ions. Phys. Rev. A 61, 012704 (1999).
Article ADS Google Scholar
Huo, W. M., Lima, M. A., Gibson, T. L. & McKoy, V. Correlation effects in elastic e-N₂ scattering. Phys. Rev. A 36, 1642 (1987).
Article ADS Google Scholar
Morrison, M. A., Saha, B. C. & Gibson, T. L. Electron-N₂ scattering calculations with a parameter-free model polarization potential. Phys. Rev. A 36, 3682 (1987).
Article ADS Google Scholar
Lakens, D. Calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and ANOVAs. Front. Psychol. 4, 863 (2013).

Download references

Acknowledgements

This work has been supported by the Austrian research agency (FFG) under project No. 873927 ESSENCSE. We would also like to thank Eun Lee, M. R. Ferreira, J. Bachmann, G. Amichay, and S. Sajjadi for their comments and suggestions, which helped to greatly improve the manuscript. And to J. Reddish, for her thorough proofreading of the paper.

Author information

These authors contributed equally: Hyunsik Kong, Samuel Martin-Gutierrez.

Authors and Affiliations

Network Inequality Group, Complexity Science Hub Vienna, Josefstaedter Strasse 39, 1080, Vienna, Austria
Hyunsik Kong, Samuel Martin-Gutierrez & Fariba Karimi

Authors

Hyunsik Kong
View author publications
You can also search for this author in PubMed Google Scholar
Samuel Martin-Gutierrez
View author publications
You can also search for this author in PubMed Google Scholar
Fariba Karimi
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

H.K. and F.K. conceived the study, H.K. and S.M.G. performed the computations. All authors analyzed the results and wrote the manuscript. H.K. and S.M.G. contributed equally to the paper.

Corresponding author

Correspondence to Fariba Karimi.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Communications Physics thanks Junming Huang and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Karimi_PR File

Supplemental material

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Kong, H., Martin-Gutierrez, S. & Karimi, F. Influence of the first-mover advantage on the gender disparities in physics citations. Commun Phys 5, 243 (2022). https://doi.org/10.1038/s42005-022-00997-x

Download citation

Received: 06 October 2021
Accepted: 16 August 2022
Published: 13 October 2022
DOI: https://doi.org/10.1038/s42005-022-00997-x

This article is cited by

Wasted talent: the status quo of women in physics in the US and UK
- Tracey Berry
- Saskia Mordijck
Communications Physics (2024)
Improving the visibility of minorities through network growth interventions
- Leonie Neuhäuser
- Fariba Karimi
- Michael T. Schaub
Communications Physics (2023)
On the inadequacy of nominal assortativity for assessing homophily in networks
- Fariba Karimi
- Marcos Oliveira
Scientific Reports (2023)
Women physicists miss out on ‘first-mover advantage’

Nature (2022)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Introduction

Results

Data description

Constructing citation networks and assessing similar pairs

Aggregate gender disparity trends

Productivity

Dropout rate

Self-citation

Authorship order analysis

Citation centrality analysis

Pair-wise citation analysis

First-mover advantage within similar pairs of papers

Researcher seniority as a temporal advantage

Historical trend in citation

Discussion

Conclusion

Methods

Assessing similar pairs of papers

Authorship order two-proportion z-test

Calculating differences in received citations

Computing temporal citation differences

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links