Effects of Inferred Gender on Patterns of Co- Authorship in Ecology and Evolutionary Biology Publications

Senior positions in academia such as tenured faculty and editorial positions often exhibit large gender imbalances across a broad range of research disciplines. The forces driving these imbalances have been the subject of extensive speculation and a more modest body of research. Given the central role publications play in determining individual outcomes and progress in academic settings, unequal patterns of authorship across gender could be a potent driver of observed gender imbalance in academia. Here, we investigate patterns of co- authorship across four journals in ecology and evolutionary biology at four time- points spanning four decades. Co- authorship patterns are of interest because collaborations are important in scientific research, affecting individual researcher produc-tivity, and increasingly, funding opportunities. Based on inferred gender from set criteria, we found significant differences between male and female researchers in their tendency to publish with female co- authors. Specifically, compared to women, male researchers in the last author position were more likely to co- author papers with other males. While we did find that the proportion of female co-authors has increased modestly over the last thirty years, this is strongly correlated with an increase in the average number of authors per paper over time. Additionally, the proportion of female co-authors on papers remains well below the proportion of PhDs awarded to females in biology.


Introduction
Numerous imbalances exist between male and female researchers across academic disciplines. Men occupy a higher proportion of tenure-track positions in STEM disciplines (Xu 2008) and have greater total authorships on research publications than women on average (Larivière et al. 2013, West et al. 2013, Bendels et al. 2018). In the evaluation and dissemination of research, men are more frequently journal editors and reviewers (Fox et al. 2016, Helmer et al. 2017, textbook contributors and authors (Damschen et al. 2005), invited speakers at conferences (Gurevitch 1988, Schroeder et al. 2013, Farr et al. 2017, Klein et al. 2017, and research award winners (Lincoln et al. 2012, Ma et al. 2019. Men are invited to submit articles (often prominently featured opinion or "ideas" papers) at two times the rate of female authors (Holman et al. 2018). Men may also be more likely to be more highly rewarded for collaborative efforts with female colleagues than those female colleagues. In shared first author publications in biological journals, women are disproportionately listed as the second author (Broderick and Casadevall 2019), and in economics, co-authored papers count for less in tenure decisions for female faculty compared with male faculty (Sarsons 2017). These imbalances exist despite the fact that across all scientific research disciplines, the proportion of female PhD graduates has approached 50% for nearly two decades (Figs. 1 and Appendix S1: Fig. S1).
A number of explanations have been proposed to account for these imbalances, which can broadly be classified as either gender differences or gender biases. While little support has been found for gender differences in innate aptitude (e.g., Hyde and Mertz 2009), men and women can differ in a number of attributes, which may contribute to their academic advancement such as the perception and self-promotion of personal success Rudman 2010, Reuben et al. 2012) and rates of self-citation (King et al. 2017). Additionally, gender biases may be present in the evaluation, hiring, and advancement of researchers across career stages (e.g., Moss-Racusin et al. 2012, Sheltzer and Smith 2014, Dutt et al. 2016, but see Williams and Ceci 2015), the evaluation of published work (Bradshaw andCourchamp 2018, Royal Society of Chemistry 2019), and in the disbursement and value of research funding (Acton et al. 2019, Witterman et al. 2019. Growing recognition of the importance of understanding the forces generating gender disparities in academia has led to recent surveys examining gender representation in authorship since publications are at the heart of a scientific career. Results from these studies consistently find that women represent a lower proportion of authors overall (Larivière et al. 2013, West et al. 2013, Bendels et al. 2018, and that in many fields, the pattern of change in the proportion of female authors indicates that it will be decades before gender parity is reached (Holman et al. 2018). These results are troubling because publications, specifically, their quantity and quality, play a key role in hiring decisions, determining grant funding levels, awards, editorships, tenure decisions, and many other factors; all of which can affect whether a junior researcher remains in science and/or the careers of more senior researchers (e.g., Van Dijk et al. 2014). Thus, factors that lead to disparities in publication numbers will have downstream consequences in gender equity across all aspects of academia.
In ecology and evolutionary biology (EEB), typically the first and last authors are assumed to be the greatest contributors to the publication (Tshcarntke et al. 2014, Duffy 2017. Although we focused on our own field, EEB, this pattern is common in a number of STEM fields (Bendels et al. 2018).
These key authors are most likely to act as the "gate-keepers" by determining the other authors that will be included on a publication as co-authors. Whether the gender of these gate-keepers affects the gender ratio of other authors on a paper is unknown, but perhaps vital to understand gender imbalances in academia. For example, a recent survey found that STEM postdoctoral fellows entering the academic job market had an average of 13 publications, six of them first-authored (Fernandes et al. 2019). Therefore, publications on which a researcher is a collaborator, but not the research lead, may represent more than half of that early career researcher's publications. Given that Holman et al.' (2018) review of scientific papers published by 36 million authors found that female researchers are typically underrepresented in the last author position, but are overrepresented as first authors, understanding how the (inferred) gender of either of these key author positions (first or last) affects the gender ratio of other authors may provide insights into observed gender differences in publication metrics and biases at large.
Here, we investigate patterns in co-authorship across four of the top EEB journals, The American Naturalist, Ecology, Ecology Letters, and Evolution. We chose Ecology and Evolution because they represent the society journals for the largest ecological and evolutionary biology societies active in the United States, the Ecological Society of America, and the Society for the Study of Evolution, respectively. We chose The American Naturalist and Ecology Letters because they consistently rank in the top 20 journals for ecology and evolutionary biology (Clarivate Analytics 2018). We collected authorship data from every publication in each journal at four time-points spanning four decades. Specifically, we ask: (1) What fraction of authors on papers in these journals are inferred to be female (based on criteria explained in the methods), and how has that changed through time? (2) Is the gender of the first or last author associated with the number of female co-authors? and (3) Do the effects of first or last author gender on co-authorship patterns change through time, and if they do, what factors are related to this?

Methods
We queried Clarivate Analytics Web of Science for all publications in The American Naturalist, Ecology, Ecology Letters, and Evolution in 1987, 1997. Ecology Letters only began publishing in 1998 (which we designate as 1997 in the dataset), yielding publications from the other three journals for 1987, and all four journals for 1997, 2007, and 2017 (2760 total publications). For each publication, we recorded the total number of authors and the gender of each author. For the purposes of this study, we have had to infer the gender of every author in our sample based on the following criteria; however, we acknowledge that by using a binary gender assignment without the input of the authors themselves we may have misidentified some researchers. The perspectives of authors who do not identify with a binary form of gender are therefore missing from this study; however, the effects of non-binary gender on publishing and collaboration should be explicitly incorporated in future research.
Nonetheless, if gender affects patterns of collaboration and authorship, it is likely to do so based on relatively simplistic gender identifications by other researchers.
To identify gender, we first tried to locate the researcher online and use images and pronouns to assign gender (male or female). For researchers without any informative online presence, we performed a Google image search of the given name and assigned gender based on the gender-name associations that typify the use of a given name (Social Security Administration, available online). 1 We were unable to assign a gender identity to every author (less than 5% of the total number of publications), and these publications were subsequently excluded from analysis. These excluded publications were uniformly distributed across years and journals, and thus are unlikely to introduce any bias in our results. For comparative purposes, we also used public records from the USA National Science Foundation (NSF) to calculate the proportion of female PhD graduates in the Biological Sciences in the same years as our publication data (available online). 2 While these values may not correspond exactly to the global pool of female EEB researchers, barring large discrepancies across countries and between EEB and biology generally, the proportion of female biology graduates in the USA should reflect general trends in the number of female researchers potentially publishing their work in these journals.
We used a generalized linear model with a Poisson error distribution to predict the number of female co-authors (function glm from the base R; R Core Team 2018): GF and GL refer to the gender of the first and last author, respectively, and total_authors refers to the total number of authors on a publication. The inclusion of the total_authors term accounts for the positive collinearity between author number and the proportion of female co-authors while evaluating all other terms. The GF × GL term addresses whether the gender of the first author depends on the gender of the last author. The GF (or GL) × Y term addresses whether the effect of the gender of the first or last author changes over time. The GF (or GL) × T term addresses whether the effect of the gender of the first or last author depends on the number of authors on a study. The T × Y term addresses whether the effect of total author number on the number of female co-authors has changed over time. We used Wald χ 2 tests with type III sum of squares to assess the significance of each term in the model (Anova function from the car package; Fox and Weisberg 2011). We used the function aout.pois from the package alphaOutlier (Rehage and Kuhnt 2016), to identify outlier observations based on the Poisson distribution. This method identified papers with greater than 10 authors as outliers, which we removed prior to our generalized linear model analysis. Papers with such high author numbers also represent a very small fraction of the total number of papers published (just 3% of all papers examined). We performed our analyses on the full dataset, and on datasets where we removed outliers, but present the results from the reduced dataset here. Results from the reduced and full dataset can be viewed in the Supporting Information (Appendix S1: Tables S1, S2).
When modeling the proportion of female co-authors as a function of time, it was apparent that the total number of authors on the average paper increased with time. To further dissect the effect of total number of authors on the number of female co-authors, we performed quantile regression using the rq function from the quantreg package (Koenker 2018). Quantile regression fits regression curves to differ- ent distributions of the response variable, and in our case, reveals discrepancies in the effect of the total number of authors on the number of female co-authors across quantiles. We calculated quantile regression coefficients estimating the effect of total author number on the number of female co-authors across specified quantiles (τ = 0.25, 0.50, 0.75). Differences in the quantile regression coefficients indicate variation in the relationship between the total number of authors and the number of female co-authors across different distributions of the data.
Analyses were performed in R, version 3.5.2. R script available on FigShare (Frances et al. 2020).

Results and Discussion
We analyzed gender and authorship patterns across 30 years in four high profile EEB journals and discuss our three main results below. Specifically, we found that: (1) total female authorship has increased through time; however, this varied based on the author's role (first, last, co-author etc.); (2) the gender of the last, but not the first, author is predictive of the proportion of female co-authors; and finally, (3) the genders of the first and last author are not predictive of the gender ratio of co-authors across time. While our results suggest that women are increasingly being represented in publications in EEB, the proportion of female authors remains well below the proportion of female PhD graduates in biology, including EEB in the United States; a problem given that advancement and retention within the academic job market is strongly dependent on publication numbers (Van Dijk et al. 2014).
Our dataset provides a description of patterns of female authorship across time in four EEB journals (Appendix S1: Fig. S2). The percentage of total female authorships (i.e., the sum of the number of authors on all publications) increased from 17% in 1987 to 30% in 2017 (Fig. 1). The percentage of female co-authors also increased over time, with 20% in 1987 compared to 31% of co-authors being female in 2017 (Fig. 3). Although there has been an increase in female authorship in EEB journals in the last 30 years, females are still underrepresented on publications relative to the estimated percentage of female EEB scientists. Data from the U.S. National Science Foundation show that since the early 2000s, women have made up over 50% of the PhDs awarded in the biological sciences (Fig. 1), and that EEB has a similar, or higher, proportion of PhDs awarded to women compared with other fields in biology (data available online). 3 These results indicate that the underrepresentation of female researchers in publications in EEB has remained essentially static over the last few decades (Fig. 1). We also found that the proportion of women as first authors increased the most (22% increase), while the proportion of women as last authors increased the least (6% increase) across a 30-year period (Appendix S1: Fig.  S2). Given that the last author position is frequently the principal investigator (PI) or group leader of the project (Tshcarntke et al. 2014), this suggests women are increasingly acting as the primary researcher on papers published in these journals (first author position); however, the number of female PIs/group leaders (last author position) is not increasing at the same pace. In other words, the EEB field has done a better job of recruiting women into graduate work than of retaining them as senior academics (i.e., the "Leaky Pipeline" issue; Goulden et al. 2011).
Papers with a female last author had a significantly higher percentage of female co-authors compared to papers with male last authors (31% versus 25% of co-authors are female, P = 0.03; Appendix S1: Table S1). For papers with a female first author, the proportion was slightly but non-significantly higher than papers with a male first author ( Fig. 2; Appendix S1: Table S1; P = 0.32). Importantly, gender homophily was only evident among male researchers, whereas female first and last authors were more likely to publish with members of the other gender (males). Several recent studies found similar patterns of gender homophily in other ecology journals , computational biology journals (Bonham and Stefan 2017), and across a wide survey of publications indexed on PubMed in the last decade (Holman and Mirandin 2019). We found no interaction between gender of the first and last author on the number of female co-authors (P = 0.95; Appendix S1: Table S1). This suggests that rather than distinct clustering of female and male researchers, female authors in the gate-keeper position (last author) appear to facilitate the inclusion of other female authors.
Our data do not allow us to identify the mechanisms driving our observed patterns of female coauthorship, but we discuss some potential contributing factors. Differences in laboratory composition between male and female PIs could explain some part of these differences. Male PIs tend to train and employ fewer women in life sciences, including grad students and post-docs (Sheltzer and Smith 2014). However, given that men outnumber women in tenured or tenure-track positions in most STEM fields (e.g., Martin 2011), many of the women completing PhDs are being trained in the labs of male PIs (Martinez et al. 2007). An alternative explanation is that the gender structure of research networks is different between male and female PIs. One study (Massen et al. 2017) found that male researchers are more likely to respond positively to requests for data, even with no promise of authorship, when the request was made by a male graduate student or post-doc. The authors of the study conclude that these results suggest the presence of "male-exclusive" research networks (Massen et al. 2017).
Another mechanism that future research should explore is whether criteria for co-authorship are uniform across researcher gender. There is a growing awareness that the contributions of women and underrepresented minorities to science may have often been hidden and undervalued (e.g., "Hidden Figures, " Shetterly 2016). It is notoriously challenging to determine whether an absence is the result of an individual or group Fig. 2. The proportion of female co-authors as a function of either the gender of the first or last author on the paper. Orange and purple circles represent women and men, respectively. Papers with female last authors have a higher proportion of female co-authors than papers with male last authors. There was no significant interaction between first and last author gender on the number of female co-authors (P = 0.95). Note that female co-authors are presented as a proportion of the total authors on a paper here; however, we used the raw value (number of female co-authors) in the analysis while controlling for the total number of authors on a paper. of people being truly absent, or simply not being recorded/accredited (e.g., Niamir et al. 2019). Additionally, unequal criteria on whether contributions from researchers should result in inclusion as a co-author could lead to differences in the proportion of male and female co-authors if these criteria are applied differently across genders (Dung et al. 2019). Other factors could also play a role in the patterns we found, and experimental studies that can identify mechanisms driving these patterns will provide important insights.
We also looked at how patterns of gender homophily in publishing have changed over time in EEB. The two-way interactions between gender of the first or last author and year were not significant (Female co-authors ~ Gender first × Year, p = 0.49; ~ Gender last × Year, P = 0.79). These results suggest that changes in the publishing practices of lead authors cannot explain the modest increase in female coauthors in EEB over time (Fig. 3a). Instead, we found a very strong relationship between the total number of authors and year ( Fig. 3b; Appendix S1: Table S3). Additionally, our quantile regression analysis performed on the full dataset (including papers with greater than 10 authors) revealed that the relationship between total authors and female co-authors was non-uniform across quantiles of the observed distribution of female co-authors (Fig. 3c). Classical regression of the full dataset estimated the slope between the total number of authors and the number of female co-authors as 0.35, yet fails to recognize that this estimate decreases at lower quantiles, and increases at upper quantiles, of the observed distribution of female co-authors (Fig. 3c). In fact, this value approaches 0.50 (the expected frequency if the number of female co-authors equals their proportion in the research workforce) when upper quantiles of the data distribution are included. Collectively, these results indicate that large, multi-authored papers (which have only recently become common in the field, e.g., Fig. 3b) tend to have closer to the 50% female representation as co-authors (Fig. 3c), and may be contributing to the rise in the number of female co-authors over time.
A number of factors could be shaping this pattern, including that there may be a greater probability of female researchers being part of these research networks as they get larger. Alternatively, as research networks expand, they may become less insular and more diverse as a result. Unfortunately, gender data on large collaborative research projects are lacking, presumably because they were relatively uncommon in the past. However, large multi-author collaborations are becoming more common, and some studies have shown that publications arising from these groups are cited more frequently than traditional research articles (Wuchty et al. 2007). Regardless of the mechanism, more equal gender ratios on these larger collaborative projects could be a positive step towards improving the gender imbalance in science.

Limitations of our study
Our hope is that in being very clear regarding the limits to our study here, we can encourage all readers to recognize what we can and cannot conclude based on this research, and to pursue their own approaches to addressing this pattern. To begin, we want to be clear that our data do not provide mechanisms driving the patterns we observe. However, regardless of the interpretation of these patterns, our results reveal real issues of gender imbalance in publishing that are likely to impact the career progress of these individuals. Another limitation is that our sampling is limited to only four journals; however, studies that examined more journals still found similar patterns of underrepresentation amongst females as co-authors (e.g., Fox et al. 2018).
We also want to point out that we are looking at a single, inferred characteristic about a given researcher, gender. In some cases, our assessments of gender will not reflect the gender identity of the researcher. For the purposes of this work, assigning a single binary gender provides a useful, albeit, crude measure Significance comes from generalized linear models analyzing either (a) the number of female co-authors or (b) the total number of authors per publication over time. In our model, the increase in female co-authorship over time is strongly driven by the increase in total authors over time. (c) Quantile regression of the number of female co-authors and the total number of authors. While the overall relationship between the number of female co-authors and the total author number is strongly significant (P < 0.001), quantile regression revealed non-uniformity in this relationship across the distribution of our data. Upper quantiles (τ = 0.75) of the data yield a slope (β) close to 0.5, what one would expect if females represent 50% of the co-authors (represented as the red dashed line). Using lower quantiles (e.g., τ = 0.50 and τ = 0.25) yields reduced slopes, indicating that the number of female co-authors disproportionately decreases as the number of total authors decreases.
because it is more likely to broadly reflect the effects of perceived gender on inclusion as a co-author on a paper. Additionally, our assignment of gender is likely to reflect a broader societal perception of gender for a given researcher, and this perception is likely to have a strong role in how researchers are brought into research networks. There are many more characteristics likely to affect the development of research networks. For example, researcher characteristics such as ethnicity, sexual orientation, disability status, and other factors may have effects equal or stronger than gender on the inclusion of researchers into research networks. These characteristics may also interact with gender, thus compounding the negative effects of inclusion within research networks (e.g., Clancy et al. 2017). However, while it is not ideal to lump researchers into a single box based on gender, we note that the simple gender categories we used seem to have predictive power about the composition of authors on publications.

Going forward
We suggest several approaches that may improve the observed gender disparity on publications in EEB fields. Where we can, we include evidence for why these suggestions are likely to work, but as a field, we have more evidence that bias exists and disadvantages women (and other underrepresented groups) than we have evidence of implemented strategies to change this which have been found to be effective. In general, departments and universities can provide guidance on the development of institutional culture that values diversity among its researchers and provides funding for initiatives that facilitate a healthy research group culture, for example, funding seminar series that bring in outside researchers, and providing funds that faculty can apply for to bring potential collaborators to campus. Departments and universities can also facilitate effective mixed-gender mentoring and collaborative interactions by providing funding that ensures that fieldwork or other research trips are taken in an appropriate way which minimizes any concerns that can arise with mixed-gender research groups.
Another suggestion is that departments and universities can incentivize mentoring and publishing with junior researchers. This can include assigning additional value when evaluating researcher dissemination records to those publications that include students or post-docs as collaborators, not just as first authors who were conducting the research. Additionally, departments or universities can create mentoring awards based on numerous criteria but include the extent to which faculty involve students and post-docs in research collaborations, and even explicit consideration how co-authorship is determined. However, the majority of these changes will have to come from individual leaders of research groups. Authors in key positions, particularly PIs in the last author position, can have large effects on who is included as a co-author. Therefore, people in leadership positions, regardless of gender, should aim to make fair and transparent decisions (Grogan 2019). This is an obligation based on fairness, but may also represent a matter of self-interest as evidence suggests that mixed-gendered papers are cited more frequently than those with a single-gender author list (Campbell et al. 2013).
The criteria required for co-authorship vary greatly across disciplines (Patience et al. 2019), but can also vary among projects and labs within a discipline (e.g., Logan et al. 2017). This discrepancy provides an opportunity for rules of co-authorship to be applied unequally between male and female contributors on a research project. Using standard criteria for authorship, making these criteria transparent from the beginning of a research project, and providing junior researchers with opportunities to earn co-authorship based on these criteria, may help equalize the valuation of all contributing researchers.
Given the central role that publications play in a research career, particularly an academic one (Van Dijk et al. 2014), any factors that decrease the representation of women on publications are likely to have large impacts on women's careers, and more broadly, on their representation in the field. Rather than suggesting overly prescriptive strategies, we encourage PIs, academic institutions, journals, and granting agencies to foster a culture in which authorship transparency is the norm and diversity is valued. We also encourage continued research on this topic that explores whether the advice given here has the desired effect on improving gender inequities in publishing in EEB.