Revising the global biogeography of annual and perennial plants

There are two main life cycles in plants—annual and perennial1,2. These life cycles are associated with different traits that determine ecosystem function3,4. Although life cycles are textbook examples of plant adaptation to different environments, we lack comprehensive knowledge regarding their global distributional patterns. Here we assembled an extensive database of plant life cycle assignments of 235,000 plant species coupled with millions of georeferenced datapoints to map the worldwide biogeography of these plant species. We found that annual plants are half as common as initially thought5–8, accounting for only 6% of plant species. Our analyses indicate that annuals are favoured in hot and dry regions. However, a more accurate model shows that the prevalence of annual species is driven by temperature and precipitation in the driest quarter (rather than yearly means), explaining, for example, why some Mediterranean systems have more annuals than desert systems. Furthermore, this pattern remains consistent among different families, indicating convergent evolution. Finally, we demonstrate that increasing climate variability and anthropogenic disturbance increase annual favourability. Considering future climate change, we predict an increase in annual prevalence for 69% of the world’s ecoregions by 2060. Overall, our analyses raise concerns for ecosystem services provided by perennial plants, as ongoing changes are leading to a higher proportion of annual plants globally.


Introduction
At the coarsest scale, terrestrial plants can be categorized into two main types of life cycles, annual and perennial 1,2 . Although crude, this categorization represents the most fundamental characteristic of plant species and illustrates the inherent trade-offs between reproduction, survival, and seedling success 1,9 . Annual plants reproduce once and complete their life cycle within one growing season, while perennial plants live for many years and, in most cases, reproduce multiple times. The evolutionary trade-offs reflected in these strategies manifest in numerous functional attributes, such as leaf 10 and root traits 11 , invasiveness 12,13 , genome characteristics 14 , and community stability 15 and therefore have many consequences for ecosystem functioning and services 3,4 . For example, by allocating more resources belowground, perennials reduce erosion, store organic carbon, and have higher nutrient-and water-use efficiencies 4,16,17,18 .
The differences between annual and perennial plants are noticeably reflected in agricultural settings. Despite being a minor part of global biomass 19 , annual species are the primary food source of humankind, probably because they allocate more resources to seed output, thereby enhancing agricultural productivity. During the Anthropocene, the global cover of annuals dramatically increased because natural systems, often dominated by perennials, were converted into annual cropland 20, 21 . Annual plants cover ~70% of the croplands and provide ~80% of worldwide food consumption 22 . Moreover, the proportion of annuals increases in many systems because woody perennials have a higher extinction rate 23 , while invasive plant species tend to be annuals 12 .
The annual life cycle has repeatedly evolved in at least 120 different families, suggesting that it provides a fitness advantage under certain conditions 24 . According to life-history theory, the optimal life cycle is determined by the ratio of seedlings (or seeds) survival to adult survival 25,26 . The reproductive mode of perennials requires multiple growing seasons 1 compared to annuals which require only one growing season. Therefore, any external condition that decreases the ability of plants to survive between growing seasons necessarily reduces the reproductive fitness of perennial species 25,26 . However, because annual species could survive such conditions as seeds rather than adults, their reproductive fitness may not be impacted 1 . Thus, any condition that skews the survivorship ratio in favor of seeds should increase the favorability of annuals.
Consequently, annuals should be favored when adult mortality is high and seed persistence and seedling survival are relatively high.
Numerous studies have discussed plant life cycles as primary examples of adaptation to different climatic conditions and provide estimates for their prevalence in various regions 5,6,7,8 .
However, the data provided in many of these studies, which penetrated many current ecological textbooks 7,8 , are problematic in several aspects. First, the current estimate for the global proportion of annual species (13%) is based on a century-old sample of merely 400 species 2 , representing 0.1% of accepted plant species 27 . Second, current biome level estimates are based on a single location and extrapolated to represent the entire biome. For example, the desert biome is assumed to contain 42% annual species 6,7 but is based only on data from Death Valley in California 2 . Third, estimates are inconsistent and difficult to compare due to ambiguous biome definitions. For example, an alternative estimate for the desert biome suggests that 73% of plant species are annuals 5,8 . Lastly, each biome incorporates a wide range of conditions, e.g., the mean temperature in the desert biome ranges from 30° to -10°C, corresponding to hot and cold deserts.
Thus, this definition aggregates regions that differ markedly in their environmental conditions, likely affecting the prevalence of the different life cycles.
As central as life cycles are to plant ecology and evolutionary research, it is remarkable that we still have no precise estimate for the worldwide prevalence of life cycles and their environmental drivers. Yet, such an assessment is essential in times of climate and land-use changes 20, 28 , which are expected to dramatically alter patterns of plant biogeography with many consequences for ecosystem processes and services 29, 30, 31, 32 . Here, we present a comprehensive assemblage of plant life cycle data encompassing over 235,000 plant species. We cross this database with millions of georeferenced data points to produce the first worldwide map of plant life-cycle distribution. This extensive plant growth-form database contains life cycle data for 67% of all vascular plant species and georeferenced data for 51%. These data allow us to evaluate the underlying drivers of plant life cycle strategies that were never tested on a global scale. We tested three key hypotheses, predicting that annuals are favored under: (1) increasing temperature and decreasing precipitation 24, 33, 34, 35 , (2) high year-to-year variability in climatic conditions 35, 36, 37 , and (3) increasing human footprint (anthropogenic disturbance 36,38,39,40 ). All these hypotheses are based on the life-history theory that predicts annual species to be favored with increasing adult mortality (relative to seedling mortality) 25,26 . In other words, the relative abundance of annuals will be higher in regions with hot-dry climates, high interannual variability, and disturbance because they decrease adult survival. Finally, with a more accurate understanding of the global drivers, we provide an initial assessment regarding the impact of future conditions on plant life cycle distribution.

Results and Discussion
This compilation of life cycles revealed dramatic differences in the relative prevalence of annual and perennial plant species compared to existing estimates. Annual species comprise 6% of all species and 15% of herbaceous species (i.e., omitting all woody species, which are all perennial). Moreover, only 5.5% of ecoregions exhibit an annual herbs proportion of 50% or more ( Fig. 1).
Below, we focus on the proportion of annual species among herbaceous species (rather than among all species), which provides a better 'resolution' in regions with a high proportion of woody species. Nonetheless, similar results were obtained when we analyzed the proportion of annuals among all species (Supplement Note 2 & Extended Data Fig. 1).
The variation in the annual-herb frequencies across biomes supports the first hypothesis that annuals are favored with increasing temperature and lower precipitation (Fig. 2). Still, the differences among biomes (following Whittaker's 41 approach) were not substantial (Fig 2A, Table 1). The proportion of annual herbs ranges from 13% to 25%, suggesting that the role of climate is underestimated in this coarse spatial scale. The large variability within each biome is revealed when examining the proportion of annuals at the ecoregion resolution (Fig. 2B). For example, not all desert-biome ecoregions have a high proportion of annual herb species, with cooler deserts exhibiting much lower proportions than hot ones. The same trend is repeated among other biomes, as ecoregions with lower precipitation and hotter temperatures (i.e., located in the lower-left coordinate of their biome in Fig. 2B) possess a greater proportion of annuals. This pattern was corroborated using a linear regression model that fitted the proportion of annuals as a function of mean yearly temperature and total yearly precipitation. These two climatic variables accounted for nearly half of the variance of the worldwide distribution of plant life cycle strategies (P < 10 -15 , D.F. = 679, R 2 = 0.48). As mean yearly temperature increases and total precipitation decreases, the proportion of annuals increases (Fig. 2C). These results are robust to spatial autocorrelation, with only negligible differences in the parameter estimates and correspondingly low p-values (Supplement Note 3) and to alternative statistical methods such as Poisson regression (Supplement Note 4, Extended Data Fig 2).
Although yearly temperature and precipitation provide a good description of annual herb proportions across the globe, it does not account for temporal variation in climate throughout the year. We, therefore, fitted a suite of two-variable regression models. Each model consisted of one quarterly temperature variable and one quarterly precipitation variable. The best-fit model (hereafter the quarterly model) incorporated the mean temperature of the warmest quarter and the log-transformed precipitation of the warmest quarter and accounted for 55% of the observed variance (P < 10 -15 , D.F. = 679). According to this model, annual herbs proportion increases with increasing temperature and decreasing precipitation of the warmest quarter (Fig. 3). Furthermore, this model had a substantially better fit than the model based on the mean yearly temperature and total yearly precipitation outperforming it in terms of explained variance (0.55 vs. 0.48) and information theory criteria (ΔAICc = 92.4). These results provide a more nuanced understanding of the first hypothesis, demonstrating that hot and dry conditions impact the prevalence of annuals, particularly in the driest season.
The quarterly model can distinguish between ecoregions with similar yearly climate patterns yet a different proportion of annuals. For example, the Eastern Mediterranean (e.g., Tel Aviv) and Chihuahuan desert (southwestern United States and northern Mexico) ecoregions have identical mean yearly temperatures (17.6° C) and relatively similar amounts of yearly precipitation (527mm and 330mm), yet maintain different annual herb proportions (51% and 36%). Given their similar mean yearly climate, the yearly temperature and precipitation model predicts similar annual herb proportions for these two ecoregions (32% and 34%, respectively).
However, the Tel Aviv ecoregion receives substantially less precipitation (6mm) than the Chihuahuan desert (157mm) during the hottest quarter. As such, the quarterly model differentiates the two ecoregions, producing substantially better predictions (annual herbs proportion of 52% in Tel Aviv and 29% in the Chihuahuan desert). Consequently, the coinciding of high-temperature and low-precipitation periods increases the favorability of annuals more than simply yearly means.
We conducted two analyses to account for potential biases of the revealed trends due to phylogenetic dependence. First, using the quarterly model, we conducted a separate analysis for the four most annual-rich families (Asteraceae, Brassicaceae, Fabaceae, and Poaceae).
Qualitatively similar relationships between climate and annual proportion were found in all families (Fig 4), providing evidence for convergent evolution of annual life cycles in hot and dry conditions. Next, we tested the life cycle and climate relationship using phylogenetic Generalized Least Squares (pGLS). We found that the median temperature of the warmest quarter for annuals is 3°C higher, and the median precipitation of the warmest quarter is 35% lower (Supplement Note 5). These results support the hypothesis that climate conditions during the driest period play a significant role in driving the prevalence of annuals.
We tested the second hypothesis that increased year-to-year climatic variability favors annuals prevalence by focusing on interannual variability in total precipitation (in terms of the coefficient of variation) and mean temperature (in terms of standard deviation). Using a bivariate regression, we found that increasing precipitation variability is associated with a higher proportion of annual species (P < 10 -15 , D.F. = 679, R 2 = 0.24). Likewise, we found that increasing temperature variability also increases the favorability of annuals, though its effect is much weaker (P = 0.0003, D.F. = 679, R 2 = 0.02). Furthermore, incorporating precipitation and temperature inter-annual variability into the quarterly model improved model fit (from R 2 = 0.55 to R 2 = 0.61) and overall performance (ΔAICc = 51).
We examined our third hypothesis that increased human footprint (anthropogenic Finally, we built a back-of-the-envelope projection of the expected prevalence of annuals in 2060 based on predicted changes in mean temperature and precipitation at the warmest quarter 42 (Extended Data Fig 3). Under the simplifying assumptions that the prevalence of annuals in the future will follow the same climatic patterns without adaptation or time-lag, our model suggests that ~69% of ecoregions will experience an increase in the proportion of annuals.

Conclusions
This study provides an extensive update to the worldwide biogeography of plant life cycles and demonstrates major differences from previous estimates. At the global level, our analyses indicate that annual species are half as common as previously thought 5,6,7,8 . Similarly, our estimates at the biome-level vary from earlier estimates changing some by as much as 3-5 fold (Table 1 and Extended Data Table 1). Additionally, these revised estimates display a more limited difference between the biome with the highest and lowest annual proportion, reducing the difference from 60% to a more restricted 12%.
Overall, our analyses provide general support for our three hypotheses regarding the conditions under which annual proportions will increase. First, we find that the proportion of annuals increases under hotter and drier conditions, and this result is robust to spatial autocorrelation and phylogenetic relatedness. However, yearly means provide an insufficient explanation for some observed patterns. After exploring alternative climate patterns, we determined that a long dry summer is a principal factor governing the occurrence of annual-rich regions, demonstrating that the temporal distribution of hot and dry periods is more important than having an arid climate per se.
Second, our results suggest that annuals are more prevalent under increasing climate unpredictability. and interannual temperature variability increase, the proportion of annuals also increases. However, the correlation between temperature variability and annual proportion is weaker than precipitation variability, indicating that irregular precipitation patterns have a larger impact.
Thirdly, our findings demonstrate that as human-mediated disturbance increases, the favorability of annual plants also increases. Furthermore, we found that a substantial portion of the effect of human disturbance is independent of climatic patterns. Although there is extensive evidence that human disturbance enhances the abundance of annuals in local communities 38 , our study is the first, to our knowledge, to provide evidence that human disturbance favors annuals on a biogeographical scale.
Finally, our future projection model predicts that by 2060, we will experience an increase in the prevalence of annuals. However, we caution that our back-of-the-envelope prediction is based on the simplistic assumption that biogeographic patterns instantly track climate changes (i.e., it does not account for time lags in species response to a changing climate). Still, our prediction is also conservative in the sense that it does not account for the predicted increase in year-to-year climatic variability 42 as well as human footprint. With the human population      Table 2. Note, the biome nomenclature used for the previous estimates differs from ours and so the location of the original study was used to determine the corresponding biome. Additional information can be found in Extended Data Table 3.

Revised Estimate
Previous Estimate

Matching life cycle data with species observations
Species observation data were based on occurrence data from the Geographic Biodiversity Information Facility 54 (GBIF). All observation data points within the Plantae kingdom (~355 million) were downloaded (September 14 th , 2021) and processed locally. We filter unreliable data points following the recommendation provided in the vignette of the R package CoordinateCleaner v2.0-18 55 . The following steps were used to filter unreliable data points: 1) Data points without coordinates were excluded.
2) The R package CoordinateCleaner v2.0-18 55 was used to discard data points with wrong locations and problematic temporal metadata (see Supplement Note 8 for a full description of this process). 3) Data points were removed if the recorded 'coordinate uncertainty' was greater than 100km. 4) Data points whose 'Basis of records' was literature or living specimen were discarded (these generally refer to the location of museum or herbaria collections) 5) Data points whose record date was during or before 1945 were excluded as it has been suggested these may be less reliable. 6) Data points that were not labeled as species.
Once cleaned, all remaining unique names were resolved using the WFO package 53 , and the same criteria as in the life form database were applied. Once the names were resolved, the species in the cleaned GBIF database and the assembled life form database were matched. Of the 235,979 species in our assembled lifeform database, 182,848 species were found within the cleaned GBIF data.
To mitigate sampling bias and inexact coordinates, species observation data were mapped into larger geographical regions defined by specific environmental and ecological conditions 14 .
To this end, each georeferenced data point was assigned to one of 827 ecoregions as defined by the World Wildlife Fund 56 (WWF). This process was accomplished using the R packages raster v3.4-13 57 and rgdal v1.5-27 58 . Following the procedures used by 14 , species were only considered "present" in a geographic region if there were five or more observations to ensure the species had a sufficient established population. Similarly, to ensure all regions contained sufficient data for analysis, each region was only considered if ten or more species were present. This procedure produced sufficient data for 723 ecoregions when examining annual species among all species and 682 ecoregions for annual species among only herbaceous species.
We additionally analyzed the data based on a grid system (using 100km × 100km cells) and found similar results to our main ecoregion-based analyses. Further details of these analyses are provided in (Supplement Note 9 and Extended Data Fig 4).

Predictors of annual proportion
We We downloaded bioclimate features from the WorldClim Global Climate Data 59 , which were developed from climate data during 1970 -2000, at ten arc-minutes resolution. All 19 BIOCLIM variables representing each region's major temperature and precipitation characteristics were extracted using the R package raster v3.4-13 57 .
As a measure of climate unpredictability, we measured interannual precipitation variation (IPV) and interannual temperature variation (ITV). The IPV metric was obtained by extracting the coefficient of variation from ecoregion precipitation. The ITV metric was obtained by extracting the standard deviation from the ecoregion temperature using the R package raster v3.4-13 57 . We aggregated all available monthly precipitation/temperature data layers from the  57 .

Biome Estimates
To obtain biome estimates of annual and annual herb frequencies, all ecoregions with sufficient data were individually plotted in the total yearly precipitation and mean yearly temperature space of the Whittaker biome overlay outline (adapted from 41 ) overlaid (see Fig 2B for reference). We determined each ecoregion's biome based on its location within this space.
For those ecoregions whose biome designation was difficult to assess, their points were enlarged until one biome had a plurality of the circle's area. For those ecoregions outside Whittaker's biome space, their biome designation was determined by the closest biome. Once the biome designation of all ecoregions was determined, the species presence data for all ecoregions within a given biome were aggregated. The same process used to determine the presence and absence of species in an ecoregion was used to determine the presence and absence of species in the biome.
Of note, a biome could have more species than the combined ecoregions within said biome because some species may have five or more observations within the biome, but not within any of the individual ecoregions.
Whittaker's defined biomes were chosen to simplify comparisons to previous estimates (the terminology used in textbooks best matched those of Whittaker's definitions) and for its simplicity of using only temperature and precipitation.

Comparing Previous Biome Estimates
The classification approach for biomes used in previous estimates of annual proportions was not explicitly defined, making a direct comparison with our set of biomes difficult.
However, we traced the origins of each estimate and determined the original study's locations.
These locations were then matched with the WWF ecoregions, and the corresponding biome was determined as discussed above. This procedure allowed a direct comparison between previous estimates and our revised estimates. Additionally, previous studies did not explicitly provide estimates for the proportion of annuals among herbaceous species. Therefore, for comparison purposes, previous annual herbs proportion estimates were calculated based on the biome-level life form classification estimates from each study. See Extended Data Tables 1-4 for original study location matchings and annual herbs proportions calculations.

Statistical analyses Temperature and Precipitation
To assess support for our first hypothesis, we linearly regressed annual and annual herb frequency against mean yearly temperature, total yearly precipitation, and their interaction.
Subsequently, we compared models based on two climatic variables, using one quarterly temperature bioclimatic variable and one quarterly precipitation variable, to identify a potentially better model. We used four temperature bioclimatic features (BIO8, BIO9, BIO10, BIO11) and four precipitation features (BIO16, BIO17, BIO18, BIO19). Month-specific bioclimatic features were omitted because they are highly correlated with quarter-specific features. Preliminary analysis suggested that log transformations of precipitation bioclimatic features often increased explanatory power, and therefore they were also included in the exhaustive search.
Altogether, this grouping scheme produced 32 different 2-feature linear regression models (four temperature and eight precipitation features) with an additional two linear regression models using mean yearly temperature and total yearly precipitation and the log transformation of total yearly precipitation. Model comparison was achieved using AIC values obtained from the R package MuMIn v1.43.17 61 . The best model was identified (hereafter the quarterly model) and then further applied to the four most annual-rich families (Asteraceae,

Brassicaceae, Fabaceae, and Poaceae).
Climate Uncertainty To assess support for our second hypothesis, we investigated the role of IPV and ITV (proxies of climate uncertainty) on annual and annual herb frequencies. We began by testing each variable individually using linear regression and then tested whether their inclusion increased the fit of the quarterly model. Finally, we assessed the increased fit when both IPV and ITV were included in the quarterly model.

Anthropogenic Disturbance
To assess support for our third hypothesis, we measured the impact of the human footprint on annual and annual herb frequencies. We linearly regressed human footprint and annual/annual herbs frequencies and then tested the change in model fit after its inclusion into the quarterly model with IPV and ITV.

Phylogenetic Biases
We applied a phylogenetic generalized least squares regression (pGLS) analysis to account for phylogenetic dependence in the observed patterns. To this end, we devised a continuous response variable for each species by taking the median of the mean temperature of the warmest quarter and precipitation of the warmest quarter of all their GBIF observations. The explanatory variable was a numeric conversion of each species' life cycle; 1 for annual and 0 for perennial.
The species were matched with those in the GBMB seed plant mega-phylogeny constructed in 62 . The same WFO name resolution process was used on the species in the phylogeny to ensure the same naming scheme. Once matched, we selected the matching herbaceous species resulting in 20,819 species.
The results of the pGLS analysis were compared to the same model without the phylogenetic component (i.e. standard linear regression) to assess the change in coefficient estimates and determine the overall impact of phylogenetic relatedness on our results (see Supplement Note 5).

GBIF Biases
To examine the biases in our dataset with regards to GBIF observational data, we linearly regressed annual and annual herb proportions against the log10 transformed total number of GBIF observations in an ecoregion. Similarly, we conducted a linear regression to assess the relationship between annual and annual herb proportion and the total number of present (5+ observations) GBIF species. Finally, we examined the species in GBIF, but missing from our dataset (see Supplement Note 10 and Extended Data Fig 5).
Spatial Autocorrelation Following the methods described in 63 , spatial eigenvectors for our data were obtained using the R package adespatial v0.3.20 64 . We selected the first set of eigenvectors (using those with both positive and negative eigenvalues) that accounted for at least 80% of the variance (39 eigenvectors) and incorporated them into the Yearly-Climate and Quarterly-Climate models.
These results were compared to the same models without the eigenvectors included (see Supplement Note 3).

Alternative Regression Models
To ensure that our results are robust to various regression methods, we applied two alternative regression methods. First, we applied a logit transformation to the proportion of annual herbs in each ecoregion followed by linear regression. Second, we used a generalized linear model (Poisson distribution with an offset to represent proportion data) (see Supplement Note 4 and Extended Data Fig 2).

Future Projection
To obtain future projections of the proportion of annual herbs in each ecoregion, future climate estimates in the year 2060 were downloaded from the WorldClim Global Climate Data 59 using the 2041-2060, UKESM1-0-LL 65 , ssp585, at ten arc-minutes resolution. The median values for each bioclimatic variable were extracted for each ecoregion using the R package raster v3.4-13 57 . Using the coefficients of a linear regression between the two-most influential climatic parameters found in our study (mean temperature and precipitation during the warmer quarter, i.e., the Quarterly-Model) and their predicted median value in each ecoregion in 2060, we produced estimates for the proportion of annual herbs in each ecoregion with sufficient data.
Year-to-year climate variability and human footprint were not incorporated due to data unavailability at the required resolution and scale. The projected annual herbs proportion in each ecoregion was compared to its current estimate to determine the predicted change in proportion.

Author Contributions
Conceptualization

Competing Interests
Authors declare no competing interests.

Additional Information
Supplementary Information is available for this paper.

Peer review information
Reprints and permissions information is available at www.nature.com/reprints.

Supplement Note 1: Annuals among Herbaceous Species -Results
Here, we present the full results of the analyses presented in the main text for annuals among herbaceous species.

Supplement Note 2: Annuals among All Species -Results
Here, we investigated the proportion of annuals among all species rather than among herbaceous plants (as we did in the main text) (Extended Data Fig 1). Similar to the results presented in the main text, we found that ecoregions with lower precipitation and hotter temperatures (i.e., located in the lower-left coordinate of their biome in Extended Data Fig 1D) possess higher proportions of annuals.
This pattern was corroborated using a linear regression model fitting the annual proportion as a function of mean yearly temperature and total yearly precipitation (P < 1.0 -15 , D.F. = 719, R 2 = 0.34) (Extended Data Fig 1E). As in the analysis presented in the main text, using bioclimatic features that account for temporal variation in climate throughout the year produced a better-fitting model. The regression model that incorporated the mean temperature of the warmest quarter and the log-transformed precipitation of the warmest quarter (Extended Data Fig 1F & G) accounted for 47% of the observed variance (P < 1.0 -15 , D.F. = 719) and outperformed the model based on yearly means in terms of information criteria (ΔAICc = 160).
Qualitatively similar relationships between quarterly climate and annual proportions were found in the four most annual-rich families (Asteraceae, Brassicaceae, Fabaceae, and Poaceae). in all families. The explained variation of annual herb proportions ranged from 49% in Brassicaceae to 18% in Fabaceae (in all models P < 10 -15 ).

Supplement Note 3: Spatial Autocorrelation -Results
Models accounting for spatial autocorrelation incorporated 39 spatial eigenvectors (80.05% of variance explained). The results below are for the proportion of annuals among herbaceous species.
The Yearly Model refers to the model with BioClim1, mean yearly temperature and BioClim12, total yearly precipitation. The Quarterly Model refers to the model with BioClim10, mean temperature of the warmest quarter and BioClim18, precipitation of the warmest quarter. Note that BioClim18 is log10 transformed.
The parameter estimates of the yearly and quarterly models show little difference between those models with and without the set of 39 spatial eigenvectors. Similarly, low pvalues are associated with each parameter estimate regardless of including the eigenvectors, suggesting negligible differences. These results demonstrate that the yearly and quarterly models are robust to spatial autocorrelation.

Yearly Model
With

Supplement Note 4: Alternative Regression Models -Results
We applied two alternative regression models to the proportion of annual herbs in ecoregions. First, we applied a logit transformation to the proportion of annual herbs (0.01 was added to all annual herb proportions to avoid 0 values) followed by a linear regression (Extended Data Fig 2A & B).
The Yearly Model refers to the model with BioClim1, mean yearly temperature and BioClim12, total yearly precipitation. The Quarterly Model refers to the model with BioClim10, mean temperature of the warmest quarter and BioClim18, precipitation of the warmest quarter. Note that BioClim18 is log10 transformed.

Supplement Note 5: Phylogenetic Relatedness (pGLS) -Results
The results below used each species' median bioclimatic values as the basis for the continuous response variable. The annual life cycle was incorporated as a dummy variable (0perennial, 1 -annual). BioClim10 is the mean temperature of the warmest quarter and BioClim18 is the precipitation of the warmest quarter. Note that BioClim18 is log10 transformed.

Supplement Note 9: Gridded System of Polygons -Methods and Results
In addition to the ecoregion-based system analyses presented in the main text, we repeated the set of analyses using a grid-based system. The grid system was based on a global tessellation of 100km-by-100km square cells using the R package sf 66 . First, using a world map provided by the R package rnaturalearth 67 , we determined the percent land coverage of each cell and excluded those cells with less than 50% land coverage, resulting in 14,595 cells. Next, we mapped the cleaned GBIF observation data points (cleaned as deteilaed in the main text for the ecoregion-based system), into each of these cells using the R packages raster v3.4-13 57 and rgdal v1.5-27 58 . As with the ecoregion system, we followed the procedures used by 14 , whereby species were only considered "present" in a cell if there were five or more observations. Similarly, to ensure all cells contained sufficient data for analysis, each cell was only considered if 10 or more species were present. This procedure produced sufficient data for 5,934 cells (~40.7% of cells) when examining annual species among all species and 5,824 cells (~40.0% of cells) for annual species among only herbaceous species.
We find that cells dominated by annuals among herbaceous species (hereafter just called annuals within this supplemental note) are rare, with only approximately 11.7% exhibiting an annual proportion of 50% or more (Extended Data Fig 4A & B).
First, we projected the proportion of annuals in each grid cell into Whittaker's biomes definitions, as represented using a two-dimensional coordinates system of mean yearly precipitation and temperature. The patterns found using the grid-based system were qualitatively similar but were noisier (probably because ecoregions were designed to minimize variability in environmental conditions within each unit and the smaller number of observations within each grid cell relative to the number of observations within ecoregions). As with ecoregions, cells with lower precipitation and hotter temperatures (i.e., located in the lower-left coordinate of their biome in Extended Data Fig 4C) possess greater percentages of annuals.
This pattern was corroborated using a linear regression model that fitted the proportion of annuals as a function of mean yearly temperature and total yearly precipitation. Although this gridded system model does not account for as much variation (P < 10 -15 , D.F. = 5821, R 2 = 0.39) as the ecoregion system model (R 2 = 0.48), it nevertheless shows the same trend (Extended Data Fig 4D compared to Fig 2C). Again, bioclimatic features that account for within-year variation in climate produced a model with a better fit. The regression model that incorporated the mean temperature of the warmest quarter and the log-transformed precipitation of the warmest quarter (Extended Data Fig 4E & F) accounted for 42% of the observed variance (P < 10 -15 , D.F. = 5821) and outperformed the annual climate model also in terms of information criteria (ΔAICc = 227).
We also compared the addition of climate unpredictability and human footprint into the models to assess their impact on annual prevalence. We found that increasing climate unpredictability is associated with a higher proportion of annual species (P < 10 -15 , D.F. = 5821, R 2 = 0.19). Similarly, adding this feature to the model that included quarterly temperature and precipitation further improved the model's fit (P < 10 -15 , D.F. = 5817, change in R 2 from 0.42 to 0.44, ΔAICc = 245). Lastly, the impact of human disturbance was positively correlated with a higher proportion of annuals (P < 10 -15 , D.F. = 5821, R 2 = 0.16). Adding this variable to the model with climate unpredictability and quarterly temperature and precipitation further improved

Supplement Note 10: GBIF Biases -Results
We assessed the biases in our dataset with regards to GBIF observational data by first examining the spatial distribution of species missing from our dataset followed by analyses to determine the impact of the number of GBIF species and observations on annual and annual herb proportions.
When examining the proportion of species that were found in GBIF (hereafter "GBIF species") but missing from our dataset in each mapped ecoregion, we found no identifiable regions with disproportionately high missing species (Extended Data Fig 5A). Furthermore, when projecting ecoregions to their respective Whittaker Biomes, we find that no biome missed substantially more species than others before and after the GBIF filtering procedure (Extended Data Fig 5B). Finally, we found that there are very weak correlations between the proportion of GBIF species missing from our dataset and the proportion of annuals among all species (P = 1.317 -7 , D.F. = 721, R 2 = 0.038) or the proportion of annuals among herbaceous species (P = 1.182 -6 , D.F. = 680, R 2 = 0.034).
Second, our analyses indicated that there is no correlation between the total number of present (5+ observations) GBIF species and the proportion of annuals among all species (P = 0.71, D.F. = 721, R 2 < 0.001) or the proportion of annuals among herbaceous species (P = 0.45, D.F. = 680, R 2 < 0.001) (Extended Data Fig 5D & F).

Supplement Note 11: Biases in Missing Phylogenetic Data -Results
We explored the biases in our data with regard to the phylogenetic distribution of species missing from our dataset. To this end, we compared the number of species in each family for each non-bryophyte family in the World Flora Online (WFO), which contain 449 families (Extended Data Fig 5H).
• Out of these 449 families: o For 288 families, less than 25% of the data is missing. o For 393 families, less than 50% of the data is missing. • The mean percentage of missed species in our dataset is 22.2% per family and the median percentage is 16.2%. The blue lines depict the best-fit line. Note that the correlation in E and F are fitted to the log transformation of the quarterly precipitation. Table 1 | A comparison of previous estimates, obtained from 4 , for the proportion of annuals among all species and among herbs to our revised estimates. Greyed cells have no initial biome estimate. Alternative previous estimates from 3 are available in Table 1. Note, the biome nomenclature used for the previous estimates differs from ours and so the location of the original study was used to determine our corresponding biome. Additional information can be found in Extended Data Table 4.

Revised Estimate
Previous Estimate