Comparative analysis of diversity and environmental niches of soil bacterial, archaeal, fungal and protist communities reveal niche divergences along environmental gradients in the Alps and

for on Quantifying of is to and to In we identified important topoclimatic, spatial and biotic drivers of the alpha and beta diversity of bacterial, archaeal, and protist communities. Then, we calculated the niche breadth and position of each along the important environmental gradients to determine how these vary within and among the taxonomic groups. We found that edaphic properties were the most important drivers of both, community di- versity and composition, for all microbial groups. Protists and bacteria presented the largest niche breadths on average, followed by archaea, with fungi displaying the smallest. Niche breadth generally decreased towards environmental extremes, especially along edaphic gradients, suggesting increased specialization of microbial taxa in highly selective environments. Overall, we showed that microorganisms have well defined niches, as do macro-organisms, likely driving part of the observed spatial patterns of community variations. Assessing niche variation more widely in microbial ecology should open new perspectives, especially to tackle global change effects on microbes.


Introduction
A large number of studies have assessed patterns of community dissimilarities through analyses of spatial and environmental variations in alpha and beta diversity. Communities of macroorganisms have been shown to be structured by three main drivers: abiotic suitability (i.e. the species' environmental niches), dispersal and biotic interactions (Lortie et al., 2004;D'Amen et al., 2017). An increasing number of similar studies are now being conducted on microorganisms, yet many of them have focused so far on diversity patterns and much less on niche quantifications and variations across taxa (Tedersoo et al., 2014;Shi et al., 2016;Delgado-Baquerizo et al., 2018;Ning et al., 2019;Oliverio et al., 2020).
The environmental niche concept is fundamental in ecology and evolutionary biology (Hutchinson, 1957;Holt, 2009). It describes a taxon's distribution and performance in an environmental space and can be used to predict species distribution in a geographic space . It has been used to investigate biotic interactions (Wiens, 2011), community dynamics (Cabral and Kreft, 2012), geographic range limits (Kearney and Porter, 2009;Pagel and Schurr, 2012), conservation planning (Costa et al., 2010;Schwartz, 2012), biological invasions (Guisan et al., 2014) and species response to climate change (Thuiller et al., 2005;Botkin et al., 2007;Wiens et al., 2009). The realized environmental niche considers the range of tolerance of taxa (measured by variations in occurrences, abundance, or fitness) along one or more environmental gradients while accounting for biotic interactions and dispersal limitations, defined as a multivariate environmental hypervolume (Hutchinson, 1957). The environmental niche is thus also a key filter of the assemblage of communities, by determiningat least in part -whether an assemblage of species (or operational taxonomic units) is adapted to thrive -now, in the past or in the future -at a given site .
Niche properties such as niche position (average position along an environmental gradient) and breadth (amplitude of tolerance along an environmental gradient) are also strong indicators of a taxon's vulnerability to environmental change (Thuiller et al., 2005;Cianfrani et al., 2018). Niche position is often used to measure the marginality of a taxon's habitat distribution within a study area, with non-marginal taxa occupying typical conditions (i.e., dominant conditions in a landscape) and marginal taxa confined to atypical or extreme conditions (Dolédec et al., 2000;Soininen and Heino, 2007;Muller, 2019). Niche breadth can be used to infer the species tolerance to changing conditions, so that specialist species with narrow niches are expected to experience a higher risk of extinction during times of stress (La Sorte and Jetz, 2010). On the other hand, generalist species with broad environmental tolerance are expected to maintain viable populations more easily even in unfavorable environmental conditions (Thuiller et al., 2005;Binzer et al., 2011;Slatyer et al., 2013). The combination of narrow niche breadth and marginal niche position especially increases vulnerability.
While niche theory is widely used in "macroorganisms" ecology (e.g. Gaston and Blackburn (2008)), it is still in its infancy in microbial ecology, and we know relatively little about the size and position of microbial niches along environmental gradients. Yet, microorganisms represent a large proportion of the global soil biodiversity and play essential roles in biogeochemical cycling, including carbon sequestration and soil fertility (Falkowski et al., 2008;Yuan et al., 2012;Bharti et al., 2017;Crowther et al., 2019) and therefore, are essential for ecosystem functioning. Hence, understanding and quantifying the niche of microbial taxa and communities is a pressing need to help understand and forecast the influence of environmental change on microorganisms (Mod et al., 2021).
We deliberately focus on the microbial niche as the overarching aim of this study is to characterise and compare the environmental niche of four microbial groups -bacteria, archaea, fungi and protists -in the same mountain region with wide environmental gradients. However, to achieve this, we first needed to identify potential key drivers of microorganism communities (and thus the main niche axes). We evaluated the correlation of topoclimatic, edaphic, spatial and biotic variables with the alpha and beta diversity of bacteria, archaea, fungi and protists. This first step also makes this work comparable to previous microbial ecology studies. Such comparisons are essential to understand how the soil biosphere differentially responds to ecosystem properties across landscapes, yet the number of studies comparing microbial groups within the same environment is indeed still limited (Dassen et al., 2017;George et al., 2019). As these four microbial groups have different ecological, physiological and phenotypic properties, we expected different variables to drive their niches and impact their diversity. Specifically, we hypothesized that bacterial and archaeal alpha and beta diversity would primarily be correlated to edaphic properties, especially soil pH and carbon content (Shi et al., 2016;Delgado-Baquerizo et al., 2018;Jiao et al., 2019), while climatic variables, especially temperature and precipitations, would be strongly associated with fungal and protist communities (Tedersoo et al., 2014;Oliverio et al., 2020;Seppey et al., 2020).
Next, we calculated the niche breadth and position of all individual microbial taxa (i.e. zOTUs) in each of the four groups and assessed whether the niche breadth differed, on average, among the groups. While abiotic conditions have a key role in defining the niche breadth and position of species, other factors, such as biotic interactions and dispersal abilities influence the geographical range of taxa, and therefore, potentially, niche breadth (Gaston, 2000). For microorganisms, studies have shown that smaller body size (<20 μm) might enhance dispersal capabilities via wind transport (Wilkinson et al., 2012), potentially increasing the geographical range of taxa and the realized niche breadth observed (Rocha et al., 2018), although highly dispersed organisms may not successfully colonize the ecosystem they disperse to.
Finally, we calculated the niche breadth and position along the key environmental gradients identified as important for alpha and beta diversity to evaluate whether and how these two niche properties would be related. We hypothesized that niche breadth would be related to niche position so that niche breadth would decrease towards the extremes of environmental gradients because of specializations required for survival (Pandit et al., 2009;Büchi and Vuilleumier, 2014;Okie et al., 2015).

Sample collection
The data were collected from non-forested sites covering ca. 700 km 2 of a mountain area located in the western Swiss Alps spanning a 2695 m elevation range (425-3120 m) [ Fig. 1]. Climatic and topographic conditions are heterogeneous, with annual mean temperatures ranging from − 5 • C to +8 • C and mean precipitation ranging between 1200 mm and 2600 mm. The sampling sites were chosen following an equal randomstratified sampling design (Hirzel and Guisan, 2002), with elevation and slope as stratifying factors (Yashiro et al., 2016). Detailed descriptions of the study area and the sampling protocol are provided in Yashiro et al. (2016) and Buri (2019). In brief, at each site, soil samples (top 5 cm) were collected at the four corners and the centre of each quadrat using flame-sterilized shovels and Whirl-Pak bags (Nasco, Fort Atkinson, WI, USA) during the summer 2013. Samples were pooled by site and manually homogenised. A subset of the pooled soil was immediately flash frozen on site in liquid nitrogen and maintained at − 80 • C until ready for soil properties measurements, while the remainder was maintained on ice until processing in the lab. Within two days of collection in the field, part of the refrigerated soil was sieved and aliquoted, and then stored at − 80 • C until DNA extraction.

DNA extraction and amplicon sequencing
Soil DNA was extracted in triplicate for each sample using the PowerSoil DNA isolation kit (Qiagen, Carlsbad, CA, USA) following the manufacturer's instructions. Triplicate DNA extracts were pooled by sample for downstream analyses (Yashiro et al., 2016). For bacterial and archaeal groups, the V5 region of the 16S rRNA gene was amplified for each DNA sample using the universal primers 784F-880R (Lazarevic et al., 2009;Yashiro et al., 2016;Mod et al., 2021). The protocol is described in detail in Yashiro et al. (2016). For fungi, the internal transcribed spacer (ITS) region was amplified from each sample using the primers ITS1F and ITS2 (White et al., 1990;Schmidt et al., 2013). The protocol is detailed in Pinto-Figueroa et al. (2019). Finally, for protists, the V4 region of the 18S rRNA gene was amplified using universal eukaryotic primers TAReuk454FWD1 and TAReukREV3 (Stoeck et al., 2010). The protocol is described in Seppey et al. (2020). The bacterial, archaeal and fungal libraries were sequenced on an Illumina HiSeq 2500 platform at the Genomic Technologies Facility of the University of Lausanne, while protist libraries were sequenced on an Illumina MiSeq platform at the University of Geneva (Molecular Systematics & Environmental Genomics Laboratory).

Bioinformatic processing
In this study, we used three independent sequencing libraries (16S, ITS and 18S) each covering a different number of sampling sites across the study area. A total of 265 independent sampling sites were covered by the 16S library, 232 sites by the ITS library and 179 sites by the 18S library. Each full dataset (with the original number of sampling sites) was processed independently using a standardized custom-made pipeline to avoid processing biases and allow for the best possible comparison among the groups, as described in Mod et al. (2021). In short, we demultiplexed the sequenced reads (of 16S and 18S), removed the barcodes, trimmed and merged the sequences. Amplicons were dereplicated to obtain zero-radius operational taxonomic units (zOTUs) at 100% identity (Edgar, 2018). For each independent dataset, zOTUs with less than 100 reads across all samples were discarded to remove chimeras, sequencing errors and spurious zOTUs. For fungi, the same process was applied, however, since the length of the ITS insert is longer than the sequence of the reads, both ends were concatenated with a NNNN separation to permit subsequent alignments against a database of fungal ITS. These sequences were then dereplicated as described above and zOTUs with less than 100 occurrences across all samples were discarded.
Taxonomic assignments for prokaryotes and protists were performed against the SILVA v132 database (Pruesse et al., 2007;Quast et al., 2012;Yilmaz et al., 2014) while fungal taxonomy was performed using the RDP (ribosomal database project) naive Bayesian classifier v2.12 (Wang et al., 2007) with the UNITE (fungal ITS trainset 07/04/2014) option.  Malard et al. Soil Biology and Biochemistry 169 (2022) 108674 For protists, we removed all 18S sequences affiliated to Fungi, Metazoa and Embryophyta. Finally, for this study, we selected only the sites where all three marker genes were sequenced and the full set of environmental variables was available, resulting in 136 sampling sites included in all subsequent analyses.

Edaphic, topoclimatic, spatial and biotic variables
All preparations related to the predictors, calculations of niche properties and statistical analyses were performed in the R environment (Team, 2013), primarily using the phyloseq (McMurdie and Holmes, 2013) and vegan packages (Dixon, 2003), and visualised using ggplot2 (Wickham and Wickham, 2007). All the variables used in this study are detailed in Table S1.
A total of 44 local soil properties, including but not limited to electrical conductivity (EC), pH, bulk soil water content (SoilWaterC), total phosphorus (total P), total organic carbon content (TOC), nitrogen (N), C/N ratio, major elements (such as SiO 2 , MnO, MgO and CaO), mineralogical composition (such as phyllosilicates, quartz and calcite) and five classes of soil texture were measured in the laboratory. The list of variables and associated methods are shown in Table S1 and supplementary method material (also in Yashiro et al. (2016) and Buri (2019)).
Topographic variables included the slope angle and topographic position (convexity-concavity) and were calculated from the digital elevation models at 25 m resolution as in Dubuis et al. (2013). The 19 climatic variables are based on the MeteoSuisse database and named following the WorldClim variables (Fick and Hijmans, 2017). We also derived variables such as growing degree days, moisture index and solar radiation based on Dubuis et al. (2013) with a resolution of 25 m. The snow cover duration index was calculated based on satellite images as described in Panchard et al. (in preparation).
To create spatial predictors, the geographic coordinates of the sampling sites were transformed to geodetic cartesian (x,y) coordinates using the SoDA package in R (Chambers, 2008) and the Euclidean distances among the sites was calculated using vegan. These were used for distance-decay curves and Mantel tests as they provide a tangible measure of spatial distance. Distance-based Moran's Eigenvector Maps (MEM) were constructed with the (x,y) coordinates using the adespatial R package (Dray et al., 2017) to summarize spatial structures present in the study area in a few proxy variables. This method creates a series of variables that correspond to spatial structures at all spatial scales contained within a given sampling design (Borcard et al., 2018). These vectors were used as predictors in random forests and variation partitioning analyses. In this instance, 27 MEM positive eigenvectors were constructed, representing a sequence of broad to fine scale variation over the extent of the study site (Borcard et al., 2018).
To derive the biotic predictors of both the alpha and beta diversities, we used six biotic variables representing the richness (ACE) and diversity (Shannon index) of each of the three groups other than the one under investigation. We also included plant richness, using data sampled at the same sites and described in Dubuis et al. (2013).
Altogether, a total of 109 non-standardized variables consisting of 44 soil, 29 climatic, 2 topographic, 27 spatial and 7 biotic variables (listed in Table S1) were available for the models to determine which are the most important predictors of alpha and beta diversity of each of the four microbial groups.

Alpha diversity analyses
Alpha diversity was measured using the ACE richness estimator and the Shannon index, both calculated using phyloseq with default settings. Differences in richness between groups were assessed with ANOVA and compared using Tukey's Honest Significant Difference (HSD) tests. The major edaphic, topoclimatic, spatial and biotic factors associated with the alpha diversity of each microbial group was evaluated with random forest regression models (Breiman, 2001). The random forest (RF) models were computed using the rfPermute function with 5000 permutations and 2000 trees in the rfPermute package (Archer and Archer, 2020). First, to identify the most important predictors, multivariate RF models including all the variables were computed. Then, Spearman correlation coefficients among all variables were calculated using the corrplot package in R (Wei et al., 2017). The variable retained in case of a correlation coefficient ≥ |0.70| (Dormann et al., 2013) was the one with the highest %IncMSE; thus, only uncorrelated variables were included in the subsequent random forest models. The final models were obtained by backward selection until the percentage of variation explained by the models stopped increasing and the models could not be improved further by dropping variables. The maximum number of variables within each final model was 12. Based on the final models, the importance and significance of each predictor in explaining the richness (ACE) and diversity (Shannon index) of each microbial community was assessed. Additionally, inter-group alpha diversity relationships were evaluated using regression models.

Beta diversity analyses
To assess beta diversity of microbial communities, each zOTU table was transformed to relative abundance (Weiss et al., 2017) using the ratio of read counts and Bray-Curtis distances were calculated using the phyloseq and vegan R packages. To determine whether the difference in community composition among locations was spatially correlated, we computed distance-decay curves (Morlon et al., 2008) using the (x,y) coordinates with the betapart package (Baselga and Orme, 2012). Then, the relationship of edaphic, topoclimatic, spatial and biotic factors with microbial beta diversity of each community were identified by two methods. First, using vegan, we computed Mantel tests between the explanatory distance matrices (topoclimatic, edaphic and spatial) and the Bray-Curtis dissimilarity matrix of each microbial group.
Second, we used distance-based redundancy analysis (dbRDA), combined with ANOVA and variation partitioning. For this method, forward and backward selection of variables was conducted to obtain the best model, starting from intercept-only and all-variables models, respectively, with the ordiR2step function. Once the best model was identified, Spearman correlation coefficients among the remaining variables were calculated. When pairs of variables had a correlation coefficient ≥ |0.70|, only the first variable selected by the model was conserved. The remaining uncorrelated variables were included in a new dbRDA model, and the variance inflation factor (VIF) was computed to further confirm that multicollinearity was low or absent (VIF <10) and the varpart function was used to assess the proportions of variation explained by each variable. The statistical significance of each environmental variable was tested using ANOVA by margin with 999 permutations and only significant variables (p < 0.05) were plotted on the dbRDA ordination plot.
Finally, we evaluated the relative importance of stochastic processes using the modified stochasticity ratio (MST) (Ning et al., 2019;Guo et al., 2021). This metric estimates ecological stochasticity according to a null-model-based framework. The MST index value indicates the dominance of stochastic processes (>50%) or deterministic processes (<50%) on community assembly. The mean Bray-Curtis dissimilarity-based MST index was calculated for each microbial group using the tNST function in the NST R package (Ning et al., 2019). This function calculates the MST value for each pairwise comparison of samples and produces the mean MST value across all pairwise comparisons.

Environmental niche
The realized niche breadth was calculated as the standard deviation of each abiotic variable, weighted by the relative abundance of the zOTUs at each sampling site. The niche position was calculated as the mean of the variable, weighted by the relative abundance of the zOTUs at each sampling site. For all niche analyses, we removed all zOTUs not present in at least three sampling sites.
To characterise the environmental niches of zOTUs, we defined a PCA space based on the soil and topoclimatic variables measured across the 136 sampling sites using the rda function in vegan. The first ten axes of the PCA jointly explained 81.2% of the total environmental variation in the study sites and were selected for further analysis. We calculated the niche breadth of each zOTU along each of the 10 PCA axis using the PCA scores and the ecospat.nichePOSNB function in the ecospat package (Di Cola et al., 2017). The niche breadths of each zOTUs were averaged and weighted by the inertia of the PCA axes to obtain the mean niche breadth of each zOTU within this environmental space using the ecospat.nicheNBmean function. Differences in mean niche breadth among microbial groups were tested using ANOVA with 999 permutations and effect size was evaluated using partial eta squared. An eta squared value > 0.14 indicates a large effect size.
We also calculated the niche breadth and position of each zOTU along each environmental gradient of interest identified as influencing alpha and beta diversity using the ecospat.nichePOSNB function. Quadratic regressions were performed and plotted using the geo-m_smooth function and the rasterised density was plotted using the geom_bin_2d function in ggplot2. The niche functions ecospat.niche-POSNB and ecospat.nicheNBmean used in this study are available on Github (https://github.com/ecospat/ecospat) and are available in the ecospat v3.3 R package (Di Cola et al., 2017).

Elevational gradient
As this study was conducted in the Alps, elevation was an important gradient to consider. However, it was not used as a variable in any of the previously mentioned analysis since it is not a direct causal driver for taxa, and it correlates with the gradients of many other climatic and edaphic variables. Nevertheless, we independently evaluated the relationship between elevation and alpha/beta diversity by repeating the above analyses and calculated changes in niche characteristics along the elevation gradient.

Predictors of alpha diversity
For this study, we selected only the sites where all three marker genes were sequenced and the full set of environmental variables was available, resulting in 136 sampling sites with 59482 bacterial, 472 archaeal, 82375 fungal and 3419 protist zOTUs.
Bacterial richness (ACE) and diversity (Shannon index) were significantly higher than fungal, protist and archaeal richness and diversity [ Table 1]. Interestingly, while mean fungal richness was significantly higher than mean protist richness, the diversity of both groups was within the same range.
The random forest models revealed strong relationships between edaphic, biotic and topoclimatic variables and the richness and the diversity of each of the four microbial groups investigated [ Fig. 2]. In contrast, only six spatial vectors (MEM 1, 3, 7, 10, 14, 21) significantly explained alpha diversity, primarily for fungi and protists [ Fig. 2].
Overall, the models best explained bacterial alpha diversity, with over 67% of the variance in both richness and diversity explained. The variance was also well explained for archaeal and fungal richness (>66%) but less for diversity (<45%). However, the random forest models explained a low proportion of the variation in protist richness (19%) and diversity (27%) and thus, results should be carefully considered as the low variance explained by the model can lower the accuracy of the variable selection and importance by random forests.
The diversity and richness of each microbial group were related to a unique set of variables representing mainly edaphic and biotic properties. Specifically, soil pH (and highly correlated calcium oxide content (CaO) (Spearman = 0.89)) were the most important edaphic variables driving richness and diversity of all microbial groups, except protist richness [ Fig. 2]. Overall, bell-shaped curves were observed, with decrease in alpha-diversity towards both ends of the pH gradient [ Fig. S1]. Soil water content (SoilWaterC) was also a key variable, explaining archaeal, fungal and protist richness and/or diversity [ Fig. 2]. Other variables such as silica (SiO 2 ) were also important edaphic variables differentially correlated to the alpha diversity of various groups [Fig. 2]. The increase in silica was correlated with the increase in fungal richness as well as with the increases in archaeal, fungal and protists diversities [ Fig. S2].
Out of topoclimatic variables, predictors related to winter conditions (snow cover duration (SCD) and number of freezing degree days (FDD)) were repeatedly identified as important in shaping alpha-diversity of all groups [ Fig. 2]. In all the cases, alpha diversity was higher at the warmer limit than at the cold extreme of the gradient [Figs. S3-4], a trend also reflected along the elevation gradient [ Fig. S5].
Among the biotic variables, plant richness was identified as a key factor [ Fig. 2]. This was especially clear with fungal alpha diversity, where an increase in plant richness was related to an increase in fungal richness and diversity (R 2 > 0.38, p < 0.001) [Fig. S6]. Other biotic variables explained some of the variance, differentially for each group [Fig. 2]. These alpha diversity relationships among the groups using regression models are detailed in Figs. S7 and S8 and highlight the strongest positive correlation between bacterial and archaeal communities for both richness (ACE) and diversity (Shannon).

Predictors of beta diversity
The Mantel tests determined that the beta diversity of all microbial communities was more correlated to edaphic variables than to topoclimatic variables [ Table 2]. However, topoclimatic properties better explained the beta diversity of fungal and protist communities than that of bacterial and archaeal communities. Geography (through the Euclidean distances of sampling sites) was also a significant factor correlated with fungal and protists communities [ Table 2] and this relationship was further supported by distance-decay curves and power models [ Fig. S9]. However, the low spatial correlation observed may reflect the limitations of using spatial data with Mantel tests (Legendre et al., 2015) rather than a genuine lack of spatial relationship. While the increase in community dissimilarity with increasing spatial distance was weak at the scale of the sampling area, these curves also highlighted the spatial turnover of these communities. Fungi presented the highest turnover with very dissimilar communities overall, followed by archaea, protists and bacteria [ Fig. S9], as supported by the distance-decay curves (Ranjard et al., 2013).
For beta diversity, the distance-based RDA identified pH, soil water content and electrical conductivity (EC) as key variables across all groups [ Fig. 3]. Total phosphorus (TotalP) was also important for all groups except protists. Topoclimatic variables presented a lower degree of correlation, with snow cover duration (SCD) related to bacterial and archaeal community composition, and temperature (gdd, bio5) related to fungal and protist communities. Spatial factors were confirmed to mostly have a limited relationship with microbial communities, as previously calculated by the Mantel tests. They were significant only for  Fig. 3]. We observed strong correlations in inter-group betadiversity especially between bacterial-archaeal, bacterial-fungal, bacterial-protist, fungal-archaeal and fungal-protists communities, also supported by the mantel tests results [ Table 2]. Finally, all communities had MST values < 50% [ Fig. 3], confirming the dominance of deterministic processes (also called niche-based mechanisms (Zhou and Ning, 2017)), even for fungal and protist communities that had a high variance remaining unexplained in the variation partitioning [ Fig. 3].

Fig. 2.
Mean predictor importance (% of increase in mean square error) of variables on microbial alpha-diversity (Shannon) and richness (ACE) predicted by the best random forest models. The accuracy was computed for each tree and averaged over the forest (2000 trees). The variance explained (var. exp.) is indicated for each model. Significance levels of each variable included in the model are as follows: *P < 0.05 and **P < 0.01 and ***P < 0.001.

The environmental niche
The environmental space was defined using the first 10 PCA axes, explaining 81% of the variance in total [Fig. S10, Table S2]. For niche breadth analyses, we removed all zOTUs not present in at least three sampling sites, leaving 58888 bacteria, 438 archaea, 73325 fungi and 3404 protists.
Differences in mean niche breadth were significant (ANOVA, p < 0.001, eta 2 = 0.14) with protists and bacteria presenting the largest average niche breadth, followed by archaea and fungi [ Fig. 4]. Fungi, bacteria and protists had longer tails of zOTUs with very large niche breadths. Considering outliers zOTUs, 2156 fungal zOTUs (2.95%) were considered generalists, of which 41% were classified as Ascomycota and 24% as Basidiomycota. Only 39 specialists zOTUs (0.05%) were identified, also primarily composed of Ascomycota (48%) and Basidiomycota (18%). Similarly, 516 bacterial zOTUs (0.88%) were considered generalists and 166 were considered specialists (0.28%). In both cases, the majority of these zOTUs were classified as Proteobacteria (28% of generalists, 31% of specialists) as well as Acidobacteria, Actinobacteria, Bacteroidetes and Planctomycetes. The main difference between generalists and specialists were the number of zOTUs identified as Patescibacteria and Chloroflexi. The former represented 14.5% of generalists but only 7.8% of specialists, while the latter represented only 5% of generalists but 18.1% of specialists. We identified 66 protist zOTUs (1.94%) as generalists and only 4 as specialists (0.12%), all primarily from the SAR clade. Finally, no specialists and only 4 generalists (0.91%) were identified for the archaea, including 3 Thaumarchaeota [data available on Figshare].

The environmental niche along gradients
Assessing the relationships between niche breadth and position of each zOTU along selected environmental gradients (previously identified as important factors for alpha and beta diversity of each microbial group [Figs. 2 and 3]) revealed that niche breadths were, in general, smaller toward the environmental extremes of the gradients. This was especially clear along edaphic gradients such as pH, soil water content (SoilWaterC) and total phosphorus content (TotalP) [Fig. 5], for which the sampling area covered a large range of the possible variation in the gradients. The strongest relationships were observed for soil water content, where zOTUs representing organisms living in drier soil exhibited small niche breadth, which rapidly increased in range with increasing water content. Similarly, zOTUs living in environments with low phosphorus content exhibited smaller niche breadths. Interestingly, while niche breadth decreased for bacterial, archaeal and fungal zOTUs towards the extreme pH values, this was not observed for protists, which exhibited increasing niche breadths towards higher pH values. Soil electrical conductivity was also identified as a key variable for the microbial communities [ Fig. S11], with increasing niche breadth associated to increasing conductivity. Similar patterns of decreasing niche breadths toward environmental extremes were less clear for climatic gradients [ Fig. S11]. A decrease in niche breadth with increasing snow cover duration was observed for bacterial, archaeal and fungal zOTUs.
In contrast, the niche breadth of protists increased with snow cover duration. No clear trends were recorded for the other climatic variables identified as important for the alpha and beta diversity.

Environmental predictors of microbial diversity
Using tools traditionally applied to macro-organisms, we showed their applicability to detect spatial and environmental biogeographic trends also when studying microorganisms both at the level of individual taxonomic units (niches) and communities (alpha and beta diversity). Assessing alpha and beta diversity of four microbial groups within the same study area allowed us to identify and compare the key factors associated with each community. In all cases, deterministic (nichebased) processes drove community assembly (Ning et al., 2019) and as hypothesized, the diversity of each microbial group was governed by a different set of variables, but contrary to our hypothesis, alpha and beta diversity of all groups were mainly associated with edaphic properties. Indeed, while soil properties were expected to weight more on the bacterial and archaeal assemblies (Shi et al., 2016;Delgado-Baquerizo et al., 2018;Malard et al., 2021;Starke et al., 2021), we had hypothesized that topoclimatic variables would be more important for fungal and protist communities (Bates et al., 2013;Tedersoo et al., 2014;Mod et al., 2020;Oliverio et al., 2020). However, fungi and protists had a high proportion of unexplained variance but the lowest MST values, suggesting that, especially for these groups, some key variables structuring these communities were missing from the models, such as soil microclimate (Lembrechts et al., 2019). Biotic variables were also key to the diversity of all groups, suggesting strong inter-kingdom interactions. Additional important biotic interactions with plants (Yashiro et al., 2016), or with other soil organisms such as arthropods, rotifers or nematodes (Coleman and Wall, 2014;Singer et al., 2020a,b;van den Hoogen et al., 2020) could also drive the diversity of these microbial groups. Finally, the MST index indicated that stochastic events such as ecological drift or diversification account for some of the unexplained variance (Zhou and Ning, 2017). Overall, these results highlight the importance of intra-and inter-kingdom comparisons.

Microbial niche breadth
We hypothesized that fungi and protists would have smaller niche breadths, on average, than bacteria and archaea due to higher spatial turnover and lower ecological tolerance (Pikuta et al., 2007;Taylor and Sinsabaugh, 2015;Belilla et al., 2019;Fournier et al., 2020;Kivlin and Hawkes, 2020). While fungi indeed presented the lowest average niche breadths recorded and the highest spatial turnover, the environmental niche breadths of protists (and bacteria) was, on average, larger than that of the other microbial groups investigated. In soils, many bacteria, archaea and fungi form obligate associations with other taxa (consortia), a fact that is illustrated by the limited number of taxa that can be cultured axenically, as compared with the total diversity (Stewart, 2012;Pande and Kost, 2017). This means that the presence of such taxon depends on the occurrence of associated organisms nearby thus potentially reducing niche breadth to that of the whole consortium ( Bar--Massada, 2015) or at least part of it. In contrast, most soil protists are heterotrophic (Singer et al., 2020a,b;Mazel et al., 2021) and depend on the presence of prey or host (Geisen et al., 2015;Seppey et al., 2017). Therefore, while intra-kingdom co-occurrences may not have much of an influence on their niche breadth, inter-kingdom co-occurrences may be determinant. The similar niche breadths recorded for protists and bacteria supports the theory that they can interact in belowground ecosystems. Strong correlations between soil protists and bacterial communities have been previously described (Oliverio et al., 2020) and were also observed in this study with the correlations of the alpha and beta diversity, but the nature of these relationships, beyond predation, are still to be determined. We had also hypothesized that dispersal limitation (via body size and dispersal capabilities) might decrease the realized niche breadth, especially for fungi and protists. However, we found no evidence of this as the protists harboured the largest niche breadth, with bacteria. As both, bacteria and archaea have similar cell size ranges, on average 1-3 orders of magnitude lower than the size of protists (Tesson et al., 2015;Stepanauskas et al., 2017), it should allow more efficient passive wind dispersal (Wilkinson et al., 2012). Indeed, spatial factors did not account Fig. 3. The edaphic, topoclimatic, spatial and biotic factors best explaining the beta diversity of bacterial, archaeal, fungal and protists communities with only the significant variables of the final db-RDA model plotted using principal coordinate analysis (PCoA). The significance of variables was tested using ANOVA. The variation partitioning analysis was computed using the significant variables identified within each category. Significance levels are as follows: *P < 0.05 and **P < 0.01 and ***P < 0.001. Residuals indicate the remaining unexplained variance.
for any of the variation in alpha and beta diversity of these two groups, suggesting that dispersal limitation only weakly structured these communities within the studied region. Therefore, as the low dispersal limitation did not appear to influence the size of their niche breadth, it might rather depend on their survival based on environmental conditions and biotic interactions, such as co-occurrence, rather than dispersal capabilities.

Niche specialization
Under the assumption that the detection of a DNA sequence indicates active presence, zOTUs with very large niche breadths indicate the presence of some highly generalist taxa, possibly able to survive in many environmental conditions and likely with strong dispersal and colonization capabilities. These may include, for instance, spore-forming fungal and bacterial taxa, thus increasing dispersal range and the available environmental space (Aguilar-Trigueros et al., 2019). For example, the high proportion of generalist Ascomycota identified was in accordance with Egidi et al. (2019) who showed the dominance of a few Ascomycota taxa in soil fungal communities worldwide. While we identified more generalist taxa; bacteria, protists, and fungi also presented a few zOTUs with very small niche breadths, likely reflecting highly specialized taxa. For example, Ktedonobacteria (Chloroflexi) were identified in higher proportions as specialists, possibly reflecting their habitat preference for oligotrophic and extreme environments (Barberán et al., 2012;Lynch et al., 2012;Tebo et al., 2015), including high elevation alpine soils (Adamczyk et al., 2019). Theory suggests that  generalist taxa are primarily influenced by dispersal processes leading to larger realized niche breadths, while specialists are primarily influenced by environmental filtering and biotic interactions, and accordingly present smaller realized niche breadths (Pandit et al., 2009;Büchi and Vuilleumier, 2014). However, recent studies on microbial communities have actually shown that biogeographic patterns among generalist microbial assemblages might be better explained by environmental conditions than for specialists, suggesting that environmental processes significantly impact all taxa and not only specialists (Székely and Langenheder, 2014;Lindh et al., 2016;Hu et al., 2019;Luo et al., 2019;Malard et al., 2019). Future studies could also explore how the environmental niche varies within a single taxonomic group or between distinct functional groups (e.g. for protists, between consumers, parasites and phototrophs) as their distributions was shown to differ along environmental gradients (Mazel et al., 2021).

Niche breadth and position along the gradients
As hypothesized, the niche breadth decreased towards environmental extremes within our study area, suggesting more specialized taxa towards environmental edges (Büchi and Vuilleumier, 2014). This trend was conserved across microbial groups and especially observed when most of the range of the gradient was represented in the study area suggesting that this trend is not due to sampling omitting part of the niches of taxa with marginal niche position (sensu niche truncation). For instance, taxa with niche position close to either end of the pH gradient generally had a narrow niche, suggesting the increase in specialist taxa in acidic and alkaline soils (a pattern also recorded for soil bacterial communities using different analytical methods (Malard et al., 2019)). Electrical conductivity, which reflects the salt concentration in the soil, also showed decreasing niche breadth with decreasing conductivity. As salt concentration is an important variable for microbial communities (Logares et al., 2009), a decrease in niche breadth in hypersaline soils was also to be expected. However, the sampling sites only covered a small section of the salinity gradient given that no sites were recorded over 500 μS cm − 1 . As a general reference, electrical conductivity over 500 μS cm − 1 represent relatively saline systems, while values can exceed 4000 μS cm − 1 in certain soils (Hardie and Doyle, 2012). Similarly, organisms living under growth-limiting low available water or nutrient (phosphorus) contents (Schmidt and Lipson, 2004;Darcy et al., 2018) need adaptations to survive in these conditions (Torsvik and Øvreås, 2008;Malard and Pearce, 2018;Singer et al., 2018), hence the lower niche breadths observed for taxa with niche position close to the lower ends of soil water content and nutrient gradients.
Interestingly, out of climatic variables, the decreasing trend of niche breadths towards the ends of the gradients was only observed for snow cover duration. This is likely because the study area best covered this climatic variable, with snow cover duration ranging from 1 day up to 235 days (out of 365 annually). Such trend is also observed for ectomycorrhizal fungi and plants (Björk and Molau, 2007;Yao et al., 2013). Furthermore, the recurrent detection of snow cover duration as an important climatic variable for alpha and beta diversity highlights the importance of considering the ecosystems being investigated (Edwards et al., 2007). Snow cover duration is rarely considered in most studies on alpine microorganisms (exceptions include Zinger et al. (2009) andXia et al. (2014)), despite its ecologically important role in these ecosystems (Edwards et al., 2007).
For the gradients only partially covered, niche breadth trends may not fully reflect the reality in all parts of the gradients due to niche truncation. For these gradients, broader extents or even global scale analyses are needed to avoid limiting the range of environmental conditions to a limited study region (Regos et al., 2019).

Limitations and perspectives
Originally, the concept of the environmental niche was proposed for well-defined species (Hutchinson, 1957), as most often used in biogeography of macro-organisms . Here, the niche calculations were based on a single phylogenetic marker associated to a single niche. Yet, we should keep in mind that microorganisms with the same phylogenetic marker can have a number of accessory genes encoded in their mobile genome, extending the functional potential, which might be beneficial under certain environmental conditions (Juhas et al., 2009). Furthermore, we implicitly assume that the detection of DNA sequence means the active presence of the organisms, which could lead to niche bias, especially towards the identification of generalists with large niche breadths. We should also note that we assume that the niche hypervolume is a symmetrical, smooth and continuous shape. However, as this shape might be asymmetrical, rugged, or contain holes (Blonder et al., 2014), the calculation of the niche position and breadth across multiple dimensions could overestimate the niche breadth or result in an assumed centroid position that is shifted or even outside the actual niche. In these cases, an overestimation of the number of generalist taxa identified is possible. Last, by pooling five subsamples into a single soil sample per site, we might have overlooked some specific soil microenvironmental conditions. As very similar sampling protocols are commonly used in many studies of soil environmental microbial communities, it would be interesting in future studies to explore the variation of soil microenvironmental conditions within a few square meters and assess how this could affect the estimates of niches and their positions along gradients.
Quantifying and comparing the niche properties of microorganisms has a wide range of potential uses, especially to model spatial distribution and evaluate the potential influence of climate and environmental change on taxa and communities (Thuiller et al., 2005;Muller, 2019;Finn et al., 2020;Mod et al., 2021). Specialized taxa with limited niche breadth may be at greater risk of extinction as they benefit from homogeneous environments (in space and/or time). Devictor et al. (2008) already highlighted some overwhelming evidence of specialist declines among macroscopic organisms in fragmented landscapes. However, whether similar trends await microbial taxa remains undetermined, especially as speciation and diversification occur at much faster rates than in macro-organisms, potentially allowing faster acclimation and adaptation to changing environmental conditions (Cavicchioli et al., 2019). The theory of functional redundancy in microbial communities (Rousk et al., 2009;Louca et al., 2018) might also mitigate the impact of widespread extinctions following environmental change. However, emerging evidence suggests that functional redundancy may not actually be as widespread as previously thought (Allison and Martiny, 2008;Delgado-Baquerizo et al., 2016;Galand et al., 2018). Therefore, the extinction of specialist taxa harbouring rare and/or key functions for an ecosystem (Mariadassou et al., 2015) could disrupt ecosystem functioning (Jousset et al., 2017), especially in fast changing ecosystems such as alpine regions.

Ethics approval
Not applicable.

Consent for publication
Not applicable.

Availability of data and materials
The niche functions ecospat.nichePOSNB and ecospat.nicheNBmean used in this study are available on Github (https://github.com/ecospa t/ecospat) and are available in the ecospat v3.3 R package (Di Cola et al., 2017). All zOTU tables and associated metadata have been deposited on Figshare (https://figshare.com/account/home#/proje cts/97421). Bacteria/Archaea (16S) and Fungal (ITS) sequences are available on European Nucleotide Archive via the project number PRJNA810480. Protist sequences (18S) are available on European Nucleotide Archive via the project number PRJEB30010.

Funding
This study was supported by the Swiss national science foundation (SNSF, grant PDFMP3_135129 to AG and HNH and grant 315230_184908 to AG). EL was supported by an "Atracción de talentos" grant from the Community of Madrid, project 2017-T1/AMB-5210 and by the project MYXOTROPIC VI (PGC2018-094660-B-I00) awarded by the Spanish Government.

Authors' contributions
AG and HNH designed the initial project leading to the sampling of the data used here. AG, EY and HNH planned the initial sampling, and EY led the laboratory work for conditioning the field samples and sequencing for bacteria, archaea and fungi. EL and EADM designed and conducted the laboratory work and sequencing for protists. NG and EY conducted the bioinformatic processing while HM compiled and prepared the datasets, in coordination with AG. AG conceived the present project based on the data, with help from HM, NG and EY. LAM and AG designed the present comparative study. LAM conducted the statistical analyses, with help from OB for the niche analyses. LAM wrote the manuscript. All authors revised and approved the final version.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.