A Multiscale Variation Partitioning Procedure for Assessing the Influence of Dispersal Limitation on Species Rarity and Distribution Aggregation in the 50-Ha Tree Plots of Barro Colorado Island, Panama

Spatial autocorrelation is one of the most important ecological processes discussed in current ecological literature. The present study represents an attempt to quantify the effect of dispersal limitation on community structure under a local environmental condition using a multiscale approach. Moreover, I assess the relationships between explained variation accounted for by space, rarity, and the distributional aggregation of species in the community. My results showed that spatial autocorrelation would have increasing influences on community composition when the spatial resolutions were increased for the 50-ha tree plots of Barro Colorado Island (BCI), Panama Canal. Also, when spatial resolutions were increasing, the rarity of species tended to decrease as measured as an intrinsic characteristic of the species regarding its distribution (monotonically), while the aggregation of species tended to increase (not monotonically). Overall, it might be of some values to perform such multiscale analyses for analyzing the relative contribution of space and environment on shaping community structure and species distribution dynamically.


Introduction
Partitioning variation for identifying relative importance of different ecological processes are ubiquitous in current ecological literature. Especially in the discipline of community ecology, variation partitioning has been widely applied to reveal the influences of dispersal limitation and environmental filtering.
One common assumption in analyzing ecological communities is that scale of the data is fixed without change. This might lead to problems as different species might respond differently to the spatial resolutions due to various dispersal abilities or niche processes operating at different scales. Thus, it might be misleading to apply a fixed spatial resolution to evaluate the influences of environment and space on species distribution or community structure. As stated by Karst et al. [1], "the balance of stochastic and deterministic factors influencing plant distributions depends on spatial resolution". Therefore, ignorance of spatial resolution transformation inherent in the raw data may pose risks of misunderstanding on the various effects of ecological processes.
To explicitly resolve the spatial invariance issue, in the present study I am trying to quantify the effects of dispersal limitation on compositional variation of the community using a multiscale procedure. Since the multiscale method is highly flexible and compatible with traditional fixed-scale methods, this procedure might allow more information to be extracted regarding the effects of dispersal limitation and environmental filtering.
Rare species have unique roles contributing to community diversity [2]. They are also important to identify conservation priorities [3], evaluate extinction risks and serve as flagship species in the local communities [4].
However, it is still unclear how rarity would influence resulting variation contributed by spatial autocorrelation. The common hypothesis should be that rare species typically has limited distributional range and thus should be determined by space or limited habitats. Thus, if there are a lot of rare species in the community, I would predict that the proportion of total variation (indicated by adjusted 2 R ) explained by space should be remarkably large.
Aggregation of species distributions is ubiquitous in ecological data [5][6][7][8]. Many studies have attempted to estimate the degree of aggregation and evaluate the consequences of aggregation on the species-area relationship [9,10] and abundance estimation [11].
To date, no study has dealt with the impacts of species aggregations on the explained variations contributed by space. One hypothesis for linking species aggregation and variation partitioning for space could be that increasing spatial aggregation would lead to a higher percentage of explained variation attributed to space because aggregated species should be spatially limited in their distributional ranges (Of course, increasing spatial aggregation may lead to higher percentage of explained variation attributed to environment when the environment or habitats are spatially structured, in this case, the role of space and environment cannot be distinguished because environment interplays with space). Typically, when the spatial resolutions are increasing, the species' distribution should appear more aggregated (because distribution points at adjacent cells are merged gradually). Thus, I predict that explained variation contributed by space should increase when spatial resolutions are increasing for the community.
In the present study, I have three research objectives for better understanding the contributions of space and environment on structuring ecological communities: (1) the shifting pattern of explained variation attributed to spatial autocorrelation caused by the rescaling of the data; (2) the relationship between explained variation and the rarity of species in the community; and (3) the relationship between explained variation and the aggregation of species in the community.

Multiscale procedure
The multiscale technique is nothing new, but simply to modify the areal size (merge or split different local quadrates to form new quadrates with new areal sizes and spatial scales) and spatial coordinates (recalculation of the geographic coordinate centroids of these newly formed quadrates) of the sampling quadrates according to different spatial scales. The generated new quadrates in each step of the rescaling procedure are then operated using Redundancy Analysis (RDA) and/or Canonical Correspondence Analysis (CCA) to evaluate the explained variation contributed by space. The calculation of explained variation contributed by space follows the previous studies using variation partitioning technique [12][13][14][15]. In a simple sense, the method is to implement a multivariate regression analysis on the species community matrix (quadrate-species matrix) using the geographic coordinate centroids of the quadrates as the explanatory variables.
In the present study, the rescaled procedure is run in a manner from high to low spatial resolutions. Specifically, the original data set is considered to have the highest spatial resolution, in which the whole studied region is divided into regular quadrates, and the areal size of the smallest spatial unit is calculated (Scale 1 subplot in Figure 1). The selection of regular quadrates is for satisfying the requirement that all the studied quadrates could cover the distributional points of at least one species.
Then, I merge every two adjacent quadrates from the original grid cell system to form new sampling quadrates with a lower spatial resolution (Scales 2, 4 and 8 subplots in Figure 1), in which the abundance of the species is summed and the environmental data are averaged. The outcome of this rescaling step is that the size of smallest spatial unit of the data becomes twofold larger than the previous unscaled data. The procedure continues until the size of the smallest unit of the distribution data becomes 1/5 of the total area size ( Figure  1).

Extraction of spatial variables
I consider two methods to quantify and extract spatial information contained in the data so as to use them as explanatory variables when performing RDA (or CCA) analyses. The first method is called as principal coordinates of neighbor matrices [16], while the second one is a cubic regression model, written as 13,17] where X and Y denote the latitude and longitude respectively of the geographic coordinate centroids of a newly formed quadrate.

Assessment of the rarity of species
When the spatial resolutions vary, I am able to quantify the role of species rarity on resultant explained variations contributed by space and environment, respectively. To quantify the rarity of species in the community, I consider that the percentages of species which occur (their presence information) in less than 10%, 30% and 50% of the total quadrates (R10, R30, R50).

Assessment of the aggregation of species
To measure the aggregation of species across the distributional ranges, instead of using the negative binomial distribution [7,8,18,19], I consider the finite version of negative binomial distribution, which has been recommended recently for its merits of handling range-narrow species [5].
The finite negative binomial probability of species distribution reads, Estimation of aggregation parameter k follows the previous work (1 ) (1 ) a n as k s a n Where m is the number of quadrates, a is the ratio between  Table 1. The results show that there is a strong negative relationship between community rarity and explained variation for space. This result suggests that when spatial resolutions are increasing, the rarity of species has a decreasing trend, while the space could explain more percentage of variation within the community composition.

Relationship between explained variation accounting for space and the aggregation patterns of species in the community
Estimation of the aggregation parameter for rare species still encounters fitting difficulties here (no convergence), in contrast to previous work [5]. I also attempted to fit the aggregation parameter for negative binomial distribution, and the results are basically identical. Thus, only the result for finite negative binomial distribution is presented (Table 1). Basically, the aggregation of species has an increasing trend when the smallest grid cell size is increased.
The relationship between explained variation from RDA and the aggregation parameter k is presented in Table 1. The results show that when the aggregation of species in the community has an increasing trend, the space could explain more percentage of variation within the community composition.
Interestingly, results highlight that the effective species number (those species can generate reasonable k >0) is not changed under multiscaling process. The number is fixed at 150 as shown in Table  1. Thus, in the increasing scaling procedure, aggregation patterns for some species seem invariant. Typically these species should be the ones with the associated aggregation parameter k that is difficult to estimate. When checking the distribution of each species, most species without reasonable estimation of k are the ones with singular distributions in the BCI quadrates.

Aggregation and rarity patterns of species under varying spatial resolutions
Contrary to the prediction mentioned above, increasing spatial resolutions does not increase rarity of species. Instead, many species become more common when spatial resolutions are increased. A very reason for this result might be due to the reduced quadrate number, thereby reducing the rarity of species.
Reducing rarity of the species in a community could better reveal spatial or environmental gradients [25]. Thus, it is not surprising in my study that the explained variation for space would increase when the community rarity is reduced.

Relationships between explained variations accounted for space and environment and percentages of rarity of species in the community (R10, R30, R50)
Community rarity thus might be an indicator to assess the role of spatial autocorrelation. As shown in Table 1, increasing spatial resolutions would decrease community rarity indices (R10, R30, R50), meanwhile the explained variation attributed to spatial autocorrelation is decreased remarkably. There is a tight negative association between rarity of species in the community and the explained variation for space. Thus, it is rational to use rarity of the community to predict the relationship between explained variation for space and the compositional variation of the community [26]. smallest spatial unit size and the whole range, { } i n is the abundance vector in a set of sampled quadrates.
I take the average of the aggregation parameter k for all the species to represent the community aggregation status when analyzing the relationship between aggregation and rarity of the species community.

Example: Distribution of 225 tree species in 40 Barro Colorado Island (BCI) sampling plots
The distribution of the tree abundances across the BCI sampling plots is derived from a previous study [20]. Originally there are 100 sampling sites, but I only consider sampling plots with standard square design (as those presented in Figure 1). Therefore, the resulting matrix is the one with a dimension of 225× 40, in which there are 40 sampling sites and 225 tree species. Spatial variables used for the analysis include the geographic coordinates (latitude and longitude) of the sampling plots. Environmental variables, including precipitation and elevation, age of the plot and the geology, are basically identical across the 40 sampling plots, thus indicating an environmentally homogeneous landscape [21][22][23]. Thus, these 40 plots allow the contribution of space on influencing community structure of trees to be tested, while ignoring environmental variability.
All the analyses are done using R software [24].

Explained variation accounting for space as the function of varying spatial resolutions
Since the results from RDA and CCA are basically the same, and the explained variation for space using PCNM method, cubic regression model and original geographic coordinates are highly similar, thus only the results for RDA with raw coordinates as input are discussed hereafter.
As shown in Table 1, increasing the smallest spatial unit for BCI data would lead to increasing explained variation attributed to space. Thus, when the spatial resolutions are larger, typically the spatial distributional range of species is larger.

Relationship between explained variation accounted for space and percentages of rarity of species in the community (R10, R30, R50)
Increasing the smallest spatial unit for BCI data would lead to a decreasing rarity of species in the community, as indicated by R10, R30, and R50 indices (

Relationships between explained variation accounted for space and the aggregation patterns of species in the community
Interestingly, it is found that increasing aggregation of species distribution could increase the explained variation caused by space. This observation is congruent with my primary prediction. The aggregation status of species might reflect the spatial clustering patterns of the species in the area. Thus, a higher aggregation of species should have a higher clustering probability, leading to a stronger effect of spatial autocorrelation. By this manner, the explained variation caused by space should increase consequently.

Limitations of the present study
First, the robustness of the analysis may be challenging because of limited data size. In the present study, only 5 quadrates are available as the response variable at the largest scale, while only 10 quadrates are available at the second largest scale. These small data sizes confine the reliability of the corresponding findings.
Second, rarity and aggregation may not be changed independently of spatial resolution. The consequence is that the results are not reliably interpreted. This issue should be possible, as indicated by the results presented in Table 1: the correlation between these two quantities is strong over various spatial resolutions. However, prior knowledge did not assume any relationships between rarity and aggregation when spatial resolution is changing. Therefore, their non-independent change when spatial resolution is varying can be one of the causes but not the only cause. In the present study, it is not possible to remove the nonindependent change of both quantities.
Finally, the empirical study system (BCI tree plots) utilized in the present study may be too specific to address the topic on a general level. It is situated in a local area, the results may not be generalized and applicable to the data of field plots in other regions. This limited sampling issue can be avoided if one can explore and compare the empirical data in other global permanent forest plots.