Optimizing sampling e ﬀ ort and information content of biodiversity surveys: a case study of alpine grassland

Aims: Current rates of biodiversity loss do not allow for ine ﬃ cient monitoring. Optimized monitoring maximizes the ratio between information and sampling e ﬀ ort (i.e., time and costs). Sampling e ﬀ ort increases with the number and size of sampling units. We hypothesize that an optimal size and number of sampling units can be determined providing maximal information via minimal e ﬀ ort. We apply an approach that identi ﬁ es the optimal size and number of sampling quadrats. The approach can be adapted to any study system. Here we focus on alpine grassland, a diverse but threatened ecosystem. Location: Gran Paradiso National Park, Italy. Methods: We sampled nine 20m×20m-plots. Each plot consisted of 100 2m×2m-subplots. Species richness and Shannon diversity were quanti ﬁ ed for di ﬀ erent sizes and quantities of subplots. We simulated larger subplot sizes by unifying adjacent 2m×2m-subplots. Shannon ’ s information entropy was used to quantify information content among richness and diversity values resulting from di ﬀ erent subplot sizes and quantities. The optimal size and number of subplots is the lowest size and number of subplots returning maximal information. This optimal subplot size and number was determined by Mood ’ s median test and segmented linear regression, re- spectively. Results: The information content among richness values increased with subplot size, irrespective of the number of subplots. Therefore, the largest subplot size available is the optimal size for information about richness. Information content among diversity values increased with subplot size if 18 or less subplots were considered, and decreased if at least 27 subplots were sampled. The subplot quantity consequently determined whether the smallest or largest subplot size available is the optimal size, and whether the optimal size can be generalized across richness and diversity. Given a 2m×2m size, we estimated an optimal quantity of 54. Given a size of 4m×4m, we estimated an optimal number of 36. The optimal number of plots can be generalized across both indices because it barely di ﬀ ered between the indices given a ﬁ xed subplot size. Conclusions: The information content among richness and diversity values depends on the sampling scale. Shannon ’ s information entropy can be used to identify the optimal number and size of plots that return most information with least sampling e ﬀ ort. Our approach can be adapted to other study systems to create an e ﬃ cient in-situ sampling design, which improves biodiversity monitoring and conservation under rapid environmental change.

changes of alpine plant communities (Steinbauer et al., 2018), efficient surveys and monitoring are urgently needed to explicitly inform climate-smart conservation management and policy (Rands et al., 2010). Efficient vegetation sampling represents the most ecological information that can be gathered by least sampling effort, i.e., in short time and at low costs (Stenzel et al., 2017).
Information content of ecological data is strongly dependent on temporal and spatial scales (Chave, 2013;Levin, 1992;Peterson and Parker, 1998;Rosenzweig, 1995;Storch et al., 2008;Wiens, 1989). Patterns of species diversity vary with the spatial scale of observation, with the species-area relationship being the most fundamental example (Arrhenius, 1921). Biotic drivers of species diversity generally tend to be more important at smaller scales, whereas abiotic drivers predominate at larger scales (Götzenberger et al., 2012;. In vegetation science, a single, well-founded and effective sampling design is missing so far. The disagreement on an ideal sampling design can be traced back to the fundamental question of the minimal area representing plant communities (Hopkins, 1957). In particular, the quantity, size, shape and spatial configuration of sampling units (i.e., plots) control species diversity estimates (Bacaro et al., 2015;Chiarucci et al., 2001;Dengler, 2009;Güler et al., 2016;Keeley and Fotheringham, 2005;Kenkel et al., 1989;Stohlgren, 2007). A non-directional plot shape such as a quadrats is expected to return most phytosociological richness of homogenous stands with weak ecological gradients (Bacaro et al., 2015), but recommended sizes still vary by a factor of 10 5 . Often rules of thumb are used such as the indication that plot size should be roughly proportional to vegetation height (Chytrý and Otýpková, 2003). In view of the difficulties of finding a consistent sampling design, some authors suggest to have an operational approach, with sampling scale decided on the basis of clear and repeatable criteria rather than vegetation characteristics (Chiarucci, 2007;Palmer and White, 1994).
Here we aim to identify an optimal size and number of plots that return the most information about species diversity with the least sampling effort. We do not analyze the relationship between sampling design and species diversity, but between sampling design and the information content among species diversity estimates. Diversity is thereby quantified for quadratic plots of different size and quantity. We define the optimal size and number of plots as the smallest size and lowest quantity at which a maximum of information among species diversity values can be obtained by a minimum of sampling effort ( Fig. 1). It is hypothesized that with increasing plot size and quantity information content first increases and then levels off, following the causation of the species-area relationship (W. R. Turner and Tjørve, 2005): An increasing sampling area expressed by increasing plot size or quantity means that a higher relative proportion of diversity is recorded that would result in an increasing redundancy among diversity values. Information content subsequently levels off. Sampling effort is basically determined by the number and size of plots. The more and the larger the plots are, the higher is the sampling effort in terms of time and costs. As a case study, we sampled alpine grassland communities. We used Shannon's information entropy as a measure of information content captured in diversity metrics. Two fundamental metrics of biodiversity were applied, which express different types of information: Species richness and Shannon diversity (i.e., including species abundances). To our knowledge, information entropy has not been used like this before, but see Bogaert et al. (2005) for an entropy-based analysis of landscape fragmentation or Turner et al. (1989) describing a rapid loss of information for rare and dispersed land cover types with increasing sampling size. We applied a methodological approach that can be easily adapted to any study system. This makes our investigation of general interest for ecologist and conservationists.

Study area and sampling design
The study area is located in the Gran Paradiso National Park in northwestern Italy (Fig. 2a). This alpine environment is characterized by low human impact due to the long history of protection. The sampling covers three vegetation subtypes of alpine grassland that were identified with the support of the CORINE Land Cover map from 2012 (available at https://land.copernicus.eu/pan-european/corine-landcover) and expert knowledge: 'Pure' natural grassland, sparsely vegetated 'rocky' grassland (on rocks, scree or gravel) and 'wet' grassland (wetlands). Each vegetation subtype was sampled in three valleys (Bardoney, Colle de Nivolet, Levionaz) between 2200 and 2700 meters a.s.l. (Fig. 2b), which resulted in one plot per vegetation subtype and valley. Subsequently, nine plots were sampled in total.
We applied quadrats because we did not observe strong ecological gradients at any plot location (Bacaro et al., 2015). For that reason, quadrats mitigate the confounding effect of environmental heterogeneity on species diversity (Dengler, 2008). The plots were established on flat terrain. Each of the nine plots had an extent of 20 m × 20 m (400 m 2 ) and was subdivided into 100 subplots measuring 2 m × 2 m (Fig. 2c). The percentage cover (abundance) of each plant species (including mosses and lichens) was estimated for each subplot. Cover estimates reflect the mean of two independent estimates by two people to reduce observer bias (Klimeš, 2003). The vegetation survey was conducted at the peak of the yearly vegetation development during August 2015. Species were identified using 'Flora Helvetica' (Lauber and Wagner, 1998), 'Flora Vegetativa' (Eggenberg and Möhl, 2009), 'Flora Alpina' (Aeschimann et al., 2004) and 'Guida alla flora della Valle d'Aosta' (Bovio et al., 2008).

Species diversity indices
The first fundamental measure of diversity that we applied is species richness R. The second index is the Shannon diversity index (Shannon, 1948) that incorporates species richness and abundance. The non-exponential version with the natural logarithm that we utilized is given by formula 1: The number of species is given by R and the relative abundance of the ith species by p i . The Shannon diversity H quantifies the uncertainty  Theoretical background to identify the optimal size and number of sampling plots. The plot size and quantity determines the sampling effort because the size and number of plots mainly determines the time and financial ressources needed for sampling. The optimal plot size or quantity retrieves a maximum of information content by minimal sampling effort. of selecting any species from the plot by chance. The Shannon diversity is maximal when each species within a plot is equally abundant. Here the percentage cover of each species was used as a measure of the relative abundance because the number of individuals cannot be recorded for clonal plants without destruction. Plants with a cover of less than 1% were set to 0.5% cover for simplification of statistical analyses. Species-specific mean cover was used to calculate the Shannon diversity H. We used the diversity-function within R package "vegan" (Oksanen et al., 2018) to calculate the Shannon diversity H.

Shannon's information entropy of species diversity indices
Information theory, of which Shannon's information entropy is an integral part, is widely applied in the scientific fields of mathematics, statistics and system dynamics. The seminal work of Shannon (Shannon, 1948) has gained broad application in these fields and is widely applied in ecology as a metric of species diversity (Shannon diversity, see Section 2.2). Shannon's information entropy is a central term in information theory. It quantifies the amount of information given by a number of entities (Shannon, 1948). Information entropy increases with decreasing redundancy among entities. Ecologists and conservationists prefer to apply the size and number of sampling plots that provide most information about species diversity. If information entropy saturates with plot size and quantity, the smallest size and lowest number of plots would be preferred that still provide most information ( Fig. 1) because sampling effort in terms of time and costs increases with the number and size of plots.
The Shannon's information entropy H is originally calculated by formula 2 including the common logarithm to base 10 instead of the natural logarithm (formula 1): with p i being the frequency of occurrence of entity i of R unique entities. Shannon's information entropy was derived from the idea to quantify information content given by letters (i.e., entities) within a text message. Shannon species diversity incorporates species abundances instead of letter abundances. Here we used the different values of a diversity index (species richness or Shannon diversity) as entities i. The value of entropy (i.e., information content) depends on the number of unique entities (e.g. letters, species or unique values of a diversity index) and their frequencies of occurrence. Entropy is positive and maximizes when the abundance of each entity (e.g. a unique index value) was equal.
With increasing decimal digits of the values of diversity indices, less equal index values may be found and entropy increases giving rise to bias in our analyses. Since the measurement accuracy of species cover was limited to the accuracy of 2 decimal digits (e.g. 25%), we considered 2 decimal digits to be a reasonable measurement accuracy throughout the entire entropy analysis. Furthermore, the absolute values of information entropy cannot be directly compared between different diversity indices because of the different scaling of indices. For valid comparison, which is not the intention of this study, metrics must be standardized before computing information entropy. The entropy was calculated using the entropy-function in R-package "entropy" (Hausser and Strimmer, 2014).
We calculated the information entropy H of a given diversity index and subplot size on the basis of m*9 randomly selected subplots, i.e., m from each of the nine 10 × 10-plots. By varying m, we simulated different numbers of sampled subplots. Thereby, we only selected the subplot-unions within a 10 × 10-plot that do not share any 1 × 1subplot to guarantee independent values for the entropy calculation. Subsequently, max(m) equals 100 for subplot size 1 × 1; max(m) = 25 for 2 × 2, max(m) = 9 for 3 × 3, max(m) = 4 for 4 × 4 and 5 × 5, and max(m) = 1 for 6 × 6, 7 × 7, 8 × 8, 9 × 9 and 10 × 10. Furthermore, Fig. 2. Geographical location of the study area. a) Gran Paradiso National Park is located in the European Alps, northwestern Italy. b) Nine sampling plots were established, three in each of the three alpine grassland subtypes inside each of three valleys (Colle del Nivolet, Levionaz and Bardoney). c) The sampling plot was designed as a 20 m × 20 m quadratic square (size 10 × 10), subdivided into 100 subplots of 2 m × 2 m (size 1 × 1). Different plot sizes from 1 × 1 to 10 × 10 were simulated by unifying adjacent 2 m × 2 m-subplots.
given n subplot-unions within a 10 × 10-plot, there are ∏ k=0 m−1 (n − k) 9 possibilities to combine m subplots from each of the nine 10 × 10-plots. We consequently repeated this random subplot selection procedure 10,000 times to represent an appropriate proportion of the number of possible combinations. However, repetitions of the random selection procedure were not necessary for subplot size 1 × 1 and m = 100, for 2 × 2 and m = 25, and for 5 × 5 and m = 4, because these configurations already incorporated all independent subplot-unions available within a 10 × 10-plot by one single selection run. We finally computed 10,000 entropy values for each diversity index (species richness and Shannon diversity), for each subplot size (from 1 × 1 to 10 × 10) and for varying m: From m = 1 to m = 24 as well as for m = 30, m = 36, m = 42, m = 48, m = 60, m = 72, m = 84, m = 96 and m = 99; we did not calculate entropy values of subplot size 1 × 1 for all subplot quantities m due to long computation time. Each of the 10,000 entropy values were thus calculated on the basis of m*9 values (entities) of a diversity metric (species richness or Shannon diversity).
In addition, the effect of the spatial dispersion of sampling units onto sampling outcomes is often neglected (but see Chiarucci et al., 2009;Dengler and Oldeland, 2010). The larger the spatial area of sampling units becomes or the larger the distance between (i.e., extent among) sampling units is, the more species will be detected due to the distance-decay of similarity between species communities (Steinbauer et al., 2012). We accounted for the effects of the species-area relationship (Dengler, 2008) and the species-extent relationship (Güler et al., 2016) onto the sampling results by randomly selecting a given number of subplot m from each of nine plots that cover a constant area and extent. As aforementioned, we repeated this probabilistic sampling procedure 10,000 times to take the large variety of available subplot combinations into consideration.

Statistical analyses
To identify the optimal subplot size for a given number of subplots, we compared the 10,000 entropy values between subplot sizes via Mood's median test (pairwiseMedianTest-function in R package "rcompanion"; Mangiafico, 2016). The optimal number of subplots for a given subplot size was quantified using breakpoint analyses via piecewise regression. The segmented-function inside R-package "segmented" (Muggeo, 2003) was used to apply piecewise regression to the 5 th , 50 th and 95 th -percentile of the entropy distributions. The segmented linear regression fits two separate yet contiguous linear regression lines to the sampling points before and after an estimated breakpoint, which is based on the maximum likelihood of model parameterization. The breakpoint analyses onto the 5 th , 50 th and 95 th -percentile provided a confidence interval for the median breakpoint. As mentioned above, we did not calculate entropy values of subplot size 1 × 1 for all subplot quantities m due to long computation time. However, breakpoint estimation is sensitive to the amount of points involved. To include the entire range of m from 1 to 99, we applied breakpoint analysis for subplot size 1 × 1 onto predicted entropy values from a local polynomial regression model (loess-function in R-package "stats"; R Development Core Team, 2016). The local regression model precisely fitted a regression line onto the points. Each subplot m from 1 to 99 could thus be related to an accurately predicted entropy value. These predicted entropy values were then used to detect the breakpoint along the relationship between the predicted entropy values and the subplot quantities m. The R-code is given in the appendix. The dataset is stored at the Dynamic Ecological Information Management System -Site and Dataset Registry (DEIMS-SDR; Wohner et al., 2019) under the UUID b549ff14-f40f-4749-8e2f-f16f6e523753 (see https://deims.org/ dataset/b549ff14-f40f-4749-8e2f-f16f6e523753).

Species richness and Shannon diversity
Species assemblages within the plots were generally representative for alpine grasslands, but specific dominance and abundance patterns were observed in the three valleys and vegetation subtypes. At Bardoney, pure grasslands were dominated by Nardus stricta, Trifolium alpinum and Carex curvula, whereas the wetlands were dominated by Nardus stricta, Carex bicolor and Salix herbacea. At Colle del Nivolet, Oxytropis helvetica was the dominating species in the rocky subtype along with Silene acaulis and Festuca alpina, whereas the most abundant species in the pure grassland were Anthoxanthum alpinum, Carex curvula and Geum montanum. In the wetlands, Carex nigra, Eriophorum scheuchzeri and Eleocharis quinqueflora were occurring the most. The rocky plot in Levionaz was dominated by Salix breviserrata, Plantago alpina and various grasses. Plantago alpina was abundant in the pure grassland along with Festuca melanopsis and Hieracium pilosella agg. The wetlands were dominated by Carex flacca and five moss species.
Species richness and Shannon diversity of the 1 × 1-subplots considerably varied within and between the nine 10 × 10-plots, representing diversity of alpine grassland (Fig. 3). Among all nine plots, 247 plant species were recorded. Herbaceous plants were most prominent, comprising 180 species. Up to 50 species of plants were recorded per 10 × 10-plot. A maximum of 33 species was recorded inside a single 1 × 1-plot of pure grassland in the Levionaz valley (Fig. 3a); a minimum of three species was identified inside a single 1 × 1-plot in the wetland of the Colle del Nivolet valley. The Shannon diversity did not necessarily increase with species richness (Fig. 3b); species can be unequally abundant, compensating the positive effect of species richness on Shannon diversity.

Information entropy and subplot size
The information entropy of species richness R generally increased with increasing subplot size irrespective of the number of subplots considered (Fig. 5a). Given that nine subplots are considered in total (m = 1, Fig. 5a), the information entropy between subplot sizes 4 × 4 and 7 × 7 was similar; increasing subplot size did not necessarily increase the information entropy within this range of subplot sizes. For all other m, the entropy significantly increased with growing subplot size.
The information entropy of Shannon diversity H increased with subplot size (Fig. 5b), but only for m ≤ 2. For m = 1, the entropy formed again a plateau along intermediate subplot sizes. For m = 2, the entropy marginally varied between subplot size 1 × 1 and 3 × 3. For m ≥ 3, however, the relationship between entropy and subplot size changed from positive to negative; the information entropy then decreased with increasing subplot size. For m ≥ 4, the information entropy was significantly different between all subplot sizes.

Discussion
As hypothesized, information content levels off with an increasing number of subplots for both diversity indices and subplot sizes (1 × 1 and 2 × 2). Accordingly, the shape of the relationship between the information entropy and the plot quantity might be universal across plot sizes and diversity indices. In our study on alpine grassland, 54 (6 from each of the nine 10 × 10-plots) was estimated to be the optimal number of 1 × 1-plots that cover the most information about species richness and diversity values by the minimal sampling effort. Regarding 2 × 2-plots, 36 (4 from each of the nine 10 × 10-plots) was the optimal number of plots. Interestingly, the optimal plot quantity did not differ between the species richness and diversity indices. The optimal number of plots can consequently be generalized across both indices given a constant plot size of 1 × 1 or 2 × 2.
In contrast to our hypothesis, the information entropy did not show such saturating behavior with an increasing plot size by keeping the number of plots constant. The information content of the richness estimates clearly increased with increasing plot size. The optimal plot size in terms of richness information is, therefore, the largest plot size that was considered (9 × 9). However, the information content among richness values did not considerably change between the intermediate plot sizes from 4 × 4 to 7 × 7. In other words, information contents significantly differed between the extremely small and between the extremely large plot sizes. Subsequently, the smaller plot sizes do not necessarily provide more information about species richness. This is all the more relevant as mistakes in species sampling have a stronger impact at small plot sizes with less species diversity (Klimeš et al., 2001). Moreover, the amount of information covered by the diversity estimates increased with an increasing plot size up to 18 plots (i.e., 2 subplots were taken from each of the nine 10 × 10-plots), but decreased if more than 27 plots (i.e., 3 subplots were taken from each of nine 10 × 10plots) were considered. Hence, the number of plots determines whether the smallest or largest available plot size is the optimal size for Shannon diversity H Fig. 3. Species diversity within and between the nine 10 × 10-plots. a) Species richness R and b) Shannon diversity H of individual 1 × 1-subplots considerably varied among the three vegetation subtypes (pure, wet and rocky) and valleys (Bardoney, Colle del Nivolet and Levionaz). The horizontal black line within the grey box represents the median. The grey box ranges from the 1st to the 3rd quartile. The upper whisker delimits the 3rd quartile plus 1.5 times the interquartile distance (3rd quartile -1st quartile). The lower whiskers mark the 1st quartile minus 1.5 times the interquartile distance. information about diversity. A trade-off between the optimal plot size and quantity has been detected regarding information obtainable about diversity. Turner et al. (M. G. Turner et al., 1989) showed that information content on the diversity of land cover types grows with an increasing spatial resolution of sampling units. Since that study partly confirms our findings, the general shape of the information-plot scale relationship might be consistent across study objects (e.g. species or land cover types). Our results, however, indicate that the relationship between information entropy and plot size given constant plot quantity is not universal across plot quantities and diversity indices. The optimal plot size for any given number of plots cannot be generalized across both diversity indices. The optimal plot size seems to depend on the number of plots considered and the diversity index applied.
Differences in the scaling of information content with plot size and quantity are driven by different factors. These include the spatial configuration of sampling units (Bacaro et al., 2015;Güler et al., 2016;, dispersal mechanisms (Dengler, 2008), species density effects (Condit et al., 1996) and small-scale heterogeneity of environmental conditions (Dengler, 2008). Even at the local scale, species diversity increases with increasing distance between sampling units because habitats and environmental conditions are expected to become more similar with decreasing distance (Chiarucci et al., 2009;Dengler, 2008Dengler, , 2009Kunin, 1997;Stohlgren, 2007). Species richness also increases with decreasing dispersal limitations (Hubbell, 2001). Therefore, it is not guaranteed that our findings are simply applicable to other systems of similar diversity levels because resource availability (Olszewski, 2004;Ugland et al., 2003;Wilson and Gitay, 1995) and population dynamics (Pannell, 2012) may idiosyncratically control the spatial distribution of species abundances at small scales. In addition, the regional species pool size may differ, which causes differences in the proportion of the pool that can be detected by local sampling units (Chao and Jost, 2012).
We additionally highlight that the measurement of species abundance (here cover) adds substantial information about the species diversity as opposed to measure species richness only (Gosselin, 2006). The shape of the relationships between information content and plot size differed between both diversity metrics given any constant number of plots (Fig. 4). The reason is that the species richness index weighs all species equally. Species richness responds equally to each additional species occurring, even if species have very low cover (Stohlgren, 2007). Abundance-based measures are less sensitive to rare species whose relative coverage is marginal. Recording species richness only Fig. 4. Information entropy versus plot size given a constant number of plots. In a) Shannon's information entropy of species richness R was seperately calculated for different quantities of subplots m (number inside grey boxes) that were randomly selected from each of the nine 10 × 10-plots. This random selection procedure was repeated 10,000 times, so that 10,000 entropy values were calculated per subplot size for a given constant number of subplots (see Methods section for details). In b) Shannon's information entropy of the Shannon diversity H was calculated. Boxplots as in Fig. 3. The letters illustrate significant differences (p < 0.05) between entropy distributions using Mood's median test. "All sig." indicates that all entropy distributions are significantly different from each other. For the subplot size 1 × 1 and m = 100 and for 5 × 5 and m = 4, repetitions of the random selection procedure were not reasonable because these configurations already incorporated all independent subplot-unions available within a 10 × 10-plot by one single selection run. They were excluded from Mood's median test. may be less laborious, but Shannon diversity offers additional information about species diversity by incorporating species abundances. We therefore recommend to record species abundances, especially when it comes to monitoring community composition. Because time and funds are limited for conservation management, surveys and monitoring programs should be conducted that maximize the probability of recording most species diversity with least sampling effort (Abella and Covington, 2004). Comprehensive conservation action should always be informed by a variety of diversity metrics because different metrics represent different conservation values that are given by areas of conservation concern (Hoffmann et al., 2018a).
Our sampling design is restricted to a particular spatial configuration and shape of sampling units. Since the spatial configuration and shape of plots control the species diversity that is sampled (Bacaro et al., 2015;Güler et al., 2016;, information entropy of diversity estimates may be affected by the plot shape and spatial arrangement. Moreover, assuming the nine 10 × 10-plots (i.e., an area of 3600 m 2 ) well represent the regional diversity of alpine plant communities, this study provides first estimates of the optimal plot size and number to sample alpine grassland at this regional extent. Nevertheless, it is desirable to enlarge the study area, extent and plot scaletowards larger plots and smaller subplotsin order to prove our results for alpine grassland in general. The optimal sampling design ultimately depends on the study objectives (Bacaro et al., 2015;Baffetta et al., 2007;Yoccoz et al., 2001). While we focused on the information about local diversity (i.e., alpha diversity sensu Whittaker, 1972)   The information entropy of Shannon diversity H given the subplot size 1 × 1. c) The information entropy of species richness R given the subplot size 2 × 2. d) The information entropy of Shannon diversity H given the subplot size 2 × 2. The curves show the local polynomial regression fits. The solid vertical lines indicate the estimated breakpoints while the stippled vertical lines span the 95%-confidence interval of those breakpoints. The 5th percentile is shown in blue, the median in black and the 95th percentile in red. For the subplot size 1 × 1 and m = 100, and for 2 × 2 and m = 25, repetitions of the random selection procedure were not reasonable because these configurations already incorporated all independent subplot-unions available within a 10 × 10 plot by one single selection run.
grassland of the regional extent, a general assessment of the information-scale relationship should consider different scales (from local to global), biotic units, information types (e.g. differentiation diversity sensu Jurasinski et al., 2009) and study objects (e.g. plant functional traits) -such as Whittaker et al., 2001 for the diversity-scale relationship. More data points will allow for a more accurate assessment of the optimal plot size and quantity, even by other methods such as change point analysis (Killick and Eckley, 2014). However, due to the general fact that species diversity is monotonically increasing with increasing sampling area, Hopkins (Hopkins, 1957) already concluded that a minimal area representing maximal diversity is, from an objective point of view, unlikely to exist for any vegetation type. It remains an open question whether this is true for information content.

Conclusion
Understanding the scale-dependence of information content of diversity metrics is crucial for efficient research, monitoring and conservation programs, especially for alpine ecosystems vulnerable to rapid environmental changes. An optimal sampling design should always be considered for reasons of temporal and financial efficiency. Apart from that, an optimal in-situ sampling design may also improve biodiversity assessment via Earth observation technique (Hoffmann et al., 2018b): Small-scale in-situ information can be projected to larger extents on the basis of remote sensing data with relatively low effort (Stenzel et al., 2017).
The information content among species diversity estimates is scaledependent as we demonstrated for alpine grassland at regional extent. Our approach can be adapted to other study systems. Yet, for some diversity indices and plot quantities, a clear saturation of information content with increasing plot size might not emerge. In that case, the smallest or largest plot size is the optimal one. The generality of our results is restricted to a single vegetation type, a particular sampling design, two diversity metrics, and a limited study area and extent. Hence, research on the scale-dependence of information entropy still offers great potential.