Mapping Plant Functional Groups in Subalpine Grassland of the Greater Caucasus

Plant functional groups—in our case grass, herbs, and legumes—and their spatial distribution can provide information on key ecosystem functions such as species richness, nitrogen fixation, and erosion control. Knowledge about the spatial distribution of plant functional groups provides valuable information for grassland management. This study described and mapped the distribution of grass, herb, and legume coverage of the subalpine grassland in the high-mountain Kazbegi region, Greater Caucasus, Georgia. To test the applicability of new sensors, we compared the predictive power of simulated hyperspectral canopy reflectance, simulated multispectral reflectance, simulated vegetation indices, and topographic variables for modeling plant functional groups. The tested grassland showed characteristic differences in species richness; in grass, herb, and legume coverage; and in connected structural properties such as yield. Grass (Hordeum brevisubulatum) was dominant in biomass-rich hay meadows. Herb-rich grassland featured the highest species richness and evenness, whereas legume-rich grassland was accompanied by a high coverage of open soil and showed dominance of a single species, Astragalus captiosus. The best model fits were achieved with a combination of reflectance, vegetation indices, and topographic variables as predictors. Random forest models for grass, herb, and legume coverage explained 36%, 25%, and 37% of the respective variance, and their root mean square errors varied between 12–15%. Hyperspectral and multispectral reflectance as predictors resulted in similar models. Because multispectral data are more easily available and often have a higher spatial resolution, we suggest using multispectral parameters enhanced by vegetation indices and topographic parameters for modeling grass, herb, and legume coverage. However, overall model fits were merely moderate, and further testing, including stronger gradients and the addition of shortwave infrared wavelengths, is needed.


Introduction
Worldwide, high-mountain grasslands are species-rich habitats that include numerous endemic species (K€ orner 2004) but are commonly highly affected by natural and land use-triggered erosion, land degradation, and land use changes (eg Tasser and Tappeiner 2002;Lehnert et al 2014;Wiesmair et al 2016). The high species richness of subalpine to alpine grasslands results from, and is affected by, long-term agricultural use. During the last decades, central European mountain grassland communities have been altered by the introduction of modern farming practices in grassland management on the one hand, and by the abandonment of agricultural use on the other (Tasser and Tappeiner 2002). Traditional high-mountain land use systems with low input of system-specific organic fertilizers had greatly contributed to a distinct floristic pattern. This changed when mineral fertilizers and more effective agricultural techniques were introduced, making more intensive management regimes applicable to large grassland sites while modifying the traditional mowing and grazing regimes, as well as homogenizing floristic patterns (Homburger and Hofer 2012). The introduction of mineral nitrogen and phosphorus fertilizers caused the greatest change in the floristic composition of grassland and resulted in an increased abundance of ubiquitous species (B€ uhler and Roth 2011).
In contrast, the subalpine grassland in our study area, the Kazbegi region, Greater Caucasus, Georgia, has been traditionally managed without any mineral fertilizer application (Tephnadze et al 2014). Therefore, nearnatural, species-rich, and quite distinct grassland types with Mountain Research and Development (MRD) An international, peer-reviewed open access journal published by the International Mountain Society (IMS) www.mrd-journal.org

MountainResearch
Systems knowledge a strong relation to topography and land use type dominate the subalpine and alpine landscape of the Kazbegi region (Py sek and Srutek 1989;Nakhutsrishvili 1999;Tephnadze et al 2014). Species-rich grassland, characterized by a dense and vertically structured vegetation layer and a diverse and deep root system, contributes to the functioning of the high-mountain ecosystem, especially to erosion control (Pohl et al 2009;K€ orner 2003).
The legume content of grassland stands is closely linked to various ecosystem functions. By their ability to fix nitrogen, legumes influence the nitrogen pool within the soil system, which further affects the root system as well as biomass production and vegetation cover (Spehn et al 2002)-important factors for erosion mitigation (Tasser and Tappeiner 2002;Lehnert et al 2014;Wiesmair et al 2017). Therefore, detailed spatial knowledge about the distribution of plant functional groups (PFGs)-groups of plants (grasses, herbs, legumes) that share similar traits and perform similar ecosystem functions-provides valuable information for grassland management (Blondel 2003).
Previous studies indicate the feasibility of using remotely sensed imagery and topographic information to model grass, herb, and legume coverage (Zha et al 2003;Biewer, Erasmi, et al 2009;Himstedt et al 2009;Psomas et al 2011). However, studies using grass, herb, and legume coverage as a target variable have so far been limited to controlled systems, achieving best results with a homogenous yield. In contrast, our study was based on seminatural mountain grassland with varying yields and cover.
We characterized grassland composition and structure of the researched grassland types and subsequently modeled and mapped the PFGs' spatial distribution. We further tested whether hyperspectral reflectance (HR)-in our case from field spectrometric data-enhanced the model's quality. We therefore aimed to (1) model and map grass, herb, and legume coverage and (2) test whether simulated HR improved the model quality.

Study area
Steep slopes and a harsh continental climate characterize the high-mountain range of the Central Greater Caucasus, Georgia, and especially the environmental conditions of the isolated Kazbegi region (Figure 1). The Tergi River runs north; the main village, Stepantsminda (1700 masl), stretches along its banks. West of the river, Mount Kazbeg (5033 masl) rises as the highest summit in the region (Ketskhoveli et al 1975). The climate of the valley is relatively continental, with long, cool summers and winters with low snow cover. The mean annual temperature is 4.78C, leading to a vegetation period of 5 to 6 months. The mean annual precipitation at 1850 masl amounts to 806 mm (Nakhutsrishvili 1999;Lichtenegger et al 2006). The bedrock of the study area comprises Jurassic sediments (clay schists), quaternary volcanic rocks (andesite and dacite), and quaternary pyroclastic deposits and fluvial sediments. Younger, Pleistocene glacial sediments as well as Holocene peats can also be found (Akhalkatsi et al 2006). The main soil types on the higher part of slopes are shallow Leptosols, used mainly as pastures, whereas on the lower slopes and accumulation areas, depending on the bedrock, moderately deep Cambisols can be found; these are located mainly close to the villages, where they are used as meadows or potato fields (Tephnadze et al 2014).
The landscape is characterized by large, lowproductivity, pastured grassland alternating with small remnants of birch forests (Betula litwinowii) and shrubberies. The grassland of north-facing slopes exhibits a relatively high biomass but is often characterized by unpalatable plant species such as Veratrum lobelianum or Festuca varia (Nakhutsrishvili 2012). On alluvial fans and close to the villages, young hay meadows occur on former organically fertilized arable fields characterized by Hordeum brevisubulatum (Tephnadze et al 2014). Older, nonfertilized and species richer hay meadows grow on steeper slopes and further away from the villages. We found no indication of the application of mineral fertilizers in our study region. For a detailed description of the study area, see Nakhutsrishvili (1999), Magiera et al (2013), and Tephnadze et al (2014).

Vegetation data
In summer 2014, the grassland vegetation within walking distance of 6 selected villages in the Kazbegi Valley (Stepantsminda, Gergeti, Pansheti, Sioni, Phkelsche, and Goristhikhe) was sampled in a stratified random design including low-, medium-, and high-productivity sites (strata). Exact locations of the plots, however, were chosen randomly. In order to avoid edge effects, we sampled only large homogeneous grassland patches at a minimum distance of 50 m from each other.
The vegetation composition of 90 plots, each covering 25 m 2 , was assessed using the modified Braun-Blanquet scale and including all vascular plant species. The nomenclature follows The Plant List 1.1 (2013). Furthermore, we recorded the total vegetation cover as well as the cover of open soil and bare rocks. The cover percentage and height of the upper and lower herb layers were assessed separately. In order to estimate the cover fractions of the functional plant groups (grass [Poaceae, Juncaceae, Cyperaceae], legume [Fabaceae], and herb [all other species]), the Braun-Blanquet scale was transformed to cover percentages (r ¼ 0.6%, þ¼ 1.2%, 1 ¼ 2.5%, 2m ¼ 5%, 2a ¼ 10%, 2b ¼ 20%, 3 ¼ 40%, 4 ¼ 80%, 5 ¼ 160%). We summarized the coverage of all species belonging to each functional group and used this as 100% coverage for comparison (van der Maarel 2007). We further identified the most dominant (mean coverage of 5%) and frequent (present in at least 30% of the vegetation relev es) species within the 3 previously defined grassland vegetation types: H. brevisubulatum meadow, Gentianella caucasea grassland, and Astragalus captiosus grassland. In this paper we use the term grassland where the land-use type is ambiguous (haymaking and pasturing), whereas the term meadow is used where haymaking occurs.
To depict the main floristic gradients, we performed a nonmetric multidimensional scaling (NMDS; Kruskal 1964) ordination. Ordination is a commonly used tool to reduce the n-dimensional vegetation dataset to lower dimensional floristic gradients. NMDS was chosen as an ordination method because it is a robust, distance-based method that accurately displays the ordinally scaled vegetation data. We calculated an NMDS ordination with 3 dimensions using the monoMDS function of the R package vegan 2.4-1 (Oksanen 2011). An NMDS was calculated for the plant species composition of the plots based on Bray-Curtis distances as a distance measure (Bray and Curtis 1957). The NMDS axes were rotated by principal component rotation, so that the new axis 1 pointed in the direction of the largest variance (Clarke 1993).
Moreover, we tested structural vegetation parameters for significant differences between the 3 grassland vegetation types, using a Kruskal-Wallis analysis of variance and a Nemenyi test for multiple comparisons of rank sums implemented in the R package PMCMR 4.1.
Preprocessing of hyperspectral field spectrometric data, satellite imagery, and topographic data We tested hyperspectral field spectrometric data and multispectral satellite imagery, including vegetation indices (VIs) and topographic data, for modeling grass, herb, and legume coverage. Compared to the coarse spectral resolution of multispectral data, commonly including 3 to 10 discrete bands, the high spectral resolution of hyperspectral data allows for a higher flexibility in the selection of spectral features . VIs are either ratios or linear combinations of sensor bands that aim to enhance the vegetation signal and allow conclusions on the status and condition of vegetation (Jackson and Huete 1991).
In mid-July 2014, at the time of the highest biomass, we acquired hyperspectral field spectrometric canopy reflectance using a handheld field spectrometer (ASD HH2), covering a range of 325-1075 nm (750 wavelengths, with a 1-nm resolution) of the solar electromagnetic spectrum. The measurements were taken from the same 5 3 5-m plots as the vegetation relev es. To cover the entire plot, we took 4 measurements per plot with 5 repetitions (each measurement with an internal averaging of 50 spectra), totaling up to 20 spectra per plot, collected with a 258 conical field of view. The measurements were collected close to the solar noon, on days with clear sky and low wind speed. Atmospheric changes were accounted for by measuring relative to a white standard panel (Spectralont, Labsphere Inc., North Sutton, NH), with a recalibration at least every 5 minutes. During preprocessing, the 20 spectra sampled per plot were averaged and filtered. A Savitzky-Golay filter with a quartic polynom and a filter length of 51 nm was used to smooth the spectra (Savitzky and Golay 1964).
Filtered field spectrometric reflectance measurements were used to test the applicability of hyperspectral sensors compared to multispectral sensors for modeling PFGs by simulating the bands of the sensors AISA Eagle (hyperspectral, 400À970 nm) and RapidEye (multispectral, 440-840 nm). PFGs have already successfully been predicted by AISA dual data as pollination types  and as vegetation types by moderate-resolution imaging spectroradiometer data (Sun et al 2008). Moreover, Lehnert et al (2013) have used hyperspectral data to discriminate grass from nongrass, but studies using multispectral data to model plant functional types are rather scarce. Both sensors were chosen because they cover a similar spectral range and offer high spatial resolution. To calculate the spectral signal of the AISA Eagle sensor, we cut the wavelengths of the AISA sensor (118 wavelengths) out of the hyperspectral field spectrometric data, whereas the function simulatoR  and the spectral response curve were used to simulate RapidEye reflectance.
Multispectral, space-borne imagery was acquired on 21 June 2014 by the RapidEye sensor. This sensor provides information on canopy reflectance in 5 bands (blue 440-510 nm, green 520-590 nm, red 630-685 nm, red edge 690-730 nm, and near infrared [NIR] 760-850 nm; Weichelt et al 2011). The imagery was orthorectified (product level 3-A) and converted to top-of-atmosphere reflectance. Differences in illumination due to the topography were corrected with a cosine topographic correction (Teillet et al 1982). Besides the 5 original bands, we included a set of previously published VIs in the analysis (Magiera et al 2017 (Herrmann et al 2010). All indices were calculated by using the R raster package Version 2.5-8.
We included topographic data from a digital elevation model (DEM) with a 20 3 20-m resolution; we then calculated derivatives from that DEM, eastness, northness (Zar 1998), and slope (Horn 1981), as well as plan curvature, mean curvature, profile curvature, solar radiation, compound topographic index (Gessler et al 1995), heat load index (Mc Cune and Keon 2002), and surface relief ratio (Pike and Wilson 1971) with the Arc Map 10.2.1 tool box and the Geomorphometry and Gradient Metrics Toolbox version 1.0. The topographic data were selected to reflect those environmental conditions induced by the terrain that are known to impact vegetation characteristics (Moeslund et al 2013). We extracted the VIs and topographic variables for the positions of each vegetation relev e.

Modeling the vegetation structure
We tested the predictive power of hyperspectral canopy reflectance against multispectral reflectance (MR) for modeling PFGs. Moreover, we enhanced the multispectral model by using VIs and topographic variables.
As a modeling technique, we chose random forest regression (Breiman 2001), an ensemble method belonging to bagged machine learning as implemented in the R package randomForest version 4.6-12 (Liaw and Wiener 2002). The random forest regression algorithm requires no assumption about data distributions; therefore, transformations are not necessarily needed (Breiman 2001;Liaw and Wiener 2002). The algorithm can capture nonlinear data structures that are often inherent in vegetation data. Moreover, it is robust towards outliers and can handle noise introduced by many predictor variables. The error rate of a random forest is assessed via out-of-bag estimation. The importance of a variable is assessed as a percentage increment of the mean square error (MSE) by permuting the out-of-bag data and the resulting error increase when one variable is left out (Liaw and Wiener 2002). Models were validated by a 100-fold bootstrapping procedure using the full dataset, as the sample size was relatively small. Adjusted R 2 as well as the root MSE were calculated for the relationship between predicted versus observed data. All 3 resulting maps were stacked and plotted in the red-green-blue (rgb) color code with r ¼ legume coverage, g ¼ grass coverage, and b ¼ herb coverage.

Grassland
The 90 vegetation relev es contained 177 plant species belonging to 35 families (26 graminoid species, 125 herbaceous species, 22 fabaceous species, and 4 sedge species). The NMDS ordination accurately depicts the 2 main floristic gradients, with a stress level of 0.14 ( Figure 2). Most of the original variation in the data (61%) is explained by the first NMDS axis; the second and third axes represent 26% and 0.5%, respectively. The color scheme represents the distribution of grass, herb, and legume in the rgb color space; that is, greenish points represent high grass coverage. Grassland vegetation is characterized by broad transitions, which explains the high species richness but makes it difficult to delineate grassland vegetation types.
The high grass coverage of the H. brevisubulatum meadow was accompanied by significantly higher total cover and yield (Table 1). Dominated by H. brevisubulatum and in the drier and stonier parts by Agrostis vinealis, Trifolium repens, Ranunculus caucasicus, and Ranunculus ampelophyllus, these meadows occurred on deep soils. The high herb coverage detected in the G. caucasea grassland was caused by species richness-mainly of herb speciesand coverage more evenly distributed among single species; many species exhibited an average coverage below 15%. In contrast, dominance was established mainly by grassland matrix species shared with the A. captiosus grassland, such as T. repens, Trifolium ambiguum, Bromus variegatus, and Poa alpina. A high number of legume species, high legume coverage (eg Medicago glutinosa as a dominant species), and an overall low vegetation cover characterized the A. captiosus grassland.

Modeling PFGs
For modeling and mapping the selected PFGs for grass coverage, the most important predictor variables were the red-edge/NIR ratio, the ARVI, and the WDRVI. For herb coverage, the most important predictors were elevation, the red-edge band, and the profile curvature; for legume coverage, the main predictors were elevation, the NIR band, and the MSAVI.
The map shows that large areas of grassland were characterized by a high herb coverage, whereas grassdominated patches were only small and established in close proximity to the settlements ( Figure 3B, C, D). In contrast, patches dominated by legumes (mainly A. captiosus) covered larger areas mainly in the floodplains or on steep, south-exposed slopes characterized by open soil and bare rock ( Figure 3C).
We further tested whether simulated hyperspectral field spectrometric reflectance, which matches the spectral characteristics of the AISA sensor, enhanced the model quality compared to simulated MR (RapidEye) or a mix of simulated MR, simulated VIs, and topographic variables as predictor variables.
The best-fitting models resulted from a mixed set of simulated MR, VIs, and topographic variables, whereas simulated HR performed equally well to the simulated MR (Table 2).
For the prediction of grass coverage, the blue and green bands played a key role when only MR was used as predictor ( Figure 4). Eastness, elevation, and profile curvature were the most successful predictors in MR, VIs, and topographic parameters (TPs). Considering simulated AISA reflectance, wavelengths of 722-727 nm (red edge) resulted in the strongest increase in MSE. Wavelengths in the green part of the electromagnetic spectrum of light (511 nm) were strong predictors as well. Herb coverage was mostly predicted by the MR blue, green, and red bands, whereas elevation, eastness, profile curvature, and the red-edge/red ratio contributed the most to the MR, VI, and TP models. The variable importance for predicting legume coverage shows a high predictability of the blue band, the green band, and the NIR band in MR. Strong predictors in MR, VI, and TP were elevation, BWDRVI, and the NIR/green ratio. In HR, strong predictors were found in the blue (405 nm) region of the spectrum and the red edge (731 nm).

Composition of grassland swards and management implications
The tested high-mountain grassland exhibited a vegetation structure common for unfertilized high- ). The length of the arrow represents the relationship between ordination and gradient, with a significance level of P 0.01. Point size is fitted to grassland biomass (maximum biomass ¼ 13.4 t * ha À1 , minimum biomass ¼ 0.25 t * ha À1 ). mountain grassland (Rudmann-Maurer et al 2008). Within all 3 tested vegetation types, herbs and legumes achieve high coverages compared to central European, nonintensively used grassland that is typically composed of 45% grass, 10% legume, and 45% herb coverage (Voigtlaender et al 1987). Grass coverage was significantly higher in H. brevisubulatum meadows in our dataset but was still within the range of unfertilized farmland, which has been almost lost in central Europe because of intensive farming practices but is in part conserved in highmountain systems. Moreover, many species typical of central European grassland with a deep-growing root system, such as Rumex obtusifolius, Festuca pratense, and Geranium sylvaticum, were frequent in our data.
Higher herb coverage characterized the G. caucasea grassland, the species richest of all 3 grassland types. This was mostly caused by altering, low-intensity land use practices: although its biomass is moderate, G. caucasea grassland is mown whenever winter fodder is scarce; it is also pastured in spring with a low cattle density. This diverse, low-intensive land use without any mineral fertilizer has contributed to the species diversity of this grassland type. An abandonment of this management practice would lead to a considerable loss of highmountain plant diversity (Maurer et al 2006).
Low species diversity, as analyzed for the legumedominated A. captiosus grassland, and the typically scant vegetation coverage indicate areas potentially prone to erosion, mostly on southeast exposed slopes (Wiesmair et al 2016). Because of nutrient-poor soil conditions and drought, where erosion had started only a few species were able to establish themselves and maintain a vegetation cover. This highlights the importance of single species-especially the dominant A. captiosus with its immense (90 cm) root length and ability to provide nitrogen to co-occurring species-for mitigating erosion processes (Spehn et al 2002;Caprez et al 2011).   Biewer, Erasmi, et al (2009) found that in sown swards, varying sward age and the closely connected biomass distorted the relationship between VIs and grass and legume content. VIs and topographic variables enhanced the model quality of the multispectral dataset, with elevation and profile curvature being the most important topographic variables because near-natural vegetation mainly follows topographic gradients (Moeslund et al 2013). Moreover, important canopy characteristics, such as vegetation cover and the distribution of grass, herb, and legume species, relate to the topographic gradient, adding to the characteristic reflectance pattern (Pfitzner et al 2006).
We compared modeling results for predicting PFGs with HR and simulated multispectral data to test the potential of hyperspectral imagery for modeling PFGs. Using the resampled field, spectrometric data minimized the effects of illumination differences in multidate FIGURE 4 Variable importance calculated as percentage increment MSE using (A) simulated MR (RapidEye); (B) simulated MR (RapidEye), simulated VIs, and TPs; and (C) simulated HR (AISA) as predictors for grass, herb, and legume cover.
comparisons (Nilson and Peterson 1994). Moreover, both datasets originated from the same spectra, unlike in a comparison of real satellite imagery where images are often recorded weeks apart. Therefore, we avoided inaccuracies introduced by rapid phenological development in high-mountain regions (K€ orner 2003). However, the spatial scale as a crucial sensor characteristic was not taken into account: the field of view of the spectrometer covers areas below 1 m 2 , depending on the average height above ground, whereas the rescalable, airborne AISA Eagle imagery has varying pixel sizes (,5 3 5 m) and the spaceborne RapidEye sensor delivers imagery with a pixel size of 5 3 5 m. The number of species in a pixel size increases with pixel size; we counteracted this problem by averaging spectra on 5 3 5m plots. Using actual imagery might result in different model qualities Meyer et al 2017).
The high spectral resolution of the hyperspectral data offered only a small advantage when modeling grass, herb, or legume coverage. Hyperspectral reflectance outperformed the MR by only 2-3%, whereas the combination of MR, VIs, and TPs explained another 10% of variance compared to the simple model with MR. However, hyperspectral imagery with a high spatial resolution is costly. The ability of MR to model floristic composition  as well as aboveground biomass and vegetation cover (Meyer et al 2017) is generally high. Both data types may be enhanced by including topographic variables, especially in a highmountain study area. Moreover, spectral information from the shortwave infrared range, which is sensitive to the water and dry-matter content of the leaves, may add valuable information to the models . Such information is, however, available for only a few sensors with high spatial resolution (eg the commercial Worldview 3). Free multispectral imagery as delivered by Sentinel 2 could improve the model quality because it offers 3 red-edge, 2 NIR, and 2 shortwave infrared bands. However, the main pitfall of Sentinel 2 data is the rather coarse spatial resolution (20 3 20-m pixel size) compared to RapidEye (5 3 5-m pixel size). Further testing while using a broader spectral range, as provided by Sentinel 2, is therefore needed.

Conclusion
The high-mountain grassland of the Kazbegi region displays a unique species composition with a high coverage of herbs and legumes, resulting in a typical structure and vegetation cover. Mapping grass, herb, and legume coverage revealed the spatial limitation of grassrich swards for haymaking and the domination of legumes on large areas with low vegetation coverage, which should be grazed with a low cattle density. Producing grass, herb, and legume cover maps could therefore aid in developing case-sensitive grassland management recommendations for a sustainable, economically and ecologically viable land use concept in these species-rich grasslands. To enhance model fits, further testing, including even stronger vegetation gradients and the addition of shortwave infrared wavelengths, is needed.

ACKNOWLEDGMENTS
We thank the Volkswagen Foundation for generous funding of the interdisciplinary project ''AMIES II-Scenario development for sustainable land use in the Greater Caucasus, Georgia,'' of which this study was a part, as well as the Rapid Eye Science Archive (Project-ID 724) for supplying the satellite imagery. We also thank the German academic exchange service (DAAD) for partially funding the fieldwork conducted by Anja Magiera. We further thank our Georgian and German project partners and colleagues for their generous help, especially Tim Theissen for the utilization of the land use/ land cover map and Katja Beisheim and Timothy Bostick for proofreading the manuscript.