The challenge of accurately documenting bee species richness in agroecosystems: bee diversity in eastern apple orchards

Bees are important pollinators of agricultural crops, and bee diversity has been shown to be closely associated with pollination, a valuable ecosystem service. Higher functional diversity and species richness of bees have been shown to lead to higher crop yield. Bees simultaneously represent a mega-diverse taxon that is extremely challenging to sample thoroughly and an important group to understand because of pollination services. We sampled bees visiting apple blossoms in 28 orchards over 6 years. We used species rarefaction analyses to test for the completeness of sampling and the relationship between species richness and sampling effort, orchard size, and percent agriculture in the surrounding landscape. We performed more than 190 h of sampling, collecting 11,219 specimens representing 104 species. Despite the sampling intensity, we captured <75% of expected species richness at more than half of the sites. For most of these, the variation in bee community composition between years was greater than among sites. Species richness was influenced by percent agriculture, orchard size, and sampling effort, but we found no factors explaining the difference between observed and expected species richness. Competition between honeybees and wild bees did not appear to be a factor, as we found no correlation between honeybee and wild bee abundance. Our study shows that the pollinator fauna of agroecosystems can be diverse and challenging to thoroughly sample. We demonstrate that there is high temporal variation in community composition and that sites vary widely in the sampling effort required to fully describe their diversity. In order to maximize pollination services provided by wild bee species, we must first accurately estimate species richness. For researchers interested in providing this estimate, we recommend multiyear studies and rarefaction analyses to quantify the gap between observed and expected species richness.


Introduction
Biodiversity encompasses both species richness and the number of species roles in the community (functional diversity), and is a critical component of ecosystem function and the provision of ecosystem services (Balvanera et al. 2006;Cardinale et al. 2006). The relationship between biodiversity and ecosystem services may be driven by facilitation among different species, complementarity in ecosystem function between species in a diverse community, or the fact that the probability of communities including better service providers increases as we add more species (Loreau and Hector 2001;Cardinale et al. 2002). In particular, there has been a strong focus on the relationship between diversity and ecosystem services provided by pollinators in agroecosystems (Klein et al. 2003;Hoehn et al. 2008;Bl€ uthgen and Klein 2011;Carvalheiro et al. 2011;Garibaldi et al. 2011;Albrect et al. 2012;Rogers et al. 2014). Recent studies have shown strong support for this relationship, emphasizing the positive correlation between crop yield and functional bee diversity in pollinator-dependent crops (Hoehn et al. 2008;Albrect et al. 2012;Rogers et al. 2014;Martins et al. 2015).
Unfortunately, biodiversity is very difficult to document accurately, especially for diverse arthropod taxa, because species can be difficult to detect. Additionally, ecological communities, including plant-pollinator communities, tend to comprise few abundant and many rare species (Kunin and Gaston 1993;Olesen and Jordano 2002;Russo et al. 2011). This could be especially important with regard to functional diversity, particularly if the rare species comprise a distinct functional component of the community (Petchey and Gaston 2006). It is critical to invest resources into sampling thoroughly over a long time period in order to accurately document the diversity of an ecosystem and subsequently the functional contribution of diversity. However, there is no consensus on the length of time that is sufficient to describe species richness in taxa that exhibit temporal variation in community composition. In species-rich systems, such as the tropics, biodiversity is only possible to quantify accurately after extensive effort (Longino et al. 2002). Bees can be particularly challenging taxa to document effectively because they are diverse, cryptic, and exhibit substantial year-to-year variation (Wilson et al. 2008;Grundel et al. 2011).
In this study, we address the question of what constitutes sufficient sampling to elucidate the relationship between diversity and function in New York apple orchards. New York is the second largest producer of apples in the United States, and apple is an economically important crop for the region (USDA NASS 2013). In addition, apples are selfincompatible and require insect vectors for pollination; thus, a diversity of pollinator taxa, predominantly bees, is critical to crop yield (Free 1964;Garratt et al. 2014). Although peak apple bloom lasts just 2 weeks, apple represents a high quality resource that attracts a diversity of bee species ( Fig. 1; Gardner and Ascher 2006;Park et al. 2010;Watson et al. 2011;Sheffield et al. 2013;Mallinger and Gratton 2014). Maintaining the diversity of wild bees may be important to ensure adequate pollination services (Aizen and Harder 2009), but, to date, we do not have a rigorous, exhaustive survey of orchard bee diversity. For this reason, over a 6-year period, we conducted more than 760 unique sampling events in New York apple orchards during peak apple bloom.
To address the question of how much sampling is required to accurately document bee diversity within apple orchards during peak bloom, we used sample-based rarefaction plots. Using this method, we evaluate the degree of sampling required to fully document the diver-sity at individual orchards and across the study region and discuss reasons for variation in the amount of necessary sampling. Lessons learned from this study have implications for other studies of bee diversity and its relationship with pollination services in agroecosystems.

Materials and Methods
We used aerial net collections to survey bee diversity in 28 apple orchards across the Finger Lakes region of western New York during peak bloom, which typically takes place over a period of 2 weeks in early to mid-May in this region (Fig. 2). We collected bees on or near apple blossoms, as well as bees traveling between apple trees, on warm (>15°C), sunny days with low wind (sampling details are provided in Park et al. 2015). We used two sampling methods throughout the study. First, we conducted general collections, sampling only wild bees visiting apple trees with the goal of documenting the diversity of wild bees that may be pollinators of apple. The general collections lasted for a minimum of 15 min, but were sometimes longer. Second, we conducted standardized collections over a 15-min time period in which we collected all bees (both wild bees and honey bees) visiting apple trees along a 100-m transect. These standardized collections were meant to provide a comparable measure of abundance and diversity between different orchards. We first analyzed these two methods separately and then pooled them for the purpose of bee diversity assessment. We analyzed 22 of the sites individually and also the pooled samples from all 28 sites. We did not individually analyze the other six sites because we did not have a high enough sample size (i.e., there were fewer than 10 samples at these sites) to evaluate diversity independently in those orchards. Nonetheless, they contributed to our estimate of overall bee diversity in the region. The orchards varied in size from approximately 1-350 acres, and the number of samples per orchard ranged from 14 to 82 (Table 1).
We identified all collected bee specimens to species using published revisions (Mitchell 1960(Mitchell , 1962Ribble 1967Ribble , 1968LaBerge 1969LaBerge , 1971LaBerge , 1973LaBerge , 1977LaBerge , 1980LaBerge , 1987LaBerge , 1989La-Berge and Bouseman 1970;LaBerge and Ribble 1972;Bouseman and LaBerge 1979;McGinley 1986;Gibbs 2010Gibbs , 2011Rehan and Sheffield 2011;Gibbs et al. 2013) and comparison to expertly identified material deposited in the Cornell University Insect Collection (http://cuic.entomology.cornell.edu/). Voucher specimens were deposited in the Cornell University Insect Collection. We used Biota software for database management and exported the data into EstimateS for diversity analyses (version 9.1.0, Colwell 2013). Because our surveys were samples, and we did not collect any bees that were flying when we were not sampling (e.g., below 15°C or on rainy, cloudy, and windy days), we used the nonparametric Chao 1 estimator for species richness to develop rarefaction curves (Chao 1984). The number of expected species is modeled as a function of the number of samples and can tell us what proportion of the expected diversity was captured by our sampling at each locality ( Table 1). The Chao 1 log-linear confidence intervals are asymmetric and the lower bound cannot be lower than the observed number of species (Chao 1987). We compared the observed species richness to the mean Chao 1 estimator (between the lower and upper bounds). Using this estimate, we calculated a proportion of realized richness by dividing the observed number of species by the expected number of species given by the Chao 1 mean. For this reason, the true proportion of species richness observed could be over or underestimated, but the model gives us a reasonable expectation. Localities where the rare- faction curves have reached an asymptote have been sampled sufficiently, whereas localities with sloping rarefaction curves indicate that further sampling is required to characterize the diversity of the bee fauna. Rarefaction methods allow for the standardization and comparison of diversity datasets (Gotelli and Colwell 2001). We calculated the percent agriculture within a 1 km radius of the orchard center with ArcGIS and a Crop-Scape data layer (ESRI 2011, Han et al. 2014. The major landscape types we defined as agriculture include corn, soybeans, wheat, and tree fruit (see Table S1 in supporting information for a comprehensive list). To determine whether the size of the orchard, the percent agriculture within a 1 km radius of the orchard, and the number of sampling events correlated with the species richness of a given orchard, we used linear mixed effect models. We also used linear mixed effect models to test whether size and percent agriculture were predictors of the difference between the number of observed and expected species. Because many of the orchards rent honeybee hives during apple bloom (see Table 1), we also tested for a correlation between honeybee abundance and wild bee abundance in the orchards. In addition, we tested the variation between the community composition of samples within the orchards, between years, and across sites using a Bray-Curtis dissimilarity index.

Results
Over the 6-year sampling period, we conducted a total of 760 sampling events and collected a total of 11,219 bee specimens, representing 104 species (Table S2, supporting information). The species included representatives of five bee families: Andrenidae, Apidae, Colletidae, Halictidae, and Megachilidae. Although the highest diversity of species was recorded in the family Halictidae (41 species), the greatest abundance of bees was in the family Andrenidae (comprising 48% of the specimens) ( Table 2). The single most abundant species was the honeybee (Apis mellifera), at approximately 35% of the total abundance. Sixteen of the orchards brought in rental honeybee hives during apple bloom. However, there was no correlation between honeybee abundance and wild bee abundance (P > 0.05, R 2 = 0.06).
Orchards varied in bee diversity, with a range from 15 to 54 observed bee species and an average of 36.5 AE 2.3 species (Fig. 3). On average, we performed significantly more standardized than general collections in the orchards (t 21 = 5.46, P < 0.01), and we collected significantly more specimens on average during standardized collections (t 21 = 4.08, P < 0.01, Table 1). Combined species richness from standardized and general collections was higher than species richness from standardized collections or general collections alone (t 21 = 10.28, P < 0.01 for combined vs. general, t 21 = 11.70, P < 0.01 for combined vs. standardized). On average, standardized collections resulted in higher observed species richness for orchards than general collections (t 21 = 2.24, P < 0.05). Overall, when we pooled all sites together, the difference in species richness was small (91 species for standardized collections vs. 89 for general collections, Table 1).
When we controlled for the number of times we performed collections at each orchard, the size of the orchard was a significant predictor of observed species richness; larger orchards tended to have fewer species (P = 0.003). When we controlled for both number of collections and the size of the orchard, the percent agriculture in the surrounding landscape was also a significant predictor of observed species richness; more agriculture in the surrounding landscape had a negative effect on species richness (P = 0.01). Neither orchard size nor percent agriculture correlated with the difference between expected and observed species richness or proportion of realized species richness in the pooled methods, or either method individually (P > 0.05 for all tests). Nor was there a relationship between number of collections and proportion of expected richness collected in either type of collection. However, there was a correlation between the number of collections and the species richness of the orchards for all collections pooled (P < 0.05, R 2 = 0.47), for standardized collections only (P < 0.05, R 2 = 0.51), and for general collections only (P < 0.05, R 2 = 0.52). This correlation was no longer significant when we excluded orchards where <75% of expected richness was captured (P > 0.05, Fig. 4). In other words, in fully sampled orchards, there was no relationship between the number of samples and the number of species, as one would expect in a rarefaction curve that had reached an asymptote.
For all collections pooled, in 13 of 22 sites individually analyzed, we captured <75% of the expected richness. In 8 of these 13 sites, estimated species richness did not asymptote in their species accumulation curves, including our most heavily sampled site, which had been sampled 82 times (Fig. 3Q). The lack of an asymptote of the curve means that each new sample is still leading to a higher estimate of species richness, whereas one might expect at a fully described site that new sampling events would not yield new species. At a regional level, with all 28 sites pooled, we captured almost 90% of the expected bee diversity, suggesting our regional sampling has been effective. In fact, the region as a whole was better sampled than any individual site, suggesting that the variation between sites was lower than the variation between samples. In other words, the bee fauna is relatively stable across the geographic region sampled. The highest ratio of observed to expected species richness (and hence the best sampling) for both methods together occurred in two orchards both at 87% (Fig. 3D and G). On the other hand, we sampled one orchard (Fig. 3R) 25 times across the 6 years and only recovered 40% of the expected species richness. We captured 75% or more of the expected species richness in 7 sites using just standardized collections and 8 sites using just the general collections (Fig. S1). The proportion of expected species richness captured varied between 38 and 94% for standardized collections and 22 and 96% for general collections. Interestingly, the most poorly sampled orchard depended on the method used (although the most poorly sampled was orchard R for both general collections and all collections pooled); the same is true of the best sampled orchard for each method. Indeed, the worst sampled orchard of the standardized collections was the best sampled of the general collections (Table 1L, Fig. 3L). In all but 6 of the 22 sites (B, D, J, K, L, and O), variation in community composition of bee species between years was higher than the variation between sites across the region, as measured by the Bray-Curtis dissimilarity index. At four of the six sites where the temporal variation was lower than the regional variation, the observed species richness was <75% of the expected species richness.

Discussion
Our study entailed more than 190 h of sampling across 28 orchards distributed over an area of approximately 357,000 hectares during a 6-year period (2008)(2009)(2010)(2011)(2012)(2013). For the region as a whole, we captured nearly 90% of the Chao 1 expected bee diversity, but within individual orchards, our sampling success varied. Despite the thoroughness of our sampling, the rarefaction curves for many of our sites did not achieve an asymptote (even in some heavily sampled sites; Fig. 3), and in 13 of the 22 sites analyzed individually, we captured <75% of the expected species richness (Table 1).
Our results demonstrate that, despite substantial sampling effort, it can be very challenging to thoroughly document pollinator diversity in agroecosystems. For some questions, a rough estimate of bee diversity may be sufficient, but for studies quantifying a relationship between biodiversity and ecosystem function, or for those attempting to elucidate community level structures, an accurate estimate of species richness could be essential (Gotelli and Colwell 2001;Balvanera et al. 2006). Poorly sampled sites may lead to a profound misunderstanding of community function and services. In particular, we cannot understand the relationship between the functional diversity of pollinators and the ecosystem service of pollination in agroecosystems without an accurate estimate of species richness (Balvanera et al. 2006). In addition, many recent studies have focused on applying network theory as a tool for understanding the function of pollinator communities There is a significant correlation between richness and samples for all collections combined (P < 0.05, R 2 = 0.47), standardized collections (P < 0.05, R 2 = 0.51), and general collections (P < 0.05, R 2 = 0.52), but this correlation is not significant for orchards where we captured 75% or more of the expected species richness (dark circles) (P > 0.05). (e.g., Bl€ uthgen and Klein 2011;Russo et al. 2013;Winfree et al. 2014); however, such studies would be especially susceptible to insufficient sampling effort because network structure is heavily dependent on network size (number of species) (Dormann et al. 2009). It is therefore essential to use species richness estimators to determine how well a given site or ecosystem has been sampled and then to account for the potential gap between observed and expected species richness. We still do not know why the amount of sampling required in individual sites varies; however, it is possible that the nature of the study system contributes. Bee populations can be highly variable and dependent on climatic conditions. For example, the abundance of univoltine bees is highly dependent on the previous year's population size and winter survival. The flowering season of apples is short, just 2-3 weeks a year. This means that the sampling window is short each year and there could be a high potential for phenological mismatch between individual bee species and the apple trees in any given year (but see Bartomeus et al. 2013). In addition, the timing of the sampling period (i.e., apple bloom) is in early May, which is early spring in central New York. Thus, the weather is poor on many days and further restricts the number of days for sampling as well as the activity of the bees, which may explain the pronounced year-to-year variation we observed. This means that for a highly variable fauna, we also have highly variable sampling.
We were able to show that there is a relationship between sampling and species richness for undersampled sites. For orchards in which we have realized <75% of the expected species richness, there is a correlation between species richness and the number of collections, but this relationship is not significant for orchards for which we have realized more than 75% of the expected species richness. This correlation is intuitive because, for undersampled sites, the rarefaction curve still has a positive slope. When it plateaus (generally above 75% of expected species richness), the correlation between sampling and species richness disappears. Neither orchard size nor percent agriculture in the surrounding landscape bears a relationship to the proportion of expected species richness in our apple orchards, so the mechanism driving the need for some orchards to be more heavily sampled is still unclear. The results of this study do agree with the heterogeneity required in sampling intensity in other systems, suggesting that sampling intensity should be considered in experimental design when the study taxa are diverse or dynamic (e.g., Shapiro et al. 2014). Temporal variation may drive some of the differences in the sampling required to fully describe the bee diversity in each orchard, although four of the six sites where the year-to-year variation was lower than the regional variation were un-dersampled. Variation in our ability to capture the expected species richness may also be due to more variable weather patterns at some sites, changes in the local landscape over the course of our study, or other differences in orchard management practices.
To the best of our knowledge, this is the most thorough existing study of bee diversity in apple orchards and perhaps one of the most rigorous of bee diversity in any agroecosystem. Despite the often broad geographic context of these studies, they tend to be restricted to a single year (but for notable exceptions with 3 year studies see Chacoff and Aizen 2006;Greenleaf and Kremen 2006;Tuell et al. 2009), while our results suggest that, for most sites, between-year variation is greater than between-site variation, at least for apple orchards. Our study also demonstrates the highest recorded bee diversity for apple orchards (another study shows higher richness [114 spp.] when multiple habitat types are included [Sheffield et al. 2013]). Reported values vary between 29 and 92 species. At the same time, because we collected samples of applevisiting bee diversity, rather than entire bee populations, we will have necessarily missed some bees. The rarefaction analyses we employ help to quantify the difference between the diversity that we would expect and the diversity that we captured.
While many studies employ pan trapping (Sheffield et al. 2013;Mallinger and Gratton 2014), or aerial netting and pan trapping (Watson et al. 2011), our sampling was restricted to aerial netting. We targeted bee species that were actively foraging within apple orchards, which allowed us to focus on potential pollinators. Pan trapping may find a still greater bee species richness, as it could attract bees moving through the orchards but not actively foraging on apple (Shapiro et al. 2014). In other words, our study may represent a subset of the total background bee community. However, the species which do not visit apple blossoms are unlikely to be effective apple pollinators. In published studies of bee diversity in agroecosystems (reviewed in Kennedy et al. 2013), only the blueberries (Tuell et al. 2009) have a comparable bee diversity to the apple orchards in this study. Tuell et al. (2009) also used pan traps to capture specimens, which could include a background community of bees that are not visiting the blueberry flowers or actively pollinating the crop. Whether this disparity is due to true differences in bee diversity or differential sampling effort among studies is unknown.

Conclusions
It is possible that accurate estimates of bee species richness could be necessary to quantify the relationship between diversity and function in an ecosystem. Rare species can be very important for ecosystem function (Lyons and Sch- wartz 2001) and can change the structure of the community (V azquez et al. 2007). Thus, these estimates may be especially important when detailed community interactions are taken into consideration, such as in plant-pollinator networks. Greater sampling may change our perception of ecosystem function, including network structure. We have shown that the level of sampling required to get an accurate representation of bee diversity in an agroecosystem is often greater than would be expected and that year-to-year variation in these diverse taxa could result in deceptively low numbers of observed bee diversity. We recommend standardized, multiyear sampling protocols as well as samplebased rarefaction methods to assess sampling adequacy for future studies of pollinator diversity and pollination services in agroecosystems.

Supporting Information
Additional Supporting Information may be found in the online version of this article: Figure S1. Rarefaction curves for the 22 orchards individually analyzed (A-V), excluding orchard C, where we only conducted standardized transects. Table S1. Land cover types included in classification of agricultural areas. Table S2. List of bee species collected in the orchards, their abundance, and the number of orchards they were collected in.