Influences of horizontal and vertical aspects of land cover and their interactions with regional factors on patterns of avian species-richness

We examined how both horizontal and vertical aspects of land-cover diversity influence patterns of avian species-richness across North America. Using count data from Breeding Bird Survey routes within the conterminous USA and land-cover data from the National Land Cover Data-set, we analyzed relationships between species-richness estimates, vegetative strata, landscape diversity and elevation and geographic position using both linear-regression models and a classification and regression tree. We found that latitude, the diversity of land-cover classes present, and the proportion of the landscape containing cover-classes representing 3 vegetative strata had the strongest influence on species richness. This illustrates that, while broad-scale biodiversity trends are strongly influenced by dominant regional factors, they are also sensitive to the structure of the intermediate-level landscape. Thus, factors at multiple scales must be considered when modeling spatial patterns of biodiversity such as avian species-richness. Subjects: Animal Ecology; Conservation Environment Studies; Biodiversity & Conservation; Geographic Information Systems; Biogeography

ABOUT THE AUTHORS Dr Brian M. Napoletano's research interests are primarily in the complex interactions between human systems and biophysical patterns and processes, particularly in the context of land change and its underlying politico-economic drivers. Bryan Pijanowski has a PhD in Zoology from Michigan State University. His work focusses on the relationship between land use/cover, biodiversity of animals, hydrology and food security at landscape to continental scales. He employs spatial technologies such as GIS, remote sensing and spatial statistics to understand these relationships. His work has taken him all across North America, Central America, South America, Africa and Asia. Dr Dunning studies the impact of habitat change across large landscapes on native wildlife species. He has used field surveys, simulation modeling and restoration projects (as a form of large-scale experiment) to record the impact of anthropogenic change of birds of forest, grasslands and wetlands.

PUBLIC INTEREST STATEMENT
Explaining the complex patterns of biodiversity at different spatial scales remains a significant challenge for scientists, while the growing numbers of species eliminated or adversely affected by human activities makes the task increasingly urgent. We compared levels of bird diversity estimated from the National Breeding Bird Survey in 1992 to land-cover and geographic factors to determine how useful landscape-level information is to inferring habitat-level attributes, and how important these factors are when compared to regional-level factors across the contiguous USA. Although the regional-level factors explained much of the variations in the diversity of bird species across the USA, the more local-level factors of landscape diversity and habitat complexity were also important, meaning that factors at all levels need to be integrated to satisfactorily explain largescale patterns of species diversity. Thus, efforts to model biodiversity patterns for conservation or to predict the effects of environmental changes must consider factors at multiple scales.

Introduction
The spatial variation in biological communities is a major research topic in applied geography, landscape ecology, biogeography, and community ecology (Costanza, Moody, & Peet, 2011;Elton, 2001;de Roos & Sabelis, 1995;Wilson, 2000), and spatial variation is believed to be as important as temporal variation to the regulation of ecosystem processes (Ma, Zuckerberg, Porter, & Zhang, 2012;Turner & Cardille, 2007). Knowledge of the spatial variability of biodiversity is also fundamental to the identification of priority areas for conservation interventions, such as the designation and delineation of "protected areas" (Carroll, Noss, Paquet, & Schumaker, 2003;del Carmen Sabatini, Verdiell, Rodríguez Iglesias, & Vidal, 2007;Reid, 1998;Soulé, 1991), the identification of "gaps" in existing conservation arrangements (McKendry & Machlis, 1993), and in shifting the focus in conservation from the preservation of individual species and habitats to that of maintaining functional ecosystems (Frost, Campbell, Medina, & Usongo, 2006;Naveh, 1994;Wilson, 2002). Over the last 10,000 years, humans have emerged as a dominant force driving the evolution of terrestrial landscapes, with the rate of transformation increasing dramatically in the last half-century (Foley et al., 2005;Turner, Gardner, & O'Neill, 2001). This widespread and complex transformation of terrestrial landscapes-particularly rapid in the cases of agricultural expansion and urbanization (Pijanowski & Robinson, 2011;Seto, Güneralp, & Hutyra, 2012;Tilman et al., 2001)-is believed to be one of the most significant factors contributing to contemporary, accelerated rates of biodiversity loss (Sala et al., 2000;UNCBD, 2010), but biodiversity forecasts focused on the anticipated effects of climate change outnumber those focused projected land-change scenarios (Titeux et al., 2016). Given the dominant role humans now play in shaping the planet's terrestrial landscapes and the immediate and profound implications of land change for biodiversity, a more thorough understanding of how local landscape features influence regional biodiversity patterns should be an important research priority (Sala et al., 2000;USNRC, 2001).

Avian species-richness as a measure of biodiversity
Avian species-richness, typically considered an α-level variable in the traditional biodiversity hierarchy (MacArthur, 2004;Whittaker, 1972;Whittaker, Willis, & Field, 2001), is commonly used in studies of biodiversity patterns over macro-scales (Gaston & Williams, 1993) because it tracks well with other dominant biodiversity factors at these scales (Gaston & Blackburn, 1995). However, it does not capture sub-species variations-which may be declining at a rate three orders of magnitude greater than that of species extinctions (Rands et al., 2010)-and is neutral in regards to species composition and attributes (e.g. evenness, rarity, endemism) that may be important in other studies (e.g. trophic structures, sink-source delineation, predator-prey models) and to conservation and management (Storch, Marquet, & Brown, 2007;Wilson, 2000). The advantage to this neutrality, however, is that species-richness estimates tend to be more robust and readily corrected for differences in detection probabilities, and therefore more accurately reflect one aspect of biodiversity at larger scales and in sampling situations-such as the North American Breeding Bird Survey (BBS)-where the requisite corrections for detection and sampling bias (i.e. each species, each route, each observer) would pose significant logistic difficulties (Gaston & Spicer, 1998;Ibáñez et al., 2006). Thus, species richness is best considered as a widely used, basic indicator of biodiversity that is, however, necessarily incomplete (Lévêque & Mounolou, 2003).
More generally, the dominant drivers of species richness tend to be highly scale-dependent (Böhning-Gaese, 1997). Studies of species richness at or near the habitat level tend to identify factors related to habitat complexity, resource quality, and community interactions as dominant drivers (Goetz, Sun, Zolkos, Hansen, & Dubayah, 2014;Harris, Milligan, & Fewless, 1983;Karr & Roth, 1971), although which variables best reflect these factors tends to vary with the type of habitat considered, and different groups of species tend to respond differently to each variable (e.g. Coppedge, Engle, Masters, & Gregory, 2001;Donovan & Flather, 2002;Flather & Sauer, 1996). Explanations of regional patterns of species richness tend to polarize into positions maintaining that either such patterns can ultimately be explained by one (i.e. climate) or no more than a few variables, or that the ensemble of determinant variables depends on the region considered and must be established empirically (Currie, 2007).

Hypothesized effects of horizontal landscape diversity and vertical stratification on avian species-richness
Rather than mutually exclusive categories, these two positions can be reconciled by positing that both habitat-level and regional-level explanations provide important insights into broad-scale biodiversity patterns, and cross-scale integration of such variables could be valuable to modeling and regional analysis (Brown & Sax, 2004;Storch et al., 2007). Therefore, we chose to focus on landscape diversity (as indicated by a Shannon index of the number of different land-cover classes within 5 km of each BBS route) and vertical stratification (as indicated by the proportion of land-cover classes reflecting differing numbers of vegetative strata) as landscape-or meso-level variables, but to use them to infer general information about habitat-level diversity, and to use geographic coordinates and topography (mean elevation and standard deviation) as surrogates to regional-level variables to determine whether and how landscape variables influence regional species-richness trends alongside stronger, regional-level climatic and topographic influences. We hypothesized that, insofar as landscape diversity and vertical stratification reflect increased habitat diversity within the landscape, higher values of either variable will generally correspond to higher levels of species richness. We also stipulated, however, that the exact form of this relationship may be non-linear, as landscape-diversity values beyond a given threshold may begin to reflect habitat fragmentation rather than richness, while a similar threshold for any one strata class may conversely indicate a lower degree of overall landscape diversity. MacArthur (1964) demonstrates that vertical habitat features are more important where overall landcover diversity is low, as the presence of multiple patches of different land-cover types introduces a corresponding number of different vertical profiles into the landscape. Accordingly, we hypothesized that a model including both horizontal land-cover diversity and vertical stratification (i.e. vegetation class) would indicate that land-cover diversity is more important than strata class where the value of the former is high, and that strata class would become more important where land-cover diversity is low.
Attempts to characterize the relative diversity of a landscape commonly employ metrics derived from information theory such as diversity or richness indices, but frequently with a focus on horizontal heterogeneity that does not reflect potentially important heterogeneity in the vertical plane (Farina, 1998;Turner et al., 2001). At finer scales, a series of recent studies explicitly include vertical habitat structure through the use of LiDAR (e.g. Davies & Asner, 2014;Pekin & Pijanowski, 2012;Vierling, Vierling, Gould, Martinuzzi, & Clawges, 2008), but the 0.5-3 m resolution of LiDAR data (Storch et al., 2007) reflects a much finer scale representation than land-cover data derived from 30 m resolution imagery, and LiDAR coverage is sparse compared to land-cover data (Goetz et al., 2014). This study is original in that it uses land-cover variables to indirectly infer habitat-level factors and geographic position and topography to indirectly represent regional factors, and then compares the influence of both sets of factors on species-richness estimates sampled across the conterminous USA.

Data
Data were synthesized from three complementary national data-sets: (1) avian species-richness estimates from BBS routes sampled across the conterminous USA, (2) elevation data from the United States National Aeronautics and Space Administration (NASA) Shuttle Radar Topographic Mission's (SRTM) 90-meter Digital Elevation Model (DEM) and (3) land-cover characteristics from the 1992 National Land-Cover Data-set (NLCD) [The data-sets used in the statistical analyses are available in the supplementary material].

BBS sample data
The BBS is a macro-scale, long-term inventory of avian species trends, and is one of the longest-running species surveys at the subcontinental scale. Started in 1966 by researchers at the USGS's Patuxent Wildlife Research Center, it is used to monitor species trends throughout the conterminous USA and includes 3,000 survey routes that are sampled annually during the spring/summer breeding season. Survey routes are pseudo-randomly situated within one-degree blocks of latitude and longitude along roadways, and each survey route is 40 km (24.5 miles) long. The observer stops every 0.8 km (0.5 miles) along the route for three minutes and records the number and species of every bird seen or heard within 0.4 km.
Beyond the above-mentioned pseudo-random situation, BBS routes are not distributed evenly or randomly throughout the USA, and tend to over-represent deciduous forests in the north-eastern region and under represent higher elevations in drier regions (Lawler & O'Connor, 2004). Although a higher number of survey routes in a particular region may increase the likelihood of a given rare species being detected along any one of the survey routes, this should not influence measures of species richness made along each route, as each route is sampled-and species-richness estimates are calculated-independently. The fact that survey routes are situated along roads may have a significant effect on species-richness estimates, as proximity to road has been shown to influence the species richness of vegetation (Marcantonio, Rocchini, Geri, Bacaro, & Amici, 2013), habitat composition and change (Harris & Haskell, 2007), and composition of avian communities (Wellicome, Kardynal, Franken, & Gillies, 2014). The risk of these factors introducing systematic biases in this study, however, was minimized insofar as their effects should be randomly distributed across all survey routes.
Observer skill presents another potential bias in the survey: although volunteers undergo training prior to participating in the survey, accumulated life experience introduces a potentially significant detection bias that is difficult to account for Gu and Swihart (2004), Link and Sauer (1998) and Urban and Swihart (2009). To some extent, such bias is unavoidable in any data-set based on long-term volunteer surveys, and BBS organizers account for observer experience when examining population trends over time (Sauer, Peterjohn, & Link, 1994).
We attempted to further mitigate the possibility of this bias in two ways: (1) by estimating speciesrichness values from only one survey-year, we minimized within-observer effects (Kendall, Peterjohn, & Sauer, 1996), and (2) using species richness rather than population factors, species misidentification-insofar as it did not change the number of species detected-posed less of a threat to the validity of the results.
Survey results from 1966 to present are publicly available from the USGS. To optimize computational resources, we used data aggregated into 10-stop summaries of the 2,087 routes that were sampled in 1992 and situated within the portion of the conterminous US covered by the NLCD to estimate route-level species-richness in this study.

SRTM DEM
BBS meta-data specifies latitude and longitude coordinates for each survey route, but does not mention elevation or topography. To include these factors, we integrated data from the NASA SRTM's global, 90 m DEMs (Jarvis, Reuter, Nelson, & Guevara, 2008). We were primarily interested in the overall altitude of each route, which we estimated by calculating the mean elevation value for all cells within a 5 km buffer (see Section 3.2) around each survey route. We also calculated the standard deviation within each route-buffer, which we included in the statistical model as an indicator of the sample area's topographic heterogeneity (Rahbek et al., 2007).
The use of latitude, longitude, and mean elevation as moderate-scale surrogates to more specific climatic and other regional variables (solar radiation, precipitation, temperature, distance to surface water, etc. HilleRisLambers, Harsch, Ettinger, Ford, and Theobald (2013) was based on previous studies that use these factors to account for climatic variability (e.g. Russell, Domke, Woodall, & D'Amato, 2015;Rweyongeza, Barnhardt, & Hansen, 2011;Shore, Safranyik, & Lemieux, 2000). This use of geographic variables as indirect indicators of climatic factors allowed us to capture the effects of the effects of climatic variability, including the longitudinal moisture/climatic gradient across the conterminous USA (Gaston & Spicer, 1998;Whittaker, 1975), while avoiding the dilemma of either inflating the potential error of the statistical models by including all necessary climatic variables to which different avian species are sensitive (in sensu Jetz, Wilcove, & Dobson, 2007) or excluding biologically important variables from the model.

1992 NLCD
The 1992 NLCD, also developed by the USGS, is one of the earliest and most widely used land-cover data-sets for the USA (USEPA, 2014a). We selected the 1992 NLCD over more recent data-sets because of this widespread use, and because it has been shown to provide an effective baseline against which subsequent land-change can be identified, although its accuracy is somewhat less than that of a neural-network based classification (Boyd, Foody, & Ripple, 2002;Kolios & Stylios, 2013).
The NLCD's land-cover types are classified from Landsat Thematic Mapper imagery into 21 classes analogous to Anderson, Hardy, Roach, and Witmer (1976) levels I and II (as modified by Vierling et al. (2008)) at 30 m spatial resolution. Whereas minimum accuracy at level I is estimated to be 70%, level II accuracy is lower (minimum < 40%) and more widely variable (Stehman, Wickham, Smith, & Yang, 2003;Wickham, Stehman, Smith, & Yang, 2004).
Despite the higher error rate (Thogmartin, Gallant, Knutson, Fox, & Suárez, 2004), we used level II classes to estimate horizontal diversity with the Shannon index (Shannon, 1948) on the grounds that we were more interested in the number of different classes than their designations, but reverted to level I classes to assign vertical-stratification categories.
At the landscape scale, vertical habitat-structure can be generalized following MacArthur and Wilson's (1967) and Whittaker (1975) characterization of vegetative stratification in different vegetation types. In this context, forests represent the most structurally complex landscapes with at least three different strata available to most bird species: (1) the upper canopy, (2) shrubs and small trees and (3) herbs and other vegetation at or near ground-level. Woodlands and similar brush habitats contain only the latter two strata, and grasslands only the last stratum. Like land-cover Table 1. Classification of land-cover classes based on landscape vegetative strata Notes: "Strata 0" refers to land-cover classes with indeterminate strata levels, and the other three classes refer to the number of vegetative strata associated with their respective land-cover classes. classification, such stratigraphic characterization may not reflect how birds actually perceive the landscape, but is a useful analytical tool to classify and compare landscapes in terms of the horizontal and vertical diversity of habitat types (MacArthur & MacArthur, 1961). Using the NLCD land-cover classification criteria (USEPA, 2014b;Vogelmann, Sohl, Campbell, & Shaw, 1998), we re-classified land-cover classes into one of four strata classes. Classes 1-3 refer to the number of strata associated with the land-cover classes in those categories, and class 0 contains land-cover classes with little or no vegetation present and those with an indeterminate degree of stratification (see Table 1).

Analytical methods
To test the hypothesized relationship between avian species-richness and landscape factors using diversity and vegetative strata as indirect indicators of habitat-level diversity, we focused on two principal questions in our statistical analysis: (1) Do increasing proportions of multi-strata vegetation correspond to higher levels of species richness in a manner analogous to the effects of increasing land-cover diversity? and (2) Does the importance of vegetation strata decrease as overall landscape diversity increases?
To determine the importance of these landscape-level variables and the habitat-level factors they indirectly reflect at the regional scale, we asked whether their influence could be detected across the study region and remain statistically significant even when modeled against regional factors.
The outer boundary of the study region in this analysis was determined by the US national borders, but the region was otherwise open and unbounded, and species richness was treated as an emergent property of landscape diversity, vegetative strata, latitude, longitude, and mean elevation (sensu Brown, Ernest, Parody, & Haskell, 2001). The translation of the BBS, geographic, and land-cover data into a meaningful analysis of species richness took place in three general phases: (1) speciesrichness estimation, (2) derivation of spatial attributes, and (3) analysis of relationships between species richness and spatial attributes.

Species-richness estimation
The total number of species detected along each survey route was adjusted with a first-order jackknife estimator to adjust for heterogeneous detection probabilities [9]. Of the eight models reviewed by Boulinier, Nichols, Sauer, Hines, and Pollock (1998), M h -in which detection probability varies with individual species-is robust to deviations from model assumptions, performs well if the average detection probability is large enough, and is selected the most frequently as the best estimator of species richness in their study of 317 survey route-years. This is consistent with other analyses indicating that the first-order jackknife yields optimal results with BBS data (Cam, Nichols, Sauer, Hines, & Flather, 2000a, 2000b).
The first-order jackknife successively resamples each survey route based on the incidences of species occurring at only one stop, as indicated by Equation (1). The variable Ŝ represents the estimated species richness, S o represents the number of species observed, α 1 is the number of species observed at only one stop, and N is the number of sampling occasions. To reduce the computational load of the analysis, the USGS provides the species accounts of all 50 stops along each survey route aggregated into 10-stop summaries (i.e. the species counts of ten consecutive stops are combined into a single species account). This effectively reduces the number of sample points from 50 to 5, meaning N = 5 instead of N = 50. Because the jackknife uses instances where a given species is only detected at one sample point to calculate the detection-probability coefficient, the term N−1 N is adjusted down from 0.98 to 0.8 in Equation (1): The implication of this adjustment is that the species-richness estimates derived from the 10-stop summaries was slightly lower than those derived from the full 50-stop accounts, but the effect was common to all 2,087 routes. Because this study examined richness levels relative to one-another across all routes, rather than absolute richness levels for any particular route, estimates based on the 10-stop data do not bias the statistical analyses. (1) Subspecies were resolved to the 10th supplement (Chesser et al., 2010) to the 7th American Ornithologist's Union check-list of bird species (AOU, 1998), and unidentified species were filtered from the data-set. We opted to include shorebirds, nocturnal species, and other species that are not always well-recorded in the BBS because we expected that the large number of routes included in the analysis would offset any effects that such rare sightings may have on the alpha term of the jackknife estimate. We used the "vegan" library (Oksanen, Kindt, Legendre, O'Hara, & Stevens, 2009) for the R Statistical Language (R Development Core Team, 2008) to estimate species richness independently for each route sampled in 1992 (Figure 1).

Derivation of geographic and land-cover variables
Latitude and longitude values were acquired from meta-data associated with the vector map of BBS route paths used to overlay routes onto the GTOPO30 and NLCD grids. To obtain elevation values, we projected a 5 km buffer onto the GTOPO30 DEM around each BBS route and calculated the mean elevation value within said buffer. The 5 km buffer, which was the maximum buffer size the computer system could process at the time of the analysis, was selected as a more inclusive buffer after it was determined that the results did not differ significantly to 1 km and 500 m buffers (unpublished data).
BBS routes are not straight transects, so the number of cells within each 5 km buffer is not constant. Because the NLCD does not cover Canada or Mexico, adjacency to national borders and coastlines also affects the total number of cells within each buffer. To account for these variable buffer sizes, we used the proportion of cells in each strata class rather than cell counts as the determinate variables.

Analysis of relationships between species richness and spatial attributes
Two types of regression analyses were employed to address the research questions: (1) a series of formal multiple-regression models to estimate regression coefficients and compare their explanatory power, and (2) a classification and regression tree (CART, hereafter regression tree) to identify key thresholds when explanatory variables are arranged hierarchically.
Land-cover diversity and proportions of each of the four strata classes were all calculated from the same set of land-cover values, and therefore highly collinear. In the latter case, the collinearity was so extreme that any attempt to generate a model containing all four variables simultaneously produced a singularity. Because both the predictive power of the individual variables and the signs of the regression coefficients-the two factors most strongly affected by multicollinearity (Hair,

Figure 1. Map of 1992 North American breeding bird survey routes color-coded by speciesrichness estimates, with values divided into five groups at natural breaks.
Notes: Longitude and latitude coordinates identified in the map correspond to splitting points in the regression tree (see Figure 3; longitude = −97.44, latitude = 35.98). Land and water base layers were generated using free spatial data from Natural Earth. Source: http://www. naturalearth.com/.
Tatham, Anderson, & Black, 1998)-were precisely what we were interested in determining, we separated the collinear variables into different models and generated Akaike Information Criteria (AIC; Akaike, 1974) scores to compare the different models' explanatory power. To more explicitly determine whether geographic and topographic variables, serving as surrogates to regional factors, eclipsed land-cover factors at our study scale, we generated a model containing only geographic variables (latitude, longitude, and elevation) to serve as a "null" model against which the AIC scores of the models containing land-cover variables could be compared. Because horizontal land-cover diversity and strata proportions were calculated (albeit differently) from the same data, we first examined these variables separately in their own models (i.e. Equation (2), where the species-richness estimate of route i is a function of latitude, longitude, elevation, and either land-cover diversity or the proportion of strata-class 0, 1, 2, or 3 (land) with an error term).
We then generated pairwise models for each of the strata classes that contained the same geographic variables with both land-cover diversity and the proportion of that strata class as explanatory variables (Equation (3), where the species-richness estimate of route i is a function of latitude, longitude, elevation, land-cover diversity (lcd) and the proportion of strata-class j (strata) with an error term). This use of separate models allowed us to determine whether collinearity between landcover diversity and strata class affected the sign of the regression coefficients.
In terms of our first research question, we anticipated that lower AIC scores for models containing land-cover data (diversity, strata class, or both) would indicate that landscape-level variables are indeed important explanatory factors at our study scale. Based on the expectation that higher levels of both horizontal land-cover diversity and vertical vegetative stratification indicate a larger diversity of available habitats, we anticipated that the estimated coefficients for land-cover diversity and the proportion of strata-class 3 would be positive, whereas those for strata class 1 would be negative. The signs of the coefficients for classes 0 and 2 were more difficult to anticipate, as class 0 represents indeterminate degrees of vegetative stratification and the sign of the coefficient for class 2 would depend largely on whether its increase was at the cost of class 1 (positive) or class 3 (negative).
To determine whether changes in land-cover diversity affect the relative importance of vertical stratification and under what conditions this occurs, we used a regression-tree analysis to situate both geographic and landscape-scale variables hierarchically. Because regression-tree analysis is robust to interactions between independent variables (De'ath & Fabricius, 2000;Zuur et al., 2007), we were able to include all variables in a single model despite the aforementioned problem of multicollinearity. Optimally homogeneous splitting-points in the data were identified by binary recursive partitioning of the independent variables, and the number of terminal nodes (leaves) was determined by a cross-validation algorithm that successively "prunes" the tree according to a range of cp (complexity parameter) values until the mean of the cross-validations is less than the mean values of the cross-validation errors plus the standard deviation of the cross-validations on convergence. We used the "tree" library for R (Ripley, 2009) to generate this model.
If the explanatory power of vertical stratification decreases with increasing land-cover diversity, then we anticipated that when strata class appeared in the regression tree it would be situated below land-cover diversity and on the lower side of the splitting point. (2)

Results
The results here are divided into two groups: (1) those from preliminary data assessment and (2) those produced by the formal hypothesis tests.

Preliminary data-assessment
Species-richness estimates ranged from 9.2 to 111.8 with a mean of 64.0 ± 17.4 SD and a Gaussian distribution across the study region. To ensure the validity of our statistical models, we used the Lattice tool-set for R (Sarkar, 2008) to examine the distributions of species-richness estimates against the eight explanatory variables and the extent to which these distributions exhibited pairwise linear relationships (Hair et al., 1998;Maindonald & Braum, 2003).
Our initial postulation that detection of relationships between proportions of strata-class 2 and species richness would be more problematic than with other explanatory variables was reinforced by the widespread absence of land-cover types in strata-class 2 from the route buffers. Whereas classes 0 and 1 were present in all 2,087 route-buffers (albeit with minimum proportions of 0.001%, but maximums of 95.1 and 99.9%, respectively) and class 3 was only absent from 22 route buffers, class 2 was entirely absent from nearly 50% (i.e. 1,031) buffers. This disparity in the representation of class 2 may be an important factor in the strong negative relationship to species richness indicated by the formal analyses (see below) and the preliminary data visualization (see Figure 2).
As discussed above, strata class 0 contains land-cover classes without vegetation and those with otherwise indeterminate stratification. Although a negative relationship to species richness may reasonably be postulated for at least the barren cover-types, the clustering of the majority of species-richness estimates at low values of strata proportion (indicated in Figure 2) made it difficult to determine whether a statistically significant correlation-coefficient reflected a biologically significant relationship or simply that the proportion of cells in strata-class 0 was greater than 50% in less than 15 of the 2,087 route-buffers. The general distribution of species-richness estimates with respect to land-cover diversity, and proportions of cells in strata-classes 1 and 3, confirmed our initial stipulation that these relationships may be non-linear. The locally weighted regression curves (LOWESS; Cleveland, 1981) in Figure 2 depict these curvilinear relationships more clearly, suggesting that they may be modeled more precisely as curvilinear regressions rather than by straight lines. However, the position of the start and end points of the LOWESS curves at different points along the y-axis and the overall shapes of the curves indicated that a linear model should capture the basic trends in the data sufficiently enough to indicate the overall directions of the relationships between variables. As we were primarily concerned with the slopes of the coefficients rather than the closeness of fit between model curves and data points (Zuur et al., 2007), we used linear models for formal hypothesis-testing.

Multiple-regression models
Of the 10 multiple-regression models examined, the model containing both land-cover diversity and proportion of strata-class 3 yielded an AIC value substantially lower than the others (16,621, vs. the next lowest value of 16,667, Table 2), and was the only model in which the confidence intervals for both land-cover diversity and proportion of strata class were both consistently positive and statistically significant.
The confidence intervals of the coefficients for latitude, longitude, standard deviation of elevation, land-cover diversity, and proportion of strata-class 3 were all positive, with the slopes for both mean Table 2. Confidence intervals (95% on α=0.05) for coefficient estimates of listed variables, significance levels of the same, and the last three digits (all numbers began with 16) of modelwise AIC values for each of the multiple-regression models examined (10 models total) Notes: "Null model" refers to the model containing only geographic variables, "Geographic variables with single landscape feature" refers to the five models each containing the geographic variables and only 1 of the landscape features examined, and "Combined landscape features" refers to the four models containing land-cover diversity and proportion of one of the four strata classes together with the geographic variables. The values of the coefficients for latitude, longitude, mean elevation, and standard deviation of elevation varied with each model, with the coefficient for mean elevation not significantly different to 0 in the model with land-cover diversity and strata 3. *Indicates that the estimated coefficient failed to meet the significance threshold required to reject the null hypothesis that its slope is not significantly different to 0. and standard deviation of elevation statistically significant but close to zero. Where statistically significant, on the other hand, the slopes of the coefficients for strata-classes 0-2 were all negative (although the upper limit of the interval of strata-class 2 is positive where it is modeled without landcover diversity; Table 2). Only in the model containing the proportion of cells in strata-class 0 with geographic variables and that containing both land-cover diversity and the proportion of cells in strata-class 1 was the null hypothesis for the slope of the coefficient for strata class not rejected, and this only after adjusting the significance test for multiple comparisons (using the Bonferroni adjustment, as discussed above).

Regression-tree analysis
The CART analysis yielded a tree with 6 splitting points using longitude, latitude, land-cover diversity, and proportion of strata-class 3, and with 7 terminal nodes (leaves; Table 3). The division on longitude <−97.4 was responsible for the greatest amount of deviation reduction, but was seconded by landscape factors on both sides of the division. The node with the highest mean species-richness identified routes situated east of longitude -97.4, with proportion of forested land >22%, and north of latitude 38.95, and that with the lowest mean identified routes west of longitude −97.4, horizontal land-cover diversity <0.96, and proportion of forested land <0.8% (Figure 3).

Discussion
The positive coefficient estimates for land-cover diversity in all models was consistent with the expectation of our initial hypothesis that more structurally diverse landscapes hold more species. Similarly, the positive coefficient estimates for the proportion of strata-class 3 was consistent with the expectation-and previous studies-indicating that landscapes containing larger proportions of forested area also tend to host more species (Rojas, Pino, Basnou, & Vivanco, 2013). Negative coefficients for proportion of strata-class 1, on the other hand, suggested that species-richness levels decrease as the landscape is increasingly dominated by simpler vertical vegetation structure such as that associated with grasslands and agriculture. That the slope of the coefficient for strata-class 1 was not estimated to be significantly different to strata-class 0 when land-cover diversity was included in the model is consistent with the extremely shallow slope indicated by the scatter plots, and suggests that the negative effects of this strata type on species richness is largely counteracted by the overall diversity of the landscape. Table 3. Tabular summary of regression-tree output depicting each node with the value of the splitting point, the number of data points on its right-hand (n), and their deviance and mean Notes: The abbreviation "lat" refers to latitude, "lon" to longitude, lcd to land-cover (horizontal) diversity, and psx to proportion of (vertical) strata class x in the buffer. Node 1 refers to the entire dataset. The importance of longitude in the regression tree was consistent with the east-west climatic moisture gradient of the conterminous USA (Whittaker, 1975). Insofar as they indicated higher species-richness levels on the eastern side of the USA, the results of the regression-tree analysis differed to predictions based on Cook's (1969) range-based isopleth map of species richness, but were consistent with O' Connor et al.'s (1996) analysis that compared similarly derived species-richness estimates to land-cover features estimated from alternate sources and from specific meteorological variables.
Land-cover diversity was only included in the regression tree where longitude <−97.4, but was situated above the node based on strata-class 3 in this portion of the hierarchy. This suggests that land-cover diversity was more important than strata-class here, and that the proportion of land covered by multi-strata vegetation was only important where diversity <0.96, which is consistent with the expectation that strata class is more important when land cover is more homogeneous.
Where latitude appeared in the regression tree, higher levels of species richness were found north of the dividing line, which was opposite to the hemispheric tropical-arctic species-diversity gradient (Fischer, 2004) but consistent with sub-global tendencies indicated in Cook's (1969) map. The extent to which these results converge logically with those of previous studies reinforces the validity of the variables selected. Divergences, conversely, suggest further investigation of broader-or finer scale factors to refine our understanding of these regional biodiversity trends.

Conclusions
The findings in this study indicating that landscape factors remain statistically significant even alongside regional factors across the conterminous USA are consistent with other regional studies (e.g. Culbert et al., 2013;Pekin & Pijanowski, 2012;Rahbek et al., 2007) demonstrating that models that purport to explain spatial patterns of species-richness using only one or a few regional factors (Currie (2007) "strong constraints family" of explanation in biogeography) are necessarily incomplete. Although they were derived from different land-cover datasets and survey years, the findings in this study regarding the positive contributions of land-cover diversity and forests (i.e. strata class 3) to species richness in particularly are consistent with those reported in a similar study by O'Connor et al. (1996), and the factors included in the regression tree here were also identified in their analysis. Notes: The left side of the split indicates that the conditional is true, and the right side that it is false. The vertical length of branches indicates the amount of deviance reduction. Variable abbreviations as follows: lon = longitude, lcd = land-cover diversity, ps3 = proportion of cells in strata-class 3, and lat = latitude.
We used the 1992 NLCD in place of more recent land-cover data-sets with the hopes that the analytical approach and statistical models used in this study could serve as a basis for subsequent studies that incorporate temporal trajectories of land-cover and species-richness trends and their interactions (Whittaker, 1972), as well as compounding factors such as climate change (Stephens et al., 2016;Storch et al., 2007) and the spread of West Nile virus to North America (Cruz-Pacheco, Esteva, & Vargas, 2012). Because the BBS is repeated annually, one of the most straightforward ways to build on the results reported here would be to repeat the analyses we performed using data from other years for which national land-cover data are available. Repeating the regression-tree model with data from other years could be particularly interesting, as it would indicate whether the dominant drivers of species richness remain constant or shift over time.
To be useful to biodiversity conservation, the information regarding species richness provided in this study would need to be complemented by studies of specific species assemblages, and factors such as the endemism and rarity of individual species would need to be considered (Hanberry, 2014;Lévêque & Mounolou, 2003;Rands et al., 2010). The approach described here, however, could be used to relate these additional response variables to the explanatory factors, and the species-richness data itself could help to provide useful preliminary indications of zones or locations with relatively higher levels of overall species-diversity (Gaston & Spicer, 1998).
The use of land-cover classes to infer vertical stratification is an important and novel contribution of this study, as it suggests a straightforward research question for further examination-i.e. do more vertically complex landscapes reflect more structurally complex habitats?-while the consistency of the results with our initial hypotheses suggests that these factors may be at least partially extrapolated from land-cover data. As LiDAR (Goetz et al., 2014;Tattoni, Rizzolli, & Pedrini, 2012) and other forms of higher resolution data-including drone-based aerial imagery (Paneque-Gálvez, McCall, Napoletano, Wich, & Koh, 2014)-become available over increasingly larger extents, the multi-scale relationships between habitat and vegetation structure, and landscape diversity, and geo-climatic can be examined more directly. In forecast models calibrated to include future land change (Titeux et al., 2016), legacy data-sets, and other instances where newer high-resolution data remain unavailable (Priego-Santander, Campos, Bocco, & Ramírez-Sánchez, 2013) our findings indicate that land-cover data can be utilized to partially infer the influence of habitat-level attributes on regional biodiversity trends.

Supplementary material
Supplementary material for this article can be accessed here http://dx.doi.org/10.1080/23311843.2017.1296604.