Landscape complexity eﬀects on crop productivity: an assessment from space

Agricultural producers have many incentives to clear small natural areas from their ﬁelds, as this can expand their cultivated land base. However, natural areas can play a role in delivering ecosystem services that improve crop productivity (e


# 1. Introduction
The Canadian prairies are now one of the world's most endangered ecosystems, as at least 70% of native grasslands have been lost due to development and conversion to agriculture (AEP, 1997).In addition to grasslands, about 70% of wetlands have been removed or altered since European settlement (DUC, 2006).There is a long history of installing ditches that drain water from wetlands in order to cultivate land, with drainage activities accounting for 84% of losses in wetlands (NAWMP, 2020).The loss of wetlands and grasslands comes with the loss of ecosystem services.Wetlands have been associated with services that include providing habitat for wildlife, maintaining soil or water quality, regulating water resources, and storing carbon (Zedler & Kercher, 2005;Mitsch et al., 2013;Conant et al., 2017;Bengtsson et al., 2019;Xu et al., 2020;Zhao et al., 2020;Vickruck et al., 2021).
Wetlands and other non-crop spaces-field margins, fencerows, shelterbelts, and tree patches-are common throughout the region of central Alberta.Producers have many incentives to clear small natural habitats from their fields, as this can expand their cultivated land base and reduce time taken to steer equipment.However, producers are also concerned with sustainable crop production as well as public trust in their enterprise and are interested in approaches that avoid additional land conversion.Ecosystem services could encourage producers to retain seminatural areas, as crop production can benefit from non-crop areas (Blaauw & Isaacs, 2014;Tschumi et al., 2016;Venturini et al., 2017;Rundlöf et al., 2018).Non-crop spaces can provide shelter and food for beneficial arthropods (Garibaldi et al., 2014;Venturini et al., 2017;Vickruck et al., 2019Vickruck et al., & 2021)), even in intensively farmed areas (Morandin et al., 2014).In many cases, those arthropods (e.g., bees, wasps, flies, spiders, beetles) provide ecosystem services to agriculture, such as pollination or natural pest control, which may help to improve yields, decrease inputs, and increase overall profitability (Albrecht et al., 2020).Equally, non-crop areas may provide abiotic ecosystem services that are associated with yields (e.g., shelterbelts may reduce soil erosion by wind on the nearby field, improving soil fertility; Rempel et al., 2017).Crop borders often suffer lower yield because of compaction from farm equipment, poor emergence, or shading, but may cause an increase in yield at distance further from the border if they also contribute ecosystem services (see Figure 1; an intermediate increase in yield; Mitchell et al., 2014;Van Vooren et al., 2017).In additional to average yield, semi-natural areas may also contribute to yield stability, as additional pest control or pollination can reduce yield variation both between-and within-fields (Redhead et al, 2020, Hünicken et al, 2021).This is important to growers as it increases harvest predictability.Thus, the local landscape "neighbourhood" surrounding a field may benefit both yield outcomes for growers, with potential to both increase profitability and support habitat for local biodiversity.
While landscape structure can influence the richness and abundance of beneficial species (Klejin et al., 2019;Zamorano et al., 2020), its effect on crop yield-the main concern of producers-is not well studied.This relationship has been typically studied using small-scale field experiments (Tschumi et al., 2016;Rundlöf et al., 2018) or regional analyses (Galpern et al., 2020;Nelson & Burchfield, 2021).Regional analyses (e.g., using county-level yield data) cannot include sufficient field-level detail to support predictions relevant to individual producers, while field studies are labour-intensive and are often limited to a few fields and/or years, limiting their generality.However, adoption of precision agriculture and remote sensing technologies has the potential to change this.Precision agriculture has been practiced commercially since the 1990s (Mulla, 2013) and is now deployed widely across the North American agricultural sector.In Canada, 84% of producers are currently using combine yield monitoring capability which allows them to obtain much information about their fields, such as grain yield and moisture content (Steele, 2017;Mitchell et al., 2020).Furthermore, it is possible to reliably predict crop yield based on its relationship with remote sensing imagery where field-level precision agricultural data is not available (Hunt et al., 2019;Nguyen et al., 2021).Therefore, maps of crop yield, either directly based on data from precision yield data or predicted from remotely sensed data, can be potential alternatives to plot-based field sampling, and present a new method of assessing the influence of landscape complexity on crop productivity.
Our objective in this study is to assess the effects of landscape complexity on canola productivity using remote sensing products.It is important to note that "landscape complexity" is usually defined by three components: (1) landscape composition, (2) landscape configuration, and (3) landscape connectivity (Wang et al., 2019).However, within this study, we only consider the first two components: the composition of land covers surrounding a canola field (i.e., amounts of non-crop and crop land covers at the field edge) and the configuration of these covers near the field (i.e., complexity of the field shape).
We hypothesized that fields with a more complex shape and an intermediate mixture of crop and non-crop land covers at its margins would have a higher field-level mean yield and lower fieldlevel variance in yield, representing the trade off between the ecosystem service boost provided by non-crop habitats and the edge effect that reduces yields (Figure 1).While existing studies often focus on effects of non-crop marginal habitats, we assessed potential effects of both crop and noncrop marginal land covers as those habitats occur simultaneously on the landscape and are expected to have different effects on the crop.Here, we limited our focus to two broad categories of marginal land covers (i.e., crop versus non-crop) due to high misclassification rates at field edges.
To better understand the overall effect of landscape on a canola field, we also examined the effect of landscape complexity on the subfield-level yield.We hypothesized that non-crop spaces at the field edge may create a boost to yield at intermediate distances.This hypothesis combines two simultaneous processes that may be operating at the field edge: a well-known yield reduction edge effect and a potential benefit from non-crop marginal habitats (Figure 1).Crop borders often suffer lower yield because of compaction from farm equipment, poor emergence, or shading, but can provide ecosystem services, resulting in an intermediate increase in yield (Mitchell et al., 2014;Van Vooren et al., 2017) (Figure 1).

# 2. Materials and Methods
We used the following processes to examine the relationship between canola yield and landscape complexity.We first mapped canola fields in the study region using Sentinel-2 time series images and used a functional linear model to predict yield for other canola fields in the study region.Then we used predictions from this model to examine how yield metrics (mean and variance) varied with landscape complexity and distance from field edges.
To test the first hypothesis, we first described landscape complexity of each canola field (i.e., field shape complexity and composition of marginal land covers) by counting numbers of noncrop/crop pixels surrounding the field within rings from the edge and normalizing those counts by the field area.For each ring distance, field-level yield was modelled as a function of the interaction between crop and non-crop pixel counts.To test the second hypothesis, we calculated distance to the nearest edge for every pixel on the field and modelled the effects of non-crop spaces on canola yield using the relationship between yield and distance.

## 2.1. Study Area & Data
This study covered a 100 × 100 km area centered around 8 canola fields (~ 36,000 pixels at 10 m resolution) in the County of Vermilion River (Alberta, Canada) where a 2019 precision yield dataset was available to build a yield mapping model (Figure 2).Precision yield data was recorded in segments by combine on-board yield monitors in 1-second intervals and was characterized by a starting position of the combine, width of the header bar (m), direction of travel (0-360° N), distance travelled (m), and the canola yield (dry mass in tonnes/ha).We used these attributes to construct harvested segments and rasterized those polygons to create yield maps that aligned with Sentinel-2 pixels at 10 m resolution (Nguyen et al., 2021, Figure 3c).
Sentinel-2 is a European wide-swath, high-resolution, multi-spectral imaging mission designed with twin satellites to give a high revisit frequency (5 days at the Equator).Each satellite carries a Multi Spectral Instrument (MSI) payload that samples 13 spectral bands: four bands at 10 m (including Red, Green, Blue and NIR), six bands at 20 m, and three bands at 60 m spatial resolution.In this study, 2019 Sentinel-2 MSI L1C scenes (top-of-atmosphere reflectance product) were downloaded from the Copernicus Open Access Hub.We then used the SNAP v7.0.0 software (ESA Sentinel Application Platform) to generate the bottom-of-atmosphere reflectance product (L2A).The generation of L2A also returned a Scene Classification (SCL) map which was used to filter "bad" pixels (cloud/cloud shadow, snow, and ice).The remaining good observations in each band were stacked to create a time series dataset at each pixel.Here we used Sentinel-2 time series of the Apr-01-2019 to Oct-31-2019 period to map canola field boundaries and to build a functional regression model for mapping canola yield at 10 m resolution.

## 2.2. Mapping Land Covers and Canola Field Boundaries
We generated a land cover map of the study area (7 classes: water, wetland, grass/shrub, forest, barren/urban, canola, and other crops) at 10 m spatial resolution using statistical features generated from Sentinel-2 time series and a Random Forest classifier (Nguyen & Henebry, 2019).
The sample data pool (for training and testing) was manually created based on the Annual Crop Inventory (ACI, at 30 m resolution; AAFC, 2019) due to its high accuracy level for crop categories.
The classification was repeated 100 times, and then aggregated to create a final map by selecting the most popular cover at each pixel.Each time, training and testing datasets were randomly drawn from the sample data pool.The average overall accuracy is 90%, and average producer's and user's accuracy are both greater than 95% for canola.After classification, we only kept fields between 20 hectares and 120 hectares.All retained fields were then visually inspected and edited, using the World Imagery Basemap available in ArcGIS software-a very high-resolution image updated typically within 3-5 years of present-and the Sentinel-2 natural composite images, to make sure that canola field boundaries were detected accurately.In total, 757 canola fields within the study area were used for further analysis (Figure 2).At each field, we computed distance from any canola pixel to the field's nearest edge pixel (as Euclidian distance from centroid to centroid).

## 2.3. Mapping Precision Canola Yield
From spectral bands of Sentinel-2 time series images, two spectral indices were computed: normalized difference vegetation index (NDVI; Huete et al., 1997), an indicator of vegetation greenness, and normalized difference water index (NDWI; Gao, 1996), an indicator of leaf water content.We modeled the rasterized canola yield as a function of NDVI and NDWI time series (presented in Nguyen et al., 2021) in R using the "fda.usc"package (Febbraro & Oviedo de la Fuente, 2012).This functional regression model (Equation 1) can predict canola yield to within 12-16% accuracy of actual yield and is able to capture within-field variation (Figure 3, c versus d).We then used the model to map precision canola yield for all studied fields.
A functional linear regression (FLR) models crop yield, y, as: where X is the value of predictor variables at time t (NDVI and NDWI, in our case), while β is the instantaneous effect (slope) of that variable on y.One way of estimating β is to present the functional covariates (X) and their corresponding slope parameters (β) and as a finite sum of M pre-defined basis elements:   () = ∑    , () =  ′

Effects of landscape complexity on field-level mean and variance of canola yield
We modelled the effect of landscape complexity at different neighbourhoods on yield mean and variance using 2-dimensional additive models.The land cover classification from section 2.3 was collapsed into two classes: crop (canola and other crops) and non-crop (water, wetland, tree, barren/developed, grass/shrub).We then measured the landscape surrounding each canola field by counting the number of crop (  ) and non-crop (  ) pixels within 10 m rings ranging from 10 to 1000 m (1 -100 pixels) from the field boundary (Figure 4).To account for different field sizes, counts within each ring were normalized by the field's area:  ̅  =   / and  ̅  =   /.These pixel counts ( ̅  and  ̅  ) allowed us to assess field shape complexity and composition at the same time.A field with a complex shape will have higher overall pixel counts due to its larger perimeter, while landscape composition is represented by the proportion of crop and non-crop pixels (Figure 4).For each ring, we modeled field-level mean and variance as a function of the interaction between crop and non-crop edges using a Generalized Additive Model-GAM (provided by the "mgcv" package in R; Wood, 2017) and a tensor product () (Equations 4 & 5).In total, 10 models were fitted for different ring sizes.Models were fit for the first 10 ring sizes, but we report, for brevity, only models with a p-value ≤ 0.1 in the Results section.
GAM is a semi-parametric extension of GLM.In GAM, linear terms are replaced by nonparametric and regularized smoothed function of the predictors.Thus, GAM can be used to reveal highly non-linear, non-monotonic relationships between the response variable and the predictors without overfitting.Like a GLM, GAM uses a link function to establish a relationship between the mean of the response variable and smoothed function of the predictors.The structure of the GAM can be written as: ).We used a tensor product to present all possible interactions between crop and non-crop covers surrounding a field, fitting GAM model formulae as follows:

Effects of field edge on subfield-level mean and variance of canola yield
Here we assessed the impact of field edge (i.e., non-crop spaces at field boundaries) on subfield-level canola productivity using two different approaches.First, we used a simple empirical approach that is based only on descriptive statistics of yield in various distance-fromthe-edge bins ("bin-yield" analysis).This approach is suitable for a large-scale analysis of the field edge impacts (regional to national scale).Secondly, we modeled the non-linear relationship between pixel-level yield and proximity to the field boundary using additive models.
### (a) Empirical "bin-yield" analysis At each field, we grouped canola pixels into 10 m distance bins according to their distance to the nearest edge and computed the mean and variance of canola yield for each bin.Using this binned dataset, impacts of field edge on subfield-level canola productivity was then presented by mean and standard deviation values of "mean bin-yields" and "variance bin-yields" across all distance to the nearest edge bins.

### (b) Non-linear modeling analysis
Here a Gaussian location-scale GAM was used to model mean and variance of yield simultaneously.We modeled mean and variance of yield as functions of distance to the nearest edge and included a two-dimensional spatial smooth (Equations 6 & 7, family gaulss in "mgcv").
Spatial smoothers were used to account for the spatial autocorrelation in yield within the crop field.

## 3.1 Effects of landscape complexity on field-level mean and variance of canola yield
The effects of neighboring crop and non-crop land covers on mean yield consistently presented a V-shaped pattern among all significant ring distances (10 -30 m) (Figure 6).However, those effects were small as shown by explained deviance of only 2% to 3%.The field-level mean yields tended to be lower if the field was surrounded mostly by either crop or non-crop neighbors as indicated by the change of color from orange to red along the two axes.In those situations, higher landscape complexity-either more crop or more non-crop neighbors-showed a stronger negative effect (i.e., lower mean yield).On the other hand, positive effects of landscape complexity on field-level mean were observed at the middle and right corner of the plot, indicating that canola fields were more productive where there was a mixture of non-crop and crop neighbors in the landscape.In that situation, higher landscape complexity had a stronger positive effect on canola yield as indicated by higher values at the right corner of the plot.
directions of increasing landscape complexity (i.e., more pixel counts per unit area of a field Like the effects on mean yield, landscape complexity had a small but significant effect (percent of deviance explained is only 2% to 4%) on field-level variance of yield across local scales (10 -80 m) (Figure 7).Towards the bottom of the plot (i.e., field pixels have fewer noncrop neighbours), effects of landscape complexity on the variance of yield were small and negative, indicating that within-field variation of canola yield is less if the field is generally surrounded by crop land covers.Towards the left of the plot (i.e., field pixels have fewer crop neighbours), effects of landscape complexity followed a hump-shaped pattern, or intermediate optimum, with lower effects where there is either a low or high proportion of non-crop edges.

## 3.2 Effects of field edge on subfield-level mean and variance of canola yield
Both methods showed evidence of higher canola yield at an intermediate distance into the field where yield-reducing "edge effects" are no longer dominant.The "edge effects" are visually apparent on plots of the mean yield (i.e., a low mean at the field edge, followed by a rapid increase from 0 to 30 m; Figures 8a & 9a).While the bin-yield approach showed a subtle peak at 100 m (Figure 8a), modeled mean yield peaked at 30 m and gradually decreased toward the field center (Figure 9a).The field edge impacts on yield variance differed between the two methods.The "edge effects" are also clearly present in the yield variance, with much higher variance at the field boundary and a rapid decrease from 0 to 30 m toward the field center (Figures 8b & 9b).However, while the bin-yield approach showed a gradual decrease of yield variance into the field (Figure 8b), the model predicted variance gradually increased from 30 meter toward the field center, indicating a potential stabilizing effect of the field edge on canola productivity apparent at around 30 m into the crop.

Effects of the field edges and landscape complexity on canola productivity:
Here we examined potential effects of the field edge and landscape complexity on mean and variance of canola yield at both field and subfield-levels.Several studies have suggested a positive effect of landscape complexity on crop productivity.In a study about crop yields in the same temperate grassland region at a much coarser, county-level scale, Galpern et al. (2020) analyzed the relationship between yields of multiple crops and landscape complexity-measured as the amount of non-crop covers found nearby or within the field.We extended this analysis by examining the potential effect of both neighboring crop and non-crop covers on the field-level mean canola yield.Our findings generally matched those of Galpern et al. (2020), in that there is a weakly positive effect of field non-crop marginal habitats on mean canola yield.However, we found evidence that canola fields surrounded mostly by non-crop covers may have slightly lower yield, possibly due to the overwhelming yield-reducing "edge effect" in those fields.Fields surrounded mostly by crop covers also have lower yields, possibly due to a lack of ecosystem services supported by the presence of non-crop covers, such as pollination and pest control.
Overall, we found a positive relationship between landscape complexity and field-level mean yield.
While a positive relationship between landscape complexity and crop productivity is measurable at the regional scale (can boost corn and wheat yields up to 20% as reported in Nelson & Burchfield, 2021), its economic importance to crop producers remains unclear.Thus, et al. (2020) suggested the potential benefits of landscape complexity be explored at a finer scale to determine how different types of field edges contribute to yield and to estimate the limits of any effect.That valuable information may help producers to manage or redesign their fields.To support this objective, we assessed the potential impacts of field edge to subfield-level yield.We found evidence of a boost in yield between 30 and 100 m from the field edge towards its center.
There is also a plausible yield stabilizing effect at the same range.Although both potential boosting and stabilizing effects are quite small, these two effects together may offer enough benefit for producers to add small patches of different non-crop land covers within or nearby their fields or, equally provide incentive to retain the current configuration of non-crop covers within or near their fields.

Limitations and future directions:
Our study relies on an accurate land cover map to identify precisely both field boundaries and their neighboring land covers.Here we generated a land cover map of the study area from Sentinel-2 imagery using the ACI layer as training and testing dataset.The overall accuracy of our land cover map is quite high (about 90%), especially for canola with both producer's and user's accuracy of above 95%.However, there remain potential issues with that cover map.Although locations and overall shapes of canola fields were often detected correctly, precise field boundaries and neighboring covers are much less accurate because misclassifications are more likely to occur at edges between different cover classes due to the mixed pixel problem.In addition, we mapped land cover at 10 m resolution which is larger than many edge features, such as small roads, shelterbelts, and wetlands, meaning that those features may not be presented correctly in the map.
To reduce classification errors, we manually inspected every individual field to make sure that its boundary and neighboring land covers were properly mapped.This manual inspection, however, cannot be done easily over a large area.Higher resolution imagery (< 5 m resolution) should be investigated to provide more accurate land cover maps for future studies.
A solution to reduce the likelihood of misclassification at the field edge that we adopted is to merge land cover types to broader categories.Here we only considered two types of edges: crop versus non-crop covers.Although this solution helps to improve accuracy of land cover maps (e.g., Galpern et al 2020), it also prevented us from analyzing the effects of different edge types.It is possible that we would expect different effects associated with roads, shelterbelts, hedgerows, wetlands, and other non-crop covers found in agricultural landscape, as the different vegetation, soil and moisture characteristics of these features may influence the amount and type of ecosystem service provided.Future studies using land cover maps with higher thematic resolution are necessary to explore the effects of different edge types.
Our analysis also relies on precision canola yield maps derived from Sentinel-2 imagery and another precision yield dataset.Our yield model performed reasonably well with prediction accuracy within 12-16% accuracy of reference yield and was able to capture within-field variation.
However, our model was built using training data from only 8 canola fields-a very small number given the much larger study area (100 × 100 km).This training dataset might not fully capture canola growth dynamics and its corresponding spectral response.Future studies should try to use a large training dataset to build a more accurate yield model which, from a data acquisition perspective, is feasible given that precision agriculture widely used in Canada and many other parts of the world.In the yield model, we also did not use any ancillary data which are common inputs of crop yield mapping, such as soil moisture, climatic conditions, crop variety, or agricultural practices, in any of our models.Those variables are available across large spatial extents as remote sensing products and could be considered in future studies to improve the predictive accuracy of yield models.
This study focused on a single crop (canola) for only one year (2019) over a relatively small study area (given that this crop is grown across a continuous footprint ~500,000 km 2 in area; estimated from AAFC, 2019).Thus, although our findings are promising, they may not hold true in other crops, years, or sub-regions of the Canadian Prairies.To confirm the validity of our findings, more studies conducted in regions with contrasting environmental conditions are needed.
In addition, to make those findings more meaningful for crop producers, future research needs to translate a plausible positive effect of the field edge to economic value, such as profitability.

# 5. Conclusion
This study is, to our knowledge, the first to utilize remote sensing imagery and a precision agricultural dataset to assess impacts of field edges on crop productivity.Research on this topic using the conventional, controlled experiment has been rather limited and has occurred chiefly in a few small-scale studies, likely due to the high cost of field campaigns.The remote sensing approach we demonstrate provides many more opportunities to assess the potential impacts of field edges on crops.The method can be implemented at low cost across a large area where precision yield is available, capturing a variety of landscape conditions and for multiple crop-years using readily available satellite images and precision agricultural datasets.
Our results suggested that neighboring non-crop spaces are not only beneficial to canola yield but may also help to stabilize crop productivity.Although the boosting and stabilizing effects of the field edge may be subtle, retaining non-crop spaces near the field could still be a beneficial option for producers, especially given the cost of removing non-crop spaces and current efforts and incentives for the conservation of natural habitats in the region.While the idea of adding noncrop features, such as wildflower strips, or hedgerows, to help increase crop productivity is receiving more attention, our findings about the effects of the field edge on subfield-level canola productivity suggest that producers already benefit from these features, and our work contributes to discussions about the optimal design of fields.

# Acknowledgements
This research has been made possible by Alberta Canola Producers Commission, Manitoba Canola Growers Association, and Eyes High Postdoctoral Research Program at University of Calgary.We thank producers who provided us with the precision canola yield dataset and valuable insight into the underlying subfield-level patterns of yield.We elect not to name them to maintain confidentiality.We also thank Laurel Thompson at Lakeland College in Vermillion, Alberta, Canada.
#  demonstrating the accuracy of predictions of yield from space for fields where precision yield data were available.To show correlation between observed and predicted yield, density scatter plots were used in panels a & b (i.e., the x-axis and y-axis were divided into 100 bins, and "count" shows number of data point in a particular bin).Dashed-red line in panel a is 1-1 line.Finally, to illustrate the ability of this approach to predict variability of yield at the sub-field level using only remote sensing data, the observed versus predicted yield for a sample field is shown in panels c & d.    demonstrating the accuracy of predictions of yield from space for fields where precision yield data were available.To show correlation between observed and predicted yield, density scatter plots were used in panels a & b (i.e., the x-axis and y-axis were divided into 100 bins, and "count" shows number of data point in a particular bin).Dashed-red line in panel a is 1-1 line.
Finally, to illustrate the ability of this approach to predict variability of yield at the sub-field level using only remote sensing data, the observed versus predicted yield for a sample field is shown in panels c & d.

𝑘;
() = ∑  ,   ()  =  (where k = {1, …, M} and i = {1, …, N} where N is the number of observation.Note that one observation of NDVI/NDWI time series consists of several measurements at different times t).Replacing β and X of equation 1 by their new forms results in equation 2-a typical multiple linear regression.

FIGURE 1 .
FIGURE 1. Simultaneous effects at the crop field edge: (red line) potential yield representing the

FIGURE 2 .
FIGURE 2. Study area: selected canola fields (in yellow) on top of the 2019 Sentinel-2 RGB image

FIGURE 5 .
FIGURE 5. Effects of distance, s(Distance), on mean (a) and variance (b) yield for a sample canola

FIGURE 2 .
FIGURE 2. Study area: selected canola fields (in yellow) on top of the 2019 Sentinel-2 RGB image

FIGURE 3 .
FIGURE 3. Outputs of functional regression model to map precision canola yield (a and b),

FIGURE 5 .
FIGURE 5. Effects of distance, s(Distance), on mean (a) and variance (b) yield for a sample

FIGURE 6 .
FIGURE 6.Effects of neighboring crop and non-crop covers on field-level mean yield at four

FIGURE 7 .
FIGURE 7. Effects of neighboring crop and non-crop covers on field-level variance in yield at

FIGURE 8 .
FIGURE 8. Mean (a) and variance (b) bin-yield of various distance bins for all canola fields