Does agri-environmental management enhance biodiversity and multiple ecosystem services?: A farm-scale experiment

Abstract Agri-environmental management has been promoted as an approach to enhance delivery of multiple ecosystem services. Most agri-environment agreements include several actions that the farmer agrees to put in place. But, most studies have only considered how individual agri-environmental actions affect particular ecosystem services. Thus, there is little understanding of how the range of agri-environmental actions available to a farmer might be deployed on any individual farm to enhance multiple services. To address this knowledge gap, we carried out an experimental study in which we deployed a set of agri-environmental actions on a commercial farm in southern England. Agri-environmental actions comprised wildflower margins and fallow areas in arable fields, creating and enhancing grassland with wildflowers, and digging ponds. Alongside biodiversity responses, we measured effects on a number of ecosystem services: pollination, pest control, crop and forage yield, water quality, climate regulation and cultural services. Wildflower margins enhanced invertebrates, pest control and crop yield, and aesthetic appeal. A greater number of pollinators was linked to enhanced oilseed rape yield. But these margins and the fallows did not prevent run-off of nutrients and sediment into waterways, and showed limited carbon sequestration or reduction of greenhouse gas emissions. Newly-dug ponds captured large amounts of sediment and provided aesthetic appeal. Grasslands had higher soil carbon content and microbial biomass, lower N20 emissions, and net sequestration of carbon compared to arable land. Enhancement of grassland plant diversity increased forage quality and aesthetic appeal. Visitors and residents valued a range of agri-environmental features and biodiversity across the farm. Our findings suggest one cannot necessarily expect any particular agri-environmental action will enhance all of a hoped-for set of ecosystem services in any particular setting. A bet-hedging strategy would be for farmers to apply a suite of options to deliver a range of ecosystem service benefits, rather than assuming that one or two options will work as catch-all solutions.


Introduction
Environmental degradation has had major impacts on the delivery of ecosystem servicesthe benefits people derive from ecosystems (Diaz et al., 2019). While farmed landscapes provide key provisioning ecosystem services especially in the form of food, intensive farming systems have also contributed to the erosion of a wide range of other ecosystem services (Power, 2010;Firbank et al., 2013;Emmerson et al., 2016). Unsustainable farming methods have been implicated in the loss of animals providing pollination and pest control for crops, soil erosion, degradation of air and water quality, increased flood risk as well as excessive water use, increased emission of greenhouse gases, and the undermining of cultural services such as recreation and aesthetic appeal (De Deyn et al., 2011;Emmerson et al., 2016;Potts et al., 2016;King et al., 2017;Redhead et al., 2018). Achieving sustainable farming is central to attempts to halt and reverse environmental degradation; for example Sustainable Development Goal 2 "End hunger, achieve food security and promote sustainable agriculture".
There is discussion in the scientific literature about the potential for changed on-farm practices, especially agri-environmental management, to enhance delivery of multiple services and, ultimately, achieve sustainable farming (Rey Benayas and Bullock, 2012;Sutter et al., 2018). Agri-environmental management in Europe involves, in general, agri-environmental schemes by which governments make payments to farmers to encourage them to limit their environmentally-damaging activities, and/or put in place management actions that enhance the farmed environment. While the initial purpose of these schemes was to protect biodiversity, the emphasis has shifted to enhancing ecosystem services (Batáry et al., 2015). The consequent multiplicity of aims, combined with the wide range of farming systems, has led to the development of a large number of agri-environmental actions that farmers in any particular country might implement. As a result, most agri-environment agreements between a farmer and the government include several actions that the farmer agrees to put in place (Hejnowicz et al., 2016;Cullen et al., 2018).
Several studies have shown how specific agri-environmental actions can affect individual ecosystem services, such as: wildflower margins increasing crop yield Albrecht et al., 2020), riparian buffer strips improving water quality (Cole et al., 2020), or grassland restoration enhancing carbon storage (De Deyn et al., 2011). Furthermore, certain specific actions might enhance multiple services; for example, reviews of multiple studies have concluded that non-cropped field margins can provide natural pest regulation, pollination, carbon sequestration, nutrient cycling, nutrient capture and reduced erosion (Van Vooren et al., 2017;Mkenda et al., 2019). It is notable, however, that: 1) most individual empirical studies consider only one ecosystem service and one agri-environmental activity; and 2) certain ecosystem services, especially cultural services, are much less studied with respect to agri-environmental actions.
Furthermore, while there has been speculation about how the range of agri-environmental actions available to a farmer might be deployed on any individual farm to enhance multiple services (Bradbury et al., 2010;Wratten et al., 2012), there has been little empirical research to inform such decision-making. Thus, how might the range of agri-environmental actions that a farmer implements affect a range of ecosystem services? To address this knowledge gap, we designed a farm-scale experiment to assess how the deployment of a set of agri-environmental actions on a farm affected the delivery of key ecosystem services. We hypothesized effects of specific agri-environmental actions on specific services, as described in Table 1, based on the literature. A crucial aspect of the experiment was to measure directly in situ the multiple processes that contribute to ecosystem service delivery rather than use proxies, which can be misleading (Stephens et al., 2015), or models, which can be inaccurate (Willcock et al., 2019). A drawback of our approach that focussed on one (large) farm is that the transferability of results from the one farm to others might be questioned. But the benefit is that we studied a number of agri-environmental actions and several ecosystem services all in the same setting, allowing direct comparability. A single farm study helps in understanding both the potential of agri-environmental schemes to deliver their goals and variability in outcomes between different services that can be expected. Furthermore, the agri-environmental actions we implemented were chosen through co-design and cooperation with the farmer, and so reflect what farmers might actually implement in practice.

Farm and experimental design
The experiment was implemented on a large mixed farm estate which covers about 1300 ha in the county of Buckinghamshire in southcentral England. It was chosen to be representative of lowland farming in southern and eastern England (Defra, 2021), being on clay loam soils with pH 6-7, and low-lying with a relatively flat topography and some impeded drainage. The farm had small areas of sheep and cattle pasture, but the dominant land use was arable with a rotation of winter wheat with winter oilseed rape, spring barley and spring beans.
The experiment had a randomised block structure, with four blocks of comparable area (ca. 200 ha). Each block was divided into treated and control areas (Fig. 1). The control area retained some standard agrienvironment (AE) options in line with the farmer's existing AE agreement: low fertiliser on grassland, grass buffer strips, and wild bird seed mix in field margins. These reflect minimum interventions carried out by many English farmers. Our enhancement treatment used the extra AE managements listed in Table 1, which were placed across the fields where we considered they would achieve most benefit and to fit with the farming operations (Fig. 1). The AE managements were derived from the English Environmental Stewardship scheme, which ran 2005-2019, comprising basic options under Entry Level Stewardship, ELS (Natural England, 2013a), and more onerous options designed to achieve greater environmental benefits under Higher Level Stewardship, HLS (Natural England, 2013b). The ELS had 91 options alone, so we selected management options (Table 1) according to two criteria. 1) They were Table 1 Agri-environment management options selected for the experiment and the ecosystem services they were expected to affect, along with the specific ecosystem processes contributing to the delivery of those services (with supporting references).

AE management
Ecosystem services affected (processes) 1) Over-winter stubble followed by over-sown fallow Climate regulation (greenhouse gases, soil community and stocks) 13 Forage production and quality 13 Cultural services (enhance aesthetic enjoyment) 6 deemed likely according to the literature to have an effect on one or more ecosystem services. We considered ecosystem services which might be enhanced by suitable farm management (Firbank et al., 2013). 2) They were appropriate for the farm; we discussed with the farm manager which options he was able to implement, as the managements were funded under the farm's Environmental Stewardship agreement. There was little pasture and we arranged the blocks to always have pasture in the treated areas. AE managements were set up in the autumn of 2011 to early 2012, with the exception of the ponds, which were dug during autumn 2012. The field margins, sown fallows, enhanced pastures and arable reversion areas were set up and managed as standard for each AE option, using bespoke seed mixes (see SM1). We tried different seed addition approaches for the enhanced pastures, but here we contrast plots sown with the most successful method, seed sowing following disturbance with disc harrow, with unsown control plots. With the exception of an existing pond in Block 4, the ponds were dug for the experiment. The ponds formed rough ellipses and varied in shape, but were all 25-35 m in circumference. We positioned the ponds in field corners in sequence with sown fallow and field margin management options (1 and 2 in Table 1) to test the differing abilities of these managements to remove pollutants from run-off water (see below for further description).
The margins and reversion managements were managed by cutting, while the enhanced pasture managements were grazed by cattle or sheep. Most management options established reasonably well; the condition of each is discussed where relevant alongside specific ecosystem services. The managements were allowed to establish through 2012 before measurements of ecosystem services began and these ran through to late 2014.

Ecosystem service and biodiversity measures
While a number of management options were expected to affect each of the ecosystem services of interest, for each particular service we focussed on specific management options most likely to affect the service (Table 1). This was done to use limited resources in the most efficient way. Because each ecosystem service was linked to a subset of AE options rather than the whole enhanced area, we measured each ecosystem service at the most appropriate spatial scale and selected appropriate controls for each ecosystem service. As our aim was to make process-based measures of ecosystem services, we employed multiple measures which combined to describe each service. These measures are detailed below for each ecosystem service and are summarised in Table 2.
Biodiversity was measured as a response in itself, with a focus on birds, bees and butterflies (Table 2), which are commonly used as biodiversity indicators in farmland (Zingg et al., 2018). Birds were surveyed using standard approaches (Hinsley et al., 2010). A point count station was located at the rough centres of both the treated and control areas in each block. Point counts over 8 mins were taken during two winter and three breeding season visits from December 2013 to June 2014. Bird species and numbers were recorded at two distances from the  Table 1 are represented by colour codes. Certain options were placed where they would achieve most benefit: wildflower margins, sown fallow and pond along a watercourse to benefit water quality; wildflower margins in arable fields to enhance pollination and pest control; forb addition into grasslands to benefit productivity). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) observer -0-50 m and from 50 m to the edge of the treatment area -and finally birds were flushed within 50 m of the point location.
Bees and butterflies were surveyed both at the treatment scale and within key management options. Following standard protocols (Westphal et al., 2008), three UV pan traps -white, yellow and blue -were placed in central locations in a grass margin, for three 48 h periods during June, July and August 2013. We also did surveys on all the flower-enhanced habitats (wildflower margins, arable reversion). Using standard protocols (McCracken et al., 2015), on three occasions from late June to early August 2013, a 100 m transect was laid out along each habitat, bees and butterflies were surveyed, and the number of flower units were counted in five 1 m 2 quadrats placed along each transect.
Pollination. In 2013, we selected across the three blocks 3 × 2 paired fields which were growing oilseed rape, the main insect pollinationdependent crop grown on the farm. One of each pair had grass-only margins, and the other wildflower margins. In each field we quantified oilseed rape seed set along a transect at 10, 20 and 50 m from the margin. We compared oilseed rape seed yield at each distance in control 2 × 2 m areas (pollinators had full access) to that in adjoining 2 × 2 m areas covered by pollinator exclusion cages (1.8 m tall with 0.6 mm netting). Cages were erected in early March 2013 and in July we harvested a 1 × 1 m area within the cages and the controls, and samples were dried and threshed. Yield (Y) of oilseed rape (tonnes ha -1 ) attributable to insect pollination was defined as: Y poll =Y control -Y cage . We also surveyed pollinator visitation rates on four occasions during peak oilseed rape flowering in May by observations over 5 mins of bees within the control areas at each distance.
Pest control was assessed during 2014 in pairs of winter wheat fields in each of the four blocks (4 × 2 fields), again contrasting fields with grass-only or flower-rich margins. We quantified: (i) the overall contribution of invertebrates to pest control; and (ii) if predator guilds that hunt on the soil surface vs those within the crop canopy combine to enhance control. In May 2014 we used clip cages to establish colonies of the aphid Rhopalosiphum padi along transects into the crop from the margin edge, on three wheat plants at 10 m and again at 50 m. The colonies at each distance were allocated to one of three treatments: 1) exposed to all predators; 2) exposed to canopy predators only, by surrounding the base of the wheat plant with a plastic tube; 3) protected from all predators, by covering the wheat plant with a net bag. We counted adult aphids in each colony at ca. 5 day intervals over 41 days and calculated the number of days each colony survived. We also made counts over 5 mins at each transect location of canopy predators on five wheat tillers and of soil surface predators in 1 × 5 quadrats.
Forage yield was assessed in the control vs forb addition grasslands (disc harrowed). We found no significant treatment effects on forb or legume cover (see SM2), but there was variation in vegetation cover in these grasslands, which we used to assess if forb and legume cover affected forage yield. In May 2014 we placed three 0.3 m 3 grazing exclusion cages in each of the two managements in three blocks (excluding Block 2 as this showed poor establishment of the sown seed). We surveyed the vegetation in each cage and clipped it to ground level. Six weeks later we clipped the regrown vegetation and removed it, and this was repeated after another six weeks. These samples were dried, weighed and analysed for nutritional quality.
Water quality. We quantified the effectiveness of three types of field margin (acting as buffer strips along water courses) and in-line ponds for trapping pollutants and sediment. We selected small headwater ditches running along the edges of large agricultural fields. On the ditch-field boundary, 40 m lengths of three buffer strip types were established in order of hypothesised effectiveness for reducing nutrient and sediment loss and these ended where the ditch emptied into one of the ponds. The potentially least effectivea 6 m wide grass-only bufferwas established next to the pond. Along from this was a 6 m wildflower buffer, and furthest upstream we placed a 30 m wide sown fallow buffer. We expected to find an increase in pollutant concentrations along the ditch to the pond as the effectiveness of the buffers decreased. We sampled the water in each ditch at the downstream extent of each buffer strip type at monthly intervals when water was flowing between October 2012 and June 2014. In the ponds, we sampled inflow and outflow water approximately monthly. Sediment accumulation was monitored by sinking three 0.25 m diameter circular plastic trays in each Block 1-3 pond in late October 2013. We excluded the Block 4 pond because of disturbance from nearby building work. We gathered the trapped sediment from these trays at approximately monthly intervals until mid-June 2014. Water samples were analysed using standard methods (Bowes et al., 2018) for concentrations of sediment, total phosphorus (TP), total dissolved phosphorus (TDP), soluble reactive phosphorus (SRP), nitrate (NO 3 ), ammonium (NH 4 ), total dissolved nitrogen (TDN) and dissolved organic carbon (DOC), as well as major anions (fluorides F, chlorides Cl, bromides Br, and sulphates SO 4 ). Sediment samples from the pond traps were analysed for dry mass and total phosphorus concentrations.
Climate regulation. We measured soil organic matter pools, soil microbial diversity and greenhouse gas (GHG) emissions in management options in Blocks 1, 3 and 4, which contrasted types of vegetation cover: control and forb-enhanced grassland, arable crops (oilseed rape in Blocks 1 and 4, wheat in Block 3), wildflower margins, and sown fallow, with one option sampled in each bock. Block 2 was excluded because of the poor plant establishment in the grassland, and to make best use of resources. GHG were measured approximately monthly between May and October 2013 in all selected managements except the wildflower margins (again, for cost efficiency). Gases were sampled in static chambers every 10 m on along 50 m transects in each option. Concentrations of CH 4 and N 2 O were analysed by gas chromatography following Ward et al. (2007). Using a portable infra-red gas analyser (IRGA) (EGM-4, PP Systems), measurements of CO 2 exchange were made using dark and light chamber lids for ecosystem respiration and net CO 2 flux respectively, with the difference between the two representing photosynthetic rates. We also collected soil cores (5 cm width, 15 cm depth) in July 2013 and August 2014. The latter were analysed using standard methods (Emmett et al., 2008) for total C and N content, pH, loss on ignition, bulk density and electrical conductivity. Carbon density was calculated from loss on ignition and bulk density. We extracted microbial phospholipid fatty acid (PLFA) biomarkers from the 2013 cores to examine bacterial and fungal communities, using standard methods (Frostegård et al., 1991;Zelles and Bai, 1993).
Cultural services. We carried out surveys of how members of the public perceived and appreciated the different AE management options. Cultural services take many forms (King et al., 2017), and we focussed on aesthetic preferences. The first study used a questionnaire targeted at visitors to the farm during Open Farm Sunday (a national event, encouraging the public to visit farms) in June 2013. The questionnaire was given to visitors to complete following a tractor ride through the estate. The questionnaire was designed to be short, simple and visual, and asked the visitor to reflect on how they enjoyed these options on a ranked scale, using photographs of each option. They were also asked questions as to which animal groups they most enjoyed seeing on the farm, and attitudes towards different field margin management options (questionnaire in SM3). We obtained 87 responses from people aged from 12 to 61+ years, living in cities, towns and villages, and of whom 60% identified as female. The second study used a questionnaire based on the first questionnaire (questionnaire in SM4; this had a greater range of questions, but here we analyse only those similar to those in the first questionnaire), and was targeted at residents who lived near the farm by leaving copies in local businesses. We obtained 31 responses, between the age range 18-61+ years, of which 46% identified as female.

Statistical analysis
Analyses were done in SAS 9.3. We used generalised linear mixed effects models (GLMM; using Proc GLIMMIX) for most analyses. In general, we used Poisson errors for count data and normal errors for other data, unless stated otherwise. We did full model checks to ensure that model assumptions were met. Where treatments were hierarchical, we used appropriate random terms to specify the models.
For biodiversity measures, we did GLMMs for the count data, with block and survey date as random effects, to assess whether treated areas had more birds or, given that AE options tend to supply seed for birds (McCracken et al., 2015), more granivorous birds than control areas. Individual species were too uncommon to analyse separately. To assess other causes of variation in bird numbers the GLMMs included the cover (m 2 ) of common land use types within 50 m of the observation point (SM6).
For pollination, we constructed GLMMs with block as a random effect, to assess responses of both yield increase due to pollination (Y poll ) and pollinator numbers (summing all bees and hoverflies) to margin type and distance from the margin. Including the interaction between these fixed effects led to large increases (> 4) in the calculated AIC, so we excluded interactions in our final models.
For pest control, we analysed if the counts of each predator guild were affected by margin type and distance, with block as a random effect in GLMMs. There were a number of zeros in the datasets, but tests suggested no zero inflation. We also analysed how margin type, distance and exclosure treatment affected the number of days that aphid colonies survived. To do this we constructed a GLMM with all pairwise interactions between fixed effects, and block as a random effect.
For forage yield, we used GLMMs with repeated measures, accounting for the nested design, and using the beta distribution with a logit link for the proportional data to analyse the forage yield and quality variables in relation to treatment (control vs disc harrow).
For water quality, we analysed differences in nutrients, sediment and major anions among the ditch sampling points and between pond inflows and outflows using GLMMs with repeated measures for the sampling date, and with block as a random effect. Lognormal errors were used because of high variability of the data.
For climate regulation, we used GLMMs to analyse the soil data from 2013 and 2014, accounting for the nested design, and using the beta distribution with a logit link for the proportional data and the normal or lognormal distribution for other data. Gas fluxes from the static chambers and IRGA from May to October 2013 were analysed with GLMMs accounting for the nested design, and also the repeated measures over several months. Normal or lognormal distributions were used as appropriate.
For cultural services, where the survey asked respondents to choose between different answers, we analysed the data using Chi-squared tests. Other questions required respondents to give an enjoyment score to different categories, and we analysed these data using generalised linear models for ordinal data (using Proc GENMOD).
The pan trap sampling at the treatment scale trapped a range of Diptera, Coleoptera, Hymenoptera and Lepidoptera. Numbers summed across survey dates showed no differences between treatment vs control areas in abundance of butterflies (treatment vs control, mean ± standard error: 21 ± 10.2 vs 15.8 ± 1.9), abundance of bumblebees (19.5 ± 4.3 vs 23.8 ± 9.9) or species number of bumblebees (8.5 ± 0.65 vs 7.8 ± 1.18). Sampling within the enhanced treatment areas showed quality of the margin affected the number of bee and butterfly species (species listed in SM6). Poisson regression, in which we controlled for weather (temperature and cloud cover) during the survey, showed that flower number strongly influenced abundance of bees (r 2 = 0.443, p < 0.0001), and had a significant, although small, effect on butterfly numbers (r 2 = 0.042, p = 0.021) (further details in SM6).
Exclusion of predators had large positive effects on aphid colony survival times (Table 3, Fig. 3). When aphid colonies were exposed to both predator guilds they survived on average fewer than 10 days, while the exclusion of all predators allowed colonies to survive on average 40 days. This pest control declined with distance from the margin (distance and distance x exclosure effects). Interaction terms also indicated effects of margin type on pest control, with enhanced control near wildflower margins. As a result, aphid colonies 10 m from wildflower margins survived about 2 days in the presence of both predator guilds. But where colonies were 10 m from grass only margins, or more than 50 m from either margin type, they survived between 6 and 11 days (Fig. 3).

Water quality
Pollutant concentrations were highly variable over time, with strong differences among sampling dates (SM8). But there were no differences among ditch sampling points or between pond inflows and outflows (Fig. 5a, b; SM8). By contrast to these discontinuous water samples, the trays sunk into the ponds provided integrated measures of sediment accumulation from October 2013 to June 2014. Across the three ponds, Fig. 2. Pollination services. a) Oilseed rape yield difference between pollinator exclosures and allowing full pollinator access, as affected by distance from the field margin, with the fitted regression. The counterintuitive negative values maybe because cages affected the microclimate and/or excluded pests too. b) Pollinator densities as affected by distance from the field margin. c) Oilseed rape yield difference as affected by pollinator density, with the fitted regression line.

Table 3
Generalised linear mixed model analysis of the effects of predator exclosure, field margin type and distance from the field margin on the survival of aphid colonies (days to death) placed on wheat plants.  each tray accumulated an average of 89.7 g of sediment and 1.62 g of phosphorus over each sampling interval. Using the area of each pond, we estimated that over the 8 month period the ponds trapped: in Block 1 (32 m 2 ) a total of 211.4 kg of sediment and 231.3 g of P; in Block 2 (31 m 2 ) a total of 342.9 kg of sediment and 604.6 g of P; and in Block 3 (53 m 2 ) a total of 1035.3 kg of sediment and 1861.2 g of P.

Climate regulation
The soils from the two grassland types (control and forb-enhanced) had higher % C and N content than soils under arable crops (Fig. 6a,  b). Wildflower margins or sown fallow, which both had been sown onto arable land, also had lower C and N content than the grasslands, but there was a trend for higher values than under continuing arable (Fig. 6a, b). Bulk density was lower for the grassland treatments than for the others (Fig. 6c), as a result of which carbon density did not differ among treatments (F 4,8 =0.83, p = 0.541). There were also no treatment effects on C/N ratios (F 4,8 =1.17, p = 0.392) or LOI (F 4,8 =2.14, p = 0.167).
Total microbial, bacterial and fungal biomass were also higher in the two grassland treatments than in the arable and sown fallow treatments (Fig. 6d, e, f). Interestingly, the wildflower margins had higher values for all these measures than the arable and sown fallow treatments, with values more similar to those for the grassland soils. The bacterial to fungal ratios showed little difference among the treatments (F 4,7 =2.99, p = 0.098).
In the static chambers, N 2 O emissions were higher in the crop and sown fallow than in the grasslands (Fig. 7b). Methane emissions were extremely low at all times and across all treatments (mean = 0.947 ± 2.397 ng/m 2 /hr). Ecosystem respiration rates measured in the IRGA were lower in the crop, higher in the grasslands, and highest in the sown fallow treatment (Fig. 7c). Photosynthesis rates showed similar patterns to respiration rates, except that they were relatively low in the sown fallow treatment (Fig. 7d). The balance of photosynthesis and respiration, expressed as net ecosystem exchange, was negative for the two grassland treatments, indicating net drawdown of CO 2 , but was positive for the arable and sown fallow treatments, indicating net emissions.

Cultural services
In the initial survey, during Open Farm Sunday, respondents showed a strong liking for all AE management options seen on the farm and all were equally preferred (Х 2 =3.19, df=3, p = 0.364; Fig. 8a). These visitors also liked seeing all types of animal on the farm, and this included both wild species and livestock (Х 2 =8.12, df=3, p = 0.044; Fig. 8b). Interestingly, overall there was a greater preference for livestock than for butterflies (CONTRAST in Proc GENMOD; Х 2 =6.79, p = 0.009) or bees (Х 2 =3.71, p = 0.054). Respondents were also asked about their attitudes towards arable fields with no margin vs those with a flower margin (Fig. 8c). There was a significant preference among respondents for flower margins, who preferred their appearance (Х 2 =61.25, p < 0.001), would prefer their food to be grown in such conditions (Х 2 =6.08, p = 0.014), and would prefer this option to be seen on farms (Х 2 =40.01, p < 0.001). However, the respondents also generally expected that they would not see these options on farms (Х 2 =7.18, p = 0.007). Furthermore, when asked under what conditions they would prefer food to be grown, respondents chose no margin more frequently than when asked about preferences for the look of fields (Х 2 =9.58, p = 0.002).
While residents also expressed a liking for most AE management options, the responses were not as positive as for Open Farm Sunday respondents, and residents were more discriminatory (Х 2 = 13.18, df = 3, p = 0.045; Fig. 9a), showing stronger preferences for field margins (Х 2 = 4.17, p = 0.041) and woods (Х 2 = 6.81, p = 0.009) over hedgerows. Similarly, while strong liking was shown for birds (of prey and of farmland) and butterflies, respondents were much more equivocal about livestock, bees and small mammals (Х 2 = 12.24, df = 5, p = 0.032; Fig. 9b). When asked about margins on arable fields, contrasting no margins, grass margins or flower margins, respondents strongly preferred the look of flower margins (Х 2 = 27.79, df = 2, p < 0.001; Fig. 9c), but generally expected not to see flower margins on farms (Х 2 = 7.72, df = 2, p = 0.021). Respondents would prefer to have (Х 2 = 12.93, df = 2, p = 0.002), and prefer their food to be grown on (Х 2 = 7.79, df = 2, p = 0.020), farms with flower margins. But in contrast to the visitors, residents did not perceive a conflict between flower margins and growing food (Х 2 = 0.80, df = 2, p = 0.670). In response to a question, the majority (25 of 31) of residents stated they were not aware of the AE scheme at the farm. But of the six that did know about them, four felt positive about the changes.

Discussion
We found experimental evidence for many positive effects of agrienvironmental actions on the processes linked to multiple ecosystem services, which we summarise in Table 4. The combination of agrienvironmental actions enhanced particular processes contributing to cultural, regulating and provisioning services, comprising aesthetic appeal, carbon sequestration and reducing greenhouse gas emissions, sediment and nutrient capture, pest control, crop pollination, and forage and crop yield. However, we also found that several processes and ecosystem services were clearly not enhanced by certain agrienvironmental actions (Table 4). A particular take-home is that while wild-flower margins may promote beneficial invertebrates and consequently increase crop yield and provide cultural services, we found they did not decrease greenhouse gas emissions or improve water quality within the timeframe of this study.

Biodiversity and its impacts on yield were enhanced by AE options at a local scale
As many others have done (Carvell et al., 2007;McCracken et al., 2015), we found enhanced biodiversity at the small-scale within wildflower margins. The lack of general treatment-wide effects on birds, bees and butterflies reflects other studies which have found equivocal effects of agri-environmental management on biodiversity at large-scales (Baker et al., 2012;Carvell et al., 2015;Angell et al., 2019). This likely reflects the large foraging range of many of the species we studied and indicates the need to have large proportions of farm areas under agri-environmental management before benefits can be seen (Zingg Fig. 6. Soil properties in control ('Grassland') and forb-enhanced ('Grassland+') grassland, arable crops, sown fallow and wildflower margins treatments. We plot the model-estimated least square means, and letters show the significant (p < 0.05) differences between these means. et al., 2019). The potential for local enhancement to increase biodiversity is illustrated however by the effects of wildflower margins on pollinators, and our finding that margin, tree and hedgerow cover had positive relationships with some measures of bird abundance (SM6).
Higher invertebrate numbers did appear to promote pollination and yield for oilseed rape and pest control on wheat. However, while we found direct effects of floristically enhancing margins on wheat yield, there was no direct effect on oilseed rape yield. Noting that these specific data were also used in a broader analysis by Woodcock et al. (2016) which had the same conclusions, this finding accords with increasing evidence of pollinator and predator 'spill-over' from field margins into crops (Albrecht et al., 2020). Moreover, the synthesis by Albrecht et al. (2020) also found a general pattern of rapid declines in these benefits with distance from the margin into the field. Enhancing biodiversity also appeared to benefit forage yield, through increasing plant diversity by sowing forbs into grassland. The sowing of new species had limited and variable success, which reflects the difficulty in establishing forbs in permanent pasture without major soil disturbance Woodcock et al., 2014). But where legumes were enhanced locally, we found some benefits for forage quality related to nitrogen content and digestibility, although no effects on amount of forage. While there is a well-known relationship between plant diversity and productivity (Hector et al., 1999), it is less certain how this translates into effects of species addition on yield in agriculturally-productive grasslands. The few existing studies show some positive, albeit variable, effects on forage yield in terms of amount or quality (Hofmann and Isselstein, 2005;Bullock et al., 2007;Jerrentrup et al., 2020).

Ponds improved water quality, but buffer strips did not
Vegetated buffer strips situated between the farmland and watercourses have been much promoted as a way of capturing nutrients and sediment running off farmland and into watercourses (Stutter et al., 2012). Syntheses and reviews of field studies have found this approach is generally effective (Van Vooren et al., 2017;Valkama et al., 2019;Cole et al., 2020). However, studies vary as to the benefits of making strips wider or with a more diverse vegetation (Cole et al., 2020). A meta-analysis by Valkama et al. (2019) found no overall effects of buffer width or species number on nitrogen retention, but another meta-analysis by Van Vooren et al. (2017) found positive effects of margin width on interception of nitrogen, phosphorus and sediment. Our results suggest increasing the plant diversity of 6 m buffer strips or increasing their width up to 30 m had no effect on capture of phosphorus, nitrogen, sediment or a range of anions. We had no controls without a vegetated strip on this farm, as English basic farm payments encourage all farmers to create simple buffer strips along watercourses. To investigate this issue we applied the Farmscoper model (Gooday et al., 2014) to assess the improvement in field-scale water quality from establishing agri-environmental measures. Simulation of the implemented treatments in the context of this farm suggested only very small reductions in mass losses (2.0-2.1% for NO 3 , 2.0-3.7% for TP and 2.8-4.9% for sediment) would be achieved compared to no buffer strips under typical weather conditions. Thus, our results suggest little benefit from vegetated strips as buffers for watercourses. These findings may have been somewhat specific to this type of farm, as it had a rather flat topography, a heavy clay soil that was prone to develop fissures which may have intercepted run-off, and most fields had sub-surface field drains which would bypass the field margins to some extent. The last point is not unusual however, as a large proportion of arable land is drained in this way in the UK (66%), especially heavy clay soils, and also in many European countries (Brown and van Beinum, 2009). Dorioz et al. (2006) argue that buffer strip effectiveness is a complex product of slope, soil type, vegetation type and rainfall patterns, so it is not simple to state where they might be effective.
By contrast, the ponds dug at the end of the field ditches were very successful at trapping sediment and associated phosphorus. The amounts of sediment trapped over eight months in our ponds (211-1035 kg), which ranged in size from 31 to 52 m 2 , are similar to the Fig. 7. Gas fluxes in control ('Grassland') and forbenhanced ('Grassland+') grassland, arable crops, sown fallow and wildflower margins treatments as measured in static chambers (a) and a portable infra-red gas analyser (b, c, d). We plot the model-estimated least square means, and letters show the significant (p < 0.05) differences between these means. (note the estimation of least square means, given the hierarchical design, leads to the Net Ecosystem Exchange (NEE) means not precisely following the values for photosynthetic and respiration rates).
amounts trapped in a number of similarly-dug ponds (20-200 m 2 ) across England, which trapped 20-4000 kg.yr -1 (Ockenden et al., 2012). More complex drainage systems involving ponds are also effective at removing nutrients from water coming off agricultural land (Carstensen et al., 2020). While we did not investigate other ecosystem service benefits from these new ponds, mature ponds can also support biodiversity, such as pollinators, in agricultural landscapes (Walton et al., 2020). But, the build-up of nutrients and sediment in such ponds can undermine biodiversity benefits (Walton et al., 2020), as well as pose a risk of re-suspension and outwash of pollutants (Carstensen et al., 2020).

Variation in effects of AE options on climate regulation
Contrasting existing grasslands with arable land confirmed the consensus that the former have higher soil carbon and nitrogen content and have net drawdown of carbon by comparison to the net loss from arable systems (Janssens et al., 2005;Leifeld and Kogel-Knabner, 2005;Dawson and Smith, 2007). These patterns are likely primarily due to the loss of carbon and nitrogen arising from tillage and the transient vegetation cover. The lower emissions of N 2 O in the grasslands is explained by the fact that they had little fertiliser, by contrast to the use of nitrogenous fertilisers on the arable land, which promotes emissions of this greenhouse gas (Roelandt et al., 2005). The higher microbial biomass in the grassland soils is an expected response to their higher organic matter content (De Deyn et al., 2011). However, we found no effects on soil properties or greenhouse gas emissions of adding forbs to grasslands, probably because this agri-environmental management was not very successful in attaining a large change in forb cover. Other longer-term and more successful enhancement of species diversity in grasslands has shown this can enhance soil carbon sequestration and decrease greenhouse gas emissions (De Deyn et al., 2011;Yang et al., 2019).
Similarly, our wildflower margins and fallow areas did not clearly benefit soil properties or greenhouse gas emissions compared to the arable fields. Sown margins have been found in general to enhance carbon stocks (Van Vooren et al., 2017). We speculate that this discrepancy was because our margins had been in place for rather a short time, being three years old by the end of our measurements, and changes in soil properties such as the carbon content often occur slowly (Richter et al., 2007). But we did find some trends towards enhanced soil carbon, nitrogen and microbial biomass in the sown margins, which hints that their soils were starting to improve. The tall vegetation, and thus greater biomass, in the fallow areas explains the greater photosynthesis and respiration than in the crops. The fact that the balance of these processes led to greater CO 2 emissions in the fallow than the crop may be because the fallow vegetation was quite open, with a relatively large amount of bare ground.

Cultural services were enhanced by AE options
Cultural service analyses in agricultural landscapes tend to contrast farmed with non-farmed land uses (Junge et al., 2015;King et al., 2017;Ridding et al., 2018). There is little work on aesthetic appreciation of specific, agri-environmental elements; in general, the cultural service benefits from such actions are usually assumed rather than demonstrated. An exception is a study in Illinois which showed farmers, academics, and residents all preferred simple vegetated 'buffers' on farmland compared to no buffers (Sullivan et al., 2004). We found that both residents and visitors showed a clear liking for flower margins and expressed a preference for seeing more on farms. Interestingly, visitors (but not residents) also appeared to perceive a conflict with food production, probably considering that margins use land which might otherwise be used for arable crops. Habitats associated with agri-environmental management, such as ponds, wildflower meadows and hedges were perceived generally positively by visitors, but residents were more negative about hedges and meadows. This maybe because residents see these habitats as common aspects of the farmed landscape and so attach no special significance to them, which contrasts with the relatively new flower margins. Wildlife that is specifically targeted by agri-environmental managementbirds, bees and butterflieswere generally liked by visitors. It is interesting that farm livestock were more appreciated by visitors, and this may reflect the particular audience attracted to Open Farm Sundays. Residents did like birds and, to some extent, butterflies, but there was no strong liking for bees, which is interesting given the assumptions in the literature that bees are particularly important for cultural services (Sumner et al., 2018). Overall, our results are generally positive, showing perceived cultural benefits from habitats and wildlife associated with agri-environmental management. These findings do raise issues concerning a lack of appreciation of certain species groups, perceived conflicts between agri-environmental management and food production and a lack of awareness of agri-environmental management even when being implemented locally. Evidence is accumulating that cultural services are affected by people's activities and experience , and so agri-environmental management may deliver such services better if combined with outreach activities.

A bet-hedging approach to deploying AE options
Overall, this study has illustrated the benefit of considering multiple ecosystem services as impacted by the several agri-environmental actions that might be implemented on a farm. A particularly significant finding was that, in this specific setting, we did not find support for all expected relationships between certain agri-environmental actions and specific ecosystem services. These contrasts with the literature may possibly be due to the systemic publication bias in ecology, towards publication of studies that report positive outcomes (Fraser et al., 2018). The focus of individual publications on one management action and one ecosystem service may exacerbate this bias. This makes a study such as ours important, in that we can report some 'negative' outcomes, as recommended by Wood (2020). Furthermore, it is increasingly clear that one needs to understand the context-specificity of any action designed to enhance ecosystem services or biodiversity, as a certain management activity will not have the same outcomes everywhere (Spake et al., 2019). This is illustrated by our finding of no nutrient and sediment trapping by the vegetated field margins. So, an understanding of the reasons that some expected relationships between agri-environmental actions and ecosystem services were not manifest helps inform this context-specificity. One general issue was that the managements were in place for a relatively short period of time, 2-3 years, before we took measures. But this reflects the short-term nature of much agri-environmental management, in terms of the length of the agreements that farmers sign (often five years in England), so the relative brevity of the experiment was appropriate in this case. The solution is not straightforward however, and management for multiple ecosystem services and biodiversity will often involve trade-offs (Bullock et al., 2011). For example, while wildflower margins tend to show a rapid decline over time in floral resources for pollinators (Smith et al., 2010) and in sediment-trapping ability (Cole et al., 2020), our margins had clearly not been in place long enough for carbon sequestration to become noticeable.
This study focussed on a single, albeit large, farming estate. An ideal, but very expensive, study would have replicated our approach across multiple farms to determine the context-specificity of the links between each agri-environmental option used and the ecosystem service outcomes. But an important finding of this study is that one cannot necessarily expect that a specific agri-environmental action will enhance all of a hoped-for set of ecosystem services in any particular setting, even if some studies have shown positive outcomes elsewhere. Indeed, the relationships between specific agri-environmental actions and specific services will likely differ among farms. Our study suggests that a Table 4 A summary of the effects of each agri-environment option on each ecosystem service tested (see Table 1). The overall ecosystem service is underlined (and are aligned in the two columns for clarity) and the individual contributing processes are listed in separate columns according to whether we found an effect of the agri-environment option or not. general, bet-hedging, approach could be to deploy multiple agrienvironmental options on a farm. This would mean it is not necessary for each option to deliver for all services as long as there is a suite of options, each delivering well for one or two services. As a consequence, farmers need to work out a portfolio of options that will deliver ecosystem service benefits in their particular circumstances, rather than assuming that one or two options will work as catch-all solutions.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.