The use of ecological models to assess the effects of a plant protection product on ecosystem services provided by an orchard

The objective of this case study was to explore the feasibility of using ecological models for applying an ecosystem services-based approach to environmental risk assessment using currently available data and methodologies. For this we used a 5 step approach: 1) selection of environmental scenario, 2) ecosystem service selection, 3) development of logic chains, 4) selection and application of ecological models and 5) detailed ecosystem service assessment. The study system is a European apple orchard managed according to integrated pest management principles. An organophosphate insecticide was used as the case study chemical. Four ecosystem services are included in this case study: soil quality regulation, pest control, pollination and recreation. Logic chains were Jo ur na l P re -p ro of Journal Pre-proof


Introduction
Currently the environmental risk assessment (ERA) of pesticides in most jurisdictions is primarily based on the results of single species tests and, at the higher tier, on multispecies tests including microcosms, mesocosms and field studies (e.g. EFSA PPR, 2013). Single species tests have limited ecological realism and might underestimate the real ecological risks (Zhao et al., 2019), therefore assessment factors are used to extrapolate from one species to all species and from effect thresholds (e.g. LC50) to no effect thresholds. The ecological realism of experiments using microcosms and mesocosms is greater, but the spatio-temporal consistency of effects thresholds remains variable (e.g. Sumon et al., 2018). Another limitation of this risk assessment framework is that the protection goal is rather vague, e.g. maintaining a healthy environment and conserving biodiversity, and that the spatio-temporal dimensions of the protection goal remains unclear (i.e. should we protect everything, everywhere, always?) (Brown et al., 2017;Maltby et al., 2017). As both the measurement endpoint and the protection goals used in the current risk assessment have their limitations, there is a great need for a new or revised risk assessment framework which allows a better informed assessment of and linkage between these factors.
In order to overcome some of these challenges, the use of the ecosystem services (ES) concept in the ERA of chemicals has been proposed. Besides adding a spatio-temporal dimension into the ERA, using the ES concept also results in assessments which have more relevance to risk managers as it can indicate which services should be protected, when and where , i.e. it allows the development of more specific protection goals.
Therefore, EFSA has produced guidance for the development of specific protection goals for use in environmental risk assessment (EFSA Scientific Committee, 2016). Specific protection goals for taxa important in delivering specific ES (i.e. service providing units, SPU) are defined in terms of 5 dimensions: ecological entity to protect, attribute to protect, magnitude of relevant effects, spatial scale of effect, temporal scale of effect (EFSA PPR Panel, 2010; The objective of this study is to explore the feasibility of applying ecological models to an ES-based environmental risk assessment using currently available data and methodologies. It addresses the use of ecological models to extrapolate from standard laboratory tests to ES assessing the entities and attributes proposed in the EFSA specific protection goal guidance and scientific opinions (Table SI1). The output from the case study will be used to develop a methodology to 1) use the ES concept in a real case study, 2) to assess the knowledge, data and modelling gaps preventing a practical, full scale implementation of the concept and 3) to evaluate the added value of an ES-based approach to regulatory decision making. This is a 'proof of concept' study and is not intended to be a detailed environmental risk assessment using best agricultural practices or risk mitigation measures. As with all risk assessments, there are extrapolations between measurement endpoints and assessment endpoints. The uncertainties associated with these extrapolations are acknowledged. As this paper is focussed on a plant protection product, we use the EFSA guidance documents for the risk assessment to provide a regulatory framework (EFSA Scientific Committee, 2016). An overview on the implementation and added value of assessing chemical risk within an ecosystem services framework as obtained by case studies is provided by Maltby et al. (2021). Maltby et al. (2021) includes the experiences from this case study but also from another one assessing the risks of metal released into rivers within the Water Framework Directive using the ecosystem services concept .

Stepwise approach used to assess effects on relevant ES using ecological models
Step 2 The selection of ES uses the Common International Classification of Ecosystem Services (CICES v5.1;Haines-Young and Potschin, 2018) and is based on the potential for the study system to provide the ES, the potential for the study system to be exposed to the plant protection product and the potential for the plant protection product to affect the ES. This approach is consistent with EFSA guidance (EFSA Scientific Committee, 2016) and has been applied to chemicals other than plant protection products (Maltby et al, 2016).
Step 3 Development of logic chains to link the results of standard toxicity tests to ecosystem service delivery by identifying important species, which are SPU.
Step 4 Selection of ecological models to assess the effects of the plant protection product to the SPU in space and time so the magnitude and time scale of the effects can be assessed.
Step 5 Detailed ES assessment extrapolating the effects observed on the SPU to the ES delivery using the logic chains.
The proof of concept study evaluates a subset of ES at risk. The assessment adopts a tiered approach starting with standard toxicity data (as per EFSA guidance documents). As the assessment is refined, non-standard species and measurement endpoints are combined with ecological modelling approaches to reduce the gap between the measurement endpoint and the assessment endpoint (i.e. ES). No new data or models are generated as the objective of the case study is to explore the feasibility of applying an ES-based approach to ERA using currently available data and methodologies. Selection of ES for detailed assessment is pragmatic and based on the availability of data and ecological models. It is acknowledged that a regulatory ERA will need to consider all ES potentially at risk.
Apple trees are a permanent crop and it is assumed that the trees are mature and that the orchard is a permanent landscape structure. Flowering occurs in May and apple harvesting is in late October-early November. The evaluation of ES delivery is focused on the orchard (approximately 30 hectares) plus a small boundary area (~10 m), which includes off-crop areas such as hedges but excludes aquatic systems. It is acknowledged that, whilst the beneficiaries of fruit production (i.e. cider producers and consumers) could be local, national or international, in most cases beneficiaries and users of the orchard will be members of the local community (Natural England, 2012). An assessment of the feasibility of having an orchard without plant protection product and implications for ES delivery are out of scope for this proof of concept study.
All models need to be framed within an environmental scenario. In this case study the evaluation of ES delivery focused on an orchard as described above. Please note that the scenario used is very simple as, for instance, tillage and the age of the trees, which will influence the ES provided and the effects of the plant protection product upon them, are not included. The interaction between an orchard and the wider landscape is strongly influenced by the distribution, size and abundances of patch types represented within the landscape (i.e. landscape configuration).
Landscape configuration is spatially explicit because it refers not only to the variety and abundance of patch types, but also to their placement or location in the landscape. Small scale heterogeneity (i.e. within orchard) will be simulated in models for species with low dispersal (earthworm, springtail), whereas landscape-scale heterogeneity will be simulated for species with high dispersal (honeybees, ladybird, butterfly). Consideration of the effect of changes in landscape configuration is out of scope for this proof of concept study.
The study chemical is an organophosphate insecticide, which is applied to cider apple orchards twice per year as pre-and post-blossom treatments in accordance with the product label requirements. Applications are made using low-drift nozzles to minimise spray drift. The study chemical is applied on two occasions, before flowering (480 g a.s/ha on April 15) and 6 weeks later after flowering (960 g a.s./ha on May 30). In the following models, this application profile is equivalent to an exposure multiplication factor (EMF) of one. As an example, an EMF of 0.1 would represent an application regime of 48 g a.s./ha on April 15 and 96 g a.s./ha on May 30. All models were run for multiple years of simulation. All models except the honeybee model were run for 5 years without exposure, 10 years with exposure and 5 years recovery without exposure. The honeybee model was run for 10 years with J o u r n a l P r e -p r o o f Journal Pre-proof exposure in order to represent a worst case scenario of managed honeybee colonies that are placed in the landscape to provide apple pollination and produce honey.

Ecosystem service selection
Apple orchards, as any crop, have the potential to provide a range of ES and to influence and be influenced by ES provided by the wider landscape. The potential ES provided by an orchard are dependent on how the orchard is managed (e.g. integrated pest management and presence of ground vegetation within the orchard). A list of the ES potentially provided by cider orchards and nearby off-crop areas are presented in Table 2 observational recreation (3.1.1.2) and have, therefore, been included in the present study. The outcome of a risk assessment to a group of organisms is concluded as "acceptable" or unacceptable", as per EFSA Scientific Committee (2016), and directly relates to the comparison of the outcome of the risk assessment metrics to a trigger value, or the outcome of the modelling as compared to control or reference situations (see Section 3.). Note that in a decision-making process the level of risk that is expected is considered together with other aspects entering the regulatory process, such as the benefit to the crop being treated and the possibility to mitigate the risk through risk mitigation measures, to finally conclude on the acceptability or unacceptability of a risk. This constitutes the next step of the regulatory process, but is outside the scope of this paper

Logic-chains for selected ES
J o u r n a l P r e -p r o o f

Journal Pre-proof
For all the evaluated ES, logic chains were developed sensu Hayes et al. (2018) (Fig. 1). Hayes et al. (2018) developed evidence-based logic chains to assess the effects of trace metal contamination in soil on a suite of ES. We used their evidence-based logic chains to hypothesise causal chains of effects between the selected ES and their SPU. When no evidence-based logic chain was available, we hypothesised one based on existing knowledge. The logic chains start with an exposure to the plant protection product resulting in a decrease of a sensitive SPU. This direct effect (e.g. decreased earthworm survival and reproduction resulting in a decrease of earthworm abundance) cascades to a subsequent effect (e.g. decreased processing of organic matter and soil aggregates stability) into an altered final process (e.g. decreased soil nutrient cycling and increased soil erosion) and affected ES (e.g. decrease in soil quality/fertility). The ecological models used in the case study address the spatio-temporal magnitude of the direct effects on SPU (e.g. earthworms), and in some cases consider effects up to ES delivery (e.g. pollination).
Ecosystem services are provided by multiple species and there is some redundancy in function within SPUs.
The evaluation presented in Table SI2 is, however, based on a small number of standard test species and individuallevel effects. The approach could be extended by using toxicity information for a wider range of species and by considering effects on populations rather than individuals. Generating species sensitivity distributions for SPU would enable an evaluation of the fraction of species within an SPU that are likely to be affected at a given exposure concentration. For example, the risk to natural pest control of crop plants may be different for above ground and below ground pest controllers due to differences in exposure but also the pests they control. In our example the risk to natural pest control (i.e. in-field) is acceptable for in-soil natural enemies, but the risk to above ground pest control is not (Table SI2). The assessment of above-ground natural pest control is based on the toxicity of the study compound to three species (lacewing, parasitic wasp, rove beetle), but toxicity data are available for 16 different natural enemy species. These data were used to generate a species sensitivity distribution (SSD) for natural enemies ( Fig. 2). Based on this SSD, the fraction of natural enemy species affected at the in-field rate of 480 g a.s./ha is 70% (90% CI = 51 -84%).
Except for individual honeybee foragers, the ecological entities to be protected for the SPU are colony, (meta)population, functional group or community. In contrast, laboratory toxicity data are generally available for individual-level responses only, while the assessment of risk to ES based on individual-level effects may overestimate risk (Forbes and Calow 2013). Population models can be used to extrapolate individual-level effects to population-J o u r n a l P r e -p r o o f Journal Pre-proof level responses and thereby provide a more appropriate assessment of risk against specific protection goals.
Population models are available for service providers for each of the species highlighted as being at risk from the case study chemical and are explored further in the next sections (Section 2.5.).

Ecological models
Population models were used to evaluate the potential population-level effects of the study compound to services providers relevant to the four ES for which unacceptable risk was identified (Table SI2): soil quality, natural pest control, pollination and observational recreation (aesthetic value). Soil quality regulation is dependent on nutrient cycling, soil formation/retention and soil remediation. Whereas the risk posed by the study compound to soil remediation was acceptable, the risks to nutrient cycling and soil formation/retention facilitated by earthworms and in-soil arthropods were unacceptable according to EFSA Scientific Committee (2016) (see also Section 2.3.).
Focal service providers modelled for the regulation of soil quality were therefore earthworms and springtails. The risk to natural pest control by non-target arthropods was unacceptable in-field and off-field. Ladybirds are an important natural enemy of pests in apple orchards and were selected as the focal service provider to be modelled.
The risk to pollination was also unacceptable in-field and off-field and honeybees were selected as the focal service provider to be modelled. Agricultural landscapes provide many cultural services and are often used for recreational activities, including observing nature and appreciating its aesthetic value. All above-ground terrestrial species could potentially contribute to this aspect of recreation, but some species are more charismatic or attractive than others, for example butterflies. There was an unacceptable risk to the contribution of non-target arthropods to cultural services including aesthetic value and butterflies were therefore selected as the focal service provider to be modelled.

Soil quality regulation: springtail and earthworm models
An individual based model (IBM) for the collembolan Folsomia candida (Meli et al. 2013, Meli et al., 2014 was adapted for use in this proof of concept study. Log-logistic dose-response relationships for effects on survival J o u r n a l P r e -p r o o f Journal Pre-proof and on reproduction were implemented and other adaptations enabled the use of two applications per year, the input of a half-life or DT50 value, the use of an EMF and differentiation of initial concentrations producing heterogeneous exposure in space. Ecological population model parameter values were obtained from Meli et al. (2013) and the parameterisation of the toxic effects of the study chemical was derived from a laboratory study with Folsomia candida in artificial soil (OECD 232 and ISO 11267 compliant). Simulations assumed heterogeneous concentration and exposure dynamics in parts of an orchard with a size of 11 x 11 m. The modelled scenarios consisted of three 1-m wide tree rows, three 2-m wide inner tramlines and one 2-m wide outer tramline. The outer tramline was not sprayed. Model simulations ran for a time period of 20 years: 5 years without exposure, then 10 years with 2 pesticide treatments per year, then 5 years without pesticide exposure. Pesticide concentrations were based on values from experimental field trials. Pesticide loss was calculated assuming first order decline and a DT50 of 34.6 days. A vertical differentiation of exposure was not realised in the model. Simulations were performed for one control ecological scenario (settings of the ecological parameters and landscape characteristics). Soil concentration input was modulated by multiplication with EMF. Simulations were performed with 100 replicates for the control without exposure, to characterise the control population dynamics in soil, and with 10 replicates, for each of an increasing series of EMF values between 0.001 and 1. A standard control year was defined by taking the average of control simulations, providing one average standard population density for each day of the 20 years simulation time. Control scenarios were used to calculate the normal operating range. The normal operating range after 5 years ranged between -24.4% and 21.1%. Consequently, a 10% deviation of a treatment from the control (the start of small effect, Table 1) would not be significant, since it would be well inside the normal operating range.
The IBM for predicting how agricultural management practices (pesticide applications and tillage) affect soil functioning through earthworm populations developed by Johnston et al (2015) was adapted slightly. A log-logistic dose-response relationship for effects on reproduction was implemented and other adaptations enabled the use of two applications per year, the input of a DT50 value, the use of an EMF and differentiation of initial concentrations producing heterogeneous exposure in space. Ecological population model parameter values were derived from Johnston et al (2014a). The parameterisation of the toxic effects of the chemical used data from a laboratory study on the toxicity of the study compound to the compost worm Eisenia andrei (De Silva et al., 2009), which conforms to OECD test guidelines (222;2004). Simulations for the earthworm model assumed a soil compartment of 3 m width J o u r n a l P r e -p r o o f Journal Pre-proof and 1 m depth, where earthworms can move vertically. Homogeneous concentrations and exposure dynamics were assumed in the top 5 cm soil layer and was assumed to be 0 below 5 cm, so no pesticide leaching was considered.
Model simulations were performed over a time period of 20 years as described for the springtails. Simulations were performed in 100 replicates for the control without exposure to characterise the control population dynamics in soil, and with 10 replicates, for EMF values between 0.001 and 100. The normal operating range after 5 years ranged between -15.5% and 18.3%. Consequently, a 10% deviation of a treatment from the control would not be significant, since it would not be outside the normal operating range.

Natural pest control: ladybird model
A landscape-scale model for the dynamics of ladybirds (Coccinella septempunctata) and their prey (aphids) was used (Bianchi and Van der Werf, 2003;Bianchi et al., 2007). Ladybirds disperse (up to 100 m) in an agricultural landscape containing crop and non-crop plants and feed on aphids. The deterministic model was adapted to implement multiple year simulations, where each year consisted of 140 days (start day represents Julian day 130 up to day 270) and therefore excluded the hibernation period. Adults leave the hibernation habitats in the first weeks of the simulation (beginning of May) and enter the fields and other habitats. At the end of the simulation adults move again into hibernation habitats. The original model was also adapted to address pesticide exposure to ladybirds. The mortality was calculated using a dose-response relationship obtained from a glass-plate experiment in which ladybirds were exposed to fresh dry residues of the insecticides. The mortality associated with the amount applied in the orchard (480 or 960 g a.s./ha) was calculated for every day assuming a DT50 of 10 days. The mortality was only taken into account when the ladybirds were in the orchard habitat, so the first application on Julian day 106 only had a partial effect on the ladybirds entering the orchard habitat on Julian day 130. The configuration of the landscape for the scenarios was defined as a checkerboard of three types of habitat: orchard (grassland), cereal field and forest edge. Forest edge provides hibernation habitat. Ladybirds feed on aphids in all habitats but can reach much higher densities in cereal fields, compared to forest edge and, particularly, orchards. Each field (orchard or cereals) measured 600 by 600 m (36 ha) and between two fields there was a 40-m wide strip of hibernation habitat. The endpoints considered were average density (individuals m -2 ) in orchard habitats and in cereal fields. Model J o u r n a l P r e -p r o o f Journal Pre-proof simulations were run with an increasing series of EMF values between 0.001 and 1000. The control scenario (i.e. no exposure) had an EMF of zero. Control simulations were run over a 10-year time period. The model is deterministic, so no normal operating range could be defined.

Pollination: honeybee model
The impact of spray application of the study compound in an apple orchard on colony performance of managed honeybees in a heterogeneous landscape was evaluated using the BEEHAVE (Becher et al., 2014) (Fig. 3). Three beehives were placed in the landscape; one within the apple orchard (hive 3), one at the edge of the apple orchard (hive 1) and one outside of the apple orchard (hive 2). The simulated orchard was sprayed twice per year (before and after apple blossom) and simulations were run for ten consecutive years with a starting population of 2,000 bees per colony. Applications were provided as compound concentrations in nectar (ng/L) and pollen (ng/Kg) at the day of application (Julian day 105 and 151). A factor of 147 between nectar and pollen concentration at the day of application was used, based on measured pollen and nectar concentrations one day after spray application. Model simulations were run for a hive concentration in food. The food requirement for worker bees and larva was used to calculate the daily compound intake, and this was subsequently related to an LC50/LD50 for adults and larva bees (with the larvae being 4 times as sensitive as adults) to calculate in-hive bee and brood mortality. The concentration in nectar and pollen outside the hive and within the hive is subject to exponential degradation using a DT50 of 1 d. The number of simulations per colony and concentration was 100. The modelling endpoints of interest were the total foraging activity on the apple trees over ten years (= pollination flights) and colony survival over ten years. Usual beekeeper activity was allowed in the BEEHAVE model to represent managed honeybee colonies that are placed in the landscape to i) provide apple pollination and 2) produce honey. Beekeeping activities included are: feeding of honeybees (pollen in spring and nectar in autumn), honey harvesting (occurring after apple blossom (Julian day 151) over a period of three weeks when more than 20 kg of honey are in the hive), queen replacement if necessary.

Recreation (aesthetic value): butterfly model
The focal species was the meadow brown butterfly (Maniola jurtina) and the model used was based on an unpublished IBM developed for risk assessment of non-target arthropods at the edge-of-field scale (Baveco, personal information). The model incorporates temperature-dependent development through egg, larval and pupae stages and density-dependence in the number eggs deposited per m 2 . The original model considered either a single crop field on which a pesticide was applied, or an off-crop strip exposed through spray drift. The typical size of an evaluation of the model was 30 m by 30 m for the field. For the current study the landscape was adapted in two aspects: the dimension of the field was increased as much as feasible (computationally) and an off-crop edge was added consisting of semi-natural habitat. The field is assumed to be an orchard where only the grassy understorey is relevant for the meadow brown butterfly, and the off-crop habitat is assumed to be grassland as well. Thus with respect to the types of habitats it is composed of, the landscape can be treated as homogeneous. The model was also modified to provide multi-year simulations. Only the larval stage was assumed to be exposed to the pesticide via

J o u r n a l P r e -p r o o f
Journal Pre-proof each value, 10 replicate runs were performed. The model was run for 20 years: first 5 years without exposure, then 10 years with exposure, ending with 5 years without exposure. A drift reduction factor of 1 was assumed, implying that outside the orchard, no exposure occurs.
Endpoint was the overall density of butterflies in the landscape, i.e. the total number of adults present divided by total landscape area in meters squared.

Results and discussion
Relationships between maximum effect on population abundance and EMF, derived from population models described in Section 2.5., were used to assess risk to focal service providers (Fig. 1). Maximum effect was based on 10 years exposure with two applications per year. An EMF equal to 1 represents an application of 480 g a.i/ha followed by an application of 960 g a.i./ha. The assessment of risk to ES described in Step 2 (EFSA approach, see Section 2.3.), was based on standard toxicity data and a single application of 480 g a.i./ha. To increase comparability between approaches, the following evaluation of model outputs used an EMF of 0.5 (i.e. applications of 240 g a.i./ha and 480 g a.i./ha). Consequences of reductions in population abundance for ES delivery were evaluated by: (i) comparison with specific protection goals and (ii) use of evidence-based logic chains. For all ES considered in this study, the entity specified in specific protection goal was population/colony or higher levels of biological organisation (Table SI1). Ideally, the extrapolation from pesticide-induced effects on the key service providers to changes in ES delivery, would be based on quantitative ecological production functions. However, there are almost no quantified logic chains available in the literature to extrapolate consequences for delivery of any ES from toxicity data at individual or population level (obtained from standardised test and modelling) , except in the case of strawberry yields and pollination by honey bees (Kleczkowski et al., 2017).

Journal Pre-proof
Soil quality regulation includes decomposition of organic and inorganic material and its incorporation into soil as well as the transformation of potentially harmful organic and inorganic substances. Organisms that play an important role in soil quality regulation include soil fauna (earthworms, ants, springtails), microorganisms, primary producers, and other detritivores. Decomposition and pedogenesis are major ecosystem processes that affect biogeochemical cycling, soil fertility, gas fluxes and primary production. An application of the EFSA approach using standard toxicity test data indicated that there is an unacceptable risk to soil quality regulation (nutrient cycling and soil formation/retention) via pesticide-induced effects on in-soil organisms (Table SI2 and Section 2.3.). The focal service providers explored in more detail were collembolans (springtails) and earthworms.
The relative reduction of population abundance was then calculated per EMF using the average population abundance for the specific EMF per day. Folsomia were exposed in large parts of the simulated area. Since springtails are mobile, the buffer zone without exposure (outer tramline) did not lead to a stable population for higher concentrations. There were no effects outside the normal operating range up to EMF 0.15 and EMF 0.5 was the highest EMF tested that allows a sustainable population. For EMF > 0.75 full recovery was not observed and for EMF 6 the population went extinct after the first treatment (Fig. SI1). The specific protection goal for in-soil arthropods in field is small effect on population or functional group abundance for months or a medium effect (< 65%) for weeks.
The maximum effect at EMF = 0.5 over 10 years was a 96% reduction in relative population abundance. Populations were able to persist in orchards at EMF = 0.5, but at a much lower relative abundance (i.e. 17 to 96% lower within a single year, Fig. SI1). The risk to the in-field specific protection goals for nutrient cycling and soil formation and retention, via effects on arthropods as service providers, is therefore unacceptable.
No adverse effects were detected on earthworms at EMF 1 and at EMF 10 the maximum level of effects has already been reached (Fig. SI2). This plateau effect is most likely caused by the mobility of the earthworms, the exposure in the topsoil can only impact a certain fraction of the population fraction which appears in the topsoil.
Under the current scenario, even high values of EMF will not drive the overall population to extinction. The specific protection goal for earthworms in field is a small effect (< 35%) on population or functional group abundance for months. The maximum effect at EMF = 0.5 over 10 years was a 12% reduction in relative population abundance, which is within the normal operating range (Fig. SI 2). The risk to the in-field specific protection goals for nutrient J o u r n a l P r e -p r o o f Journal Pre-proof cycling and soil formation and retention, via effects on earthworms as service providers, is therefore acceptable according to EFSA Scientific Committee (2016).
Evidence-based logic chains linking reductions in the abundance of earthworms and collembolans to reductions in soil fertility and ultimately to crop production and other ES, have been developed following the approach of Hayes et al. (2018) and making reference to the EFSA PPR Panel (2017) Scientific Opinion addressing the state of the science on risk assessment of plant protection products for in-soil organisms. A simplified logic chain indicating the role of population models is illustrated in Fig. 1. Soil quality is important as it influences crop yield. Johnston et al. (2015), provide a quantitative relationship between earthworm biomass and crop yield based on an analysis of published studies on arable crops. The resulting relationship between crop yield and earthworm biomass ( Fig. 5 of Johnston et al., 2015) indicates that at a high earthworm biomass (e.g. 200 g/m 2 ), considerable reduction in biomass may occur before crop yield is affected, however at low to intermediate biomass (i.e. < 100 g/m 2 ) there is a linear positive relationship between earthworm biomass and crop yield.
Comparing the results of the population modelling to the EFSA specific protection goals, indicates that there is no unacceptable risk to the regulation of soil quality by earthworms, but there may be an unacceptable risk to ES delivery via effects on springtails (see Section 2.3.). Although the contribution of collembolans to soil quality processes is important, it is quantitatively not as important as that of earthworms (Filser et al, 2016). The main effect of collembolans on soil quality is by enhancing decomposition and nutrient cycling by microorganisms, in particular fungi. Collembolan species may increase fungal biomass by over 50% (Filser 2002). However, the case study did not include vertical heterogeneity of exposure and can therefore be considered a conservative approach. Nonetheless, the question regarding what effect a fluctuating 18 to 96% reduction in the relative abundance of collembolans will have on soil quality needs to be answered.

Natural pest control
Natural pest and disease control is a regulating ES with beneficial arthropods (natural enemies such as ladybirds, ground beetles, true bugs, lacewings, spiders, parasitic wasps), vertebrate predators and fungal species as service providers (EFSA Scientific Committee, 2016). Since beneficial arthropods are closely related to insect pest J o u r n a l P r e -p r o o f Journal Pre-proof species, insecticides used to control pests may also impair biological control by beneficial arthropods, although recovery might be quick in some cases (Markó et al., 2017). Pests in apple orchards include blossom weevil, caterpillars, aphids and spider mites. Passerine birds play an important role in caterpillar control in apple orchards (Mols and Visser, 2007;Peisley et al 2016). Passerines also feed on blossom weevils, but parasitic wasps are the more important predators here. Ladybirds are important predators of aphids, bugs, moth eggs and some mites.
Predatory mites are important predators of red spider mites.
An application of the EFSA approach using standard toxicity test data indicated that there is an unacceptable risk to natural pest control via pesticide-induced effects on non-target arthropods (NTA) ( Table SI2, see also Section 2.3.). The specific protection goal for NTA natural enemies in field is a medium effect (< 65%) on functional group abundance for a few weeks at most. The maximum effect at EMF= 0.5 was a 41% reduction in relative abundance of ladybirds inside the orchard (Fig. SI3). Although ladybird populations were able to persist for 10 years exposure at EMF = 0.5 and steadily increase in abundance after the pesticide application stopped, however there was no evidence of recovery within the year of application (Fig. SI3). Therefore, although the magnitude of effect is in line with the specific protection goal, the duration of effect is not. Consequently the risk to the ES of pest control via effects on NTA would be unacceptable. The entity to be specified in the specific protection goal is functional group and many different species of NTA are potential natural enemies. For example, a study of the predators of the rosy apple aphid in orchards identified 54 different species, of which the most abundant were ladybirds (Coccinellidae), hoverflies (Syrphidae) and earwigs (Forficulidae) (Dib et al, 2010). A species sensitivity distribution of natural enemies exposed to the study compound, indicates that ladybirds are not a particularly sensitive group of natural enemies (Fig. 2).
Evidence-based logic chains linking reductions in the population abundance of predators to reductions in pest populations and ultimately crop production, as well as other ES, have been developed and a simplified logic chain showing the role of population models is illustrated in Fig. 1. Exposure over 10 years to EMF of 0.5 or greater leads to a steady decline of the ladybird population across the whole landscape (Fig. SI3). There is a yearly negative growth rate of the overall population leading to an exponential decline over the years. Interestingly, although there is evidence that natural enemies, including ladybirds, can suppress pest populations (e.g. Votava and Bosland, 1996), it is generally not enough, at least in South-East France, to prevent damage to apple trees (Dib et al 2010).

Journal Pre-proof
Moreover, experimental exclusion of NTA predators from orchards did not cause a significant difference in aphid abundance between predator exclusion and control treatments (Fréchette et al 2008). This would suggest that the consequences of the reduction in NTA natural enemy populations to the ES of pest control is limited for orchards, although this is certainly not always the case (see Table SI 1 of Faber et al. (2021) for examples).

Pollination
Pollination provides a critical ES for the culturing of crops. Pollination contributes to one third of food production and is essential for the production of three-quarters of our crops (IPBES, 2016). Although many apple varieties are self-fertile, other varieties require pollination and cross-pollination increases the productivity of selffertile varieties (Ramirez and Davenport, 2013). Pollination can be delivered by wild pollinators (e.g. wild bees) and managed pollinators (i.e. honeybees), and wild pollinators generally increase yield biomass, quality and value of fruits and seeds -even over honeybees .
Dose-response curves for honeybee colony survival and apple pollination over 10 years could be generated for those hives being located within or at the edge of the apple orchard (Fig. 3). Bees from hives located outside of the orchard foraged on dandelion within direct proximity of the hive rather than on dandelion within the orchard.
Control colony survival over ten years was simulated to be 98%. The application rates used covered effect intensities from 0 to 98% for colony survival, but limited up to 87% of reduction in apple pollination. This discrepancy in effect intensity is not driven by the upper limit of the application rate, but is an artefact of foraging flights into apple trees prior to compound effects (preliminary first year). The EFSA approach using standard toxicity test data indicated that there is an unacceptable risk to pollination via pesticide-induced effects on bees (EFSA Scientific Committee 2016; Table SI2). The specific protection goals for bees are rather complex: The entities are defined as foragers/colonies or as populations for solitary bees. The attributes to be considered are forager behaviour, colony survival and reproduction or population abundance for solitary bees (Table SI1). The acceptable level of effect is negligible (i.e. < 7% reduction in colony size after two brood cycles) to small (i.e. <15% reduction in colony size after two brood cycles) for days in crop. The maximum effect on colony survival after ten years at EMF = 0.5 was < 7% for hives in the orchard (Fig. 4), indicating that the risk to the ES of pollination by honeybees was acceptable. showing the role of BEESCOUT and BEEHAVE population models is partly illustrated in Fig. 1. The effect of pesticide application on apple pollination were simulated using adaptations of the BEEHAVE model (Fig. 4). The effective concentration for pollination effect was at least 4 times higher than for colony survival, and the relationship between modelled change in colony survival and change in pollination (log-logistic regression, r 2 = 0.99) is illustrated in Fig. 5.
There was no evidence from the comparison of effect levels with EFSA specific protection goals or from the prediction of pollination effects that the ES of pollination by managed honey bees was at risk.

Recreation
Recreation is a bundle of cultural ES and distinction is made in the CICES v5 classification between observing nature and engaging with nature (e.g. boating, swimming, hiking etc). The recreational value of observing nature is in part related to the aesthetic value of species i.e. observation of attractive and iconic species. Insecticides may have a direct impact on iconic invertebrate species (e.g. butterflies and bees) in the orchard, which may affect the experience of people using the orchard for recreation (Haines-Young and Potschin, 2018). Observing species with a high aesthetic value (e.g. butterflies) is the aspect of recreation used in this proof of concept study.
The risk to aesthetic value and other cultural services provided by non-target arthropods is unacceptable according to the EFSA approach using standard test species (EFSA Scientific Committee 2016; Table SI2). Single species toxicity test data for Lepidoptera also indicate that the risk to this aspect of recreation is unacceptable (Table   SI2; see also Section 2.3.). The specific protection goal for aesthetic value suggested by EFSA is a small (35%) effect on the abundance of metapopulations in-field for months. The output from the meadow brown butterfly model indicates that the species is very sensitive to the study chemical with a > 80% reduction in relative population abundance at EMF = 0.001 and above, although populations do recover once pesticide applications stop (Fig. SI4).
Therefore, based on the specific protection goal, the risk to the ES of recreation (aesthetic value) via observing butterflies is unacceptable. Pesticide applications occur when most Meadow brown individuals are in their immobile (non-flying) larval stage, and thus vulnerable. Depending on temperature, however, a small fraction may already J o u r n a l P r e -p r o o f pupate, and -according to the model assumptions -escape exposure. With high application rates the larval population in the orchard itself is wiped-out. All larvae survive in the off-crop habitat where exposure is zero. This prevents extinction of the population in the landscape and sets a limit to the maximum reduction in abundance (Fig.   SI4).
A simplified logic chain linking reductions in the population abundance of butterflies to reductions in aesthetics is illustrated in Fig. 1. The butterfly model does not simulate movement or activity in general. The probability of observing a meadow brown individual can however be assumed to be proportional to the density of adults, and a decrease in service provision from pesticide exposure proportionally result from a decrease in adult density. Consequently, the risk to the cultural service of recreation via nature observation (butterfly watching) is unacceptable.
This is a very conservative risk assessment as the model assumes that all caterpillars receive the full application rate and there is limited spatial variation in the landscape and hence in exposure. In a landscape with more mixed and connected habitats, the buffering by off-crop habitats would be higher (as would be the recreational value).

Recommendations and outlook
The ecosystem services approach enables an extrapolation and valuation of the potential implications of pesticide effects on laboratory test species to the things that matter to people (i.e. benefits from nature). It also provides a mechanism for risk managers to consider the potential impact of plant protection products on ES prioritized in agricultural landscapes against the revenues of the use of plant protection product on crop production.
The use of population models provides a means of extrapolating the individual level effects measured in laboratory toxicity tests to potential effects on populations in the field. Population models may also be useful for exploring possible risk mitigation measures. However the usefulness of population models will depend on the environmental scenario. If the environmental scenario is broadly similar to toxicity test systems then they will be of limited value as they will not add more details and information on the spatio-temporal scale of the risk assessment.
In the case of the models used in this proof of concept study, a primary consideration is landscape spatial configuration and heterogeneity and its consequences for the distribution and exposure of organisms. This proof of J o u r n a l P r e -p r o o f concept study was constrained to using existing models with minor modifications. The environmental scenario was therefore relatively simple and in some cases (i.e. butterfly model) there was no spatial variation in habitat type and limited spatial variation in exposure and, herewith, also in effects. Some of the maximum effects shown in the EMF -effect relationships (Fig. 5) start already in the small effect area (springtail) which is a consequence of the background variability in the models. This could be tackled by taking into account uncertainty variability of the model, but is leading to the question of how to define significant deviations then? This is of course not a question related to the ecosystem services concept, but to the background variability underlying models which may also reflects the (real) environmental variability.
A major remaining challenge is the quantitative translation of changes in population abundance to changes in the delivery of ES (i.e. ecological production functions) (Faber et al., In press). Evidence-based logic chains can provide insight into qualitative relationships and some models may be able to provide a direct measure of the effect of chemical exposure of an ES (e.g. the BEEHAVE and the BEESCOUT (Becher et al., 2016) models we used for our bee modelling). However, in all other cases full range quantitative ecological production functions covering toxicity data for standard test species up to service provision by SPU are not available (Faber et al., In press). The natural world is also very complex with multiple species potentially contributing to multiple ES. However, it may be a way forward to agree on a set of standard environmental scenarios and for those 'typical' conditions use modelling approaches to link population dynamics through ecological functioning to service provision.   J o u r n a l P r e -p r o o f