Combining agro-ecological functions in grass-clover mixtures

Grass-clover mixtures show many benefits for sustainable agriculture. In the Netherlands, organic farmers often work together in a so-called partner farm concept, with the aim to close nutrient cycles on a regional level. In this system, arable farms grow one-year grass-clover leys, as fodder for a livestock farm, in exchange for, e.g., manure. This practice could also be used in the transition of conventional farms towards a more circular regenerative and nature inclusive agriculture. In the current experiment we assessed the effect of a range of grass (Lolium perenne: Lp, Lolium multiflorum: Lm) and clover (Trifolium pratense: Tp and Trifolium repens: Tr) monocultures and mixtures on both belowand aboveground parameters in light of benefits for livestock and arable farms, and biodiversity. The grass monocultures showed good weed suppression, high root density, and especially Lp had a positive effect on soil structure. Clover, on the other hand, showed high herbage dry matter yield (particularly Tp) and Nitrogen (N) yield, and Tr showed high digestibility. Moreover, clover had a positive effect on the soil mineral N, and earthworm abundance tended to be higher in the clover monocultures. When (some of) the four species were combined in grass-clover mixtures, they combined the positive effects of the species and often even outperformed the (best) monocultures. We concluded that grass-clover mixtures increased agro-ecological functions.


Introduction
There is increasing pressure on the agricultural sector to increase sustainability and reduce its negative impact in terms of soil degradation, water quality, greenhouse gas emissions and decrease in biodiversity of flora and fauna [1,2]. Regenerative and nature inclusive farming has been suggested as an alternative to current farming systems [3]. This concept relies on natural processes to increase the biodiversity and the resilience of farming systems, while decreasing the need for inputs like pesticides and fertilizers [4]. One pillar of nature inclusive farming is the concept of functional agrobiodiversity [3]. For example, the inclusion of clover in grass-mixtures offers many potential benefits [5]. Clover grown in grass-clover mixtures can fix up to 380 kg N ha −1 year −1 through symbiotic N 2 fixation [5], reducing the need for external fertiliser inputs. Also, white clover has been shown to decrease soil penetration resistance due to an increased presence of soil biota, while grasses increase the proportion of crumbly soil [6]. Moreover, clover and grass differ in their temporal development, which results in a better suppression of weeds [7]. Finally, grass-clover mixtures can benefit the biodiversity on farms. For example, a higher abundance of earthworms has been found in white clover swards [6], as earthworms are attracted to the increased quality of food [8]. Also, grass-clover mixtures can be beneficial for the populations of pollinating insects and invertebrates in general, which are in decline due to loss of habitat and food sources [9,10].
Part of nature inclusive farming is also the exchange between arable and livestock farms with the aim to close nutrient cycles on a regional level [3]. This transition to circular agriculture is now also official policy of the Dutch Ministry of Agriculture [Ministerie van Landbouw, Natuur en Voedselkwaliteit, 11]. In the Netherlands, organic arable and livestock farmers often work together in a so called partner farm concept [12]. In this system arable farms grow one-year grass-clover leys, as fodder for a livestock farm, in exchange for, e.g., manure. This has potential benefits for both farms: the arable farm widens its rotation, increases its soil fertility and receives manure. The livestock farm, on the other hand, receives fodder with a high protein content and digestibility [12]. This practice could also be used in the transition of conventional farms towards a more circular and nature inclusive agriculture.
Many studies on the benefits of grass-clover mixtures either focus on aboveground effects or belowground effects of grass-clover mixtures [6,7,13,14]. However, these separate studies do not allow for seeing the whole picture of benefits and constraints of these mixtures in an agricultural context. Therefore, the aim of this research was to investigate the effect of different grass-clover mixtures and monocultures in a one-year ley on both aboveground and belowground parameters in light of the potential benefits of the ley for livestock and arable farms, and biodiversity. To this end, we conducted a one-year trial in which we investigated the effect of fallow, grass (perennial ryegrass and Italian ryegrass) and clover (white and red clover) monocultures and different grass-clover mixtures on herbage yield and quality, weed pressure, roots, soil structure, soil mineral N and biodiversity.
We hypothesize that grass monocultures have a high weed suppression, high root density and high proportion of crumbly soil. Monocultures of clover are expected to have high N yield, N concentration and digestibility of herbage, increased soil mineral N and higher earthworm abundance.
Mixtures of these species will probably combine the potential benefits of the different monocultures for both livestock and arable farms, and biodiversity.

Site
The experiment was established in late summer of 2015 on an organic experimental farm of Wageningen Research nearby Lelystad, the Netherlands at 52°54'N and 5°58'E on a clay soil (pH-KCl 7.5, 1.49 g total N kg −1 , 3.7 mg plant available P kg −1 , 68 mg total K kg −1 ). The experiment was sown after the preceding potato crop was harvested on the 10 th of September 2015. The seedbed was prepared with a cultivator trill after which the seeds were sown on 11 September 2015 on plots of 3.15 × 12 m using an Oyord sowing machine.

Experimental design
The experiment consisted of 10 sward types laid out in a randomised block design with three replicates. Perennial ryegrass (Lolium perenne, cultivar Calibra), Italian ryegrass, (Lolium multiforum, cultivar Gaza), red clover (Trifolium pratense, cultivar Raunis) and white clover (Trifolium repens, cultivar Lena) were all sown in monoculture and in a number of two-species, three-species and fourspecies mixtures. In order to limit the number of treatments we selected a limited number of mixtures based on their potential relevance for practice (Table 1). For example, we did not include any two and three species mixtures including the combination of Lm and Tr, because Lm is known to suppress clover when it is present in substantial proportions [15]. Additionally, a fallow treatment was included as a control to measure the differences with the grasses and clovers. In the fallow the weeds were controlled by cultivating the plots with a cultivator trill approximately once a month. The plots did not receive any fertilization during the experimental period, which is normal practice for grass-clover management in an organic arable crop rotation.

Aboveground measurements
In 2016, herbage dry matter yield was determined four times (May, June, August and October) by harvesting a 1.5 × 6 m strip from each plot with a Haldrup harvester. Fresh weight was recorded and a subsample was dried at 70 ℃ for 48 h to determine dry weight. The dried samples were ground at 1 mm and analysed with a near-infrared spectrometer (NIRSystems 6500, Foss, USA) for total N content and digestible organic matter content. The proportions of weeds and clover were determined on seven occasions from September 2015 to October 2016 as visual estimates by the same observer.

Soil sampling and measurements
On the 10 th of November 2016, soil penetration resistance was measured at 0-80 cm depth with a penetrometer (Eikelkamp, Giesbeek, the Netherlands) with a cone diameter of 1 cm 2 and an apex angle of 60°, with 10 measurements per plot. Visual assessments of soil structure and rooting were conducted by an expert, on 20 × 20 × 20/25 cm soil cubes from the 0-25 cm and 25-45 cm soil layer. Cubes were dug out with a spade and broken in both horizontal and vertical direction. Soil structure was assessed by estimating the proportion (%) of soil crumbs, sub-angular blocky elements and angular blocky elements in the cubes, following the method by Peerlkamp [16] and Shepherd [17]. Rooting was assessed by scoring visible root density (score 1-10; 1 for no roots and 10 for above average root density) [18]. The activity of soil biota was assessed by scoring the abundance of soil pores (score 1-10, with 1 meaning no visible pores and 10 for above average poor density) [18].
On the 23 rd of May 2017, 20 soil samples per plot were taken from 0-30 cm depth (diameter 1 cm). The mineral N content in the soil was measured by dissolving it in 0.01 M CaCl 2 (1:2 v/v) and analysing it with a DA spectrophotometer [19].

Earthworms
Earthworms were sampled on the 12 th of December 2016 by digging out a soil block of 20 × 20 × 20 cm from each plot. Earthworms were hand-sorted, counted, weighed and fixed in 70% ethanol prior to identification. Numbers and biomass were expressed per m 2 . If possible, the worms (adult and juvenile) were identified to species and classified into functional groups (epigeic, endogeic and anecic species) [20].

Statistical analysis and calculations
The data was first tested for normality with the Shapiro-Wilk test and for homogeneity of variances with the Levene's test [21]. Data was log or logit transformed if it was non-normally distributed. Normally distributed data were analysed by analysis of variance (ANOVA), with sward type as main factor and plot as a nested factor for penetration resistance data, of which the averages per 10 cm were analysed for differences between sward types. Tukey's Honest Significance Difference test was used as a post-hoc test. A Kruskal-Wallis test with the Fisher's Least Significant Difference test [22] was done on the original data if data remained non-normally distributed after transformation, which was the case for accumulated herbage dry weight, ground cover, proportion weeds and weighted proportion of weeds, soil crumbliness, root score, soil biota score, herbage nitrogen content in June, August and October, digestible organic matter in June and digestible organic matter per ton dry matter yield.
The predicted mixture performance was calculated by taking the sum of the monoculture performance of the component species multiplied with the sown proportion of the respective species (Table 1). Overyielding was tested by comparing the predicted mixtures and actual mixture performance with the Student's t-test. Transgressive overyielding was accepted if the ANOVA or Kruskal-Wallis test indicated that the actual mixture performance was significantly better than the best performing monoculture (see also [23]).
A correlation analysis was carried out between the sown proportions of the four species and all above and belowground parameters. Many correlations with the sown proportions were highly significant, but the correlations were non-linear, indicating interaction effects between the different species (as also indicated by the many significant cases of overyielding and transgressive overyielding). To this end the data were also analysed using a regression based approach following [24]. Here, the community level response (e.g., DMY) was modelled as a linear combination of i) identity effects of species as represented by their monoculture performance and ii) species net interactions (being positive, negative, or neutral) that are defined as the difference between the actual mixture performance and that expected from the relative contribution of the constituent monocultures, using the following model: DMY = β1PLp + β2PLm + β3PTr + β4PTp + δ1PLp:Lm + δ12PLp:Tr + δ3PLp:Tp + δ4PLm:Tp + δ5PTr:Tp + ε (1) where P represents the sown proportions of species (Table 1, Lp: Lolium perenne, Lm: Lolium multiflorum, Tp: Trifolium pratense, Tr: Trifolium repens). The relative value of the different parameters was calculated for the different sward types. Difference between minimum and maximum value was equally divided in five bins. Bins were ranked from worst to best and color-coded accordingly.
All statistical analyses were performed with the statistical software R [25].

Herbage dry matter yield (DMY)
There was a significant (p < 0.05) effect of sward type on DMY. The cumulative (across four harvests) herbage DMY of the monocultures ranged from 6.6 t ha −1 year −1 for Lp to 12.5 t ha −1 year −1 for Tp ( Figure 1A). Herbage DMY for the mixtures was on average 13.8 t ha −1 year −1 . In all cases the mixture herbage DMY was significantly (p < 0.05) higher than expected from the weighted average of the monocultures, and this overyielding was on average 33%. With the exception of Lp:Tr mixture, the herbage DMY of all mixtures was equal to or significantly (p < 0.05) higher than the highest performing monoculture, Tp, indicating transgressive overyielding.
The herbage DMY of the first cut was significantly (p < 0.05) affected by sward type. The DMY of the monocultures ranged from 1.5 t ha −1 for Tp to 7.1 t ha −1 for Lm ( Figure 1B, Table S1). Herbage DMY for the mixtures was lowest for Lp:Tr (4.0 t ha −1 ) and highest for Lm:Tp (7.6 t ha −1 ).
Herbage DMY of all mixtures, except Lp:Tr, was higher than expected from the weighted average of the monocultures; the overyielding was on average 37%. Only the herbage DMY of Lm:Tp and the four species mixture was equal to or higher than the highest performing monoculture (Lm), but there was no significant (p > 0.05) transgressive overyielding. The regression model showed a significant (p < 0.001) positive interaction effect on first cut herbage DMY between Lp and Tp, and particulary Lm and Lp, and Lm and Tp (Table S5a).

Herbage N concentration and yield
The sward type had a significant (p < 0.05) effect on the herbage N concentration and the N yield. The N concentration of the monocultures ranged from 12.3 g N kg -1 DM for Lm to 37.7 g N kg −1 DM for Tr ( Figure 1C, Table S1). For the mixtures, N concentration was on average 22.0 g N kg -1 DM. The N concentration of Lp:Tp and Lp:Tr was significantly (p < 0.05) higher than expected from the weighted average of the monocultures, and the N concentration of Lp:Tp and Lp:Tr was on average 15% higher. The N concentration of Lm:Tp and the four species mixture was significantly (p < 0.05) lower than predicted, and the difference was on average 16%. The regression model showed significant positive interaction effects between the sown proportions of Lp:Tr and Lp:Tp, whereas all grassgrass and legume-legume interactions were negative ( Table S5a).
The N yield of the monocultures was on average 108 kg N ha −1 for the grass species and 411 kg N ha −1 for the leguminous species ( Figure 1D). The N yield of the mixtures was lowest for Lm:Tp (238 kg N ha −1 ) and highest for Lp:Tp (372 kg N ha −1 ). The N-yield of the three and fourspecies mixtures were 328 kg N ha −1 and 277 kg N ha −1 respectively. The N yield of Lp:Tp, Lp:Tr and the three-species mixture was significantly (p < 0.05) higher than predicted, with an average overyielding of 26%. The N yield of Lm:Tp and the four-species mixture were not significantly (p > 0.05) higher than predicted. The regression model indicated significant (p < 0.001) positive interaction effects between Lp and the two legumes on N yield (Table S5a).

Digestible organic matter (DOM) and yield
The DOM of dry matter (DOMD, g kg −1 DM) and the DOM yield were significantly (p < 0.05) affected by sward type. The DOMD was on average 700 g kg −1 DM ( Figure 1E, Table S1). DOMD of the three and four-species mixture was significantly (p < 0.05) lower than expected from the weighted average of the monocultures, with a mean difference of 36%. The regression model showed a tendency (p < 0.1) for a negative Lm:Tp and Tr:Tp interaction effect on DOMD (Table S5a).
The DOM yield of the monocultures ranged from 4.7 ton ha −1 for Lp to 8.6 ton ha −1 for Tp ( Figure  1F). DOM yield of the mixtures was lowest for Lp:Tr (8.7 ton ha −1 ) and highest for Lp:Tp (10.1 ton ha −1 ). DOM yield of all mixtures was higher than predicted, and the average overyielding was 31%. The DOM yield of all mixtures was equal to or higher than the highest producing monoculture, Tp, but only Lp:Tp, the three-species mixture and the four-species mixture were significantly (p < 0.05) higher, indicating transgressive overyielding. The regression model showed a significant (p < 0.001) positive Lp:Tr, Lp:Tp and Lm:Tp interaction effect on DOM yield (Table S5a).  (Table 1). Means with the same letters are not significantly different (p > 0.05) (n = 3).

Proportion of clover and weeds
The proportion of weeds was highest at the end of 2015 (on average 18%) and for the fallow treatment (9%; Figure S2A, Table S2). The grass-clover mixtures had generally lower proportions of weeds compared to the monocultures, with Lm:Tp as best performing (Figure 2A). The mean weighted proportion of weeds in 2016 ranged from 0.2% for Lm:Tp to 8.4% for Tr monoculture (Table S2). The regression model showed a significant negative interaction effect (p < 0.05) of Lp:Tr and Lm:Tp on the mean weighted proportion of weeds in 2016 (Table S5a). There was a strong negative correlation (r = −0.79; p < 0.001) between the weighted proportion of weeds and spring DMY ( Table S4b).
The visually determined proportion of clover changed from 2% for Lm:Tp and 16% for Lp:Tr in May 2016 to 78% for Lp:Tr and 88% for Lp:Tp in September 2016 ( Figure S2B, Table S2). Lm:Tp and the four-species mixture had the lowest values during spring and early summer, which was on average 5%. Eventually, all mixtures had a clover proportion larger than 78% in autumn. On average, the weighted clover proportion in the mixtures (based on herbage DMY in the four harvests) ranged from 23% for Lm:Tp to 55% for Lp:Tr ( Figure 2B). The regression model showed a significant (p < 0.05) negative interaction effect of Tr:Tp, Lm:Lp and Lm:Tp, whereas there was a strong positive (p < 0.001) interaction effect of Lp:Tr and Lp:Tp on the weighted clover proportion (Table S5a).  (Table 1). Means with the same letters are not significantly different (p > 0.05) (n = 3).

Soil structure
At 0-25 cm, the soil structure was significantly (p < 0.05) affected by sward type. The proportion of crumbs was lowest for the fallow (12%; Figure 3) and ranged from 13% for Lm to 28% for Tp for the monocultures. The crumbliness of the mixtures was on average 36%. Only the threespecies mixture resulted in a significantly (p < 0.05) higher proportion of soil crumbs (40% increase) than expected from the weighted average of the monocultures. The regression model showed a significant (p < 0.05) positive interaction effect of Lp:Tr on the soil crumbs (Table S5b).
At 25-45 cm, soil crumbliness was low (<15%, data not shown) and no significant (p > 0.05) differences in soil structure were found between sward types. There was no significant effect of sward type on the proportion of angular and sub-angular blocky elements at both soil depths.  (Table 1). Means with the same letters are not significantly different (p > 0.05) (n = 3).

Penetration resistance
The penetration resistance ranged from 77 N cm −2 on average in the topsoil (0-10 cm) to 371 N cm −2 at 70-80 cm depth (data not shown). No significant (p > 0.05) differences between the different sward types were found for the penetration resistance at any of the depth intervals. The sward type had a significant effect (p < 0.05) on the visually assessed root density (ranging from 1: No visible roots to 10: Above average high root density). The root density score was lowest at both depths for the fallow (2.7 for 0-25 cm and 1.3 at 25-45 cm; Figure 4A and 4B). At 0-25 cm, the root density score for the monocultures ranged from 4.0 for Tr to 7.0 for Lp. For the mixtures, the root density score was on average 7.3. All mixtures, except Lm:Tp, had a significantly (p < 0.05) higher root density score than expected from the weighted average of the monocultures (on average 19% higher), and were higher or equal to the best performing monoculture (Lp). The regression model showed a significant (p < 0.05) positive interaction effect of Lp:Tr and Lm:Tp on root density at 0-25 cm (Table S5b). Root density at 0-15 cm showed a positive correlation with soil crumbliness (r = 0.76; p < 0.001) and a negative correlation with soil mineral N (r = −0.46; p < 0.05) ( Table S4c).
At 25-45 cm, the root density score for the mixtures was on average 5.6. All mixtures, except the four-species mixture, had a significantly (p < 0.05) higher root density score than expected, on average 14%. Again, mixtures were not significantly different from Lp, the monoculture with the highest root density score. The regression model showed a significant (p < 0.05) positive interaction effect of Lp:Tr and Lp:Tp on root density at 25-40 cm (Table S5b).

Soil macro-biota
Total earthworm biomass ranged from 42 g m −2 for fallow to 103 g m −2 for Lp:Tp, but there were no significant differences between sward types. Similarly, on average the total number of earthworms was lowest on the fallow (269 m −2 ) and highest in the clover monocultures and mixtures (with the exception of Lm:Tp) but differences were not statistically significant (Table S3).
The number of epigeic (top-soil dwelling) earthworms was positively correlated to the proportion of clover (r = 0.53; p < 0.01) and to the soil mineral N content (r = 0.58, p < 0.01) ( Table S4b and c). The number of endogeic earthworms was not correlated to sown species proportion (Table S4a), but did show a significant positive correlation with the soil crumbliness (r = 0.39, p < 0.05) and the root density score (r = 0.41; p < 0.05) at both soil depth intervals (Table S4c).
There was a significant effect (p < 0.05) of sward type on the score for soil biota activity (score 1-10, 1 = no visible soil pores and 10 = above average number of soil pores). The score of the monocultures at 0-25 cm was lowest for Lm (4.7) and highest for Tp (7.7; Figure 5C and 5D). The score of the mixtures was on average 7.3. Lp:Tr and the four-species mixture had a significantly (p < 0.05) higher score than expected from the weighted average of the monocultures, which was on average 25% higher than predicted. The regression analysis showed a significant (p < 0.001) positive interaction effect of Lp:Tr (Table S5b). The soil macropore score at 0-25 cm was significantly correlated with the sown clover proportion (Tr + Tp, r = 0.58; p < 0.01 and Tp, r = 0.40; p < 0.05) and negatively correlated with the proportion of angular blocky elements (r = −0.61; p <0.001) ( Table S4a and c).
At 25-45 cm, the score was the lowest for the fallow (3.0), and was on average 6.9 for the mixtures. Only Lm:Tp had a significantly (p < 0.05) higher score than expected (15% higher). The regression analysis showed no significant interactions effects. At this depth there was no significant correlation with the sown proportions, but a significant positive correlation (r = 0.48; p < 0.05) with the proportion of crumbs at 25-45 cm (Table S4a and (Table 1). Means with the same letters are not significantly different (p > 0.05) (n = 3).

Soil mineral N content
There was a significant effect (p < 0.05) of sward type on soil mineral N content. Soil mineral N was on average 72.5 kg N ha −1 for the grass swards and 115.0 kg N ha −1 for the clover swards ( Figure 5). For the mixtures, the mineral N was on average 90.1 kg N ha −1 , and was not significantly (p > 0.05) different from the predicted values, which was confirmed by the lack of significant interactions in the regression model (Table S5). Soil N content showed a highly significant positive correlation (r = 0.77; p < 0.001) with sown clover content.  (Table 1). Means with the same letters are not significantly different (p > 0.05) (n = 3).

Discussions
This trial was designed to investigate the potential effects of different grass-clover mixtures and monocultures in an arable crop rotation on aboveground and belowground traits. Different traits are important for livestock and arable farms which work together as partners, or for biodiversity, as discussed below.

Benefits for livestock farms
For livestock farms the yield and the quality in crude protein and digestibility is important.

Herbage DMY of mixtures higher than for monocultures
In the present study, the DM-yield of the monocultures ranged from 6.6 t ha −1 year −1 for Lp to 12.5 t ha −1 year −1 for Tp, and all mixtures showed higher yields (maximum DM-yield 14.7 t ha −1 year −1 ) than expected based on monoculture performance ( Figure 1A). This is high compared to other experiments with grass-clover mixtures in the Netherlands [15,26]. On average, the mixture yield was 37% higher than the monoculture DMY. The combination of functional plant groups such as grass and clover in mixtures often show higher yield than expected based on monoculture performance, so called overyielding, which has been shown in many studies on both natural and agricultural grasslands [7,[27][28][29].
In a meta-analysis of forty-four biodiversity experiments that manipulated plant species richness, Cardinale, Wright, Cadotte et al. [30] found that species mixtures were more productive than the average monoculture in 79% of the experiments. However, in only 12% of all experiments did the mixtures perform better than the best performing monoculture, so-called transgressive overyielding, and it took on average 5 years to become evident. In an agronomic context, transgressive overyielding is clearly preferred, as the mixture has to compete against the highest performing monoculture [5]. Remarkably, in the current experiment, the average mixture yield was 10% higher than the best performing monoculture, Tp, and all mixtures, with the exception of Lp:Tr, showed transgressive overyielding. Finn, Kirwan, Connolly et al. [7] showed transgressive overyielding in 79% of the mixture plots, with an average yield advantage of 18% over the highest yielding monoculture. Both overyielding and transgressive overyielding are only possible through synergistic interactions such as niche differentiation and facilitation. Symbiotic N fixation in legumes is an important factor explaining the improved performance of grass-clover mixtures. In grasslands, symbiotically fixed N 2 by legumes can range from 100 to 380 kg N ha −1 year −1 [5]. In addition, in mixed grass-legume systems, between 10 and 75 kg of N ha −1 year −1 are transferred from legumes to grass, depending on legume and receiver plant species [31]. The lack of fertiliser N application in the current experiment stimulated the performance of the mixtures and clover monocultures relative to the grass monocultures. However, Nyfeler, Huguenin-Elie, Suter et al. [32] showed robust transgressive overyielding in four-species mixtures containing clover, over a wide range of N fertilisation levels. Likely, such independence of fertilisation level would also hold in our conditions.
In the current trial the seed densities of the mixtures were higher than for the monocultures, but for the calculation of the expected performance, the species proportions were scaled to 100% (see Table 1). Therefore, the transgressive overyielding found in this study might be confounded by the increased seed density in the mixtures. However, earlier research found that seed densities of 70% and 100% did not result in significant differences between sward types in grass-clover monocultures an mixtures [32,33]. Current sowing rates of monocultures were at the agronomic recommended level, and no increase in herbage yield would be expected from increasing those rates.

Clover in mixtures increases N concentration and N yield
Both the N yield and N concentration were highest for the clover monocultures ( Figure 1C and 1D), as the fixation of atmospheric N 2 by Rhizobium bacteria increases internal N concentrations [5]. For the N yield, Lp:Tp, Lp:Tr and the three-species mixture was higher than expected from the weighted average of the monocultures, while for the N concentration Lp:Tp and Lp:Tr had a higher value than expected. Higher production than expected based on the monocultures in terms of N yield and concentration grass-clover mixtures has been attributed to the stimulatory effects of grass on symbiotic N 2 fixation activity of clover (which is closely related to N demand, which is higher for grass) [29]. Although lack of fertiliser N application in the current experiment may have stimulated the performance of the mixtures and clover monocultures relative to the grass monocultures, Nyfeler, Huguenin-Elie, Suter et al. [32] showed that four-species mixtures containing clover can have robust transgressive overyielding, over a wide range of N fertilisation levels.
Overyielding in these specific mixtures was related to the high year-round proportion of clover in both two-species mixtures and to a lesser extent in the three-species mixture ( Figure 2B). Both N yield and N concentration showed a strong and highly significant (p < 0.001, r = 0.92 and 0.99, respectively) positive correlation with the average weighted clover proportion.
In contrast, mixtures including Lm showed a lower clover content and hence N content and yield, which may be related to the high DM yield of Lm in spring (Figure 1b), slowing down the development of clover. This was illustrated with a significant negative correlation of the sown Lm proportion with N concentration, N yield and average clover proportion, in line with results from De Wit, Rietberg and Van Eekeren [15] with grass-clover mixtures with Lm and Lp crosses. 4.1.3. The DOM yield is higher than predicted for the mixtures Digestibility was highest for Tr and lowest for Lm ( Figure 1E). This is in line with previous reports, where a higher DOMD has been found for white clover in comparison with grasses and red clover [34], because white clover has a lower proportion of structural cell-wand components [5]. The values of the mixtures were similar to predictions from the weighted average of the monocultures. As a result, the DOM yield showed a strong positive correlation with DMY (r=0.96; p < 0.001). Thus, grass-clover mixtures have a higher DOM yield than monocultures, while the DOMD of the mixtures is in the same range as the monocultures.

Benefits for arable farms
For arable farms, weed suppression is an important aboveground trait, while soil aggregate stability, penetration resistance and soil mineral N for the next crop in the crop rotation, are important belowground traits. Soil aggregate stability and penetration resistance matter because they are indicators for water infiltration, suitability of the habitat for soil biota and the capability of root growth. Rooting is also relevant, as roots form a large part of the soil organic carbon stock [35] and are important for improving the soil physical structure [36].

Grasses and mixtures show highest weed suppression
The proportion of weeds was highest for the fallow, despite monthly mechanical weed control, followed by the two clover monocultures, whereas the grass monocultures and mixtures had the lowest proportion of weeds ( Figure 2A). Of the grasses, Lp is known to show good weed suppression [37]. Grass-clover has been reported to show good weed suppression too [7], due to an increased functional trait distribution and better capture of resources of mixtures in comparison with monocultures, even under drought [37]. In the current experiment, there was a significant interaction effect of Lm with both clovers. The suppressive effect of Lm may be related to the high DMY and cover during autumn 2015 and spring 2016.

Mixtures increase root density and soil aggregate stability
At both depths, the rooting score was relatively high for Lp ( Figure 4A and 4B), which is known for its extensive and fine root system [38,39]. In line with this, Lp had a high proportion of soil crumbs, which is positively related to soil aggregate stability (Figure 3). While Tp has an entirely different root architecture and had an average root density score, it still had a high proportion of crumbs (Figure 3). This is possibly related to the production of root exudates. Mucilages and root exudates can hold soil particles together, influencing soil structure [40]. Also, the root score of the mixtures was high at both depths ( Figure 4A and 4B); in most cases higher than predicted based on the monocultures, which is in line with previous research [41]. This may be related to the stronger build-up of soil borne pathogens in monocultures, which would have a larger negative effect than in mixtures [42]. Other explanations include the exudation of allelopathic compounds [43] and/or heterospecific positive effects of root exudates [44]. This means that exudation of harmful compounds by litter or roots negatively affect other plants from the same species. The high score for root density of the mixtures explains the high score for the proportion of crumbs (Figure 3). In conclusion, mixtures induce a higher rooting density than expected, better or comparable to Lp, the best monoculture, which in turn increases the soil aggregate stability. Also, roots have a positive effect on organic matter build-up in the soil.

Penetration resistance is not affected by sward type
The penetration resistance did not show significant (p < 0.05) differences between the different sward types, which is in line with a study on the effects of different perennial crops on penetration resistance [45]. The reason might be that in the short term (e.g., one year) penetration resistance is more strongly affected by soil tillage. Moreover, moisture content of the soil easily masks differences in penetration resistance [46]. The fallow treatment was cultivated regularly, therefore, no significant difference could arise in penetration resistance. On the long term, root architecture and density might influence penetration resistance; especially between shallow and deep rooting species a difference can be expected, as root structure affects soil aggregate stability.

Clover stimulates soil N mineral content
The N mineral in the soil is important for the following crop in the rotation and was highest for the clover monocultures ( Figure 5), and showed a highly significant positive correlation with sown clover content. After a one year grass-clover ley, remains of the shoot and roots of the clover are mineralised, forming an important source of soil N mineral [47,48]. In contrast, Lp had the lowest soil N mineral content, even lower than the fallow, as it has a high N demand in comparison to clover [29] and an intensive root system. There was no effect of overyielding in soil mineral N, indicating that any extra available soil mineral N as result of clover, was proportionally taken up by the grasses. Overall, clover in grasslands improved soil N mineral content for the following crop.

Benefits for biodiversity
For the below and above-ground biodiversity, the availability of food (roots, litter, nectar, pollen and earthworms) and the accessibility of food is important (e.g sward height in spring).

Clover increases soil biota score and earthworm abundance in mixtures
In all treatments the total number of earthworms (Table S3) were higher than normally found on arable farms on marine clay (202 n m −2 ) [49]. The total number of earthworms tended to be higher in clover monocultures and mixtures (Table S3), which is in line with expectations on farms with good soil quality [6,50].
The number of epigeic (top-soil dwelling) earthworms was positively correlated to the clover content and N content in the soil. These worms may have been attracted by the low C:N ratio in the clover roots and above-ground litter (and stimulated the incorporation of these plant parts into the soil, hence increasing soil mineral N content). In contrast, the endogeic (below soil dwelling) earthworms showed a positive correlation with root density score and soil crumbliness at both soil depth intervals. Even though this research does not allow for the establishment of a causal relationship, earthworm activity has been related to better soil structure, which allows better root development and penetration, particularly at deeper soil depths [51].
Results are partly in contrast with data from Van Eekeren, Van Liere, De Vries et al. [6], who reported more endogeic earthworms at higher N from roots and above-ground litter. Possibly epigeic worms have an advantage at the beginning of a grass-clover ley because they are the ones incorporating the above-ground litter in the soil before endogeic earthworms can use it.
The score for soil pores as an indicator for soil biota activity was significantly correlated to the sown clover content (r = 0.58; p < 0.01). This agrees with earlier findings, where the presence of clover increased soil biota [6], due to increased food quality [8], as the C:N ratio of clover is lower than that of grass [6]. In deeper soil layers, this effect decreases, since most soil biota live in the upper 20 cm of the soil and eat mostly parts of the shoot and new roots [52]. Hence, soil biota activity was positively affected by clover, especially in the upper soil layers. There was no significant correlation between the pore score and any of the earthworm parameters (Table S4c).

Clover and grass-clover may affect farmland birds and bees positively
Particularly in spring, there is evidence that the survival rate of meadow bird chicks is negatively affected by tall swards. For example, the foraging rate of northern lapwing (Vanellus vanellus) and common starling (Sturnus vulgaris) chicks declined as sward height increased as short swards may facilitate surface prey detection, improve forager mobility and increase foraging time by altering vigilance patterns [53].
The sward types containing Lm showed DMY in excess of 7 t DM ha −1 for the first cut in May ( Figure 1B), so these swards would be sub-optimal for foraging farmland birds. Tp and Tr monocultures showed the lowest spring DMY (<2 t DM ha −1 ), whereas Lp and the remaining mixtures had intermediate spring DMY, indicating a more beneficial habitat for meadow birds.
In addition to improved foraging, grass-clover mixtures might be beneficial for breeding of farmland birds. Schlaich, Klaassen, Bouten et al. [54] showed that skylarks nested in 'Birdfields', which consisted of a combination of set-aside and Medicago sativa. As especially red clover resembles Medicago sativa in growth form, it is highly probable that skylarks and other farmland birds deem the tested grass-clover mixtures suitable for breeding.
The positive effects of sward type on bees and other insects were not measured, but it is known that flowering clover is beneficial for bees [9], including several red list species [55]. Therefore, bees and other pollinators and insects are likely to benefit from the clover monocultures and the grassclover mixtures compared to the grassland monocultures, which is positive both for insect and pollinator populations and as a food source for meadow birds [56].

Practical implications
This research showed that grass and clover differently affected above-and belowground parameters important for livestock and arable farms, and biodiversity as summarised in Table 2. The grass monocultures showed good weed suppression, had a high root score, and especially Lp had a high score for soil structure. Clover, on the other hand, showed high herbage DMY (particularly Tp) and N yield and Tr showed a high DOMD. Moreover, clover had a positive effect on the soil mineral N, showed low herbage DMY in spring, and the soil biota score and earthworm abundance tended to be higher in the clover monocultures. When (some of) the four species were combined in grassclover mixtures, they combined the best below and aboveground characteristics of the species and often even outperformed the (best) monocultures (Table 2). Therefore, grass-clover mixtures are a good option to combine agro-ecological functions for livestock and arable farms and biodiversity.
The present research was limited to one site and one year. In order to design grass-clover mixtures that are optimised to meet the specific requirements from livestock and arable farms, and biodiversity, more site-specific, multiple year research is required.