1 Introduction

Provisioning a growing population with enough food of good nutritional quality while reducing agriculture’s negative environmental impacts is one of the century’s biggest challenges (Foley et al. 2011). Weed management is recognized to be a key point for ecological intensification in agriculture (Petit et al. 2015): weeds can generate severe yield losses (Oerke 2006) and their management in arable crops currently relies on herbicides. The over-reliance on chemical weed control in over-simplified cropping systems (CS) is now questioned because of water pollution and herbicide resistance (Heap 2014; Mottes et al. 2014). On the other hand, weeds have been recognized as an important support to agro-ecosystem functioning that should be maintained provided that economic losses are not generated (Armengot et al. 2013; Barzman et al. 2015; Petit et al. 2015).

Alternative weed management tools can difficultly match the effectiveness of synthetic herbicides on their own (Swanton et al. 2008). To reduce herbicide reliance and maintain crop productivity, IWM seeks to optimize the synergy between a diverse set of weed management tools coherently combined at the CS scale. Diversified crop sequences appear as one critical component of IWM across a diversity of situations (Anderson 2015). Crop rotation allows to diversify selection pressures because crops determine tillage type and timing, sowing date, timing and mode of action of herbicides, type of mechanical weeding, period of competition, harvest date, amount of crop residues, etc. (Barzman et al. 2015; Koocheki et al. 2009; Lechenet et al. 2014; Petit et al. 2015). Hence, each crop and its associated practices will act as a set of filters that can disrupt different phases of the weed species’ life cycle (Derksen et al. 2002). Besides this, crop yield can be increased when the time interval between the same crop is extended (e.g., a 20% increase in winter wheat yield, (Derksen et al. 2002)). Additionally, it has been shown that crop competitiveness can be optimized by increasing seeding rates, adapting row spacing, fertilization strategy, tillage, and competitive crop varieties (Fig. 1). These practices, when taken independently, have a limited effect on weed biomass suppression (i.e., 10–15%) but can provide outstanding results when combined into a multi-tactic approach (i.e., 70% weed biomass suppression) (Derksen et al. 2002). The importance of tillage is up to debate. No-till or reduced tillage CS usually show higher weed pressure and herbicide use, increasing the probability of herbicide resistance (Barzman et al. 2015). However, some authors (e.g., Anderson (2015)) argue that tillage lessens the impact of rotation design on weed density because weed seeds survive longer after burial. False/stale seedbed techniques, delayed sowing, herbicide dose reduction, and mechanical weeding are also some of the tools which can be added to fine tune the CS (Barzman et al. 2015). Nevertheless, examples of how the synergy between long-term strategic planning and short-term cultural tactics affects productivity or how it could be combined into a fully functional CS are still scarce (Petit et al. 2015; Swanton et al. 2008).

Fig. 1
figure 1

Cultural methods combining twin row spacing (8 cm–22 cm) to allow hoeing of winter wheat in the larger interrows and weed suppression by competition in the narrower interrows (© Pascal FARCY, 2017)

One of the main bottlenecks of IWM is the widespread belief that reduced herbicide use will lead to explosive weed dynamics and reduce crop production (Bastiaans et al. 2008). Farming practices determine productivity by fixing yield potential but also by driving weed dynamics, which can represent a constraint on potential productivity (Quinio et al. 2017). A recent study across 946 non-organic arable commercial farms showed no positive relationships between herbicide use and productivity for 71% of the farms (Lechenet et al. 2017). Such results could suggest that weeds did not represent a constraint on crop production because herbicide use could be compensated by alternative preventive and curative measures, as shown in a simulation study by Colbach and Cordeau (2018). Few long-term experiments provide a complete picture of how farming practices influenced weeds and final productivity (Davis et al. 2012). Studies tend to overlook either weeds (Lechenet et al. 2017), productivity (Chikowo et al. 2009), or the multi-annual complexity of CS and long-term weed dynamics (Quinio et al. 2017). Finally, concrete examples of combinations of coherent farming practices that reconcile low herbicide use and high productivity are scarce throughout the literature (see Adeux et al. (2017) in conventional maize monoculture) and there is ongoing debate about whether or not multiple pathways are possible to achieve this goal (Petit et al. 2015; Wezel et al. 2014).

The objectives of this study are to (i) assess long-term weed control in CS which were designed to reduce herbicide use and (ii) disentangle the effects of farming practices and weed competition on crop productivity. We hypothesize that (i) coherent combinations of farming practices allow a reduction of herbicide use and efficient long-term weed control, (ii) crop productivity is affected by certain characteristics of the CS (e.g., delayed sowing, crop yield potential) but that (iii) weeds did not represent a constraint on productivity. This study is the first to combine 17 years of intensive observations of weed densities before and after weeding, weed and crop biomass at crop flowering, and crop yields across five CS tested in large farm-scale conditions. These CS included a reference 3-year rotation with systematic ploughing and chemical weeding (S1) and four 6-year rotations with contrasted constraints (S2: transition from reduced tillage to no-till, chemical weeding; S3: chemical weeding; S4: typical IWM system; S5: mechanical weeding).

2 Material and methods

2.1 Experimental set up

The field experiment was conducted from harvest 2000 to harvest 2017 at the INRA experimental farm in Bretenière (47° 14′ 11.2″ N, 5° 05′ 56.1″ E), 15 km southeast of Dijon, France. The site is subject to an oceanic climate (but with a greater temperature range than the Atlantic coast), characterized by cold wet winters (average daily temperature of 4 °C and average monthly precipitation of 43 mm) and hot summers (average daily temperature of 18 °C and average monthly precipitation of 69 mm). The experiment was set up as a completely randomized block design. The set of decision rules characterizing each of the five CS was replicated on two blocks. The two blocks (A and D) were characterized by a clay content of 40 (A) to 50% (D) and a soil depth of 0.5 (D) to 0.9 m (A). To avoid complete overlap between crop:year and CS effects, the two plots (1.7 ha) of each CS did not start with the same entry point (i.e., crop).

S1 was the reference CS, typical of the Burgundy region, designed to maximize financial return. It was characterized by a triennial oilseed rape—winter wheat—winter barley rotation, systematic moldboard ploughing in summer-autumn, and herbicides as sole curative weed management tool. Nitrogen fertilization aimed to ensure the full needs of the crop.

All alternative CS (S2, S3, S4, and S5) were designed to mimic farmers aiming to reduce herbicide reliance through contrasted agronomical pathways and resulted in more complex 6-year rotations which included: 3 winter sown crops (winter wheat (Triticum aestivum L.), winter barley (Hordeum vulgare L.), triticale (× Triticosecale Wittm. ex A. Camus) or faba bean (Vicia faba L.)), autumn sown oilseed rape (Brassica napus L.), one spring crop (oat (Avena sativa L.), sugarbeet (Beta vulgaris subsp. vulgaris L.), faba bean, lupin (Lupinus albus L.), spring barley or mustard (Brassica juncea (L.) Czern)) and one summer-sown crop (maize (Zea mays L.), sorghum (Sorghum bicolor (L.) Moench), soybean (Glycine max (L.) Merr.), or sunflower (Helianthus annuus L.)). Hence, winter wheat and oilseed rape, the two most common crops of the region, were present throughout the five CS. Sugar beet was only cropped in S4 (up to 2006 when the nearby sugar refinery plant closed). In S5, perennial forage crops such as alfalfa (Medicago sativa L.) were included in order to manage Canada thistle (Cirsium arvense (L.) Scop.) or bitter dock (Rumex obtusifolius L.). Similarly, in S3, companion crops (such as faba bean, lentil (Lens culinaris Medikus), vetch (Vicia sativa L.), flax (Linum usitatissimum L.)) were intercropped in oilseed rape to cover interrows before winter.

Alternative CS also differed by their tillage type and weed management strategies. S2 was a transition from reduced tillage (i.e., no inversion tillage, 2001–2010) to no-till conservation agriculture (2010–2017), designed to reduce labor requirement and time-consuming operations, whereas S3, S4, and S5 could implement moldboard ploughing. These four CS could also implement a wide array of preventive and cultural weed management tools such as false seedbed technique (up to 2010 for S2), delayed sowing of winter cereals, and higher seeding rates. Herbicides were used as the sole method of direct weed control in S2 and S3. This choice was made in coherence with the strategy of minimum soil disturbance in S2 and to simulate certain farmers’ wish to not invest in mechanical weeding tools in S3. In contrast, S5 resorted to mechanical weeding as sole method of direct weed control. S4 aimed to be the typical IWM system, resorting preferentially to preventive measures and mechanical weeding. However, applications of specialized herbicides on target species remained possible when weather conditions were not suitable for mechanical weeding or to control weeds with a low sensibility to mechanical weeding. More detailed information concerning crop sequence, pesticide application, soil tillage, sowing, and harvest dates can be found in Deytieux et al. (2012) and Chikowo et al. (2009).

2.2 Data collection

2.2.1 Herbicide reliance and diversity of farming practices

Herbicide reliance was quantified through the herbicide treatment frequency index (HTFI) as Eq. 1:

$$ \mathrm{HTFI}={\sum}_T\frac{{\left(\mathrm{Applied}\ \mathrm{dose}\right)}_T\times {\left(\mathrm{Treated}\ \mathrm{surface}\ \mathrm{area}\right)}_T}{{\left(\mathrm{Reference}\ \mathrm{dose}\right)}_T\times \mathrm{Plot}\ \mathrm{surface}} $$
(1)

where T = a given herbicide at the applied dose on a specific area (in case of localized applications) and reference dose of the given T herbicide on a specific crop. A herbicide application at the reference dose on the whole plot surface yields a value of 1. In the case of glyphosate, different reference doses exist depending on the targeted weed species. Here, we considered as a HTFI of 1 an application of 3 L ha−1 of a commercial product containing 360 g L−1 of glyphosate as a unique active ingredient. Herbicides were partitioned according to their spectrum (Mamarot and Rodriguez 1997): broad spectrum (i.e., non-selective herbicides), anti-broadleaf, and anti-grasses (i.e., selective herbicides). Glyphosate was separated from the other broad-spectrum herbicides because it was the only herbicide used during the fallow period.

CS were characterized according to five blocks of variables describing tillage, crop rotation, herbicide use, mechanical weeding, and fertilization. The number of false seedbed preparations was computed as the number of intervals separating two tillage operations by more than 2 weeks. Hardy crops refer to triticale and oats. Delayed sowing of winter cereals corresponds to the number of days separating sowing and the earliest sowing date of winter cereals in the dataset. Variability of tillage depth and nitrogen fertilization was computed as the standard deviation of cumulated annual tillage depths and nitrogen fertilization respectively, whereas diversity of mechanical weeding tools, sowing period, crop types and herbicide spectrum was computed via the Shannon diversity index on the sum of practices applied within a sub-block (example of sub-block for mechanical weeding: rotary hoe, hoe, and harrow) at the CS scale. Crop types were defined as an interaction between botanical families and sowing periods.

2.2.2 Weed and crop sampling, weeding efficacy

On average, weed abundance was assessed 2.3 times per plot each year. Weed abundance was assessed by counting the density of each weed species in 32 (2001–2013) or 8 (2014–2017) 0.36 m2 fixed quadrats. Samples were then partitioned according to their timing with respect to weeding operations: before (i.e., November to February for winter cereals, April for spring crops, May–June for summer crops) and after (i.e., around crop flowering). The use of pre-emergence herbicides did not allow us to characterize weed density before weeding in the case of oilseed rape in the reference system.

In order to compare weeding efficacy across CS, a subset analysis containing only combinations of plot × year for which weeds were observed before and after weeding was performed. This analysis allowed us to partition total weed density after weeding into (i) species that were present before weeding and that were—at least partially (by a reduction of their abundance)—unfiltered by weeding and (ii) species that germinated in between the two sampling sessions. Hence, the analysis focused on the difference of weed density before and after weeding of species that were initially present.

Aboveground crop and weed biomass were also sampled at crop flowering (i.e., after weeding) in the same quadrats as weed density. Weed biomass was collected in all 32 (2004–2013) or 8 (2014–2017) quadrats. Crop biomass was collected in a random subset of 4 of the 32 quadrats over the 2004–2013 period whereas all 8 quadrats were sampled over the 2014–2017 period. Samples were then dried for 48 h at 80 °C and weighed. Yield loss due to weeds (i.e., percentage of crop biomass reduction) was estimated through crop biomass estimation in weed-free conditions at flowering (see Section 2.3).

2.2.3 Yield assessment

Crop productivity was assessed each year as the yield (standardized at 0% moisture content) at the plot scale through weighing of grain wagons and assessment of grain moisture content. Crop productivity at the CS scale was computed by standardizing harvested products into energy based on the higher heating value of each crop (Lechenet et al. 2017). Energy values were then back-transformed into equivalent tons of winter wheat grain.

2.3 Statistical analysis

2.3.1 Mixed effect models

All regression analyses were carried out with the R software version 3.3.2 (R Development Core Team 2016). Generalized and linear mixed effect models were performed with the packages {lme4} and {nlme}, respectively, in order to account for the nested sampling design and the distribution of residuals. HTFI, weed biomass, percentage of crop biomass reduction due to weeds, and crop productivity were modeled with a Gaussian distribution whereas weed density was analyzed with a Poisson distribution. Weed biomass and crop productivity were ln-transformed to meet normality assumptions. A constant variance function was added to the model when heteroscedasticity across CS was detected.

Models were fitted at three different scales: CS, winter wheat and oilseed rape. All models contained block, CS, time (as continuous) and CS × time as fixed effects. The comparison of weed density before and after weeding required the addition of timing, timing × time, CS × timing, and CS × time × timing. Significance of fixed effects was tested through type III likelihood ratio tests using the package {monet}. Year and plot were always considered random effects. The interaction between year and plot was considered a random effect when multiple values (i.e., pseudo-replication) were available for a combination of the two. A unique quadrat identifier was added as a random effect for the analysis of weed density because the same quadrats were observed before and after weeding. Overdispersion in Poisson regression was accounted for by the addition of an observation level random effect.

Contrasts were adjusted using the package {emmeans}. Least square means (denoted \( {\overline{\mathrm{x}}}_{\mathrm{ls}} \)) or trends (denoted β) are presented to highlight the marginal effect of CS. Coefficient of determination (R2) is presented as a useful diagnostic tool. For generalized or linear mixed effect models, R2 is partitioned into marginal R2 (R2m), the variance explained by the fixed factors and conditional R2 (R2c), the variance explained by the entire model.

Crop biomass reduction due to weeds was computed as mentioned in Eq. 2. Crop biomass in absence of weeds (crop biomass predictedweed free) was estimated by building the most parsimonious model (backward selection according to the Akaike Information Criterion corrected for small sample sizes) relating crop biomass to weed biomass at flowering and predicting the value of the intercept. Only the final model’s intercept (predicted for a specific combination of plot and year) was used to compute the percentage reduction of crop biomass.

$$ \% crop\ biomass\ reduction=\frac{Crop\ {biomass\ predicted}_{weed\ free}- Crop\ {biomass\ observed}_{weed y}}{Crop\ {biomass\ predicted}_{weed\ free}} $$
(2)

2.3.2 Multivariate analysis

Principal component analysis was performed to highlight the farming practices that best discriminated the five CS. All farming practices were computed at the CS scale. Hence, the matrix consisted of 10 rows (one for each plot) and 28 columns (one for each descriptor of farming practices). Variables were centered and scaled. Supplementary variables describing the diversity of CS operations were simply projected on top of the ordination, i.e., they had no effect on the ordination. The ordination diagram was produced with the CANOCO software (Šmilauer and Lepš 2014).

3 Results and discussion

3.1 IWM systems allow a drastic reduction of herbicide use in time

Analysis of HTFI at the CS scale showed a significant effect of CS, time, and CS × time interaction (Tab. 1). Over the years, mean HTFI in S1 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}} \) = 1.87, SE = 0.12) and S2 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}} \) = 1.71, SE = 0.15) was similar to regional references and was significantly greater than mean HTFI in S3 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}} \) = 1.00, SE = 0.11) and S4 the typical IWM system (\( {\overline{\mathrm{x}}}_{\mathrm{ls}} \) = 0.66, SE = 0.10). S5 showed the lowest HTFI of all CS (\( {\overline{\mathrm{x}}}_{\mathrm{ls}} \) = 0.02, SE = 0.01). Higher herbicide use is often reported in reduced or no-till systems (Wezel et al. 2014). However, the fact that herbicide use in S2 did not exceed that of S1 can be considered an improvement considering all the positive aspects associated with conservation tillage (Anderson 2015).

Table 1 ANOVA-like table resulting from type III likelihood ratio tests on nested linear mixed effect models showing the effect of the tested explanatory variables (time being year treated as continuous) on herbicide treatment frequency index (HTFI), productivity, total weed biomass and the % of crop biomass reduction due to weeds at three different scales: cropping system (CS), winter wheat (WW), and oilseed rape (OSR)

Moreover, herbicide use was significantly reduced in time in S1 (β = − 0.09, pβ≠0 = 0.0005) and S3 (β = − 0.05, pβ≠0 = 0.03) whereas S2 showed an increasing trend (β = 0.05, pβ≠0 = 0.08) (Fig. 2), due to a transition from reduced tillage to a no-till conservation agriculture system which inhibited superficial tillage during the fallow period and required glyphosate applications. Indeed, S2 had the greatest mean annual glyphosate TFI (Fig. 3, \( {\overline{\mathrm{x}}}_{\mathrm{ls}} \) = 0.74, SE = 0.11) increasing over time (Fig. 3, β = 0.08, pβ≠0 = 0.0002). The higher level of glyphosate use in S2 is in line with previous results (Derksen et al. 2002) showing a greater reliance of reduced or no-till systems to this active compound, mainly because of effective control of weeds and crop volunteers and cost-effectiveness for cover crop termination. In our experiment, the S2 CS did not aim at reducing glyphosate use in particular but herbicide use in general. Thus, considering the amount of current research done on non-chemical termination of cover crops (Davis 2010), future opportunities would allow a drastic reduction of herbicide use in these systems.

Fig. 2
figure 2

Dynamics of herbicide treatment frequency index (HTFI, graphs on the left) and glyphosate TFI (graphs on the right) across the five cropping systems. One data point corresponds to a plot x year value of HTFI or glyphosate TFI. The trends (black line) are highlighted with a loess smoother and 95% confidence intervals (grey bands). Statistical inferences were made on a linear mixed effect model taking into account random effects. Blue arrows highlight stable dynamics over the 17 years, green and red arrows show a significant reduction or increase of HTFI respectively in time at p < 0.05

HTFI in S1 was mainly represented by broad-spectrum herbicides (70%) which highlights its conservative strategy to cover a wide spectrum of flora. This is contrary to IWM principles which do not seek to eradicate the weed community but rather to adapt herbicides to target particularly harmful species (Barzman et al. 2015). The higher percentage of anti-grass herbicides (24%) in S2 reflects its selection of grasses such as, in our case, Alopecurus myosuroides Huds., which often prevail in reduced tillage or no-till systems (Derksen et al. 2002). Conversely, grasses were not as predominant in S3 and S4, as reflected by the low percentage of anti-grass herbicides used in these systems (8%). In these systems, herbicide spraying mainly targeted particularly harmful dicots such as Galium aparine L, well adapted to tolerate mechanical weeding and enhanced by diversified grain-based cropping sequences.

Analysis at the winter wheat scale and at the oilseed rape scale showed similar results but brought to light the different strategic CS approaches in S3 (use of herbicide in winter wheat and oilseed rape) vs. S4 (use of herbicide mainly in winter wheat). Attempting to reduce herbicide reliance may be achieved at the crop rotation scale by reducing herbicide use in certain crops while maintaining a high level of herbicide use in economically important crops. This may ease farmers’ fears of explosive weed dynamics in IWM (Bastiaans et al. 2008).

3.2 Low herbicide use is achieved through diversified practices

The first and second axes of the principal component analysis accounted for 42.3 and 22.5% of the total variation (Fig. 2). The first axis, discriminating S1 and S2 from S3, S4, and S5, was mostly associated with herbicide use and the percentage of autumn sown crops (i.e., oilseed rape in our case). Chemical weed management in S2 resulted in a greater diversity of herbicide use spectrum partly due to the inclusion of glyphosate.

Fig. 3
figure 3

Principal component analysis on the correlation matrix of farming practices. The position of the centroid of each cropping system is shown with black triangles. Indicators highlighting the diversity of operations within the main components of cropping systems (i.e. crop rotation, tillage, fertilization …) are shown with empty arrow tips and are supplementary variables, i.e. only projected on top of the ordination. Red colored variables refer to herbicide use, brown to tillage practices, green to the crop rotation, orange to fertilization and blue to mechanical weeding. HTFI: herbicide treatment frequency index, n: number of operations

The second axis, discriminating S2 and S3 from S1, S4, and S5, was mainly associated with moldboard ploughing frequency (highly correlated to cumulated tillage depth) and the percentage of summer crops. Moldboard ploughing was indeed systematic in S1 but implemented every 2 years in S3, S4, and S5, leading to a higher interannual variability of tillage depth in the latter alternative CS. However, the number of secondary tillage operations (including false seedbed) was greater in S3 (in average 3.9/crop season), S4 (4.3) and S5 (3.6) than in S1 (2.8). False seedbeds, known to be an effective weed management tool (Rasmussen 2004), were antagonistic to the no-till strategy of minimum soil disturbance, which renders diversification of selection pressures more challenging in this type of system.

S1 was associated with (i) a high percentage of oilseed rape and winter crops (i.e., winter wheat and winter barley) as a result of its functionally low 3-year rotation, (ii) higher nitrogen fertilization rates (average of 160 kg N year−1), and (iii) a high percentage of herbicides applied as mixtures (44%). By contrast, all alternative CS were associated to a higher percentage of spring, summer, and hardy crops (i.e., triticale, oat) as a result of their 6-year rotation, leading to a higher diversity of crop types and sowing periods. Diversifying sowing periods disrupts dynamics of autumn emerging weeds because crops determine tillage timing which in turn determines if weed species germination requirements will be fulfilled (Cordeau et al. 2017). Introduction of summer crop in our region (such as maize or soybean) led to a slight need for irrigation (150 mm plot−1 over the experiment). Introduction of legume crops and to a lesser extent crops with lower N requirements allowed a reduction of nitrogen fertilization in comparison with S1 (average of 120 kg N year−1 for S4 and 100 kg N year−1 for S2, S3, and S5) and hence a higher variability of nitrogen fertilization regime at the CS scale. Considering that the yield potential of winter wheat was reduced by delayed sowing (18 days later than S1 on average for all winter cereals of all alternative CS), nitrogen fertilization was reduced (by 33 to 40 units on average for all alternative CS in comparison with S1) and sowing density increased (+ 45 kg/ha on average for all alternative CS in comparison with S1). However, delayed sowing of winter cereals is not currently adopted by farmers in conservation agriculture (such as S2), who prefer early sowing in order to benefit from warmer soil conditions. The integration of perennial crops such as alfalfa in S5 required a higher level of potassium fertilizers (on average + 12 kg K2O year−1 ha−1).

Finally, S4 and S5 used a diversity of mechanical weeding tools (harrow, rotary hoe, and hoe) depending on crop, crop stage, and weed biology. Lower herbicide use in S4 and S5 was made possible by compensating with alternative measures (Colbach and Cordeau 2018) such as mechanical weeding (2.4 and 2.6 operations year−1 in S4 and S5 respectively) and false seedbed technique operations (Rasmussen 2004).

3.3 Contrasted pathways to efficient weed management exist

The weed context of the long-term experiment was representative of past cropping sequences which included winter wheat, winter barley, and to a less extent, summer crops such as soybean and sunflower. It was majorly represented by Alopecurus myosuroides Huds., Galium aparine L., Viola arvensis Murray, Veronica hederifolia L., and Veronica persica Poir. in autumn-winter, Polygonum aviculare L., Persicaria maculosa Gray, and Fallopia convolvulus (L.) Á. Löve in the spring and Chenopodium album L., Solanum nigrum L., and Amaranthus hybridus L. in summer-sown crops.

The analysis of total weed density at the CS scale showed a significant effect of CS, time, CS × time interaction, CS × timing interaction, and CS × timing × time interaction. On average, total weed density before and after weeding was low across all CS but greater in S2 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}-\mathrm{before}} \) = 17.6 plants m−2, SEbefore = 3.86, \( {\overline{\mathrm{x}}}_{\mathrm{ls}-\mathrm{after}} \) = 12.1, SEafter = 2.6), S3 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}-\mathrm{before}} \) = 22, SEbefore = 4.8, \( {\overline{\mathrm{x}}}_{\mathrm{ls}-\mathrm{after}} \) = 15.2, SEafter = 3.2), S4 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}-\mathrm{before}} \) = 18, SEbefore = 3.70, \( {\overline{\mathrm{x}}}_{\mathrm{ls}-\mathrm{after}} \) = 19.6, SEafter = 3.9), and S5 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}-\mathrm{before}} \) = 6.4, SEbefore = 1.3, \( {\overline{\mathrm{x}}}_{\mathrm{ls}-\mathrm{after}} \) = 20.6, SEafter = 4.2) than in the reference system S1 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}-\mathrm{before}} \) = 2.3, SEbefore = 0.5, \( {\overline{\mathrm{x}}}_{\mathrm{ls}-\mathrm{after}} \) = 1.8, SEafter = 0.4). These results are consistent with previous studies showing greater weed density under no-till or low herbicide input CS (Adeux et al. 2017; Bàrberi and Lo Cascio 2001; Moonen and Bàrberi 2004).

Before weeding, S4 was the only system with no increase in total weed density over time (Fig. 4). However, total weed density significantly increased in S1 (βlog scale = 0.12, pβ≠0 = 0.003), S2 (βlog scale = 0.11, pβ≠0 = 0.01), S3 (βlog scale = 0.10, pβ≠0 = 0.01), and S5 (βlog scale = 0.18, pβ≠0 < 0.001) reaching a final predicted density of 3, 44, 78, and 47 plants m−2, respectively, in 2017 (Fig. 4). After weeding, total weed density significantly increased in S2 (βlog scale = 0.11, pβ≠0 = 0.01), S3 (βlog scale = 0.09, pβ≠0 = 0.03), and S5 (βlog scale = 0.16, pβ≠0 < 0.001) reaching a final predicted density of 32, 30, and 83 plants m−2, respectively, in 2017 (Fig. 4). Despite that CS such as S1 are challenged by herbicide-resistant populations of Alopecurus myosuroides Huds. in our region (Chauvel et al. 2001), weed dynamics were stable in S1 in this experiment, possibly due to an efficient choice and rotation of herbicides (type and rate) and precise timing of applications which could be more difficult to organize under larger real-life farm conditions (Barzman et al. 2015). On the other hand, S4, which can be considered as the most diversified CS, also showed stable weed dynamics after weeding.

Fig. 4
figure 4

Fitted values of weed density dynamics before (left panel) and after weeding (center and right panels) from 2001 to 2017 for the five cropping systems. After weeding (total weed density) refers to all plants observed after weeding whereas after weeding (density of species having emerged before weeding) refers to individuals belonging to species that were observed before weeding which were still present after weeding. Predictions were based on generalized linear mixed model taking into account random effects. The regression line shows an average plot value (i.e. prediction at the population level). Stars highlight regression slopes which are significantly different from zero at p < 0.05. Slopes sharing the same letter are not significantly different at p < 0.05

The analysis focusing on weed density after weeding of species having emerged before weeding (i.e., partitioning out weeds which germinated in between the two samplings) showed a significant effect of all the tested variables except block (right panel of Fig. 4 for dynamics after weeding, blue bars in Fig. 5 for static mean). Density after weeding of weeds having emerged before weeding was higher in S2 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}} \) = 3.9, SE = 1.3), S3 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}} \) = 7.6, SE = 2.4), and S4 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}} \) = 5.8, SE = 1.9) than in S1 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}} \) = 0.5, SE = 0.1) or S5 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}} \) = 0.9, SE = 0.3) (Fig. 5). Most weeds present after weeding were actually late germinating weeds, i.e., seedlings at the time of sampling, which, most likely, had very limited biomass, seed production, and competitive effect on the crop (O'Donovan et al. 1985). High densities of late emerging weeds can be interpreted as an effect of crop diversification (i.e., introduction of spring and summer crops in the crop sequence) (Anderson 2005).

Fig. 5
figure 5

Observed mean weed density before and after weeding for each cropping system at three different scales: cropping system, winter wheat and oilseed rape. Observations are paired (S1 does not have any observations before weeding in oilseed rape due to the use of pre-emergence herbicides), each quadrat was fixed during the crop season and observed before and after weeding allowing to partition weed density after weeding into (i) weeds which were present in the sampling before weeding and still present after (blue bars) and (ii) weeds which germinated in between the two samples (green bars). One data point is an average plot value (i.e. observed mean) and error bars represent standard deviation of the observed mean. Statistical inferences were based on generalized linear mixed effect models taking into account random effects (too few data was available for the oilseed rape scale). Bars which do not share a letter in common are significantly different from each other at p < 0.05. Lowercase letters refer to a before weeding contrast whereas uppercase letters refer to an after weeding contrast (on the blue bars)

Moreover, the slopes for weed density after weeding (of species that emerged before weeding) by time were not significantly different across CS (right panel of Fig. 4). However, the relative percentage difference of density between before and after weeding (of species initially present, i.e., weeding efficacy, Fig. 5: difference between blue bars) was greater in S1 (84%) than in S4 (77%) and S3 (73%). S2 (78%) was not significantly different from the three latter and S5 (83%) was only significantly different from S3, bringing evidence that non-chemical curative measures can have the same magnitude of weeding efficacy as synthetic herbicides, as demonstrated earlier through simulations (Colbach and Cordeau 2018).

3.4 Productivity was reduced in IWM systems

Productivity differed among CS at the CS scale as well as in winter wheat (Tab. 1). At the CS scale, productivity was significantly reduced by 22% in S2 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}} \) = 4.7 t dry matter (DM) equivalent wheat grain ha−1, SE = 0.3) and by 33% in S3 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}} \) = 4.1, SE = 0.3) in comparison with S1 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}} \) = 6.1, SE = 0.2). Even if productivity was reduced by 17% (\( {\overline{\mathrm{x}}}_{\mathrm{ls}} \) = 5.1, SE = 0.4) in S4 and by 11% in S5 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}} \) = 5.4, SE = 0.4) in comparison with S1, the three latter systems were not significantly different. Differences in productivity among CS resulted mainly from the choice of crop in the rotation (Lechenet et al. 2014). S2 and S3 presented a higher frequency (38 and 44% respectively) of crops with intrinsic low productivity (e.g., crops grown for oil—not including oilseed rape—and proteins like spring mustard, lentils, and winter faba bean) than S4 and S5 (20 and 30% respectively). On the contrary, highly productive crops (e.g., sugar beet and maize) were overrepresented in S4 in comparison with the other CS (12% in S4 vs. 0–3% in others CS).

In oilseed rape, no significant differences were observed across CS. In winter wheat, grain yields (t DM ha−1) were reduced by 11, 21, 15, and 10% in S2 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}} \) = 6.0, SE = 0.3), S3 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}} \) = 5.3, SE = 0.3), S4 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}} \) = 5.7, SE = 0.2), and S5 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}} \) = 6.0, SE = 0.3), respectively, in comparison with S1 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}} \) = 6.7, SE = 0.3). However, only S3 and S4 significantly differed from S1. A reduction of winter wheat yields was expected in the alternative CS considering that sowing was delayed, nitrogen fertilization was slightly reduced and the choice of varieties was not solely based on yield potential but also on resistance to pests and/or pathogens. We hypothesize that greater winter wheat yields in S2 and S5 might be due to better resource use efficiency (Anderson 2015) and better water retention in S2 (Zibilske and Bradford 2007) and a higher proportion of winter wheat preceded by legumes (e.g., alfalfa and faba bean) with likely important pre-crop effects in S5 (Cernay et al. 2018).

3.5 Weeds were not responsible for yield loss

Analysis of weed biomass at crop flowering showed a significant effect of CS at the three different scales of interest (Tab. 1). However, mean weed biomass was extremely low across all CS and crops. At the CS scale, mean weed biomass was significantly greater in S2 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}} \) = 1.9 g DM m−2, SE = 0.6), S3 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}} \) = 2.0, SE = 0.6), S4 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}} \) = 3.1, SE = 0.9), and S5 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}} \) = 4.9, SE = 1.4) in comparison with S1 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}} \) = 0.2, SE = 0.1). In winter wheat, mean weed biomass (g DM m−2) was significantly greater in S3 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}} \) = 2.0, SE = 0.7), S4 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}} \) = 3.3, SE = 1.0), and S5 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}} \) = 4.0, SE = 1.4) than in S1 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}} \) = 0.4, SE = 0.1) and not significantly different from the latter in S2 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}} \) = 0.9, SE = 0.3). Mean weed biomass in oilseed rape was significantly greater in S2 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}} \) = 2.8, SE = 1.6), S3 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}} \) = 9.8, SE = 4.6) and S4 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}} \) = 4.8, SE = 2.8) than in S1 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}} \) = 0.4, SE = 0.1), and not significantly different from the four latter in S5 (\( {\overline{\mathrm{x}}}_{\mathrm{ls}} \) = 4.1, SE = 3.4). It is important to note that the greater variability of weed biomass in oilseed rape in S3 was largely due to companion crops (e.g., faba bean and lentils) that were not winter killed certain years, highlighting one of the potential drawbacks of this technique (Lorin et al. 2015). Moreover, higher weed biomass has already been observed in low herbicide input CS (Hiltbrunner et al. 2008), but assessments often lack an estimation of actual yield losses.

After backward selection, the final predictive model of crop biomass contained the a priori covariate block, crop, weed biomass, crop density, crop × crop density, and crop × weed biomass interactions. The slopes between weed and crop biomass were negative for all crops, but not always significant, stressing the fact that weeds can represent a major constrain on crop production (Oerke 2006) when weed biomass at crop flowering is important. In fact, weeds might have had a limited contribution to yield variations in this experiment considering the slopes were significant only for a subset of the crops: oilseed rape (βlog scale = − 2.5e−03, pβ≠0 = 0.0001), maize (βlog scale = − 3.5e−03, pβ≠0 < 0.0001), spring barley (βlog scale = − 7.8e−03, pβ≠0 < 0.0001), and sunflower (βlog scale = − 3.6e−03, pβ≠0 = 0.0006). The variability of slope values may highlight that weed biomass was higher in some crops because of non-efficient weed management, and/or because some crops are more sensitive to weed biomass, which is coherent with the literature (van Heemst 1985). CS and its interaction with crop were dropped during model selection. This consolidates the fact that variations of crop biomass at crop flowering were limited across the CS and that weed:crop competition was not affected by CS.

Crop biomass reduction due to weeds was not significantly different across CS at the three different scales of interest (Tab. 1). At the CS scale, crop biomass reduction averaged − 2.4 (SE = 3.1), − 4.0 (SE = 3.1), 0.9 (SE = 2.9), 1.1 (SE = 2.8), and 7.1% (SE = 3.3) in S1, S2, S3, S4, and S5 respectively. Average crop biomass reduction due to weeds was comprised between − 3 and 3% for all CS in winter wheat and between − 3 and 4 for all CS in oilseed rape. The predicted values of biomass reduction in winter wheat were well below the 7.7% of actual crop losses due to weeds reported by Oerke (2006), providing evidence for efficient weed management and a limited impact of weeds in this crop (Milberg and Hallgren 2004). Negative crop biomass reduction, i.e., increase of crop productivity, reflected the large variability of crop biomass within a field at low levels of weed biomass, potentially due to missing covariates describing the soil’s physical and chemical properties. We do not consider this phenomenon as evidence for a beneficial effect of weeds on yield.

3.6 Global perspective on herbicide reduction while maintaining crop productivity

S4 was the only CS able to reach the multiperformance desired, i.e., a combination of low herbicide use (− 65% compared to the reference S1 CS), low weed densities before and after weeding (stable over time), high productivity (equal to S1), and insignificant yield losses due to weeds. We show for the first time that this was enabled by the complementarity and synergy between a well-balanced and diversified crop rotation integrating different crop types with different sowing periods, occasional ploughing, repeated false seedbed preparations, reduced nitrogen fertilization at the CS and crop scale, and a combination of mechanical and chemical weeding. Nevertheless, the CS approach does not allow us to identify precisely which tool or combination of tools allowed S4 to achieve this level of performance. Such approaches are challenged by the complexity of crop rotations and the diversity of tools implemented, which might explain why so few studies are published (Adeux et al. 2017; Chikowo et al. 2009; Davis et al. 2012). As stated by Davis et al. (2012), “small amounts of herbicides proved to be a powerful tool, participating to a diverse suite of tactics, with which to tune, rather than drive, agroecosystem performance.” Even though crop diversification undoubtedly contributed to long-term weed management, the concept cannot be generalized to all production situations. For example, Adeux et al. (2017) showed that conventional maize monoculture could be adapted in simpler ways (e.g., introduction of an earlier maturing variety, a cover crop, and mixed weeding) to conciliate weed management, herbicide use, crop productivity, and profitability.

The S2 CS, investigating a transition from reduced tillage to no-till, was challenged by high herbicide reliance, particularly due to the chemical destruction of the vegetation during the fallow period with glyphosate (Derksen et al. 2002; Wezel et al. 2014). When tillage is not conceivable, non-chemical termination of cover crops is feasible by rolling or mowing (Creamer and Dabney 2002) but these practices were not integrated into the initial design of S2 because certain weed species or crop volunteers may be insensitive. However, we show that conservation agriculture may be implemented without increasing herbicide use, contrary to what is mentioned in the literature (Wezel et al. 2014).

The no-mechanical weeding S3 and no-chemical weeding S5 CS were challenged by an increase in weed density before and after weeding. These results contradict those of Chikowo et al. (2009) obtained over the first 6 years of the same experiment, highlighting the necessity of long-term assessments. Even if this increase in weed pressure resulted in limited yield losses due to weeds because of efficient use of direct weed control methods targeting competitive species, it might render long-term weed management more difficult and less efficient. Besides this, management of perennial weed species such as Cirsium arvense (L.) Scop. relied on the introduction of forage crops in S5, which proved to be an efficient tool due to repeated sward cutting (Anderson 2015), consolidating the importance of mixed farming to diversify weed management tools (Lechenet et al. 2014). However, no market outlet is currently available due to the drastic decline of livestock farming in the region.

In our experiment, CS productivity was mainly affected by the crops chosen to diversify the crop rotation (also shown to decrease productivity in Lechenet et al. (2014)). Furthermore, productivity is only one aspect of sustainability which inter alia neglects profitability, an important aspect that might hinder the adoption of IWM systems (Bastiaans et al. 2008). Over the 2001–2012 period of this experiment, all the IWM CS showed a deviation of the semi-net margin (production costs including labor from which sale revenue is subtracted) comprised between − 50 and − 130 € ha−1 in comparison with S1 (295 € ha−1) (Lechenet et al. 2014). The choice of crops was mainly based on the expected positive impact on weed management, not considering the potential impact on profitability, contrary to what might drive farmer’s decision making. Hence, additional room might be available to reconcile crop diversification with positive agronomic impacts and limited (or no) impact on profitability. On the other hand, CS such as S1 may generate negative externalities (e.g., water and air pollution by pesticides and fertilizers, soil erosion, loss of organic matter, important greenhouse gas emissions, simplification of agricultural landscapes, decline of agro-biodiversity) which are not taken into account in the semi-net margin (Davis et al. 2012).

4 Conclusion

Through an in-depth analysis of farming practices, weed dynamics, crop productivity, and weed:crop competition across five contrasted CS tested over 17 years, we show for the first time that highly diversified CS (in terms of rotation, associated practices such as fertilization regime, weeding tools, etc.) can drastically reduce herbicide use, provide effective long-term weed control, and limit yield losses due to weeds in grain-based CS of the Burgundy region. Productivity was mainly affected by the choice of crops. The positive externalities provided by IWM CS could be supported through accreditation or incentives while new and fair priced markets are put into place. Moreover, further studies are needed to explore how IWM, which so far relies mainly on interventions at the field scale, can take advantage of a higher share of ecological weed management options (e.g., increased weed seed predation and/or decay) deployed across different spatial scales. Along those lines, we also encourage further analysis to identify which combinations of practices may select more diversified but harmless weed communities. Finally, this experiment, visited by hundreds of stakeholders over the years, has been an incredible support to help different actors question themselves on how farming practices or policies could be updated to reach a more sustainable agriculture.