Effects of agricultural management practices on soil quality: A review of long-term experiments for Europe and China

In this paper we present effects of four paired agricultural management practices (organic matter (OM) addition versus no organic matter input, no-tillage (NT) versus conventional tillage, crop rotation versus monoculture, and organic agriculture versus conventional agriculture) on five key soil quality indicators, i.e., soil organic matter (SOM) content, pH, aggregate stability, earthworms (numbers) and crop yield. We have considered organic matter addition, no-tillage, crop rotation and organic agriculture as “promising practices”; no organic matter input, conventional tillage, monoculture and conventional farming were taken as the respective references or “standard practice” (baseline). Relative effects were analysed through indicator response ratio (RR) under each paired practice. For this we considered data of 30 long-term experiments collected from 13 case study sites in Europe and China as collated in the framework of the EU-China funded iSQAPER project. These were complemented with data from 42 long-term experiments across China and 402 observations of long-term trials published in the literature. Out of these, we only considered experiments covering at least five years. The results show that OM addition favourably affected all the indicators under consideration. The most favourable effect was reported on earthworm numbers, followed by yield, SOM content and soil aggregate stability. For pH, effects depended on soil type; OM input favourably affected the pH of acidic soils, whereas no clear trend was observed under NT. NT generally led to increased aggregate stability and greater SOM content in upper soil horizons. However, the magnitude of the relative effects varied, e.g. with soil texture. No-tillage practices enhanced earthworm populations, but not where herbicides or pesticides were applied to combat weeds and pests. Overall, in this review, yield slightly decreased under NT. Crop rotation had a positive effect on SOM content and yield; rotation with ley very positively influenced earthworms’ numbers. Overall, crop rotation had little impact on soil pH and aggregate stability − depending on the type of intercrop; alternatively, rotation of arable crops only resulted in adverse effects. A clear positive trend was observed for earthworm abundance under organic agriculture. Further, organic agriculture generally resulted in increased aggregate stability and greater SOM content. Overall, no clear trend was found for pH; a decrease in yield was observed under organic agriculture in


Introduction
Soil is increasingly recognized as a non-renewable resource on a human life scale because, once degraded it's regeneration is an extremely slow process (Camarsa et al., 2014;Lal, 2015). Given the importance of soils for crop and livestock production as well as for providing wider ecosystem services for local and global societies, maintaining the soil in good condition is of vital importance. To manage the use of agricultural soils well, decision-makers need sciencebased, easy-to-apply and cost-effective tools to assess changes in soil quality and function.
The European Commission, the Government of China and the Government of Switzerland co-funded the research project "Interactive Soil Quality Assessment in Europe and China for Agricultural Productivity and Environmental Resilience" (iSQAPER), which aims to develop an interactive soil quality assessment tool (SQAPP) that accounts for the impact of agricultural land management practices on soil properties and functions. The ultimate aim is to provide agricultural land users with options for cost-effective agricultural management activities which enhance soil quality and crop productivity.
The concept of soil quality includes assessment of soil properties and processes as they relate to the ability of soil to function effectively as a component of a healthy ecosystem (Bünemann et al., 2018). Specific functions and subsequent values provided by ecosystems are variable and rely on numerous soil physical, chemical, and biological properties and processes, which can differ across spatial and temporal scales (Doran, 2002;Nannipieri et al., 2003;Van Diepeningen et al., 2006;Spiegel et al., 2015). As such, selection of a standard set of specific properties as indicators of soil quality can be complex and varies among agricultural systems and management purposes. According to Islam and Weil (2000), soil quality is best assessed by soil properties that are neither so stable as to be insensitive to management, nor so easily changed as to give little indication of long-term alterations.
Understanding interacting effects of agricultural management practices on soil quality indicators (SQI) is essential for the development of SQAPP. Such effects can be best analysed from data of agricultural long-term experiments (LTE), where soils are experimentally manipulated to identify the key drivers of soil change. These trials allow to study changes over time of soil properties under various types of treatment (e.g. plough/no-tillage) and their respective intensities (e.g. ploughing frequency).
The present study has been performed to analyse and summarise the data of a large range of LTEs. Our hypothesis was that sufficient data for promising soil quality indicators can be extracted in order to show trends over time as a basis for further, generic decision-making on recommended agricultural practices.

Selection of soil quality indicators and agricultural management practices
Based on an earlier review by Bünemann et al. (2018) in the iS-QAPER project framework, and work by Spiegel et al. (2015), we have initially chosen six soil quality indicators. Main considerations in making this selection were: • Changes in soil quality and fertility are gradual and significant effects of land use and management generally cannot be measured within at least five years after their introduction; hence, long-term experiments (LTEs) are of critical importance. Focus will be on "dynamic" over "static" indicators as only the former can reflect changes within a reasonable time span.
• Most indicators are soil and site specific (e.g. soil organic matter content and pH), so it is essential that experiments have been done under comparable conditions (e.g. LTEs with split-plot design, or at least with neighbouring parcels) under identical soil and climate conditions.
• It is necessary to distinguish between short-term effects and longterm changes in soil quality indicators.
• Indicators can be related to potential changes in soil functions and soil threats.
• It is important not only to identify the most appropriate bio-physical indicators, but also to ensure that farmers and land managers can easily understand and relate to these indicators so that they may be used to support on-farm management decisions.
The selected soil quality indicators were: soil organic matter (SOM) content, pH, aggregate stability, water-holding capacity and (number of) earthworms. Yield, although not a soil property, is also considered here as it is a good measure for soil quality and a primary concern to farmers.
Five agricultural management practices were chosen as "promising": organic matter addition, no-tillage, crop rotation, irrigation, and-at the system level-organic agriculture. For each LTE, we compared results with respect to the corresponding "standard practice" (reference): no organic matter input, conventional tillage, monoculture, non-irrigation, and conventional farming.

Data collection and literature review
LTEs are indispensable for assessing effects of agricultural management practices on changes in soil quality. We have collated data of 30 long-term experiments from the 13 iSQAPER project partners in Europe and China. Data collated for each LTE included: location, climate, land use, soil data, trial factors, management systems, assessments done, sample storage and analysis. The average duration of the LTEs under consideration was 19 years (range: 5-34 years). The earliest LTE began in 1964 and most of these LTE's are still ongoing. Details on the trials included are provided as Supplementary information in Table  S1.
The above data were complemented with analytical data from 42 long-term agricultural experiments across China covering over 30 years of observations and various management practices (Xu et al., 2015a(Xu et al., , 2015b. To augment our database, we performed an extensive literature review, including over 900 publications and reports using web-based search engines Google Scholar, ScienceDirect, ISI Web of Science, ResearchGate, and Scopus. Publications in Chinese were retrieved using the China Knowledge Resource Integrated (CNKI) database (http://eng. oversea.cnki.net/kns55/). Key search terms used included organic matter addition (crop residue, straw return, green manure, farmyard manure, compost, slurry), crop rotation, no-tillage, organic agriculture, organic farming, and combination with the chosen soil properties and yield.
The resulting publications were documented using an open source reference manager (Mendeley.com) and subsequently screened for their relevance for the present review. Key elements of the selected studies (402 observations) were entered into a Microsoft Excel database. The corresponding data and literature references are documented in Supplementary Table S2.

Data analysis and visualization
Effects of management practices on the selected soil quality indicators were assessed on the basis of both the iSQAPER LTE data (Supplementary Table S1), and the data extracted from the literature review including analytical results from the LTEs of China (Supplementary Table S2).
For the LTE's, we calculated response ratios (RR) for each indicator under a paired practice. For example, SOM content under NT (Treatment 2) was divided by SOM content under conventional tillage (Treatment 1 as a reference). For some experiments, results were reported as soil organic carbon (SOM = 1.724 * SOC, according to van Bemmelen (1890)), so the ratios are comparable.
Measurements were made at variable intervals depending on the objective of each experiment. As indicated, for this study, the duration of each experiment should be at least five years. For this, we have first analysed the data using the following procedure; if there are: • ≥ 3 measurements (92% of the LTE observations), then we calculated the average RR for the last three measurements (e.g. total 5 measurements over 14 years, period 2002-2015, last three measurements in 2008, 2012, 2015).
• two measurements (8% of the LTE observations), then we calculated the average RR for both measurements.
For data extracted from the literature review and the supplementary LTEs of China, we also calculated the RR for each soil quality indicator under a paired practice as indicated above, for example, aggregate stability under crop rotation divided by aggregate stability under monoculture for the given LTE's.
Due to a lack of data, the previously selected indicator of waterholding capacity was excluded as well as the paired practice of irrigation/non-irrigation.
In total, response ratios for 354 paired observations have been calculated (Table 1). Inherently, the number of observations was biased by relying on available data. For example, we found more data for changes in yield, SOM content and pH than for (number of) earthworms. This represents a known limitation for this type of descriptive studies.
To limit the influence of possible data outliers, medians instead of means were employed to visualise the response ratios per treatment. 'Flower petal' diagrams were generated for each paired management practices. All analyses and visualisations were performed using R scripts (R Development Core Team, 2008).

Results and discussion
Overall, there are clear trends and relative changes in the five indicators as affected by the four paired practices (Table 1, Fig. 1A-D), but the spread is large (Fig. 2). A ratio of 1 or close to 1 in Fig. 1 indicates there was no change or no difference between "promising" and "standard practices" (blue line); > 1 indicates a 'positive' change (increase) vis-à-vis the respective reference practice, while < 1 points at a 'negative' change (decrease). For most indicators, a median ratio > 1 is considered favourable from a soil quality perspective. However, pH results have to be interpreted more cautiously depending on the pH range (i.e. acidic, neutral or basic) of the soil type and the crop involved under consideration. Also, while the differences are very pronounced for earthworms, the magnitude thereof has to be interpreted with care because of the generally low number and high spread of observations.

Organic matter addition versus no organic matter input
OM addition favourably affected all five soil quality indicators under consideration as shown in Fig. 1A. The most favourable effect was reported for earthworm numbers, followed by yield, SOM content and soil aggregate stability. For pH, effects depended on soil type, for example OM input may favourably affect the pH of acidic soils. These results are similar to those reported in other reviews (Khaleel et al., 1981;Haynes and Naidu, 1998;Abiven et al., 2009).
Increases in SOM content depend on the amount and types of OM applied, and the duration of application. The equivalent amount of tested organic materials, i.e., compost, farmyard manure and slurry application increased SOM contents by 37%, 23% and 21%, respectively in the upper 10 cm and values tended to increase with the duration of experiments (> 10 years compared to < 10 years) until a new equilibrium was reached (Spiegel et al., 2015). Based on analyses of 42 LTE's from China, Xu et al. (2015aXu et al. ( , 2015b concluded that straw application of 7.5-12 Mg ha −1 y −1 was needed to restore the SOM content to initial levels under current cultivation practices. From a practical perspective, however, it should be noted that such amounts of straw may not always be available for application on the land (e.g. used for cooking and brick making).

No-tillage versus conventional tillage
No-Tillage (NT) comprises land cultivation with little or no soil surface disturbance, the only disturbance being during planting. Fig. 1B shows the impact of NT on the selected soil quality indicators compared to conventional tillage. NT generally led to increased aggregate stability and greater SOM content. With respect to the SOM content, the median RR for the whole data set (n = 19) is 1.21 (Fig. 2, no-tillage versus tillage). Median RR values for SOM range from 1.02 for maize (n = 3), 1.20 for winter wheat (n = 6), 2.12 for barley (n = 64) and 1.48 for other crops (n = 11). NT practices enhanced earthworm populations, but not always where herbicides or pesticides were applied to combat weeds and pests. Overall, in this review, yield slightly decreased under NT with a median RR of 0.98. For winter wheat, the median RR is 0.81 (n = 38), for maize is 0.85 (n = 3). Overall, however, the sample populations were too small to adequately tease out such effects.
Similarly, other studies indicated that no-tillage led to improvements in soil quality in the upper soil layer by improving soil structure and enhancing soil biological activity, nutrient cycling and reducing bulk density (Hamza and Anderson, 2005), thus improving soil water holding capacity, water infiltration, water use efficiency (e.g. Islam and Weil, 2000;Pittelkow et al., 2015) and aggregate stability (Aziz et al., 2013). For yields, there were no clear trends, as such are ultimately determined by many interacting factors (Pittelkow et al., 2015;Zhao et al., 2017), such as crop type, the detailed consideration of which was beyond the scope of this review.
Tillage per se does not directly affect soil pH. Rather, effects of tillage on pH depend on the prevailing climatic conditions, parent material, soil type, and management factors such as the application of chemical fertilizers or lime. For example, wet compacted soils favour denitrification. Such soils may show a reduction in pH, making other nutrients less available to crops (Cookson et al., 2008;Lal, 1997;Rahman et al., 2008;Rasmussen, 1999).
A change or difference in tillage practices can result in changes in biological, chemical and physical properties of soil, affecting the soil function (Chan, 2001;Islam and Weil, 2000) and its capacity to provide ecosystem services (Funk et al., 2015;Palm et al., 2014). In this context, NT represents a relatively widely adopted soil management practice in Australia, South America, US and Canada, but not in Europe.

Crop rotation versus monoculture
Crop rotation had an overall positive effect on earthworms (number), SOM content and yield (Fig. 1C), but it had little impact on soil pH and aggregate stabilitydepending on the type of crop. Limited impact of rotation on soil pH was also reported by Spiegel et al. (2015).
Favourable effects of crop rotation on SOM levels and yield were reported by various reviews (e.g., Bullock, 1992;West and Post, 2002;Jarecki and Lal, 2003), and neutral impacts on SOM content by Spiegel et al. (2015). The limited effect of rotation on aggregate stability was presented in other studies (Arrigo et al., 1993;Castro Filho et al., 2002;Spiegel et al., 2015). Conversely, for 22 LTE's in Europe, Guzmán et al. (2015) observed a negative impact of crop rotation on aggregate stability, i.e., response ratio (rotation/mono-cropping = 0.77) and no clear trend in earthworm numbers.  Wortman et al. (2012) and Ponisio et al. (2014). Alternatively, some studies reported no significant differences in yield under organic cultivation compared to conventional agriculture (e.g. Eyhorn et al., 2007), or even higher under organic management (Melero et al., 2006).

Organic versus conventional agriculture
Although the 'organic yield gap' is widely reported, it is also recognised that judicious land management can help to decrease it. For example, Ponisio et al. (2014) reported that agricultural diversification practices (multi-cropping and crop rotations) substantially reduced the yield gap when the methods were applied in purely organic systems. Other studies have shown that organically managed cropping systems have lower long-term yield variability (Smolik et al., 1995;Lotter et al., 2003).
Nine local studies on the effect of organic farming on soil pH (Condron et al., 2000;Gosling and Shepherd, 2005;Marinari et al., 2006;Melero et al., 2006;Eyhorn et al., 2007;Heinze et al., 2010;Reganold et al., 2010;Ge et al., 2011;Domagala-Swiatkiewicz and Gastol, 2013) confirm how remarkably small soil pH differences are between organic and conventional systems (on similar soils). In six out of the nine cases, pH is slightly but not significantly lower in organic systems, with all observed differences being < 0.4 units. In the Swiss DOK experiment, soil pH was slightly higher in the organic systems (Mäder et al., 2002). Generally, soil pH depends on the soil type and its buffering capacity, and the type of organic fertilizer or soil amendment applied. It is therefore of paramount importance to specifically consider the local soil and management conditions.
There is a close relationship between organic matter content and aggregate stability (Loveland and Webb, 2003). Various studies confirmed that organic farming significantly improved aggregate stability compared to conventional systems (Jordahl and Karlen, 1993;Gerhardt, 1997;Siegrist et al., 1998;Mäder et al., 2002;Schjønning et al., 2002;Williams and Petticrew, 2009). Besides enhancing soil water retention, organic farming seems to improve water use efficiency, especially under drought conditions this can lead to organic crops outyielding conventional crops by 70-90% (Lotter et al., 2003;Gomiero et al., 2011). Finally, higher organic input under organic farming systems leads to a more vibrant soil life, which in turn creates a more stable soil structure (Tsiafouli et al., 2014).

Conclusions and recommendations
Our study has confirmed that land management practices influence soil quality indicators in various ways. There are clear trends and relative changes in the five indicators as determined by the four-paired practices. However, the magnitude of the trends and direction of the indicator changes vary with the different management practices.
Several management practices had negative effects on soil quality indicators. For example, yield levels were lower under organic farming as compared to conventional farming and, to a lesser extent, no-tillage compared to conventional tillage. However, the yield reduction could be marginal, if other principles of conservation agriculture such as proper residue management and crop rotation are applied.
Conversely, there are also positive aspects under organic farming such as higher marketing price and reduced environmental damages. Therefore, to evaluate whether it is judicious to convert conventional farming to organic farming, socio-economic aspects will have to be considered in combination with soil quality impacts.
Under the framework chosen, earthworm numbers appear to be the most sensitive indicator for the four paired management practices and positively affected by all the promising practices in comparison to the corresponding standard practices. SOM content responds positively to all the promising practices in comparison to the references. Aggregate stability and yield are less sensitive to the practices, and soil pH appears to be the least sensitive indicator.
The agricultural practices chosen (e.g. organic matter input) represent categories rather than specific treatments (e.g. addition of farmyard manure, compost, green manure, crop residue, or slurry). Although details on the various different treatments under those categories were documented in the literature review database (Table S2), a full-blown meta-analysis was beyond the intention and scope of research performed for the iSQAPER project and current manuscript.
LTE's are an invaluable source of information and at the basis of understanding the mechanisms and magnitude of soil change. Given the ever increasing pressures on agricultural land, every effort possible should be undertaken to maintain, enhance, and connect existing LTE's, and where possible invest to extend their network.
Opposite to our hypothesis, the potential for deducing meaningful trends for soil quality indicators from agricultural management practices was restricted by using currently available LTE data as the only source of information. Main reasons are the large study area with its huge range of pedo-climatic conditions, and the heterogeneous setup of LTEs making comparison of data difficult or impossible. Efforts such as the systematic mapping of evidence relating to the impacts of agricultural management on SOC described by Haddaway et al. (2015) are promising and should be extended to collate data about other soil quality relevant indicators.
Finally, it should be noted that farmers often know very well which specific soil parameters are the most relevant for their particular situation. Therefore, the view of land managers should be taken into account when evaluating various sets of indicators for soil quality (Lima et al., 2013;Palm et al., 2014), necessitating a transdisciplinary and participatory approach.