A dataset of meta-analyses on crop diversification at the global scale

Numerous meta-analyses have been conducted in the last three decades to assess the productive and environmental benefits resulting from a diversification of cropping systems. These meta-analyses assessed one or several diversification strategies (e.g., rotations, cover crops, agroforestry) according to various outcomes (e.g., productivity, profitability, biodiversity). To date, no dataset has provided a comprehensive synthesis of existing experimental data on crop diversification. We present here a dataset containing 2382 effect sizes published in 99 meta-analyses covering 3736 experimental studies worldwide (https://figshare.com/s/c15a93e96c95f89ddd89). We also provide an extensive appraisal of the quality of each meta-analysis and a quantification of the redundancy of primary studies between meta-analyses. Our database hence provides (i) a quantification of the impacts of a variety of diversification strategies on crop production, the environment and economic profitability at the global scale and, (ii) a quality and redundancy assessment that may be used as a reference for future studies.


a b s t r a c t
Numerous meta-analyses have been conducted in the last three decades to assess the productive and environmental benefits resulting from a diversification of cropping systems. These metaanalyses assessed one or several diversification strategies (e.g., rotations, cover crops, agroforestry) according to various outcomes (e.g., productivity, profitability, biodiversity). To date, no dataset has provided a comprehensive synthesis of existing experimental data on crop diversification. We present here a dataset containing 2382 effect sizes published in 99 meta-analyses covering 3736 experimental studies worldwide (https://figshare.com/s/ c15a93e96c95f89ddd89). We also provide an extensive appraisal of the quality of each meta-analysis and a quantification of the redundancy of primary studies between meta-analyses. Our database hence provides (i) a quantification of the impacts of a variety of diversification strategies on crop production, the environment and economic profitability at the global scale and, (ii) a quality and redundancy assessment that may be used as a reference for future studies.

Data
The dataset includes the values of effect sizes of 99 meta-analyses based on a total of 3736 unique primary studies. The dataset covers seven strategies of crop diversification (Table 1) and 114 countries over five continents (Fig. 1). More than 50 species are included in our dataset, but most of the data concerns six species (Maize, Wheat, Barley, Soybean, Bean, and Cowpea - Fig. 2). Our database also reports a quality assessment of the selected meta-analyses based on an extended and updated version of the quality checklist of [1].
The data collected are grouped into six separate but inter-related tables. The table 'Effect_size' contains the effect sizes reported in the 99 selected meta-analyses. Two other tables pertain to the extraction and classification of meta-information on 'Effect_Size'; 'Description_Meta-analyses' compiles the references and the publication information on each meta-analysis, and the 'Primary_Studies' table reports information on each of the 4972 primary studies (of which 3736 are unique) included in the 99 meta-analyses. The table 'Quality' reports a comprehensive quality assessment for each of the 99 meta-analyses. Finally, the table 'Definition_of_variable' includes the definitions of all the attributes (column headers) of the other five tables. The following sections present each table in more details.

Table effect size
The "Effect_Size" table is the central table to quantify and compare the impacts of the seven types of strategies of crop diversification on the environment (e.g., soil carbon, biodiversity), agricultural production (e.g., crop yield, incidence of plant diseases) and economic profitability. This table can serve as a basis to perform a quantitative meta-synthesis (i.e., the synthesis of several meta-analyses). This table can also be used to identify knowledge gaps, i.e. combinations of crop diversification strategies and outcomes with a low number of published meta-analyses. Value of the data -The database allows to quantify and compare the impacts of various crop diversification strategies on the environment (e.g., soil carbon, biodiversity), agricultural production (e.g., crop yield, incidence of plant diseases) and economic profitability. -The database can be used to identify knowledge gaps, i.e. combinations of crop diversification strategies and outcomes with a low number of published meta-analyses. -The database includes an in-depth quality appraisal of 99 meta-analyses on crop diversification which can help weighting evidence in future scientific evidence assessments.
yield, soil quality and biodiversity respectively. The strategy 'Associated plant species' is the strategy including the highest number of effect size values (25% of all effect sizes), followed by 'Intercropping' (24% of all effect sizes), 'Rotation' (18% of all effect sizes), and 'Agroforestry' (13% of all effect sizes). Each effect size is described by its type (e.g., ratio, log ratio, difference, standardized difference), its value, and level of uncertainty (when available) i.e., confidence intervals, standard-errors, number of data, etc.
For illustration, Fig. 3 presents the values of the most common type of effect size, i.e. ln(Y T /Y C ), and their associated 95% confidence intervals. These values measure the impacts of three crop diversification strategies (i.e., rotation, agroforestry, associated plant species) on yield, soil quality (e.g., soil carbon content, soil organic matter content, etc.), and biodiversity (e.g., pollination, arthropod abundance, etc.).

Table literature search
The 'Literature_Search' table describes the references of all articles screened and the source where each article was identified (names of the database or additional sources). The table also specifies whether each article screened satisfied the considered inclusion criteria, and whether each article was selected or not. See Fig. 4 for a summary of the selection process.

Table Description_Meta-Analyses
The 'Description_Meta-Analyses' table describes the characteristics of each of the meta-analyses included in the Effect_Size table (i.e., n ¼ 99); full reference, type of publication, abstract, keywords, author's affiliations, crop species, crop diversification strategies. The 99 selected meta-analyses were published from 1994 to 2018 (Fig. 5). This table provides a rapid access to the scope and the objective of each selected meta-analysis identified by a unique index (attribute 'ID').

Table Primary_Studies
The "Primary_Studies" table can be used as a resource to identify relevant primary studies on crop diversification, to perform new meta-analyses or to update existing ones, or to analyze the redundancy of primary meta-analyses between meta-analyses. The table describes the main characteristics of Table 1 Definition of the seven strategies of crop diversification included in the database.

Characteristics Agroforestry
Agroforestry satisfies three conditions: i) at least two plant species interact biologically, ii) at least one of the plant species is a woody perennial, and iii) at least one of the plant species is managed for forage, annual or perennial crop production. Associated plant species Plant sown in addition to the main crop for agronomic or environmental purposes (e.g., to manage soil erosion, soil fertility, soil quality, weeds, pests, diseases, biodiversity or nitrate leaching). The associated plant could be harvested or not, permanent or not. This category, primarily defined by plants function encompasses cover crops, trap crops, repellent crops, buffer and companion crops. Cultivar mixture The simultaneous cultivation in the same field of multiple cultivars of the same species. All cultivars are harvested. Intercropping The simultaneous cultivation in the same field of two or more crops (different species) for all or part of their growth cycle. All crop species are harvested. Landscape heterogeneity Landscape composition (perennial habitat diversity, semi-natural habitat cover) and configuration (mean patch size). Other Any other type of crop diversification.

Rotation
Recurrent succession of a set of selected crops grown on a particular agricultural land each season or each year according to a definite plan. Here, we do not consider as rotation, a system with temporal overlap of two or more crops.  primary studies (i.e., the experimental trials) published on crop diversification: the references, plant species (Fig. 4), and locations of experimental trials of all primary studies included in each metaanalysis. All primary studies included in the 99 selected meta-analyses were published from 1936 to 2018 (Fig. 5). Most of the trials reported in primary studies were conducted in Northern America (1286 primary studies out of 3636), Western and Eastern Europe (782 primary studies), and in Central and Southern America (326 primary studies). A large majority of the primary studies focus on Gramineae and Fabaceae crops (Fig. 4). Fig. 6 presents the number of common primary studies between each pair of meta-analyses.

Table quality
The 'Quality' table can serve as a benchmark to improve the quality of systematic reviews and help for the development of appraisal tools for meta-analyses in a variety of research fields. The table describes a quality assessment of each of the 99 meta-analyses based on a set of 20 defined criteria along  three main categories (i.e., review and selection of the studies, data and statistical analysis, and identification of potential bias). Each criterion relies on the assessment of several quality items. When satisfied, a criterion is scored 1, and zero otherwise. Fig. 7 presents the percentage of meta-analyses satisfying each of the 20 quality criteria.

Table Definition_Of_Variable
The 'Definition_Of_Variable' table describes the meaning of all attributes (column headers) of the five other tables. Definitions of terms and types of attributes (numerical, text, date) are detailed in this table (see Table 3 for a summary). Fig. 3. Compilation of effect sizes (ln(YT/YC), i.e., the log ratios of a measurement in a diversified treatment to its value in a less diversified control) for three diversification strategies: (a) rotation, (b) agroforestry, and (c) associated plant species. Each point corresponds to one effect size from one meta-analysis for one single category of outcome (note that several effect sizes may be affiliated to one single meta-analysis). The figure focuses on the following environmental and production outcomes: biodiversity (yellow), soil quality (grey) and productivity levels (blue). Vertical bars correspond to 95% confidence intervals. The number of data used to calculate each effect size are indicated at the bottom of each graph, when available. In some meta-analyses, the effect sizes were computed for a fraction of its total data sample (e.g., per covariate), but only global effect sizes are presented here. Note though that the totality of effect sizes is available in the table "Effect_size". Effect sizes that were informed as relative distances were converted to log ratios and integrated in the figure whereas absolute differences and hedge's distances were not.

Literature search
A systematic search of peer-reviewed journals and grey literature was carried out on May 2018 using six databases: Web of Science (http://apps.webofknowledge.com), CAB abstract (http://www. cabdirect.org), Greenfile (http://www.greeninfoonline.com), Environment Complete Database (https://www.ebsco.com), Agricola (http://agricola.nal.usda.gov), and Google Scholar (http://www. scholar.google.com). Our search equation was defined as follows; (meta-analysis OR meta analysis) AND (cropping system OR crop* OR agriculture) AND ((rotation OR Diversification OR intercrop* OR cover crop OR mixture) OR (organic AND (system OR agriculture)) OR (conservation AND (system OR agriculture)) OR no till* OR agroforestry OR agroecology). No restriction was applied to the date and language of publication in the article title, abstract and keywords, or to the geographical localization of the studies. References cited in each selected meta-analysis and those listed in a narrative review [2] were also screened. After the removal of duplicates, this initial literature search identified 537 candidate meta-analyses of potential interest evaluating the effect of crop diversification on a series of outcome (Fig. 2). The most frequent species are indicated in the bar plots (restricted to one species when the total number of species is below 4; and species included in more than 150 primary studies for Gramineae and Fabacea). One primary study can report data for different species and/or families. Gramineae: Barley, Maize, Millet, Oat, Other, Rice, Rye, Ryegrass, Sorghum, Wheat; Fabaceae: Alfalfa, Bean, ChickPea, Clover, Cowpea, Faba Bean, Field Pea, Groundnut, Lentil, Lupin, Other, Garden Pea, Pigeon Pea, Soybean, Vetch; Malvaceae: Cocoa, Cotton; Solanaceae: Potato, Tomato; Brassicaceae: Cabbage, Mustard, Oilseed.

Meta-analyses selection
Each article title and abstract were screened for eligibility according to the following inclusion criteria: (i) article reporting a quantitative analysis of several primary experiments, (ii) article studying at least one crop diversification strategy, (iii) article including control plots adjoined to treatment plots (a less diversified cropping system should be tested as a control). Studies dealing with pure forestry or Fig. 6. Percentage of common primary studies between meta-analyses (upper plot) and total number of primary studies used in each meta-analysis (bottom plot). Each point corresponds to a pair of meta-analyses. We calculated the percentages of common primary studies between the meta-analyses reported in the x-axis (ID of the meta-analyses) and the others (level of redundancy). We identified the name (ID) of all the meta-analyses with a redundancy level higher than 25%. The numbers at the bottom of the upper plot refer to the percentage of meta-analyses with at least one common primary studies with the meta-analyses reported in the x-axis. For the bottom plot, we distinguished unique primary studies (darkgreen) and primary studies used in at least two metaanalyses (lightgreen).

Extraction of data and quality assessment
All effect sizes related to crop diversification in each of the selected meta-analysis were extracted. Here, an effect size is defined as a quantitative measure of the effect of a crop diversification strategy compared to a reference cropping system (i.e., less diversified) on one or several response variables (e.g., crop yield, soil carbon content, biodiversity index, plant disease incidence). Let Y T and Y C be the values of one response variable in the diversified treatment and control, respectively. An effect size is a function of Y T and Y C . Depending on the considered meta-analysis, the effect size can either be the ratio of Y T to Y C (or a log ratio, odds ratio) or the difference between Y T and Y C (standardized or not). Each selected meta-analysis presents the estimated values of one or several effect sizes for one or several groups of primary studies corresponding to different regions, crop types, etc. When several estimated values were available, they were extracted altogether with the characteristics of the groups of primary studies used for their estimation (e.g., name of the region, type of crop, etc.). When available, information characterizing the uncertainty of estimated effect sizes were also systematically extracted (e.g., sample size, confidence interval, p-value). Data were extracted from tables, text, supplementary information or directly from graphics using the WebPlotDigitizer software [10]. Information to assess the quality of each meta-analysis was also extracted using 20 criteria spanning the quality of the literature review, data extraction, data analyses, and interpretations.
Each meta-analysis was read carefully at least three times, to identify relevant data. In case of ambiguity, a second reader was asked to re-analyze the article. Inconsistencies of judgments were discussed by the two readers. When the reported data or protocols were unclear, authors were directly contacted and asked to provide additional information. The units of all data and the origin of the extracted effect sizes (figure, table, text section) were also precisely described. The qualitative and quantitative contents of all class, numerical, index, binary and date attributes were checked by importing each table in turn into the R software [11], and by visualizing data distribution for each attribute in turn. Outliers were systematically and manually checked in order to detect possible mistakes and returned to the original articles as many times as needed to check the accuracy of the data.