Arthropods dataset from different genetically modified maize events and associated controls

Arthropods from four genetically modified (GM) maize hybrids (coleopteran resistant, coleopteran and lepidopteran resistant, lepidopteran resistant+herbicide tolerant and coleopteran resistant and herbicide tolerant) and non-GM varieties were sampled during a two-year field assessment. A total number of 363 555 arthropod individuals were collected. This represents the most comprehensive arthropod dataset from GM maize, and together with weed data, is reasonable to determine functional groups of arthropods and interactions between species. Trophic groups identified from both phytophagous and predatory arthropods were previously considered non-target organisms on which possible detrimental effects of Bacillus thuringiensis (Bt) toxins may have been directly (phytophagous species) or indirectly (predators) detected. The high number of individuals and species and their dynamics through the maize growing season can predict that interactions are highly correlational, and can thus be considered a useful tool to assess potential deleterious effects of Bt toxins on non-target organisms, serving to develop biosafety risk hypotheses for invertebrates exposed to GM maize plants.


Background & Summary
This works presents the data underlying the studies in published papers [1][2][3] . In the USA, genetically modified (GM) maize was grown on 38 M ha in 2016 (ref. 4). Throughout EU countries maize is one of the major crops and is cultivated annually on approximatively 13 million hectares, representing 13% of the total cultivated area in the EU and about 8% of maize production area worldwide 5 . The two most potent insect pests of non-GM maize crops in Europe are the western corn rootworm (Diabrotica virgifera virgifera LeConte, Coleoptera: Chrysomelidae) and the European corn borer (Ostrinia nubilalis Hübner, Lepidoptera: Crambidae), the control of which represents the greatest challenge in European maize production 6 . Other insects, such as aphids (Aphidoidea) and thrips (Thysanoptera), also occur in all maize croplands and may occasionally contribute to crop losses [7][8][9][10] . Several arthropod predators preying on these and other arthropod herbivores are also found in maize croplands, the most important of which are ground beetles (Coleoptera: Carabidae), ladybird beetles (Coleoptera: Coccinellidae), rove beetles (Coleoptera: Staphylinidae), predatory thrips (Aleolothripidae), wolf spiders (Areneae: Lycosidae), Syrphid larvae (Diptera: Syrphidae) and minute pirate bugs (Orius spp.) 5 . While several previous studies have assessed the effects of GM maize toxins (especially Bt toxins) on non-target (particularly predatory) arthropods 3,[11][12][13][14][15][16] , few studies have focussed on the strength of species' trophic interactions and associated parameters among GM maize systems. This lack of knowledge represents a major issue in the analyses of arthropod communities associated with different GM plant species 17 . Arthropod community studies in GM crop systems must involve a comprehensive and multimethodological field assessment based on different collection methods of all arthropods and weeds from several different GM and non-GM control crops, over several years and throughout the maize growing season. Such datasets may allow us to assess the functional diversity of arthropod communities in GM and non-GM crops to be analysed at the same time, providing consistent species abundance and prey preference data 1,3,18 . However, datasets with hundreds of thousands of arthropod individuals from GM maize and its non-GM controls are rare, with no such datasets being freely available until now. The analysis of such data may demonstrate that interactions between species are highly correlated, and can thus be considered a useful tool in assessing the deleterious effects of Bt, and other toxins on a wide range of non-target organisms in GM croplands 1,3 . These data can also be a useful tool to develop biosafety risk hypotheses for invertebrates exposed to GM plants [18][19][20] . During a two-year intensive field assessment, detailed arthropod collections were made on different GM maize varieties (some are novel with no previous field test such as the coleopteran+lepidopteran resistant+glyphosate tolerant maize) and on their non-GM controls (Table 1).

Site characterisation and sampling procedure
This section is an expanded versions of descriptions in our previous works [1][2][3] . The field sites were set up in 2007 in Central Europe, near Budapest, Hungary on chernozem soil in a completely randomised block design with each of the four different GM maize varieties and two non-GM controls. Peach and apricot orchards of about 200 ha dominated the area around the field sites, and no previous maize cultures were present in the immediate area, thus reducing the risk of cross-pollination by GM pollen as much as possible. Both GM and non-GM maize plots were established in 625 m 2 (25 m × 25 m) blocks, each spaced 3 m apart within each block (Table 1, Figure 1). The GM maize varieties tested contained proteins conferring resistance against two worldwide important maize insect pests (Western corn rootworm D. virgifera virgifera and European corn borer O. nubilalis), or an enzyme conferring glyphosate-resistance, or a combination of these. Tolerance of different GM maize varieties was conferred through different Bt insecticidal crystal (Cry) proteins in different forms that can specially target the above-mentioned two insect pests or the combinations of these. Two other GM maize varieties, in addition to insecticidal proteins, comprised of enzyme (C4 EPSPS) conferring tolerance to glyphosate herbicides (Table 1). Those GM maize varieties that did not contain glyphosate tolerances and the two non-GM controls were seeded   in four replicates each. All glyphosate tolerant varieties were replicated eight times from which four blocks were subject to an extra glyphosate treatment in order to test and distinguish the effects of GM varieties vs. effects of glyphosate chemical application on arthropods (Table 1, Figure 1). The extra glyphosate in these blocks was applied in a total amount of 1060 g/ha each year at the four (V4) and eight leaf stages (V8) of the plants (according to the normal application procedures used in maize production). To

Data adjustments
This section represents a simplified version of descriptions from our previous works 1, 3 and describes data adjustments before food web analyses. Arthropod food webs were built using several thousand individuals (lowest = 24 972, highest = 32 237) per GM maize type (sum of four replications/treatment). First, all arthropods collected were assigned to trophic groups, a widely accepted method in food web studies as it reduces methodological biases related to uneven resolution of taxa within and among species trophic relations 21 . Trophic groups were defined as taxa that share the same set of predators and/or prey. Before food web constructions, all possible predator-prey interactions of species identified in the GM and non-GM plots were carefully searched in previously published scientific literature. Trophic relations between species (or the next highest level of resolution available, usually genus) were checked using a total of 62 scientific references (Supplementary Table 1). The scientific nomenclature and taxonomy of every resource and consumer were standardised using the Global Names Resolver using the Global Biodiversity Information Facility dataset (http://resolver.globalnames.biodinfo.org/). Designation of trophic links followed methods presented by Gagic et al. 2011 andJordán et al. 2012 21,22 . To increase the sensibility of the method, arthropod data was further adjusted according to the following parameters: (i) Because maize pollen can move long distances by several ways (wind, mechanical, etc.), thereby posing a risk that non-GM maize fields might produce kernels with GM toxins (production of Bt toxins is a dominant trait), food webs of each GM and non-GM maize varieties were built using information on the trophic groups prior tasselling stage (VT) of maize. (ii) The presence and abundance of some, or all of the arthropods and weeds investigated varied significantly during the vegetation period. For example, the western corn rootworms adults were absent in April, May, June, present in July and August, and absent again in September, October and November. Therefore, food webs in each GM and non-GM maize varieties were constructed from arthropod and weed data when the abundance of the most frequent species was highest, but prior to pollen spreading (end of June, mid-July).
After adjustments, the food web parameters number of nodes (species), number of edges (trophic relations between species) and K indexes (keystone index) were calculated for each entry following Jordán et al. 2012 21 .

Data Records
The data obtained is available as Data Citation 1. Sample 1_Pitfall Traps file contains arthropod data collected with pitfall traps; Sample 2_Yellow Traps contains data collected with yellow sticky traps; Sample 3_Plant Assessments contains data collected by plant surveys from genetically modified maize events expressing Cry34Ab1, Cry35Ab1, Cry1F and CP4 EPSPS proteins and controls. Values represent number of individuals, except for Sample 3_Plant assessment column, Spider Mites damages %, where the percentage of damages per single plant is given. All arthropod data (Sample 1, 2 and 3) were recorded during the maize developing stages (eight leaves stage (V8), twelve leaves stage (V12), vegetative stage, tasseling (VT) and reproductive stage, milk (R3)) and presented in second column in each dataset). Entry (column 3) describes treatment ID or different GM and non-GM maize varieties (i.e. Entry 1 means Coleopteran resistant GM maize; see Table 1). Replicates (Blocks) (column 4) describes replicates in each entry, along with the code in numbers (as in Table 1). Trap number (column 5) donates the number of traps reported in each entry and its replicates. In Sample 3_Plant Assessment dataset, instead of trap numbers the plant numbers in which arthropods were assessed are given. Sample 4_Weed coverage dataset contains weed coverage (%) in different maize growing stage periods. For weed data the first columns, 'Sampling' donates the sampling data made in each year. Columns 2, 3 and 4 contains the same data as described for the previous datasets (2 is Maize growing stage, 3 is Entries and 4 is Blocks). Column 5 (Sample (m2)) describes the number of visual sampling made on weed coverage in each block. Values for each weed species represents soil coverage from 0-100% in 1 m 2 area. In food web graph indexes dataset, the number of nodes (species) and the number of edges (trophic relations between species) are presented for each entry. Values 'in degree' represents the number of trophic links with the species in the food web (number of other species feeding on this species). Values 'out degree' represents the number of trophic links between the species (number of other species eaten by the species). K indices are quantitative rank of species by their topological importance. As the K index increases, the species importance is greater in the functioning of the food web.

Technical Validation
The results of food web analyses using the datasets reported were published in peer-reviewed journals 1-3 . The arthropod and weed data presented here and in online datasets have also been statistically analysed using standard statistical tools in the associated published papers. The datasets have been carefully checked for possible typing errors in the various published papers. If any discrepancies were found they were corrected before being placed on the web-resource. Weed data were identified to species-level. Arthropod data were identified to the lowest taxonomic level possible, as was the case for most of the predator species. All identified species (weeds and arthropods) were checked by the authors of the paper and by researchers from the Department of Plant Protection and Department of Entomology of the Szent István University, Hungary and by scientists from the Plant Protection Institute of the Hungarian Academy of Science. All datasets were cross validated by carefully checking all data from field sheets and any inconsistencies that might be introduced into the database by different team members were corrected. Additional consistency checks were carried out by carefully analysing the most appropriate scientific references and datasets published in order to establish species similarities found in the present assessment and previously published papers. In addition to these checks, each species identified to this level was checked according to its distribution map. This was done in order to ensure that there were no mistakes in species distribution (i.e. if certain species were indeed previously present in Central European agricultural crops). Necessary corrections and changes were made in the database.

Usage Notes
The database contains information on the functional groups of arthropods from four different GM maize and its isogenic control. To our knowledge, this is the most comprehensive food web/weed/arthropod data assessment from GM maize to date. Future data analyses of the present data that needs further attention include: a Weed data has not been analysed in detail in comparison with specific trophic groups or species in different GM crops and its control. As weed data has been collected in different maize growth stage periods (V8, V12, VT and R3), detailed arthropod abundance and/or analyses in different growth stages can be made and analysed. b New methods of species trophic relations and associated metrics analyses can be made. c Comparison with other arthropod abundances from other GM maize crops can be computed. d Importance of specific non-target groups and their relative abundance on different GM maize has not been analysed. e Data can be analysed using several other software applications' (e.g., R, Matlab, SPSS).