An intensive multilocation temporal dataset of fungal and bacterial communities in the root and rhizosphere of Brassica napus

The plant microbiome has been recently recognized as a plant phenotype to help in the food security of the future population. However, global plant microbiome datasets are insufficient to be used effectively for breeding this new generation of crop plants. We surveyed the diversity and temporal composition of bacterial and fungal communities in the root and rhizosphere of Brassica napus, the world's second largest oilseed crop, weekly in eight diverse lines at one site and every three weeks in sixteen lines, at three sites in 2016 and 2017 in the Canadian Prairies. We sequenced the bacterial 16S ribosomal RNA gene generating a total of 127.7 million reads and the fungal internal transcribed spacer (ITS) region generating 113.4 million reads. 14,944 unique fungal amplicon sequence variants (ASV) were detected, with an average of 43 ASVs per root and 105 ASVs per rhizosphere sample. We detected 10,882 unique bacterial ASVs with an average of 249 ASVs per sample. Temporal, site-to-site, and line-driven variability were key determinants of microbial community structure. This dataset is a valuable resource to systematically extract information on the belowground microbiome of diverse B. napus lines in different environments, at different times in the growing season, in order to adapt effective varieties for sustainable crop production systems.

lion reads and the fungal internal transcribed spacer (ITS) region generating 113.4 million reads. 14,944 unique fungal amplicon sequence variants (ASV) were detected, with an average of 43 ASVs per root and 105 ASVs per rhizosphere sample. We detected 10,882 unique bacterial ASVs with an average of 249 ASVs per sample. Temporal, site-to-site, and linedriven variability were key determinants of microbial community structure. This dataset is a valuable resource to systematically extract information on the belowground microbiome of diverse B. napus lines in different environments, at different times in the growing season, in order to adapt effective varieties for sustainable crop production systems.
© 2020 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license.
( http://creativecommons.org/licenses/by/4.0/ ) Table   Subject Agriculture, Crop production, Applied microbiology Specific subject area Diversity and temporal composition of bacterial and fungal communities in the root and rhizosphere of Brassica napus Type of data Figure  How

Value of the Data
• This dataset characterizes the bacterial and fungal microbiomes in the root endosphere and rhizosphere of B. napus in the Canadian prairies. • It can be used to systematically extract information on diversity and composition of the root and rhizosphere microbiome in diverse B. napus lines, in different environments, and at different times in the growing season. • The data presented in this article are useful in various areas including microbial ecology, soil science, plant science, and in breeding programs as an alternative plant phenotype, in order to adapt effective varieties for sustainable crop production.

Data Description
This dataset provides information on the diversity and temporal composition of the bacterial and fungal microbiomes in the root endosphere and rhizosphere of Brassica napus L. in different environments in the Canadian prairies. Previous reports revealed that root microbiomes of B. napus , the world's second largest oilseed crop, was consistently different from those of other crop plants, and tended to shift in different cropping rotation systems [1][2][3][4]. However, no evidence was provided on the controlling factors, including how plant genetics and environment may shape the diversity and composition of these microbiomes [ 5 , 6 ].
We surveyed the diversity and composition of the root and rhizosphere bacterial and fungal microbiomes of sixteen genetically diverse B. napus lines replicated three times. A total for four site years included sites at Llewelyn in 2016 and 2017 as well as Melfort and Scott in 2017 only. A temporally intensive survey was performed once per week for ten weeks in 2016 at Llewelyn to determine the degree of change in the bacterial and fungal microbiomes over the growing season. In 2017, we repeated this work at three time points with the same 16 lines grown in multiple locations (Llewelyn, Melfort and Scott). These locations had different soils and climatic factors that are representative of important canola producing regions in the Canadian prairies. In 2017, we also repeated the weekly sampling of a subset of eight lines at the Llewelyn field site ( Table 1 ).
In total, 14,944 unique fungal and 10,882 unique bacterial amplicon sequence variants (ASV) were detected in all lines across the four site years based on the internal transcribed spacer region. There was an average of 43 fungal ASVs per root and 105 ASVs per rhizosphere sample. The bacterial and fungal microbiomes were more diverse in the rhizosphere samples compared   to the root samples. In Llewelyn 2017, the total number of bacterial ASVs was the highest in rhizosphere samples whereas it was the lowest for the root samples ( Fig. 1 ). The fungal microbiome was more diverse at Scott compared to Llewellyn and Melfort. In Llewellyn, the site with two years of data, the root fungal microbiome was more diverse in 2017 than 2016, while in contrast in the rhizosphere soil the microbiome had a higher richness in 2016 than 2017 ( Fig. 2 ). Diversity of fungi varied in different B. napus lines and it shifted over growing seasons. An example of this is shown for lines NAM-5 and NAM-13 ( Fig. 3 ). The composition of bacterial and fungal communities in the root and rhizosphere was shaped by B. napus line, location, and the year the experiment was conducted, likely dominated by year-to-year differences in environmental and edaphic conditions ( Figs. 4 and 5 ). This dataset is a valuable resource that can be used to systematically extract information on the temporal dynamics of microbial communities in the root endosphere and rhizosphere soil of diverse B. napus lines in different soils and across years. The purpose of this report is to provide a publicly available microbiome dataset, its associated metadata and the context by which others explore how bacterial and fungal microbiomes relate to varieties better suited to sustainable crop production, and improve food security worldwide [7][8] .

Experimental design
Sixteen lines of B. napus L. were grown in Agriculture and Agri-Food Canada research farms in Saskatchewan in 2016 and 2017 ( Table 1 ). In 2016, the lines were grown in a site at Llewellyn (52.1718 °N, 106.5052 °W). In 2017, the same lines were again grown at Llewellyn, as well as at Scott (52.3574 °N, 108.8400 °W) and Melfort (52.8185 °N, 104.6027 °W) (Fig. S1). Each line was grown in randomized blocks and replicated three times. Each plot had six rows and was 6.1 m long and 1.8 m wide. At Llewellyn in 2016, the plots were seeded on May 27 and received 220.1 mm of precipitation throughout the growing season (May to August) and had a mean temper- In 2016, root and rhizosphere samples were collected weekly from June 14 to August 16 from the Llewellyn site for 10 consecutive weeks, starting 3 weeks after sowing. In addition, three duplicates were collected for randomly chosen samples at each sampling time throughout the growing season resulting in a total of 510 samples. In 2017, the same sixteen lines were sampled at all three sites 3, 6, and 9 weeks after sowing. In 2017, eight lines were also sampled weekly  for 10 weeks from the site at Llewellyn, starting 3 weeks after sowing. Three canola plants were collected and combined to form a single composite sample from each plot. A total of 2,160 root and rhizosphere samples were collected and DNA was extracted from all samples. Roots and rhizosphere samples were collected to a 10-cm depth, and rhizosphere soil was determined as soils adhering to the roots.

Sample collection and processing
Three plants including roots were sampled from each plot using a sterilized trowel. Roots with adhering soil and bulk soil were placed in a sampling bag and stored in a cooler where they were kept on ice. The samples were transported to the lab and were stored at 4 °C until processing. The following day, the aboveground material was cut from the roots, oven dried at 60 °C, and weighed. The loosely adhering soil (soil not attached to the roots) was shaken off and collected and stored at -80 °C for further analysis. Plant roots and the tightly adhering soil were weighed and transferred to a flask containing 100 ml of sterile 0.05M NaCl buffer and shaken at 180 rpm for 15 minutes. The roots then were rinsed with deionized water, weighed, and a subsample of the root material was stored at -80 °C for DNA extraction. The buffer and soil mixture was centrifuged at 50 0 0 rpm for 15 minutes at room temperature. The supernatant was stored at -20 °C for root exudate analysis. The pellet containing the rhizosphere soil was transferred to 1.5 ml tubes and stored at -80 °C for DNA extraction.

DNA extraction and amplification
DNA was extracted from 250 mg of rhizosphere soil using Qiagen PowerSoil extraction kit following manufacturer instructions. DNA was extracted from 50 mg root tissue using Qiagen PowerPlant extraction kit (Hilden, Germany) following manufacturer instructions. Extraction duplicates were included for quality control. After extraction, DNA quantity and quality were determined following the standard Qubit protocol (Thermo Fisher Scientific, Waltham Massachusetts).
In 2016, the PCR reaction mix (25 μl total) contained 2.5 μl DreamTaq Buffer (Thermo Fisher Scientific, Waltham Massachusetts), 2 μl MgCl 2 , 2.5 μl betaine, 2.5 μl dNTP mix (Invitrogen, Carlsbad, California), 1 μl of each primer, 0.25 μl DreamTaq (Thermo Fisher Scientific, Waltham Massachusetts), 13.25 μl nuclease free water, and 2 μl of the standardized template DNA. The PCR conditions were 95 °C for 5 minutes as an initial denaturization, followed by 95 °C for 30 seconds, 54 °C for 30 seconds, 72 °C for 30 seconds for 35 cycles, and a final elongation of 72 °C for 7 minutes. Negative controls and PCR duplicates were included. In 2017 the same reaction mix was used without MgCl 2 and betaine. The same PCR conditions were followed.
The Internal Transcribed Spacer (ITS) region was amplified using the ITS1F_KYO1 (CTHGGT-CATTTAGAGGAASTAA) / ITS2_KYO2 (TTYRCTRCGTTCTTCATC) primer set [10] and then sequenced using the Illumina MiSeq Platform. The PCR reaction mix (25 μL total) contained 12.5 μL Platinum Green Taq buffer (Invitrogen, Carlsbad, California), 1 μL of each of the forward and reverse primers (10 μM), 8.5 μL nuclease free water, and 2 μL of template DNA. The PCR conditions were 95 °C for 5 minutes as an initial denaturation, followed by 95 °C for 30 seconds, 51 °C for 45 seconds, 72 °C for 1 min for 35 cycles, and a final elongation of 72 °C for 7 minutes. PCR reactions with no DNA template were included for negative control. PCR reactions were replicated for a fraction of randomly selected DNA samples and used for PCR quality control.

Library preparation
For bacterial amplification, samples collected in 2016 were sent to Genome Quebec (Montreal, Quebec) after the initial PCR for barcoding, amplicon normalization, amplicon library QC, and sequencing on Illumina MiSeq platform (San Diego. California). Library preparation for 2017 samples was completed at the University of Saskatchewan. PCR product was purified to eliminate primers and impurities using 1:1 ratio of Nucleomag NGS clean-up and size select (D-mark Biosciences, Scarborough, Ontario). After purification, samples were indexed following the Illumina protocol, purified again to remove excess index primers, quantified and standardized to 4 nM, and pooled. Pooled libraries were then sequenced using the Illumina MiSeq platform. A total of 1,008 individual root samples and 1,008 rhizosphere soil samples, as well as 578 quality assurance/quality control samples were sequenced. Quality assurance/control samples included field soil and root duplicates, DNA extraction duplicates, library preparation duplicates and sequencing duplicates. On average, every third field experimental unit was duplicated in some fashion for quality assurance/control. The raw sequences were submitted to the sequence read archive (SRA) repository of the National Center for Biotechnology Information (BioProject PR-JNA575004, accessions: SAMN12874189 -SAMN12875128; SAMN12898547 -SAMN12899542; SAMN12907026 -SAMN12907756).
For fungal amplification, the PCR products were confirmed by visualization in an agarose gel (1.2 %), purified using 1:1 ratio of Nucleomag NGS clean-up and size selected to remove primers and impurities according to manufacturer's instructions (D-mark Biosciences, Scarborough, Ontario). After purification, samples were barcoded using Nextera XT indexes, purified again to remove the impurities and indexing primers, quantified using Qubit 4 (Thermo Fisher Scientific, Waltham Massachusetts), standardized at 4 ng/ μL, and pooled (384 samples). Selected PCR reactions were indexed separately and used for sequencing quality control.

Bacteria
A total of 127,698,986 reads were produced with an average of 23,940 reads per sample. Primers were removed using cutadapt v. 2.1 and then imported into QIIME2 v 2019.1 [11] . Sequences were then filtered based on quality using default parameters: maximum number of consecutive low-quality scores ( r ) = 3, minimum length that a sequence read can be following truncation and still be retained ( p ) = 0.75 total read length, maximum number of low PHRED scores that can be observed in direct succession before truncating a sequence read ( q ) = 3, maximum number of ambiguous (i.e., N) base calls ( n ) = 0, minimum sequence count ( c ) = 0.005% [12] and sorted into amplicon sequence variants using (ASVs) using a trim length of 200 in Deblur [13] . The resultant QIIME2 abundance and taxonomy artifacts were exported to BIOM format [14] for processing in R v. 3.5.3 [15] . The bacterial abundance data were imported into R using the biomformat package v. 0.4.0 19 and combined with the taxonomy and sample information using phyloseq v. 1.26.1 [16] . Chloroplasts and mitochondrial contaminants were removed, and abundances standardized to the Aliivibrio fischeri spike. Duplicate samples were removed, as were any samples or taxa with zero abundance sums. There were 10,882 unique ASVs with an average of 249 ASVs per sample. ASVs were classified using a 342F/806R-trained V3/V4 SILVA database [17] .

Fungi
In total, 113,378,162 forward and reverse raw sequence reads were produced with an average of 26,244 paired reads per sample. Sequences were processed using QIIME2 v2019.7 [11] . First primer sequences were removed. Sequences were quality filtered using the default parameters and arranged as amplicon sequence variants (ASVs) using a forward truncation length of 180 bp and reverse truncation length of 120 bp in DADA2. There were 14,944 unique ASVs with an average of -43 ASVs per root and 105 ASVs per rhizosphere sample. The average sequence length for both root and rhizosphere was 238 base pairs. ASVs were classified using UNITE database v 8.0 [18] . The QIIME2 abundance and taxonomy artifacts were combined and converted to BIOM format [14] for processing in R v. 3.5.3 [15] . The fungal abundance data were imported into R using the biomformat package v. 0.4.0 and combined with the taxonomy and sample information using phyloseq v. 1.26.1. [16] . Duplicate samples, and samples or taxa with zero abundance sums were removed.

Statistical analysis
Initial exploration of the composition of microbial community was assessed using Principal Component Analysis (PCA) and compositional differences quantified using a Permutational Analysis of Variance (PERMANOVA) in the vegan v. 2.5-6 package in R [19] . First, we removed any taxa with a mean read count ≤ 1 and/or samples with ≤ 1 non-zero value. Next, to address the compositional nature of the canola microbiome dataset Zeros in the dataset were replaced using a Bayes-Laplace approach using the zCompositions v. 1.2.0 R package and then transformed using a centered-log ratio (CLR) transformation using the CoDaSeq v. 0.99.3 package [20][21][22] . The CLR mitigates the issue that differences among microbial taxa are not linear by turning the read counts into a ratio abundance (abundance normalized by the geometric mean abundance of all taxa per sample). CLR allows us to retain the relationships among samples and also puts the data in linear (Aitchison) space where we can apply linear statistical techniques such as PCA. The geometric mean centres the abundance values such that the average relative abundance is zero and therefore above average abundances will be positive and below average negative. Alpha diversity was calculated using vegan v. 2.5-6 in R [19] .

Funding
This work is supported by a grant from the Plant Phenotyping and Imaging Research Centre (P2IRC) to BLH, EGL, KS, and SDS. P2IRC is a digital agriculture research center funded by the Canada First Research Excellence Fund (CFREF) from the Natural Sciences and Engineering Research Council (NSERC), managed by the Global Institute for Food Security (GIFS), and located at the University of Saskatchewan (U of S).

Declaration of Competing Interest
The authors declare no competing interests