Assessing soil bacterial community and dynamics by integrated high-throughput absolute abundance quantification

Microbial ecological studies have been remarkably promoted by the high-throughput sequencing approach with explosive information of taxonomy and relative abundance. However, relative abundance does not reflect the quantity of the microbial community and the inter-sample differences among taxa. In this study, we refined and applied an integrated high-throughput absolute abundance quantification (iHAAQ) method to better characterize soil quantitative bacterial community through combining the relative abundance (by high-throughput sequencing) and total bacterial quantities (by quantitative PCR). The proposed iHAAQ method was validated by an internal reference strain EDL933 and a laboratory strain WG5. Application of the iHAAQ method to a soil phenanthrene biodegradation study showed that for some bacterial taxa, the changes of relative and absolute abundances were coincident, while for others the changes were opposite. With the addition of a microbial activity inhibitor (NaN3), the absolute abundances of soil bacterial taxa, including several dominant genera of Bacillus, Flavobacterium, and Paenibacillus, decreased significantly, but their relative abundances increased after 28 days of incubation. We conclude that the iHAAQ method can offer more comprehensive information to reflect the dynamics of soil bacterial community with both relative and absolute abundances than the relative abundance from high-throughput sequencing alone.

After 28 days of incubation, ten grams (frozen-dry basis) of soil were sampled from 160 each flask of Treatments P, W, and S for DNA extraction. 161 All the soil samples collected from the flasks were immediately frozen and stored 162 in a freezer (-80 °C). Aliquot sample of 0.21 g (frozen-dry basis) per collected sample 163 was used for DNA extraction using the MoBio PowerSoil DNA Isolation kit (MoBio 164 Laboratories, Carlsbad, CA, USA). The extracted DNA was stored in the same freezer 165 (-80 °C) before qPCR analysis. 166

Quantitative PCR analyses 167
The qPCR analyses were performed to quantify the specific genes of the extracted 168 DNA with three replicates using a StepOnePlus TM Real-Time PCR System instrument 169 (Applied Biosystems, Foster City, CA, USA). The two pairs of primers, 341F (5'-170 CCTACGGGAGGCAGCAG -3') and 534R (5'-ATTACCGCGGCTGCTGG -3'), 171 520F (5'-AYTGGGYDTAAAGNG -3') and 802R (5'-172 TACNVGGGTATCTAATCC -3'), were chosen to detect the copy numbers of 16S 173 rRNA gene of total bacteria in V3 and V4 variable regions, respectively (Reddy et al. 174 2012;Guinane et al. 2013). To measure the abundance of strain EDL933, the fliC 175 gene was selected from its chromosome, which had only one copy in the genome 176 (Latif et al. 2014). The forward and reverse primers used were 5'- The standard curves were generated using 10-fold serial dilution of a plasmid 180 containing the targeted DNA PCR fragments of 16S rRNA and the fliC genes. The 25 181 µL qPCR reactions contained 12.5 µL 2 × SYBR Green qPCR SuperMix-UDG with 182 Rox (Invitrogen, Carlsbad, CA, USA), 0.5 µL of each 10-µM forward and reverse 183 primers, and 10.5 µL sterile and DNA-free water. The known amount and soil DNA 184 samples were added at 1.0 µL per reaction. The reaction was executed under the 185 following thermocycler conditions: 50 °C for 2 min × 1 cycle, 95 °C for 5 min × 1 186 cycle, then 95 °C for 15 s, and 60 °C for 30 s × 40 cycles; and finally followed by the 187 dissociation stage for the melting curve analysis. All the gene copy numbers were 188 calculated from the standard curves of 16S rRNA and the fliC genes by using the ∆C t 189 (cycle threshold) method. All qPCR reactions were conducted in triplicate. 190

High-throughput sequencing and data processing 191
According to the method reported by (Ma et al. 2016), each composite DNA sample 192 for sequencing was mixed by combining equal amounts of DNA extraction of the 193 three replicates. The bacterial communities of the Cont, E9 to E4, P, W, and S 194 treatments were amplified with the V4 region of primers (520F,and 196 then sequenced with Illumina Miseq platform following standard protocols. The 197 quality control of raw sequencing reads was performed using QIIME software 198 (version 1.7.0) (Caporaso et al. 2010). The sequences were removed if the average 199 quality in a moving 5 bps window lower than Q20, length less than 150 bps and 200 having the ambiguous barcode. The paired-end sequences were spliced by the same 201 index with overlap no less than 10 bps and no mismatch using FLASH (Magoc & 202 Salzberg 2011). The chimera sequences were de novo identified and removed using 203 UCHIME (Edgar et al. 2011). 204 After quality filtering, the sequences clustering was performed using UCLUST 205 with a 97% similarity and sorted out as operational taxonomic units (OTUs) (Edgar 206 2010). Then, the OTUs were classified using RDP-classifier to match to SILVA 207 database release 119 (Wang et al. 2007;Quast et al. 2013). The raw data obtained in 208 this research were deposited to NCBI SRA (Sequence Read Archive; 209 http://www.ncbi.nlm.nih.gov/sra/) under accession numbers SRP097773 and 210

SRP105351. 211
Absolute quantification of soil bacteria and its community by the iHAAQ 212 method 213 The absolute abundance of each phylum or genus in the soil bacterial community was 214 calculated by multiplying total bacterial quantities (the copy number of 16S rRNA 215 gene of total bacteria from qPCR) and the corresponding relative abundance from 216 high-throughput sequencing (Fig. 1). The calculated absolute abundances of 217 Escherichia-Shigella, Esch-V3 and Esch-V4, were based on its relative abundance 218 and the total bacterial quantities measured by V3 and V4 regions of 16S rRNA gene, 219 respectively. Since the 16S rRNA gene varied in copy numbers among the different 220 bacteria genomes (Latif et al. 2014;Sun et al. 2013), 7 copies of 16S rRNA gene and From 7 days to 28 days incubation, the residual PHE was declined rapidly from 22.01 284 to 8.44 µg (g dry wt soil) -1 in Treatment P and from 7.53 to 4.38 µg (g dry wt soil) -1 in 285 Treatment W. Compared with Treatments P and W, Treatment S contained the highest 286 residual PHE and had the slightest change during the 28-d incubation (47.79 -60.45 287 µg (g dry wt soil) -1 ). Following the procedures as shown in Fig. 1, the dynamics of 288 soil bacterial community in Treatments P, W, and S were assessed by the iHAAQ 289 method. The total bacterial quantities increased with the incubation time in 290 Treatments P and W, but decreased with the incubation time in Treatment S ( Fig. S4, 291

Supporting Information). 292
There were mainly 17 phyla in which the relative abundances were greater than 293 0.1% in each of the samples (Table S1,  Actinobacteria decreased by 5.58%, 1.93%, 1.59%, 1.28%, and 1.24%, respectively, 299 in Treatment P; the phyla of Proteobacteria, Firmicutes, and Verrucomicrobia 300 decreased by 9.24%, 3.41%, and 0.93%, respectively, in Treatments W; and the phyla 301 of Actinobacteria, Chloroflexi, Gemmatimonadetes, and Proteobacteria decreased by 302 6.57%, 4.24%, 2.77%, and 1.40%, respectively, in Treatment S. However, the phylum Firmicutes in Treatment S increased substantially from 8.53% to 25.87%, and 304 eventually it became the dominated bacteria in soil. 305 Nevertheless, relative abundance does not provide information about change of 306 total bacterial quantities in soil. Based on the absolute abundances calculated by our 307 proposed iHAAQ method, there were only 5 phyla showing decrease in both the 308 relative and absolute abundances in Treatment P, while the relative and absolute 309 abundances of other phyla showed opposite trend (Fig. 1, Fig. 5, Fig. 6, Fig. 7; Table  310 S1, Fig. S5, Supporting Information). As an example, the difference in relative and 311 absolute abundances was especially evident for phylum Proteobacteria in Treatment 312 P, in which the relative abundance decreased by 6.67%, while the absolute 313 abundances increased by 42.22% and 21.02% in V3 and V4 regions, respectively. 314 Such inconsistent result was also observed in Treatments W and S. In Treatment W, 315 the relative abundance of 4 and 6 phyla decreased, while the absolute abundances of 316 all phyla increased. In Treatment S, none of the absolute abundances of phylum 317 increased, but the relative abundance of phylum Firmicutes increased by 17.54%. 318 Discrepancy between relative and absolute abundances also occurred at the genus 319 level (Fig. 7, Table S1, Supporting Information). In Treatment P, 135 and 57 opposite 320 results in relative and absolute abundances were observed in V3 and V4 regions, 321 respectively. While in Treatment W, 77 and 160 opposite relative and absolute 322 abundances were found in V3 and V4 regions, respectively. Even more discrepancies 323 (258 and 245 genera in V3 and V4 regions, respectively) were observed in Treatment S. While it was a general trend that the relative abundances of genera decreased while When the iHAAQ method is applied to quantify soil bacterial community 388 dynamics, both similar and opposite trends of relative and absolute abundances were 389 observed, which agreed with early investigations ( absolute abundances from the iHAAQ method than by the relative abundance alone. 410 Furthermore, even if the relative abundance of samples had been obtained from 411 the high-throughput sequencing, the absolute abundance of each taxon in the samples 412 will be achieved with the supplementary qPCR analysis. It was recommended to use 413 the same variable region of 16S rRNA gene for qPCR and high-throughput 414 sequencing. Hence, the iHAAQ method is suitable to determine the absolute 415 abundance of bacterial community and can be applied to studying the quantitative 416 bacterial ecology in soil. 417 It is worthy to mention that the range of the valid absolute abundance of each 418 taxon is limited by the high-throughput sequencing technique. Based on 419 pyrosequencing data of a fungal aerosol population, the exact measure of absolute 420 abundance of the fungal species can be achieved at the sufficiently high relative 421 abundance (above 4.4 × 10 -4 ) (Dannemiller et al. 2014). In our study, the least 422 sequence number of indigenous bacteria in soil were 5 reads (i.e. the least relative 423 abundance was about 8.0 × 10 -5 ). The qPCR results showed that the total bacteria 424 quantities in our samples were from 10 8 to 10 10 copies (g dry wt soil) -1 . Therefore, the 425 indigenous bacteria in the soil samples can be detected by the iHAAQ method when 426 their absolute abundances were above 10 3 copies (g dry wt soil) -1 . With the increase 427 of high-throughput sequencing depth, more rare species with lower abundance can be detected (Agogue et al. 2011;Polka et al. 2015), which may help to better 429 characterize microbial communities. 430

Conclusion 431
Our results indicate that the proposed iHAAQ method allows to obtain the absolute 432 abundance of each taxon in soil bacteria communities by combining qPCR (total 433 bacterial quantity) and high-throughput sequencing (relative abundance of each taxon), 434 which could not be achieved separately by either method. The absolute abundance 435 reflects the dynamics of a bacterial community more accurately than the relative 436 abundance data. In our study, the iHAAQ method was successfully applied to 437 characterize the soil bacterial communities with both relative and absolute abundances, 438 which could provide more information than with the relative abundance alone by 439 high-throughput sequencing. The proposed iHAAQ method could help to study soil 440 quantitative bacterial ecology in future application.  Manuscript to be reviewed Detection and calculation of the added internal reference strain EDL933 (copies (g dry wt soil) -1 ) in soil by qPCR (fliC gene and 7 × fliC gene ) and iHAAQ method (Esch-V3 and Esch-V4).
Treatments E9 to E4 are corresponding to with 6 different concentrations of 10 9 to 10 4 CFU (g dry wt soil) -1 . Different letters indicate significant difference (p < 0.05) among 7 × fliC gene, Manuscript to be reviewed