Microbial diversity of a full‐scale UASB reactor applied to poultry slaughterhouse wastewater treatment: integration of 16S rRNA gene amplicon and shotgun metagenomic sequencing

Abstract The 16S rRNA gene amplicon and whole‐genome shotgun metagenomic (WGSM) sequencing approaches were used to investigate wide‐spectrum profiles of microbial composition and metabolic diversity from a full‐scale UASB reactor applied to poultry slaughterhouse wastewater treatment. The data were generated by using MiSeq 2 × 250 bp and HiSeq 2 × 150 bp Illumina sequencing platforms for 16S amplicon and WGSM sequencing, respectively. Each approach revealed a distinct microbial community profile, with Pseudomonas and Psychrobacter as predominant genus for the WGSM dataset and Clostridium and Methanosaeta for the 16S rRNA gene amplicon dataset. The virome characterization revealed the presence of two viral families with Bacteria and Archaea as host, Myoviridae, and Siphoviridae. A wide functional diversity was found with predominance of genes involved in the metabolism of acetone, butanol, and ethanol synthesis; and one‐carbon metabolism (e.g., methanogenesis). Genes related to the acetotrophic methanogenesis pathways were more abundant than methylotrophic and hydrogenotrophic, corroborating the taxonomic results that showed the prevalence of the acetotrophic genus Methanosaeta. Moreover, the dataset indicated a variety of metabolic genes involved in sulfur, nitrogen, iron, and phosphorus cycles, with many genera able to act in all cycles. BLAST analysis against Antibiotic Resistance Genes Database (ARDB) revealed that microbial community contained 43 different types of antibiotic resistance genes, some of them were associated with growth chicken promotion (e.g., bacitracin, tetracycline, and polymyxin).


| INTRODUCTION
The anaerobic digestion (AD) corresponds to a consolidated technology that is successfully applied in different wastes and influents for the conversion of complex organic compounds to CH 4 and CO 2 , nutrient recovery (phosphorus and sulfur), pollutant removal and/or energy (CH 4 and H 2 ), and fermentative compound (butyrate, butanol) production (Batstone & Virdis, 2014;Battistoni et al., 1997).
The large AD application is only possible due to the occurrence of complex interactions of microorganisms belonging to Bacteria and Archaea domains (Zinder, 1984). Previous studies have reported that microbial composition of anaerobic digestion sludge is highly diverse (Cabezas et al., 2015). However, information on the key players in anaerobic conditions and their interactions is not comprehensive. These results indicate a high genetic potential that has been poorly explored and open promising perspectives for biotechnological applications.
WGSM sequencing has been shown as an excellent approach to assess the genetic potential of a microbial community, unraveling the phylogenetic composition, metabolic capacity, and functional diversity (Streit & Schmitz, 2004) of community members. However, comprehensive information on the phylogenetic composition is highly dependent on the sequence number and in general the shotgun metagenomic sequencing is not deep enough. On the other hand, WGSM sequencing does not involve the biased amplification of 16S rRNA genes, and although more precise, the organism abundances are still dependent on the DNA extraction and sequencing protocols used (Kalyuzhnaya et al., 2008;Morgan, Darling, & Eisen, 2010).
Although the 16S rRNA gene amplicon sequencing has showed several artifacts due to PCR amplification and sequencing errors, this approach offers a deep sequencing for a large microbial community characterization allowing to detect rare species in complex communities (Quince et al., 2009). For this reason, the integration of 16S rRNA gene amplicon and WGSM sequencing approaches represent a robust strategy for the study of microbiomes. It is important to emphasize that the microbiological knowledge (phylogenetic composition, metabolic capacity, and functional diversity) in AD is fundamental to elect microbial indicators of optimal performance, to understand the microbial behavior in response to environmental disturbance and to optimize the process (Carballa, Regueiro, & Lema, 2015).
The term metagenomics was first used by Handelsman et al. (1998) in a soil microbiome study and the first report associating metagenomics with AD occurred only 10 years later in a deep microbial analysis of a German full-scale biogas plant treating farm waste (Schlüter et al., 2008). Thenceforward, several studies have examined the AD under a taxonomic and functional perspective, allowing the reconstruction of important metabolic pathways and genomes (Campanaro et al., 2016;Wang et al., 2013;Wirth et al., 2012;Wong et al., 2013) with focus on the understanding of biological interactions and optimization of bioprocesses. Despite the fact that the first report on the application of metagenomics in AD has been from real-scale reactors, most of the studies have focused on the evaluation of laboratory-scale reactors. Therefore, scarce investigation has been performed in full-scale reactors treating real wastewater and these reactors should not be neglected, especially those applied to wastewater treating of industrial activities of global impact like poultry production.
The AD technology, especially the up-flow anaerobic sludge blanket (UASB) reactor, has been applied as the main alternative for poultry slaughterhouse wastewater treatment. Apart from the engineering aspects related to the optimization of UASB reactors for treatment of this type of wastewater, several studies have investigated the microbiological potential of the microbiome in granular sludge from UASB reactors, and the results are promising. The high microbial diversity observed (Delforno et al., 2015) represents a high genetic diversity and, consequently, a broad metabolic potential for several bioprocesses. Therefore, a wide application of granular sludge for nitrogen (Moura, Damianovic, & Foresti, 2012) and sulfide removal (Camiloti et al., 2014), hydrogen production (Penteado et al., 2013), anionic surfactant degradation , polychlorinated biphenyl (PCB) degradation (Gomes et al., 2014), and veterinary antimicrobial removal (Oliveira, Santos-Neto, & Zaiat, 2015) has been observed.
In this context, the objective of this study was to carry out a robust description of the microbiome in a full-scale UASB reactor applied to poultry slaughterhouse wastewater treatment. For the in-depth microbiome survey, an integrated approach combining 16S rRNA gene amplicon and WGSM sequencing was used. In addition to the description of the community composition (Bacteria, Archaea, and Virus), the metabolic processes related to biogeochemical cycles (e.g., sulfur, iron, phosphorus, nitrogen) and antibiotic resistance genes were explored (microbial metabolic potential). In this sense, the robust information on the microbiome composition, functioning, and interactions may guide novel biotechnological applications of the granular sludge present in the UASB reactor under study.

| Wastewater treatment system of the poultry slaughterhouse
The wastewater treatment system of poultry slaughterhouse is lo- Average grease and fats were 308 and 469 mg L −1 for effluent and influent, respectively. The total solids (TSS) were 2,133 mg L −1 in the effluent and 1,593 in the influent, whereas, the concentration of N-NH 3 and P-PO 4 in the influent reached 48 mg L −1 and 38 mg L −1 , respectively.

| Sampling, DNA extraction, Sequencing, and Preprocessing data
Samples examined in this study were collected from UASB reactor in December 2013. Initially, granular sludge was collected in 20 l highdensity polyurethane bottles from UASB reactor. In the laboratory, 25 ml of the granular sludge was transferred to 50 ml Falcon tubes and stored at −20°C until analyzed. The surplus, present in the highdensity polyurethane bottles, was stored at 4°C to be used as an inoculum for other anaerobic bioreactor studies.
For the in-depth microbiome characterization, two strategies were adopted: (i) the sequencing of 16S rRNA gene amplicons for phylogenetic characterization and (ii) metagenomic sequencing for functional diversity characterization.
The sample for phylogenetic characterization was sequenced using the MiSeq 2 × 250 bp and DNA was extracted using a modified phenol-chloroform protocol described by Griffiths et al. (2000).
The 16S rRNA genes were amplified using the primer set S-D-Bact-0341-b-S-17 (5′-CCTACGGGNGGCWGCAG-3′) and S-D-Bact-0785a-A-21 (5′-GACTACHVGGGTATCTAATCC-3′), flanking the V3 and V4 hypervariable regions, as described by Klindworth et al. (2013). The differences between the platforms used for each sample include the desired fragment size, quantity of data generated, and cost of sequencing. For the phylogenetic diversity, fragment size is highly important to increase the reliability of the evolutionary relatedness among sequences used as OTUs, whereas for the evaluation of the functional diversity the amount of generated data is important to increase the coverage. In both cases, the DNA quality was assessed by an ND-2000 spectrophotometer (Nanodrop Inc., Wilmington, DE), using a ratio of 260/280 nm >1.8, and agarose gel electrophoresis.
All datasets in FastQ format were uploaded to MG-RAST server (http://metagenomics.nmpdr.org/) version 3.5 (Meyer et al., 2008) for preprocessing and annotation. The artificial replicate sequences were removed according to Gomez-Alvarez, Teal, and Schmidt (2009), whereas the low-quality sequences were removed using the modified Dynamic Trim (Cox, Peterson, & Biggs, 2010); the default parameters were adopted.
The sequence data are available at MG-RAST server access number 4633385.3(Amplicon_PS) and 4626733.3 (WGS_whole).

| Microbial community structure and composition (phylogenetic characterization)
The microbial community structure and composition of the UASB reactor sludge was evaluated using two approaches: (1) the 16S rRNA gene sequences derived from the amplicon sequencing were used against the M5RNA (Nonredundant multisource ribosomal RNA annotation) database (Meyer et al., 2008); (2) the 16S, 23S, ITS, and 18S rRNA sequences derived from the metagenome sequencing were extracted and analyzed against the M5RNA database (WGS_rRNA; Figure 1). The parameters adopted in both analyses were as follows: max. E-value cutoff = 1e −5 , min. % identity cutoff = 60% and min. alignment length cutoff = 15. Moreover, taxonomic assignment of metagenomic sequence (WGS_whole) was performed against the SEED database with default parameters (Overbeek et al., 2005). In both cases (using the M5RNA and SEED database), the best hit classification was used to visualize the results.

| Diversity analyses
Coverage and alpha diversity indices (dominance, Simpson, Shannon, and Chao-1) were calculated using the PAST software version 3.07 F I G U R E 1 Pipeline of methods and analyses. The sequence data are available at MG-RAST server under the access number 4633385.3 (Amplicon_PS) and 4626733.3 (WGS_whole) (Hammer, Harper, & Ryan, 2001). Thus, the .biom files of each sample generated by the MG-RAST (QIIME report; Caporaso et al., 2010) were the input files for the PAST software.

| Functional analysis
Functional classification of WGS_whole-sequence dataset was conducted by MG-RAST annotation pipeline using the SEED subsystems database. The data were compared using a maximum e-value of 1e −5 , a minimum identity of 60%, and a minimum alignment length of 15 amino acids for proteins.

| Metabolic mapping
Some specific metabolic mappings, such as carbon, nitrogen, phosphorus, sulfur, iron, protein, and aromatic compounds, were investigated in details. Sequences assigned to each of the metabolic mappings mentioned above were extracted from SEED dataset and affiliated to the taxonomic groups harboring the most related function (protein) to provide a picture of the microbial community likely involved in such specific metabolic pathways. The taxonomic assignment was performed using the SEED database with default parameters and the data were visualized using the best hit classification for each feature.
Moreover, putative antibiotic resistance genes (ARGs) were searched in the WGS_whole dataset against the Antibiotic Resistance Database (ARDB), which include nonredundant genes, using BLASTX at a cutoff E value ≤10 −5 . A read was identified as an ARG-like sequence according to its best BLASTx hit with amino acid identity of ≥90% and alignment length of ≥25 amino acids (Kristiansson et al., 2011;Yang et al., 2013).

| Sequencing statistics
In total, 52,826,969 and 293,825 sequences, with average lengths of 233 ± 78 bp and 454 ± 14 bp, were obtained for WGSM sequencing and 16S amplicon sequencing, respectively. After trimming, 66.4% and 34.0% of sequences were removed due to the low quality and the average lengths were reduced to 213 ± 91 bp and 336 ± 156 bp, for WGSM sequencing and 16S amplicon sequencing, respectively. The values of GC-content were around 53-54% for both samples.

| Taxonomic characterization of UASB reactor microbiome
The number of rRNA genes extracted from WGSM dataset was 211,795 sequences (1.0% of total sequences; Table 1
In the metabolism of aromatic compounds, 0.14% of reads were related to the anaerobic degradation of aromatic compounds with emphasis on anaerobic benzoate metabolism (99.0%; Figure 4a-b), specifically the function Acetyl-CoA acetyltransferase.
The cell wall and capsule (4.4%) and quorum sensing and biofilm formation (0.12%) categories, that are likely involved in the formation and maintenance of flocs and biofilms (e.g., granular sludge present in UASB reactors), were found. The main functions in these categories were gram-negative cell wall components (1.4%), capsular and extracellular polysaccharides (1.2%) and biofilm adhesion biosynthesis (0.06%).
Proteins assigned to iron, nitrogen, sulfur, phosphorus, and potassium metabolisms were found with relative abundance ranging from 0.5% to 2.3% (Tables S1 and S2). In the sulfur metabolism, sequences were assigned to organic and inorganic sulfur assimilation, alkanesulfonate assimilation, sulfur oxidation, and sulfate reduction-associated complexes; whereas in the nitrogen metabolism, sequences were mainly related to ammonia assimilation, nitrate and nitrite ammonification, and denitrification.
Taxonomic inferences of the genes involved in biogeochemical cycles (sulfur, nitrogen, iron, phosphorus, and potassium) indicated that 248 different bacterial genera showed enzymatic machinery for all biogeochemical cycles (Figure 5a and b). For the domain Archaea, most of the genera (17)

| Antibiotic Resistance Genes and their host organisms in the UASB reactor microbiome
In total, 43 different ARGs were found in the UASB reactor microbiome, representing 0.03% of total metagenomics sequences ( Figure 6 and  Table 3 and Table S4).

| DISCUSSION
Each approach (16S rRNA gene amplicon and WGSM sequencing) re-  T A B L E 3 Taxonomic classification of sequences related to antibiotics resistance genes (ARGs) from WGS_whole dataset using the SEED database through MG-RAST server. The study of viral community in anaerobic wastewater treatment is scarce. However, the viral community may play an important role in methanogenic digestion, since the Archaea (e.g., Methanosaeta) may be a favorable target to viral attack (Chien et al., 2013). Therefore, viral infection might explain the decrease in methane production, in upsets and process failures with no obvious explanation (Kroeker et al., 1979). Tamaki et al. (2012)  Amplicon_M5RNA dataset, respectively. The prevalence of acetoclastic pathway for methane production in AD has also been observed by Guo et al. (2015) and Yang et al. (2013).
The abundance of reads related to the metabolism of aromatic compounds and resistance to antibiotics and toxic compounds indicates the genetic potential of the microbial community to survive in environments with fluctuating levels of contaminants and stress, in addition to their potential for the application in bioremediation processes. Sanapareddy et al. (2009), evaluating industrial and medical wastewaters, observed 5.0% of aromatic compound metabolism compared to 1.8% observed in this study. These results suggest that the type of wastewater treatment employed may be a key factor in functional gene distribution.
The highest abundance observed was related to iron metabolism, probably due to the selection exerted by the iron present in the poultry blood. Differently, Guo et al. (2015) observed only 0.45% from full-scale anaerobic reactor treating the waste of activated sludge, suggesting that the presence of iron in poultry blood has probably favored this type of metabolism.
The sequences assigned to the nitrogen and sulfur metabolism suggest the potential of the biomass from UASB reactor treating poultry slaughterhouse to be used as inoculum for different processes, such as: fermentative production, aromatic compound degradation, and sulfur, iron, nitrogen, and phosphorus removal. Moreover, the presence of proteins assigned to the resistance of cobalt-zinc-cadmium and arsenic might be biotechnologically explored. Except for the nitrogen metabolism, the relative abundance of reads related to sulfur, phosphorus, and potassium metabolisms were slightly higher than that obtained by Guo et al. (2015) and Yang et al. (2013) when analyzing the metagenome from full-scale anaerobic digesters.
Taxonomic inference of genes involved in biogeochemical cycles (sulfur, nitrogen, iron, phosphorus, and potassium) suggested that many microbial genera have enzymatic machinery for all biogeochemical cycles. According to Zumstein, Moletta, and Godon (2000) many microbes play the same role in the same biogeochemical cycles, strategy known as functional redundancy that makes the anaerobic digestion process robust and plastic. This strategy is considered as an insurance to maintain ecosystem functions under changing environmental conditions (McMahon, Martin, & Hugenholtz, 2007). The finding of a high number of microbial genera harboring genes involved in all biogeochemical cycles in the UASB reactor microbiome reinforces the importance of functional redundancy within the anaerobic digestion. Although ARGs have been described in chicken feces , studies of such resistance genes in full anaerobic systems applied to the treatment of poultry slaughterhouse wastewater are scarce. In this study, 0.03% of total metagenomic sequences were related to antibiotic resistance genes. Li et al. (2015) observed higher numbers of ARG sequences in wastewater from livestock farms (0.5%-3.0%). However, ARGs frequency found in this study is similar to the one in drinking water (0.01%-0.05%), soil (0.02%), sediment (0.004-0.03%), and river water (0.02-0.03%; Li et al., 2015).
The frequencies of ARGs are likely associated with the antibiotics used as veterinary medicine, including growth promoters. Previous with Bacteria and Archaea members as hosts. A wide functional diversity was found in the UASB reactor microbiome, representing a high genetic potential to be further explored in biotechnology such as anaerobic degradation of aromatic compounds, fermentation, protein degradation, and one-carbon metabolism. Regarding the methanogenic metabolism, genes related to the acetotrophic pathways were more abundant, corroborating the taxonomic results that showed the prevalence of the acetotrophic genus Methanosaeta. Moreover, the data indicated the genetic potential of the inoculum for other bioprocesses, such as the removal of sulfur and nitrogen, with many genera able to play a role in all cycles. Different types of antibiotic resistances genes (ARGs) were found, which are likely associated with the antibiotics used as growth chicken promoters and suggesting their uncontrolled use. Combined data gathered in the present work unraveled a robust description of the microbiome (community structure and functional diversity) in a full-scale UASB reactor and provided fundamental information about the bioprocess in AD that can be further explored in biotechnology.