Impact of enzymatic digestion on bacterial community composition in CF airway samples

Background Previous studies have demonstrated the importance of DNA extraction methods for molecular detection of Staphylococcus, an important bacterial group in cystic fibrosis (CF). We sought to evaluate the effect of enzymatic digestion (EnzD) prior to DNA extraction on bacterial communities identified in sputum and oropharyngeal swab (OP) samples from patients with CF. Methods DNA from 81 samples (39 sputum and 42 OP) collected from 63 patients with CF was extracted in duplicate with and without EnzD. Bacterial communities were determined by rRNA gene sequencing, and measures of alpha and beta diversity were calculated. Principal Coordinate Analysis (PCoA) was used to assess differences at the community level and Wilcoxon Signed Rank tests were used to compare relative abundance (RA) of individual genera for paired samples with and without EnzD. Results Shannon Diversity Index (alpha-diversity) decreased in sputum and OP samples with the use of EnzD. Larger shifts in community composition were observed for OP samples (beta-diversity, measured by Morisita-Horn), whereas less change in communities was observed for sputum samples. The use of EnzD with OP swabs resulted in significant increase in RA for the genera Gemella (p < 0.01), Streptococcus (p < 0.01), and Rothia (p < 0.01). Staphylococcus (p < 0.01) was the only genus with a significant increase in RA from sputum, whereas the following genera decreased in RA with EnzD: Veillonella (p < 0.01), Granulicatella (p < 0.01), Prevotella (p < 0.01), and Gemella (p = 0.02). In OP samples, higher RA of Gram-positive taxa was associated with larger changes in microbial community composition. Discussion We show that the application of EnzD to CF airway samples, particularly OP swabs, results in differences in microbial communities detected by sequencing. Use of EnzD can result in large changes in bacterial community composition, and is particularly useful for detection of Staphylococcus in CF OP samples. The enhanced identification of Staphylococcus aureus is a strong indication to utilize EnzD in studies that use OP swabs to monitor CF airway communities.


INTRODUCTION
Cystic Fibrosis (CF) is an autosomal recessive disease, characterized by chronic airway infection, whose predominant pathogens include Staphylococcus aureus, Pseudomonas aeruginosa and other Gram-negative bacteria (Heijerman, 2005;LiPuma, 2010). OP swabs are commonly used when children are unable to expectorate (Zemanick et al., 2015). This is particularly true for pediatric, non-expectorating subjects with CF in whom oropharyngeal (OP) cultures are used as a surrogate for lower airway bacteria (Zemanick et al., 2015). In the pediatric studies, OP swabs have ranged from 30% to 68% of samples collected (Armstrong et al., 1996;Zemanick et al., 2010;Wolter et al., 2013;Hoppe et al., 2015) and therefore represent an important sample type in early CF.
Sequencing is becoming more widely used to evaluate bacterial communities in CF airway samples (Cummings et al., 2016). DNA preparation prior to sequencing, including cell lysis, can have a profound effect on microbial community composition especially in detecting certain organisms such as S. aureus (Zhao et al., 2012;Yuan et al., 2012;Willner et al., 2012;Lozupone et al., 2013;Pérez-Losada et al., 2016). S. aureus in particular has a rigid cell wall that can be hard to rupture, hindering the ability to efficiently extract DNA for molecular detection (Zhao et al., 2012).
Previous studies have indicated that use of enzymatic digestion (EnzD) may enhance the ability to detect Staphylococcus (Schindler & Schuhardt, 1964;Browder et al., 1965;Yuan et al., 2012;Zhao et al., 2012;Johnson et al., 2016). Specifically, Yuan et al. (2012) performed a comprehensive experiment evaluating multiple DNA extraction methods utilizing human associated bacterial species as well as a mock community. Zhao et al. (2012) found that EnzD increased the yield of Staphylococcus in sputum samples. Similarly, Johnson et al. (2016) determined EnzD improved the sensitivity of sequencing in OP swabs while no differences were observed in sputum samples. Increased sensitivity with EnzD was robust when compared to clinical culture results and qPCR. Furthermore, the work showed that the majority of Staphylococcus was S. aureus. It remains unclear, however, how the use of EnzD effects the remaining bacterial community in clinical CF samples. In this work, we build on these two previous CF studies to evaluate the effect of DNA extraction using enzymatic digestion (EnzD) on bacterial communities detected from oropharyngeal swab (OP) and sputum samples collected from patients with CF.

Patient demographics and samples
Sputum and OP samples were obtained from patients with CF as part of standard of care for monitoring bacterial infection during routine patient visits. Further explanation of selection of samples to be used for this study are included in a previous publication (Johnson et al., 2016). Excess specimen was stored frozen at −80 • C for molecular assessment of infection. Standard sputum processing protocol was performed for sample homogenization utilizing sputalysin and standard CF culture was performed following CF Foundation guidelines (Burns et al., 1998). The Colorado Multiple Institutional Review Board approved the study (COMIRB 07-0835). Written informed consent was obtained from patients or guardians. Written informed assent was obtained for children 10-17 years of age.

DNA extraction methods
DNA was extracted using the Qiagen EZ1 Advanced automated extraction platform using the Tissue kit and bacterial card per manufacturer's instructions. EnzD was performed on one replicate of each sample by mixing with lysostaphin (final concentration 0.18 mg/mL) and lysozyme (3.6 mg/mL) and incubated at 37 • C for 30 min. Samples were then digested with proteinase K (1.4 mg/mL) and incubated at 65 • C for ten minutes, then incubated at 95 • C for 10 min (Zhao et al., 2012;Johnson et al., 2016). Lysozyme and lysostaphin target degradation of the bacterial cell wall by targeting peptidoglycan (lysozyme) and pentaglycine bridges (lysostaphin) (Salazar & Asenjo, 2007). Lysostaphin is specific to the subset of staphylococci that contain pentaglycine bridges including the human pathogen S. aureus (Trayer & Buckley, 1970). Proteinase K is a broad specificity endopeptidase that is used to digest proteins (Gradisar et al., 2005).

16S rRNA gene amplicon library construction
Bacterial profiles were determined by broad-range amplification and sequence analysis of 16S rRNA genes following our previously described methods (Hara et al., 2012;Markle et al., 2013). Amplicons were generated using primers that target approximately 300 base pairs of the V1/V2 variable region of the 16S rRNA gene. Each DNA was amplified in triplicate along with a barcode specific negative PCR control. PCR reactions contained 1X HotMaster Mix (5Prime), 150 nM each PCR primer and template in a reaction volume of 25 µl. Cycling conditions were 94 • C denaturation for 120 s followed by 30 cycles of 95 • C 20 s, 52 • C 20 s and 65 • C 60 s. Amplification was confirmed using agarose gel electrophoresis. None of the negative PCR controls showed evidence of amplification. PCR products were normalized based on agarose gel densitometry, pooled, lyophilized, purified and concentrated using a DNA Clean and Concentrator Kit (Zymo, Irvine, CA, USA). Pooled amplicons were quantified using Qubit Fluorometer 2.0 (Invitrogen, Carlsbad, CA, USA). The pool was diluted to 4 nM and denatured with 0.2 N NaOH at room temperature. The denatured DNA was diluted to 15 pM and spiked with 10% of the Illumina PhiX control DNA prior to loading the sequencer. Illumina paired-end sequencing was performed on the MiSeq using a 500 cycle version 2 reagent kit.

Analysis of illumina paired-end reads
As previously described, paired-end sequences were sorted by sample via barcodes in the paired reads with a python script (Markle et al., 2013). Sorted paired end sequence data were deposited in the NCBI Short Read Archive under accession number SRP043334. The sorted paired reads were assembled using phrap (Ewing & Green, 1998;Ewing, 1998). Pairs that did not assemble were discarded. Assembled sequence ends were trimmed over a moving window of five nucleotides until average quality met or exceeded 20. Trimmed sequences with more than one ambiguity or shorter than 250 nt were discarded. Potential chimeras identified with Uchime (usearch6.0.203_i86linux32) (Edgar et al., 2011) using the Schloss Silva reference sequences (Schloss & Westcott, 2011) were removed from subsequent analyses. Assembled sequences were aligned and classified with SINA (1.3.0-r23838) (Pruesse, Peplies & Glöckner, 2012) using the 479,726 bacterial sequences in Silva 115NR (Quast et al., 2013) as reference configured to yield the Silva taxonomy. Sequences with identical taxonomic assignments were grouped into Operational taxonomic units (OTUs). This process generated 4,302,223 sequences for 162 samples (average sequence length: 316 nt; average sample size: 26,557 sequences/sample; minimum sample size: 7,527; maximum sample size: 63,105). The median Good's coverage score was ≥99.70% at the rarefaction point of 7,527. The software package Explicet (v2.10.5, http://www.explicet.org) (Robertson et al., 2013) was used to calculate rarefied values for diversity measurements.

Statistical analysis
Shannon-H alpha diversity, evenness, richness and Morisita Horn (MH) beta diversity were calculated in Explicet. MH beta diversity is a measure of similarity between two communities and ranges from 0 (no similarity) to 1 (identical communities). Principal Coordinate Analysis (PCoA) was used to assess differences at the community level utilizing 1-MH. 1-MH distances were used to show the dissimilarity between with EnzD and without EnzD in PCoA plots instead of MH, which are a measure of similarity. Relative abundance (RA) for each genus was calculated by dividing the genera-specific sequence counts by the total number of sequences obtained for each sample. Shannon-H diversity, evenness and richness were compared using a Wilcoxon Signed Rank test. Differences in phyla and genera between EnzD samples versus non-EnzD samples with a median RA of at least 1% were evaluated using Wilcoxon Signed rank test for paired samples. Benjamini-Hochberg corrections were used to account for multiple comparisons (Benjamini & Yekutieli, 2001). Spearman correlations were used to assess associations between MH for paired samples and RA of specific genera from the non-digested sample. Analyses were calculated using R version 3.2.4 Revised (2016-03-16 r70336).

Sample collection
We analyzed 81 airway samples (39 sputum and 42 OP, Fig. S1) collected from 63 patients with cystic fibrosis ranging in age from 1.5 to 24 years (9-24 sputum and 1.5-23 OP). Half of the subjects were female (51%), the majority of subjects were non-Hispanic white (89%) and 52% of subjects were homozygous F508. The median number of samples collected per subject was one and ranged between one and three samples. Seven samples (9%) were negative for standard CF pathogens by culture (Fig. S2).

Changes in individual organisms
Individual taxa were assessed at the phyla and genera level by sample type (OP and Sputum). For OP samples, we observed a significantly lower RA for the phyla Bacteroidetes, Fusobacteria and Proteobacteria when EnzD was used. Firmicutes and Actinobacteria both were detected in higher RA with the use of EnzD. For sputum, the RA for Bacteroidetes and Proteobacteria were lower with EnzD (Table 1) consistent with findings in OP samples (Fig. 1A). Genera with a statistically significantly higher RA following EnzD for OP swabs consisted of: Gemella, Streptococcus, Actinomyces, Johnsonella, and Rothia. Genera with a statistically significantly lower RA for OP swabs consisted of: Prevotella, Veillonella, Neisseria, Haemophilus, Leptotrichia, Campylobacter, Fusobacterium, and Granulicatella. For these genera, while the changes are statistically significant, the changes may not be clinically meaningful. The largest changes in RA (>5%) are seen in Streptococcus, Prevotella, and Veillonella (Table 2). Staphylococcus was the only genus with a significantly higher RA in sputum following EnzD. Genera with a statistically significantly lower RA in sputum consisted of: Veillonella, Granulicatella, Prevotella, and Gemella ( Table 2). The largest estimated change in RA for sputum samples is attributed to Staphylococcus (Table 2). Gemella was higher in OP samples with EnzD while it was lower in sputum samples with EnzD ( Fig. 1B).
Genera traditionally associated with CF include Pseudomonas, Staphylococcus, Haemophilus, Stenotrophomonas, and Achromobacter (Zemanick et al., 2015). All of the samples have low RA for Achromobacter and all of the OP samples had low RA for Pseudomonas and Stenotrophomonas. In the sputum samples, 11 had low amounts of all CF pathogens, 11 were dominated by Staphylococcus, all of which increased with EnzD, and three were dominated mostly by Haemophilus. Of the remaining 14 sputum samples, three samples were dominated by Stenotrophomonas, one increased, one decreased and one remained unchanged with EnzD. Eleven sputum samples had >5% RA Pseudomonas prior to EnzD, the RA increased after EnzD in six of these samples (median 18.7 range 1.1%-68.4%), in those where RA decreased, the median and range was −6.1%, −1.2% to 42.0% (Table 3).

Overall community changes
The Shannon-H alpha Diversity Index was significantly lower (p < 0.01) with EnzD in sputum samples (Fig. S3)   Shannon-H alpha Diversity Index was also lower likely due to higher RA of Streptococcus with EnzD, although this result did not reach statistical significance (Shannon-H, Wilcoxon Signed Rank, p = 0.06). Richness and evenness were also not significantly affected by EnzD in OP samples. Changes in alpha diversity due to use of EnzD do not necessarily correspond to changes in microbial community composition as measured using beta diversity (Fig. S4).

Beta diversity changes
There was less impact of EnzD on community composition in sputum compared to OP samples as measured by MH beta diversity values for each sample pair. The majority of the MH values (79.5%) for sputum were greater than 0.8, whereas only 38.1% of OP samples were above that level. For the OP swabs the lowest MH was 0.4 for sputum samples the lowest MH was 0.48. The distributions between the two sample types were statistically different (MH, Wilcoxon Rank Sum, p < 0.01, Fig. 2). We further investigated these changes using the MH distance matrix in Principal Coordinate Analysis (PCoA). OP samples changed more consistently in ordination space with EnzD (Fig. 3A). The consistent changes observed in the OP samples were largely due to the change in Streptococcus, Gemella, and Rothia (all Gram-positive) as well as Neisseria, Leptotrichia, Prevotella, and Veillonella (Gram-negative); with the samples undergoing EnzD primarily clustered around Streptococcus, Gemella, and Rothia (Fig. 3A). Findings here align with previously discussed individual level changes in genera where Gram-positive bacteria were found in significantly higher RA while Gram-negative bacteria was found in significantly lower RA with EnzD ( Table 2). As indicated in Fig. 3B, sputum samples showed less impact from EnzD, with a large group of samples having limited changes in MH. A subset of the sputum samples Table 3 Displays the RA for CF pathogens detected in each sample. Samples are highlighted when the difference in relative abundance was at least 1%, red for increase with EnzD and blue with a decrease. The grey rows indicate the median (min, max) for the difference in relative abundance for the highlighted samples.    demonstrated large changes in MH with EnzD, but with no consistent direction of change as was seen in the OP samples (Fig. 3B). Sputum appears to be more heterogeneous compared to OP samples. Sputum was primarily influenced by enhanced extraction of Staphylococcus, Veillonella, and Streptococcus (Fig. 3B). This result is consistent with Table 2 where Staphylococcus and Veillonella are shown to have a larger change than Streptococcus.

Associations between community composition and MH
We found no striking evidence of a single phylum or genus whose relative abundance was associated with MH ( Fig. S5). Combination of Gram-positive bacteria was calculated using the summation of (Actinobacteria + Firmicutes (-Veillonella)). For the OP samples with large changes in their communities with EnzD, the relative abundance of Gram-positive organisms was lower compared to those samples with limited changes in their communities (Fig. 4). For all the OP samples, the relative abundance of Gram-positive organisms was higher with EnzD. There was an association between change in relative abundance of Gram-positive organisms with EnzD and MH values for the pairs (Spearman's correlation, r = −0.60; p < 0.01; Fig. 4B).

DISCUSSION
In this work, we demonstrate that EnzD of airway samples changes the bacterial community detected by sequencing primarily due to increased detection of organisms with a Grampositive cell wall structure. From the biplots (Fig. 3A) we can see that for OP samples there is a clear shift in microbial composition between those without EnzD and those with EnzD. With EnzD the OP samples are highly clustered around Gemella, Rothia, and Streptococcus whereas those without EnzD are dispersed around Neisseria, Prevotella and Veillonella. These observations are consistent with cell wall structure; we would predict that given their rigid cell wall Gram-positive bacteria would drive changes in community composition for those samples with EnzD. Streptococcus, Gemella, and Rothia explain a large amount of the variability in the OP samples. Streptococcus and Staphylococcus are Gram-positive taxa explaining a large amount of the variability in the sputum samples. Among the Gram-negative taxa explaining a large amount of variability in OP and sputum samples are Neisseria, Veillonella, and Prevotella due to decreasing relative abundance with EnzD. Granulicatella, a member of the Firmicutes had decreased RA with EnzD, which was not consistent with other Gram positive taxa. Granulicatella is reported to have pleomorphic and variable Gram stain characteristics, which may explain this observation (Bottone et al., 1995;Christensen & Facklam, 2001). Difference in OP and sputum samples can be due to differences in the microbial communities which may be related to multiple factors including anatomic location, inter individual variability and disease severity. The samples used in this study were collected for clinical evaluation of airway infection, although separate evaluation for sample adequacy was not performed, these samples are reflective of what is being used to make clinical decisions. The patients that were not capable of expectorating sputum tend to be younger with presumably less lung disease (i.e., fewer pathogens). There are also differences between upper and lower airways (Zemanick et al., 2015) that could also impact the community composition observed. Willner et al. (2012) demonstrated that genera detection varied significantly between five commonly used extraction methods with Staphylococcus being detected with varying efficiency. Support for these findings showed the largest difference in microbial proportions between two DNA extraction methods (Norgen and Qiagen) was found in Staphylococcus (Pérez-Losada et al., 2016). Consistent with previous findings, EnzD revealed a higher RA of Staphylococcus in OP samples. Further, when evaluated in both OP swab and sputum samples from pediatric subjects with CF, variations in detection of Staphylococcus were observed (Johnson et al., 2016). Total RA for Staphylococcus in sputum samples was higher overall, but had less of a change before and after EnzD (Johnson et al., 2016). Zhao et al. (2012) looked at sputum and saw an increase in Staphylococcus, which we corroborate. We also found a higher RA of Streptococcus in OP samples. Johnson et al. (2016) found that for OP samples DNA concentration consistently increased with EnzD. This indicates that in addition to observing changes in RA, EnzD probably better represents the absolute amounts of genera present in OP samples.

Limitations
There are certain limitations to our study. First, only two airway sample types collected from subjects with CF were evaluated, thus limiting more general inferences regarding the effects of EnzD in other samples types. However, this study does include OP samples, which have not previously been evaluated and represents an important sample type in early CF. The patient's clinical status at the time of sample collection is unknown, this hampers our ability to determine whether these samples are reflective of both clinical stability or during a pulmonary exacerbation. However, because paired samples were used to assess the effect of EnzD, the difference for each pair is not confounded by clinical status. Because only a single replicate with and without EnzD was evaluated, we are unable to compare effects due to EnzD versus technical variability. However, the observation of the consistent differences for the majority of the sample pairs provides some evidence that the change is due to EnzD rather than technical variability. In our study we considered beta-diversity values of 0.8 or greater to be within the limits of change due to biological variability.
The utility of EnzD is difficult to predict due to the strong influence of the community composition in any given sample on the results. There is a compelling argument to utilize EnzD in studies that rely on OP swabs as previous studies have shown increased sensitivity for detection of Staphylococcus. While the argument is less compelling for sputum, EnzD should be applied consistently. Other sample types would require evaluation to determine how much impact is observed. However, increasing the RA of any group makes it harder to identify minor components in the community. EnzD could negatively impact the ability to identify low abundance Gram-negative organisms as seen through the decrease in Gram-negative genera shown in Table 2. This includes many important pathogens in CF (e.g., Pseudomonas) and should be considered carefully.

CONCLUSIONS
In summary, we found that the use of EnzD on airway samples prior to DNA extraction and sequencing alters microbiome community composition results, likely due to improved detection of gram-positive bacterial taxa (e.g., Streptococcus and Staphylococcus). The use of EnzD appears particularly important for analysis of OP swab samples. EnzD use with sputum samples may be less critical and has the potential to decrease sensitivity for low abundance taxa. Our findings highlight the need for a consistent approach to airway sample processing and analysis, and suggest that EnzD should be applied routinely in studies using OP swabs and studies requiring sensitive detection of Staphylococcus and Streptococcus.