Combinatorial Glycomic Analyses to Direct CAZyme Discovery for the Tailored Degradation of Canola Meal Non-Starch Dietary Polysaccharides

Canola meal (CM), the protein-rich by-product of canola oil extraction, has shown promise as an alternative feedstuff and protein supplement in poultry diets, yet its use has been limited due to the abundance of plant cell wall fibre, specifically non-starch polysaccharides (NSP) and lignin. The addition of exogenous enzymes to promote the digestion of CM NSP in chickens has potential to increase the metabolizable energy of CM. We isolated chicken cecal bacteria from a continuous-flow mini-bioreactor system and selected for those with the ability to metabolize CM NSP. Of 100 isolates identified, Bacteroides spp. and Enterococcus spp. were the most common species with these capabilities. To identify enzymes specifically for the digestion of CM NSP, we used a combination of glycomics techniques, including enzyme-linked immunosorbent assay characterization of the plant cell wall fractions, glycosidic linkage analysis (methylation-GC-MS analysis) of CM NSP and their fractions, bacterial growth profiles using minimal media supplemented with CM NSP, and the sequencing and de novo annotation of bacterial genomes of high-efficiency CM NSP utilizing bacteria. The SACCHARIS pipeline was used to select plant cell wall active enzymes for recombinant production and characterization. This approach represents a multidisciplinary innovation platform to bioprospect endogenous CAZymes from the intestinal microbiota of herbivorous and omnivorous animals which is adaptable to a variety of applications and dietary polysaccharides.


Polysaccharide Linkage Analyses
Cell wall of each CM sample was prepared as de-starched AIRs according to the previous report [24], with slight modifications; the de-starched sample remained in tubing after extensive dialysis against deionized water was centrifuged and the resulting precipitate was collected, lyophilized, and ball-milled.
Starting material (~0.3 mL mg −1 ) was subjected to sequential fractionation with the following solutions: 50 mM EDTA (pH 6.5), 50 mM Na 2 CO 3 containing 25 mM sodium borodeuteride (NaBD 4 , 99 atom % D, Alfa Aesar), and 4 M KOH containing 25 mM NaBD 4 for 24 h each at room temperature with gentle magnetic stirring, which was modified based on the literature [33]. After each extraction, the soluble fraction was collected by differential centrifugation followed by neutralization. The residue was washed with deionized water three times with centrifugation in between, and all washes were pooled into the corresponding soluble fraction. The EDTA and Na 2 CO 3 extracts were pooled. Final fractions were dialyzed extensively against deionized water and lyophilized.
Alternatively, before fractionation the cell wall was pre-treated with 0.25 M NaBD 4 solution (0.2 mL mg −1 of starting material) at 4 • C for 24 h [34], followed by neutralization by slow dropwise addition of 10% (v/v) acetic acid over ice and centrifugation (3000× g, 30 min). The resulting supernatant was pooled with three water washes of the pellet with centrifugation between washes, and the pooled solution was further incorporated into the EDTA and Na 2 CO 3 fraction isolated from the washed pellet that was sequentially fractionated using the same method as described above.
Linkage analyses of the two cell walls (cold-pressed and solvent-extracted) and their fractions were performed by carboxyl reduction of uronic acids to their corresponding 6,6-dideuterio neutral sugars [35] followed by GC-MS analysis of partially methylated alditol acetate derivatives according to the published protocol [36], except that sample solution (1 µL) was splitless-injected to an Agilent 7890A-5977B GC-MS system (Agilent Technologies) installed with a Supelco SP-2380 column (30 m × 0.25 mm × 0.20 µm, Sigma-Aldrich) with a constant helium flow of 0.8 mL min −1 and with optimized oven temperature programmed to start at 55 • C (hold 1 min) followed by increasing at 30 • C min −1 to 135 • C, 3 • C min −1 to 180 • C (hold 20 min), 1.5 • C min −1 to 200 • C, and 3 • C min −1 to 255 • C (hold 3 min). Four separate experiments were conducted to each unfractionated cell wall, and three to each fraction.

Cecal Digesta Collection and Establishment of Bacterial Communties in Continuous-Flow Mini-Bioreactors
Digesta were harvested within 30 min of death from the ceca of three healthy ca. six-month-old mature male broiler breeders obtained from a local producer (i.e., humanely culled by the producer according to industry standards and transported to the Lethbridge Research and Development Centre immediately after euthanization). Ceca were removed and digesta were harvested in an anaerobic chamber containing a 85:5:10% N 2 :H 2 :CO 2 % atmosphere (referred to as the nitrogen atmosphere hereafter). Harvested cecal digesta were thoroughly mixed, and a subsample of the digesta was used for the direct isolation of bacteria. Subsamples of digesta were also used to establish a broiler cecal bacterial community in continuous-flow mini-bioreactors.
For cecal communities established in bioreactors, digesta were suspended in reduced BRM [37], and were added to individual bioreactor vessels at a final concentrate of 25% w/v. The mini-bioreactor array consisted of a multi-vessel continuous-flow system [37] that was situated in a Thermo Forma 1025 anaerobic chamber (Thermo Fisher Scientific Inc., Waltham, MA, USA) with a nitrogen atmosphere. Two 24-channel peristaltic pumps (205U, Watson Marlow, Concord, ON, Canada) with low flow capabilities provided nutrient exchange to individual vessels. Each vessel had an inflow and outflow tube, and peristaltic pumps were set for an intake rate of 2 RPM (3.76 mL/h) and an outtake rate of 4 RPM (7.52 mL/h). Cold-extracted CM (5.0%) or NSP extracted from CM (0.9%) was added to the BRM-digesta mixture in the vessels. A control treatment (CON) consisted of the BRM medium with cecal digesta not supplemented with either NSP or CM. Flow to/from the bioreactor vessels was commenced after 24 h after the addition of the cecal digesta; inflow BRM was maintained in individual 1 L media bottles, and the medium was replaced daily. The volume of each vessel was maintained at 60 mL, with four replicate vessels per treatment. A micro stir bar (10 mm × 1.5 mm) was placed in each vessel, and vessels were placed on a multi-position stir plate (Variomag ® POLY 15, Thermo Fisher Scientific Inc., Waltham, MA, USA) set at 200 rpm to ensure consistent mixing of the vessel contents. The ambient temperature was maintained at 37 • C with top-mounted heaters. The temperature was verified using digital thermometers positioned throughout the chamber. Moisture, oxygen, and sulfurous gases were minimized within the chamber using desiccant, palladium, and charcoal catalysts, and were replaced as needed (depending on moisture). Each vessel was fitted with sampling port, and samples were removed via a rubber septum using a sterile needle fitted to a 5 mL syringe.
Subsamples were collected from individual bioreactor vessels at day 0, 1, 2, 3, 4, 6, 8, and 10 using a 22G 7.6 cm hypodermic needle (Air-Tite Products Co., Inc., Virginia Beach, VA, USA) fitted to a 5 mL syringe. To collect the sample, the sterile needle was inserted through the septum of the sampling port of the vessel.

Temporal Characterization of Bacterial Communities in Mini-Bioreactor Vessels
On all eight sample days, 1 mL of the contents from each vessel was placed into individual sterile 2 mL screw-capped tubes, tubes were centrifuged (13,200× g for 10 min), the supernatants were transferred to new screw-capped tubes, and both samples were flash frozen in liquid nitrogen. Samples were stored at −80 • C until processed. The frozen pellet was re-suspended in 200 µL of lysis buffer containing 10 mg mL −1 lysozyme, and DNA was extracted from using the DNeasy Blood and Tissue kit (Gram Positive protocol; Qiagen Inc., Hilden, Germany). DNA from the four replicate bioreactors per treatment was pooled by volume for Illumina sequencing. Extracted DNA was quantified using a Qubit 4 fluorometer (Thermo Fisher Scientific Inc., Waltham, MA, USA), and PCR was amplified for 16S rRNA gene metagenomics analyses using the protocol developed by Kozich et al. [38], which targets the V4 region of the 16S rRNA gene with a dual-indexing strategy. The PCR mastermix included 12.5 µL of Paq5000 Hi Fidelity Taq Master Mix (Agilent Technologies, Mississauga, ON, Canada), 1 µL of 10 µM of forward primer (V4-Read 1 Primer TATGGTAATTGTGTGCCAGCMGCCGCGGTAA), 1 µL of 10 µM of reverse primer (V4-Read 2 Primer AGTCAGTCAGCCGGACTACHVGGGTWTCTAAT) (Integrated DNA Technologies, Coralville, IA, USA), 8.5 µL of Nuclease free water (Qiagen Inc., Hilden, Germany), and 2 µL of genomic DNA template (30-50 ng). Reactions were amplified on a thermocycler (Mastercycler Pro S; Eppendorf, Mississauga, ON, Canada) using the following conditions: 95 • C for 2 min; 25 cycles of 95 • C for 20 s, 55 • C for 15 s, and 72 • C for 2 min; and a final elongation cycle at 72 • C for 10 min. Amplicons were purified with AMPure XP beads (Beckman Coulter Diagnostics, Brea, CA, USA), and checked for quality and size with a Bioanalyzer 2100 (Agilent Technologies, Mississauga, ON, Canada). Quantification of the amplicons was done using a Qubit 4. Samples were normalized to 4 nM, pooled, denatured with NaOH, and further diluted with HT1 (Illumina, San Diego, CA, USA) to produce a 4 pM library for analysis using a MiSeq System (Illumina, San Diego, CA, USA). Twenty percent PhiX control DNA was added to the library as a sequencing control. The library was loaded using a MiSeq Reagent Kit v2 500-cycle, and run on an Illumina MiSeq platform (Illumina, San Diego, CA, USA). The Q30 score for the output data was 89%. QIIME2 [39] was used to process and classify bacterial reads. Raw reads were de-noised with DADA2 [40], and representative sequences and amplicon sequence variants (ASVs) were generated. Samples with a read depth of less than 22,000 were removed from the analyses. A phylogenetic tree of ASV sequences was generated, and the taxonomy of each ASV was identified by using a machine learning classifier pre-trained with the reference SILVA 132 database (lva-132-99-515-806-nb-classifier.qza). Evenness (i.e., Pielou's evenness index) and α-diversity (i.e., Shannon's index) were calculated. In addition, β-diversity was determined using unweighted UniFrac, weighted UniFrac, Bray-Curtis, and Jaccard distance/dissimilarity.

Identification of Bacterial Isolates
Liquid cultures were grown of all selected CM NSP-degrading single colony bacterial isolates in Columbia medium (Difco) supplemented with 1 mg L −1 hemin at 37 • C in an anaerobic chamber (Coy Lab Products, Grass Lake, MI, USA) in a nitrogen atmosphere. Bacterial cells were harvested by centrifugation at 8000× g for 10 min. Genomic DNA samples were extracted using a 96-well Bacteria Genomic DNA Miniprep Kit (BioBasic Inc., Markham, ON, Canada) and sent for 16S rDNA sequencing (Eurofins Genomics, Toronto, ON, Canada) using degenerate 27F and 1492R primers [42,43] to identify bacterial species.

Evaluation of Growth Profiency of Isolates on CM NSP
Growth proficiency of Bacteroides isolates was evaluated at 37 • C in an anaerobic chamber (Coy Lab Products, Grass Lake, MI, USA) in an atmosphere consisting of 85% N 2 , 10% CO 2 , and 5% H 2 . Canola meal NSP-utilizing numbered isolates (CMU#) corresponding to Bacteroides spp., CMU13, CMU19, CMU33, CMU36, CMU103, CMU108, CMU128, were cultured overnight in Columbia medium (supplemented with 1 mg L −1 hemin). Isolates were then subcultured overnight at 37 • C into MM supplemented with D-glucose (0.5%) to adapt to nutrient restricted media. Liquid culture growths were inoculated from overnight cultures and performed in triplicate, in MM supplemented with CM NSP (1%) or D-glucose (0.5%) as a control. NSP were insoluble which prevented us from using an anaerobic multiplexed robotic plate reader. Thus, growth profiles were assessed by OD600 nm using manual time point readings at 0, 5, and 24 h, with gentle manual agitation after each time point to promote NSP accessibility and bacterial growth.

CAZyme Phylogenetic Analyses and Target Selection
Prokka results were analyzed locally by dbCAN [47] to determine total CAZyme content (CAZome) for all isolated bacterial species. CAZyme families were chosen based on polysaccharide composition as determined by the glycomics analyses; predicted CAZyme genes from GH28, GH43, GH78, and CE6 families, active on rhamnogalacturonan-I (RG-I) backbone, arabinoxylan (AX), arabinan (AB), and homogalacturonan (HG), were selected from CMU13 and CMU19 isolates analyzed by SACCHARIS [48] to determine phylogenetic relatedness to biochemically characterized sequences. Sequences and accession numbers for characterized proteins were extracted from the CAZy database [49], and all protein sequences were aligned using MUSCLE [50]. ProtTest [51] was used for best-fit model selection using the MUSCLE sequence alignment, and FastTree [52] was used to generate the trees. Phylogenetic trees were annotated using iTOL [53]. Target enzymes were chosen based on distally related sequences, represented by novel clades within the phylogenetic trees. Domain boundaries were manually curated via predictions by dbCAN [54] and InterProScan [55].

Target CAZyme Gene Synthesis and Expression in E. coli
Selected gene targets (with GenBank-assigned locus tags in parentheses): N872 (IAG16_04270), N1089 (IAG16_05330), N2073 (IAG16_10110), N2589 (IAG16_12615), N3394 (IAG16_16585), K696 (IAG19_03410), K2605 (IAG19_12770), and K3550 (IAG19_17440) were codon optimized for expression within E. coli and gene synthesized as full-length constructs without their native signal peptide but including flanking NdeI and XhoI restriction sites at the 5 and 3 , respectively (BioBasic Inc.). Genes were ordered synthesized in the pET24b vector for C-terminal His 6 -tag protein expression. Each protein construct vector was transformed into E. coli BL21 (DE3) Star cells (Thermo Fisher). Cells were grown in LB Miller broth containing 50 µg mL −1 kanamycin at 37 • C to an OD600 nm of 0.6, when protein expression was then induced by the addition of IPTG to a final concentration of 1 mM. The cell culture was incubated at 18 • C for 16 h prior to being harvested by centrifugation at 6500× g for 20 min at 4 • C. Cell pellets were stored at −20 • C until needed.

Recombinant Gene Expression and Purification
The cell pellet from 1 L of bacterial culture was thawed and resuspended in 50 mL of lysis buffer (20 mM Tris pH 8.0, 500 mM NaCl, 0.1 mg mL −1 lysozyme). Cells were homogenized by sonication for 2 min of 1 s intervals of medium intensity sonic pulses at a power setting of 4.5 (Heat Systems Ultrasonics Model W-225 and probe). Cellular debris was removed by centrifugation at 17,500× g for 45 min at 4 • C and passed through a 0.45 µm filter. The filtrate was loaded onto Ni-NTA resin and purified by immobilized metal affinity chromatography, whereby recombinant protein was eluted by an increasing gradient 0-500 mM imidazole in 20 mM Tris pH 8.0 and 500 mM NaCl. Protein was concentrated by centrifugation with 10 kDa (K2605) or 30 kDa (N1089) cutoff Amicon Ultra centrifugal concentrators (Millipore Sigma). His 6 -tagged protein was further purified using a HiLoad 16/60 Superdex 200 prep-grade size exclusion column (GE Healthcare, Chicago, IL, USA) in 20 mM Tris pH 7.5, 500 mM NaCl, 2% glycerol. Pure protein fractions were pooled and concentrated. Protein purification was monitored throughout by SDS-PAGE. All recombinant proteins were used freshly prepared for downstream enzyme activity assays.

pNP-Sugar Colourimetric Activity Assay
Enzyme activities were assayed against several p-nitrophenol (pNP)-sugar conjugate substrates: pNP-α-D-Manp, pNP-α-L-Araf, pNP-α-L-Arap, pNP-β-D-Xylp, pNP-β-D-Galp, pNP-β-D-Glcp, and pNP-acetate. Reactions contained 1 µM enzyme in 50 mM Tris pH 8.0 and were initiated via the addition of a final concentration of 1 mM pNP-sugar conjugate. Assays were carried out in triplicate in 96-well microtitre plates. Product release was measured by monitoring at OD405 nm (BioTek Instruments) every 1 min for 30 min. Final product concentration was calculated using a calibration curve for the hydrolysis product pNP. The final concentration of DMSO in each reaction did not exceed 2%. Background hydrolysis rates were measured and subtracted from reaction rates.

Thin Layer Chromatography (TLC)
Purified CAZymes were screened for activity on CM AIR and extracted CM NSP. Additionally, commercial sources of polysaccharides and synthetic substrates consistent with substrates identified within CM were tested, including: Potato RG-I (Megazyme), citrus PGA (Megazyme), wheat arabinoxylan (Megazyme), galacturonic acid ladder of mono, di, and triose forms ( Reactions contained 1 µM enzyme and 5 mg mL −1 substrate in 20 mM Tris pH 8.0 and incubated overnight at 37 • C with mild shaking to maximize accessibility of insoluble substrates (i.e., CM AIR and CM NSP). After incubation, the samples were heat treated at 100 • C for 10 min to denature the enzyme and terminate the reaction. Samples were then shortly centrifuged at 8000× g to pellet denatured protein and undigested insoluble substrate particles from the product. Digested samples were spotted (total 6 µL; spotted 3 times with 2 µL each) onto TLC plates (TLC Silica Gel 60; EMD Millipore). The samples were dried between multiple rounds of spotting. Appropriate monosaccharide standards were included as controls (6 µL of 1 mM concentration); D-galactose (D-Gal), L-arabinose (L-Ara), D-xylose (D-Xyl), D-rhamnose (D-Rha), D-glucose (D-Glc), and D-mannose (D-Man). The samples were resolved using a mobile phase of 2:1:1 butanol:acetic acid:H 2 O, and dried prior to visualization with an orcinol solution (70:3, acetic acid:sulfuric acid with 1% orcinol) and heating at 100 • C for 3-5 min.

Whole Plant Cell Wall Characterization via Antibody-Based Glycome Profiling
While the monosaccharide composition of CM has been elucidated [8], defined polysaccharide content and linkages within CM NSP remained largely unknown [56]. Enzyme discovery for targeted biomass deconstruction is quite difficult without knowledge of the structure of CM NSP. However, determination of polysaccharide composition in cell wall rich samples such as CM is challenging because of the sheer complexity of structures present and the variety of molecular associations between them. Biochemical approaches enable monosaccharides and glycosidic linkages to be quantified, but these cannot always be assigned with confidence to particular polysaccharides because of the polymer deconstruction that is required for analysis. In contrast, although mAb based techniques such as ELISAs and MAPP are only semi-quantitative, they are based on the extraction of largely intact polysaccharides. Moreover, the glycan epitopes recognized by mAbs typically encompass several linked monosaccharides, and can be highly specific for a particular polysaccharide. Thus, ELISAs and MAPP provide useful information about polysaccharides per se, and changes to those polysaccharides due to enzyme processing.
CM plant cell wall fractions were extracted and analyzed for their glycome profiles using 155 plant cell wall glycan-directed mAbs [25,26] ( Figure 1A). These results determined that CM contains all major groups of non-cellulosic polysaccharides typically found in higher plant cell walls, including xyloglucans, xylans, rhamnogalacturonans (RG), and galactans. Mannans were notably absent from the CM, and arabinogalactans were also not as abundant as seen in some other plant cell wall preparations ( Figure 1A). Additionally, MAPP was used to screen a targeted panel of glycan-directed mAbs ( Figure 1B), presenting results consistent with the broad-panel mAbs; CM AIR and NSP contain glycans consistent with those found in pectins (including RG-I and -II), xyloglucan, and xylan ( Figure 1C).

Linkage Analyses of CM NSP
The linkage analyses of cell walls of cold-pressed and solvent-extracted canola meal were conducted for the first time. Sequential fractionation by extractants with increasing strength was conducted to the cell walls with and without the pre-treatment of NaBD4. The reducing agent NaBH4 acts on the reducing ends of polysaccharides and on the carboxyl groups resulting from alkaline cleavage of the esters of uronic acids (e.g., the methyl ester of galacturonic acids in pectins and ester link of the glucuronic acids of the hemicellulose glucuronoxylan) to prevent oxidative degradation during alkaline extraction [58][59][60]. NaBD4, a deuterated form of NaBH4, was used here for the analytical purpose of deuterium labelling changes that can be tracked by GC-MS analysis [35]. As expected, a much higher yield of the EDTA and Na2CO3 extract (32.1% versus 16.4%) and a much lower yield of the 4 M KOH fraction (29.0% versus 48.4%) was found in the NaBD4 pretreated coldpressed sample than the untreated one, while slight difference in the insoluble residue yield (38.9% versus 35.2%) between the two was observed (Supplemental Figure S2), indicating that NaBD4

Linkage Analyses of CM NSP
The linkage analyses of cell walls of cold-pressed and solvent-extracted canola meal were conducted for the first time. Sequential fractionation by extractants with increasing strength was conducted to the cell walls with and without the pre-treatment of NaBD 4 . The reducing agent NaBH 4 acts on the reducing ends of polysaccharides and on the carboxyl groups resulting from alkaline cleavage of the esters of uronic acids (e.g., the methyl ester of galacturonic acids in pectins and ester link of the glucuronic acids of the hemicellulose glucuronoxylan) to prevent oxidative degradation during alkaline extraction [58][59][60]. NaBD 4 , a deuterated form of NaBH 4 , was used here for the analytical purpose of deuterium labelling changes that can be tracked by GC-MS analysis [35]. As expected, a much higher yield of the EDTA and Na 2 CO 3 extract (32.1% versus 16.4%) and a much lower yield of the 4 M KOH fraction (29.0% versus 48.4%) was found in the NaBD 4 pretreated cold-pressed sample than the untreated one, while slight difference in the insoluble residue yield (38.9% versus 35.2%) between the two was observed (Supplemental Figure S2), indicating that NaBD 4 treatment greatly increased the extractability of polysaccharides from the cell wall and enabled the release of more polysaccharides by the relatively weaker extractants EDTA and Na 2 CO 3 . Similar trends were observed in the solvent-extracted CM fractions.
The cell walls and their fractions were subjected to carboxyl reduction-methylation-GC-MS analysis for the composition of linkages from uronic acids and neutral sugars [36]. Results showed that 4-Glcp and 4-GalAp were the two most abundant linkages in the whole cell wall (Figure 2, Supplemental Figure S3, Supplemental Tables S1 and S2) indicating higher abundance of cellulose and pectins than hemicelluloses in the sample. The 4-GalAp from pectins and 4-Glcp from cellulose were found dominant in the pooled EDTA + Na 2 CO 3 fraction and the alkali insoluble residue, respectively, and high amounts of linkages (e.g., 4-Xylp) typical for hemicelluloses were observed in the 4 M KOH fraction, which was in agreement with the previous reports on the fractionation of higher plant cell wall [33,61].
Based on the linkage data (Supplemental Tables S1 and S2), the relative composition of polysaccharides were estimated by assigning glycosidic linkages to polysaccharides (Table 1) and summing up the compositions of assigned linkages for corresponding polysaccharides following the published protocol [36], with the modifications that t-Fucp was additionally assigned to arabinogalactan-II (AG-II) according to the recent reports [62,63] and extensin was grouped with AG-II based on the finding of the presence of extensin in the cell wall by the monoclonal antibody assay ( Figure 1B) and the consideration that AG-II structure contains all the glycosidic linkage types of extensin glycoprotein [63]; 3-Glcp was assigned to callose instead of the mixed-linkage (1,3;1,4)-β-D-glucans that dicotyledonous plants lack [36]. Results showed the cell wall was rich in cellulose, pectins (homogalacturonan (HG), RG-I), arabinan (AB), and xyloglucan (XG) (Figure 3, Supplemental Figure  S4). It is well accepted that linear 5-linked AB are most prevalent in the side chains of RG-I [64][65][66], although they can also be found as part of arabinogalactan glycoproteins (AGPs). Considerable amounts of AG-I and AG-II structures were also found in the cell wall, which could be present as side chain structures of RG-I [64][65][66] or attached to AGPs. The possible crossing-linking between AB, AG-I and AG-II in the cell wall was supported by the observation of co-extraction and enrichment of them with the HG and RG-I in the pectin-rich EDTA and Na 2 CO 3 fraction; however, direct evidence needs to be found in future studies in order to confirm the existence of such structures in the canola meal sample. XG were the most abundant hemicellulose in the canola meal cell wall, followed by heteroxylan (HX) and heteromannan (HM) in decreasing abundance ( Figure 3). It was apparent that both the XG and HX were enriched in the strong alkali soluble fraction, which was in good agreement with previous reports [33,61], indicating both XG and HX were intensively cross-linked to the cell wall matrix and therefore strong extractants are needed to break the bonding and release them.

Polysaccharides
Linkage Assignments 1    Table S1) by assigning the linkages of uronic acids and neutral sugars to polysaccharides (Table 1; as in [36]). A pre-treatment of NaBD4 was conducted to the cell wall before its sequential fractionation in order to increase the extricability of polysaccharides. Four separate experiments conducted to cell wall and three to each fraction. Stacked bar plots are shown, with a mean value from the separate experiments represented as a horizontal bar and bars depicting the standard deviation from the mean. The polysaccharides were assigned as in Table 1: Arabinan (AB), type I arabinogalactan (AG-I), type II arabinogalactan (AG-II) and extensin (ET), heteroxylan (HX), xyloglucan (XG), callose (CA), cellulose (CE), rhamnogalacturonan I (RG-I), homogalacturonan (HG), heteromannan (HM), and unassigned linkages (UA).

Temporal Characterization of Broiler Cecal Bacterial Communities in Mini-Bioreactors
There was no difference in α-diversity (p ≥ 0.298) or evenness (p ≥ 0.418) of communities within the mini-bioreactor vessels among the CM, NSP, and control treatments over time. However, a reduction in α-diversity was observed within the first 24 h for all treatments; Shannon's Index values ranged from 6.7 to 6.9 at day 0, and from 4.2 to 4.6 at day 1. From day 1 onward, bacterial diversity was stable (p ≥ 0.083), ranging from 3.3 to 4.9 over the remaining nine days of the experiment. Despite the loss of diversity, bacteria in the phyla Firmicutes and Bacteroidetes were retained, although selection for some taxa occurred over time independent of treatment (Supplemental Figure S5). In this regard, bacteria in the families Bacteroidaceae and Fusobacteriaceae increased over time. The βdiversity (i.e., structure) of bacterial communities was not altered by NSP relative to the control treatment (p ≥ 0.441). In contrast, the structure of communities was altered for the CM treatment as determined by unweighted UniFrac (p = 0.045) and Jaccard (p = 0.055) (Figure 4). However, no or modest quantitative differences in β-diversity were observed among the CM and CON treatments as determined by weighted UniFrac (p = 0.237) and Bray-Curtis (p = 0.067). The observed impact on βdiversity for the CM treatment was attributed in part to selection for bacteria within the Veillonellaceae family (Supplemental Figure S5).  Table S1) by assigning the linkages of uronic acids and neutral sugars to polysaccharides (Table 1; as in [36]). A pre-treatment of NaBD 4 was conducted to the cell wall before its sequential fractionation in order to increase the extricability of polysaccharides. Four separate experiments conducted to cell wall and three to each fraction. Stacked bar plots are shown, with a mean value from the separate experiments represented as a horizontal bar and bars depicting the standard deviation from the mean. The polysaccharides were assigned as in Table 1: Arabinan (AB), type I arabinogalactan (AG-I), type II arabinogalactan (AG-II) and extensin (ET), heteroxylan (HX), xyloglucan (XG), callose (CA), cellulose (CE), rhamnogalacturonan I (RG-I), homogalacturonan (HG), heteromannan (HM), and unassigned linkages (UA).

Temporal Characterization of Broiler Cecal Bacterial Communities in Mini-Bioreactors
There was no difference in α-diversity (p ≥ 0.298) or evenness (p ≥ 0.418) of communities within the mini-bioreactor vessels among the CM, NSP, and control treatments over time. However, a reduction in α-diversity was observed within the first 24 h for all treatments; Shannon's Index values ranged from 6.7 to 6.9 at day 0, and from 4.2 to 4.6 at day 1. From day 1 onward, bacterial diversity was stable (p ≥ 0.083), ranging from 3.3 to 4.9 over the remaining nine days of the experiment. Despite the loss of diversity, bacteria in the phyla Firmicutes and Bacteroidetes were retained, although selection for some taxa occurred over time independent of treatment (Supplemental Figure S5). In this regard, bacteria in the families Bacteroidaceae and Fusobacteriaceae increased over time. The β-diversity (i.e., structure) of bacterial communities was not altered by NSP relative to the control treatment (p ≥ 0.441). In contrast, the structure of communities was altered for the CM treatment as determined by unweighted UniFrac (p = 0.045) and Jaccard (p = 0.055) (Figure 4). However, no or modest quantitative differences in β-diversity were observed among the CM and CON treatments as determined by weighted UniFrac (p = 0.237) and Bray-Curtis (p = 0.067). The observed impact on β-diversity for the CM treatment was attributed in part to selection for bacteria within the Veillonellaceae family (Supplemental Figure S5).

Enrichment and Isolation of CM NSP-Degrading Bacterial Species
Bacteria were isolated directly from cecal digesta or from the bioreactors (days 3 and 10); 332 isolates able to grow on an agar medium supplemented with MM and CM or CM NSP as a carbohydrate source were recovered. Of the 332 isolates, 325 were culturable in liquid rich media. We identified 100 intestinal bacterial isolates with the ability to degrade and metabolize CM NSP via growth on CM NSP-supplemented agar plates, and subsequent species identification was performed via 16S rDNA sequencing using degenerate 27F and 1492R primers [42,43] (Figure 5A). Enterococcus spp., and Bacteroides spp. were the most predominant bacteria able to utilize CM NSP.

Enrichment and Isolation of CM NSP-Degrading Bacterial Species
Bacteria were isolated directly from cecal digesta or from the bioreactors (days 3 and 10); 332 isolates able to grow on an agar medium supplemented with MM and CM or CM NSP as a carbohydrate source were recovered. Of the 332 isolates, 325 were culturable in liquid rich media. We identified 100 intestinal bacterial isolates with the ability to degrade and metabolize CM NSP via growth on CM NSP-supplemented agar plates, and subsequent species identification was performed via 16S rDNA sequencing using degenerate 27F and 1492R primers [42,43] (Figure 5A). Enterococcus spp., and Bacteroides spp. were the most predominant bacteria able to utilize CM NSP.

Enrichment and Isolation of CM NSP-Degrading Bacterial Species
Bacteria were isolated directly from cecal digesta or from the bioreactors (days 3 and 10); 332 isolates able to grow on an agar medium supplemented with MM and CM or CM NSP as a carbohydrate source were recovered. Of the 332 isolates, 325 were culturable in liquid rich media. We identified 100 intestinal bacterial isolates with the ability to degrade and metabolize CM NSP via growth on CM NSP-supplemented agar plates, and subsequent species identification was performed via 16S rDNA sequencing using degenerate 27F and 1492R primers [42,43] (Figure 5A). Enterococcus spp., and Bacteroides spp. were the most predominant bacteria able to utilize CM NSP.

Growth Metrics of "High-Efficiency" CM Degraders
Based on the bioreactor community analyses, growth in liquid culture for the Bacteroides isolates were evaluated to identify "high-efficiency" utilizers ( Figure 5B). Of the seven isolates, two B. theta CM NSP-utilizing isolates (CMU13 and CMU108) and two B. ovatus isolates (CMU19 and CMU33) were capable of moderate growth in CM NSP after 24 h (termed high-efficiency utilizers for our purposes), whereas three B. fragilis isolates (CMU36, CMU103, and CMU128) grew to low density.

Total CAZyme Content of Bacteroides CM-Degrading Isolates
Many members of Bacteroidetes are polysaccharide generalists that consume diverse cell wall polysaccharides. Therefore, the six top performing B. theta and B. ovatus isolates were selected for genome sequencing and prospecting ( Table 2). Functional annotation of assembled contigs and putative protein identification was done using Prokka [46]. Note: Quality assessment statistics were determined without reference by QUAST [45]. All statistics are based on contigs of size ≥ 500 bp, unless otherwise noted (e.g., "# contigs (≥0 bp)" and "Total length (≥0 bp)" include all contigs).

Target Selection for CAZymes with Potential Enzymatic Activity on CM NSP
Based on polysaccharide linkage analyses, genes from CAZyme families of interest were selected from the genomes of high growers and analyzed using our in-house pipeline SACCHARIS [48]. Gene sequences from bacterial isolates CMU13 (B. theta) and CMU19 (B. ovatus) for GH28 (pectins), GH43 (arabinoxylans), GH78 (RG-I/II), and CE6 (xylans) enzymes were embedded into phylogenetic trees comprised of characterized enzyme sequences ( Figure 6). From these trees, we can predict enzymatic function with higher accuracy. Overall, the majority of GH43 enzymes were found to be highly related both to each other, when comparing those from isolate CMU13 to isolate CMU19, as well as to other characterized enzymes; few isolate gene sequences are in novel clades, with two most distantly related groupings identified ( Figure 6). GH28, GH78, and CE6 family members have individual outliers, signifying targets that may possess unique enzyme specificities. A total of 8 CAZyme sequences from either CMU13 or CMU19 were chosen from GH28 (N2073, K2605), GH43 (N1089, K696), GH78 (N872, N3394, K3550), and CE6 (N2589) families based on these criteria ( Figure 6).
The selected targets were further analyzed by InterProScan [55] to identify any accessory modules of interest ( Figure 6). Five of the eight enzymes selected were determined to be multimodular. Of note, the CMU13 GH28 CAZyme N2073 contains a putative carbohydrate lyase domain that was not identified by dbCAN; and CMU13 N1089 contains two tandem GH43 subfamily 2 domains, only one of which is divergent from other GH43 family members in the phylogenetic tree.

Characterization of Novel CM-Targeted CAZymes
Full-length constructs of the selected enzyme targets were cloned for recombinant expression in E. coli, and evaluated by small-scale expression tests to determine expression conditions that result in large amounts of soluble protein. Of the eight enzyme targets selected, two full-length enzymes were found to produce soluble recombinant protein: CMU13 N1089, with two tandem GH43 domains; and CMU19 K2605, one GH28 CAZyme.
In order to screen CAZymes for function, colourimetric pNP assays were used. When pNP-sugar conjugate substrates are cleaved by enzymes with the required activity, the products can be detected and quantified spectrophotometrically. The GH43 family enzyme N1089 can function as both an arabinofuranosidase and xylosidase, consistent with other GH43 enzymatic activities ( Figure 7A), whereas the GH28 enzyme K2605 demonstrated faint activity on pNP-galactose (data not shown), with no detectable activity on other pNP-sugar substrates.
pNP-conjugate substrates provide an initial activity-based screen for enzymatic function, however these small-molecules do not reflect the complexity within native polysaccharide substrates. Thus, purified N1089 and K2605 were screened for activity on CM AIR and extracted CM NSP. Additionally, numerous commercial sources of polysaccharides and synthetic substrates were tested against general activities. The GH43 enzyme N0189 was found to be active on CM AIR, as smaller products consistent with oligosaccharide production can be seen by TLC ( Figure 7B). A similar cleavage product cannot be seen on CM NSP, likely due to the highly insoluble nature of the NSP substrate. No other oligo/polysaccharide-based substrates tested were shown to be cleaved by either N1089 or K2605, including: RG-I, polygalacturonic acid (PGA), wheat arabinoxylan, 2 3 -α-L-arabinofuranosyl-xylotriose (A 2 XX), arabinobiose (Ara 2 ), arabinotriose (Ara 3 ) (data not shown).
Microorganisms 2020, 8, x FOR PEER REVIEW 20 of 28 Figure 6. Prediction of enzyme activities and target selection by phylogenetic analyses. Genomes from isolates CM13 (B. theta) (red) and CM19 (B. ovatus) (blue) were annotated by Prokka [46]. Putative protein sequences were annotated by dbCAN [54] and used as query sequence inputs for Figure 6. Prediction of enzyme activities and target selection by phylogenetic analyses. Genomes from isolates CM13 (B. theta) (red) and CM19 (B. ovatus) (blue) were annotated by Prokka [46]. Putative protein sequences were annotated by dbCAN [54] and used as query sequence inputs for SACCHARIS [48] and embedded into phylogenetic trees of characterized GH28, GH43, GH78, and CE6 enzymes. EC number and CAZy database annotated functions are colour-coded as indicated. Target enzymes were chosen based on distally related sequences, represented by novel clades within the phylogenetic trees. Domain boundaries are based on manual curation of predictions by dbCAN [54] and InterProScan [55], with all schematics to scale and colour-coded. The polypeptide length for the open reading frame of each target is shown. Schematics of potential plant cell wall substrates for selected target CAZymes, as identified by EC number, are shown.
In order to screen CAZymes for function, colourimetric pNP assays were used. When pNP-sugar conjugate substrates are cleaved by enzymes with the required activity, the products can be detected and quantified spectrophotometrically. The GH43 family enzyme N1089 can function as both an arabinofuranosidase and xylosidase, consistent with other GH43 enzymatic activities ( Figure 7A), whereas the GH28 enzyme K2605 demonstrated faint activity on pNP-galactose (data not shown), with no detectable activity on other pNP-sugar substrates.

Discussion
Previous research and development of commercial enzyme products has shown that the addition of exogenous enzymes to animal feed can improve the digestion of NSP [67][68][69]; however, these approaches did not optimize the enzymes for unique structural features within the cell walls of feed or to function within the intestine of animals. NSP content (cellulose and non-cellulosic) varies greatly between common feeds in poultry diets [70]. Cereal grains such as corn, wheat, and barley are rich in arabinoxylans and β-glucans, whereas the protein-rich dietary components, soybean and canola meals, are also rich in xyloglucans, galactans, and pectic polysaccharides, in addition to the xylans. Here, we have used an enzyme discovery platform to identify enzymes within the genomes of cecal bacteria isolated from chickens. Importantly, tailored enzyme discovery required the development of a comprehensive analytic pipeline to fully characterize the structure of the substrate and to select for relevant enzyme activities. Accordingly, we conducted the first in depth glycomic analysis of CM and CM NSP cell walls through a holistic strategy involving glycome profiling and glycosidic linkage analyses.
Canola is the registered name for the rapeseeds of several cultivars of Brassica napus species, and CM is superior to its predecessor rapeseed meal (RSM) in nutritional values [1]. Previously, CMs from different varieties and lines were found to contain 8-10% sucrose, 2-3% oligosaccharides, 20-22% NSP, and 5-8% lignin and polyphenols. It was reported that CM was rich in cellulose, hemicelluloses, and pectins based on neutral sugar analysis using GC and total uronic acid colorimetric assay [56].
There have been very few reports on the structural characterization of cell wall polysaccharides of either RSM or the closest genetic relative to canola meal, Brassica campestris [71]. Oil-free de-hulled RSM contains arabinogalactan (1%), arabinan (2%), pectin (14.5%), cellulosic residue (7%), and small amounts of xylans [72]. The de-fatted seed meal of B. campestris is rich in AB, RG-I, AGPs, XGs, and xylans [73]. Both RSM and B. campestris seed meals have similar compositions with respect to pectins, AG, AB, glucuronoxylans, and XG with minor differences in some of the detailed structural features of AB and XG [71]. Linkage analysis was used in the previous studies, but uronic acids were not treated by carboxyl reduction before methylation analysis, resulting in the detection of only neutral sugar linkages. Notably, pectins were observed in several sequential extracts even in the final residue after alkaline extraction indicating a rigid matrix of the seed meals, suggesting that pre-treatments are needed to loosen the cell wall structure for increased digestibility by monogastric animals [71], further supported by an in vivo study [74].
We are not aware of any structural studies of the cell wall composition of CM beyond its monosaccharide composition and inferred polysaccharide content [56]. The first step of conventional monosaccharide composition analysis is acid hydrolysis or methanolysis of polysaccharides to their constituent monosaccharides, and it is widely accepted that this initial hydrolysis can yield an incomplete release of monosaccharides from some cell wall polysaccharides (e.g., the release of glucose from insoluble crystalline cellulose or galacturonic acids from pectins) and degradation of released monosaccharides [75,76]. Unlike direct hydrolysis of underivatized samples, per-methylated polysaccharides can be completely hydrolyzed by relatively mild acid such as 2 M TFA without signs of degradation [36,77]. In the current study, uronic acids in the CM cell walls were carboxyl-reduced to neutral sugars with deuterium labelling, followed by methylation-GC-MS analysis of the carboxyl reduced samples, which enabled not only complete hydrolysis of the insoluble pectin-rich cell walls to monosaccharides, but also the determination of linkage patterns of the released monosaccharides. Based on the linkage composition data, estimation of polysaccharide composition was performed following a published protocol [36]. Glycosidic linkage analysis determined that CM NSP are rich in pectins (RG-I, AB, galactans, HG, and arabinogalactans), XG, and AX ( Figure 3). In addition to linkage analysis, the CM cell walls were comprehensively characterized using two glycome profiling methods, one based on ELISAs using a large and diverse collection of mAbs, the other based on spotted arrays, using a smaller set of mAbs. The ELISA-based glycome profiling showed the presence of non-cellulosic polysaccharides typical of higher plant cell wall preparations (xyloglucans, xylans, pectins, galactans and arabinogalactans), although mannan epitopes were largely absent, and arabinogalactan epitopes appear underrepresented compared with other plant cell walls ( Figure 1A). The array-based glycome profiling also revealed the prominent presence of xyloglucans and arabinogalactans, with a lesser amount of a galactoglucomannan epitope ( Figure 1B). However, the spotted array did not detect appreciable levels of HG, in contrast to the ELISA-based results, where HG epitopes were easily detected. Interestingly, destarching appears to modify CM cell walls, as evidenced by lower detection of xyloglucan (LM15) and AGP (LM2) epitopes, and slightly higher detection of the mannan epitope recognized by LM21 ( Figure 1B). Amylase has been previously shown to extract NSP [78]. Both glycome profiling methods reveal the polymeric complexity of NSP prepared from CM, illustrating a highly diverse set of epitope structures within each of several polysaccharide classes. In addition, the glycome profiling revealed that xylans and xyloglucans, as well as some RG-I and AGP epitopes are quite tightly bound to the cell wall matrix and require quite harsh conditions to release them from the wall. Homogalacturonans appear to be less tightly integrated into the CM walls, since the majority of these epitopes are released prior to the chlorite step in the wall extraction sequence.
High-resolution methods glycome profiling and glycosidic linkage analyses used herein have now revealed the true complexity of NSP within CM (Figure 1), and the digestion of these complex dietary fibres is likely to require the addition of appropriate enzymatic activities for their digestion. Some candidate types of enzymes are suggested by the data presented here. Bi-functional enzymes offer a more comprehensive breakdown strategy encoded within a single polypeptide, where the presence of multiple CAZyme domains of complementary specificities could work to more efficiently digest complex polysaccharides [79] (i.e., the two tandem GH43 domains of N1089). In addition, carbohydrate-binding modules were found as part of a number of the selected CAZyme targets, which may help increase the concentration and/or targeting of these enzymes to their substrates [80]. Additionally, polysaccharide linkages within the insoluble NSP may be inaccessible to the target enzymes and/or CM isolates under the designed assay and growth conditions used here. While bacterial CM isolates were able to utilize CM NSP as a sole carbohydrate source, these cultures reached relatively low density when compared to growth on glucose ( Figure 5B) or to that of wild-type B. theta VPI-5482 on complex polysaccharides, such as pectins [81]. A comprehensive approach involving the use of an enzymatic cocktail with multiple complementary activities may be required to more effectively digest CM NSP for use as livestock feed stocks.
Novel enzymatic activities may also be required for efficient breakdown of CM, tailored to specific linkages found in NSP ( Figure 6). With this in mind, two enzymes from CM-utilizing Bacteroides isolates, N1089 and K2605, were selected and screened for enzymatic activity. N1089 was discovered to be an arabinofuranosidase and xylosidase, both activities directly relevant to the degradation of arabinosyl-and xylosyl-containing polysaccharides or sidechains found in CM NSP (Figures 1-3). Additionally, we were able to successfully demonstrate activity of N1089 on CM AIR. As the genetic construct assayed with the pNP-substrate screen is a full-length enzyme, it is possible that the dual arabinofuranosidase/xylosidase activity seen is the result of one domain of N1089 functioning as an arabinofuranosidase, while the other may possess xylosidase activity. Indeed, the C-terminal module ("b") of N1089 was found to cluster with α-L-arabinofuranosidases active on arabinan (Figure 6), suggesting that the N-terminal module ("a") may be responsible for cleaving xylan backbone linkages common to CM plant cell walls (Figures 1 and 3). K2605 was identified by dbCAN as a GH28 and clustered with numerous other endo-polygalacturonases ( Figure 6). It is thus likely that the K2605 enzyme is a polygalacturonase, but may accommodate a pNP-galactose substrate in the -1 subsite. Unfortunately, the pNP-α-galacturonide substrate is not commercially available which prevents screening K2605 against this synthetic substrate. Regardless, the galactosidase/galacturonase activity of K2605 may assist with dismantling the homogalacturonan found in CM NSP (Figures 1 and 3). These two enzymes presented here may represent valuable targets for further product development, perhaps as part of a combinatorial strategy to dismantle the complex multitude of polysaccharides present in CM.
Surprisingly, species not commonly associated with complex carbohydrate utilization were also isolated from CM-NSP medium, including Enterococcus faecalis, E. faecium, E. cecorum, and E. avium ( Figure 5). In all likelihood these microorganisms do not possess the enzymatic machinery required to utilize CM NSPs. Inspection of their genomes revealed that E. faecium and E. avium possess one GH43 and one GH78 enzyme, while the genomes of E. cecorum and E. faecalis do not [49]. The genome of E. faecium does encode two GH28 enzymes; however, whether these enzymes are secreted and active on CM NSPs is not known. As isolations were performed on rich media after initial minimal media selections, it is possible that these species were able to grow on contaminating monosaccharides in the media or on products generated by neighbouring colonies from Bacteroides sp. Importantly, two of the enzymes identified in this study were found to possess N-terminal signal peptides, suggesting that at least some CM NSP-relevant enzymes may be secreted for extracellular function ( Figure 6).

Conclusions
The discovery of enzymes that improve the digestion of feed by livestock has been difficult to achieve in the past. The majority of commercially available enzyme additives have been repurposed for applications in biofuel production [82,83]; thus, they have not been optimized for dietary substrates or to function within the intestine of animals. To address both of these limitations, we have developed an enzyme discovery platform that combines glycomic characterization of complex substrates to inform enzyme selection, selective isolation of bacteria that colonize the cecum of the host species, in silico identification of enzymes from sequenced genomes, and biochemical characterization of recombinantly produced target enzymes. In this study, two promising enzymes, N1089 and K2605, were identified from chicken-associated Bacteroides spp. that are active on CM polysaccharides. This enzyme discovery platform may provide a new route for informed biocatalyst development that can be adapted to a variety of other plant sources, host species, and industrial applications.
Supplementary Materials: The following are available online at http://www.mdpi.com/2076-2607/8/12/1888/s1, Figure S1: Enzyme discovery platform for the tailored degradation of non-starch dietary polysaccharides, Figure S2: Fractionated yield of cold-pressed and solvent-extracted canola meal (CM) plant cell wall extractions for glycomic linkage analyses, Figure S3: Cell wall linkage composition of solvent-extracted canola meal, Figure S4: Estimated polysaccharide composition of solvent-extracted canola meal, Figure S5: Temporal relative abundance of primary bacterial families in CON, NSP, and CM treatment bio-reactor vessels, Table S1: Linkage composition of cold-pressed canola meal whole cell wall (AIR) and fractions, Table S2: Linkage composition of solvent-extracted canola meal whole cell wall (AIR) and fractions, Table S3: CAZome of bacterial isolates for the GH5 and GH43 subfamilies.