Clostridium manihotivorum sp. nov., a novel mesophilic anaerobic bacterium that produces cassava pulp-degrading enzymes

Background Cassava pulp is a promising starch-based biomasses, which consists of residual starch granules entrapped in plant cell wall containing non-starch polysaccharides, cellulose and hemicellulose. Strain CT4T, a novel mesophilic anaerobic bacterium isolated from soil collected from a cassava pulp landfill, has a strong ability to degrade polysaccharides in cassava pulp. This study explored a rarely described species within the genus Clostridium that possessed a group of cassava pulp-degrading enzymes. Methods A novel mesophilic anaerobic bacterium, the strain CT4T, was identified based on phylogenetic, genomic, phenotypic and chemotaxonomic analysis. The complete genome of the strain CT4T was obtained following whole-genome sequencing, assembly and annotation using both Illumina and Oxford Nanopore Technology (ONT) platforms. Results Analysis based on the 16S rRNA gene sequence indicated that strain CT4T is a species of genus Clostridium. Analysis of the whole-genome average amino acid identity (AAI) of strain CT4T and the other 665 closely related species of the genus Clostridium revealed a separated strain CT4T from the others. The results revealed that the genome consisted of a 6.3 Mb circular chromosome with 5,664 protein-coding sequences. Genome analysis result of strain CT4T revealed that it contained a set of genes encoding amylolytic-, hemicellulolytic-, cellulolytic- and pectinolytic enzymes. A comparative genomic analysis of strain CT4T with closely related species with available genomic information, C. amylolyticum SW408T, showed that strain CT4T contained more genes encoding cassava pulp-degrading enzymes, which comprised a complex mixture of amylolytic-, hemicellulolytic-, cellulolytic- and pectinolytic enzymes. This work presents the potential for saccharification of strain CT4T in the utilization of cassava pulp. Based on phylogenetic, genomic, phenotypic and chemotaxonomic data, we propose a novel species for which the name Clostridium manihotivorum sp. nov. is suggested, with the type strain CT4T (= TBRC 11758T = NBRC 114534T).


INTRODUCTION
The bio-based economy is an emerging sector with a notable potential for economic growth and with promising business opportunities. It is generally defined as the sustainable exploitation and management of renewable natural resources for producing bio-based products. Recently, biorefineries utilize lignocellulosic and other organic raw materials to generate a spectrum of bio-based products such as biofuels, biochemicals and other high value-added products get attention (FitzPatrick et al., 2010). Biomass feedstocks are grouped into two categories, carbohydrate-rich and oleaginous (Melero, Iglesias & Garcia, 2012). Carbohydrate-rich feedstocks contain starch and non-starch polysaccharides (NSP). Industrial starch-rich by-products such as cassava pulp, wheat bran, rice bran, sago pith residues and brewery-spent grains are available in enormous quantities and vary in terms of starch and NSP, hemicellulose and cellulose components (Cripwell et al., 2015). These materials are potential feedstocks for bio-based production, however, they have first to undergo a pretreatment process for the enhanced production of biofuels, organic acids and other valuable biochemicals (Cripwell et al., 2015;Zhang et al., 2016). The starch granules in the starch-rich by-products are entrapped tightly in the secondary cell wall structure by cellulose, hemicellulose and lignin, thus, the starch cannot be easily released for further conversion (Apiwatanapiwat et al., 2016). Moreover, the costs associated with the pretreatment process, such as the energy, equipment and wastewater treatment costs, have resulted in the slow adoption of the technology.
Thailand is a major cassava producer for the domestic and global markets. Cassava starch factories in Thailand generate approximately 1.5-2.0 million tons of waste cassava pulp annually (Norrapoke et al., 2018). Most of the cassava pulp ends up in landfills, resulting in environmental pollution. The pulp spoils rapidly in the humid, warm tropical environment, and under anaerobic conditions generates methane, thus contributing to global warming and leaching of the soil, entering water sources and creating a nuisance to the air quality near the cassava starch factories that consequently affects human health. The utilization rather than discarding of cassava pulp will, therefore, reduce the negative impact on environmental and human health. On a dry weight basis, cassava pulp is mainly composed of starch (50-60%, w/w) with 15-27% (w/w) cellulose and hemicelluloses contents, pectin (7.0-7.3%, w/w), and lignin (3.4-4.6%, w/w) (Djuma'ali et al., 2012;Vaithanomsat et al., 2013). In general, the saccharification of cassava pulp to fermentable sugars used in the production of high value-added products requires the action of enzymes belonging to species, and to date, the availability of low-cost, high-performance sequencing continues to expand the diversity of research and applications on a genome-scale. Advances in the next generation of sequencing technologies, e.g., Illumina (Bentley et al., 2008) and Oxford Nanopore Technology (ONT) platforms (Clarke et al., 2009) have been applied to sequencing full-length genetic information of many organisms, by generating short-and long-read sequence data that enables the accurate identification of species-level taxonomy and allows for the de novo assembly of complete genomes. The combination of genomic and phenotypic information will allow a faster and more reliable classification of new isolates of microorganisms (Chun & Rainey, 2014).
In this study, we isolated a novel mesophilic anaerobic bacterium, Clostridium manihotivorum CT4 T from the soil of a cassava pulp landfill. The isolated strain demonstrated an efficient degradation of cassava pulp, a by-product of the cassava starch industry. The phenotypic and biochemical characteristics of the isolated strain were reported. To better understand the genetic basis for the cassava pulp degradation by strain CT4 T , its genome was entirely sequenced using Illumina and ONT platforms. The genome analysis of strain CT4 T identified a set of genes encoding amylolytic-, hemicellulolytic-and cellulolytic-enzymes critical to its ability to degrade cassava pulp, which is rarely found in Clostridium species.

Preparation of samples and basal medium
Samples of cassava pulp and soil beneath the waste heap were obtained from a starch factory landfill in Chonburi Province, Thailand. The pulp was ground by an ultra centrifugal mill ZM-100 and sieved through a 0.5 mm mesh screen (Retsch, Haan, Germany). The pulp was washed several times with distilled water to remove the remaining sugar and other dirt, oven-dried at 50 • C until at a constant weight and then stored in plastic bags at 4 • C for further experiments.

Screening and isolation of cassava pulp-degrading strains
The enrichment and isolation were performed under anaerobic conditions. Approximately 1 g of the soil sample was transferred into Hungate tubes containing 15 mL BM7 (pH 7.0) and 1% (w/v) cassava pulp. After inoculation, each test tube was flushed with N 2 and incubated at 37 • C. The culture that showed the highest degradation of pulp, as visually indicated by the remaining cassava pulp (approximately ≤ 50% residue dry weight), was selected and serially diluted into agar-cassava pulp medium that had been preliminarily melted and cooled to 55 • C. The cultures were then subjected to the roll-tube technique for isolating obligate anaerobes (Hungate, 1969), after which solidified samples were incubated at 37 • C. Single colonies were isolated with sterile needles and inoculated into BM7 broth containing cassava pulp. Afterward, the cultures were incubated to study their ability to degrade the cassava pulp. Pure cultures were obtained following repeated sub-culturing (ten times) in BM7 containing cassava pulp.
The composition of cassava pulp and residual cassava pulp digested by C. manihotivorum CT4 T , C. polyendosporum PS-1 T and C. amylolyticum SW408 T were analyzed following the National Renewable Energy Laboratory (NREL) protocol (Sluiter et al., 2008).

16S rRNA gene sequencing
Genomic DNA for 16S rRNA gene sequencing was prepared by phenol-chloroform extraction. The 16S rRNA gene was amplified by PCR using the following primers: 8F (5 -AGAGTTTGATCCTGGCTCAG-3 ) and 1492R (5 -GGTTACCTTGTTACGACTT-3 ). The PCR reaction conditions were as follows: 94 • C for 3 min, 35 cycles at 94 • C for 30 s, 55 • C for 40 s and 72 • C for 2 min, with a final extension time of 5 min at 72 • C. The amplified fragment was ligated into the pGEM-T Easy vector (Promega, Madison, WI, USA), and the recombinant plasmid was sequenced using T7 and SP6 primers. A sequence similarity search was performed using the BLAST program (http://blast.ncbi.nlm.nih.gov/Blast.cgi). A phylogenetic tree was generated by the neighbor-joining method with 1,000 bootstrap replications, employing the MEGA version 6.0 (Tamura et al., 2013).

Physiological and biochemical analysis
Gram staining of strain CT4 T was conducted using the conventional methodology and confirmed using the KOH test (Powers, 1995). Endospore staining was examined by Schaeffer-Fulton's method (Schaeffer & Fulton, 1933). Cell morphology was observed by scanning electron microscope (SEM; model SU800 Hitachi, Japan). Substrate utilization was tested by growing the strain in BM7 containing 0.5% (w/v) of the following substrates: D-glucose, D-galactose, D-arabinose, D-rhamnose, D-mannose, D-xylose, D-fructose, D-trehalose, D-raffinose, lactose, sucrose, maltose, cellobiose, mannitol, soluble starch from potato (ACS reagent), birchwood xylan, cellulose powder (Type 20) and Avicel R (PH-101); these chemicals were purchased from Sigma-Aldrich, Saint Louis, MO, USA. Raw cassava starch and cassava pulp were obtained from a starch factory landfill in Chonburi Province, Thailand. After 5 days of incubation, cell growth was assessed by determining the optical density at 600 nm. The fermentation products in the supernatant extracted from the glucose-grown culture were determined by gas chromatography equipped with a flame-ionization detector and a Carbopack B-DA/4% Carbowax 20M column (GC-14A; Shimadzu, Japan). The column, injector and detector temperatures were 170, 230 and 230 • C, respectively. Other biochemical tests were conducted according to the methods of Holdeman, Cato & Moore (1977) and Summanen et al. (1993). The isomers of diaminopimelic acid (DAP) in the cell wall were determined as described by Komagata and Suzuki (Komagata & Suzuki, 1988). Cellular fatty acids were extracted, methylated and analyzed using the standard microbial identification system (MIDI) protocol (Sherlock microbial identification system, version 6.1) while the fatty acids were identified using the TSBA6 database of the microbial identification system (Sasser, 1990). The polar lipids were analyzed from freeze-dried cells by two-dimensional TLC, as described by Minnikin et al. (1984). Appropriate detection reagents were used to visualize the spots: phosphomolybdic acid reagent 5% (w/v) solution in ethanol (Sigma-Aldrich, Saint Louis, MO, USA) was used to detect total polar lipids; ninhydrin reagent (0.2% solution; Sigma-Aldrich) was used to detect amino lipids; Dittmer and Lester reagent (molybdenum blue reagent, 1.3%; Sigma-Aldrich) was used to detect phospholipids and Dragendorff's reagent (Sigma-Aldrich) was used to detect phosphatidylcholine.

Cultivation and enzyme production
Strain CT4 T was anaerobically cultivated in 1,000 mL of BM7 containing 1% (w/v) cassava pulp for 3 days at 37 • C, pH 7.0 under static conditions in an anaerobic chamber (Bactron II, USA). The culture supernatant was collected by centrifugation at 10,000× g for 10 min at 4 • C and subsequently concentrated using a hollow fiber cartridge with a 10-kDa-cutoff membrane (GE Healthcare, USA). The retentate (50-times more concentrated) was then used as the crude enzyme.

Enzyme assays and protein determination
All enzyme assays were performed in 50 mM sodium phosphate buffer (pH 7.0) and incubated at 55 • C for 10 min. The enzymatic activities on 1% (w/v) cassava pulp, soluble starch, pullulan, birchwood xylan, cellulose powder (Type 20) and pectin from citrus peel were assayed by determining the amount of reducing sugars liberated by the Somogyi-Nelson method (Nelson, 1944). One unit (U) of enzyme activity was defined as the amount of enzyme that released 1 µmol of reducing sugar in 1 min under assay conditions. The protein concentration was determined by the Lowry method (Lowry et al., 1951) using bovine serum albumin as a standard.

Library preparation and genome sequencing
The genomic DNA was extracted from the cultures using a blood and cell culture DNA midi kit (Qiagen, Germany) according to the manufacturer's instructions. Strain CT4 T was sequenced using two sequencing platforms: Illumina (Hiseq2500) and ONT (MinION). Illumina sequencing paired-end DNA libraries were prepared following the Illumina DNA manufacturer's instructions (NEBNext R Ultra TM DNA library prep kit). The sizeselected, adaptor-ligated DNA fragments were PCR-amplified using the following protocol: polymerase activation (98 • C for 30 s), followed by 10 cycles (denaturation at 98 • C for 10 s, annealing at 65 • C for 75 s and extension at 65 • C for 75 s) with a final 5 min extension at 65 • C. The DNA libraries were purified by magnetic beads, and their size distribution was checked using Agilent Bioanalyzer DNA high sensitivity chip assay. The DNA fragments were sequenced using the Illumina Hiseq2500 with 2 × 150 bp paired-end protocol (Illumina, Inc., California, USA). The ONT library preparation and bioinformatics analysis were performed according to Jenjaroenpun et al. (2018). In brief, a total amount of 600 ng genomic DNA was used as input for a rapid sequencing kit (SQK-RAD002) to generate the DNA sequencing library. The library was then loaded onto a flow cell version FLO-MIN106 on a MinION Mk1B (released in 2014 through the MinION Access Program) (Oxford, UK) to perform DNA sequencing for 48 h. Base-calling was performed using the local-based software, Albacore version 1.2.3 (ONT, USA).

Genome assembly and annotation
The high-quality ONT reads (average quality score of >7) were first assembled using combination Minimap2 (Li, 2018) and Miniasm (Li, 2016), resulting in a circular draft genome. The draft genome was polished using Racon (Vaser et al., 2017) and Nanopolish (Loman, Quick & Simpson, 2015), based on the consensus pileup of high-quality ONT reads and additionally using Pilon (Walker et al., 2014), based on short Illumina reads.

Average amino acid identity analysis
The first analysis comprised pairwise comparisons of AAIs (Konstantinidis & Tiedje, 2005) of the 665 genomes belonging to the Clostridia class (Cabal et al., 2018). For each pair of genomes, the average AAI was then calculated based on the identities of all conserved reciprocal best matches, a calculation that was not always symmetrical. In such cases, the average of the two AAI values was assigned to each pair of genomes. The AAI tree was built with BIONJ (Gascuel, 1997) to find dissimilarities of AAI values (100% minus AAI).

Comparison of glycoside hydrolase producing genes in strain CT4 T with related species
Strain CT4 T (GenBank accession number CP025746) was compared with the closely related C. amylolyticum SW408 T (NZ_FQZO00000000; NCBI) that had available genomic information in the NCBI database, using OrthoMCL (Chen et al., 2006) to characterize their specific genetic features and identify overlaps among orthologous clusters. The protein sequences were grouped into gene families encoding amylolytic-, hemicellulolytic-and cellulolytic-enzymes, using the criteria: E-value <1E-5 and sequence identity >50%. The genomic information of C. polyendosporum PS-1 T was not reported in the NCBI database, therefore, the strain PS-1 T was excluded from genome comparison.

Isolation and identification of cassava pulp-degrading bacterium
In total, 15 individual colonies were isolated by the roll-tube technique and were subcultured 10 times in BM7 separately, utilizing cassava pulp as carbon sources. Visualization of the roll-tube appearance revealed that isolate CT4 T performed best in relation to cassava pulp degradation. Moreover, approximately 60.8% (w/v) removal of dry weight was detected when cultured in BM7 broth. The isolated CT4 T thoroughly utilized the starch contains in cassava pulp after 5 days of culturing. The result showed that the isolated CT4 T could remove 99.0% starch in cassava pulp, while cellulose and hemicellulose contents were removed by 42.2% and 39.2%, respectively, when compared with the control. Besides, this strain removed starch and non-starch polysaccharide in cassava pulp better than the related species, strain PS-1 T and SW408 T (Fig. S2). Thus, it was consequently selected for further analysis. Prior to genome sequencing, the 16S rRNA gene sequence of strain CT4 T (accession number MH879026) was compared with the nucleotide sequences in NCBI. The analysis revealed that strain CT4 T shared 95% sequence identity with Anaerobacter polyendosporus PS-1 T , now reclassified as Clostridium polyendosporum comb. nov. (Duda et al., 1987;Stackebrandt et al., 1999) and 94% sequence identity with C. amylolyticum SW408 T (Song & Dong, 2008), Clostridium putrefaciens DSM 1291 T (Sturges & Drake, 1927), and Clostridium algidicarnis NCFB 2931 T (Lawson et al., 1994). Phylogenetic analysis based on 16S rRNA gene sequences and neighbor-joining method indicated that strain CT4 T belongs to the genus Clostridium (Fig. 1). Therefore, isolated CT4 T was classified as Clostridium sp. CT4 T

Physiological and biochemical characteristics of strain CT4 T
The SEM image revealed that cells of strain CT4 T were rod-shaped, and surrounded by a polysaccharide capsule (Fig. 2). Strain CT4 T was Gram-positive, single endospore-forming, non-motile and non-flagellate (Table 1). To understand the optimal growth conditions, strain CT4 T was cultivated under different pH (pH 4.0-11.0) and temperature (25−50 • C) conditions. Strain CT4 T could grow at a wide range of temperatures (25−45 • C) and pH (5.5-7.5) in BM7 medium containing 1% (w/v) cassava pulp. The optimum growth of strain CT4 T was found at 37 • C and pH 7.0. Moreover, strain CT4 T used a wide range Butanol was not observed during growth, whereas it was produced in the closest relative, C. polyendosporum PS-1 T (Table 1). In contrast, C. putrefaciens isolated from spoiled ham (Sturges & Drake, 1927), and C. algidicarnis isolated from vacuum-packed refrigerated pork (Lawson et al., 1994) cannot hydrolyze starch despite their relatedness to strain CT4 T (Table 1). Strain CT4 T presented LL-diaminopimelic acid (LL-DAP) in their cell wall, whereas most members in the genus Clostridium contains meso-diaminopimelic acid. Thus, the strain CT4 T was different from the other related strains, except C. putrefaciens that have the same with strain CT4 T . The cellular fatty acid profiles of strain CT4 T are listed in Table S1. The major fatty acids detected from strain CT4 T were C 16:0 (37.4%), C 14:0 (15.0%), anteiso-C 15:0 (5.5%), summed feature 1 (C 13:0 -3OH and/or C 15:1 isoH; 4.5%), C 19:0 cyclo ω8c (4.2%) and C 17:0 2-OH (4.0%). In terms of their polar lipid profiles, strain CT4 T contained phosphatidylethanolamine (PE) and phosphatidylglycerol (PG) as the major polar lipids, while phosphatidylcholine (PC) was found as minor polar lipid. Additionally, three unidentified phospholipids (PL1-PL3) and three unidentified amino lipids (AL1-AL3) were also detected (Fig. S1). Based on the 16S rRNA gene sequence similarity, physiological attributes and biochemical properties, strain CT4 T was considered to be a novel species of the genus Clostridium. Thus, the strain CT4 T was introduced in the namely Clostridium manihotivorum CT4 T , which can degrade cassava pulp. The meaning of ''manihotivorum'' is devouring cassava. This bacterium was deposited as a type strain in the Thailand

Characterizations of amylolytic-, hemicellulolytic-and cellulolytic-enzymes of C. manihotivorum CT4 T
In this study, a C. manihotivorum CT4 T was discovered to degrade cassava pulp, which was able to produce the cassava pulp degrading enzymes, including amylolytic-, hemicellulolytic-and cellulolytic-enzymes. In order to characterize the properties of the crude enzyme from strain CT4 T , the isolate was cultivated in BM7 medium containing 1% (w/v) cassava pulp at pH 7.0, 37 • C. Afterwards, the culture supernatant was harvested at the early stationary phase (3 days) and concentrated by ultrafiltration technique. The crude enzyme gave the highest activity on cassava pulp (1,901.1 U/g protein), which was 1.56-fold higher than that obtained from soluble starch (1,212.7 U/g protein). In addition, a pullulanase activity of 27.5 U/g protein was detected (Table 2). C. manihotivorum CT4 T was also able to produce xylanase (43.5 U/g protein), cellulase (32.0 U/g protein) and pectinase (42.4 U/g protein) as shown in Table 2, which are involved in the degradation of xylan, cellulose and pectin contained in the cell wall structure of cassava pulp, respectively.

The complete genome of C. manihotivorum CT4 T and comparative genomics
In this study, the complete genome of C. manihotivorum CT4 T , deposited in GenBank under the accession number CP025746, was described. A complete, gapless and circular  genome assembly was generated, with a total size of 6,364,326 bases and a 40-fold coverage, as shown in Fig. 3 and Table 3. The origin of replication was determined based on GC skew analyses. The average G + C content was approximately 32 mol%, and plasmid was not detected. The DNA G + C content of strain CT4 (32 mol%), was within the range of 23-37% reported for the genus Clostridium (Lawson & Rainey, 2016). Genome annotation was performed using Prokka (Seemann, 2014) and Blast2GO (Conesa et al., 2005). The genome was predicted to have 5,664 protein-coding sequences (CDS), 42 rRNA sequences, 95 tRNA sequences, 1 tmRNA sequence and 153 misc_RNA sequences. Furthermore, NCBI Prokaryotic Genome Annotation Pipeline (PGAP) version 4.11 was also employed to annotate the genome, which provided slightly different result of 5,308 CDSs and 5,654 total genes (Table 3). Hereafter, 5,664 CDSs were used for further analysis, in which the details are available in Data S1. According to the comparison of the genomes between C. manihotivorum CT4 T and C . amylolyticum SW408 T , the strain CT4 T has much larger genome size than the strain SW408 T about 2.1 Mb. Moreover, 5,664 CDSs were predicted in C. manihotivorum CT4 T whereas only 3,957 CDSs were reported in C. amylolyticum SW408 T .

Average amino acid identity and phylogenetic analysis
The whole-genome phylogeny of C. manihotivorum CT4 T was compared with a unique set of 665 Clostridia class genomes (Cabal et al., 2018), employing the average amino acid identity (AAI) analysis method. AAI has proven to have a better resolution power at the species level than 16S rRNA gene sequence-based comparison (Mahato et al., 2017). The derived phylogenetic tree based on the AAI analysis of all 666 genomes revealed several main clusters (Fig. 4). The strain CT4 T was clearly separated from the other Clostridia, in which a single branch was observed. The result aligned with the above physiological and biochemical characteristics that differ from other related type strains.

Functional category of strain CT4 T
Approximately, 75% (4,223 out of 5,664) of the protein-coding sequences in C. manihotivorum CT4 T were classified into COG functional categories (Table 4)   transportation of the compounds (Tomazetto et al., 2016). These results suggest that C. manihotivorum CT4 T contains genes encoding glycoside hydrolases, related to starch, hemicellulose, cellulose and pectin degrading enzymes ( Table 5).

DISCUSSION
As we know, cassava pulp generated in large amounts, as industrial waste during cassava processing is rich in starch and fiber (Norrapoke et al., 2018). Thus, it can be used as a renewable material to produce high value-added products (FitzPatrick et al., 2010).
Mostly, bacterial species of the genus Clostridium are known as good degraders of lignocellulosic materials (Doi & Kosugi, 2004). However, not much is known regarding amylase, hemicellulase and cellulase-producing species that are capable of efficient cassava pulp degradation. Among the species isolated from soil samples collected from cassava pulp landfill using the Hungate roll-tube technique, strain CT4 T was most effective in degrading cassava pulp. The roll-tube procedure has previously been used to isolate single colonies and pure cultures of bacteria, including Clostridium thermocellum S14 (Tachaapaikoon et al., 2012) and C . amylolyticum SW408 T (Song & Dong, 2008). Subsequently, strain CT4 T was identified using the 16S rRNA gene sequencing analysis. According to 16S rRNA gene sequence analysis, strain CT4 T was phylogenetically related to members of the genus Clostridium (90-95% sequence similarity), with the highest degree of sequence similarity to C. polyendosporum PS-1 T (95%) and follow by C . amylolyticum SW408 T (94%). These values are at the level suggested to allocate the strain to a novel species of genus Clostridium (Yarza et al., 2008). Moreover, AAI and phylogenetic analysis of the strain CT4 T suggested that the newly isolated strain CT4 T should be classified as a novel species of the genus Clostridium, known as C. manihotivorum CT4 T . Although C. polyendosporum PS-1 T could degrade starch, its activity is not known (Duda et al., 1987). Remarkably, C. polyendosporum PS-1 T and C. manihotivorum CT4 T have different capacities for endogenous spore formation. While C. polyendosporum PS-1 T has the ability to form several endospores in one cell (some cells may produce up to seven), cells of the strain CT4 T contained a single endospore (Table 1). Likewise, C. amylolyticum SW408 T , a mesophilic anaerobic amylolytic bacterium (and a close relative of strain CT4 T ), isolated from an H 2 -producing up-flow anaerobic sludge blanket reactor utilizes several kinds of mono-and di-saccharides and simultaneously hydrolyzes and ferments starch (Song & Dong, 2008). Nonetheless, there are no reports precisely in relation to cassava pulp degradation in this genus. The degradation of cassava pulp by strain CT4 T was also compared with that of C. polyendosporum PS-1 T and C. amylolyticum SW408 T , which are the closest related species. They were inoculated into BM7 containing 1% (w/v) cassava pulp at 37 • C, pH 7.0 for 5 days. C. manihotivorum CT4 T grew rapidly, while both strains showed a small amount of growth on cassava pulp. After cultivation, the residue weights of C. manihotivorum CT4 T , C. polyendosporum PS-1 T and C. amylolyticum SW408 T were decreased by 60.8% (w/v), 0.6% (w/v) and 0.4% (w/v), respectively, and compared with the initial dry weight of cassava pulp. Cassava pulp compositions after digested by C. manihotivorum CT4 T , C. polyendosporum PS-1 T and C. amylolyticum SW408 T were analyzed (Fig. S2). C. manihotivorum CT4 T showed a high degradation ability for starch, which was 99% starch removal. Moreover, the strain CT4 T revealed not only efficient starch degradation but also cellulose, hemicellulose and pectin. By contrast, C. polyendosporum PS-1 T and C. amylolyticum SW408 T showed ineffective cassava pulp degradation. The starch, cellulose, and hemicellulose contents of the residues were little decreased. The result indicated that C. manihotivorum CT4 T might have better cassava pulp degradation ability than C. polyendosporum PS-1 T and C. amylolyticum SW408 T . The results indicated that the C. manihotivorum CT4 T showed greater cassava pulp degradation than the other closely related species. The significantly different degradation of cassava pulp by the crude enzyme from C. manihotivorum CT4 T , from the other members of Clostridium, was possibly caused by many factors such as: (1) synergistic interactions among amylolytic-, hemicellulolytic-and cellulolytic-enzymes; (2) the enzymes containing non-catalytic binding domains that linked with catalytic domains known as carbohydratebinding modules (CBMs); and (3) the weak binding of the enzymes to lignin. Various hemicellulolytic-enzymes including endo-β-1,4-xylanase, β-xylosidase and endo-β-1,4mannanase were broken down xylan and mannan, the main hemicellulose in cassava pulp that covers the cellulose. Removal of xylan and mannan could help to increase the accessibility of cellulolytic-enzymes (such as endo-β-1,4-glucanase and β-glucosidase) for disruption of cellulose. Synergism between hemicellulolytic-and cellulolytic-enzymes led to enhanced release of the entrapped starch granules from cassava pulp. Consequently, the entrapped starch granules became more available for amylolytic-enzyme which were then effectively hydrolyzed to oligosaccharides and monosaccharides by endo-acting α-amylase, exo-acting α-glucosidase, and debranching enzyme pullulanase. Therefore, the synergism hemicellulolytic-and cellulolytic-enzymes acted cooperatively on decomposition of the hemicellulose-cellulose matrix, leading to increased accessibility of the amylolytic enzymes to the exposed starch granule located within the cassava pulp (Bunterngsook et al., 2017;Poonsrisawat et al., 2017), while CBMs have been reported to assist hydrolysis of insoluble substances by bringing the catalytic domain in close proximity to its substrate (Hervé et al., 2010). Moreover, the cassava pulp degrading enzyme of C. manihotivorum CT4 T may be active and low binding to lignin in cassava pulp. Teeravivattanakit et al. (Teeravivattanakit et al., 2017) reported that because the bacterial multifunctional enzyme PcAxy43A from Paenibacillus curdlanolyticus B-6 was a weak lignin-binding enzyme, this enzyme was capable of converting xylan contained in agricultural residues to xylose in one step without chemical pretreatment to remove lignin. Therefore, a weak lignin-binding enzyme is a potential factor for obtaining enzymes suitable for the hydrolysis of lignocellulosic materials (Berlin et al., 2006). Although some Clostridium spp. such as C. amylolyticum SW408 T (Song & Dong, 2008), C. thermosulfurigenes H12-1 (Saha, Shen & Zeikus, 1987) and C. butyricum T-7 (Tanaka et al., 1987) have the ability to hydrolyze soluble starch or raw starch by producing α-amylase and β-amylase. However, these three strains do not produce pullulanase, xylanase or cellulase and thus, unlike C. manihotivorum CT4 T , lack the properties of cassava pulp degrading enzymes.
To further explore whether C. manihotivorum CT4 T could be used to degrade cassava pulp, we analyzed its whole genome for the presence of enzymes involved in cassava pulp degradation. It found that the genome contains various genes encoding amylolytic-, hemicellulolytic-and cellulolytic-enzymes which possess different CBM domains. Those CBM families help in substrate recognition and binding, and thus increase the catalytic activity on insoluble substrates such as CBM20, CBM34 and CBM53 have been reported to act in the degradation of raw starch granules by enabling the enzyme to interact with the starch granules and also disrupt the surface of the starch structure (Machovič & Janeček, 2006;Lombard et al., 2014). Furthermore, hemicellulases and cellulases, including the exo-, endo-types and side-chain acting enzymes, are involved in the hydrolysis of hemicellulose and cellulose contained in lignocellulosic materials (Linares-Pastén, Andersson & Karlsson, 2014). The genome annotation of C. manihotivorum CT4 T revealed the presence of gene products of hemicellulases featuring CBMs that have the ability to interact with insoluble substances and support catalytic domains to hydrolyze their substrates (Shallom & Shoham, 2003). For example, the α-L-arabinofuranosidase and endo-β-1,4-mannanase of C. manihotivorum CT4 T were predicted to contain CBM4 (gene loci; CT4_03484 and CT4_03690), CBM6 and CBM35 (CT4_04971 and CT4_05469) which have been reported to have a binding function to insoluble xylan (Munir et al., 2014). However, endo-1,4β-xylanases (gene loci; CT4_03195 and CT4_04894), the main enzymes to attack the xylan backbone of strain CT4 T could not find the CBM. To explain how those xylanases were able to degrade xylan in cassava pulp, the enzymes might have other substrate-binding regions which are located at a certain distance from the active site and are called secondary xylan-binding sites (SXS), which function similarly to the CBM (Jommuengbout et al., 2009). Based on the amino acid sequence alignment of endo-1,4-β-xylanase (CT4_04894) with an endo-1,4-β-xylanase in the glycoside hydrolase family 10 (Xyn10) from Penicillium simplicissimum, which is capable of binding to insoluble xylan via the SXS, it was found that the residues E60, N61, K64, H97, W101, N142, E143, Y187, Q218, H220, E250, W283 and W291 of an endo-1,4-β-xylanase from strain CT4 T (CT4_04894) were conserved with the SXS of Xyn10 from P. simplicissimum (Schmidt, Gübitz & Kratky, 1999). Besides, cellulolytic enzymes such as endo-β-1,4-glucanase and β-glucosidase in C. manihotivorum CT4 T also contained CBM3 (gene locus; CT4_03367), CBM6 (CT4_03385), CBM3 and/or CBM46 (CT4_05071 and CT4_00352), which are known to bind and support catalytic domains to hydrolyze crystalline and amorphous celluloses (Cho et al., 2008;Guillén, Sánchez & Rodríguez-Sanoja, 2010). The results strongly indicated that C. manihotivorum CT4 T possesses a set of genes encoding a complete system of amylolytic-, hemicellulolyticand cellulolytic-enzymes, indicating that C. manihotivorum CT4 T is a good candidate for degrading cassava pulp.

CONCLUSIONS
In this work, we have highlighted the cassava pulp-degrading enzyme of the isolated strain CT4 T . It is a new species of the genus Clostridium that possesses specialized ability to degrade cassava pulp, a property that is occasionally found in this genus. The AAI constructed from C. manihotivorum CT4 T revealed differences in the evolutionary relationships among the other Clostridium species. A complete genome sequence studied by Illumina and Oxford Nanopore Technology revealed that C. manihotivorum CT4 T possesses a set of genes encoding the enzymes for the decomposition of an industrial starch-rich by-product, cassava pulp. In addition, C. manihotivorum CT4 T contained a total of 91 genes encoding amylolytic-, hemicellulolytic-, cellulolytic-and pectinolytic-enzymes. Comparative analyses of the C. manihotivorum CT4 T with the genome of C. amylolyticum SW408 T revealed that strain CT4 T had a high proportion and diversity of amylolytic-, hemicellulolytic-, cellulolytic-and pectinolytic-enzymes. The results suggest that C. manihotivorum CT4 T is a promising microbe for saccharification of cassava pulp into useful value-added products.