Exploiting Type I-B CRISPR Genome Editing System in Thermoanaerobacterium Aotearoense SCUT27 And Engineering The Strain For High-Level Ethanol Production


 Background: Thermophilic microbes for biofuels and chemicals have attracted great attention due to their tolerance of high temperature and wide range of substrate utilization. Thermoanaerobacterium aotearoense SCUT27 has the ability of glucose and xylose co-utilization in lignocellulosic biomass. Polygene manipulation was a bottleneck since it was hindered by available markers for selection. In this study, the endogenous Type I-B CRISPR/Cas system was developed for multiplex genome editing in SCUT27. Results: The protospacer-adjacent motif (PAM) was identified by in silico and orotidine-5’-phosphate decarboxylase (pyrF) and then lactate dehydrogenase (ldh) were chosen as the editing target to assess the toxicity of this immune system and gene editing efficiency. The mutants could be repeatedly obtained with an editing efficiency of 58.3-100%. Higher transformation efficiency was observed after optimization of some editing strategies. Furthermore, a new method was performed for screening mutants of plasmid curing (recycling of the editing plasmid) for multiplex genome editing based on the negative selection marker tdk, and then ldh and arginine repressor (argR) were knocked out successively. The mutant SCUT27/Δldh/ΔargR had the prominent advantages over SCUT27 for ethanol production with enhanced ability to metabolize xylose. When cultured under various lignocellulosic hydrolysates, the mutant showed a satisfactory performance with the ethanol titer and yield improved by 147.42–739.40% and 112.67–267.89%, respectively, compared with SCUT27, as well as the enhanced tolerance to inhibitors.Conclusion: The multi-gene editing by native CRISPR/Cas system is a promising strategy to engineer SCUT27 for higher ethanol production with lignocellulosic hydrolysates.

However, there were several problems hindering the metabolic engineering in SCUT27, such as the limited selective markers and absence of e cient genetic engineering tools which can facilitate engineering the strain continuously, and make a traceless editing without markers (avoid the interference from them).
Thus, there is an urgent demand to develop other technologies for strain improvement.
Clustered regularly interspaced short palindromic repeat (CRISPR) and CRISPR-associated (Cas) system is a prokaryotic RNA-guided adaptive immune system in bacteria and archaea against invading foreign DNAs such as phages and plasmids [6,7]. It has been reported that approximately 47% of sequenced bacteria and 87% of sequenced archaea harbor CRISPR/Cas loci, all of which share similar features consisting of identical direct repeats (DR) separated by variable sequences called spacers along with a suite of associated Cas genes. The CRISPR system can be classi ed into two classes and six types (33 subtypes) based on the Cas protein and the architecture of CRISPR/Cas loci [8]. CRISPR/Cas9-based editing system has been well understood and applied maturely with high e ciency in a board range of organisms [9][10][11][12][13]. However, the utilization in thermophiles is quite rare until now. Although some exogenous thermo-tolerant Cas9 had been found, their severe toxicity to cells led to low e ciency of transformation and editing [9,10]. Besides, the Type II system with Cas9 has a more complex PAM sequence and shorter single guide RNA (sgRNA) that may limit the selection of editing locations or offtarget [10,14]. To make an editing easily, many researchers utilized the endogenous CRISPR/Cas system for genome editing. Pyne et al. utilized the native Type I-B CRISPR/Cas system in Clostridium pasteurianum as an e cient genome editing tool, and successfully deleted the cpaAIR gene [15]. Zhang et al. exploited the CRISPR/Cas system in Clostridium tyrobutyricum to make multiplex genome editing with deletion of spo0A and pyrF by a single transformation, and also integrated the alcohol dehydrogenase gene to replace cat1 for n-butanol production [16]. Zhou et al. successfully deleted spo0A and aldh by the customized endogenous CRISPR/Cas system in C. butyricum, yielding an e ciency high up to 100% [17]. Liu et al. found the endogenous subtype II-A CRISPR/Cas system and used it to deplete native plasmids for enhancing cell growth and integrate of a L-lactate dehydrogenase gene into the chromosome, then led to both enhanced cell growth and lactic acid production [18].
In this study, we rstly reported the native type I-B CRISPR/Cas system in T. aotearoense strains in which an in-depth analysis for elements of this immune system was lacking. In silico CRISPR arrays analysis and in vivo toxicity assay were performed and revealed that the motif 'TTA' at 5'-end of the predicted protospacers was a functional PAM site for targeting of this endogenous system. By using a constitutive strong promoter Pkan for CRISPR mini-array expression, we successfully deleted ldh with a high editing e ciency which could be up to 100%. Some attempts were also made to improve the transformation e ciency in this process. Besides, for improving ethanol production, multi-gene editing system has been developed here based on a novel way for plasmid recycling. With the deletion of both ldh and argR, the mutant showed a better performance on ethanol production with a yield of 0.35-0.40 g /g, especially feeding with the xylose. Then, we assessed the engineered strain SCUT27/Δldh/ΔargR for its ability to metabolize xylose-rich lignocellulosic hydrolysates.

Identi cation and activity test of the Type I-B CRISPR/Cas system in SCUT27
Recently, the Type II CRSIPR/Cas9 system tools have been reported for thermophiles based on some thermostable cas9 derived from Geobacillus stearothermophilus, Acidothermus cellulolyticus, and Geobacillus thermodenitri cans T12 [9,10]. For genome engineering in SCUT27, we also tried to construct the plasmids pLY2_ldh-HA12 and pLY2_ldh-HA12 (Table 1) containing the thermostable cas9 gene from G. stearothermophilus (under the control of constitutive promoter PadhE and riboswitchcontrolled inducible promoter Pkan-RS pbuE , respectively), a sgRNA expressing module targeting ldh and a repairing template. However, no transformants were obtained by electro-transformation with several tries.
It indicated that the genome editing with CRISPR/Cas9 system was di cult to be used in our strain probably due to the high toxicity of the potent Cas9 RNP (the heterologous nuclease Cas9 was non-lethal as the pLY1 could transformed into SCUT27), and the limited transformation e ciency with pIKM1. Even using the inducible promoter Pkan-RS pbuE , it seemed that the leaky expression of Cas9 with sgRNA led to cell death. Therefore, we turned to exploit the endogenous CRISPR/Cas system for genome editing in SCUT27. Table 1 Strain and plasmids used in this study.
CRISPRCasFinder, a web tool to identify clustered regularly interspaced short palindromic repeats and the presence of Cas genes [19], was used here to mine the native CRISPR/Cas system on chromosome of SCUT27 (GCF_000512105.1 Thermoanaerobacterium aotearoense SCUT27 assembly genomic.fna). Results showed that SCUT27 possessed the native CRISPR/Cas system which was mainly assigned to Class 1 Type I-B. Seven major CRISPR loci were found in SCUT27 genome sequence (Table S2), but intact Cas genes (v518_0414-0422) were only identi ed at the upstream and the downstream of the largest CRISPR array containing 55 spacers ( Fig. 2a; Table S2). The other CRISPR arrays uncoupled with Cas genes were de ned as orphan CRISPR. This may suggest that the longest array created by numerous spacer acquisition events and co-expression with a complete functional set of Cas genes might constitute an active CRISPR functional unit within the bacterial genome [20]. As a result, this longest CRISPR array (Positon: 14334-18020 in NZ_AYSN01000014.1) including 55 spacers along with direct repeats (DR) was used for further research.
To recognize protospacers by CRISPR/Cas system, a functional PAM anking at the 5' end of the protospacer is necessary. The primary approach for PAM identi cation was to identify the putative original sequences of the invading elements that ended up as CRISPR spacers during the 'adaptation' stage in SCUT27. The searching program CRISPRTarget [21], was used to analyze the CRISPR spacers of the strain by aligning these sequences against the existing genome sequences in various databases such as phage, plasmid and so on, as described by Pyne [15]. We set out to analyze spacer sequences of which score were greater than 20 points in silico. However, there was no obvious pattern of PAM with the only 55 spacers in the largest CRISPR array in SCUT27. Then we searched for the closest-related strain T. thermosaccharolyticum M0795, which possessed the similar Cas operon (Thethe_02655-02664) as SCUT27 and up to 266 spacers on its largest CRISPR array (the same DR motif as SCUT27: GTTTTTAGCCTACCTATAAGGAATTGAAAC) to identify the PAM sequence. From the results of spacerprotospacer matching analysis in M0795, we found the high-frequency patterns of PAM motif 'TTA' at the 5' end of the protospacers (Fig. 2b, 2c). Besides, the PAM sequence 'TNA' or 'TNA' referring to Thermoanaerobacter sp. was also predicted in some literatures with various in silico tools [22][23][24]. In some closely related Clostridium such as C. pasteurianum, C. di cile, C. tyrobutyricum and C. thermocellum, PAM site 'TNA' was de ned in their reported subtype I-B system as well [9,15,16,25]. Based on the above analysis, the most con dent PAM site 'TTA' in SCUT27 was characterized and chosen in the following experiments.
Then toxicity assay was performed to evaluate the activity of native CRISPR system. Through designing a plasmid containing a CRISPR mini-array cassette (Fig. S1b, c), the expression of sgRNA further forming ribonucleoprotein (RNP) on the target site mediated single strand DNA break. The arti cial mini-array composed of 30 bp DR from the largest CRISPR array of SCUT27 and a 37 bp spacer (responding to the target protospacer with a functional PAM) was constructed (Fig. S1a, b). Like most bacteria, SCUT27 does not encode proteins responsible for non-homologous end joining. Thus, if the CRISPR/Cas system was functional, DNA cutting caused by the active RNP complex could not be repaired and no transformants would be observed. Here, the mini-array on the plasmid was designed to target pyrF gene (V518_1373) in SCUT27 (Fig. S1). The 921 bp pyrF ORF contains a total of 44 potential PAMs (TTA) on double strands of DNA. Two spacers (37 nt downstream of 'TTA') were chosen for targeting pyrF locus and one non-target spacer (PAM motif as 'GGC') was designed as a negative control (Fig.S1a). With a strong promoter Pkan for sgRNA expression, we observed a 21-fold decrease in electro-transformation e ciency of SCUT27 (from ~ 48.3 to 2.3 CFU/µg DNA), indicating that CRISPR-mediated interference was happening and around 95 % cell killing (cutting e ciency) was observed ( Table 2, Fig. S1d). The promoters upstream of Cas operon and sgRNA were strong enough to mediate a single-chain cut. Some background colonies (escape mutants) containing the killing plasmids in experimental groups may be due to spontaneous mutations in Cas operon (speci cally Cas3), which would be expected to inactivate the encoded functional protein [9,26]. Endogenous Type I-B CRISPR-based deletion of ldh in SCUT27 After verifying the activity of the native CRISPR system, we constructed the gene editing plasmid pKQ1_ldh-HA12 (Fig. 3a) targeting the gene ldh (V518_0188) for knockout, as the resultant mutant could easily be identi ed by its product without lactate and showed obvious advantage for growth and ethanol production based on our previous report [3]. A spacer targeting the chromosomal ldh locus was chosen by the same process as described above. Besides, a repair template was introduced for CRISPR/Cas mediated homology directed repair at the downstream of the CRISPR mini-array. It consisted of two homologous arms (HAs) of ~750 bp each and a KpnI digestion site instead of 805 bp in ldh locus to inactivate its function (Fig. S2a).
The transformation of pKQ1_ldh-HA12 into SCUT27 resulted in 12-65 colonies on MTCK plate (~6.2 CFU/ug DNA). Colony PCR was carried out with randomly picked colonies to screen the ldh deletion mutants. Twelves colonies were picked and veri ed by PCR. As shown in Fig. 3b, the PCR products in three kinds of cases were observed: Nine of 12 colonies (75%) had a clean deletion genotype (Δldh) with 1616 bp, two colonies had a wild-type genotype with 2421 bp and one colony showed the mixed genotype. For the wild type, it was supposed that some site mutation appeared on the key Cas proteins leading to the ineffective function of editing system. For the mixture type, it hasn't been given clear explanations. Overall, with the facilitation of native CRISPR system, desirable mutants could be reproducibly obtained with high edited rate varying from 58.3 to 100 % within 2-4 days after transformation. The colony PCR product was sent for DNA sequencing (Fig. S2a), showing the target locus was edited as we designed. We also cultured the mutant and took the fermentation sample for HPLC with the result of non-lactate product and the same characteristics as SCUT27/Δldh::ermR in our lab (Table S5).
Promotion of endogenous CRISPR system with various promoters for sgRNA expression Two metrics were considered to evaluate the usefulness of the genome editing tool. One metric is the total number of transformants (the transformation e ciency), the other is the fraction of correct transformants (true-editing e ciency). The number of correct transformants is the product of the above two metrics. Although the Type I-B system in SCUT27 showed e ciency at CRISPR mediated killing and high edited rate was observed in several times, few transformants were obtained after transformation of SCUT27 with editing plasmids. The low e ciency of transformation is possibly due to the high toxicity of CRISPR-induced DNA cutting with sgRNA expression under Pkan. To further explore the e ciency of CRISPR-mediated DNA cleavage and transformation e ciency, we tested other three constitutive promoters (PadhE, Pclo1313_1194 and Pcat1) for regulating the expression of sgRNA to see which expressing extent was better for editing. The expressing strength of the promoters were characterized by enzymatic activity downstream of them in overexpression vectors (Fig. S3a).
It showed that Pclo1313_1194 as a strong promoter in C. thermocellum did not work in SCUT27 (Fig. 4a). The strength of PadhE was weaker than Pkan, and the e ciency of transformation by PadhE (~5.5 CFU/ µg DNA) was very close to that by Pkan, as well as the similar editing e ciency (~75.0%) (Fig. 4a, Table  S3). The strength of Pcat1 was obviously stronger than Pkan. However, the transformation and editing e ciency of Pcat1 were both lower than those of Pkan (Fig. 4a, Table S3). We speculated that stronger expression of sgRNA caused much stronger cleavage and led to higher toxicity to cell, especially in the initial period after electro-transformation, possibly leading to an increased proportion of escape mutants with wild type phenotype as a result.
Notably, through the enzyme activity testing, we found the activity value of ldh complement by pIKM1 (Pldh-ldh) was nearly equal to that of wild type. So, we speculated that the plasmid pIKM1 in our strain mostly had only one copy, which was consistent with Walker's resequencing date in C. thermocellum, although the plasmid is originally known to have 10-1000 copy number in C. thermocellum [9]. This may could give an explanation to the low number of colonies on MTCK plate after electro-transformation. As shown in Fig. 5, with a constitutive promoter to express sgRNA, the cell would die due to the chromosome breaking under the case of no homology directed repairing, no matter how many plasmids transformed into the cell. However, under the case of homology directed repaired, if only one plasmid was transformed into the cell, the cell would also die owing to the plasmid-breaking (loss of the resistance to kanamycin); only the cells with several plasmids transformed would be survival (SCUT27/Δldh we got was in this case).
Thus, we looked for an inducible promoter to control the expression of sgRNA and restrict the cutting by the CRISPR system on the initial stage of the transformation. As shown in Fig. 5, with an inducible promoter, the cell containing one plasmid would be survival on MTCK plates without inducer, due to the repressed expression of sgRNA. Then, the inducer was added to activate the cleavage. The colonies with one plasmid under the case of homology directed repaired could be picked on the non-selective plates, while others would be death as no homology directed repairing happened. And in this case, the mutants could be obtained with plasmid-curing in theory.
To verify our guess, here we used a thermostable riboswitch element [27] and constructed the adeninecontrolled inducible promoter Pkan-RS pbuE to control the expression of sgRNA in order to avoid the cutting at the initial stage after transformation. The results of enzyme activity assays showed the riboswitchmediated inducible promoter was strictly controlled by inducer adenine (Fig. 4a). No enzyme activity was observed when the medium didn't add extra adenine, while 0.04 U/mg enzyme activity was observed with 1 mM adenine in medium. With the sgRNA expression under Pkan-RSpbuE, the number of transformants was obviously increased on MTCK plates (approximately 42.0 CFU/µg DNA), and colonies were picked and cultured in medium with 1 mM adenine for inducing. However, the clones on MTCA plates were mostly false-positive and showed lower editing e ciency (Fig. 4a), even after ve times passaging for enrichment. We speculated that the induced promoter here for the expression of CRISPR mini-array might be not strong enough to make ssDNA break thus leading to an increased number of wild type strains or escape mutants. And the mutants we got in this way still contained the plasmids, indicating these mutants had several plasmids initially and cells with a single plasmid maybe di cult to be edited with weak induced promoter. Therefore, utilization of a thermostable and strong inducible promoter for controlling the expression of sgRNA seems a key to this genetic modi cation tool.
Other attempts to improve the e ciency of transformation and editing with the plasmids The highly e cient genome editing system (cutting of ssDNA) resulted in few alive colonies on plates, which was likely due to low homologous recombination [9]. It has been noted in CRISPR/Cas genome editing for Clostridial organisms that serial transferring (1:20 dilution) and sub-culturing in liquid medium could enrich the desirable homologous recombination and increase the probability of gene editing [16].
As our previous protocol without extra enrichment, we speculated that it might be also required due to insu cient homology-directed repair in SCUT27. Thus, after the electric shock, we cultivated the bacteria solution and transferred with several times to increase the opportunity of homology repairing. After 5, 10 and 20 rounds of serial passaging in MTCK medium, the results showed that the editing e ciency was between 15.0 -58.3%, lower than that without serial transferring, while a large number of true-edited mutants could be obtained. It seemed that the escaped mutants increased more rapidly than the truepositive mutants during the serial transferring.
In this study, we also adjusted the amounts of plasmids from 5 µg to 100 µg in 300 mL buffer of electric shock with no prominent improvement (data not shown). Besides, we explored the effects of the various lengths of HAs on the e ciency of transformation and editing with the endogenous CRISPR/Cas system in our strain. The length of HAs in the plasmid was set to 250 bp, 500 bp, 750 bp and 1000 bp. As shown in Fig. 4b, the transformation e ciency was improved as the length of HAs increased. The highest editing e ciency (~ 75% correct edited) was observed when the length of HAs was up to 750 or 1000 bp. And we tried to respectively add 5 and 20 µg HAs of 1 kb (puri ed PCR products) mixed with 20 µg plasmids into the buffer during the electro-transformation, in order to strength the homologous recombination [28]. The results showed that the addition of repair template (HAs) had an obviously positive effect on transformation e ciency while the editing e ciency of the transformants was relatively low as a compensation (Fig. 4b).
Plasmid curing and argR deletion in mutant Δldh for higher ethanol production As described above, single gene deletion was achieved with high e ciency using endogenous CRISPR/Cas system and we further explored this system for multiplex genome editing in SCUT27.
Considering the limitation of number of both selective makers and thermostable shuttle vectors available in SCUT27, plasmid curing was a must for continuous editing. The editing plasmids pKQ1_ldh-H12 in SCUT27/Δldh needed to be eliminated for acquiring marker-free mutants. So, the mutants were rstly transferred into medium without kanamycin for 5-20 generations transferring in order to lose the plasmid spontaneously. However, no colony without plasmid was picked, indicating the plasmid was di cult to be eliminated probably due to its stable replication region. Then some chemical and physical methods were applied for plasmid curing. To weaken cell wall or damage cell membrane, curing agents like isonicotinic acid hydrazide of 4µg/mL or 0.002% sodium dodecyl sulfate (SDS) (the maximum non-lethal concentration for SCUT27 we tested) were added into cultures in early exponential phase [29,30].
However, almost all of the picked colonies had retained the plasmids even after several generations. Physical processing such as repeated freezing-thawing, sublethal temperature, electroporation or integrating several methods together were also tried but no one worked e ciently.
Here we developed a novel way for e ciently screening cells of plasmid curing and recycling the editing plasmid for multi-genes editing, based on the thymidine kinase (tdk) as a negative selection marker [31]. The 5-uoro-2'-deoxyuridine (FUDR) is the agent for negative selection as shown in Fig. S4a. The toxicity of FUDR for SCUT27 and SCUT27/Δtdk was tested. As shown in Fig. S4b, under the concentration of 50 µg/mL FUDR, no colonies of wild type grew while lots of colonies of Δtdk were observed, indicating tdk could be used as a selectable marker. Besides, deletion of tdk showed no in uence on growth and properties of fermentation for SCUT27 (Fig. S4d, Table S4). Thus, SCUT27/Δtdk could be used as a starting strain for engineering, unnecessary for reintroducing tdk at last. As shown in Fig. 6, SCUT27/Δtdk was rstly picked by homologous recombination and veri ed by PCR (Fig. S4c), and then the mutant Δtdk/Δldh was obtained via electro-transformation with pKQ1_Δldh-H12::tdk. Through successive transferring (three generations here) and spreading on MTCF plate, colonies of Δtdk/Δldh without plasmids were picked with a positive rate around 25% (Fig. S5), showing a signi cant improvement compared with the previous blind screening. The mutant of plasmid curing was prepared for next round editing.
Here, arginine repressor (argR), which had been reported on previous research in our lab [4], was chosen as the target gene for the second editing with pKQ1_ΔargR-H12::tdk. The mutant with argR inactivation had a good performance on growth, ethanol yield and energy level, and its ability to utilize xylose and lignocellulosic hydrolysates had been enhanced [4,5]. Thus, the editing of argR based on SCUT27/Δldh was valuable to explore for better performance. With the same operation above, the mutant SCUT27/ Δtdk/Δldh/ΔargR (termed as Δldh/ΔargR below) was successfully obtained (Fig. S2b, S6), indicating the way for multiplex genome editing was feasible.
Analysis on glucose and xylose utilization ability of the mutants in serum bottles In most of the lignocellulose hydrolysates, glucose and xylose are the two major fermentable sugars [32].
Here, three kinds of glucose/xylose ratios (1:0, 2:1, 0:1) were investigated by ask fermentation with wild type, Δldh, ΔargR, and Δldh/ΔargR. As shown in Fig. 7 and Besides, as shown in Table S5, SCUT27/Δldh gained in this study has the similar characterization with the previous mutant SCUT27/Δldh::ErmR in our lab (2), indicating the editing way by using the CRISPR/Cas system was reliable as same as the traditional way of homologous recombination with a selective marker.
Fermentation kinetic of Δ ldh /Δ argR with various lignocellulosic hydrolysates An ideal microorganism for lignocellulosic bioethanol production should not only utilize various sugars e ciently but also tolerate high temperature (at least 40 ℃) and inhibitors [33]. Saccharomyces cerevisiae as a well-known ethanol fermentation strain could not grow well above 40 ℃. Though some thermotolerant yeast such as Kluyveromyces marxianus or Ogataea polymorpha were found, the fermentation traits of them were far inferior to S. cerevisiae [34]. Besides, yeasts showed unsatisfactory performance on ethanol fermentation with xylose as sole carbon source due to the cofactor imbalance in cell [35,36] or the ine cient XI pathway for ethanol production owing to the thermodynamic limit with an unfavorable equilibrium between the xylose and xylulose [37]. Thus, it's urgent to obtain an ideal microorganism for ethanol production especially with xylose. Here, we focused on the ethanol fermentation from xylose with SCUT27/Δtdk/Δldh/ΔargR, a candidate for ethanol fermentation with xylose-rich lignocellulosic hydrolysates.
Hydrolysates of rice straw (RSH), sorghum straw (SSH), peanut straw (PSH), wheat straw (WSH) and soybean straw (OSH) pretreated with diluted H 2 SO 4 were chosen to assess the ethanol production ofΔldh/ΔargR. Xylose was the main sugar in the hydrolysates of the most lignocellulosic biomass with partial glucose and cellobiose. Here each hydrolysate had been diluted with water at initial xylose concentration about 15 g/L. As shown in Fig. 8, within the expectation, the improved ethanol production had been obtained by Δldh/ΔargR along with improved sugar utilization ability with all hydrolysates. The ethanol production and yield of the mutant Δldh/ΔargR had been greatly improved about 147.42-739.40% and 112.67-267.89% respectively when compared with wild type. The ethanol titers of RSH, SSH, PSH, WSH and OSH were up to 9.84 g/L, 10.25 g/L, 9.27 g/L, 10.45 g/L and 9.70 g/L in serum bottles, respectively. And the ethanol yields were 0.59 g/g, 0.53 g/g, 0.57 g/g, 0.61 g/g and 0.60 g/g, respectively, all of which were higher than 0.35-0.40 g/g ethanol yield with pure sugars and also beyond the theoretical maximal yield of ethanol 0.51g/g glucose or xylose. We speculated that trace amounts of other substrates such as cellobiose, arabinose and some protein might exist in hydrolysates and had not been calculated in the ethanol yield [38,39]. Nearly all the xylose and glucose in hydrolysates except RSH could be exhausted by SCUT27/Δldh/ΔargR, which might attribute to the enhanced activity of xylose isomerase and xylulokinase as well as higher energy level of cells for xylose transport [4]. Both wild type strain and Δldh/ΔargR had the strong ability to tolerate lignocellulose-derived inhibitors in hydrolysates such as weak acids, furan derivatives and phenolic compounds [40], and the ability of Δldh/ΔargR was much stronger, especially under the rice and peanut straw hydrolysates (Fig. 8), possibly due to both DnaK-DnaJ-GrpE system and the GroEL-GroES chaperonin up-regulated simultaneously [5], the potential improved ability to eliminate ROS against the inhibitors [5] and improved ATP level to synthesis heat shock protein, pump cytoplasmic protons and transform inhibitors [41]. Results above suggested that Δldh/ΔargR was a promising bioethanol producer with prominent advantages for dealing with harsh environment from various hydrolysates.

Discussion
Over the last decade, CRISPR/Cas system as adaptive immune system in bacteria or archaea, has been repurposed for genome editing and transcriptional regulation in various species [42]. However, the majority of the application are based on the type II CRISPR/Cas9 system, which is hard to be met for thermophiles due to the demand of thermostability and toxicity of Cas9 (poor transformation e ciency) [43]. After the failure attempt with the type II CRISPR/Cas9 system, we set our sights on the endogenous Type I CRISPR/Cas system, which had been harnessed in a few species of bacteria for e cient genome editing including transcriptional regulation [9, 15-17, 25, 44]. Compared with the type II CRISPR/Cas9 system, one of the advantages for genome editing with native CRISPR/Cas system is that the employment of the longer spacer sequence (~37 nt vs. 20 nt for CRISPR/Cas9) can abate the potential off-target effect as the targeting of the crRNA is more speci c [14] and thus avoid undesired mutations on the other sites of the chromosome.
In this study, we developed and successfully repurposed the Type I-B CRISPR/Cas system as an e cient genome editing tool in SCUT27. Due to the de ciency of spacers in SCUT27, in silico analysis of CRISPR array was also made in T. thermosaccharolyticum M0795 which is a closely-related species with SCUT27, showing the same direct repeat sequences of them and a potential PAM site that they may share together. The way for nding PAM was also successfully applied in Clostridium tyrobutyricum [45]. The subsequent toxicity assay demonstrated the bioinformatically predicted PAMs identi ed from M0795 could effectively get employed to SCUT27 with the activity of the endogenous CRISPR/Cas system. With the constitutive promoter Pkan to drive the transcription of the synthetic CRISPR mini-array on the plasmid without homology repairing templates, the transformation e ciency with the target sgRNA (for pyrF) was ~21-fold lower than that with non-target sgRNA ( Table 2). For the genome editing with the endogenous CRISPR/Cas system, ldh was successfully edited under the repairing templates as we designed. And the desirable mutants could reproducibility obtained with a high editing e ciency of 58.3-100% (Fig. 3, Table S3).
Although the high editing e ciency was displayed with CRISPR/Cas system in SCUT27, the overall transformation e ciency was low for practical use as a genome editing tool. Thus, several promoters including a riboswitch-controlled inducible promoter were chosen to replace Pkan, exploring the editing effects under the various extent of sgRNA expression. However, it seemed that no signi cant improvement was discovered with the replacement of the promoters (Fig. 4a), and the transformation e ciency were decreased when the strong promoter Pcat1 was used (Table S3), possibly due to its intense toxicity to cells. In Clostridium tyrobutyricum, lactose inducible promoter was utilized for CRISPR array expression and limit the initial toxicity of native system to gain a high editing e ciency [16], however, the riboswitch-controlled promoter Pkan-RS pbuE induced by adenine as a thermostable inducible promoter here seemed not strong enough to transcribe sgRNA leading ssDNA break, with a low editing e ciency but more transformants (Fig. 4a, Table S3). Based on the results above for the deletion of ldh, Pkan seems like a good option with the high edited rate.
It was also worthwhile to point out that through the enzyme activity test for characterization of promoter strength, we made a speculation that the plasmid pIKM1 in our strain mostly had only one copy (Fig. 4a), which was also reported at Walker's resequencing date in C. thermocellum, but not having 10-1000 copy numbers in the original report [9]. This may could give an explanation to the low transformation e ciency in this study, as shown in Fig. 5. With a constitutive promoter to express sgRNA and homology directed repaired, the cell with single copy of editing plasmid would also die as the result of plasmid breaking (loss of the resistance to kanamycin); only the cells with several plasmids transformed would be survival.
Thus, the riboswitch-based inducible promoter Pkan-RS pbuE here was used to control the expression of sgRNA and restrict the cutting by the CRISPR system on the initial stage of the transformation (Fig. 5).
However, the results showed that the clones picked were mostly of false-positive with a low editing e ciency (Fig. 4a), even after ve times passage for enrichment, indicating this induced promoter for the expression of CRISPR mini-array may too weak to make ssDNA break thus leading to a lot of escape mutants. Therefore, the application of a thermostable and strong inducible promoter for controlling the expression of sgRNA seems a key to performing a more convenient genetic modi cation.
The highly e cient genome editing system with few alive colonies on plates was also likely due to low homologous recombination [9]. It has been noted in CRISPR/Cas genome editing for Clostridial organisms that serial transferring and sub-culturing in liquid medium could enrich the desirable homologous recombination and increased the probability of gene editing [16]. In this way, the true-edited mutants were enriched but with the low edited rate between 15.0% -58.3% after 5-20 rounds of passaging. It seemed that the escaped mutants increased more rapidly than the true-positive mutants during the serial transferring. The effects of the various lengths of HAs and addition with a moderate amount of extra HAs in electro-transformation were also tested in order to overcome the potential limitation in homologous recombination. The results showed that utilization of long homologous arms and addition of HAs had an obviously positive effect on transformation e ciency (Fig. 4b). However, the edited rate was signi cantly lower with the extra HAs addition. Therefore, depending on the different genome editing purposes, one can make a tradeoff between the transformation e ciency and genome editing e ciency by the addition of HAs or use of various promoters.
To get the engineered strains with better performance, multiplex genome editing was a signi cant and necessary technology. In C. tyrobutyricum, the arti cial CRISPR mini-array on one plasmid with two spacers for deletion of two genes (spo0A and pyrF) was constructed. However, the editing e ciency of double deletion vector was signi cantly lower than that of single deletion vector at the 8th generations during the sub-culturing [16]. Furthermore, as the number of editing loci increased, the size of the single plasmid (containing HAs) would be greater, which increase the di culty for vector construction and transformation. And the mini-array could not be easily gained by overlap PCR when the number of spacers was more than one (the same sequence of direct repeats in CRISPR array would disturb the PCR of fragments for plasmid construction). As a result, the multi-genes editing CRISPR mini-array could only get obtained on costly DNA synthesis.
In order to edit multiple loci in genome cheaply and e ciently, successive editing with one plasmid including one spacer would be a better choose. In our strain, under the condition with limited amounts of positive selection markers available, plasmid curing was inevitable for successive editing. Some attempts were then performed to eliminate the plasmids in mutants, however, it seemed di cult as the mutants without plasmids were hard to screen out, restricting the application of CRISPR/Cas genome editing system. Here, a novel way for e ciently screening cells without plasmids was developed and made the multiplex genome editing come true in SCUT27. As shown in Fig. 6, the negative selection marker tdk inserted into the editing plasmid and its selection agent FUDR were used for the recycling of the editing plasmid and the resistant marker kanR. Based on the starting strain SCUT27/Δtdk, SCUT27/Δtdk/Δldh was rstly acquired after the rst around of editing. Then with successive transferring, the plasmid-free mutants appeared and screened out on plates with 5-FUDR. The mutants of plasmid curing were further edited for the deletion of argR with the native CRISPR/Cas system. Here we got the mutants SCUT27/ Δtdk/Δldh/ΔargR for fermentation pro les testing. The framework of plasmid with diverse sgRNA and HAs could be successively delivered into the cells for multiple genome editing.
To date, this is the rst success for successive gene-editing in T. aotearoense strains with type I-B CRISPR/Cas system. And the mutant SCUT27/Δtdk/Δldh/ΔargR showed a better performance in sugar utilization ability with the improved ethanol production and yield, especially under xylose (Fig. 7).
Under the ve kinds of lignocellulosic hydrolysates, the ethanol production and yield of the mutant Δldh/ ΔargR had been up to 9.27-10.45 g/L and 0.53-0.61 g/g sugar, and greatly improved about 147.42-739.40% and 112.67-267.89% respectively when compared with SCUT27 (Fig. 8). Δldh/ΔargR also showed strong ability to tolerate lignocellulose-derived inhibitors in hydrolysates such as weak acids, furan derivatives and phenolic compounds [40], especially under the rice and peanut straw hydrolysates (Fig. 8). The great performance of the mutant was possibly due to both DnaK-DnaJ-GrpE system and the GroEL-GroES chaperonin up-regulated simultaneously [5]. And the mutant had the potential ability to eliminate ROS against the inhibitors and improve ATP level with more energy to synthesis heat shock protein, pump cytoplasmic protons and transform inhibitors to respond the stress [5,41]. Results above suggested that Δldh/ΔargR was a promising bioethanol producer with prominent advantages for dealing with harsh environment from various hydrolysates.
The applications of endogenous CRISPR/Cas system for genome editing in SCUT27 could be potentially larger than the just generation of deletion mutants. This technique could be readily applied for introducing other types of mutations, i.e., fragment insertions and even point mutations. For the point mutation, the homologous arms on the editing plasmid should be designed to introduce changes in the functional PAMs to a nonfunctional motif. Alternatively, substitutions could be introduced into a seed region, the rst 8 nucleotides of the protospacer, crucial for CRISPR targeting [20,26]. Besides, the CRISPR interference (CRISPRi), which allows repression of the expression of target genes, had already been performed with the subtype I CRISPR/Cas system [26, 46] rather than only with dCas9 [47,48]. The subtype I-B CRISPR-Cas system of Haloferax volcanii lacking cas3 and cas6 genes could be used for gene repression with various extents [46]. Type I-E CRISPR/Cas system with the deletion of Cas3 in Escherichia coli was also co-opted for programmable transcriptional repression, and yielded the strongest repression targeting promoter regions [44]. This strategy offers a simple approach to convert many endogenous Type I systems into transcriptional regulators, expanding the available toolkit for CRISPRmediated genetic control while creating new opportunities for genome-wide screens and pathway engineering.

Conclusion
The protocol of utilizing endogenous Type I-B CRISPR/Cas system for multiplex genome editing in this study extends the existing genetic toolbox in SCUT27 and may be adapted to other microorganisms that carry active endogenous CRISPR/Cas systems and limited by available selective markers and plasmid. And the deletion of both ldh and argR by native CRISPR/Cas system here is a successful strategy to engineer SCUT27 for enhanced ethanol production with lignocellulosic hydrolysates.

Strains and culture condition
The strains used and engineered in this study are all listed in Table 1. Escherichia coli DH5α used for plasmid construction and propagation were selected and grown at 37℃ aerobically in LB broth or on solid LB agar plate supplemented with 50 μg/mL kanamycin or 100 μg/mL ampicillin as required for plasmid maintenance. SCUT27 and the mutants were cultivated with optimized MTC medium (pH 7.0) [3,49] at 55 ℃ and 150 rpm anaerobically in 100 mL serum bottles. For selection of transformants, solid MTC medium containing 50μg/mL kanamycin (MTCK), 1 mM adenine (MTCA) or 50 μg/mL 5-uoro-2'deoxyuridine (MTCF) were used when required, and were put into anaerobic chamber (Shellab, USA) for pre-deoxygenation about 12 h before using. The lignocellulosic hydrolysates were prepared as the previous study [4] and were added with 20 g/L CaCO 3 for adjustment of pH during the fermentation. The concentration of xylose in hydrolysates was diluted around 15 g/L.

Plasmid construction
The plasmids constructed in this study are listed in Table 1 and oligonucleotides used are listed in Table   S1. All plasmids were constructed using standard molecular biology technique. The fragments were obtained by the PCR using DNA polymerase KOD (Toyobo Co., Ltd When screening of promoters, Pkan was cloned from upstream of kanamycin resistant gene on pIKM1. Pcat1 was ampli ed from cat1 (CTK_C06520) promoter region in C. tyrobutyricum ATCC 25755. P Clo1313_1194 was ampli ed from Clo1313_1194 promoter region in C. thermocellum [50]. The fragment of adenine-responsive riboswitch RSpbuE was ampli ed from Bacillus subtilis [51], and the construction of Pkan-RSpbuE was shown in Fig. S3b. PadhE and gene ldh were cloned from SCUT27. Every promoter was fused with ORF of ldh through Splicing by Overlap Extension (SOE) PCR, and then was cloned into MCS of pIKM1 by Gibson Assembly, generating vectors of pIKM1-P-ldh for LDH expression in SCUT27/ Δldh::ermR. Their responding LDH enzyme activity was tested for characterization of their strength. As a control, 225 bp upstream of ldh CDS region was ampli ed as the potential promoter for ldh complementation.
For genome engineering by CRISRP/Cas9 in SCUT27, plasmids pLY1 and pLY2 were constructed containing the thermostable cas9 gene from G. stearothermophilus, respectively under the control of PadhE and Pkan-RS pbuE to regulate the toxicity by Cas9 expression. In addition, a sgRNA expressing module (the motif as described by [10]) targeting gene ldh under the control of the constitutive promoter Pcat1 were inserted into these two plasmids as well as a repairing template for ldh deletion through homologous recombination. The resultant plasmids pLY1_ldh-HA12 and pLY2_ldh-HA12 were then transformed into SCUT27.
The construction of all editing plasmids with native CRISRP mini-array was based on plasmid pKQ1 which was inserted into the promoter of kanamycin resistance gene Pkan and native putative terminator fragment Ter (298 bp of sequence located at the 3' end of the largest CRISPR array in SCUT27) into the multiple cloning sites (MCS) of pIKM1. To obtain the killing or editing vectors with mini-array, promoter Pkan and terminator Ter respectively with the sequence of direct repeat (DR) and partial targeting spacer at one end of them were amplified and then through SOE PCR formed ~950 bp CRISPR expression cassette (Pkan-DR-Spacer-DR-Ter) followed by Gibson Assembly with the BamHI and EcorRI sites on linear vector according to the manufacturer's guidelines. The repaired templates each containing two homologous arms anking ldh (V518_0188) or argR (V518_1864) of wild type was ampli ed from 250-1000 bp and introduced into the SalI and EcoRI restriction sites of pKQ1 to mediate the recombination. For plasmid recycling, the fragment tdk with the putative promoter and terminator (ampli ed with primers tdk-in-F/R) was inserted into upstream of Pkan on editing plasmid.

Transformations and mutant screening
All plasmids were ultimately isolated from E. coli DH5α and were transformed into SCUT27 (OD 600 ~ 0.8) and recovered in MTC overnight as previously described [3,30]. Clones that could grow on MTCK plates were picked into 2 mL centrifuge tube containing 1.0 mL, 50 μg/mL kanamycin MTC medium and then cultured at 55 °C for 24 h in anaerobic chamber (Shellab, USA). Then, they were transferred into 20 mL serum bottles containing 10 mL medium for 24 h at 55 °C and re-transferred into 100 mL serum bottles containing 50 mL medium for enriching the mutants (adenine or FUDR was added when required). A series of sub-culturing (5% v/v inoculum) was carried out to enrich the strains with desirable homologous recombination or plasmid curing, before plating onto plates for selection. The last step was to identify the mutants (deletion of ldh and argR, plasmid curing or overexpression) by colony PCR, and sequencing of the PCR amplicon when required, with primers designed of which one annealed to the upstream of repair template and the other annealed to the downstream of repair template, as shown in Table S1.

Fermentation kinetics and analytical methods
The ask batch fermentations of SCUT27 and mutants were performed at 55 °C and 150 rpm in 100 mL serum bottles containing 50 mL of MTC medium with various carbon sources such as 25 g/L glucose, 25 g/L xylose or 25 g/L glucose/xylose mixture (mass ratio = 2:1) to mimic lignocellulosic hydrolysates, as well as kinds of lignocellulosic hydrolysates. During the fermentation, the samples were taken at every 4 h intervals until the end to detect the OD 600 of strains, residual sugar, ethanol, acetic acid and lactic acid concentration in the broth. Cell density was monitored by spectrophotometer (PERSEE T6, Beijing, China) at 600 nm. The concentration of glucose and xylose, and production of ethabol and other metabolites were measured by HPLC (Waters, Milford, USA). All experiments in this study were performed independently at least three times, and data are expressed as mean and standard deviation (SD). To compare statistical difference between the groups of experiment data, student's t test was used.
Enzyme assays SCUT27 and the mutants of lactate dehydrogenase (LDH) overexpressing were cultured anaerobically in modi ed MTC medium at 55 °C with 20 g/L glucose. When the OD 600 reached to ~ 0.8, 5 mL of cell culture was collected by centrifugation and the cell pellet was washed twice with PBS (potassium phosphate buffer) and resuspended in 400 μL of PBS for crushing by ultrasonication on ice. The supernatant of the cells was obtained by centrifugation at 12000×g at 4 °C for 5 min, and the protein concentration was measured by the Pierce TM BCA Protein Assay kit (Thermo Fisher Scienti c, USA). The activity of LDH was measured in the direction of pyruvate reduction according to our previous study [52]. The volume of lactate dehydrogenase reaction system was 210 µL, containing 100 µL of 5 mM NADH, 100 µL of 22.7 mM pyruvate and 10 µL cell extract. The reaction was performed at 55 °C for 15 min and NADH consumed was measured at 340 nm. Speci c activities were denoted as U/mg protein of cell extract and one unit of enzyme activity corresponds to one unit value decline of A340 nm per minute as the previous study reported [52].
Single guide RNA design and CRISPR/Cas toxicity assays By identifying the PAM site (TTA) in the locus of target gene, the spacer of single guide RNA was chosen with 37 nt sequence adjacent downstream of PAM, as shown in Table 2. The length was designed as same as the average of spacers in native CRISPR arrays. In toxicity assay of native CRISPR system, a sequence of 37 bp on the sense strand in pyrF (V518_1373) locus with a motif of 'GGC' instead of predicted PAM 'TTA' was identi ed as a control and used for the non-target spacer. Pkan was chosen for mini-array expression, as is not present in genome of SCUT27 thus not a target for unwanted homologous recombination.
The toxicity assay was used to access functional DNA cutting by the Type I-B CRISPR system in SCUT27, and was characterized by bacterial survival rate (transformation e ciency). Transformation of wild type strain by plasmids with target sgRNA or non-target sgRNA targeting to the pyrF coding region (with the predicted PAM or without a predicted PAM) were employed as described above. All  The prospect and CRISPR/Cas genome editing system of SCUT27. a With an optimal growth temperature over 50 °C and some advantages over mesophilic bacteria, SCUT27 is a promising candidate for producing biofuels by using lignocellulosic biomass. b Schematic of the endogenous Type I-B CRISPR/Cas system in SCUT27 functional by electro-transformation with editing vectors. Cleavage at target chromosome region results from the expression of CRISPR mini-array under a constitutive or inducible promoter on plasmid, forming a ribonucleoprotein (RNP) complex with native Cas proteins. The editing plasmid carrying repair templates allows recombination between plasmid and chromosome before or during the CRISPR interference.

Figure 2
Characterization of the Type I-B CRISPR/Cas system in SCUT27. a Structure of the only intact Type I-B CRISPR/Cas loci in the genome of SCUT27, which possesses a representative Type I-B Cas operon including cas6-cas8a-cas7-cas5-cas3 (effector), cas4-cas1-cas2 (adaptation) and the longest array containing 55 distinct spacers (diamonds) separated by 30-nt direct repeats (rectangles). b Identi cation of putative protospacers via in silico analysis in T. thermosaccharolyticum M0795. Spacer-protospacer mismatches are indicated with red color. Eight nt of the 5′-and 3′-end adjacent sequences are provided.
Putative PAM are indicated in yellow. c PAM identi cation for the CRISPR system in M0795. The alignment of regions anking protospacers was used to create the sequence logo by WebLogo [53]. The 5' adjacent motifs at positions -8 to -1 of the potential protospacers is indicated and 'PAM' of 3-nucleotide is shown in red box.

Figure 3
The construction and results of the editing plasmid pKQ1_ldh-HA12 for ldh deletion. a Schematic of pKQ1_ldh-HA12, comprised of a synthetic CRISPR expression cassette and a ldh editing template. b Gel electrophoresis analysis for ldh deletion mutants (1,616 bp PCR bands) and SCUT27 wild type (2,421 bp).
PCR products of SCUT27/Δldh (lane 1, 2, 4, 6, 7, 8, 11,12) and wild type with primer ldh-c-F and ldh-c-R distal to the entire recombination region. shows editing e ciency (blue points). "-" means without HAs addition. The data above are the means and standard deviations of three replicates.

Figure 5
Different processes and results of the transformation by editing plasmids with two kinds of promoters.
The expression of sgRNA on plasmids were respectively controlled by constitutive promoter Pkan (a) and adenine-induced riboswitch-based promoter Pkan-RSpbuE (b). In the former case, self-cleavage was performed by immediate expression of the CRISPR mini-array after transformation. On MTCK plates, the transformants with no homology directed repairing, or homology directed repaired but containing just one plasmid, were dead; Only the transformants with several plasmids and homology directed repaired could be survival. In the latter case, the transformants with one plasmid could be survival without inducer on MTCK plate. Then after induction, the transformants of homology directed repaired could be survival on non-selective plate while that with no homology directed repairing were dead.

Figure 6
The general work ow of multiplex genome editing system in SCUT27. The suicide vector pBlu_tdk-H12 was used for recombination event to delete tdk in wild-type strain. The shuttle vector pKQ1_ΔX-H12::tdk (X in this study was ldh or argR) was used for CRISPR-mediated recombination events for deleting gene X in SCUT27/Δtdk or other engineered strains without plasmids for successive editing.

Figure 8
Fermentation pro les by SCUT27 and SCUT27/Δldh/ΔargR under dilute acid pretreated biomass hydrolysates in serum bottles.