CRISPR–Cas9-enabled genetic disruptions for understanding ethanol and ethyl acetate biosynthesis in Kluyveromyces marxianus

Background The thermotolerant yeast Kluyveromyces marxianus shows promise as an industrial host for the biochemical production of fuels and chemicals. Wild-type strains are known to ferment high titers of ethanol and can effectively convert a wide range of C5, C6, and C12 sugars into the volatile short-chain ester ethyl acetate. Strain engineering, however, has been limited due to a lack of advanced genome-editing tools and an incomplete understanding of ester and ethanol biosynthesis. Results Enabled by the design of hybrid RNA polymerase III promoters, this work adapts the CRISPR–Cas9 system from Streptococcus pyogenes for use in K. marxianus. The system was used to rapidly create functional disruptions to alcohol dehydrogenase (ADH) and alcohol-O-acetyltransferase (ATF) genes with putative function in ethyl acetate and ethanol biosynthesis. Screening of the KmATF disrupted strain revealed that Atf activity contributes to ethyl acetate biosynthesis, but the knockout reduced ethyl acetate titers by only ~15%. Overexpression experiments revealed that KmAdh7 can catalyze the oxidation of hemiacetal to ethyl acetate. Finally, analysis of the KmADH2 disrupted strain showed that the knockout almost completely eliminated ethanol production and resulted in the accumulation of acetaldehyde. Conclusions Newly designed RNA polymerase III promoters for sgRNA expression in K. marxianus enable a CRISPR–Cas9 genome-editing system for the thermotolerant yeast. This system was used to disrupt genes involved in ethyl acetate biosynthesis, specifically KmADH1–7 and KmATF. KmAdh2 was found to be critical for aerobic and anaerobic ethanol production. Aerobically produced ethanol supplies the biosynthesis of ethyl acetate catalyzed by KmAtf. KmAdh7 was found to exhibit activity toward the oxidation of hemiacetal, a possible alternative route for the synthesis of ethyl acetate. Electronic supplementary material The online version of this article (doi:10.1186/s13068-017-0854-5) contains supplementary material, which is available to authorized users.


Background
The yeast Kluyveromyces marxianus is a potentially valuable host for industrial biochemical synthesis. Wild-type strains are known to have fast growth kinetics, thermotolerance to ~50 °C, the ability to metabolize a range of monomeric and dimeric C 5 and C 6 sugars, and strong fermentation pathways for ethanol production [1][2][3]. A high capacity for protein expression and secretion is also advantageous and has been exploited in bioprocesses for inulase and galactosidase production [4,5]. Another industrially relevant phenotype is the native capacity of various strains to synthesize acetate esters. Ethyl acetate and other short-chain volatile esters are used as industrial solvents and as flavor and fragrance compounds; worldwide ethyl acetate demand is ~1.7 million tons per year [6,7]. This demand is typically met by converting ethanol and acetate into ethyl acetate by Fisher esterification in reactive distillation processes [6,8]. A recent bioprocessing pilot plant using K. marxianus produced ethyl acetate titers of 10.9 g L −1 from waste whey feeds with yields of 51.4%, thus providing a single-step alternative to the traditional process that relies on petroleum feedstocks (for acetate production) and multiple unit operations (ethanol fermentation, acetate production, and Fisher esterification) [8].
Metabolic synthesis of ethyl acetate and other shortand medium-chain esters can occur via a number of pathways [7]. In Saccharomyces cerevisiae, ester biosynthesis is primarily catalyzed by alcohol-O-acetyltransferase (Atf or AATase) activity that condenses acetyl-CoA with an alcohol to produce the corresponding ester [9,10]. Double knockout of ATF1 and ATF2 in S. cerevisiae eliminates the synthesis of the medium-chain ester isoamyl acetate and reduces ethyl acetate production by 50% [11]. These enzymes localize to the endoplasmic reticulum and lipid droplets; this has been shown to be critical for high activity [12,13]. Reverse esterase and alcohol dehydrogenase (Adh) activity may also contribute to acetate ester production. In Candida utilis, ethyl acetate is produced by Adh oxidation of a hemiacetal intermediate (the spontaneous product of ethanol and acetaldehyde) [14]. Adh1 from S. cerevisiae (ScAdh1) and an Adh from Neurospora crassa are also known to exhibit hemiacetal oxidation activity [15,16], a reaction that is thought to be involved in aldehyde detoxification under conditions of low NADH availability [7]. In K. marxianus, both Atf and reverse esterase activities have been identified, and Atf activity is thought to be primarily responsible for ethyl acetate biosynthesis [17,18]. However, the metabolism of K. marxianus is less well understood than that of S. cerevisiae, and promiscuous Adh activity toward hemiacetal oxidation and the roles of different Adh enzymes in ester and alcohol metabolism are not yet completely understood.
To enable gene disruption studies and identify the roles of KmADH1-7 and KmATF in ethyl acetate and ethanol biosynthesis, we adapted the type II CRISPR-Cas9 system from S. pyogenes for genome editing in K. marxianus. K. marxianus is known to have a high capacity for DNA repair by nonhomologous end joining (NHEJ), which can limit the rapid generation of targeted knockout libraries by traditional genome manipulation techniques [2,25]. In this work, we used a hybrid promoter strategy to express single-guide RNAs (sgR-NAs) that target Cas9 endonuclease to KmADH1-7 and KmATF for functional disruption in the CBS 6556 strain of K. marxianus [26,27]. Transcriptional and functional analysis of the disruption library revealed the critical role of KmADH2 in ethanol production, and that disruption results in acetaldehyde accumulation. In addition, analysis of the knockout strains coupled with overexpression studies revealed novel hemiacetal activity toward ethyl acetate synthesis from KmAdh7 and showed that KmAtf has activity toward the condensation of acetyl-CoA and ethanol.

Thermotolerance and ethyl acetate biosynthesis in K. marxianus CBS 6556
Various strains of K. marxianus have been shown to have fast growth kinetics at temperatures above 40 °C [2,28]. For example, the strain CBS 6556 exhibits high growth rates at 45 °C (0.60 ± 0.07 h −1 ; Fig. 1a). High growth rates were also observed at lower temperatures and were nearly twice that of the model yeast S. cerevisiae BY4742. We have previously identified CBS 6556 as a high producer of ethyl acetate and here demonstrate the growth-associated production of 3.72 ± 0.06 g L −1 from the wild type in a controlled aerated bioreactor (Fig. 1b was significantly limited (0.44 ± 0.02 nmol min −1 mg −1 ), and control assays with only ethanol, acetate, and acetaldehyde did not produce measurable amounts of ethyl acetate. Combined together, the assays suggest that ethyl acetate may be produced from one or both of the Atfand Adh-dependent pathways.

CRISPR-Cas9-mediated gene disruptions in K. marxianus
CRISPR-Cas9 genome-editing systems function by targeting the Cas9 endonuclease to specific loci in the genome through complexation with guide RNAs (sgR-NAs). sgRNA expression is often limiting to Cas9 function [27,29]. To address this potential problem, we designed a series of native and hybrid RNA polymerase III promoters for sgRNA expression, including SNR52, tRNA Gly , and fusions of SNR52-tRNA Gly , SCR1-tRNA Gly , and RPR1-tRNA Gly (Fig. 3a). We targeted xylitol dehydrogenase (XYL2) for promoter testing because successful disruption can be coupled to a phenotype that is easily screened, i.e., the loss of growth on xylitol supplemented agar media plates (Fig. 3c). XYL2 disruption efficiencies were determined by restreaking a minimum of 30 colonies that were subjected to mutation by CRISPR-Cas9 onto solid media plates with xylitol as the sole carbon source. Disruption efficiency was found to be promoterdependent (Fig. 3b). The highest efficiency, 66 ± 8%, was achieved with the RPR1-tRNA Gly promoter, while tRNA Gly achieved 52 ± 15% disruption efficiency. The SNR52, SNR52-tRNA Gly , and SCR1-tRNA Gly promoters were less successful, resulting in disruption efficiencies of 10 ± 6, 35 ± 7, and 18 ± 11%, respectively.
Gene disruption was also found to be sgRNA sequencedependent (Additional file 1: Table S1), and a scrambled sgRNA did not produce a loss of xylitol dehydrogenase function (Fig. 3b).
With a functional CRISPR-Cas9 system in hand, we created a library of mutant CBS 6556 strains with functional disruptions to KmADH1-7 and KmATF. In each case, sgRNA design was accomplished using a previously published sgRNA scoring algorithm to identify high-scoring sgRNAs upstream of each enzyme's putative active site [30]. The protocol for creating and screening genetic disruptions by CRISPR-Cas9 is schematically described in Additional file 1: Figure S1. The final steps of the protocol include quantifying the rate of gene disruption, which was determined by amplifying the gene of interest from the genome, and sequencing the PCR product to identify indels. The indel success rates, which ranged from 10 to 67%, are presented in Table 2. Importantly, the mutant strains used in subsequent experiments were cured of the CRISPR-Cas9 plasmid, and the gene of interest was sequenced to identify the specific mutations in each gene ( Table 2). In each case, the indel created a genetic frameshift mutation that results a premature stop codon, thus functionally disrupting the gene. Figure 1b, c as well as previous K. marxianus studies show that ethyl acetate production is growth-associated, as such KmATF transcript levels were examined at lag, log, and stationary phase [1,31]. Reverse transcription Cultures were grown on selective media for 2 days prior to screening on xylitol supplemented agar media. Data points represent the arithmetic mean of 30 colonies randomly selected from three different transformations. Error bars represent the standard deviation of the samples. c XYL2-disrupted strains were restreaked on xylitol-containing agar plates and screened for loss of growth. Gene disruption was confirmed by sequencing quantitative PCR (RT-qPCR) analysis showed that under aerobic conditions (the conditions required for high ethyl acetate production) KmATF is upregulated during stationary phase (Fig. 4a). Anaerobically, KmATF followed a growth-associated expression pattern (Additional file 1: Figure S2). CRISPR-Cas9 disruption of the gene produced a 15% reduction in ethyl acetate during aerobic growth and a 66% reduction under anaerobic conditions, suggesting that KmAtf has activity toward ethyl acetate biosynthesis (Fig. 4b). The effect of the gene disruption when cultured aerobically on synthetic minimal media was not statistically significant (Additional file 1: Figure S3). The alcohol-O-acetyltransferase activity of KmAtf was confirmed in lysate experiments with overexpression in S. cerevisiae (Fig. 4c); however, KmAtf (2.7 ± 0.1 nmol min −1 mg −1 ) was found to be less active than ScAtf1 (85.4 ± 3.1 nmol min −1 mg −1 ). Western blot analysis of the S. cerevisiae lysates used in the assay confirmed protein expression and suggested that some of the difference in activity was due to reduced enzyme expression (Fig. 4c).

The effect of KmADH1-7 disruptions on ethyl acetate and ethanol biosynthesis
To determine the effects of KmADH1-7 on ethanol and ethyl acetate biosynthesis, wild-type CBS 6556 and mutant strains were cultured, monitoring growth, gene expression, ethanol, ethyl acetate, and acetaldehyde production. The transcriptional patterns of the KmADHs in wild-type CBS 6556 are shown in Fig. 5. Transcriptional analysis was conducted aerobically at 5, 10, and 18 h, corresponding to lag, log, and stationary phases, respectively (Fig. 5a). KmADH1 and -2 were highly expressed after 5 h of aerobic growth and transcript levels remain consistent after 10 and 18 h. Transcript levels of KmADH3 and -4 at 5 h were significantly lower in comparison with KmADH1 and -2, but were upregulated during stationary phase. Analysis of KmADH5 revealed the lowest initial  Bars and error bars represent the arithmetic mean and standard deviation of triplicate biological samples. Statistical significance (P < 0.05) is indicated by asterisk and was determined using a T test expression level, with increased expression after 10 h of aerobic culture. KmADH6 expression remained consistent through each growth phase, and KmADH7 increased expression as cells entered stationary phase, both of which maintained initial levels lower than the highly expressed KmADH1 and -2. For the analysis of anaerobic expression, cultures were grown for 6, 14, and 24 h to lag, log, and stationary phases, respectively (Fig. 5b).
KmADH2 expression was the highest during all of the growth stages, indicating its importance in K. marxianus metabolism. Similar to KmADH2, KmADH1 was highly expressed throughout culture growth. KmADH3 and -4 showed increased expressions at stationary phase, KmADH6 and KmADH7 expressions increased upon reaching stationary phase, and KmADH5 expression was insensitive to growth phase. With respect to cell growth, KmADH1-7 and KmATF disruptions had no effect on cell growth under aerobic conditions (Additional file 1: Figure S4). Under anaerobic conditions, ethanol fermentation pathways are significantly upregulated and are necessary for cell redox balance. As expected, KmADH1-7 and KmATF disruptions affected growth (Additional file 1: Figure S5). More specifically, disruptions of KmADH2 and -4 resulted in growth rates of 0.25 ± 0.05 and 0.23 ± 0.08 h −1 compared with 0.36 ± 0.10 h −1 of CBS 6556 at 37 °C.

KmAdh activity toward ethyl acetate formation from hemiacetal
Because the KmADH1-7 disruption library did not reveal clear gene candidates for ethyl acetate production, each ADH gene was separately overexpressed in S. cerevisiae to facilitate lysate enzyme assays. ScAtf1 and -2 are known to be suppressed by oxygen; therefore, the S. cerevisiae lysates provided a cell lysate background that lacks the capacity to produce ethyl acetate [32]. Western blot analysis of C-terminal MYC-tag modified enzymes confirmed the overexpression of KmAdh1, -2, -5, -6 and -7 (Fig. 7). KmAdh3 and -4 were also expressed but at reduced levels. S. cerevisiae lysates overexpressing each to the enzymes were incubated with hemiacetal and NAD + cofactor, and the reactions were allowed to continue for 30 min. Notably, KmAdh7 produced ethyl acetate at a rate of 66.3 ± 2.9 nmol min −1 mg −1 of lysate protein. No other overexpressed enzymes produced measurable amounts of ethyl acetate. Thus far, the function of KmAdh7 has not been described, and the result presented here suggests hemiacetal activity toward the biosynthesis of ethyl acetate.

Discussion
The CBS 6556 strain of K. marxianus is thermotolerant and exhibits fast growth kinetics at 45 °C (Fig. 1a). It ferments high titers of ethanol under anaerobic conditions, [2] and aerobically produces significant amounts of the short-chain volatile ester ethyl acetate (Fig. 1c). CBS 6556 and other strains of K. marxianus also have the ability to metabolize various C 5 , C 6 , and C 12 carbon sources [1]. Collectively, these characteristics are useful for industrial-scale bioprocessing: high-temperature bioreactors can minimize the need for aseptic conditions, diverse carbon sources allow for the use of the lowest-cost sugars, and the high titer production of volatile compounds can facilitate low-cost product separation through distillation and stripping [33].
Despite these many advantages, K. marxianus strain development has been limited in comparison with the model yeast S. cerevisiae because genome-editing tools and stable heterologous expression systems developed for K. marxianus are limited [25,[34][35][36]. Enabled by hybrid RNA polymerase III promoters for sgRNA expression, we adopted the CRISPR-Cas9 system from S. pyogenes for K. marxianus genome editing (Fig. 3). This system was necessary to create a library of strains with disruptions to genes with suspected function in ethyl acetate and ethanol metabolism including KmADH1-7 and KmATF ( Table 1). The main observations stemming from our CRISPR-Cas9 experiments are that (1) the RPR1-tRNA Gly hybrid promoter achieved the highest knockout efficiencies (Fig. 3b), and (2) that gene disruption efficiency was found to be highly dependent on gene target and sgRNA sequence (Additional file 1: Table S2). We have previously observed a similar result for hybrid sgRNA promoter design in the yeast Yarrowia lipolytica, and the sequence dependency of sgRNAs has been described in both human and mouse systems [27,30].
The widespread adoption of type II CRISPR-Cas9 technologies for genome editing has made less genetically Fig. 6 Ethyl acetate, ethanol, and acetaldehyde production in K. marxianus CBS 6556 and ADH disruption strains. a-c Aerobic production of ethyl acetate, ethanol, and acetaldehyde production per cell density after 10 h of growth in YM media at 37 °C. d-f Anaerobic production of ethyl acetate, ethanol, and acetaldehyde production per cell density after 14 h of growth. Bars and error bars represent the arithmetic mean and standard deviation of triplicate biological samples tractable organisms more accessible [26,27]. These tools are particularly useful in organisms where NHEJ prevails over DNA repair by homologous recombination and when genomes are diploid [37,38]. For example, CRISPR-Cas9 genome editing has recently been demonstrated in the yeasts Y. lipolytica, S. pompe, P. pastoris, K. lactis, C. albicans, and S. cerevisiae [26,27,[37][38][39][40][41][42]. Many of these examples focus on tool development for genome and metabolic engineering, including standardized and multiplexed methods for heterologous gene integration [26,38,41,42]. The other examples use the genome-editing system, as we have done here, to study the effects of genetic disruptions on cell phenotypes and metabolism [37,39,40]. To our knowledge, these types of studies have not yet been accomplished in K. marxianus.
The ∆adh1-7 and ∆atf strains of K. marxianus CBS 6556 were created to investigate the effects of these genes on ethyl acetate and ethanol biosynthesis. Our first experiments toward this goal helped identify the targeted knockouts. Specifically, K. marxianus lysate assays showed ethyl acetate biosynthesis from both Atf-and Adh-catalyzed reactions (Fig. 2). Atf activity was observed when lysates were supplemented with ethanol and acetyl-CoA, while Adh activity was observed when supplementing with hemiacetal and NAD + cofactor. These results are in contrast to prior reports that describe Atf and reverse esterase activities as the only reactions responsible for ethyl acetate biosynthesis in K. marxianus DSM 5422 and other strains [17,18]. In the CBS 6556 strain, we found no reverse esterase activity, but did identify Atf and Adh activities as possible routes for ethyl acetate production. On the basis of these results, ∆adh1-7 and ∆atf strains of K. marxianus CBS 6556 were created to investigate the effects of these genes on ethyl acetate and ethanol biosynthesis.
Functional disruption of KmATF resulted in the statistically significant reduction of ethyl acetate biosynthesis under both aerobic and anaerobic conditions (Fig. 4b). It is important to note here that ethyl acetate production was ~100-fold higher when cells were grown aerobically. Ethyl acetate synthesis through KmAtf activity was further confirmed in S. cerevisiae lysate assays containing heterologously overexpressed enzyme (Fig. 4c). Previous studies on K. lactis Atf (KlAtf ), which is most closely related to KmAtf, revealed limited activity toward ethyl acetate in comparison with ScAtf1, a result that was also observed in our studies of KmAtf (Fig. 4c) [13,43,44]. Taken together, these results suggest that KmAtf activity contributes in part to ethyl acetate biosynthesis in K. marxianus, but that other alcohol-O-acetyltransferases and/or other metabolic pathways are also responsible.
It is well understood that ethanol production in yeast is accomplished by the Adh-catalyzed reduction of acetaldehyde. Various Adh enzymes have also been shown to catalyze ethyl acetate synthesis through the oxidation of hemiacetal (Fig. 2a) [7]. RT-qPCR analysis of the genes found to be most relevant to ethyl acetate production showed that KmADH2 was constitutively expressed in aerobic growth with glucose as a carbon source, while KmADH3 and KmADH7 expression increased as cells reached stationary phase. These results are in agreement with previously published analyses of the K. marxianus transcriptome [21]. The ∆adh2 strain showed reduced aerobic ethanol and ethyl acetate production, suggesting a large role in ethanol and consequently ethyl acetate production. Disruption of KmADH3 also had a significant effect on ethyl acetate and ethanol formation, but this is likely due to the role of KmADH3 in cellular cofactor balance [45]. Under both aerobic and anaerobic conditions, the disruption of KmADH2 resulted in a large accumulation of acetaldehyde, something that was not observed in the ∆adh3 strain or the other knockouts strains, suggesting that KmAdh2 is the dominant enzyme in ethanol production when glucose is used as a carbon source. With respect to KmADH7, the lysate assays conducted with overexpressed KmAdh1-7 in S. cerevisiae (Fig. 7) demonstrated that KmAdh7 is a source of hemiacetal activity in K. marxianus (Fig. 2). Bioinformatic analysis revealed that KmAdh7 has no significant similarity to KmAdh1-6. NADH cofactor usage is similar to KmAdh1-5, but otherwise KmAdh7 appears to be unique among K. marxianus Adh enzymes. KmAdh7 does, however, show sequence similarity to an Adh from the bacteria Cupriavidus necator, but not to other yeast Adh enzymes [19]. The C. necator NAD(H)(P)-dependent enzyme exhibits broad substrate specificity including activity toward the oxidation ethanol and 2,3-butanediol, and the reduction of acetaldehyde, acetoin, and diacetyl; however, hemiacetal oxidation has not been demonstrated [46,47]. Such activity has been reported for Adhs found in both C. utilis and N. crassa, as well as in S. cerevisiae [14][15][16]. The ∆adh7 strain of K. marxianus CBS 6556 studied here did not result in a reduction in ethyl acetate biosynthesis (Fig. 6a) and, therefore, the role of KmAdh7 in ester biosynthesis remains unclear.
Previous investigations of K. marxianus metabolism suggest that Atf activity plays a critical role in ethyl acetate biosynthesis [17,18]. The studies presented here use new genome-editing tools and biochemical assays to confirm the function of KmATF, acetate ester synthesis via the condensation of an alcohol with acetyl-CoA. CRISPR-Cas9 technology adopted for use in K. marxianus also allowed for the creation of a library of KmADH disruption strains, identifying KmADH2 as critical for the reduction of acetaldehyde to ethanol (a precursor to ethyl acetate). In addition, KmAdh7 was found to exhibit activity toward the oxidation of hemiacetal to ethyl acetate, but the absence of this activity in the ∆adh7 strain of CBS 6556 did not produce a measurable reduction in ethyl acetate biosynthesis. It was recently postulated that ester biosynthesis in K. marxianus may also occur through homologs to the medium-chain acyltransferases Eeb1 and Eht1 from S. cerevisiae, the isoamyl acetate-hydrolyzing esterase Iah1, the N-acetyltransferase Sli1 and/or the alcohol-O-acetyltransferase Eat1 [44,48]. In S. cerevisiae, Eeb1 and Eht1 have limited activity toward the acetylation of ethanol and are most active toward medium-chain acyl-CoAs and alcohols [49]. If similar activities are found in the K. marxianus homologs, then these enzymes are unlikely to contribute to ethyl acetate production. Overexpression of Iah1 in S. cerevisiae resulted in lower ester titers due to ester hydrolysis, suggesting that the K. marxianus homolog may not contribute to ethyl acetate biosynthesis [50]. A recently published study describes Eat1 as a putative alcohol-O-acetyltransferase capable of producing ethyl acetate [48]. The new genome-editing tools created in this work should enable the future study of the role of N-acetyltransferases and other enzymes in ethyl acetate biosynthesis.

Conclusion
In this work, we developed an efficient CRISPR-Cas9 system for genome editing in K. marxianus. This system was used to create a library of single-knockout strains to investigate ethyl acetate and ethanol biosynthetic pathways in K. marxianus CBS 6556. Analysis of the knockout strains revealed the importance of KmADH2 in ethanol production in glucose-fed aerobic and anaerobic cultures. With respect to ethyl acetate biosynthesis, KmADH2 is necessary to produce ethanol, a substrate for the Atf-catalyzed condensation reaction with acetyl-CoA. Because functional disruption of KmATF did not completely abolish ethyl acetate production, alternative biosynthetic routes are likely present in K. marxianus. One possible pathway is the oxidation of hemiacetal (the spontaneous product of ethanol and acetaldehyde) by Adh activity, an activity that we identify in KmAtf7.

Strains and culturing conditions
All strains were purchased from ATCC or DSMZ (Deutsche Sammlung von Microorganismen und Zellkulturen). All materials were purchased from Fisher Scientific unless noted otherwise. All yeast strains used in this study are listed in Additional file 1: Table S2.
Wild-type K. marxianus strain CBS 6556 as well as the S. cerevisiae BY4742 strain were grown in shake flasks containing 50 mL YM media (3 g L −1 yeast extract, 3 g L −1 malt extract, 5 g L −1 peptone; DB Difco ® , Becton-Dickinson) with 10 g L −1 glucose. Overnight cultures were inoculated with isolated single colonies freshly grown on agar YM plates. The overnight cultures were used to inoculate shake flask at an OD of 0.05, which were subsequently cultured at 30, 37, 40, 45, and 48 °C. For K. marxianus cell lysate studies, strain CBS 6556 was grown for 8 h in 1 mL YM media at 37 °C. Then, 500 µL was transferred into 50 mL YM media, and culture was grown for 16 h.
For Adh and Atf overexpression studies, S. cerevisiae strains (YAL1-9 and YS202, see Additional file 1: Table  S2) with either an empty vector or an Adh/Atf expression plasmid were grown at 30 °C for 8 h in 1 mL SD-U and then transferred to a shake flask with 50 mL SD-U and grown for an additional 16 h.

Bioreactor cultures
K. marxianus strain CBS 6556 cultures were grown in a 1-L stirred bioreactor in batch mode (Biosatat ® A, Satorius AG). The vessel was equipped with four baffles, two Rushton impellers, gassing tube; exhaust-gas cooler; ports for supplementation, inoculation, and sampling; and sensors for dissolved oxygen, temperature, and pH. Cultures were grown in synthetic-defined (SD) media (6.7 g L −1 yeast nitrogen base without amino acids, DB Difco ® ; and 0.79 g L −1 CSM, Sunrise Science Products) containing 20 g L −1 glucose and 1 mL of a 1:1000 dilution of antifoaming agent (Antifoam B Emulsion, Sigma-Aldrich) at 37 °C. Twenty-five-milliliter overnight cultures were used to inoculate the reactor at an initial cell density of 0.08 OD. Dissolved oxygen (DO) concentration was maintained at 60% saturation by constant aeration with 1000 ccm air and varying stir rates. Media pH of 5 was maintained by titration with 1 M sodium hydroxide. When necessary, liquid was taken through the sampling port. Gas sampling was accomplished by collecting 0.4 L of exhaust gas in a 1-L gas-sampling bag connected to the gas-sampling port (Supel ™ Inert Multi-Layer Foil Gas Sampling Bag, Sigma-Aldrich). OD was measured, and spent media and gas samples were analyzed by GC-FID. Media samples for glucose analysis were spun down to removed cells and the supernatant was stored at −20 °C prior to analysis.

Glucose analysis
The spent media of the bioreactor experiments was analyzed for residual glucose using the Glucose (GO) Assay Kit (GAGO-20; Sigma-Aldrich). The procedure was slightly modified. Briefly, 200 µL of sample was mixed with 400 µL Glucose Assay reagent. After reaction for 30 min at 37 °C the reaction was stopped by addition of 400 µL 12 N H 2 SO 4 . 250 µL of the solution mixed was then transferred into a 96-well plate, and absorbance was determined at 540 nm using a BioTek Synergy 2 UV-Vis plate reader.

Molecular cloning and plasmids construction
All cloning was accomplished using Phusion polymerase, restriction endonucleases, and Gibson assembly master mix purchased from New England BioLabs (NEB). DNA oligos were purchased from Integrated DNA Technologies (IDT). Chemically competent DH5α Escherichia coli was used for plasmid propagation. Following transformation, E. coli cells were grown in LB medium containing 100 mg L −1 ampicillin. All plasmids and primers used are listed in Additional file 1: Tables S3 and S4. The CRISPR-Cas9 plasmid was constructed using pJSK316-GPD (a kind gift from Dr. Dae-Hyuk Kweon, Sungkyunkwan University) that contains the backbone necessary for plasmid retention in K. marxianus [51]. This plasmid was digested with the restriction enzymes KpnI and SacII. The Tef1p-Cas9-Cyct cassette was amplified from p414 (Addgene #43802) using primers P1379/P1525 [52]. The structural guide RNAs containing a ScSNR52 promoter, an Ade2 target sequence, and the structural guide RNA (SNR52-sgRNA-Ade2; Additional file 1: Figure S7) were designed based on previously described sequences [52]. The SNR52-sgRNA-Ade2 fragment was amplified using primer P1626/P1530. The Cas9 and sgRNA fragments were inserted into the digested pJSK316-GPD by Gibson Assembly [53]. For increased editing efficiency, the Cas9 nuclease sequence was codon optimized using Optimizer (http://genomes.urv.es/OPTIMIZER/) and the codon usage table for K. marxianus [54]. The resulting sequence was then manually altered, as shown in Additional file 1: Figure S8, to allow for production of three similar-sized gBlocks (double stranded fragments from IDT DNA). The CRISPR plasmid (pIW333) was cut with SpeI and XhoI, and the 3 gBlocks were inserted by Gibson Assembly to create a new KmCRISPR plasmid (pIW360). Different RNA polymerase III promoters were designed to assess the expression of sgRNAs and CRISPR-Cas9 efficiency [27]. The xylitol dehydrogenase gene (XYL2) was used as reporter gene for Cas9induced gene disruption. The CRISPR-Cas9 plasmids with varying sgRNA promoters were constructed by inserting different RNA polymerase III promoters into pIW360. ScSNR52 was amplified from pIW360 using P1626/P1789. The K. marxianus RNA polymerase III promoters were amplified from isolated K. marxianus CBS 6556 genomic DNA. KmSNR52 was amplified with P1792/P1806, while P1792/P1793 were used to generate the SNR52 fragment for KmSNR52-tRNA Gly . KmSCR1 and KmRPR1 were amplified using primers P1795/P1796 and P1798/P1799, respectively. The tRNA Gly sequence was amplified with primers P1839/40 for insertion by itself or with P1790/P1807, P1794/ P1807, P1797/P1807, and P1800/P1807 for insertion with ScSNR52, KmSNR52, KmSCR1 and KmRPR1, respectively. The sgRNA fragments targeting XYL2 were amplified using primers P1773/P1530 and P1775/P1530. To exchange the sgRNA target sequence, the reverse primer of the promoter fragment and the forward primer of the sgRNA fragment were replaced by appropriate primers containing the target sequence.
For ADH and ATF overexpressions in S. cerevisiae, the eight genes were separately cloned into the pRS426 vector containing a PGK1 expression cassette (pIW21) [13]. The ADH and ATF genes of interest were identified in the annotated genome of K. marxianus DMKU3-1042 as KLMA_40102, KLMA_40220, KLMA_80306, KLMA_20158, KLMA_20005 KLMA_80339, and KLMA_40624 for KmADH1-7, respectively and KLMA_30203 for KmATF [19]. Blast searches confirmed the presences of each gene in the unannotated genome of CBS 6556. All proteins were tagged with a C-terminal Myc tag and cloned into pIW21 at the SacII and SpeI sites using Gibson Assembly. Coding sequences for ADH2-7 and ATF were made using the primers shown in Additional file 1: Table S4 and cloned into pIW695 that was cut with SacII and AvrII. The resulting plasmids pIW696-702 are listed in Additional file 1: Table S3.

Transformation of K. marxianus
Plasmid and linear DNA transformations were performed using a previously reported protocol with the following modifications [55]. In brief, 1.5 mL K. marxianus cells was grown to stationary phase and harvested by centrifugation at 5000 rpm for 1 min. After washing with 1 mL of sterile water, cells were suspended in 100 µg carrier DNA (salmon sperm DNA), and 0.2-1 µg of plasmid or linear DNA. 400 µL of transformation mix (40% polyethylene glycol 3350, 0.1 M lithium acetate, 10 mM Tris-HCl (pH 7.5), 1 mM EDTA, and 10 mM DTT) was then added, and the solution was incubated at room temperature for 15 min. Subsequently, the transformation mix was heat shocked at 47 °C for 15 min, and cells were plated.

Creation of a URA3 auxotrophic strain
To create a URA3 disruption strain, a truncated K. marxianus URA3 fragment (missing 160 bp of the coding region) with 500 bp homology upstream and downstream was transformed into K. marxianus strain CBS 6556. Transformed cells were recovered overnight and plated on 5-fluororotic acid (5-FOA) containing plates. Solid media contained 6.7 g L −1 yeast nitrogen base (Becton-Dickinson), 1.92 g L −1 yeast synthetic drop-out medium supplements without uracil (Sigma-Aldrich), 50 mg L −1 uracil (Sunrise Science Products), 1 g L −1 5-FOA (Sigma-Aldrich) and 20 g L −1 glucose. Colonies were selected based on colony PCR and selected colonies were sequenced (Additional file 1: Figure S9). To create the URA3 disruption homology donor, overlap extension PCR was performed as previously described [56]. Overlapping fragments of upstream and downstream regions of the sequence targeted for deletion are amplified using primers P1019/P1920 and P1021/P1022, respectively. Resulting fragments are purified and used for overlap PCR with P1019/P1022. The resulting fragment was purified and used for transformation. To confirm disruption, genomic DNA was screened using primers P1072/1073 where a knockout resulted in a 200 bp amplified fragment, with the wild-type gene producing a 360 bp fragment. The knockout design is shown in Additional file 1: Figure S10.

CRISPR-Cas9-mediated gene disruption
The K. marxianus CRISPR-Cas9 system developed in this work was adopted from systems developed for Y. lipolytica and S. cerevisiae [27,52]. Cas9 was codon optimized for K. marxianus and was expressed from a plasmid using the S. cerevisiae Tef1 promoter. For sgRNA expression, RNA polymerase III promoters in K. marxianus were identified by blasting the K. lactis genes of SNR52 (NC_006042.1), RPR1 (NC_006042.1), and SCR1 (NC_006042.1) against the draft genome of K. marxianus strain CBS 6556 (Accession Number: AKFM00000000) [57]. The search yielded K. marxianus SNR52, RPR1 and SCR1 that had 86, 75, and 82% identity to the respective K. lactis genes. The promoter regions were identified by searching for conserved A and B-box motifs as previously described [27,58]. Promoter regions were defined as ~100 bp upstream of the A box until the start of the coding region of the gene. For KmSCR1, the boxes were within the transcribed region, which is why the promoter was chosen as the start of the aligned sequence to about 30 bp downstream of the identified B-box. The glycine tRNA (tRNA Gly ) was identified by blasting the annotated tRNA-Gly (AGG) from K. marxianus strain DMKU3-1042 (RNA central; URS00003CECDB; [19,50]) against the genome of CBS 6556 as described above. sgRNA target sequences for xylose dehydrogenase (XYL2), ADH and ATF knockouts were identified using the sgRNA design tool hosted by the Broad Institute (http://www.broadinstitute.org/rnai/public/analysistools/sgrna-design) [30]. Target sequences were checked for secondary structures using the IDT OligoAnalyzer Tool 3.1 (https://www.idtdna.com/calc/analyzer) and uniqueness within the K. marxianus genome using BLAST. All sgRNA sequences are listed in Table 2 and Additional file 1: Table S1.
For ADH and ATF disruptions, transformed K. marxianus cells were cultured in 2 mL SD-U media or plated on SD-U plates and grown for 2 days at 30 °C. If cultured in liquid media, 50 µL of cell culture was transferred into new media after 1 day of culturing. After 2 days of growth colonies were screened by amplifying the CRISPR-Cas9-edited region in the genome by colony PCR and subsequent sequencing of the purified PCR fragments. Positively confirmed disruptions colonies were saved at −80 °C after the plasmid was removed.

Adh and Atf protein sequence analysis
Homology of the K. marxianus Adh1-7 and Atf proteins to other proteins was analyzed by Pairwise Sequence Alignment using the EMBOSS Needle software form EMBL-EBI. Analyzed sequences are shown in Additional file 1: Table S5.

Headspace gas chromatography
One-milliliter of culture supernatant or cell lysate reaction was used for headspace GC analysis in a 10 mL headspace vial containing 1 g of NaCl and 20 µL of 5 g L −1 1-pentanol as internal standard. Volatile metabolite concentration was measured using an Agilent 7890A system equipped with an Agilent DB-624UI column and an FID detector. For metabolite separation, the temperature was held at 40 °C for 2 min, then increased 20 °C min −1 to 70 °C and 50 °C min −1 to 220 °C and held for 2 min. For the bioreactor off gas analysis 1 mL of the off gas was injected from the gas sample bag by manual injection.

Reverse transcription quantitative PCR (RT-qPCR)
Total RNA was extracted using the YeaStar ™ RNA Kit (Zymo Research). RNA was DNAse treated (DNAse I, NewEngland Biolabs) for 30 min. and subsequently purified using the RNA Clean & Concentrator ™ -5 Kit (Zymo Research). RNA was used for the reverses transcription reaction (iScript ™ Reverse Transcription Supermix for RT-qPCR, Bio-Rad) and cDNA was used for SYBR Green qPCR (SsoAdvanced ™ Universal SYBR ® Green Supermix. Bio-Rad) using the Bio-Rad CFX Connect ™ . Primers used for the qPCR reaction are listed in Additional file 1: Table S4. For Figs. 4a and 5a, b (top panel) total copy number was calculated from a standard curve with GAPDH as an internal standard. Fold changes in Fig. 5a, b (bottom panel) were calculated under consideration of the reaction efficiency as previously described [59]. Log and stationary phase expression were compared with lag phase expression using transcript level normalized to GAPDH.

Total cell lysates
Cell lysis was performed as described earlier [13]. In short, cell were harvested, washed, and resuspended in equal volumes of wet cell pellets, 425-600 μm acidwashed glass beads (G8772, Sigma-Aldrich), and icecold lysis buffer (100 mM potassium phosphate buffer, 2 mM magnesium chloride, 5 mM DTT, and Pierce ™ Protease Inhibitor Tablets). The cells were disrupted at 4 °C by vortexing 10 times for 30 s with a 30 s cooling step between each vortexing. The beads were removed by centrifugation at 500g for 5 min at 4 °C, and the supernatant transferred to a pre-cooled 1.5 mL tube. The protein concentrations of whole cell lysates were determined by Pierce ™ 660 nm Protein Assay.

Enzyme activity assay
K. marxianus strain CBS 6556 and S. cerevisiae strains containing Adh and Atf expression plasmids are described in Additional file 1: Table S1. One-hundred micrograms of total cell lysate isolated from the various strains was used in each reaction. Lysates were incubated in 100 mM potassium phosphate (pH 7.4), 500 mM ethanol, and 0.5 mM acetyl-CoA to test for Atf activity and 100 mM potassium phosphate (pH 7.4), 100 mM acetaldehyde, 1 M ethanol, and 30 mM NAD + (Sigma-Aldrich) to test for hemiacetal activity of Adh. To test for esterase activity, cell lysates were incubated in potassium phosphate (pH 7.4), 500 mM ethanol, and 500 mM potassium acetate. The reaction mixtures were incubated for 30 min at 30 °C. The samples were analysis by headspace chromatography as described above. To allow for hemiacetal production, reaction mixtures with 10 M ethanol and 1 M acetaldehyde were mixed beforehand, and pH was adjusted to 10 using sodium hydroxide as previously described [15]. The chromatograms of the hemiacetal solution and vector control experiments showed no significant peak for ethyl acetate (Additional file 1: Figure  S11).

Western blot analysis
Western blot analysis was performed to confirm the expression of Adh and Atf proteins with C-terminal c-Myc tag. 2.5 OD of cells were lysed using 0.1 M NaOH. Samples were loaded onto a 10-well Any kD ™ Mini-PROTEAN ® TGX ™ Precast Protein Gel (BioRad) and run for 1 h at 150 V. Samples were then electrophoretically transferred overnight to a PVDF membrane at 25 V. Membranes were blocked with 5% nonfat milk in TBST buffer for 1 h at room temperature and incubated with anti-c-Myc mouse antibody (Sc-40, Santa Cruz Biotech) or anti-GAPDH (PA1-987) diluted to 1:2000 and 1:5000 in TBST buffer with 1% nonfat milk. Goat anti-mouse IgG-HRP (31,430) diluted to 1:10,000 was added as secondary antibody and incubated at room temperature for 30 min. After washing with TBST, HRP substrate (Bio-Rad) was used for signal detection. Blots were imaged using the BioRad ChemiDoc ™ MP System with the Image Lab software.

Statistical analysis
Data points represent arithmetic means of at least triplicate biological samples, and error bars represent the standard deviation. Comparisons between two samples were accomplished by an unpaired two-tailed T test with a significant difference at P < 0.05. Groups of samples were analyzed by one-way ANOVA with a Tukey post hoc test and considered significant at P < 0.05. Statistical analysis and plotting of data points were performed using the GraphPad Prism software.

Authors' contributions
AL and IW planned the experiments and analyzed the data. AL, RE, and AF performed the experiments and bioinformatics analysis. CS helped to plan and design the CRISPR-Cas9 experiments. AL and IW wrote the manuscript, with editing and input from all authors. All authors read and approved the final manuscript.