Improving production of Streptomyces griseus trypsin for its application in processing insulin precursor

Background: Trypsin has a plenty application in food and pharmaceutical manufacture. While, the commercial trypsin is usually extracted from pork pancreas, which has the risk of infectious and immunogenicity. Therefore, the microbial Streptomyces griseus trypsin (SGT) is a prior alternation because it processes efficient hydrolysis activity without the aforementioned risk. The remarkable hydrolysis efficiency of SGT caused its autolysis, and five autolysis sites R21, R32, K122, R153, and R201 were identified from its' autolysate. Results: The tbcf (K101A, R201V) mutant was screened by directed selection approach for improved activity in flask culture (60.85 ± 3.42 U·mL -1 , increased 1.5-fold). From the molecular dynamics simulation, the K101A/R201V mutation shortened the distant between catalytical residues D102 and H57 from 6.5 Å vs 7.0 Å, which afforded the improved specific activity 1527.96 ± 62.81 U·mg -1 . Further, the production of trypsin was increased 302.8% (689.47 ± 6.78 U·mL −1 ) in 3-L bio-reactor, with co-overexpression of chaperones SSO2 and UBC1 in Pichia pastoris . Conclusions: The SGT protein could be an adequate trypsin for insulin production. When working with hydrolysates analysis and direction selection, the tbcf (K101A, R201V) mutant increased 1.5-fold activity. Further, the production of trypsin was improved 3-fold by overexpressing chaperone protein in Pichia pastoris . The future study should be emphasized on the application of SGT in insulin manufacture and pharmaceutical.

3 from the porcine and bovine pancreas. However, due to the potential risk of infectious agent contamination, the animal-derived trypsin is under control in pharmaceuticals and food manufacture.
Thus, microbial Streptomyces griseus trypsin (SGT) is a potential alternative for sharing similarities with bovine trypsin (BT) in three-dimensional structure and catalytical property. Moreover, SGT showed higher hydrolysis performance than BT in proteomics application, because of producing a amount of matching tryptic peptides [11]. However, the high hydrolysis efficiency of SGT caused autolysis [12], owing to 16 potential autolysis residues in its polypeptide sequence. Thus, high-yield production of trypsin might be hindered by autolysis of active trypsin. The mutant tbcf (K101A), with R145I and R201V mutations, afforded the higher stability against autolysis [13]. However, the autolysis residues of SGT have not been investigated in detail comparing with BT [14,15].
When induced by pAOX1 promoter, the human trypsinogen protein stacked as inclusion body in the endoplasmic reticulum, which up-regulated the UPR then caused cascade degradation of the intracellular un-folded trypsinogen by Bip protein [23]. On the other hand, the folding of disulfide bonds in trypsin was also the burden for P. pastoris. Because the generated peroxide toxicity further induced ERAD response [24]. In this study, our objective was to develop a strategy for the high-yield production of SGT in P. pastoris, by directed selection and co-overexpression of chaperones.
Moreover, the SGT mutant could be applied in manufacturing insulin product from the insulin precursor.

Results And Discussion
Identification of the autolysis sites in tbcf (K101A) The SGT contains 16 potential autolysis residues (Arg, Lys). In previous study, the stability and production of SGT were improved with tbcf (K101A) mutant [13]. For further investigating the 4 autolysis of tbcf (K101A), its' hydrolysate was analyzed by MALDI-TOF-MS. Based on the autolysis fragments of tbcf (K101A), five autolysis residues were identified as R21, R32, K122, R153, and R201 ( Fig. 1A). This result indicated that the SGT slightly prefers to hydrolyze at R than K. Because the hydrogen bond interaction between R and substrate binding D189 was more stable than a water molecule bridged interaction between K and D189 [25]. Štosová et al. significantly reduced the autolysis of SGT by modifying R or K residues with chemical reagents phenylglyoxal and formaldehyde, but the specific activity of SGT dropped to 12% of the parent enzyme [12]. Because the R and K interacted with other residues to form hydrogen bond, salt bridge and π-interaction.
These interactions played essential roles in SGT folding, three-dimensional structure, and catalytic activity. In secondary structure of SGT, except R201V (β-Sheet), the other four autolysis residues (R21, R32, K122, and R153) were located in the loop regions (Fig. 1B). Specifically, R21 interacted with Y131 by a hydrogen bond, which was also the case for R32-T55 and R32-Q128 interactions.
Whereas K122 formed a salt bridge with D184 and E188, so as for R153 and D60. These results indicated that R21, R32, K122 and R153 residues helped lock the three-dimensional conformation.
Eventually, the mutant tbcf (K101A, R201V) was discovered with increased production of trypsin and its specific activity.

Enzyme kinetics and molecular modeling analysis of SGT mutant
The MD simulation was applied to analyze the three-dimensional structure of tbcf (K101A, R201V) compared with tbcf. The root-mean-square deviations (RMSD) of protein backbone atom indicated the stability of protein [30]. After the 7 ns MD simulation, the backbone of tbcf (K101A, R201V) mutant showed the lower deviation of RMSD value (Fig. S2). This result indicated the K101A/R201V mutations could improve the stability of SGT backbone. Then, the internal interaction of catalytical triad (H57, D102 and S195) was analyzed. Interestingly, the tbcf (K101A, R201V) mutant showed a shorter distance between H57 and D102 (6.5 Å vs 7.0 Å) in the catalytic center (Fig. 3AB). Consequently, the tbcf (K101A, R201V) mutant, with a k cat /K m value of 1.53 × 10 7 min − 1 ·mM − 1 , afforded higher catalytical activity than parent tbcf (K101A) ( Supplementary Table 3). Especially, the increased specific activity might attribute to shortened distance between D102 and H57 in catalytical triad, which could consolidate the hydrogen bond between carboxylic oxygen of D102 and δ-nitrogen of H57 [31]. Because the hydrogen bond stabilized the structure of H57 in catalytical transient state, which facilitated H57 to accept the proton from S195 [32]. Moreover, the Km value of tbcf (K101A, R201V) and tbcf were similar, which were 5.39 ± 0.36 × 10 − 2 mM and 5.86 ± 0.16 × 10 − 2 mM respectively. This result indicated that K101A/R201V mutations retained the conserved internal interaction at substrate binding domain.
High-yield production of SGT with co-overexpression chaperones in P. pastoris Protein expression was known to be regulated by UPR [33] or ERAD [34] in P. pastoris. And expression of trypsinogen triggered UPR and ERAD in P. pastoris, because of the unfolded trypsinogen in endoplasmic reticulum (ER) and peroxide toxicity by forming disulfide bond [23,24,35]. It was known that protein expression could be improved by upregulating the endogenous proteins [36]. Therefore, twelve proteins were individually overexpressed, involved in transcription regulation, disulfide bond formation and protein secretion. The ER located chaperones processed diverse functions during 6 polypeptides folding into the biologically active protein. And these chaperones included the oxidative reaction in protein folding (Ero1), disulfide bond forming (GLR1, PDI, and GSH2), and degradation of the unfolded protein (UBC1) [34,35]. Interestingly, the production of trypsin was increased by 17.0% and 31.6% with overexpression of GSH2 and UBC1, respectively (Fig. 4). Moreover, the transport of polypeptides was known to be critical for secretory proteins. The Bip, SLY1 and SEC53 chaperones were responsible for transporting and recognizing of nascent polypeptides in ER and Golgi membrane [37]. And, the SEC1 and SSO2 promoted the extracellular secretory of the folded protein [38,39]. So, overexpression of SEC1 and SSO2 increased the trypsin activity by 24.1% and 41.5%, respectively.
Then, the SEC1, SSO2 and UBC1 were co-overexpressed, because they contributed more than 20% increase of trypsin activity. Finally, the trypsin production of strain GS115-tbcf (K101A, R201V)_SU showed highest production 109.25 ± 4.76 U·mL − 1 (increased by 79.5% ), with co-overexpression of SSO2 and UBC1 in flask culture (Fig. 4). This indicated that the bottle-neck for high-yield production of SGT might be hindered by secretory transportation and degradation of unfolded trypsin in P. pastoris.

Application of high-yield trypsin to processing insulin precursor
For scale-up production of the trypsin, the strain GS115-tbcf (K101A, R201V)_SU was cultured in 3-L bio-reactor, according to the published method [13]. After glycerol fed-batch cultivation, a higher density of cells (68.02 ± 1.5 g/L, DCW) was achieved with glycerol feeding and high agitation speed (850 r·min − 1 ) (Fig. 5). Then, the fermentation entered methanol-feeding cultivation phase, when the glycerol was depleted with the indication of increased DO (over 50%). After induction for 156 h, the trypsin production reached 689.47 ± 6.78 U·mL − 1 , which was increased 302.8% than parent GS115tbcf (K101A).
The mammalian trypsin was generally applied for the preparing insulin from its precursor, because of the canonically tryptic cleavage of lysine for removal of C-chain [4,40]. While, the traditional method for preparing trypsin is extraction from the mammalian pancreas, which has the risk of the bioactive compound, infectious virus, and heath-harmful proteases [41]. Although the heterologous expression of mammalian trypsin could avoid the aforementioned problems, it still suffered from immunogenicity issues, low expression level, activation of the zymogen, and autolysis [15,42,43]. Importantly, the 7 SGT mutant tbcf (K101A, R201V) showed higher hydrolysis performance, with no immunogenicity, high production, autoactivation and stability against autolysis. The tbcf (K101A, R201V) was mixed with insulin precursor rPI to afford insulin precursor with Asp30 deleted B-chain (PI-B D30 ) (Fig. 6A).
And the tbcf (K101A, R201V) was compared with commercial porcine trypsin at the identical condition. After hydrolysis for 19 h, the rPI was converted to PI-B D30 as demonstrated in the HPLC chromatograph. The elution time of rPI was 18.75 min (Fig. 6B). After cleavage by commercial porcine trypsin and tbcf (K101A, R201V), the rPI was converted to PI-B D30 , which was eluted out at 21.40 min (Fig. 6CD). So, the engineered SGT mutant tbcf (K101A, R201V) performed the potential application in insulin manufacture, due to the same hydrolysis capacity with commercial porcine trypsin.

Conclusions
In this study, five autolysis residues were identified in of SGT. And the mutant tbcf (K101A, R201V) was identified from a library of 35 mutants with improved hydrolysis performance and specific activity. Furthermore, the production of trypsin was increased 302.8% (689.47 ± 6.78 U·mL − 1 ) with co-overexpression of chaperone proteins SSO2 and UBC1 in P. pastoris. Consequently, the engineered SGT tbcf (K101A, R201V) showed the same hydrolysis capacity with commercial poricne trypsin. The future study should be emphasized on the application of SGT in insulin manufacture and pharmaceutical [44].

Prediction of the point mutations
The mutations were predicted by web servers (PoPMuSiC 2.1 [28], DUET [27], NeEMO [29]). All of the predictions were used 1SGT as the template. Based on trypsin mutant tbcf (K101A), three web servers (PoPMuSiC 2.1, DUET and NeEMO) were used to predict the positive mutation of five autolysis sites with a cutofff for increased ΔΔG value (PoPMuSiC, DUET) or Kcal/mol value over 1.4 (NeEMO). For NeEMO prediction, the pH and temperature were set at 6.0 and 30 ºC. And R32A, K122A, and R153A mutation was individually intraduced into tbcf (K101A), for the increased activity in previous study [13].

Plasmid and strain construction
The PCR product was generated from pPIC9K-tbcf (K101A) plasmid with the corresponding primers (Supplementary Table 1). The ligated plasmids were transformed into E. coli strain JM109, then selected by 50 mg·L − 1 kanamycin. Moreover, the constructed plasmids were linearized with restrictive enzyme SalI, then transformed into P. pastoris GS115 (His − ) competent cell and screened by histidine autotrophic phenotype. Consequently, the recombinant yeast, with the high copy number of trypsin cassette, was screened by streaking the single colony on YPD plate with gradient geneticin (1 mg·mL − 1 , 2 mg·mL − 1 , 3 mg·mL − 1 and 4 mg·mL − 1 ). The q-PCR quantification was applied to quantify the copy number of trypsin cassette in recombinant yeast from 4 mg·mL − 1 geneticin plate [22].
The endogenous proteins were expressed with constitutive promoter pGAP. Firstly, the backbone vector was amplified from the pGAPZB plasmid by FpGAPZB and RpGAPZB primers. Then, the gene fragments were amplified from genomic DNA of P. pastoris GS115 (His − ) with the respective primers, and the hac1s (intron split hac1 gene) was synthesized with coding sequence of hac1. Finally, the construction of plasmids was performed by DNA ligation kit. Moreover, the co-expressing plasmids pGAP-SSO2/SEC1 and pGAP-SSO2/UBC1 were constructed. Firstly, the backbone vector was generated from the pGAP-SSO2 plasmid with FpGAP-C and RpGAP-C. Then the expressing cassettes of pGAP m -SEC1-tAOX1 and pGAP m -UBC1-tAOX1 were separately amplified by primer FpGAPm and RAOX1 from plasmids pGAP m -SEC1 and pGAP m -UBC1, whose restrictive enzyme site AvrII was mutated by primers  Table 2.

Media and cultivation
The media included Luria-Bertani (LB) medium, Yeast Extract Peptone Dextrose (YPD) medium, Buffered Methanol-complex (BMMY) medium and Basal Salts (BSM) medium [13]. The yeast cells were pre-cultured in YPD medium for 24 h, at 30 °C and 220 rpm. Then the pellets were resuspended in BMMY medium for 144 h cultivation with the same condition. Also, 1% methanol (v/v) was added into medium every 24 h for inducing the pAOX1 promoter. The scale-up cultivation was carried out in 3-L bio-reactor (INFORS, Switzerland), with 800 mL BSM medium. The cultivation process was divided into three phases. During the glycerol batch cultivation, the yeast cells were cultured under pH 5.5, 30 °C, and the dissolved oxygen (DO) controlled over 30% by constant agitation speed. Glycerol fed-batch cultivation was carried out, when the glycerol was depleted with DO over 50%. And the feeding solution (50% glycerol with 1.2% PTM1 solution) was gradiently added into medium to confer high-density cultivation with higher DO level by increased agitation speed (800 rpm). In the methanol fedbatch phase, the trypsin was expressed by methanol induced promoter pAOX1. In order to avoid the repression effect of pAOX1 by the glycerol, the inducer methanol was added 2 hours later when DO was over 60%. And the methanol was gradually fed, according to the method developed by Wang et al. [45].

Expression of human insulin precursor
The codon-optimized coding sequence of recombinant human insulin precursor (rPI) was ligated into the pPICK-9K plasmid, by the restrictive enzyme site EcoRI and BamHI. The recombinant strain GS115-rPI was screened by 4 mg·mL − 1 geneticin for the high copy number of expression cassettes.
And the scale-up cultivation and preparation of insulin precursor were according to the reported method. [4,46]. The insulin precursor sample was purified by ion-exchange chromatography and reversed-phase chromatography, then lyophilized after isoelectric precipitation.

Purification of trypsin
The culture supernatant was separated for purifying the trypsin, under centrifugation at 5000 g for 10 min. And the filtered (0.22 µm) sample was loaded into 1 mL benzamidine column, then equilibration with buffer A (pH 7.4 50 mM Tris-HCl, 0.5 M NaCl). Finally, the trypsin was eluted with 60% buffer B (pH 2.0, 10 mM HCl, 20 mM NaAc) [13]. Moreover, the purified trypsin was further separated using the HiLoad 16/60 Superdex 200 pg column. Finally, the concentration of purified trypsin was determined by the modified Bradford protein assay kit.

MALDI-TOF-MS analysis
MALDI-TOF-MS was applied to identify the autolysis fragments. The concentrated hydrolysate of tbcf (K101A) [13] was loaded on the MALDI-TOF-MS plate with the control without hydrolysis. Moreover, the autoproteolytic peptides were analyzed with the Swiss-Prot database [47].

Molecular dynamics (MD) analysis
The three-dimensional model of trypsin mutant was simulated by NAMD 2.11 with CHARMM27 force field [48,49]. For the simulation of trypsin mutants tbcf and tbcf (K101A, R201V), the modeling template SGT (PDB, 1SGT) was download from RCSB Protein Data Bank [50]. The protein was set in a cubic water box of 70 × 70 × 70 Å with the criterion that the more than 7 Å water layer in each dimension and 12 Å cut-off for non-bonded interactions. The default pH was set at 7.0 and the Na + was added for charge neutralization. After water equilibration (1 ns) and minimization (1000 steps), the 60 ps of heating was performed from 0 K up to 300 K before each primary molecular simulation [51]. The temperature was kept at 300 K for 150 ps for equilibration and 10 ns simulation for data sampling at constant temperature and pressure. Finally, the three-dimensional structure was analyzed and presented by PyMOL Molecular Graphics system [52].

Determination of trypsin activity and hydrolysis of insulin precursor
The culture supernatant or the purified enzyme were prepared to measure the trypsin activity, according to the published method [22]. And, the purified recombinant SGT and commercial porcine trypsin were prepared to hydrolyze the insulin precursor. The trypsin, with 3000 U BAEE activity, was added into 10 mL insulin precursor solution (60 mg·mL 1 in 50 mM pH 8.0 Tris-HCl solution with EDTA·2Na). The hydrolysis was performed at 25 ºC for 19 h, then ceased by adjusting pH value to 3.0 with HCl. At the same time, the hydrolysis solution without trypsin was set as control. Consequently, the hydrolysis product was analyzed by HPLC with the modified method developed by Richard et al. [53].
All experiments were carried out with triplicate and data was shown as means ± standard deviation.  The trypsin amidase activity of yeast strain with overexpression of endogenous proteins.

Abbreviations
25 Figure 4 The trypsin amidase activity of yeast strain with overexpression of endogenous proteins.