Enfuvirtide biosynthesis in thermostable chaperone-based fusion

Highlights • Peptide biosynthesis in thermostable chaperone-based fusion• Model peptide enfuvirtide sequence inserted between Ser 199 and Tyr 201 of modified GroEL chaperone• Host proteins thermal denaturation as purification step• Methionineless fusion partner and cyanogen bromide treatment to produce free peptide


Introduction
Peptides have great potential as biologically active substances: their secondary structure allows them to act on complex membrane receptors; natural peptide hormones and neurotransmitters are "blueprints" for new peptide pharmaceuticals, peptide degradation products are included in the human body's metabolism. Also, besides high clearance rate, they penetrate deeper in tissues and can be synthesized much cheaper than conventional antibodies [1].
Modern chemical synthesis methods allow to make polypeptides more than 100 amino acid residues long. For example, Genescript offers the synthesis of polypeptides up to 200 amino acid residues long as a routine service. However, the cost of the resulting drug usually limits its use. To date, the introduction of peptides with a size of more than 40 amino acid residues into clinical practice is obviously limited [2].
Usually, large peptides are synthesized by the combination of SPPS and ligation methods. The SPPS efficiently produces up to 40 aa-long peptides and native ligation helps to unite such fragments into an even longer peptide chain. Both methods have their own weak spots.
Native ligation and extended methods use wide variety of concepts and mechanisms, each with its own features. Usually, the process is slow and sensitive to amino acids on merging ends [9]. HPLC purification of synthesized peptides is difficult and pricey. There are some ways to make it cheaper. One of them, and straightforward, is to make the synthesis process less prone to produce highly similar products. Biosynthesis is much more accurate [10,11] than SPPS. Respectively, for 40 aa biosynthetic peptide average single-substituted impurities content is about 4% or less. Enfuvirtide, the best industrial example of SPPS, gives ~75% HPLC purity of crude peptide [12]. Typical and most abundant peptide impurities are closely related [13]. Usually, related impurities are sequences with deletion, truncation and incompletely unprotected sequences, miscleavage and side-reaction products.
An alternative for SPPS is biosynthesis. Peptides are usually difficult to synthesize standing alone in their mature form in a bacterial expression system because of their activity toward host cell or degradation by cellular proteases. The most efficient strategies for peptide biosynthesis are expression in tandem repeats or in fusion with a wellexpressed partner, or both. We should note ketosteroid isomerase as the only commercially available fusion system for inclusion bodies formation and because of it's similar to our fusion partner advantages for target peptide. The ketosteroid isomerase forms inclusion bodies, thus facilitating purification, and giving the protection of the cell from peptide toxicity and of the peptide from proteolysis [14].   Earlier, we developed a fusion partner [15,16] for the biosynthesis of a toxic antibacterial peptide polyphemusin I in E. coli as part of a fusion protein. Target peptide was inserted into thermostable chaperone GroEL sequence and co-expressed with GroES co-chaperone. The insert was placed in such a way that the peptide protruded inside the substrate binding cavity of the assembled GroELS particle. The main features of this fusion were the expression of the fusion protein in stable soluble form, peptide protection from the bacterial internal environment, protection of bacteria itself from synthesized peptide's toxicity and the possibility of purification from host proteins by lysate heating.
It should be mentioned that the proposed fusion partner is intended only for biosynthesis of peptides consisting of 20 encoded amino acids. Cyanogen bromide hydrolysis, if used, additionally restricts the use of methionine in peptide sequence. The methionine link between fusion partner and target peptide can be modified to facilitate other cleavage protocols. The expanded genetic code and chemical or enzymatic modification of synthesized peptide probably are workaround to produce some NRPs and RiPPs but it is out of scope of the current work.
The main goal of this study is to show the applicability of the same fusion partner for a practically valuable peptide with differing properties, enfuvirtide. In contrast to polyphemusin I, enfuvirtide is a large, hydrophobic and low pI peptide. It is an active pharmaceutical ingredient, used against HIV I virus, and is chemically synthesized on a large scale. Despite its state-of-the-art synthesis, one dose at the time of this study costed more than $70 with the necessity of twice a day administration.
Additionally, we refined the purification and hydrolysis protocol [16] aiming to make it easy and scalable for possible practical use.

Chemicals
LB tissue culture grade, buffer components and SDS-PAGE reagents "for biochemistry" grade, by Amresco, USA; Pierce Unstained Protein MW Marker by Thermo Fisher Scientific, USA; formic acid for biochemistry by AppliChem, USA and MS-grade solvents by Merck, Germany were used.
The plasmid ploop/ES and the synthetic gene were digested with BamHI and EcoRI (Thermo Scientific, USA) and ligated with T4 DNA ligase (Thermo Scientific, USA). All operations were performed in provided buffers according to enzymes' user manuals.   Fig. 4. Circular dichroism spectra of enfuvirtide Met→Hsl and theoretical curve, based on predicted secondary structure fractions. The plot was generated by DichroWeb server

Transformation and strain storage
5 mL LB was inoculated with overnight E. coli culture ad incubated at 37 • С till OD600 0.5±0.1. Culture was cooled to 4 • С centrifuged at 3000 g 4 min 5418R (Eppendorf, Germany). Cells were washed twice with icecold water by resuspension and centrifugation then suspended in 50 µL ice-cold water 0.5 µL of plasmid DNA (10-15 ng) was added and pulsed with MicroPulser Electroporator (Bio-Rad, USA) at 2.5 kV in cooled 2mm cuvette. 1 mL LB was inoculated with transformed cells and incubated for 1 h, then transferred to LB agarose plate with ampicillin (100 μg/mL).

Cultivation, expression and lysis
LB medium containing ampicillin (100 μg/mL) was inoculated with stored cells and incubated at 37 • С overnight. 2.5 L of LB containing 100 μg/mL ampicillin was inoculated with 30 mL night culture and incubated till OD600 0.5±0.1 IPTG was added in 1M water stock to a final concentration 0.4 mM and incubation continued for 3 h. Cell culture was cooled to 4 • С and harvested by centrifugation at 3000g 15 min CR22N (Hitachi, Japan) then washed with ice-cold 50 mM Tris buffer pH 7.5 containing 150 mM sodium chloride (TBS) by resuspension and centrifugation. Cells were stored at -20 • С.
Stored cells (about 3.6 g) were defrosted at 4 • C, resuspended in 300 mL TBS with the addition of 5 mM EDTA and 1 mM β-mercaptoethanol and sonicated in ice bath in high glass beaker by Q500 (Qsonica, USA) with 100% amplitude, 10 s with 30 s intervals and total 300 J/mL applied.
The supernatant was loaded in glass beaker and with constant vigorous stirring, rapidly (15 min) heated to 65 • С from 4 • С, treated 5 min at 65 • С and cooled (20 min) to 4 • С. The heated sample was centrifuged again in the same conditions. The supernatant was collected and pellets discarded.
β-mercaptoethanol and eluted with 40-500 mM sodium chloride gradient in 10 CV. Fractions were collected and analyzed by SDS-PAGE. GroEL-enfuvirtide-rich fractions were united and buffer was changed to 10mM ammonia acetate pH 7.5 on HiPrep 26/10 Desalting (Cytivia, USA) column according to column manual.
Gels were scanned with Perfection 1600 photo scanner (Epson) at 600 dpi resolution and 8 bit gray scale TIFF were processed with ImageLab 6.0.1 software (Bio-Rad, USA). 0.5-2 µg BSA calibration curve with 95% CI was plotted, based on 2 independently weighted samples. GroEL-enfuvirtide samples were applied intermediately with BSA in the same concentration (w/v). GroEL-enfuvirtide/BSA peak area ratio was decided to be the best approximation of GroEL-enf/BSA primary component content ratio (n=8). Arithmetic means of proteins purity were used to calculate total GroEL-enfuvirtide mass in the sample. The calibration was normalized to protein molecular weight ladder and used for approximate concentration estimation on other gels.

Lyophilization and cyanogen bromide treatment
United fractions in ammonia acetate buffer were frozen on the walls of glass vacuum flask in liquid nitrogen and lyophilized Alpha 3, 4 LSCbasic (Martin Christ, Germany) at 0.05 mBar pressure and -110 • C condenser temperature.
Stored protein was warmed up to room temperature and 2-3 mg were weighted with 0.01 mg precision. The protein sample was dissolved 5 mg/mL in 70% (v/v) formic acid. 100 µL of cyanogen bromide 5M stock solution in acetonitrile was added to 1 mL of protein solution. HPLC    column thermostat CT 2.1 (Knauer, Germany) was used for temperature control.
The reaction was quenched after one hour by 10-fold water dilution, followed by immediate liquid nitrogen flash-freezing and lyophilization at 0.05 mbar pressure and -110 • C condenser temperature.

RP-HPLC purification and hydrolysis and HPLC step yield assessment
Lyophilized reaction products were dissolved in 60% mobile phase in concentration 10 mg of dry mass per ml. The mixture was centrifuged at 4 • С, 16,000g 15 min, supernatant was collected and diluted with mobile phase A to 30% mobile phase concentration.

Circular dichroism
Collected fractions were dried overnight on rotary vacuum concentrator RVC 2-25 CDplus (Martin Christ, Germany) at 30 • C and reconstituted at 2.5 mg/mL concentration in 10 mM sodium phosphate pH 9.3 at 4 • C. Solution buffer concentration and pH were adjusted to 30 mM and 6.8 by addition of 1 M pH 6.8 sodium phosphate. The final solution was stored on ice and spectra were collected by Chirascan circular dichroism spectrometer (Applied Photophysics, UK) using 0.1 mm pathlength quartz cuvette. Three repetitive scans between 280 and 180 nm were averaged. Accurate peptide concentration was determined as the mean of 215, 210, and 205 nm absorbances, divided by 15, 21, and 32 extinction coefficients respectively and 0.01 cm pathlength. Single spectra analysis and fold recognition was performed using DichroWeb [19] tool. CDSSTR method [20] and dataset 4 [21] were used for secondary structure fraction calculation.

Statistical methods
The linear fit with 95% percentile calculation was performed using OriginPro 2017 (OriginLab Corporation, USA).
SciPy python package was used for mean, standard error and confidence interval calculation, the confidence interval for Student's t distribution was calculated (scipy.stats.t.interval).
Target GroEL-enfuvirtide protein was predominantly soluble after cell lysis and centrifugation. Clarified lysate was heated at 65 • C and most denatured host proteins were sedimented by centrifugation. Target protein loss was negligible (Fig. 1, lanes 3-5).
GroEL-enfuvirtide was readily adsorbed by DEAE resin and eluted at high concentration by sodium chloride gradient (Fig. 1, lanes 6 and f1-f5 and Appendix A, Fig. A1).
Alternative chromatography modes were assessed. SAX Table A1 MS2 fragmentation peaks of enfuvirtide.
chromatography was tested under the same conditions. GigaCap Q (Tosoh, Japan) performed similarly with less backpressure but adsorbed GroES additionally (Appendix A, Figs. A2 and A3, ). Ceramic hydroxyapatite type I showed no separation from GroES, and Toyopearl Butyl had excessive retention with substantial sample loss even after 20% isopropanol addition (Appendix A, Figs. A3-A6). Multiple run fractions were united after SDS-PAGE analysis based on GroEL-enfuvirtide content. United fractions were exchanged into volatile buffer and lyophilized. Lab-scale preparative batch produced 295 mg of 85.4-90% pure target protein from 2.5 liters of culture media. Densitometry data show about 272 (CI 254-290) mg of target protein.
Different cyanogen bromide hydrolysis protocols were tested (Appendix A, Fig. A9). Protein concentration, ultrasound and shaking application, the reaction mixture with or without acetonitrile were tested. Unexpected results were obtained: practically full hydrolysis occurred after 60 min incubation.
The yield of cyanogen bromide hydrolysis and HPLC purification was 33.5-38.9 %, so about 7.14-8.29 mg of 95% pure peptide (Fig. 2) can be produced from 2.5 L of culture.
detailed MS2 fragmentation peaks table is presented in Appedix. The circular dichroism spectra of reconstituted peptide were measured (Fig. 4) and secondary structure fraction were calculated ( Table 2).

Discussion
Synthetic gene cloning showed that the developed earlier ploop/ES plasmid with BamHI, HindIII and EcoRI polylinker introduced between the codons of amino acids 199 and 201 of modified thermophilic GroEL [16] is a convenient tool for recombinant peptide expression. Also, cultivation parameters for that fusion were generic and there is a great space for protein yield improvement. High density cultures, according to some authors can increase yield up to 9-85 folds [22].
It is known that one of the key parameters for successful and reproducible cell lysis by sonication is energy applied per milliliter of cell suspension. Other crucial sonication parameters, such as cell quantity, buffer content or temperature are much more obvious. In our practice, we used to record the specific presets for one specific sonicator. So, especially for reproducibility purposes and considering initial energy about 500 J/mL [23][24][25], a new protocol was successfully tested with little lower energy applied, 300 J/mL. We should mention that energy efficiency of sonicator varies on model, probe, liquid volume and viscosity. In this study, according to device self-measurement, it was about 20% of declared power consumption.
Target protein purification by lysate heating is one of the key advantages of our fusion that retained its thermostability after the introduction of enfuvirtide into thermophilic GroEL sequence. Note that no protease inhibitors were added except EDTA. Addition of other protease inhibitors adds cost to process and can modify target protein during heating [26]. Fast heating of lysate almost immediately after lysis helps skip optimal conditions for proteases and begins denaturing them. The selectivity of heating purification, in our opinion, is comparable with the selectivity of Ni-affinity chromatography in denaturing conditions and cost efficiency is beyond comparison. Our peptide biosynthesis approach seems suitable for peptides with folding issuesit supports a peptide in the solution on the substrate binding surface of GroEL chaperone. Therefore, it is possible to avoid denaturants at all by BrCN cleavage replacement with some cleavage technique in native conditions. However, in terms of biophysics, 65 • C heating is above fusion protein melting point [27], so melting reversibility for incorporated peptide should be tested individually.
In previous study we used reverse phase chromatography on C4 resin for fusion protein polishing. Reverse phase in that condition has undeniable advantages: it is universal, has high resolution and yields protein ready for lyophilization though it also has too many drawbacks: protein recovery issues are usually discussed by protein RP phases manufacturers and rarely, by scientists [28]; RP-ready equipment is expensive; it produces organic wastes; it has a moderate binding capacity. Other chromatography modes were tested: strong and weak anion exchange, hydrophobic interactions and hydroxyapatite. The last two were outsiders, they both had moderate binding capacities and no GroEL/GroES separation for hydroxyapatite or recovery issues for hydrophobic interactions resins respectively. DEAE and Q resins have comparable dynamic binding capacity, according to manufacturer -40 and 47 mg BSA/mL resin, but DEAE did not bind GroES and it was accepted as an advantage. Actual binding capacities for fused protein were not tested. WAX purification protocol was also used for polyphemusin I fusion and for empty fusion partner, modified GroEL. Peak retention varies about 1 CV on gradient, but the method proved to be universal [27].
Target protein desalting by chromatography was chosen. It is a fast and easy procedure; resin is cheap, homemade column can be packed and used under gravity flow. Similarly, dialysis is routine in most biochemical labs and can be used for buffer exchange purposes instead.
The dry weight of purified protein and densitometry data are consistent, which shows low content of non-protein contaminants and buffer residues.
There is a large variety of cyanogen bromide treatment protocols [29][30][31][32]. BrCN hydrolysis needs denaturing conditions to access methionine residues in the protein structure. Every denaturant has its own drawbacks: formic acid both denatures and provides low pH for the reaction, formic acid-based reaction mixture is fully volatile, but it modifies protein with formyl residues; urea and guanidine are nonvolatile, they need hydrochloric acid addition, urea tends to modify protein with carbonylation and guanidine narrows following purification step to RP-HPLC. So, formic acid and minimal reaction time was chosen as reaction conditions. Cyanogen bromide 5 M stock solution in dry acetonitrile helps to minimize exposure to BrCN fumes, additionally this stock is stable and acetonitrile accelerates cleavage [29].
In our opinion, the full and fast cyanogen bromide cleavage is a feature of soluble non-hydrophobic fusion partner [33].
Certain levels of peptide modification in chosen conditions are inevitable, however it is under 5% and can be eliminated during subsequent purification [34].
Large pore analytical-grade C18 sorbent was used for peptide purification. The applicability of flash-RP, SPE-RP, or any other sorbent should be considered for each peptide and each purpose individually. RP HPLC usually uses formic acid (FA) or trifluoroacetic acid (TFA) as mobile phase additives. TFA usually provides better peak shape, but it is potentially toxic [35] and has long environmental lifetime [36]. Formic acid was chosen with full MS compatibility as a bonus.
Total yield for hydrolysis and HPLC step was 36% of theoretical, considering 7.2% share of the free peptide in protein mass and ignoring lyophilized protein purity. Moderate yield can be explained by partial hydrolysis (Fig. 1, lane 7, thin lines above main hydrolysis products) and peptide recovery issues from column and during solubilization.
The cyanogen bromide treatment in 70% formic acid provides fully denaturing conditions, which may cause peptide folding issues. It can be expected that soluble peptide would fold properly on its substrate [37], especially if it is acysteine-free peptide. Enfuvirtide is exceptionally well-studied peptide. The proper fold of enfuvirtide on its substrate is alpha helix [38], but there are circular dichroism data indicating only 18-20% of helical structure for enfuvirtide solutions without substrate [39,40]. We have conducted circular dichroism assessment of enuvirtide solution without substrate. Our data are collected in much more concentrated solution, but total 18% of helical structure is consistent with previously existed data.
Low peptide/fusion protein ratio is a great metabolic burden for host cells. On the other hand, favorable and reproducible fusion protein properties combined with high expression ratesregardless of target peptidecan neutralize that drawback. Substrate-binding domain trimming and peptide tandem repeats can be used to increase peptide/ fusion ratio and yield; however, each technique should be tested for each target peptide individually.
Well-characterized, cost-efficient, and straightforward protocol was developed. It has flexibility and most of the methods are scalable up to industrial level. All the basic steps(target peptide sequence cloning, fermentation, host cell denaturing by lysate heating, cyanogen bromide cleavage of fusion protein and reverse-phase peptide purification) are summarized in scheme (Fig. 5). Intermediate purification of fusion protein is optional.
The comparison of biophysical properties of the fusions of modified GroEL with different peptides [22] and full quality and biological activity assessment of synthesized enfuvirtide [29] are the prior directions for future study. Also, systematic practical comparison with popular fusion tags is unavoidable to name our project "the new fusion system" (Figs. A1-A9, Table A1).

Funding
The reported study was funded by RFBR, project number 18-29-08023.

Data availability statement
All stock solutions protocols, any other mentioned protocols and data are available in details on request from corresponding author Author contributions Vladimir Zenin: conceptualization, methodology, data curation, formal analysis, writing-original draft preparation, Maria Yurkova: conceptualization, methodology, writing-review and editing, supervision Andrey Tsedilin: methodology, writing-original draft preparation, writing-review and editing, Alexey Fedorov: conceptualization, writing-review and editing, funding acquisition, supervision

Declaration of Competing Interest
The authors of manuscript "Enfuvirtide biosynthesis in thermostable chaperone-based fusion" Vladimir Zenin, Maria Yurkova, Andrey Tsedilin and Alexey Fedorov declare no conflict of interest.