Incorporation of Aliphatic Proline Residues into Recombinantly Produced Insulin

Analogs of proline can be used to expand the chemical space about the residue while maintaining its uniquely restricted conformational space. Here, we demonstrate the incorporation of 4R-methylproline, 4S-methylproline, and 4-methyleneproline into recombinant insulin expressed in Escherichia coli. These modified proline residues, introduced at position B28, change the biophysical properties of insulin: Incorporation of 4-methyleneproline at B28 accelerates fibril formation, while 4-methylation speeds dissociation from the pharmaceutically formulated hexamer. This work expands the scope of proline analogs amenable to incorporation into recombinant proteins and demonstrates how noncanonical amino acid mutagenesis can be used to engineer the therapeutically relevant properties of protein drugs.

Table S3: Expression conditions and insulin yields .

Enzymes:
Restriction enzymes, kinases, and ligases were purchased from New England Biolabs.Trypsin was purchased from MilliporeSigma.Carboxypeptidase B was purchased from Worthington Biochemical.Glu-C peptidase was purchased from Promega.

Strains and plasmids:
The proline-auxotrophic E. coli strain CAG18515 was obtained from the Coli Genetic Stock Center (CGSC) at Yale University.Strain DH10B was used for standard cloning operations; electrocompetent CAG18515 were transformed with purified plasmid products.
The plasmid pQE80_H27R-PI_proS contains an IPTG-inducible proinsulin gene and the E. coli prolyl-tRNA synthetase gene controlled by its endogenous promoter.Proinsulin is translationally fused to an N-terminal leader peptide (H27R) that increases expression yields, 1 and a 10x-his tag to facilitate proinsulin enrichment after refolding.The gene for H27R-PI was ordered as a g-Block gene fragment from Integrated DNA Technologies (IDT) after codon optimization of the Nterminal leader peptide.A restriction enzyme cloning approach (XhoI and BamHI restriction enzyme cut sites) was used to replace the hexahistidine-tagged proinsulin gene in the plasmid pQE80PI-proS, which was described previously. 2 Correct installation of the gene of interest was verified by Sanger sequencing.
A blunt-end ligation approach was used to install the C443G and M157Q mutations.The proScontaining plasmid was amplified with primers AL01004_fwd & AL01004_rev (C443G), or AL01005_fwd & AL01005_rev2 (M157Q).The linear PCR product was phosphorylated (T4 PNK) and circularized (T4 DNA ligase).Correct installation of the point mutation was verified by Sanger sequencing.
The culture was grown at 37°C until it reached OD ~0.8, after which it was subjected to a medium shift: cells were pelleted via centrifugation (5,000 g, 5 min, 4°C) and washed twice with 10 mL ice-cold 0.9% NaCl.Washed cells were resuspended in 80 mL of 1.25x M9 -Pro, a 1.25x concentrated form of M9 that omits proline.The culture was split into 4 mL aliquots, and incubated for 30 min at 37°C to deplete residual proline.A 1 mL solution containing 2.5 mM ncPro and 1.5 M NaCl was added (0.5 mM ncPro and 0.3 M NaCl working concentrations).After 30 min of incubation at 37°C to allow for ncPro uptake, proinsulin expression was induced by the addition of 1 mM IPTG.Cultures were incubated for 2.5 h at 37°C, after which cells were harvested via centrifugation and stored at -80°C until further processing.
Cell pellets were thawed and lysed with B-PER Complete (Thermo Fisher Scientific) for 1 h at room temperature with shaking, then centrifuged (20,000 g, 10 min) and the supernatant discarded.The pellet (containing insoluble proinsulin) was washed once with Triton wash buffer (2 M urea, 20 mM Tris, 1% Triton X-100, pH 8.0), and twice with ddH2O.The pellet was resuspended in solubilization buffer (8 M urea, 300 mM NaCl, 50 mM NaH2PO4, pH 8.0), and proinsulin was allowed to dissolve for 1 h at room temperature with shaking.Samples were centrifuged, and the supernatant removed for analysis by SDS-PAGE and MALDI-TOF (described in the section entitled "MALDI-TOF MS" below).
When growth reached mid-exponential phase (OD600 ~0.8), the culture was subjected to a medium shift: cells were pelleted via centrifugation (5,000 g, 5 min, 4°C) and washed twice with 100 mL ice-cold 0.9% NaCl.Washed cells were resuspended in 1 L of 1.25x AMM -Pro, a 1.25x concentrated form of AMM that omits proline.Cells were incubated for 30 min at 37°C to deplete residual proline, after which 250 mL of a solution containing 2.5-5.0 mM ncPro (see Table 4.S3) and 2.5 M NaCl was added (0.5-1.0 mM ncPro and 0.5 M NaCl working concentrations).After 30 min of incubation at 37°C to allow for ncPro uptake, proinsulin expression was induced by the addition of isopropylthio-b-galactosidase (IPTG, 1 mM).Cultures were incubated overnight at 37°C, after which cells were harvested via centrifugation and stored at -80°C until further processing.
Proline-containing proinsulin was expressed using strain CAG18515 harboring plasmid pQE80-H27R-PI_proS in 7.5 L (as 6 x 1.25 L cultures) of Terrific Broth (TB).IPTG (1 mM) was added at mid-log phase (OD600 ~0.8) to induce proinsulin expression.Cultures were incubated at 37°C for 3 h, after which cells were harvested via centrifugation and stored at -80°C until further processing.

Proinsulin refolding:
Cell pellets were warmed from -80°C to room temperature and resuspended in 5 mL IB buffer (50 mM tris, 100 mM NaCl, 1 mM EDTA, pH 8.0) per gram cell pellet.Lysozyme (1 mg L -1 ) and phenylmethylsulfonyl fluoride (PMSF, 1 mM) were added, and the slurry was placed on ice for 30 min before cells were lysed via sonication.The lysate was centrifuged (14,000 g, 30 min, 4°C) and the soluble fraction was discarded.The pellet was washed twice with IB buffer + 1% Triton X-100, once with IB buffer, and once with water; this final step required centrifugation for 45 min.The washed inclusion body pellet was resuspended in a minimal amount of water, and the mass of proinsulin in the inclusion body pellet was estimated by SDS-PAGE.
In preparation for proinsulin refolding, the inclusion body was resuspended in 3 M urea and 10 mM cysteine in water, such that the proinsulin concentration was 1 mg proinsulin per L total slurry.To dissolve proinsulin, the pH was adjusted to 12 and sample stirred for 1 h at room temperature.At this stage, ncPro incorporation was assessed by MALDI-TOF, as described in the section entitled "MALDI-TOF MS" below.The solubilized proinsulin solution was diluted ten-fold into refolding buffer (10 mM CAPS, pH 10.6) that had been pre-cooled to 4°C.The pH of the refolding solution was adjusted to 10.7 and the sample stored at 4°C; care was taken to ensure that the solution pH remained between 10.6 and 10.8 throughout the refolding process.Proinsulin refolding progress was monitored by reverse-phase HPLC, and usually reached completion within 50 h.
Proinsulin was enriched from the refolding solution after adjusting the pH to 8.0 and incubating the sample overnight with Ni-NTA resin and 10 mM imidazole.The resin was washed with wash buffer (25 mM imidazole in PBS, pH 8.0), and proinsulin was eluted with elution buffer (250 mM imidazole in PBS, pH 8.0).Fractions containing proinsulin were combined and extensively dialyzed against 10 mM sodium phosphate, pH 8.0.

Insulin maturation and purification:
Refolded proinsulin was warmed to 37°C and digested with trypsin (20 U mL -1 ) and carboxypeptidase-B (10 U mL -1 ) at 37°C for 90 min to remove the N-terminal tag and C-chain.Digestion was halted by adjusting the pH to ~3 with 6 N HCl.
Insulins were immediately purified after proteolysis by reverse-phase HPLC on a C4 column (Penomenex Jupiter 5 µm particle size, 300 Å pore size, 250x10 mm) using 0.1% TFA in water (solvent A) and 0.1% TFA in acetonitrile (solvent B) as mobile phases.A gradient of 25-32% solvent B was applied over 65 min, and fractions containing insulin were collected.Samples for purity analysis were removed at this stage; the remaining portion of the fraction was lyophilized.Each insulin fraction was analyzed by analytical reverse-phase HPLC, MALDI-TOF MS (Figure 2e-h), and SDS-PAGE (Figure S2) to verify sample quality and ensure ≥95% purity for all downstream analyses.Lyophilized insulin powders were stored at -20°C until further use.

Circular dichroism spectroscopy:
Equilibrium measurements: The circular dichroism spectra of insulin samples (60 µM in 100 mM sodium phosphate, pH 8.0) were measured at 25°C in 1 mm quartz cuvettes on an Aviv Model 430 Circular Dichroism Spectrophotometer using a step size of 0.5 nm and averaging time of 1 s.A reference buffer spectrum was subtracted from each sample spectrum.
Ellipticity was monitored at 222 nm over 120 s (1 s kinetic interval, 0.5 s time constant, 10 nm bandwidth) at 25°C.A typical run led to a rapid drop in CD signal as mixing occurred (~5 s), then a gradual rise to an equilibrium ellipticity representative of an insulin monomer.Data preceding the timepoint with the greatest negative ellipticity (representing the mixing time) were omitted from further analysis.Runs were discarded if the maximum change in mean residue ellipticity from equilibrium did not exceed 750 deg cm 2 dmol -1 , which indicated poor mixing.The remaining data were fit to a mono-exponential function using Scipy (Python).The data presented here are from at least two separate insulin HPLC fractions, measured on two different days.
For quality control, an equilibrium spectrum for each protein was obtained after dilution as described above; all spectra approached that of the insulin monomer 4 (Figure S4).The CD spectrum of human insulin under pre-dilution formulation conditions was obtained using a 0.1 mm quartz cuvette.In each case, a blank spectrum containing all buffers and ligands was subtracted from the sample spectrum.

Fibrillation:
Insulin samples (60 µM in 100 mM sodium phosphate, pH 8.0) were centrifuged at 22,000 g for 1 h at 4°C, prior to the addition of 1 µM thioflavin T (ThT).Each insulin (200 µL) was added to a 96-well, black, clear bottom plate (Greiner Bio-One) and sealed.Samples were shaken continuously at 960 rpm on a Varioskan multimode plate reader at 37°C, and fluorescence readings were recorded every 15 min (444 nm excitation, 485 nm emission).Fibrillation runs were performed on at least two separate HPLC fractions, each in triplicate or quadruplicate, on at least two different days.The growth phase of each fibrillation replicate was fit to a linear function, and fibrillation lag times were reported as the x-intercept of this fit.Fibril samples were stored at 4°C until analysis by TEM and mass spectrometry.

Transmission electron microscopy:
Insulin fibrils were centrifuged (5,000 g, 1 min), then washed twice and resuspended in ddH2O.Fibrils were stained with 2% uranyl acetate on a 300-mesh formvar/carbon coated copper grid (Electron Microscopy Sciences) and imaged on a Tecnai T12 LaB6 120 eV transmission electron microscope.

ANS fluorescence:
Insulins (1 µM) were mixed with 5 µM ANS in 100 mM phosphate buffer, pH 8.0.Fluorescence emission spectra were measured in 1 cm quartz cuvettes at ambient temperature using a PTI QuantaMaster fluorescence spectrofluorometer.A 350 nm excitation wavelength and scan rate of 2 nm s -1 were used.Measurements for each insulin were performed in triplicate from three separate HPLC fractions.Spectra were smoothed before plotting and determining the emission maxima.

Analytical ultracentrifugation:
Insulins were dialyzed against 28.6 mM tris buffer, pH 8.0, and formulated at 300 µM insulin, 12.5 mM m-cresol, and 125 µM ZnCl2.Ligand-free insulins were formulated from the same dialysis sample.The insulin samples were then diluted 75-fold into 25 mM tris buffer, pH 8.0 (4 µM insulin, 167 µM m-cresol, 1.7 µM ZnCl2), conditions identical to those after dilution in the CD dissociation kinetics experiments.Diluted insulins were incubated at room temperature for at least one hour after dilution prior to analysis.
Velocity sedimentation experiments were performed at the Canadian Center for Hydrodynamics at the University of Lethbridge.300 µM insulin samples were measured by interference optics, due to the high absorbance from the protein and m-cresol; they were measured in 3 mm titanium centerpieces from Nanolytics, fitted in a standard Beckman Coulter cell housing using a 3mm spacer above and below the centerpiece.Diluted samples (4 µM insulin) were measured using absorbance optics at 225 nm in standard Beckman Coulter 1.2 epon-charocol centerpieces.All samples were measured at 50,000 RPM and 20°C in standard Beckman Coulter cell housings fitted with a 1.2 cm epon-charcoal centerpiece and sapphire windows.All data were analyzed with UltraScan III version 4.0 release 6606. 5Velocity data were initially fitted with the twodimensional spectrum analysis 6 to determine meniscus position and time-and radially-invariant noise.Subsequent noise-corrected data were analyzed by the enhanced van Holde-Weischet analysis 7 to generate diffusion-corrected integral sedimentation coefficient distributions.

Models of ins-4R-Me and ins-4S-Me hexamers:
Crystal structures of the T6 (PDB: 1MSO) and R6 (1EV6) insulin hexamers were downloaded from the Protein Data Bank and visualized with Pymol.The hydrogen atoms at the C g position of ProB28 were replaced with methyl groups; no additional energy minimization was used.

Mass spectrometry characterization of dissolved insulin fibrils:
Samples containing insulin fibrils were centrifuged (5,000 g, 1 min, 4°C), washed twice with ddH2O and dissolved in dimethylsulfoxide (DMSO).Dissolved fibrils were reduced (5 µM DTT, 55°C, 20 min), then diluted ten-fold into MS loading buffer (2% ACN, 0.2% formaic acid, FA, in water).8 µL of this sample was injected onto a Thermo EASY-Spray column (ES902, C18, 2um, 100A, 75 um x 25 cm) equipped with an Acclaim PepMap trapping column (C18, 3um, 100A, 75 um x 2 cm), and analyzed using a Thermo Orbitrap Eclipse Tribrid Mass Spectrometer coupled with a Thermo Easy nLC-1200.The resulting raw files were deconvoluted using MASH Explorer. 8lculations of proline and proline analog conformation: The equilibrium geometry conformations of the N-methyl, O-methyl ester protected versions of proline, 4-methyleneproline, and 3,4-dehydroproline in water were calculated using Spartan Student (Wavefunction) at the B3LYP/G-31+G** level of theory.Pseudorotation parameters were calculated from the dihedral angles about the pyrrolidine ring, as previously reported. 9% Digested photo-proline peptide was analyzed by LC-ESI-MS, due to diazirine photolysis during MALDI-TOF analysis.We quantified the [M+3H] +3 ion for the proline and ncPro-containing peptides.We also note the presence (27%; m/z = 519.3) of an ion corresponding to replacement of proline by 3,4-dehydroproline (dhp).The incorporation efficiency for photo-pro reported here is with respect to the proline and dhp ions: photo-pro incorporation efficiency = . Changes in CD signal after dilution are not due to protein denaturation.At 60 µM, insulin is expected to exist as a dimer at pH 8, as a monomer in 20% ethanol, and in denatured form in 8 M guanidinium chloride.These spectra are overlaid with equilibrium spectra collected before and after dilution for kinetic CD measurements.Spectra below 210-215 nm were omitted for some samples due to high levels of buffer absorbance at these wavelengths.The pseudorotation parameters amplitude (A) and phase angle (P) 10 are indicated for each structure.Amplitude corresponds to the degree of puckering for each proline, and phase angle represents puckering geometry.The endo (P~198°) and exo (P~18°) ring puckers of proline are nearly isoenergetic and rapidly interconvert. 9More notable in this case is the puckering amplitude, which decreases with the addition of sp 2 hybridized carbon atoms.

Figure S1 .
Figure S1.SDS-PAGE analysis of proinsulin expression in media supplemented with noncanonical proline analogs.Proinsulin (12.7 kDa) was expressed after a medium shift to ncProcontaining medium.The inclusion body fraction was isolated, solubilized, and analyzed by SDS-PAGE.

Figure S5 .
Figure S5.Models of ins-4R-Me and ins-4S-Me hexamers.4S-Me and 4R-Me were modeled in the structures of the R6 (a) and T6 (b) insulin hexamers (PDB ID: 1EV3 & 1MSO, respectively).Atoms near to each methyl substituent are indicated; distance measurements are in Å.

Figure S6 .
Figure S6.Deconvoluted mass spectra of insulin and ins-4ene fibrils.Insulin fibrils were dissolved with DMSO, reduced, and analyzed by mass spectrometry.Shown are the peaks corresponding to the insulin B-chain (expected molecular weights for insulin and ins-4ene: 3427.68 and 3439.68Da).The peak present at 3464.7 Da in both samples corresponds to sample contamination by human keratin.Compared to insulin, additional peaks corresponding to chemical modification of the ins-4ene B-chain were not observed.

Figure S7 .
Figure S7.Conformations of proline, 4-methyleneproline, and 3,4-dehydroproline.The equilibrium geometry conformations of protected versions of proline (a), 4ene (b), and 3,4dehydroproline (c) were calculated (B3LYP/G-31+G**).The pseudorotation parameters amplitude (A) and phase angle (P) 10 are indicated for each structure.Amplitude corresponds to the degree of puckering for each proline, and phase angle represents puckering geometry.The endo (P~198°) and exo (P~18°) ring puckers of proline are nearly isoenergetic and rapidly interconvert.9More notable in this case is the puckering amplitude, which decreases with the addition of sp 2 hybridized carbon atoms.
The g-Block gene fragment was purchased from IDT.The coding sequence is in UPPERCASE, XhoI and BamHI cut sites are underlined.

Table S1 . Incorporation of non-canonical proline residues into recombinant proinsulin.
# Mass shift compared to the proline-containing peptide present in the spectrum * n.d., not detected

Table S3 . Expression conditions and insulin yields.
Yields determined by measuring absorbance (280 nm) after proinsulin refolding and Ni-NTA enrichment.