Chemo- and Regioselective Lysine Modification on Native Proteins

Site-selective chemical conjugation of synthetic molecules to proteins expands their functional and therapeutic capacity. Current protein modification methods, based on synthetic and biochemical technologies, can achieve site selectivity, but these techniques often require extensive sequence engineering or are restricted to the N- or C-terminus. Here we show the computer-assisted design of sulfonyl acrylate reagents for the modification of a single lysine residue on native protein sequences. This feature of the designed sulfonyl acrylates, together with the innate and subtle reactivity differences conferred by the unique local microenvironment surrounding each lysine, contribute to the observed regioselectivity of the reaction. Moreover, this site selectivity was predicted computationally, where the lysine with the lowest pKa was the kinetically favored residue at slightly basic pH. Chemoselectivity was also observed as the reagent reacted preferentially at lysine, even in those cases when other nucleophilic residues such as cysteine were present. The reaction is fast and proceeds using a single molar equivalent of the sulfonyl acrylate reagent under biocompatible conditions (37 °C, pH 8.0). This technology was demonstrated by the quantitative and irreversible modification of five different proteins including the clinically used therapeutic antibody Trastuzumab without prior sequence engineering. Importantly, their native secondary structure and functionality is retained after the modification. This regioselective lysine modification method allows for further bioconjugation through aza-Michael addition to the acrylate electrophile that is generated by spontaneous elimination of methanesulfinic acid upon lysine labeling. We showed that a protein–antibody conjugate bearing a site-specifically installed fluorophore at lysine could be used for selective imaging of apoptotic cells and detection of Her2+ cells, respectively. This simple, robust method does not require genetic engineering and may be generally used for accessing diverse, well-defined protein conjugates for basic biology and therapeutic studies.


Computational methods and data Quantum Mechanical calculations
Full geometry optimizations and transition structure (TS) searches were carried out with the Gaussian 09 package 1 using the M06-2X hybrid functional 2 and 6-31+G(d,p) basis set. Bulk solvent effects in water were considered implicitly through the IEF-PCM integral equation formalism polarizable continuum model. 3 The possibility of different conformations was taken into account for all structures.
Frequency analyses were carried out at the same level used in the geometry optimizations. Thermal and entropic corrections to energy were calculated from vibrational frequencies. The nature of the stationary points was determined in each case according to the appropriate number of negative eigenvalues of the Hessian matrix. Scaled frequencies were not considered. The quasiharmonic approximation reported by Truhlar et al. was used to replace the harmonic oscillator approximation for the calculation of the vibrational contribution to enthalpy and entropy. 4 Scaled frequencies were not considered. Mass-weighted intrinsic reaction coordinate (IRC) calculations were carried out by using the Gonzalez and Schlegel scheme 5,6 in order to ensure that the TSs indeed connected the appropriate reactants and products. Gibbs free energies (ΔG) were used for the discussion on the relative stabilities of the considered structures.

Supporting Figure 2. Four different conformers (top) and Minimum Energy
Pathway (bottom) calculated with PCM(H2O)/M06-2X/6-31+G(d,p) for the whole reaction pathway (aza-Michael addition followed by sulfone elimination) between sulfonyl acrylate 1c and methylamine (model for lysine sidechain). Rotation around the C-C ester bond (conformers I and II) has virtually no effect on the activation barriers (ΔG ‡ ), while rotation around the C-S sulfone bond (conformers III and IV) slightly increases ΔG ‡ . Cartesian coordinates have been inverted with respect to those shown in Figure 2 for clarity.
Supporting Figure 4. Structural differences between the transition states calculated with PCM(H2O)/M06-2X/6-31+G(d,p) for methylamine (model for lysine sidechain) aza-Michael addition to sulfonyl acrylate 1c (top) and sulfonyl acrylamide 1d (bottom). Two different carbonyl rotamers: s-trans (left) and s-cis (right) are shown. The much bulkier N,N-dimethylamide group in 1d severely distorts the α,β-unsaturated carbonyl group deviating it for planarity (CCCO dihedral angles in degrees shown as magenta arrows), thus reducing the electrophilic character of the Michael acceptor as reflected by the higher calculated activation barriers (ΔG ‡ ) . Additionally, the NMe2 group blocks the nucleophilic attack of methylamine through steric hindrance, further increasing ΔG ‡ in certain conformations. Cartesian coordinates have been inverted with respect to those shown in Figure 2 for clarity.

Cartesian coordinates of the lowest energy structures calculated with PCM(H2O)/M06-2X/6-31+G(d,p).
All the calculated structures can be obtained from the authors upon request.

Constant pH Molecular Dynamics (CpHMD) simulations
The pKa of titratable residues was calculated using the method implemented by MacCammon 7 in the Amber 16 package supplemented with Ambertools 17 8 for the following target proteins: Hen white egg lysozyme (PDB 1G7H), Synaptotagmin I C2A domain, C2Am (PDB 3F04; the S95C mutation was modelled using PyMol), Annexin V (PDB 1AVH) and Trastuzumab/Herceptin ® (PDB 1N8Z). This method works only in Generalized Born implicit solvent (igb = 2). 9,10 For the generation of the topology and input coordinate files, the specifically developed leaprc.constph file containing all the necessary variables for a CpHMD simulation was used in combination with the leap utility. The underlying force field was ff10 (equivalent to ff99SB for proteins) and the atomic radii (PBRadii) were set to mbondi2. 11  A random seed (ig = -1) was used to initialize velocities to avoid synchronization artifacts. 12,13 No constant pressure periodic boundary conditions were used (ntp = 0) and the particlemesh-Ewald method 14 to model long-range electrostatic effects was turned off (ntb = 0) with no cut-off for Lennard-Jones and electrostatic interactions. The SHAKE algorithm was used (ntc = 2, ntf = 2) with a relative geometrical tolerance for coordinate resetting of 1E-6 Å (tol = 0.000001), such that the angle between the hydrogen atoms is kept fixed. The time step was kept at 2 fs (dt = 0.002) during the 2 ns heating stage. Each system was then equilibrated for 2 ns with a 2 fs timestep at a constant temperature of 300 K, using Langevin dynamics under the same conditions described above. At this point constant pH in implicit solvent is turned on (icnstph = 1) and changing protonation states starting from physiological pH (solvph = 7.5) is attempted every 2 or 5 steps (ntcnstph = 2 or 5) depending on the number of the titratable residues of the target protein (Supporting Table 2). The original paper suggested that each residue should attempt to swap states every ~100 steps at least. Production trajectories were then run under the same simulation conditions for additional 40 or 100 ns depending on the target protein to facilitate proper conformational sampling, in a range spanning pH = 5 to pH 14 (Supporting Table 2). The program cphstats was used to analyse the results obtained from the CpHMD simulations. From these data, the pKa values were computed using the Hill equation: Hill equation can be rearranged to: The calculated pKa values and Hill coefficients for each residue were derived by fitting the protonated fraction (1 -fd) at each considered pH using a non-linear, least-squares        Figure 6. 1 H NMR of 1c.

Synthesis of Ac-GKAT-NH2
The peptide was synthesized by a stepwise solid-phase peptide synthesis using the Fmoc strategy on Rink Amide MBHA resin (0.1 mmol). The Fmoc amino acids (10 molar equivalents) were automatically coupled on an Applied Biosystems 433A peptide synthesizer using HBTU. The acetylation step was carried out with Ac2O/pyridine. The peptide was then released from the resin and all acid sensitive side chain protecting groups were simultaneously removed using TFA 95%, TIS 2.5%, H2O 2.5%, followed by precipitation with diethyl ether. Finally, the peptide was purified by HPLC on a Waters Delta Prep

LC-MS method for analysis of protein conjugation
LC-MS was performed on a Xevo G2-S TOF mass spectrometer coupled to an according to the manufacturer's instructions. To obtain the ion series described, the major peak(s) of the chromatogram were selected for integration and further analysis.

Analysis of protein conjugation by LC-MS
A typical analysis of a conjugation reaction by LC-MS is described below. The total ion chromatogram, combined ion series and deconvoluted spectra are shown for the product of the reaction. Identical analyses were carried out for all the conjugation reactions performed in this work.
Supporting Figure 9. A typical analysis of a conjugation reaction by LC-MS is described for the reaction of rHSA protein with the acrylate derivative 1c. The total ion chromatogram, combined ion series and deconvoluted spectra are shown for the starting material and the product of the reaction of rHSA with 1 equiv. of 1c. Identical analyses were carried out for all the conjugation reactions performed in this work.
The heated solution was loaded to NuPAGE Bis-Tris mini gel (10x 10 cm) with 4-12% gradient polyacrylamide concentration, and then the conjugation reaction was analysed by electrophoresis (200 V). with 0.5% of Ruby. The gel was mixed overnight at room temperature and read the day after. After wash the gel, coomassie (0.5%) was added and the gel was read 2 h after mixing at room temperature.

Recombinant human serum albumin (rHSA) was kindly provided by Albumedix
Limited; C2Am was provided by Dr. André Neves and Prof. Kevin Brindle; 18 Lysozyme was purchased from Sigma-Aldrich; and finally Annexin V was expressed and purified as previously described. 19 The Trastuzumab antibody used in this study was purchased from commercial supplier (Carbosynth Limited).

Reaction of rHSA with 1a
The reaction was performed according to the general procedure. To an eppendorf with 9. Supporting Table 3. Optimisation of reaction conditions between rHSA and 1c with respect to pH, buffer and time.

Reaction of rHSA with 1d
The   Values are mean of duplicates. There are differences observed in FcRn binding kinetics between the albumins. For rHSA/rHSA-1c: FcRn binding is negatively impacted when 1c was installed at position 573. The kon for rHSA-1c is much slower, driving the kD value up to 29.1 µM, which is 2x fold less than rHSA; for rHSA-K573P/ rHSA-K573P-1c: The kon when position 4 is modified is slightly slower (7.925) compared to non-modified rHSA-K573P (9.755), pushing the affinity kD for the human FcRn, up slightly.

Reaction of rHSA with MS(PEG)4 Methyl-PEG-NHS-Ester
The

Reaction of Annexin V with 1c
The

Reaction of C2Am-1c with Ellman's reagent
A 40 µL aliquot of C2Am-1c (10 µM) was transferred to a 0.5 mL eppendorf tube. The use of same conditions but performing the reactions in reverse order (i.e., Ellman's mixed disulfide formation followed by reaction with 1c gave identical results. Supporting Figure 46. Combined ion series and deconvoluted mass spectrum of the reaction of C2Am-1c with Ellman's reagent (500 equiv.) after 4 h at 37 ºC.
FITC-PEG3NH2 (100 equiv.) was added and the resulting mixture was vortexed for 10 seconds. After 1 h of additional mixing at room temperature, a 10 µL aliquot was analysed complete conversion to the expected product was observed by polyacrylamide gel electrophoresis.

Reaction of 1c with Trastuzumab
The The crude reaction mixture was buffer exchanged with PBS for three times to remove the excess of NHS-(PEG)4-Biotin, obtaining a biotin-to-antibody ratio around 1.6.

Bio-layer interferometry
Binding assays were performed on an Octet Red Instrument (fortéBIO). Ligand

Flow cytometry analysis of Trastuzumab-1c-FITC
The binding affinity of the antibody Trastuzumab-1c-FITC (compared to

Flow cytometry analysis of Trastuzumab-1c-crizotinib
The binding affinity of the antibody Trastuzumab-1c-crizotinib was determined