Halogenation of Peptides and Proteins Using Engineered Tryptophan Halogenase Enzymes

Halogenation of bioactive peptides via incorporation of non-natural amino acid derivatives during chemical synthesis is a common strategy to enhance functionality. Bacterial tyrptophan halogenases efficiently catalyze regiospecific halogenation of the free amino acid tryptophan, both in vitro and in vivo. Expansion of their substrate scope to peptides and proteins would facilitate highly-regulated post-synthesis/expression halogenation. Here, we demonstrate novel in vitro halogenation (chlorination and bromination) of peptides by select halogenase enzymes and identify the C-terminal (G/S)GW motif as a preferred substrate. In a first proof-of-principle experiment, we also demonstrate chemo-catalyzed derivatization of an enzymatically chlorinated peptide, albeit with low efficiency. We further rationally derive PyrH halogenase mutants showing improved halogenation of the (G/S)GW motif, both as a free peptide and when genetically fused to model proteins with efficiencies up to 90%.

Halogenation reactions: For peptide halogenation 2.5 mM substrate was treated with 10 µM enzyme and 50 mM NaCl/NaBr in the presence of the cofactors 10 µM FAD, 2 mM NADH and 30 µM flavin reductase enzyme RebF, 20 mM glucose and 5 unit glucose dehydrogenase enzyme. The reaction was carried out in 10 mM phosphate buffer (pH = 7.2) overnight at room temperature with constant mixing. The enzymes were inactivated by heating at 95 • C for 10 min and removed by centrifugation at 13,500 rpm for 10 min. The supernatant was analyzed by LC-MS, and the halogenated and non-halogenated products were quantified from the HPLC peak area% of starting material and product ( Figures S26-S33).
For kinetic study, various concentrations (0.5-20 mM) of GGW peptide were halogenated by PyrH-WT and PyrH-Q160N enzymes following the above protocol. The reactions were carried out for 60 min, and samples were collected at 0, 5, 15, 30 and 60 min; and the reactions were stopped immediately by inactivating the enzymes at 95 • C for 10 min and the precipitates were removed by centrifugation at 13,500 rpm for 10 min. The supernatant was analyzed by LC-MS. The amounts of halogenated products were quantified from the HPLC peaks using a standard curve made with various concentration of chlorinated GGW peptides, and used for calculating the reaction rate. The substrate concentrations and the initial reaction rates were used for making Michaelis-Menten plot ( Figure S34), which was used to calculate the kinetics parameters.
For protein chlorination, 86 µM protein was treated with 10 µM enzyme and 50 mM NaCl in the presence of the cofactors 10 µM FAD, 2 mM NADH and 30 µM flavin reductase enzyme RebF, 20 mM glucose and 5 unit glucose dehydrogenase (GDH) enzyme. The reaction was done in 10 mM phosphate buffer (pH = 7.2), for 3 h at room temperature with constant mixing. Precipitates, if any, were removed by centrifugation at 13,500 rpm for 10 min. To cleave the halogenated C-terminus, the supernatant was mixed with PreScission protease (10 unit/mL final concentration) and incubated overnight at 4 • C with constant mixing. The reaction mixtures were heated to 95 • C for 10 min, precipitates were removed by centrifugation at 13,500 rpm for 10 min. The supernatant was analyzed by LC-MS, and the halogenated and non-halogenated cleaved C-terminus was quantified from the respective peak area% ( Figures S8-S25). To rule out any possible halogenation of cleaved Cterminus peptide during protease treatment at 4 • C, a control experiment was done using the thermostable Stoffel fragment with C-terminal SGW tags (Table S1) where the reaction mixture was heat treated (20 min at 65 • C) to inactivate the PyrH, RebF and GDH enzymes after 3 hours' halogenation prior the PreScission protease treatment.
Detection of halogenation position: The halogenation position was detected in the chlorinated products of two peptides G 5 W and G 3 SGW. The peptides were chlorinated in preparative scale by scaling up the above reaction and the chlorinated products were purified by semi-preparative HPLC. The halogenation site was identified by 1 H NMR.
Analytical methods: Chemicals and anhydrous solvents were obtained from Sigma Aldrich and were used without further purification. Spectroscopic grade solvents were purchased from Sigma Aldrich. NMR spectra were recorded on Bruker Avance III 400 MHz spectrometer in MeOD-d4. Data are reported in the following order: chemical shifts are given (δ); multiplicities are indicated as s (singlet), d (doublet), t (triplet), q (quartet) and m (multiplet). High-resolution mass spectra (HRMS) were recorded on an Agilent ESI-TOF mass spectrometer at 3500 V emitter voltage. Exact m/z values are reported in Daltons.
Analytical UPLC-MS was accomplished by using an Agilent 1290 Infinity II LC system coupled with Agilent 6120 Single Quadrapole MS. 20 µL of crude mixture was injected onto an Agilent EclipsePlus C18 analytical column (1.8 µ packing, 2.1 mm × 50 mm). Gradient starting conditions of 10% (v/v) MeCN/H 2 O (plus 0.1 % (v/v) HCOOH) were held for 1 min before development to 95% (v/v) MeCN/H 2 O over 3 min prior to re-equilibration to starting conditions over 2 min. Flow rates and column temperature were kept constant at 0.4 mL min −1 and 25 • C, respectively. UV absorbance was detected at 254 nm and 280 nm throughout. HPLC retention time of peptides were assigned based on mass extraction.
Purification of the peptides was performed by Agilent 1260 Infinity II LC system. 900 µL of solution containing crude mixture dissolved in H 2 O/MeCN was injected onto a Phenomenex Jupiter semi-preparative C12 HPLC column (4 µ packing, 250 × 10 mm).
then held for 3 min prior to re-equilibration of starting conditions over 3 min. Flow rates were kept constant at 5 mL min −1 . UV absorbance was detected at 280 nm throughout.

Results
The TH enzymes PyrH, SttH, ThHal, PrnA and RebH were first assayed for chlorination of a panel of short (2-4 mer) peptides comprising only tryptophan (W) and glycine (G) residues. PyrH, SttH and ThHal chlorinated peptides with tryptophan at the C-terminus ( Table 1). The enzymes PrnA and RebH that halogenate at the tryptophan 7 position did not produce any detectable chlorinated products. Interestingly, only mono-chlorinated products were observed, even for the peptides containing two tryptophan residues. Chlorination of WW and GW dipeptides was observed but not WG, suggesting that only the C-terminal tryptophan residue was accessible to the substrate binding site.
Next, we expanded the panel of screened dipeptide substrates to include charged, nucleophilic and bulkier amino acid residues adjacent to the C-terminus tryptophan. PyrH exhibited very high activity on SW, GW and AW dipeptides, showing 83%, 55% and 50% chlorination, respectively ( Figure 1). SttH and ThHal also showed higher activity on these peptides. Peptides with a bulkier or charged amino acid next to the C-terminal tryptophan showed reduced halogenation. All three enzymes were able to brominate the dipeptides, although with lower efficiencies. The most efficient enzyme PyrH was able to brominate 25% of the GW and 17% of the SW peptide.
Tripeptides comprising the permissive SW, GW and AW motifs extended by either a bulky, nucleophilic, or charged amino acid at the N-terminus were next investigated. The halogenases were largely inefficient at chlorinating the tripeptides SSW, GSW, AAW or GAW, and almost inactive for their bromination (Figure 2). PyrH-catalyzed chlorination dropped from 83% for SW dipeptide to 16% and 10% for GSW and SSW tripeptides, respectively. Activity on the AAW (9%) and GAW (1%) tripeptides was also markedly lower than for the AW dipeptide (50%). In contrast, chlorination and bromination of the tripeptide GGW (50%) was comparable to GW dipeptide (55%). Highest activity was seen for the SGW tripeptide, with 58% chlorination achieved by PyrH. Incorporation of charged amino acids reduced efficiency in all sequence contexts. The effect of peptide length on PyrH-catalyzed chlorination was tested next by sequential N-terminal extension of the GGW and SGW tripeptides with glycine ( Figure 3). All peptides were efficiently converted in comparison to the GW dipeptide and GGW /SGW tripeptides. The longer G 5 W and G 3 SGW peptides were next enzymatically chlorinated and purified at preparative scale. 1 H NMR spectra indicated tryptophan C5 as the sole halogenation site (Figures S1-S4, S6 and S7, in agreement with the regiospecificity of PyrH [25]. To further confirm halogenation, Suzuki coupling of 1,4-benzenediboronic acid bis(pinacol) ester to the PyrH-chlorinated G 3 SGW peptide was carried out. Phenylsubstituted G 3 SGW peptide formed by monocoupling and subsequent protodeboronation was detected by LC-MS ( Figure S35). This indicates the possibility of derivatization of enzymatically chlorinated peptides, although further protocol optimization is required to achieve efficient dicoupling.
The PyrH substrate binding pocket differs notably from PrnA and RebH, resulting in tryptophan adopting a different bound conformation [26]. Differences include both an α-helical region in the T7H enzymes (F458-N464 in RebH and F447-N453 in PrnA) and a short loop insertion (T343-F438 in RebH and N444-S448 in PrnA) that potentially limits optimal peptide binding by active site occlusion. No similar structural elements are present in PyrH, likely rendering its active site more accessible to the larger non-cognate peptide substrates (Figure 4). We hypothesized that further opening-up of the PyrH substrate binding site could enhance halogenation efficiency. We therefore deleted four amino acids (A145-V148) within a proximal unstructured loop region (F144-Y166) to yield variant PyrH-dASQV. This enzyme showed between 10 to 30% increased activity over the wildtype PyrH on peptide substrates ( Figure 5). We also introduced a conservative mutation, Q160N, to both increase active site accessibility and further accommodate residues preceding the C-terminal tryptophan (Figure 4). This variant (PyrH-Q160N) showed ≥50% higher activity over PyrH for chlorination of GGW and G 5 W peptides ( Figure 5). The binding affinity (K M ) of PyrH-Q160N was 54% improved over PyrH for the GGW peptide (Table 2). Both variants exhibited similar turn-over number (kcat), indicating that the modest increase in activity of PyrH-Q160N arose from improved substrate binding.    RebH and N444-S448 in PrnA) that is absent in PyrH. The PyrH residues Q160 (mutated to N in Q160N variant) and A145 (deleted in PyrH-dASQV) are highlighted. Note that S146, Q147 and V148 also deleted in this mutant are not resolved in crystal structure. Bottom panels: Modeling of the PyrH Q160N mutation with bound tryptophan highlights cavitation of the active site region enabling access to larger peptidic substrates. Images generated based on structures 2WEU [26] (PyrH), 2OA1 [30] (RebH) and 2ARD [29] (PrnA) using PyMOL. We next recombinantly expressed and purified three model proteins, eGFP [36] (30 KDa), Stoffel fragment [37] of Taq DNA polymerase (64 KDa) and SpyCatcher [38] (13 KDa) with genetically encoded C-terminal GGW or SGW tags preceded by a site-specific protease cleavage site to facilitate MS analysis of reaction products (Tables S1 and S2). C-terminal halogenation by PyrH and its two mutant variants was observed ( Figure 6). This approached 90% conversion for the eGFP-GGW substrate using the PyrH mutants, with no significant perturbation of its spectral properties compared to unmodified eGFP ( Figure S37). A control experiment utilizing the tagged thermostable Stoffel fragment and heat inactivation of the halogenase enzyme (65 • C, 20 min) prior to proteolysis and MS analysis confirmed protein halogenation ( Figure S5). As before, PyrH-Q160N was generally the most active (Figure 4). Lower halogenation efficiencies of the Stoffel and SpyCatcher proteins (~40% using PyrH-Q160N) was observed, which may arise from steric hindrance between the halogenase and the substrate protein. Further iteration of linker lengths between these proteins and the (G/S)GW tag is warranted to address this possibility. We next recombinantly expressed and purified three model proteins, eGFP [36] (30 KDa), Stoffel fragment [37] of Taq DNA polymerase (64 KDa) and SpyCatcher [38] (13 KDa) with genetically encoded C-terminal GGW or SGW tags preceded by a site-specific protease cleavage site to facilitate MS analysis of reaction products (Tables S1 and S2). Cterminal halogenation by PyrH and its two mutant variants was observed ( Figure 6). This approached 90% conversion for the eGFP-GGW substrate using the PyrH mutants, with no significant perturbation of its spectral properties compared to unmodified eGFP ( Figure S37). A control experiment utilizing the tagged thermostable Stoffel fragment and heat inactivation of the halogenase enzyme (65 °C, 20 min) prior to proteolysis and MS analysis confirmed protein halogenation ( Figure S5). As before, PyrH-Q160N was generally the most active (Figure 4). Lower halogenation efficiencies of the Stoffel and SpyCatcher proteins (~40% using PyrH-Q160N) was observed, which may arise from steric hindrance between the halogenase and the substrate protein. Further iteration of linker lengths between these proteins and the (G/S)GW tag is warranted to address this possibility.

Discussion
We have developed a novel enzymatic method for in vitro C-terminal halogenation of a range of peptides and proteins. Enzyme and substrate screening yielded peptides comprising the optimal (G/S)GW motif, which we term the HaloTryp Tag. In combination with the rationally designed PyrH-Q160N mutant, the HaloTryp Tag facilitated sitespecific halogenation of several model proteins (40 to 90% conversion). We also demonstrated chemo-catalytic derivatization of an enzymatically halogenated peptide, although cross-coupling was not observed with the protocol employed. In this respect, derivatization of halogenated peptides/proteins should be assessed using recently described cross-coupling approaches that work at low temperatures in aqueous media [39][40][41][42]. Post-translational labelling using the HaloTryp Tag could also be used downstream of co-translational labelling methodologies [18,43] to expand chemical diversity and potentially introduce novel physico-chemical properties. Of pertinent interest is modulation of a therapeutic protein's cell permeability via halogen installation.

Discussion
We have developed a novel enzymatic method for in vitro C-terminal halogenation of a range of peptides and proteins. Enzyme and substrate screening yielded peptides comprising the optimal (G/S)GW motif, which we term the HaloTryp Tag. In combination with the rationally designed PyrH-Q160N mutant, the HaloTryp Tag facilitated site-specific halogenation of several model proteins (40 to 90% conversion). We also demonstrated chemo-catalytic derivatization of an enzymatically halogenated peptide, although crosscoupling was not observed with the protocol employed. In this respect, derivatization of halogenated peptides/proteins should be assessed using recently described cross-coupling approaches that work at low temperatures in aqueous media [39][40][41][42]. Post-translational labelling using the HaloTryp Tag could also be used downstream of co-translational labelling methodologies [18,43] to expand chemical diversity and potentially introduce novel physico-chemical properties. Of pertinent interest is modulation of a therapeutic protein's cell permeability via halogen installation.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/biom12121841/s1, Figure S1. 1H Spectra of chlorinated GGGGGW; Figure S2. Chemical structure of chlorinated GGGGGW; Figure S3. 1H Spectra of chlorinated GGGSGW; Figure S4. Chemical structure of chlorinated GGGSGW; Figure S5. Chlorination of Stoffel-SGW protein using PyrH-Q160N enzyme followed by C-terminus cleavage with and without prior heat inactivation of the halogenating enzymes; Figure S6. HRMS of chlorinated GGGGGW; Figure S7. HRMS of chlo-rinated GGGSGW; Figure S8. HPLC spectrum for chlorination of eGFP-GGW produced by enzy-matic halogenation using WT PyrH. Retention time before and after chlorination are 7.210 min and 7.817 min, respectively; Figure S9. HPLC spectrum for chlorination of eGFP-GGW produced by enzymatic halogenation using PyrH-dASQV; Figure S10. HPLC spectrum for chlorination of eGFP-GGW produced by enzymatic halogenation using PyrH-Q160N. Retention time before and after chlorination are 6.005 min and 6.628 min, respectively; Figure S11. HPLC spectrum for chlorination of eGFP-SGW produced by enzymatic halogenation using WT PyrH; Figure S12. HPLC spectrum for chlorination of Egfp-SGW produced by enzymatic halogenation using PyrH-Dasqv; Figure S13. HPLC spectrum for chlorination of Egfp-SGW produced by enzymatic halogenation using PyrH-Q160N; Figure S14. HPLC spectrum for chlorination of Stoffel-GGW produced by enzymatic halogenation using WT PyrH; Figure S15. HPLC spectrum for chlorination of Stoffel-GGW produced by enzymatic halogenation using PyrH-Dasqv; Figure S16. HPLC spec-trum for chlorination of Stoffel-GGW produced by enzymatic halogenation using PyrH-Q160N; Figure S17. HPLC spectrum for chlorination of Stoffel-SGW produced by enzymatic halogenation using WT PyrH; Figure S18. HPLC spectrum for chlorination of Stoffel-SGW produced by enzy-matic halogenation using PyrH-dASQV; Figure S19. HPLC spectrum for chlorination of Stof-fel-SGW produced by enzymatic halogenation using PyrH-Q160N; Figure S20. HPLC spectrum for chlorination of Spycatcher-GGW produced by enzymatic halogenation using WT PyrH. Retention time before and after chlorination are 7.205 min and 7.859 min, respectively; Figure S21. HPLC spectrum for chlorination of Spycatcher-GGW produced by enzymatic halogenation using PyrH-dASQV. Retention time before and after chlorination are 7.2222 min and 7.861 min, respec-tively; Figure S22. HPLC spectrum for chlorination of Spycatcher-GGW produced by enzymatic halogenation using PyrH-Q160N. Retention time before and after chlorination are 6.936 min and 7.555 min, respectively; Figure S23. HPLC spectrum for chlorination of Spycatcher-SGW produced by enzymatic halogenation using WT PyrH. Retention time before and after chlorination are 7.137 min and 7.791 min, respectively; Figure S24. HPLC spectrum for chlorination of Spycatcher-SGW produced by enzymatic halogenation using PyrH-dASQV. Retention time before and after chlo-rination are 7.150 min and 7.794 min, respectively. No di and tri substituted product were observed by mass extraction in MS; Figure S25. HPLC spectrum for chlorination of Spycatcher-SGW pro-duced by enzymatic halogenation using PyrH-Q160N. Retention time before and after chlorination are 6.888 min and 7.510 min, respectively; Figure S26. HPLC spectrum for chlorination of GW produced by enzymatic halogenation using PyrH; Figure S27. HPLC spectrum for chlorination of G2W produced by enzymatic halogenation using PyrH; Figure S28. HPLC spectrum for chlorina-tion of G3W produced by enzymatic halogenation using PyrH. Retention time before and after chlorination are 1.788 min and 2.211 min, respectively; Figure S29. HPLC spectrum for chlorina-tion of G4W produced by enzymatic halogenation using PyrH; Figure S30. HPLC spectrum for chlorination of G5W produced by enzymatic halogenation using PyrH; Figure S31. HPLC spectrum for chlorination of SGW produced by enzymatic halogenation using PyrH; Figure S32. HPLC spectrum for chlorination of G2SGW produced by enzymatic halogenation using PyrH; Figure S33. HPLC spectrum for chlorination of G3SGW produced by enzymatic halogenation using PyrH; Figure S34. Michaelis-Menten plot for calculating kinetic parameters for chlorination of GGW using WT PyrH (solid circles) and PyrH-Q160N (solid squares). Values represent average ±SD (n = 2); Figure S35. Extracted MS (ESI+) spectrum of phenyl substituted G3SGW; Figure S36. SDS-PAGE analysis of indicated purified proteins; Figure S37. A. Absorbance spectra of eGFP and eGFP hal-ogenated with PyrH-Q160N enzyme (5 mg/mL) in 10 mM phosphate buffer (pH = 7.2); Table S1. Amino acid sequence of proteins tested for halogenation showing the N-terminus His-tag in orange, the protein in green, PreScission protease cleavage sequence in blue, FLAG (solubility) tag in red and the HaloTrypTag in black; Table S2. Primers used for inserting LEVLFQGPDYKDDDDK-GGW/-SGW sequence at the C-terminus of respective proteins.