Site-Selective Dynamics of Ligand-Free and Ligand-Bound Azidolysozyme

Azido-modified alanine residues (AlaN$_3$) are environment-sensitive, minimally invasive infrared probes for the site-specific investigation of protein structure and dynamics. Here, the capability of the label is investigated to query whether or not a ligand is bound to the active site of Lysozyme and how the spectroscopy and dynamics change upon ligand binding. The results demonstrate specific differences for center frequencies of the asymmetric azide stretch vibration, the long time decay and the static offset of the frequency fluctuation correlation function - all of which are experimental observables - between the ligand-free and the ligand-bound, N$_3$-labelled protein. Changes in dynamics can also be mapped onto changes in the local and through-space coupling between residues by virtue of dynamical cross-correlation maps. This makes the azide label a versatile and structurally sensitive probe to report on the dynamics of proteins in a variety of environments and for a range of different applications.

Proteins are essential for function and sustaining life of organisms.2][3] However, characterizing structural and functional dynamics of proteins at the same time under physiological conditions in the condensed phase, which is prerequisite for understanding cellular processes at a molecular level, remains a challenging undertaking. 1 Vibrational spectroscopy, in particular 2-dimensional infrared (2D IR) spectroscopy, has been shown to be a powerful tool for studying the structural dynamics of various biological systems. 4One of the particular challenges is to obtain structural and environmental information in a site-specific manner.To address this, significant effort has been focused on the development and application of various infrared (IR) reporters 5,6 that absorb in the frequency range of 1700-2800 cm −1 to discriminate the signal from the strong protein background. 7,8Such IR probes have provided valuable information about the structure and dynamics of complex systems.For example, nitrile probes have helped to clarify the role of electrostatic fields in enzymatic reactions 9,10 or to elucidate the mode of drug binding to proteins. 11,12Isotope edited carbonyl spectroscopy was used to characterize the mechanism of protein folding and amyloid formation 13,14 or the structure and function of membrane proteins. 15,16Additional molecular groups such as thiocyanate, 17 cyanamide, 18 sulfhydryl vibrations of cysteines, 19 deuterated carbons, 20 carbonyl vibrations of metal-carbonyls, 21 cyanophenylalanine, 22 and azidohomoalanine (Aha) 23 have also been explored.
In the present work AlaN 3 , an analogue of azidohomoalanine (Aha) that has been shown to sensitively report on local structural changes while still being minimally invasive, 24,25 is used as the probe.This modification can be incorporated into proteins at virtually any position via known expression techniques. 26The asymmetric stretch frequency of -N 3 is at ∼ 2100 cm −1 and has a reasonably high extinction coefficient of 300-400 M −1 cm −1 which makes it an ideal spectroscopic reporter. 23Aha has been used for biomolecular recognition after incorporation into the peptide directly 23,24 or in the vicinity of the binding area of a PDZ2 domain, 27 to detect the water-specific response of azide vibrations when attached to small organic molecules, 28 or to probe the frequency shift and fluctuation due to its sensitivity to the local electrostatic environments and dynamics. 25,29Such studies confirm that AlaN 3 and/or Aha are environment-sensitive IR probes and suitable modifications for site-specific investigations of protein structure.
With its picosecond time resolution, IR spectroscopy provides direct information about the structural dynamics around a probe molecule with high temporal resolution. 4,30Moreover, introducing IR probes with isolated vibrational frequencies overcomes the problem of spectral congestion that complicates discrimination and analysis of desired vibrational bands.With that, the inter-and intramolecular coupling between degrees of freedom or the local structure or dynamics of biological systems can be specifically probed and characterized.Such an approach relies on the sensitivity of the probe to report on changes in the vibrational frequencies induced by alterations in the local electrostatic interactions in the vicinity of the probe. 22R spectroscopy is a potentially advantageous technique to characterize ligand binding to proteins. 31,32Its success depends in part on the notion that when a ligand binds to a protein, the frequency of an infrared active vibration shifts due to the different electric field in solution -often water -and in the protein binding site.Such an approach often requires the ligand to be modified, e.g. through addition of a suitable label such as -CN as in benzonitrile.This has been successfully demonstrated for benzonitrile in the active site of WT and mutant lysozyme. 32ternatively, the protein can be selectively modified by attaching a spectroscopic label at strategic positions so that the binding process and functional dynamics can be interrogated with the functional and unmodified ligand.This has the potential advantage that interactions between the ligand and the surrounding protein are unaltered.These interactions contribute the majority of the enthalpic part to the binding free energy and therefore directly affect the affinity of the ligand and its rate of unbinding.In the present work changes in 1dand 2d-IR signatures of the azido group attached to all alanine (Ala) residues of Lysozyme upon binding of cyano-benzene (PhCN) are determined.In addition, the changes of the environmental dynamics around all AlaN 3 are quantified for ligand-free vs. ligand-bound Lysozyme.Such differences are experimentally observable and yield valuable insight into the energetics and dynamics of protein-ligand binding.The dynamics of WT Lysozyme without and with labeled alanine (AlaN 3 ) has been recently found to provide position-specific information about the spectroscopy and dynamics of the modification site. 25The structure of the protein with the labelled Ala residues is shown in Figure 1 together with the binding site lined by residues Leu84, Val87, Leu91, Leu99, Met102, Val111, Ala112, Phe114, Ser117, Leu118, Leu121, Leu133, Phe153.Following earlier work, 32 benzene was replaced by cyano-benzene (PhCN) maintaining carbon atom positions.The WT structure was used here to a) compare directly with earlier results 25 and b) because PhCN has a comparatively small binding free energy towards the WT protein (∆G bind = −0.5 kcal/mol) which suggests that the interaction between the ligand and the protein is weak. 32For the L99A mutant protein ∆G bind = −3.9kcal/mol for PhCN 32 which compares with an experimentally determined value of ∼ −3.5 kcal/mol for iodobenzene from isothermal titration calorimetry. 33or the -N 3 label a full-dimensional, accurate potential energy surface (PES) calculated at the pair natural orbital based coupled cluster (PNO-LCCSD(T)-F12/aVTZ) 34,35 level and represented as a reproducing kernel Hilbert space (RKHS) 36,37 is available. 38This energy function is suitable for spectroscopic investigations and was combined with the CHARMM force field 39 for the surrounding protein. 25MD simulations for the WT and all modified AlaN 3 labels were carried out using an adapted version of the CHARMM program 40 with an interface to perform the simulations with the RKHS PES. 38The protein is solvated in explicit TIP3P water 41 using a cubic box of size (78) 3 Å3 .First, all systems were minimized followed by heating and equilibration.Next, 2 ns N V T production simulations were carried out with and without the ligand present in the active site for all 16 protein variants with Ala replaced by AlaN 3 .Bonds involving H-atoms were constrained using the SHAKE 42 algorithm and all nonbonded interactions were evaluated with shifted interactions using a cutoff of 14 Å and switched at 10 Å. 43 Snapshots for analysis were recorded every 5 fs.Using instantaneous normal mode (INM) analysis 25,38,44 the frequency trajectory ω(t) of the asymmetric stretch vibration of the -N 3 label was determined.Based on this, the 1D infrared spectra corresponding to the azide asymmetric stretch vibration for each of the 16 AlaN 3 residues was computed for the ligand-free and ligand-bound protein, see Figures 3   and S2.Direct comparison of the maximum position of the infrared lineshape shows that for three N 3 -modified alanine residues (Ala41, Ala98, Ala130) the difference in the absorption frequency is insignificant.For positions Ala63, Ala73 and Ala160 the differences are 8, 3, and 2 cm −1 , respectively, whereas for the other residues the change is within 1 cm −1 .Such frequency changes can be measured with stat-of-the art experiments 22 and their magnitude is also consistent with previous simulations of the vibrational Stark effect for the -CN probe in PhCN with red shifts of up to 3.5 cm −1 in going from the WT to the L99A and L99G mutants of T4-Lysozyme 32 Similarly, the 1D and 2D infrared spectroscopy of -CO as the label for insulin monomer and dimer found 45 that the relative shifts of the spectroscopic response was correctly described whereas the absolute frequencies may differ by some 10 cm −1 .In a very recent work such an approach found a splitting of 13 cm −1 , compared with 25 cm −1 from experiment, for the outer and central -CO labels in cationic trialanine in water. 46Hence, MD simulations together with instantaneous normal modes are a meaningful approach to determine relative frequency shifts whereas capturing absolute frequencies in such simulations requires slight reparametrization of the underlying force field, e.g. through morphing techniques. 47,48e magnitude of frequency shifts found from the present simulations is also comparable with -3 cm −1 found from experiments of the nitrile stretch in ligand IDD743 bound to WT vs. V47N mutant hALR2 31 or a 6 cm −1 blue shift of the -CO vibrational frequency due to the binding of 19-NT to the Asp40Asn mutant of the protein ketosteroid isomerase compared to the WT. 49Thus, differences of ∼ 1 cm −1 for the frequency of the reporter in different chemical environments can be experimentally detected. 22om the frequency trajectories the frequency fluctuation correlation function (FFCF) can be determined which contains valuable information on relaxation time scales corresponding to the solvent dynamics around the solute.The FFCFs are fit to an empirical expression which allows analytical integration to obtain the lineshape function 50 using an automated curve fitting tool from the SciPy library. 51As was found for the RMSFs and 1D IR spectra, the FFCFs from the simulations with and without the ligand bound to the protein can be very similar or differ appreciably, see Figure 4.The slow decay time, τ 2 , of the -N 3 asymmetric stretch mode of the label is typically shorter for the ligand-bound protein compared to that without PhCN, see Figure 4B, although exceptions exist.For Ala97N 3 , Ala112N 3 , and Ala134N 3 the slow relaxation time τ 2 is faster by 75 % up to a factor of ∼ 2.5 and for Ala146N 3 the slow time scale, τ 2 , differs by a factor of ∼ 3 between ligand-free (τ 2 = 5.13 ps) and ligand-bound (τ 2 = 1.61 ps) Lysozyme.For the other Alanine residues the τ 2 times between ligand-free and ligand-bound lysozyme are similar.As an exception, for Ala129N 3 the decay is slowed down by ∼ 50 % for PhCN-bound lysozyme.All FFCFs without (Figure S3) and with (Figure S4) the ligand bound together with the parameters of the empirical fit (Table S1) are given in the supporting information.
As a last feature of the FFCF it is found that the static component ∆ 0 can differ appreciably between ligand-free and -bound lysozyme.The static offset ∆ 0 is an experimental observable and characterizes the structural heterogeneity around the modification site.There are only four alanine residues for which the static offset is similar (Ala41, Ala49, Ala82, and Ala93) for ligand-bound and ligand-free lysozyme.For all others the differences range from 15 % to a factor of ∼ 3.As an example, for Ala73N 3 the difference for ∆ 2 0 between bound and ligandfree lysozyme is a factor of ∼ 2.5 (∆ 2 0 = 0.18 vs. 0.07 ps −2 or ∆ 0 = 0.42 vs. ∆ 0 = 0.26 ps −1 ) and for Ala146N 3 they differ by a factor of ∼ 3.5 (∆ 2 0 = 0.52 vs. 0.15 ps −2 ; i.e. ∆ 0 = 0.72 vs. ∆ 0 = 0.39 ps −1 ).Thus, the environmental dynamics around the spectroscopic label can be sufficiently perturbed by binding of a ligand in the protein active site to be reported directly as an experimentally accessible quantity with typical errors 22 between 0.1 cm −1 and 0.3 cm −1 (∼ 0.05 ps −1 ).Hence, the differences found from the simulations are well outside the expected error bars from experiment.
Nonvanishing static components of the FFCF were also reported from experiments.For trialanine (Ala) 3 a value of ∆ 0 = 5 cm −1 was reported 52 compared with ∆ 0 = 4.6 cm −1 from MD simulations (0.94 ps −1 vs. 0.86 ps −1 ) with multipolar force fields. 46Similarly, CN − in water features a nonvanishing tilt angle by τ = 10 ps 53 with ∆ 0 ∼ 0.1 ps −1 ∼ 0.5 cm −1 . 54To determine in which way the dynamics of residues is affected upon modification of the protein, dynamical cross-correlation maps 55,56 (DCCM) were calculated from the trajectories using the Bio3D package. 57Dynamic cross-correlation matrices are based on the expression where r i and r j are the spatial C α atom positions of the respective ith and jth amino acids and ∆r i corresponds to the displacement of the ith C α from its averaged position over the entire trajectory.DCCMs report on the correlated and anticorrelated motions within a protein and difference maps provide a global view of the positionally resolved differences in the dynamics.In the following, only absolute values for C ij and differences between them that are larger than 0.5 are reported.The DCCMs are symmetrical about the diagonal and for clarity, positive correlations (for DCCM) or positive differences in C ij (for ∆DCCM) are displayed in the lower right triangle and negative values or differences in C ij are displayed in the upper left triangle, respectively.
The DCCM for Lysozyme with Ala129N 3 ligand-free, ligand-bound and the difference between the two is shown in Figure 5.These maps reveal ligand-induced differences in the correlated and anticorrelated motions with appreciable amplitudes, see features A to D in Figure 5C.For ligand-free Lysozyme there are pronounced couplings between residues [130, 147] and [20, 25]/ [32, 37] in anticorrelated motions and residues [68, 80] and [103, 112]   for correlated motions.As demonstrated in Figure 5B, upon binding the ligand to Lysozyme the DCCM shows different coupled residues compared to the ligand-free protein.As an example, residues [35, 45] and [55, 68] are affected more for anticorrelated motions while in the correlation ones, the coupling is between residues [5, 15] and [55, 65].Note that these effects may not be visible in the ∆DCCM map as the magnitude of the difference between the two systems may be below the threshold of 0.5 in the ∆C ij .In the difference map (Figure 5C) feature A indicates the coupling between residues [135, 145]   and [20, 25]/ [30, 42] whereas feature B refers to coupled residues [65, 75] and [58, 65].Furthermore, feature C demonstrates prominent variations between residues [84, 95] and [117, 125]   while for feature D residues [129, 140] and [140, 147] are strongly correlated.These findings suggest that residues couple both locally (features B/D) and through space (features A/C).
It should also be pointed out that residues involved in features A to C are among those with higher RMSF, see Figure S1.Interestingly, the region around residue Ala146 with larger differences ∆C ij display correlated dynamics with spatially close residues around residue Asp20 (white licorice in Figure 1).On the other hand, the pronounced differences in the RMSF of Ala129N 3 (see Figure S1) for residues [42,57] do not show up in the ∆DCCM map because their C ij coefficients are below the threshold of 0.5.In summary, the present work demonstrates that the 1D and 2D IR spectroscopy of azide bound to alanine residues in WT Lysozyme provides valuable site-specific and temporal information about ligand binding of PhCN to the active site of WT lysozyme.Of particular note is the increase in contrast between the ligand-free and the ligand-bound protein when the azido-label is present, as demonstrated for Ala134N 3 .Furthermore, the static component ∆ 0 of the FFCF, which is an experimentally accessible observable, shows pronounced differences between the ligand-bound and ligand-free protein and can serve as a useful indicator for ligand binding.Changes in the maximum of the infrared absorbance are of the order of one to several cm −1 which is still detectable with state-of-the-art experiments. 22Given that the -N 3 label can be introduced at multiple positions along the polypeptide chain with specific spectroscopic signatures for each variant of the system, it may even be possible to use the present approach to refine existing structural models based on NMR measurements 58 or from more conventional co-crystallization and X-ray structure determination efforts.E-mail: m.meuwly@unibas.chTable S1: Parameters obtained from fitting the FFCF to Eq. 1 for INM frequencies for all different AlaN 3 residues in lysozyme.Average frequency ω of the asymmetric stretch in cm −1 , the amplitudes a 1 to a 3 in ps −2 , the decay times τ 1 to τ 3 in ps, the parameter γ in ps −1 and the static term ∆ 2 0 in ps −2 .

Figure 2 :Figure 3 :
Figure 2: RMSFs for the C α atoms for ligand-free (blue) and ligand-bound (red) Lysozyme with N − 3 attached to Ala73, Ala82, Ala112 and Ala160 residues.The label in each panel refers to the Ala residue number which carries the azide label and the corresponding position of residue is indicated with an asterisk above the RMSF trace.

Finally, 2D IRFigure 4 :
Figure 4: Panel A: FFCFs with pronounced differences from correlating the INM frequencies for ligand-free and -bound Ala73, Ala146, Ala42, Ala98 in Lysozyme.The labels in each panel refer to the Ala residue which carries the azide label.Blue (ligand-free) and red (ligand-bound) traces are the fits to Eq. 1.The y−axis is logarithmic.Panels B and C compare τ 2 and ∆ 2 0 for ligand-bound and ligand-free Lysozyme, respectively.

Figure 5 :
Figure 5: DCCM for ligand-free (panel A), ligand-bound (panel B), and ∆DCCM between ligand-free and -bound for Ala129N 3 -PhCN.Positive correlations are in the lower right triangle, negative correlations in the upper left triangle.Only correlation coefficients with an absolute value greater than 0.5 are displayed.

Figure 6 :
Figure 6: Differences in DCCM maps (∆DCCM) between WT and WT-PhCN (panel A), WT and Ala134N 3 (panel B) and Ala134N 3 and Ala134N 3 -PhCN (panel C).Positive correlations are in the lower right triangle, negative correlations in the upper left triangle.Only differences in correlation coefficients with an absolute value greater than 0.5 are displayed.

Figure S1 :
FigureS1: Root mean squared fluctuations (RMSFs) for the C α atoms for ligand-free (blue) and PhCN-bound (red) Lysozyme with -N 3 attached to every alanine residues.The label in each panel refers to the alanine residue number which carries the azide label and the corresponding position of residue is indicated as asterisk above the RMSF trace.

Figure S2 :
Figure S2: 1D lineshapes from INM for all 16 AlaN 3 residues for ligand-free (panel A) and ligand-bound (panel B) Lysozyme.The positions of all frequency maxima are compared in Figure 3.

Figure S3 :Figure S4 :
FigureS3: FFCFs from correlating the instantaneous harmonic frequencies for all 16 AlaN 3 in Lysozyme.The labels in each panel refer to the alanine residue which carries the azide label.Black traces are the raw data and red dashed lines the fits to Eq. 1.The y−axis is logarithmic.

Figure
Figure Difference dynamic cross correlation maps (∆DCCM) between Ala41N 3 and Ala41N 3 -PhCN.Positive correlations are in the lower right triangle, negative correlations in the upper left triangle.Only correlation coefficients with an absolute value greater than 0.5 are displayed.

Figure
Figure Difference dynamic cross correlation maps (∆DCCM) between Ala42N 3 and Ala42N 3 -PhCN.Positive correlations are in the lower right triangle, negative correlations in the upper left triangle.Only correlation coefficients with an absolute value greater than 0.5 are displayed.

Figure S7 :
Figure S7: Difference dynamic cross correlation maps (∆DCCM) between Ala49N 3 and Ala49N 3 -PhCN.Positive correlations are in the lower right triangle, negative correlations in the upper left triangle.Only correlation coefficients with an absolute value greater than 0.5 are displayed.

Figure
Figure Difference dynamic cross correlation maps (∆DCCM) between Ala63N 3 and Ala63N 3 -PhCN.Positive correlations are in the lower right triangle, negative correlations in the upper left triangle.Only correlation coefficients with an absolute value greater than 0.5 are displayed.

Figure S9 :
Figure S9: Difference dynamic cross correlation maps (∆DCCM) between Ala73N 3 and Ala73N 3 -PhCN.Positive correlations are in the lower right triangle, negative correlations in the upper left triangle.Only correlation coefficients with an absolute value greater than 0.5 are displayed.

Figure S10 :
Figure S10: Difference dynamic cross correlation maps (∆DCCM) between Ala74N 3 and Ala74N 3 -PhCN.Positive correlations are in the lower right triangle, negative correlations in the upper left triangle.Only correlation coefficients with an absolute value greater than 0.5 are displayed.