Tracking the Amide I and αCOO− Terminal ν(C=O) Raman Bands in a Family of l-Glutamic Acid-Containing Peptide Fragments: A Raman and DFT Study

The E-hook of β-tubulin plays instrumental roles in cytoskeletal regulation and function. The last six C-terminal residues of the βII isotype, a peptide of amino acid sequence EGEDEA, extend from the microtubule surface and have eluded characterization with classic X-ray crystallographic techniques. The band position of the characteristic amide I vibration of small peptide fragments is heavily dependent on the length of the peptide chain, the extent of intramolecular hydrogen bonding, and the overall polarity of the fragment. The dependence of the E residue’s amide I ν(C=O) and the αCOO− terminal ν(C=O) bands on the neighboring side chain, the length of the peptide fragment, and the extent of intramolecular hydrogen bonding in the structure are investigated here via the EGEDEA peptide. The hexapeptide is broken down into fragments increasing in size from dipeptides to hexapeptides, including EG, ED, EA, EGE, EDE, DEA, EGED, EDEA, EGEDE, GEDEA, and, finally, EGEDEA, which are investigated with experimental Raman spectroscopy and density functional theory (DFT) computations to model the zwitterionic crystalline solids (in vacuo). The molecular geometries and Boltzmann sum of the simulated Raman spectra for a set of energetic minima corresponding to each peptide fragment are computed with full geometry optimizations and corresponding harmonic vibrational frequency computations at the B3LYP/6-311++G(2df,2pd) level of theory. In absence of the crystal structure, geometry sampling is performed to approximate solid phase behavior. Natural bond order (NBO) analyses are performed on each energetic minimum to quantify the magnitude of the intramolecular hydrogen bonds. The extent of the intramolecular charge transfer is dependent on the overall polarity of the fragment considered, with larger and more polar fragments exhibiting the greatest extent of intramolecular charge transfer. A steady blue shift arises when considering the amide I band position moving linearly from ED to EDE to EDEA to GEDEA and, finally, to EGEDEA. However, little variation is observed in the αCOO− ν(C=O) band position in this family of fragments.


Introduction
The amide I vibration has been found to be heavily dependent on the formation of intramolecular hydrogen bonds in proteins, and hydrogen bonds, in turn, are integral in the stabilization of various protein secondary structures [1][2][3][4][5][6][7][8][9]. As a result, vibrational spectroscopy combined with density functional theory (DFT) computations is commonly employed to investigate the molecular geometries, dynamics, and secondary structures of short peptide chains . Navarette and coworkers reported the Raman and IR vibrational spectra for L-glutamic acid [44], L-aspartic acid (D) [35], and the glutamic acid (EE) and aspartic acid (DD) dipeptides [35] in their seminal works and found that these molecules are zwitterionic in the crystalline state, confirmed with both X-ray and neutron diffraction techniques. The Raman spectra of the zwitterionic dipeptide, L-aspartyl-Lglutamic acid (DE), in solid and solution have also been reported compared to simulated spectra from DFT computations at the B3LYP/cc-pVDZ level of theory [41]. Kauser et al. concluded that the zwitterionic form of DE in the solid state is stabilized by strong interand intra-molecular hydrogen bonds and, furthermore, characterize the amide I band position of the DE dipeptide at 1674 cm −1 in solid state Raman spectra [41]. In one study, the vibrational band positions of GxG (x = D, aspartic acid; N, asparagine; or C, cysteine) and various other tripeptides were analyzed with Raman spectroscopy, circular dichroism, and DFT computations [11,21,45,46]. These fully protonated tripeptides were found to have an above-average propensity for conformations, which are usually found in turn regions of peptide chains. Rybka and coworkers concluded that these tripeptides, containing the D, N, or C amino acids at the core, might facilitate the formation of hairpin-like regions in the unfolded state of proteins and could potentially support the initiation of protein folding processes [11].
Much less work has been carried out in the application of quantum chemical techniques to investigate larger peptide systems due to the computational cost required for such analyses. The conformational complexity of polypeptides has thus led to the development of various methods to provide accurate structures and frequencies of large systems while minimizing the computational cost. Bouř and Keiderling introduced the Cartesian transfer method, which applies a direct transfer of Cartesian molecular force fields (FF) and electric property tensors as opposed to the traditionally employed internal coordinates [44]. They included atomic polar and axial tensors in the transfer for computation of the vibrational frequencies. They investigated N-methylacetamide, a tripeptide, and a helical heptapeptide and found that their Cartesian transfer method performed well in the systems selected but has many limitations, most importantly that it cannot be used for aromatic systems with conjugated π-bond systems. Interestingly, they assembled the force fields for larger molecules from smaller fragments to decrease the computational cost even further [44]. Bouř and Keiderling applied their Cartesian transfer method to optimize five standard helical structures (α, 3 10 -,3 1 -, and left handed) at the B3LYP/SV(P) level of theory, while simulating the solvent effects with COSMO (conductor-like screening solvent model) [28]. They computed the vibrational frequencies with the BPW91/6-31G* level of theory. Use of the polarized continuum stabilizes the hydrogen bonds in each system under study, and their computed structural parameters agree well with previously reported X-ray structures for native-state proteins [28]. Bouř and Keiderling also developed a normal mode-based method for quantum chemical optimization of molecular geometries [45], which was found to provide smooth convergence for each of the systems under study. Although their method cannot be used if the exact valence coordinates are desired, it provides a useful complementary tool when employing traditional internal valence coordinate-based optimizations.
Reiher introduced his mode-tracking algorithm in 2007, which selectively calculates specific areas of vibrations instead of computing all of the normal modes and frequencies at once, called localized normal modes [46,47]. In his model polypeptide, (Ala) 20 , Reiher demonstrates that the localized modes represent the displacements of only a few atoms at a time and are obtained by first optimizing the structure with a DFT method and then performing vibrational frequency computations to find the sole transformation of the normal modes within one band of the spectrum whose accuracy was derived by previously established localization criterion [46]. The localized modes method proved to be more accurate at computing the band positions of the normal modes than those previously described, including its use in computing the coupling constants that arise when the modes are delocalized throughout the structure. Their use of the highly structured (Ala) 20 peptide made arriving at an optimized geometry relatively simple, though, and they only considered a single conformation, without further investigation of the conformational space. Reiher and coworkers have applied this method to investigate the Raman optical activity (ROA) signatures of four structurally similar peptides with a common backbone conformation but varying sequences of amino acid configurations at the BP86/TZVP level of theory [48]. They found that the amino acid configuration plays a significant role on the ROA peaks in the amide I, II, and III regions. Additionally, Reiher applied the localized mode methodology to investigate secondary structure effects on the IR and Raman spectra of the (Ala) 20 polypeptide in the α-helical or 3 10 -helical conformation [49].
Hydrogen bonding has been found to play an important role in protein stability both in the folded and unfolded states [2,4,5]. Several studies have investigated hydrogen bonding patterns in peptide chains and the effects of intramolecular hydrogen bonding and other noncovalent interactions on the position of the amide I band [2][3][4][5][6]8,9,15,32,[50][51][52][53][54][55][56][57][58]. A study by Görbitz et al. investigated the hydrogen bonding interactions in the crystal structures of unprotected, zwitterionic dipeptides and found that hydrogen-bonding patterns arise in which the dipeptides orient themselves in head-to-tail chains involving the N-terminal amino and C-terminal carboxylate groups [51]. Kumar and coworkers recently investigated the folded structures of Z-Gly-Pro-OH dipeptides (Z = benzyloxycarbonyl) with gas phase spectroscopy and DFT computations and observed a weak intramolecular hydrogen bond in the experimental spectrum that corresponds to the backbone amide N-H and backbone carbonyl C=O hydrogen bond (termed a C5 hydrogen bond). Their natural bond order (NBO) analysis provides evidence of delocalization of the electrons in the p-type lone-pair orbital of the carbonyl oxygen atom to the σ* orbital of the N-H group [50]. The hydrogenbonding patterns formed by these structures can also provide evidence for the secondary structure of the protein of interest.
L-glutamic acid (E) is a polar amino acid, and the reactivity of E's side chain results in the facilitation and regulation of many biochemical reactions. For example, the E-hook of β-tubulin is instrumental in cytoskeletal regulation and function. The last six C-terminal residues of the βII isotype, amino acid sequence EGEDEA, protrude from the microtubule surface and facilitate protein binding and molecular motor motility . Unlike investigations by Bour and Reiher, which investigate ordered peptide chains falling into the classic α-helical and β-sheet secondary structure domains [28,[44][45][46]48,49,[85][86][87], EGEDEA is thought to be an intrinsically disordered protein, in part due to the inability to resolve a crystal structure. Knowing that many peptide conformers are possible, an approach using experimental Raman and simulated DFT characterization together facilitates the approximation of the solid phase behavior of the molecule in vacuo. A recent publication by our group characterizes the vibrational band positions of the E-hook hexapeptide, EGEDEA, and the peptide fragments used to build it computationally, EG, ED, EA, EGED, and EDEA, using experimental Raman spectroscopy combined with DFT computations [10]. Since there is limited conformational freedom due to the partial rigidity of the peptide backbone, small variations in the molecular geometry are not represented in the Raman spectrum. The remarkable similarity observed in the experimental Raman spectra as the size of the peptide fragment increases along with the observed dependence of the amide I vibration on the overall amino acid composition of the fragment warrants further investigation into the intramolecular interactions occurring in these fragments and how they affect the vibrational band positions. In this study, the EGEDEA hexapeptide is broken down into 11 fragments (EG, ED, EA, EGE, EDE, DEA, EGED, EDEA, EGEDE, GEDEA, and, finally, EGEDEA) and analyzed with experimental Raman spectroscopy and DFT computations, including natural bond order analyses (NBO), to track the amide I vibration and the extent of intramolecular charge transfer as the size of the fragment increases and with variation of the neighboring residue.

Experimental Methods
The β-TUBB2A E-hook in the form of a synthetic peptide of charged amino acid sequence EGEDEA 4− is acquired in crystalline form from Genscript at >95% purity. Additional peptide fragments EG 1− , ED 2− , EA 1− , EGED 3− , and EDEA 3− are also acquired in the same manner and purity. The samples are stored at −20 • C to ensure that they remained in their charged state for Raman analysis. A Horiba Scientific LabRAM HR Evolution Raman Spectroscopy system with CCD camera detection is used to analyze the crystalline solids of the peptide fragments. A 532 nm laser line is used as the excitation source, and either a 600 or 1800 grooves/mm grating is used for detection, affording a resolution of less than 1 cm −1 . Additionally, Raman studies at −100 • C utilize a temperature-controlled stage allowing the formation of crystals via a controlled flow of liquid nitrogen over the sample. The vibrational characterization of only the EGEDEA hexapeptide is performed with the temperature-controlled data acquired.
In lieu of molecular dynamics simulations to locate all of the possible local minima for each fragment, a Maxwell-Boltzmann statistical distribution of energetic minima for each is collected by first isolating a single energetic minimum and then varying the peptide bond dihedral angles to search for additional low-energy conformations that fall within an appropriate range. From this set of candidate structures, a Boltzmann sum of the vibrational frequencies and intramolecular hydrogen bonds are employed to describe the experimental band positions of the amide I and αCOO− terminal ν(C=O) vibrational modes. Maxwell-Boltzmann statistics give the distribution of microstates in a system (termed the macrostate) as a function of temperature, defining the probability that a molecule will exist in a certain "state" of given energy at a certain temperature. This method assumes that an array of particles, N, has a total energy and that the energies of the individual particles take on discrete values, E 0 , E 1 , . . . , E i . The number of particles with energy E 0 is N 0 , with E 1 is N 1 , etc. The word "macrostate" is now applied to describe the gross state that corresponds to a given set of numerical values, N 1 , N 2 , . . . , N i [96]. The logarithm of the fraction of particles in a given microstate is proportional to the ratio of the energy of that state to the temperature of the system: Maxwell-Boltzmann statistical thermodynamics assumes the molecules are independent from one another. Additionally, the states are considered to be in thermal equilibrium. Knowing Boltzmann's factor of e −Ei/kT , the above equation can be rearranged and represents the absolute probability for the occurrence of state i: where N i is the expected number of particles in the single-particle microstate, i; N is the total number of particles in the system; E i is the energy of the microstate, i; T is the equilibrium temperature of the system; and k is the Boltzmann constant at 1.38 × 10 −23 J/K [96,97].
Simulated Raman spectra are created by summing Lorentzian profiles for each calculated normal mode and weighting with the corresponding Raman activity. A scaling factor is employed in order to partially account for the anharmonicity of the computed harmonic vibrations. The overestimation of calculated vibrational frequencies is fairly uniform allowing for a single scaling factor to be used for the methods and basis sets employed here. A scaling factor of 0.97 is used here for all levels of theory as has been previously reported to correct for the disagreement of the vibrational frequencies acquired using B3LYP/6-311+G(2df,2pd) when compared to experiment [98]. The Boltzmann sum of the simulated spectra, B T , is created by weighting the simulated spectra, W, with the corresponding N i values of the microstate and then summing over the total available states.
Sixty-nine geometries are found, with six total energetic minima found corresponding to the ED ( Figure Figure S11) peptide fragments. The lowest energy molecular geometries for each fragment of interest are presented in Figure 1, along with the computed Boltzmannweighted magnitudes of the intramolecular hydrogen bonds, in millielectrons, e − . Arrival at a set of energetic minima for each fragment is achieved by manipulating the ϕ and ψ bond angles and, as mentioned before, performing the step-wise optimizations beginning first at the B3LYP/3-21G level of theory and building up to 6-31G, 6-31+G(d,p) and finally 6-311++G(2df,2pd). B3LYP/6-311++G(2df,2pd) has been shown to perform well in the literature in description of the experimental Raman spectra of small biomolecules and peptide fragments [40][41][42][99][100][101][102][103]. Regardless, additional methods are employed to compute the lowest energy molecular geometry of the EG dipeptide, shown in Figure S12, to investigate the accuracy of B3LYP/6-311++G(2df,2pd). Very little variation is observed when comparing the simulated spectra computed with the 6-311++G(2df,2pd) basis set and the B3LYP, M06-2X, PBEPBE, and MP2 methods. Thus, due to the good performance of the B3LYP/6-311++G(2df,2pd) level of theory when compared to experiments in peptide studies, only these results are presented and discussed herein. The Boltzmann distribution of the energetic minima for each peptide fragment is computed, employing Equation (2), giving the probability that a molecule will populate a certain state based on the distribution of energies in the set of states considered, as a function of temperature. Since the experimental Raman spectra are acquired under standard room temperature conditions, 298 K is employed in all cases in the Boltzmann equations for the theoretical data. Table 1 presents the relative energies of the six lowest energetic minima for each fragment, along with the computed Boltzmann probabilities (N i ), the number of intramolecular hydrogen bonds formed (HBs), Boltzmann-weighted magnitudes of the intramolecular hydrogen bonds, the Boltzmann-weighted total charge transferred via intramolecular hydrogen bonding (qT), and the computed and scaled harmonic vibrational frequencies of the amide I and αCOO− terminal ν(C=O) bands. The raw data for each fragment can be found in the supporting information (Tables S1-S11). The intramolecular hydrogen bonds are computed from natural bond order analyses by calculating the difference between the computed electron densities of the two participating atoms. The relative energies for each set of energetic minima range from 0 to 50 kcal mol −1 to ensure the Boltzmann distribution is fully representative. All computations begin with the fragment in a charged state, including αNH 3 + , αCOO−, and all side chains in their zwitterionic states at pH 7, as is the composition of the crystalline solids analyzed experimentally. Interestingly, the molecules consistently tend to protonate themselves in isolation. Upon visualizing several of these protonated bands in the experimental spectra, these molecular geometries (found in the Supporting Information) are included in the Boltzmann distribution of the simulated Raman vibrational spectra. In many cases, protonation is exhibited in the lowest energy conformations (EG, EA, DEA, EGED, EGEDE, GEDEA, and EGEDEA, Figure 1).  Table 1. Relative energies of the energetic minima for the EG, ED, EA, EGE, EDE, DEA, EGED, EDEA, EGEDE, GEDEA, and EGEDEA peptide fragments, computed at the B3LYP/6-311++G(2df,2pd) level of theory. The number of intramolecular hydrogen bonds (HBs); Boltzmann probability (N i ); Boltzmann-weighted magnitudes of the hydrogen bonds, in e − ; Boltzmann-weighted total charge transferred into the system (qT); and the amide I and αCOO− terminal ν(C=O) bands and shifts for each structure are also computed, in cm −1 .

Summed Simulated Spectra to Experiment
Many previous investigations have searched for the global minimum of peptides and peptide fragments in order to describe acquired experimental data [104][105][106][107][108][109][110][111][112]. There is no need to search for the actual minimum to describe the experimental band positions because, in the case of these short peptides, the partial rigidity of the peptide bond introduces some extent of structural rigidity, preventing vibrations from deviating tremendously from structure to structure, as is represented in the comparison of the simulated spectra for the various conformations of each peptide fragment in vacuo (Figures S13-S23). A recent publication by our group characterizes the experimental Raman vibrational band positions of the crystalline solid form of the EGEDEA hexapeptide with DFT computations. This is achieved by breaking the hexapeptide down into components (EG, ED, EA, EGED, and EDEA), which are then used to build the EGEDEA hexapeptide computationally [10]. In the previous work, a single energetic minimum for each fragment was acquired for description of the experimental spectra of the crystalline solids. The agreement between experiment and theory is good, but with hopes to more accurately characterize the amide I region and investigate the effects of intramolecular hydrogen bonding on the position of the vibrational band positions, a more robust computational investigation is presented here. All experimental data referenced here can be found in [10] and the corresponding supplementary material.
To track the amide I and αCOO− terminal ν(C=O) band positions as the size of the fragment increases linearly, the tripeptides EGE, EDE, and DEA and the pentapeptides EGEDE and GEDEA are added to provide a complete picture, and, thus, a theoretical investigation of these fragments is included in this investigation. Figure 2 presents the comparison of the Boltzmann-summed simulated Raman spectra (in vacuo) to the experimental Raman spectra acquired for the EG, ED, EA, EGED, EDEA, and EGEDEA solid peptide fragments. No experimental data corresponding to the tripeptides or pentapeptides are acquired, as the stepwise pathway does not indicate that such would be beneficial.
In all cases, the agreement between experiment and theory improves, supporting the use of a Boltzmann distribution of the energetic minima to describe the vibrational frequencies of short peptide systems when other methods are unfeasible. The experimental Raman spectra are all remarkably similar, reminiscent of the presence of the strongly polar E residue dominating the signal in each case. The variations arise from the composition of the neighboring side chains and length of the peptide chain, which can be investigated by identifying specific vibrational bands that are shared by all peptides and give important information regarding protein secondary structure.

Tracking the Amide I and αCOO− Terminal ν(C=O) Bands
The amide I band in the analyzed peptides appears in the range of 1600-1700 cm −1 and consists mainly of the backbone C=O stretching vibration, with minor contributions from the out-of-phase CN stretching vibration, the CCN deformation, and the NH inplane bend [54]. Table 2 presents the Boltzmann sums of the total charge transferred via intramolecular hydrogen bonding (q B T) along with the Boltzmann sums of the amide I and αCOO− terminal ν(C=O) band positions for each peptide fragment of interest compared to the experimental value. An amide I ν(C=O) band is associated with each of the peptide bonds in a protein.
Therefore, throughout this study, the amide I ν(C=O) band is in reference to the first peptide bond from the αNH 3 + terminal end, physically furthest from the αCOO− terminal. These two vibrations are chosen because the atoms involved consistently participate in intramolecular hydrogen bonding, and the band position of the amide I vibration can provide valuable information regarding the secondary structure of the peptide under study. The y-axis in Raman spectra is the intensity of the scattered light, related to the number of photons the detector records at each Raman shift. The units are arbitrary and are thus not presented. As shown in Table 2, there is good agreement between the Boltzmann-summed vibrations and those experimentally obtained, all of which have been scaled by 0.97 to partially correct for the anharmonicity of the computed harmonic vibrations.   Figure 3a,b correlate the experimental and Boltzmann-summed simulated band positions of the amide I and αCOO− ν(C=O) vibrations with the increasing number of residues in the peptide chain. In the case of the dipeptides, EG, the smallest and most nonpolar fragment, possesses an overall 1-charge, three intramolecular hydrogen bonds, and exhibits the amide I ν(C=O) band at 1656 cm −1 (exp: 1671 cm −1 ) and the αCOO− ν(C=O) band at 1644 cm −1 (exp: 1648 cm −1 ), roughly 10 cm −1 apart. The experiment suggests these bands to be almost 20 cm −1 apart, which is among the greatest deviations between experiment and theory found in this study, warranting transition dipole computations to correct for the coupling of these vibrational bands in quantum chemical approximations of the vibrational frequencies. In moving from a neighboring glycine to a neighboring alanine residue, only slightly more polar than EG due to the larger electron-withdrawing effects of alanine's methyl side chain and still possessing a 1-charge and three intramolecular hydrogen bonds, causes the amide I ν(C=O) band to red shift slightly to 1651 cm −1 (exp: 1651 cm −1 ) and the αCOO− ν(C=O) band to red shift by only 2 cm −1 to 1642 cm −1 . The experiment, however, exhibits a 20 cm −1 red shift from the EG to EA dipeptide for both the amide I and αCOO− ν(C=O) bands. The ED dipeptide is the most polar and largest of the three dipeptides considered, containing three charged carboxylic acids along with the charged αNH 3 + terminal, an overall 2-charge, and, again, three intramolecular hydrogen bonds. The amide I band appears at 1631 cm −1 (exp: 1632 cm −1 ), and the αCOO− ν(C=O) band appears at 1609 cm −1 (exp: 1608 cm −1 ). Both are dramatically red shifted when compared to these bands in the EG and EA dipeptides. Due to the partial conjugation of the peptide backbone, electron density can be delocalized throughout, and fragments containing only charged residues exhibit no barriers through which the density must travel, exhibiting an overall stabilizing effect on the energies of participating vibrational bands. In all three dipeptides, the amide I band appears higher in energy than the αCOO− ν(C=O) band. All three tripeptide fragments exhibit some degree of polarity, with the EGE tripeptide being the least polar due to the separation of the two polar E residues by the nonpolar G residue, and the EDE residue being the most polar due to the presence of three charged side chains. The EGE tripeptide possesses an overall 2-charge and four intramolecular hydrogen bonds. EGE's amide I band position appears at 1617 cm −1 , and the αCOO− ν(C=O) band appears at 1641 cm −1 , which is interesting because the amide I band position is now shifted to appear lower in energy than the αCOO− ν(C=O) band, opposite to that seen in the EG dipeptide. The DEA tripeptide is the second most polar of the three fragments, possessing an overall 2-charge and five intramolecular hydrogen bonds. In DEA's case, the amide I band is blue shifted from EGE to 1634 cm −1 , and the αCOO− ν(C=O) band is red shifted to 1618 cm −1 . Unlike EGE, the amide I band shifts back to higher in energy than the αCOO− ν(C=O) band. DEA's amide I vibration is blue shifted only 4 cm −1 from the predicted position for the ED dipeptide, while this same vibration is red shifted by 16 cm −1 from the EA dipeptide. This behavior provides evidence of the dependence of the amide I band on neighboring side-chain polarity. The most polar tripeptide under study, EDE, exhibits an overall 3-charge and four intramolecular hydrogen bonds, with the amide I band appearing very close to that predicted for the DEA tripeptide at 1634 cm −1 and the αCOO− ν(C=O) band appearing at 1609 cm −1 . The amide I band is blue shifted by only 3 cm −1 from the predicted position for the ED dipeptide. The αCOO− ν(C=O) band steadily red shifts with increasing side-chain polarity from EGE (1641 cm −1 ) to DEA (1618 cm −1 ) to EDE (1609 cm −1 ).
In the case of the tetrapeptides, EGED and EDEA, EDEA is the most polar due to the presence of three neighboring polar residues. The less polar tetrapeptide, EGED, possesses an overall 3-charge and exhibits four intramolecular hydrogen bonds, with the amide I band appearing at 1676 cm −1 (exp: 1686 cm −1 ) and the αCOO− ν(C=O) band appearing at 1620 cm −1 (exp: 1622 cm −1 ). Interestingly, unlike the EGE tripeptide's amide I band position, EGED's amide I band is red shifted from the EG dipeptide, with no clear trend arising in the transition from EG to EGE to EGED. The αCOO− ν(C=O) band, however, appears to steadily red shift from EG (1644 cm −1 ) to EGE (1641 cm −1 ) and finally to EGED (1620 cm −1 ). The largest red shift in this band is observed upon adding a neighboring polar residue where there was not one before (moving from EGE to EGED). The more polar EDEA residue, which also possesses an overall 3-charge but displays seven intramolecular hydrogen bonds, exhibits the amide I band at 1640 cm −1 (exp: 1641 cm −1 ) and the αCOO− ν(C=O) band at 1605 cm −1 (exp: 1602 cm −1 ). In this case, a clear trend is arising in the amide I band, with a steady blue shift moving from the ED dipeptide (1631 cm −1 ) to the EDE tripeptide (1634 cm −1 ) to the EDEA tetrapeptide (1640 cm −1 ). In the αCOO− ν(C=O) case, very little change is observed as the length of the peptide chain increases, without varying the overall polarity of the system. The αCOO− ν(C=O) band consistently appears at 1609 cm −1 in the ED and EDE peptide fragments, with a slight red shift observed moving up to the EDEA tetrapeptide (1605 cm −1 ).
Two pentapeptides are considered, including EGEDE and GEDEA. The EGEDE pentapeptide is considered the most polar of the two, as it possesses four polar amino acid residues. GEDEA poses a unique case because it is capped by two nonpolar residues, G and A, which could be interesting when considering the observed shifts thus far. GEDEA possesses an overall 3-charge and exhibits six intramolecular hydrogen bonds. GEDEA's amide I band appears at 1654 cm −1 , and the αCOO− ν(C=O) band appears at 1607 cm −1 . The αCOO− ν(C=O) band is blue shifted by only 2 cm −1 from EDEA's αCOO− ν(C=O) band position, expected since only a G residue is added from EDEA to GEDEA. GEDEA fits well into the trend observed in the vibration moving from ED to EDE to EDEA and now to GEDEA. The amide I band also blue shifts by 14 cm −1 , holding with the trend observed for the amide I band's steady blue shift moving from ED to EDE to EDEA and now to GEDEA. The more polar EGEDE pentapeptide possesses an overall 4-charge and six intramolecular hydrogen bonds. EGEDE's amide I band appears at 1650 cm −1 , while the αCOO− ν(C=O) band appears at 1614 cm −1 . The addition of a third neighboring side chain to EGED to make EGEDE causes the amide I vibration to dramatically red shift by 26 cm −1 and the αCOO− ν(C=O) band to red shift by 6 cm −1 . EGEDE also holds with the amide I trend observed in the ED/EDE/EDEA/GEDEA case. A clear trend also arises when considering the αCOO− ν(C=O) band position moving from EG (1644 cm −1 ) to EGE (1641 cm −1 ) to EGED (1620 cm −1 ) to EGEDE (1614 cm −1 ).
The final fragment considered in this study, the hexapeptide, EGEDEA, possesses an overall 4-charge and seven intramolecular hydrogen bonds. The amide I band appears at 1658 cm −1 (exp: 1658 cm −1 ), and the αCOO− ν(C=O) band appears at 1610 cm −1 (exp: 1611 cm −1 ). The amide I band position is slightly blue shifted from both of the pentapeptide band positions, holding with the steady blue shift that arises moving from ED to EDE to EDEA to EGEDE/GEDEA and now to EGEDEA. The αCOO− ν(C=O) band position is slightly red shifted from the EGEDE pentapeptide, again holding with the trend observed in the αCOO− ν(C=O) band position moving from EG to EGE to EGED to EGEDE and now to EGEDEA. Again, very little variation is observed in the αCOO− ν(C=O) band position when moving from ED to EDE to EDEA to GEDEA and on to EGEDEA.

Intramolecular Charge Transfer Disparately Influences Amide I and αCOO− Vibrations
As shown in Table 2 and Figure 4, the extent of intramolecular hydrogen bonding increases as the size of the peptide chain increases, while the extent of intramolecular hydrogen bonding in each size class depends on the amino acid composition. The magnitude of intramolecular charge transfer in these systems does not follow the same trends shown in the amide I and αCOO− ν(C=O) band positions, except in the dipeptide and tetrapeptide cases. Of the dipeptides, all three fragments exhibit three intramolecular hydrogen bonds, with the most nonpolar dipeptide fragment, EG, exhibiting the smallest extent of intramolecular hydrogen bonding, q B T of 0.611 e − , and the most polar dipeptide, ED, exhibiting the greatest extent of intramolecular hydrogen bonding, with a q B T of 0.850 e − . This is also reminiscent of the additional carboxylic acid found in the ED fragment and not in the EG and EA fragments. For the tripeptides, all three exhibit a greater extent of intramolecular charge transfer than the dipeptides. The most polar DEA tripeptide possesses the greatest magnitude of charge transfer and five intramolecular hydrogen bonds, q B T of 1.220 e − . The less polar tripeptides, EGE and EDE, exhibit a magnitude of 1.030 e − and 0.990 e − , respectively, transferred within the system, both displaying only four intramolecular hydrogen bonds. Considering the molecular geometries, all three carboxylic acids in the DEA tripeptide participate in intramolecular hydrogen bonds, with protonation observed in D residue's carboxylic acid by the nearby αNH 3 + group. This leads to the formation of a strong hydrogen bond (0.400 e − ) between the protonated O-H and the now neutral αNH 2 group. In the EGE tripeptide, again all three of the carboxylic acids are shown to participate in intramolecular hydrogen bonds. However, no protonation occurs, and the magnitudes of the formed hydrogen bonds are not as great as those in the DEA tripeptide. The EDE tripeptide is considered the most polar of the three, but, unlike the dipeptides, this fragment exhibits the smallest extent of intramolecular charge transfer, which is surprising considering the stabilization observed in the vibrational frequencies discussed above. This is possibly due to the fact that although there is an additional carboxylic acid present in this fragment, it does not participate in an intramolecular hydrogen bond, and, thus, there is no computed increase in the magnitude of intramolecular charge transfer.
Of the two tetrapeptides considered, the most polar, EDEA, exhibits the greatest magnitude of intramolecular charge transfer (q B T of 1.444 e − ). This behavior is unsurprising considering the formation of seven intramolecular hydrogen bonds when compared to the EGED tetrapeptide (q B T of 1.016 e − ), which forms only four. EGED is predicted with protonation in D residue's carboxylic acid by the nearby αNH 3 + group. Despite the protonation and formation of a strong hydrogen bond between the protonated O−H and the now neutral αNH 2 group (0.39 e − ), interruption of the polar EED residues by the nonpolar G residue dramatically reduces the ability of the side chains to form intramolecular hydrogen bonds with each other, which is also observed in the shifts of the amide I and αCOO− ν(C=O) band positions described above. EGED exhibits even less charge transfer than the EGE tripeptide, explaining the low energy position of the amide I band when compared to the rest of the fragments (Figure 4). EDEA falls in line with a trend arising from a steady increase in the magnitude of intramolecular charge transfer from ED/EA to DEA to EDEA.
The pentapeptide, GEDEA, also falls into this trend, with the formation of six intramolecular hydrogen bonds and a q B T of 1.530 e − . The carboxylic acid of the first E residue in GEDEA is predicted to be protonated by the nearby αNH 3 + group, leading to the formation of a strong hydrogen bond between the protonated O-H and the now neutral αNH 2 group (0.39 e − ); however, in this case, there is no interruption of the remaining three residues, which all participate in at least one intramolecular hydrogen bond with the peptide backbone and, in the E2 residues case, two stabilizing hydrogen bonds with the backbone (both at 0.26 e − ). The EGEDE pentapeptide also displays six intramolecular hydrogen bonds, but the magnitude of intramolecular charge transfer is less, with a q B T of 1.240 e − . The EGEDE pentapeptide is also protonated with the third E residue's carboxylic acid interacting with the αNH 3 + group, again leading to the formation of a strong hydrogen bond between the protonated O-H and the now neutral αNH 2 group (0.38 e − ). The first E residue's carboxylic acid does not participate in intramolecular hydrogen bonding, leading to the overall lower magnitude of charge transfer observed in EGEDE compared to GEDEA. EGEDE falls in line well with a trend arising for a steady increase in the magnitude of intramolecular hydrogen e − bonded charge transfer moving from the EG and ED dipeptides to EDE and EGE tripeptides to the EGED tetrapeptide and now to the EGEDE pentapeptide. In the hexapeptide case, EGEDEA, seven intramolecular hydrogen bonds are formed and a total of 1.650 e − are transferred in the system.
The hexapeptide again exhibits protonation, this time of the third E residue's carboxylic acid by the αNH 3 + group, with the formation of the strongest computed hydrogen bond in this study of 0.401 e − between the protonated O-H and the now neutral αNH 2 group. The trends observed in the magnitude of intramolecular hydrogen bonded charge transfer appear to depend upon the nature of the polarity of the amino acids in the peptide chain, the protonation state of the αNH 3 + group, and the length of the peptide chain, with overall more polar and longer fragments exhibiting an observably greater extent of intramolecular charge transfer.

Experimental and Computational Vibrational Analyses Predict EGEDEA Secondary Structure
The E-hook of β-tubulin protrudes from the microtubule surface and is thought to adopt a mostly disordered structure due to the inability to resolve its crystal structure. Hence, there is no current expectation for the peptide to fall under typical secondary structure parameters. Figure 5a presents the molecular geometry for the lowest energy conformation of the EGEDEA hexapeptide computed here, structure A, with the Φ and Ψ bond angles labeled for each peptide bond in • . Figure 5b is a Ramachandran plot of the computed ϕ and ψ bond angles for each of the energetic minima found for the EGEDEA hexapeptide ( Figure S11), along with the sums of the Boltzmann-weighted angles. The Ramachandran plots for the rest of the fragments considered in this study can be found in the Supporting Information, Figures S24-S33.
Considering only Boltzmann sums of the ϕ and ψ bond angles and the Boltzmannweighted position of the amide I ν(C=O) band predicted at 1658 cm −1 (experiment also at 1658 cm −1 ), the EGEDEA hexapeptide appears to take on a helical conformation in the crystalline state, with the mean frequency for α-helical structures reported at 1652 cm −1 for the amide I band [7]. The ϕ and ψ angles that appear in the restricted region (+120 • , −120 • ) in proteins traditionally belong to G residues, as the lack of substitution on the C α permits a greater extent of flexibility about the peptide bond when compared to other amino acid residues in the chain, as is also the case here. The rest of the angles that fall in the region are typically associated with α-helices; however, further analysis of the vibrational coupling constants in the amide I region is needed to reliably predict the EGEDEA secondary structure [7,[53][54][55][114][115][116]. Although it is unlikely that the peptides form short helices without external stabilization, it is possible that a portion of the external α-helices would extend and specialize to perform E-hook's duties. Additionally, interaction with the local environment and MAPs potentially support its helical structure.

Local Structure within E-Hooks May Play Pivotal Roles in Protein Recruitment and Retention
The full sequence of the βII isotype of tubulin is DATADEQGEFEEEEGEDEA with an 11-overall charge, 8 glutamates, and is 19 residues long [117]. If the overall peptide is considered disordered and has not been resolved in crystallographic structures, what is the role of the seemingly local helical structure of the C-terminal end? The very end of the long, disordered E-hook possibly acts as a local "hook" when binding microtubule (MT)-associated proteins (MAPs) or molecular motors, facilitating the delicate balance of having enough affinity to be recruited to the MT surface but not so tightly that it cannot exhibit motility. Further, if the spectroscopic clues from this study are used to dissect the remaining E-hook structure, there are sections of high electronegativity that appear broken up by segments of uncharged residues with varying hydrogen bonding potential. Only one such interruption appears in the EGEDEA hexapeptide, by the G residue, which results in the first three residues arranging in an extended linear backbone conformation. The first turn begins at the conjunction of the second E residue and D residue (shown in Figure 1), which is accompanied by a strong intramolecular hydrogen bond (0.273 e − ) stabilizing the helical turn. The resulting residues in the EGEDEA hexapeptide appear to exhibit a greater extent of helical behavior than the first three. The amide I vibration is shown to shift from 1656 to 1617 cm −1 ( Table 2) when moving from EG to EGE, which disrupts the delocalization of electron density between the two glutamic acid residues. The addition of two more polar residues, creating EGEDE, results in the amide I vibrations shifting back to 1650 cm −1 . The results here indicate that the amide I mode is highly influenced by the surrounding residues and the local hydrogen bonding environment, as would be expected for forming secondary structures. Furthermore, this suggests that these interruptions do not foster the proper environment for the E-hook to form higher order architectures on its own. However, when the peptide binds MAPs or molecular motors such as kinesin or dynein, perhaps this flexibility allows the E-hook to form customized attachments to these different proteins. For instance, the presence of E-hooks on MTs differentially affects binding and processivity of different kinesins, ranging from small modulations in motility to complete inhibition [63,64,75]. Additionally, E-hooks are the diversity site of tubulin, where the core is mostly conserved, but these disordered domains undergo a variety of post-translational modifications (PTMs), forming what is often referred to as the "tubulin code" [117,118]. In particular, β-tubulin E-hooks undergo glutamylation, tyrosination, and phosphorylation [117,118]. It is plausible that these additions further prevent the formation of secondary structures under physiological conditions, as well as custom-tune E-hook fit to motor and MAP binding sites. However, the effects of PTMs on local E-hook structure are not well understood and will be the subject of future study.
There are limitations to the method selected, including the potential that the global minimum was been identified; thus, future work will include additional computational models to search for the many conformations of the EGEDEA hexapeptide. This will allow for comparison to our method to determine its accuracy. Furthermore, this investigation does not consider the involvement of protein/peptide dynamics, which is vital due to E-hook's key function as a "hook" for microtubule-associated proteins (MAPs). An analysis of the interaction of E-hook with the local environment in various solvent models is also a subject of future study, including implicit and explicit solvation of the hexapeptide in water.

Conclusions
In summary, a Boltzmann sum of the simulated Raman spectra compares well with the experiment to track the amide I and αCOO− ν(C=O) band positions for each in a family of L-glutamic acid-containing peptide fragments (EG, ED, EA, EGE, EDE, DEA, EGED, EDEA, EGEDE, GEDEA, and EGEDEA). The computational protocol presented here gives comparable results to the experiment, and, together, these tools provide deeper insights into the biophysical properties of these molecules of interest. The Raman experimental band position for the EGEDEA hexapeptide appears at 1658 cm −1 , confirmed by the computed Boltzmann sum of the corresponding vibration (also predicted at 1658 cm −1 ). Similarly, the experimental band position for the EDEA tetrapeptide appears at 1641 cm −1 , compared to the Boltzmann, scaled DFT computed band position at 1640 cm −1 . This greatly improves upon the previous computed prediction of the amide I vibration for this fragment at 1705 cm −1 .
A steady blue shift arises when moving linearly from ED to EDE to EDEA to EGEDE/GEDEA and finally to the EGEDEA hexapeptide. However, very little variation is observed in the αCOO− ν(C=O) band position when considering this group of fragments. Conversely, a steady red shift arises in the αCOO− ν(C=O) band position when moving from EG to EGE to EGED to EGEDE and finally to EGEDEA, while no clear trend arises in the amide I band positions for this group of fragments. The extent of intramolecular charge transfer is found to be heavily dependent on the overall polarity and size of the peptide fragment, with larger and more polar fragments exhibiting the greatest magnitude of intramolecular charge transfer. NBO computations reveal that the EGEDEA hexapeptide exhibits the greatest extent of intramolecular charge transfer (q B T of 1.650 e − ) and forms seven intramolecular hydrogen bonds.
Although the E-hook of β-tubulin is currently thought to be an intrinsically disordered protein tail that protrudes from the microtubule surface, the computed ϕ and ψ bond angles of the EGEDEA hexapeptide suggest it takes on an α-helical secondary structure, confirmation of which will be the focus of future studies.
Supplementary Materials: The following are available online. Figures S1-S33 and Tables S1-S22.
Author Contributions: A.E.W. was involved in all aspects of the work, including performing experiments, calculations, data analysis, and preparation of the manuscript. A.E.W., N.I.H., D.N.R., and R.C.F. designed the research and computational work and performed data analysis. A.E.W., N.I.H., D.N.R., and R.C.F. prepared the manuscript. All authors have read and agreed to the published version of the manuscript.