First-principles calculations of Raman vibrational modes in the fingerprint region for connective tissue

Vibrational spectroscopy has been widely employed to unravel physical-chemical properties of biological systems. Due to its high sensitivity to monitor real time"in situ"changes, Raman spectroscopy has been successfully employed, e.g., in biomedicine, metabolomics, and biomedical engineering. The grounds of interpretation of Raman spectra in these cases is the isolated macromolecules constituent vibrational assignment. Due to this, probe the anharmonic interactions or the mutual interactions among specific moieties/side chains to name but a few is a challenge. We present a complete vibrational modes calculation for connective tissue in the fingerprint region ($800-1800$ cm$^{-1}$) using first-principles Density Functional Theory. Our results indicated that important spectral features correlated to molecular characteristics have been ignored within the usual tissue spectral bands assignments. In particular, we found that the presence of confined water is the main responsible for the observed spectral complexity. Our calculations accounted for the inherent complexity of the spectral features in this region and useful spectral markers for biological processes were unambiguously identified.

The rapid, noninvasive and high spatial resolution capabilities of Raman spectroscopy technique have been employed to obtain biochemical and structural pieces of information of biological samples. Biomedicine [1], metabolomics [2], biomedical engineering [3,4] are examples of fields where this tool have been successfully used to acquire high-quality data. In particular, several optical-biopsy studies have shown that molecular interaction features in cells and tissues which cannot be accessed by conventional histopathology can be probed by this technique [5]. Raman is of special interest due to their high sensitivity to detect biochemical and molecular variations in tissues [5].
To a first approximation, spectrum of biological tissue is a convolution of isolated biological macromolecules (e.g., carbohydrates, proteins, lipids, deoxyribonucleic acid, ribonucleic acid) spectra. Hence, the tissue vibrational bands assignment is usually made based on isolated macromolecules assignment. There are several literature compilations (see, e.g., refs. [6][7][8][9] ) listing the vibrational bands of biological tissues constituents. These compilations eventually are used to perform a qualitative interpretation of the spectra. However, a large amount of relevant pieces of information are absent to one using this qualitative vibrational assignment. Anharmonic interactions which gives rise to coupling among harmonic vibrational modes is an example [10,11]. Moreover, mutual interactions among specific moieties could be analyzed only in comparative basis. The usage of computational simulations of small specific parts and short time intervals of the macromolecule is another method to understand this interactions. These approaches usually obscure relevant * herculano.martinho@ufabc.edu.br physical-chemical data from the environment. At scale of real biology they are in fact only a small part of the overall picture [12].
Computer simulations could be a suitable tool to interpret experimental data aiming understand biochemical changes translating structural changes that lead to macroscopic biological processes. Vibrational spectra of macromolecules and tissues are a important class of experimental data addressing this issue. The atomistic models based on quantum mechanical calculations have better prediction of the materials properties. However, due to its inherent complexity, atomistic modeling of biological systems are still in the early stages. In a previous [13] we presented a computational model for skin (STmod). The model consisted of a collagen peptide cutout including confined water submitted to periodic boundary conditions. The model was able to successfully explain important experimental structural and general biochemical trends of normal and inflammatory tissues.
In the present work a detailed vibrational modes assignment of a connective tissue based on the STmod is presented. To the best of our knowledge this is the first report on literature concerning complete vibrational assignment of a tissue. The vibrational calculations were performed on C n (n − 8), C 1s , D 0 , and D 1 unit cells of STmod. The numeric subscript indicates the number of water molecules inside the unit cell. The "s" subscript related to the presence of external water solvating the C 1 model. Starting from a hydrated collagen peptide each unit cel was obtained and calculations performed on periodic boundary conditions. More details concerning the obtainment of these structures and previous characterizations as well could be found in ref. [13]. Figure 1 shown the unit cell for C 0 , C 1 s, C 2 , and D 0 structures.
Density Functional Theory (DFT) [14,15] was used in order to obtain the equilibrium geometries and har-  Table I of ref. [13]. monic frequencies. The simulations were implemented in the CPMD program [16] using the BLYP functional [17] augmented with dispersion corrections for the proper description of van der Waals interactions [18,19]. The cutoff energy was considered up to 100 Ry. The wave functions were optimized and then the vibrational modes were obtained using the Hessian matrix. Finally the linear response for the values of polarization and polar tensors of each atom in the system was calculated to evaluate the eigenvectors of each vibrational mode.The Raman-active modes were obtained from the atomic polar tensors for each atom in the system and the corresponding eigenvectors of Hessian [20]. Harmonic frequencies were compared to experimental Raman data of normal(NM) oral mucosa tissue (see ref. [3] for experimental details). The spectra were simulated as a convolution of Gaussian lineshape peaks centered on the calculated frequencies using the Fityk [21] program . The linewidth was chosen to be 20 cm −1 . Figure 2a) presents the results for fingerprint region. The experimental Raman spectrum of NM is also shown.
From direct inspection, we found that C 2 and C 3 models presented a large set of represented bands (thirteen). The worst model was C 7 one, which has only four peaks according to the experimental results, followed by C 0 , consistent with only five peaks. The C 3 model presented a set of negative high frequency ( 940 cm −1 ) modes indicating some degree of mechanical instability. Thus, we concluded that C 2 is the suitable model to represent the connective tissue from the vibrational modes point of FIG. 2. a) Experimental Raman spectra for normal oral mucosa tissue (NM) compared to Cn (n = 0 − 8), C1s, D0, and D1 STmod models in the fingerprint region.The wavenumberaxis projection of amine (b), methyl (c), water (d), and Amide II and III partial eigenvectors contributions are also shown. The top scheme represents the qualitative additional bands assignment based on Table I. view.
Connective tissue vibrational modes based on C 2 model are shown in Table I. The comparison between the present work and literature bands assignment are also shown. We notice that our results presented extra pieces of information. The most striking feature is the activation of methyl, methylene, and amine side chains vibrations along the fingerprint spectral window. Figure  2b)-2e) shows the percentage contribution to the total eigenvectors projected on the wavelength axis for some of these vibrations. Amine vibrations (Fig. 2 b) are present on almost the entire region weighting around 10% of contribution. It is possible to observe that methyl groups contributions (Fig. 2c)   and around 1450, 1550, and 1700 cm −1 . The side chains are key factors determining the properties and reactivity of molecules. Thus, one expect that molecular transformations under, e.g., pathological processes, will display overall changes in the fingerprint vibrational region. It is important to notice that our calculations indicated that the side chains vibrational activation occurs only in the presence of confined water. The anhydrous C 0 model did not display this characteristic. The water dimer itself presents spectral features around 900 and 1600 − 1750 cm −1 (Fig. 2 d) which usually are not described. Amide II and III vibrations are also present on the overall spectra weighting around 5% (Fig. 2e).
In fact, experimentally observed spectral changes in fingerprint region for tissues have been qualitatively re-ported to correlated to water content (see,e.g., refs. [22][23][24][25]). Elderly and diabetis [22,23], oral cancer [24], cervical cancer [25] to name but a few are examples of physiological situations where spectral complexity emerges beyond the isolated molecule vibrational bands assignment. Protein-water interactions are known to play a critical role in the function of several biological systems and macromolecules including collagen in tissues [26]. Small changes in structure and dynamical behavior of water molecules at the peptide-water interface can effectively change both the structure and dynamics of the protein function [27]. Our model indicates that the main source of this complexity is the presence of confined water enabling distant and isolated side chains coupling. A large set of wagging, scissoring, twisting and rocking vibrations of side chains usually assigned in the high-wavenumber region ( 2000 cm −1 ) appeared in the fingerprint region damped to usually assigned vibrations (see Table  I). Coexistence of symmetrical and asymmetric stretchings are also present in a more complicated fashion that go beyond the protein, collagen, proline and hydroxyproline usual bands assignment. Interestingly, the region 880 − 940 cm −1 appeared to retain information about the confined water content (Fig. 2 c). In fact, the contributions to the eigenvectors in this region goes from 30% confined water and 70% of CH 3 ,CH,C-C,C-C-H vibrations.
From the qualitative and quantitative pieces of information generated by our first principle calculations one can found useful spectral markers which could be correlated to biological process of interest. We will comment two examples.
Confined Water. Since the 800 − 880 region is dominated by C,H vibrations (Table I) the difference between the integrated areas of these two regions will be a suitable qualitative quantifier for the confined water content, a key parameter for describing important processes, as just commented in the previous discussion.
Methylation. The 800 − 880 cm −1 region could also be used to probe the protein methylation process. Protein methylation is the process through which methyl groups are added to proteins under the action of specific enzymes, the methyltransferases. Usually it occurs on nitrogen atoms in N-terminals and cannot be reversed creating new amino acid residues. [28]. The above-cited spectral region does not present amine nor hydroxil contributions being exclusive of methyl and protein backbone vibrations. Thus computing their ratio to N-H stretching appearing in the high-wavenumber region [3], P rotein CH3 = I 800−880 /I N −H (2) will give a useful protein methyl quantifier. Usually the methyl band in the ∼ 2940 cm −1 high-wavenumber region is considered to evaluate the methylation. However, it includes contributions from lipids and proteins [3]. Protein methylation modulates cellular and biological processes including transcription, RNA processing, protein interactions and protein dynamics [29]. Methyl-binding protein domains and improved antibodies with broad specificity for methylated proteins are being used to characterize the so-called protein methylome. They also have the potential to be used in high-throughput assays for inhibitor screens and drug development [30]. In summary, our vibrational modes calculation for connective tissue in the fingerprint region indicated that important spectral features correlated to molecular characteristics have been ignored within the usual tissue spectral bands assignments. Our results indicated that the presence of confined water is the the main responsible for the observed spectral complexity being a factor that cannot be ignored. The inherent complexity of the spectral features in this region could be rationalized by our calculations and useful spectral markers for biological processes could be identified.