Biomarkers of the Caseous Lymphadenitis in Sheep by NMR-Based Metabolomics

Caseous Lymphadenitis (CLA) is a disease that affects animals, such as sheep, whose etiologic agent is Corynebacterium pseudotuberculosis. In sheep, it causes enormous damages, expressed through a reduction in milk, meat and wool production, or death of infected animals, and, also, difficulties in sterilizing infected animals habitats. Another problem in the control of this disease is its late diagnosis. Thus, we present the pioneer results on NMRbased metabolomics applied in CLA with the aim to detect blood serum profile alterations provoked by a pathogen in C. pseudotuberculosis seropositive sheep. We have compared two types of serum samples: The ones taken from the seropositive animals, which we called infected, and others taken from the seronegative (healthy) animals. Significant metabolomics profile changes occurred in the spectral regions δ: 0.20 to 2.20; 3.20 to 4.40 and 6.40 to 8.00 that refer to hydrogen atoms of proteins, organic acids, alcohols, lipids and some amino acids, correspondingly. When applying chemometrics, a significative separation of 59 serum samples into two groups, sick and healthy animals, was achieved. Additionally, key-metabolites that were only present in the group of sick animals and that can be considered as exclusively bacteria derived were pointed as possible biomarkers for CLA. Therefore, these data might contribute to the development of a non-invasive NMR-based diagnostic method, as well as to bring new insights into the development of new immunoprohylaxis tools. M et ab olo mics: Openccess ISSN: 2153-0769 Metabolomics: Open Access Citation: De Moraes Pontes JG, De Santana FB, Portela RW, Azevedo V, Poppi RJ, et al. (2017) Biomarkers of the Caseous Lymphadenitis in Sheep by NMR-Based Metabolomics. Metabolomics (Los Angel) 7: 190. doi:10.4172/2153-0769.1000190

CLA is characterized by lymphoadenomegaly and increases of vascular permeability, facilitating other ways of infections in the sheep. After the incubation period of the pathogen (1 to 3 months), the formation of encapsulated purulent abscesses with characteristic color occurs. The direct contact with these abscesses is also one form of transmission of disease [11]. Therefore, one of the options that might ameliorate treatment of this disease is to detect CLA at the beginning of infection and intervene before the illness has spread. Thus, biomarker detection other than C. pseudotuberculosis detection through PCR (DNA, RNA) analysis is being proposed. NMR spectroscopy can be an interesting option, once it is a non-destructive and reproducible technique and a very suitable analytical platform for the identification of metabolites and biomarkers in metabolomics studies of different nature [24][25][26][27][28][29]. There are researches on C. pseudotuberculosis-hosts interactions in the scientific literature [30][31][32][33]; however, there is a lack of information on this disease´s metabolomics.
In this context, the main aim of this research was to evaluate the metabolic changes in animals infected by C. pseudotuberculosis through the combination of NMR spectroscopy and chemometrics. Applying NMR-based metabolomics to assess the blood serum profile changes in CLA is the pioneer attempt. Our research aimed the identification of key metabolites and possible biomarkers of caseous lymphadenitis disease in sheep. We hope that the results of our research may contribute to the development of methodologies for the early diagnosis of CLA still in its asymptomatic phase, acting as a form of zoonosis control and prevention of CLA as to minimize economic losses.

Samples
Sera from 59 sheep were used in this study. 33 blood samples were taken from the animals infected with Corynebacterium pseudotuberculosis, as confirmed by bacterial isolation from caseous lesions, and further identification by biochemical [34] and molecular methods [35]. Concomitantly, sera from 26 healthy control animals were collected. The healthy sheep were picked after continuous clinical and immunological examinations, the last been characterized by 89% sensitivity and 99% specificity [36].
Animals of both genders, with the ages varying from eight months to five years, were selected. They were subjected to the same nutritional scheme and breeding procedures. Healthy sheep were bred apart and isolated from infected animals, with the objective to prevent further infections. The sample collection was obtained through vacutainer system without the use of anticoagulants. The puncture was done in the jugular vein using a 10 mL tube. The blood was centrifuged at 4,000 rpm for 10 min, and was stored at -80 o C for not longer than 15 days until NMR-spectra acquisition.
Two-dimensional NMR spectroscopy: 2D NMR spectra were acquired using a spectrometer frequency of 600.17 MHz for F2 and 150.91 MHz for F1 with a free induction decay (FID) size of 2048 (F2) and 256 (F1) data points and ns=64. Edited HSQC were acquired in the spectral range between δ 11.73 to -2.29 (F2) and δ 25.9 to 11.0 (F1). For more details about spectra, acquisition parameters see supplementary information. HSQC contour maps were processed using LB=1.00 Hz (F2) and LB=0.30 Hz (F1). NMR spectra processing: After NMR-spectra recording, the data-matrix has been constructed and, then, we have tested various data preprocessing as to remove all sources of variation that were not related to our samples [39][40][41][42]. The tested preprocessing methods were: Autoscale, pareto scale, standard normal variate (SNV) and normalization by the line between 0 and 1 [43].
Chemometric analysis: PLS-DA as variation of the Partial Least Squares (PLS) regression algorithm for discriminant analysis and a supervised classification method for multivariate data, was constructed as to relate a matrix X (NMR data) with a matrix/vector Y representing the class of the sample [44,45].
In general, for PLS-DA involving 2 classes, it is common to use Class 1 and Class 0 [46]. Therefore, the Class 0 corresponds to infected sheep's blood sera, and Class 1 corresponds to healthy control samples.
In many cases, the Y predicted value by the model could take values close to 0 and 1 (not exactly 0 and 1). Thus, it is necessary to calculate the threshold value for separation of the classes [47]. If predicted value is lower than the specified threshold value, the sample will be considered to belong to Class 0, and when the predicted value is higher than the threshold, the sample will be classified as a part of Class 1. The optimal threshold value can be calculated by Bayes theorem or use the ROC curves [43,48,49].
For the development of PCA and PLS-DA classification models MATLAB R2016a (Mathworks Inc.) and PLS_TOOLBOX 8.1 (Eigenvector Research) were used. Randomly, approximately twothirds of the samples were selected from healthy and infected sheep blood sera for the training set, and the remaining samples were used as the test set.
For the construction of PCA model, the optimal number of Principal Components (PC), was chosen analyzing the additive variance in X block. Moreover, for PLS-DA model the optimal number of latent variables was selected based on the Root Mean Square Error of Cross-Validation (RMSECV) in a combined analysis with the lowest difference between RMSECV and Root Mean Square Error of Calibration (RMSEC). Moreover, the additive variance in the X and Y blocks was also analyzed. Following these criteria, some models using the pre-processing already described were built. Also, the best PCA model was constructed with 4 PC. The best PLS-DA model was constructed with 3 latent variables. In both, PCA and PLS-DA models, the best pre-processing for the NMR data was auto-scaling. Thus, we assured not to lose information of any principal metabolite whose peak had low intensity [50].

Evaluation of the model parameters:
The assessment of the model was performed using the following parameters: True positive (TP) and negative (TN), sensitivity (Eq. A.1), specificity (Eq. A.2), false positive rate (FPR, Eq. A.3) and negative (FNR), accuracy (Eq. A.4), and efficiency (Eq. A.5). In this research, TP refers to the NMR data of the infected sheep blood sera classified in Class 0, and FP is used when the healthy control sample is classified erroneously as Class 1 [51,52]

Results and Discussion
Currently, Polymerase Chain Reaction (PCR) is the most reliable method for the detection of Corynebacterium pseudotuberculosis in caseous lesions [53,54]. However, PCR technique requires the isolation and later identification of the bacteria, which leads to an increased exposure to the pathogen. Also, it is impossible to take caseous material from lesions that are located in internal granulomatous lesions from organs such as liver, lungs, and kidneys. Immunodiagnosis assays work well for the identification of the infection with the bacteria in animals presenting CLA clinical signs, but no assay presents 100% specificity and sensitivity for the CLA diagnosis in infected animals without CLA clinical signs [34]. Therefore, it is very hard to perform a proper diagnosis in caseous lymphadenitis (CLA) subclinical animals that do not present any detectable lesions.
The use of NMR spectroscopy in the search for biomarkers has many advantages, such as high sensitivity and the use of only a small amount of blood from the animal [25,55]. Moreover, the analyst does not have direct contact with the pathogen, a situation that makes this procedure somewhat safer. 1 H NMR spectra indicate the presence of many species in the sheep blood serum, for example in δ 1.00 to 4.00 (amino acids, lipids, organic acids, alcohols), δ 4.00 to 5.00 (carbohydrates), and δ 6.00 to 8.00 (aromatic compounds and other) [56]. Examples of 1 H NMR spectra recorded for the two sheep blood serum samples are presented in Figure  1a; spectra with the suppression of peaks referring to macromolecules such as peptides, proteins and lipids are shown in Figures 1b and 1c in the spectral regions δ 6.50 to 8.00 and δ 0.50 to 4.20, respectively.
The spectra presented in Figure 1 have many overlapped broad peaks, typical for a biological sample such as blood serum. Just based on an inspection of the differences among recorded NMR-spectra, it was tough to discriminate the healthy from the infected sera. Subtle profile differences might be observed from the integration values of some peaks (Table S1, Supplementary information); one alternative to solve this problem is to apply chemometrics. Therefore, we used the multivariate classification method Principal Component Analysis (PCA) primarily and then, the Partial Least Squares Discriminant Analysis (PLS-DA) for samples differentiation.
Due to the vast amount of data and to the fact that metabolic changes are sometimes discrete, we have used PCA to preserve the most relevant data without compromising the quality of research. PCA and PLS-DA enabled us to gain insight on the spectral region that suffered the most significant change [57,58] upon infection provoked with the C. pseudotuberculosis.
Scores results of the PCA analysis are shown in Figure 2, where we can observe a tendency to a slight separation into healthy or infected sheep blood serum samples along the PC 2, but close to the Score 0 (PC 2), the samples form an overcrowded mix.
The Principal Component Analysis (PCA) was performed using all NMR-data. The T 2 -edited 1 H-NMR data were not considered in chemometrics, because relaxation times for some low molecular weight (LMW) molecules may also be affected by the suppression of the peaks of the high molecular weight (HMW) molecules.
The Y predicted values by PLS-DA model are shown in Figure 3. Just two samples, one from the healthy control group and the other from the infected group, were mistakenly classified and were considered as FP and FN, correspondingly. However, there are infected sheep blood sera with prediction values very close to the threshold line. In the validation set, the infected sheep blood sera were also close to the threshold line. Still, only one of these samples was misclassified.
Evaluation of the model parameters was organized in the matrix presented in Table 1. Analyzing the values of sensitivity and specificity obtained in the PLS-DA model in training and test sets, we can conclude that the values were close, which indicate a good concordance between the training and the test sets. In other words; there is no overfitting, and the training set is representative. Another way to evaluate the model is through accuracy and efficiency values, which indicate the quality of the a) 1 H NMR spectra obtained from Corynebacterium pseudotuberculosis healthy control sheep (gray) and infected sheep (black) in the spectral region of δ 0.00 to 9.00; b) 1 H NMR with T2 filter spectra for δ 6.50 to 8.00; c) 1 H NMR with T2 filter for δ 0.50 to 4.20. Tryptophan (Trp), phenylalanine (Phe), tyrosine (Tyr), glycerylphosphocholine (GPCho), phosphocholine (PCho), glutamate (Glu), alanine (Ala), valine (Val), isoleucine (Ile), alpha-glucose (Glc) and leucine (Leu). model when assessing two classes simultaneously. These values for the presented PLS-DA model were higher than 90% in both sets (training and test), thus pointing to an excellent discrimination between healthy and infected sheep blood sera.
Through the PLS-DA developed model, it was possible to identify the variables that are the most important to establish the relationship between matrix X ( 1 H NMR spectra profile) and Y (the classes of serum). We can highlight the variable importance in projection (VIP)available in an online resource. The idea behind VIP is to accumulate the importance of each variable j being reflected by the weights from each latent variable [59].
Variables with VIP scores greater than 1 were the most important Chemical Shifts for the discrimination of the two classes of samples in PLS-DA, these variables were highlighted in the mean spectrum of all samples. Thus, NMR spectral regions that contributed to the separation between healthy and infected sheep blood sera were: δ -0.20 to 2.20; 3.20 to 4.40 and 6.40 to 8.00. With the objective to identify the relevant metabolites, we have used the HSQC correlations and online databases such as Biological Magnetic Resonance Data Bank-BMRB [60] and the Platform for Riken Metabolomics-PRIMe [61]. Moreover, twenty metabolites were identified using the cited database, as presented in Table S4 (on-line resource). Also, examples of the HSQC contour maps obtained for the healthy and infected sheep's blood sera, 1 H NMR with T 2 filter and HSQC correlations (δ H -δ C )-Tables S2 and S3-used in metabolomics database searches are available in an online resource.

Parameters
Training set (%) Test set (%) Sensitivity is expressed as the probability for an infected sheep blood serum sample to be classified as Class 0, and specificity as the probability that a healthy control is wrongly classified in Class 1. Accuracy is the rate of correct classification, independent of the class of the sample, and efficiency rate is defined as the difference between the total of results (100%) and the sum of FPR and FNR. Based on results of peak assignments (Table S5 in the on-line resource) that reflect the differences in concentration or differences in chemical nature, the following metabolites can be cited as the most responsible for two groups separation: L-tryptophan, L-phenylalanine, L-tyrosine, D-(+)-mannose, D-(+)-galactitol, D-(+)-mannitol, L-ornithine, L-leucine, and L-arginine.
It also has been observed an increase of alcohols, such as butanol and sorbitol in the infected sheep blood samples, which can be related to the fragmentation of mycolic acid, the last one is exclusively present in the Corynebacterium genus [62].
Furthermore, a decrease in the intensity of the peaks in the δ 3.00-4.00 range in infected sheep blood sera (Figure 1c) was observed. It is expected that the decreased levels of sugars, such as D-(+)-glucose and D-(+)-mannose, contributed to this observation. This decrease can be correlated to the fermentation processes that are typical for the biochemical action of C. pseudotuberculosis [63]. However, the greater diversity of the peaks at this spectral region (δ 3.00-4.00) seen in the infected sheep blood serum spectra can be due to the presence of other C. pseudotuberculosis sugars.
L-Tryptophan was identified only in serum samples from infected (sick) animals ( Figure 1b). In general, tryptophan is synthesized in bacteria [64,65]. We believe that in the infection process this amino acid might be excreted to the blood. The increase in the blood tryptophan levels is linked to some skin histopathologies, while quinolinate, formed through tryptophan catabolism, was previously reported in lymph nodes [66].
In healthy control samples, we have detected slightly higher amounts of the following amino acids: L-arginine, Ala-Ala, L-leucine, L-ornithine, L-phenylalanine, and L-tyrosine. We might assume that infected animals use these amino acids for the formation of new immune cells via processes of mitosis and expansion, which T and B lymphocytes undergo after antigen recognition and activation [67].
C. pseudotuberculosis infected animals usually develop a significant immune response as can be observed through the formation of granulomatous lesions in lymphoid tissues. This bacterium is also known to induce a strong chemotaxis action on many immune cells that will construct the granuloma [68]. Therefore, a marked presence of lipids with the fatty acids' long chains can be explained as a consequence of the production of lipidic intermediates originated from the degradation of the arachidonic acid (phospholipids, PL), which is a well-known feature when inflammation process occurs [69], such in chemotaxis of immune cells.
Negative chemical shifts, resultant of intermolecular interactions [70], very likely indicate variations in the levels of some proteins in serum samples upon infection, such as immunoglobulins. There is no knowledge on immunoglobulins characterization studies by NMR for the CLA. However, there are reports on the improved resistance to infection provoked with the C. pseudotuberculosis in mice [71]. Thus, CLA progress in sheep might be associated with the high levels of haptoglobin and IgM [72].

Conclusion
Very successfully, a pioneer NMR-based metabolomics research has been applied for the Caseous Lymphadenitis (CLA) biomarkers' identification in sheep. Twenty metabolites were detected as altered in sheep blood serum samples when compared healthy and infected animals. Among those, fourteen figured as possible CLA key-metabolites, 9 found only in the healthy sample group and 5 in the infected sheep blood sera. Their chemical shifts belong to the following NMR-spectral regions: δ -0.20 to 2.20; 3.20 to 4.40 and 6.40 to 8.00, which showed to have very high loadings in the PCA and PLS-DA. We also have shown that exclusive Corynebacterium pseudotuberculosis metabolites, like alcohols, sugars and tryptophan, could be observed applying NMR with ease. Thus, based on the reported data, we hope that future research might contribute to the development of a safer caseous lymphadenitis diagnosis and/or bring new solutions to immunoprohylaxis of the CLA disease.
The sheep under the study were bred at the Animal Facilities of the Laboratory of Immunology and Molecular Biology (Labimuno-UFBA), located in the municipality of Salinas da Margarida, Bahia State, Brazil (12°51'54.95"S, 38°47'53.09"W), and were purchased from the local sheep breeders for the scientific purposes. Animals were fed with grass, protein concentrate and water ad libitum, as recommended by the Animal Nutrition Laboratory of the Federal University of Bahia (UFBA). Collecting of the animal blood serum samples was conducted by veterinarians and followed the recommendations of the Brazilian Ministry of Agriculture (MAPA, 2010). After the samples had been obtained, animals were closely observed for a two days period with the objective to detect any lesion or behavioral problems due to the procedures.
We confirm that this manuscript is the result of a basic research project that was developed at the University and that was funded by public funding agencies with the aim of developing new knowledge. The generated results should, therefore, be shared with the scientific and technological community through Open University theses and manuscripts that are published in standard scientific journals. In this sense, we would like to state that we fully adhere to all of the Metabolomics: open access policies regarding the sharing of data and materials. We confirm that this study did not involve endangered or protected species.

Supplementary Information
The contents of the SI include correlations spectroscopic, HSQC and edited HSQC spectra, T 2 -edited 1 H-NMR spectra, acquisition parameters, mean integration peaks of 1 H-NMR, Variable Importance in Projection (VIP) and details about the 20 metabolites identified (ID and assignments).