Introduction

Homocysteine is a nondietary amino acid, which is the byproduct of methionine metabolism. Hyperhomocysteinaemia is documented as a potential risk factor for a wide spectrum of diseases such as recurrent pregnancy loss (Govindaiah et al. 2009), birth defects (Naushad et al. 2014a), psychiatric disorders (Moustafa et al. 2014), coronary artery diseases (Lakshmi et al. 2013), deep vein thrombosis (Naushad et al. 2007; Ghaznavi et al. 2015), Parkinson’s disease (Kumudini et al. 2014; Kirbas et al. 2016), etc. Hyperhomocysteinaemia could occur due to genetic polymorphisms/mutations or cofactor deficiencies of the folate metabolic pathway.

The dietary source of folate is in the form of folyl polyglutamate and is converted to monoglutamates with the help of glutamate carboxy peptidase II (GCPII) in the intestine. With the help of folate reductase (FR), monoglutamates are reduced to dihydrofolate (DHF) and tetrahydrofolate (THF). Serine hydroxymethyl transferase 1 (SHMT1) present in the cytosol is a \(\hbox {B}_{6}\)-dependent enzyme that catalyses the reversible conversion of serine and THF to glycine and 5,10-methylene THF; and irreversible conversion of 5,10-methylene THF to 5-formyl THF. Methylene tetrahydrofolate reductase (MTHFR) catalyses the flavin adenine dinucleotide (FAD)-dependent reduction of 5,10-methylene THF to 5-methyl THF. It also helps in the conversion of deoxy uridine monophosphate (dUMP) to deoxy thymidine monophosphate (dTMP) with the help of thymidylate synthase (TYMS). 5-Methyltetrahydrofolate-homocysteine methyltransferase (MTR) catalyses the conversion of homocysteine to methionine with 5-methyl THF as the substrate and methylcobalamin as the cofactor. 5-Methyltetrahydrofolate-homocysteine methyltransferase reductase (MTRR) catalyses the reductive methylation of cobalamin. Then, methionine is converted to S-adenosyl methionine (SAM), which donates methyl moiety to DNA, histones, catechol amines, etc. After the transfer of methyl groups, SAM is converted to S-adenosyl homocysteine (SAH), which on hydrolysis gives homocysteine.

The pathophysiology of hyperhomocysteinaemia was reported to be mediated through the following mechanisms: (i) auto-oxidation of homocysteine increasing free radical production (Zhang et al. 1998); (ii) the superoxide ion generated will remove nitric oxide from the circulation as peroxynitrite (Antoniades et al. 2006; iii) homocysteine was shown to induce damage to endothelium (Pushpakumar et al. 2014; iv) elevated homocysteine might be a surrogate marker of altered cellular methylation (Naushad et al. 2014); (v) it can promote hypercoagulable state by promoting procoagulants and inhibiting anticoagulants (Coppola et al. 2000) (figure 1).

Fig. 1
figure 1

Homocysteine metabolic pathway. This scheme depicts the transfer of one-carbon moiety from one substrate to another in sequential reactions. Any perturbation in this pathway results in elevated homocysteine.

Several putatively functional polymorphisms such as GCPII C1561T, RFC1 G80A, SHMT1 C1420T, TYMS \(5^\prime \)-UTR 28 bp tandem repeat, MTHFR C677T, MTR A2756G, MTRR A66G were reported in the folate metabolic pathway (Binia et al. 2014). These polymorphisms were shown to influence homocysteine levels and also contribute towards impaired DNA synthesis and DNA methylation. Deficiency of cofactors such as folic acid, \(\hbox {B}_{2}\), \(\hbox {B}_{6}\) and \(\hbox {B}_{12}\) were reported to increase homocysteine (Sukla et al. 2013). Apart from these inherent variables, several other variables such as age, gender, hypertension, diabetes and consumption of tea, coffee, alcohol and smoking were also known to influence homocysteine (Landini et al. 2014).

Fig. 2
figure 2

The impact of genetic polymorphisms in homocysteine metabolism. The univariate analysis of each genetic polymorphism for possible association with homocysteine is depicted for wild (0), heterozygous (1) and homozygous (2) genotypes.

In view of multi-system involvement of elevated homocysteine levels, we aimed to develop a mathematical model of homocysteine metabolism and utilize this model to understand the influence of environmental and genetic variables in modulating homocysteine levels. This will help in delineating whether lifestyle modulation can control homocysteine levels.

Method

Recruitment of subjects

Population-based controls in the age group of 20–75 yrs were recruited for this study. The inclusion criteria were: no history of any thyroid abnormality, no history of narcotic drug usage and no evidence of malignancy or inflammatory disease. We have recruited 85 subjects based on these criteria at Nizam’s Institute of Medical Sciences, Hyderabad, India. The study protocol was approved by the Institutional Ethical committee of Nizam’s Institute of Medical Sciences, Hyderabad, India. Informed consent was obtained from all the subjects.

Sample collection

Whole blood samples were collected in EDTA after overnight fasting. Plasma was separated immediately and stored at \(-80{^{\circ }}\hbox {C}\) until further analysis. Buffycoat was used to extract DNA using phenol–chloroform extraction method following digestion with proteinase K.

Biochemical analysis

Plasma homocysteine and glutathione were investigated by reverse phase high performance liquid chromatography following precolumn derivatization with 4-fluro-7-sulfobenzofurazan ammonium salt as described elsewhere (Ubbink et al. 1991).

Genetic analysis

GCPII C1561T, RFC1 G80A, SHMT1 C1420T, TYMS \(5^\prime \)-UTR 28 bp tandem repeat, MTHFR C677T, MTR A2756G and MTRR A66G were detected using polymerase chain reaction (PCR) restriction fragment length polymorphism (RFLP) and PCR-amplified fragment length polymorphism (AFLP) approaches as described earlier (Mohammad et al. 2011).

Multiple linear regression (MLR)

All genetic and environmental factors such as age, gender, GCPII C1561T, RFC1 G80A, SHMT1 C1420T, TYMS \(5^\prime \)-UTR 28 bp tandem repeat, MTHFR C677T, MTR A2756G and MTRR A66G were used as input variables and total plasma homocysteine was used as output variable to construct a MLR model.

Development of neuro-fuzzy logic model

For the development of neuro-fuzzy logic model, the input variables used were age, gender, diet, smoking, alcohol, diabetes mellitus, hypertension, GCPII C1561T, RFC1 G80A, SHMT1 C1420T, TYMS \(5^\prime \)-UTR 28 bp tandem repeat, MTHFR C677T, MTR A2756G and MTRR A66G, while total plasma homocysteine was used as the output variable. Fuzzy interference system (FIS) was generated using subclustering. The model was trained based on the data of 85 subjects. The FIS optimization for the training of the model was based on ‘hybrid’ method with error tolerance of 0.0001 and epochs of 3000. The training of the model was stopped when the mean absolute error was minimized. The performance of the model was ascertained by cross-validating the data as the testing and the checking data. The final model depicted the rules and surface plots representative of interactions between the variables.

In silico analysis

The crystal structure of Homo sapiens recombinant serine hydroxymethyl transferase (PDB ID: 1BJ4) with a single chain of 480 residues was used as the template for this analysis. Using PyMOL software, the leucine residue at the 474th position was mutated to phenylalanine to generate the mutant protein structure. The binding affinity of three ligands, i.e., tetrahydrofolate (THF), methylene THF and formyl THF towards wild versus mutant was assessed using PyRx software. The chemical structures of ligands were obtained from drug bank. The atomic coordinates were generated using PyRx tool. The PDB coordinates of the protein and the ligands were optimized by Drug Discovery studio ver. 3.0. The binding energies were scored using Auto Dock vina option.

Fig. 3
figure 3

Influence of diet on the association between genetic and demographic variables with homocysteine. This illustrates a comparative analysis between vegetarians and nonvegetarians (0, vegetarian diet; 1, nonvegetarian diet) demonstrating alterations in homocysteine with respect to age, MTR, MTHFR, RFC1 genotypes depicted as 0, 1 and 2 based on number of variant alleles. The plasma homocysteine levels were represented in \(\mu \hbox {moles}/\hbox {litre}\).

Results

The MLR equation explained 64% variability in homocysteine levels. Homocysteine levels were higher in men compared to women in all genotypes. The MTHFR C677T, MTR A2756G and MTRR A66G polymorphisms were shown to increase homocysteine (figure 2). The contribution of genetic and demographic factors towards homocysteine was depicted with the following equation:

$$\begin{aligned} \hbox {Homocysteine}= & {} (15.27) - (0.07^{*}\hbox {Age}) \\&+\,\, (4.93^{*}\hbox {Gender}) \\&-\,\, (4.51^{*}\hbox {GCPII}) - (1.7^{*}\hbox {RFC}) \\&+\,\, (0.58^{*}\hbox {SHMT}) - (0.81^{*}\hbox {TYMS}) \\&+\,\, (1.53^{*}\hbox {MTHFR}) \\&+\,\, (2.7^{*}\hbox {MTR}) + (2.19^{*}\hbox {MTRR}). \end{aligned}$$
Fig. 4
figure 4

Association of age and gender with homocysteine. This illustrates changes in homocysteine in both genders across different age groups.

The developed neuro-fuzzy logic model showed higher accuracy with mean absolute error of 0.000065. This model was developed to study gene–gene, gene–environment and gene–nutrient interactions. A positive association was observed between age and homocysteine level with respect to a vegetarian diet. Nonvegetarian diet showed a protective role (figure 3). An elevated level of homocysteine was observed in males compared to females (figure 4).

Fig. 5
figure 5

Counteracting interactions between MTHFR and TYMS in homocysteine metabolism. MTHFR C677T polymorphism was shown to increase homocysteine levels in the presence of TYMS \(5^\prime \)-UTR wild genotype. However, in the presence of 2R allele (1 and 2), MTHFR polymorphism will not exert homocysteine elevation.

Fig. 6
figure 6

Interactions of THF with wild versus mutant proteins. The crystal structure of recombinant SHMT (PDB Id: 1BJ4) was used as a template and the leucine residue in 474th position was mutated to phenylalanine (L474F) using PyMOL. Docking was performed using PyRx, which showed similar binding affinities of wild and mutant proteins towards THF. However, H-bonding interactions were more in mutant protein than wild.

As shown in figure 5, MTHFR C677T was associated with increased homocysteine levels, while presence of TYMS \(5^\prime \)-UTR 28 bp tandem repeat negated this effect. The nonvegetarian diet was shown to decrease homocysteine levels induced by MTR A2756G (figure 3). MTHFR CT and TT genotypes were associated with elevated homocysteine in vegetarians. The nonvegetarian diet conferred protection against this polymorphism (figure 3).

As shown in figure 3, RFC1 GA and AA genotypes increased homocysteine levels in vegetarians. The nonvegetarian diet conferred protection against elevated homocysteine. Since the SHMT1 C1420T has no appreciable impact on the homocysteine levels, an in silico analysis was performed to elucidate its functional impact. The binding affinities for the three ligands i.e., THF, MTHF and FTHF with wild versus mutant proteins were studied. Furthermore, H-bond interactions between the ligand and protein were elucidated.

Fig. 7
figure 7

Interactions of MTHF with wild versus mutant proteins. The crystal structure of recombinant SHMT (PDB Id: 1BJ4) was used as a template and the leucine residue at 474th position was mutated to phenylalanine (L474F) using PyMOL. Docking was performed using PyRx, which showed similar binding affinities of wild and mutant proteins towards MTHF. However, H-bonding interactions were more frequent in the wild protein than in the mutant.

The binding energies for wild and mutant proteins were \(-8.2\) and \(-8.3\) (kcal/mol), respectively, in the presence of THF as a ligand. When FTHF and MTHF were used as a ligand, these energies were similar i.e., \(-7.9\) and \(-7.3\) (kcal/mol), respectively. The interactions between the protein and ligand were viewed using PyMOL software. Though, the binding affinities of both wild and mutant types were the same, they differed in H-bonding interactions. A greater number of H-bond interactions were observed in the mutant than in the wild type (figure 6) protein in THF, while a greater number of H-bond interactions were observed in the wild type compared to the mutant in MTHF (figure 7).

Discussion

Hyperhomocysteinaemia is a well-documented risk factor for a variety of diseases affecting all age groups. Several researchers have previously demonstrated the association of hyperhomocysteinaemia with the molecular pathophysiology of these diseases. However, due to complex gene–gene and gene–environment interactions, few studies were conducted to explain the elevation of homocysteine in terms of multilocus models. Mathematical models of homocysteine metabolism were proposed based on known reaction kinetics (Reed et al. 2004). An inverse association was reported between folate and homocysteine at very low concentrations of folate (Reed et al. 2006). The application of MLR approach to address the contribution of independent variables on a dependent variable is well accepted and this approach explained 64% variability in homocysteine. However, being a linear model, it is unlikely to take into account, complex interactions between the independent variables. We have overcome this limitation by applying neuro-fuzzy design, which is specifically capable of exploring bivariate interactions. The observations of this model showing positive association of MTHFR C677T and inverse association of TYMS \(5^\prime \)-UTR 28 bp tandem repeat and SHMT1 C1420T corroborated the mathematical model of folate metabolism (Ulrich et al. 2008), which showed increased homocysteine levels in the presence of decreased specific activities of MTHFR and SHMT; and increased specific activity of TYMS. Presence of promoter polymorphism in TYMS increases the flux of folate towards remethylation of homocysteine. Hence, even in the presence of MTHFR C677T, homocysteine will not be elevated. A similar mechanism was shown to confer protection against coronary artery disease (Vijaya Lakshmi et al. 2011).

A majority of Indians have been reported to have \(\hbox {B}_{12}\) deficiency by virtue of a vegetarian diet and hence, their homocysteine levels have been found to be higher than their counterparts across the globe (Refsum et al. 2001). To substantiate this observation, we compared changes in homocysteine levels with respect to different genotypes in vegetarians and nonvegetarians. We observed homocysteine lowering in nonvegetarians compared to vegetarians. The uptake of cobalamin and its reactivation were facilitated by specific protein–protein interactions between the MTRR FMN domain and MTR (Wolthers and Scrutton 2009). MTRR was proposed to serve as a molecular chaperone for MTR and also access aqua cobalamin reductase (Yamada et al. 2006). This hypothesis is substantiated by a study on Thai population, which showed protective role of nonvegetarian diet against MTHFR C677T induced hyperhomocysteinaemia (Kajanachumpol et al. 2013).

The association of MTHFR C677T with hyperhomocysteinaemia was consistent with the existing literature. Recently, we reported that because of this association, the frequency of this polymorphism is lower in South Indians due to adaptive developmental plasticity (Naushad et al. 2014b).

The association of RFC1 G80A with homocysteine corroborated our earlier observation showing an inverse association of this polymorphism with folate (Naushad et al. 2011). SHMT1 C1420T polymorphism showed no statistically significant association with the plasma homocysteine. To support this observation, in silico analysis was performed which suggested similar binding energy between the wild and mutant proteins for THF, FTHF and methylene THF. However, the H-bond interactions were more in the mutant protein compared to the wild type for THF, while H-bond interactions were less in the mutant protein for MTHF compared to the wild type suggesting greater turnover of MTHF. This observation corroborated the studies on human and rabbit cytosolic SHMT, which showed no influence of mutation on the stability of SHMT or the rate of conversion from 5,10-methenyl THF to FTHF (Fu et al. 2005).

The major strength of the current study was the integration of MLR and neuro-fuzzy models to evaluate independent as well as bivariate interactions influencing homocysteine. Further, narrow mean absolute error increases the prediction accuracy of the model. The limitations of the current study were: (i) sample size; (ii) other lifestyle factors such as smoking, alcohol intake, consumption of tea and coffee, physical activity, etc., were not incorporated due to lack of quantitative information; and (iii) the estimations of folate and \(\hbox {B}_{12}\) were not performed due to technical constraints. Future studies are warranted on large ethnic groups and populations to increase the precision of these models.

To conclude, the MLR model of homocysteine explained 64% variability. The neuro-fuzzy model with narrow mean absolute error explored the interactions of remethylating polymorphisms with diet and demonstrated the protective role of nonvegetarian diet through lowering of homocysteine. Increased flux of folate either due to TYMS \(5^\prime \)-UTR 28 bp polymorphism or one-carbon homeostasis due to SHMT1 C1420T was shown to negate the homocysteine elevation mediated by MTHFR, MTR and MTRR polymorphisms.