Cholesterol 7 alpha-hydroxylase ( CYP7A1 ) gene polymorphisms are associated with increased LDL-cholesterol levels and the incidence of subclinical atherosclerosis

The cholesterol 7 alpha-hydroxylase (CYP7A1) enzyme plays an important role in the conversion of cholesterol to bile acid, contributing to the reduction of cholesterol plasma levels in normal conditions. Nonetheless, recent studies have shown that some genetic variants in the enhancer and promoter regions of the CYP7A1 gene reduce the expression of the CYP7A1 enzyme, increasing plasma lipid levels, as well as the risk of developing coronary heart disease. The aim of this work was to explore whether the genetic variants (rs2081687 , rs9297994, rs10107182 , rs10504255 , rs1457043 , rs8192870, and rs3808607) of the CYP7A1 gene are involved in subclinical atherosclerosis and plasma lipid levels. We included 416 patients with subclinical atherosclerosis (SA) with coronary artery calcium (CAC) greater than zero, and 1046 controls with CAC = 0. According to the inheritance models (co-dominant, dominant, recessive, over-dominant and additive), the homozygosity of the minor allele frequencies of 7 analyzed polymorphisms showed a high incidence of SA ( P < 0.05). In a sub-analysis performed including only the patients with SA, the same SNPs were associated with increased low-density cholesterol (LDL-C) levels. On the other hand, our findings showed that the haplotype ( TGCGCTG ) increases the risk of developing SA ( P < 0.05). In conclusion, the rs2081687 , rs9297994, rs10107182 , rs10504255 , rs1457043 , rs8192870, and rs3808607 polymorphisms of CYP7A1 confer a risk of developing SA and elevated LDL-C levels. Our results suggest that the CYP7A1 is involved in the incidence of SA through the increase in the plasma lipid profile.

Therefore, considering the central role of CYP7A1 activity in cholesterol catabolism, in this study we hypothesize that CYP7A1 gene polymorphisms above-mentioned are associated with elevated LDL-cholesterol plasma levels, and a consequent higher risk of SA.The objective of this study was to look for the potential statistical association of the rs2081687 C/T, rs9297994 G/A, rs10107182 C/T, rs10504255 A/G rs1457043 C/T, rs8192870 G/T, and rs3808607 G/T SNPs with the risk of developing SA, and with the plasma lipid profile, particularly LDL-cholesterol (LDL-C) and total cholesterol.

Characteristics of the study population
This cross-sectional study is nested in the GEA (Genetics of Atherosclerotic Disease) study, which investigates the associations of gene polymorphisms with atherosclerosis in Mexican individuals [19].The GEA cohort was recruited from June 2008 to January 2013 at the Instituto Nacional de Cardiología.The 1462 Mexican mestizo volunteers included in the present study were enrolled in the GEA cohort after a medical examination and health questionnaire.The main inclusion criteria for these 1462 individuals were the absence of a personal or familial history of coronary heart disease or current or previous congestive heart failure.Exclusion criteria were liver, renal, thyroid and oncological diseases, determined by clinical chemistry and medical exploration [19].The 1462 volunteers in this study were born in Mexico and considered Mexican Mestizo based on autochthonous and Caucasian and/or Black origin.Once included in the cohort, the 1462 volunteers in this study underwent a computed tomography scan to assess the CAC score [18].A CAC score > 0 without clinical symptoms of coronary artery disease was established as a diagnosis of subclinical atherosclerosis (SA).

Clinical and laboratory measurements
Cholesterol and triglyceride plasma levels were performed in plasma samples and were determined using commercially available kits (Randox Laboratories, Crumlin, UK).Highdensity lipoprotein-cholesterol (HDL-C) was determined after selective precipitation of apolipoprotein B-containing lipoproteins with phosphotungstic acid-Mg 2+ .The Friedewald formula was used to estimate the LDL-C plasma concentration [20] if triglyceride concentrations were < 400 mg/dL.Patients were considered to have diabetes when their fasting glucose levels were ≥ 126 mg/dL, they had a previous diagnosis of the disease, or were using antidiabetic medications (https://www.msdmanuals.com/professional/endocrineand-metabolic-disorders/diabetes-mellitus-and-disorders-of-carbohydratemetabolism/diabetes-mellitus-dm#v29299021).Subjects were considered hypertensive when they had systolic blood pressure values ≥ 130 mmHg, and/or diastolic blood pressure ≥ 90 mmHg, or were using anti-hypertensive drugs when blood samples were drawn for this study, according to the MSD manual (https://www.msdmanuals.com/professional/cardiovasculardisorders/hypertension/hypertension?query=hypertension (accessed on Dec 2, 2023)).

Genetic analysis
DNA samples were obtained from whole blood as previously described [21].CYP7A1 gene polymorphisms (rs2081687 C/T, rs9297994 G/A, rs10107182 C/T, rs10504255 A/G rs1457043 C/T, rs8192870 G/T, and rs3808607 G/T) were analyzed in patients with SA and control individuals using TaqMan assays on a QuantStudio 12K Flex Real-Time PCR system from Applied Biosystems, Foster City, USA.In addition, information regarding these polymorphisms, such as chromosome position, base change, and gene location, is shown in the Supplementary Table 1.

Ethical statement
This work complies with the statements of the Declaration of Helsinki, and was approved by the local Ethics Committee under project number 23-1361.All participants enrolled in this study signed the corresponding informed consent.

Statistical analysis
Data distribution was determined by the Shapiro-Francia test.Variables in the SA and control groups were compared using Student's t-test and represented as mean ± SD, or by Mann-Whitney U non-parametric tests, represented as median and interquartile interval [25 th -75 th ], when the variable distribution was normal or non-normal, respectively.For categorical variables, Fisher's exact test or Chi-squared test was performed.The association of CYP7A1 gene polymorphisms with SA was analyzed using additive, codominant, dominant, over-dominant (heterozygous), and recessive models of inheritance [22,23].
These analyses were performed using logistic regression.Logistic regression models included age, gender, blood pressure, and diabetes incidence as confounding variables.The P values (P) were corrected using the Bonferroni method, corresponding to the number of tests for each SNP based on the five inheritance models.The results of the analyses were presented as odds ratios (OR) with 95% confidence intervals.The power of the statistical analyses was set to 0.80 (OpenEpi available online, http://www.openepi.com/SampleSize/SSCC.htm).P < 0.05 was fixed for statistically significace.
Haplotypes and linkage disequilibrium (LD), were analyzed with Haploview version 4.1 (Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA, USA).Haploview uses the international HapMap project database and the standard EM algorithm to estimate the phase of haplotypes, considering the combination of analyzed alleles in one or more genes that can segregate together due to the closeness among them.This analysis estimates the maximum-likelihood values to obtain the D' logarithm of the odds (LOD), and r 2 [24].

Analysis of association between CYP7A1 genotypes with cardiovascular risk factors
To explore the potential contribution of CYP7A1 gene polymorphisms to triglycerides, total cholesterol, HDL-C, LDL-C, glucose, BMI, systolic and diastolic blood pressures, individuals were grouped based on their genotype for each SNP.Comparisons among carriers of different genotypes were performed by ANOVA after logarithmic transformation of non-normally distributed variables.Variance homogeneity was evaluated by the Levine test and confirmed by the F test.

Characteristics of the study subjects
On the basis of the CAC score [18], 416 individuals were classified as patients with subclinical atherosclerosis (SA), meaning that they had a CAC score > 0, and 1046 were assigned to the control group (CAC score = 0).Table 1 shows the biochemical and anthropometric characteristics of controls and SA patients.With the exception of body mass index, total cholesterol, triglycerides and smoking habits, the risk variables such as glucose, HDL-C, and LDL-C levels, as well as the incidence of hypertension, and DM2, were higher in patients with SA than in control individuals.

Association of CYP7A1 polymorphisms with SA
The genotypic frequencies of CYP7A1 gene polymorphisms in SA patients and controls were in Hardy-Weinberg equilibrium (P > 0.05).The allelic and genotypic distribution of the five polymorphisms considered in this study were different in SA patients compared to controls (P < 0.05) (Supplementary Table 2).
The analysis of polymorphisms with the incidence of SA, according to the inheritance models, is shown in Table 2.The analysis of the rs2081687 C/T SNP showed that the carrier individuals of one or two copies of the T allele had an increased risk of developing SA under four models (Table 2).Additionally, the carriers of one or two copies of the rs9297994 G allele of the G/A SNP showed a higher risk of developing SA (Table 2).Similar findings were observed with the rs10107182 C/T SNP; this analysis showed that the carriers of one or two copies of the C allele had an increased risk of developing SA.
Moreover, individuals that presented one or two copies of the rs10504255 G allele had the highest risk of developing SA, under four of the inheritance models (Table 2).Carrier individuals of one or two copies of the rs1457043 SNP C allele were associated with the highest risk of developing SA.In addition, the rs8192870 G/T polymorphism analysis showed that the carriers of one or two copies of the T allele also had an increased risk of developing SA.Finally, the analysis of the rs3800867 G/T SNP showed that the individuals that present one or two copies of the G allele had an increased risk of developing SA under codominant, recessive, and additive models (Table 2).

Linkage disequilibrium analysis
As shown in Figure 1, the block formed by the 7 polymorphisms considered in this study had a strong linkage disequilibrium (D' > 0.85), increasing the probability that these polymorphisms may be inherited together.Nonetheless, the analysis of r 2 showed that the rs8192870 G/T, rs3808607 G/T and rs1457043 C/T polymorphisms recombine more than rs10504255 A/G, rs9297994 G/A, rs2081687 C/T and rs10107182 C/T, SNPs (r 2 < 0.80).This analysis revealed two " haplotypes."CATATGT" and "TGCGCTG", that had different frequencies in patients with SA compared to controls (Table 3).The "CATATGT" haplotype was associated with protection against the development of SA, while the "TGCGCTG" haplotype represented a risk of SA (OR = 1.56,P = 5X10 -5 ).

Effect of the genotypes of CYP7A1 SNPs on plasma lipids levels
Recent data have shown that the cholesterol 7α-hydroxylase is associated with high plasma lipid levels, familial hypercholesterolemia, and cardiovascular diseases [10][11][12][13][14][15][16][17]25].In this context, we created subgroups based on the genotype of each one of the studied SNPs, to compare BMI, blood pressure, glucose, as well as the plasma lipid concentrations.The results of these sub-analyses demonstrated that the homozygous carriers of the minor allele of any of the 7 polymorphisms analyzed in this study had increased LDL-C levels (P < 0.05) (Table 4).Also, rs3808607 GG genotype showed lower concentrations of the triglycerides (P < 0.05) (Table 4).
Since diabetes is a major risk factor for coronary artery disease that may bias the results, we performed a sub-analysis excluding these patients to confirm the contribution of genotypes in non-diabetic subjects (Supplementary tables 3 and 4).In this subgroup, the Hardy-Weinberg equilibrium was conserved and glucose plasma levels and HDL-cholesterol were no longer different between SA and control subjects (Supplementary table 3).Importantly, five out of the 7 studied polymorphisms remained associated with subclinical atherosclerosis under similar inheritance models compared to the whole group (Supplementary Table 4).Consistently, the minor allele of any of the seven polymorphic sites was associated with higher LDL-C plasma levels in the SA patients after excluding patients with diabetes from the analysis.

DISCUSSION
In this study, we evaluated whether 7 polymorphisms located in the promoter and enhancer regions of the CYP7A1 gene were associated with plasma lipid levels and the incidence of SA.These SNPs encode cholesterol 7α-hydroxylase, a key enzyme in cholesterol catabolism, bile acid homeostasis, and plasma lipid levels [9][10][11][12][13][14][15][16][17].The association of these CYP7A1 polymorphisms with cardiovascular diseases has been mostly explained by increased LDL-C levels.In our study, we determined that the minor allele frequencies conferred an increased risk of developing SA.In addition, the association of these polymorphisms with SA in other populations has not been reported; our work is one of the few studies that describes the statistical relationship between these polymorphisms and coronary artery disease, acute coronary syndrome, hypercholesterolemia or diabetes [10,[12][13][14][15][16][17]25].Focusing on the rs2081687 T allele, which was associated with the risk of SA, several studies have shown the association of this allele with the risk of CAD and acute coronary syndrome [16,[25][26][27][28].In addition, an experimental study showed that this same allele was associated with high LDL-C plasma concentrations [28].Accordingly, we observed that patients with SA who were homozygous for the rs2081687 T allele, had higher plasma LDL-C plasma concentrations than in heterozygotes or non-carrier patients.
Concerning the rs8192870 polymorphism, recent studies revealed that the T allele was associated with an increased risk of diabetes, and acute coronary syndrome [17,25].In our study, the rs8192870 T allele homozygous patients also had elevated LDL-C plasma levels than heterozygotes or non-carriers of this allele.Regarding the rs10504255 A/G SNP, in our study, the GG genotype was statistically associated with a higher risk of SA, as well as with increased LDL-C levels.Similar results were recently reported by our group in patients with ACS [25].Conversely, Wang et al reported that both, G and A alleles, did not differ in their gene expression regulation effects, as demonstrated by cloning each sequence into the reporter gene pGL4.23 [10].In this context, we consider that the rs10504255 A/G SNP is physiologically relevant and merits being explored in other populations with different cardiovascular diseases.
The existing information about the relationship between the rs9297994 G/A and rs10107182 C/T polymorphisms with LDL-C plasma levels and cardiovascular diseases is scarce [10].Genome-wide association studies revealed that both polymorphisms may be related to total cholesterol and LDL-C plasma levels, and an increased incidence of cardiovascular diseases [29][30][31][32].Moreover, Wang et al. reported that the rs9297994 G/A and rs10107182 C/T SNPs are in linkage disequilibrium (LD) with the rs3808607 polymorphism, which is involved in the regulation of CYP7A1 gene expression [10].
Assuming that low CYP7A1 mRNA expression results in higher levels of plasma cholesterol, our findings are congruent with these results: the rs9297994 GG, rs10107182 CC, and rs3808607 GG genotypes were associated with higher LDL-C levels in patients with SA.This association was not observed in patients with acute coronary syndrome [25] probably because of the use of anti-dyslipidemic and anti-hypertensive drugs, among others, in these subjects.Taken together, this evidence suggests that these polymorphisms can be useful to constitute a genetic panel for the evaluation of the risk of developing symptomatic disease in combination with other polymorphisms.This possibility remains to be explored in future studies.
Concerning the rs3808607 G/T SNP, previous studies have shown that the G allele is associated with higher plasma LDL-C plasma levels [10,33] but its association with cardiovascular diseases or DM2 remains controversial [15,16,34].In this context, our study showed that not only the homozygous carriers of the rs3808607 G allele had higher levels of LDL-C, but also the rs9297994 GG and rs10107182 CC genotypes were associated with increased LDL-C levels.Consistently, these same genotypes, i.e., rs3808607 GG, rs9297994 GG and rs10107182 CC, were more frequent in patients with SA.This association may be due to the high LD that exists between these SNPs [10].In line with this data, a recent case-control study demonstrated that rs9297994 and rs10107182 were associated with the risk of developing ACS [25].
In contrast with our results, a previous report did not find any association between the rs1457043 C/T polymorphism and the prevalence of coronary heart disease [10].We observed that the C allele was associated with higher plasma LDL-C levels and with an increased risk of developing SA.We also determined that the "TGCGCTG" haplotype, formed by the seven studied polymorphisms, was associated with an increased risk of developing SA, and a high probably that this block could segregate together (D' > 0.85).As far as we know, there are no studies that showed a haplotype similar to the one reported in our study.In this context, taken together, our results suggest a link between CYP7A1 polymorphisms and the incidence of SA through plasma LDL-C levels.Nonetheless, previous studies have proposed that the possible mechanism is the effect of CYP7A1 gene polymorphisms on CYP7A1 mRNA expression, and plasma lipid concentrations [10,11,[28][29][30][31][32][33].
When SA patients were grouped by genotype, we observed significantly higher levels of LDL-cholesterol in carriers of any of the seven studied alleles.Taken together, our results suggest that polymorphisms located in the enhancer/promoter region of the CYP7A1 gene increase the risk of atherosclerosis by favoring a slight but significant increase in plasma cholesterol levels.Since CYP7A1 polymorphisms have been associated with diabetes, and given that the SA and control groups were different in diabetes frequency [15,25], we performed a statistical sub-analysis that excluded patients with diabetes.Even with lower statistical power, the minor alleles of any of the seven polymorphic sites were consistently associated with higher LDL-C plasma levels in the SA patients.These results further support the idea that polymorphic sites within the promoter/enhancer region of CYP7A1 contribute to SA via LDL-C plasma levels, independently of the incidence of diabetes.
On the other hand, the contribution of the studied polymorphisms to dyslipidemia and cardiovascular diseases remains controversial, likely due to ethnicity.In this context, the distribution of the minor alleles of the seven studied polymorphisms was lower in our population (Mexican mestizos) compared to Asian and Caucasian populations (Supplementary Table 5).Nonetheless, in the African population, the distribution of the rs2081687 T, rs1457043 C, and rs3808607 G alleles was higher than in ours, but the rs9297994 G, rs10107182 C, and rs10504255 G alleles were much lower compared to other populations (Supplementary Table 5).Taken together, our results and the different distribution of CYP7A1 polymorphisms based on ethnicity, suggest that the effect of these SNPs on SA and other cardiovascular diseases merits further exploration in a multicentric study involving patients of diverse origins.

Limitations of the study
We recognize that our study has some limitations that merit consideration: 1) As a consequence of the low frequency of the minor alleles in the studied population, there were few carriers of the genotypes and haplotypes associated with the risk of SA. 2) This study was not matched by sex and age; the proportion of men to women in the SA group was higher than in the control group.Even if these variables were considered to adjust the statistical analysis, this issue should be considered when interpreting of the study results.3) This study did not demonstrate a cause-effect relationship between CYP7A1 polymorphisms and LDL-cholesterol plasma levels and SA risk; instead, it only showed a statistical association between variables.Therefore, future experimental studies, such as GWAS, exome sequencing studies, and exome chips, are needed to ensure the validity, reliability, and accuracy of these polymorphisms in clinical practice, but also functional studies that demonstrate the effect of the polymorphisms on the CYP7A1 mRNA expression.

CONCLUSION
In conclusion, our results have shown that the rs2081687 C/T, rs9297994 G/A, rs10107182 C/T, rs10504255, A/G rs1457043 C/T, rs8192870 G/T, and rs3808607 G/T SNPs of the CYP7A1 gene conferred an increased risk of developing SA, either individually or as part of a haplotype (TGCGCTG).In addition, our study showed the carrier individuals of the rs2081687 TT, rs9297994 GG, rs10107182 CC, rs10504255 GG, rs1457043 CC, rs8192870 TT, and rs3808607 GG genotypes had increased LDL-C levels in our population.Lastly, based on our results and the genetic distribution of these polymorphisms in our population, these SNPs deserve to be studied in populations of different ethnic origins to establish the right role of these polymorphisms in susceptibility to developing SA and other cardiovascular diseases.

TABLES AND FIGURES WITH LEGENDS
nucleotide polymorphism, SA, Subclinical Atherosclerosis, OR, odds ratio, CI, confidence interval, pC, corrected P value.*The inheritance models were design based in the minor allele.Co-dominant model that compares the subgroup of homozygous individuals carrying the minor allele with homozygotes of the major allele.Dominant model compares the subgroup of homozygous individuals carrying the major allele with the subgroup conformed by heterozygotes and minor allele homozygotes.Recessive model compares the subgroup conformed by heterozygotes and major allele homozygotes vs homozygotes of the minor allele.Over-dominant model compares the subgroup conformed by homozygotes of the major allele and homozygotes of the minor allele vs heterozygotes.Additive model compares the subgroup major allele carriers with both, heterozygotes and minor allele homozygotes.All models were analyzed by logistic regression including gender, age, hypertension, type 2 diabetes mellitus, and smoking habit as confounding variables.
and the allele combination to the haplotypes is according to the position in the chromosome8q11-12 (rs2081687 T/C-rs9297994 G/A-rs10107182 C/T-rs10504255 G/A-rs1457043 C/T-rs8192870 T/G-rs3808607 G/T) as depicted Figure 1.

Table 1 .
Anthropometric and clinical characteristics of the patients with subclinical atherosclerosis (SA) and control individuals.Data were collected at recruitment.Gender, hypertension, type 2 mellitus, and smoking are expressed as n (frequency), and P values was calculated to chi-square.

Table 2 .
Association of the CYP7A1 polymorphisms with subclinical atherosclerosis (AS) accordance to the inheritance models.

Table 3 .
Distribution of haplotypes between CYP7A1 gene polymorphisms in the study groups.Haplotypes were analyzed using Haploview software, version 4.1 (Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA, USA).SA, Subclinical Atherosclerosis; Hf, Haplotype frequency; P = P value.*The polymorphisms order,