Association of the NCAN-TM6SF2-CILP2-PBX4-SUGP1-MAU2 SNPs and gene-gene and gene-environment interactions with serum lipid levels

This study investigated the association of the NCAN-TM6SF2-CILP2-PBX4-SUGP1-MAU2 SNPs and gene-gene and gene-environment interactions with serum lipid levels in the population of Southwest China. Genotyping of 12 SNPs (i.e., rs2238675, rs2228603, rs58542926, rs735273, rs16996148, rs968525, rs17216525, rs12610185, rs10401969, rs8102280, rs73001065 and rs150268548) was performed in 1248 hyperlipidemia patients and 1248 normal subjects. The allelic and genotypic frequencies of the detected SNPs differed substantially between the normal and hyperlipidemia groups (P < 0.05-0.001), and the association of the 12 SNPs and hyperlipidemia was also observed (P < 0.004-0.0001). Four haplotypes (i.e., NCAN C-C, CILP2 G-T, PBX4-SUGP1 G-C, and MAU2 C-A-G-T) and 5 gene-gene interaction haplotypes (i.e., rs2238675C-rs2228603C, rs16996148G-rs17216525T, rs12610185G-rs10401969C, rs73001065G-rs8102280A-rs150268548G-rs968525C and rs73001065C-rs8102280A-rs150268548G-rs96852）showed a protective effect, whereas four other haplotypes (i.e., TM6SF2 T-A, TM6SF2 C-A, MAU2 G-G-G-C and MAU2 C-G-A-T), as well as 4 gene-gene interaction haplotypes (i.e., rs58542926C-rs735273A, rs58542926T-rs735273A, rs73001065G-rs8102280G-rs150268548G-rs968525C, and rs73001065C-rs8102280G-rs150268548A-rs968525T), exhibited an inverse effect on hyperlipidemia (P < 0.05-0.0001). There were notable three-locus models comprising SNP-SNP, SNP-environment, and haplotype-haplotype interactions (P < 0.05-0.0001). The individuals with some genotypes and haplotypes reduced the prevalence of hyperlipidemia, whereas the individuals with some other genotypes and haplotypes augmented the prevalence of hyperlipidemia. The NCAN-TM6SF2-CILP2-PBX4-SUGP1-MAU2 SNPs and gene-gene and gene-environment interactions on hyperlipidemia were observed in the population of Southwest China.

AGING of atherosclerosis; it can lead to oxidative stress and chronic inflammation and induce damage to macromolecules, endothelial cell apoptosis, proliferation and migration of vascular smooth muscle cells, all of which involve the formation of atheroma, leading to the development of atherosclerosis [1]. Although people strive to change their lifestyle and take such medications as statins and other lipid-lowering drugs, the incidence of CVD is still increasing [2]. It is difficult for many individuals to reach standard serum lipid levels even after taking medications, or some of them may suffer from certain side effects [3]. Thus, it is essential to discover variants for new markers that regulate serum lipid profiles, which may facilitate efforts to further improve hyperlipidemia and thus may reduce the probability of CVD.  [4,5]. These loci are located in regions associated with morbidity due to coronary artery diseases (CAD) [6][7][8]. Kozlitina et al. [9] demonstrated that hepatic triglyceride content (HTGC) was associated with TM6SF2 (rs58542926 c.449 C > T). Several in vitro experiments have found that knockdown of TM6SF2 can cause decreased synthesis of apolipoprotein (Apo) B and triglyceride (TG)-rich lipoproteins [9,10]. Moreover, TM6SF2 knockdown also causes accumulation of cellular TG, which has been reported as a significant increase in the number and size of lipid droplets at the subcellular level [11]. In contrast, overexpression of TM6SF2 can result in a reduction in the number and size of lipid droplets [11]. Some previous studies have established that the 19p13.11 locus linked together the adjoining genes NCAN and PBX4 with dyslipidemia, and these relationships may now be attributed to TM6SF2 [12][13][14]. Rašlová et al. [15] established that there was an association between CILP polymorphism and esterification rate of cholesterol in plasma high-density lipoprotein and affect lipid metabolism. Zhou et al. [8] identified that the rs16996148 SNP in NCAN-CILP was significantly associated with reduced CAD risk in the Chinese populations. Luptakova et al. [16] established that the minor T allele of CILP2 can fight against the elevation of lipid and lipoprotein in serum. Some reports have also documented that after transfecting the HepG2 and Huh7 cell lines with siRNAs for SUGP1, the transcript concentrations and protein levels of SUGP1 were reduced by 45-70% and 72-91%, respectively [17]. Moreover, overexpression of SUGP1 was correlated with greater elevation in total cholesterol (TC) and TG, both in vivo and in vitro [17]. In addition, overexpression of SUGP1 led to greater activity of hepatic 3-hydroxy-3methylglutaryl coenzyme A (HMG CoA) reductase (HMGCR) enzyme, but there was no change in the transcript level of hepatic HMGCR [17]. MAU2, which is located close to NCAN on chromosome 19, has been identified to have an association with TC, lowdensity lipoprotein cholesterol (LDL-C) and TG in serum [18,19].
The causes of these variations have not been fully elucidated, but hyperlipidemia is considered to be a complex disease characterized by subtle interpatient variability, comprising host genetic factors and environmental interactions that generate disease phenotypes and establish disease advancement. Although a series of studies have revealed that environmental factors have determined the presence of dyslipidemia [20][21][22], it is also known that genetic factors have a vital role and can establish how an individual responds to challenges [6,12]. Our previous study established that the BCL3-PVRL2-TOMM40 SNPs were located on chromosome 19 p11, the prevailing model of rs157580 and rs8100239 SNPs, and some haplotypes and gene-gene interaction haplotypes were involved in protection, although other haplotypes and gene-gene interaction haplotypes, including the prevailing model of rs6859, rs3810143, rs519113 and rs10402271 SNPs, indicated an augmented morbidity function [23]. Even though we have conducted substantial research and made extensive progress in identifying genetic modifiers, the relationship between hyperlipidemia and other gene polymorphisms has not been fully elucidated. In this study, we focus on the association of the NCAN, TM6SF2, CILP2, PBX4, SUGP1 and MAU2 single nucleotide variants, geneenvironment interactions and gene-gene interactions with serum lipid levels. Configurations of the relationships among SNPs throughout the genome might be categorized with regard to linkage disequilibrium (LD) and haplotype [24]. Table 1 describes the typical characteristics of 2,496 participants from both groups. Systolic blood pressure, diastolic blood pressure, pulse pressure, TC, TG, highdensity lipoprotein cholesterol (HDL-C) and LDL-C levels were substantially higher in hyperlipidemia than Table 1. Comparison of demographic, lifestyle characteristics and serum lipid levels between the normal and hyperlipidemia groups.

Demographic and biochemical characteristics
HDL-C: high-density lipoprotein cholesterol. LDL-C: low-density lipoprotein cholesterol. Apo: Apolipoprotein. 1 Mean ± SD determined by t-test. 2 Because the data were not normally distributed, the value of triglyceride was presented as median (interquartile range), and the difference between the two groups was determined by the Wilcoxon-Mann-Whitney test.
in normal groups (P < 0.05-P < 0.001 for all), whereas body weight, waist circumference, and blood glucose levels were significantly lower in hyperlipidemia than in normal groups (P < 0.001 for all). However, there was no substantial difference in age, sex ratio, height, body mass index (BMI), smoking status, alcohol consumption, ApoA1, ApoB levels, or the ApoA1/ApoB ratio between the two groups (P > 0.05 for all). Figure 1 shows the locations, as well as the partial nucleotide sequences, of the NCAN, TM6SF2, CILP2, PBX4, SUGP1 and MAU2 SNPs, which are located on chromosome 19. The genotypes of 12 SNPs were confirmed by direct sequencing. As mentioned in Table  2, the genotypic distribution of 12 SNPs substantially conformed to Hardy-Weinberg equilibrium (HWE) in the hyperlipidemia and normal. The genotypic and allelic frequencies of 12 SNPs in the NCAN, TM6SF2, CILP2, PBX4, SUGP1 and MAU2 were substantially different between the hyperlipidemia and normal groups (Tables 2 and 3). The allelic frequencies of rs2238675C, rs2228603T, rs58542926T, rs735273G, rs16996148T, rs17216525T, rs12610185A, rs1040 1969T, rs73001065G, rs8102280G, rs150268548A, and rs968525T were substantially greater in hyperlipidemic individuals than in normal subjects (P < 0.05-P < 0.001, for all).

Haplotype-based association with hyperlipidemia
As presented in Table 4, the most common haplotypes , and MAU2 C-A-G-T (G13) haplotypes were significantly different between the hyperlipidemia and normal groups (P < 0.05 for all). In addition, the haplotypes of G2, G6, G8, and G13 showed a protective effect, whereas all of the G3, G5, G9 and G12 haplotypes showed an inverse effect (P < 0.05-0.001, respectively). The detected sites that were elucidated by multiple locus LD were not fully statistically independent in the participants. As presented in Figure 3, both the LD and the haplotypes block the combination of two groups. Figure 4 shows that carriers with the detected gene-gene interaction haplotypes had higher serum TC (rs58542926C-rs735273A and rs73001065C-rs8102280G-rs150268 548A-rs968525T), LDL (rs73001065G-rs8102280G-rs150268548G-rs968525C, and rs73001065C-rs81022 80G-rs150268548A-rs968525T), and TG (rs58542926T -rs735273A) levels than the haplotype non-carriers.

Gene-gene (G × G) interaction-based association with hyperlipidemia
As shown in Table 5, the most common G × G interaction was C- between the normal and hyperlipidemia groups (P < 0.05 for all). Meanwhile, the G × G interactions of H1, H2, H6, H8 and H9 contributed to a protective effect, while the G × G interaction of H3, H5 and H7 showed an inverse effect. The H2, H6, H8 and H9 carriers had low TC levels, but the H5 and H7 carriers had high TC levels; the H1 carriers had low TG levels, but the H3 carriers had high TG levels; the H7 carriers had high LDL-C levels, and the H9 carriers had low LDL-C levels; and the H5 carriers had high ApoA1 levels in both the normal and hyperlipidemia groups ( Figure 5; P < 0.006 for all).

G × G and gene-environment (G × E) interactions on hyperlipidemia
Entropy-based interaction dendrogram ( Figure 6) and proportional hazard model results (Figure 7) show the strongest synergy of SNP-SNP interaction between rs735273 and rs16996148 and haplotype-haplotype interaction between G10 and G6. However, these results showed a redundancy effect in SNP-environment interaction (rs16996148 and diabetes), haplotypeenvironment interaction (G6 and diabetes), gene-gene interaction (H3 and H6) and gene-environment interaction (H6 and diabetes). We also established that the rs735273 AA and rs16996148 GT/GG genotypes increased the risk of hyperlipidemia, whereas the rs735273 AG/GG, rs16996148 TT, rs735273 AG/GG and rs16996148 GT/TT genotypes decreased the risk of hyperlipidemia. SNP-environment interaction and rs16996148 and diabetes indicated that the rs16996148 SNP decreased the risk of hyperlipidemia, whereas rs16996148 GT/TT and diabetes, rs16996148 TT and diabetes increased the risk of hyperlipidemia. The haplotype-haplotype interaction showed that G10 (MAU2 G-A-G-C) and G6 (CILP2 G-T) carriers could reduce the risk of hyperlipidemia compared with G10 or G6 carriers. With regard to the gene-gene interaction between carriers, we found that the latter showed an inferior risk of hyperlipidemia, while the former indicated an augmented probability of hyperlipidemia. As a genotype-environment interaction was considered, G6 (CILP2 G-T) carriers and diabetes increased the risk of hyperlipidemia. A similar result was shown in the gene-environment interaction between H6 (C-C-C-A-G-C-G-C-G-A-G-C) carriers and diabetes.

DISCUSSION
The major new findings in this study were as follows:  (4) We also found different interactions that augmented the risk of hyperlipidemia.
Hyperlipidemia is the main risk factor that can result in CVD, which accounts for approximately 4 million deaths each year worldwide [25,26]. High levels of TC can contribute to the risk for CAD [27], ischemic cerebrovascular accident [28], aortic dissection and peripheral arterial disease [29]. It has been demonstrated that TG levels have an intense association with non-alcoholic fatty liver disease (NAFLD) and metabolic syndrome [30]. NAFLD and metabolic AGING syndrome have also been reported to be independent of the risk factors for subclinical atherosclerosis [31,32].
This study identified that the variants of NCAN, TM6SF2, CILP2, PBX4, SUGP1 and MAU2 were related to serum lipid concentrations. Moreover, there were substantial differences in the genotypic and allelic frequencies of 12 SNPs between the normal and hyperlipidemia groups. These outcomes suggest that genetic factors are associated with the prevalence of hyperlipidemia. When we analyzed the relationship between SNPs and hyperlipidemia, the rs735273 and rs16996148 SNPs were found to decrease the risk of hyperlipidemia. However, the interaction of the SNPenvironment showed that subjects with the rs16996148 SNP and diabetes had an increased risk of hyperlipidemia. We also found similar results in the interactions of haplotype-environment, G × G and G × E. A plausible interpretation for these findings is that metabolic disorder might occur due to the combined influence of people's behavior, environmental and genetic factors [39,40]. More than 50% of the diet of southern Chinese populations includes cereals [41], which significantly lack some important micronutrients, such as vitamins and dietary fiber. These populations prefer rice, refreshing sour, spicy and sweet food. Furthermore, these populations all prefer food containing many saturated fatty acids, such as pork, beef and animal organ offal [42]. Long-chain dietary saturated fatty acids have shown detrimental consequences on lipid metabolism in blood, especially resulting in higher levels of plasma TC and TG [43,44].
Unhealthy lifestyles (e.g., unhealthy diet, smoking, excessive alcohol intake and lack of exercise) have been closely connected with abnormal serum lipid levels [45]. Compared with the normal groups, there was a higher percentage of smoking and alcohol intake in the hyperlipidemia group. A large number of Southwest Chinese adults enjoy drinking. Most people who live in rural areas usually make wine themselves by using corns, cereals and cassava. It has been documented that alcohol could elevate serum levels of HDL-C and benefit CAD [46,47]. However, it has also been reported that the elevation of HDL-C levels was set off by increased smoking levels. Smoking could increase the serum concentrations of TC, TG and LDL-C, but it could decrease serum levels of HDL-C [48,49]. This phenomenon may be a suitable explanation for the current results of serum lipid levels between the two groups. There might be an effect of modifiable or nonmodifiable risk factors on genetic variants identified in GWAS of disease. Recently, a number of variants have been identified to be connected with lifestyle behaviors and health outcomes in GWAS. From the example of tobacco and alcohol research that we discussed above, behavioral phenotypes can be predicted by a genetic variant, which has been shown in GWAS of disorders that informally interact with these activities. It is important to explain GWAS findings [50].
Dyslipidemia is the result of a combination of genetic and environmental factors that have been universally recognized worldwide [51,52]. China is a multiethnic country with 56 ethnic groups [53]. Han nationality is the largest ethnic group, and the rest of 55 ethnic groups are AGING distributed in different areas of the country. The genotypic and allelic frequencies of many SNPs in some genes were inconsistent in diverse racial/ethnic groups [54][55][56]. There may also be an ethnic difference in lifestyle and environmental factors, as well as in genetic background. To the best of our knowledge, the TM6SF2 rs58542926 SNP increased the risk of NAFLD in the eastern Chinese Han population [57]. The SNP of rs16996148 in NCAN-CILP2 or NCAN/CILP2/PBX4 was significantly associated with dyslipidemia in the midlands and east of the Chinese Han population [8,58]. The studies mentioned above suggested that genetic variants of those genes in chromosome 19p13 confer susceptibility to dyslipidemia in the Chinese populations. However, the relationship between dyslipidemia and SUGP1 and MAU2 is not clear in the Chinese populations, and the association between SNPs, gene-gene, and gene-environment interactions and dyslipidemia is still limited. With the rapid development of biomedicine technology, we are entering a precision medicine era, and precision medicine seeks to identify and classify individual patients such that optimal treatment decisions can be made. It is essential to explore the NCAN-TM6SF2-CILP2-PBX4-SUGP1-MAU2 SNPs, gene-gene and gene-environment interactions on serum lipid levels in Southeast China and other areas of Chinese populations. These results may help us to take precise treatment for dyslipidemia and decrease the risk of CVD.   TM6SF2 rs58542926-rs735273, CILP2 rs16996148-rs17216525, PBX4-SUGP1 rs12610185-rs10401969, and MAU2 rs73001065-rs8102280-rs150268548-rs968525.
In conclusion, this study shows potential interactions among the NCAN, TM6SF2, CILP2, PBX4, SUGP1 and MAU2, environment and serum lipid levels in hyperlipidemia subjects. Our findings also showed that the interactions increased the risk of hyperlipidemia over single-locus tests. In addition, these factors exhibit distinctive collaboration or redundancy effects on morbidity.

Clinical data
The clinical data were obtained by means of a universally standardized technique [38]. Standardized questionnaires were administered to acquire details of demographics, socioeconomic standing and lifestyle dynamics. The status of cigarette smoking was categorized into ≤ 20 cigarettes per day and > 20 cigarettes per day [59]. Alcohol intake was classified based on the grams of alcohol per day: ≤ 25 and > 25 [23]. Details regarding other factors, such as height, weight, waist circumference, blood pressure, and BMI (kg/m 2 ), were also acquired.

Biochemical measurements
Venous blood samples were acquired following 12 h of fasting. TC, HDL-C, LDL-C and TG concentrations in serum were detected by means of Tcho-1, TG-LH (RANDOX Laboratories, UK), Cholestest N HDL, and Cholestest LDL (Daiichi Pure Chemicals Co., Ltd., Japan) kits, respectively. ApoA1 and ApoB concentrations in serum were determined by immunoassay (RANDOX Laboratories). Detection of all samples was completed with an autoanalyzer (Hitachi Ltd., Japan) [60].

Diagnostic criteria
The     We employed Univariant to test associations between genotypes, haplotypes, G × G interactions and lipid phenotypic variations. P < 0.004 represented statistical significance in the association between any variants and lipid phenotypic variations (equivalent to P < 0.05 after adjusting for 12 independent tests by the Bonferroni correction). The association between genotypes, alleles, haplotypes, G × G interactions and lipid phenotypic variants was performed using unconditional logistic regression evaluation. Other parameters were adjusted for the data analysis. The greatest interaction pattern among genes, SNPs and environmental exposures was screened by means of generalized multifactor dimensionality reduction [64]. The cross-validation consistency score was performed to identify the best model of selected interaction among all probabilities. The testing balanced accuracy was a measure of the degree to which the interaction precisely calculates case-control status with scores between 0.50 (representing that the model projects no better than chance) and 1.00 (representing impeccable prediction). Finally, to evaluate whether an identified model is significant, we used a sign test or a permutation test for accuracy of prediction.

Availability of data and materials
The datasets generated during the present study are not publicly available, because detailed genetic information of each participant was included in these materials.