The association of cholesterol absorption gene Numb polymorphism with Coronary Artery Disease among Han Chinese and Uighur Chinese in Xinjiang, China

Hypercholesterolemia is a major risk factor for coronary artery disease (CAD). As Numb is an important regulating factor for intestinal cholesterol absorption and plasma cholesterol level, the aim of the present study is to assess the association between human Numb gene polymorphism and CAD among Han and Uighur Chinese. We have conducted two independent case–control studies in Han Chinese (384 CAD patients and 433 controls) and Uighur Chinese (506 CAD patients and 351 controls) subjects. All subjects were genotyped for four kinds of SNPs (rs12435797, rs2108552, rs1019075 and rs17781919) and SNP is used as a genetic marker for human Numb gene. Genotyping was undertaken using TaqMan SNP genotyping assay, and the subjects’ ethnicity and gender were considered in the analysis. We found that rs2108552 was associated with CAD in the dominant model (CC vs CG + GG) for the total Han Chinese population (n = 200) and Han Chinese males (n = 115) (P = 0.004 and P = 0.001, respectively). The difference remained statistically significant after multivariate adjustment (total: OR = 1.687, P = 0.004; male: OR = 1.498, P = 0.006). Further, for the total (n = 817) and male (n = 490) Han Chinese, the frequency of the haplotype (T-C-T-C) was significantly higher in the CAD patients than in the controls (P = 0.004 and P = 0.002), and the frequency of the haplotype (G-G-T-C) was significantly lower in the CAD patients than in the control subjects (P = 0.013, P = 0.007). In addition, for the total (n = 857) and male (n = 582) Uighur Chinese, we observed that rs12435797 was associated with CAD in an additive and recessive model (P = 0.021 and P = 0.009; P = 0.048 and P = 0.034). However, the difference did not remain statistically significant after multivariate adjustment. The overall distribution of rs2108552, rs1019075 and rs17781919 genotypes, alleles and the frequency of the haplotype established by four SNPs showed no significant difference between CAD patients and control subjects in the total, male and female Uighur Chinese. The results of this study indicate that CC genotype of rs2108552 and T-C-T-C haplotypes in Numb gene is a possible risk genetic marker and G allele and G-G-T-C haplotypes is a possible protective genetic marker for CAD in male Han Chinese.


Introduction
Cholesterol is an important structural component of cell membranes, and a precursor for bile acids, vitamin D, and steroid hormone [1]. Accumulative studies have established that the high concentration of blood cholesterol level is closely related to an increase risk of Coronary Artery Disease (CAD), which is becoming increasingly prevalent and is a leading cause of death in developed countries [2][3][4]. In addition, to environmental factors, plasma cholesterol level is also influenced by genetic factors such as single nucleotide polymorphisms (SNPs) [5].
Cholesterol homeostasis is mainly maintained by denovo synthesis, intestinal absorption, and biliary and fecal excretion in the human body. Studies have shown that elevated plasma cholesterol level can be reduced by inhibiting exogenous cholesterol absorption, and therefore prevent the development of atherosclerotic cardiovascular disease [6][7][8][9].
There are several genes such as that encoding Niemann-Pick C1-Like 1 (NPC1L1) protein [10,11] that are involved with intestinal cholesterol pathway [12][13][14][15][16][17]. Further, the role of NPC1L1 in cholesterol abosorption is mainly attributed to the clathrin-dependent endocytosis [18,19]. Pei Shan Li et al. [20] has revealed that there exists an interaction between clathrin adaptor Numb and NPC1L1 during the regulation of cholesterol absorption. They have also found that Numb recognizes the particular endocytic motif (YVNXXF) in C-terminal of NPC1L1, finishes the internalization by recruiting clathrin/AP2, and transports cholesterol to endocytic recycling compartment (ERC) through microfilaments.
Jian Wei et al. [29] confirmed that there was a remarkable correlation between Numb polymorphism G595D (rs17781919) and low concentration of LDL-C among humans. Besides, Jian Wei et al. believe that rs17781919 influences Numb activity during NPC1L1 internalization and reduces cholesterol absorption. In addition, it is generally accepted that a low concentration of LDL-C in plasma is an important factor for delaying the development of atherosclerotic cardiovascular disease.
No case-control studies have been conducted to assess the association between Numb gene and CAD. Therefore, the aim of this study is to clarify the relationship between polymorphism of gene Numb and CAD in Han Chinese and Uighur Chinese.

Subjects
All patients with CAD and control subjects were recruited from The First Affiliated Hospital of Xinjiang Medical University from January 2007 to December 2013. Han Chinese patients and Uighur Chinese patients were studied independently. CAD group included 384 Han Chinese patients and 506 Uighur Chinese patients, and the control group included 433 Han Chinese and 351 Uighur Chinese. CAD was defined as the presence of at least one significant coronary artery stenosis with more than 50 % luminal diameter on coronary angiography. Control subjects also underwent a coronary angiogram, and were confirmed to be free of coronary artery stenosis. Moreover, these subjects did not show clinical or electrocardiogram evidence of myocardial infarction (MI) or CAD as described previously [30,31]. However, some control subjects had cardiovascular risk factors such as essential hypertension (EH), diabetes mellitus (DM) or hyperlipidemia, but they had no history of MI or CAD. All information and data regarding EH, DM, hyperlipidemia and smoking were collected from all study subjects, and they were matched between two CAD and control cohorts individually. Hypertension was established if patients were on antihypertensive medication or if the mean of 3 measurements of systolic blood pressure (SBP) > 140 mmHg or diastolic blood pressure (DBP) >90 mmHg, respectively. Diabetes mellitus was diagnosed according to the criteria of the American Diabetes Association. Hyperlipidemia was defined as a total plasma cholesterol > 6.22 mmol or plasma triglycerides >2.26 mmol and /or the current use of lipid-lowering drugs with an established diagnosis of hyperlipidemia.
Smoking status was dichotomized as smokers (current and ex-smokers) or non-smokers [32]. CAD patients and control subjects were free of impaired malignancy, connective tissue disease, renal function, valvular disease and chronic inflammatory disease.

Anthropometric and biochemical variables measurement
Height and body weight were measured as described previously [33], and body mass index (BMI) was calculated by dividing the weight in kilogram to the square of height in meter. Further, WHO Asia-Pacific Area criterion-BMI ≥25 kg/m 2 was used to define obesity as described previously [33]. Finally, blood urea nitrogen (BUN), creatinine (Cr), uric acid, total cholesterol (TC), triglyceride (TG), low density lipoprotein-cholesterol (LDL-C), and high density lipoprotein-cholesterol (HDL-C) were measured by using chemical analysis equipment (Dimension AR/AVL Clinical Chemistry System, Newark, NJ) in Clinical Laboratory Department of The First Affiliated Hospital of Xinjiang Medical University [34,35]. Friedewald formula was used in the calculation of very low density lipoprotein (VLDL) [36] as shown in the following: VLDL = 1/5 of plasma TG level (mmol/L). TG < 4 mmol/L is required for the formula as a cut off and the reference value for VLDL is 0.11-0.34 in our study.

Ethical approval of the study protocol
All participants have given their written informed consent and explicit permission for DNA analysis as well as for the collection of relevant clinical data. This study was approved by the Ethics Committee of The First Affiliated Hospital of Xinjiang Medical University (Urumqi, China) and was conducted by strictly following the requirements of the Declaration of Helsinki.

SNP selection
The human Numb gene consists of 651 amino acids and is located on chromosome 14q24.3.It contains 13 exons which are further separated by 12 introns. There are 3781 different kinds of SNPs of human Numb gene as listed in theNational Center for Biotechnology Information SNP database (http://www.ncbi.nlm.nih.gov/ snp). For this current study, we have screened the HapMap phase I& II database and Haploview 4.0 software for the tag SNPs of Numb gene and selected three SNPs (rs2108552, rs12435797, and rs1019075). Each of them conforms to the standards of minor allele frequency (MAF) ≥0.1 and linkage disequilibrium patterns with r 2 ≥ 0.8 as a cut-off [30]. Meanwhile, we also included rs17781919 from the Numb gene which was associated with LDL-C [29]. rs2108552, rs12435797, and rs1019075 are located in intron. rs17781919 is located in exon13, and had a non-synonymous substitution amino acid change, which is defined by an C-to-T nucleotide substitution that leads to an exchange of Glycine by Aspartic acid at amino acid position 595. The position of the four SNPs (rs12435797, rs2108552, rs1019075 and rs17781919) was by order of increasing distance from the Numb gene 5`end ( Fig. 1).

Genotyping
Blood samples were taken from all participants by using anticoagulant ethylene diamine tetraacetic acid (EDTA) tube, and standard phenol-chloroform method was used to extract genomic DNA from peripheral leukocytes [37]. Samples were stored at −80°C until use. DNA was diluted to 50 ng/μL concentration for real time PCR. Further, genotyping was undertaken by using TaqMan® SNP Genotyping Assay (Applied Biosystems) [38]. The probes and primers used in TaqMan®SNP Genotyping Assays (ABI) were selected according to the information on ABI website (http://www.appliedbiosystems.com). Allele-specific fluorogenic probes were hybridised to the template in the first step of the 5′ nuclease assay. During the polymerase chain reaction (PCR), the 5′ nuclease activity of the Taq polymerase made it possible for discrimination. In addition, the probes include a 3′ minor groove binding group that hybridises to single-stranded targets and has greater sequence specificity when compared to the original DNA probes. This reduces nonspecific probe hybridization which leads to low background fluorescence for the 5′ nuclease PCR assay (TaqMan; Applied Biosystems). Cleavage results in the increased emission of a reporter dye. Two unlabeled PCR primers and two allele-specific probes were required for each 5′ nuclease assay. At the 5′ end, each probe is labeled with two reporter dyes. In the present study, VIC and FAM were used as the reporter dyes. Finally, PCR amplification was performed by using 1 μL of DNA, 3 μL of TaqMan Universal Master Mix, 1.95 μL ddH 2 O, and 0.05 μL TaqMan SNP Genotyping Assay Mix (40×) containing a 331.2nM final concentration of primers and a 73.6nM final concentration of the probes. In addition, thermal cycling conditions for PCR amplification were 95°C for 10 min, 45 cycles of 95°C for 10 s and 60°C for 1 min. Moreover, thermal cycling was undertaken by using Applied Biosystems7900HT Standard Real-Time PCR System, and all 96 well plates were read according to Sequence Detection Systems (SDS) automation controller software v2.3 (ABI).

Statistical analysis
All statistical analyses were performed via using SPSS 16.0 software for Windows (SPSS Institute, Chicago, IL, USA). All continuous variables were expressed via mean ± standard deviation and the differences in continuous variables between CAD patients and control subjects were compared by using an independent-sample T-test. Further, Chi square analysis was used to test the deviations of genotype distribution from the Hardy-Weinberg equilibrium and to determine the differences of allele or genotype frequencies between patients and controls. Finally, logistic regression analyses with effect ratios (odds ratio [OR] and 95 % CI [confidence interval]) were used to assess the contribution of the major risk factors.
Based on the genotype data of the genetic variations, we performed haplotype-based case-control analyses using the expectation maximization algorithm and the software SNPAlyze version 3.2 (Dynacom, Yokohama, Japan). The SHEsis platform was used to verify reliability for SNPAlyze [38]. In the haplotype-based case-control analysis, haplotypes with a frequency of < 0.03 were excluded. The frequency distribution of the haplotypes was calculated by performing a permutation test using the bootstrap method. P < 0.05 was considered to statistically significant.
For the analysis to succeed, all variants should be located in one haplotype block, which is indicated by a large |D'| value between each SNP (near 1). When the r 2 values are large (near 1) for the pairwise variants, one variant is not needed. The LD analysis was performed using four SNP pairs. We used |D'| values of > 0.25 to assign SNP locations to one haplotype block. SNPs with an r 2 value > 0.5 were selected as tagged.
Dominant (common allele homozygotes coded as 1 and heterozygotes and recessive allele homozygotes as 2); recessive (recessive allele homozygotes as 1, common allele homozygotes and heterozygotes coded as 2) and additive (heterozygotes as 1, recessive allele homozygotes and common allele homozygotes coded as 2) models were considered in the Chi square analysis.

Characteristics of included subjects
The two study cohorts included a total of 1674 subjects (1072 male and 602 female). There were 384 Han Chinese CAD patients (270 male and 114 female) and 506 Uighur Chinese CAD patients (369 male and 137 female). 433 Han (220 male and 213 female) and 351 Uighur individuals (213 male and 138 female) were included as controls. All subjects were recruited from out-patient and inpatient departments of The First Affiliated Hospital of Xinjiang Medical University from January 2007 to December 2013. CAD when compared to control subjects (Table 1). ii. In Uighur Chinese, BMI, plasma concentration of Glu, BUN, TC, TG, LDL-C, VLDL, HDL-C, prevalence of EH, DM, hyperlipidemia and smoking was significantly higher in CAD patients in comparison to controls ( Table 2).

Results of outcome measures
(2)The distribution of genotypes and alleles of SNP1 (rs12435797), SNP2 (rs2108552), SNP3 (rs1019075) and SNP4 (rs17781919) of Numb gene analysis are shown among Han Chinese and Uighur Chinese in Tables 3 and 4. The distributions of genotypes for four kinds of SNPs were in good agreement with the predicted Hardy-Weinberg equilibrium values (data not shown).
i. In the total sample of Han Chinese subjects and in Han Chinese males, the overall distribution of SNP2 (rs2108552) genotypes showed a significant difference between CAD and control participants (P = 0.013 and P = 0.003, respectively) ( Table 3).
In addition, the dominant model and alleles of SNP2 in the total and male group showed a significant difference between CAD and control subjects (P = 0.004 and P = 0.001; P = 0.007 and P = 0.003, respectively).  (Table 3).
ii. For total Han Chinese, the overall distribution of SNP3 (rs1019075) genotype, recessive, additive model and alleles showed a significant difference between CAD patients and control subjects (P = 0.041, P = 0.016, P = 0.015, and P = 0.045, respectively) ( Table 3). In addition, the recessive and additive model of SNP3 in males showed a significant difference between CAD patients and control subjects (P = 0.030, P = 0.020, respectively). The distribution of the recessive model (TT vs CT + CC) of SNP3 was significantly higher in CAD patients (total: 67.3 %; male: 66.8 %) compared to control subjects (total: 59.0 %; male: 56.9 %). Further, C allele frequency of SNP3 was significantly lower in CAD patients        (Table 3). Among females, there was no significant difference between CAD patients and control subjects with respect to the overall distribution of SNP2 and SNP3 genotypes; dominant and recessive models; and alleles.
iii. For total, male and female Han Chinese, the overall distribution of SNP1 (rs12435797) and SNP4 (rs17781919) genotypes and allele showed no significant difference between CAD patients and control subjects. (Table 3).
iv. In the total sample of Uighur Chinese and in Uighur Chinese males, the recessive and additive models of SNP1 showed a significant difference between CAD and control subjects (recessive model: P = 0.048 and P = 0.034; additive model: P = 0.021 and P = 0.009, respectively). The distribution of the additive model (GT vs GG + TT) of SNP1 was significantly higher among CAD patients (total: 53.6 %; male: 52.2 %) compared to control subjects (total: 45.4 %; male: 42.5 %). In addition, the overall distribution of SNP1 genotypes and alleles showed no significant difference between CAD patients and control subjects in the total, male and female group. Further, the overall distribution of SNP2, SNP3 and SNP4 genotypes and alleles frequencies showed no significant difference between CAD patients and control subjects in the total, male and female Uighur Chinese (Table 4).
(4)Multiple logistic regression analysis for CAD patients and control subjects from Han Chinese (rs2108552) The multivariable logistic regression analysis combining genotypes was conducted for following variables: BMI, Glu TC, TG, LDL-C, DM, EH, hyperlipidemia and smoking, and shown in Table 5 (Table 5).

(5)Haplotypes and linkage disequilibrium
Haplotypes are established through the use of different combinations of the SNPs in the haplotypebased case-control analysis, independently (Tables 6  and 7) and there are five haplotypes established in all subjects.
i. In Han Chinese, the frequency of the T-C-T-C haplotype established by SNP1-SNP2-SNP3-SNP4 in these two groups was significantly higher among CAD patients when compared to control subjects (total: OR = 1.334, 95 % CI = 1.096-1.624, P = 0.004; male: total: OR = 1.482, 95 % CI = 1.148-1.912, P = 0.002). Moreover, the frequency of the G-G-T-C haplotype established by SNP1-SNP2-SNP3-SNP4 in these two groups was also significantly lower among CAD patients when compared to control subjects (total: OR = 0.701, 95 % CI = 0.529-0.929, P = 0.013; male: OR = 0.616, 95 % CI =0.432-0.877, P =0.007) ( Table 6). For females, there was no difference in the frequency of haplotypes between the CAD patients and control subjects. These results of haplotypes were consistent with the results  CAD coronary artery disease, SNP single nucleotide polymorphism. The p value of haplotype was calculated by Fisher's exact test. * p < 0.05. Haplotypes with frequencies >0.03 were estimated using SHEsis software; 0 represents major allele and 1 represents minor allele "0100" refers respectively the major allele of the SNP1,minor allele of the SNP2,major allele of the SNP3,major allele of the SNP4 CAD coronary artery disease, SNP single nucleotide polymorphism. The p value of haplotype was calculated by Fisher's exact test. * P < 0.05. Haplotypes with frequencies >0.03 were estimated using SHEsis software; 0 represents major allele and 1 represents minor allele "0100" refers respectively the major allele of the SNP1,minor allele of the SNP2,major allele of the SNP3,major allele of the SNP4 of CC genotype and G allele of SNP2 (rs2108552). ii. In Uighur Chinese, the overall distribution of the haplotypes established by SNP1-SNP2-SNP3-SNP4 showed no significant difference between CAD patients and control subjects in the total, male and female group (Table 7).
iii. The patterns of linkage disequilibrium of the Numb gene among Han and Uighur Chinese are shown in Figs. 2 and 3. All four SNPs are located in one haplotype block, as all of the |D'| values were beyond 0.25 and r 2 values were below 0. 5 We constructed the haplotypes using SNP1, SNP2, SNP3 and SNP4.

Findings
In this case-control study, we have genotyped four kinds of SNPs of Numb gene among Han Chinese and Uighur Chinese, and investigated the association between polymorphism of Numb gene and CAD. We have found that variation in Numb gene is associated with CAD among Han Chinese. This is the first endeavor to study the common allelic variant in Numb gene and it's association with CAD. Numb is an important protein for regulating cholesterol absorption and it plays a pivotal role in the development of atherosclerosis. Further, some researches have observed that hypercholesterolemia can cause multiple physiologic outcomes, such as coronary artery disease, diabetes and obesity [39][40][41]. Therefore, we assume that Numb gene and coronary artery disease might be associated. However, few researches have been conducted regarding the relationship between Numb gene and cardiovascular diseases.
Our study has showed that among Han Chinese, C allele frequency of rs2108552 was higher among male CAD patients when compared to male control subjects. The distribution of the dominant model (CC vs CG + GG) was significantly higher among CAD patients compared to control subjects. The difference remained significant after multivariate adjustment (Table 5). These findings suggest that males carrying the CC genotype of rs2108552 may have a higher risk of CAD In the total and male Han population, the distributions of the recessive models (TT vs CT + CC) of rs1019075 were significantly higher among CAD patients compared to control subjects. This suggests that people carrying the T allele of rs1019075 may have a higher risk of CAD. T allele frequency of rs1019075 was not higher among CAD patients than control subjects in the total and male group. In logistic regression analysis (TT vs CT + CC), there was no evidence of a statistically significant difference before or after multivariate adjustment (P > 0.05, data not shown). A possible explanation is that people carrying T allele have a higher risk from suffering CAD is codetermined by the TT and T allele frequency of rs1019075 in the CAD patients.
We have found that rs2108552 of Numb gene was associated with CAD only in the Han male subgroup. BMI, Glu, TG, TC, LDL-C, prevalence of DM, EH, hyperlipidemia and smoking were higher in male patients in comparison to female CAD patients. Therefore, this indicates the male patients might have higher chance of suffering from cardiovascular disease than the female patients.
In Uighur Chinese subjects, there was no difference between CAD patients and controls with respect to rs2108552, and rs1019075. But another polymorphism of Numb gene, rs12435797, the distribution of the additive model (GT vs GG + TT) was significantly higher among CAD patients in comparison to control subjects in the total population and in the male subgroup. However, Fig. 2 Pairwise estimates of linkage disequilibrium (LD) between each Numb Polymorphism is plotted for Han Chinese using SHEsis platform. Each polymorphism is numbered according to its position in the Numb gene as presented in Fig. 1. (a) Showed | D'| and different colors represent different degree of linkage disequilibrium. The darker the Color was the stronger the degree of linkage disequilibrium was (b) showed r 2 statistical significance did not remain after multivariate adjustment (P > 0.05). The underlying reason might be ethnic differences which causes different genetic background and life style.
Previous data has shown that SNP4 (rs17781919, G595D) is related to low concentration of LDL-C in humans [29]. In our study, we have neither observed Numb (G595D) variant genotype, nor found significant difference regarding the genotypic distribution of SNP4 (CC and CT) between CAD patients and control subjects among all Han Chinese and Uighur Chinese.
A recent study suggested that statistical method according to the haplotype analysis has more advantages than the individual SNPs analysis to assess complex disease genes, especially there was weak linkage disequilibria between SNPs [42]. The present study is the first haplotypebased case-control endeavor to study the association between the human Numb gene and CAD in Han and Uighur Chinese. In our study, we have found two haplotypes (T-C-T-C and G-G-T-C) of SNP1-SNP2-SNP3-SNP4 in Han male Chinese and assumed that the haplotype (T-C-T-C) is a risk factor for CAD and G-G-T-C is a protective factor for in Chinese Han male population according to the logistic regression and haplotype analyses

Limitations and shortcomings
The shortcomings of this study are as follows: (1)Because of the limited time, we are only able to conduct a retrospective study. Further, the evidence of this study might be biased due to the unbalanced matching and biased selection. Therefore, a prospective cohort study has to be conducted under a reasonably long time span if one aims to get evidence with higher quality.
(2)The representativeness of the study sample might not be sufficient enough due to the limited selection of CAD patients and control subjects. The source of subjects was limited to The First Affiliate Hospital of Xinjiang Medical University, and these subjects may possess some risk factors of cardiovascular disease.

Conclusions
This is the first scientific endeavor to study the correlation between human Numb gene and CAD among Han Chinese and Uighur Chinese. Findings suggest that rs2108552 may be a novel polymorphism of Numb gene that associates with CAD among male Han Chinese. In addition, CC genotype of rs2108552 and T-C-T-C haplotypes in Numb gene is a possible risk genetic marker for CAD, and G allele and G-T-C-C haplotypes is a possible risk genetic marker for CAD among male Han Chinese which supports the hypothesis that Numb gene variations are involved in the pathogenesis of CAD. These results may broaden the knowledge of genetic variants and disease-association studies. Further studies employing larger sample sizes are required.