Association of VAMP5 and MCC genetic polymorphisms with increased risk of Hirschsprung disease susceptibility in Southern Chinese children

Hirschsprung disease (HSCR) is a genetic disorder characterized by the absence of neural crest cells in parts of the intestine. This study aims to investigate the association of vesicle-associated membrane protein 5 (VAMP5) and mutated in colorectal cancer (MCC) genetic polymorphisms and their correlated risks with HSCR. We examined the association in four polymorphisms (rs10206961, rs1254900 and rs14242 in VAMP5, rs11241200 in MCC) and HSCR susceptibility in a Southern Chinese population composed of 1473 cases and 1469 controls. Two variants in VAMP5 were replicated as associated with HSCR. Interestingly, we clarified SNPs rs10206961 and rs1254900 in VAMP5 are more essential for patients with long-segment aganglionosis (LHSCR). Relatively high expression correlation was observed between VAMP5 and MCC using data from public database showing there may exist potential genetic interactions. SNP interaction was cross-examined by logistic regression and multifactor dimensionality reduction analysis revealing that VAMP5 rs1254900 and MCC rs11241200 were interacting significantly, thereby contributing to the risk of HSCR. The results suggest that significant associations of the rs10206961 and rs14242 in VAMP5 with an increased risk of HSCR in Southern Chinese, especially in LHSCR patients. This study provided new evidence of epistatic association of VAMP5 and MCC with increased risk of HSCR.


INTRODUCTION
Hirschsprung disease (HSCR) is the most common cause of neonatal intestinal obstruction [1], defined by the partial or complete absence of the neural crest cells in the intestinal tract [2]. The overall prevalence of HSCR among the Asian population is estimated at 2.8/10,000 live births and displayed a significant racial variation [3].HSCR can be classified into three types based on the length of the aganglionic tract, including short-segment HSCR (S-HSCR), long-segment HSCR (L-HSCR) and total colonic aganglionosis (TCA) with the percentages around 80%, 15% and 5% respectively [4,5]. HSCR is a complex multifactorial disease, which is mainly determined by individual genetic factors [6,7]. The recurrence risk in siblings varied from 1% to 33% depending on the length of the aganglionic segments and gender of the probands. More than ten genes were identified as contributed to the pathogenesis of HSCR including RET, GDNF, EDNRB, EDN3 and so on [8].However, mutations in these genes account for only 50% of the known cases of HSCR [9].
Vesicle-associated membrane protein 5 (VAMP5) has been reported to provide their subcellular targeting in the synaptic vesicle fusion process of enteric nervous AGING system (ENS) neurotransmission [10,11]. Shin JG et al. suggested that two single nucleotide polymorphisms (SNP, rs10206961 and rs1254900) of VAMP5 were the potential risk locus in TCA progression by using 187 Korean HSCR patients and 283 unaffected controls (P=0.006 for rs10206961, P=8.03×10 -5 for rs1254900), we also included SNP rs14242 in VAMP5 with suggestive significance to HSCR (P=0.04) for further replication [12].
Mutated in colorectal cancer (MCC) gene which is located on chromosome 5q21 encodes a protein comprised of 829 amino acids [13]. MCC is a tumor suppressor in different types of cancers, including hepatocellular carcinoma, colorectal cancer (CRC) and acute myeloid leukemia [14]. MCC is strongly related to the plasma membrane and membrane organelles in fmouse intestinal epithelial cells and neuronal cells via an immunoelectron microscopic analysis [15]. Additionally, case-control studies suggested that MCC might confer alterative genetic susceptibility to CRC in individuals with schizophrenia, implying MCC as a factor related to neurodevelopmental disorders [16]. A genome-wide association study (GWAS) by Garcia-Barcelo et al. has revealed that MCC was a plausible candidate gene of HSCR [17]. Rs11241200 in MCC was chosen for replication in current study according to this GWAS. It is noteworthy that MCC did not only function as an independent tumor suppressor in the majority of colorectal cancers, but also functioned as a susceptibility gene in HSCR.
The underlying genes remain largely unknown, especially the interplays between these susceptibility genes. The aim of this study was to ascertain whether the genetic polymorphisms of VAMP5 (rs10206961, rs1254900 and rs14242) and MCC (rs11241200) were associated with HSCR in 1470 Chinese HSCR cases and 1473 controls. We have clarified SNPs rs10206961 and rs1254900 in VAMP5 which are more essential for patients with LHSCR. Interestingly, further elaborated of SNP rs1254900 (VAMP5) and rs11241200 (MCC) were found to be interacted significantly, thereby contributing to the risk of SHSCR. This finding may add MCC to the list of genes as associated with HSCR. Further replications and functional evaluations are still required.

Association of VAMP5 and MCC SNPs with HSCR
Four SNPs were selected for replication in this study including 3 SNPs on VAMP 5 and 1 SNPs on MCC. The selection criteria were detailed listed in the Method. Detailed information about four SNPs genotyped in this study was shown in Table 1 using 1473 HSCR patients and 1469 HSCR-free controls from South China. The genotype distribution for the 4 SNPs followed the hardy-weinberg equilibrium (HWE) in the control subjects (P_hwe=0.58 for rs10206961, P_hwe=0.46 for rs1254900, P_hwe=0.07 for rs14242 and P_hwe=0.07 for rs11241200). All four SNPs are located in the intronic region. SNPs rs10206961 and rs14242 in VAMP5 shows significant association with HSCR (0.044≤P_adj≤0.046, 1.12≤OR≤1.17 for rs10206961; P_adj=0.020, OR=1.20 for rs14242). Inconsistent with the report by Shin JG et al. [12] for the SNP rs1254900 in VAMP5 , we failed to replicate the association in our population (0.666≤P≤0.953, 0.97≤OR≤1.04 for rs1254900). For the most prominent SNP rs11241200 in MCC, 0.282≤P≤0.438 for rs11241200, we failed to replicate the association as presented previously [17]. To better digest the effective pattern for the two SNPs on VAMP5, we specified the samples following four different genetic models including additive, dominant, recessive and genotypic models. Larger effect was observed in both SNPs following dominant models (OR=1.12/1.17 for 10206961; OR=1.20 for 14242 respectively).

Independence test of the replicated SNPs
To further identify the independence of SNPs previous replicated in VAMP5, the linkage disequilibrium (LD) patterns were examined on Guangzhou replication data ( Figure 1). Consistent with the public data as shown in Supplementary Figure 1, SNPs rs10206961 and rs1254900 showed moderate LD with each other (r 2 =0.58). SNP rs14242 showed limited LD with all the other SNPs (r 2 =0.35).
Conditional logistic regression analysis was performed to investigate the independent effects among the three SNPs in VAMP5 (  AGING evidence of association to disease (Table 1), we observed SNP rs1254900 was significant if the effect of SNP rs10206961 was adjusted (P=0.018, OR=1.22), These results raised up the notion that diversified effects of these SNPs exist in this region for disease susceptibility, such as epistatic effect.

Intergenic SNPs show epistatic effect to HSCR
As shown in Table 3, pairwise epistasis test was among the three SNPs in VAMP5. Surprisingly, we failed to observe any intragenic epistatic association to disease (0.11≤P≤0.5, 0.86≤OR≤1.15). As presented above, VAMP5 and MCC may involve in the enteric nervous system (ENS) neurotransmission. We also observed the two genes showed relative high coexpression correlation in Peripheral blood mononuclear cell (PBMC) with confidence (P=0.002, OR=1.26) using the data extracted from ImmuCo [18]. Thus, we also included the SNP in MCC (rs11241200), the result suggested significant elevated effect between SNP rs1254900 (VAMP5) and rs11241200 (MCC) to disease (P_int= 2.9E-03, OR=1.42). SNP rs1254900, rs14242 showed less synergetic effect with other SNPs reflected  AGING by the insignificant P values shaded in grey. Another statistical method for testing epistatic interaction was also applied for further validation. Pairwise Multifactor dimensionality reduction (MDR) analysis was adopted here to test the epistatic interaction between SNP pairs. Table 3 showed the results of cross-validation consistency (CVC) and Balanced accuracy (BA) obtained from MDR analysis of the two-locus model, which showed significant pairwise interactions. The significance of the result was tested showing the consistent higher effect size between epistatic rs11241200 (MCC) with rs10206961, rs1254900 and rs14242 (VAMP5) to disease (P=0.002, OR=1.26).

Clinical stratification of SNPs in VAMP5 and MCC with HSCR
HSCR is heterogenous varied with different subclinical manifestations, typically classified by the aganglionisis length of the patients including SHSCR, LHSCR and TCA patients. For the two SNP including rs14242 and rs10206961 which showed evidence of association to HSCR in current study, rs14242 is associated with increased risk of the SHSCR though the effect is marginal (P=0.039, OR=1.19). SNP rs10266961 showed much stronger evidence of association to LHSCR with larger effect size and more stringent P value (2.97E-03≤P≤0.027, 0.61≤OR≤1.44), especially following the additive model. Surprisingly, despite from the epistatic effect with SNP rs1254900 to disease, we also observed a significant individual association to LHSCR patients. (P=4.62E-03, OR=0.61), highlighting extra roles for the SNP to disease. So we further examined the epistatic effect of SNP rs11241200 (MCC) and rs1254900 (VAMP5) to different HSCR patients. As shown in Table 4, we only observed significant epistatic associations in SHSCR patients to the disease (P_int=8.9E-03, OR=1.40), for LHSCR patients, marginal effect was observed which may be caused by limited samples waiting for further replications (P_int=0.066, OR=1.48). It is also possible SNP rs11254900 plays multiple roles leading to disease association which also required functional characterization.

DISCUSSION
HSCR is a complex clinical syndrome. Increasing studies focused on case-control and trio based study designs. It the largest population-based study to the correlation of VAMP5 and MCC genetic polymorphisms with HSCR risk in our study that 1473 cases and 1470 unrelated controls were enrolled in. We have replicated two SNPs rs10206961 and rs14242 in VAMP5 that were associated with HSCR (Table 1, 2). In further subclinical manifestation analysis, we further clarified SNPs rs10206961 and rs1254900 in VAMP5 and found they were essential for patients with LHSCR (Tables 3,  5). Interestingly, SNP interaction confirmed through logistic regression and multifactorial dimensionality reduction analysis revealed that the genotypes of the polymorphisms of VAMP5 rs1254900 and MCC rs11241200 were interacting significantly, thereby contributing to the risk of SHSCR (Table 4).
VAMP5 belongs to part of the vesicle-associated membrane protein (VAMP) and the soluble NSF attachment protein receptor (SNARE) superfamily. SNARE superfamily is responsible for the last stage of docking and subsequent fusion in diverse intracellular membrane transport events [19]. VAMP5 can provide their subcellular targeting in neurotransmission [11,19]. AGING It may also facilitate glucose transporter type 4 (GLUT-4) translocation from the intracellular pool to the plasma membrane. Downregulation of VAMP5 might determine reduced GLUT4 membranal expression, followed by reduced glucose transport [20]. Accordingly, we hypothesize that the VAMP5 risk alleles could unbalance the metabolism of its encoded protein leading to the disorders of intestinal protein synthesis, although there is a lack of direct experimental support for this assumption. Considering the facts that pathophysiology of HSCR, it is caused by a congenital absence of neurons in a portion of the intestinal tract. More specifically, genetic variants of VAMP5 may functionally hinder the normal the migration and proliferation of enteric neural crest cell. Similar genetic studies have suggested that there was a relationship between VAMP5 poly- AGING morphisms and HSCR [12]. Three SNPs in VAMP5 overlaps with the previously reported potential association of VAMP5 polymorphisms with 21 TCA patients [20]. But in our study, we observed the association of VAMP5 with HSCR, especially in SHSCR and LHSCR patients, but we failed to replicate the association of SNPs using 82 TCA patients. The possible sources of this discrepancy could be attributed to number of samples. As shown in independence test result, SNP rs14242 showed limited LD with all the other SNPs (Figure 1, r 2 =0.35). Besides, our study presents the epistatic association between MCC and Vamp5. MCC is known to reduced activation of NF-κB signalling in colorectal cells as well as a factor related to neurodevelopmental disorders [21]. A study on mice has shown that MCC binds SH3/ankyrin domain gene 3 isoform were able to participate in neurodevelopmental, neurobehavioral and autism spectrum disorders [22]. Epistasis between the different genes provides us a perspective for disease etiology, not only limited HSCR, we also observed in other complex diseases such as colorectal cancer [23] . Taking the advantage of large replication samples, we tested the pairwise genetic epistasis between VAMP5 and MCC. Intriguingly, a significant synergetic interaction between rs1254900 (VAMP5) and rs11241200 (MCC) was identified through the cross-validation by logistic regression and MDR analysis, thereby contributing to the risk of SHSCR. We could speculate that VAMP5 and MCC gene are associated with the neurotransmitter release process that affects neurogenesis SNPs for the subphenotype interaction examination to make a more precise and convictive assessment, pointing to the increased risk of SHSCR comparing to other related diseases.
There are several limitations to this study. First, the effect of gene-environment interactions was not emphasized. Second, more accurate ORs should be adjusted by patient factors such as medication consumption and other exposure factors. Third, as the heterogeneity in different ethnicities influenced the results significantly, the findings from the Asian based studies were not convictive enough. We also calculated the power of current study using Epistasis Power Calculator (https://gump.qimr.edu.au/general/manuelF/ epistasis/epipower4i.html), based on current sample size with the incidence rate 1 per 5000 infants, the power to detect pairwise epistatic effect is limited (0.61 for caseonly study, 0.37 for case-control study), further replication in independent cohort was still required. Fourth, although this is the largest population-based study conducted to-date, the statistical power was still limited due to the relative insufficient sample size. Replication studies from other hospital with a larger sample size are encouraged to confirm the association. Lastly, the functional mechanisms subject to the association of VAMP5 and MCC for HSCR is required in the further study, borrowing the idea from other diseases [24,25].
In summary, our study indicated significant associations between rs1254900 (VAMP5) and rs11241200 (MCC) as independently related with SHSCR status. We proposed a relationship that may fill the gap between genetic susceptibility and subclinical manifestation. These conclusions also will provide a basis for future efforts to understand the detailed mechanisms of this intestinal disorder. SNPs associated with increased severity or worsening progression of HSCR would potentially afford a better individualized treatment of patients.

SNP Genotyping and quality control
Three SNPs in VAMP5 involved in the study were selected according to a GWAS in 187 Korean HSCR cases (Supplementary Table 2). Five SNPs were replicated in their study as shown in Supplementary  Figure 1, two SNPs were removed according to the high LD (r 2 >0.9，rs1561198 and rs55971080) in Asians ( the LD in CEU among the SNPs was less tightly with r 2 >0.8). The three SNPs with high LD were annotated by the RegulomeDB database (http://www.regulomedb. org/) to estimate the potential functional roles. The potential regulatory SNP (rs10206961) was remained for further replication.
Three SNPs in MCC from the HSCR GWAS by Garcia-Barcelo et al. (Supplementary Table 3). [17] showed AGING similar likelihood and effect size of disease association, linkage disequilibrium (LD) was examined among them reflecting the association might derive from one signal (r 2 >0.9). SNP rs11241200 was chosen for replication in current study with higher annotation score by Regulome DB [17]. Four selected SNPs were genotyped by MassARRAY iPLEX Gold system (Sequenom) on the samples. Hardy-Weinberg equilibrium tests were performed and SNPs with P < 0.05 were excluded from the final analysis. Quality control of the four SNPs was performed as follows: 1. Cases/controls were excluded from the analysis on the basis of SNPs with >10% missing data. 2. Any subjects with 10% missing call were removed. After all quality control procedures, all four SNPs were kept for further analysis consisted of 1470 cases and 1473 controls.

Association analysis and subphenotype analysis
The SNPs were tested for associations with the disease by means of comparison of the minor allele frequency in cases and controls (basic allelic test) as well as other tests using PLINK1.9 (logistic regression for additive model, test of dominant and recessive models) [27]. Subphenotype association analyzes were performed by comparing cases with and without a certainly given subphenotype.

Independence testing
Linkage disequilibrium (LD) patterns were analyzed and displayed by HaploView [28]. Logistic regression tests were performed using SNPTEST v2.5b [29].Tests of independent contributions toward disease associations for SNPs in a single locus were done using logistic regression, adjusting for the effect of a specific SNP in the same locus.

Gene Coexpression
We visited ImmuCo, a database of gene Co-expression and Correlation in multiple cells including expression data for a total of 20,283 human (http://immuco.bjmu.edu.cn/) and examined the pairwise expression correlation between VAMP5 and MCC. The correlation was calculated by Graphpad 5.0

Genetic epistasis
Epistasis test (case-control analysis) by logistic regression was adopted here for the parametric analysis of genetic interaction using PLINK1.9 [30]. PLINK uses a model according to allele dosage ranging from 0 to 2 indicating the number of risk alleles for each SNP, A and B, and fits the model in the form of Y =b 0 + b 1 SNPA + b 2 SNPB +b 3 SNPA*SNPB + e. The para-meters b1, b2 and b3 indicate the contribution of SNP A and SNP B and interaction between A and B. The test for interaction is based on the coefficient b3. P values less than 0.05 were considered statistically significant.
Multifactor dimensionality reduction (MDR) was used to determine the genetic model that could most successfully predict the disease status or phenotype from several loci. Pairwise non-parametric epistasis test was also applied using MDR analysis [31].This method includes a combined cross-validation (CV)/permutation testing procedure that minimizes false positive results by multiple examinations of the data. We determined the statistical significance by comparing the average prediction error from the observed data with the distribution of average prediction errors under the null hypothesis. The MDR analysis was carried out using version 2.0 of the open-source MDR software package that is freely available online (http://www.epistasis.org) [32].

AUTHOR CONTRIBUTIONS
Yan Zhang and Huimin Xia designed experiment. Yuxiao Yao, Jinglu Zhao, Xiaoli Xie, Qiuming He, Ruizhong Zhang conducted the study. Yan Zhang and Xiaoli Xie analyzed the data. Yuxiao Yao, Jinglu Zhao and Yan Zhang wrote the paper. All authors read and approved the manuscript.
AGING 2016042036) and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.

CONFLICTS OF INTEREST
AGING Supplementary SNP: Single Nucleotide Polymorphism; A1/A2 indicates the risk allele and protective allele to disease; HSCR: Hirschsprung disease; The calculation of odds ratio (OR) is also based on the risk allele of each SNP. The P value indicates the significance based on different genetic models.