ERAP1 polymorphisms interactions and their association with Behçet’s disease susceptibly: Application of Model-Based Multifactor Dimension Reduction Algorithm (MB-MDR)

Background Behçet’s disease (BD) is a chronic multi-systemic vasculitis with a considerable prevalence in Asian countries. There are many genes associated with a higher risk of developing BD, one of which is endoplasmic reticulum aminopeptidase-1 (ERAP1). In this study, we aimed to investigate the interactions of ERAP1 single nucleotide polymorphisms (SNPs) using a novel data mining method called Model-based multifactor dimensionality reduction (MB-MDR). Methods We have included 748 BD patients and 776 healthy controls. A peripheral blood sample was collected, and eleven SNPs were assessed. Furthermore, we have applied the MB-MDR method to evaluate the interactions of ERAP1 gene polymorphisms. Results The TT genotype of rs1065407 had a synergistic effect on BD susceptibility, considering the significant main effect. In the second order of interactions, CC genotype of rs2287987 and GG genotype of rs1065407 had the most prominent synergistic effect (β = 12.74). The mentioned genotypes also had significant interactions with CC genotype of rs26653 and TT genotype of rs30187 in the third-order (β = 12.74 and β = 12.73, respectively). Conclusion To the best of our knowledge, this is the first study investigating the interaction of a particular gene’s SNPs in BD patients by applying a novel data mining method. However, future studies investigating the interactions of various genes could clarify this issue.


Results
The TT genotype of rs1065407 had a synergistic effect on BD susceptibility, considering the significant main effect. In the second order of interactions, CC genotype of rs2287987 and GG genotype of rs1065407 had the most prominent synergistic effect (β = 12.74). The mentioned genotypes also had significant interactions with CC genotype of rs26653 and TT genotype of rs30187 in the third-order (β = 12.74 and β = 12.73, respectively).

Conclusion
To the best of our knowledge, this is the first study investigating the interaction of a particular gene's SNPs in BD patients by applying a novel data mining method. However, future studies investigating the interactions of various genes could clarify this issue. PLOS

Introduction
Behçet's disease (BD) is a chronic vasculitis presented with multi-systemic signs and symptoms; however, it is majorly separated from other autoimmune diseases by characteristic bipolar aphthosis [1]. With a wide range of prevalence worldwide (from 0.64 per 100,000 in the UK to 420 per 100,000 in Turkey), BD is mostly distributed in countries alongside the Silk Road [2]. According to the considerable prevalence and morbidity of BD in Asian countries, understanding BD's pathophysiology might lead to new therapeutic options and increasing patients' quality of life. Years of research have proven that similar to many other rheumatic disorders, genetic factors have a significant role in BD's course [3]. HLA region has been proven to have a pivotal contribution to the genetic component of BD [4]. BD's association with HLA-B � 51 is proved by several influential studies, including a meta-analysis on 4800 patients that has shown individuals with this allele have an odds ratio of 5.78 for developing BD [5]. In addition to HLA-B � 51, studies have suggested a link between BD and other genes such as interleukin 10 (IL-10) and IL-23 receptor (IL-23R), some of which are associated with HLA-B � 51 [6]. In our previous study, we have shown that the endoplasmic reticulum aminopeptidase-1 (ERAP1) gene polymorphisms are associated with HLA-B � 51, resulting in higher BD susceptibility [7]. ERAP1 is an amino-peptidase responsible for the Nterminal trimming of peptides, which is a critical step in peptides processing and their presentation by MHC-I [8].
Furthermore, ERAP1 takes part in cleaving proinflammatory cytokine receptors such as tumor necrosis factor receptor (TNFR1) from the cell membrane [9]. Polymorphisms of ERAP1 might alternate the activity of the protein and subsequently changing the structure of peptidome available to HLA-B � 51. However, the association of ERAP1 single nucleotide polymorphisms (SNPs) and BD susceptibility is not entirely clear, and some studies suggest contradictory findings, which need to assess by more comprehensive studies [7,10,11].
Up to now, logistic regression for high dimensional and sparse data, parameter estimation is a costly and non-accurate procedure that introduces significant standard errors because sample sizes are too small compared to the order of interaction size. Also, conventional approaches (e.g., logistic regression) used for the analysis of genomic data are oversimplified and usually cannot consider all possible associations between multiple polymorphisms and gene-gene interactions [12]. Multifactor Dimensionality Reduction (MDR) approach is now a reference in the epistasis and SNPs interactions detection field. However, MDR suffers from some significant drawbacks, including that crucial interactions could be missed owing to pooling too many cells together or that proposed MDR analysis will only reveal at most one significant epistasis model, the selection being based on computationally demanding crossvalidation and permutation strategies. To overcome the aforementioned hurdles, model-based multifactor dimensionality reduction (MB-MDR) is a flexible framework to detect gene-gene or SNP-SNP interactions. MB-MDR is a non-parametric data mining method that has sufficient power and is capable of investigating the interaction of the unlimited number of genes and polymorphisms [13]. Therefore, we aimed to use the MB-MDR method to identify the interactions of ERAP1 polymorphisms and their association with BD susceptibly.

Study participants
The present study included 748 BD patients who were referred to the outpatient BD clinic in the Rheumatology Research Center, Shariati Hospital, Tehran, Iran. The International Criteria confirmed patients' diagnosis for Behçet's Disease (ICBD), and patients who were less than 16 years old or related to each other were excluded from the study [14,15]. For the control group, we have included 776 healthy individuals with no clinical presentation or family history of any rheumatic disorders or autoimmune diseases, who were matched for sex, age, and ethnicity [16]. Written informed consent was obtained from all individuals themselves or their parents in cases with the age of under 18. The ethical committee of Tehran University of Medical Sciences approved the study protocol, and the relevant university guidelines did all experiments.

DNA preparation and SNP genotyping
A peripheral blood sample was collected from all participants into EDTA-anticoagulated tubes using venipuncture. Genomic DNA was extracted using the standard phenol/chloroform method, and the extracted DNA samples were stored at −20˚C. Approximately 20 ng of the genomic DNA in each sample was used for genotyping. We assessed 10 common missense SNPs from our previous study [7] that were identified in the super-population of the 1000 Genomes project and had a minor allele frequency of more than one percent ( Table 1). We have also included an intronic SNP (rs1065407) that has been associated with BD in another study [17]. MGB-TaqMan Allelic Discrimination technique was used for SNP genotyping (Applied Biosystems, Foster City, CA, USA). Ten μl of reaction volumes, containing 0.25 μl of distilled water, 4.5 μl of genomic DNA, 0.25 μl of TaqMan genotyping assay mix, and 5 μl of the TaqMan genotyping master mix was used for amplification. The StepOnePlus Real-Time PCR System (Applied Biosystems) and the manufacturer's protocol were used for genotyping the patients and healthy individuals' samples. The allelic call was done using SDS v.1.4 software (Applied Biosystems) and the analysis of allelic discrimination plots. Finally, the genetic makeup of SNPs for each subject was considered as the genotype of that SNP.

Statistical methods
The continuous variables were indicated as mean ± SD. Allelic and genotypic frequencies of the ERAP1 SNPs were mentioned as N (%). The genotype distributions of SNPs were tested for deviation from Hardy-Weinberg equilibrium (HWE) in the control group. P-values were corrected for multiple comparisons by the Benjamini-Hochberg approach [18]. Since calculations of the main effect of ERAP1 SNPs were not available by the model-based multifactor dimensionality reduction (MB-MDR), multiple logistic regression has been used to obtain the main effects of ERAP1 SNPs, simultaneity. To adjust for main effects, main effects should be  [19]. MB-MDR method has proven to be more potent than multifactor dimensionality reduction (MDR) in the presence of genetic heterogeneity [20]. MB-MDR can unify the best of both nonparametric and parametric machine learning algorithms.
On the other hand, characterization, and identification SNP-SNP interactions lack performance in the absence of proper statistical methods and large sample sizes. Logistic regression, as a standard tool for modeling effects and interactions with binary response data, lacks power in the identification of gene interactions in high-order levels due to sparsity and separation [21]. Thus, in this study, SNP-SNP interactions were calculated by the MB-MDR algorithm. MB-MDR shows high power in the presence of all types of noises, such as missing data, genotyping error, genetic heterogeneity, and low sample size [22]. This algorithm was performed by "mbmdr" R package version 3.5.1. To assess the significance in MB-MDR, permutation test with 1000 replications has been done, which corrects for multiple testing (overall marker pairs) and adequately controls the family-wise error rate at α = 0.05.

Results
In this case-control study, 748 patients and 776 age-, sex-, and ethnicity-matched healthy controls were included according to the inclusion and exclusion criteria [16]. In BD patients, the mean age was 40.26 ± 10.88 years, and in the control group was 38.88 ± 11.54 years (Pvalue = 0.076). Out of 748 patients and 776 healthy individuals, 448 (59.9%) and 476 (61.3%) were male, respectively (P-value = 0.599). Based on the results of assessing the main effects of ERAP1 SNPs, the TT genotype of rs1065407 SNP (β = 0. 23, and adjusted P-value = 0.034) had a significant synergistic effect on BD. The synergistic effect of an allele is described as the allele increasing the disease risk, and the antagonistic effect is described as the allele having a protective effect regarding the disease susceptibility. In contrast, TT genotype of rs30187 SNP (β = -0.26 and adjusted P-value = 0.041) and AA genotype of rs469876 SNP (β = -0.20 and adjusted P-value = 0.046) had significant antagonistic effects on BD ( Table 2). Other ERAP1 SNPs do not have significant main effects concerning BD susceptibly. Table 2 summarizes the results of SNP-SNP interactions for six important SNPs (rs1065407, rs30187, rs469876, rs2287987, rs17482078, and rs26653). Based on the results of second-order interaction effects, there were only six significant 2-locus models. For instance, CC genotype of rs2287987 and GG genotype of rs1065407 (β = 12.74 and adjusted P-value = 2.12×10 −10 ) had a significant synergistic effect on BD susceptibility. rs30187 and rs1065407, CT, and TT genotype (β = -0.39 and adjusted P-value of 1.98×10 −3 ) had a significant antagonistic effect on BD. Synergistic effects of rs469876 (AA and GG) genotypes with rs1065407 (GG and GT) genotypes were significant as well (β = 0.32, adjusted P-value = 4.73×10 −3 ). Effects of rs30187 and rs469876 (CC vs. AA) and (TT vs. AG) were also significantly synergistic (β = 0.32 adjusted P-value = 2.39×10 −2 ). rs26653 (CC) with rs1065407 (GG) had a significant synergistic effect on BD (β = 0.76, adjusted P-value = 2.49×10 −2 ). However, the results of rs26653 (CT) and rs469876 (AG) showed a significant negative association with BD susceptibly (β = -0.42, adjusted P-value = 7.38×10 −2 ).

Discussion
In this study, we aimed to investigate the interactions of the ERAP1 gene polymorphisms and their associations with BD susceptibility in an Iranian cohort. Using the MB-MDR package, we have found plenty of synergistic and antagonistic significant interactions between ERAP1 polymorphisms and BD development. Considering the main effects, the TT genotype of rs1065407 had a synergistic effect on BD susceptibility. In the second-order interactions, CC genotype of rs2287987 and GG genotype of rs1065407 had the most prominent synergistic effect (β = 12.74). Furthermore, the mentioned genotypes also had significant interactions with CC genotype of rs26653 and TT genotype of rs30187 in the third-order (β = 12.74 and β = 12.73, respectively). Hence, we propose that the genotypes, as mentioned earlier of rs2287987, rs1065407, rs26653, and rs30187, could have prominent interactions resulting in a higher risk of developing BD.
ERAP1 gene is located in the 5q15 chromosome, and its expression has been observed in many tissues [23]. There are two main processes that ERAP1 is proposed to have a role in them. First, this amino-peptidase is involved in optimizing the length of peptides to bind with MHC-class I molecules by trimming their N-terminal in the endoplasmic reticulum (ER) [23]. Moreover, ERAP1 is involved in the cleavage process of various cytokine receptors such as TNFR1, Interleukin 1 receptor II (IL-1RII), and Interleukin 6 receptor α (IL-6 α), which results in receptor shedding [24,25]. Previous studies have shown that the ERAP1 gene is associated with other autoimmune disorders such as ankylosing spondylitis (AS) and psoriasis [26,27]. Homozygosity of ERAP1 polymorphisms is proposed to be correlated with a lower risk of AS and psoriasis, whereas it might be associated with a higher risk of developing BD [28,29]. These differences could be justified by the fact that loading different peptides on MHC-class I molecules can alter the subsequent immune response.
Our results indicated that the homozygous genotypes of minor alleles of rs2287987, rs1065407, rs26653, and rs30187 had the most prominent interactions causing BD susceptibility. In this regard, it has been demonstrated that the frequencies of the homozygous alleles of the ERAP1 gene are higher among BD patients [11]. As it was shown in further studies, these combinations of homozygote ERAP1 SNPs could result in alternations in the surface electrostatic potential of the protein [30]. These changes might alter the trimming activity of ERAP1, resulting in an altered composition of peptidome that is available for binding to HLA-B � 51. This claim could support the higher risk of developing BD observed in individuals carrying the mentioned genotypes. Furthermore, some SNPs such as rs30187 (Arg528Lys) are placed proximal to the entrance pocket of the protein [28]. Amino acid changes in such positions could modify the ideal structure of the protein and alter the enzyme activity.
Although several studies have investigated the association of ERAP1 polymorphisms and BD, there have been some contradictory findings that motivated us to utilize a more complex statistical method for addressing this issue. Zhang et al. evaluated 930 Chinese patients and proposed that rs1065407 and rs10050860 might be associated with increased risk of BD [17]. Sousa and colleagues studied another Iranian cohort and proposed that rs10050860 and rs13154629 might contribute to the genetic susceptibility of BD [15]. Moreover, Conde-Jaldón et al. found that homozygous genotypes for the minor alleles of rs27044, rs10050860, rs30187, and rs2287987 could be considered as risk factors for BD [10]. Takeuchi and colleagues found a haplotype consisting of 10 SNPs (five of which were non-ancestral), which was associated with a higher risk of developing BD, especially in those individuals who carry HLA-B � 51 [30]. Interestingly, our results indicated that homozygote genotypes of minor alleles of rs30187 and rs2287987 are associated with a higher risk of BD. rs30187 and rs2287987 are among those five SNPs that their non-ancestral alleles were mentioned in Takeuchi's study. Finally, the previous study by our team and the study on the Turkish population revealed that ERAP1 polymorphisms have epistatic interactions with HLA-B � 51 contributing to BD risk [7,30].
In conclusion, this is the first study investigating the interaction of a particular gene's SNPs in BD patients by applying a novel data mining method (MB-MDR package). Model-Based MDR as a flexible framework and a reference method to detect gene-gene or SNP-SNP interactions has adequate power even the presence of genotyping errors, missing genotypes, and genetic heterogeneity in this study compare with traditional methods (e.g., logistics regression). Finally, a significant interaction between minor genotypes of ERAP1 polymorphisms was observed in BD patients in comparison to healthy individuals. rs2287987, rs1065407, rs26653, and rs30187 interactions had the strongest association with developing BD in our study population. Taken together, these findings imply the contribution of ERAP1 polymorphisms in BD pathogenesis. However, further studies investigating the interactions of different genes could shed more light on this issue.
Supporting information S1 Table. Model-based multifactor dimensionality reduction algorithm for assessing the main and interaction effects of 11 ERAP1 SNPs on Behçet's disease risk (748 Iranian BD patients and 776 healthy individuals).