Epigenome-wide association data implicates DNA methylation-mediated genetic risk in psoriasis

Psoriasis is a chronic inflammatory skin disease characterized by epidermal hyperproliferation and altered keratinocyte differentiation and inflammation and is caused by the interplay of genetic and environmental factors. Previous studies have revealed that DNA methylation (DNAm) and genetic makers are closely associated with psoriasis, and strong evidences have shown that DNAm can be controlled by genetic factors, which attracted us to evaluate the relationship among DNAm, genetic makers, and disease status. We utilized the genome-wide methylation data of psoriatic skin (PP, N = 114) and unaffected control skin (NN, N = 62) tissue samples in our previous study, and we performed whole-genome genotyping with peripheral blood of the same samples to evaluate the underlying genetic effect on skin DNA methylation. Causal inference test (CIT) was used to assess whether DNAm regulate genetic variation and gain a better understanding of the epigenetic basis of psoriasis susceptibility. We identified 129 SNP-CpG pairs achieving the significant association threshold, which constituted 28 unique methylation quantitative trait loci (MethQTL) and 34 unique CpGs. There are 18 SNPs were associated with psoriasis at a Bonferoni-corrected P < 0.05, and these 18 SNPs formed 93 SNP-CpG pairs with 17 unique CpG sites. We found that 11 of 93 SNP-CpG pairs, composed of 5 unique SNPs and 3 CpG sites, presented a methylation-mediated relationship between SNPs and psoriasis. The 3 CpG sites were located on the body of C1orf106, the TSS1500 promoter region of DMBX1 and the body of SIK3. This study revealed that DNAm of some genes can be controlled by genetic factors and also mediate risk variation for psoriasis in Chinese Han population and provided novel molecular insights into the pathogenesis of psoriasis.


Background
Psoriasis is a chronic inflammatory skin disease with a complex etiology involving genetic risk factors and environmental triggers. Psoriasis prevalence in Europe and North America is about 2%, and the prevalence increases linearly over an individual's life course, from 0.12% at age 1 year to 1.2% at age 18 years [1]. Psoriasis may cause substantial reduction of mental and physical functioning, strongly influencing the patients' quality of life [2]. The tremendous health, social, and economic burden combined with lack of successful treatment of this disease emphasizes the necessity of exploring the genetic and molecular mechanism of psoriasis etiology.
DNA methylation (DNAm) is a molecular process that can reversibly modify cytosine residues in human or other eukaryote organisms. Since the findings that exogenous methylated DNA can be transcriptionally repressed when transfected into cultured mammalian cells, DNAm has been closely related with gene regulation. Epigenetics has been found to closely regulate gene expression; DNAm may affect the binding efficiency of transcription factors on gene regulatory elements. Hypermethylation of CpG islands (CGI) in promoter regions have found to cause repression of regulated genes, while hypomethylation has led to actively transcript gene expression. On the other hand, several lines of evidences support the notion that sequence variations profoundly correlate with DNAm in human tissues and cultured cells [3], and this will pave the way for the development of new therapeutic strategies.
The role of DNAm in cancer pathogenesis and some common immune-related diseases has been extensively studied [4]. Skin-based analysis revealed hundreds of methylation loci that contribute to disease onset or progression [5]. Only a few epigenome-wide methylation profiling in psoriasis patients have been conducted with relative small sample size [5,6]. Meanwhile, diseaseassociated DNAm differences may arise as a consequence of the disease or are independently acted upon by the genotype. Thus, it is important to find out the exact sites and their involvement in regulating the function of this disease.
Previously, we had identified 264 differentially methylated sites (DMSs) that were significantly associated with psoriasis [7]. However, the underlying mechanisms of how these methylation markers associated with psoriasis are not clear. It has been previously hypothesized that DNAm can potentially mediate the genetic risk in psoriasis. Utilizing the blood-derived genotype data and skin-derived epigenetic data in a total of 114 psoriasis cases and 62 normal controls, we set out to search the methylation markers that potentially mediate genetic risk for psoriasis.

Identification of methylation quantitative trait loci
We identified 129 SNP-CpG pairs achieving statistically significant level (Bonferoni correction P value 6.0 × 10 −8 = 0.05/829060); 129 SNP-CpG pairs constituted 28 unique SNPs and 34 unique CpGs (Additional file 1: Table S1). The methylation quantitative trait loci (MethQTL) CpG probes varied across annotated regions with 58.8% (20/34) locating on gene body and none locating on 3′ UTR or intergenic regions (Additional file 2: Figure S1). Compared to the genome distribution, the MethQTL CpG probes were significantly enriched in gene body (P = 0.046, Fig. 1a). CpG probes of the gene body region were more likely to harbor MethQTL than probes in intergenic and 3′ UTR regions, the first two most variable regions of the genome. The likelihood of detecting MethQTL CpGs was not dependent on the variability of the methylation level when compared with the global variation of each category (Fig. 1b). We set out to see whether the positions of MethQTL CpGs influence the disease severity by comparing the average methylation level from annotated regions; we found that the methylation level in promoter and gene body regions was higher in severe psoriasis patients than in mild. On the contrary, the average intergenic methylation level was lower in severe than in mild psoriasis patients (Additional file 1: Table S2).

In-depth analysis of the major histocompatibility complex region
Interestingly, almost all of MethQTL SNPs spread over a 3-Mb region covering the major histocompatibility complex (MHC) cluster, which has been reported to harbor several susceptible loci for psoriasis. In order to assess whether these SNPs are associated with psoriasis, the association between 28 MethQTL SNPs and disease status was tested using an additive genetic effect model. Of these SNPs, 18 SNPs were significantly associated with psoriasis at P value <6.0 × 10 −8 (Bonferoni correction) (Additional file 1: Table S3). The most significant signal comes from rs130079 (P = 1.02 × 10 −9 ), a coding SNP located on the 13th exon of CCHCR1, which leads to T → C substitution and causes Cys to Gly amino acid variation (Fig. 2a). After retrieving the public Chip-seq data  of rs130079 region from GM12878 cell line, we found that rs130079 region was not overlapping with active regulatory chromatins marked by H3K4me3 and H3K27Ac (Fig. 2b). The sequences surrounding rs130079 are highly conservative among multiple species including human, pig, sheep, horse, and dog, indicating its potential role in evolution (Fig. 2b).

Establishing epigenetic mediation of genetic risk
Having established 18 SNPs are tightly associated with psoriasis and CpG sites, we used causal interference test (CIT), a formal statistical hypothesis test that quantifies uncertainty in a causal inference pertaining to a measured factor, which potentially mediates a known causal association between a locus and disease phenotype, to evaluate whether these SNPs mediate CpG sites. These 18 SNPs formed 93 SNP-CpG pairs with 17 unique CpG sites (Additional file 1: Table S4). By performing a CIT test for 93 SNP-CpG pairs, the SNP effects of 11 pairs were attenuated after adjusting for methylation (Table 1). These 11 SNP-CpG pairs were composed of 5 unique SNPs (rs13203895 in USP8P1, rs2853953 in HLA-C, rs10484554 nearby WASF5P, rs2245822 nearby HLA-C, and rs2853952 in HLA-C) and 3 CpG sites (cg04087571 in SIK3, cg09914444 in DMBX1, and cg0683507 in C1orf106) and presented a methylation-mediated relationship between SNPs and psoriasis. The cg04087571 (β = −0.13, 95%CI [−0.16-0.11]), located on the body of SIK3, presented the most significant mediation for rs13203895 (Fig. 3). After adjusting for cg04087571, the effect of rs13203895 on psoriasis decreased from −0.49 to −0.26 (Fig. 3d). The rs13203895 was lying 4.1 kb downstream of HLA-C and was highly conservative among multiple species (Fig. 2b). The remaining two CpG sites are located on the body of C1orf106 and the TSS1500 promoter of DMBX1, respectively. We noticed that almost all MethQTL SNPs were located around HLA-C gene, suggesting that they may have important role similar to HLA-C in the etiology of psoriasis (Fig. 2b).

Discussion
Psoriasis is an immune-mediated chronic inflammatory skin disorder that is characterized by abnormal interactions between keratinocytes and immune cells such as CD4+ T cells, resulting in keratinocyte hyperproliferation in the epidermis [6]. Evidence clearly suggests that the methylation of cytosine residue at CpG dinucleotide contributes to the development of psoriasis [8]. Methylation changes in naïve CD4+ T cells may affect CD4+ T cell polarization in psoriasis, indicating that perturbation of CD4+ T cell methylation may be critical for psoriasis pathogenesis [6,9,10]. DNAm of dermal mesenchymal stem cells (MSCs) in psoriasis are probably participant in the pathogenesis and development of psoriasis through an extraordinarily complex mechanism [11]. Here, we analyzed the blood-derived genotype data and skin-derived epigenetic data, aimed to search the methylation markers that potentially mediate genetic risk for psoriasis.
We implemented causal statistical mediation framework to analyze methylome, genotype, and phenotype data. There are 129 SNP-CpG pairs achieving the significant association threshold, which constituted 28 unique MethQTL and 34 unique CpGs. MethQTL CpG probes varied across annotated regions with more than half locating on gene body and none locating on 3′ UTR or intergenic regions. The phenomenon that MethQTL CpG probes are enriched in gene body can be interpreted by the factors that these probes are more likely to increase the risk for germline or somatic mutations due to their increased propensity to become mutated [12]. Upon spontaneous hydrolytic deamination, methylated cytosine turns into thymine. Thymine is harder to be repaired than uracil, which results from the deamination of unmethylated cytosine. Indeed, comprehensive analyses have revealed that high frequent G → T transversion mutations are easily captured in CpG dinucleotide across the genome [13]. The likelihood of detecting MethQTL CpGs is not dependent on the variability of the methylation level when compared with the global variation of each category. Thus, genetic variations may profoundly influence DNAm by some undetermined mechanisms.
Associations between genetic variation with expression and methylation levels have been identified in several studies, both local (cis) and distal (trans) associations of genetic variation are associate with methylation levels [9,14,15]. Same to a previous study on epithelial ovarian cancer, all of the SNPs were trans-MethQTL in the current study [16], while some studies reported that cis-MethQTL account for a larger proportion of MethQTL [17,18]. This phenomenon can be interpreted by two reasons. First, our analysis was based on relatively sample size and the stringent significance threshold (P = 6.0 × 10 −8 ). Further, more information is needed to evaluate cis-MethQTL when more skin tissues are available in the future. Second, all trans-MethQTL were located at MHC regions, suggesting a strong genetic control of methylation variance among the studied population although the biological mechanism was not well understood. To check whether MethQTL SNPs regulate expression of cognate genes, we also retrieved gene expression data from our previous RNA-seq data set including 20 PP and 20 NN. However, no significant expression changes were observed for individuals with different genotypes.
By using CIT test, we identified three putative mediators (cg04087571, cg06834507, and cg09914444) for five MHC variants. Our analysis indicated that cg04087571, Fig. 3 Genotype-dependent CpGs potentially mediate genetic risk for psoriasis and the relationship among cg04087571, rs10484554, and psoriasis. a Association between cg04087571 and disease status. b Association between cg04087571 and rs10484554. Red horizontal bars represent average DNA methylation levels. c Percentage of cases for each rs10484554 genotype. d The coefficient (β) represents the dependence of psoriasis phenotype (P) on genotype (G) with or without adjustment for DNA methylation level of cg04087571 (M). The bars represent the 95% confidence interval for the coefficient. e, f Examples of DMSs potentially mediate genetic risk for psoriasis. Density plots of methylation levels by genotype for two SNP-CpG pairs. Plot lines were colored by different genotypes which is associated with salt-inducible kinase 3 (SIK3), was the most significant mediator for USP8P1. SIK3 is an AMP-activated protein kinase-related kinase and a novel energy regulator that modulates cholesterol and bile acid metabolism by coupling with retinoid metabolism [19]. We also found that cg06834507, residing in the body of the C1orf106 gene, was a potential mediator of genetic risk for psoriasis. The function of C1orf106 remains to be elucidated, but the coding variants in this gene have been associated with inflammatory Crohn's disease (CD) [20], which shares clinical and immunological features with psoriasis [21]. The CpG site cg09914444 locates on the TSS1500 promoter of DMBX1; DMBX1 encodes a member of the bicoid subfamily of homeodomain-containing transcription factors. The encoded protein acts as a transcription factor and may play a role in brain and sensory organ development [22]. It is well known that stressful life events, anxiety, and depression constitute the major risks of occurrence and recurrence of psoriasis, aggravating the severity of this disease and duration of symptoms [23]. On the contrary, psoriasis itself also contributes to depression, anxiety, and psychological stress. Though how genotypes be mediated epigenetically is not clear, one hypothesis assumes that DNAm could mediate phenotypic plasticity in response to a varying environment [24]. In this study, we suspected that MHC variants might regulate psoriasis variability in addition to mean disease phenotype by regulating psoriasis plasticity through DNAm.
Consistent with the findings from rheumatoid arthritis (RA) [4], almost all of the psoriasis-related MethQTL located in the MHC region harbors several susceptibility loci for psoriasis. We found that 18 SNPs of 28 MethQTL SNPs were associated with psoriasis, the most significant signal rs130079 located on the 13th exon of CCHCR1. Interestingly, we did not find strong CpG mediation for this most significant disease-associated locus, indicating that rs130079 has biological function through a different pathway. The SNP rs130079 was not located on active chromatin structure marked by H3K4me3 and H3K27Ac as indicated by retrieving the public Chip-seq data, suggesting that rs130079 might not be directly involved in regulating gene expression of nearby genes. The sequences surrounding rs130079 are highly conservative among multiple species including human, mouse, dog, and chicken, indicating its potential role in evolution. Importantly, missense polymorphism of rs130079 causes amino acid substitution from Cys to Gly. Even previous linkage analysis, fine mapping, conditional analysis, and genome-wide association study (GWAS) identified the MHC associated with psoriasis and its clinical subtypes [5,[25][26][27][28][29], we speculated that rs130079 may be a methylation mediator that regulates the genetic risk of psoriasis indirectly.
It should be acknowledged that it is not possible to establish causality on the basis of a pure case-control study. The limitation of this study is that we do not have enough samples to conduct a further replication and validation experiment. Our previous published data sets did not contain both genetic and DNA methylation data [30,31]; thus, those data sets were not suit for a complete validation of SNP-CpG pairs identified here. Actually, we are collecting more skin samples and aim to perform new project to reveal genetic and epigenetic crosstalking in future. As discussed above, the fact that cg04087571 and cg09914444 can potentially connect external environmental factors and psoriasis status encourage us to believe they could play roles in disease etiology. Meanwhile, we should also emphasize that our findings are based on epidemiological hypothesis. Although all these findings should be validated in another independent larger sample size population and with pathogenesis investigation, these markers can be serviced as a starting point for further studies.

Conclusions
This research is one of few first studies to evaluate genotypes, methylation, and status variables of psoriasis. We identified several psoriasis-associated MethQTL CpG sites and SNPs. The MethQTL CpG sites are enriched in gene body regions and almost all the MethQTL SNPs locate in the MHC region. We found that CpG sites of C1orf106, DMBX1, and SIK3 mediate the genetic risk of psoriasis. These findings through analysis of epigenomewide association data provide new insights into the pathogenesis of psoriasis and represent a promising avenue through which to investigate novel therapeutic approaches for psoriasis.

Sample preparation
In our previous study, 217 human skin tissue samples in 114 psoriasis cases and 62 normal controls were collected from the Department of Dermatology, the First Affiliated Hospital, Anhui Medical University, Anhui Province, China. Three types of tissues were collected, including 114 psoriatic skin tissues from psoriasis patients (PP), 41 matched uninvolved psoriatic skin tissues from psoriasis cases (PN), and 62 unaffected skin tissues from normal controls (NN). The detailed clinical characteristics and genome-wide methylation experiment had been described previously [7]. In this study, we only analyzed methylation data of 114 PP and 62 NN samples, the blood-derived genotype data from the same patients. The case and control samples were matched in the best way for sex and age. No statistic significances were detected (P > 0.01, Table 2).

Differentially methylated sites identification
The genome methylation level was detected by using Infinium Human-Methylation450 BeadChips, which quantitatively measures more than 485,000 methylation loci. The methylation detection of each sample was described in detail in our previous study [7]. In brief, we performed single probe CPG methylation and psoriasis association by the non-parametric Wilcoxon rank-sum test. Probe expression was considered significantly different between the tested groups at a Bonferonicorrected P < 0.05. Statistic analysis was performed with Illumina Methylation Analyzer (IMA) package in R. A total of 264 DMSs were significantly associated with psoriasis in both PP versus PN and PP versus NN comparisons [7]. These 264 DMSs were analyzed in current study.

Genome-wide genotyping and quality control
The genome-wide genotyping analysis was conducted using Illumina HumanOmni ZhongHua-8 BeadChips containing more than 890,000 variations. Genotypes of SNPs were called by Illumina BeadStudio 3.2 software (Illumina, San Diego, CA, USA). SNPs were excluded if they had a call rate lower than 90%, minor allele frequency (MAF) of <1%, and/or significant deviation from Hardy-Weinberg equilibrium in the controls (P < 1 × 10 −7 ). After data cleaning and quality control, 829,060 SNPs remained for the MethQTL analysis.

Technical verification
To further evaluate the quality of the genotype data, we selected 5 SNPs (rs13203895, rs2853953, rs10484554, rs2245822, and rs2853952) to be re-genotyped in 96 selected samples from the HumanOmni ZhongHua-8  panel by using the direct Sanger sequencing. The concordance rate between the genotypes from Illumina and Sanger sequencing was 97.6%. We aimed to verify methylation data of cg04087571, cg09914444, and cg06834507 by Sequenom Epitype technology as described in supplementary materials of our previous study [7]. But we failed to find proper primers in these three restricted regions. Instead, we evaluated the performance of Illumina Methylation450 array using Sequenom Epitype technology by detecting the nine CpGs reported in previous study [7]. For 24 PP samples from the Illumina array, we found high consistency between the two platforms, suggesting high quality of the original data (spearman correlation pho > 0.6, P < 1 × 10 4 , Additional file 1: Table S5).
To further verify our results, we compared the overlapped differentially methylated sites that were shared by Human-Methylation27K and 450 K beadarrays from Roberson's and our studies, respectively. Among the 456 overlapped sites, 99.8% (455/456) loci showed the same hyper-or hypomethylated trend, suggesting high consistency and excellent data quality (data was not shown here).

Methylation quantitative trait loci analysis
Associations between all 264 differentially methylated CpG sites and common SNPs (MAF ≥5%) were examined. Additive genetic effect model was applied in the MethQTL analyses. To adjust for age and sex, we first generated residuals of methylation by regression models adjusted for age and sex. The residuals of methylation were then used to associate with SNP genotypes. The association analysis was performed using Plink 1.07. To correct for multiple testing, a rigorous Bonferoni correction was applied (P < 0.05/829,060 = 6.0 × 10 −8 ). In this study, the cis-MethQTL was defined as being less than 500 kb upstream and downstream from CpG loci, while the trans-MethQTL was defined as being more than 500 kb from the target CpG loci in the same chromosome or on different chromosomes.

Causal inference test
CIT is applicable to data that includes genotype, possible causal mediator such as DNA methylation, and an outcome of interest. CIT is used to assess the relationship between a causal factor (genotype, G), a potential mediator (methylation, M), and an outcome (psoriasis status, Y). To clarify that methylation acts as a mediator of genetics for psoriasis, the following criteria should be met: (i) G and Y are associated, (ii) G is associated with M after adjusting for Y, (iii) M is associated with Y after adjusting for G, and (iv) G is independent of Y after adjusting for M. If M is a consequence of Y or independently controlled by G, there should be no difference in the effect of G on Y, when conditioning on M. However, when M mediates the genetic risk for psoriasis (Table 1), conditioning on M should substantially reduce the effect of G on Y (Fig. 4). We focus our CIT analyses on the psoriasis-associated 264 CpG sites and the SNPs with significant MethQTL SNPs described above and significantly associated with psoriasis. The CIT P value is defined using the intersection-union test framework as the maximum of the four component test P values [32]. The significant threshold is set as CIT P < 0.05. The test was performed by causal interference test or "cit" package in R.

Additional files
Additional file 1: Table S1. Psoriasis-associated DMSs that are controlled by SNPs. Additional file 2: Figure S1. Genomic annotation for methylation probes. Refseq genes annotations were from the University of California, Santa Cruz (UCSC) hg18 reference genome [National Center for Biotechnology Information (NCBI) Reference Sequence Database Release 37]. CpG sites located in or near Refseq genes are separated into TSS1500 (1500 bp upstream of the transcription start site), TSS200, 5′ UTR, first exon, gene body, and 3′ UTR regions. Intergenic probes refer to methylation sites not mapping to any of the other categories. The ambiguous regions include sites that fell into two or more different categories. The promoter region includes methylation sites located in TSS1500, TSS200, 5′ UTR, and first exon. (DOC 92 kb)