Investigating DNA methylation as a potential mediator between pigmentation genes, pigmentary traits and skin cancer

Abstract Pigmentation characteristics are well‐known risk factors for skin cancer. Polymorphisms in pigmentation genes have been associated with these traits and with the risk of malignancy. However, the functional relationship between genetic variation and disease is still unclear. This study aims to assess whether pigmentation SNPs are associated with pigmentary traits and skin cancer via DNA methylation (DNAm). Using a meta‐GWAS of whole‐blood DNAm from 36 European cohorts (N = 27,750; the Genetics of DNA Methylation Consortium, GoDMC), we found that 19 out of 27 SNPs in 10 pigmentation genes were associated with 391 DNAm sites across 30 genomic regions. We examined the effect of 25 selected DNAm sites on pigmentation traits, sun exposure phenotypes and skin cancer and on gene expression in whole blood. We uncovered an association of DNAm site cg07402062 with red hair in the Avon Longitudinal Study of Parents and Children (ALSPAC). We also found that the expression of ASIP and CDK10 was associated with hair colour, melanoma and basal cell carcinoma. Our results indicate that DNAm and expression of pigmentation genes may play a role as potential mediators of the relationship between genetic variants, pigmentation phenotypes and skin cancer and thus deserve further scrutiny.

and sun sensitivity phenotypes (Gordon, 2013). Even though there are missense and nonsense polymorphisms in pigmentation genes strongly associated with skin cancer, particularly in the gene MC1R (Nasti & Timares, 2015), it is not fully understood how non-coding variation in these genes relates to malignancy.
DNA methylation (DNAm) is an epigenetic modification with a prospective role in cancer aetiology. The association of global whole-blood DNA hypomethylation with cancer is well known and has also been described for melanoma (Cappetta et al., 2015;Shen et al., 2017). In addition, recent studies have shown the presence of hypomethylation in skin biopsy samples that were exposed to sunlight or artificial ultraviolet radiation (Grönniger et al., 2010;Holzscheck et al., 2020;Vandiver et al., 2015). Inter-individual DNAm variation at specific sites, measured in peripheral blood, has been uncovered as a predictor of a number of complex trait risk factors as well as all-cause mortality (McCartney et al., 2018). DNAm has the potential to be used as a molecular biomarker to diagnose disease or to assess prognosis in those affected by disease and is a promising candidate leading towards the realization of personalized medicine (Nikolouzakis et al., 2020).
This study investigated whether DNAm plays a role in the relationship between genetic variants, pigmentation-related skin cancer risk factors and skin cancer. First, we examined the effect of genetic variation on DNAm levels in peripheral blood across 10 regions robustly associated with pigmentation traits and skin cancer, in 36 cohorts of European descent. We then analysed the association of pigmentation SNP-associated DNAm sites with sun exposure and pigmentation phenotypes in participants of the Avon Longitudinal Study of Parents and Children (ALSPAC). Finally, using summary data-based Mendelian randomization (SMR), we explored whether SNP-related DNAm was likely to causally underlie the expression Significance Several genetic variants in pigmentation-related genes have been associated with skin cancer, particularly in populations of European ancestry. DNA methylation was investigated as a potential link between pigmentation SNPs, pigmentation phenotypes and skin cancer. DNA methylation sites near MC1R and ASIP were strongly associated with pigmentation SNPs, gene expression, hair colour, melanoma and basal cell carcinoma in an analysis that included data from the Genetics of DNA Methylation consortium, the Avon Longitudinal Study of Parents and Children and the UK Biobank. Further evaluation of the role played by DNA methylation in these relationships is warranted.
F I G U R E 1 Stages of analysis implemented and datasets used in this study. From the 391 unique DNA methylation (DNAm) sites associated with pigmentation SNPs, 25 were selected for further analysis as depicted. GoDMC, Genetics of DNA methylation consortium; ALSPAC, Avon Longitudinal Study of Parents and Children; CIT, causal inference test; SMR, summary data-based Mendelian randomization; eQTL browser, blood eQTL browser; CAGE, Cap Analysis of Gene Expression; GTEx, Genotype-Tissue Expression consortium; UKB, UK Biobank of genes associated with pigmentation phenotypes and skin cancer.
The purpose of this work was to carry out an integrative analysis to evaluate the joint contribution of genetic and epigenetic risk factors to skin cancer susceptibility. As this is the first instance of such an analysis involving genes associated with pigmentation traits and skin cancer, the clinical relevance of our findings will depend on subsequent replication and further investigation of the genes and pathways that we described here.

| MATERIAL S AND ME THODS
The different stages of this study are depicted in Figure 1.

| Identification of DNAm sites associated with pigmentation SNPs
The genetics of DNA methylation consortium (GoDMC) was created to study the genetic basis of DNAm variation and bring together resources and researchers with expertise in the epigenetics field (www.godmc.org.uk). One of its aims was to carry out a meta-GWAS were used here. We provide a brief description of the analysis in the Methods S1.
Using GoDMC data, we searched for DNAm sites that were strongly associated (p < 1 × 10 -5 ) with well-known pigmentation-related SNPs previously identified via genome-wide association stud-

| DNA methylation and pigmentation/sun exposure phenotypes in the Avon Longitudinal Study of Parents and Children (ALSPAC)
Cohort description and information regarding the collection of DNAm data in ALSPAC can be found in the Methods S1.

| Regression analysis
The pigmentation/sun exposure phenotypes evaluated included the following: skin reflectance, freckles, moles, sunburning, tanning ability, hair colour and eye colour. We examined DNAm at the nearest time point with respect to when the phenotypes were measured.
We assessed the association of cord blood DNAm and pigmentation  (Bonilla et al., 2014). Data on hair colour and tanning ability in young adults were collected in a recent ALSPAC questionnaire (Life@25+), using the same scales employed in past questionnaires (Bonilla et al., 2014).
We tested the association of SNP-associated DNAm sites with pigmentation and sun exposure phenotypes using t tests, one-way ANOVA tests, linear and logistic regression models. Regressions were adjusted for age, sex and the first 10 genetic principal components to account for population stratification. Pairwise correlation between DNAm sites was examined in ALSPAC children at age 7.
All analyses were carried out with the statistical package Stata v15.

| Mediation analysis
The causal inference test (CIT) approach (Millstein et al., 2009), implemented in the R package 'cit', was employed to investigate the causal direction between pigmentation SNPs, DNAm and red hair colour. Regressions were adjusted for sex, age and the top 10 genetic principal components.

| Summary data-based Mendelian randomization (SMR)
In order to investigate the potential for DNAm at selected sites to be causally related to gene expression and phenotypes such as pigmentation characteristics and skin cancer, we carried out a Mendelian randomization analysis with colocalization, implemented in the summary data-based Mendelian randomization (SMR) method (Zhu et al., 2016). We used the platform Complex Trait Genetics Virtual Lab (CTG-VL, https://genoma.io/; Cuellar-Partida et al., 2019) to run the statistical package SMR (Lloyd-Jones et al., 2017;Qi et al., 2018;Zhu et al., 2016). This method uses summary data from GWAS, mQTL or eQTL studies to distinguish causal or pleiotropic associations between DNAm and gene expression, or between either of these and a phenotype, from a situation where the traits are caused by different genetic variants that are in strong LD (see figure 1b of Zhu et al., 2016). If the former is true, that is a single genetic variant underlies DNAm and the other tested phenotype, the traits are said to be colocalized. The different SMR analyses run are described below. Unless otherwise reported, the SMR analyses p-value thresholds were as follows: p SMR < 5 × 10-7/5x10-6 and p HEIDI ≥ 0.05.

| Heritability of the DNA methylation sites associated with pigmentation SNPs
We checked the heritability of DNAm sites using the resource provided by the Complex Disease Epigenetics Group (www.epige nomic slab.com/onlin e-data-resou rces/; Hannon, Knox, et al., 2018) that employs twin data to report the variance in whole-blood DNAm explained by an additive genetic component, a shared environmental component and a unique environmental component.

| Identification of DNA methylation sites associated with pigmentation SNPs
We summarized available functional information on the analysed SNPs in Table 1.
Pigmentation SNPs that were strongly associated with DNAm sites in GoDMC are shown in Table S1.
We selected DNAm sites with the most reliable effects for further analysis as follows: sites associated with at least six of the DNAm sites that were selected for further analysis are shown in Table 2.
MC1R non-coding polymorphisms and functional variants exhib- DNAm at this position) and cg07130392, both associated with 7 of the 8 SNPs tested, and for cg08845973 and cg09738481, associated with 6 SNPs.
Interestingly, DNAm sites associated with both RHC variants, which are located in the first exon of the gene SPIRE2 (i.e. cg03388025, cg04287289, cg07402062, cg21285383, cg26513180), exhibited effects of similar magnitude and direction ( Figure S1). However, for these same DNAm sites, the association with rhc variants rs2228479 and rs885479 occurred in the opposite direction to that of the RHCs (Table S1).
One of the trans sites most frequently associated with MC1R SNPs The maximum variance in DNAm explained by a pigmentation SNP was 24.4% shown by the pair rs2228479-cg09569215.
Amongst the RHC, the highest R 2 was found for rs1805007, which explained 17.8% of variation at cg08854185 (Table S1).

| Other pigmentation genes
The number of SNP-DNAm site associations for the other pigmentation genes examined ranged from 1 (rs1800407 in OCA2) to 29 (rs619865 in ASIP; Table 1). There were 127 unique DNAm sites associated with at least one SNP in these 7 genes, while 102 DNAm sites were associated with only one polymorphism.
Five different DNAm sites in the ASIP region were associated with three SNPs each, whereas one site was associated with 3 SNPs in the OCA2/HERC2 region (Table S2). There was no DNAm site overlap between SNPs rs12210050 and rs12203592 in IRF4. The only fully consistent association between SNPs and DNAm sites was with cg01901788, located near ASIP (i.e. for all 3 SNPs the alleles associated with a fair skin phenotype increased DNAm at this site).

| DNA methylation sites and pigmentation/sun exposure phenotypes in ALSPAC
Pairwise correlation coefficients (Pearson's) and p-values for DNAm sites measured in ALSPAC children at age 7 are shown in Table S3a-c.
There was moderate to strong correlation between DNAm sites in the region that extends from cg04287289 to cg03388025 in MC1R (Table S3b), and some moderate correlation was detected between DNAm sites in unlinked genetic regions (Table S3a).

| Regression analysis
When we investigated the association of the selected DNAm sites with pigmentation and sun exposure traits in ALSPAC children, we found the strongest associations with red hair colour. The most robust association with this phenotype was that of cg07402062, where redhead children showed lower levels of DNAm at this site than non-redhead children (Table S4). This effect was seen with cord blood DNAm and red hair at 15 and 54 months (after correction for multiple testing), as well as with DNAm in adolescence and red hair at 18 years old (before correction for multiple testing). Associations with the other traits tested were either weaker or absent.

| Mediation analysis
The causal inference test (CIT) showed inconclusive results for SNPs rs1805007, rs1805008, rs258322, and rs4785763 (Table S5). There was evidence for a mediating effect of DNAm at cg07402062 between these pigmentation SNPs and red hair colour, as well as for red hair as a mediator of the relationship between the pigmentation SNPs and DNAm at cg07402062. On the other hand, for variants rs2228479 and rs885479, we found evidence of a causal effect of DNAm at cg07402062 on red hair colour, but not the other way around. Finally, rs11648785 appeared to influence both traits independently.

| Summary data-based Mendelian randomization (SMR) of whole-blood DNA methylation and gene expression
Out of the 25 DNAm sites under analysis, four (cg01023759, cg09738481, cg14091419 and cg26114043) were not present in any of the expression datasets used. Since the data sources only included variation in cis, cg26114043, located on chromosome 4, was left out. The 21 remaining sites showed associations with gene expression to varying degrees which were, for the most part, consistent TA B L E 2 Pigmentation SNP-associated DNA methylation (DNAm) sites selected for follow-up

| Summary data-based Mendelian randomization (SMR) of gene expression in skin, pigmentation/sun exposure traits and skin cancer
We found no evidence of a pleiotropic or causal association between gene expression at 16q24.3, 20q11.22, or any of the other pigmentation regions included in this study, and ease of skin tanning, skin colour, childhood sunburn occasions, melanocytic naevi and SCC, in sun-exposed and sun-unexposed skin.
There was evidence of an association with the same genetic variant between black hair colour and the expression of genes ASIP and FAM83C (20q11.22) in sun-exposed skin and of CDK10 (16q24.3), ASIP, NCOA6 (20q11.22) and EXOC2 (6p25.3) in sun-unexposed skin, whereas blonde hair colour was associated with the expression of ASIP and FAM83C in sun-exposed skin and with NCOA6 and EXOC2 in sun-unexposed skin. In addition, ASIP gene expression was associated with self-reported and diagnosed malignant melanoma, and BCC in both skin tissues, and CDK10 was associated with self-reported and diagnosed malignant melanoma, and BCC in unexposed skin (Table S7).

| Additional summary data-based Mendelian randomization (SMR) of whole-blood DNA methylation and gene expression
Because the DNAm sites we selected were associated with the expression of just 5 of the genes identified in the previous step (i.e.

DBNDD1, DEF8, GGT7, NCOA6, SPATA2L), we searched for additional
DNAm sites that showed an association with the expression of the 3.8 | Summary data-based Mendelian randomization (SMR) of whole-blood DNA methylation, pigmentation/sun exposure traits and skin cancer We uncovered 696 DNAm sites associated and colocalized with pigmentary and skin cancer traits (Table S9), but of these, only a reduced number (n = 18) were associated with gene expression colocalizing with these phenotypes (see Table S8). From this set of 18 DNAm sites, 9 were identified in our original analysis in GoDMC but only one (cg01901788) was on the list we followed up (the others were associated with three SNPs or less, Table S2). SNPs associated with these "mediating" 9 DNAm sites were as follows: rs11648785, rs1805008 (R160W), rs2228479 (V92M) and rs4785763 in the MC1R region; and rs1015362, rs4911414 and rs619865 in the ASIP region (Table 3).

| Heritability
Nine (38%) of 24 DNAm sites followed up showed a heritability value of 80% or more (additive genetic contribution), almost all of these are located in MC1R. DNAm site cg01023759 was unavailable in this dataset. Median heritability was 63% (IQR 31%, 85%; Table S10). The five correlated DNAm sites associated with SPIRE2 expression and also with SNPs rs1805007, rs1805008, rs2228479 and rs885479 showed a heritability higher than 63%.

| D ISCUSS I ON
In this study, we reported the existence of strong associations between pigmentation/sun exposure/skin cancer-associated SNPs and DNAm variation at specific sites, mostly in cis, with a few cases in trans. This appears to be especially marked in the MC1R gene region, which could be related to the high CpG content present in this region (Martinez-Cadenas et al., 2013). The SNP-DNAm site associations were observed for functional variants (RHC and rhc) as well as for non-coding polymorphisms within MC1R.
For a set of selected DNAm sites (those associated with several pigmentation SNPs in the same direction), we assessed their effect on pigmentation/sun exposure phenotypes measured in ALSPAC participants. Although sample sizes were modest, we detected a strong inverse association of red hair with cg07402062, which is located in the gene SPIRE2 (MC1R region). Remarkably, a recent GWAS of hair colour has identified the strongest genetic signal of association with red hair (versus brown/black hair) at SNP rs34357723 (T allele associated with red hair; LD with rs1805007, r 2 = .27), a signal that remained relevant even after adjustment for MC1R coding variants (Morgan et al., 2018). Rs34357723 is an intronic variant of SPIRE2, and GoDMC data indicate that it is also a strong mQTL for cg07402062, explaining ~16% of DNAm at this site (T allele beta = −0.82, SE = 0.01). The CIT analysis suggested that a mediation effect of cg07402062 DNAm between rs34357723 and red hair colour is compatible with ALSPAC data (see Table S5), although replication of these findings is needed.

Galanter and colleagues reported that DNAm at cg07402062
and cg21285383 differed between ethnic groups and that these differences could be accounted for by differences in genetic ancestry (Galanter et al., 2017), which would also be consistent with hair colour variation between groups.
The high heritability detected for these two DNAm sites, along with cg03388025, cg04287289 and cg26513180, and their association with RHC and rhc variants suggests a close relationship between SPIRE2 DNAm status and the MC1R gene structure.
SPIRE2 was recently shown to participate in myosin Vadependent transport of melanosomes in melanocytes via the generation of actin tracks (Alzahofi et al., 2018). However, despite SPIRE2 expression colocalizing with DNAm at cg07402062 and correlated sites, we detected no evidence of colocalization with hair colour GWAS signals, and thus, we cannot yet conclude that this gene causally underlies the phenotype.
Using summary data and two-sample Mendelian randomization implemented in SMR, we determined that the selected DNAm sites affected expression of a number of genes in cis, but there was limited overlap between this set of genes and those that showed an association of expression with pigmentation traits and skin cancer. Given that the DNAm expression analysis was carried out using whole-blood data, and the expression-phenotype analysis used skin data, the differences observed may be due to this discrepancy.
Previous reports on DNAm in pigmentation/skin cancer-associated regions include a study by Roos and colleagues (Roos et al., 2017) who showed that DNAm in healthy human skin was associated in cis with SNPs underlying melanoma risk discovered by GWAS, including two polymorphisms in MC1R (rs258322, rs4785763, which we tested) and one in the ASIP region (rs910873, not considered by us but in LD with rs619865). SNPs in SLC45A2 (rs35390) and TYR (rs1393350, rs1847134) were also investigated but yielded no evidence of association with DNAm, a finding similar to ours. Roos et al. also found that none of the DNAm sites associated with these SNPs was associated with naevus count in their epigenome-wide association study (EWAS). Likewise, in our SMR analysis presence/absence of melanocytic naevi failed to colocalize with any DNAm site.
Another study by Budden and Bowden described the regulation of MC1R expression by a CpG island overlapping a potential enhancer in the MC1R gene (Budden & Bowden, 2018). These authors found that this CpG island is unmethylated in melanocytes but highly meth- gions in gene and GC content, which has been found to be correlated . In this study, polymorphisms in MC1R were associated with DNAm variation at substantially more sites than is the case in other pigmentation genes, with lower GC content, tested (for instance, MC1R region SNPs are associated with 95 DNAm sites each on average, whereas ASIP region SNPs were associated with an average of 27 DNAm sites each, see Table 1).
From an evolutionary point of view, it is interesting to consider the potential for epigenetic modifications in pigmentation genes as a mechanism of adaptation to variable levels of ultraviolet radiation exposure, providing phenotypic flexibility in a changing environment, similar to what has been suggested for adaptation to high altitude (Julian, 2017). Since there is evidence that pigmentation genes, such as MC1R and ASIP, have been shaped by natural selection (Martinez-Cadenas et al., 2013;Norton et al., 2007;Quillen et al., 2019), polymorphic variants in them may act, in part, by affecting how they are epigenetically regulated to optimize a response to an environment with reduced sunlight.

| Limitations
Our analysis was conducted on DNA derived from whole-blood rather than skin tissue, and any pigmentation SNP-DNAm site associations may not be as strong or present in skin. We used heritability estimates provided by Hannon, Gorrie-Stone, et al. (2018) to assess the likelihood of the DNAm sites examined to vary in a similar way in different tissues. These authors reported a low contribution on average of additive genetic effects on DNAm across the genome (~16%) and a higher heritability of the sites that were correlated across tissues (median 71%). Since most of the DNAm sites (16/24) we analysed had heritability values > 50%, it is possible that the associations that we detected with pigmentation SNPs are also there in skin. Moreover, for two MC1R polymorphisms, Roos et al. (2017) already described an association in healthy skin with some of the which included cg04287289 and cg26513180, with ethnicity in placental tissue. However, the small overlap between the genes whose whole-blood expression was affected by the DNAm sites we tested and those whose expression in skin was associated with pigmentation and skin cancer phenotypes could be suggesting that there are in fact differences between tissues.
Unfortunately, mQTL data on populations of non-European ancestry are limited. In the only study involving one of these populations that we identified, Pierce and colleagues reported an association of 5 of our pigmentation DNAm sites with SNPs in cis in the population of Bangladesh (Pierce et al., 2018). While this is an indication that similar genetic-epigenetic relationships are likely to occur in other groups, knowledge of the role of these chromosomal regions on pigmentation in non-White populations is still scarce (Pavan & Sturm, 2019).
Each of the methods we used has their own particular limitations, so it is advisable to be cautious in the interpretation of our findings.
The CIT method requires that the analysis is carried out in one sample where the SNP, exposure and outcome have all been measured, and therefore, statistical power is constrained by the size of this sample. It may also be affected by collider bias and phenotype measurement error. In order to protect against collider bias, it is essential to adjust for all confounders of the exposure-outcome relationship. While we adjusted the regression models for age, sex and genetic principal components, unknown confounders may be still playing a role (Hemani et al., 2017).
The SMR method was conceived to try to tell apart pleiotropy from LD in the event of an association between gene expression and a complex trait, and thus to interpret results under a model of pleiotropy rather than causality, even though it could be consistent with the latter as well (Zhu et al., 2016). To tease out a causal effect, additional analyses are necessary, such as two-step Mendelian randomization (Relton & Davey Smith, 2012) using independent (in cis or trans) mQTLs. However, independent instrumental variables for the DNAm sites in this study have been hard to find since there is usually considerable LD with the original SNPs. As a consequence, SMR results are mostly intended to be used to highlight genes for follow-up studies and not to establish unequivocal causal relationships (Zhu et al., 2016).
In summary, our results suggest that DNAm may lie on the causal pathway leading from pigmentation-associated SNPs to pigmentation/sun exposure traits and eventually skin cancer, possibly via regulation of gene expression, although more evidence to support these findings is needed. This study puts forward a set of genes that could be prioritized in upcoming investigations (e.g. CDK10 or SPIRE2) to delve deeper into the role played by the variability in DNAm as a potentially causal risk factor for skin cancer. We have reported for to obtain a better picture of pathways leading to disease. However, there is still some way to go before human genetic and epigenetic information can be introduced fully into the dermatology clinic for the provision of personalized medicine.

CO N FLI C T O F I NTE R E S T
The authors have no conflicts of interest to declare.