Relevance of Coding Variation in FLG And DOCK8 in Finnish Pediatric Patients with Early-Onset Moderate-To-Severe Atopic Dermatitis

Early-onset, persistent atopic dermatitis (AD) is proposed as a distinct subgroup that may have specific genotypic features. FLG gene loss-of-function variants are the best known genetic factors contributing to epidermal barrier impairment and eczema severity. In a cohort of 140 Finnish children with early-onset moderate-to-severe AD, we investigated the effect of coding variation in FLG and 13 other genes with epidermal barrier or immune function through the use of targeted amplicon sequencing and genotyping. A FLG loss-of-function variant (Arg501Ter, Ser761fs, Arg2447Ter, or Ser3247Ter) was identified in 20 of 140 patients showing higher transepidermal water loss values than patients without these variants. Total FLG loss-of-function variant frequency (7.14%) was significantly higher than in the general Finnish population (2.34%). When tested separately, only Arg2447Ter showed a significant association with AD (P = 0.003104). In addition, a modest association with moderate-to-severe pediatric AD was seen for rs12730241 and rs6587667 (FLG2:Gly137Glu). Loss-of-function variants, previously reported pathogenic variants, or statistically significant enrichment of nonsynonymous coding region variants were not found in the 13 candidate genes studied by amplicon sequencing. However, higher IgE and eosinophil counts were found in carriers of potentially pathogenic DOCK8 missense variants, suggesting that the role of DOCK8 variation in AD should be further investigated in larger cohorts.


INTRODUCTION
Studies have suggested that patients with atopic dermatitis (AD) can be divided into subgroups not only on the basis of clinical phenotypes but also on the basis of biomarker and genotype status (Bieber et al., 2017). One such subgroup is early-onset AD with FLG loss-of-function (LoF) variants, increased asthma risk, high IgE levels, and parental AD history (Amat et al., 2018;Drislane and Irvine, 2020). Monomeric FLG protein is produced by cleavage from a large pro-FLG precursor containing 10À12 FLG repeats and is encoded by the FLG gene located in the epidermal differentiation complex (review by Sandilands et al. [2009]). FLG interacts with keratins in the skin, contributing to the formation of the uppermost layer of the epidermis (stratum corneum) and the natural moisturizing factor (Sandilands et al., 2009). The insufficiency of FLG results in impairment of the epidermal barrier and enhanced transepidermal water loss (TEWL), making the skin more permeable and vulnerable to diverse irritants, allergens, and pathogens (Kezic et al., 2008;Sandilands et al., 2009;Smith et al., 2006). Consequently, FLG LoF variants predispose to AD and ichthyosis vulgaris and affect eczema severity (Palmer et al., 2006, Smith et al., 2006 (review by Liang et al. [2016]).
FLG2 is part of the same gene family and similar in protein structure to FLG (Wu et al., 2009). Similar to FLG, FLG2 is found in the human epidermis where it is needed for proper cornification (Pendaries et al., 2015). Decreased FLG2 expression has been detected in human skin diseases (Makino et al., 2014), and homozygosity for FLG2 LoF variants causes the rare genodermatosis peeling skin syndrome (Alfares et al., 2017;Bolling et al., 2018;Mohamad et al., 2018). However, in addition to being linked to more persistent AD in one study on African Americans, there is not much information on the effect of FLG2 variation on AD (Margolis et al., 2014).
In addition to FLG variants, associations with higher risk or severity of AD have been found for variants in many other genes affecting epidermal barrier integrity or the immune response (Liang et al., 2016;Martin et al., 2020). To study the genetics behind moderate-to-severe pediatric AD in Finland, we investigated the relevance of sequence variation in FLG and FLG2 as well as in 12 other genes with a previous connection to AD pathogenesis in a cohort of Finnish pediatric patients with early-onset AD.

RESULTS
We studied 140 children with moderate-to-severe disease at the Skin and Allergy Hospital in Helsinki (Helsinki, Finland) as part of a 3-year randomized open-label follow-up study (Perälä et al., 2023). The baseline demographics of the study population are shown in Table 1 and Figure 1a. To determine the significance of FLG variations for Finnish pediatric AD, we genotyped selected single-nucleotide variations (SNVs) in the FLG (n ¼ 8) and FLG2 (n ¼ 6) genes and tested their association with AD using Fisher's exact test (Løset et al., 2019) and BenjaminiÀHochberg false discovery rate correction for multiple testing (Benjamini and Hochberg, 1995). Variants detected in the patient cohort included the four most prevalent European FLG LoF variants Arg501Ter, Ser761fs, Arg2447Ter and Ser3247Ter, and rs12730241 (G>A) and three FLG2 variants (Ser2377Ter, Cys298Ser, and Gly137Glu). Other genotyped loci included Gln1754Ter, Ser1020Ter, and Val603Met for FLG and Thr1314fs, Cys298Arg, and Leu168Phe for FLG2, but they were monomorphic in patients. Allele frequencies, carrier numbers, and association results are shown in Table 2.
A total of 20 of the 140 patients were heterozygous for an FLG LoF variant, which translated into a significantly higher combined FLG LoF-variant frequency of 7.14% in patients compared with 2.34% in controls (P ¼ 2.72E-05, OR ¼ 3.2). Although the small cohort size and low variant frequencies limited our ability to detect statistically significant associations for SNVs, the Arg2447Ter variant (n ¼ 8) showed a significant association with AD (P ¼ 0.003104, OR ¼ 5.8). A modest association was also detected for rs12730241 (P ¼ 0.028, OR ¼ 1.5) and rs6587667 (P ¼ 0.039, OR ¼ 3.6). The rs6587667 variant co-occurred with the rs12730241 variant in 19 of 20 controls and in six of six patients, indicating that  the two variants are in linkage disequilibrium and not independent. Hence, the contributions of these SNPs to the association signal as well as their causality should be evaluated in more detail by further studies on bigger cohorts. FLG LoF-variant carrier status was not linked to disease severity, whereas TEWL at previous eczema site was significantly higher at 36 months in LoF carriers than in noncarriers (Table 3 and Figure 1aÀc) (P ¼ 0.029). In addition, carriers of Ser761fs had significantly higher eczema TEWL at baseline (P ¼ 0.021). Changes in eczema treatment parameters (estimated marginal means) during the study in FLG LoF-variant carriers and noncarriers are shown in Figure 2aÀd. Occurrence of rs12730241-A allele correlated with significantly lower TEWL at the study end (P ¼ 0.036). There were no differences in clinical parameters between rs12730241-A allele homozygotes (n ¼ 10) and heterozygotes (n ¼ 49).
We also conducted an exploratory candidate-gene study for possible, to our knowledge, previously unreported monogenic-like causes of moderate-to-severe pediatric AD by amplicon sequencing protein-coding regions of the following genes: FLG, CLDN1, DOCK8, IL13, IL17A, IL22, IL31, IL33, IL4, IL4R, SPINK5, signal transducer and activator of transcription 6 gene STAT6, and toll-like receptor 2 gene TLR2. Gene selection for the amplicon panel was based on the genes matching one or more of the following criteria: (i) cause for monogenic immunodeficiency with AD-like features (eczema, asthma, allergy); (ii) previously reported association with AD, allergy, or asthma; or (iii) known involvement in AD pathogenesis (Table 4). Unfortunately, owing to technical challenges in designing primers for highly homologous regions, we failed to cover all FLG gene coding regions. Hence, analysis of the FLG gene was limited to variants in nonhomologous regions at the ends of the gene and to the variants covered by genotyping.
After quality control and variant annotation, a total of 247 variant loci were identified in the 140 patients with AD in the 13 genes studied by sequencing (Supplementary Table S1). Of the detected variants, 13 had a previous association with AD, asthma, allergy, or eosinophil counts with modest effect sizes, but their frequency in our cohort was similar to that in population cohorts (gnomAD database, version 2.1) ( Table 5) (Karczewski et al., 2020).
To identify putative high-impact variants, we filtered for rare, exonic variants causing likely LoF or amino acid change in the protein. A total of 116 of 247 variants were exonic, 69 were nonsynonymous, and 30 were rare or novel (frequency < 0.01 in gnomAD) ( Figure 1d and Table 6 and  Supplementary Table S2). LoF variants were not detected, but it should be noted that analyses for intronic deletions/duplications and copy number variation were not performed. Most of the rare variation was found in DOCK8 (21 missense variants at 13 loci in 19 patients), biallelic LoF variants of which cause autosomal recessive hyper-IgE syndrome (Aydin et al., 2015). When we used a Combined Annotation Dependent Depletion score cutoff of 15-the median value for all possible canonical nonsynonymous and splice variants in Combined Annotation Dependent Depletion-nine DOCK8 missense variants in 11 patients were considered potentially harmful (Kircher et al., 2014). This included two patients with two such DOCK8 variants (NM_203447:c.A3535T p.S1179C/c.A4019G p.Y1340C and c.C305T p.T102I/ c.C986T p.A329V), but it is not known whether these variants occurred in cis or trans. These two patients had parental AD, moderate disease, normal serum total IgE levels, slightly elevated eosinophil counts (0.40À0.53 E9/l), and positive aeroallergen sensitizations at baseline. They had no severe or frequent infections. One patient (IgE 277 kU/l) was diagnosed with epilepsy at 36 months. The other patient had high IgE and eosinophil levels (1,133 kU/l and 0.66 E9/l, respectively) as well as both food and aeroallergen sensitizations at the study end. Carriers of potentially harmful DOCK8 variants (n ¼ 10) had significantly increased total IgE and eosinophil counts in comparison with noncarriers (n ¼ 110) both at baseline (IgE: 374 vs. 70 kU/l, P ¼ 0.003; eosinophils: 0.71 vs. 0.44 E9/l, P ¼ 0.025) and at 36 months (IgE: 671 vs. 147 kU/l, P ¼ 0.002; eosinophils: 0.59 vs. 0.38 E9/l, P ¼ 0.032). FLG LoF carriers were excluded from this analysis. Clinical data in relation to DOCK8 status are shown in Supplementary  Table S3, and change in eczema treatment parameters (estimated marginal means) during the study in patients with and without possibly pathogenic DOCK8 variants is shown in Figure 3aÀd.

DISCUSSION
Our study provides additional information on FLG LoFvariant carriers by presenting Finnish pediatric patients with moderate-to-severe AD. Many FLG variants have previously been shown to associate with AD (Brown et al., 2012;Luukkonen et al., 2017;Martin et al., 2020). However, these variants are only found in around 15À50% of patients with AD, and similarly, up to 40% of the carriers develop no AD at all (Palmer et al., 2006). In adult patients with AD, reduced FLG gene expression can occur in keratinocytes in the presence of IL-4 and IL-13 also without an FLG variant, whereas a similar finding was not made in a cohort of pediatric patients with early-onset AD (Esaki et al., 2016;Howell et al., 2009). Instead, barrier defects due to a reduced amount of epidermal lipids and tight junctions, such as claudins, have been seen in early pediatric AD (Bussmann et al., 2011).
In our pediatric cohort, FLG LoF variants were found in 14.3% (20 of 140) of the patients. The combined FLG LoFvariant frequency was significantly higher in study patients than in controls (7.14 % vs. 2.34%). This is slightly higher than the 5.6% FLG LoF-variant frequency previously reported for a Finnish adult AD cohort (Luukkonen et al., 2017). However, when the FLG LoF variants were tested individually, only the Arg2447Ter variant (n ¼ 8) showed a statistically significant association with AD in our cohort. The detection of other associations was restricted by the small size of the cohort, which can be seen as a notable limitation in this study. The FLG LoF-variant carriers presented with higher TEWL values than noncarriers both at baseline (Ser761fs) and at previous eczema sites at 36 months (combined FLG LoF Arg501Ter, Ser761fs, Arg2447Ter, or Ser3247Ter) consistent with epidermal barrier impairment. This is in line with previous findings showing that FLG LoF variants lead to reduced amounts of natural moisturizing factor in the skin (Kezic et al., 2008).
Previous research has shown that the genetic background of AD is heterogeneous, and there is considerable variation in the results of genetic association studies done in cohorts of varying sizes and ethnicity. In addition, the frequency of specific FLG LoF variants in people of African or Asian ancestry differs from those of European ancestry (Wong et al., 2018;Zhu et al., 2021). For instance, FLG Pro478Ser and c.3321delA are prominent variants in Asia (Kim et al., 2019;On et al., 2017), whereas Arg501Ter and Ser3249Ter are the two most common variants in northern Europe (Brown et al., 2012;Sandilands et al., 2009). The association between FLG variants and AD is less clear in people of African ancestry (Nomura and Kabashima, 2021). In the Finnish adult patients with AD, carrier frequencies of Arg501Ter, Ser761fs, and Ser3247Ter were lower than their reported frequencies in other European populations, whereas frequencies of Arg2447Ter, Gln1754Ter, and Ser1020Ter were slightly higher in Finns (Luukkonen et al., 2017 and current study). Because amplicon sequencing failed to cover most of the FLG gene, other rare or, to our knowledge, previously unreported FLG LoF variants could not be detected in our study, and hence the reported total FLG LoF frequency in our cohort may be an underestimation. FLG features intragenic copy number variation with allelic variation of either 10, 11, or 12 FLG repeats, which may affect the expressed FLG amount as well as urocanic acid concentration in the epidermis (Brown et al., 2012). In the Irish population, the rs12730241-A allele was used as a marker of the FLG allele with 12 repeats and was found to be protective against AD (Brown et al., 2012). Similarly, this variant was associated with a reduced risk of AD among European American subjects and in the Western Siberian population Komova et al., 2014). However, in African Americans and in the previous Finnish adult AD work, rs12730241 showed an opposite effect, conferring instead increased risk of AD Luukkonen et al., 2017). Moreover, Fernandez et al. (2017) found no association between AD severity and FLG repeat number in the Ethiopian population. In our Finnish pediatric AD cohort, we saw a modest association with AD for rs12730241 and another variant in linkage disequilibrium, rs6587667 (FLG2:Gly137-Glu). The rs12730241-A allele was also associated with lower TEWL at previous eczema site at 36 months. These associations could be due to an effect of the FLG copy number, but it should be noted that a correlation between rs12730241 and the number of FLG repeats has not been confirmed in the Finnish population. Hence, it is possible that this association is driven by other genetic factors such as regulatory or structural variation present in the same haplotype. Currently, there are no validated markers for different numbers of FLG repeats in the Finnish population, and hence more in-depth analysis of the effect of FLG copy number variation in pediatric AD was out of scope for this study.
Although the significance of FLG2 in AD remains unclear for the most part, an association between two FLG2 variants  Asad et al. (2016), Bergmann et al. (2020), and De Benedetto et al. (2011).

STAT6
Signal transducer and activator of transcription 6 Eczema, allergies Furue (2020) and Tamura et al. (2001).  and persistent AD was reported in African American patients (Margolis et al., 2014). Of these two variants, rs16833974/ His1249Arg is extremely rare in the Finnish population (gnomAD FIN frequency ¼ 0.0001592), but the rs12568784/ Ser2377Ter variant is present at an allele frequency of 0.1309 and was thus included in our genotyping panel together with one other FLG2 LoF variant (Thr1314fs) and four missense variants (Gly137Glu,Leu168Phe,Cys298Ser,and Cys298Arg). No association was seen between the rs12568784/Ser2377Ter variant and the risk of moderate-tosevere pediatric AD in Finns. However, our study did not compare nonpersistent with persistent patients with AD where the association for this variant was previously seen. Instead, we detected a modest association between the rs6587667 (FLG2:Gly137Glu) and risk of pediatric AD. However, owing to the linkage between this and the rs12730241 variant, the origin of this association signal needs further study.
In amplicon sequencing of 13 AD-related genes, we did not identify any LoF variants, previously reported pathogenic variants, or statistically significant enrichment of nonsynonymous coding region variants. However, we found it interesting that carriers of potentially harmful DOCK8 variants had significantly increased total IgE and eosinophil counts in comparison with noncarriers both at baseline and at 36 months. All but one carrier had allergic sensitizations at the study end. The relation of DOCK8 with AD has only been sparsely reported thus far, whereas DOCK8 deficiency due to recessive damaging variants is a well-documented cause of hyper IgE syndrome and a tendency to viral infections (Biggs et al., 2017;Boos et al., 2014;Jacob et al., 2019;Yamamura et al., 2017). Although AD is a complex multifactorial disease, the increase in IgE and eosinophil counts in carriers of potentially pathogenic DOCK8 missense variants suggests that the role of DOCK8 variation in AD should be further investigated in larger cohorts.

Patient cohort and eczema severity measures
Genetic analyses were carried out on 140 children who had moderate-to-severe AD and participated in a 3-year randomized open-label follow-up study between 2013 and 2019 ( Figure 1a and  (Chang et al., 2015). Analyses included samples with a maximum of two missing SNV calls and variants with missing call rates < 0.1. Amplicon sequencing for CLDN1, DOCK8, IL13, IL17A, IL22, IL31, IL33, IL4, IL4R, SPINK5, signal transducer and activator of transcription 6 gene STAT6, toll-like receptor 2 gene TLR2, and FLG was performed using Illumina Truseq Custom Amplicon Kit and the MiSeq system (Illumina, San Diego, CA). Amplicon target information can be found in Supplementary Table S4. Reads were aligned to the GRCh37 human reference genome assembly utilizing Bowtie2 (Langmead and Salzberg, 2012), and variant calling was done using an in-house pipeline as previously described (Rajala et al., 2015). Variant Call Format files were trimmed and combined using BCFtools (Li, 2011). To analyze germline variants, only variants with alternative/reference read frequency ratio >0.2 were included in analyses. In addition, recurrent polymerase chain reaction/alignment errors were removed manually after visual inspection of data on Integrative Genomics Viewer (Robinson et al., 2017). Variants were defined as heterozygous when the alternative/reference read frequency ratio was between 0.2 and 0.8 and homozygous when the ratio was >0.8. Variant annotations were performed with ANNO-VAR (Wang et al., 2010). Rare variants were defined as having a frequency <0.01 in gnomAD exomes and genomes of Finnish origin (Karczewski et al., 2020). Gene-wise rare variant frequencies in the pediatric AD cohort were also compared with rare variant frequencies in gnomAD to estimate the enrichment of rare variation in the selected candidate genes. Variant frequencies in the study cohort were calculated by dividing the number of identified alternative alleles by the number of samples with a minimum of 10Â coverage at the site. Variant pathogenicity was estimated with Combined Annotation Dependent Depletion (Kircher et al., 2014) and REVEL (Ioannidis et al., 2016), and evolutionary conservation was evaluated with GERPþþ (Cooper et al., 2010). Information for variants with previous associations to AD, eczema, allergy, IgE levels, or eosinophil numbers was sought from the following publicly available online databases: FinnGen F6 (FinnGen, 2022), UK Biobank (Canela-Xandri et al., 2018;GeneAtlas, 2017), and The NHGRI-EBI GWAS catalog (Buniello et al., 2019).

Ethical considerations and permits
All parents or legal guardians provided written informed consent. The ethics committee of the Helsinki University Central Hospital and the Finnish Medicines Agency approved the study protocol (222/13/ 03/03/2012, EudraCT2012-002412-95).

Statistics
Statistical analyses for clinical parameters were performed with the statistical software package SPSS 24 and 25 for Windows software (IBM, New York, NY). Association of FLG and FLG2 variants with AD was tested using Fisher's exact test with Lancaster's mid-P adjustment and 95% confidence interval in PLINK. False discovery rate correction was applied to adjust P-values for multiple testing. A significance cutoff of P < 0.05 was used for all analyses. Quantitative clinical parameters in relation to genotype status were compared with the MannÀWhitney U test. Continuous variables are presented as medians with 25À75th percentiles (quartile 1Àquartile 3). Categorical variables are presented as counts and percentages. Repeated measures ANOVA (general linear models) was used to analyze eczema severity parameters over time.

Data availability statement
DOCK8 variant data have been submitted to the Global Variome shared LOVD and can be found at http://databases.lovd.nl/shared/ references/DOI:10.1016/j.xjidi.2023.100203