Decoding the genetic and epigenetic basis of asthma

Asthma is a complex and heterogeneous chronic inflammatory disease of the airways. Alongside environmental factors, asthma susceptibility is strongly influenced by genetics. Given its high prevalence and our incomplete understanding of the mechanisms underlying disease susceptibility, asthma is frequently studied in genome‐wide association studies (GWAS), which have identified thousands of genetic variants associated with asthma development. Virtually all these genetic variants reside in non‐coding genomic regions, which has obscured the functional impact of asthma‐associated variants and their translation into disease‐relevant mechanisms. Recent advances in genomics technology and epigenetics now offer methods to link genetic variants to gene regulatory elements embedded within non‐coding regions, which have started to unravel the molecular mechanisms underlying the complex (epi)genetics of asthma. Here, we provide an integrated overview of (epi)genetic variants associated with asthma, focusing on efforts to link these disease associations to biological insight into asthma pathophysiology using state‐of‐the‐art genomics methodology. Finally, we provide a perspective as to how decoding the genetic and epigenetic basis of asthma has the potential to transform clinical management of asthma and to predict the risk of asthma development.


| INTRODUC TI ON
Asthma is a common and heterogeneous non-communicable disease that affects over 300 million people worldwide. Asthma is characterized by variable respiratory symptoms such as coughing, wheezing, chest tightness, and airflow limitation caused by chronic airway inflammation, tissue remodeling, bronchial hyperresponsiveness, and mucus hypersecretion. [1][2][3] Global asthma prevalence is increasing rapidly, and despite the significant advances in asthma treatment over the past decades, approximately 10% of asthma patients do not adequately respond to standard therapy. 4 Why susceptible individuals develop asthma whereas others do not is a central question in the field that remains only partially resolved to date. Environmental factors early in life, such as viral infections or allergen exposure, provide an important piece to this complex puzzle. 5 In addition, genetics plays a critical role in explaining asthma susceptibility. Over the past 12 years, genome-wide association studies (GWAS) have made tremendous strides in annotating this genetic component of asthma susceptibility. 6 The vast majority of the identified genetic variants are not associated with altered protein function but are instead enriched in non-coding gene regulatory elements (GREs) 7,8 -small genomic regions that can control gene expression. 9 GREs, such as promoters and enhancers, are bound by DNA-binding proteins called transcription factors (TFs) that regulate gene transcription. TF action is heavily influenced by epigenetic modifications of the genome-collectively referred to as the epigenome. 10 These include DNA methylation and post-translational histone modifications, which affect the ability of TFs to bind DNA and control gene expression. 11 Since DNA methylation is a heritable modification, 12 epigenome-wide association studies (EWAS) have searched for DNA methylation changes associated with the development of asthma. 13,14 By modulating gene regulatory processes, genetic, and epigenetic variants can affect gene expression levels and as a consequence cellular function.
Despite the wealth of potentially disease-relevant information provided by GWAS and EWAS, follow-up studies have been relatively scarce due to the complexity with which (epi)genetic variants may impact gene regulation in different cell types. Fortunately, recent advances in epigenomics technology now provide a toolbox for linking disease-associated variants to transcriptional control mechanisms. This review provides a detailed overview of the (epi)genetic landscape linked to asthma susceptibility, focusing specifically on recent findings from GWAS and EWAS. Importantly, we provide an update on the latest developments in epigenomic approaches to further decode the (epi)genetic basis of complex diseases such as asthma, which offers promising opportunities to move our knowledge on the (epi)genetics of asthma toward clinically relevant applications.

| THE PATHOPHYS I OLOGY OF A S THMA
Traditionally, asthma has been grouped into two broad subtypes: allergic and non-allergic asthma. However, this classification turned out to be an oversimplification, 15 as extensive inflammatory and clinical heterogeneity exists among asthma patients that is not captured by the allergic/non-allergic dichotomy. 2,4 More recently, the field has adopted a distinction based on the presence ("T2-high") or absence ("T2-low") of substantial type-2 inflammation in the airways. 3,16 T2high asthma mainly involves allergic disease types and affects most children with asthma as well as approximately 50% of adult patients.
The chronic type-2 immune response in T2-high asthma, including allergic asthma, is characterized by an eosinophilic airway inflammation. In allergic patients, the disease starts with sensitization to inhaled allergens such as house dust mite (HDM), animal dander, pollen, or fungal spores during childhood that eventually results in chronic airway inflammation. Allergic patients can be defined by the presence of immunoglobulin E (IgE) antibodies in the serum and/or a positive skin prick test for inhaled allergens. In contrast, patients with T2-low asthma are usually non-allergic and often have a more neutrophilic or paucigranulocytic inflammatory profile. 2,3 In individuals susceptible for developing allergic T2-high asthma, a type-2 immune response is initiated by lung epithelial cells that release the alarmin cytokines IL-25, IL-33, and thymic stromal lymphopoietin (TSLP) upon allergen exposure ( Figure 1). Alarmins stimulate dendritic cells (DCs) to take up allergens and migrate to the lymph F I G U R E 1 Pathophysiology of type-2 asthma. In susceptible individuals, allergens induce the airway epithelium to produce and release alarmin cytokines IL-25, IL-33, and thymic stromal lymphopoietin (TSLP). These activate dendritic cells (DC) to migrate toward the lymph node and present allergen-derived antigens to naive T cells (T H 0). IL-4 produced by T follicular helper (T FH ) cells stimulates naive T cells to differentiate towards T H 2 cells, which migrate to the lung. In addition, IL-4 produced in the lymph node by the T FH cells initiates class switching of B cells to produce IgE antibodies. In the lung, not only T H 2 cells but also type 2 innate lymphoid cells (ILC2s) and CD8+ T cells produce type 2 cytokines, which subsequently instigate effector functions in B cells, M2 Macrophages, mast cells, and eosinophils. Finally, the chronic type-2 inflammation induces hallmark asthma symptoms (listed in the orange box). 23 29,30 ILC2s are innate counterparts of T H 2 cells, which lack an antigen-specific receptor but respond to alarmins by producing copious amounts of type 2 cytokines. 31 T2-low asthma does not feature a prominent type-2 immune response and is associated with a late onset of disease, poor response to corticosteroid therapy, and obesity. 3 Clinical heterogeneity and a lack of proper mouse models have resulted in a poor understanding of the immunological basis of T2-low asthma. The inflammatory pathways most consistently associated with T2-low asthma are related to inflammasome activation, including IL-1β and IL-6 signaling, 3 as well as IL-17-mediated neutrophilic inflammation. 3,28 These pathways are likely triggered by environmental factors such as microbes, pollutants, or cigarette smoke. 32 In this context, neutrophils can act as pathogenic effector cells inducing epithelial cell damage and mucus hyperproduction. 3 Given the complexity and heterogeneity of asthma as a disease, it is important to acquire a better understanding of asthma susceptibility and pathophysiology.

| G ENE TI C A SSO CIATI ON S WITH A S THMA
Before the advent of GWAS, decades of work involving classic linkage analysis and candidate-gene approaches have linked numerous genetic loci to asthma. 33,34 Moreover, twin cohort studies have shown that up to 70% of asthma susceptibility originates from genetic factors. 35,36 More recently, GWAS has been used to further explore the genetic basis of asthma. In GWAS, two cohorts of case and control individuals are genotyped on single nucleotide polymorphism (SNP) arrays that include so-called "tag" SNPs that represent haplotype blocks in the genome. These are then compared through statistical methods in order to link genetic variants to specific traits. 37 Since the first asthma GWAS appeared in 2007, 38 the NHGRI-EBI GWAS catalogue now contains 179 published association studies on asthma (both allergic and non-allergic asthma) or asthma-related traits. 6 Table 1

| A S THMA : INTEG R ATING G ENE TI C AND EPI G ENE TIC A SSO CIATI ON S
EWAS investigate the relationship between epigenetic modifications and traits and have mostly focused on identifying regions of differential DNA methylation across cohorts, which could in part explain trait susceptibility not captured by GWAS. 67 Importantly, the epigenome-in particular DNA methylation-can be influenced by environmental factors such as air pollution and dietary components. 68 In the context of asthma, the sum of all these external environmental influences (commonly referred to as the "exposome") may thus have a profound effect on asthma susceptibility. [69][70][71] Important to note is that epigenetic modifications are cell type-specific, and therefore care should be taken when interpreting EWAS results. The first asthma EWAS was published by Stefanowicz et al. in 2012. 72 Since then, the National Genomics Data Center EWAS Atlas contains 12 published asthma EWAS 73 (summarized in Table 2).
Differentially methylated cytosine-phosphate-guanine (CpG) nucleotides (DMCs) were identified in non-coding regions near genes previously associated with asthma by GWAS, including SMAD3, IL5, and ORMDL3, revealing significant colocalization with GWAS SNPs. 76,77 Integration of GWAS and EWAS findings has the potential to provide new insights into asthma susceptibility and disease mechanisms. To this end, we integrated asthma GWAS and EWAS data and found a moderate proportion of GWAS-and EWAS-associated genes to overlap (n = 609), which were mostly involved in T cell differentiation and activation, cell migration and inflammation ( Figure 2A and Figure 2B). Among these overlapping genes were most of the canonical asthma-associated genes such as GATA3, ORMDL3, and IL5. 33  Interestingly, CpG islands that are located in these distal regions can exert substantial enhancer activity. 78

| LINKING VARIANTS TO CHANG E S IN G ENE E XPRE SS I ON: QUANTITATIVE TR AIT LOCUS ANALYS IS
The localization of asthma-associated variants in non-coding regions precludes the direct determination of causal variant-gene relationships. Hence, an important next step is to link (epi)genetic variants to differences in gene expression. A classic statistical approach to uncover the relationship between a phenotypic trait and genotype is by Genes involved in immune response and activation TWAS was also able to yield new asthma-gene associations, as whole blood eQTLs were used to identify 4 reproducibly affected novel asthma genes involved in nucleotide synthesis and nucleotidedependent cell activation. 25 Although TWAS does not inform on the exact causal genetic variants, when combined with additional finemapping strategies, TWAS is a powerful method for elucidating the mechanisms underlying GWAS findings. 109 An online collection of TWAS data can be found in the TWAS hub (http://twas-hub.org). 110

| EPI G ENOMI C S TO PINP OINT PUTATIVE C AUSAL VARIANTS
The non-coding localization of trait-associated variants is not the only challenge that complicates the interpretation of GWAS findings. Another main hurdle is linkage disequilibrium (LD), which refers to the linked heritability of neighboring genetic variants in the genome as they co-segregate through meiosis. GWAS in fact make use of this: by focusing on a limited set of variants that represent regions of high LD ("tag SNPs"), modern GWAS can assay genome-wide for genotype-phenotype associations without having to analyze every individual SNP. However, many SNPs will reside in high LD with tag SNPs, thus often obscuring the identification of true causal variants.
Hence, GWAS associations generally do not implicate single variants but rather identify regions of high LD linked to the trait under investigation.
Making biological sense of non-coding genetic variants or CpG methylation changes is greatly facilitated by integrating epigenomics data. 99 Functional effects of (epi)genetic variants are -similar to eQTLs-highly cell type-specific, since transcriptional regulation is orchestrated by cell type-specific GREs such as enhancers. 111 In addition, the activation status of certain cell types, in particular immune cells, is reflected in highly dynamic epigenomes and transcriptomes. 85 show that GWAS SNPs are enriched in Th2-specific GREs, and that some of these GREs exhibited differential H3K4Me2 levels in cells from asthma patients compared with healthy controls. 118 Notably, many GWAS variants localize to stimulation-responsive GREs, highlighting the critical importance of including epigenome information from activated (immune) cells. 112,113,122 Indeed, we showed that asthma-associated genetic variation is concentrated in H3K4Me2+ putative GREs in ILC2s activated by epithelial alarmins IL-25 and IL-33, revealing both shared and unique SNP-GRE colocalization with those observed in Th2 cells. 119 Of note, altered histone modifications and regulatory protein binding have also been linked to asthma pathophysiology outside the context of (epi)genetic risk variants. 123 For example, studies in epithelial cells from asthmatics and healthy controls revealed widespread changes in histone acetylation, indicating changes in gene regulatory mechanisms in the airways of asthma patients. 124 Enhancers are often located at large genomic distances from their target genes, regulating gene expression at long range through spatial interaction (or "chromatin looping") with promoter regions. [125][126][127] Hence, enhancers do not necessarily regulate the expression of the nearest gene, which poses a problem for linking SNP-GRE combinations to candidate genes. Together with QTL data, chromosome conformation capture (3C) methods that measure spatial proximity between genomic regions can be highly informative. 128 Combined with functional assays to directly test whether a SNP affects GRE activity, detailed analysis of the epigenome at disease-associated loci provides a first step at identifying the mechanistic basis of a (epi) genetic association.  After the initial list of tag and LD SNPs has been reduced to a smaller list of putative causal SNPs based on QTL and regulatory SNP annotations (Figure 3), several experimental approaches exist to further strengthen the functional relationships between candidate variants, GREs and target genes. To test whether a GRE has enhancer or promoter activity, in vitro or in vivo reporter assays can be used. In these assays, the putative GRE is coupled to a reporter gene (e.g., firefly luciferase) in a DNA construct (e.g., a plas- Additionally, obesity-associated variants within the FTO locus were shown to disrupt a long-range enhancer of IRX3, a gene critical for controlling body mass in mice. 148 Importantly, this GWAS follow-up strategy has already resulted in the development of new therapies.

Bauer et al. reported that common genetic variation in an intron of
BCL11A associated with elevated fetal hemoglobin levels disrupts TF binding to an erythroid-specific enhancer of BCL11A, effectively reducing BCL11A levels and as a direct consequence increasing fetal hemoglobin levels. 149 Gene therapy targeting this BCL11A enhancer has now been successfully used to treat patients with adult hemoglobin defects (e.g., sickle cell disease). 150 Thus, a combination of epigenetics, QTL analyses and experimental validation can translate GWAS findings into disease-relevant biological insights.

| FOLLOW-UP S TUD IE S OF MA JOR A S THMA-A SSOCIATED G ENE TIC VARIANTS
Functional studies into the biological mechanisms underlying genetic risk loci for asthma remain relatively scarce. To date, only a handful of asthma-associated loci have been subjected to more F I G U R E 3 Schematic overview of a generic workflow to link non-coding GWAS/EWAS associations to gene regulatory mechanisms in specific cell types. Tag SNPs and LD SNPs from GWAS integrated with DMCs from EWAS studies provide a list of candidate SNPs and DMCs. Variants can subsequently be prioritized by filtering SNPs with QTL analyses and/or to regulatory regions in the genome. The latter is illustrated by a schematic example of a ChIP-Seq and ATAC-seq-based approach to prioritize SNPs that reside in putative enhancer regions of T helper cells or fibroblasts, which are marked by H3K27Ac. This significantly reduces the number of candidate causal SNPs and assigns (a lack of) cell-type specificity to the remaining variants. The resulting list of putative causal variants can then be validated in in vitro or in vivo based experiments such as reporter assays combined with chromosome conformation capture assays, CRISPR-Cas9-based approaches and transcription factor binding motif analyses, in order to gain biological insight into the regulatory function of the tested variants. Abbreviations: DMC, differentially methylated CpG; GRE, gene regulatory element; MPRA, massively parallel reporter assay; QTL, quantitative trait locus; SNP, single nucleotide polymorphism; TF, transcription factor; TWAS, transcriptome-wide association Study.  variants and asthma susceptibility. 121 In addition, EWAS identified a DMC (cg05616858) in a putative GRE associated with asthma and ORDML3 expression. 13 However, in vivo models of allergic asthma using transgenic mice in which ORDML3 levels were altered have generated conflicting results, 157,158  which was further supported by reduced TNFAIP3 expression in lung epithelial cultures of (severe) asthmatics. 64 Notably, an exonic SNP in TNFAIP3 (rs2230926) was associated with allergy and asthma risk in children, 64 suggesting multiple mechanisms through which common genetic variation could affect A20 protein function.
Other asthma-associated loci from GWAS studies have received less attention. The 5q31 GWAS signal spanning the Th2 cytokine locus was shown to harbor a SNP (rs2240032) that is located in the Th2 locus control region (LCR), a cluster of enhancer elements that have been extensively characterized in mouse models as a critical driver of IL4, IL5, and IL13 transcription in Th2 cells. 171,172 Rs2240032 was suggested to influence IL4 expression and IL13 promoter methylation in cord blood cells, which may be relevant for asthma pathogenesis given the central role of these cytokines in the development of allergic and T2-high asthma. 173 5q31 variants may also affect type-2 cytokine production by other cell types, including human ILC2s. 119  adult onset, obesity-associated asthma or asthma with specific comorbidities could yield endotype-specific variants and offer novel biological insights. 178 Indeed, well-powered GWAS have for example identified genetic variants specifically linked to childhood or adultonset asthma 56,60,179 and a smaller scale study identified (m)eQTLs specifically associated with obesity-associated asthma in children. In addition, well-powered and ethnically diverse studies also have the potential to improve the power of common variants to identify individuals at risk of developing asthma via polygenic risk scores, which can be used in the clinic for patient stratification and precision medicine approaches. 179,180 Through such approaches, GWAS/EWAS findings may also prove valuable as biomarkers in a clinical context, for example, by supporting patient phenotyping or for predicting risk of developing exacerbations and therapy response. 33,181,182 Given that cost of (epi)genotyping assays have been rapidly decreasing, we envision that well-curated sets of (epi)genetic variants could in the future be employed in various clinical settings. Regarding EWAS findings, it should be noted that replication of these associations has remained challenging, likely due to relatively small cohort sizes and differences in the investigated cell types or tissues. 76 Perhaps, the most significant challenge that we now face is how to capitalize on the wealth of available asthma GWAS/EWAS data.
Given that human genetics evidence supports two-third of the 2021 FDA-approved drugs 183 and that drug targets with genetic support are twice as likely to get FDA approval, 184 we believe it is important for the field to focus on the functional dissection of asthmaassociated variants. Indeed, current asthma treatment modalities include biologicals that act on proteins that have been extensively associated to asthma in GWAS and EWAS data, including IL-5, IgE, and TSLP. [185][186][187][188] However, very few associations have been thoroughly investigated to determine which variant(s) are causal, what molecular mechanisms underlie the impact of the genetic variants, and which biological pathways relevant for the risk of developing asthma are affected. Furthermore, in order to fully understand the mechanistic basis of asthma it is crucial to validate findings from association studies in more sophisticated models like animal or primary cell-based models, rather than remaining at the association level. It will be important in this endeavor to not only focus on genes and pathways that are well-known to be relevant for asthma pathogenesis. Rather, loci should be explored with less clear links to the disease, as these may provide novel entry points for biomarker or therapy development. 57,58 With an ever-increasing capacity for "omics" data generation and analysis, we propose to take full advantage of publicly available epigenomics and QTL resources to prioritize noncoding variants for functional follow-up studies to arrive at actionable biological processes relevant for asthma pathophysiology.

AUTH O R CO NTR I B UTI O N S
B.S., R.W.H., and R.S. conceptualized and wrote the review.

CO N FLI C T O F I NTE R E S T S TATE M E NT
All authors have read and approved the manuscript. There are no conflicts of interest.