Models of Aire-Dependent Gene Regulation for Thymic Negative Selection

Mutations in the autoimmune regulator (AIRE) gene lead to autoimmune polyendocrinopathy syndrome type 1 (APS1), characterized by the development of multi-organ autoimmune damage. The mechanism by which defects in AIRE result in autoimmunity has been the subject of intense scrutiny. At the cellular level, the working model explains most of the clinical and immunological characteristics of APS1, with AIRE driving the expression of tissue-restricted antigens (TRAs) in the epithelial cells of the thymic medulla. This TRA expression results in effective negative selection of TRA-reactive thymocytes, preventing autoimmune disease. At the molecular level, the mechanism by which AIRE initiates TRA expression in the thymic medulla remains unclear. Multiple different models for the molecular mechanism have been proposed, ranging from classical transcriptional activity, to random induction of gene expression, to epigenetic tag recognition effect, to altered cell biology. In this review, we evaluate each of these models and discuss their relative strengths and weaknesses.


INTRODUCTION
The autoimmune regulator, AIRE (OMIM #607358), has been the focus of intense research since mutations in the gene were identified in 1997 as the cause of autoimmune polyendocrinopathy syndrome type 1 (APS1, OMIM #2400300, also known as autoimmune polyendocrinopathy-candidiasis-ectodermal dystrophy syndrome; Finnish-German APECED Consortium, 1997; Nagamine et al., 1997). The syndrome is a rare Mendelian disease, found more commonly in isolated populations, with prevalence rates of 1/25000 among Finns (Ahonen, 1985), 1:5600 to 1/9000 among Iranian Jews (Zlotogora and Shapiro, 1992), and 1/14400 among Sardinians (Rosatelli et al., 1998). The disease is clinically classified by the presence of any two out of three primary disorders: hypoparathyroidism, primary adrenocortical failure, and chronic mucocutaneous candidiasis, with additional endocrine failure common (Betterle et al., 1998). Most of these clinical conditions result from progressive autoimmune destruction, with lymphocytic infiltrate and autoantibody production (Betterle and Zanchetta, 2003). The clinical mechanism of enhanced sensitivity toward candidiasis also appears to be autoimmune in nature, with autoantibodies targeted against Th17-associated cytokines neutralizing anti-Candida immunity (Kisand et al., 2010;Puel et al., 2010).

THE FUNCTIONAL ROLE OF AIRE IN IMMUNOLOGICAL TOLERANCE
Key insights into the mechanism by which mutations in AIRE affect tolerance have come through the development of Aire-deficient mouse models. Despite relatively mild disease on the C57BL/6 background, each knockout strain develops lymphocytic infiltrate and autoantibodies (Anderson et al., 2002;Kuroda et al., 2005;Su et al., 2008), with more severe clinical presentation on alternative genetic backgrounds (Jiang et al., 2005;Kuroda et al., 2005). A role for Aire in central tolerance was first suggested through experiments demonstrating that expression of Aire is largely restricted to rare cells in the thymic medulla (Bjorses et al., 1999;Heino et al., 1999;Ramsey et al., 2002). Thymic transplant experiments demonstrated that the key nonredundant function of Aire existed within the thymic stroma, and that medullary thymic epithelial cells showed reduced transcription of tissue-restricted antigens (TRAs) in Aire knockout mice (Anderson et al., 2002). While TRA expression in the thymic medulla had been previously documented (Jolicoeur et al., 1994), the finding of a specific molecular mediator to drive this expression suggested an endogenous tolerogenic function. The function of Aire-dependent TRA expression in maintaining immunological tolerance was first demonstrated using a neo-self transgenic system, where TRA expression in the thymus was linked to the efficacy of negative selection of autoreactive thymocytes and consequently to the development of autoimmune disease in the periphery (Liston et al., 2003(Liston et al., , 2004. Additional experiments have demonstrated that thymic TRA expression is likewise able to drive regulatory T cell conversion, with the alternative fates (negative selection or regulatory T cell conversion) likely depending on antigen quantity and TCR affinity (Aschenbrenner et al., 2007;Hinterberger et al., 2010;Wirnsberger et al., 2011). Several alternative mechanisms of central tolerance were proposed for Aire Kuroda et al., 2005), however the TRA-transcription function appears to be the most robust, and has been subsequently extended to multiple endogenous self-antigens, including insulin 2, salivary protein 1, desmoglein 3, seminal vesicle secretory protein 2 (SVS2), and interphotoreceptor retinoid-binding protein (IRBP;DeVoss et al., 2006;Yano et al., 2008;Hou et al., 2009;Wada et al., 2011).
A potential role for Aire in peripheral tolerance remains controversial. Several groups have found extra-thymic Aire expression in multiple peripheral locations (Halonen et al., 2001;Kogawa et al., 2002;Gardner et al., 2008), while other studies have found Aire expression to be restricted to thymic epithelial cells . Initial experiments suggested that the function of Aire was restricted to the thymic epithelium, as loss of Aire in the thymus was sufficient to induce disease (Anderson et al., 2002) and expression of selected TRAs in the lymph nodes were not Airedependent (Kont et al., 2008). However, more recent experiments have suggested that Aire is expressed in the periphery and does have a functional role . Aire-expressing stromal cells from the lymph nodes have been demonstrated to express TRAs in a tolerogenic form, and this process has been demonstrated to be Aire-dependent for at least a subset of TRAs . Independent experiments have also confirmed that the lymph node stroma mediates TRA immune tolerance (Lee et al., 2007;Nichols et al., 2007;Magnusson et al., 2008;Fletcher et al., 2010), including both Aire-dependent and Aire-independent mechanisms (Cohen et al., 2010). Overall, it is likely that Aire has some function in peripheral tolerance for a subset of TRAs, but that failures in this function are not the primary cause of autoimmunity in APS1.

MECHANISMS OF AIRE-DEPENDENT TOLERANCE
Despite the crucial function of Aire, the molecular mechanism by which it drives TRA expression in medullary thymic epithelium and lymph node stromal cells remains opaque. The key difficulty is the scale of Aire-dependent gene expression, with dependent TRAs numbering in the order of several hundreds to thousands of genes (Anderson et al., 2002;Derbinski et al., 2005). In this review, we address the various models that have been proposed so far for the elucidation of the mechanism by which Aire regulates its target genes. The models can be envisaged in two broad categories: molecular biology models and cell biology models. The molecular biology models utilize the molecular structure of Aire to explain the mechanism of its target gene activation and include classical transcription activation model, random transcriptional activator model, and epigenetic tag recognition model. In each of these models, Aire is proposed to have direct transcriptional activation properties, driving the expression of target genes. The cell biology models provide an alternative approach to the mechanism of Aire target gene regulation and encompass two contrasting models -"developmental retardation" and "dysregulated death" models -in which both propose Aire to influence the differentiation of medullary thymic epithelial cells to enhance TRA expression. With new studies revealing insights into both the molecular and cellular impacts of Aire expression, this review focuses to evaluate the relative strengths and weaknesses of each of the models.

Aire as a classical transcription factor
The first model for Aire transcriptional activity describes Aire as a classical transcription factor, able to bind the promoter sequence of target genes ( Figure 1A). Initial protein domain analysis of the human Aire protein identified several functional domains indicative of a transcriptional regulator (Nagamine et al., 1997;Kumar et al., 2001). Most importantly, Aire contains DNA binding domains, which have been mapped to two zinc-finger-binding plant homeodomains (PHD1 and PHD2), and a "human Sp100, Aire1, NucP41/P75, and Drosophila DEAF1 domain" (SAND) domain (Gibson et al., 1998;Bottomley et al., 2001;Kumar et al., 2001). This model is experimentally supported by the ability of Aire to bind DNA via these domains in a sequence-specific manner (Kumar et al., 2001). Gel-shift analysis revealed the SAND domain to bind the motif TTATTA in the presence of guanine residues while the zinc-finger-containing PHD domains were found to bind to the ATTGGTTA sequence (Kumar et al., 2001;Purohit et al., 2005). Furthermore, when tethered to DNA Aire has the capacity to activate transcription (Pitkanen et al., 2000). Also in favor of this model is the finding that Aire is capable of recruiting molecular partners involved in chromatin binding, transcription, mRNA elongation and processing, such as TOP2a, DNA-PK, Ku80, PARP-1, and H2A (Abramson et al., 2010). Importantly, the knockdown of these molecular partners reduces the expression of Aire-dependent genes, indicating a functional role in Airemediated transcription (Abramson et al., 2010). The most important support for this model comes from the finding that transgenes using target gene promoters, such as the insulin promoter-driven hen egg lysozyme transgene, also show Aire-dependence (Liston et al., 2004), despite being imbedded in a different chromatin context and using only the core promoter sequence. Likewise the insulin promoter-driven ovalbumin transgene, while being initially characterized as Aire-independent , has now been demonstrated to have partial Airedependency despite its foreign chromatin context Hubert et al., 2011).
Despite the attractive simplicity of the classical transcription factor model, there are several notable disadvantages. Firstly, the utilization of KNKA motif as the binding motif in the SAND domain instead of KDWK or KNW (K/R) in other SANDcontaining proteins suggests atypical DNA binding activity (Purohit et al., 2005;Peterson et al., 2008). Secondly, a consensus sequence for DNA binding has not been found among the numerous target genes. Thirdly, the sheer number of target genes (∼200-1000), with highly diverse peripheral expression patterns, make a direct binding function unlikely. Finally, the expression of different Aire-dependent TRA targets in thymic epithelial cells, peripheral lymph node stromal cells (Lee et al., 2007;Gardner et al., 2008), and modified monocytes (Sillanpaa et al., 2004) is not consistent with a fixed promoter recognition site. Nevertheless, the classical transcription factor binding model remains viable, as various modifications could explain these discrepancies, such as the expression Frontiers in Immunology | Immunological Tolerance of target genes being modulated by the chromatin context in an Aire-independent manner, or the existence of binding partners to Aire that modify target recognition (Abramson et al., 2010). One modification of the model that could explain many of the discrepancies is one where Aire is responsible for initiating the transcription of a small subset of master transcription factors, which then in turn initiate the expression of larger pools of downstream TRAs ( Figure 1B). This would enable Aire to drive the expression of large numbers of genes without any apparent conservation within the promoter, and is supported by the finding that diverse tissue-specific master transcription factors, such as Pdx1, are expressed in thymic epithelial cells in an Aire-dependent manner (Gillard et al., 2007).

Aire as a random transcriptional activator
An alternative to the classical transcription factor model is the random transcriptional activator model. Early analysis of the cohort of genes expressed as TRAs suggested that genes were being turned on at low levels in a random manner (Derbinski et al., 2001), similar to the way "leaky" expression of genes is observed in the testes (Derbinski et al., 2001) and stem cells (Miyamoto et al., 2002;Zipori, 2004). In this model, Aire would contribute to the loosening of chromatin structure or general accessibility of genomic DNA in a way which allows non-specific gene expression, or permits otherwise illegitimate expression to be productive (Abramson et al., 2010). The expression of Aire in the testes (Derbinski et al., 2001) and stem cells (Nishikawa et al., 2010), and the observation of infertility in Aire knockout mice (Anderson et al., 2002;Ramsey et al., 2002;Hubert et al., 2009), generated the plausible model that Aire had a DNA accessibility function primarily for proliferative processes, which was co-opted for immunological tolerance in the thymic epithelium. However, the infertility observed in Aire knockout mice reflects autoimmunity against reproductive organs, as it is absent in Aire knockout mice crossed to the Rag knockout background (unpublished observation), and TRA expression in the testes is Aireindependent, unlike thymic epithelial cells (Liston et al., 2004). A primary function for Aire in the cell division process has therefore been discredited. A modified form of the random transcriptional activator hypothesis, however, is still viable, with the observation that Aire-dependent TRAs tend to be found in chromosomal clusters (Derbinski et al., 2005). It is therefore feasible that Aire recognizes sequence-specific DNA regions (as in model 1), or chromatin tags (as in model 3, below) in order to open up transcription of a small genomic region (Figure 2). Without genome-wide knowledge of chromatin boundaries and insulators it is difficult www.frontiersin.org FIGURE 2 | Random gene expression model. In this model Aire contributes to random activation of genes by loosening up the chromatin structure to increase the general accessibility of TRA genes. This would in turn allow the recruitment of transcriptional activators to bind genes that would otherwise be physically restricted.
to draw firm conclusions on the existence of these "regional" effects.

Aire as an epigenetic tag recognition factor
A third model for the activity of Aire is based on recent biochemical analysis of Aire, indicating a capacity to bind modified histones (Koh et al., 2008;Chignola et al., 2009). Unlike the classical transcription factor model, this epigenetic tag model does not require sequence-specific recognition of target genes, but instead requires a common histone modification. In this scenario Aire would bind the modified histones and activate the transcription of nearby genes (Figure 3), probably through the recruitment of a complex of transcriptional activators and mRNA elongation/processing factors (Abramson et al., 2010). In support of this model, studies have shown that Aire is able to bind to unmethylated K4 on histone 3 (Koh et al., 2008;Org et al., 2008). Furthermore, Aire has been shown to interact with the transcription co-factors CBP (CREB binding protein; Pitkanen et al., 2000), PIAS1 (protein activator of activated STAT1; Ilmarinen et al., 2008), and P-TEFb (positive transcription elongation factor b; Oven et al., 2007), indicating an ability to drive transcription after tethering to the locus (Bjorses et al., 2000;Pitkanen et al., 2000Pitkanen et al., , 2005Oven et al., 2007;Ilmarinen et al., 2008).
In support of this model is the finding that binding to unmethylated K4 on histone 3 is dependent on the PHD1 domain (Chakravarty et al., 2009;Chignola et al., 2009;Koh et al., 2010), which is evolutionarily conserved among gnathostomes (Saltis et al., 2008) and harbors a high density of the human mutations detected in APS1 patients (Bjorses et al., 1998;Chakravarty et al., 2009). While the importance of the PHD1 domain in the function of Aire has been demonstrated (Koh et al., 2008;Org et al., 2008), it may not reflect Aire's targeting mechanism as the PHD1 domain has also been implicated in direct DNA binding (Kumar et al., 2001;Purohit et al., 2005). An attractive feature of this model is a putative explanation for the large number of diverse target genes, as many genes throughout the genome would exhibit similar histone modification. Indeed, the range of Aire-dependent genes is actually far smaller than would be predicted from a common histone tag, and the over-expression of a H3K4 demethylase, able to extend the Aire-binding signature, does not extend the range of Aire target genes (Koh et al., 2010). The key weakness with this model is explaining how the specificity of target gene expression is achieved, for example how the core insulin promoter is able to be recognized when placed in a different chromatin context in the form of a transgene (Liston et al., 2004;Hubert et al., 2011). Another weakness is the inability of this model to explain why only small numbers of genes are activated in any given Aire-expressing epithelial cell (Gillard et al., 2007;Villasenor et al., 2008), unless Aire activity itself is limiting. Overall, the biochemical data that Aire binds epigenetic tags are convincing, but it is yet to be determined whether this is the function that primarily drives TRA expression, or whether it is a mechanism of Aire used to increase the stability of interaction with a promoter it binds due to DNA sequence recognition (Ruthenburg et al., 2007).

"Developmental retardation" model
In contrast to the direct activity models described above, several innovative alternatives have been proposed that postulate an effect of Aire on the cell biology of medullary thymic epithelial cells. The first model is the "developmental retardation" model, where it was proposed that Aire expression keeps thymic epithelial cells in an immature state (Gillard et al., 2007). This model was supported by data reporting Aire-dependent expression of Nanog, Oct4, and Sox2, which are candidates for expression by epithelial progenitor cells in the thymus and are critical for maintaining their multipotentiality (Gillard et al., 2007). This model was also supported by evidence demonstrating that Airedeficient mice present with a reduced and altered medullary compartment (Gillard et al., 2007) with fewer terminally differentiated epithelial cells in the absence of Aire (Yano et al., 2008). Furthermore, Aire has been found to be actively expressed in early embryogenesis (Nishikawa et al., 2010). Under this model, the expression of Aire in the thymic epithelium would prevent full differentiation into mature epithelial cells, thereby allowing Aire-expressing cells to differentiate into alternative epithelial fates, a process which would initiate the expression of TRAs (Farr et al., 2002).
Ultimately, the "developmental retardation" model for Aire function is incompatible with more recent findings that Aireexpressing thymic epithelial cells are concentrated within the mature CD80 hi subpopulation, have a low level of proliferation and a high level of apoptosis (Gabler et al., 2007;Gray et al., 2007;Irla et al., 2008;White et al., 2010). A study utilizing 5bromo-2 -deoxyuridine (BrdU) incorporation instead reveals that Aire-expressing epithelial cells are terminally differentiated and highly apoptotic (Gray et al., 2007). However, this new evidence does not negate the possibility of a cell biology model for Aire expression of TRAs.

"Dysregulated death" model
An alternative model, that of "dysregulated death," proposes that Aire expression dysregulates normal cell biology of the thymic epithelial cell, resulting in the partial differentiation toward one or more alternative epithelial fates in a terminal process (Figure 4; Gillard and Farr, 2006;Dooley et al., 2008). The terminal nature of Aire expression may even aid tolerance by facilitating the crosspresentation of TRAs to dendritic cells (Gray et al., 2007). Key support for the "dysregulated death" model comes from numerous studies which demonstrate changes to thymic epithelial cell biology in the absence of Aire. Beyond the well documented decrease in TRA expression, Aire-deficiency can result in morphological changes, altered antigen presentation capacities, and enhanced apoptosis Gray et al., 2007;Dooley et al., 2008;Yano et al., 2008;Hubert et al., 2011). This model has the capacity to explain many of the unique characteristics of Aire-dependent TRA expression, such as the broad repertoire of genes that are activated and the retained ability to activate transgenes utilizing these promoters, as conserved transcriptional pathways would be utilized. The observation that diverse tissue-specific master transcription factors, such as Pdx1, are Aire-dependent supports this hypothesis (Gillard et al., 2007). However this latter property would predict that Aire-dependent TRA expression should cluster TRAs from the same target organ in the same Aire-expressing cells, while this phenomenon is not observed in practice (Gillard and Farr, 2006;Derbinski et al., 2008;Venanzi et al., 2008;Villasenor et al., 2008). The key disadvantage of cell biology-based models is the lack of a known molecular mechanism, as the described molecular properties of Aire are more consistent with a role in transcription than in cell physiology. A putative explanation would involve Aire inducing "dysregulated death" via transcriptional activity, however again a molecular target is not available to explain such behavior. www.frontiersin.org

CONCLUDING REMARKS
The failure of thymic deletion in Aire-deficient mice has revealed the significant role of Aire in promoting clonal deletion of autoreactive organ-specific T cells. The main unresolved issue is the mechanism of transcriptional activation of Aire-dependent TRAs. Despite large advances that have come from the study of Aire-dependent TRA targets, the molecular biology of Aire and the cellular biology of Aire-expressing cells, there is as yet no single coherent model for Aire activity which adequately accounts for all the observed phenomena (refer to Table 1).
The provision of new data is required to definitively support a consensus model, whether it be one of the four main models outlined here, a hybrid model or a novel model yet to be proposed.