The AlkB Family of Fe(II)/α-Ketoglutarate-dependent Dioxygenases: Repairing Nucleic Acid Alkylation Damage and Beyond*

The AlkB family of Fe(II)- and α-ketoglutarate-dependent dioxygenases is a class of ubiquitous direct reversal DNA repair enzymes that remove alkyl adducts from nucleobases by oxidative dealkylation. The prototypical and homonymous family member is an Escherichia coli “adaptive response” protein that protects the bacterial genome against alkylation damage. AlkB has a wide variety of substrates, including monoalkyl and exocyclic bridged adducts. Nine mammalian AlkB homologs exist (ALKBH1–8, FTO), but only a subset functions as DNA/RNA repair enzymes. This minireview presents an overview of the AlkB proteins including recent data on homologs, structural features, substrate specificities, and experimental strategies for studying DNA repair by AlkB family proteins.

The AlkB family of Fe(II)-and ␣-ketoglutarate-dependent dioxygenases is a class of ubiquitous direct reversal DNA repair enzymes that remove alkyl adducts from nucleobases by oxidative dealkylation. The prototypical and homonymous family member is an Escherichia coli "adaptive response" protein that protects the bacterial genome against alkylation damage. AlkB has a wide variety of substrates, including monoalkyl and exocyclic bridged adducts. Nine mammalian AlkB homologs exist (ALKBH1-8, FTO), but only a subset functions as DNA/RNA repair enzymes. This minireview presents an overview of the AlkB proteins including recent data on homologs, structural features, substrate specificities, and experimental strategies for studying DNA repair by AlkB family proteins.

AlkB Is an Escherichia coli Fe(II)/␣KG-dependent Dioxygenase That Reverses DNA Alkylation Damage
E. coli cells exposed to a low dose of a methylating agent such as methylnitronitrosoguanidine upregulate a transcriptional program that confers significant resistance to subsequent, higher levels of alkylation insult. This phenomenon, discovered in 1977 by Samson and Cairns, was coined the adaptive response (1). At its core, the adaptive response relies on the activity of four proteins: Ada, AlkA, AlkB, and AidB, whose overexpression affords the observed bacterial resistance to the deleterious effects of alkylating agents (1). Three of the four proteins (Ada, AlkA, and AlkB) are DNA repair proteins that combat the mutagenic and toxic effects of alkylated bases. Although the biological functions of Ada (a methyltransferase) and AlkA (a glycosylase) were established relatively quickly, the function of AlkB remained mysterious until 2001, when Ara-vind and Koonin, using sequence homology alignments, predicted that AlkB is an Fe(II)-and ␣-ketoglutarate (␣KG) 6 -dependent dioxygenase (2). Soon after, AlkB was established as a prototypical oxidative dealkylation DNA repair enzyme (3)(4)(5) that protects the bacterial genome against alkylation damage (6 -10). AlkB uses molecular oxygen to oxidize the alkyl groups on alkylation-damaged nucleic acid bases, such as 1-methyladenine (m1A), 3-methylcytosine (m3C) (7,8), and 1,N 6 -ethenoadenine (⑀A) (9,11); the oxidized alkyl groups are subsequently released as aldehydes, regenerating the undamaged bases (Fig. 1). The decade of research that followed the 2002 discovery of the enzymatic properties of AlkB (7,8) greatly expanded our understanding of the biological functions of the AlkB dioxygenases. This minireview summarizes the most salient aspects of this research.

Bacterial Homologs of the E. coli AlkB
The E. coli AlkB protein is by far the best studied enzyme in the family. However, AlkB homologs exist in most bacteria and almost all eukaryotes. Speaking to the remarkable versatility and ubiquity of the AlkB proteins, even certain single-stranded plant-infecting RNA viruses (e.g. Flexiviridae (now classified as Alphaflexiviridae, Betaflexiviridae, and Gammaflexiviridae)) encode an AlkB homolog that can repair alkylated bases in DNA and, preferably, RNA (12).
The majority of aerobic bacterial species express AlkB proteins, which show a great diversity of substrate specificity (4,13,14). Given the AlkB requirement for molecular oxygen, obligate anaerobic bacteria (e.g. Clostridium, Bacteroides, Bifidobacterium) do not appear to have AlkB type proteins (4,14). A careful bioinformatics analysis of the homologous sequences found four phylogenetic groups of bacterial AlkB proteins denoted 1A, 1B, 2A, and 2B (14). Group 1A proteins, which include the E. coli AlkB, are characterized by robust oxidative dealkylation activities and broad substrate specificities. This group includes the Streptomyces AlkB proteins, which share 79% sequence identity with the E. coli AlkB and have been shown to repair both methylated and etheno lesions (14). Group 1B proteins, primarily found in the ␤ and ␥ subdivisions of Proteobacteria and in cyanobacteria, also show wide substrate preferences. They are most closely related to the eukaryotic ALKBH2 and ALKBH3 homologs. Group 2A proteins, often found in the ␣ Proteobacteria (e.g. Agrobacterium, Rickettsia, and Rhizobium), share strong homology with the ALKBH8 proteins from animals and plants, which have been implicated in tRNA posttranscriptional modification rather than true DNA or RNA repair. Finally, group 2B proteins, often found in soil bacteria (e.g. Actinobacteria) but also in Xanthomonas and Burkholderia, are most likely to show substrate specialization; some enzymes have high activity on monoalkyl lesion, but no activity on bridged adducts (e.g. MT-2B from Mycobacterium tuberculosis), whereas others are exactly the opposite (e.g. XC-2B from Xanthomonas campestris) (14). Such specialization presumably reflects an adaptation to a specific environmental stressor. Additionally, unlike E. coli, many bacterial genomes contain two or even three AlkB family proteins (14). So far, no AlkB homologs have been found in Archaea (14).

Eukaryotic and Mammalian AlkB Homologs
Most eukaryotic cells have several AlkB homologs, with the notable exception of Saccharomyces cerevisiae, which lacks this class of DNA repair enzymes. Mammalian cells have nine homologs (2,4,15); the first eight have been denoted ALKBH1-8, whereas the ninth is known as FTO (fat mass and obesity-associated) (16). Evolutionarily speaking, ALKBH5 and FTO are the newest AlkB proteins, being found only in vertebrates (17,18). The other seven AlkB homologs are conserved across all metazoans, including worms and fruit flies (13).
The remaining homologs have no known activity on DNA substrates; instead, they demethylate RNA or proteins. Both FTO (39 -41), whose overexpression is closely linked to obesity and diabetes (16,42,43), and ALKBH5 (44 -48) repair primarily N 6 -methyladenine (m6A) in RNA (Fig. 1). FTO also repairs 3-methylthymine (m3T) and 3-methyluracil (m3U) in ssDNA (49). ALKBH8 (50 -54) is involved in the maturation of tRNA featuring both an S-adenosylmethionine-dependent methyltransferase domain that methylates 5-carboxymethyluridine (cm5U) to 5-methoxycarbonylmethyluridine (mcm5U), and a dioxygenase domain that hydroxylates mcm5U to generate (S)-5-methoxycarbonylhydroxymethyluridine (mchm5U), a common functional modification at the wobble position of tRNA (51,52). The main function of ALKBH4 seems to be actin demethylation (55,56). ALKBH7, which plays a role in alkylationinduced necrosis (57)(58)(59), is believed to act on protein substrates, but their identity is not currently known. Finally, the function and substrates of ALKBH6 remain to be established. Although they are also members of the Fe(II)/␣KG dioxygenase superfamily, the mammalian TET (ten-eleven translocation) proteins, which play an important role in the epigenetic reprogramming of the cell (60), are only distantly related to the AlkB proteins, and they will be covered in other minireviews. nificant insight into the molecular mechanism of the oxidative dealkylation catalyzed by AlkB dioxygenases. The active site of these enzymes is contained on a characteristic double-stranded ␤-helix domain, also known as a "jelly-roll" fold, consisting of eight ␤-strands arranged in pairs in a helical conformation to form the core that binds the Fe(II) and ␣KG cofactors using conserved residues (61, 62) ( Fig. 2A). The E. coli AlkB also contains a unique 90-residue N-terminal subdomain that interacts with the oligonucleotide substrate containing the damaged base and covers the active site by forming a "nucleotide recognition lid." Being conformationally flexible, this lid can accommodate substrates of variable sizes, which can explain the diverse substrate specificity of the enzyme (see below). However, recent data on bacterial AlkB homologs suggest that this domain is not the only factor governing the substrate specificity (67). The active site of AlkB contains a 3 Å-wide oxygen diffusion tunnel from the protein surface to the oxygen binding site (61). His-131, Asp-133, and His-187 of E. coli AlkB, together with molecular oxygen (or water under anaerobic conditions) and ␣KG bound in a bidentate fashion, form the octahedral primary coordination sphere around the non-heme iron (Fig. 2, B-D). The ␣KG cofactor is held in place by two salt bridge interactions between the Arg-204, Arg-210, and the carboxylates of ␣KG (61, 63). The octahedral coordination and the global geometry are conserved in AlkB complexes in which Fe(II) is replaced with Co(II) or Mn(II), but such replacement results in closure of the oxygen diffusion tunnel, producing an oxygen-resistant phenotype (63).

The Mechanism of AlkB-catalyzed Dealkylation
Similar to other non-heme iron dioxygenases, the AlkB-catalyzed oxidative dealkylation reaction is initiated by the binding of molecular oxygen to Fe(II) in the active site by replacing the bound water molecule (68 -75). Cleavage of molecular oxygen generates an Fe(IV)-oxo reactive moiety, whereas the ␣KG ligand is oxidatively decarboxylated to succinate. Once CO 2 is released from the active site, the reactive Fe(IV)-oxo ligand migrates adjacent to the target alkyl group on the nucleobase substrate and hydroxylates the alkyl moiety (72)(73)(74)(75). The resulting carbinoliminium is typically unstable and dissociates (in a spontaneous or catalyzed fashion) into an aldehyde product and the unmodified (repaired) base.
For simple alkyl substrates, AlkB oxidizes the carbon attached to a nucleobase nitrogen atom (i.e. ring nitrogen or exocyclic amine). The resulting carbinoliminium species (e.g. 3-hydroxymethylcytosine from m3C ( Fig. 1), or 1-hydroxymethyladenine from m1A) have been experimentally observed in AlkB oxidation reactions performed in crystallo (64). Specifically, the AlkB-substrate complex, together with Fe(II) and ␣KG, is first crystalized under anaerobic conditions. The reaction is then initiated by exposing the crystals to air (64). In the case of etheno lesions (e.g. ⑀A), the AlkB dealkylation was proposed to involve the oxidation of the etheno bridge to an epoxide, followed by hydrolysis of the epoxide to a glycol and subsequent release of the dialdehyde glyoxal ( Fig. 1) (9, 11). The glycol corresponding to ⑀A has been experimentally observed in crystallo (64). The epoxide formation, however, has been recently challenged in a computational study, which suggested instead that the AlkB reaction with ⑀A proceeds via a zwitterion intermediate (76). Nevertheless, further experimental work is needed to establish the identity of the reaction intermediate.
The binding affinity of E. coli AlkB to damaged DNA is influenced by both the damaged base and the backbone phosphates of the substrate (61). Residues Thr-51, Tyr-76, and Arg-161, conserved across all eubacterial AlkB homologs, form extensive hydrogen-bonding interactions with the backbone phosphates (29,61,62). In the AlkB repair complex, the central alkylated base is splayed out of the DNA helix; the stacking interactions between the normal flanking bases stabilize the resulting backbone conformation. The flipped-out alkylated base is sandwiched between AlkB conserved residues Trp-69 and His-131, whereas the DNA backbone is distorted such that the bases flanking the flipped damaged base stack with each other (62).
Unlike E. coli AlkB, ALKBH2 promotes the flipping of the damaged base by using an aromatic finger residue, Phe-102, which intercalates into the duplex DNA and fills the gap left by the flipped base (62). Moreover, although AlkB only interacts with the lesion strand, ALKBH2 interacts with both DNA strands, which may explain the observed differences in substrate preference: ALKBH2 prefers dsDNA, whereas E. coli AlkB prefers ssDNA or RNA (61,62).

Substrate Specificity of AlkB Dioxygenases
In addition to the originally reported AlkB substrates, m1A and m3C, a wide variety of modified bases are substrates for the AlkB family proteins, including monoalkylated bases with alkyl moieties of various sizes (Table 1) and bases featuring exocyclic bridged adducts (Table 2). Additionally, AlkB proteins can process both single-stranded and double-stranded substrates, and in both DNA and RNA.
Interestingly, the ability of AlkB to demethylate exocyclic DNA adducts such as m4C and m6A suggests a potential addi-tional biological function for AlkB, besides DNA repair. Often, the m4C and m6A modifications are not deleterious, but rather physiological post-replicative DNA markers that control strand discrimination, replication initiation, and even gene expression in certain bacteria (81,82). Furthermore, the demonstrated chemical competence of E. coli AlkB to oxidize a wide range of substrates has anticipated the putative substrate specificity of other AlkB homologs (e.g. m6A in RNA is now considered the canonical substrate for ALKBH5 and FTO), including functional roles beyond DNA repair.
The efficiency with which AlkB enzymes process different N-alkyl nucleobases varies considerably, depending on both the identity of the base and the position of the alkyl group on the base. For the E. coli AlkB, m1A is a better substrate than m6A (these damaged bases are regio-isomers) (83); similarly, m3C is

Chemical structures of the monoalkyl substrates processed by AlkB dioxygenases
The repair target within each base is highlighted in red. The lesion phenotype in AlkB-deficient E. coli cells is provided in terms of % relative bypass (RB) and % mutagenesis, as determined by the CRAB and REAP assays. Bypass efficiencies are reported as a percentage relative to the unmodified DNA base at the lesion site. a Very strong replication block (RBϽ10%), strong replication block (RB 10 -50%), mild replication block (RB 50 -90%), not a replication block (RBϾ90%). b Not mutagenic (Ͻ2%), slightly mutagenic (2-10%), mutagenic (10 -50%), very mutagenic (Ͼ50%). c AlkB notation refers to the E. coli protein. a better substrate than m4C (80). When comparing two different bases alkylated at the same position, m1A is a better AlkB substrate than m1G both in vivo and in vitro (77,79). Generally, alkylated adenines and cytosines are repaired more efficiently than alkylated guanines and thymines; alkyl groups on the ring nitrogens are removed more efficiently than alkyl groups on the exocyclic amines.
The preference of the AlkB enzymes toward m1A and m3C may reflect the fact that these lesions are positively charged under physiological pH, which is believed to help with the AlkB recognition and binding to the damaged bases, and may also allow a faster release of formaldehyde from the carbinoliminium intermediates (84). Additionally, a stabilizing hydrogen-bonding interaction exists between the AlkB invariant residue Asp-135 and the 6-amino or 4-amino group of adenine or cytosine, respectively (61,63,84). For neutral lesions, such as m1G and m3T that contain hydrogen bond acceptors at the equivalent position, the interaction is weaker, being mediated through water molecules; this may account for the less efficient repair of m1G and m3T relative to m1A and m3C.
One notable peculiar feature of the E. coli AlkB is its ability to repair equally well the top two substrates, m1A and m3C, despite the large size difference between them. Biochemical and structural studies have shown that AlkB achieves its diverse substrate specificity by tailoring its k cat and K m values for vari-ous substrates (63). For example, AlkB has a significantly higher k cat and K m for m3C when compared with m1A. Thus, AlkB maintains a similar net catalytic activity (k cat /K m ) by increasing the turnover rate of the substrate with lower affinity. Co-crystals of AlkB with both substrates suggest a correlation between the k cat and K m compensation and the atomic packing density in the active site (i.e. the extent to which the substrate fills the volume of the active site). Smaller substrates such as m3C have a lower atomic packing density, which seems to promote a faster rate (k cat ) at the expense of a weaker binding (higher K m ) (63). The mechanistic basis for this relationship is not fully understood, but it is suspected that the stereochemical properties of the substrate directly influence (via quantum mechanical effects) the rate with which the electrons or atoms rearrange during the course of the reaction (63).

Monoalkyl RNA Lesions
E. coli AlkB and human ALKBH1 and ALKBH3 can oxidize and remove the methyl group of m1A, m3C, and m1G in RNA substrates (15,30,85). The bacterial AlkB protein XC-1B can also repair RNA substrates (14). The ability of these proteins to work on RNA correlates with their preference to repair ssDNA over dsDNA. Both AlkB and ALKBH3 can function as RNA repair enzymes in the cell (86). In fact, the primary function of ALKBH3 is speculated to be RNA repair, argued by its equal activity on RNA and ssDNA and its diffuse cellular localization (cytosol and nucleus) (30,85).
Other AlkB homologs, such as ALKBH5, FTO, and ALKBH8, have been shown to work exclusively on RNA substrates. Both ALKBH5 and FTO remove the methyl group from m6A in RNA ( Fig. 1) (41, 47). However, these oxidation/demethylation reactions likely constitute an additional layer of mRNA regulation (ALKBH5 and FTO) or post-transcriptional tRNA modification (ALKBH8), and thus not true repair mechanisms for alkylation-damaged RNA.

TABLE 2 Chemical structures of the exocyclic bridged substrates of AlkB dioxygenases
The repair target within each base is highlighted in red. The lesion phenotype in AlkB-deficient E. coli cells is provided in terms of % relative bypass (RB) and % mutagenesis, as determined by the CRAB and REAP assays. Bypass efficiencies are reported as a percentage relative to the unmodified DNA base at the lesion site. All lesions are in DNA.
Other exocyclic bridged lesions can also be processed by AlkB dioxygenases (Table 2), but in most cases, the substrate is processed initially as a monoalkyl lesion. After the initial oxidation, the exocyclic bridge can open, forming another monoalkyl lesion that either can fall apart spontaneously or is further oxidized by AlkB. For example, 1,N 6 -ethanoadenine (EA), the saturated analog of ⑀A, is completely repaired by E. coli AlkB, with the reaction involving two successive oxidation steps at the carbons on the N1 and N 6 positions of adenine (83,88). Similarly, 3,N 4 -␣-hydroxyethanocytosine (the hydrated version of ⑀C) and the three-carbon bridge analog 3,N 4 -␣-hydroxypropanocytosine are also good AlkB substrates, but require only one oxidation at the carbon attached to the N3 of cytosine (10,84). Additionally, three propano-exocyclic lesions of guanine (␣-hydroxypropanoguanine (␣HOPG), ␥-hydroxypropanoguanine (␥HOPG), and malondialdehydeguanine (M 1 G)) are also processed by E. coli AlkB in vitro, but the oxidation steps are thought to occur primarily on the open ring forms of the lesions (89).

In Vitro Strategies
Most in vitro strategies require expressing and purifying the AlkB protein of interest, which is then incubated directly with single-or double-stranded substrates, typically DNA or RNA oligonucleotides containing chemically defined modified bases (lesions). Phosphoramidites for many common modified bases are commercially available, which allows for a straightforward preparation of AlkB substrates by using a DNA/RNA synthesizer. Careful purification and characterization (by LC-MS) of such oligonucleotide substrates are essential because the modified bases are often unstable.
Two general methods are commonly used to analyze the outcome of AlkB reactions. The most sensitive method relies on high resolution MS to identify the reaction products and intermediates by their specific masses. Because this method may detect relatively stable reaction intermediates, it often provides insight into the mechanism of more complex AlkB reactions (83,89). The main caveats of this method are the low throughput (each reaction needs a dedicated HPLC-MS run) and the cost (MS equipment is expensive). The second method can be utilized when the lesion under analysis is either a block for a methylation-sensitive restriction enzyme (e.g. DpnII) or a good substrate for a glycosylase. After the AlkB reaction, the glycosylase would generate an abasic site if the lesion is still present, but would leave the DNA intact if the canonical base has been restored by direct reversal (67). Following chemical or enzymatic cleavage at the abasic site, a simple PAGE experiment would distinguish between the repaired (uncut) and unrepaired (cut) oligonucleotides (67). When using a modification-sensitive restriction enzyme, which only cuts a canonical sequence, the digested product (cut) would signify repair, whereas the undigested product (uncut) is unrepaired (14). The key advantage of these approaches is speed; the efficiency of repair of one or more lesions with an entire panel of purified AlkB enzymes can be analyzed in one run, in parallel (14, 67). The caveat of the method is the requirement of an efficient and specific glycosylase for every lesion studied, or a suitable restriction endonuclease that is inhibited by the studied lesion. Additionally, this method detects only the fully repaired canonical base product of the AlkB reaction, and thus, provides no information regarding the reaction intermediates.

In Vivo Strategies
Genetic strategies have been used to establish the importance of AlkB in protecting cells against alkylation damage, long before the Fe(II)/␣KG-dependent mechanism was known (6). Later, such genetic approaches were combined with biochemical tools to evaluate the repair efficiency of AlkB on many chemically defined DNA lesions. Specifically, an M13 singlestranded viral vector is engineered to contain site-specifically a modified base and allowed to replicate in isogenic AlkB ϩ and AlkB Ϫ cell lines. Two metrics are calculated by analyzing the resulting viral progeny: 1) lesion bypass, the ability of the lesion to block viral replication, when compared with a normal base, evaluated with the CRAB assay (90); and 2) mutagenesis, the ability of the lesion to generate mutations at that site, evaluated with the REAP assay (77,90). Any significant positive change in these metrics between the AlkB Ϫ and AlkB ϩ strains (i.e. improvement in bypass, or decrease in mutagenesis) indicates that AlkB contributes to the repair of the studied lesion. When compared with the in vitro strategies, this method allows evaluation of the AlkB repair efficiency in a cellular context, where both the enzyme and the putative substrates are present at physiologically relevant concentrations. As a caveat, this approach will not work if the lesion studied does not produce a phenotype measurable by the two metrics above. For example, m2G is neither a block to replication nor mutagenic; therefore the ability of AlkB to repair this lesion in vivo cannot be discerned with this method (91). So far, the in vivo genetics strategy has been successfully used in E. coli cells to establish that m1A, m3C, e3C, m1G, m3T (77), ⑀A, ⑀C (9), EA (88), and 1,N 2 -⑀G (87) are substrates for AlkB repair in vivo, while at the same time establishing that N 2 ,3-⑀G is not an AlkB substrate (87).

Future Directions and Perspective
As the body of knowledge regarding the specific cellular functions of the AlkB family dioxygenases expands, and the complete list of substrates for each enzyme, particularly for the human homologs, becomes known, the field will be poised to explore in more depth the regulation of AlkB proteins. From the point of view of DNA repair, two directions merit attention. First, as gatekeepers of genomic integrity, certain AlkB homo-logs may function as tumor suppressor genes (92). When their activity is impaired, excessive DNA alkylation damage may accumulate, which can lead to mutations and malignant transformation or cell death (93,94). Understanding the biochemical mechanisms by which AlkB enzymes are rendered inoperative may help connect environmental or endogenous factors to mutagenesis and cancer. Molecules that compete for binding with ␣KG (e.g. the oncometabolite 2-hydroxyglutarate) or metal ions that compete with the required Fe(II) (e.g. Ni(II)) have already been shown to inhibit the activity of certain Fe(II)/ ␣KG-dependent dioxygenases (95,96). However, the relevance of these mechanisms of inhibition to the AlkB family of enzymes has not been fully evaluated.
Second, from an opposite perspective, AlkB family proteins may also be key factors that help tumor cells withstand chemotherapy and promote tumor cell growth (24,26,27). Here, the development of potent and specific inhibitors of Fe(II)/␣KGdependent dioxygenases becomes an important challenge (65,97). By using the structural and mechanistic information available, the development of anti-cancer agents or adjuvants that target AlkB homologs is certainly within reach.
The last decade of research on the AlkB family dioxygenases has been filled with unexpected findings that have propelled the field in leaps and bounds. One can only wonder about the stillto-be-discovered surprises this family of enzymes has to offer for the decades to come.