Emerging roles for thiol dioxygenases as oxygen sensors

Cysteine dioxygenases, 3‐mercaptopropionate dioxygenases and mercaptosuccinate dioxygenases are all thiol dioxygenases (TDOs) that catalyse oxidation of thiol molecules to sulphinates. They are Fe(II)‐dependent dioxygenases with a cupin fold that supports a 3xHis metal‐coordinating triad at the active site. They also have other, broadly common features including arginine residues involved in substrate carboxylate binding and a conserved trio of residues at the active site featuring a tyrosine important in substrate binding catalysis. Recently, N‐terminal cysteinyl dioxygenase enzymes (NCOs) have been identified in plants (plant cysteine oxidases, PCOs), while human 2‐aminoethanethiol dioxygenase (ADO) has been shown to act as both an NCO and a small molecule TDO. Although the cupin fold and 3xHis Fe(II)‐binding triad seen in the small molecule TDOs are conserved in NCOs, other active site features and aspects of the overall protein architecture are quite different. Furthermore, the PCOs and ADO appear to act as biological O2 sensors, as shown by kinetic analyses and hypoxic regulation of the stability of their biological targets (N‐terminal cysteine oxidation triggers protein degradation via the N‐degron pathway). Here, we discuss the emergence of these two subclasses of TDO including structural features that could dictate their ability to bind small molecule or polypeptide substrates. These structural features may also underpin the O2‐sensing capability of the NCOs. Understanding how these enzymes interact with their substrates, including O2, could reveal strategies to manipulate their activity, relevant to hypoxic disease states and plant adaptive responses to flooding.


Introduction
Thiol dioxygenases (TDOs) have been known for many decades as having important roles in cysteine catabolism, taurine metabolism and other small molecule thiol turnovers. These enzymes are Fe(II)-dependent oxygenases from the cupin superfamily of proteins, and a large body of work has been dedicated to understanding the structural and mechanistic features that underpin their catalytic properties [1]. Until 2014, there was no reason to link their dioxygenase function with a role in O 2 sensing. However, the identification of plant cysteine oxidases (PCOs) with an O 2 -sensing role revealed a new subset of TDOs, enzymes that oxidise cysteine residues at the N termini of proteins, triggering their degradation in an O 2 -dependent manner via the Cys/ Arg branch of the N-degron pathway [2,3]. We will refer to this subset of TDOs as NCOs (N-terminal cysteinyl dioxygenases). Subsequently, cysteamine dioxygenase (also known as 2-aminoethanethiol dioxygenases, ADO), until then understood to oxidise cysteamine, was also found to act as an NCO, regulating the stability of certain Cys-initiating proteins in an O 2sensitive manner [4]. Emerging structural information and sequence analysis revealed that these two subfamilies of TDO appear to have divergent features, which may rationalise their different functions and could help to reveal their O 2 -sensing mechanisms [5,6]. In this review, we summarise what we know to date about the two subfamilies and what is still to be uncovered. As TDOs are conserved across eukaryotic and prokaryotic kingdoms, such knowledge could help identify novel O 2 -sensitive pathways.

Cysteine dioxygenase
Cysteine dioxygenase (CDO) was the first characterised TDO and has been known for several decades to regulate cysteine levels in animals [7,8]. It catalyses oxidation of L-cysteine to L-cysteine sulphinic acid in an O 2dependent oxidation reaction, incorporating both oxygen atoms into the product (Scheme 1) [1]. Homeostasis of cysteine levels in the body is crucial as high cysteine levels are toxic and have been linked to neurological disorders such as Parkinson's disease and Alzheimer's disease [9]. CDO is expressed in a tissuespecific manner, particularly in liver, adipose tissues and lungs [10,11] Sequence and structural analyses have identified that CDO belongs to the cupin superfamily of enzymes, which are non-haem metalloenzymes that catalyse a wide range of reactions [12]. They share common structural features such as the two conserved motifs (named cupin motif 1 and 2) and double b sheet helix (DBSH), and the active site of the enzymes is formed within this cupin fold. While CDO shares these conserved structural features (Fig. 1AI), its active site Fe (II) coordination is different to that of other cupin superfamily enzymes. Typically, 3-His-1-Glu or 2-His-1-Glu/Asp motifs coordinate the metallocenter; [12] however, in CDOs just three conserved histidine residues coordinate the active site iron [13]. Uncommon to the rest of the cupin enzymes, this motif is only seen in two other enzymes which are gentisate 1,2dioxygenase (GDO) and diketone dioxygenase (Dke1) [1]. Sequence analysis shows that the metalcoordinating glutamate residue seen in other non-haem metalloenzymes is replaced by a cysteine residue in mammalian CDOs (Cys 93 in RnCDO).
A number of high-resolution crystal structures of mammalian CDOs have been published, including from mouse (PDB 2ATF) [13], rat (PDB 2B5H, 3ELN) [14,15] and humans (PDB 2IC1) [16], which reveal features of the enzymes' active sites (Fig. 1A). A substrate-free structure of rat CDO (PDB 2B5H) showed the active site iron cofactor in an unusual tetrahedral coordination with the three His residues as well as a water molecule, which was also coordinated with Tyr 157 [14]. The substrate-free mouse CDO structure (PDB 2ATF) instead suggested a distorted octahedral geometry at the active site (in which Fe(II) was substituted with Ni). Other conserved features of CDO active sites, at least in eukaryotes, are a Ser 153 -His 155 -Tyr 157 motif (RnCDO numbering) which form a hydrogen-bonding network, a Cys 93 -Tyr 157 thioether bond and Arg 60 involved in binding the substrate carboxylate group ( Fig. 1AIII; Fig. 2B). CDOs are also present in certain phyla of bacteria, where their active sites differ slightly from their eukaryotic homologues with a Gly residue replacing Cys 93 , meaning a Cys-Tyr thioether bond is not observed [13,14].
Human CDO was the first structure (PDB 2IC1) reported with substrate present in the active site, although subsequent analysis of electron density of this structure disputes the presence of L-cysteine [16]. The structure of a rat CDO with L-Cys bound and a proposed intermediate (persulfenate) trapped in the active site have since been reported (PDB 3ELN), along with similar structures at a range of pH values [15,17]. These structures (e.g. PDB 4IEV at pH 8.0; Fig. 1AIV) show that the thiolate and amine groups of L-cysteine bind directly to the active site, coordinating the iron in a bidentate manner [15,17]. CDO Arg 60 is positioned to interact with the carboxylate group of the substrate via its guanidinium group. Tyr 157 is also in close proximity (2.2 A) to the bound substrate, and Tyr157Phe variants show significantly reduced activity compared with the wild-type, suggesting an important role of the Tyr in substrate interaction and the catalytic mechanism (likely including formation of the Cys-Tyr cross-link) [16,[18][19][20].
The post-translational covalent cross-link observed between residues Cys 93 and Tyr 157 is a modification which has only otherwise been seen to date in galactose oxidase [21]. The role of the arising Cys-Tyr thioether bond has been examined in some detail. It is not necessary for CDO activity, forming over the course of multiple turnovers, but enzymatic activity is enhanced in its presence. Cys 93 variants of CDO, as well as the Tyr157Phe variant, have been shown to be functional but with decreased activity [22][23][24]. Cross-link formation is proposed to be initiated by O 2 activation at the active site resulting in oxidation of Cys 93 instead of L-Cys (substrate) to generate a thiyl radical, which subsequently oxidises Tyr 157 to form the cross-link [25]. Although the mechanism by which the Cys 93 -Tyr 157 cross-link enhances enzyme kinetics is not definitive, current evidence indicates that it is through stabilisation of the Tyr 157 -OH group, which forms part of a hydrogen-bonding network across the Ser 153 -His 155 -Tyr 157 motif [13,14,18], such that it is at a favourable distance and position to interact with the substrate as well as O 2 , and may play a role in catalysis of substrate oxidation [26,27].
Despite reports of structural, spectroscopic and enzymatic characterisation, the CDO catalytic mechanism is not fully resolved. An electron paramagnetic resonance study of CDO with iron cofactor, NO (substituted for O 2 ) and L-cysteine showed that the cysteine coordinates the active site iron first, followed by molecular O 2 [27]. This study also showed that in the substrate-free form of CDO, Fe(II) is stabilised in the reduced state in aerobic conditions even in the absence of a reductant [27]. Hence, the high selectivity of substrate binding to the enzyme prior to O 2 binding may be due to the high reduction potential of the substrate-free Fe(II) centre. This order of binding means CDO is likely to act via activation of O 2 rather than activation of substrate. There are currently two proposed mechanisms for thiol oxidation (Scheme 2). The first is based on a crystallographically observed persulfenate species and predicts that, once substrate and O 2 are bound, the thiolate attacks the proximal O coordinating the iron cofactor, resulting in thiadioxirane formation and heterolytic cleavage of the O-O bond [15,17]; this mechanism proposes a role for Tyr 157 (RnCDO numbering) in Hbonding with the proximal O (Scheme 2). The second mechanism is based on theoretical calculations, which suggest that the thiolate is attacked by the distal O, resulting in O-O bond cleavage and a reactive Fe(IV)oxo species from which the second O is transferred to the substrate (Scheme 2) [28,29]. Kinetic experiments attempting to trap and identify catalytic intermediates to resolve the mechanism have not yet been successful: although a catalytically relevant spectroscopically visible intermediate has been observed, it was shortlived (< 20 ms) and its identity has not yet been experimentally resolved [29].

Bacterial thiol dioxygenases
TDOs are also found in certain phyla of bacteria, where they are divided into subclasses, 'Arg-type' and 'Gln-type' TDOs [23]. CDOs fall into the Arg-type category, where an active site Arg residue is retained in an equivalent position to Arg 60 in RnCDO to stabilise substrate binding (Fig. 1AIII) [23]. 'Gln-type' TDOs replace this conserved Arg with a Gln residue which is not involved in substrate binding (Fig. 2B). These TDOs in fact appear to preferentially act as 3mercaptopropionate dioxygenases (3MDOs, Scheme 1), another class of TDO found in bacteria [23,30]. 3MDOs have a conserved Arg elsewhere in the active site (e.g. Av/Pa3MDO Arg 168 ; Fig. 1BIII and IV; Fig. 2B) which has been shown to stabilise substrate binding via interaction with its carboxylate group, fulfilling a similar role to RnCDO Arg 60 [18]. Mercaptosuccinate dioxygenase (MSDO, Scheme 1) also has both Gln and Arg residues in the active site; while Variovorax paradoxus B4 MSDO Gln 64 characterises this enzyme as a Gln-type TDO, Arg 66 is important in catalysis likely via substrate binding ( Fig. 2B) [31]. Substitution of the Tyr 157 equivalent residue to Phe in MSDO did not impact activity, suggesting its OH group is nonessential for mercaptosuccinate oxidation [31]. In line with this observation, in bacterial TDOs (including CDOs) the equivalent Cys is replaced with a Gly, and the Cys-Tyr cross-link is not observed (Fig. 2B) [30]. Neither mercaptosuccinate nor 3-mercaptopropionic acid (3MPA) has an amine group as is seen for the CDO substrate, L-cysteine. Therefore, substrate binding to the active site of these TDOs cannot involve amine as well as thiol coordination to the active site metal. Kinetic and spectroscopic data have indicated that substrate coordination is instead monodentate [30]; however, recent modelling and structural reports, including Av3MDO with a 3hydroxypropionic acid (3HPA) substrate analogue in the active site, indicate bidentate coordination of substrate via its carboxylate and hydroxyl groups is possible (Fig. 1BIV) [32,33].

Cysteamine dioxygenase
A second mammalian TDO, capable of oxidising cysteamine to hypotaurine, has been postulated since the 1960s, [34] when an enzyme extracted from animal tissue was found to be specific for cysteamine over cysteine and other cysteine derivatives [34]. This reaction is part of an alternative pathway in taurine biosynthesis and also regulates levels of cysteamine, which is released during coenzyme A degradation. The enzyme, cysteamine (2-aminoethanethiol) dioxygenase (ADO), is more widely expressed than CDO, with higher levels of expression in the brain and muscles [35] and was first biochemically characterised in 2007 by Dominy et al. [35]. They reported that the recombinant ADO protein, like the extracted enzyme, was very specific for the dioxygenation of cysteamine to hypotaurine (Scheme 1) and could not use L-cysteine as a substrate. L-Cysteine was also found to be a very poor inhibitor of ADO's cysteamine dioxygenase activity, as was 2mercaptoethanol, indicating neither of these alternative thiols bound readily to the active site.
There is no published structure of ADO; its sequence suggests that, like CDO, it belongs to the cupin superfamily of enzymes but, also like CDO, the common glutamate residue usually found in cupin motif 1 is not present, replaced by glycine or valine (Cys 93 in RnCDO) [36]. ADO is therefore expected to have a DBSH, with the active site located within the cupin fold. Sequence alignment of ADO across different organisms and with CDO shows 3 conserved histidine residues (HsADO His 112 , His 114 and His 193 ), which are expected to be involved in binding iron via a facial triad. Indeed as for CDO, iron coordination is needed for ADO activity and when iron binding was ablated using a His to Ala variant cysteamine dioxygenase activity was lost [35]. It is also possible that ADO shares another structural feature with CDO: in 2018, Wang et al. published evidence that a Cys-Tyr crosslink can occur in ADO between Cys 220 and Tyr 222 [37]. Interestingly, recent spectroscopic investigations indicate that metal-substrate coordination in ADO is distinctive from that of CDO: While crystallographic evidence has shown these enzymes to bind their substrates in a bidentate manner, spectroscopic evidence has indicated that ADO binds cysteamine in a monodentate manner via the thiol but not via its amine group [38,39].

Plant cysteine oxidases
In 2011, two reports identified that in plants, cysteine oxidation plays a major role in adaptive responses to the reduced oxygen availability (hypoxia) associated with flooding [40,41]. Group VII ethylene response factors (ERF-VIIs) are transcription factors which upregulate genes that trigger either a metabolic quiescence response or an escape response involving rapid growth above the level of the floodwater. While the ERF-VIIs are stable in these hypoxic conditions, they are . This is an analogous mechanism to the hypoxic regulation of the hypoxia-inducible factor (HIF), a transcription factor which performs a similar role in triggering hypoxic adaptation in animals [43].
Knowing that HIF stability is controlled by O 2 -sensing HIF hydroxylase enzymes, equivalent enzymatic control of ERF-VII N-terminal Cys oxidation was suspected. By searching plant genomes for homologues of the small molecule TDOs described above, Weits et al [2] reported in 2014 that they had identified 5 enzymes in Arabidopsis thaliana that could oxidise ERF-VII N termini and regulate ERF-VII stability; these were termed plant cysteine oxidases (PCOs). Subsequent analyses confirmed that the PCOs catalyse oxidation of the Nt-Cys of ERF-VII to Cys-sulfinic acid (Scheme 1) [40,41] and that these sulfinylated N termini are then arginylated by arginyl-tRNA-protein transferase 1 (ATE1) to generate Ndegrons [44][45][46]. The N-degron pathway then recognises  . Secondary structure for VpMSDO was determined using predicted structure produced with MODELLER [56] using RnCDO (PDB 2B5H) as a template (C): alignment of N-terminal cysteinyl dioxygenases. Secondary structure for AtPCO4 was determined using published structure (PDB 6S7E). Secondary structure for remaining NCOs was determined using predicted structures produced with MODELLER [56] using AtPCO4 (PDB 6S7E) as a template. and ubiquitinates N-degron proteins or fragments (degradation signals) directing them for proteolysis by the 26S proteasome [47]. The PCOs therefore represent a form of TDO that can catalyse oxidation of N-terminal cysteine residues in the context of a protein (N-terminal cysteinyl dioxygenases, NCOs) in contrast to the role described for small molecule TDOs. Since this discovery, PCOs have also been shown to catalyse oxidation of Nt-Cys residues of arabidopsis proteins polycomb group protein vernalisation 2 (VRN2) [48] and LITTLE ZIPPER 2 (ZPR2) [49], expanding the role of these enzymes beyond flood tolerance to developmental signalling processes. Indeed, as plants have hundreds of Nt-Cys-initiating proteins, this suggests the potential for multiple PCO substrates. Kinetic studies have been performed with AtPCO enzymes towards a peptide representing the Met-excised N terminus of two of the ERF-VII substrates in Arabidopsis (RAP2.2 and RAP2.12). These studies revealed that the catalytic activity among the five AtPCOs varied significantly [3], with one enzyme, AtPCO4, having a significantly higher turnover rate than the other enzymes. AtPCO4 and AtPCO5 have k cat /K m values of 44927 and 49068 M À1 s À1 , respectively, which are at least twice the rates of AtPCOs1-3, indicating these are the most catalytically efficient enzymes [3]. AtPCO4 and AtPCO5 are both constitutively expressed, along with AtPCO3, but in contrast to AtPCO1 and AtPCO2 which are hypoxically upregulated [2,50] [3]. Combined with apparent differing substrate preferences of AtPCO1/2 and AtPCO4/5 towards ERF-VII substrates and the nature of their expression in plant cells (hypoxia-induced vs. constitutively expressed) [2], the kinetic data suggest the possibility of distinct physiological roles for the five enzymes, such as stabilisation and destabilisation of ERF-VIIs on submergence/desubmergence. Further investigation of the dynamics of PCO activity across O 2 gradients in planta will be of great interest.
To date, crystal structures of three AtPCOs have been reported: PCO4 (PDB 6S0P [5], 6S7E [5] and 7CHJ) [6], PCO5 (PDB 6SBP [5] and 7CHI [6]) and PCO2 (PDB 7CXZ) [6]. These structures reveal that, similar to other TDOs, the PCOs are members of the cupin superfamily of enzymes, with DBSH core motifs flanked by an N-terminal a-helical and a C-terminal loop region (Fig. 1CI). Their active sites are formed within the cupin fold where the active site iron is coordinated by 3xHis residues (His 98 , His 100 and His 164 , AtPCO4/5 numbering here and throughout) [5]. In the absence of substrate, the octahedral iron coordination at the iron centre is completed with three water molecules bound in the vacant sites [5]. Beyond the ironbinding residues, the active sites of the PCOs appear quite different to those of the small molecule TDOs (Fig. 1CIII). The Ser-His-Tyr catalytic triad is replaced with Asp-Ile-Leu, while the equivalent residues to the CDO Cys 93 -Tyr 157 cross-link are not present in the PCOs ( Fig. 1CIII; Fig. 2). Although there are other potential cross-link sites (Cys 190 , Tyr 182 , Tyr 192 ), thioether bond formation has not been identified in PCOs to date [5,37]. Mutagenesis studies have revealed roles for Asp 176 and Tyr 182 in substrate binding and/or catalysis, where conservative substitutions to Asn and Phe (respectively) saw decreased activity compared with the wild-type [5,6]. Curiously, given that 2-His-Asp metal coordination is a common feature in cupin superfamily enzymes [52], an AtPCO4 His164Asp variant lost all activity.
A structure of AtPCO2 with Cys bound in the active site has been modelled based on a structure of AtPCO2 with Tris bound in the active site (PDB 7CXZ) [6]. This model suggested quite different substrate-interacting residues to those seen in CDO, for example, van der Waals interactions between L-Cys and residues Phe 123 , Ile 131 and Phe 199 (AtPCO2 numbering) [6]. Although these residues are conserved in all five isoforms, to date there are no mutagenesis studies reported to validate their role in substrate recognition and/or binding in the active site.

ADO functions as an N-terminal cysteinyl dioxygenase
Having identified the ability of the AtPCOs to oxidise N-terminal cysteinyl residues of target proteins, examination of human TDO sequences suggested that ADO showed a greater homology towards PCO sequences than towards CDO sequences, particularly with respect to secondary active site residues (Fig. 2). Regulators of G-protein signalling, RGS4 and RGS5, were already known targets of the Cys/Arg branch of the N-degron pathway in humans [45]; ADO was therefore investigated as a potential NCO-type enzyme as ADOcatalysed oxidation of RGS4/5 N-terminal Cys showed that ADO was indeed required for O 2dependent degradation of RGS4 and RGS5 [4]. As for the PCOs, ADO was shown to catalyse dioxygenation of the N-terminal cysteine residue of peptides representing the N termini of each of these proteins using both oxygen atoms from molecular oxygen. This activity was not replicated with human CDO. The stability of the pro-inflammatory cytokine Interleukin 32 (IL32) was also found to be regulated by ADO [4], although other Nt-Cys-initiating mammalian proteins were found to be poor ADO substrates, suggesting that ADO may only regulate O 2 -dependent protein degradation in a select group of Nt-Cys-initiating proteins.
Similar to the PCOs, the Nt-Cys oxidation activity catalysed by ADO was shown to be sensitive to O 2 availability, with an apparent K M O 2 of around 50% O 2 with respect to RGS4 and RGS5 oxidation, a degree of O 2 sensitivity only observed to date for the HIF hydroxylase enzyme prolyl hydroxylase 2 (PHD2) [53]. Kinetic studies indicated that RGS4, RGS5 and IL32 were significantly better substrates for ADO than its small molecule substrate, cysteamine: ADO had a catalytic efficiency (k cat /K m ) of 420 M À1 s À1 for cysteamine [35], compared to k cat /K m values of 164 000, 237 000 and 307 000 M À1 s À1 with RGS4, RGS5 and IL32 peptides, respectively. The ability of cysteamine and L-cysteine to compete with turnover of the RGS5 peptide was determined and IC 50 values reported as 37.65 mM and 13.63 mM, respectively. Although these values are high, they were determined using a high RGS5 2-15 substrate concentration (320 µM) to give maximal ADO activity (K M app = 71.5 µM). Assuming cysteine and cysteamine act as competitive inhibitors, it is therefore possible that they could compete with the NCO activity of ADO under physiological conditions with lower concentrations of Nt-Cys-initiating substrates (the normal concentration of cysteine in human cells is 80-200 µM [54] while cysteamine is likely less than this [55]) [38,39]. Although the kinetic data therefore suggest that the primary biological role of ADO is hypoxic regulation of Nt-Cys-initiating protein stability, the role of ADO in catalysing cysteamine oxidation likely remains biologically relevant, dependent on the relative concentrations of cysteamine and Nt-Cys-initiating substrates in cells.
Sequence similarity between ADO and PCOs enabled us to predict a structure of ADO using MOD-ELLER [56] with AtPCO4 (PDB 6S7E) [5] as a template (Fig. 1D). The structure predicted had 93% of residues in the Ramachandran favoured region, with just 1.49% Ramachandran outliers and 2.65% rotamer outliers. This predicted structure featured the expected distinctive DBSH and active site iron cofactor coordination via a facial triad of histidine residues (Fig. 1DI). Other aspects of the AtPCO4 active site are also predicted to be conserved in the ADO model, including a motif comprising Phe 202 , Asp 204 and Leu 206 , similar to the Ile 174 -Asp 176 -Leu 178 triad seen in the PCOs (Fig. 1DIII). The predicted structure of ADO also featured a hairpin loop near the active site opening (residues 212-220), similar to a hairpin loop observed in the PCO structures (Fig. 1DIV) [5]. Intriguingly, a recent spectroscopic study suggests that ADO binds RGS5 peptide in a monodentate manner through coordination to the Fe(II) by the thiol of the Nt-Cys, similar to the proposed binding mode for cysteamine described above [39].

Two subclasses of thiol dioxygenase
The structural and functional analyses of PCOs and ADO have revealed clear differences to the related TDOs, CDOs, MSDO and MPDO. These suggest that TDOs can be divided into two subclasses depending on the thiol substrate: protein or small molecule. This is seen upon phylogenetic analysis of their sequences ( Fig. 2A), and available crystal structures of these two subclasses of TDO also appear to show different intrinsic structural features which may fulfil their requirements for substrate binding (Fig. 1).
The Ser-His-Tyr motif seen in small molecule TDOs ( Fig. 1AIII and BIII; Fig. 2B), known as the 'catalytic triad', forms a H-bond network adjacent to the active site iron. This motif is conserved in all small molecule thiol dioxygenases and facilitates substrate binding by the Tyr-OH group [14,15,33]. This motif is not conserved in AtPCOs or ADO and is replaced by Ile-Asp-Leu residues at equivalent positions, or Phe-Asp-Leu in ADO [5]. This alternative motif is conserved in all AtPCOs as well as PCO sequences from a wide range of other plants, although Leu is a common substitution for Ile (Fig. 2C) [5]. The precise role of the alternative triad in NCOs is not yet clear, but an Asp176Asn variant in AtPCO4 resulted in nearablated activity. If, as reported for ADO, substrate coordination of NCOs is monodentate to the active site metal [38], it is possible that the conserved Asp in the NCO form of this triad could play a role in binding the N-terminal amine of their polypeptide substrates. Tyr 157 plays a particularly important role in CDO substrate binding, so although this role could be fulfilled by other Tyr residues in the active sites of NCOs, distinct modes of primary substrate (and/or O 2 ) binding between the two classes of TDO seem likely, especially as there is no equivalent in NCOs to the interaction between Arg residues and substrate carboxylate groups seen in CDOs, 3MDOs and MSDOs. The PCOs do, however, appear to share a common feature with CDOs, which are a cis-peptide bond following the catalytic triad: this comprises Ala 179 -Pro 180 and Ser 179 -Pro 180 in AtPCO4 and AtPCO5, respectively, and Ser 158 -Pro 159 in RnCDO (Fig. 2). This may prove to be important in aligning substrate for oxidation; NCO:substrate structures will be invaluable in understanding these and other active site interactions. As PCO:substrate interactions are likely to occur across a greater surface area of the enzyme than is the case for small molecule TDOs, it was interesting to find that the position of a C-terminal loop in the AtPCO substrate-free structures creates an open tunnel-like cavity leading to the active site (Fig. 1CII) [5]. Our modelled ADO structure also indicates similar accessibility to the active site (Fig. 1DII). In contrast, the equivalent C-terminal loop in small molecule TDO enzymes sits across this active site face, rendering access to the active site more hindered (Fig. 1AII, BII). It could be that the more accessible active sites seen for the NCOs may enable binding of a protein substrate, whereas the more restricted enzymes are better suited for small molecule thiol substrate binding. Situated at the entrance to this proposed active site cavity in the AtPCO4 structure (PDB 6S0P) is a hairpin loop (b9-b10 loop, residues 182-190, AtPCO4 numbering) consisting of charged (Glu 185 , Asp 187 and Arg 188 ) and polar (Tyr 182 , Ser 183 , Ser 184 and Cys 190 ) residues [5] (Fig. 1CIV). The predicted structure of ADO also features a hairpin loop near the active site opening (residues 212-220; Fig. 1DIV, Fig. 2C), similar to the hairpin loops observed in the PCO structures and rich in aspartate residues [5]. This region is noticeably variable among NCO sequences (Fig. 2C) and, given its location, may play a role in substrate recognition and/or binding [5].  [57], this nevertheless implies that there will be a rapid stabilisation of its substrates in hypoxic conditions that may be important in relevant disease conditions such as ischaemia or in hypoxic tumour cores. The sequence homology between ADO from the animal kingdom and the PCOs from the plant kingdom implies that these O 2 sensors could have arisen from a common evolutionary ancestor, earlier than the PHD/HIF mediated O 2 -sensing system, which is only present in metazoans [58]. Our current understanding of NCO/PHD-mediated O 2 sensing is centred on their ability to regulate transcription factors that evolved in complex eukaryotes, as is the case for both HIF and ERF-VIIs. It will be of interest to uncover substrates for these enzymes in simpler organisms which may enlighten our understanding of how O 2 -sensing capacity has driven adaptation to different environments [57].

An emerging role for N-terminal cysteine dioxygenases in oxygen sensing
It is important to note that, although there has been no connection identified between CDO, 3MDO or MSDO activity and O 2 -sensitive responses, the kinetics of their activity specifically with respect to O 2 availability has not been determined. Such knowledge could help elucidate whether all TDOs, or just the NCO subset, are responsive to physiologically relevant levels of hypoxia. Whether or not this is the case, the O 2sensitive regulation of protein stability via NCO activity is an important new pathway in hypoxia biology and common structural features could underpin this property. Although the mechanistic rationale can only be speculated at present, O 2 sensitivity may be conferred by active site features regulating the rate at which O 2 binds and is activated by iron or through a restricted rate of O 2 delivery to the active site, which in turn could be impacted by the nature of the enzyme:substrate interaction and accessibility. Further mechanistic and kinetic assays will reveal how the NCOs control their rate of reaction with O 2 , with such knowledge potentially enabling rational manipulation of plant PCOs to alter their O 2 sensitivity and with it, their capacity to respond to flood-induced hypoxia.

Conclusions
We have described the recent characterisation of a new class of thiol dioxygenase, the N-terminal cysteinyl dioxygenases (NCOs), which have an O 2 -sensing role in regulating protein stability via the N-degron pathway. While there are many common features, there are distinct structural differences between the NCOs and small molecule thiol dioxygenases characterised to date, particularly in the active site. Some of these The common features of the NCOs, particularly at the active site, in combination with their O 2 -sensing function, raise the possibility of being able to predict whether other organisms may have such O 2 -sensing enzymes. The existence of such sequences in early land plants (Fig. 2C) and even algae is intriguing [57], particularly as known substrates of the PCOs have evolved with flowering and seeding plants. Investigating NCOs from distantly related organisms could reveal new O 2 -sensing functions in a range of biological niches.
Although there is much work to be done to correlate the structural and functional features of the NCOs, particularly with respect to O 2 -sensing, so doing could reveal opportunities for targeted manipulation of their activity. Such manipulation, either via pharmaceutical, agrochemical or genetic intervention, could impact on biological responses to hypoxia. Ultimately, this may be of use from a therapeutic perspective in humans or from an agronomic perspective where reduced PCO activity could stabilise ERF-VII levels and enhance adaptive responses to flooding [59].