Thermophilic bacteria are potential sources of novel Rieske non-heme iron oxygenases

Rieske non-heme iron oxygenases, which have a Rieske-type [2Fe–2S] cluster and a non-heme catalytic iron center, are an important family of oxidoreductases involved mainly in regio- and stereoselective transformation of a wide array of aromatic hydrocarbons. Though present in all domains of life, the most widely studied Rieske non-heme iron oxygenases are found in mesophilic bacteria. The present study explores the potential for isolating novel Rieske non-heme iron oxygenases from thermophilic sources. Browsing the entire bacterial genome database led to the identification of 45 homologs from thermophilic bacteria distributed mainly among Chloroflexi, Deinococcus–Thermus and Firmicutes. Thermostability, measured according to the aliphatic index, showed higher values for certain homologs compared with their mesophilic relatives. Prediction of substrate preferences indicated that a wide array of aromatic hydrocarbons could be transformed by most of the identified oxygenase homologs. Further identification of putative genes encoding components of a functional oxygenase system opens up the possibility of reconstituting functional thermophilic Rieske non-heme iron oxygenase systems with novel properties.


Introduction
Rieske non-heme iron oxygenases (ROs) constitute a large family of oxidoreductase enzymes involved primarily in the oxygenation of various aromatic compounds. Although Gibson et al. (1968) first detected the involvement of such an enzyme system in an alkylbenzenedegrading Pseudomonas sp., the family has since garnered a great deal of attention for two major reasons. First, ROs are key enzymes responsible for the initial attack on otherwise inert aromatic nuclei, thereby making them targets of a cascade of downstream enzymes, leading to their complete mineralization (Gibson and Subramanian 1984;Allen et al. 1995;Gibson and Parales 2000;Mallick et al. 2011). Secondly, regio-and stereoselective cis-dihydroxylation of aromatic compounds, catalyzed by ROs, generate impressive chiral intermediates in the synthesis of a wide array of agrochemically and pharmaceutically important compounds (Ensley et al. 1983;Wackett et al. 1988;Hudlicky et al. 1999;Bui et al. 2002;Newman et al. 2004;Boyd et al. 2005;Zezula and Hudlicky 2005).
Members of the RO family are usually either two-or three-component systems in which one or two soluble electron transport (ET) proteins (such as ferredoxin and reductase) transfer electrons from reduced nucleotides, such as NAD(P)H, to the terminal oxygenase component (a large α-subunit, often accompanied by a small β-subunit),which in turn catalyzes the di-or mono-oxygenation of the aromatic nucleus of the substrate (Mason and Cammack 1992;Ferraro et al. 2005). Numerous ROs have been identified and characterized from bacteria, thereby enriching the available information on their diversity in terms of both sequence and function (Habe and Omori 2003;Iwai et al. 2010Iwai et al. , 2011Chakraborty et al. 2012). Although found in all three domains of life, studies have shown that ROs occur more commonly in bacteria compared with archaea and eukaryotes (Chakraborty et al. 2012). Homologs of the large (α) subunit of RO terminal oxygenase (RO ox ) have also been investigated in certain plant species, such as Arabidopsis thaliana, Zea mays, Pisum sativum, Oryza sativa, Physcomitrella patens, Amaranthus tricolor, Ocimum basilicum and Spinacia oleracea (Caliebe et al. 1997;Meng et al. 2001;Reinbothe et al. 2004;Berim et al. 2014), as well as in insects, nematodes and vertebrates (Rottiers et al. 2006;Yoshiyama et al. 2006;Yoshiyama-Yanagawa et al. 2011). ROs from these taxa, however, have entirely different functions from those of bacterial aromatic ring-hydroxylating ROs. They either act as proteintranslocons, facilitating transport across the chloroplastic envelope membranes during chlorophyll biosynthesis, or are involved in flavone and hormone metabolism in plants. They have also been suggested to be involved in regulation of cholesterol metabolism or trafficking during steroid synthesis in insects (Caliebe et al. 1997;Meng et al. 2001;Reinbothe et al. 2004;Rottiers et al. 2006;Yoshiyama et al. 2006;Yoshiyama-Yanagawa et al. 2011;Berim et al. 2014).
Interestingly, few bacterial RO homologs with novel functions, such as oxidative cyclization during biosynthesis of certain antibiotics, hydroxylation and desaturation of short-chain tertiary alcohols and alkane monooxygenation, have been reported in recent years (Sydor et al. 2011;Schäfer et al. 2012;Li et al. 2013). This suggests that ROs bear much more catalytic potential than previously realized. Almost all bacterial ROs characterized biochemically to date have been isolated from mesophilic bacteria, with the sole exception of polychlorinated biphenyl degrading ring-hydroxylating dioxygenase from Geobacillus sp. JF8 (Mukerjee-Dhar et al. 2005;Shintani et al. 2014). As such, very little is known about RO homologs present in bacteria that live in extreme environments. Extremophiles, and in particular their enzymes, have proved to be a potentially valuable resource in the development of novel biotechnological processes. The most well-studied extremophiles include thermophiles and hyperthermophiles, and enzymes isolated from such microorganisms are often extremely thermostable and resistant to proteolysis, chemical denaturants, detergents, and organic solvents (Vieille and Zeikus 2001). Apart from enzymatic stability at high temperatures, which is often desired in industrial processes, there are several advantages of thermophilic systems in bioremediation studies. Owing to the poor aqueous solubility of aromatic hydrocarbons, biodegradation studies often encounter problems related to bioavailability. These issues can be overcome at elevated temperatures, since bioavailability tends to increase with temperature owing to increases in solubility (Margesin and Schinner 2001;Feitkenhauer and Märk 2003;Perfumo et al. 2007). Thermophilic microorganisms may thus be attractive candidates for sources of novel thermostable ROs with potential utility in industrial biosynthesis and bioremediation at elevated temperatures.
In recent years, microbial genome sequencing projects have generated an enormous quantity of data for public databases. Since publication of the genome sequence of the first extremophile in 1996 (Bult et al. 1996), there has been a substantial increase in the number of extremophilic genome sequences. Metagenomics and single-cell genomics further add to this repertoire (Hedlund et al. 2014). The present study explored all available genome sequences of thermophilic bacteria for the presence of RO homologs and predicted their suitability as novel RO candidates for biotechnological applications.

Screening thermophilic bacterial genomes for the presence of RO ox α-subunit homologs
Functionally characterized ROs have been categorized into five different similarity classes (A, B, C, D and D*) based on their phylogenetic distribution, substrate preferences and mode of attack on aromatic nuclei (Chakraborty et al. 2012). The National Center for Biotechnology Information (NCBI) 'genome' and 'taxonomy' databases were searched to characterize the distribution of thermophilic bacteria among different bacterial lineages and the availability of their genome sequences. Representative RO ox α-subunit sequences from each class were used as query probes (Table 1) to perform blastp (Altschul et al. 1990) searches against the translated set of genome sequences. Blast searches were also performed using each thermophilic RO as a query against the NCBI non-redundant database to characterize their distribution among thermophiles (and/or other extremophiles) and mesophiles.

Phylogenetic clustering and prediction of substrate preferences
The RHObase server (Chakraborty et al. 2014) was used to categorize each candidate thermophilic RO ox α-subunit into a similarity class and to obtain the closest biochemically characterized homologs. The substrate prediction module of RHObase was further used to predict the substrate preference of the thermophilic homologs and the possible sites of oxygenation. ClustalX v1.81 (Thompson et al. 1997) was used to obtain multiple sequence alignments and to eliminate redundancy among sequences. The default settings were retained for all parameters, with the exception of the matrix (BLOSUM series) used for both pairwise and multiple alignments. Phylogenetic trees were constructed based on distance data using the neighborjoining method (Saitou and Nei 1987) implemented in ClustalX. The trees were visualized and manipulated using the program TreeExplorer v2.12 (Tamura et al. 2007).

Verification of the integrity of conserved motifs and domain architecture
The RO ox α-subunit homologs obtained from the genomes of thermophiles were subjected to ScanProsite (De Castro et al. 2006) and NCBI conserved domain database searches (Marchler-Bauer et al. 2002) to verify the presence of conserved sequence motifs. The relevant motifs were C-X-H-X n -C-X 2 -H, corresponding to the N-terminal Rieske [2Fe-2S] center, and D-X 2 -H-X 3,4 -H-X n -D, corresponding to the C-terminal conserved 2-His-1-carboxylate motif preceded by a conserved aspartate (involved in electron transport), as these are the functional prerequisites of ROs (Jiang et al. 1996;Parales 2003). The motifs were compared with those of phylogenetically close mesophilic ROs. For each protein, the aliphatic index (relative volume occupied by aliphatic side chains) (Ikai 1980) was calculated using ProtParam (Gasteiger et al. 2005).

Identification of putative ET components
Genomes exhibiting the presence of RO ox α-subunits were searched (using blastp) for genes putatively encoding ET components (both ferredoxin and reductase) using a set of queries (Table 1), followed by manual inspection of each genomic loci when necessary. The queries included the oxidoreductase sequences (e.g., ferredoxin-NAD reductases and glutathione reductasetype reductases and ferredoxins) commonly associated with ROs, as well as other possible oxidoreductases (e.g., flavin reductase and rubredoxin reductase).

Distribution of RO homologs among thermophiles
Browsing the bacterial taxonomy database revealed the existence of several thermophilic genera belonging to different classes/orders. These taxa were concentrated mainly among the phyla Thermotogae, Deinococcus-Thermus, Chloroflexi, Aquificae, Firmicutes, and to some extent, Bacteroidetes/Chlorobi, Actinobacteria and Proteobacteria. Blast searches against all thermophile genomes initially led to the identification of 95 putative RO ox α-subunit homologs distributed among 20 different genera (data not shown). Among 45 non-redundant sequences (Table 2), the one obtained from Alicyclobacillus acidoterrestris ATCC 49025 (GenBank: EPZ42375) was found to be truncated at the N-terminal end and was therefore excluded from further analysis. Analysis of the distribution of the remaining candidate ROs among both thermophiles and mesophiles revealed that they were present mainly among thermophilic strains belonging to the phyla Chloroflexi, Deinococcus-Thermus, Firmicutes and Thermotogae (Fig. 1). However, distant homologs were abundant among mesophilic strains belonging to the phyla Actinobacteria, Firmicutes and Proteobacteria.

Functional clustering of thermophilic RO ox α-subunit homologs
The candidate α-subunit sequences from thermophiles were subjected to phylogenetic studies to assess their relatedness with functionally characterized ROs from other bacteria, as well as from eukaryotes. The  phylogenetic tree (Fig. 2) showed that thermophilic homologs were unevenly distributed among all classes of ROs, being clustered in a few specific regions of the tree, again suggesting a radical diversification followed by independent evolution of these genes in thermophiles.
As can be seen in Fig. 2  Each thermophilic homolog (represented by the corresponding protein name or locus tag followed by the accession number and strain name) was used as a blastp query, and only entries equal to or exceeding the threshold identity of 40% and query coverage of 80% were considered to be positive hits. The distribution is categorized into different taxa (or taxonomic hierarchies), shown on the top, with bacteria belonging to each phylum grouped separately as thermophiles and mesophiles (highlighted with yellow and green backgrounds, respectively). In the heat map, each cell is divided into two blocks; the upper (wider) block shows the percentage identity obtained from blastp, while the lower (narrower) block indicates the number of distinct species obtained from the blast search. Color codes used for each identity range are shown as an inset. Values corresponding to each cell can be obtained from Additional file 1: Table S1 functioning of an RO (Jiang et al. 1996;Parales 2003), is consistent with phylogenetically related, previously characterized ROs (Fig. 3).
The aliphatic index is regarded as a positive factor for the increased thermostability of globular proteins (Ikai 1980). Therefore, we calculated this index for each protein as a measure of thermostability. Not all proteins showed significantly higher values compared with their mesophilic homologs (Fig. 3). However, the average value was found to be higher (80.88) (Fig. 3). Table 3 lists the closest biochemically characterized homolog of each candidate RO ox α-subunit, as obtained from RHObase (Chakraborty et al. 2014). Preferable substrate(s) for most candidate ROs could be predicted using the RHObase substrate prediction module (Fig. 4). RO ox α-subunit homologs belonging to Class A showed a preference for polycyclic aromatic hydrocarbons and heterocyclic polyaromatic hydrocarbons. For one of the Class A ROs, obtained from Alicyclobacillus acidocaldarius subsp. acidocaldarius Tc-4-1, ketosteroid was found to be its putative substrate. The predicted substrates for the Class B RO ox α-subunit from Thermus thermophilus JL-18 were carboxylated aromatics, such as p-cumate, while members of Class D showed a preference for carboxylated aromatics such as phthalate, chlorobenzoate, vanillate and phenoxybenzoates, as well as for toluene-4-sulfonate. However, owing to the lack of information regarding the function of Class D* ROs, the substrate preference of these ROs could not be predicted. Apart from MupW and GbcA, involved in the mupirocin (El-Sayed et al. 2003) and glycine betaine (Wargo et al. 2008) biosynthetic pathways, respectively, all other sequences belonging to this class have been derived from whole genome annotations and lack complete information regarding their biochemical function. This makes Class D* the 'dark matter' of Rieske oxygenases.

Reconstitution of functional RO systems
As discussed earlier, the oxygenase α-subunit is often accompanied by a small β-subunit, and these subunits function in combination with one or two ET component(s). All observations discussed thus far concern the α-subunits of RO ox . However, to reconstitute a functional RO system, all of the above components must work together in a coordinated manner. Whenever present, the genes encoding both α-and β-subunits are usually co-localized. Thus, the genome of each thermophile bearing the candidate RO ox α-subunit homologs was screened for the presence of genes putatively encoding the ET component(s). Several putative genes (listed in Table 4) were identified and, in most cases, were located at a distance from the terminal oxygenase genes. As expected, putative ferredoxin and reductase components, along with the β-subunit of RO ox , could be identified in most organisms bearing Class A and B ROs. The genome of Thermus thermophilus ATCC 33923 had no adjacent β-subunit and did not yield any ferredoxin hits with the queries used. Alicyclobacillus hesperidum URH17-3-68 and Brevibacillus thermoruber PM1 were also found to lack ferredoxin. Among Class D ROs, D-IVα and D-VIIα type ROs form three-component systems containing both ferredoxin and reductase (Chakraborty et al. 2012). Similarly, putative ferredoxin and reductase components could be identified in several thermophiles bearing Class D ROs. The only exceptions were Bacillus thermotolerans SGZ-8 and Thermoactinomycetaceae bacterium GD1, (See figure on previous page.) Fig. 2 Neighbor-joining tree representing the phylogenetic relationship of putative α-subunits of oxygenase components of ROs obtained from thermophilic bacteria (orange font) with homologous sequences from other bacteria and eukaryotes. Each entry is represented by the corresponding protein name or locus tag, followed by the accession number (within parentheses) and the strain name. Values at each node indicate the level of bootstrap support based on 100 resampled datasets, while bootstrap values below 60% are not shown. The bar represents 0.1 substitutions per amino acid. The sequences have been clustered according to similarity class as defined in Chakraborty et al. (2014) (See figure on next page.) Fig. 3 Comparison of conserved N-terminal Rieske [2Fe-2S] and C-terminal 2-His-1-carboxylate motifs among α-subunits of oxygenase components of putative thermophilic ROs obtained from thermophilic bacteria (orange font) and those obtained from mesophilic bacteria and eukaryotes. The horizontal bars represent the aliphatic index of each sequence. Blue and orange vertical dotted lines indicate the average aliphatic indices obtained for mesophilic (75.25) and thermophilic (80.88) RO homologs, respectively. All those homologs which showed an aliphatic index ≥80.88 are indicated by an arrow, while clades representing only the thermophilic homologs are denoted by asterisks

Discussion
Owing to their extensive presence among Proteobacteria and Actinobacteria, all extant ROs are postulated to have originated and evolved within these lineages (Chakraborty et al. 2014). Though the thermophilic RO homologs identified in this study were present among taxonomically close organisms, their distributions were very specific for certain phyla, often with very low abundance (Fig. 1). Thus, it is quite likely that the thermophiles also acquired RO ox genes from Proteobacteria or Actinobacteria and further evolved separately. The role of Firmicutes in the evolution of ROs in thermophiles cannot be ruled out, as several RO homologs were found among mesophilic strains of Firmicutes (Fig. 1), especially within the order Bacillales. Similarly, it would be difficult to claim that only the distantly located ET components identified in this study complement the putative α-and β-subunit genes. It is highly likely that unknown gene(s) vicinal to those encoding the terminal oxygenase(s) are responsible for the proper functioning of the ROs. Similar observations have previously been made in Rhodococcus opacus TKN14, in which rubredoxin and another hypothetical protein were found to be crucial for the oxidation of o-xylene (Maruyama et al. 2005). In another study, the purified large subunit of a novel alkane monooxygenase (belonging to Class B ROs), identified from a cold-tolerant Pusillimonas sp. T7-7, showed NADH-dependent alkane monooxygenase activity (Li et al. 2013). Transformation of aromatic hydrocarbons has also been attained by heterologous expression of the terminal oxygenase components using non-specific ET proteins complemented by the host strain (Mukerjee-Dhar et al. 2005). Although the present study indicates that the RO homologs present in these organisms are either cryptic in nature or are involved in some other physiological function, we cannot rule out the possibility of reconstituting a thermostable functional RO system with novel properties (in terms of substrate preference or mode of catalysis) by combining the terminal oxygenase genes along with all possible combinations of ET components. Integrity of the motif signatures and predicted enhanced thermostability (Fig. 3) further strengthens this hypothesis. The existence of unexplored microbial diversity, together with the availability of whole genomes, represents a large pool for future industrial catalysts. Thermostable ROs may be attractive candidates for carrying out efficient biotransformation at elevated temperatures. Apart from enhancing our understanding of the distribution of ROs in nature, the present study may aid in designing new bioremediation strategies or industrial biosynthetic processes. Based on the information provided here, functional RO systems can be reconstituted from each organism by cloning both terminal oxygenase and ET genes into suitable vectors and performing biotransformation assays using the predicted substrates (Fig. 4). On the other hand, gene knockout studies can be performed (provided that appropriate genetic tools are available) to help elucidate the physiological role of RO homologs with unknown functions in thermophilic bacteria.