CYP154C5 Regioselectivity in Steroid Hydroxylation Explored by Substrate Modifications and Protein Engineering**

Abstract CYP154C5 from Nocardia farcinica is a P450 monooxygenase able to hydroxylate a range of steroids with high regio‐ and stereoselectivity at the 16α‐position. Using protein engineering and substrate modifications based on the crystal structure of CYP154C5, an altered regioselectivity of the enzyme in steroid hydroxylation had been achieved. Thus, conversion of progesterone by mutant CYP154C5 F92A resulted in formation of the corresponding 21‐hydroxylated product 11‐deoxycorticosterone in addition to 16α‐hydroxylation. Using MD simulation, this altered regioselectivity appeared to result from an alternative binding mode of the steroid in the active site of mutant F92A. MD simulation further suggested that the entrance of water to the active site caused higher uncoupling in this mutant. Moreover, exclusive 15α‐hydroxylation was observed for wild‐type CYP154C5 in the conversion of 5α‐androstan‐3‐one, lacking an oxy‐functional group at C17. Overall, our data give valuable insight into the structure–function relationship of this cytochrome P450 monooxygenase for steroid hydroxylation.


Introduction
Cytochrome P450 monooxygenases (P450s or CYPs) are hemoproteins carrying a heme b molecule covalently linked to a cysteine side chain. [1] From a biocatalytic perspective, they are remarkable enzymes as they are able to catalyze the selective hydroxylation of non-activated carbon atoms using molecular oxygen. [2,3] For the activation of molecular oxygen during the catalytic cycle, they require electrons -from NAD(P)H -which are in most cases delivered to the monooxygenase by additional redox partners. [4] One of the most important industrial uses of these cytochrome P450 monooxygenases is their application in steroid synthesis in the pharmaceutical industry due to their remarkable selectivity in steroid hydroxylation. Well-known examples include the 11β-hydroxylation of 11deoxycortisol (Reichstein S) to hydrocortisone by Curvularia sp. or the conversion of progesterone to cortisone by Rhizopus sp. [5][6][7][8] Though P450s are already applied on industrial scale, there is always the need for yield improvement, including the increase in hydroxylation specificity (e. g., fewer by-products), the need for an altered selectivity of the enzyme to also hydroxylate, for example, new sites in a known substrate or the adaptation of a known P450 to a new substrate. [9] In that respect, protein engineering has proven to be a powerful tool to alter enzyme characteristics such as activity and selectivity. One example, reported by Kille et al., is the generation of P450BM3 mutants hydroxylating testosterone selectively either at 2β-or 15β-position, while the starting mutant, P450BM3 F87A, forms a 1 : 1 mixture of 2β-and 15βhydroxytestosterone. [10] In this case, mutants were generated by iterative combinatorial active-site saturation mutagenesis, a protein engineering approach that is significantly facilitated by the availability of structural information for a given protein.
Recently, the crystal structure of CYP154C5, a cytochrome P450 monooxygenase from Nocardia farcinica, was elucidated in the presence of four steroid substrates. [11] This enzyme catalyzes the highly regio-and stereoselective hydroxylation of different pregnans and androstans producing exclusively the corresponding 16α-hydroxylated products. [12] As the natural redox partners of this P450 monooxygenase are unknown, putidaredoxin (Pdx) and putidaredoxin reductase (PdR) from Pseudomonas putida can be applied in bioconversions to supply CYP154C5 with electrons. The active site pocket of CYP154C5 forms a hydrophobic "tube" with two opposite polar regions at both ends. These polar regions are occupied by Gln239 and Gln398 forming hydrogen bond interactions with the hydroxy or ketone functionalities of steroids at positions C3 (via water molecules) and C17. [11] Additionally, several hydrophobic interactions between enzyme and steroid substrate within the active site could be identified. With the help of the crystal structure the remarkably high regio-and stereoselectivity of CYP154C5 towards steroids 1-6 ( Figure 1) was explained. [11] With the future goal to modify the enzyme's regioselectivity in steroid hydroxylation reactions, we herein aimed to explore the selectivity of CYP154C5 in more detail based on the previously obtained structural insight. To this end, selected active-site residues of CYP154C5, mediating important enzymesubstrate interactions, were mutated by site-directed mutagenesis, and resulting mutants were applied in bioconversions with steroids 1-6. In a complementary approach, steroid substrates lacking oxyfunctional groups at C3 and C17, which otherwise enable hydrogen bonding with active site residues of CYP154C5, were tested in bioconversions with the wild-type enzyme.

CYP154C5 mutagenesis
Based on the CYP154C5 crystal structure and a detailed analysis of the enzyme active site in the presence of different steroid substrates, [11] four active-site residues have been identified to play an important role in steroid binding. Among them are the two glutamines at positions 239 and 398 forming hydrogen bonds with the carbonyl or hydroxy groups at C3 (via a water molecule) and C17 of the steroid substrates ( Figure 2). Moreover, residues M84 and F92 are involved in hydrophobic interactions with the steroid backbone. Specifically, M84 interacts with C11 and C12 of ring C as well as methyl substituents (C18 and C19) on the steroid backbone, while residue F92 forms hydrophobic contacts with C5, C6, C7, and C8 of ring B. [11] Thus, residues M84, F92, Q239 and Q398 were selected for mutagenesis to investigate their impact on steroid binding and CYP154C5's selectivity in steroid conversions. To this end, single alanine mutations were prepared using site directed mutagenesis. Afterwards, wild-type CYP154C5 as well as mutants M84A, F92A, Q239A and Q398A were heterologously produced in Escherichia coli C43(DE3) and purified by anion exchange and affinity chromatography ( Figure S1 in the Figure 1. Chemical structures of previously tested (1)(2)(3)(4)(5)(6) [11,12] and five new steroid substrates (7)(8)(9)(10)(11) used in bioconversions with CYP154C5 (wild-type and mutants), Pdx and PdR in this study.

ChemBioChem
Full Papers doi.org/10.1002/cbic.202000735 Supporting Information). Similarly, redox partners Pdx and PdR from P. putida, which are required for P450 activity, were produced in E. coli C43(DE3) and subsequently purified by anion exchange and hydrophobic interaction chromatography (Figure S1).

Steroid conversion by CYP154C5 mutants
In order to analyze the influence of the different mutations on steroid binding and catalysis by CYP154C5, dissociation constants (K D ), turnover numbers (TONs) and coupling efficiencies of the mutants towards steroids 1-6 were determined. In the absence of substrate, P450 enzymes exhibit an absorbance maximum around 420 nm. Upon substrate binding, a water molecule is removed as the sixth ligand at the heme iron. [13] This causes a spin shift of the heme iron which, in type I spectral changes, involves a shift of the P450 absorbance maximum to approximately 390 nm. From the peak-to-trough difference in absorbance (ΔA) between P450 with high-spin iron (A 390 ) and low-spin iron (A 420 ) dissociation constants can be inferred. Using the four different CYP154C5 mutants and steroids 1-6, K D values for almost all combinations could be determined (Table 1; respective plots are displayed in Figures S2-S6). Some mutants, however, yielded only a partial or no spectral shift upon substrate addition ( Figure S7), even at high steroid concentration, hampering K D determination. A missing spectral shift upon substrate addition could indicate that the respective steroid does not bind anymore in the mutated active site of CYP154C5. It is also possible, however, that due to the mutation in the active site, the bound substrate is not positioned anymore close to the heme iron to displace the water molecule as the sixth ligand.
Notably, obtained K D values of the alanine mutants were in many cases higher compared to wild-type CYP154C5, indicating a substantial influence of the mutated active-site residues on steroid binding. The main exception is CYP154C5 Q239A for which the obtained K D values were even lower than for wildtype CYP154C5. This suggests a positive influence of mutation Q239A on steroid binding. The exact molecular reasons, however, are difficult to anticipate, as the exchange of glutamine by alanine at this position will actually result in a loss of a hydrogen bond, via a water molecule, to the oxygen atom at C3 of the steroids. Here, a crystal structure of CYP154C5 Q239A with one of the steroids 1-6 bound in the active site could be solved in the future to gain deeper insight. Furthermore, binding of pregnenolone (1) and progesterone (3) to CYP154C5 was hardly affected by amino acid exchange Q398A, whereas significantly higher K D values were obtained for all other steroid substrates. Removal of the glutamine side chain in CYP154C5 Q398A leads to a loss of a hydrogen bond to the C17-substituent oxygen of the steroid substrate, and generates space that could result in a possible movement of steroids 2, 4, 5 and 6 in the active site. In contrast, steroids 1 and 3 carry a more spacious acetyl side chain at C17, which could fill this space resulting in a tighter binding. Interestingly, mutation M84A in CYP154C5 affected substrate binding the most among all tested variants. More than 100-fold lower binding affinities compared to wild-type CYP154C5 were obtained with androstenedione (4) and testosterone (5). Additionally, no spectral shifts of CYP154C5 M84A were observed with steroids 1 and 6, while substrates 2 and 3 induced only partial shifts. This clearly emphasizes the importance of position M84 for steroid binding and correct positioning within the active site of CYP154C5.
Turnover numbers and coupling efficiencies of the purified mutants were determined in the conversion of steroids 1-6, together with Pdx and PdR as redox partners (Tables 2 and 3). Mutants CYP154C5 M84A, CYP154C5 F92A and CYP154C5 Q398A exhibited significantly decreased TONs in comparison to the wild-type enzyme, independent of the used substrate, while for mutation Q239A the TONs were less affected ( Table 2). This  Table 1. Dissociation constants (K D ) of CYP154C5 wild type and mutants for steroid substrates 1-6. K D values were determined at 30°C for the His-tagged CYP154C5 mutants using the quadratic tight-binding equation. [14] All values are the result of duplicate measurements given as mean � SD. is in general agreement with the observed higher K D values obtained for mutants CYP154C5 M84A, CYP154C5 F92A and CYP154C5 Q398A. Interestingly, CYP154C5 M84A converted pregnenolone (1) to a small extent even though no shift from low to high spin could be detected. Such a behavior was previously observed in the case of cytochrome P450 from Bacillus megaterium, CYP106A2, where no change in absorbance was obtained upon addition of deoxycortisone (DOC) although this substrate is converted by this enzyme producing 15βhydroxydeoxycortisone. [15] Simgen et al., however, proved by FTIR spectroscopy using the stretch vibration of the heme iron CO-ligand that DOC indeed enters the active site and binds close to the heme prosthetic group. [15] Similarly, obtained coupling efficiencies of the CYP154C5 mutants in the conversion of steroids 1-6 were, in many cases, also negatively affected (Table 3). This is especially evident for substrates pregnenolone (1) and progesterone (3) as well as for coupling efficiencies of mutant CYP154C5 M84A with all tested steroids. In contrast, coupling efficiencies of CYP154C5 F92A with nandrolone (6), CYP154C5 Q239A with 3, 5 and 6 as well as coupling efficiencies of CYP154C5 Q398A with testosterone (5) are still similar to the wild-type values.
In almost all cases, the CYP154C5 mutants still formed the corresponding 16α-hydroxylated products in steroid conversions of 1-6. Hence, regioselectivity of the single mutants was generally not affected, except for the conversion of progesterone (3) by mutant CYP154C5 F92A. Here, formation of a second hydroxylation product was observed ( Figure S12), which was identified as 11-deoxycorticosterone (hydroxylation at position 21) by NMR analysis (Scheme 1). Both products, 16α-and 21hydroxylated progesterone, were produced in a ratio of 4 : 1 by CYP154C5 F92A. Hydroxylation of progesterone as well as 17α-hydroxyprogesterone at C21 yielding 11-deoxycorticosterone and 11-deoxycortisol, respectively, are important steps in adrenal steroidogenesis required for the synthesis of glucocorticoids and mineralocorticoids. In human, CYP21A2 is the responsible enzyme catalyzing this step. Interestingly, for CYP21A2 the formation of 16α-hydroxyprogesterone as side product has been reported as well. [16] Moreover, exchange of Val359 by alanine yielded a mutant with significantly increased hydroxylation in 16α position resulting in a ratio of 21hydroxyprogesterone to 16α-hydroxyprogesterone of 60 : 40, while mutant CYP21A2 V359G gave even 90 % 16αhydroxyprogesterone. [17] To reveal the structural basis of the change in regioselectivity of mutant CYP154C5 F92A in progesterone hydroxylation, this enzyme-substrate complex was studied by computational tools.  (6) 1.33 � 0.10 [a] Data taken from ref. [11] for comparison.
[b] No conversion observed. Table 3. Coupling efficiencies of CYP154C5 wild type and mutants in the conversion of steroids 1-6. Measurements were performed in duplicate. Values are given as mean � SD.

Modeling of CYP154C5 F92A
A structural model of the compound I intermediate of the F92A mutant was generated and used in docking and molecular dynamics simulations. As a reference, a crystal structure of wildtype CYP154C5 with progesterone (3) bound was converted to the compound I state ( Figure 3A) and subjected to docking as well. [11] The docking simulations with the F92A mutant suggested two possible binding orientations for substrate 3.
One orientation is similar to the one found in the crystal structure ( Figure 3C), whereas the second is clearly different: progesterone is reoriented with its A and B ring now occupying the space created by the F92A mutation and only the hydrogens of carbon 21 are close to the reactive oxygen of compound I ( Figure 3B). This alternative orientation of the substrate would cause extreme steric hindrance, if F92 would remain in the position that is observed in all X-ray structures; the aromatic ring of F92 would have to interlock with the A or B ring of the steroid. The first step in P450-catalyzed hydroxylation is the abstraction of a hydrogen atom from the substrate by the electrophilic oxygen of the compound I intermediate. [18] The predicted reactivity of different enzyme-substrate complexes was therefore examined by molecular dynamics (MD) simulations with scoring of near-attack conformations (NACs) in which substrate hydrogens approach the oxygen of the compound I intermediate ( Figure S8). For each complex, three independent 22 ns MD simulations were performed. The enzyme-substrate complexes were stable ( Figure 4) and the simulations gave reproducible results. In simulations of progesterone (3) in complex with wild-type CYP154C5, both the 16α and 21 hydrogen atoms stayed close to the reactive oxygen atom ( Figure 4A). The overall X-ray structure was maintained and high percentages of NACs were found for both positions ( Table 4). The observation that only 16α hydroxylation took place with the wild-type enzyme is in agreement with the higher reactivity of this secondary carbon atom compared to the primary carbon atom at position 21, [19] even though the latter might be influenced by the flanking carbonyl group.
With the substrate oriented in the F92A mutant like it is in the wild-type enzyme, the distances and NAC percentages for the 16α and 21 hydrogens calculated from the simulations were similar to those found with the wild-type CYP154C5 ( Figure 4C). In contrast, MD simulations of the F92A mutant with progesterone (3) bound in the alternative orientation ( Figure 4B) showed that in this case only the 21 position can undergo oxidation. The distance between the 16α hydrogen and the reactive oxygen is predicted to exceed 6 Å, making hydrogen abstraction impossible and only the 21 hydrogens gave significant levels of NACs (Table 4). Thus, the modeling suggests that the F92A mutation provides an additional progesterone (3) binding mode that is particularly suitable for oxidation at the 21 position. Also Pallan et al. suggested that two alternative binding modes of 3 in CYP21A2's active site are responsible for the formation of 21-and 16α-hydroxylated products by this enzyme based on an observed partial burst in   pre-steady state kinetics. [20] Furthermore, two alternative orientations for progesterone (3) binding have also been observed for CYP260A1 by molecular docking, explaining the low selectivity of this CYP in progesterone conversion. Moreover, through targeted mutagenesis of active-site residue S276, one or the other binding orientation could be suppressed yielding two highly regio-and stereoselective CYP260A1 mutants, which formed either 1α-or 17α-hydroxyprogesterone selectively. [21] Furthermore, CYP154C5 F92A with progesterone (3) also gave the highest uncoupling (Table 3). Uncoupling is commonly observed with P450 mutants and is a highly relevant challenge as it hinders their use in applied catalysis. [22] We found that substrate 3 is more mobile in the active site of mutant F92A. The root-mean-square fluctuations (RMSF) of this substrate are higher in the mutant than in wild-type CYP154C5 ( Figure S9). This suggests that the shape-complementarity between enzyme and substrate is not optimal in the mutant. Also, while in the case of the wild type, the closest water stayed at a 7 Å distance from the reactive oxygen during the entire MD simulations (in agreement with a good enzyme-substrate shape complementarity), with mutant F92A a water molecule did approach the reactive oxygen for both substrate orientations ( Figure 5). Some of the observed distances are less than 2 Å. Moreover, the intruding water forms a H-bond with the reactive oxygen. While the encroaching water does not drive out the substrate (Figures 4, S10 and S11), the H-bonding should change the reactivity of compound I. Uncoupling at the stage of compound I (the oxidase shunt) involves the addition of two protons and two electrons after which the reactive oxygen dissipates to water. Mechanistically, it seems unlikely that the water molecule would protonate compound I (at least not before the latter has been further reduced by one or two electrons), as compound I already has a strongly positive charge. It seems more feasible that the H-bonding would decrease the reactivity of compound I or increase its redox potential, which in either case would increase the chance that compound I becomes reduced before the substrate had the time to react. Theoretical modeling found that H-bonding to the sulfur that coordinates the iron has a strong effect on reactivity. [23,24] H-bonding by water to the reactive oxygen atom of compound I is expected to have even stronger effects, as it is extremely close to the site of the reaction.

Bioconversions of new steroid substrates by wild-type CYP154C5
Instead of introducing mutations, an alternative approach to study the enzyme's selectivity is the selection of steroid substrates that are either lacking key functional groups for the enzyme-substrate interaction or that are carrying new features. Hence, based on the known CYP154C5-steroid complex structures, five new steroid substrates were selected. As previously reported, oxyfunctional groups at C3 and C17 were shown to form hydrogen bond interactions with residues Q239 (via a water molecule) and Q398, respectively. [11] Therefore, steroid substrates lacking one (10 and 11) or both (9) oxyfunc-tional groups, as well as steroids containing a larger side chain at position C17 (7 and 8) were chosen to be tested in bioconversions with wild-type CYP154C5 (Figure 1).
Initially, small-scale reactions employing whole cells or cellfree extract of E. coli containing wild-type CYP154C5, Pdx and PdR were carried out in order to investigate if compounds 7-11 are converted. Reactions employing whole cells or cell free extract of E. coli containing only Pdx and PdR were used as negative controls under the same reaction conditions. No conversion of finasteride (7) and etiadienic acid ethyl ester (8) by CYP154C5 could be observed, neither employing whole cells nor cell-free extract. This is probably the result of the larger side-chain at C17 preventing binding of the steroids in the enzyme's active site. This is further supported by the fact that 7 and 8 did not induce any spectral shift during K D measurements (data not shown). Similarly, CYP154C3 from Streptomyces griseus, a homologue of CYP154C5 hydroxylating steroids selectively at 16α position as well, is also unable to convert steroids with bulky substituents at the D ring. [25] In the case of ethioallocholane (9) conversion by CYP154C5, a possible product peak was identified by GC-MS ( Figure S15) though conversion was too low for product isolation. A NIST-library search suggested a steroid-related structure for this product. Additionally, steroid 9 induced a partial spectral shift of CYP154C5 during K D measurements with a high-spin species content of roughly 50 % ( Figure S7-G). These results suggest that ethioallocholane (9) is indeed converted by CYP154C5 but further tests will be necessary to identify the formed product. In contrast, conversion of 3-deoxydehydroepiandrosterone (10) led to the formation of several products ( Figure S16), indicating that the regio-and/or stereoselectivity of CYP154C5 was altered. Moreover, conversion of 5α-androstan-3-one (11) by CYP154C5 resulted in one hydroxylated product ( Figure S20).
In order to elucidate the structures of formed products, whole-cell bioconversions were performed on preparative scale for substrates 3-deoxydehydroepiandrosterone (10) and 5αandrostan-3-one (11). Similar to the results obtained on analytical scale, several products were formed in the preparative-scale conversion of 10 by wild-type CYP154C5. Of these products, two were obtained in sufficient amount and purity for subsequent NMR analysis (see the Supporting Information for NMR spectra). Results revealed that 16α-hydroxy-3-deoxydehydroepiandrostendione was formed as the main product (Scheme 2). Interestingly, the second purified product seems to be the result of a double hydroxylation as indicated by GC-MS and NMR data ( Figures S18 and S19). The exact hydroxylation positions, however, could not be determined due to low product quantities. Preliminary docking studies with 3-deoxydehydroepiandrosterone (10) and the compound I model of wild-type CYP154C5 suggest position 4β as potential hydroxylation site in addition to the observed 16α-hydroxylation ( Figure S12A). Similarly, also product 16α-hydroxy-3-deoxydehydroepiandrosterone was docked in the active site with position 4β as potential hydroxylation site ( Figure S12B), which would ultimately result in a double hydroxylation of 10. Furthermore, position 2α was identified as potential additional hydroxylation site when using 4β-hydroxylated 10 as docking substrate ( Figure S12C). Even though these docking poses were only obtained in silico, they can give a first indication for potential hydroxylation sites of the other observed products.
In contrast, the product formed in the preparative-scale conversion of 5α-androstan-3-one (11) by CYP154C5 was identified as 15α-hydroxy-5α-androstan-3-one by NMR (Figure S21 and Scheme 2; details on the assignment of the hydroxylation product are given in the Supporting Information). This result indicates a change in CYP154C5's regioselectivity in the conversion of 11, likely caused by the lack of the functional group at position C17 and/or the saturated A-ring of the steroid substrate. To gain structural insight into the altered regioselectivity of wild-type CYP154C5 in the conversion of 11, the P450 was co-crystallized with this steroid. The overall structure of CYP154C5 is very similar to the previously determined CYP154C5 structures with C α -RMSD values of 0.20-0.37. The electron density of the ligand and the derived model indicate that 5α-androstan-3-one (11) binds in a similar position as androstenedione (4) and testosterone (5; Figure 6), the latter being hydroxylated in 16α position by wild-type CYP154C5. Interestingly, however, C15 of steroids 4, 5 and 11 in the corresponding crystal structures of CYP154C5 are in a similar position as C16 of pregnenolone (1) and progesterone (3; Figure 6 A), which are also hydroxylated in 16α position by Scheme 2. Conversion of 3-deoxydehydroepiandrosterone (10) and 5α-androstan-3-one (11) by wild-type CYP154C5 together with redox partners Pdx and PdR and the obtained hydroxylation products, as revealed by NMR spectroscopy.

ChemBioChem
Full Papers doi.org/10.1002/cbic.202000735 wild-type CYP154C5. Hence, the crystal structure alone cannot explain the observed hydroxylation of 5α-androstan-3-one (11) in 15α-position. Therefore, MD simulations (10 trajectories of 22 ns each) of the compound I state of the CYP154C5 structure with 11 bound in the active site have been performed with scoring of NACs as mentioned before ( Figure S13). This way, predicted NAC percentages of 0.31 % for C15 and 15.8 % for C16 were obtained. This confirms that C15 can be reached in productive conformations by the oxygen of compound I, even though the probability for attack of C16 seems to be higher as judged by distance. The question, why only 15α-hydroxy-5αandrostan-3-one is observed as product in conversions of 11 by CYP154C5, cannot finally be solved but might be caused by a higher reactivity of C15 compared to C16. Here, quantum mechanical calculations could be performed in the future to reveal further insight.
Additionally, K D values, TONs and coupling efficiencies for wild-type CYP154C5 in the conversion of 3-deoxydehydroepiandrosterone (10) and 5α-androstan-3-one (11) were determined. CYP154C5 exhibited a rather high affinity towards substrates 10 and 11 with K D values of 94 � 52 and 20 � 16 nM, respectively. In contrast, coupling efficiencies are dramatically decreased in both cases resulting in only 7 � 6 and 26 � 10 % for steroids 10 and 11, respectively. Similarly, obtained TONs (0.67 � 0.17 and 0.77 � 0.17 min À 1 for 10 and 11, respectively) are also rather low compared to steroids 1-6. This suggests that steroid conversion by CYP154C5 is indeed significantly affected if one of the oxyfunctional groups at C3 or C17 of the steroid backbone, and hence the corresponding hydrogen bond, is missing. On the other hand, TONs of wild-type CYP154C5 for conversion of 10 and 11 were determined for the whole-cell system and not isolated enzyme. Hence, the resulting data is not directly comparable to TONs determined in the conversion of steroids 1-6.

Conclusion
With our study, we were able to demonstrate experimentally that the previously reported high regioselectivity of CYP154C5 is dependent on the presence of oxyfunctional groups at C3 and C17 of the steroid substrate, as products with new hydroxylation sites were obtained in reactions with steroids lacking one of those substituents. Here, hydroxylation of 5αandrostan-3-one in 15α position is especially interesting, as 15α hydroxylation of steroids by bacterial cytochrome P450 monooxygenases has only been reported once so far, in the conversion of testosterone and dehydroepiandrosterone by specific mutants of CYP102A1 from B. megaterium. [26] In contrast, the replacement by alanine of active site residues Q239 and Q398 in CYP154C5, which have been shown to interact with the oxyfunctional groups through hydrogen bonding, did not alter the enzyme's regioselectivity but had a negative impact on catalytic efficiency. Similarly, mutagenesis of residues M84 and F92, which form hydrophobic interactions with the steroid backbone, resulted in reduced turnover numbers and coupling efficiency as well as higher dissociation constants for most steroid substrates tested. This was especially evident for mutant M84A, which confirms the importance of residue M84 for delimiting the active site pocket and confining the steroid in a catalytically active position. Moreover, mutation F92A appeared to enable a second binding orientation of progesterone in the enzyme active site resulting in C21 hydroxylation. Overall, our data demonstrate the feasibility for future modification of CYP154C5 regioselectivity by protein engineering and give valuable insight into the structurefunction relationship of this cytochrome P450 monooxygenase for steroid hydroxylation.
Bacterial strains and plasmids: E. coli DH5α (Invitrogen) was used for genetic manipulations, while E. coli C43(DE3) (Lucigen, Middleton, WI, USA) was used for recombinant gene expressions. Plasmid pET28a(+) was purchased from Novagen (EMD Biosciences, San Diego, CA, USA). Preparation of plasmid pACYCcamAB for coexpression of putidaredoxin reductase (CamA or PdR) and putidaredoxin (CamB or Pdx) from P. putida was described elsewhere. [27] Preparation of plasmid pIT2cyp154C5 was described elsewhere. [12] The gene of CYP154C5 was subcloned from vector pIT2cyp154C5 into pET28a(+) using restriction sites NdeI and HindIII. The resulting plasmid was named pET28cyp154C5. Expression of CYP154C5 from vector pET28cyp154C5 results in a fusion protein with N-terminal His-tag.
Generation of CYP154C5 mutants: Mutants were prepared by QuikChange® site-directed mutagenesis using the Pfu-Turbo Hotstart PCR Master Mix (Agilent) according to the manufacturer's instructions. The primers applied in the PCR reactions are listed in Table S1 in the supplementary. Correct introduction of the mutations was confirmed by sequencing at GATC Biotech (Konstanz, Germany) and final plasmids were transformed into E. coli C43(DE3) for protein expression.
Expression and purification of enzymes: Production of CYP154C5 wild type and its mutants using E. coli C43(DE3), as well as expression of camA (PdR gene) and camB (Pdx gene) using E. coli C43(DE3) (pACYCcamA) and E. coli C43(DE3) (pACYCcamB), respectively, were performed as described elsewhere. [11] Protocols for the production of E. coli C43(DE3) (pIT2cyp154C5) (pACYCcamAB) and E. coli C43(DE3) (pETcyp154C5-F92A) (pACYCcamAB) whole-cell biocatalysts were also previously described. [12] Purification of wildtype CYP154C5 and its mutants via N-terminal His-tag, as well as purification of PdR and Pdx by anion exchange and hydrophobic interaction chromatography were also performed as previously described. [11] Enzyme assays: The P450 concentration was measured using COdifference spectra. [28] The activity of the purified electron transfer components (ETC) Pdx and PdR was determined by cytochrome c reduction assay, monitoring the increase in absorbance at 550 nm of reduced cytochrome c (ɛ 450 = 19.1 mM À 1 cm À 1 ) in a mixture containing a PdR/Pdx ratio of 3 : 16. [29] Additionally, all enzymes assays were also measured with cell lysate of the respective E. coli C43(DE3) cells containing P450, PdR and Pdx before whole-cell catalysis as described elsewhere. [11] Total protein concentration was determined by Bradford assay. [30] Substrate binding studies: Dissociation constants (K D ) of CYP154C5 wild type and CYP154C5 mutants for the different steroids were determined by spectroscopic measurements upon titration of purified P450 with increasing steroid concentrations. [13] Thus, purified CYP154C5 carrying an N-terminal His tag was diluted with 50 mM potassium phosphate buffer pH 7.4 in order to reach 3 μM final enzyme concentration. To this mixture, substrate was added in concentrations from 0 to 150 μM. Therefore, three different substrate stock solutions of 0.1, 0.5 and 1 mM in 0.1-4.5 % (w/v) hydroxypropyl-β-cyclodextrin in deionized water (diH 2 O) were prepared. The absorbance spectra of each sample were measured on a Cary 50 spectrophotometer (Agilent) between 300 and 500 nm at 30°C. As a blank, 3 μM P450 in 50 mM potassium phosphate buffer, pH 7.4, with addition of an equivalent amount of buffer instead of substrate solution was used. Each sample was prepared and measured in duplicate. By plotting the resulting absorbance difference (Abs 386 nm À Abs 420 nm ) against the applied substrate concentration and fitting the data with the tight binding equation using MATLAB, K D values for the different steroids were obtained. [31,14] Turnover number determination: Reactions for the determination of turnover numbers were carried out in 5 mL scale. Each reaction contained 3 μM P450, 3 μM PdR, 16 μM Pdx, 0.5 U mL À 1 formate dehydrogenase from C. boidinii, 150 mM sodium formate, 300 U mL À 1 catalase from bovine liver, 50 μM NADH and 2 mM of the respective steroid substrate (1-6) in 50 mM potassium phosphate buffer pH 7.4. Steroid stock solutions of 4 or 5 mM concentration were prepared in 1.8-4.5 % (w/v) hydroxypropyl-βcyclodextrin in diH 2 O depending on the substrate. All bioconversions were carried out at 30°C and 250 rpm for up to 20 h. During bioconversions, samples were taken at different time points for subsequent HPLC and GC analysis. For that, 0.25 mL of each reaction was extracted as described elsewhere. [12] Turnover numbers were calculated based on substrate consumption and for the period of time where the highest substrate consumption rate was observed (usually within the first 2-3 h of reaction). Each reaction was performed in duplicate.
In contrast, in the case of steroids 10 and 11, TONs were determined based on whole-cell conversion. When performing preparative scale reactions of 10 and 11 using frozen cells of E. coli C43(DE3) (pIT2cyp154C5) (pACYCcamAB), samples were taken over time and TONs were calculated as described in the previous paragraph.
Coupling efficiency determination: NADH depletion during bioconversions of steroids by CYP154C5 mutants, Pdx and PdR was monitored in a spectrophotometer at 340 nm (ɛ 340 = 6.22 mM À 1 cm À 1 ). Reactions of 0.7 mL total volume included 0.4 μM purified P450, 0.4 μM purified PdR, 14 μM purified Pdx, 600 U mL À 1 catalase from bovine liver, 200 U mL À 1 superoxide dismutase, 200 μM NADH and 1 mM of the respective steroid (4 or 5 mM stock in 1.8-4.5 % (w/v) of hydroxypropyl-β-cyclodextrin in diH 2 O) in 50 mM potassium phosphate buffer pH 7.4. After the NADH was completely consumed, 0.5 mL reaction mixture was extracted as described for the whole-cell catalysis and further analyzed by HPLC and GC in order to determine the conversion.

Analytical-scale bioconversions:
In the case of whole-cell bioconversions, frozen cells of E. coli C43(DE3) (pIT2cyp154C5) (pACYCca-mAB) overexpressing Pdx, PdR and CYP154C5 were resuspended in 50 mM potassium phosphate buffer, pH 7.4, to the desired final OD 600 of 40. All bioconversions were carried out in 1 mL scale at 30°C and 250 rpm with addition of glucose (0.54 mg mL À 1 final conc.) for cofactor regeneration and 1 mM substrate. Substrate stock solutions of steroids were prepared in 36 % (w/v) hydroxypropyl-β-cyclodextrin in potassium phosphate buffer, pH 7.4. In detail, stocks with a final concentration of 2.5; 3.2; 4.2; 4.1 and 4.2 mM for substrates finasteride (7), etiadienic acid ethyl ester (8); ethioallocholane (9); 3-deoxydehydroepiandrostendione (10) and 5α-androstan-3-one (11) were prepared, respectively. Control reactions were carried out in parallel with E. coli C43(DE3) (pACYCcamAB) containing only Pdx and PdR. After 20 hours of reaction, the bioconversions were extracted for subsequent HPLC and GC analysis. For that, 500 μL of sample was extracted twice with ethyl acetate (300 μL) and once with chloroform (250 μL). The organic phases were combined, dried with sodium sulfate and the solvent was removed under reduced pressure. As an exception, conversions performed with steroid 8 were acidified with 2 M HCl previous to the extraction procedure.

Preparative-scale bioconversions:
For the conversion of steroids 10 and 11, preparative-scale bioconversions were carried out in 50 mM potassium phosphate buffer, pH 7.4, using resting whole cells of E. coli C43(DE3) (pIT2cyp154C5) (pACYCcamAB) at 30°C and 250 rpm with the addition of glucose (0.54 mg mL À 1 ) for cofactor regeneration. In the case of substrate 10, resting cells were resuspended in 100 mL buffer to OD 600~4 0, equivalent to 7.8 μM CYP154C5 and an ETC activity of 8.1 U mL À 1 (1.1 U mg À 1 of total protein), as determined by CO-difference spectra and cytochrome c assay, respectively. Similarly, conversion of substrate 11 was performed using resting cells resuspended in 100 mL buffer to OD 600~6 0, equivalent to 4.2 μM CYP154C5 and an ETC activity of 13.8 U mL À 1 (1.4 U mg À 1 of total protein). Initial substrate concentrations of 1 mM (10 and 11) were used; for that, stock solutions of substrates 10 and 11 were prepared in 36 % (w/v) hydroxypropyl-βcyclodextrin (in diH 2 O) and DMSO, respectively. Preparative-scale bioconversions of progesterone (3) were carried out in 50 mM potassium phosphate buffer, pH 7.4, in shake flasks using resting whole cells of E. coli C43(DE3) (pIT2cyp154C5_F92A) (pACYCcamAB) at 30°C and 250 rpm. In this case, cells were resuspended in 80 mL buffer to OD 600~4 0 equivalent to 8 μM CYP154C5 and an ETC activity of 4.7 U mL À 1 (0.6 U mg À 1 of total protein), as determined by CO-difference spectra and cytochrome c assay, respectively. After 24 h of reaction, the complete reaction volume was extracted twice with ethyl acetate (50 and 40 mL) and once with chloroform, (30 mL), the organic phases were combined, dried with sodium sulfate and the solvent was removed under reduced pressure. Hydroxylated steroid products were afterwards purified by silica gel column chromatography with a mixture of ethyl acetate: n-heptane (8 : 2) as mobile phase.

GC-MS and NMR analyses:
Preliminary product identification was performed with a gas chromatograph -mass spectrometer (GC-MS-QP2010S, Shimadzu) equipped with an OPTIMA 17 ms column (Macherey-Nagel) using a linear temperature gradient starting at 250°C and heating with 10°C min À 1 to 300°C. Injector, interface and ion source temperature were set to 300, 300 and 200°C respectively. Structure elucidation of formed products was performed by 1 H, 13 C, COSY, HSQC, DEPT and NOESY NMR analysis on Bruker AV400, AV500 or AV600 instruments using deuterated chloroform or DMSO as solvent with TMS as internal standard. Chemical shifts (δ) are given in ppm and coupling constant (J) in Hz.
MD simulations: Molecular dynamics simulations were performed by using Yasara with algorithms that were recently described in detail elsewhere. [32] Periodic boundary conditions were applied. Long range electrostatics, beyond 7.86 Å, were calculated through the particle mesh Ewald method using 4th degree spline functions. The time-step was 2.0 fs, with the nonbonded interactions updated every two time-steps. The force-field was Yamber3, which is an Amber99 derivative which was specifically parameterized for structural accuracy. [33] The compound I structure was generated and its atomic point charges assigned as described by the Pleiss group. [34] Point charges on the steroid substrates were generated with AM1-BCC, which gives similar accuracy as RESP at a much lower computational cost. [35] Prior to the MD simulations, an energy minimization (described previously [36] ) was used to remove subtle steric clashes. For each of the modeled enzyme-substrate complexes, three independent MD simulations were carried out. These simulations were started with different initial atom velocities assigned via a random seed number. [37] The distribution of these atom velocities always obeyed a Boltzmann distribution. In the first 30 ps of the MD simulation, the temperature was gradually increased from 5 to 298 K. After that, the simulation was allowed to equilibrate for 1970 ps. The subsequent production phase was 20 ns. From the latter phase, all the reported results were collected. Snapshots were saved every 50 ps.
During the production phase of the MD simulations, geometric information was recorded to quantify to which extent hydrogen atoms of the substrate were in a suitable orientation to be attacked by the oxygen atom of compound I. These geometries were recorded every 1 ps, on the fly. As suggested by the Bruice group, near attack conformations (NACs) were defined as having interatomic distances of less than the van der Waals contact distance and angles between the reacting atoms within 20°of those in the quantum mechanically modeled transition state ( Figure S8). [38,39] A Yasara script that automatically performs the MD simulations, the recording of these NACs, and analysis of the resulting data is available upon request.
Docking: A challenge with the flexible P450 class of enzymes is that substrate binding often requires small backbone changes. Therefore, docking was carried out essentially as previously performed by the Reetz group for modeling steroid binding and conversion by P450 BM3 variants. [10] First the F92A mutation was introduced into the four experimentally determined CYP154C5 structures that have a steroid substrate bound. [11] Subsequently 12 ns MD simulations were carried out, to sample the possible backbone changes around the active site. Progesterone was docked to snapshots of these MD simulations (1 snapshot was used per ns of MD simulation). The docking was performed using Autodock4 [40] with 4995 docking runs for each snapshot and 25000 energy evaluations per docking run. Substrate orientations with an unrealistic binding orientation were avoided by eliminating all poses of which the predicted binding energy fell outside the 90 % confidence interval of the substrate orientation with the best binding energy, as described earlier. [41] The same protocol was used for docking steroid 10 as well as 16αor 4β-hydroxylated 10 into the active site of CYP154C5 but using the wild-type structures as starting points.
Structure determination: Data sets were processed using DIALS, [42] POINTLESS [43] and AIMLESS [44] of the CCP4 suite [45] applying a resolution cut-off of 2 Å yielding a CC1/2 higher than 0.5 in the lowest resolution shell. Initial maps were calculated by Fourier synthesis using phenix.refine [46] and the atomic coordinates of CYP154C co-crystalized with testosterone (PDB ID: 4 J6D). The structure was further refined by alternating rounds of manual adjustment in COOT [47] and computer-driven refinement with phenix.refine, including TLS refinement. Geometry restraints for 5αandrostan-3-one were calculated using the Grade Web Server (http://grade.globalphasing.org; Global Phasing Inc.). Data processing and refinement statistics are listed in Table S3. Diffraction data and coordinates were deposited in the Protein Data Bank [48] (PDB ID: 6TO2).