Shy is a proteobacterial steroid hydratase which catalyzes steroid side chain degradation without requiring a catalytically inert partner domain

Shy (side chain hydratase) and Sal (side chain aldolase), are involved in successive reactions in the pathway of bile acid side chain catabolism in Proteobacteria. Untagged Shy copurified with His-tagged Sal indicating that the two enzymes form a complex. Shy contains a MaoC and a DUF35 domain. When coexpressed with Sal, the DUF35 domain but not the MaoC domain of Shy was observed to copurify with Sal, indicating Sal interacts with Shy through its DUF35 domain. The MaoC domain of Shy (ShyMaoC) remained catalytically viable and could hydrate cholyl-enoyl-CoA with similar catalytic efficiency as in the Shy-Sal complex. Sal expressed with the DUF35 domain of Shy (Sal-ShyDUF35) was similarly competent for the retro-aldol cleavage of cholyl-3-OH-CoA. ShyMaoC showed a preference for C5 side chain bile acid substrates, exhibiting low activity toward C3 side chain substrates. The ShyMaoC structure was determined by X-ray crystallography, showing a hot dog fold with a short central helix surrounded by a twisted antiparallel β-sheet. Modeling and mutagenesis studies suggest that the bile acid substrate occupies the large open cleft formed by the truncated central helix and repositioning of the active site housing. ShyMaoC therefore contains two substrate binding sites per homodimer, making it distinct from previously characterized MaoC steroid hydratases that are (pseudo) heterodimers with one substrate binding site per dimer. The characterization of Shy provides insight into how MaoC family hydratases have adapted to accommodate large polycyclic substrates that can facilitate future engineering of these enzymes to produce novel steroid pharmaceuticals.

b-sitosterol, and ergosterol make up a significant portion of cell membranes in animal, plant, and fungal cells, respectively (2).Bile acids are hydroxylated amphipathic cholesterol derivatives that are synthesized in the liver of animals and secreted into the digestive tract to emulsify dietary fats, facilitating absorption in the intestine (3).These natural steroids are eventually deposited into the environment and represent a significant potential carbon source.Despite their ubiquity in many environments (4,5), steroids are inaccessible to most organisms due to their complex structure (including two quaternary carbon atoms and limited functional groups) rendering them resistant to degradation.The ability to utilize steroids as a sole carbon and energy source for growth is limited to certain bacteria from the Proteobacteria and Actinobacteria phyla (6).This rare ability to completely degrade steroids allows these bacteria to fulfill an important role in natural carbon cycling.Metagenomic analysis indicates that these organisms are globally distributed in soil, plant rhizospheres, wastewater treatment plants, and marine environments (6).Furthermore, these bacteria have been harnessed to produce valuable precursors of steroid pharmaceuticals from low-cost phytosterols (7,8).There are also potential biotechnological applications for the bioremediation of natural and synthetic steroid pollutants in wastewater which commonly originate from livestock farming (9)(10)(11); These pollutants include anthropogenic steroids that can have detrimental effects on the endocrine systems of animals (7,8).Realizing this wealth of potential applications, however, is hampered by a lack of knowledge of the biochemical details of bacterial steroid degradation pathways and the enzymes involved.
Steroids are comprised of a four-ring nucleus (designated as rings A, B, C, and D) with an alkyl side chain on the D ring.Variations in the length of the D-ring side chain and the placement of hydroxyl substituents on the steroid rings contribute to the structural and therefore functional diversity of steroids (8).The catabolic pathways utilized by the bacteria reflect the structure of these molecules, with distinct pathways responsible for catabolism of the AB-rings, CD-rings, and Dring side chain.While the pathways of steroid rings catabolism are generally conserved between Actinobacteria and Proteobacteria, the D-ring side chain catabolism pathway is divergent between these phyla.The catabolism of the D-ring side chain is analogous to fatty acid b-oxidation with repeating cycles of enzymatic reactions removing either two carbon atoms as acetyl-CoA, or three carbon atoms as propionyl-CoA (12, 13) (Fig. 1A).In Actinobacteria, after activation of the C 5 carboxylate side chain of cholate with coenzyme A, the resulting thioester can be dehydrogenated by an acyl-CoA dehydrogenase, followed by the addition of water across this double bond by an enoyl-CoA hydratase (ECH) (14,15).This hydrated product is oxidized by a hydroxyacyl-CoA dehydrogenase to produce a 3-ketoacyl-CoA that can then be cleaved by a b-ketothiolase to form acetyl-CoA (in the case of cholate) to generate a CoA-ester shortened by two carbon atoms (16,17).This ester then proceeds through an analogous series of reaction steps; however, hydration now results in a tertiary alcohol that cannot be oxidized further and is instead cleaved by an aldolase, Ltp2, releasing propionyl-CoA and a 17-keto steroid (18)(19)(20).
Proteobacterial cholate side chain catabolism has been primarily studied in Stutzerimonas stutzeri strain Chol1 (previously Pseudomonas sp.strain Chol1 (21)), through gene KOs (22,23).The C-C bond cleavage of the C 5 side chain of cholate in the first cycle of b-oxidation terminates via an aldolytic deacetylation instead of a thiolase catalyzed reverse Claisen reaction (Fig. 1A).This aldolytic reaction is thought to be catalyzed by the enzyme Sal (steroid aldolase), producing a free aldehyde which is then oxidized and reesterified to produce a C 3 CoA-ester which can enter the second round of b-oxidation analogous to the pathway in Actinobacteria (Fig. 1A) (22).The two aldolases in the pathway, Sal and Ltp2, are homologous.
Ltp2 was found to associate with the heteromeric hydratase that catalyzes the preceding reaction in the pathway (20).The heteromeric hydratases, exemplified by ChsH1-ChsH2 from Mycobacterium tuberculosis, are members of the MaoC family of hydratases.However, only ChsH1 has the canonical MaoC catalytic His/Asp dyad required for the hydration reaction, while the ChsH2 MaoC domain lacks these residues and is modified to accommodate the bulky steroid rings (Fig. 1B) (19).Therefore, the steroid rings of the substrate bind to ChsH2 while the side chain extends toward ChsH1 for hydration to occur (19).ChsH2 also possess a DUF35 domain (domain of unknown function 35) at the C-terminus (19) which interacts with the aldolase Ltp2, forming a ChsH1-ChsH2-Ltp2 complex (20).The crystal structure of Ltp2 with the DUF35 domain of ChsH2 showed two DUF35 protein chains at the periphery of a central Ltp2 dimer, bridging the two protomers (24).The DUF35 domain does not appear to be important for the activity of the aldolase or the hydratase (19,20).The aldolytic cleavage of the hydratase product by Ltp2 is, however, necessary for the hydratase to overcome the unfavorable hydration equilibrium, explaining why the tight coupling of these two enzymes is likely metabolically desirable (20).
Similar hydratases containing two distinct MaoC domains catalyze hydration of the C 5 steroid side chains in Actinobacteria, including ChsH3 involved in cholesterol catabolism, and CasD and CasQ involved in bile acid catabolism (16,25).Here, however, the two MaoC domains are fused into a single polypeptide and there are no DUF35 domains (Fig. 1B) (25).In contrast, the hydratase, Shy, (side chain hydratase) thought to be responsible for a similar reaction in Proteobacteria, contains a single MaoC domain possessing the His/Asp catalytic dyad in the C terminus and a DUF35 domain in the N terminus (Fig. 1B) (23).The shy gene is downstream of sal and there are no other genes encoding MaoC domains in its vicinity.Since a noncatalytic MaoC domain is seemingly a requirement for the heteromeric hydratases or double MaoC domain hydratases in Actinobacteria to utilize steroid substrates, it is unclear how Shy (which only contains a single MaoC domain) accommodates bile acid substrates.As the function of Shy and Sal are inferred only from gene KO studies, we sought to confirm the function of both Shy and Sal in vitro, determine if these enzymes form a complex and whether this association is necessary for the function of each enzyme.We also report the crystal structure of MaoC domain of Shy, shedding light on the molecular basis for steroid binding in this unique homomeric MaoC hydratase.

Shy associates with Sal via its DUF35 domain
Genes encoding Shy and Sal (WP_003057309.1 and WP_003057311.1,respectively) from the Betaproteobacterium Comamonas testosteroni KF-1 (23) were amplified from genomic DNA via PCR and inserted into the expression vectors.When produced recombinantly and separately in Escherichia coli, His-tagged Shy and His-tagged Sal were expressed in low yields and the enzymes were impure after Ni-NTA chromatography.However, when coexpressed, the untagged Shy copurifies with N-terminal His-tagged Sal in good yield on a Ni-NTA column (2.7 mg per liter of culture), indicating that the two enzymes associate (Fig. 2).Purified Shy-Sal, when subjected to analytical size-exclusion chromatography eluted as a single peak corresponding to a molecular weight of 160 kDa, suggesting a stoichiometry of two molecules of Shy and two molecules of Sal in the complex (the theoretical molecular weights of Shy and tagged Sal are 34.81 and 46.08 kDa, respectively) (Fig. S1).We next created a truncated form of Shy that contains the MaoC but not the DUF35 domain (Shy MaoC ).Shy MaoC can be expressed and purified in good yield from recombinant E. coli without Sal.However, untagged Shy MaoC did not copurify with His-tagged Sal when coexpressed.Conversely, an untagged truncated form of Shy containing only the DUF35 domain (Shy DUF35 ) copurified with His-tagged Sal, confirming that Sal interacts with Shy through its DUF35 domain (22).
The Shy-Sal complex was able to hydrate cholyl-enoyl-CoA with a high catalytic efficiency (k cat /K m ) of (2.64 ± 0.191) × 10 7 M −1 s −1 (Table 1).A similar catalytic efficiency toward the compound was also observed for Shy MaoC indicating that efficient hydration does not require the DUF35 domain nor the association with Sal.
We further elucidated the specificity of Shy toward steroid substrates with different side chain lengths and nucleus substituents (Fig. 3).Shy MaoC is capable of hydrating 3b-hydroxy-chol-5,22-diene-24-oyl-CoA (3-HCDO-CoA), which bears a C 5 side chain and a cholesterol like nucleus, with 37-fold decreased catalytic efficiency versus cholyl-enoyl-CoA.Shy MaoC possessed low activity toward the C 3 side chain metabolite 3-oxo-4,17-pregnadiene-20-carboxyl-CoA (3-OPDC-CoA) with a specific activity at 35 mM 3-OPDC-CoA approximately 1000-fold lower than that obtained with the same concentration of cholyl-enoyl-CoA ((3.70 ± 0.305) × 10 −2 versus 51.2 ± 0.0833); this activity was too low to determine kinetic parameters.Given that 3-OPDC-CoA has a similar nucleus to 3-HCDO-CoA, these results suggest Shy is specific toward C 5 side chains and has a broad specificity for steroid ring nucleus conformations and substituents, with a preference for bile acid nuclei.Lastly, Shy MaoC was assayed for activity against crotonyl-CoA, an aliphatic enoyl-CoA with C 4 acyl chain.The catalytic efficiency for this substrate is six orders of magnitude lower than that for cholyl-enoyl-CoA, indicating that the steroid ring nucleus is an important determinant for productive substrate binding.

Structure of Shy MaoC
The crystal structure of Shy MaoC was determined to 2.05 Å resolution by molecular replacement using a protomeric model of the protein as predicted by ColabFold (26) as the search model.Data collection and structure refinement statistics are summarized in Table 2; electron density is shown in Fig. S3.Although the crystal was grown in the presence of acetyl CoA, electron density for the ligand was not observed in the structure.Shy MaoC crystallizes with two chains in the asymmetric unit; these protomers superimpose closely with an r.m.sd of 0.3 Å.In both protomers, the first 13 residues (E 148 SEVAVVEQSVAA 158 ) are disordered and were not modeled.Shy MaoC adopts a hot dog fold, with a central bent five-stranded antiparallel b-sheet (topology 2, −5, 6, −7, 4) "bun" wrapping around a central helix "sausage", inserted between b2 and b3 (Fig. 4, A and B).A three-helix motif that contains residues important for catalysis is inserted N terminal to the central helix; this motif is termed the active site housing segment (27,28).An additional auxiliary domain is built from the N-terminal strand-helix pair b1 and a1, with b1 paired parallel with b3 (contributed by the loop joining the central helix, a5, to b4); this auxiliary domain is positioned at the end of the central b-sheet distal from the active site housing.The two Shy MaoC protomers interact to form a homodimer with an interface area of 1001.9Å 2 (as determined by proteins, interfaces, structures and assemblies, PISA) (Fig. 4C) (29).b4 strands from adjacent protomers interact in antiparallel fashion to form an extended 10-strand b-sheet; in addition, the central helix, a5, and a4 from the active site housing of each protomer interact to form a four-helix bundle which buries an extended nonpolar surface.Analytical size-exclusion chromatogram of Shy MaoC displays a single peak corresponding to a molecular weight of 40 kDa, suggesting that a homodimer is the preferred oligomeric state of this construct in solution (the theoretical molecular weight of tagged Shy MaoC is 20.5 kDa) (Fig. S1).

Comparison to other hydratases
Structural homologues of Shy MaoC were initially identified via a heuristic Protein Data Bank (PDB) search using DALI (http://ekhidna2.biocenter.helsinki.fi/dali/)(statistics listed in Table S1) (30).The highest ranking hit that has been biochemically characterized is the C-terminal ECH domain of PaaZ from E. coli K12 (PaaZ MaoC ) (PDB ID: 6JQN), a bifunctional enzyme involved in phenylacetic acid metabolism (31) (Fig. 5A).PaaZ MaoC hydrates oxepin-CoA, a seven-membered C-O heterocyclic compound with a C 2 side chain, with high catalytic efficiency (32).An interesting aspect of this structural similarity is that PaaZ MaoC , unlike most other MaoC domains,   Specific activity measurements were performed at least in triplicate.

Characterization of Shy-Sal
shares a pronounced twisting of the b-sheet with Shy (Fig. 5B).
Another high-ranking hit is PhaJ from Aeromonas caviae (PhaJ Acav ) (PDB ID: 1IQ6), an ECH involved in the biosynthesis of polyhydroxyalkanoates (PHAs) and which is specific toward short chain enoyl-CoAs with C 4 -C 6 acyl chains (27,33,34) (Fig. 5C).Surprisingly, the steroid specific ChsH1-ChsH2 ECH from M. tuberculosis (PDB ID: 4W78), while also among the high-ranking hits, is less similar to Shy than PaaZ and PhaJ despite also utilizing steroid substrates (Fig. 5D).ChsH1-ChsH2 is specific toward C 3 side chain steroid substrates (3-OPDC-CoA) and can also hydrate crotonyl-CoA with a three order of magnitude reduced catalytic efficiency versus 3-OPDC-CoA (19,25).The three-helix active site housing motifs in these proteins are structurally conserved and, except for the noncatalytic ChsH2, contain the conserved catalytic dyad comprised of a water activating histidine (substituted with tyrosine in ChsH2) and an aspartate that positions the water for hydration (Fig. 5E).The catalytic roles of these residues have been well established in multiple enzymes including ChsH1 and PhaJ Acav (19,27,35).The water molecule that hydrates the substrate in PhaJ Acav is observed in an equivalent position in Shy MaoC within hydrogen bond distance to the carboxylate of Asp-29, while in ChsH1 the catalytic water is displaced by a cadmium ion (28).
All of these proteins form dimers with a similar interface, including the pairing of each protomer's b-sheet into an extended 10-strand antiparallel b-sheet and packing between the adjacent active site housings.Shy MaoC , PhaJ Acav , and PaaZ MaoC form homodimers with two active sites at the dimer interface, in which the active site housing from one residue faces a substrate-binding pocket formed by the bun and helix of the opposite protomer.In contrast, the ChsH1-ChsH2 heterodimer contains only one active site made up of the active site housing of ChsH1 and the substrate binding pocket of ChsH2.

Active site architecture of Shy
In canonical MaoC ECHs like PhaJ the active site takes the form of an enclosed tunnel with a single, narrow opening through which the acyl portion of the substrate is threaded.The depth of the PhaJ Acav tunnel is defined in part by the central helix "sausage", which is 16 residues in length and connected to the equivalent of b4 through a short loop.The tunnel entrance is defined by the active site housing of the opposite protomer which contacts the central helix, connecting loop, and b-strand (Fig. S4A).The long central helix of PhaJ restricts the volume of the central substrate-binding cavity, resulting in a relatively short tunnel with an approximate volume of 827 Å 3 (Fig. 6A).PaaZ MaoC possesses a similar enclosed tunnel like active site; (Fig. S4B).However, the central helix is shortened by three residues at its C-terminal end, with a longer loop connecting it back to the subsequent bstrand.The shortened helix increases the volume of the tunnel to approximately 1144 Å 3 , consistent with the larger physiological substrate of PaaZ MaoC (Fig. 6B).
In contrast, Shy MaoC represents a remodeling of this canonical MaoC active site architecture.Firstly, the central helix of Shy MaoC is only nine residues long, with the shortening of the helix by an additional turn at the C terminal (relative to the already shortened PaaZ MaoC ) opening up considerable additional space in the active site.This helix also shows mixed geometry, with the first four hydrogen bonds conforming to an a-helical pattern (i to i + 4), while the subsequent four follow a 3 10 -helix pattern (i to i + 3); this narrows the C-terminal half of this helix.The loop C terminal to this central helix is also greatly elongated, forming b3 of the auxiliary domain before linking back to b4 of the central b-sheet; this pushes these residues to the periphery of the binding site, where they form an extended wall of an open pocket rather than forming one side of the substrate binding tunnel.In order to accommodate this loop, the N-terminal end of the otherwise long (in PhaJ Acav and PaaZ MaoC ) b2 strand does not hydrogen bond with the central b-sheet, instead packing behind it as an extended loop.Together, these changes expose much of the surface of the b-sheet to solvent.The active site is further opened up by a small shift outward of the active site housing; as a result of this shift (along with the shortening of the central helix and the a5-b3 loop) a3 and a4 do not make contact with the dimeric partner (Fig. S4C).Together these changes produce a continuous open cleft that spans the width of the b-sheet with an approximate volume of 2895 Å 3 (Fig. 6C).

Characterization of Shy-Sal
While the single active site of ChsH1-ChsH2 also diverges from the canonical MaoC organization, the remodeling is less extensive than observed in Shy MaoC .Specifically, the central helix of ChsH2 is also nine residues long with mixed geometry but with the N terminal most four residues following a 3 10 pattern, while the final three residues are a-helical.ChsH2 also features an auxiliary domain, albeit with a different composition than Shy MaoC , with the extended loop C-terminal to the central helix connecting to an a-helix which forms a two-helix bundle with an N-terminal helix.Intriguingly, ChsH2 also reorganizes the edge of b-strand 2 of the hot dog "bun" into a loop which connects a1 to the active site housing, analogous to the a1 to b2 loop in Shy MaoC .A key difference between ChsH1-ChsH2 and Shy is that the active site housing of ChsH1 maintains contact with the b-sheet maintaining the canonical tunnel entrance through which CoA is threaded, but with an extended "back door" opening formed by the remodeling of the ChsH2 active site (Fig. S4D).The distinct auxiliary domain of ChsH2 is positioned closer to the dimer core relative to Shy MaoC , which combined with the tunnel entrance results in a reduced internal volume of approximately 2017 Å 3 (Fig. 6D).As previously mentioned ChsH2 lacks the catalytic histidine required for activation of water for hydration, in addition, the substrate-binding pocket formed by ChsH1 is too small to accommodate steroid substrates due to the 18 residue long central helix that connects directly back to the central b-sheet via a short loop without an intervening auxiliary domain, forming a restrictive enclosed tunnel akin to canonical MaoC ECHs.Overall, the architecture of the active site of ChsH1-ChsH2 differs significantly from Shy MaoC despite both hydratases using steroidal substrates.

Characterization of Shy-Sal
We next constructed a maximum likelihood phylogenetic tree from these three types of hydratases (Fig. S5).Homologs of PaaZ and PhaJ were selected, along with Shy homologous from proteobacteria confirmed to catabolize steroids.Homologs of ChsH1 and ChsH2 were selected from cholesterol degrading actinobacteria while their homologs, CasM and CasO, were selected from bile acid degrading actinobacteria.To align the MaoC domains of these proteins the DUF35 domains and the aldehyde dehydrogenase domains of all sequences were removed from the alignment.The resulting tree shows that Shy is most closely related to the other proteobacterial enzymes, sharing, instead, a most recent common ancestor with PaaZ followed by PhaJ.The heteromeric hydratases form distinct clades, with catalytic ChsH1/CasO and noncatalytic ChsH2/CasM grouping together.

Determinants of steroid specificity
To gain further insight on steroid binding, a derivative of cholyl-3-OH-CoA (cholyl-22-(R)-hydroxy-24-methylthioate), the product of the hydration of cholyl-enoyl-CoA, was modeled into the structure of Shy MaoC and energy minimized using Rosetta (Fig. 7) (36).Based on the consistent stereospecificity of other MaoC hydratases the 3-hydroxyl group was placed in the R-configuration.CoA is both large and highly flexible; the docking ligand was therefore terminated as a methyl thioester so that interactions with the relatively rigid, nonpolar steroid moiety drive binding.The resulting structure shows extensive ligand-protein interactions, with both good nonpolar shape complementarity and formation of favorable hydrogen bonds, including those required for catalysis (35).In the forward direction, MaoC hydratases catalyze the syn addition of water to trans-a,b double bonds, protonating the a carbon and hydroxylating the b carbon from the same side (35).The water molecule is positioned via hydrogen bonding with the catalytic Asp and His, with the His acting as the catalytic base/acid, deprotonating the water which then attacks the b-carbon, followed by protonation of carbon a (Fig. S6) (35).The third conserved catalytic residue is a Gly located on the N-terminal end of the central helix, the backbone amide of this residue hydrogen bonds to the thioester carbonyl oxygen of the substrate, forming an oxyanion hole that acts to stabilize the enolate intermediate formed by the attack on the b-carbon (35).The modeled substrate exhibits all the expected interactions required for the reverse reaction; the 3-hydroxyl group is positioned to hydrogen bond with Asp-201, with His-206 positioned in range to act as the catalytic base, deprotonating the a-carbon (35).Ca-Cb of the substrate are positioned such that the resulting enoyl double bond would be  trans, and the carbonyl oxygen is positioned to hydrogen bond with the amide of Gly-225, forming the conserved oxyanion hole interaction.
The steroid nucleus packs across the b-sheet, interacting with an extended hydrophobic patch comprised of the side chains of Met-289 and Leu-291 on b5, Leu-307 and Ile-309 on b6, and Met-249 on b4.The distal edge of the pocket shifts inward during minimization, allowing Met-249 to contact the edge of the C-ring.This surface patch shows good shape complementarity with the substrate, with defined pockets for the C18 and C19 methyl carbons.The bent conformation of the bile acid nucleus, induced by the cis conformation at the A/B ring juncture, places the A ring out plane with the rest of the nucleus.This ring interacts with residues from the loop connecting the C terminus of the central helix with the auxiliary domain, with the side chain of Ile-237 extending downward and interacting with the b-face of the A ring.The C3-a-hydroxyl is positioned close enough to the exposed amide nitrogen of Ile-237 (3.8 Å) that additional small local adjustments may allow a hydrogen bond to form; alternatively, a water molecule might bridge these groups.The back of the pocket is delimited by Ala-232 and Ser-233 at the C-terminal end of the central helix, with the Ser-233 hydroxyl within 3.6 Å of the 7a-hydroxyl group; the shortening of this helix, and its narrower 3 10 character in this region, creates the necessary space to accommodate the bulky steroid ring.Note that the steroid ring binds with its hydrophobic b-face interacting with the enzyme, while the hydrophilic a-face remains exposed to the solvent.Because the bile acid substrate is rigid and interfaces with the enzyme primarily through nonpolar interactions, binding appears to be primarily driven by shape complementarity between the steroid ring and the exposed nonpolar surfaces.
The side chain methyl thioester is positioned proximal to the gap between the b-4 strands of the adjacent protomers.
This positioning coincides with the CoA binding orientation observed in other hydratases, including PaaZ and ChsH1-ChsH2 where a typical "boomerang" conformation bends the cysteamine arm through the notch between b4 strands and around the b-sheet, while positioning the ADP moiety on the opposite face of the b-sheet.In the structure of PaaZ in complex with octanoyl-CoA, Tyr-607 and Arg-258 from opposite protomers form hydrogen bonds with the 3 0 adenine phosphate, while Lys-636 from the same protomer as Arg-258 forms hydrogen bonds with the 5 0 b-phosphate (31).Tyr-252 and Arg-232 from opposite protomers in Shy MaoC superimpose well with the corresponding PaaZ residues (Fig. S7A).Additionally, sequence conservation analysis in ConSurf (37,38) of Shy MaoC shows that both Tyr-252 and Arg-232 exhibit strong conservation (Fig. S7B).Together this suggests that Tyr-252 and Arg-232 are plausibly involved in CoA binding, and that CoA in Shy MaoC is likely to bind similarly to that observed in PaaZ.
To test the substrate binding model in Shy MaoC , we selected residues within the proposed binding site for mutagenesis.In addition, we attempted mutagenesis of the conserved catalytic histidine.We generated Phe replacements of Leu-307 and Ile-309 and Ala and Glu replacements of His-206.Unfortunately, L307F, H206A, and H206Q expressed at low yield and purity, precluding further CD analysis; however, I309F expressed well and had a CD spectrum consistent with WT Shy MaoC (Fig. S8).Steady state kinetic parameters for Shy MaoC I309F were acquired with cholyl-enoyl-CoA and crotonyl-CoA.Shy MaoC I309F had a two-fold increased K m for cholyl-enoyl-CoA and an approximately three-fold decreased k cat , resulting in an approximately six-fold reduction in catalytic efficiency (Table 1).The model predicts that Ile-309 packs closely on the cholyl-enoyl-CoA C19 methyl group; the increase in bulk and reduction in flexibility introduced by the I309F replacement would be anticipated to introduce steric clashes that result in suboptimal binding and potentially reduce the fraction of time the bound substrate is able to adopt a productive binding orientation.In contrast, this replacement would not impact the binding of the much smaller crotonyl-CoA; consistent with this, the catalytic efficiency toward crotonyl-CoA was nearly identical to the WT enzyme.

Discussion
The C-C bond cleavage in the first round of b-oxidation of bile acid side chains is catalyzed by an aldolase in Proteobacteria, instead of by a thiolase.We showed that this aldolase, Sal, forms a complex with the hydratase, Shy, that catalyzes the preceding reaction.Shy contains a MaoC domain and a DUF35 domain, the latter which is responsible for interaction with Sal.Notably the DUF35 domain is dispensable for substrate hydration, with the isolated MaoC domain (Shy MaoC ) retaining the catalytic efficiency of the full protein.This is analogous to the hydratase-aldolase complex ChsH1-ChsH2-Ltp2 involved in the last round of b-oxidation reactions in bacterial steroid degradation where the DUF35 domain involved in hydratasealdolase association is not essential for the hydration activity of ChsH1-ChsH2.However, Shy differs from the corresponding ChsH1-ChsH2 hydratase, as it is homomeric instead of heteromeric and is specific for C 5 side chain rather than C 3 side chain steroid metabolites.
The structure of Shy MaoC reveals a drastically remodeled active site, with a large, exposed cleft resulting from the truncated central helix and extended, repositioned loops; this creates space for the bulky gonane nucleus.The other major change is the opening of the active site entrance via repositioning of the active site housing, resulting in the narrow tunnel mouth typical of MaoC hydratases being replaced with an extended open cleft.This suggests that accommodation of the steroid substrate not only requires increasing the size of the substrate binding pocket but also the size of the entrance, as the large rigid steroid substrate may be too unwieldy to thread through a tunnel opening like a linear acyl chain could in an ECH like PhaJ.While the active site housing helices a2 and a3 of ChsH1 do make contact with ChsH2, the contact area is much smaller than seen in PhaJ Acav or PaaZ MaoC , and this motif may possibly shift to open up the active site and allow entry of the substrate.
While the shape and size of Shy's substrate binding site is quite distinct from other MaoC hydratases, the general mode of substrate binding share many similarities to PhaJ Acav .Docked structures of PhaJ Acav and other Pseudomonas homologues place the acyl tail of the substrate with the u-carbon interacting with the longer central helix (37,39).The Shy docked model places the gonane nucleus of the steroid along a similar axis, with the A/B rings extending into the space opened by the shortened central helix, sitting alongside the end of the helix and interacting with the distal connecting loop.The conservation of CoA binding residues from PaaZ MaoC also supports the general position of the substrate being similar to these canonical enzymes.This similarity extends to the specific amino acids side chains involved in substrate binding.In PhaJ Acav and homologs from Pseudomonas aeruginosa and Pseudomonas putida (37,39), the width of the active site is defined on one side by a Ser residue on the central helix, while the opposite side is defined by a Val/Ile residue on the central b-sheet.These residues are conserved in Shy MaoC from C. testosteroni which has a Ser residue on the central helix and an Ile residue on the respective position on the b-sheet.Shy homologs from other Proteobacteria also contain a conserved Ser and Ile/Met/Val residue at the respective positions (Fig. S9).The modeled structure suggests that the Cd methyl group of this Ile residue (Ile-309) on b7 of Shy MaoC is in van der Waals contact with C11 ring carbon and the C18 and C19 methyl carbon groups; replacement with the bulkier Phe led to reduced activity with the steroid substrate.
The other residues interacting with the steroid rings identified in the docked structure are also conserved in various Shy homologues.Together, this suggests that the adaptations to accommodate the steroid nucleus in Shy require significant remodeling of secondary structure and main chain geometry, where an alternative set of exposed residues define the distal side of the binding pocket, but those closest to the catalytic dyad are conserved.
Shy's ability to catalyze reactions with steroid substrates without requiring a noncatalytic dimeric partner does call into question the prevalence of a specialized inactive MaoC domain in actinobacterial and eukaryotic steroid hydratases, such as in ChsH1-ChsH2.While ChsH1-ChsH2 and its bile acid degrading homologs are specific toward C 3 side chains, heteromeric/pseudoheteromeric MaoC ECHs are also involved in b-oxidation of steroids of different structures.For example, Mfe-2 ECH in mammals uses cholesterol metabolites with a C 8 side chain (38), while CasD and CasQ in actinobacteria use bile acid metabolites with a C 5 side chain (25).Therefore, the specialized domain is not simply due to constraints for binding steroid substrates of specific side chain lengths or ring structures.
In MaoC hydratases specific toward small substrates, such as PhaJ, the longer central helix spans the entire length of the b-sheet bun, forming extensive interactions that serve to stabilize the protein.When the central helix is shortened to accommodate larger substrates, these interactions are no longer present, requiring other stabilizing interactions to compensate.Homodimeric proteins such as Shy and (pseudo)heterodimeric proteins such as ChsH1-ChsH2 with a specialized domain may represent two distinct solutions to this issue.In heterodimers, the specialized substrate binding subunit with the short helix (ChsH2) may be stabilized via association with the catalytic unit (ChsH1).ChsH1 possesses a long helix, closing off its substrate binding pocket, which may serve to brace ChsH2 and compensate for the stabilizing interactions lost due to the short ChsH2 helix.In contrast Shy could be considered a homodimer of ChsH2 analogues, with both proteins having short central helices.
The b-sheet of Shy MaoC has a pronounced twist that is especially notable in the distal strand (b5).The twist in these sheets may introduce increased rigidity that stabilizes the sheet in lieu of interactions between the helix and sheet, circumventing the need for a binding partner with a longer central helix to stabilize the dimer.A similar twist can be observed in PaaZ MaoC , which also accommodates a large cyclic substrate without the need for a specialized domain.In addition, Shy MaoC also features an auxiliary domain which packs against the exposed end of the sheet.Given that the central helix of Shy is even shorter than that of PaaZ MaoC , this domain may serve to buttress the b-sheet of Shy, further stabilizing it.ChsH2 has an analogous domain comprised of a two-helix bundle that is positioned similarly, albeit closer toward the core of the dimer.However, when superimposing different copies of the proteins from their respective structures, the distal edge superposes relatively poorly in ChsH2 while the equivalent residues in Shy MaoC superpose much Characterization of Shy-Sal more closely, suggesting that Shy MaoC is relatively less flexible.This may reflect their substrate specificity, as Shy MaoC only needs to interact with one face of the amphipathic bile acid substrate, while ChsH2 used a cholesterol substrate where both faces are hydrophobic.This flexibility may allow the ChsH2 active site to partially close so as to bury both faces of the substrate in a productive complex.
While intuitively the heteromeric/pseudo-heteromeric architecture seems less efficient, it allows the specialized protomer to be optimized for stability and dynamics, separately from the evolutionary pressure for activity.As demonstrated by the catalytic efficiency of ChsH1-ChsH2 and its bile acid degrading homologs, this does not result in any net loss in productivity relative to Shy MaoC (25).Another potential advantage can be observed in Actinobacterial heteromeric MaoC type enzymes involved in synthesis of long-chain fatty acids that are incorporated into mycolic acids of the cell wall.In this system, HadB contains the catalytic residues while HadA and HadC do not (40).HadB can associate with either HadA or HadC, forming HadAB or HadBC complexes.Interestingly these complexes have distinct substrate specificities, with HadAB preferring C12-C18 fatty acyl chains while HadBC is specific for longer C22-C26 acyl chains (41).Crystal structures suggest that HadA and HadC bind to the acyl chains of the substrate for HadB to hydrate (40,42).Therefore, the heteromeric architecture could enable modularity, by allowing the evolution of specialized binding partners with differing substrate specificities.
Shy's distinct architecture from other steroid using MaoC hydratases and similarity to other proteobacterial hydratases raises interesting questions about Shy's evolutionary origins.Previous metagenomic studies of steroid degrading bacteria have suggested that the 9,10-seco pathway for cholate A/B ring degradation originated via a gene duplication event of the homologous cholesterol pathway in an ancestor of Rhodococcus sp (43).This cluster was then horizontally transferred to a proteobacterium before subsequent further horizontal transfers between proteobacteria (43).The origin of the cholate side chain catabolism pathway in proteobacteria was, however, not explained.Based on their initial structural similarity we conducted a phylogenetic analysis of Shy, PhaJ, PaaZ, ChsH1, and ChsH2 homologs, which indicated Shy's closest relatives are PaaZ and then PhaJ, suggesting that Shy evolved from a canonical MaoC hydratase rather than from a horizontally transferred heteromeric enzyme.Shy and ChsH1-ChsH2 therefore represent independently evolved solutions to the challenges of steroid hydration, with some convergent aspects such as the displaced active site housings and the presence of auxiliary domains, albeit with different structural compositions.
The structural and biochemical characterization of Shy sheds light on a unique steroid using enoyl-CoA hydratase.To our knowledge, Shy is the only characterized enoyl-CoA hydratase capable of using steroid substrates while maintaining a homodimeric architecture with both active sites functional.Our data are in agreement with the relationship between the size, shape, and positioning of the central helix and the upper limit of substrate size an enzyme can accept.However, many specific details on the determinants of substrate specificity in enoyl-CoA hydratases remain unclear; for example, factors determining the specificity toward steroid side chain lengths and nucleus substituents have not been elucidated.Determination of additional hydratase structures, and if possible, enzyme-substrate complexes could shed light on these details.Possible candidate for further study may be found among the anaerobic cholesterol degrading Proteobacteria, such as the aforementioned Sterolibacterium denitrificans (44).These enzymes could provide an avenue for comparison with bile acid utilizing Shy.This information could facilitate their use as biocatalysts to modify steroid side chains with desired steroid ring structures that can be used as synthons for producing existing or novel steroidal pharmaceuticals.

Experimental procedures Chemicals
Potassium hexacyanoferrate (III) was purchased from Sigma-Aldrich.Cholic acid was purchased from Alfa Aesar.4-pregnen-3-one-20b-carboxylic acid (3-OPC) was purchased from Steraloids Inc. Coenzyme A was purchased from BioShop Canada Inc. ATP was purchased from Bio Basic Inc.Restriction enzymes and Pfu polymerase were purchased from Thermo Fisher Scientific.T4 DNA ligase was purchased from New England Biolabs.Ni 2+-NTA Superflow resin was purchased from Qiagen.3-OPDC-CoA and 7-HOPC-CoA were synthesized as described previously (20).All other chemicals were purchased from Thermo Fisher Scientific or Sigma-Aldrich unless otherwise stated.

Bacterial strains and plasmids
C. testosteroni KF-1 and Thermomonospora curvata DSM 43183 were obtained from DSMZ-German Collection of Microorganisms and Cell Cultures and E. coli BL21 LOBSTR from Kerafast Inc.

DNA manipulation
DNA was purified, digested, and ligated using standard protocols.The shy, sal, shy maoC , and shy DUF35 genes were amplified from the genomic DNA of C. testosteroni KF-1 with the primers listed in Table S3.DNA fragments containing shy, shy maoC , and shy DUF35 were inserted between the NdeI and HindIII sites of pBTLactac plasmid that would enable expression of tagless proteins in E. coli.The pBTLactac plasmid was derived from pBTL-4 plasmid (45), with the weak lac promoter removed by BsrBI and HindIII digestion and replaced with the lacI q and tac promoter from pVLT31 (46).The DNA fragment containing shy maoC was inserted into pET28a using NdeI and HindIII to produce His-tagged protein (Novagen).DNA containing sal was inserted into pMCSG7 (47) via ligation independent cloning as there are internal NdeI sites in the gene that preclude restriction site-based insertion into common E. coli expression vectors.Plasmids were transformed into E. coli BL21 LOBSTR for expression.Site-directed mutagenesis was performed via a modified Quik-Change method using primers listed in Table S3 (48).Cloned genes and mutations were confirmed by DNA sequencing at Laboratory Services (University of Guelph).

Protein expression and purification
Recombinant E. coli was grown in 4 L of LB media supplemented with kanamycin (50 mg/ml), ampicillin (100 mg/ml), and/or tetracycline (15 mg/ml) at 37 C.At mid-log phase (A 600 of 0.4-0.6)recombinant protein expression was induced by the addition of 1 mM IPTG.Cells were incubated for a further 24 h at 15 C and harvested by centrifugation at 9605g for 10 min.E. coli cell pellets were resuspended in 20 mM Hepes buffer (pH 7.5) and lysed by passage through a French press at 103,421 kPa.Cell lysates were centrifuged at 39,191g for 15 min.Cell extracts were filtered through a 0.45 mm filter and incubated for 1 h at 4 C with Ni 2+-NTA resin in buffer (50 mM sodium phosphate buffer pH 8.0, 300 mM sodium chloride) containing 20 mM imidazole.The mixture was poured into a gravity column and washed with the same buffer.The His-tagged proteins were eluted with buffer containing 150 mM imidazole (pH 8.0).The buffer was exchanged for 20 mM Hepes (pH 7.5) by dilution in a stirred cell equipped with a YM10 filter (Amicon).Purified enzymes were stored at −80 C.

CD spectroscopy
Protein solutions (0.15 mg/ml) were prepared in 1 mM sodium phosphate, pH 7.0, buffer and analyzed in 1 mm pathlength quartz cuvettes.CD spectra were collected using a Jasco, Inc, J-815 spectropolarimeter from 260 to 190 nm with 0.5 nm pitch, 1 nm bandwidth, 2 s of data integration time, and 100 nm/min scan speed.Nine individual spectra were averaged and then blank subtracted, followed by processing via a Savitzky-Golay filter in Jasco, Inc, Spectra Manager 2 software (https://jascoinc.com/products/spectroscopy/molecular-spectroscopy-software/).

Cholyl-enoyl-CoA and cholyl-3-OH-CoA synthesis
Cholyl-enoyl-CoA was synthesized similarly to previously described methods (25).A 10 ml reaction mixture of 100 mM sodium Hepes buffer (pH 7.5) with cholic acid (1.0 mM), CoA (1.0 mM), ATP (2.5 mM), magnesium sulfate (5.0 mM), and the acyl-CoA synthetase CasG from Rhodococcus jostii RHA1 (15) (1.5 mM) was incubated overnight at 22 C with gentle agitation.Produced cholyl-CoA was dehydrogenated to cholylenoyl-CoA using the acyl-CoA dehydrogenase CasC from R. jostii RHA1 ( 14) (500 nM) and potassium hexacyanoferrate (III) (300 mM) in 40 ml 100 mM sodium Hepes buffer (pH 8.5) for 2.5 h at room temperature with gentle agitation.To produce cholyl-3-OH-CoA, the hydratase ChsH3 from M. tuberculosis (1 mM) was added (25).The reaction was then halted by acidification to pH 4.0 with HCl and the solution passed through 0.45 mm and YM10 filters.Twenty milliliters of the filtrate was loaded onto a 2.8 ml HyperSep Disposable C18 column (500 mg bed weight; Thermo Fisher Scientific) equilibrated with 10% acetonitrile in 50 mM sodium phosphate buffer (pH 5.3).The column was washed with the same buffer and CoA-esters were eluted with 10 ml 40% acetonitrile in 50 mM sodium phosphate buffer (pH 5.3).This process was repeated twice to purify all the filtrate.Acetonitrile was evaporated and water was removed by lyophilization.The identities of the steroid esters were confirmed by electrospray ionization-mass spectrometry.

Analysis of CoA esters and reaction products
Reactions were performed with 50 mM cholyl-enoyl-CoA or cholyl-3-OH-CoA in 100 mM Hepes pH 7.5 buffer.CoA-ester substrates and products were analyzed on an Agilent 1200 HPLC liquid chromatograph interfaced with an Agilent UHD 6530 Q-Tof mass spectrometer at the Mass Spectrometry Facility of the Advanced Analysis Centre, University of Guelph.A C18 column (Agilent Poroshell 120, 50 mm × 4.6 mm 2.7 mm) was used for chromatographic separation of 10 ml sample volume with the following solvents: water with 0.1% formic acid for A and acetonitrile with 0.1% formic acid for B. The mobile phase gradient was as follows: initial conditions, 5% B increasing to 100% B in 15 min followed by column wash at 100% B and 10 min reequilibration.The flow rate was maintained at 0.4 ml/min.The mass spectrometer electrospray capillary voltage was maintained at 4.0 kV and the drying gas temperature at 250 C with a flow rate of 8 L/min.Nebulizer pressure was 30 psi, and the fragmentor was set to 160.Nitrogen was used as nebulizing gas, drying gas, and collision-induced dissociation gas.The mass-to-charge ratio was scanned across the m/z range of 50 to 2000 m/z in 4 GHz (extended dynamic range positive-and negative-ion auto MS/ MS modes).Two precursor ions per cycle were selected for MS2 fragmentation scanning from 25 to 2000 m/z.The instrument was externally calibrated with the ESI Tuning Mix (Agilent Technologies).The data were analyzed using Agilent Qualitative Analysis software 10 (https://www.agilent.com/en/product/software-informatics/mass-spectrometry-software/da ta-analysis/qualitative-analysis).

Steady-state kinetic assays
All assays were performed in at least triplicate, in 100 mM sodium Hepes buffer (pH 7.5) with a total volume of 1 ml, at 25 C, using a Varian Cary 3 spectrophotometer equipped with a temperature-controlled cuvette holder.Hydratase activity was determined by measuring the decrease in absorbance at 263 nm due to Ca-Cb saturation during hydration (ε263 of 6700 M −1 cm −1 ) (Fig. S10) (51).Aldolase activity against cholyl-3-OH-CoA was performed with 1 mM NAD + and 7 nM C. testosteroni KF-1 Sad.Activity was determined by measuring increase in absorbance at 340 nm corresponding to reduction of NAD + to NADH (ε340 of 6220 M −1 cm −1 ).Data were fitted to the Michaelis-Menten equation by nonlinear regression using GraphPad Prism software (https://www.graphpad.com/).

Protein crystallization and structure determination
Conditions for crystallization of Shy MaoC were screened using the JCSG+ and PACT premier kits (Molecular Dimensions Inc).Crystals used for data collection were grown using the sitting drop method at 4 C, with 1 ml of reservoir solution (0.1 M sodium cacodylate, 0.2 M calcium acetate hydrate, 40% PEG 300) combined with 1 ml 30 mM acetyl-CoA and 1 ml of 20 mg/ml Shy MaoC .Crystals were soaked in Paratone N as cryoprotectant prior to freezing.The datasets were collected at the Canadian Light Source, Canadian Macromolecular Crystallography Facility (CMCF-ID), and processed using XDS (52).The structure was phased using molecular replacement in Phaser, using a Colabfold model of the shy monomer as the molecular replacement model (53); Refinements were carried out in PHENIX refine (https://phenixonline.org)(54), with manual building in Coot (https://www2.mrc-lmb.cam.ac.uk/personal/pemsley/coot/) (55).Initial refinements were carried out to 1.9 Å, with the resolution cut to 2.05 Å on the final round of refinement.Figures of protein structure were generated using PyMOL version 2.0 (https:// pymol.org).

Modeling
The Shy MaoC complex with cholyl-22-(R)-hydroxy-24methylthioate was initially modeled by hand in Pymol, with the positioning of the substrate guided by the shape of the pocket and the interactions usually observed between the substrate and catalytic residues in productive hydratase complexes.Candidate complexes were then subjected to energy minimization in Rosetta (36).

Sequence analysis
Multiple sequence alignment was performed using ClustalΩ (56).A maximum likelihood tree was constructed using IQtree with 1000 bootstrap replications (57).Sequences were selected from an NCBI BLAST search using Shy Ctest and the proteins identified via DALI search as queries, and gene synteny was confirmed by analysis of the corresponding genome sequences in the NCBI nucleotide database.

Figure 1 .
Figure 1.Overview of steroid side chain catabolism in Actinobacteria and Proteobacteria.A, overview of steroid side chain catabolism pathway.Names of relevant enzymes that catalyze the shared steps are indicated in green (actinobacterial) and blue (proteobacterial).For simplicity the steroid rings are not shown in the figure.B, gene arrangement of shy and sal in Proteobacteria and chsH1, chsH2, ltp2, and chsH3 in Actinobacteria.The same colors are used to depict the corresponding homologous genes.

Figure 2 .
Figure 2. Coomassie blue-stained 10% SDS-PAGE gel of copurified proteins from Ni 2+ -NTA chromatography.Loaded samples are purified from Escherichia coli BL21 LOBSTR expressing His-tagged Sal and untagged Shy (lane 2), His-tagged Sal and untagged Shy DUF35 (lane 3), His-tagged Shy MaoC (lane 4), and His-tagged Sal and untagged Shy MaoC (lane 5), Along with molecular weight ladders (lanes 1 and 6).Molecular weight of relevant ladder bands in kDa are indicated on the left while the positions of the expressed proteins is indicated on the right.

Figure 3 .
Figure 3. Structure of synthesized steroid substrates.Carbon atom numbering scheme and ring designations are shown for Cholyl-enoyl-CoA.

Figure 4 .
Figure 4. Structure of Shy MaoC .A, secondary structure organization of Shy MaoC with secondary structure elements labeled.Side chains of the catalytic His and Asp residues are shown in sticks.B, topology diagram of Shy MaoC .Approximate positions of the catalytic His and Asp residues are indicated.C, organization of Shy MaoC homodimer with protomers colored blue and orange.Key secondary structure elements for the orange protomer are labeled.D, Shy MaoC homodimer depiction rotated 90 from (C).

Figure 5 .
Figure 5.Comparison of Shy MaoC , PaaZ MaoC , PhaJ Acav , and ChsH1-ChsH2 Structures.A, structural alignment of Shy MaoC dimer (shown in blue) with PaaZ MaoC dimer (shown in yellow), secondary structure elements outside of the central helix are semi-transparent for clarity and the catalytic dyad are shown as sticks.B, structural alignment of Shy MaoC protomer with PaaZ MaoC protomer (colored as in A), secondary structure elements outside of the central b-sheet are translucent for clarity.C, structural alignment of Shy MaoC dimer with PhaJ Acav dimer (shown in green), displayed as in A. D, structural alignment of Shy MaoC dimer with ChsH1-ChsH2 dimer (shown in cyan and lavender, respectively), displayed as in (A).E, structural alignment of Shy MaoC , PaaZ MaoC , PhaJ Acav , and ChsH1-ChsH2 showing the catalytic dyads, coordinated water of Shy MaoC and PhaJ Acav , and coordinated cadmium cation of ChsH1 colored as in (A-C).

Figure 6 .
Figure 6.Comparison of PhaJ Acav , PaaZ MaoC , ChsH1-ChsH2, and Shy MaoC substrate binding pockets.A, PhaJ Acav , (B) PaaZ MaoC, (C) Shy MaoC , and (D) ChsH1-ChsH2.Each dimer shows a single substrate binding site in light orange.The protomer contributing the catalytic dyad is dark gray, with the residues shown as sticks.The opposite protomer is light gray, with the central helix and connecting loop that define the end of the binding site colored green and red, respectively.In ChsH1-ChsH2 and Shy MaoC the auxiliary domain is colored purple.Pockets were detected using HOLLOW(58) and their volumes were estimated with 3V (59).

Figure 7 .
Figure 7. Structure of Shy MaoC substrate binding site with modeled cholyl-22-(R)-hydroxy-24-methylthioate.Shy MaoC is shown in gray while the substrate is shown in purple.The functional groups of residues that define the binding cleft and participate in catalysis are shown in gray and magenta, respectively, and numbered (' indicates residues from the opposite protomer).Potential hydrogen bonds are shown as dashed lines with their length in angstroms indicated.Those between the substrate and catalytic residues are colored yellow, while those between the substrate and noncatalytic residues are gray.

Table 1
Steady-state kinetic parameters of Shy and Sal

Table 2
Data collection and refinement statistics *Data for highest resolution shell shown in parentheses.