Structural basis of a bi-functional malonyl-CoA reductase (MCR) from the photosynthetic green non-sulfur bacterium Roseiflexus castenholzii

ABSTRACT Malonyl-CoA reductase (MCR) is a NADPH-dependent bi-functional enzyme that performs alcohol dehydrogenase and aldehyde dehydrogenase (CoA-acylating) activities in the N- and C-terminal fragments, respectively. It catalyzes the two-step reduction of malonyl-CoA to 3-hydroxypropionate (3-HP), a key reaction in the autotrophic CO2 fixation cycles of Chloroflexaceae green non-sulfur bacteria and the archaea Crenarchaeota. However, the structural basis underlying substrate selection, coordination, and the subsequent catalytic reactions of full-length MCR is largely unknown. For the first time, we here determined the structure of full-length MCR from the photosynthetic green non-sulfur bacterium Roseiflexus castenholzii (RfxMCR) at 3.35 Å resolution. Furthermore, we determined the crystal structures of the N- and C-terminal fragments bound with reaction intermediates NADP+ and malonate semialdehyde (MSA) at 2.0 Å and 2.3 Å, respectively, and elucidated the catalytic mechanisms using a combination of molecular dynamics simulations and enzymatic analyses. Full-length RfxMCR was a homodimer of two cross-interlocked subunits, each containing four tandemly arranged short-chain dehydrogenase/reductase (SDR) domains. Only the catalytic domains SDR1 and SDR3 incorporated additional secondary structures that changed with NADP+–MSA binding. The substrate, malonyl-CoA, was immobilized in the substrate-binding pocket of SDR3 through coordination with Arg1164 and Arg799 of SDR4 and the extra domain, respectively. Malonyl-CoA was successively reduced through protonation by the Tyr743–Arg746 pair in SDR3 and the catalytic triad (Thr165–Tyr178–Lys182) in SDR1 after nucleophilic attack from NADPH hydrides. IMPORTANCE The bi-functional MCR catalyzes NADPH-dependent reduction of malonyl-CoA to 3-HP, an important metabolic intermediate and platform chemical, from biomass. The individual MCR-N and MCR-C fragments, which contain the alcohol dehydrogenase and aldehyde dehydrogenase (CoA-acylating) activities, respectively, have previously been structurally investigated and reconstructed into a malonyl-CoA pathway for the biosynthetic production of 3-HP. However, no structural information for full-length MCR has been available to illustrate the catalytic mechanism of this enzyme, which greatly limits our capacity to increase the 3-HP yield of recombinant strains. Here, we report the cryo-electron microscopy structure of full-length MCR for the first time and elucidate the mechanisms underlying substrate selection, coordination, and catalysis in the bi-functional MCR. These findings provide a structural and mechanistic basis for enzyme engineering and biosynthetic applications of the 3-HP carbon fixation pathways.

For the first time, we here determined the cryo-electron microscopy (EM) structure of the full-length MCR from Roseiflexus castenholzii (20) (RfxMCR), which shares 58% sequence identity with CfxMCR (21), at 3.35 Å resolution. The crystal structures of NADP + -MSA-bound RfxMCR-N and RfxMCR-C were also determined at 2.0 Å and 2.3 Å resolutions, respectively. Full-length RfxMCR was a homodimer of two cross-interlocked subunits that each contained four tandemly arranged SDR domains; only the catalytic SDR1 and SDR3 domains, which incorporated additional components (such as the α10 helix and the ED) into the SDR core, were capable of accommodating NADP + -MSA as reaction intermediates. Molecular dynamics (MD) simulations of the full-length RfxMCR further revealed that malonyl-CoA binding required the cooperation of Arg1164 from SDR4 and Arg799 from the ED. Malonyl-CoA was successively reduced through a proton-relay network formed by the Tyr743-Arg746 pair in SDR3 and the catalytic triad Thr165-Tyr178-Lys182 in SDR1 after nucleophilic attack by the NADPH hydrides. Our work illustrates the molecular mechanisms underlying substrate selection, binding, and consecutive reduction by the bi-functional enzyme MCR. These findings will serve as the structural basis for future enzyme engineering and biosynthetic applications of the malonyl-CoA pathway for 3-HP production.

Protein expression and purification
The gene sequences encoding R. castenholzii MCR (Rcas_2929) was amplified from R. castenholzii DSM 13941 genomic DNA and inserted into pET20b expression vector at XhoI and NdeI to construct the C-terminal His 6 -tagged expression vector. The sequenced plasmid was transformed into E. coli BL21(DE3) cells for the recombinant expression of a C-terminal His 6 -tagged full-length RfxMCR. The gene sequences encoding RfxMCR-N (Met1-Phe572) and RfxMCR-C (Gly573-Val1229) were inserted into pET28a vectors at XhoI and NcoI restriction site, and NdeI and NcoI restriction site, respectively, to express the C-terminal and N-terminal His 6 -tagged recombinant proteins. The transformed cells were grown in 1 L Luria-Bertani broth containing 100 mg/mL ampicillin at 37°C until the optical density at 600 nm (OD 600 ) reached 0.6-0.8. The gene expression was then induced with 0.2 mM isopropyl-β-D-thiogalactopyranoside overnight at 25°C. Cells were harvested by centrifugation at 7,500× g for 10 minutes at 4°C and resuspended in wash buffer containing 50 mM Tris-HCl pH 8.0, 300 mM NaCl, and 2 mM MgCl 2 prior to homogenization with a high-pressure homogenizer (Union, People's Republic of China). The insoluble cell debris was removed by centrifugation at 22,000× g for 40 minutes at 4°C. The supernatant containing crude soluble proteins was loaded onto a Ni 2+ -chelating affinity chromatography column (GE Healthcare, Fairfield, CT, USA) and was rinsed with 100 mL of binding buffer (50 mM Tris-HCl pH 8.0, 300 mM NaCl, 2 mM MgCl 2 , 50 mM imidazole) to remove non-specifically bound proteins. The bound full-length RfxMCR, RfxMCR-N, and RfxMCR-C proteins were eluted with the binding buffer containing 300 mM imidazole. The eluates were further purified by a HiLoad 16/600 Superdex 200PG size exclusion column (GE Healthcare) with buffer containing 25 mM Tris-HCl pH 8.0 and 150 mM NaCl to 95% purity.

Cryo-electron microscopy
Three-microliter aliquots of full-length RfxMCR (0.3 mg/mL) was placed on the glowdischarged GiG R1.2/1.3 300-mesh gold holey nickel titanium grid (Zhenjiang Lehua Electronic Technology, China) and blotted for 3.0 seconds under a blot force of 4 at 100% humidity and 16°C before being flash-frozen in liquid ethane with a Mark IV Vitrobot system (FEI). Micrographs were acquired on a Titan Krios microscope (FEI) operated at 300 kV with a K3 Summit direct electron detector (Gatan). SerialEM (22) was used for automatic data collection. A nominal magnification of ×22,500 was used for imaging, which yielded a pixel size of 1.07 Å. The defocus range was between 1.3 and 1.8 µm. Each micrograph was dose-fractionated to 32 frames under a dose rate of 9.4 e -/Å 2 per second and an exposure time of 6.4 seconds, which resulted in a total dose of about 60 e -/Å 2 .

Image processing
Motion correction and exposure weighting were performed by the MotionCorr2 program (23), and the CTF (contrast transfer function) parameter was estimated using the CtfFind program (24). All the image processing steps were performed using RELION 3.0 (25) and CryoSPARC programs (26). To generate a template for two-dimensional (2D) classification, 14,112 particles were auto-picked from 20 micrographs and subjec ted to 2D classification. Using a reference generated from 2D classification of 12,105 particles, 4,043,667 particles were auto-picked from 4,647 micrographs and imported in CryoSPARC. After five iterative rounds of 2D classifications, 2,391,743 particles were selected and imported in RELION for three-dimensional (3D) classification. Using the best 3D class as a reference, 3D classifications generated four classes of particles. Then 1,766,605 particles from the best two classes (percentage of 42.89 and 30.97, respec tively) were selected and extracted, and subjected to another round of reference-based 3D classification and non-uniform refinement in CryoSPARC, which produced an EM-map with a global resolution of a 3.35 Å based on the gold standard Fourier shell correlation (FSC). Local resolution was estimated with Resmap (27) (Fig. S2).

Model building, refinement, and validation
Based on the cryo-EM density map, de novo atomic model building of RfxMCR was conducted in Coot (28). Then real-space refinement in PHENIX (29,30) was used for model refinement. All figures were drawn in PyMOL (The PyMOL Molecular Graphics System, Version 2.5.2; Schrödinger, LLC), USCF chimera (31), or ChimeraX (32). The refinement statistics were summarized in Table 2.

Crystallization of NADP + -MSA-bound RfxMCR-N and RfxMCR-C
The purified RfxMCR-N and RfxMCR-C were concentrated using an Amicon Ultra centrifugal filter device (10-kDa molecular weight cutoff; Millipore) at 4°C. Protein concentrations were determined using a NanoDrop device (IMPLEN) by recording the absorption at 280 nm. The protein samples were diluted to 25 mg/mL in buffer (50 mM Tris-HCl pH 8.0, 150 mM NaCl) for crystallization. The RfxMCR-N and RfxMCR-C were incubated with cofactor NADP + at 1: 10 molar ratio for 30 seconds at 50°C before crystallization. Crystallization was performed using the hanging-drop vapor diffusion method, with 1.2 µL of protein sample mixed with an equal volume of reservoir solution, and the mixture was equilibrated against 200 µL reservoir solution. Crystals of NADP + -MSA-bound RfxMCR-N were obtained with the reservoir solution containing 1 M sodium malonate pH 5.0, 0.1 M sodium acetate trihydrate pH 4.5, and 3% polyethylene glycol 20,000 at 16°C. The NADP + -MSA-bound RfxMCR-C were crystallized with reservoir solution containing 1.6 M sodium formate and 0.1 M Bis-tris propane pH 7.0 at 16°C.

Crystal data collection, structure determination, and refinement
The optimized crystals were cryo-protected by adding 30% glycerol to the reservoir solution and flash-freezing with liquid nitrogen. A 2.0-Å data set of NADP + -MSA-bound RfxMCR-N and a 2.30-Å data set of NADP + -MSA-bound RfxMCR-C were collected at SSRF BL10U2 (Table 3). Diffraction data were automatically processed, integrated, and scaled with Porpoise XDS software (33). The quality of the data were assessed using SFCHECK (34), and the solvent content was calculated using Matthews_Coef from the CCP4 package (35,36). The NADP + -MSA-bound RfxMCR-N and RfxMCR-C structures were determined by molecular replacement method using the cryo-EM structural model of full-length RfxMCR as a search model. The Phaser program (37) from the CCP4 package was employed to determine the initial phases; iterative model building and refinement were performed using Coot (28), Refmac5 (38), and Phenix (29) to obtain the refined model (Table 3).

Site-directed mutagenesis and enzyme activity assay
Site-directed mutagenesis of the catalytic triad and amino acid residues that are involved in NADP + -MSA binding was performed using a mutagenesis kit (Yeasen), the constructed plasmids containing the mutated gene sequences were sequenced, and transformed into E. coli BL21(DE3) cells for the expression of the mutant proteins. The proteins were expressed and purified following the same procedure under identical conditions as the wild-type proteins.
The enzymatic activity of RfxMCR was assayed spectrophotometrically at 50°C by measuring the absorbance of NADPH at 340 nm (ε = 6.22 mM -1 ·cm -1 ). The standard assay mixture (200 µL) was composed of 100 mM Tris-HCl pH 8.0, 150 mM NaCl, 5 mM MgCl 2 , 0.4 mM NADPH, and 0.1 nmol RfxMCR as the enzyme. Enzyme concentration was determined by the Bradford method. The absorbance of NADPH at 340 nm was recorded for 10 minutes at 50°C. One unit of enzymatic activity was defined as the amount of enzyme that catalyzes the oxidation of 1 µmol NADPH per minute. The apparent Michaelis-Menten constant (K m ) and V m were measured at the reaction velocity by varying the substrate malonyl-CoA concentrations at 0.05, 0.1, 0.15, 0.2, 0.4, 0.5, and 1.0 mM ( Table 1). The k cat and K m values were determined by the non-linear least squares fitting method. The enzymatic activity of the site-directed mutants was measured in the same conditions as the wild-type RfxMCR. All the enzymatic data were obtained from triplicate experiments.

Sedimentation velocity analytical ultracentrifugation
Sedimentation velocity analytical ultracentrifugation (AUC) was performed to check the oligomerization state of RfxMCR-N and RfxMCR-C in solution. Sedimentation experiments were performed on a Beckman Coulter Proteome Lab XL-I ultracentrifuge using a 4-hole An-60Ti rotor. Protein samples with an initial absorbance at 280 nm of approximately 0.5-0.8 were equilibrated for 2 hours at 20°C under a vacuum prior to sedimentation. The absorbance at 280 nm was measured using a continuous scan mode during sedimenta tion at 55,000 rpm in 12 mm double-sector cells. The data were analyzed using sedfit (39).

Molecular dynamics simulation and binding free-energy calculations
To obtain the binding structure of full-length RfxMCR with the substrate malonyl-CoA, the structures of full-length RfxMCR and NADP + -MSA-bound RfxMCR-C were superim posed with the CoA bound PdMCR-C (PDB 6K8T) (19) to construct the binding structure of NADP + -CoA-MSA-bound full-length RfxMCR. Then, the structure of full-length RfxMCR in complex with NADPH-malonyl-CoA was obtained through structure editing using the UCSF ChimeraX software (32). The missing residues in the determined structures were added with UCSF ChimeraX as well. To investigate the binding features of full-length RfxMCR in complex with the reaction intermediate NADP + -MSA, and the substrate for the second reduction reaction NADPH-MSA, the binding structures of full-length RfxMCR with NADP + -MSA and NADPH-MSA were constructed based on the crystal structures of NADP + -MSA-bound RfxMCR-C and RfxMCR-N, respectively.
To characterize the binding features of full-length RfxMCR with malonyl-CoA and MSA, MD simulations were performed on the binding structures of full-length RfxMCR bound with NADPH-malonyl-CoA, NADP + -MSA (NADP + -MSA bound at the C-terminal domain of full-length RfxMCR), and NADPH-MSA (NADPH-MSA bound at the N-terminal domain of full-length RfxMCR) by using the AMBER 20 software (40). Each binding complex was immersed into the center of a truncated octahedron box of TIP3P water molecules with a margin distance of 12.0 Å. Environmental sodium counterions were added to keep the system in electric neutrality. The AMBER ff14SB force field was applied for full-length RfxMCR (41). The force field parameters of NADPH and NADP + were retrieved from previous reports (42,43). For malonyl-CoA, the atomic partial charges were calculated using the restricted electrostatic potential method with a basis set of HF/6-31G(d) on the structures optimized at the B3LYP/6-31G(d) level (44). The other force field parameters of malonyl-CoA were generated from the Generalized Amber Force Field with the Antechamber module of AmberTools (45). Following the same procedure in our previous report (46), each MD simulation was conducted with a time scale of 100 ns.
To evaluate the binding affinities between full-length RfxMCR and the bound substrate and cofactors, their binding free energies (ΔG bind ) were obtained through the molecular mechanics/generalized Born surface area (MM/GBSA) calculation approach (47): where E MM is the gas phase interaction energy comprising van der Waals energy (E vdW ) and electrostatic energy (E ele ). G solv is the solvation free energy, including the contributions form a polar part (G GB ) and a non-polar part (G SA ). ΔG GB was estimated using the generalized Born model with the interior and exterior dielectric constants set to 4 and 80, respectively (48). ΔG SA was estimated using the LCPO algorithm: ΔG SA = γΔSASA + β, where γ and β were set to 0.0072 and 0, respectively (49). The solute entropy term TΔS is sometimes approximated by normal mode entropy (50), but such treatment rarely leads to improvement in the correlation with experiments (51). Therefore, the solute entropy term was not included in the current study. In calculating ΔG bind , 200 snapshots were evenly extracted from the last 20 ns trajectories for the calculations of ΔE vdW , ΔE ele , ΔG GB , and ΔG SA .

Cryo-EM structure of full-length RfxMCR
The gene sequence (Rcas_2929) encoding the full-length RfxMCR (Met1-Val1229) was expressed with a C-terminal His 6 -tag in E. coli BL21 (DE3) cells (17). The encoded recombinant protein was purified via nickel-nitriloacetic acid affinity and size exclusion chromatography ( Fig. S1A and B). Gel filtration analysis of purified RfxMCR revealed an elution peak at 59.8 mL (Fig. S1A), which corresponded to the elution profile of an RfxMCR dimer with a calculated molecular weight of ~270 kD. Additionally, a single protein band with a molecular weight near 268 kDa was observed in the gel after Native PAGE (Fig. S1C), confirming the existence of the RfxMCR dimer in solution. Using malonyl-CoA as the substrate and NADPH as the cofactor, purified full-length RfxMCR was able to catalyze the reduction of malonyl-CoA to 3-HP ( Fig. 1A and B). The apparent catalytic constant (k cat ) was 5.65 ± 0.57 per second and the Michaelis constant (K m ) was 0.38 ± 0.08 mM for malonyl-CoA (Table 1). Compared to CfxMCR, RfxMCR exhibited a lower catalytic efficiency for malonyl-CoA reduction, characterized by the decreased substrate-binding affinity and turnover number (17). This difference may be attributed to the varied amino acid residues that affect the conformation of substrate-binding pocket or resulted from the different temperature and pH values for measuring the enzyme activity.
Prior to our study, the structure of full-length MCR had not been reported. To investigate the dimer structure, RfxMCR was imaged on a 300 kV Titan Krios cryo-EM with a The kinetic parameters were determined under the optimal conditions. Data represent mean ± SD (N = 3).
Research Article mBio a K3 Summit direct electron detector (Gatan) in counting mode (Fig. S1D). From 4,647 micrographs, 4,043,667 particles were selected and used to construct an electron potential map with an average resolution of 3.35 Å (Fig. S1D through G; Fig. S2). The final reconstructed cryo-EM map was clearly resolved and enabled us to build an accurate model of the protein side chains of full-length RfxMCR ( Table 2; Fig. S3A). The overall structure of RfxMCR was a homodimer of two cross-interlocked subunits with the N-and C-terminal ends assembled in the same direction; the dimensions were 160 × 110 × 95 Å (Fig. 1C). Each subunit was composed of four tandemly arranged SDR domains (SDR1-4).
The ED (Gly783-Arg932), which comprised eight α-helices, was inserted into the SDR3 domains (Gly573-Glu782 and Met933-Gly962) and was exposed at the two sides of the homodimer ( Fig. 1C and D). Each SDR domain adopted a typical Rossmann fold that is From the full-length RfxMCR structure, we first resolved the conformation of RfxMCR-N and RfxMCR-C connecting region, which was folded into an α-helix (G 573 WAESL 578 ). In the center of the homodimer, two of these α-helices were anti-parallel arranged in the reverse direction (Fig. 1E). Gel filtration and Native-PAGE analyses of the site-directed mutants W574A/E576A/F579A showed the same dimer formation as wild-type RfxMCR ( Fig. S1H and I), indicating that this helix is not essential for forming the homodimer. Instead, it plays an important role in connecting the RfxMCR-N and RfxMCR-C fragments and also serves as a point of intersection for the two subunits to form a dimer. This also confirmed that the two subunits were cross-interlocked to form the homodimer; this finding was inconsistent with the simulated SAXS model of PdMCR, in which the homodimer is formed by two parallel-contacted subunits (19). Actually, the dimer interface was formed by symmetrically distributed hydrogen bonding interactions between amino acid residues from the RfxMCR-N (SDR1-SDR2) and RfxMCR-C (SDR3-ED-SDR4) of the two subunits ( Fig. S4A and B).

Crystal structure of substrate-bound RfxMCR-N and RfxMCR-C
Attempts to obtain a cryo-EM structure of substrate-bound full-length RfxMCR were not successful. We did not resolve any densities of substrate or cofactor from the cryo-EM maps of full-length RfxMCR incubated with NADP + or malonyl-CoA ( Fig. S3E and F). To elucidate the substrate binding and catalytic mechanisms of RfxMCR, we therefore determined the crystal structures of RfxMCR-N (Met1-Phe572) and RfxMCR-C (Gly573-Val1229) bound with NADP + -MSA ( Fig. 2 and 3). The crystal structure of RfxMCR-N in complex with NADP + -MSA was determined by the molecular replacement method and refined to an R work of 19.62% and an R free of 20.81% at 2.0 Å resolution (Table 3). Although the P6 5 22 crystal contained only a monomer RfxMCR-N in one asymmetric unit, it was packed into a homodimer in the crystal (Fig. S4C). This was consistent with both the gel filtration and AUC analyses, which showed the existence of an RfxMCR-N dimer in solution ( Fig. S5A and D). Superimposition of the crystal-packed RfxMCR-N dimer with the full-length RfxMCR yielded a root mean square deviation (RMSD) of 1.031 Å (Fig. S4D), indicating that RfxMCR-C truncation did not affect the overall conformation of RfxMCR-N.
Although SDR1 (Met1-Pro292) and SDR2 (Thr293-Gly572) in RfxMCR-N had identical architectures, only SDR1, which contained the conserved Tyr-X-X-X-Lys motif (52), was resolved with NADP + bound ( Fig. 2A; Fig. S6A). Surprisingly, we observed an extra electron density that precisely matched the reduction intermediate MSA near the nicotinamide ring of NADP + (Fig. S3C). The carboxyl groups of MSA were immobilized by hydrogen bonding interactions with Tyr172 (2.6 Å), Arg175 (2.7 Å), and Tyr178 (2.9 Å), whereas the aldehyde group was hydrogen bonded with the hydroxyl group of Thr165 (2.9 Å) (Fig. 2B). NADP + was also coordinated in the cofactor-binding pocket through extensive hydrogen bonding interactions (Fig. 2B). The nicotinamide ring was stabilized by a hydrogen bond with Ser213 (3.2 Å), and the ribose oxygen formed a hydrogen bond with the amino nitrogen of Lys182 (2.7 Å). The free oxygen atom of the pyrophosphate is the intensity of the ith measurement of reflection hkl and 〈I(hkl)〉 is the mean intensity of all symmetry-related reflections.
group was hydrogen bonded with the guanidine side chain of Arg215 (3.1 Å). On the other side, the guanidine amine group of Arg41 covered the adenine ring and formed a hydrogen bond with one oxygen of the 2′-phosphate group of the ribose ring (3.0 Å). The adenosine ring was further hydrogen bonded with Asp70 (3.0 Å) and Asn138 (3.0 Å) (Fig.  2B). Site-directed mutation of Thr165, Tyr172, Arg175, Tyr178, and Lys182 to Ala residues resulted in a dramatic decrease in the overall enzymatic activity of the full-length RfxMCR (Fig. 2C), confirming the essential role of these amino acid residues in malonyl-CoA reduction.
The crystal structure of RfxMCR-C in complex with NADP + -MSA was also determined by the molecular replacement method and refined to an R work of 22.05% and an R free of 25.28% at 2.3 Å resolution (Table 3). In contrast to RfxMCR-N, both the gel filtration and AUC analyses showed the existence of a RfxMCR-C monomer in solution ( Fig. S5C and E). The overall structure contained 28 α-helixes and 13 β-strands that were organized into tandemly arranged SDR3 (Gly573-Glu782 and Met933-Gly962), ED (Leu783-Thr932), and Research Article mBio SDR4 (Phe963-Val1229) domains (Fig. S3B). Superimposition of a crystal-packed RfxMCR-C dimer with the full-length RfxMCR yielded an RMSD of 0.995 Å (Fig. S4E and F), with the dimer interface stabilized by similar extensive hydrogen-bonding interactions as the fulllength RfxMCR dimer (Fig. S4B). Similar to RfxMCR-N, a cofactor NADP + and intermediate MSA were resolved at the interface between the SDR3 and ED domains ( Fig. 3A; Fig. S3D). In the substrate-binding pocket, MSA was immobilized by hydrogen bonding interactions with Ser731 (3.0 Å) and Tyr749 (3.2 Å) from the SDR3 domain and Arg799 (3.5 Å) from the ED. NADP + was coordinated in the cofactor-binding pocket through extensive hydrogen bonding interactions with Tyr749 (2.6 Å), Lys753 (3.1 Å), Ser593 (2.7 Å), Arg616 (2.9 Å), and Asp651 (3.2 Å). The essential catalytic roles of these residues were also confirmed by enzymatic analyses (Fig. 3C). Mutation of the NADP + -MSA coordinating residues Ser731, Tyr749, Lys753, and Arg799 completely eliminated enzymatic activity. In addition, mutagenesis of Arg746, Arg785, and Leu795 (the highly conserved residues that are essential for CoA binding in PdMCR) both markedly decreased RfxMCR enzymatic activity ( Fig. 3C; Fig. S7). interactions. (C) Enzymatic analyses of mutant forms of full-length RfxMCR. Several residues were mutated to Ala residues: members of the catalytic triad (Ser731-Lys753-Tyr749); Arg799, which is necessary for coordinating MSA; and the highly conserved amino acid residues Arg746, Arg785, and Leu795, which are essential for CoA binding in Porphyrobacter dokdonensis MCR. All the enzymatic data were obtained from triplicate experiments. WT, wild type.

RfxMCR conformational changes occurred during substrate binding
To investigate the conformational changes that occurred during substrate binding, we compared NADP + -MSA-bound RfxMCR-N and RfxMCR-C structures with that of fulllength RfxMCR (Fig. 4). Superimposition of NADP + -MSA-bound RfxMCR-N with full-length RfxMCR revealed striking conformational changes in the substrate-binding pocket. Specifically, the side chains of Arg41 and Asp70 in SDR1 were flipped by ~7.2 Å and 9.3 Å, respectively, to stabilize the NADP + adenine ring. Notably, the side chains of members of Research Article mBio the catalytic triad (Thr165-Tyr178-Lys182) were also shifted toward the substrate-bind ing pocket, as were the sidechains of Tyr172 and Arg175, which were essential for coordinating MSA. Especially, the α10 helix (Ser213-Gly228) was rotated by ~20° toward the substrate-binding pocket to cover MSA and the NADP + nicotinamide ring (Fig. 4A). This resulted in the closure of the MSA-binding site and shrinking of the substrate-bind ing pocket (Fig. 4B). Interestingly, no conformational changes were observed for the corresponding helix between the apo-and NADP + -bound PdMCR-N structures (PDB 6K8V and 6K8W), which adopted similar conformations to that of full-length RfxMCR; their nicotinamide ring-binding sites were still exposed when NADP + was bound ( Fig. 4B; Fig.  S8A and B). These analyses indicated that the dramatic conformational changes of the α10 helix were specific structural features that occurred when the reaction intermediate MSA was bound. Similarly, extensive side chain conformational changes were observed in the RfxMCR-C substrate-binding pocket during NADP + -MSA binding (Fig. 4C). When NADP + -MSA was bound, members of the SDR3 catalytic triad (Ser731-Lys753-Tyr749) were shifted toward the substrate-binding pocket to form hydrogen bonds with NADP + and MSA. Specifically, the guanidine side chain of Arg799 was flipped by ~1.6 Å to stabilize the MSA carbonyl group, and the side chains of Asp651 and Arg616 were shifted by ~2.0 Å and 4.1 Å, respectively, to immobilize the adenosine ring of NADP + ; this further resulted in partial closure of the adenosine ring-binding site in NADP + -MSA-bound RfxMCR-C ( Fig. 4D; Fig. S8C and D). In PdMCR-C bound with NADP + (PDB 6K8U) and CoA (PDB 6K8T), the substrate-binding pockets adopted similar conformations, even with respect to the secondary structures (Fig. S8C). There was a lack of clear electron density in the region of Glu782-Pro793 (corresponding to the α7 helix in PdMCR-C); however, the nicotinamide ring-binding site was exposed in NADP + -MSA-bound RfxMCR-C, whereas it was closed in PdMCR-C bound with NADP + and CoA ( Fig. 4D; Fig. S8D through G), indicating that this region likely played important roles in CoA binding.

Malonyl-CoA-binding conformations revealed by MD simulations
To further explore the catalytic mechanism of RfxMCR, MD simulations were performed for the substrate malonyl-CoA, the cofactor NADPH (Fig. 5A), and the reaction intermedi ates NADP(H) + -MSA with full-length RfxMCR. Due to the differences in the MD starting structures and the binding cofactor, as well as the substrate-induced conformational changes, RfxMCR in the binding complexes of RfxMCR-NADPH-malonyl-CoA, RfxMCR-NADP + -MSA, and RfxMCR-NADPH-MSA showed different RMSD profiles. But RfxMCR together with the binding cofactor and substrate showed converged RMSD profiles in the second half of all MD simulations, indicating the simulation process reached the equilibration stage ( Fig. S9A through C). An MM/GBSA calculation (47) revealed that the highest binding free energy of malonyl-CoA to full-length RfxMCR was −77.76 kcal/mol ( Table 4), indicating that malonyl-CoA was a much more favorable substrate than MSA for binding to RfxMCR. The binding free energy was further decomposed to identify per-residue contributions (Fig. 5B). The largest per-residue contribution came from Arg1164 in the SDR4 domain, which formed hydrogen bonds with the pyrophos phate oxygen of CoA and the guanidine nitrogen of Arg799 ( Fig. 5B and C). Similarly, the residues Arg799, Lys807, and Lys919 (from the ED domain) contributed to immobi lize the CoA moiety through hydrogen bonding interactions (Fig. 5C). The essential roles of these amino acid residues in malonyl-CoA binding were further verified by the decreased binding free energy of Arg799 and the absence of Lys807 and Lys919 in the per-residue decomposition of NADPH-MSA-bound RfxMCR (Fig. S9D). These results were also in agreement with the structural comparisons of NADP + -MSA-bound RfxMCR-C and full-length RfxMCR, in which Arg799 underwent dramatic conformational changes during MSA binding (Fig. 4C).
Importantly, the malonate moiety of malonyl-CoA was hydrogen bonded with residues Tyr749, Arg746, and Tyr743 within the sequence motif YXXRXXY that were conserved between SDR3 and SDR1 domains ( Fig. 5C; Fig. S6C). In contrast with Arg746  and Tyr749, which contributed to both malonyl-CoA and MSA binding, Tyr743 made significant contributions to malonate binding but no contributions to MSA binding ( Fig.  5B through D; Fig. S9D). This indicated that Ty743 played an essential role in malonyl-CoA reduction.

MD simulations of the reduction intermediate NADP(H)-MSA with full-length
RfxMCR revealed a relatively higher binding affinity with the SDR1-2 domains than with SDR3-ED-SDR4 (Table 4), demonstrating that it was more favorable for MSA to bind to the SDR1-2 domains to proceed through the second reduction step of MSA to 3-HP. Consistent with the crystal structure of NADP + -MSA-bound RfxMCR-N, Thr165, Tyr172, Arg175, and Tyr178 contributed considerable free energies for MSA binding to full-length RfxMCR. Specifically, Arg175 and Arg215 contributed the biggest free energies for MSA binding (Fig. 5D; Fig. 2B). Coincidently, the amino acid residues Arg215, Ile216, and Met219, which had moderate binding free energies, were all located within the α10 helix (Ser213-Gly228) that underwent striking conformational changes during MSA binding ( Fig. 5D; Fig. 4A). Superimposition analyses revealed that, in the simulated NADPH-MSAbound full-length RfxMCR, the α10 helix also adopted the same conformations that were observed in the crystal structure of NADP + -MSA-bound RfxMCR-N ( Fig. 5E and F). This confirmed that the conformational changes of the α10 helix were necessary for MSA binding and closure of the binding pocket for subsequent reduction reactions.

Molecular mechanism of RfxMCR in catalyzing the two-step reduction of malonyl-CoA to 3-HP
In the equilibrated binding structure with NADPH-malonyl-CoA, electrostatic interac tions were observed between the carbonyl C3 atom and the hydride from the NADPH nicotinamide ring (3.4 Å); this provided the prerequisite for a hydrophilic attack at the C3 atom. On the other side of the molecule, oxygen atoms at the malonyl end were immobilized in a chain of hydrogen bonding interactions with the OH of Tyr743 (2.7 Å), the NH1 of Arg746 (2.9 Å), and Tyr749 (2.8 Å) (Fig. 5C). Based on the crystal structures of NADP + -MSA-bound RfxMCR-N and RfxMCR-C, and the binding characteris tics of full-length RfxMCR with NADPH-malonyl-CoA and NADP(H) + -MSA, we propose a catalytic mechanism for the two consecutive reduction reactions (Fig. 6A). Initially, the hydride from the NADPH nicotinamide group launches a nucleophilic attack at the carbonyl C3 atom of malonyl-CoA, leading to breakage of the S-C bond and generation of MSA, a CoAS − thioanion, and NADP + (Fig. 6B). Subsequently, CoAS − is reduced by sequential deprotonation of the Tyr743 hydroxyl group and the Arg746 guanidine amine; the deprotonated Arg746 could be restored by surrounding solvents. As indicated in the structural analyses of NADP + -MSA-bound RfxMCR-C, the reaction intermediate MSA could be further stabilized in a proton relay network by hydrogen bonding interactions with Ser731, Tyr749, and Arg799 for the second reduction step (Fig. 3B).
In the structure of NADP + -MSA-bound RfxMCR-N, Thr165 stabilized the reaction intermediate MSA by forming hydrogen bonds with the C3 carbonyl oxygen (Fig. 2B), which facilitates nucleophilic attack by the NADPH hydride on the C3 atom of MSA. The MSA anion generated by this process is protonated by extracting a proton from the Tyr178 hydroxyl group to form 3-HP (Fig. 6C). Due to the steric hindrance and the distance between the Tyr178 hydroxyl and the Lys182 α-amino group (4.6 Å in the crystal structure) (Fig. 2B), a direct proton transfer from Lys182 to Tyr178 is unlikely. However, Lys182 could play a dual role not only in orienting the cofactor NADPH by hydrogen bonding with the nicotinamide ribose (2.7 Å) but also by mediating protonation of Tyr178 through an NADPH hydroxyl-mediated proton transfer ( Fig. 2B; Fig. 6C). The deprotonated Lys182 could feasibly be recovered by the solvent, completing the catalytic reaction. Overall, using two molecules of NADPH, malonyl-CoA could be converted into MSA, then further reduced to 3-HP through two consecutive reactions occurring at domains SDR3-ED-SDR4 and SDR1-2 of RfxMCR (Fig. 6). proposed mechanism by which RfxMCR catalyzes a two-step reduction of malonyl-CoA to MSA and further reduction of MSA to 3-hydroxypropanoate (3-HP). The functional groups of malonyl-CoA and NADPH are indicated as R 1 and R 2 , respectively.

DISCUSSION
We here investigated the catalytic mechanism of a bi-functional MCR from R. castenholzii. It is a chlorosome-less green nonsulfur bacterium that is closely related to C. aurantiacus (20), the species in which the 3-HP autotrophic carbon fixation pathway was first identified and demonstrated. For the first time, we determined the cryo-EM structure of a full-length bi-functional MCR and found that it contained two subunits, each with four tandemly arranged SDR domains and one ED domain, which were cross-interlocked to form a homodimer. By determining the cryo-EM structure of full-length RfxMCR, we first observed that the RfxMCR-N and RfxMCR-C in each subunit were connected by an α-helix (G 573 WAESL 578 ). Second, we revealed a homodimer of two cross-interlocked subunits with the N-and C-terminal ends in the same direction. In the center of the homodimer, two of these helices were antiparallelly arranged and serve as a point of intersection of the two subunits, further representing that the two subunits were interlocked but not paring contacted to form the homodimer (Fig. 1E). This disagrees with the previous studies of PdMCR. In this work, the authors fit the dimeric crystal structures of PdMCR-N and PdMCR-C into the SAXS model of full-length PdMCR and proposed that two PdMCR subunits were paring contacted to form a homodimer (19). MSA is an essential reaction intermediate of the malonyl-CoA reduction, and it is either released or remains enzyme bound (17). Like other aldehydes, cellular accumula ted MSA can react with free amino groups to form adducts and exert toxic effects. The growth of E. coli K-12 cells at high temperature was severely restricted by the accumulation of MSA when pyrimidines were used as the sole nitrogen source (5). The bi-functional MCR is superior to the mono-functional enzymes in that it combines the alcohol and aldehyde dehydrogenase (CoA-acylating) activities into one single enzyme, which directly consumes the reaction intermediate MSA and thus decreases its cellular accumulation and toxicity. However, no structural information of the MSAbound conformation of MCR enzymes is available. Notably, we here determined the crystal structures of NADP + -MSA-bound RfxMCR-N and RfxMCR-C and revealed the conformational changes necessary for substrate selection and intermediate coordina tion. Superimposition of the catalytic SDR1 domain with the non-catalytic SDR2 domain gave a main chain RMSD of 6.680 Å. Compared to SDR1, the non-catalytic SDR2 lacked the α10 and α11 helices, which were necessary for covering the NADP + -MSA binding pocket; it also lacked two antiparallel β-strands (β7-β8) and two short α-helices (α5-α6) that were exposed on the SDR1 surface (Fig. S6A). Consistent with the crystal structure of NADP + -MSA-bound RfxMCR-N, MD simulations of the full-length RfxMCR with NADP(H) + -MSA revealed similar conformational changes of the α10 helix, confirming that this helix played an important role in stabilizing the binding of the reaction intermediate MSA ( Fig.  4A; Fig. 5F). In addition, SDR2 and SDR1 had completely different amino acid residues in the NADP + -MSA binding sites; SDR2 had a distinct substrate-binding pocket that disfavored MSA binding ( Fig. S6A; Table 4). Similarly, the superimposition of the SDR3 and SDR4 main chains had a larger RMSD at 18.511 Å (Fig. S6B), indicating the dramatic differences between these two SDR domains arose from the incorporation of ED into the SDR3 domain. Sequence alignment further confirmed that the substrate-binding residues in SDR3 and SDR4 were less conserved, which resulted in the absence of a malonyl-CoA-NADPH binding pocket in SDR4 (Fig. S6B).
Interestingly, the superimposition of RfxMCR-N and RfxMCR-C revealed an excellent match of the tandem SDR1-2 and SDR3-4 domain core architectures and relatively high conservation of the NADP + -MSA binding site (Fig. S6C). Notably, SDR1 required additional secondary structures (such as the α10 and α11 helices) to cover the MSA-bind ing site. In contrast, SDR3 incorporated a larger module, including SDR4 and ED, to form the active site pocket, in which Arg1164 and Arg799 made major contributions to malonyl-CoA binding ( Fig. 5C; Fig. S6C; Table 4). As a result, the conformation of the substrate-binding pocket in SDR3 was more favorable for binding the substrate malonyl-CoA, whereas the SDR1-binding pocket was optimal for stabilizing the reaction intermediate MSA. However, the mutation of the amino acid residues in SDR4 of CfxMCR increases enzymatic activity (18). Therefore, although SDR2 and SDR4 were not involved in either substrate or cofactor binding or catalysis, they served as auxil iary domains to maintain RfxMCR catalytic activity. Most importantly, we found that the substrate specificity of the bi-functional MCR was determined by incorporating additional secondary structures to the core SDR architectures (Fig. S6).
RfxMCR has not been examined in biosynthetic 3-HP production, but CfxMCR has been reconstructed in E. coli BL21 (DE3) cells together with CfxACCase to generate a 3HP-producing strain (9). Separation of the CfxMCR-N and CfxMCR-C fragments in this strain increases 3-HP production by 1.5 times (17), indicating that the full-length CfxMCR enzyme activity is regulated by coordination between the CfxMCR-N and CfxMCR-C modules. However, the expression levels of CfxMCR-N and CfxMCR-C in the cells were different, which generated a functional imbalance that further decreased the 3-HP yield. Direct evolution of CfxMCR-C has yielded three mutants with increased enzymatic activity (Fig. S10A) and 270-fold higher 3-HP production (after fine-tuning of CfxMCR-N expression levels) (17,18); these mutants were N940V (corresponding to Asn951 in RfxMCR), K1106W (Lys1116 in RfxMCR), and S1114R (Asn1124 in RfxMCR). The highly conserved Lys1116 was located at the interface between SDR4 and the ED domain. Mutation of Lys1116 to Trp would increase steric hindrance with Asn887 from the helix α33, which likely enhanced malonyl-CoA binding by facilitating closure of the substrate-binding pocket during catalysis (Fig. S10B). Mutation of the conserved Asn951 to Val likely enhanced the hydrophobic interactions with residues Pro934 and Pro959 located in the α36 and loop region covering the substrate-binding pocket, which probably enhanced NADPH binding during catalysis (Fig. S10C). In contrast, Asn1124, which was located at the end of the SDR4 domain, was not conserved and was far away from the substrate pocket (Fig. S10D). Combination of these mutations enhanced the enzyme activity of CfxMCR-C, but the mutation "hot spots" identified via saturation mutation were distant from the substrate and cofactor-binding pockets. Nevertheless, the structural basis of RfxMCR revealed in this work serves as an accurate template for the future rational design of the full-length MCR enzymes, which contributes to eliminate the expression and functional imbalance between dissected MCR-N/C and also decreases the cytotoxicity resulting from the accumulation of the toxic reaction intermediate MSA.
In summary, we here revealed the molecular bases underlying substrate binding, specificity determination, and catalytic mechanisms of a bi-functional MCR from R. castenholzii. This was achieved through a combination of approaches, including cryo-EM, X-ray crystallography, enzymatic analyses, and MD simulations. The results of this study not only broaden our understanding of the key catalytic steps in the 3-HP autotrophic carbon fixation pathways but are expected to advance industrial applications of this metabolic pathway in biosynthetic 3-HP production.

ADDITIONAL FILES
The following material is available online. Flowchart of the cryo-EM data processing for the full-length RfxMCR. To generate a template for two-dimensional (2D) classification, 14,112 particles were auto-picked from 20 micrographs, and subjected to 2D classification. Using a 2D reference generated from 12,105 particles, 4,043,667 particles were auto-picked from 4,647 micrographs and imported in CryoSPARC. After five iterative rounds of 2D classifications, 2,391,743 particles were selected and imported in RELION for generating a 3D map with the initial model program, which was used as a refer ence for three-dimensional (3D) classifications that generated four classes of particles. Then 1,766,605 particles from the best two classes (percentage of 42.89% and 30.97%, respectively) were selected and extracted, and subjected to another round of referencebased 3D classification and non-uniform refinement in CryoSPARC, which produced an EM-map with a global resolution of 3.35 Å based on the gold standard Fourier shell correlation (FSC).  Fig. S10 (mBio03233-22-s0010.tif). Spatial organization of RfxMCR amino acid residues that are corresponding to the hot spot mutations of Chloroflexus aurantiacus MCR (CfxMCR). (A) Distribution of the amino acid residues that are corresponding to the hot spot mutations in CfxMCR. The catalytic centers of RfxMCR are indicated with bound substrate malonyl-CoA (magenta) and cofactor NADPH (green), the SDR3 (yellow), SDR4 (sage) and ED (orange) domains are shown in surface, and the amino acid residues are shown in stick models. (B-D) Spatial organizations of the amino acid residues Lys1116 (B), Asn951 (C) and Asn1124 (D) that are corresponding to the hot spot mutations K1106W, N940V and S1114R of CfxMCR. The amino acid residues in RfxMCR are shown in stick models, and the mutations in CfxMCR are shown as gray outline.