CryoEM reveals oligomeric isomers of a multienzyme complex and assembly mechanics

Propionyl-CoA carboxylase (PCC) is a multienzyme complex consisting of up to six α-subunits and six β-subunits. Belonging to a metabolic pathway converging on the citric acid cycle, it is present in most forms of life and irregularities in its assembly lead to serious illness in humans, known as propionic acidemia. Here, we report the cryogenic electron microscopy (cryoEM) structures and assembly of different oligomeric isomers of endogenous PCC from the parasitic protozoan Leishmania tarentolae (LtPCC). These structures and their statistical distribution reveal the mechanics of PCC assembly and disassembly at equilibrium. We show that, in solution, endogenous LtPCC β-subunits form stable homohexamers, to which different numbers of α-subunits attach. Sorting LtPCC particles into seven classes (i.e., oligomeric formulae α0β6, α1β6, α2β6, α3β6, α4β6, α5β6, α6β6) enables formulation of a model for PCC assembly. Our results suggest how multimerization regulates PCC enzymatic activity and showcase the utility of cryoEM in revealing the statistical mechanics of reaction pathways.

Prior structural studies have resolved a 3.2 Å crystal structure of a recombinant PCC chimera (PCC chi ), where its α-subunit is from Ruegeria pomeroyi and β-subunit is from Roseobacter denitrificans (Huang et al., 2010). A cryoEM structure of the recombinant HsPCC has been determined at 15 Å resolution showing similar architecture to PCC chi (Huang et al., 2010). More recently, a 3.48 Å cryoEM structure of the recombinant Methylorubrum extorquens PCC (MePCC) was used to guide the design of a new-to-nature enzyme for improved CO 2 fixation (Scheffen et al., 2021). In the aforementioned cryoEM studies, PCC was found to oligomerize as an α 6 β 6 dodecamer.
For PCC to catalyze carboxylation, the enzyme must first be biotinylated (Waldrop et al., 2012). Following biotinylation, the multienzyme acts in two steps involving αand β-subunits. In the first step, biotin is carboxylated at an α-subunit active site with bicarbonate as the carbon dioxide donor upon concomitant ATP hydrolysis (Huang et al., 2010;Tong, 2013). In the second step, the carboxylated biotin is translocated to the corresponding β-subunit active site, and the carboxyl group is transferred from biotin to form methylmalonyl-CoA (Huang et al., 2010;Tong, 2013). Given that assembly and disassembly propensities of multienzyme complexes may influence catalytic efficiency, it is crucial to model their assembly mechanisms. CryoEM enables observation of multienzymes in different assemblies and reconstruction of different oligomeric isomers. Unlike X-ray crystallography, endogenous proteins in different stages of assembly can be classified and counted in cryoEM micrographs to reveal the statistical mechanics of chemical reactions, along with obtaining high-resolution atomic structures.
Here, we report three structures of endogenous LtPCC: an α 6 β 6 dodecamer, α 5 β 6 undecamer, and α 4 β 6 decamer. We utilize the α 5 β 6 and α 4 β 6 architectures of PCC to demonstrate that PCC oligomeric isomers differ only in the number of α-subunits. We devised a sorting method to calculate the number of LtPCC proteins with the same oligomeric formula, from α 0 β 6 to α 6 β 6 . From this statistical information, we characterized the dynamics of PCC's molecular assembly system and reaction mechanics in solution.

LtPCC α 6 β 6 dodecamer and domain organization
To capture cryoEM structures of various biotin-binding complexes, we performed streptavidin pull-down of endogenous complexes from L. tarentolae mitochondrial lysate fractionated by glycerol density gradient and determined their structures. One subset of these structures has a three-fold symmetric architecture reminiscent of carboxylases (Huang et al., 2010;Tong, 2013;Huang et al., 2012;Jurado et al., 2015;Hu et al., 2023); by using the cryoID approach (Ho et al., 2020), we confirmed its identity as propionyl-CoA carboxylase from L. tarentolae (LtPCC).
The α-subunit of LtPCC contains three domains: a biotin carboxylase (BC) domain, BC-CT mediating (BT) domain, and biotin carboxyl carrier protein (BCCP) domain. A BT-BCCP linker connects the BT and BCCP domains, allowing BCCP to transport biotin between the αand β-subunits. As in PCC chi , the active sites in LtPCC are positioned 55 Å apart; therefore, the BCCP domain must translocate between the active sites, as proposed by the swinging-domain model (Huang et al., 2010;Tong, 2013). The β-subunit is composed of the structurally homologous Ncarboxyltransferase (N-CT) and C-carboxyltransferase (C-CT) domains (Huang et al., 2010;Scheffen et al., 2021) In carboxylases, biotin attaches to a conserved lysine residue in the alanine-methionine-lysine-methionine (AMKM) biotinylation motif within the BCCP domain (Samols et al., 1988;Baumgartner et al., 2001). Accordingly, we observed a clear density for biotin next to Lysine 631 of the AMKM motif in the β-subunit active site. The biotin's carboxyl group, which is the point of covalent attachment to lysine, neighbors the lysine residue. There is weak density connecting lysine and biotin (E) Atomic model of an α-subunit and a β-subunit of LtPCC colored by domains and shown as ribbons, with active sites labeled. Residues at the α-β binding site are boxed with dotted lines in (F) and labeled in zoomed-in views (G-I). Residues that form hydrogen bonds are labeled in blue. BTI stands for biotin. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) ( Fig. 1D), suggesting that some LtPCC assemblies are covalently biotinylated. CoA was not observed at its β-subunit binding site.

α-β binding sites within LtPCC
Using our 3.2 Å map of the dodecamer, we analyzed inter-subunit interactions within LtPCC. Interactions between LtPCC αand β-subunits enable complex assembly ( Fig. 1F-I) and make up the α-β binding site. No interactions exist among LtPCC α-subunits. There are seven hydrogen bonds between αand β-subunits. Three hydrogen bonds occur between the BCCP domain of the α-subunit and the β-subunit, and two hydrogen bonds form between the BT-BCCP linker and the β-subunit. Glutamate 628 from the BCCP domain forms two hydrogen bonds with Lysine 292 from the β-subunit (Fig. 1I). There is another mainchainmainchain hydrogen bond between Arginine 633 in the BCCP domain and the β-subunit. Glutamate 628, the biotin-attachment residue Lysine 631, and Arginine 633 form a U-shape with Lysine 631 in the middle, with their three hydrogen bonds stabilizing the U-shaped loop to hold the lysine residue in proximity to biotin (Fig. 1I). Lysine 45 from the β-subunit forms two hydrogen bonds with the BT-BCCP linker mainchain ( Fig. 1G), stabilizing the linker. These interactions hold the BCCP domain in place near the β-subunit.
Among the α-subunits and β-subunits, residues contributing to the α-β binding site primarily occur along the small α-helix of the BT domain LtPCC α 4 β 6 and α 5 β 6 oligomeric isomers differ from α 6 β 6 dodecamer in α-subunit occupancies In the α 6 β 6 dodecameric structure, α-subunits disappear when observed at high density threshold, indicating that α-subunits are either flexible and/or have lower occupancy (Fig. S2). The 2D class averages also contain classes with missing α-subunits (Fig. S3). To sort out LtPCCs with missing α-subunits, we performed symmetry-relaxed 3D classification and determined two additional LtPCC structures ( Fig. 2A-C): α 5 β 6 undecamer, α 4 β 6 decamer. The α 5 β 6 undecamer lacks one α-subunit, and the α 4 β 6 decamer lacks two α-subunits on opposite sides of the LtPCC complex. In our decamer, undecamer, and dodecamer, the conformation of individual subunits is preserved. In other words, the structures of these complexes differ from one another only in the number and positioning of α-subunits. It follows that a single α-β binding site gives rise to a binary choice of the binding site being occupied or unoccupied ( Fig. 2A-C).

Reaction mechanics and model of a multienzyme assembly and disassembly
To facilitate the description of the various structures we observed, we introduce three terms: oligomeric formula, oligomeric isomer, and structural conformation. An oligomeric formula is akin to a chemical formula, and a LtPCC oligomeric formula can be written as α n β m . Multiple oligomeric isomers could share the same oligomeric formula but have different arrangements of subunits in space. Though the architectural arrangement of subunits in an oligomeric isomer is unique, each subunit could assume different structural conformations. In the case of LtPCC, as described above, each oligomeric isomer has one, and only one structural conformation.
The above-mentioned three structures are only three possible isomers that LtPCC could have in solution. There are sixteen possible different LtPCC oligomeric isomers represented by seven oligomeric formulae (one isomer for α 6 β 6 , α 5 β 6 , α 1 β 6 and α 0 β 6 , and four for α 4 β 6 , α 3 β 6 and α 2 β 6 ). Even though there is only one structural conformation for each oligomeric isomer, the large number of potential oligomeric isomers and the structural similarity among these oligomeric isomers present a technical challenge in relying on traditional classification methods to sort out all these isomers.
To tackle this problem, we used a sorting method to determine the number of particles belonging to each oligomeric formula by classifying its α-subunits. As indicated above, for each oligomeric isomer, the α-β binding site is either occupied or unoccupied. We searched for the presence or absence of an α-subunit at each of the six α-β binding sites on a PCC particle, giving rise to 2 6 possible permutations, sharing seven oligomeric formulae and sixteen oligomeric isomers (Fig. 3A-B). We counted the number of unoccupied α-β binding sites in each PCC particle (Table S1), to assign each particle to its corresponding oligomeric formula.
The solution in our sample is at an equilibrium state, as it has mixed and settled for hours before freezing. The equilibrium constant K c can be obtained by calculating the ratio of product concentration to reactant concentration. The assembly of LtPCC can be represented by this reaction: α n β 6 + α⇌α n+1 β 6 The equilibrium constant K c can be found by: α and β represent αand β-subunits, respectively; square brackets indicate concentration; n is an integer that denotes the number of α-subunits in LtPCC. We are unable to measure the concentration of α-subunits because single α-subunits are small and flexible, and therefore cannot yet be recognized in cryoEM images. Nevertheless, we can calculate the product of [α] and K c as an attribute of the PCC assembly reaction. We assume that [α] remains constant throughout due to the assembly and disassembly of α-subunits at equilibrium (Fig. 4).
The plot of K c [α] as a function of n (Fig. 4) shows that K c [α] decreases exponentially with increasing n. With [α] remaining constant, this means that particles with a smaller n value have greater tendency to attach new α-subunits compared to particles with a larger n value. This can be partially explained by complexes with more α-subunits having less reaction sites for α-subunit attachment.

Discussion
Our present work shows promise of utilizing cryoEM for statistical analysis of thermodynamics and structural dynamics to understand the behaviors and assembly of biological complexes. By simultaneously determining different structural conformations of a protein, we can discern its reaction mechanics and calculate the rate of conversion Fig. 2. LtPCC conformations differ only in occupancies of the α-β binding sites. Cartoon representations of α 6 β 6 (A) α 5 β 6 (B) and α 4 β 6 (C) cryoEM densities, with α-subunits in orange and β-subunits in green (top row). Two views of the cryoEM densities for α 6 β 6 (A) α 5 β 6 (B) and α 4 β 6 (C) colored by domain as in Fig. 1C  between its oligomeric isomers. The future prospect of experimentally determining statistical mechanics presents exciting opportunities for a deeper understanding of the catalytic mechanisms of multienzyme complexes and the working-cycle of molecular machines.
Crystallographic studies have elucidated β-homohexamer structures with no α-subunits in various carboxylases, including PCC (Huang et al., 2012;Diacovich et al., 2004). Therefore, we propose that β-homohexamers are stable in solution, and the lack of interactions between PCC α-subunits allow for a continuous assembly and disassembly of α-subunits from β-homohexamers. As a dodecamer, PCC contains six pairs of active sites per enzyme available for catalysis. In other oligomeric isomers, at least one active site cannot participate in catalysis, and so all non-dodecameric PCCs are less catalytically active than the dodecamer.
As the concentration of α-subunits increases, to maintain equilibrium, the reaction will favor the formation of PCC isomers with more α-subunits. The maintenance of β-homohexamers allows for quick regulation of PCC activity, where only α-subunits need to assemble.
Isomers arise from protein conformational heterogeneity, and, in other systems, an oligomeric isomer can have multiple structural conformations. In theory, cryoEM allows one to determine all these structural conformations and derive their reaction kinetics. An example of a dynamic macromolecular machine is the spliceosome. During transcription, the spliceosome splices introns from pre-messenger RNAs. Throughout the splicing cycle, some spliceosome components are displaced while others assemble, and assembled components undergo conformational changes. Though structures of the spliceosome at different stages of the splicing cycle exist, the assembly kinetics of this process remains to be resolved Yan et al., 2015;Galej et al., 2016;Liu et al., 2017;Li et al., 2019). Similarly, ribosomes undergo compositional and conformational changes during translation; despite structures of multiple ribosome states, information about its reaction kinetics remains limited (Xue et al., 2022). The complexity of resolving all isomers in multi-subunit systems would require imaging and computational resources that are cost-prohibitive at present. The relatively simple LtPCC system with just two different molecules, each of a single conformation, already has sixteen oligomeric isomers. The presence of D3 symmetry further allows application of a sorting method to count the number of particles sharing the same oligomeric formula without having to solve all structures exhaustively using an enormous cryoEM dataset.
Currently, the difficulty in seeing small molecules (<80 kDa) with cryoEM makes it unfeasible to directly find the rate constant of molecular assembly involving small components, though future developments should enable the determination of such components by cryoEM (Wu and Rapoport, 2021;Kim et al., 2019). Just like how Google's AlphaFold (Jumper et al., 2021) came about in 2020 to solve the previously computationally-prohibitive protein-folding-prediction problem, future cryoEM imaging and computational resources should enable determination of all structural conformations in a complex assembly to derive reaction kinetics. If the cells are thin enough, not only can we determine structural conformations in solution but also those in situ (Xue et al., 2022;Chmielewski et al., 2022;Liu et al., 2020). Such prospects offer exciting opportunities for experimentally "visualizing" statistical mechanics within an enormous conformational space, and to assist drug design in targeting the rate-limiting step of a complex of interest's assembly (Giraldo-Barreto et al., 2021;Chen and Ludtke, 2021;Zhou et al., 2022;Ma et al., 2018). Taking advantage of the limited number of conformational isomers of PCC, the current work showcases the utility of cryoEM beyond determining static structures towards statistical analysis of thermodynamics and structural dynamics.

Experimental model and subject details
We grew L. tarentolae cells in brain heart infusion media at 27 • C. The media was supplemented with 5 mg/L of hemin. We harvested the cells at ~ 2x10 8 cell/ml during the late-exponential growth phase.

Preparation of mitochondrial lysate
We enriched mitochondrial fraction through hypotonic cell lysis, and used RenoCal76 density gradients for the sequential separation of membrane-containing fractions (Pelletier et al., 2007). By sonication at 24 W for 15 s and centrifugation at 30,000 RPM in a SW55 rotor for 15 min, we lysed mitochondrial pellets in 1 ml of pH 7.3, 50 mM HEPES, 150 mM KCl, 2 mM EDTA, 1% NP40, and 50 μL of 20x complete protease cocktail. We recovered and separated the supernatant on a continuous 10-30% gradient glycerol in 20 mM HEPES, pH 7.3, 100 mM KCl, and 1 mM EDTA at 72,000 g for 15 h, in SW32 rotor (Beckman). We collected glycerol gradient fractions of 1.5 ml from the top and combined the fractions corresponding to the 20S-40S value region.

Purification of LtPCC by streptavidin affinity pulldown
We supplemented glycerol gradient fractions with octylglucoside to 2 mM. On a nutating mixer, we incubated the fractions with 20 μl Strep-Tactin®XT magnetic beads in a Binding Buffer (50 mM Tris-HCl, pH 8.0, 2 mM OG, 1 mM EDTA, 100 mM KCl) at 4 • C for 1 h. We washed the beads twice, with 5 ml of Binding Buffer each time. Then, we incubated the beads for 10 min and at 4 • C, in 0.2 ml of Elution Buffer (20 mM Tris-HCl, pH 8.0, 100 mM biotin, 1 mM EDTA, 100 mM KCl, 2 mM OG). Using Zeba™ Spin Desalting Columns, 7 K MWCO (0.5 ml), we exchanged the 130 μL of purified material into the Sample Buffer (20 mM Tris-HCl, pH 7.5, 60 mM KCl, 5 mM MgCl2, 1 mM DTT, 5 mM OG). We centrifuged the sample for 10 min at 21,000 g. We stored the supernatant on ice before cryoEM grid preparation.

CryoEM sample preparation and image acquisition
We first used PELCO easiGlow TM , with a target current of 15 mA, to glow discharge Lacey carbon cryoEM grids with a 2 nm continuous carbon film (Ted Pella) for 45 s. Then, we applied 2.5 μL of sample onto the grids. We waited for 5 s before blotting the grids for 4 s with blot force 0, at 100% humidity and 4 • C (to minimize temperate changes and structure flexibility). After we blotted the grids, we used a FEI Mark IV Vitrobot (Thermo Fisher Scientific) to plunge-freeze the grids into liquid ethane. We stored the grids in a liquid nitrogen dewar until cryoEM image acquisition.
We loaded and imaged the cryoEM grids through a Titan Krios (Thermo Fisher Scientific) with a Gatan Imaging Filter Quantum LS and K3 camera, operated at 300 kV. We recorded movies at a pixel size of 0.55 Å/pixel with SerialEM (Mastronarde, 2005), by electron counting in super-resolution mode. We set an exposure time of 2 s, fractionated to 40 frames, and a defocus range between − 2.5 to − 1.5 μm. We had an estimated total dosage of 40 e-/Å 2 . We collected 3,328 movies.

Reconstruction of cryoEM maps
To produce drift-corrected and dose-weighted micrographs, we processed the movies with MotionCor2 (Zheng et al., 2017). After motion correction, the movies had a calibrated pixel size of 1.1 Å. Due to severe drift of the first frame, it was discarded. We determined the defocus values of the micrographs with Gctf (Zhang, 2016). We first used Warp's BoxNet (Tegunov and Cramer, 2019) to automatically pick particles. Afterwards, in RELION 3.1 (Zivanov et al., 2018), we performed reference-free 2D classification, and selected the classes with features for 3D classification with D3 symmetry. After 3D classification, we selected good classes, with a total of 8,214 particles, to train a topaz model. Then using this model, we picked 979,436 particles through topaz. We used RELION 4.0 (Kimanius et al., 2021) to perform multiple rounds of reference-free 2D classification of the topaz-picked particles. We selected 2D classes with protein features. After six rounds of 2D classification, we selected 15,859 particles for 3D refinement in RELION 4.0 (Kimanius et al., 2021). We used a map from the previous 3D classification of BoxNet particles as the reference map. After 3D refinement with D3 symmetry, we obtained a map of 3.49 Å. We iteratively refined the 3.49 Å map through CTF refinement and 3D refinement (Kimanius et al., 2021). After all processing, its final resolution was 3.2 Å (Fig. S4).
We observed through the D3 map that LtPCC α-subunits exhibit flexibility and/or lower occupancy, but the β-homohexamer remains stable (Fig. S2). To investigate this further, we performed 3D classification with C1 symmetry.
We performed D3 symmetry expansion on the particles from D3 3D refinement. Using a mask of one α-subunit, we performed 3D classification with C1 symmetry. For 3D classification, we skipped both angular and offset search in RELION 4.0 (Kimanius et al., 2021). We then performed 3D refinement with C1 symmetry and local search, obtaining an α 5 β 6 map of 8.6 Å after postprocessing (Fig. S4-S5). Afterwards, we performed C3 symmetry expansion on the particles from D3 3D refinement. Using a mask of two α-subunits, we performed 3D classification with C1 symmetry, skipping both angular and offset search (Kimanius et al., 2021). We then performed 3D refinement with C1 symmetry and local search, obtaining an α 4 β 6 map of 10.3 Å after postprocessing ( Fig. S4-S5). Next, to determine the number of particles sharing each oligomeric formula, we utilized a sorting method as described below.

Sorting method
The sorting method is based on the focused classification and symmetry expansion method described here (Gong et al., 2019). We performed 3D reconstruction with D3 symmetry using all the particles to obtain the rot, psi and tilt Euler angles of each particle. We then checked each particle's six α-β binding site locations to see if the α-subunit is present. To do this, we expanded the particles for D3 symmetry (using RELION's relion_particle_symmetry_expand command), which duplicated each particle into six particles, corresponding to its six different orientations. Then we masked out one α-subunit from the D3 symmetry 3D reconstruction, and using this mask, we performed 3D focused classification without angular and offset search on the symmetry expanded particles, classifying the particles into two classes. We found that in the two classes generated from the 3D classification, one class contains density of an α-subunit (α-present class) and the other class is missing α-subunit density (α-missing class). We counted how many symmetryexpanded particles (from zero to six) for each LtPCC complex were in the α-missing class, which corresponds to the number of unoccupied α-β binding sites on that LtPCC particle (Table S1). From this, we calculated the number of LtPCC particles sharing an oligomeric formula.

Atomic modeling and model analysis
We first modeled and refined one α-subunit and one β-subunit in Coot (Emsley et al., 2010), based on the AlphaFold (Jumper et al., 2021) prediction for LtPCC. Then, through ChimeraX (Goddard et al., 2018), we duplicated the subunits to D3 symmetry and fit them into the LtPCC cryoEM map reconstructed with D3 symmetry. The model was then refined first through Phenix's real space refine function (Liebschner et al., 2019), and manually checked in Coot (Emsley et al., 2010). This model was fitted into the α 5 β 6 and α 4 β 6 maps. Based on the two maps, one and two α-subunits were removed from the original model, respectively. Hydrogen bonds and interfacial residues were analyzed through ChimeraX (Goddard et al., 2018) and verified in PISA (Krissinel and Henrick, 2007). The sequence alignment between LtPCC and HsPCC was done through Clustal Omega (Sievers and Higgins, 2018) and visualized through ESPript 3 (Robert and Gouet, 2014).

Data and code availability
The α 6 β 6 , α 5 β 6 and α 4 β 6 cryoEM maps have been deposited in the Electron Microscopy Data Bank under accession numbers EMD-40474, EMD-40473, and EMD-40472, respectively. The coordinates of the α 6 β 6 , α 5 β 6 and α 4 β 6 LtPCC models have been deposited in the Protein Data Bank under accession numbers 8SGZ, 8SGY, and 8SGX, respectively. All aforementioned deposited data are publicly available as of the date of publication. This paper does not report original code. Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Data availability
Data will be made available on request.