Many-molecule encapsulation by an icosahedral shell

We computationally study how an icosahedral shell assembles around hundreds of molecules. Such a process occurs during the formation of the carboxysome, a bacterial microcompartment that assembles around many copies of the enzymes ribulose 1,5-bisphosphate carboxylase/ oxygenase and carbonic anhydrase to facilitate carbon fixation in cyanobacteria. Our simulations identify two classes of assembly pathways leading to encapsulation of many-molecule cargoes. In one, shell assembly proceeds concomitantly with cargo condensation. In the other, the cargo first forms a dense globule; then, shell proteins assemble around and bud from the condensed cargo complex. Although the model is simplified, the simulations predict intermediates and closure mechanisms not accessible in experiments, and show how assembly can be tuned between these two pathways by modulating protein interactions. In addition to elucidating assembly pathways and critical control parameters for microcompartment assembly, our results may guide the reengineering of viruses as nanoreactors that self-assemble around their reactants. DOI: http://dx.doi.org/10.7554/eLife.14078.001


Introduction
Encapsulation is a hallmark of biology. A cell must co-localize high concentrations of enzymes and reactants to perform the reactions that sustain life, and it must safely store genetic material to ensure long-term viability. While lipid-based organelles primarily fulfill these functions in eukaryotes, self-assembling protein shells take the lead in simpler organisms. For example, viruses surround their genomes with a protein capsid, while bacteria use large icosahedral shells known as bacterial microcompartments (BMCs) to sequester the enzymes and reactions responsible for particular metabolic pathways (Kerfeld et al., 2010;Axen et al., 2014;Shively et al., 1998;Bobik et al., 1999;Erbilgin et al., 2014;Petit et al., 2013;Price and Badger, 1991;Kerfeld and Erbilgin, 2015). Within diverse bacteria, BMC functions have been linked to bacterial growth, carbon fixation, symbiosis, or pathogenesis (Kerfeld and Erbilgin, 2015). Other protein-based compartments are found in bacteria and archea (e.g. encapsulins (Sutter et al., 2008) and gas vesicles (Pfeifer, 2012;Sutter et al., 2008)) and even eukaryotes (e.g. vault particles (Kickhoefer et al., 1998)), while some viruses may assemble around lipidic globules (Lindenbach and Rice, 2013;Faustino et al., 2014). Thus, understanding the factors that control microcompartment assembly and encapsulation is a central question in modern cell biology. From the perspectives of synthetic biology and nanoscience, there is great interest in reengineering BMCs or viruses as nanoreactors that spontaneously encapsulate enzymes and reagents in vitro (e.g. Luque et al., 2014;Douglas and Young, 1998;Patterson et al., 2014;Patterson et al., 2012;Zhu et al., 2014;Rhee et al., 2011;Wö rsdö rfer et al., 2012;Comas-Garcia et al., 2014), or as customizable organelles that assemble around a programmable set of core enzymes in vivo, introducing capabilities such as carbon fixation or biofuel production into bacteria or other organisms (e.g. Kerfeld and Erbilgin, 2015;Bonacci et al., 2012;Parsons et al., 2010;Choudhary et al., 2012;Lassila et al., 2014). However, the principles controlling such co-assembly processes have yet to be established, and it is not clear how to design systems to maximize encapsulation.
In this article we take a step toward this goal, by developing theoretical and computational models that describe the dynamical encapsulation of hundreds of cargo molecules by self-assembling icosahedral shells. Although our models are general, we are motivated by recent experiments on a type of BMC known as the carboxysome (Kerfeld et al., 2010;Schmid et al., 2006;Iancu et al., 2007;Tanaka et al., 2008). Carboxysomes are large (40-400 nm), roughly icosahedral shells that encapsulate a dense complex of the enzyme ribulose-1,5-bisphosphate carboxylase/oxygenase (RuBisCO) and other proteins to facilitate the Calvin-Bensen-Bassham cycle in autotrophic bacteria (Price and Badger, 1991;Iancu et al., 2007;Kerfeld et al., 2010;Tanaka et al., 2008). Recently, striking microscopy experiments visualized bÀcarboxysome shells assembling on and budding from procarboxysomes (the condensed complex of RuBisCO and other proteins found in the interior of carboxysomes) (Cameron et al., 2013;Chen et al., 2013). Genomic analysis suggests that many BMCs with diverse functions assemble via similar pathways (Cameron et al., 2013;Kerfeld and Erbilgin, 2015). However, the mechanisms of budding and pinch-off to close the shell remain incompletely understood because of the small size and transient nature of assembly intermediates. Moreover, experiments suggest that eLife digest Bacterial microcompartments are protein shells that are found inside bacteria and enclose enzymes and other chemicals required for certain biological reactions. For example, the carboxysome is a type of microcompartment that enables the bacteria to convert the products of photosynthesis into sugars. During the formation of a microcompartment, the outer protein shell assembles around hundreds of enzymes and chemicals. This formation process is tightly controlled and involves multiple interactions between the shell proteins and the cargo -the enzymes and other reaction ingredients -they will enclose. Understanding how to control which enzymes are encapsulated within microcompartments could help researchers to re-engineer the microcompartments so that they contain drugs or other useful products.
Recent studies have used microscopy to visualize how microcompartments are assembled. However, most of the intermediate structures that form during assembly are too small and shortlived to be seen. It has therefore not been possible to explore in detail how shell proteins collect the necessary cargo and then assemble into an ordered shell with the cargo on the inside. Experiments alone are probably not enough to understand the process, especially since microcompartment assembly can currently only be studied within live cells or cellular extract. Within these complex environments it is difficult to determine the effect of any individual factor on the overall assembly process.
Perlmutter, Mohajerani and Hagan have now taken a different approach by developing computational and theoretical models to explore how microcompartments assemble. Computer simulations showed that microcompartments could assemble by two pathways. In one pathway, the protein shell and cargo coalesce at the same time. In the other pathway, the cargo molecules first assemble into a large disordered complex, with the shell proteins attached on the outside. The shell proteins then assemble, carving out a piece of the cargo complex. The simulations showed that many factors affect how the shell assembles, such as the strengths of the interactions between the shell proteins and the cargo. They also identified a factor that controls how much cargo ends up inside the assembled shell.
Perlmutter, Mohajerani and Hagan found that, in addition to revealing how microcompartments may assemble within their natural setting, the simulations provided guidance on how to re-engineer microcompartments to assemble around other components. This would enable researchers to create customizable compartments that self-assemble within bacteria or other host organisms, for example to carry out carbon fixation or make biofuels.
A future challenge will be to investigate other aspects of microcompartment assembly, such as the factors that control the size of these compartments.
aÀcarboxysomes (another form of carboxysome) assemble by a different mechanism, in which shell assembly encapsulates an initially diffuse pool of RuBisCO (Iancu et al., 2010;Cai et al., 2015). The factors determining which of these assembly pathways occurs are unknown. BMC assembly is driven by a complex interplay of interactions among the proteins forming the external shell and the interior cargo. It is difficult, with experiments alone, to parse these interactions for those mechanisms and factors that critically influence assembly pathways, especially due to the lack of an in vitro assembly system. Models which can correlate individual factors to their effect on assembly are therefore an important complement to experiments.
We present phase diagrams and analysis of dynamical simulation trajectories showing how the thermodynamics, assembly pathways, and emergent structures depend on the interactions among shell proteins and cargo molecules. Within distinct parameter ranges, we observe two classes of assembly pathways, which resemble those suggested for respectively aÀ or bÀcarboxysomes. We find that tunability of cargo loading is a key functional difference between the two classes of pathways. Shells assembled around a diffuse cargo can be varied from empty (containing almost no cargo) to completely full, whereas assembly around a condensed, procarboxysome-like complex invariably produces full shells. While we find that the encapsulated cargo becomes ordered due to confinement, complete crystalline order in the globule before encapsulation inhibits budding. We discuss these results in the context of recent observations on carboxysome assembly, and their implications for engineering BMCs, viruses or drug delivery vehicles that assemble around a fluid cargo (e.g. Refs. [Kerfeld and Erbilgin, 2015;Parsons et al., 2010;Choudhary et al., 2012;Lassila et al., 2014;Luque et al., 2014;Douglas and Young, 1998;Patterson et al., 2014; (B)

Results
Our model system is motivated by icosahedral viral capsids and BMCs (Tanaka et al., 2008;Kerfeld et al., 2010). Since icosahedral symmetry can accommodate at most 60 identical subunits, formation of large icosahedral structures requires subunits to assemble into different local environments. The subunits can be grouped into pentamers and hexamers, with 12 pentamers at the icosahedron vertices and the remaining subunits in hexamers. Viruses typically assemble from small oligomers of the capsid protein, which we refer to as the basic assembly unit (Hagan, 2014). Recent AFM experiments demonstrated that hexamers are the basic assembly unit during the assembly of BMC shell facets (Sutter et al., 2016), and the carboxysome major shell proteins crystallize as pentamers and hexamers (Tanaka et al., 2008). Motivated by these observations, our model considers two basic assembly units, one a pentamer and the other a hexamer, with interactions designed so that the lowest energy structure corresponds to a truncated icosahedron with 12 pentamers and 20 hexamers ( Figure 1). While BMCs generally have more hexamers, our model is intended to explore the general principles of assembly around a fluid cargo rather than model a specific system. Further details of the model and a thermodynamic analysis are given in section 3 and the appendices.
To understand how assembly around multiple cargo molecules depends on the relative strengths of interactions between components, we performed dynamical simulations as a function of the parameters controlling shell subunit-subunit (" SS ), shell subunit-cargo (" SC ), and cargo-cargo (" CC ) interaction strengths. All energy values are given in units of the thermal energy, k B T. We focus on parameters for which shell subunit-subunit interactions are too weak to drive assembly in the absence of cargo (" SS 4:5). Except where mentioned otherwise, the cargo diameter is set equal to the circumradius of a shell subunit.
For the simulated density of cargo particles, the phase behavior (in the absence of shells) corresponds to a vapor at " CC ¼ 1:3, liquid-vapor phase coexistence for " CC 2 ½1:6; 2:0 (the phase coexistence boundary is slightly below " CC ¼ 1:6), and a solid phase at " CC ¼ 3:0. We find that tuning " CC through phase coexistence dramatically alters the typical assembly process. Strong cargo interactions (" CC ! 1:6) drive formation of a globule followed by assembly and budding of a shell, such as observed for bÀcarboxysomes ( Figure 2A, Simulation Video 1), while under weak interactions (" CC <1:6) shell assembly usually proceeds in concert with cargo encapsulation ( Figure 2B, Simulation Video 2), as suggested for assembly of aÀcarboxysomes. We now elaborate on these classes of assembly pathways, and how the resulting assembly products depend on parameter values.

Assembly and budding from a cargo globule
We begin by discussing assembly behavior when the cargo-cargo interactions are strong enough to drive equilibrium phase coexistence (" CC ! 1:6). Near the phase boundary (" CC ¼ 1:6) a system of pure cargo particles is metastable on the timescales we simulate. However, for " SC >4, adding shell Video 1. Animation of a typical simulation showing assembly around a cargo globule. Parameters are " CC ¼ 1:6, " SC ¼ 7, and " SS ¼ 2:5. DOI: 10.7554/eLife.14078.007 Video 2. Animation of a typical simulation showing simultaneous assembly and cargo condensation. Parameters are " CC ¼ 1:3, " SC ¼ 9, and " SS ¼ 3:5. DOI: 10.7554/eLife.14078.008 subunits drives nucleation of a cargo globule with shell subunits adsorbed on the surface. The subsequent fate of the globule depends on parameter values; typical simulation end-states are shown as a function of parameter values in Figure 3. For moderate interaction strengths (2:5 " SS 3:5) the globule grows to a large size, typically containing at least twice the cargo molecules that can be packaged within a complete shell. Adsorbed shell subunits then reversibly associate to form ordered clusters. Once a cluster acquires enough inter-subunit interactions to be a stable nucleus, it grows by coagulation of additional subunits or other adsorbed clusters. For the parameter set corresponding to Figure 2A, nucleation is fast in comparison to cluster growth, and thus two nuclei grow simultaneously. The last three images show the system immediately preceding and following detachment of the lower shell. Missing only one of its 32 subunits, the shell is connected to the remainder of the droplet only by a narrow neck of cargo. Insertion of the final subunit breaks the neck and completes shell detachment. The complete shell contains 120-130 cargo particles, which is slighty above random close packing ( » 120 particles) but below fcc density ( » 150 particles, see appendix 1.2).
Increasing the shell-shell interaction strength drives faster shell assembly and closure, thus limiting the size of the globule before budding. For the largest interaction strength we simulated (" SS ¼ 4:5) the globule typically does not exceed the size of a single shell, and multiple globules nucleate within the simulation box (  on shell-shell interaction strengths, since multiple nucleation events were rare in the carboxysome assembly experiments (Cameron et al., 2013) (however, we discuss potential complicating factors within the cellular environment below). To quantify the relationship between assembly mechanism and parameter values, we calculate an assembly order parameter, defined as the maximum number of unassembled subunits adsorbed onto a globule during an assembly trajectory. The order parameter is shown as a function of the interaction strengths in Figure 4. For " CC ! 1:6 and " SS 3 we observe large values of the order parameter (e.g. >32, the red and yellow regions in Figure 4), which indicate formation of a large amorphous globule consisent with the procarboxysome precursor to carboxysome shell assembly (Cameron et al., 2013).

Other assembly products
Outside of the optimal parameter ranges, we observe several classes of alternative outcomes. Overly weak shell-shell interactions fail to drive assembly. For " CC ¼ 1:6 and " SC 4 the cargo vapor phase is metastable, and the system remains 'Unnucleated' (with no cargo globule) on simulated timescales (we discuss alternative initial conditions below). Stronger cargo-cargo or shell-cargo interactions result in unassembled 'Globules', where a cargo globule forms but the shell subunits on its surface fail to nucleate. As " SS increases, we observe assembly on the globule, leading either to complete shells or two classes of incomplete assembly. In the first incomplete case, 'Attached', one or more shells almost reaches completion, but fails to detach from the droplet within simulated timescales. 'Attached' configurations occur for low " SC , when the subunit-cargo interaction does not provide a strong enough driving force for the last subunit(s) to penetrate the cargo and close the shell. Overly strong interactions drive the other class of incomplete assembly: 'Over-nucleated/Malformed', in which an excess of partially assembled shells deplete the system of free subunits before any shells are completed. In this regime it is also common to observe malformed structures, in which defects become trapped within growing shells.
As the cargo-cargo interaction increases (" CC ! 1:8), multiple effects narrow the parameter range that leads to complete assembly and detachment. Firstly, cargo globules nucleate rapidly at multiple locations within the simulation box, increasing the likelihood of the 'Over-nucleated' outcome. Secondly, the threshold value of " SC required for cargo penetration increases, resulting in 'Attached' shells over a wider parameter range. We also observe a configuration we refer to as 'Stalled', in which shell assembly fails to penetrate the globule surface (and thus does not even proceed to the attached stage). The latter is especially prevalent for " CC ¼ 3:0, when the cargo crystallizes even in the absence of shell encapsulation. For both 'Attached' and 'Stalled' configurations, regardless of the initial number of nucleation events, we typically observe coarsening into a large globule.  Simultaneous shell assembly and cargo condensation For " CC ¼ 1:3 the cargo forms an equilibrium vapor phase in the absence of shell subunits. However, above threshold values of " SS and " SC , the diffuse cargo molecules drive nucleation of shell assembly. The subsequent assembly pathway depends sensitively on the shell-cargo interaction strength. For low " SC ( Figure 2C), assembly captures only a few cargo molecules, leading to complete, but nearly empty shells. For larger " SC ( Figure 2B, and Simulation Video 2), the shell-cargo interactions drive local condensation of cargo molecules. Shell assembly and cargo complexation then proceed in concert, resembling the mechanism proposed for assembly of a-carboxysomes (Iancu et al., 2010). Thus, tuning the shell-cargo interaction dramatically affects cargo loading, with a sharp transition from empty to filled shells around " SC ¼2. This transition closely tracks the equilibrium filling fraction ( Figure 5C), measured by simulating a complete shell made permeable to cargo molecules. This effect is comparable to the condensation of water vapor below its dew point inside of hydrophilic cavities. In contrast, assembly around a globule only generates full shells.

Complete (full shell)
Complete ( Assembly of full shells (by either pathway, Figure 2A or Figure 2B) is typically about two orders of magnitude faster than assembly of empty shells ( Figure 2C). This disparity demonstrates the key role that the cargo plays in promoting shell association, during all stages of assembly. Cargo molecules initially promote shell nucleation by stabilizing interactions among small, sub-nucleated clusters. Then, the presence of a condensed globule provides a large cross-section for adsorption of additional subunits, significantly enhancing the flux of subunits to the partial capsid, thus increasing its growth rate. The condensed cargo particularly facilitates insertion of the last few subunits, which are significantly hindered by steric interactions, as noted previously for simulations of empty virus capsids (Nguyen et al., 2007). Figure 5A shows how the products of assembly around cargo with weak interactions depends on parameters. While moderate parameter values lead to complete assembly, overly weak " SC and " SS (lower left region of Figure 5A) prevent shell nucleation, leading to the 'Unnucleated' outcome. In the limit of large " SC but weak " SS the shell-cargo interaction stabilizes small disordered globules (~50 cargo particles, lower right region of Figure 5A), while under strong subunit and weak cargo interactions (" SS ¼ 4:5, " SC <5) shells nucleate but cannot condense the cargo, leading to the complete but slow assembly just discussed. As for assembly around a globule, overly strong interactions lead to overnucleation and malformed shells. However, the predominant mode of malformation is now shell collapse. Because the cargo is below its dew point, the locally condensed globule leads to a negative pressure on the shell subunits, which can flatten the shell and thus prevent closure of a symmetric shell.

Thermodynamic model
The simple free energy model (Equations (1-2)) reproduces the threshold parameter values required for shell assembly with no adjustable parameters (color map in Figure 3). Since it is an equilibrium model and only considers the free energy difference between complete and unassembled configurations, it cannot distinguish between parameter values that lead to complete assembly or kinetic traps at the long but finite simulation times. However, the thermodynamic calculation does suggest that the simulations resulting in 'Attached' shells would eventually reach completion on a longer timescale. We do not show Df assem in Figure 5A because the globule is always less favorable than assembled shells for " CC ¼ 1:3, but the yield of well-formed shells in our simulations roughly follows the prediction of the equilibrium theory ( Figure 5-figure supplement 1).

Effects of varying other parameters or initial conditions
To investigate whether the results described above depend on assumptions within our model, we performed several sets of additional simulations. Firstly, we performed simulations in which the ratio between cargo diameter in shell subunit size was varied. As shown in Figure 5-figure supplement 2, assembly is most robust for our default cargo diameter (for which the model was parameterized), but productive assembly occurs for cargo diameters varied over a factor of four. Secondly, we performed assembly simulations with anisotropic cargo molecules with a shape motivated by the octomer structure of the RuBisCO holoenzyme (Figure 2-figure supplement 2).
Thirdly, we performed a set of simulations in which we pre-equilibrated the cargo globule before introducing shell subunits into the system (Figure 3-figure supplement 2, Simulation Video 3). Investigating this alternative initial condition was motivated by the fact that RuBisCO is present in the cell before induction of the Video 3. Animation of a simulation with a preequilibrated cargo globule. Parameters are " CC ¼ 1:6, " SC ¼ 6, and " SS ¼ 3:5. DOI: 10.7554/eLife.14078.021 carboxysome gene in the experiments of Ref. (Cameron et al., 2013), and by the observation that multiple carboxysomes bud sequentially in time from a single procarboxysome. For " CC ¼ 1:6 the results are very similar to those obtained without pre-equilibrating the cargo. However, for " CC >1:6, successful assembly and detachment is limited to more narrow ranges of shell-shell and shell-cargo interaction strengths than in Figure 3, due to an increased prevalence of 'Attached' and 'Stalled' configurations. The latter are particularly common for " CC ¼3, when the cargo forms a hexagonally close packed crystal which strongly resists deformation by shell protein assembly.
Taken together, the results from both assembly protocols (Figure 3 and Figure 3-figure supplement 2) suggest that moderate effective cargo-cargo interactions are most consistent with the observations of shell assembly and budding in Refs. (Cameron et al., 2013;Chen et al., 2013). Such interactions are strong enough to drive cargo globule formation, but malleable enough to allow shell assembly to deform and eventually sever intra-globule interactions.

Organization of encapsulated cargo
Studies of assembled carboxysomes report varying degrees of order for the encapsulated cargo, ranging from none to paracrystalline order (Iancu et al., 2007;Kaneko et al., 2006;Schmid et al., 2006). We therefore studied the relationship between cargo order and interaction parameters using equilibrium simulations (see Figure 6 and Figure 6-figure supplement 1). Below " CC <3k B T, we do not observe true fcc order of the encapsulated cargo. However, for all parameters leading to significant filling, even those well below the cargo liquid-vapor transition, the cargo becomes organized in concentric layers ( Figure 6). We observe similar cargo organizations within shells which have budded from cargo globules in dynamical simulations. These results demonstrate that ordering of the cargo does not require crystallinity of the initial globule. Moreover, the magnitude of ordering increases with cargo loading, but, for fixed loading, is essentially independent of the cargo-shell interaction strength " SC . We observe ordering within filled shells due to confinement, even if even if " SC is set to 0 ( Figure 6-figure supplement 1), as previously noted by Iancu et al. (Iancu et al., 2007).

Discussion
We have described an equilibrium theory and a dynamical computational model for the assembly of shells around a fluid cargo. Our simulations show that assembly can proceed by two classes of pathways: (i) a multi-step process in which the cargo forms a dense globule, followed by adsorption, assembly, and budding of shell proteins, or (ii) single-step assembly, with simultaneous aggregation of cargo molecules and shell assembly. This result demonstrates that the minimal interactions included in our model are sufficient to drive both classes of assembly pathways, suggesting that they are a generic feature of assembly around a fluid cargo. Moreover, while we cannot rule out the existence of active mechanisms in biological examples such as carboxysomes, our model demonstrates that the same interactions which drive assembly of shells can also drive budding from and closure around an amorphous globule of cargo.
Our results suggest bounds on the relative strengths of interactions that drive BMC assembly in cells. The decisive control parameter determining the assembly pathway is the cohesive energy between cargo molecules, which could arise through direct cargo-cargo interactions or be mediated by auxiliary proteins (Cameron et al., 2013). Relatively weak cargo interactions lead to single-step assembly pathways, while stronger interactions favor formation of the cargo-shell globule. However, the strength of cargo-shell and shell-shell interactions also play a role. Strong shell-shell interactions cause assembly to proceed rapidly during globule formation, limiting the size of the globule. Moreover, if a large globule is already present (e.g. due to time-dependent protein concentrations within a cell), strong interactions tend to drive malformed assemblies. We find that an important functional difference between the two classes of assembly pathways is control over the amount of packaged cargo. While the multi-step assembly pathways always generate a shell filled with cargo molecules, shells assembling around a diffuse cargo can be tuned from nearly empty to completely full by controlling the strength of cargo-shell interactions.
These results have implications for reengineering BMCs to encapsulate new core enzymes. Recent works demonstrated that protein cargos can be targeted to BMCs via encapsulation peptides that mediate cargo-shell interactions. However, packaged amounts were much lower than for native core enzymes (Parsons et al., 2010;Choudhary et al., 2012;Lassila et al., 2014). Our simulations show that both cargo-shell and cargo-cargo interactions (direct or mediated) must be controlled to assemble full shells.
We also find that a general equilibrium theory describes the ranges of parameter values for which assembly occurs. However, the dynamical simulations demonstrate that, at finite timescales, there is a rich variety of assembly morphologies. Formation of ordered, full shells requires a delicate balance of cargo-cargo, cargo-shell, and shell-shell interactions, all of which must be on the order 5 À 10k B T. This constraint is consistent with previous studies on viruses and other assembly systems, which found that formation of ordered states requires multiple, cooperative weak interactions between subunits (Hagan, 2014;Whitelam and Jack, 2015). Outside of optimal parameter regimes, the simulations predict alternative outcomes, ranging from no assembly to various alternative trapped intermediates, with the morphology depending on which interaction is strongest. We find that assembly is least robust to parameter variations when the cargo crystallizes before shell assembly. The assembling shell is unable to deform or penetrate the cargo complex, leading to defect-riddled, non-budded complexes. Within the limits of our simplified model, this observation suggests that procarboxysome complexes are at least partially fluid prior to successful shell assembly. Moreover, we find that observations of ordered cargo within assembled shells may be explained by packing constraints. An important limitation of the present study is that the model interactions are specific to the shell geometry shown in Figure 1 (containing 20 hexamers) because alternating edges on hexagonal subunits have attractive interactions only with pentagonal subunits. In reality BMCs contain many more hexamers (formed from multiple protein sequences) and thus must include a greater range of hexamer-hexamer interactions. Extension of the model to allow for this possibility would allow consideration of two important questions: (1) The mechanism controlling insertion of the 12 pentagons required for a closed shell topology. (2) The relationship between assembly pathway and BMC size polydispersity. In particular, experiments suggest that b-carboxysomes are more polydisperse than a-carboxysomes (Price and Badger, 1991;Iancu et al., 2007;Kerfeld et al., 2010;Tanaka et al., 2008). We speculate that in the case of assembly around vapor-phase cargo, the size of the assembling shell will be primarily dictated by the preferred shell protein curvature and thus relatively uniform. However, during assembly around a condensed globule, the shell protein interactions could be strained to accommodate a globule which is larger or smaller than the preferred curvature, causing the shell size to depend on a complex balance of intermolecular interaction strengths and variables such as the local RuBisCO concentration.
Our model is minimal, intended to elucidate general principles of assembly around a fluid cargo, and thus may apply to diverse systems including prokaryotic microcompartments, viruses, and engineered delivery vehicles. The predicted trends for how assembly mechanisms and morphologies vary with control parameters can be experimentally tested by microscopy experiments. Such testing will be most straightforward in vitro (e.g. Luque et al., 2014;Douglas and Young, 1998;Patterson et al., 2014;Patterson et al., 2012;Zhu et al., 2014;Rhee et al., 2011;Wö rsdö rfer et al., 2012), where subunit-subunit interactions can be tuned by varying solution conditions and the stoichiometries of shell and cargo species can be readily varied. While there is currently no BMC assembly system starting from purified components, our findings can be tested in vivo by mutations which alter known protein binding interfaces, or by altering expression levels of RuBisCO or carboxysome proteins.
We anticipate that our model can serve as a qualitative guide for understanding how such multicomponent complexes assemble in natural systems, or to reengineer them for new applications. More broadly, our results demonstrate that the properties of encapsulated cargo, such as its topology, geometry and interaction strengths, strongly influence assembly pathways and morphologies.

Computational model Shell subunits
We have adapted a model for virus assembly (Perlmutter et al., 2013;2014;Perlmutter and Hagan, 2015a;Wales, 2005;Fejer et al., 2009;Johnston et al., 2010;Ruiz-Herrero and Hagan, 2015) to describe assembly of an icosahedral shell around a fluid cargo. Each subunit contains 'Attractors' on its perimeter that mediate subunit-subunit attractions (as in Ruiz-Herrero and Hagan, 2015). Attractor interactions are specific -complementary pairs of Attractors (see Figure 1A,B and appendix 1) have short-range interactions (modeled by a Morse potential), whereas non-complementary pairs have no interactions. A repulsive interaction between pairs of 'Top' (type 'T') pseudoatoms favors the correct subunit-subunit angle. The 'Bottom' (type 'B') pseudoatoms mediate shortranged subunit-cargo attractions (e.g. due to interactions with shell 'encapsulation peptides' (Kinney et al., 2012;Cameron et al., 2013;Fan et al., 2010)), represented by a Morse potential. We also add a layer of 'Excluders' in the plane of the 'Top' pseudoatoms, which represent subunitcargo excluded volume interactions. The strengths of subunit-subunit and subunit-cargo attractions are parameterized by potential well depths " SS and " SC respectively (appendix 1).

Cargo
As a minimal representation of globular proteins, the cargo is modeled as spherical particles which interact via an attractive Lennard-Jones (LJ) potential, with well-depth " CC . The attractions implicitly model hydrophobic and screened electrostatics interactions between cargo molecules, as well as effective cargo-cargo interactions mediated by auxiliary proteins (e.g. the carboxysome protein CcmM (Cameron et al., 2013)).

Simulations
We simulated assembly dynamics using the Langevin dynamics algorithm in HOOMD (a software package that uses GPUs to perform highly efficient dynamics simulations [Anderson et al., 2008]) and periodic boundary conditions to represent a bulk system. The subunits are modeled as rigid bodies (Nguyen et al., 2011). The simulations were performed using a set of fundamental units (URL. http://codeblue.umich.edu/hoomd-blue/doc/page_units.html), with 1d u defined as the circumradius of the pentagonal subunit (the cargo diameter is also set to 1 d u ). Unless specified otherwise, each simulation contained enough subunits to form four complete shells (48 pentamers and 80 hexamers) and 611 cargo particles (a shell typically encapsulates 120-130 cargo particles) in a cubic box with side length 40d u . The simulation time step was 0:001 in dimensionless time units, and dynamics was performed for 3 Â 10 8 timesteps unless mentioned otherwise.
We performed two sets of simulations, using different initial conditions. In the first, simulations were initialized by introducing cargo particles and shell subunits simultaneously with random positions and orientations (except avoiding high-energy overlaps). The second set of initial conditions was motivated by the possibility that the cargo globule could form before shell subunits reach sufficient concentrations within the cell to undergo assembly. To model this situation, we pre-equilibrated the cargo by performing a long simulation with only cargo particles present. Shell subunits were then introduced with random positions and orientations (excluding high-energy overlaps). For " CC ! 1:6, the assembly simulations thus began with a cargo globule already present. For " CC <1:6 the two protocols are equivalent, since no globule forms during cargo equilibration.

Sample sizes
To cover the largest range of parameter space possible given the computational expense associated with each simulation, we performed 5 independent simulations at most parameter sets. To assess statistical error and to estimate the distribution of different assembly outcomes, we performed 10 independent trials for one value of " SS at each value of " SC and " CC . We also performed additional simulations at parameter sets for which 5 trials did not result in a majority outcome, or when necessary to obtain better statistics on the number of encapsidated cargo particles. Based on these results, performing additional simulations at other parameter values would not qualitatively change our results. (It would increase the statistical accuracy of estimated boundaries between different outcomes; however, these boundaries correspond to crossovers rather than sharp transitions.)

Thermodynamics of assembly around a fluid cargo
To complement the finite-time simulations, we have developed a general thermodynamic description of assembly around a fluid cargo. We consider shells composed of species a ¼ 1; 2; . . . M, with n shell a subunits of species a in a complete shell, which encapsulates n 0 cargo molecules (the index 0 refers to cargo molecules henceforth). Assembly occurs from a dilute solution of cargo molecules with density 0 , shell subunits with density a for each species, and the density of assembled, full shells as shell . These are in equilibrium with a globule containing n glob 0 cargo molecules and n glob a subunits for each species a. We assume that, due to the asymmetric nature of the shell-cargo interaction, the shell subunits reside at the exterior of the globule (as we observe in our simulations). The globule containing unassembled shell subunits thus resembles a spherical microemulsion droplet (Safran, 1994). Minimizing the total free energy (see appendix 2) gives: where G shell is the interaction free energy of the assembled shell and a are the chemical potentials of free cargo molecules and shell subunits, given by a ¼ k B T lnð a v 0 Þ, with v 0 a standard state volume and the globule composition given by qG glob ðfn glob a gÞ qn glob a ¼ a for a ¼ 0 . . . M; (2) with G glob ðn glob s ; n glob 0 Þ as the globule free energy. (1) -(2) are the general equilibrium description for a system of assembling shells with a disordered-phase intermediate; application to a specific system requires specifying the forms of G shell and G glob . In appendix 2 we specify these equations for our computational model, allowing us to compare the equilibrium calculation with simulation results, using no free parameters.
To compare the relative stabilities of the globule and assembled shells, we also calculate the free energy difference Df assem ¼ f tot ðfn glob a ¼ 0gÞ À f tot ð shell ¼ 0Þ; ( 3) where the first term on the right-hand side is the minimized free energy for a system containing shells and free subunits but no globule, while the second term corresponds to the minimized free energy for a system containing subunits and the globule, but no assembly.
where P sub i P sub j<i is the sum over all distinct pairs of subunits in the system, P sub i P cargo j is the sum over all subunit-cargo particle pairs, etc.

Subunit-subunit interactions
The subunit-subunit potential U SS is the sum of the attractive interactions between complementary attractors, and geometry guiding repulsive interactions between 'Top' -'Top', 'Bottom' -'Bottom', and 'Top' -'Bottom' pairs. There are no interactions between members of the same rigid body. Thus, for notational clarity, we index rigid bodies and non-rigid pseudoatoms in Roman, while the pseudoatoms comprising a particular rigid body are indexed in Greek. For subunit i we denote its attractor positions as fa ia g with the set comprising all attractors a, its 'Top' position ft i g, and 'Bottom' position fb i g. The subunitsubunit interaction potential between two subunits i and j is then defined as: where " SS is an adjustable parameter which both sets the strength of the subunit-subunit attraction at each attractor site and scales the repulsive interactions which enforce the geometry, N ai is the number of attractor pseudoatoms in subunit i, s tb ¼ 1:8r b is the diameter of the 'Top' -'Bottom' interaction (this prevents subunits from binding in inverted configurations (Johnston et al., 2010), and s b ¼ 1:5r b is the diameter of the 'Bottom' -'Bottom' interaction.
In contrast to the latter parameters, s t;ij the effective diameter of the 'Top' -'Top' interaction, depends on the species of subunits i and j; denoting a pentagonal or hexagonal subunit as p or h respectively, s t;pp ¼ 2:1r b , s t;hh ¼ 2:436r b , and s t;ph ¼ 2:269r b . The parameter r 0 is the minimum energy attractor distance, set to 0:2r b , % is a parameter determining the width of the attractive interaction, set to 4r b , and r att cut is the cutoff distance for the attractor potential set to 2:0r b . Since the interactions just described are sufficient to describe assembly of the shell subunits, we included no excluder-excluder interactions.

Cargo-cargo interactions
The interaction between cargo particles is given by with L the full Lennard-Jones interaction: Lðx; s; r cut Þ ¼ ðx À r cut Þ Â and " CC is an adjustable parameter which sets the strength of the cargo-cargo interaction, N l is the number of LJ particles, s C is the cargo diameter set to 1:0r b except where mentioned otherwise, and r c cut is set to 3s C .

Subunit-cargo interactions
The subunit-cargo interaction is a short-range repulsion between cargo-excluder and cargo-'Top' pairs reresenting the excluded volume plus an attractive interaction between the cargo -'Bottom' pairs. For subunit i with excluder positions fx ia g and 'Bottom' psuedoatom fb ia g and cargo particle j with position R j , the potential is: where " SC parameterizes the shell-cargo interaction strength, N x , N t , and N b are the numbers of excluders, 'Top', and 'Bottom' pseudoatoms on a shell subunit, s ex ¼ 0:5r b and s t ¼ 0:5r b are the effective diameters of the Excluder -cargo and 'Top' -cargo repulsions, r 0 is the minimum energy attractor distance, set to 0:5r b , % is a parameter determining the width of the attractive interaction, set to 2:5r b , and r cut is the cutoff distance for the attractor potential set to 3:0r b .

Motivation for choice of interaction potentials
The choices we have made for potential functions (Morse or Lennard-Jones) between different classes of pseudoatoms are based on the need for tunability of the interaction length scale and the extent to which guidance on parameterization is available from the existing literature. In particular, the Morse potential enables controlling the interaction length scale independently from the particle excluded volume size, whereas the interaction length scale and excluded volume size are tuned by a single parameter in the Lennard-Jones potential. Our shell-shell interaction potential is based on previous models for viral capsid assembly (Wales, 2005;Johnston et al., 2010;Ruiz-Herrero and Hagan, 2015;Perlmutter et al., 2013;2014;Perlmutter and Hagan, 2015b), and the choice of a Morse potential for attractor-attractor interactions and a Lennard-Jones potential for Top-Top interactions follows these previous works. The attractor interactions are modeled using a Morse potential because the length scale of their interaction strongly affects the subunit orientational specificity. We chose to model the cargo-cargo interaction using a Lennard-Jones potential because the phase behavior for this model has been extensively studied in the literature, thus limiting the need for model parameterization. However, we note that it could be of interest to study how the probability of shell detachment depends on the length scale of the cargo-cargo interaction; we speculate that a longer-range interaction would increase the probability of detachment by making it easier for shell subunits to penetrate into the globule. Finally, the shell-cargo interactions could have used either choice of potential; we elected to use a Morse potential due to its greater flexibility.

Maximum cargo loading
To give context to the densities of packaged cargo particles that we observe in simulations, we estimate the maximum possible cargo loading here. Our assembled shell has the geometry of a truncated icosahedron with an edge length of approximately 1:5d u . Accounting for the volume occluded to cargo particles by the shell pseudoatoms, the interior volume is V in » 109d 3 u . The maximum number of cargo molecules that can be packaged (assuming hexagonal close packing) is thus N HCP » 154. However, this is an overestimate since the shell geometry is not commensurate with perfect hexagonal close packing. We thus estimate N HCP ¼ 148, the maximum number of packaged cargo particles seen in an equilibrium simulation. The maximum cargo loading for random close packing is then N RCP » 120.