Minimal Out-of-Equilibrium Metabolism for Synthetic Cells: A Membrane Perspective

Life-like systems need to maintain a basal metabolism, which includes importing a variety of building blocks required for macromolecule synthesis, exporting dead-end products, and recycling cofactors and metabolic intermediates, while maintaining steady internal physical and chemical conditions (physicochemical homeostasis). A compartment, such as a unilamellar vesicle, functionalized with membrane-embedded transport proteins and metabolic enzymes encapsulated in the lumen meets these requirements. Here, we identify four modules designed for a minimal metabolism in a synthetic cell with a lipid bilayer boundary: energy provision and conversion, physicochemical homeostasis, metabolite transport, and membrane expansion. We review design strategies that can be used to fulfill these functions with a focus on the lipid and membrane protein composition of a cell. We compare our bottom-up design with the equivalent essential modules of JCVI-syn3a, a top-down genome-minimized living cell with a size comparable to that of large unilamellar vesicles. Finally, we discuss the bottlenecks related to the insertion of a complex mixture of membrane proteins into lipid bilayers and provide a semiquantitative estimate of the relative surface area and lipid-to-protein mass ratios (i.e., the minimal number of membrane proteins) that are required for the construction of a synthetic cell.

| energy conservation | metabolite transport | membrane composition | physicochemical homeostasis Table of contents:   Table S1. Cell volumes and genome sizes of representative prokaryotes  Table S1. Cell volumes and genome sizes of representative prokaryotes. To avoid variability among different microbial subspecies, the genome size for each species was arbitrarily chosen from the NCBI Assembly database (links available in the 'Assembly' entry). The genome size is intended as the sum of the size of all chromosomes and plasmids, while chromosome size refers to the number of nucleotides of the main chromosome.  Figure S1. Probabilities of encapsulating soluble components as a function of the vesicle radius. Three different concentrations (1, 5 and 10 µM) of soluble components were analyzed. The vesicle radius required for a given encapsulation probability varies reciprocally with the concentration. By contrast, the vesicle radius correlates with the expected abundance, that is, larger vesicles are required to obtain higher abundance values at a given concentration.

METHODS
Encapsulation probability. The number of enzyme molecules per vesicle (E) was calculated as a function of the vesicle size. A given internal concentration (ε) was multiplied with the Avogadro constant (NA, 6.022x10 23 mol -1 ) and the internal vesicle volume, as calculated for a sphere of given radius (r).
The cumulative probability (P) of a vesicle to contain one enzyme with a certain (or larger) abundance (x) was determined from the Poisson probability mass function.

S9
Copy number of full complexes. The oligomeric state of a protein complex is taken into account in calculating the copy number, when it is known or can be inferred from a structure of a homologous protein (the copy number is halved when the protein forms a dimer). When information on the oligomeric state is missing, the protein is assumed to be a monomer.
Planar section area of protein(s) complexes. The planar section area was determined by manually inspecting homolog structures or making use of AlphaFold predictions. The widest distance between two amino acid residues was measured at a 90º angle from the vertical protein axis; for protein complexes with bulkier cytosolic domains, the latter were used in order to avoid steric clashes. The planar section area (s) was then calculated by approximating the structure to a cylinder: where D is the measured widest distance. The cumulative planar section area (S) was calculated as the sum of planar section area of all proteins: Next, the planar section area of each complex (sc) was multiplied by the abundance (c) of the protein in JCVI-syn3A: The cumulative planar section area (Sc) was weighted accordingly: Relative surface occupancy of protein complexes. A spherical geometry was assumed for calculating the total surface area of vesicles, in agreement with the shape of phospholipid vesicles under iso-osmotic conditions. The total surface area (A) was calculated accordingly, for radii (r) in the range 0-1000 nm: The relative surface occupancy of the full complexes was determined as the ratio between the cumulative planar section area and the total surface area:

=
where O is the relative surface occupancy of all membrane proteins used in the analysis.
Total protein mass. The mass of each gene product was taken from the database 1 . The mass of gene products that are part of oligomeric complexes were scaled accordingly (e.g., the mass of a protein was doubled when it is part of a homodimeric complex); this correction ensures that at least one active complex is counted. The mass of protein complexes for which structural information was not available are assumed to be monomeric. The sum of the mass values (M) represents the total protein mass when one copy is present: where m is the mass of each gene product. The weighted mass of each gene product was obtained by multiplying with the corresponding copy number: where mc is the mass of a gene product times the copy number (c) of the protein, and m is the mass of a gene product; the total protein mass (Mc) is the sum of the individual masses times the copy number: Total lipid mass and lipid-to-protein mass ratio. The surface area available to lipids was determined by subtracting the protein surface occupied by protein complexes (see section above) from the surface areas of spheres with radii in the range 0-1000 nm; the bilayer arrangement was taken into account by including both inner and outer leaflets:

S11
where Aout and Ain are the outer and inner leaflet surface areas, respectively, r is the radius in the range 0-1000 nm and O is the relative surface occupancy of proteins (see above); a correction factor was applied to the inner radius to account for the approximate bilayer thickness of di-oleoyl phospholipids (~4 nm) 7 . The total lipid surface area (B) was obtained from the sum of outer and inner leaflet surface areas: ( 2 ) = + Given a 25:25:50 mol:mol:mol DOPE:DOPG:DOPC composition of the vesicles, the total number of phospholipids was determined and converted into mass: where L is the total lipid mass, p is the weighted phospholipid surface area (0.63 nm 2 , calculated from DOPE 8 =0.524 nm 2 , DOPC 8 =0.651 nm 2 , DOPG 9 =0.694 nm 2 ), w is the weighted molecular weight (777.8 g/mol) and NA is the Avogadro's constant (6.022x10 23 mol -1 ). The total lipid mass obtained for each radius was divided by the total protein mass values (see above) to yield the lipid-to-protein mass ratio: =