An interaction network approach predicts protein cage architectures in bionanotechnology

Significance Protein containers play important roles in nanotechnology applications, including diagnostics, drug delivery, and vaccination. These applications require programmable control over container size and stoichiometry. The seminal Caspar–Klug approach for the classification of virus architecture predicts nanocontainer geometry if protein subunits exhibit similar interaction types across the capsid surface. However, in many nanoparticles, there are identical protein subunits that do not interact with other subunits in the same way, thus presenting gaps in the container surface. We demonstrate that in these cases, the interaction network between the assembly units can be used to gain predictive control over the spectrum of particle morphologies. This paves the way to programmable control over particle polymorphism and informs the design of nanoparticles for specific applications.


Introduction
Protein containers are ubiquitous in nature.Prominent examples are the viral protein shells, called capsids, that provide protection and transport for viral genomes between rounds of infection.Protein cages also serve vital functions in bacteria as microcompartments, or in prokaryotic cells, where encapsulins, ferritin and lumazine synthase cages facilitate catalysis [4], intracellular trafficking [8], and transport [40,45].Nanoparticles, either derived from these naturally occurring protein containers or de novo designed, play pivotal roles in a host of applications, including vaccine development [5,50], cargo storage, drug delivery, gene therapy and diagnostics [15,43].
Viruses have evolved mechanisms to assemble specific geometric designs with high fidelity and efficiency.The vast majority of virus architectures exhibit icosahedral symmetry as a consequence of the principle of genetic economy [14], as this symmetry type allows container volume to be maximised without increasing the coding cost for its components.The additional volume in the confines of the capsid provides room to package genes supporting other functions, thus allowing viruses to gain more complexity with time.Insights into the geometric and mechanical properties of viral capsids afford a better understanding of viral life cycles.For example, buckling transitions from an initial spherical procapsid to the final icosahedral faceted shell have been shown to enhance a capsid's tolerance of internal pressures [2], and models of thermal dissociation have elucidated the processes of viral assembly and disassembly [13].The structures of artificial nanoparticles, on the other hand, are not as well understood to date and exhibit a much wider spectrum of morphologies [6,25,29,44,30,42,31,50].This is because they frequently violate the quasi-equivalence principle.A capsid is considered quasi-equivalent if it is held together by the same type of bonds throughout, allowing for deformations in slightly different ways in the different, non-symmetry related environments [11].For example, violation of the principle occurs if a subset of the constituent protein subunits of the assembly unit of the capsid (capsomer) does not interact with neighbouring capsomers, resulting in larger gaps in the particle surface.As such gaps are biologically important, amongst others for diffusion-limited encapsulation of complementarily charged guest molecules [38], a better understanding of the geometric construction principle of such artificial protein cages is required.This is also an important step towards control over nanoparticle size and stoichiometry, enabling their manufacturing to be optimised, and their biophysical properties to be tuned for specific applications [6].
The seminal Caspar-Klug (CK) theory was the first to propose quasi-equivalence as a geometric design principle for the structural organisation of icosahedral viruses [11].CK theory indicates capsid protein (CP) positions relative to surface triangulations, ascribing CPs to the polyhedral angles (corners) of the triangular tiles (Fig. 1A).The pentamers in AaLS-neg, a protein cage made from 36 pentamers (PDB: 5MQ3), cannot be mapped onto pentagons in a surface lattice in which every tile has an interpretation in terms of protein positions.As a subset of CPs do not interact with proteins in other pentamers, this cage violates the quasi-equivalence principle, resulting in gaps.However, the interaction network between capsomers (not protein subunits) can be represented as a tiling; its geometric information is exploited here to construct and classify alternative capsid architectures.(E) The interaction network of the protein cage reported in Ref. [28] (PDB: 4QCC) can be mapped onto a cube; coloured vertices indicate the centres of mass of the CPs (middle).This cubic surface can be embedded into a gyrated square tiling (right).(F) Other embeddings of the cubic surface into the gyrated square tiling (left) predict the morphologies of other cages that can assemble from the same protein units, such as the smaller cage reported in [28,29] (right).
The dual tilings, polyhedra with hexagonal and pentagonal faces akin to Buckminster Fuller's domes [32], predict the same capsid layout, again assuming CPs to be located in the polyhedral angles of the faces (Fig. 1A, grey).Therefore viral capsids are interchangeably modelled via hexagonal surface lattices and triangulations in CK theory.These models predict protein positions correctly for capsids formed from pentagonal, hexagonal or triangular capsomers, i.e. from pentamers, hexamers and trimers.However, they do not accurately reflect the layout of capsids assembled from dimers (SI Fig. S1).Viral Tiling theory (VTT) recognises that this is because the tiles in the surface lattice must be in a one-to-one correspondence with biological units.It therefore represents bacteriophage MS2, which assembles from protein dimers, by a rhomb tiling (Fig. 1B), thus capturing the correct relative CP orientations.More general types of tilings are required for other capsomer types if viral capsids are formed from more than one type of protein unit [49].
A further complication arises for human papillomavirus (HPV), a capsid formed from 72 identical pentamers.Due to the crystallographic restriction [41] there is no all-pentamer surface lattice with more than 12 identical pentagonal tiles, implying that pentamers cannot be modelled in a biologically meaningful way by pentagonal tiles.Again recognising the need for tiles to have an interpretation in terms of biological units, VTT introduces two types of tiles for HPV, a rhomb and a kite, that are in a one-to-one correspondence with the two types of interactions mediated by the C-terminal arms stabilising the capsid: kites corresponding to three proteins forming a trimer interaction, and rhombs representing two proteins forming a dimer interaction [47] (Fig. 1C).This tiling is reminiscent of the Penrose tiling [36], an aperiodic tiling given in terms of kites and rhombs, and 3D Penrose tilings can also be used to approximate virus structure [37].There are other approaches modelling virus architecture, using a local rules approach [39] or a dodecahedral nets approach that associates the centres of mass of protein molecules in icosahedrally symmetric viral cages with the nodes of a chiral pentagonal tiling [26].However, the lack of a direct correspondence between tiles and biological units limits the predictive power of these approaches.They describe the layouts of different viruses built according to the same mathematical principle, but do not allow for the classification of particle morphologies that can be formed from the same types of building blocks.By contrast, this is possible in VTT as tiles have an interpretation in terms of assembly units or specific types of protein-protein interactions.
Protein nanoparticles exhibit a wider spectrum of morphologies than viruses because they also include structures violating the quasi-equivalence principle.This occurs if some of the protein subunits in a capsomer do not interact with other capsomers, as is the case for the 36-pentamer particles formed from Aquifex aeolicus lumazine synthase (AaLS) [38](Fig.1D).In this case, neither capsomers nor interactions between individual CPs can be represented by tiles in a meaningful way.Of course, the particle surface can always be represented as a tessellation in terms of multiple copies of the fundamental domain (also called asymmetric unit) of the underlying symmetry group, in this case consisting of three Voronoi cells (SI Fig. S2).However, in such mathematical representations, there is no clear biological interpretation of the lattice unit in terms of individual capsomers or interactions between protein subunits.Therefore, such an approach does not allow prediction of other possible cages made from AaLS pentamers.
In order to achieve this, we construct the interaction network between capsomers (rather than between their constituent protein subunits) [9].For this, the centres of mass (CoMs) of the capsomers are computed based on the coordinates in the PDB file.These then form nodes in a network in which connecting edges indicate interactions between capsomers (Fig. 1D).This coarse-grained topological descriptor of capsid architecture ignores the geometry of individual capsomers, and interactions formed by individual protein subunits, and is therefore distinct from the surface lattice models in VTT.However, the geometric structure of this interaction network can be embedded into a tiling.For this, the symmetry axes of the particle are aligned with those of a reference cube (see also SI Fig. S3), and then the cubic surface is embedded into a planar tiling that continues the interaction network periodically in the plane.This tiling can then be used to construct models for other particle types via different embeddings of the cubic surface, akin to the embedding of icosahedral surfaces into hexagonal lattices in Caspar-Klug theory.
The cubic protein container designed by Lai et al. [28] (Fig. 1E, left) provides a simple example of this interaction network approach.Representing each of its 24 proteins as a node ( Fig. 1E, left) and drawing connections between interacting proteins, results in the interaction network (shown in black).By aligning the 4-fold and 3-fold symmetry axes of the particle with those of a reference cube (Fig. 1E, middle), the interaction network can be mapped onto the cubic surface.The latter is then embedded into a planar tiling by "unfolding" the cubic surface in the plane (Fig. 1E, right).Any other particles assembled from the same protein units should exhibit similar local interactions.From a mathematical point of view, this means that their interaction networks can be constructed by working backwards from the planar tiling.In particular, any other planar embedding of the cubic surface, obtained via rescaling and reorienting the surface in the plane such that the symmetry axes of the cube again coincide with those of the tiling (see also SI Fig. S3), then present an alternative particle layout.To reconstruct the biological model, vertices have to be replaced by biological units, oriented such that interacting units meet along the edges of the interaction network (Fig. 1F).This cage architecture, inferred via our method, has been observed [28,29], suggesting that our method can indeed be used to predict viable alternative protein container designs.
The ability to predict alternative particle morphologies that can assemble from the same protein unit(s) is important in nanotechnology.It can be used in the context of kinetic models to compute relative ratios of different particle morphologies for different experimental conditions, thus opening up the opportunity to tune experiments to favour production of desired particle types [6].It also informs the reconstruction of less frequent particle types from cryo-EM data in the case of polymorphic assembly, and guides the selection of particle morphologies with desired biophysical properties for specific applications [9].In the following, we demonstrate the predictive power of our approach for a more complex system -cages formed from AaLS pentamers -in which nodes in the interaction network represent assembly units composed of multiple protein subunits.This analysis illustrates how the interaction network approach can be used to predict and classify particle structures for any system of interest in bionanotechnology.

The building blocks of the interaction network
Given a nanoparticle of interest, the first step consists in computing its interaction network.In AaLS-based nanoparticles, pentamers and their genetic variants self-assemble into a spectrum of different particles with tetrahedral and icosahedral symmetry [38].Representing the CoMs of the pentamers as nodes and drawing edges between interacting pentamers, we obtain the interaction network (Fig. 2A).The next step in the analysis is to identify the distinct geometric shapes in the interaction network.In this case, there are two types: interactions in groups of three, corresponding to triangles, and interactions in groups of 6, corresponding to non-regular hexagons.The latter, called squashed hexagons in the following because of their characteristic shapes, can be divided into two triangles and one square as indicated by dashed red lines in Fig. 2B.

Planar tilings representing the interaction network
The interaction network of AaLS pentamer cages can therefore be embedded into planar tilings made of triangles and squashed hexagons.For other nanoparticles, the nature of tiles will depend on the local interaction patterns and may therefore differ.However, as nanoparticles self-assembling from (potentially multiple different) protein units exhibit only a limited spectrum of distinct local interaction patterns between neighbouring capsomers, planar tilings associated with their interaction networks must all be k-uniform tilings, meaning they are tessellations of the plane with only a limited number (k) of distinct vertex types (cf.SI Fig. S4 and SI text).In the case of AaLS-based nanoparticles, we therefore consider k-uniform tilings made of triangles and squares as an intermediate step to obtaining all possible tilings in terms of triangles and squashed hexagons by deleting edges in the triangle-square tilings.

Classification of tilings associated with the interaction network
k-uniform tilings have been classified.There are in total 575 tilings with polygonal faces up to k = 5 (see SI text), 140 of which are made entirely of triangles and squares [12,19,18].The combinatorial task is to identify all possible tilings that are given exclusively in terms of the building blocks of the interaction network, here triangles and squashed hexagons.Note that the tiling can potentially also include a set of squares aligning with the 3-fold symmetry axes of the cage, as these become triangles in the 3D surface (see Fig. 1E and F for an example).As tilings in which four squares tessellate a bigger square cannot be subdivided into triangles and squashed hexagons, these tilings are therefore excluded from further analysis, reducing the number of candidates to 58.In order to construct a particle with tetrahedral or octahedral symmetry from such tilings, the symmetry axes of the protein cage must be identified with symmetries in the tiling, as illustrated in Figs.1E  and F. This requires the tiling to have 3-fold and/or 4-fold symmetries.46 of the 58 tilings have only 2-fold symmetry and therefore cannot be used to construct the surface lattices of particles with tetrahedral or octahedral symmetry.We checked each of the 12 remaining tilings individually, excluding five tilings (SI Fig. S5A) from further consideration (cf.SI Fig. S6A and SI text).Three of the remaining seven tilings (SI Fig. S5B) contain local 6-fold symmetry axes and therefore must also be excluded as they would require vertices representing pentamers to be positioned on a 6-fold symmetry axis (cf.SI Fig. S6B and SI text).In summary, only four tilings fulfil all necessary criteria to allow for the embedding of an AaLS interaction network.These are the triangular (3 6 ) and the snub square For each of these tilings, we next identify all possible ways in which they can be reorganised into triangles and squashed hexagons.First, we focus on the snub square tiling which has 4-fold symmetry axes at the centres of its squares (Fig. 2C, left).There are four inequivalent options of organising triangles and squashed hexagons around these axes (Fig. 2D).One option is to accommodate four triangles around the square (yellow) and continue with squashed hexagons (type-1).The other option is to organise four squashed hexagons around a square (magenta) and then continue in one of the following ways: either locate triangles between the squashed hexagons (type-2) or accommodate the squashed hexagons in a way to either connect (type-3) or place them between (type-4) the previously added hexagons.
Starting with a type-1 square, the only option is to continue the tiling with type-2 squares (SI Fig. S7A), and vice versa, resulting in the tiling called T 1l (Fig. 2E) as the unique solution.Starting from a type-3 square, there are two squares where choices have to be made (SI Fig. S8A).In each case, a type-4 square is required next, and then the tessellation must be continued with alternating type-3/type-4 squares (SI Fig. S8B and C), leading to the tilings T 2l and T 3l (Fig. 2E).The same construction can be applied to the right-handed versions of the squares, i.e. the opposite handed versions obtained using the mirror images of the configurations in Fig. 2D.This results in analogous right-handed tilings, denoted as T 1r (SI Fig. S7B), T 2r and T 3r (SI Fig. S8D).Thus, modulo handedness, the snub square tiling can be divided into squashed hexagons and triangles in precisely three inequivalent ways, corresponding to the tilings T jl ( j = 1, 2, 3) shown in Fig. 2E.
For all other tilings, the combinatorics are much simpler.The (3 3 • 4 2 , 3 2 • 4 • 3 • 4) tiling can be divided into triangles and squashed hexagons in only one way (Fig. 2F).In the (3 6 , 3 2 • 4 • 3 • 4) tiling, placement of the 6-fold vertices must be such that the particle generated from the tiling does not contain any 6-fold symmetric vertices as this would be incompatible with vertices representing pentamers.Thus, this tiling can only be subdivided into triangles and squashed hexagons in a unique way (Fig. 2G).

Construction of protein cage architectures
Given the exhaustive list of tilings embodying the characteristics of the interaction network derive above, models for particles with tetrahedral or octahedral symmetry can then be constructed via different embeddings of a cubic surface (see 5/22 SI Fig. S3).For this, a planar representation of the cubic surface must be embedded into the tiling such that its corners align with the centres of appropriately spaced squares in the tiling.For example, allocating the vertices of the cubic surface to the centres of adjacent coloured squares in the T 1l tiling (Fig. 3A, left), and mapping these onto the vertices of a cube (Fig. 3A, middle), generates a model for an AaLS cage made of 24 pentamers (Fig. 3A, right).This particle has been observed in the self-assembly of AaLS pentamers [46].It is not possible to embed the surface of a cube into this tiling in any other way without mapping squares in the tiling onto the faces of the cube (cf.SI Fig. S9A).As vertices represent pentamers, this would generate a ring-like interaction between four AaLS pentamers in the particle surface, which is a local interaction type that has not been observed in any AaLS cage to date.We therefore exclude it from our classification.There are thus no other biologically viable AaLS cage models that can be constructed from this tiling.Similarly, both T 2l and T 3l result in only one tetrahedral model each.The former corresponds to a protein cage made from 48 pentamers (Fig. 3B), and the latter to a 60 pentamer-cage (Fig. 3C).These particles have not yet been observed but are consistent with the experimentally observed local interaction rules and therefore provide viable geometric models for AaLS cages.It is possible that these cages have previously been overlooked because they occur less frequently during polymorphic assembly than other variants.Note that all known AaLS cages only exhibit the left-handed version of the tilings.We therefore only consider the left-handed versions in our analysis, assuming that the interactions in all cages should have similar characteristics.
The 2-uniform tiling (3 3 • 4 2 , 3 2 • 4 • 3 • 4) has only one type of 4-fold symmetry axis, shown in orange (Fig. 3D).In analogy to the snub square tiling, only the smallest embedding of the cubic surface is possible, and is obtained by associating neighbouring squares with the vertices of a cubic face (Fig. 3D).This particle is made from 36 pentamers that are organised with tetrahedral symmetry, and corresponds to one of the AaLS pentamer cages reported previously [38].Note that, as before, any larger models, obtained via an embedding of a rescaled cubic surface (SI Fig. S9) would necessarily contain a square, i.e. a group of four pentamers, and we again reason that this would not be a biologically viable option.
The remaining two tilings include 6-fold symmetry axes.It is therefore possible to construct particle architectures with icosahedral symmetry from them following the Caspar-Klug construction [11].The smallest particle with icosahedral symmetry that can be derived from the triangular tiling is the icosahedron, a polyhedron with 12 vertices.This model corresponds to the wild-type (WT) AaLS cage (Fig. 3E).In the Caspar-Klug construction, higher order triangulations are possible in which the faces of the icosahedron are subdivided into triangular facets.However, in their surface lattice interpretation, proteins are allocated in the 60 • angles of the triangular facets, thus generating models with 12 pentamers and otherwise hexamers.Here, on the other hand, the triangulation is representing the interaction network between pentamers, whose positions are indicated by the vertices.Therefore, larger particles with icosahedral symmetry are not feasible as they would map pentamers onto 6-coordinated vertices in the triangulation.Similarly, tetrahedral and octahedral particles can be constructed via triangular surface lattices [49], but are not viable in the framework of AaLS interaction networks as they would locate pentamers on 3-or 4-fold symmetry axes.Similar arguments show that there exists only one planar embedding of an icosahedral surface into the (3 6 , 3 2 • 4 • 3 • 4) tiling leading to a viable AaLS cage (Fig. 3F).The resulting cage morphology, a particle made of 72 pentamers, corresponds to the AaLS-13 cage [38].
There are thus precisely six cage structures with 3D symmetry that can self-assemble according to the known local interaction pattern of the AaLS pentamers (Fig. 3 and Table 1), ranging in size from the WT AaLS cage (12 pentamers) to the AaLS-13 cage (72 pentamers).Our classification implies that there are no AaLS particles with octahedral symmetry.It also predicts intermediate structures formed from 48 and 60 pentamers that could potentially be observed but have not been reported to date.Note that the distinct symmetry types observed here for AaLS cage assembly can also occur in the self-assembly of other protein cages [33,35].Our analysis demonstrates how tilings encoding the interaction network of a protein cage can be used to systematically enumerate all possible alternative protein cage structures with 3D symmetry that can also assemble from the same protein units.Whilst we have demonstrated our approach for the AaLS system, it can readily be applied to any protein nanocontainer architecture of interest following the methodology introduced here.This analysis is thus a primer for the modelling of protein nanocontainers in bionanotechnology.

Discussion
Protein containers, either adapted from naturally occurring protein cages or de novo designed, are pillars of bionanotechnology.Many groups worldwide are developing novel types of nanoparticles for a host of applications, for example using the Rosetta Software [50,16,20,10].The simultaneous assembly of a wide spectrum of particle morphologies -a phenomenon known as particle polymorphism -poses a challenge for nanocontainer production.Such polymorphism, which has also been observed in the assembly of capsid proteins in the presence and absence of viral RNA genomes [7,51], is often triggered by genetic modifications of the capsid protein subunit.This includes insertion of amino acid sequences (SpyTags) into the outward facing portion of the protein subunits, a method that is standardly used to functionalise the particle surface (using SpyCatchers) with antigens for vaccine production.Such modifications are known to result in the assembly of multiple different particle morphologies that contain the WT morphology as only one of many distinct options [6].Genetic modifications to alter the chemical properties of the protein units, such as their charges or sensitivity to pH, have similar effects, both in nanocontainers derived from bacterial enzymes [38,1,3,27] and in virus-like particles [7,34].Understanding the determinants of this particle polymorphism is an important step in controlling the assembly outcome.
We introduce here a theoretical framework to characterise the spectrum of nanoparticle morphologies that can assemble from a given set of protein units.This interaction network approach can be used even for container architectures violating the principle of quasi-equivalence, that is central to the seminal CK theory [11], VTT [47,22,23,24,17,49,48] and related models explaining the surface architectures of de novo designed nanoparticles used as malaria vaccines [21].These approaches fail, because the existence of protein subunits not interacting with neighbouring capsomers makes it difficult to define a biologically meaningful mathematical unit for the tiling models.The method introduced here closes this gap in our understanding of protein nanocontainer architecture.It uses knowledge of the local interactions between the self-assembling capsomers to systematically enumerate all viable container designs with 3D symmetry that can be formed from them.The predictive power of this approach is demonstrated for particles formed from AaLS pentamers [38,46], for which our method not only characterises all experimentally observed variants of different sizes and 3D symmetries, but also predicts structures that have not been observed yet.As tiles in CK theory and VTT represent capsomers, the dual tilings -obtained by replacing tiles by vertices and connecting vertices corresponding to adjacent tiles -can also be viewed as interaction networks (SI Fig. S10).Thus, the interaction network approach provides a unifying framework for the modelling of virus and nanoparticle architecture alike.It is applicable in bionanotechnology for the classification of nanocontainer architecture and can be built as local constraints into programmes like Rosetta to support nanoparticle design.The geometric models also open up novel avenues for the study of the biophysical properties of protein cages, such as their propensity for fragmentation and cargo release [9], or their kinetics of self-assembly along assembly pathways leading to distinct particle types [6].Such models predict the relative ratios of different particle types depending on experimental conditions, revealing how assembly can be biased towards specific outcomes.This, in turn, provides a means to increase the yield of desired particles, supporting the rational design of protein containers for diverse applications in bionanotechnology.       .As vertices represent pentamers, this would correspond to a hole surrounded by four pentamers interacting with each other in a circular arrangement, which is a local interaction pattern that has not been observed experimentally.Thus, we reason that these particles are not biologically viable options.However, they might be engineered if pentamers are mutated to enable this type of local interaction.), in which protein subunits interact via dimer and trimer interactions, is modelled in terms of rhomb and kite tiles in VTT.Its interaction network is a weighted triangular tiling, in which different types of interactions between pentamers are shown colour-coded.In particular, as the close up at the bottom shows, the dimer and trimer interactions between protein subunits give rise to three distinct interactions between pentamers in the interaction network: red edges correspond to dimer interactions, i.e. interactions represented by a rhomb tile in VTT; green edges indicate interactions within a trimer, i.e. one kite tile; and black edges represent interactions in two adjacent trimers, i.e. two neighbouring kite tiles.

Figure 1 .
Figure 1.Tiling models of virus architecture.(A) A triangular tiling according to Caspar-Klug theory is a surface lattice model for the Pariacoto virus capsid (PDB: 1F8V); its dual, a hexagonal lattice (grey) also correctly models the relative positions of the capsid proteins.(B) A different surface lattice model, a rhomb tiling, is required to capture the relative CP positions in bacteriophage MS2 (PDB: 2MS2); rhombs are one-to-one with the protein dimers from which the capsid assembles.(C) The surface lattice of human papillomavirus (PDB: 3J6R) is made of two tiles, a kite and a rhomb, that represent trimer and dimer interactions in the capsid surface.Note that each CP interacts with CPs in other pentamers.(D)The pentamers in AaLS-neg, a protein cage made from 36 pentamers (PDB: 5MQ3), cannot be mapped onto pentagons in a surface lattice in which every tile has an interpretation in terms of protein positions.As a subset of CPs do not interact with proteins in other pentamers, this cage violates the quasi-equivalence principle, resulting in gaps.However, the interaction network between capsomers (not protein subunits) can be represented as a tiling; its geometric information is exploited here to construct and classify alternative capsid architectures.(E) The interaction network of the protein cage reported in Ref.[28] (PDB: 4QCC) can be mapped onto a cube; coloured vertices indicate the centres of mass of the CPs (middle).This cubic surface can be embedded into a gyrated square tiling (right).(F) Other embeddings of the cubic surface into the gyrated square tiling (left) predict the morphologies of other cages that can assemble from the same protein units, such as the smaller cage reported in[28,29] (right).

Figure 2 .
Figure 2. Classification of AaLS protein container architectures based on the interaction network approach.(A) The AaLS-13 cage, made of 72 pentamers (PDB: 5MQ7), reveals two types of interactions between pentamers: in groups of three (triangles) and groups of six (squashed hexagons).(B) Close-up view of a squashed hexagon and its schematic representation in terms of two triangles and one square; dashed red lines are used to divide the squashed hexagon into a square and two triangles.(C) The only k-uniform tilings (up to k = 5) that can be partitioned into triangles and squashed hexagons.(D) There are four distinct ways in which a 4-fold symmetry axis in the snub square tiling can be surrounded by triangles and squashed hexagons.(E) There are only three types of tilings, modulo handedness, that can be constructed from these vertex environments: T 1l from type-1 and type-2, and T 2l and T 3l from type-3 and type-4.(F) and (G) The unique way in which the (3 3 • 4 2 , 3 2 • 4 • 3 • 4) and (3 6 , 3 2 • 4 • 3 • 4) tilings can be partitioned into triangles and squashed hexagons, respectively.

Figure 3 .
Figure 3. AaLS cage architectures derived from interaction networks.(A) The T 1l tiling provides the layout of a particle made from 24 pentamers and corresponds to the surface structure of a known AaLS protein cage (PDB: 7A4F).(B) and (C) Particle architectures derived from the T 2l and T 3l tilings correspond to cages with tetrahedral symmetry made from 48 and 60 pentamers, respectively.These cages have not been reported to date.(D) The (3 3 • 4 2 , 3 2 • 4 • 3 • 4) tiling predicts a particle made from 36 pentamers with tetrahedral symmetry.Its surface architecture corresponds to a known AaLS protein cage (PDB: 5MQ3).(E) The triangular tiling corresponds to a polyhedron with 12 vertices, which embodies the architecture of the WT AaLS cage (PDB: 5MPP).(F) The (3 6 , 3 2 • 4 • 3 • 4) tiling corresponds to an icosahedral particle made from 72 pentamers, and corresponds to the AaLS-13 protein cage (PDB: 5MQ7).

Figure S1 .
Figure S1.Distinct tiling types predict different orientations of the capsomers (assembly units, usually composed of several protein subunits) in the capsid surface.A triangulation modelling Pariacoto virus (A), and a rhomb tiling representing bacteriophage MS2 (B), predict different relative positions of the protein subunits (red dots).Locating protein positions in the corners of the triangular or rhomb tiles (blue) following the convention in Caspar-Klug theory, the resulting capsid blueprints have different orientations with respect to the underlying pentagonal/hexagonal lattice architecture.In the triangulation pairs of protein subunits, and in the rhomb tiling individual protein subunits, in neighbouring hexamers are facing each other.

Figure S2 .
Figure S2.Surface tessellation of the 36-pentamer AaLS cage architecture in terms of Voronoi cells.The fundamental domain (or asymmetric unit) of the tiling consists of three Voronoi cells (right).Pentamers are coloured in cyan, and black and white circles indicate monomers that bind, and respectively do not bind, to a protein subunits of an adjacent pentamer.

Figure S3 .
Figure S3.Deriving tetrahedral and octahedral symmetries from a cubic net.(A) A planar embedding of the surface of a cube.If all vertices are identical, the cube has octahedral symmetry, but colouring its vertices in red and blue to match the cube in (B) reduces it to tetrahedral symmetry.(C) As the octahedron is the dual of the cube, both have the same symmetry.The vertices of the octahedron (green) correspond to the 4-fold symmetry axes of the cube.

Figure S6 .
Figure S6.Graphical illustrations of the arguments used to exclude specific tiling types from the classification of AaLS cage architectures.(A) Partitioning the 3-uniform tiling (3 6 , 3 3 • 4 2 , 3 2 • 4 • 3 • 4) into triangles and squashed hexagons is not possible as the centre of the square marked by a disk cannot be the location of a 4-fold symmetry axis, nor can a squashed hexagon be placed on it.(D) The smallest cubic surface that can be embedded into the 5-uniform tiling (3 6 , (3 2 • 4 • 3 • 4)) contains a vertex on a local 6-fold symmetry axis (red dot).As vertices indicate pentamer positions, it is not possible to construct an AaLS cage from this tiling.

Figure S7 . 22 Figure S8 .
Figure S7.AaLS surface models of opposite handedness.For each tiling in our classification there is a right-handed counterpart that is its mirror image in the plane.For example, the T 1l and T 1r tilings implied by the arrangements of triangles and squashed hexagons around a left-handed (A) and right-handed (B) type-1 symmetry axis are mirror images of each other.The squares marked by stars must correspond to left-handed (A) or right-handed (B) type-2 squares.

Figure S9 .
Figure S9.Construction of the second smallest particle layouts from the tilings with 4-fold symmetry axes.Embeddings of rescaled versions of the cubic surface result in octahedral particles that each contain a square in their faces (yellow (A), orange (B), and green (C and D)).As vertices represent pentamers, this would correspond to a hole surrounded by four pentamers interacting with each other in a circular arrangement, which is a local interaction pattern that has not been observed experimentally.Thus, we reason that these particles are not biologically viable options.However, they might be engineered if pentamers are mutated to enable this type of local interaction.

Figure S10 .
Figure S10.The tiling models in Caspar-Klug and Viral Tiling theory can also be represented by interaction networks.(A) A hexagonal lattice, connecting the midpoints of the triangular tiles representing the Pariacoto virus capsid (Fig.1A; PDB: 1F8V), is an example of an interaction network associated with a Caspar-Klug model.(B) The Kagome lattice, consisting of hexagonal and triangular faces, is the interaction network of Bacteriophage MS2 (PDB: 2MS2), which is represented by a rhomb tiling in Viral Tiling theory (VTT).(C) The Human Papilloma virus capsid (PDB: 3J6R), in which protein subunits interact via dimer and trimer interactions, is modelled in terms of rhomb and kite tiles in VTT.Its interaction network is a weighted triangular tiling, in which different types of interactions between pentamers are shown colour-coded.In particular, as the close up at the bottom shows, the dimer and trimer interactions between protein subunits give rise to three distinct interactions between pentamers in the interaction network: red edges correspond to dimer interactions, i.e. interactions represented by a rhomb tile in VTT; green edges indicate interactions within a trimer, i.e. one kite tile; and black edges represent interactions in two adjacent trimers, i.e. two neighbouring kite tiles.