Structure of the HIV immature lattice allows for essential lattice remodeling within budded virions

For HIV virions to become infectious, the immature lattice of Gag polyproteins attached to the virion membrane must be cleaved. Cleavage cannot initiate without the protease formed by the homo-dimerization of domains linked to Gag. However, only 5% of the Gag polyproteins, termed Gag-Pol, carry this protease domain, and they are embedded within the structured lattice. The mechanism of Gag-Pol dimerization is unknown. Here, we use spatial stochastic computer simulations of the immature Gag lattice as derived from experimental structures, showing that dynamics of the lattice on the membrane is unavoidable due to the missing 1/3 of the spherical protein coat. These dynamics allow for Gag-Pol molecules carrying the protease domains to detach and reattach at new places within the lattice. Surprisingly, dimerization timescales of minutes or less are achievable for realistic binding energies and rates despite retaining most of the large-scale lattice structure. We derive a formula allowing extrapolation of timescales as a function of interaction free energy and binding rate, thus predicting how additional stabilization of the lattice would impact dimerization times. We further show that during assembly, dimerization of Gag-Pol is highly likely and therefore must be actively suppressed to prevent early activation. By direct comparison to recent biochemical measurements within budded virions, we find that only moderately stable hexamer contacts (–12kBT<∆G<–8kBT) retain both the dynamics and lattice structures that are consistent with experiment. These dynamics are likely essential for proper maturation, and our models quantify and predict lattice dynamics and protease dimerization timescales that define a key step in understanding formation of infectious viruses.

additional stabilization of the lattice would impact dimerization times. We further show that 48 during assembly, dimerization of Gag-Pol is highly likely and therefore must be actively 49 suppressed to prevent early activation. By direct comparison to recent biochemical 50 measurements within budded virions, we find that only moderately stable hexamer contacts (-51 12k B T<∆G<-8k B T) retain both the dynamics and lattice structures that are consistent with 52 experiment. These dynamics are likely essential for proper maturation, and our models quantify 53 and predict lattice dynamics and protease dimerization timescales that define a key step in 54 understanding formation of infectious viruses. 55 56 the membrane via lipid binding and myristolyation (3,4), and thus the increased concentration 116 on the budded membrane will drive distinct dynamics and stability than those expected in a 3D 117 volume due to dimensional reduction (28,29). Identifying regimes of binding stabilities and 118 rates that can support assembly and simultaneously support dynamics or remodeling of the 119 immature lattice is thus important for understanding the requirements for forming infectious 120 virions. 121 With our simulations, we are then prepared to test distinct mechanisms of protease 122 dimerization possible within the immature lattice. Two primary dynamic mechanisms are 123 possible. 1) large-scale remodeling of the lattice could bring together two fragments that 124 contain protease monomers and 2) protease monomers could unbind and reattach at new 125 Previous modeling work studying the HIV-1 immature lattice has captured similar 145 structural features to our work but has not interrogated the membrane bound lattice dynamics 146 and their implications for protease dimerization. Coarse-grained molecular-scale models of the 147 immature Gag lattice established interaction strengths between Gag domains that are 148 necessary to maintain a hexagonal lattice ordering, as well as changes in structure following 149 mutation(34). Molecular dynamics simulations of incomplete hexamers along the immature 150 lattice gap-edge demonstrated conformational changes in Gag monomers that indicate lower 151 stability and likely targets for protease cleavage(11). Coarse-grained simulations of lattice 152 assembly in solution(35) and on membranes(36) have identified the importance of co-factors, 153 including the membrane, RNA, and IP6 in stabilizing hexamer formation and growth. Similar to 154 these molecular dynamics simulations, our reaction-diffusion simulations also track the coarse-155 grained coordinates of each Gag monomer in space and time. In contrast, our model is 156 parameterized not by empirical energy functions describing how each site in the model 157 attracts/repels other sites, but instead by rates that control the probability of binding upon 158 diffusive collisions (20). With this reaction-diffusion approach, we have access to longer 159 timescales despite the large system size (~2500 monomers), and precise control over the 160 association kinetics and free energies, which are directly input as parameters to our model. We 161 can thus quantify the dynamics and kinetics of the assembled lattice over several seconds for 162 multiple model strengths and rates. 163 In this work, we initialize Gag monomers into their immature lattices on the membrane, 164 as they would be structured after budding from the host cell but prior to maturation(11). We 165 use reaction-diffusion simulations to both assemble these immature lattices, as well as to 166 characterize the timescales of remodeling and Gag dynamics within the incomplete lattices. 167 We validate that our structured lattices conform to those observed in cryoET through a 168 quantitative analysis, and we verify that the specified free energies and rates of association 169 between our Gag monomers are validated in simpler models. We first characterize the 170 likelihood of the Gag-Pol monomers to dimerize during the assembly process. We find that 171 although they represent only 5% of the monomers that assemble into the lattice, the stochastic 172 assembly will ensure that at least a pair of them are adjacent within the lattice, even if they do 173 not engage in a specific interaction. We next show that, if, on the other hand, the molecules are 174 distant from one another, they would need to detach, diffuse, and reattach stochastically at the 175 site of another Gag-Pol molecule. By modulating the kinetics and energetics of Gag-Gag 176 contacts, we quantify how the overall time for dimerization depends on unbinding, and 177 rebinding, with the 2D diffusion contributing negligibly to the overall time. Lastly, we show how 178 the mobility of the lattice causes binding events that are consistent with biochemical 179 measurements (37), and decorrelation of the lattice that is qualitatively consistent with recent 180 microscopy measurements on immature Gag lattices(30). Our results show that the stochastic 181 dimerization of two Gag-Pol molecules would need to be actively suppressed or inhibited to 182 effectively prevent early activation, and that otherwise, even stable lattices can support  Pol dimerization events due to dynamic remodeling. The position of the MA site is not in the cryoET structure, and we position it to place 195 each monomer normal to the surface. The distance of the MA site from the center of 196 mass is set to 2nm. The hexamerization sites (green and blue) mediate the front-to-back 197 binding between monomers to form a cycle. The dimerization site (purple) forms a 198 homodimer between two Gag monomers, as illustrated on the right. The reactive sites 199 are point particles that exclude volume only with their reactive partners at the distances 200 shown. Thus, the hexamer-hexamer binding radius is 0.42nm, whereas the longer dimer-201 dimer binding radius is 2.21nm. Positions and orientations are defined in Source Data. 202 The experimental lattice has an intrinsic curvature, and our model recapitulates this to 203 assemble a sphere. The binding kinetics between the interaction types for multiple rates 204 was validated against theory (figure supplement 1 and 2), and we verified that the lipid 205 binding site model did not significantly impact the dynamics of the lattice ( Gag-Pol in the initial immature lattice increases with more surface coverage. Normally, 219 we set all parameters for Gag and Gag-Pol to be identical (blue circles). During assembly, 220 we tested turning off any explicit Gag-Pol to Gag-Pol interactions, rendering them 221 unfavorable (black circles), but they can still end up adjacent to one another. However, 222 this is sensitive to the assembly conditions-when monomers can unbind during 223 assembly, they can correct these unfavorable interactions and reduce the Gag-Pol to 224 Gag-Pol pairs further (red circles). C) Formation of the lattice produces structures that 225 are similar to cryoET, with a single large continent and a large vacancy, as well as several 226 defects or incomplete hexamers throughout the large lattice, which are shown in red in 227 these 4 independent assemblies. An incomplete hexamer in the simulated lattice is 228 quantified as a sub-structure with 2-5 monomers present in the ring. The size 229 distribution of these defect regions is also found to be similar to the cryoET results 230 (figure supplement 1). 231 232 233

A. Assembled lattices on the membrane are structurally similar to those present in cryoET 236
Our model captures coarse structure of the Gag and Gag-Pol monomers as derived from a 237 recent cryoET structure(38) of the immature lattice ( Fig 1A) (Methods). The Gag-Pol is 238 structurally identical to the Gag but represents 5% of the total monomer population to track 239 protease locations within the lattice. We were able to assemble a variety of spherical Gag 240 lattices that grew from monomers to a single sphere with our targeted coverage of the 241 membrane surface using our stochastic reaction-diffusion simulations(20) (Fig 2) (Methods). 242 The lattices in Fig 2 are a single connected continent, with imperfect edges, a large gap on the 243 surface (~1/3), and regions with defects present in the tri-hexagonal lattice (Fig 2). Our lattice 244 topologies are in very good agreement with the structures determined by cryoET, which also 245 shows a single continent and a large gap, such that the spherical lattice is truncated (11). We 246 quantified the fraction of hexamers in our lattices that are incomplete, finding 36-40% have 247 fewer than 6 monomers when binding events during assembly are irreversible, or 30-32% when 248 we allow unbinding during assembly (see Methods). This is in excellent agreement with the 249 34±4% we calculated from the cryoET datasets (11). We also observe a similar distribution in 250 the sizes of the regions containing incomplete hexamers, with most regions being localized and 251 small, but with a few larger strands or 'scars' (Fig 2-figure supplement). Along the incomplete 252 edge, we count a larger fraction of the free binding sites are hexamer sites, in agreement with 253 experiment (11), although we acknowledge our simulations do not exclude free dimer sites 254 (which are not observed in the cryoET) given the assembly parameters (dimer and hexamer 255 rates are equally fast). 256 It is illuminating that the structures of our lattices share features with the experimental 257 lattices, given that our assembly simulations (see Methods) do not directly mimic the 258 physiologic process of Gag assembling in the cytoplasm, at the plasma membrane, with RNA 259 (39). To promote the nucleation and growth of only a single lattice (rather than nucleating 260 multiple lattice structures), we combined fast Gag-Gag binding (6x10 6 M -1 s -1 ) with a slow 261 titration of Gag monomers into the volume. The slow titration does mimic the role of co-262 factors, however, in that Gag does not assemble without being effectively 'turned on' by co-263 factors like RNA(40). The similarity of our structures to experiment suggests that our assembled 264 model is constrained to incorporate topological defects at a similar frequency to the biological 265 proteins. Interestingly, while these lattices must have defects because a sphere cannot be 266 perfectly tiled by a hexagonal lattice, the number of non-hexamers or imperfect contacts within 267 them is significantly higher than the number required by Euler's theorem, which is only 6 for a 268 spherical lattice with a hole in it(41). We speculate that during assembly, the lattice is not 269 undergoing a significant amount of remodeling and annealing to correct these defects. This 270 would be consistent with a fast and more irreversible nucleation and growth, and indeed we 271 see fewer defects (~31% vs 38%) when we allow for unbinding during assembly vs irreversible 272 binding. The biological lattices seem to be 'good enough' despite the possibility of more perfect 273 lattice arrangements, and the lower stability of these more defective lattices should facilitate 274 the remodeling necessary for maturation. Although only 5% of the Gag monomers in our simulation are tagged as Gag-Pol (~125 out of 280 ~2625 simulated proteins), we find it is extremely unlikely that a lattice will be assembled 281 without a pair of them already adjacent ( Fig 2B). This is due to the stochastic nature of the 282 assembly and the fact that each monomer has 3 adjacent monomers, two via its hexamer 283 interfaces and one via its dimer interface. However, we can reduce the number of Gag-Pol to 284 Gag-Pol pairs if we turn off any specific interaction between them by setting their binding rates 285 to zero. Even making this interaction thus highly unfavorable relative to a Gag to Gag or Gag to 286 Gag-Pol interaction, we still find pairs of them adjacent, as they can be brought into proximity 287 via their specific interactions with the Gag monomers ( Fig 2B). The number of pairs given the 288 unfavorable interaction is also dependent on the assembly conditions; when we allow for 289 unbinding between Gag contacts this allows for annealing and correction of such unfavorable 290 contacts during assembly, and the Gag-Pol to Gag-Pol pairs are largely eliminated. Overall, 291 these results indicate that to prevent early activation of the proteases, one cannot just rely on 292 the lower frequency of Gag-Pol to Gag-Pol interaction, as the lattice is simply too densely 293 packed. Instead, these Gag-Pol dimers would have to be actively inhibited from initiating 294 protease activity by either having a highly unfavorable affinity for one another or otherwise 295 forming dimers that are enzymatically inhibited, as any activation preceding budding can leak 296 proteases back to the cytoplasm(42), and is known to reduce infectivity(32). Regardless of how 297 the activation is prevented, inhibition would have to be released following budding, and this 298 mechanism is not known. We assume below that the Gag-Pol to Gag-Pol dimers following 299 budding can now interact favorably, identically as Gag to Gag, since we know that activation 300 must ultimately occur. 301

C. The Gag lattice disassembles with the weaker hexamer contacts of -5.62 303
We perform all our simulations from the same starting structures, but with a range of hexamer 304 strengths of -5.62 to -11.62 , and a slower (0.015μM -1 s -1 ), medium (0.15) and faster (1.5) 305 rate of binding for each ∆ . For the weakest hexamer contacts of -5.62 , we find that the 306 lattice is not able to retain its single continental structure, and instead fragments into a 307 distribution of much smaller lattices (Video 2). Given a fixed ∆ we speed up the on-and off-308 rates and as expected, we see more rapid disintegration of the lattice structure. As we stabilize 309 the lattice by increasing ∆ , we still see departure from the single continental structure due 310 to unbinding of monomers and small complexes from the lattice edge (Fig 3). Hence, we see the 311 emergence of a bimodal distribution of lattices, with a peak at the monomer/small oligomer 312 end, and another peak containing the majority of the lattice in one large continent. The size of 313 the large continent remains largest with increasing ∆ and with a slower rate, over the 314 course of these ~17-20s simulations (Fig 3). Importantly, these dynamics occur in all our 315 simulations and would not be possible if not for the incompleteness of the lattice. Specifically, 316 the Gag contacts are dissociating not from the membrane but from each other, predominantly 317 along the edge, at which point they can then diffuse along the membrane surface (Video 1, 318 Video 3). If the lattice were covering 100% of the surface, dissociation events would not allow 319 Gags to diffuse away, and no dynamic remodeling would occur. From the sizes of the lattices 320 present in the simulations, we can also report on the distribution of diffusion constants 321 represented on the surface, as larger lattices diffuse more slowly. For the weaker lattices, the 322 distribution is very broad, spanning 4 orders of magnitude, whereas for the most stable lattice 323 there is primarily one very slowly diffusing time-scale, and a separate time-scale for the more 324 faster moving oligomers (Fig 3-figure supplement). Lastly, the lifetimes of hexamers in our 325 lattices are controlled by ∆ , by a ∆ penalty, and by the extent to which the hexamers 326 are constrained by further dimer contacts. Our ∆ penalty is small at 2.3 but it does 327 shorten the hexamer lifetimes relative to having zero strain (Methods). This means that the 328 strain penalty can increase lattice dynamics, but we see that the relaxation dynamics from the 329 initial lattices is much more sensitive to the magnitude of ∆ (Fig 3). The effect becomes 330 negligible for more stable lattices as the remodeling we observe is dominated by Gag subunits which is largely bimodal for all systems: a population of small oligomers and one giant 339 connected component. As time progresses (from left to right columns), the initial structure 340 which was one giant connected component continues to fragment somewhat, indicating that 341 the starting structure was not at equilibrium. As the on-and off-rates increase (from top to 342 bottom) with a fixed ∆ = −9.62 , the largest component shrinks, as shown by the peak 343 denoting the large giant component shifting to the left, and the peak denoting the small 344 oligomers shifting to the right. B) For a weaker hexamer free energy shown in the blue data 345 (∆ = −7.62 ), the lattice is breaking apart more rapidly and moving towards a more 346 uniform distribution of lattice patch sizes as both peaks shift to the center. Note that we cut-347 off the y axis at 0.005 to make the peak at ~2500 visible. The bars at small sizes extend up to 348 ~0.05. C) Representative structures at the later times (t=17s) for each case, illustrating the 349 increased fragmentation as the rates accelerate, or as the hexamer contacts destabilize (lowest 350 row). We quantify the corresponding diffusivity of the structures in figure supplement 1. We 351 show how changes to ∆ have a minimal impact on the structural dynamics in figure  already in contact with one another, so we ignore those pairs to focus instead on spatially 386 separated Gag-Pols. By tracking the separation between all pairs of Gag-Pol monomers (Fig 4), 387 we can quantify the first passage time (FPT), or the time for the first pair to find one another in 388 each simulated stochastic trajectory. The edge of the incomplete lattice supports multiple 389 detachment events (Video 1); of the ~250 Gag monomers along the edge, 10% have only one 390 link to the lattice, which offers the easiest path to disconnect, by breaking only a single bond 391 (Fig 4-figure supplement). 392 In Fig 4, our results show how the first-passage time for two Gag-Pols to dimerize with 393 one another is dependent on both unbinding rates and binding rates, as both events are 394 required to bring two Gag-Pol together. We observe two intuitive trends. One is that for a given 395 free energy ∆ , a faster association (and thus faster dissociation) rate results in faster 396 dimerization events between the Gag-Pol monomers. The second trend is that as the lattice 397 free energy stabilizes, dimerization events are slowed due to the slower dissociation times, 398 despite having the same on-rates (Video 3). These timescales are thus consistent with the 399 dimerization events requiring at least one of the monomers to dissociate from the lattice, and 400 then rebind at a new location containing a Gag-Pol. We report the association rates as their 2D 401 values, because binding is occurring while the proteins are affixed to the 2D membrane surface, 402 as unbinding from the surface is rare (Methods). The corresponding 3D rates are representative 403 of slow to moderately fast rates of protein-protein association (1.5x10 4 -1.5x10 6 M -1 s -1 ), where 404 they are converted to 2D values via a molecular length-scale h=10nm (Methods). Activation 405 requires explicitly that the 5% of monomers carrying the proteases to be involved. Hence, 406 additional unbinding and rebinding events will occur that are not 'activating' because they 407 involve a Gag monomer without a protease. Gag-Pol molecules can also unbind and rebind 408 multiple times before successfully finding another Gag-Pol. analogy to a MFPT model for bimolecular association, with an inverse dependence on k a (see, 422 e.g. (43)). Second, we empirically find that the also has a power-law dependence on the 423 K D , ∝ , or equivalently, ∝ exp ( ∆ ). Our phenomenological formula thus had 424 two fit parameters, the power-law exponent and a constant pre-factor (see Methods). After 425 fitting, we find the approximate relationship: 426 Where 7×10 -5 is a dimensionless fit parameter, R is the radius of the sphere, and our 430 convention has ∆ < 0. We see excellent agreement between our formula and our data ( Importantly, in the models we have studied, the activation of a dimer can occur in well under a 436 minute up through several minutes (Fig 4). For hexamer stabilities of -5.62 and -7.62k B T, all 437 rates support dimerization events at less than 10s. For the more stable lattices of -9.62 and -438 11.62 k B T, only the medium and fast rates ensure a MFPT that is less than or comparable to 439 (~50s): an event occurring within 100s of the start. Using our Eq. 4, we can determine that for a 440 moderate rate of 2.5×10 -2 nm 2 μs -1 , a ∆ more stable than -12k B T will be slower than 100 441 seconds. For the slowest rate of 2.5×10 -3 nm 2 μs -1 , anything more stable than -9.4k B T will be 442 slower than 100 seconds. Our most stable lattices at the slowest rates take 10 minutes on 443 average for an activation event. Our results thus quantify and predict how the kinetics and the 444 stability of the lattice must be tuned to allow sufficiently fast dimerization events involving the 445 5% of Gag-Pol molecules carrying proteases. 446 447 G. Lower lattice coverage does not dramatically change the first-passage times 448 When comparing lattices with 66% coverage vs 33% coverage, we see in some cases a minor 449 slow-down in dimerization times, but the MFPT is overall much less sensitive than it is to the 450 binding rates. With 33% coverage, the edge of the lattice does have a comparable size to the 451 66% lattice, but the 'bulk' interior is smaller, with more free space required to diffuse to a 452 partner. However, most significantly, the concentration of Gag is smaller, and now with only 66 453 Gag-Pols (vs 125) present in the lattice, we see in some cases an increase in the time it takes for 454 a pair to find one another (Fig 5A). We also simulated the system where diffusion of all species was slowed by a factor of 10 467 (red data). The MFPT is also not sensitive to changes in the dimer strength (figure 468 supplement as ∆ over this range of free energies (Fig 5-figure supplement). 476 This likely emerges because these dimer contacts are typically more stable than the hexamer. 477 Thus, the hexamer unbinding events are more frequent and more likely to directly provoke the 478 first activation events. Further, the hexamer has two binding sites, so more contacts in the 479 lattice, and because hexamers nucleate stable cycles needed for higher order assembly, the 480 frequency of hexamers vs incomplete hexamers are significantly more sensitive to ∆ than a 481 single dimer bond. This result shows that breaking and formation of the hexamer contacts is 482 important in driving Gag-Pol dimerization events, given that the MFPT shows clear sensitivity to 483 these rates. 484 We similarly found minimal dependence of the MFPT on the diffusion constant. With a 485 10-fold slower diffusion constant for all membrane-bound Gag monomers, which effectively 486 slows all lattices down by 10-fold, the MFPT were not significantly slower (Fig 5b). This is not 487 surprising given that the diffusional search along the membrane to find a new partner is not 488 ultimately the rate-limiting step in the association process. The rates we report are intrinsic 489 rates that control binding upon collision, whereas the macroscopic rates one measures through 490 standard biochemistry experiments in the bulk are dependent on both this intrinsic rate and on 491 diffusional times to collision(45). Faster intrinsic rates are more diffusion-limited and produce 492 binding that is more sensitive to diffusion (46). However, given the small dimensions of the 493 virion, traveling ~70nm for example (the radius) takes on the order of milliseconds for 494 monomers and small oligomers of Gag. Rough estimates of delay times for a binding event, 495 using t~(k a 2D *N gagpol /SA) -1 indicate that even for the fastest binding, it is on the order of a few 496 milliseconds. The slowest timescale given all the rates is for the stable lattice, where 497 dissociation has a timescale of ~7 seconds. Altogether, these timescales of individual steps 498 show that the observed MFPT are not merely controlled by the slowest single events, but by 499 the need for multiple attempts of un-and re-binding to ensure a pair of Gag-Pols find one 500 another. Our results further illustrate how the crowding due to the lattice on the surface can 501 actually accelerate rebinding events compared to a freely diffusing pair when the lattice is 502 unstable (Fig 5a), whereas for stronger Gag contacts the lattice will dramatically slow rebinding. 503 504 505 I. Biochemical measurements of Gag mobility in VLPs agree with our moderately stable 506 lattices 507 We find that the dynamics of our simulated lattices agrees with experimental measurements of 508 binding within the lattice for parameters that exclude the most stable, slowest regimes. 509 Experimental measurements within the Gag lattice of budded virus-like particles (VLP) tracked 510 the biochemical formation of a Gag dimer involving a population of Gag molecules tagged with 511 a SNAP-tag (10-40%) and the same fraction of Gag molecules tagged with a HALO-tag(37). A 512 covalently linked dimer was formed through addition of a HAXS8 linker at time zero, with one 513 linker forming an irreversible bridge between a HALO and SNAP protein. Formation of this 514 covalently linked dimer was quantified to reveal an initial rapid formation of dimers, followed 515 by an increasing slower growth that reaches 42% dimer pairs formed for the 10% tagged 516 populations. In our simulations, we thus performed a comparable 'experiment' given our 517 trajectories (see Methods). We tracked the encounter between two populations of our Gag 518 molecules that had been randomly tagged as either 10% HALO or 10% SNAP (Fig 6). We 519 similarly found that the majority of the dimers formed rapidly, because they were already 520 adjacent in the lattice when the covalent linker was introduced. The dynamics of the lattice 521 then allowed a slow growth in additional dimers (Fig 6a). We calculated the fraction bound over 522 the course of our 20 second simulations and used a simple extrapolation to define an upper 523 bound on the number formed at 3 minutes (see Methods). For the least stable lattices, the 524 upper bound is close to all dimers formed, over all 3 association rates, which is much higher 525 than observed experimentally. For the most stable lattices in contrast, even when our model 526 assumes maximal efficiency of the covalent linker, we extrapolate to an upper bound that is 527 less than 42% dimers formed experimentally (Fig 6b). These results thus indicate that these 528 lattices are too stabilized to support the dynamics observed in the Gag VLPs. 529 We also verified that these simulations are consistent with the trends expected as the 530 population of tagged Gag monomers is increased. Indeed, we find, similar to experiment, that 531 as a larger fraction of Gag monomers have tags, corresponding to a higher concentration of 532 binding partners, we see a larger fraction of dimers being formed (Fig 6C). We quantify the dynamics of the Gag lattice on fixed viewpoints on the spherical surface using 553 number auto-correlation functions (ACFs), which report on correlations of collective motion 554 that can emerge due to heterogeneity within the lattice (Fig 7) (Methods). We expect this 555 heterogeneity due to our lattices all exhibiting a large component and smaller oligomers (Fig 2). 556 We find that as the lattice becomes more stable and the bimodal separation of lattice sizes 557 becomes more pronounced, the measured correlations in Gag copy numbers per quadrant 558 increase in amplitude and slow in timescales, and that these dynamics are sensitive to slowing 559 diffusion (Fig 7-figure supplement). All the ACFs will eventually asymptote to 1 at long delay 560 times as the copy numbers become independent (Fig 7a). signal, which is 1 as expected (bleaching of the Gag monomers causes limited drops in total 569 copies across 20s), as the total copy numbers across the membrane surface do not change. We 570 note that the ACF values at our longest delays (i.e. >~10s) are not statistically robust, because 571 of the limited number of frames separated by these timescales. B) ACF of each of the 8 572 quadrants of one simulated lattice. C) As the lattice is stabilized by increasing ΔG hex , the ACF 573 shows higher amplitude correlations that decay to 1 at longer times, additional trends shown in 574 correlations because some quadrants contain large lattice fragments, and others contain mostly 587 empty space (Fig 7b). The same trend is observed in a single VLP measured using super-588 resolution microscopy imaging (Fig 7e). Our simulations further show that as the lattice is 589 stabilized, the ACF increases in amplitude and decays more slowly, which is qualitatively the 590 same as is observed in imaging of Gag lattices in budded VLPs that have been stabilized with a 591 fixative(30) (Fig 7c, 7f). We cannot quantitively compare the ACFs, as the experiments produced 592 ACFs with much higher amplitudes of correlations, and even with the background correlations 593 divided out (Fig 7d), the experimental signal contained additional sources of correlation likely 594 due to measurement noise. However, we were able to use our simulations to illustrate how 595 sources of measurement noise in stochastic localization imaging experiments can produce 596 increased correlations beyond the background. We specifically find that short-term blinking of 597 the fluorophore does not appreciably change the ACF (Fig 7-figure supplement). However, we 598 do see increased amplitude of correlations in the ACF if we introduce a distribution of 599 activation probabilities for the fluorophores, mimicking the fact that the populations initially 600 activated may have a higher probability of activation than those appearing at later times (Fig 7-601 figure supplement). Further, if we assume that the lattice is not perfectly centered with respect 602 to the activating laser pulses, then Gag monomers that are initially 'dark' can diffuse into view 603 and then have a probability of being activated (Fig 7-figures  Pol monomers, we show that dimerization can proceed in less than a few minutes, and for less 617 stable lattices much faster, despite the embedding of these molecules within the lattice. By 618 comparison with experimental measurements of lattice structure (via cryoET) and lattice 619 dynamics (via biochemistry and time-resolved imaging), we conclude that the stability of the 620 hexamer contacts should be in the range of −10 < ∆ < −8 for binding rates that 621 are slower than 10 5 M -1 s -1 . If the binding rates are faster, then the free energy could be further 622 stabilized (∆ < −10 ), as the dimerization events would still be fast enough to be 623 consistent with the biochemical measurements. If the lattice is less stable than −6 , we 624 found that the large-scale structure of the lattice is not maintained even within seconds, which 625 is not consistent with structural measurements, and between −6 to −8 , the 626 dimerization is likely too fast relative to the biochemical measurements, although it is feasible 627 given that our simulations predict an upper bound. These hexamer-hexamer contact strengths 628 report the stabilities that would be expected given the presence of co-factors, as without co-629 factors the lattice does not assemble at all (47, 48), and co-factors are present in all 630 experiments used for comparison. 631 Our simulations also demonstrate that during assembly, the fraction of Gag-Pol 632 monomers, while only 5%, is still too high to prevent stochastic dimerization events between 633 them. This means that preventing early activation, which can result in loss of proteases from 634 the virion(42) and significant reduction in virion formation (32), requires active suppression of 635 the interaction between adjacent Gag-Pol monomers. This suppression could occur in the form 636 of highly unfavorable dimerization events between Gag-Pol monomers, which we found could 637 significantly reduce the number of adjacent pairs, particularly if the assembly process allows for 638 unbinding and 'correction' of such unfavorable contacts. Suppression could also occur by 639 having adjacent pairs that are somehow enzymatically inhibited. The exact mechanism is not 640 known. Ultimately, the suppression must be relieved to allow for protease activity in the 641 budded virion, and our results show that two protease domains will be able to find one another 642 even if seemingly locked within the lattice at distant locations. 643 Our model explicitly accounts for the crowding effects of localizing the lattice to a small, 644 2D surface, ensuring that excluded volume is maintained between all monomers. However, we 645 do not explicitly include the genomic RNA that would be packaged within the immature virion 646 and attached to the Gag lattice (through non-competing binding sites). It is known that binding 647 to RNA (49, 50), membrane, and other co-factors is important in stabilizing the lattice for 648 assembly (21,25,26,44,(51)(52)(53)(54)(55). IP6 has been shown to accelerate and stabilize immature 649 lattice assembly in vitro(25) and in vivo(21). Our Gag-Gag interaction free energies thus 650 presuppose that RNA and IP6 have bound already, as otherwise the lattice would not have 651 assembled productively. Because we do not explicitly incorporate IP6 binding throughout the 652 lattice, however, we are assuming it uniformly affects the lattice, whereas it could locally 653 stabilize only where it is bound. IP6 is highly abundant in cells (~50 M), and visible in cryo 654 structures of the immature lattice(35), so it is likely that the majority of hexamers are 655 interacting with IP6, but in future work it will be important to confirm this explicitly. Our Gag 656 monomers and oligomers diffuse along the membrane surface, not through the interior of the 657 budded virion where the RNA would be packaged, consistent with excluded volume in the 658 virion center. When our lattice coverage changes from 33% to 66%, for example, we see only 659 small changes in our mean-first passage times which primarily reflect the increase in total Gag-660 Pol monomers available. However, the attachment of the Gag lattice to a large RNA polymer of 661 9600 nucleotides (~3 m) could change the mobility of the Gag monomers following their 662 detachment from the lattice. Proteins can still unbind and diffuse when bound to a polymer like 663 RNA(36, 56), but the effective rates could slow, and the distance that a monomer typically 664 travels can be limited by the fluctuations of the attached RNA polymer. Hence rebinding may be 665 more restricted to shorter excursions from the start point. We note the Gag VLPs contain 666 smaller RNA polymers and not the full gRNA, so the model is in that way more consistent with 667 the dynamics of a VLP. Somewhat remarkably, the Gag monomers within the virion are able to 668 reassemble around the gRNA to form the mature conical capsid (following cleavage)(57), which 669 indicates there is a clear capacity for diffusion driven remodeling. This mature lattice is also 670 subsequently disassembled(58), and the principles of our model here indicate how 671 destabilization of hexamer contacts could help promote disassembly. 672 Our models here contain pairwise interactions, and cooperativity enters only in that the 673 formation of a completed cycle (whether a hexamer or a higher-order cycle of multiple 674 hexamers) is significantly more stable, because it requires two bond breaking events. However, 675 coordination of the hexamer by IP6 can produce conformational changes(48) or kinetic 676 effects(35) that could change the stability of hexamer contacts between say, a dimer vs a 5-677 mer. We did not include this additional cooperativity to keep the model as simple as possible; 678 we expect that added cooperativity in hexamer formation would change the prefactors in the 679 quantitative relationship we predict between the hexamer free energies and the first-passage 680 times, as intermediates would be biased away from smaller fragments. However, because the 681 lattice would inevitably still have the 'dangling' edges and partial hexamers observed 682 experimentally(11), we would still see dissociation, diffusion, and rebinding events. Our model 683 also does not incorporate any mechanical energy, so while we capture local changes in stability 684 due to defects in the lattice that reduce the protein contacts and thus free energy, we cannot 685 measure directional forces or stresses within our lattice. Inhomogeneities in assembled lattices, 686 like pentamers vs hexamers, result in varying mechanical stress(59), and defects or 'scars' in 687 lattices on curved surfaces are known to represent mechanical weak points that are susceptible 688 to cracking or fragmenting(41). This will be a particularly important extension for coupling the 689 lattice with the mechanical bending of the membrane, which can be performed using 690 continuum models(60). Lastly, other proteins are packaged into HIV-1 virions, including 691 curvature inducers(61), and like RNA, additional protein interactions could shift the Gag 692 unbinding kinetics. Ultimately, however, our models clearly show that despite the significant 693 amount of protein-protein contacts and ordered structure within the membrane attached Gag 694 lattice, there is nonetheless enough disorder along the incomplete edge to support multiple 695 unbinding and rebinding events over the seconds to minutes time-scale (Video 1, Video 3). 696 Although our work here is focused on the HIV-1 immature lattice, our approach could be 697 insightfully applied to other retroviruses, particularly given the morphological differences 698 between the closely related HIV-1 and HIV-2 immature lattices (62). The HIV-2 Gag polyprotein 699 similarly forms the immature lattice at the plasma membrane, but imaging of the budded virion 700 shows that the HIV-2 lattice is largely complete with an average membrane coverage ratio of 701 76% ± 8% (62, 63). Hence although this lattice contains defects and gaps, it does not have the 702 large vacancy present in the HIV-1 lattice studied here. Given the important role that this 703 incomplete edge played in facilitating unbinding and rebinding events of Gag-Pol, we would 704 expect that the protease dimerization events would be significantly slowed in the HIV-2 lattice. 705 With higher surface coverage, the concentration of Gag-Pol is overall higher in the virion, which 706 would help promote dimerization, but with less access to a long, incomplete edge, the number 707 of un(re)binding events would be reduced. We found here that lattices that were initially 708 assembled into 2-3 fragments rather than a single continent would have less of a large vacancy 709 on the surface and exhibited slightly slower remodeling dynamics and increased first-passage 710 times for Gag-Pol dimerization. Ultimately, the HIV-2 lattice does still need to be cleaved and 711 reassembled into the mature capsid, just like HIV-1 (62), so we would hypothesize that the 712 binding kinetics between Gag contacts would have to be faster, to more readily promote the 713 remodeling needed both for protease dimerization and the cleavage and disassembly of the 714 immature lattice. Currently there is significantly less detail on the assembly and maturation of 715 HIV-2, and future research will be essential to gain a more comprehensive understanding of 716 protease dimerization, activation, and maturation across various retroviruses. 717 Overall, the model and simulations here reveal a level of detailed Gag dynamics coupled 718 to structural changes that are inaccessible to any single experiment but can nonetheless be 719 compared to a range of experimental observables, as we have done here. Although diffusion 720 does influence the collective dynamics of the lattice, for example, we find it does not 721 significantly influence activation rates, as those are limited by binding and unbinding events 722 rather than mobility. By defining a formula that allows us to extrapolate our model to other 723 rates and free energies, we can predict how mutations that would change the strength or 724 kinetics of the hexamer contacts would impact the time-scales of the initial protease 725 dimerization event. Mean first-passage times can be predicted from theory in surprisingly 726 complex geometries (64), but for the immature lattice, the problem is intractable without using 727 simulation data due to the ability of Gag-Pol to rebind or 'stick' back onto the lattice through 728 multiple contacts before successful dimerization encounters. More generally, modeling stages 729 of viral assembly has been critical for establishing the regimes of energetic and kinetic 730 parameters that distinguish successful assembly from malformed or kinetically trapped 731 intermediates, such as in viral capsid assembly (65-68). Computational models of self-assembly 732 can be used to assess how additional complexity encountered in vivo, such as macromolecular 733 co-factors (35, 36, 69), crowding (67, 70), and changes to membrane-to-surface geometry (29), 734 could help to promote or suppress assembly relative to in vitro conditions. Our reaction-735 diffusion model developed here provides an open-source and extensible resource (20) to study 736 preceding and following steps in the Gag assembly pathway (as done in recent work (40)) with 737 the addition of co-factors. A model of mature capsid assembly, for example, would involve Gag 738 monomers that have a modified interface geometry and orientations relative to one another, as 739 quantified above. With rates and energies that match biochemical measurements, the model 740 can act as a bridge between in vitro and in vivo studies of retroviral assembly and budding, and 741 a tool to predict assembly conditions that disrupt progression of infectious virions. 742 743 744

METHODS 745
Model components and structural details Our model contains Gag and Gag-Pol monomers 746 enclosed by a spherical membrane. The membrane contains binding sites for the Gag 747 monomers. The Gag-Pol is structurally identical to the Gag but represents 5% of the total 748 monomer population to track protease locations within the lattice. The model captures coarse 749 structure of the Gag/Gag-Pol monomers as derived from a recent cryoET structure(38) of the 750 immature lattice (Fig 1A). The key features of our rigid body models are the locations of the 751 four binding sites/domains that mediate protein-protein interactions between a pair of Gag 752 monomers and the Gag-membrane interaction. Each Gag/Gag-Pol contains a membrane 753 binding site, a homo-dimerization site, and two distinct hexamer binding sites that support the 754 front-to-back type of assembly needed to form a ring. When two molecules bind via these 755 specific interaction sites, they adopt a pre-defined orientation relative to one another (20) that 756 ensures the lattice will have the correct contacts, distances between proteins, and curvature 757 (Fig 1 and Source Data). The Gag monomers bind to the membrane from the inside of the 758 sphere, as would be necessary for budding, and we model this as a single binding interaction 759 that captures stabilization from PI(4,5)P 2 binding and myristolyation (3, 4). Each reactive site 760 excludes volume from only its reactive partners at a distance . The dimer site reacts with 761 another dimer site at a binding radius of =2.21nm. The MA site binds to the membrane at 762 =1nm. The hexamer site 1 binds to hexamer site 2 at =0.42nm (Fig 1b). Once reactive sites 763 have bound to one another, they are no longer reactive and no longer exclude volume. 764 Therefore, to maintain excluded volume between monomers throughout the simulation, we 765 introduce an additional dummy reaction between the monomer centers-of-mass (COM). The 766 COM sites exclude volume with a binding radius of =2.5nm between all monomer pairs. This is 767 necessary to prevent monomers from unphysically diffusing 'through' one another when their 768 reactive sites are fully bound. 769 770 Reaction-diffusion simulations Computer simulations are performed using the NERDSS 771 software(20). The software propagates particle-based and structure-resolved reaction-772 diffusion using the free-propagator reweighting (FPR) algorithm(71). The membrane is treated 773 as a fixed continuum surface that contains a population of specific lipid binding sites, or 774 PI(4,5)P 2 . We model these binding sites using an implicit lipid algorithm that replaces explicit 775 diffusing lipid binding sites with a density field that will change with time as proteins bind or 776 unbind from the membrane. Hence PI(4,5)P 2 are assumed well-mixed on the surface. This 777 method reproduces the kinetics and equilibria as the explicit lipid method but is significantly 778 more efficient (Fig 1-figure supplement) (72). We use a time-step ∆t=0.2 s. We validated the 779 model kinetics as described in the next section. Software is open source here 780 github.com/mjohn218/NERDSS and executable input files for the models are here 781 github.com/mjohn218/NERDSS/sample_inputs/gagLatticeRemodeling. 782 We briefly describe here how the stochastic reaction-diffusion simulations work. Each 783 protein or protein complex moves as a rigid body obeying rotational and translational diffusive 784 dynamics using simple Brownian updates, e.g. ( + Δ ) = ( ) + √2 Δ , where is the 785 diffusion constant of the rigid body and is a normally distributed random number with mean 786 0 and standard deviation of 1. Each protein binding site is a point particle that can react with a 787 site on another molecule to define the reaction network, as illustrated by the contacts in Fig 1.  788 Reactions can occur upon collisions, with the probability that the reaction occurs evaluated 789 using the Green's function for a pair of diffusing sites, parameterized by an intrinsic reaction 790 rate , a binding radius , and the sum of the diffusion constants of both species(71). This 791 reaction probability is corrected for rigid body rotational motion(73). For proteins that are 792 restricted to the 2D membrane, they perform 2D association reactions with 2D rate constants 793 (46), which are derived from the 3D rate constants by dividing out a lengthscale ℎ that 794 effectively captures the fluctuations of the proteins when on the membrane, 795 = /ℎ. Eq. 1 796 Proteins that do not react during a time-step undergo diffusion as a rigid complex, and excluded 797 volume is maintained for all unbound reactive sites at their binding radius by rejecting and 798 resampling displacements that result in overlap. All binding events are reversible, with 799 dissociation events parameterized by intrinsic rates that are sampled as Poisson processes. 800 We have that for each reaction, = , and for the corresponding 2D reaction, we assume 801 the unbinding rates are unchanged, and thus = ℎ . 802 Binding interactions are dependent on collisions between sites at the binding radius 803 and are not orientation dependent. Orientations are thus enforced after an association event 804 occurs by 'snapping' components into place. Association events are rejected if they generate 805 steric overlap between components of two complexes. Steric overlap is determined using a 806 distance threshold, where here if the distance between molecule centers-of-mass is less than 807 2.3nm, we reject due to overlap. They are rejected if they generate large displacements due to 808 rotation and translation into the proper orientation, using a scaling of the expected diffusive 809 displacement of 10. Defects ultimately emerge in the lattice because a hexagonal lattice cannot 810 perfectly tile a spherical surface by the Euler polyhedron formula. These defects result in 811 contacts that are not perfectly aligned (Fig 2); if the contacts are within a short cutoff distance 812 of 1.5 , they can still form a bond to stabilize the local order, otherwise they are left unbound, 813 weakening the local order. We set that energy ∆ here to +2.3 for all models, meaning that the stability of any 839 closed hexagon within the lattice is slightly lower compared to 6 ideal bonds (i.e. for ∆ =-840 11.62 it is 5.8 bonds) but still much more stable than a linear arrangement of six Gag 841 monomers which has only 5 bonds. This is an entropic penalty to forming closed cycles which 842 require the final subunit to fit into the 5-mer structure and form 2 bonds simultaneously. This . We do not 850 apply this strain penalty to dimer bonds that can also end up in higher-order cycles, thus 851 assuming that they can accommodate spacing or small structural rearrangements without any 852 free energy cost. We ran a set of comparison simulations where ∆ = 0, to illustrate how 853 it can impact the structures of the weaker lattices. Quantitatively, setting ∆ to zero gives 854 the hexamer cycles a lifetime that is 10 fold longer. For ∆ = −11.6 , the hexamer 855 lifetime thus increases from 1380s to 13800s when k a 3D =1.5x10 5 M -1 s -1 , but these are both 856 dramatically slower than a single hexamer bond which has a lifetime of 0.74s. For the weakest 857 lattice, however, the hexamer cycle is only 4x more long-lived than a single bond, so the strain 858 penalty is more impactful. With additional dimer interactions stabilizing subunits in the lattice, 859 however, hexamer lifetime increases further. 860 We validated the kinetics and equilibrium of our model as it assembled on the 861 membrane when we set the hexamer rates to zero, so it formed purely dimers (Fig 1-figure  862 supplement), and when we set the dimer rates to zero, so it formed purely hexamers (Fig 1-863 figure supplement). The observed kinetics and equilibria were compared to solutions solved 864 using the corresponding system of non-spatial rate equations, showing very good agreement 865 with apparent intrinsic rates that systematically accounted for excluded volume and the criteria 866 used for accepting association events (Fig 1-figure supplement). Thus, all of the rates and free 867 energies reported in the paper agree with the kinetics and equilibria observed and expected for 868 the sets of binding interactions that make up the full lattice system. 869

870
Simulations for constructing the initial lattices on the membrane The HIV lattice is composed of 871 Gag and Gag-Pol bound to the inner leaflet of the lipid membrane. To study the remodeling 872 dynamics, we must construct the initial configurations where the lattice is assembled such that 873 it has a specific coverage of the surface (67% or 33%), and is linked to the membrane via lipid 874 binding. We define the membrane sphere of radius 67nm to represent the membrane 875 surface(57). PI(4,5)P 2 is populated on the membrane surface at a concentration 0.07nm -2 , or 876 4000 copies, which exceeds the number of Gag monomers, meaning there is always a pool of 877 free PI(4,5)P 2 available for (re)binding. 878 Assembling the Gag monomers into a single spherical lattice is non-trivial due to the size 879 of the lattice. Because the lattice is so large, requiring N~2400 monomers at 67% coverage, it is 880 very difficult for a single nucleated lattice to complete growth (which scales approximately with 881 N) before another lattice nucleates. These multiple intermediate fragments do not readily 882 combine. In a recent study we quantified how titrating in monomers instead of trying to 883 assemble from the bulk can dramatically improve assembly yield(40). Therefore, here we titrate 884 in the Gag and Gag-Pol monomers at a rate of 6×10 -5 M/s and 3×10 -6 M/s respectively, which 885 can ensure a ratio of Gag : Gag-Pol of ~ 20 : 1, consistent with experiment(57, 75). Gag  Pol) molecules can bind in solution (3D), to the membrane (3D to 2D), and when on the 887 membrane (2D). In one set of assembly simulations we set binding rates between Gag-Pol -888 Gag-Pol pairs to zero to try and suppress the 'activation' events that could therefore occur 889 during assembly (Fig 2). While the titration of the monomers reduced multiple nucleation 890 events for the membrane system, we found that the easiest and most efficient way to form a 891 single lattice was by assembling the structure fully in solution, in a volume of (250nm) 3 . We 892 then put the assembled single lattice into a spherical system by linking this structure to the 893 membrane using one PI(4,5)P 2 attachment per monomer. The Gag rates of dimerization and 894 hexamerization were both set to 6x10 6 M -1 s -1 . For the lattices studied below, we made binding 895 events irreversible, as it improved growth of single lattices. For comparison, we also ran a few 896 assembly simulations where the binding was reversible, using ∆ =-11 , and 897 ∆ =-13 , k off =100s -1 and 13.6s -1 , respectively. These reversible binding simulations 898 also used titration, and although they often nucleated 2 structures, we could keep adding 899 monomers until at least one lattice reached our target size. Because the hexamer and dimer 900 rates are identical during the assembly process, we do not see selection for only complete 901 dimers along the lattice periphery, as is observed in the cryoET maps(11). To recover this 902 feature, we would instead need to assemble the lattice under more native-like conditions 903 where the dimer is more rapidly and stably formed compared to the hexamer contact. We 904 generated 16 initial configurations for each coverage area (67% and 33%). Some initial 905 configurations are shown in Fig. 2. 906 907 Simulations for lattice remodeling dynamics. For each initial configuration we have generated, 908 we perform 6 independent trajectories. See Video 1 for one trajectory. We perform these 96 909 simulations for each set of model parameters to generate statistics both within and across 910 initial configurations. For some simulations, fragments of the lattice become sterically 911 overlapped with one another, due to the high density and the time-step size. While this could 912 be eliminated by lowering the time-step, we instead keep the more efficient time-step, and 913 discard these simulation traces which produce overlap. We finally analyze 60 remodeling traces 914 for each parameter set. All the simulation parameters are listed in Table 1. The number of 915 monomers is fixed for each simulation by the initial configuration, so that only binding, 916 unbinding, and diffusion can occur throughout the simulation. During lattice construction, in 917 one set of simulation we set all Gag-Pol to Gag-Pol binding interactions to zero (Fig 2B). Now for 918 the remodeling dynamics, we allow all interactions involving Gag-Pol, and at rates that are 919 identical to those involving Gag, meaning there is no difference between the types except for in 920 their label. The Gag/Gag-Pol molecules are allowed to diffuse on the membrane, where they 921 can unbind from a molecule and rebind to another with the specified binding rates. Each 922 monomer can also unbind and rebind to the membrane lipids. However, dissociation to 923 solution is extremely rare, as it requires that all Gag monomers in an assembled complex 924 unbind from their lipid before any of the sites rebind. In Fig 1- Fig 2. We note that two Gag-Pols have a chance to be adjacent at the initial configuration (Fig  934   2), but we ignore these events since dimerization of Gag-Pol before viral release has been 935 experimentally shown to result in loss of Pol components from the virions (42) were identifiable because one sub-population of Gag monomers carried a HALO tag (a protein 953 that fuses to a target of interest, here Gag), and another carried a SNAP tag. The addition of a 954 linker HAXS8 produced a covalent linkage which we will call HALO-link-SNAP. The concentration 955 of these HALO-link-SNAP structures was then quantified vs time. We therefore reproduced this 956 experiment via analysis of our simulation trajectories. We defined a population of our Gag 957 monomers randomly selected to have a 'SNAP tag', and a population randomly selected to have 958 a 'HALO tag'. For each trajectory, the tagged populations of each were either 5, 10, 20 or 40% 959 of the total monomers, to match the experimental measurements (37). We then monitored the 960 number of dimerization events that occurred as a function of time. A dimerization event 961 required that a monomer with a SNAP-tag and a monomer with a HALO-tag encountered one 962 another at a distance less than 3nm (77), where one and only one of these partners must have 963 the covalent linker attached. 3nm cutoff distance is comparable to the molecular length-scales 964 of the two protein tags with the linker between them (77). We randomly selected half of the 965 population of SNAP-Gags to have a linker attached, and half of the population of HALO-Gags to 966 have a linker attached, and thus some encounters between a SNAP and HALO Gag were not 967 productive if zero or 2 linkers were present. Ultimately, however, all dimers could be formed 968 given the symmetric populations containing linkers. These binding events were irreversible, 969 consistent with a covalent bond formed. 970 This model assumes that the arrival of the linker to the inside of the virion is relatively 971 rapid. The permeability coefficient of the linker when exposed to the membrane enclosed Gag 972 lattice is approximately 0.0004nm/ s (77), and assuming a membrane thickness of ~5nm, the 973 diffusion across the membrane occurs at ~0.002nm 2 / s. To test the role of linker permeability, 974 we solved the diffusion equation for a 1 M concentration of linker molecules diffusing into a 975 sphere of radius R=67nm, which mimics the experiments. Within 100ms, the concentration of 976 the linker at 60nm (close to the Gag-tagged end) has already reached 0.8 M. Hence although 977 there is some delay following addition of the linker, it is much less than the time (20s -3 978 minutes) over which most of the dimerization occurs. The model also assumes that the linker 979 does not saturate all HALO and SNAP molecules independently, which would prevent any 980 dimers forming. The rates of binding of SNAP and HALO to the linker HAXS8 are 3x10 4 and 3x10 6 981 respectively (77). We solved a system of Ordinary Differential Equations (ODES) for binding of 982 HALO and SNAP to a linker given these rates. The HALO and SNAP concentrations were 983 controlled by the size of the virions with 250 of each present (10%), and the linker 984 concentration was 1 M, which was found experimentally to ensure high dimerization success 985 (30). Although the linker binds more rapidly to HALO, there is still plenty of time for the HALO-986 linked molecules to bind to a free SNAP before all the sites are occupied by linkers, as the copy 987 numbers of linkers in the volume are low. In particular, if the HALO and SNAP tags are adjacent 988 in the lattice, the SNAP is much more likely to bind the adjacent HALO-linker than a free linker. 989 Since our simulations are ~20s, we did a linear fit of the last second of the dimer forming 990 kinetics to extrapolate the dimer copies at 3 minutes, which can be used for comparison with 991 the experiment. This extrapolation therefore assumes that dimer formation does not slow 992 down, which it almost certainly does. All of our assumptions contribute to the maximal possible Eq. 3 1007 agreement between the stochastic measurement of the ACF and the direct measurement of the 1037 copy numbers ACF are excellent (see below). We use this stochastic localization method so that 1038 we can introduce additional sources of correlation to our measurement of the simulated lattice 1039 dynamics, since these measurement artifacts can appear in the real experimental system. 1040

Analysis of autocorrelation function (ACF) from experimental data on VLPs 1042
The time-resolved microscopy (iPALM) experiments to characterize lattice dynamics in virus-like 1043 particles (VLPs) were previously described and published (30). We describe the analysis of these 1044 stochastic localization experiments here because we focus on analyzing a shorter part of the 1045 measurement. We analyzed only the first 500 seconds of the measurement (5000 frames), 1046 because after that time the laser intensity was changed. Based on data collected on 25 VLPs, we 1047 analyzed only the VLPs where they reported a large enough fraction of localization events to 1048 indicate a reliable measurement, so we included only VLPs where >75% of the quadrants had 1049 more than 250 localizations, leaving 11 VLPs. We used the same algorithm (78) as applied to the 1050 simulation data to quantify the ACF from the time-dependent sequences of localization events 1051 (typically a series of 1s and 0s, with occasionally 2 events per frame). For each of the 8 1052 quadrants, one effectively measures the copies of monomer per quadrant: n 1 (t), n 2 (t), ... n 8 (t). 1053 The total copies are then N(t)= n 1 (t)+n 2 (t)+ ... +n 8 (t). The ACF for any single quadrant is given by 1054 , which we denote as the 1055 background signal, as the full surface was visualized once per experiment, with localization 1056 events then assigned to quadrants. This background signal reports on fluctuations in the total 1057 copy numbers of Gag on the surface, which we would expect to be 1 given a perfect 1058 measurement, but which was always higher than this due to measurement noise. To remove 1059 this effect of total copy number variations, and instead focus on the local fluctuations in 1060 concentrations per quadrant, we would like to report a corrected 1061