Breathing and Tilting: Mesoscale Simulations Illuminate Influenza Glycoprotein Vulnerabilities

Influenza virus has resurfaced recently from inactivity during the early stages of the COVID-19 pandemic, raising serious concerns about the nature and magnitude of future epidemics. The main antigenic targets of influenza virus are two surface glycoproteins, hemagglutinin (HA) and neuraminidase (NA). Whereas the structural and dynamical properties of both glycoproteins have been studied previously, the understanding of their plasticity in the whole-virion context is fragmented. Here, we investigate the dynamics of influenza glycoproteins in a crowded protein environment through mesoscale all-atom molecular dynamics simulations of two evolutionary-linked glycosylated influenza A whole-virion models. Our simulations reveal and kinetically characterize three main molecular motions of influenza glycoproteins: NA head tilting, HA ectodomain tilting, and HA head breathing. The flexibility of HA and NA highlights antigenically relevant conformational states, as well as facilitates the characterization of a novel monoclonal antibody, derived from convalescent human donor, that binds to the underside of the NA head. Our work provides previously unappreciated views on the dynamics of HA and NA, advancing the understanding of their interplay and suggesting possible strategies for the design of future vaccines and antivirals against influenza.


Mesoscale all-atom modeling of the influenza virus
In the work presented here, two glycosylated, whole-virion models of the H1N1 influenza A (H1N1) virus were built, simulated, and analyzed. The first corresponds to the strain A/swine/Shandong/N1/2009(H1N1) (hereafter referred to as H1N1-Shan2009), whereas the second one accounts for the A/45/Michigan/2015(H1N1) strain (hereafter referred to as H1N1-Mich2015). Both models are based on the unglycosylated H1N1-Shan2009 whole-virion model previously built by us and presented in Durrant et al. 1 The main difference lies in the addition of N-linked glycans to HA and NA glycoproteins, which substantially improved the biological relevance of the simulated constructs. Although we refer to Durrant et al. 1 for a detailed description of the original modeling of the individual proteins, assembly procedure of the whole-virion model, and molecular dynamics (MD) simulations of both the individual proteins (HA, NA, M2) and the whole-virion model, a summary is also provided in the following paragraph.
The unglycosylated, H1N1-Shan2009 whole-virion model constructed and simulated by Durrant et al. 1 comprises 236 HA trimers, 30 NA tetramers, and 11 M2 ion channels embedded in a 3-palmitoyl-2-oleoyl-dglycero-1-phosphatidylcholine (POPC) quasi-spherical lipid bilayer. 1 HA, NA and M2 were individually modeled and separately equilibrated (HA and NA) through MD simulations before being incorporated into the viral membrane. 1 We note that in Durrant et al. 1 glycans were not modeled. LipidWrapper 2 was used to create the spherical lipid bilayer starting from a large planar POPC-bilayer model generated with the CHARMM-GUI server. 3,4 As specified in reference 1, 1 slight differences in the leaflet densities are magnified by several times in the mesoscale systems, possibly creating large instabilities during simulation. 5 Although a bilayer made of only POPC represents an oversimplification of the actual membrane composition, 6 it has the advantage of minimizing such instabilities during simulations. This was the main reason behind the decision to build a POPC-only lipid bilayer. The M1 matrix proteins, coating the inner leaflet of the lipid bilayer, and the ribonucleoproteins contained inside of the virion were not modeled. As described in Amaro et al. 7 , the shape of the virion and the distribution of the glycoproteins in the viral membrane were adopted from a point model of the virion exterior provided by collaborators, which was in turn derived from a cryoelectron tomography (cryoET) map of the influenza virus pleiomorphy. 8 Positioning of the glycoproteins and insertion of M2 channels in the lipid bilayer according to the point model was done with PyMolecule. 9,10 Full procedure, including handling of clashes, is described in references 1 and 7. 1,7 The whole-virion model was solvated inside and outside of the lipid bilayer with explicit TIP3P water molecules, 11 and ionized at 150 mM concentration with Na + /Cl -, tallying 160,653,271 atoms. All-atom MD simulations were performed on the Blue Waters supercomputer using NAMD2.10 12 and CHARMM36 all-atom additive force fields for protein, 13 lipids, 14,15 and ions. 16 In the next paragraphs, we detail the system setup and all-atom MD simulations of the glycosylated H1N1-Shan2009 and H1N1-Mich2015 whole-virion models presented in this work. molecules 11 located inside and outside of the lipid bilayer, Na + and Clions, as well as Ca 2+ ions bound to the NA heads, were retained from the original unglycosylated system. 1 The amino acid mutations and the addition of glycans altered the overall electric charge of the solute, thus requiring further tweaking of the total number of Na + and Clions to establish electrical neutrality. These modifications did not impact the overall ionic strength, which remained at ~150 mM. The following minimization steps solved eventual clashes involving mutated residues or between newly added glycans and pre-existing water molecules and ions. The total number of atoms for the final system is 160,981,954, with an orthorhombic periodic cell of 1138 Å × 1189 Å × 1150 Å. After generating suitable files for mesoscale simulations with NAMD, 35 the system was ready to undergo all-atom MD simulations.
All-atom molecular dynamics simulations All-atom MD simulations of H1N1-Shan2009 and H1N1-Mich2015 were performed on the TITAN supercomputer at Oak Ridge National Laboratory (ORNL) and the NSF Blue Waters supercomputer at the University of Illinois at Urbana-Champaign, respectively, using the CUDA memory-optimized version of NAMD 2.13. 12 The same simulation protocol was adopted for both H1N1-Shan2009 and H1N1-Mich2015. CHARMM36 all-atom additive force fields were used to describe proteins, 13 lipids, 14,15 glycans, 34 and ions. 16 Simulations were performed in explicit water using the TIP3P model. 11 Preliminary tests were initially run to assess the stability of the systems, familiarize with the computing facilities, and debug possible errors. Then, the systems were initially minimized in two consecutive cycles of 15,000 steps, each using the NAMD's conjugate gradient energy approach. During the first cycle, the position of protein backbone atoms and of POPC P atoms within 15 Å of the M2 ion channels was kept harmonically restrained at 100 (kcal/mol)/Å 2 . We note that the orientation of M2 ion channels was modified with respect to our previous work, 1 thus requiring more caution. We also remark that the initial coordinates for water molecules, ions, lipids, and proteins (except for M2) were retained from a previous MD simulation as described in the previous sections, 1 thus being already equilibrated. During the second minimization cycle, the harmonic restraints on the protein backbone atoms were released, whereas the harmonic restraints on the POPC P atoms within 15 Å of M2 ion channels were decreased to 10 (kcal/mol)/Å 2 . After minimization, the systems were gradually heated up with no restraints from 0 K to 298 K in 220 ps using isothermal-isobaric conditions (NPT ensemble). The temperature was controlled using the Langevin thermostat 36 with a damping coefficient of 5/ps, whereas the Nosé-Hoover Langevin piston 37,38 was used to maintain the pressure at 1.01325 bar. A time-step of 2 fs was used. Next, the systems were equilibrated for 0.5 ns under the NPT ensemble. Temperature (298 K) and pressure (1.01325 bar) control were achieved again using the Langevin thermostat 36 with a damping coefficient of 5/ps, and the Nosé-Hoover Langevin piston. 37,38 During this phase, harmonic restraints at 100 (kcal/mol)/Å 2 were applied again to POPC P atoms within 15 Å of M2 ion channels and gradually released to 10 (kcal/mol)/Å 2 .
Finally, the systems were submitted to productive MD simulations in NPT conditions using the same barostat and thermostat as above to achieve pressure (1.01325 bar) and temperature (298 K) control. Langevin damping coefficient was set to 1/ps during productive MD. One continuous replica was performed for each system using a time-step of 2 fs. To preserve the overall geometry of the virion models in the absence of the interior component, we applied a harmonic position restraint of 20 (kcal/mol)/Å 2 to every 20 th inner-leaflet P atom. A similar protocol was adopted in our previous simulation of the unglycosylated virion, 1 where instead every 10 th POPC head group was kept fixed. Non-bonded interactions (van der Waals and short-range electrostatic) were calculated at each timestep using a cutoff of 12 Å and a switching distance of 10 Å. All simulations were performed using periodic boundary conditions, employing the particle-mesh Ewald method 39 with a grid spacing of 2.1 Å to evaluate long-range electrostatic interactions every three timesteps. SHAKE algorithm 40 was adopted to keep the atomic bonds involving hydrogens fixed with the NAMD option rigidbonds all. Frames were saved every 30,000 steps (60 ps).
For H1N1-Shan2009, we collected a total of ~441.78 ns (7363 frames) continuous productive MD (Movie S1). Simulations were performed with the CUDA memory-optimized version of NAMD 2.13 12 running on 4,096 physical nodes (28,672 processors) on the TITAN supercomputer, with an average benchmark of ~13.83 ns/day. A total of ~17 TB of data were generated, including stripped trajectories.
For H1N1-Mich2015, we collected a total of 424.98 ns (7083 frames) continuous productive MD (Movie S2). Simulations were performed with the CUDA memory-optimized version of NAMD 2.13 12 running on 4,096 physical nodes (114,688 processors) on the Blue Waters supercomputer, with an average benchmark of ~13.78 ns/day. A total of ~16 TB of data was generated including stripped trajectories.
Root Mean Squared Deviation (RMSD) analysis of HA and NA For both strains, each of the 236 HA trimeric and 30 NA tetrameric trajectories was separately aligned using as a reference the respective initial coordinates of all the backbone atoms (C, N, O, Cα). Then, for each of the HA trimers/NA tetramers, we calculated the RMSD of the backbone atoms using the measure rmsd command implemented in VMD. 27 Subsequently, we computed the mean and standard deviation of the RMSD values obtained at each frame across all the 236 HA trimers / 30 NA tetramers, generating, respectively, the averaged RMSD profiles for HA and NA ( Figure S9).

Root Mean Squared Fluctuation (RMSF) analysis of HA and NA residues
We performed the calculation of Cα-atom RMSF to probe the average mobility of HA and NA residues. Before the calculation of Cα-atom RMSF, rotation and translation motions of each individual HA trimer and NA tetramer were removed through specific alignments. In detail, two sets of aligned trajectories, based on different residue masks, were generated for each of the 236 HA trimers and 30 NA tetramers of both strains. For each of the HA trimers, the first set of alignment was performed using as a reference the respective initial coordinates of all the backbone atoms (C, N, O, Cα) of the ectodomain region (residues 18 to 503) of the respective glycoprotein, whereas the second set was based on the region comprising the transmembrane domain (TMD) (residues 529 to 549). The former served for the calculation of RMSF of residues 1 to 503, the latter for the calculation of RMSF of residues 504 to 566. For each of the NA tetramers, the first set of alignment was performed using as a reference the respective initial coordinates of all the backbone atoms (C, N, O, Cα) of the head region (residues 96 to 461) of the respective glycoprotein, whereas the second set was based on the region comprising the stalk and the TMD (residues 7 to 81). The former served for the calculation of RMSF of residues 96 to 469, the latter for the calculation of RMSF of residues 1 to 95. This procedure was performed due to peculiar structures of HA and NA that present domains undergoing tilting motions during the dynamics, i.e., HA ectodomain tilting and NA head tilting. If we had performed a single alignment based on the backbone atoms of the whole structure, the extensive motions of the HA ectodomain and NA head, which are larger in size and account for most of the residues, would have biased the RMSF values of the complementary region encompassing the hinge linker (both for HA and NA), the stalk (for NA), the TMD, and the cytosolic tail.
For each trajectory of both sets and for each monomer, we calculated the RMSF of the Cα atoms of the residues belonging to the respective region using the measure rmsf command implemented in VMD. 27 All the frames were used for the calculation. We then combined the RMSF values obtained for the two regions into a single RMSF profile for each monomer. Finally, we computed the mean and standard deviation of the RMSF values obtained for each Cα atom across all the 708 HA monomers / 120 NA monomers, generating the respectively averaged RMSF profiles for HA and NA ( Figure S10).
Positional fluctuations of HA's and NA's TMD and ectodomain center of masses For both H1N1-Shan2009 and H1N1-Mich2015, we calculated the RMSF of HA's and NA's TMD center of mass (COM) and ectodomain COM using the first frame as a reference. The analysis was performed using in-house tcl scripts within the VMD framework. 27 We note that the aim of this analysis was estimating the average positional fluctuations of the TMD COM and ectodomain COM of each glycoprotein within the virion. Therefore, only the global rotation and translation motions of the whole virion were removed and not those of each individual glycoprotein within the virion. Global rotation and translation motions of the whole virion coordinates during dynamics were removed by aligning the whole-virion trajectory using as reference the coordinates of all the protein backbone atoms (C, N, O, Cα) at the initial frame. The COM of HA TMD was defined using the coordinates of backbone atoms of residues 529 to 549. The COM of HA ectodomain was defined using the coordinates of backbone atoms of residues 1 to 528. The COM of NA TMD was defined using the coordinates of backbone atoms of residues 7 to 29. The COM of NA ectodomain was defined using the coordinates of backbone atoms of residues 30 to 469.
At each frame of the whole-virion trajectory, we calculated, for each glycoprotein, the squared positional fluctuation of both COMs (i.e., ectodomain COM and TMD COM) with respect S9 to their initial position (reference). Next, always for each glycoprotein, we averaged the squared fluctuations of the COMs and calculated the square root of the averages, obtaining the respective RMSF value.

NA head-tilt angle
The NA head-tilt angle was calculated for all 30 NA tetramers at each frame of both virion trajectories using in-house tcl scripts within the VMD framework. 27 The NA head-tilt angle is defined by the principal axis of the tetrameric upper stalk region (residues 50 to 78) and the vector joining residue 78 at the top of the stalk with the COM of the tetrameric head (residues 91 to 469). Only Cα atoms were considered to calculate the COMs. The distribution of NA head-tilt angle values was computed through kernel density estimation using Gaussian kernels with gnuplot plotting program. 41

HA ectodomain-tilt angle
The HA ectodomain-tilt angle was calculated for all 236 HA trimers at each frame of both virion trajectories using in-house tcl scripts within the VMD framework. 27 The HA ectodomaintilt angle is defined by the principal axis of the trimeric TMD (residues 529 to 549) and the vector joining residue 529 at the top of the TMD with the COM of the trimeric ectodomain's long αhelices (LαHs) (residues 419 to 471). Only Cα atoms were considered to calculate the COMs. The distribution of HA ectodomain-tilt angle values was computed through kernel density estimation using Gaussian kernels with gnuplot plotting program. 41

HA head breathing
The extent of HA head breathing was estimated by calculating the distance between the COM of each monomeric HA head (residues 66 to 286) and the COM of the apical residues of the three LαHs. The apical residue of each LαH corresponds to residue 419, and the COM was calculated based on the three residues 419 as in the trimeric assembly. Only Cα atoms were considered to calculate the COMs. This distance was calculated for all 708 HA monomers at each frame of both virion trajectories using in-house tcl scripts within the VMD framework. 27 The distribution of HA head-LαHs distance values was computed through kernel density estimation using Gaussian kernels with gnuplot plotting program. 41

NA head breathing
The extent of NA head breathing was estimated by calculating the inter-protomer distances between the four NA head COMs (residues 96 to 461). Only Cα atoms were considered to calculate the COMs. Considering the four NA heads within each tetramer, a total of six distances have been monitored for each NA tetramer ( Figure S27), at each frame of both virion trajectories, using inhouse tcl scripts within the VMD framework. 27 The distribution of the values of the inter-protomer distances was computed through kernel density estimation using Gaussian kernels with gnuplot plotting program. 41 FluA-20 epitope antibody accessible surface area (AbASA) Antibody accessible surface area (AbASA) of the FluA-20 42 epitope, located on the HA head domain trimer interface, was calculated using the measure sasa VMD command, 27 an implementation based on the Shrake and Rupley algorithm, 43 along with in-house tcl scripts to handle data. The following HA residues were considered for the epitope: residues 109, 230, 232-237, 243. An 18.6 Å probe radius, which approximately encloses the FluA-20 Fab's variable domains (see Figure 5 in the main text), was used. ASA evaluations were conducted on the HA trimer showing the highest degree of head domain breathing, at every frame of the trajectory.

Evaluation of glycoprotein-glycoprotein interactions
For both virion simulations, interactions between glycoproteins were evaluated using the measure contacts implemented in VMD 27 along with in-house tcl scripts to handle data.
For each glycoprotein, we first pre-computed a list of neighbor glycoproteins using a cutoff of 75 Å. Then, always for each glycoprotein, we assessed the number of connections formed with any of the glycoproteins included in its respective neighbor list every 10 th frame (i.e., every 0.6 ns) of the virion trajectory. At each evaluation, a "connection" was considered formed and counted as one when any protein or glycan heavy atoms of the screened glycoprotein were detected to be within 5 Å of any protein or glycan heavy atoms of the screened neighbor glycoprotein. Only heavy atoms belonging to the ectodomains were considered for the analysis. Importantly, if glycoprotein "A" was, for example, found to form a connection with glycoprotein "B", that connection was counted one time for glycoprotein "A" and one time also for glycoprotein "B". Relative fractions of glycoproteins forming respectively 0, 1, 2, 3, 4, or 5 connections at each evaluation were subsequently computed with respect to the total number of glycoproteins (266). Results are shown in Figure 7 for all the glycoproteins, i.e., HAs + NAs, and in Figure S26 for HA and NA separately.
Within the same workflow, we then evaluated the formation of macroclusters of glycoproteins. At each assessment, we calculated the number of glycoproteins that were part of a macrocluster using the same cutoff of 5 Å between protein/glycan heavy atoms. If glycoprotein "A" was, for example, found to form a connection with glycoprotein "B," and, glycoprotein "B" was in turn found to establish at the same time a connection with glycoprotein "C," the three glycoproteins A, B, and C were considered as part of a macro-cluster of 3, regardless of the presence of a direct connection between A and C. Results are shown in Figure S27.
Next, we evaluated the nature of the connections formed by each glycoprotein. The connections were broken down into three categories depending on the residues used by each glycoprotein to form such connections: i) "glycan-only" if only glycan residues were used to form the connection; ii) "protein-only" if only protein residues were used to form the connection; "protein and glycan" if both protein and glycan residues were used at the same time to interact with the screened neighbor glycoprotein. Relative frequencies of "glycan-only", "protein-only", and "protein and glycan" types of connection were subsequently computed with respect to the total number of counted connections. Results are shown in Figure 7 in the main text.
Finally, if glycans were involved in a connection between two glycoproteins, for each glycoprotein we counted the number of stalk/head glycans involved in the connection, respectively. Relative frequencies of the interacting stalk/head glycans were subsequently calculated for HA and NA with respect to the respective total number of stalk/head glycans in both strains. Results are shown in Figure S28.
Evaluation of conditional probabilities between HA and NA tilting/breathing and interprotein connectivity For each motion investigated in this work, namely NA head tilting, HA ectodomain tilting, and HA head breathing, we calculated the conditional probability that a glycoprotein (either NA or HA) exhibits a certain degree of tilting/breathing during the dynamics given that it forms a connection with one or more neighboring glycoproteins. Here, the conditional event, i.e., B, the formation of one or more connections during the dynamics, should account for the interprotein connectivity. Therefore, with this analysis, we investigate whether the interprotein connectivity somehow affects the probability of achieving a certain degree of tilting/breathing. In order to calculate such conditional probability, we used the following formula: Where: • P(A) is the probability that a glycoprotein (HA or NA) tilts/breathes by a certain amount during the dynamics with respect to the initial frame. • P(B) is the probability that the glycoprotein forms one or more connections during the dynamics. P(B) must be > 0. • P(A∩B) is the probability at which the events A and B occur together.
• P(A|B) is the probability of event A occurring if event B has happened. P(A), P(B) and P(A∩B) were determined by counting the number of frames where events A, B, and A∩B occurred, respectively, and dividing it by the total number of screened frames. H1N1-Shan2009 and H1N1-Mich2015 trajectories were analyzed with a stride of 10. To assess whether event A occurred or not for HA ectodomain tilting and NA head tilting, we evaluated, for each HA/NA, the tilting angle at the examined frame with respect to the tilt angle at frame 0 and computed the difference, namely (tilt-angle_frame_n -tilt-angle_frame_0). If the obtained difference value was within a certain (positive) range, then event A would be counted as occurred for the evaluated range of values (see Tables S4-S9). Instead, to assess whether event A occurred or not for HA head breathing, we evaluated, for each HA trimer, the breathing distance of each monomer at the screened frame with respect to the respective breathing distance at frame 0 and computed the difference, namely (breathing-distance_frame_n -breathing-distance_frame_0). If the obtained difference value for at least one monomer within the trimer was within a certain (positive) range, then event A would be counted as occurred for the evaluated range of values (see S12   Tables S4-S9). To assess whether event B occurred or not we evaluated the number of connections made by the screened glycoprotein at each analyzed frame. If the number of connections was equal or larger than one, then event B would be counted as occurred. To assess whether event A∩B occurred or not, we evaluated whether event A and B occurred at the same time or not. If true, then event A∩B would be counted as occurred.
The absolute error of P(A) was calculated as P(A) / sqrt((Count of number of times A was observed) -1). The same was done for the absolute errors of P(B) and P(A∩B).
The absolute error of P(A|B) was calculated as ER(A|B) × P(A|B), where ER(A|B) is the relative error of P(A|B).
The relative error of P(A|B), namely ER(A|B), was calculated as sqrt( is the relative error of P(A), ER(B) is the relative error of P(B), and ER(A∩B) is the relative error of P(A∩B).
The relative error of P(A), namely, ER(A) was calculated as E(A) / P(A). The same was done to calculate ER(B) and ER(A∩B).
Next, we also estimated in a similar way the opposite conditional probability (P(B|A)) that a glycoprotein (either NA or HA) forms a connection with one or more glycoproteins, given that it exhibits a certain degree of tilting/breathing during the dynamics. Here, the conditional event is A, i.e., the degree of tilting/breathing achieved during the dynamics. Therefore, with this analysis, we investigate whether tilting/breathing affects the probability of HA/NA forming one or more connections. In order to calculate such conditional probability, we used the following formula: Where: • P(B) is the probability that the glycoprotein forms one or more connections during the dynamics. • P(A) is the probability that a glycoprotein (HA or NA) tilts/breathes by a certain amount during the dynamics with respect to the initial frame. P(A) must be > 0. • P(A∩B) is the probability at which the events A and B occur together.
• P(B|A) is the probability of event B occurring if event A has happened.
Events A, B, and A∩B were counted as above, and errors were computed as described above.

Markov State Models (MSMs)
The first step was the preparation of individual NA tetrameric (30) and HA trimeric (236) trajectories, which were generated from the whole-virion MD simulations of H1N1-Shan2009 and H1N1-Mich2015, respectively. Glycans were not included in the generated trajectories because a common topology, i.e., equal for all the NA tetramers and HA trimers, respectively, was necessary to load all the trajectories in the MSM workflow. However, the impact of glycans on the S13 glycoproteins remains accounted for as the trajectories were generated from the simulations of the glycosylated whole-virions.
NA head tilting. The MD simulation trajectories were imported into the PyEMMA2 software 44 for MSM analysis. The MSM of the NA head tilting was created as follows. We used two features; the first was defined as the shortest distance between the COM of the head and the COM of the center of the stalk, while the second feature was defined as the shortest distance between the COM of the head and the COM of the bottom of the stalk. The exact residues defining these COM selections are given in the accompanying Jupyter notebook (Data S1). With these two features, we ran a time-lagged independent component analysis (TICA) as is commonly used on MSMs with multiple features 45-47 using a lag time of 100 steps. Our step size is 0.06, so 100 steps are equivalent to 6 ns. We ran k-means clustering 48 of the two features, selecting 300 clusters. From our implied timescale plots 49,50 we selected an MSM lag time of 250 steps ( Figure S12). We then created Bayesian Markov state models, selecting two macrostates for each MSM. Using these parameters, we created MSMs and validated our two-state systems through Chapman-Kolmogorov (CK) tests 51-54 at a confidence interval of 95% ( Figure S13). The estimated and predicted lines overlapped out to 750 steps (3x the MSM lag time number), which meant these MSMs are statistically valid. 51 We then used Robust Perron Cluster Center Analysis (PCCA++) to cluster the first two eigenvectors of the MSM transition matrices to cluster the states. 55 The first macrostate in the H1N1-Shan2009 strain MSM (untilted) had a stationary distribution of 51.4% of the total population, while the second macrostate (tilted) had a distribution of 48.6%. The first macrostate in the H1N1-Mich2015 strain MSM (untilted) had a stationary distribution of 44.3% of the total population, while the second macrostate (tilted) had a distribution of 55.7%. Finally, we extracted NA structures from the two macrostates per strain by creating a 99.99999% probability cutoff for the H1N1-Shan2009 strain macrostates and a 99.999999% probability cutoff for the H1N1-Mich2015 strain macrostates. This means that the NA structures (microstates) that we selected have a >99.99999% probability of residing in the macrostate state that we think they are in through the PCCA++ fuzzy spectral clustering method. Confirmation of which microstates corresponded to which conformations was performed through the MDTraj software program. 56 HA ectodomain tilting. The MD simulation trajectories were imported into the PyEmma2 software 44 for MSM analysis. The MSM of the HA ectodomain tilting was created as follows. We used two features, defined as the distance between the COM of the head and the center of the HA stalk, and the distance between the HA monomer head tips and the bottom of the stalk. The exact residues defining these COM selections are given in the accompanying Jupyter notebook. With these two features, we ran TICA 45-47 using a lag time of 25 steps. Our step size is 0.06, so 25 steps are equivalent to 1.5 ns. We ran k-means clustering 48 of the three features, selecting 300 clusters. From our implied timescale plots 49,50 we selected an MSM lag time of 50 steps ( Figure S18). We then created Bayesian Markov state models, selecting two macrostates for each MSM. Using these parameters, we created MSMs and validated our two-state systems through CK tests 51-54 at a confidence interval of 95% ( Figure S19). The estimated and predicted lines overlapped out to 150 steps (>3x the MSM lag time number), which meant these MSMs are statistically valid. 51 We then S14 used PCCA++ to cluster the first two eigenvectors of the MSM transition matrices to cluster the states. 55 The first macrostate in the H1N1-Shan2009 strain MSM (tilted) had a stationary distribution of 6.3% of the total population, while the second macrostate (tilted) had a distribution of 93.7%. The first macrostate in the H1N1-Mich2015 strain MSM (tilted) had a stationary distribution of 6.2% of the total population, while the second macrostate (untilted) had a distribution of 93.8%. Finally, we extracted HA structures from the two macrostates per strain by creating a 99.999999% probability cutoff for all macrostates. This means that the HA structures (microstates) that we selected have a 99.999999% probability of residing in the macrostate state that we think they are in through the PCCA++ fuzzy spectral clustering method. Confirmation of which microstates corresponded to which conformations was performed through the MDTraj software program. 56 HA head breathing. The MD simulation trajectories were imported into the PyEMMA2 software 44 for MSM analysis. The HA head breathing MSM was created as follows. We used one feature, defined as the smallest mean distance of the monomers to each other. The exact residues defining this feature are given in the accompanying Jupyter notebook. With this one feature, we ran TICA 45-47 using a lag time of 100 steps to properly format the data, even though the one dimension cannot be reduced, obviously. Our step size is 0.06, so 100 steps are equivalent to 6 ns. We ran k-means clustering 48 of the features, selecting 300 clusters. Running TICA and k-means clustering should not be necessary for an MSM with one feature, but we ran these anyways for continuity. From our implied timescale plots 49,50 we selected an MSM lag time of 500 steps ( Figure  S22). We then created Bayesian Markov state models, selecting two macrostates for each MSM. Using these parameters, we created MSMs and validated our two-state systems through CK tests 51-54 at a confidence interval of 95% ( Figure S23). The estimated and predicted lines overlapped out to 1250 steps (>2.5x the MSM lag time number), which meant these MSMs are statistically valid. 51 We then used PCCA++ to cluster the first two eigenvectors of the MSM transition matrices to cluster the states. 55 The first macrostate in the H1N1-Shan2009 strain MSM (open) had a stationary distribution of 24.6% of the total population, while the second macrostate (closed) had a distribution of 75.4%. The first macrostate in the H1N1-Mich2015 strain MSM (open) had a stationary distribution of 26.7% of the total population, while the second macrostate (closed) had a distribution of 73.3%. Finally, we extracted HA structures from the two macrostates per strain by creating a 99.99999% probability cutoff for the H1N1-Shan2009 strain macrostates and a 99.9999% probability cutoff for the H1N1-Mich2015 strain macrostates. This means that the HA structures (microstates) that we selected have a >99.9999% probability of residing in the macrostate state that we think they are in through the PCCA++ fuzzy spectral clustering method. Confirmation of which microstates corresponded to which conformations was performed through the MDTraj software program. 56

Figures and movies of the whole-virion simulations
Figure panels and movies of the whole-virion simulations were rendered from the trajectories using VMD GPU-accelerated ray tracing. 27,57

Experimental Materials and Methods
Single-cell sorting, immunoglobulin amplification, sequencing and mAb production Cryopreserved PBMC samples were thawed and stained with monoclonal antibodies against human CD3, CD14, CD56, CD19, CD21, CD27, CD38, IgM and IgG (BioLegends, BD BioSciences, Beckam Coulter). The NA probe N2 (A/Wisconsin/67/2005) was conjugated with either AX488 or AX647 fluorochrome according to the manufacturer's protocol (Microscale Protein Labeling Kit, Thermo Fisher Scientific) prior to use in flow cytometry. Non-B cells and dead cells were excluded with CD3, CD14, CD56, and aqua (live-dead) staining in the same fluorescent channel. Memory B cells were further gated on CD38low, CD27high, IgM-and IgG+ and NA-binding memory B cells were single-cell sorted into a 96-well plate (Bio-Rad) using an FACS Aria Instrument (BD Immunocytometry Systems). Plates were subjected to reverse transcription PCR and cDNA were used for PCR to amplify immunoglobulin (Ig) heavy and light chain genes as described previously. 58,59 PCR products were sequenced by Sangar sequencing. Ig heavy and light chain sequences were synthesized and cloned into IgG1 heavy and kappa backbone expression vector (GenScript). Antibody NDS.1 was expressed recombinantly by transient transfection of Ig expression plasmids in Expi293 cells with ExpiFectamine (Thermo Fisher Scientific). Supernatant was harvested 5 days post transfection and antibody was purified by using Protein A Sepharose (Thermo Fisher Scientific).

Biolayer interferometry (BLI)
BLI experiments were performed by using the Octet HTX instrument (Sartorius). HIS1K biosensors (Sartorius) were hydrated in PBS prior to use. Recombinant N2 NA tetramers derived from A/Wisconsin/67/2005 and A/Darwin/9/2021 were immobilized on HIS1K biosensors through their hexahistidine tags. After brief equilibration in assay buffer (25 mM Tris pH 8.0, 150 mM NaCl, 1% BSA), the biosensors were dipped into a two-fold dilution series of NDS.1 Fab for 5 min. Starting concentration of NDS.1 was 800 nM. Biosensors were then dipped in the assay buffer to allow Fab to dissociate from NA for 10 min. All assay steps were performed at 30°C with agitation set at 1,000 rpm. Baseline correction was carried out by subtracting the measurements recorded for a sensor loaded with the NA in the same buffer with no Fab. Data analysis and curve fitting were done with the Octet analysis software (version 11). Experimental data were fitted with the binding equations describing a 1:1 (Langmuir model) interaction.

Negative-stain electron microscopy
Samples were diluted to 0.02 mg/ml with buffer containing 10 mM HEPES, pH 7.0, and 150 mM NaCl, adsorbed to glow-discharged carbon-coated copper grids for 10 s, washed with three drops of the same buffer and negatively stained with 0.75% uranyl formate. Micrographs were recorded at a nominal magnification of 100,000 (corresponding to a pixel size of 2.2 Å) on an FEI Tecnai T20 electron microscope equipped with a 2k x 2k Eagle CCD camera and operated at 200 kV. Data collection was performed with SerialEM. 60 Particles were selected from micrographs using in-house written software (YT, unpublished) and extracted into 144 x 144-pixel boxes. After S16 2D classification, initial models were obtained and refined using Relion 3.1 61 with C4 symmetry enforced. The final datasets contained 2,304 and 4,368 particles for NDS.1 Fab and NDS.1/1G01 Fabs, respectively. The corresponding resolutions determined using the Fourier shell correlation threshold of 0.5 was 24.1 and 24 Å. UCSF Chimera 62 was used for docking and map visualization. S17 Figure S1. All-atom glycosylated model of H1N1-Mich2015 influenza A virion. Neuraminidase (NA) and hemagglutinin (HA) glycoproteins, and the M2 ion channels, are depicted with red, blue, and white surfaces, respectively. N-linked glycans are shown in yellow. The spherical lipid envelope is represented with pink van der Waals (vdW) spheres.  22,30,31 in this work we evaluated the accessibility of each selected sequon by calculating the accessible surface area (ASA) of the asparagine side chain's ND2 atom using a probe radius of 10 Å. This was done since the initial coordinates of the whole-virion construct are derived from previous MD simulation of the unglycosylated whole-virion construct. 1 C) Addition of glycan.        The solid line (the "estimate") represents the transition probabilities of the model created at a lag time of 250 steps, with these probabilities then estimated at multiples of the lag time. The dotted line (the "prediction") represents predicted probabilities of the same model independently created at each of these multiples of the lag time. Dark blue shading represents a 95% confidence interval for the predicted probabilities. Since the estimates of the MSM model that we used overlap with the predicted probabilities of the model at 3x the lag time, this model satisfies Markovianity, passes the CK test and can be used to extract transition kinetics. The numbers in the lower left corners represent macrostate transitions; thus "1 -> 1" represents a transition from macrostate 1 to macrostate 1.             S44 Figure S28. Analysis of glycoprotein connections involving glycans. A) Time evolution (ns) of the relative frequency (%) of interacting glycans, i.e., glycans participating in glycoprotein interconnections. The relative frequency of interacting glycan was calculated with respect to the total number of glycans. B) Grouped bar plot depicting the time evolution (ns) of the relative frequency (%) of interacting glycans in H1N1-Shan2009, where glycans have been divided into "head" and "stalk" categories according to their position, both for HA (shades of blue) and NA (shades of red). The relative frequency for each category of interacting glycans was calculated with respect to the total number of glycans of the same category. C) Grouped bar plot depicting the time evolution (ns) of the relative frequency (%) of interacting glycans in H1N1-Mich2015, where glycans have been divided into "head" and "stalk" categories according to their position, both for HA (shades of blue) and NA (shades of red). The relative frequency for each category of interacting glycans was calculated with respect to the total number of glycans of the same category. Figure S29. Analysis of NA head breathing. Distribution of inter-protomer distances between the four NA head COMs in H1N1-Shan2009 (black) and H1N1-Mich2015 (gray) virions. The distribution of the values is shown as a kernel density. On the right, a top view molecular representation of the NA head using red cartoons is provided, with the monitored distances between NA head COMs highlighted with cyan cylinders. Tables   Table S1.

Supplementary
Tilt angles (degree) of ten tilted and untilted structures (microstates) extracted from the MSM of the NA head tilting. Note: the average is the average of the ten selected microstates (ten selected conformational states and their respective angles). This will be representative of the average tilt angle of all microstates assigned to a macrostate of interest.

NA HEAD-TILT ANGLES [°]
H1N1-Shan2009 H1N1-Mich2015 , the largest of the three distances existing within the conformation of the HA trimer represented by the microstate. This applies also to the "closed" microstates, where the reported distance is still the largest distance seen in those microstates. Note: the average is the average of the ten selected microstates (ten selected conformational states and their respective angles). This will be representative of the average tilt angle of all microstates assigned to a macrostate of interest.

HA HEAD COM-LαHs apical residue COM DISTANCE [Å]
H1N1-Shan2009 H1N1-Mich2015  Table S5. P(A|B) for HA head breathing, where event A is the achievement of a certain extent of breathing by at least one of the HA head monomer within the same HA trimer during the dynamics with respect to the initial frame (calculated as breathingdistance_frame_n -breathingdistance_frame_0, with only positive values counted), whereas event B is the formation of one or more connections during the dynamics.  Table S6. P(A|B) for NA head tilting, where event A is the achievement of a certain tilt-angle swing range by the NA head during the dynamics with respect to the initial frame (calculated as tilt-angle_frame_n -tilt-angle_frame_0, with only positive values counted), whereas event B is the formation of one or more connections during the dynamics. glycoproteins, and the M2 ion channels, are depicted with red, blue, and white surfaces, respectively. N-linked glycans are shown in yellow. The spherical lipid envelope is represented with pink van der Waals (vdW) spheres. The HA exhibiting ectodomain tilting motion is highlighted in blue during the dynamics, whereas the other HAs and NAs are colored with gray and dark gray, respectively.

Movie S7.
HA head breathing motion. The head breathing motion of a representative HA trimer extracted from the H1N1-Shan2009 whole-virion simulation is displayed from a side (left) and a top (right) viewpoint simultaneously. Each HA monomer within the trimer is depicted with a transparent surface overlaid on top of a cartoon representation colored with a viridis palette according to the extent of the respective head breathing. N-linked glycans and sequon asparagine are shown with yellow van der Waals (vdW) spheres.

Movie S8.
HA head breathing motion. The head breathing motion of an HA trimer is displayed together with the dynamics of the surrounding crowded environment of the H1N1-Shan2009 whole-virion. ~442 ns of MD are shown. Neuraminidase (NA) and hemagglutinin (HA) glycoproteins, and the M2 ion channels, are depicted with red, blue, and white surfaces, respectively. N-linked glycans are shown in yellow. The spherical lipid envelope is represented with pink van der Waals (vdW) spheres. The HA exhibiting head breathing motion is highlighted in blue during the dynamics, whereas the other HAs and NAs are colored with gray and dark gray, respectively.

Movie S9.
The dynamic interplay between glycoproteins. The glycoprotein interplay taking place during the simulations of the glycosylated whole-virion models of H1N1-Shan2009 (left) and H1N1-Mich2015 (right) is displayed. ~442 ns and ~425 ns of MD are shown, respectively. The glycoproteins are depicted with a surface colored according to the number of connections made by the respective glycoprotein at that frame (0 to 5, colored in viridis palette). Interacting glycans are highlighted with yellow vdW spheres, whereas non-interacting glycans with white VdW spheres.
Movie S10. Glycoprotein macrocluster formation. The formation of macrocluster of glycoproteins taking place during the simulations of the glycosylated whole-virion models of H1N1-Shan2009 (left) and H1N1-Mich2015 (right) is displayed. ~442 ns and ~425 ns of MD are shown, respectively. The glycoproteins are depicted with a surface colored according to the size of the macrocluster they are involved in at that frame (colored in viridis palette). All glycans are highlighted with white vdW spheres.

Captions for Supplementary Data
Data S1. Jupyter-notebooks for the MSM analysis of NA head tilting, HA ectodomain tilting, and HA head breathing are provided along with PDB files of representative NA/HA conformations extracted for each state.