A biophysical model of viral escape from polyclonal antibodies

Abstract A challenge in studying viral immune escape is determining how mutations combine to escape polyclonal antibodies, which can potentially target multiple distinct viral epitopes. Here we introduce a biophysical model of this process that partitions the total polyclonal antibody activity by epitope and then quantifies how each viral mutation affects the antibody activity against each epitope. We develop software that can use deep mutational scanning data to infer these properties for polyclonal antibody mixtures. We validate this software using a computationally simulated deep mutational scanning experiment and demonstrate that it enables the prediction of escape by arbitrary combinations of mutations. The software described in this paper is available at https://jbloomlab.github.io/polyclonal.


Introduction
Many viruses evolve antigenically to escape polyclonal antibodies elicited by vaccination or prior infection (Smith et al. 2004;Hensley et al. 2009;Bedford et al. 2014;Eguia et al. 2021). Neutralization assays are the gold standard for experimentally assessing if a new viral variant has mutations that erode antibody immunity. However, neutralization assays require generating the actual viral variant(s) after they emerge for individual testing, and can therefore only be effectively applied retrospectively to a modest number of viral variants of interest (DeGrace et al. 2022). For this reason, neutralization assays are having difficulty keeping pace with the identification of vast numbers of emerging viral variants by genomic epidemiology (Elbe and Buckland-Merrett 2017;Rambaut et al. 2020;Viana et al. 2022), and so it would be useful to have a method for accurately predicting the antigenic phenotype of viral variants with arbitrary combinations of mutations.
Unfortunately, predicting how a polyclonal serum will neutralize a new viral variant remains a challenge. Deep mutational scanning can systematically measure how large libraries of viral protein variants affect antibody neutralization or binding (Dingens et al. 2017;Lee et al. 2019;Wu et al. 2020;Greaney et al. 2021aGreaney et al. , 2021c. However, although such high-throughput experimental methods can assess the effects of all single mutants on some viral proteins, the number of possible multiply mutated variants far exceeds the limits of these experiments. Therefore, a variety of computational approaches have been developed that attempt to predict escape by new viral variants. These approaches include basic transformations of deep mutational scanning data (Greaney, Starr, and Bloom 2022), models that integrate antigenic data with phylogenetic (Neher et al. 2016) or sequence data (Sun et al. 2013;Harvey et al. 2016), and neural networks that can be trained using deep mutational scanning data (Taft et al. 2022) or sequence data alone (Hie et al. 2021;Thadani et al. 2022).
Here we introduce a new model of viral polyclonal antibody escape that has several advantages over existing computational approaches. Our model is interpretable in terms of underlying biophysical parameters, can be directly fit to experimental deep mutational scanning data, and can predict how new viral variants will be neutralized by the polyclonal sera used to generate the experimental data. We implement our model in a software package and validate it on simulated experimental data. Finally, we demonstrate how the parameters of the model provide quantitative intuition for how mutations combine to escape antibodies that target distinct viral epitopes.

The concept of antibody epitopes
Our approach is inspired by the idea that viral antigens can be partitioned into distinct epitopes. This idea can be traced back over four decades to classic experiments on influenza, which tested viral mutants against large panels of monoclonal antibodies (Laver et al. 1979;Yewdell, Webster, and Gerhard 1979;Figure 1. Antibodies can be grouped into epitopes based on whether they share viral escape mutants. Heatmap displaying classic experimental data extracted from Webster and Laver (1980). Influenza virus mutants (a-p) were selected using individual antibodies from a large panel of monoclonal antibodies. Each viral mutant was then tested to see if it was neutralized or escaped by the other antibodies in the panel. Antibodies were then grouped into epitopes based on their shared escape mutants. Webster and Laver 1980). Viral escape mutants were first selected using individual antibodies and then tested against other antibodies in the panel. A pattern that emerged from these experiments was that groups of antibodies were escaped by similar viral mutants (Fig. 1). Antibodies that share common escape mutants were inferred to recognize a common epitope on the viral protein.
For instance, in Fig. 1, Antibodies 2-5 recognize a similar epitope since they are escaped by many of the same viral mutants.
Based on these studies, the H3 influenza hemagglutinin was divided into five distinct epitopes (Wiley, Wilson, and Skehel 1981;Skehel et al. 1984). This type of coarse epitope map also provided a straightforward way to conceptualize viral escape from polyclonal serum. Unlike monoclonal antibodies, polyclonal serum can contain multiple antibodies, each recognizing discrete epitopes. As such, any variant that fully escapes polyclonal serum would need to escape antibodies binding at multiple epitopes.
The concept of dividing viral antigens into distinct epitopes has proven to be very useful and continues to be applied to new viruses such as severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) (Barnes et al. 2020;Piccoli et al. 2020;Cao et al. 2022). However, grouping antibodies by epitopes is a simplifying approximation, as two antibodies targeting a similar epitope may be escaped by slightly different mutations, and some antibodies may bind idiosyncratic regions outside or between major epitopes (Doud, Hensley, and Bloom 2017;Piccoli et al. 2020;Starr et al. 2021a;Greaney et al. 2021bGreaney et al. , 2021c. For instance, in Fig. 1, Antibodies 6 and 12 are grouped into the same epitope, even though some viral mutants only escape one of these two antibodies. Nonetheless, the concept of epitopes provides a valuable way to interpret polyclonal antibody escape and forms the basis of the quantitative approach we take here.

An epitope-based model of viral escape from multiple antibodies
Given the concept of epitopes described earlier, consider viral escape from multiple antibodies. For simplicity, consider a hypothetical polyclonal mixture of just two antibodies (1 and 2) that bind distinct epitopes on a viral antigen. The functional activities of these two antibodies in the polyclonal mixture can differ due to antibody-intrinsic factors (e.g. binding affinity and neutralization potency) and extrinsic factors (e.g. antibody concentration). In our hypothetical antibody mixture, we assume Antibody 1 has a higher functional activity than Antibody 2.
To understand how mutations affect viral escape from this hypothetical two-antibody mixture, consider the viral variants in Fig. 2. In the first variant, a single mutation escapes Antibody 1 while leaving Antibody 2 binding intact. Since Antibody 1 has a higher functional activity, losing its contribution causes a marked escape from the overall antibody mixture ( Fig. 2A). However, a viral variant with a single mutation that escapes Antibody 2 but leaves Antibody 1 binding intact has little escape from the overall antibody mixture, since the more active Antibody 1 can still bind (Fig. 2B).
Importantly, this framework makes distinct predictions about the effects of multiple mutations depending on whether they are in the same or different epitopes. Imagine if the viral variant has two mutations in Antibody 1's epitope. If one of the mutations is already sufficient to mostly escape binding by Antibody 1, then the second mutation will have a largely redundant effect (Fig. 2C). It follows that multiple escape mutations in the same epitope will have diminishing returns-even when all Antibody 1 molecules are unbound, Antibody 2 can still bind. However, the situation is very different if the viral variant gains mutations in the epitopes of both Antibodies 1 and 2. Since both antibodies are escaped, the polyclonal mixture will have no activity (Fig. 2D). Note that the principles described previously can be readily extended to mixtures of antibodies that target more than two distinct epitopes.
The simple hypothetical example described previously is mirrored in real-world data by Kuzmina et al. (2021) on SARS-CoV-2 escape from neutralization by polyclonal serum pooled from vaccinated individuals (Fig. 3). The K417N mutation, which is located in the subdominant, Class 1 epitope (Greaney et al. 2021b) of the SARS-CoV-2 receptor-binding domain (RBD), has little effect on polyclonal antibody escape. But E484K, which is located in the immunodominant Class 2 epitope (Greaney et al. 2021b), causes a substantial drop in polyclonal antibody neutralization. But while K417N has no effect on its own, combining K417N with E484K causes a larger drop in neutralization than E484K alone, presumably due to escape from two distinct groups of antibodies in the sera targeting each epitope (Fig. 3).

Biophysical modeling of polyclonal antibody escape
Here we formalize the epitope-based model of polyclonal antibody escape described previously in terms of experimental measurables and relevant biophysical quantities. First, let ( , ) be the fraction of viral variant that escapes a mixture of polyclonal antibodies at concentration . The quantity ( , ) is experimentally measurable, for instance, from a neutralization assay. Additionally, suppose the antibodies can bind to one of epitopes on the viral antigen. Let ( , ) be the fraction of variant v that has an unbound epitope when the antibody mixture is at concentration . Assuming antibodies bind independently without competition, we can express ( , ) as follows: Note that relaxing the assumption that antibodies bind independently without competition to each epitope has little effect on ( , ) (see Appendix). Note also that this model simplifies escape to be the state where no antibodies are bound to a single idealized antigen, although in reality, multiple antigens decorate each virion. Next, we can define ( , ) in terms of underlying biophysical properties. Again assuming that there is no competition among antibodies binding to different epitopes, that antibodies to each epitope are completely neutralizing, and that the antibody binding to a given epitope e can be described by a Hill curve with coefficient of 1, then ( , ) is given by a Hill equation (Einav and Bloom 2020) as follows: where − ( ) represents the functional activity of antibodies to epitope against variant . Note that ( ) depends on both antibody-intrinsic factors (i.e. affinity as quantified by the free energy of binding Δ ( )) and extrinsic factors (i.e. relative fraction of antibodies in the mix that bind epitope ). The overall functional activity directed to an epitope is a combination of these intrinsic and extrinsic factors and can be expressed as ( ) = Δ ( ) − ln ( ). Note that is the product of the molar gas constant and the temperature, and , ( ) = exp ( Δ ( ) ) is the dissociation constant. For simplicity, we consider each ( ) to be independent, so antibody binding at one epitope does not affect the affinity of antibodies binding at another epitope. Furthermore, we group together all antibodies targeting a given epitope and let ( ) be their effective activity, simplifying the fact that each of these antibodies can have different activities and affinities. More positive values of − ( ) indicate that antibodies that bind epitope e have higher functional activity against variant v.
Lastly, we can write ( ) in terms of the contributions of specific mutations. To do this, assume that mutations have additive effects on the free energy of binding for antibodies targeting any given epitope (Otwinowski, McCandlish, and Plotkin 2018;Otwinowski and Wilke 2018). In this way, we assume that antigenic epistasis arises via the non-linear Hill function that relates ( ) to ( , ) and the product over ( , ) across epitopes. Specifically, let , be the functional activity of antibodies that bind epitope on the unmutated viral antigen, with larger values of , indicating stronger antibody activity targeting this epitope. Let , be the extent to which mutation reduces antibody binding or neutralization at epitope e, with larger values of , corresponding to a larger contribution to antibody escape. So, ( ) can also be written as: where ( ) is one if variant has mutation and 0 otherwise, and m ranges over all M possible mutations. Taken together, the above equations relate the pre-mutation functional activity of antibodies at each epitope ( , ) and the antibody escape effects of individual mutations at each epitope ( , ) to the experimentally measured fraction ( , ) of variants that escape an antibody mixture at concentration .

Fitting biophysical models to deep mutational scanning data
We reasoned that the parameters of our biophysical model could be learned from experiments that measure ( , ) for a large number of variants, containing the most possible single mutants and a sizable number of multiple mutants (Fowler et al. 2010;Kinney et al. 2010). These rich genotype-phenotype measurements can be obtained by viral deep mutational scanning. Briefly, viral deep mutational scanning involves generating a large library of viral or viral protein variants, incubating this library with antibodies or sera, and using deep sequencing to identify which variants successfully escape binding or neutralization (Dingens et al. 2017;Lee et al. 2019;Wu et al. 2020;Greaney et al. 2021a;Greaney et al. 2021c). While previous studies have been restricted to variants that predominantly contain single amino acid mutations, recent advances enable combinations of mutations to be assayed as well (Dadonaite et al. 2022). Although only an infinitesimal fraction of all combinations of mutations can be assayed, this approach does measure ( , ) for each mutation in many different backgrounds, which should be sufficient for revealing the epitopes targeted by polyclonal antibody mixtures and their associated escape mutations.
To fit our biophysical model using deep mutational scanning data, we created a Python software package named polyclonal that uses gradient-based optimization to fit the model to a large set of viral variants v and their corresponding experimentally measured escape values, ( , ). This software package estimates the , and , parameters that best predict the measured ( , ) under tunable and biologically motivated constraints. These constraints are key to ensuring the inference of biologically accurate epitopes. We typically enforce a sparsity constraint that encourages most , values to be close to 0, as most mutations to a , values are penalized to be close to 0 unless there is clear evidence to the contrary. We also usually enforce an evenness constraint, as most mutations to a site that mediates antibody escape at a specific epitope tend to have similar effects (or , 's). The software itself is available at https://github.com/jbloomlab/ polyclonal, and detailed documentation is at https://jbloomlab. github.io/polyclonal. The software also provides methods to visualize the resulting mutation-level escape values in interactive plots, as described in the documentation.

Validation with computationally simulated deep mutational scanning data
To validate the performance of our software, we simulated a hypothetical polyclonal antibody mixture that contains antibodies targeting three major neutralizing epitopes (Class 1-3) on the SARS-CoV-2 RBD (Fig. 4A) (Barnes et al. 2020;Greaney et al. 2021b). We assigned each epitope a different pre-mutation functional activity ( , ) based on the order of typical neutralizing activities against these epitopes in actual human polyclonal serum (Class 2 > Class 3 > Class 1) (Greaney et al. 2021b) (Fig. 4B). Next, we assigned , values using prior deep mutational scanning studies (Starr et al. 2021b(Starr et al. , 2021c that measured the effects of all single RBD mutations on escape from prototypical monoclonal antibodies targeting each epitope (Fig. 4C). Specifically, we used LY-CoV016 (etesevimab) for the prototype Class 1 antibody, LY-CoV555 (bamlanivimab) for Class 2, and REGN10987 (imdevimab) for Class 3 (Starr et al. 2021b;Starr et al. 2021c). Note that the site of greatest total escape (the sum of positive , values at a site) is 417 for Class 1, 484 for Class 2, and 444 for Class 3.
Based on this hypothetical polyclonal antibody mixture, we computationally simulated a realistic deep mutational scanning dataset containing 30,000 RBD variants. These variants contained an average of two amino acid mutations, with the number of mutations per variant following a Poisson distribution ( Supplementary Fig. S1A). These variants also only contained mutations at sites found to be functionally tolerated in prior deep mutational scanning, and these sites were well-represented in the simulated dataset (Supplementary Fig. S1B). We then calculated the true ( , ) for each variant using our biophysical model with the , and , defined earlier. We did this for three concentrations of the hypothetical serum that represent IC97.5, IC99.9, and IC99.998 against the unmutated RBD ( Supplementary Fig. S1C). Lastly, we added gaussian noise, N(0, 0.05), into the ( , ) measurements and then truncated the noisy values to be between 0 and 1 to reflect experimental errors and built in a 1 percent probability of adding or subtracting a mutation from a variant to portray sequencing errors.
We found that polyclonal could successfully infer the underlying properties of our hypothetical polyclonal antibody mixture when fit to the noisy, simulated deep mutational scanning dataset. The predicted , values were nearly identical to the true , values (Fig. 5A), suggesting that we can deconvolve the dominance hierarchy of epitopes targeted by a polyclonal antibody mixture. Furthermore, the predicted , values strongly correlated with the true , values (Fig. 5B, R 2 = 0.63 for Class 1, R 2 = 0.91 for Class 2, and R 2 = 0.85 for Class 3), indicating that we can learn the effects of mutations on antibody escape at each epitope. Note that the lower correlation for the Class 1 epitope can be attributed to its relative subdominance. As shown in Fig. 2B, mutations to a subdominant epitope manifest little effect on the functional activity of the overall antibody mixture when the immunodominant epitopes remain unaffected, and so only have measurable effects in the subset of mutants with mutations in more dominant epitopes. Nonetheless, our approach can still learn these subdominant effects reasonably well. Taken together, these results demonstrate that noisy ( , )'s measured by deep mutational scanning can be modeled to reveal fine details of polyclonal antibody mixtures: the extent to which antibodies target specific epitopes on a viral antigen and the extent to which mutations escape antibodies to each epitope.
We used the fit biophysical model to predict the extent to which variants that were not seen in our experiments will escape the same polyclonal antibody mixture. To do this, we simulated an independent deep mutational scanning dataset, grounded in the same true , and , values, but with a higher number of mutations (three) on average per variant ( Supplementary  Fig. S1A). We found that our fit model could predict the IC90 (the concentration required for ( , ) = 0.1) of each variant in the independent dataset with high accuracy (Fig. 5C, R 2 = 0.98). Ultimately, polyclonal can be applied to make predictions over all variants with mutations that were observed by deep mutational scanning.

Model fitting is dependent on experimental design
We next sought to clarify the experimental conditions for deep mutational scanning that lead to accurate model fitting with polyclonal. To that end, we systematically explored the impact of three important experimental conditions on model fitting through simulation (see Methods).
We first examined the effect of the library mutation rate, an important consideration for library design because variants containing multiple mutations are key to revealing epitopes. If a library only contained variants with single mutations, it should not be possible to detect the presence of subdominant epitopes, due to the phenomenon in Fig. 2B. Indeed, we found that a library containing 30,000 variants with an average of one mutation could only infer the , values for the immunodominant Class 2 epitope (Fig. 6A), highlighting the need for including multiply mutated variants. On the other hand, a library containing 30,000 variants with two mutations on average was sufficient for inferring the , values at all three epitopes, but the accuracy improved as the mutation rate increased (Fig. 6A), consistent with our expectation that greater coverage of variants with multiple epitope mutations is helpful. Note that these simulations do not capture the real-world fact that variants will be increasingly less likely to be functional as they accumulate more mutations.
Next, we investigated the effect of library size, another important library design consideration. If there are too few variants in the library, mutations may not be observed in enough backgrounds to be able to deconvolve their effects on different epitopes. Previously, we showed that a simulated library with 30,000 variants with three mutations on average could accurately infer the , values at all three epitopes. Therefore, we set out to determine the minimum number of variants, with three mutations on average, required to accurately infer the , values at all three epitopes. We found that at least 20,000 functional variants were required to accurately infer the , values at each epitope (Fig. 6B). We also noticed that the , values of immunodominant epitopes can be inferred with fewer variants (Fig. 6B), consistent with the idea that effects of mutations at subdominant epitopes can only be observed in the backgrounds where the immunodominant epitope is already mutated.
Lastly, we explored the effect of antibody concentration, an important consideration for deep mutational scanning selections. If the concentration is too high, all variants will be neutralized. If the concentration is too low, there will be little measurable effect from the mutations. To test the effect of different concentrations, we again used the simulated library containing 30,000 variants with three mutations on average. First, we tested if the data collected at a single concentration was sufficient. We found that a single concentration of IC99.9 against the unmutated RBD was most effective at inferring the , values for each epitope, but there does indeed exist a fine balance (Fig. 6C). Particularly for the subdominant Class 1 epitope, the accuracy is lower when the concentration is too low or too high. We then tested if model fitting could be improved by including additional concentrations flanking the IC99.9, spanning the IC91.443 to the IC99.998. Indeed, we found that the accuracy increases as the number of concentrations measured increases (Fig. 6D), consistent with our expectation that obtaining more data points along the binding curve is helpful for quantifying the effects of mutations.

Discussion
We have described a biophysical model of viral escape from polyclonal antibodies. Notably, this model is composed of easily interpretable parameters that capture the interactions between antibodies and the viral epitopes they bind. In addition, we developed a software package that can infer these parameters using deep mutational scanning data. Using a simulated deep mutational scanning dataset, we demonstrated that this approach can infer true parameters from noisy experiments if the deep mutational scanning experiment is appropriately designed.
There are several limitations to our approach. Grouping antibodies by discrete epitopes is an approximation because each antibody is unique; however, decades of experimental work have shown that an epitope-based representation offers a useful way to interpret viral escape. Our model also does not explicitly consider the fact that there are multiple antigens per virion and instead models escape as simply being the state where no antibodies bind a single idealized antigen. Additionally, our model assumes a specific shape of antigenic epistasis: mutations have additive effects on binding affinity at each epitope but can manifest non-linear effects on the overall measured phenotype (e.g. neutralization) due to both the non-linear Hill curves that relate binding affinity to the total fraction bound at an epitope and the product of these fractions across epitopes. Note that the single-epitope version of our model is similar to the global epistasis models that have proven so useful for interpreting some other types of deep mutational scanning data (Otwinowski, McCandlish, and Plotkin 2018;Tareen et al. 2022), while the multi-epitope version resembles a fuzzy logic model of multiple biophysical traits (Warszawski et al. 2014). Our model also assumes that antibody binding to one epitope does not influence the affinity of antibodies to other epitopes and that the Hill curves have a coefficient of one. It is possible that these assumptions could be relaxed in elaborated versions of our model. We expect that these assumptions will hold reasonably well for viral variants with modest numbers of mutations but may not extrapolate accurately to variants with dozens of mutations relative to the parental strain used for the deep mutational scanning experiment.
Despite these limitations, our model not only predicts the escape potential of viral variants with arbitrary combinations of mutations that are observed in a simulated deep mutational scanning library-it also clarifies how escape mutations combine to determine the magnitude of viral escape. While this manuscript was under review, our model was successfully applied to real deep mutational scanning data measuring the escape of SARS-CoV-2 spike variants from monoclonal antibodies, yielding predictions that strongly correlated with neutralization assays (Dadonaite et al. 2022). While this paper only considers simulated data, we envision that our model can be further applied to appropriately designed deep mutational scanning experiments to address two main questions: (1) delineating the epitopes targeted by polyclonal serum and (2) predicting the antigenic properties of new variants with arbitrary combinations of mutations.

Model fitting
Our goal is to estimate the biophysical model parameters that best predict the deep mutational scanning escape measurements under biologically motivated constraints. We defined a loss function that is robust to outliers: where ( , ) is the actual antibody escape fraction for variant at concentration , ̂( , ) is the predicted antibody escape fraction for variant at concentration , and ℎ loss ( ) is a scaled Pseudo-Huber function, defined as follows: where is a parameter that indicates when the loss transitions from being quadratic (L2-like) to linear (L1-like). Note that the loss is mostly L2-like when , the residual, is small. However, the loss transitions to become more L1-like when is large, making it more robust to outliers. A default loss = 0.01 was used in all fit models. To minimize this loss function, we use the gradient-based L-BFGS-B method implemented in scipy.optimize. minimize.

Regularization
Additionally, we implemented penalty terms to regularize the parameters to behave under biologically motivated constraints. We regularize the escape values ( , ) using a Pseudo-Huber function, based on the notion that most mutations should not mediate escape.
where escape is the strength of the regularization and escape is the Pseudo-Huber delta parameter. For all fit models, the default parameters escape = 0.02 and escape = 0.1 were used. We also regularize the variance of escape values ( , ) at each site, based on the notion that mutations at a site involved in escape will exhibit similar effects.
where spread is the strength of the regularization, ranges over all sites, is the number of mutations at site , and indicates all mutations at site . For all models fit, the default parameter spread = 0.25 was used.
Lastly, we regularize the pre-mutation functional activities ( , ) using a Pseudo-Huber function as these should be less positive (or even negative) unless there is clear evidence to the contrary. activity = activity ∑ ℎ activity ( , ) if , ≥ 0 where activity is the strength of the regularization, activity is the Pseudo-Huber delta parameter, and activity = 0 when , < 0. For all models fit, the default parameters activity = 1.0 and activity = 0.1 were used. For more information on these penalties and model fitting, see https://jbloomlab.github.io/polyclonal/optimization.html.

Determining the number of epitopes
Fitting the model requires specifying the number of epitopes a priori. However, it is not possible to know the number of epitopes that are targeted by antibodies in polyclonal serum in practice. Our approach to resolving this is similar to the 'elbow method' commonly used to determine the optimal number of clusters in k-means clustering. We start by fitting a model with one epitope and iteratively fit models with an increasing number of epitopes. At some point, the N-th epitope becomes redundant. This is evidenced by a highly negative , value (i.e. if antibodies existed against this epitope, they would never be bound) and all nearzero , values, indicating that the previous fit model, containing N − 1 epitopes, is the one that best describes the polyclonal mixture. To view how this approach was applied to the simulated RBD example, see https://jbloomlab.github.io/polyclonal/specify_ epitopes.html.

Experimental design simulation
We computationally simulated deep mutational scanning libraries based on the hypothetical polyclonal antibody mixture shown in Fig. 4. All libraries contained 30,000 variants but differed by their mutation rate with variants containing an average of one, two, three, or four mutations. Furthermore, the number of mutations per variant in each library followed a Poisson distribution. Variant escape was also simulated under six concentrations in each library. These concentrations represented IC91.443, IC97.488, IC99.441, IC99.9, IC99.985, and IC99.998 against the unmutated RBD antigen. For more details on how these experiments were simulated, see https://jbloomlab.github.io/polyclonal/simulate_RBD. html.
To make comparisons about model fitting, models were fit with identical parameters except for the experimental variable of interest: library mutation rate, library size, and antibody concentration.
To determine the impact of library mutation rate, models were fit to simulated datasets for libraries containing one, two, three, or four mutations on average. All datasets contained 30,000 variants and were measured at three different concentrations.
To determine the impact of library size, models were fit to simulated datasets containing different-sized subsets of variants that were randomly sampled from a library containing an average of three mutations per variant and measured at three different concentrations.
To determine the impact of antibody concentration, models were fit to simulated datasets to a library measured at one or multiple antibody concentrations. This library contained 30,000 variants and three mutations on average per variant.

Monoclonal antibody case
Before considering the polyclonal antibody case, we first consider the case of a monoclonal antibody that binds a viral antigen. Here, the viral protein can exist in two microstates: bound or unbound by the antibody. The Boltzmann weight is 1 for the unbound state and for the bound state, where is the antibody concentration and is the dissociation constant of antibody-antigen binding. These weights can be derived using the steady-state approximation (Einav, Bloom, and Antia 2020). We can then define the partition function Ξ as follows: represents the Boltzmann weights of the microstates. Given this, the probability of a viral antigen being unbound is unbound = unbound Ξ = 1 1 +

Polyclonal antibody case
For a polyclonal antibody mixture, we modify to represent the concentration of the polyclonal antibody mixture. We assume that the polyclonal antibody mixture contains antibodies that bind one of epitopes. As follows, the Boltzmann weight of the state where epitope is bound is modified to , , where represents the fraction of antibodies in the mixture that target epitope , and , is the dissociation constant of antibodies binding to epitope .

Two distinct epitopes
In a polyclonal antibody mixture, new microstates exist where multiple epitopes are bound by antibodies. For example, we can consider a viral antigen that contains two distinct epitopes (1 and 2) that are targeted by polyclonal antibodies. In addition to the microstates where a single epitope is bound, we now require an additional microstate where both epitopes are bound. The Boltzmann weight for this new microstate is: We can then rewrite the partition function Ξ as follows: Ξ = ∑ = unbound + 1,bound + 2,bound + 12,bound and the probability of a viral antigen being unbound is unbound = unbound Ξ = 1 Note that this is the biophysical model that is described in the main text.

Two overlapping epitopes
In Appendix section 2.1, the two epitopes were distinct, and there was no competition among antibodies. However, if the epitopes are overlapping and there is competition, then the microstate where both epitopes are bound ( 12, bound ) can no longer exist. In this case, the probability of a viral antigen being unbound is unbound = unbound Ξ = 1

Extending beyond two epitopes
The same logic applies to viral antigens with more than two epitopes targeted by antibodies. For example, we can write unbound for the case of three distinct epitopes as: .
Overall, we found that the predicted unbound does not differ by much between the independent and overlapping epitope models under realistic 's, 's, and 's. However, the predicted unbound becomes more discordant between the two models as the number of epitopes increases, given the number of microstates with multiply bound epitopes increases. In practice, we only fit models to a handful of epitopes and not all of them will overlap. As such, we maintain that the independent epitope assumption is appropriate. To interactively explore how unbound differs between the two models, see https://jbloomlab.github.io/polyclonal/partition_function. html.