Fragment-based computational design of antibodies targeting structured epitopes

De novo design methods hold the promise of reducing the time and cost of antibody discovery while enabling the facile and precise targeting of predetermined epitopes. Here, we describe a fragment-based method for the combinatorial design of antibody binding loops and their grafting onto antibody scaffolds. We designed and tested six single-domain antibodies targeting different epitopes on three antigens, including the receptor-binding domain of the SARS-CoV-2 spike protein. Biophysical characterization showed that all designs are stable and bind their intended targets with affinities in the nanomolar range without in vitro affinity maturation. We further discuss how a high-resolution input antigen structure is not required, as similar predictions are obtained when the input is a crystal structure or a computer-generated model. This computational procedure, which readily runs on a laptop, provides a starting point for the rapid generation of lead antibodies binding to preselected epitopes.


INTRODUCTION
Antibodies are key tools in biomedical research and are increasing ly used to diagnose and treat a wide range of human diseases. Now, there are over 120 antibodies approved or undergoing regulatory review in the United States and Europe (1). Existing antibody dis covery methods have been widely successful, but still have impor tant limitations (2). Extensive laboratory screenings are required to isolate those antibodies binding to the intended target, which can be time consuming and costly. Some classes of hard targets remain, including some receptors and channels, proteins within highly ho mologous families, aggregationprone peptides, and diseaserelated shortlived protein aggregates (3,4). Most notably, it is often highly challenging to obtain antibodies targeting preselected epitopes. Screening procedures typically select for the tightest binders, which usually occur for immunodominant epitopes, thus disfavoring the discovery of antibodies with lower affinities but binding to func tionally relevant sites (5). Furthermore, screening campaigns often yield antibodies with favorable binding affinity but otherwise poor biophysical properties, such as stability, solubility, and production yield, which may hinder their development into effective reagents. Computational antibody design has the potential to overcome these limitations by markedly reducing time and costs of antibody dis covery and, in principle, allowing for a highly controlled parallel screening of multiple biophysical properties. Moreover, rational design inherently enables the targeting of specific epitopes.
Most available methods for the design of binding proteins rely at least in part on the minimization of a calculated interaction free energy, through the sampling of the mutational space and the conformational space (2,6,7). The nature of these calculations, which are based on molecular modeling and classical force fields, and the challenges of achieving exhaustive sampling make simulations rather imprecise and highly resource intensive. For these reasons, the de novo design of antibody binding has generally met low success rates and required recursive experimental screenings and large libraries (5,(8)(9)(10). Com putational design of binding has been most successful in synergy with in vitro affinity maturation and, in particular, when applied to miniproteins (11)(12)(13). The small size of these miniproteins is amenable to the highthroughput gene synthesis required to exper imentally screen designed candidates on a massive scale, and their rigidity reduces the need for accurate conformational sampling. However, antibody domains are considerably larger and bind their target using complementaritydetermining regions (CDRs) located within hypervariable loops on the antibody surface, which are often extended and highly flexible.
Here, we describe a novel method to design antibody CDR loops targeting epitopes for which a structure is available, from either an experimentally determined structure or a computational model. Designed CDRs are then grafted onto antibody scaffolds and fur ther optimized computationally for solubility and conformational stability. Novel antibodyantigen interactions are designed by com bining together protein fragments identified as interacting with each other within known protein structures.

De novo CDR design strategy
To overcome some of the limitations of molecular modeling men tioned above, in particular those associated with the approxima tions of the interatomic interactions, we exploited the availability of large structural databases to implement a fragmentbased proce dure to design CDRs (paratope) complementary to a target epitope. To implement this idea, we compiled from the nonredundant Pro tein Data Bank (PDB) a database of CDRlike fragments and cor responding antigenlike regions, which we call AbAg database. CDRlike fragments are defined as linear motifs structurally com patible with an antibody CDR loop, which may be found in any protein structure in the PDB. Conversely, antigenlike regions com prise those residues interacting with CDRlike fragments in the context of the protein where the CDRlike fragment is originally found (see Materials and Methods).
Given the structure of a target epitope, the database can be searched to identify antigenlike regions similar to this epitope or to fragments of it. In this way, the CDRlike fragments interacting with the identified antigenlike regions may also interact with the target epitope. To perform this search, the structure of the input epitope is fragmented using two different strategies (Fig. 1A): (i) a linear fragmentation, which generates fragments of at least four consecutive residues, and (ii) a surfacepatch fragmentation, which takes each residue and yields the closest n ≥ 4 solventexposed res idues in the threedimensional structure of the epitope. The reason for this choice is that n < 4 results in a substantially slower search, and small fragments are unlikely to capture enough of the epitope complexity to yield CDRlike fragments that would actually bind to the target epitope. These two approaches allow for covering a wider search space, as the first one conducts an exhaustive search for con tiguous epitopes, whereas the second one is more suitable for con formational epitopes comprising multiple segments, which are generally distant in the sequence space. Next, each epitope fragment is compared to the antigenlike regions to identify those with com patible backbone structure and similar sequence. More specifically, the search is carried out with the Master algorithm (14), and the comparison is based on the root mean square deviation (RMSD) of the full backbone and on sequence similarity (see Materials and Methods). Therefore, a hit antigenlike region is similar to its query epitope fragment in both sequence and structure. In practice, the fragmentation is carried out starting from large fragments (i.e., from the full region defined as epitope) and moving to smaller ones for a minimum size of four residues. Most commonly, no hits are found for larger fragments, while many hits are typically found for smaller ones (n ≤ 6).
Because of the nature of the AbAg database, this procedure yields those CDRlike fragments that interact with the identified antigen like fragments (Fig. 1B). These CDRlike structures are then rotated to match the orientation of the epitope, by superimposing each antigenlike region, together with its interacting CDRlike fragments, to the matching part of the epitope ( Fig. 1 and fig. S1). When possi ble, different CDRlike fragments whose backbones are partly over lapping and compatible with a single longer CDR loop are joined together to yield longer interacting motifs ( Fig. 1C; see Materials and Methods).
Some of the original interactions of each CDRlike fragment may be affected when this fragment is transferred onto the epitope, for instance, if the sequence of the antigenlike region is not iden tical to the corresponding epitope sequence, or if the epitope side chains are found in different conformations (Fig. 1D). Similarly, new interactions may arise when a CDRlike fragment forms contacts with parts of the epitope that were not matched onto its antigenlike region. To overcome potential issues arising from these suboptimal interactions, we implemented a sidechain optimization procedure that seeks to maximize the number of favorable interactions be tween the CDRlike fragment and the antigen. Briefly, for each CDR like side chain with interactions different, or additional, to those found in the original hit, a structural neighborhood is defined by taking the backbone coordinates of all contacting residues (see Materials and Methods). These residues are then used as a query to interrogate the AbAg database, retrieving as hits those CDRlike  The antigen structure is shown in gray, with the epitope of interest highlighted in red. At this step, the epitope is fragmented into its structural fragments, both in a linear mode and in a surface-patch mode that yields also noncontiguous fragments (see text). Some example fragments are pointed by red arrows. (B) These fragments are used as queries for a structural search against a custom database of structures of antigen-like and CDR-like interacting fragments. Hits are selected on the basis of structural and sequence similarity with the query epitope fragments, and two example hits are depicted: an epitope fragment (pink top example, purple lower example) matching antigen-like fragments (yellow) interacting with a CDR-like fragment (cyan top example, blue lower example). These two examples originate respectively from the structures of human EML1 protein and of a bacterial transferase, as antigen-like and CDR-like fragments may be found in any structure from the PDB. (C) When possible, identified CDR-like fragments are joined together. Here, the overlapping CDR-like fragments from B are merged as they meet the stated compatibility criteria. (D) The sequence of the designed CDR fragment resulting from the merging is optimized to increase the probability of favorable CDR-epitope contacts. The final fully optimized designed CDR motif can then be grafted onto suitable antibody frameworks. The example in this figure corresponds to the designed binding motif within the CDR3 of DesAb-RBD-C1 targeting the ACE2-binding site on the RBD domain.
side chains that better match the native local environment of the epitope, therefore increasing the total number of favorable interac tions to yield a fully optimized designed CDR motif ( Fig. 1D; see Materials and Methods). Typically, multiple CDR motifs are designed in this way for a given input epitope, as multiple CDRlike fragments are usually iden tified as suitable starting points for the combination and the optimi zation procedures. Therefore, all possible CDR motif candidates generated for the input epitope are ranked according to the total num ber of favorable interactions, the number of interactions that could not be optimized, and a solubility score calculated with the CamSol method (15).
Topranking, designed CDR motifs can then be grafted into an antibody scaffold (Fig. 2). Our pipeline can structurally match the generated motifs either to complete CDRs or entire antibody struc tures (specifically Fv regions), which can result in longer CDR loops harboring multiple motifs, or in multiple motifs being grafted in different CDR loops of the same Fv region (Fig. 2, A to C; see Mate rials and Methods). If needed, any new interactions between the grafted antibody scaffold and the antigen are optimized using the sidechain optimization procedure described above. Furthermore, as an alternative to this structural matching, designed CDR motifs can also be grafted directly into an antibody scaffold that is already known to be highly tolerant to loop replacements. In this work, we tested experimentally both approaches (Fig. 2D).
To validate our design strategy, we tested it experimentally on singledomain antibodies, because of their monomeric nature, ease of production in prokaryotic systems, and small size (16). Nonethe less, the computational design pipeline described here can readily be applied to other antibody fragments, including whole Fv regions, on which designed CDR motifs can be structurally matched and grafted in the same way on either heavy or lightchain CDRs.

Description of designs and biophysical characterization
We designed six singledomain antibodies for three different anti gens by exploring two grafting strategies: the direct grafting of the designed CDRs onto stable scaffolds and the matching of the de signed CDRs to a scaffold that is structurally compatible with them. The first strategy provides the opportunity to test the de novo CDR design procedure by minimizing possible complications arising from the grafting, while the second is a more complex approach that allows to design multiple CDR loops onto a scaffold structurally matched to the epitope. Two designed singledomain antibodies (DesAbs) target the severe acute respiratory syndrome coronavirus 2 (SARS CoV2) spike protein receptorbinding domain (RBD), three human serum albumin (HSA), and one pancreatic bovine trypsin (Table 1). HSA and trypsin were selected for the initial validation. Both pro teins are available off the shelf, and binding of therapeutic proteins to HSA is a key determinant of pharmacokinetics. Therefore, single domain antibodies targeting HSA may provide a tool for enhancing the halflife of biologics (17). Conversely, trypsin offers the oppor tunity of testing the design strategy on poorly accessible concave epitopes harboring an active site. The RBD of SARSCoV2 exem plifies the power of targetingspecific epitopes, as binding to regions overlapping with, or close to the ACE2 receptor binding site, while avoiding glycosylation sites, is known to yield neutralizing antibody candidates, which would sterically hinder virus binding to the human cell receptor (18). In this case, we used as starting point for the design the firstreleased cryo-electron microscopy (cryoEM) model of the SARSCoV2 spike protein in the prefusion conformation (PDB ID 6VSB) (19). The reason for this choice was to assess how the design strategy performs with a lowerresolution structure used as input. Specifically, we ran the design on the surface of the up RBD around the ACE2binding region, which has some regions of low resolution (~6 to 8 Å) (19) and several missing residues in the model.   Fig. 1C) or, if not overlapping, may still be grafted in the same CDR loop as shown in (B). (D) The structure of HSA is shown in gray, and the designed CDR motifs selected for experimental validation are shown in blue, yellow, and purple docked onto their respective epitopes. Two fragments (blue) are grafted into separate CDRs (CDR1 and CDR3) of an antibody scaffold, which they match structurally (PDB 4DKA). The resulting design is DesAb-HSA-D3 ( Table 1). The yellow and purple motifs are instead grafted into the CDR3 of a scaffold resilient to CDR3 substitutions to yield DesAb-HSA-P1 and DesAb-HSA-P2. The motif grafted onto DesAb-HSA-P1 comprises two fragments joined together as in Fig. 1C. DesAb structural models were obtained with the SAbPred webserver (51).
All DesAbs expressed well in Escherichia coli were obtained to high purity and showed circular dichroism (CD) spectra fully com patible with a wellfolded variable heavy (VH) domain ( fig. S2; see Materials and Methods). All designs were highly stable, with a melting temperature at par or better than that of immune systemderived nanobodies (Table 1 and fig. S2C) (20).
Two of the three antiHSA singledomain antibodies, DesAb HSAP1 and DesAbHSAP2 (Table 1 and Fig. 2D), consisted in designed CDR motifs grafted in place of the CDR3 of a previously characterized singledomain antibody scaffold highly amenable to CDR3 substitutions (21,22) (table S1). The third design, DesAb HSAD3, was made by structurally matching two separate CDRlike candidates onto two CDR loops of a nanobody scaffold identified as highly compatible with these two binding motifs ( Fig. 2D; see Materials and Methods).
We note that this pipeline recovered and scored highly the se quences of the CDRs of an existing nanobody targeting HSA, called Nb.B201, whose structure in complex with the antigen was pro cessed during the AbAg database construction (table S4). While this observation serves as an in silico consistency check for our design method, when selecting fragments for all designs used in this study, we excluded fragments originating from antibodies or peptides al ready known to bind to the target antigen, and in the case of HSA, we also selected different epitopes.
Binding to HSA was measured in solution with microscale ther mophoresis (MST), which yielded K d values ranging from 140 to 800 nM (Fig. 3, A to C and E), while a control singledomain anti body showed extremely weak signal in this assay ( fig. S4A). To put this in context, the Nb.B201 nanobody, which was isolated with yeast display from a stateoftheart naïve library, was reported to bind HSA with a K d of 430 nM (23), which is in the same range as those of our de novo designs.
To confirm the binding, we also carried out biolayer interferom etry (BLI) with immobilized HSA, obtaining K d values compatible with those measured in solution (Fig. 3, D and E). The trypsin targeting DesAbTryp used as a negative control gave no binding signal for HSA in this assay (Fig. 3D), while the yeast displayderived antiHSA nanobody Nb.B201, used as a positive control, yielded a K d compatible with that reported in the literature (fig .  S4B) (23). DesAbTryp has the same sequence as DesAbHSAP1 and DesAbHSAP2, except for the designed CDR motif grafted in the CDR3 loop (table S1), and therefore represents a particularly suitable negative control to confirm that the observed binding is coming from the grafted designed motif. Besides, DesAbTryp was able to bind its intended target trypsin, while DesAbHSAP1 and DesAbHSAP2 showed no binding signal and were likely digested by the protease during the binding assay ( fig. S5).
The crystal structure of DesAbHSAP1 in the unbound form further confirms the correct folding of the VH domain. This struc ture also reveals the dynamic nature of the CDR3 loop, which har bors the designed motif, as the electron density is missing for most of this region (fig. S3). A highly dynamic CDR3 loop was expected for this scaffold. For example, two of the four identical chains com prising the asymmetric unit of the structure of the original single domain scaffold (PDB ID 3B9V) also have unassigned coordinates in their CDR3, even if the loop here is eight residues shorter than that of DesAbHSAP1. The highly dynamic nature of this loop likely stems from the lack of strong CDR3framework contacts, which is why folding and stability of this scaffold have been shown to be insensitive to mutations in its CDR3 loop by several studies (21,(24)(25)(26)(27). We selected this scaffold precisely because it can harbor virtually any sequence in its CDR3 without marked consequences on its stability. However, the dynamic nature of the loop harboring the designed motifs likely also explains why we were unable to ob tain a crystal structure of DesAbHSAP1 bound to HSA. We spec ulate that this dynamic loop, even when bound to the antigen, retains enough hinge flexibility to embody the resulting complex with a degree of dynamics unsuitable for structural determination.
In the absence of an atomiclevel structural characterization of the designed interaction, we resorted to epitope binning through competition experiments. BLI competition experiments show that DesAbHSAP1 and DesAbHSAD3 compete with each other for binding to HSA, as the binding of one is hindered by the presence of the other antigenbound DesAb (Fig. 3F). Conversely, DesAb HSAP2 does not compete with the other two, as its binding is not affected by the presence or absence of other antigenbound DesAbs (Fig. 3F). This competition behavior is fully compatible with the rational design, as DesAbHSAD3 and DesAbHSAP1 were de signed to target partly overlapping epitopes, while DesAbHSAP2 tar gets a different epitope on the opposite side of the antigen (Fig. 2D). †PDB ID of scaffold in whose loop the designed CDRs are grafted. 6Z3X is from this study.
‡Melting temperature rounded to the closest 0.5°C to reflect the accuracy of the thermal shift assay used (see fig. S3C).
§As measured with BLI, rounded to the closest 10 nM with exception of DesAb-Tryp, which was rounded to 100 nM ( fig. S6) Like the HSAtargeting DesAbs, the two designs made to target the RBD of the spike protein showed a binding affinity in the nano molar range. We first tested the binding in solution to the full trimeric spike protein using MST ( Fig. 4B; see Materials and Methods). Both RBDtargeting DesAbs showed binding to the spike protein, while the HSAtargeting DesAbHSAP2 used as a negative control gave no signal in the assay (Fig. 4C), confirming that the observed binding comes from the designed CDR3 motif. Fitting the binding curves with a 1:1 binding model reveals apparent K d values of 150 and 580 nM for DesAbRBDC1 and DesAbRBDC2, respec tively. As the spike protein is a trimer, a 3:1 binding model could have been, in principle, more suitable. However, while three distinct drops may be discernible in the binding curve of DesAbRBDC1, these are largely absent from that of DesAbRBDC2, and in both cases, the error bars are too large for a reliable 3:1 fit. To confirm the bind ing, we carried out a BLI assay with immobilized natively glycosylated RBD, which yielded K d values of 210 and 130 nM for DesAb RBDC1 and DesAbRBDC2, respectively (Fig. 4, D and E). Conversely, these two antiRBD antibodies showed no binding signal for immo bilized HSA used as a negative control and as a blocker in the assay ( fig. S4C; see Materials and Methods). We note that the lower apparent affinity of DesAbRBDC2 for the full spike, together with the absence of a threestep transition in its MST binding curve, is compatible with DesAbRBDC2 having a more sideway epitope (Fig. 4A), which may be poorly accessible in the down RBD confor mation of the full spike (19). Last, both antiRBD DesAbs were able to compete with the binding of the human ACE2 receptor to the viral RBD, which suggests that affinitymatured versions of these DesAbs may have neutralizing potential (Fig. 4F).

Applicability of the design strategy
Having established that our computational method can yield stable singledomain antibodies that bind their intended targets with K d values down to the nanomolar range, we asked how readily and generally applicable the design strategy is. Given the fragmentbased combinatorial nature of our method, we first asked what are the Curves are labeled with "X1 versus X2" to identify the anti-HSA DesAbs used. The plot shows the last three steps, and reference sensors monitoring the background dissociation of X1 during these steps were subtracted from the traces shown here. The traces P1 versus P1 and D3 versus D3 were taken as positive controls for the competition, and the small signal observed is due to the facts that not all epitopes are occupied by the first DesAb (X1) and that this is dissociating in the background. The trace Buffer versus P2 was taken as a negative control for the competition.
chances that suitable CDRlike fragments can be designed to target a given epitope, i.e., how typical it is for an epitope to have appro priate matching fragments in the AbAg database. To address this question, we run our design pipeline on the whole surface of all ex perimental target structures from the Critical Assessment of Tech niques for Protein Structure Prediction competition (CASP14) (28). The target structures of the CASP assessments are selected ensuring that they represent a diverse sample of native folds, characterized by different sequences, secondary structures, and overall shape (29). Therefore, these structures also constitute a particularly suitable test set to explore the applicability of our design strategy. Having ob tained all possible designed CDRs for each structure, we computed the solventaccessible surface area (SASA) of the structure in the presence and absence of bound designed CDR fragments to reveal how much of the antigen surface is covered (see Materials and Methods). Our results reveal that most of the surface of each anti gen is typically targetable with our strategy, with a median surface coverage of 78% (Fig. 5A). Furthermore, for each epitope, there are typically many candidate designed CDR loops to choose from, with a median density of 19 designed CDRs per nm 2 of antigen surface (Fig. 5B). Together, these results reveal that, while some epitopes that cannot be targeted with our combinatorial strategy exist, most epitopes can be targeted by choosing between multiple different de signed CDR candidates.
Having established that most of the epitopes can be targeted with our design strategy, the most apparent bottleneck of the pipe line is the need for a structure to be used as input. As structural de termination can be challenging for some antigens, this aspect could limit the applicability of the method, in particular in the cases of emerg ing diseases or of poorly investigated areas, where novel antibodies are often most needed. Recent advances in structure prediction are changing this scenario, as it is now possible to readily obtain rather accurate models of most protein structures of interest (30,31). However, the accuracy of many methods of computational design, and in particular of those relying on energy functions that depend on interatomic distances, is known to rapidly deteriorate with lower quality input models (32). Therefore, we next asked how applicable our method is on computationally modeled protein structures.
To test the dependence of our design method on the quality of the input structural model, we ran our CDR design procedure on all CASP14 models generated with AlphaFold2, which was the best performing algorithm assessed (28,30). By using all models depos ited in CASP14 for each target structure, we also included in our analysis lower quality models that were not top ranking in CASP (see Materials and Methods). Our results reveal that most of the designed CDRlike fragments obtained by using each model as input are effectively identical to those obtained using the corre sponding experimentally determined structure (Fig. 6A). More spe cifically, the median number of designed CDRs in common between each model and its corresponding experimental structure, ex pressed as a percent of the total number of designed CDRs obtained for each model, is 77%, and only 20 (10%) of the 200 models ana lyzed have less than 50% CDRs in common with their target struc tures (Fig. 6A, fig. S6, and table S3). These results suggest that if one were to use an AlphaFold2 model as input for our antibody design pipeline, typically about 75% of the generated CDRs would be iden tical to those that would be obtained from the corresponding crystal structure, and at least 50% would be identical in 90% of the cases. Besides, we only observe a very weak correlation (R 2 = 0.06) be tween the percent of CDRs in common among model and structure and the quality of the model itself as quantified by the global dis tance test total score (GDT; Fig. 6B). This weak correlation indi cates that the performance on modeled structures is not excessively determined by those very highquality models (GDT ≥ 90) that are almost identical to their corresponding crystal structure. Together, these results imply that the CDR design procedure could be expected to yield similar results when running on computerpredicted models or on experimental structures, and that these results do not strongly depend on the quality of the model used as input, at least within the quality range we explored (GDT > 40).

DISCUSSION
We have described a fragmentbased strategy for the rational design of antibodies targeting structured epitopes. We use protein frag ments of at least four residues and typically longer to assemble de signed CDRs in a combinatorial way. The idea behind this choice is that these fragments should be large enough to contain nontrivial sequence determinants of structure and interactions (6,21,33).
Our experimental results demonstrate that the design pipeline that we presented can yield highly thermostable singledomain an tibodies, which bind their intended targets with K d values down to the nanomolar range (Table 1). This affinity range was confirmed with two independent experimental techniques, one relying on equi librium thermodynamics in solution (MST) and one on binding kinetics with a surfaceimmobilized ligand (BLI). We explored slightly different design strategies, using single or multiple motifs to construct designed CDRs, and two grafting strategies (Fig. 2). In one, the designed motifs are grafted in the CDR3 of a stable scaffold, and in the other, they are structurally matched into two distinct loops of a compatible framework (DesAbHSAD3). We did not find substantial differences in the binding affinities of DesAbs ob tained through different strategies.
We further verified, through various negative control experi ments, that the DesAbs do not bind antigens that they are not in tended for. Given that all DesAbs in this study, except for the two loop design DesAbHSAD3, share the same framework se quence (table S1), these experiments make us confident that the ob served interaction is coming from the designed binding motif grafted in the CDR loop. In a recent publication, we also show that dimeric conjugates of our antiRBD DesAbsRBDC1 and DesAbs RBDC2 have 10 to 60fold improved binding affinity toward the spike protein over their monomeric counterpart (K d , 8 to 15 nM), as one may expect from functional antiRBD nanobodies (34). Fur thermore, we observed a binding competition behavior fully com patible with the location of the target epitopes on the antigen surface (Figs. 3F and 4F).
Our failed attempts to obtain a structure of the bound complex, together with the structure of DesAbHSAP1 with missing elec tron density in the CDR3 region ( fig. S3), suggest that these DesAbs differ from immune system-derived ones in their loop dynamics. This possibility is supported by recent results from molecular dy namic simulations, which compared the loop dynamic of DesAbs obtained with these grafting strategies with that of a nanobody ob tained from llama immunization (35). Future work will be focused on addressing this limitation, to enable the design of rigid DesAbs amenable to structural characterization, which may even be applica ble as crystallization chaperones like natural nanobodies (36), and also in assessing the immunogenicity of the designed antibodies.
We have been able to obtain DesAbs binding in the nanomolar range without the need of experimentally screening a large number of designs, but rather by preselecting in silico those designed CDRs that appeared most promising according to the metrics implemented, which include proxies for the predicted binding and sidechain complementarity, as well as predictions of solubility (see Materials and Methods) (15). The fragmentbased combinatorial approach presented here does not require the calculation of interaction energies and is also sub stantially faster than approaches based on the sampling of conforma tional and mutational space (2). Besides, this strategy is not highly sensitive to small variations in interatomic distances in the input model as, for example, force field calculations are, which helps ex plain why using models of varying quality for a given antigen results in similar CDRlike fragments (Fig. 5, C and D). An intrinsic limita tion of this strategy, however, is that its applicability to epitopes of interest depends on the availability of suitable CDRlike fragments in the databases used. Nonetheless, the growing number of available protein structures in public databases makes the procedure generally applicable, as for most epitopes one obtains a number of candidate CDRs to choose from (Fig. 5 and fig. S1).
Our results, which are obtained with a computer code that can run on standard laptops, demonstrate that it is becoming increasingly possible to design de novo antibodies binding to preselected epitopes of interest. We have exploited recent advances in proteinfolding predictions and ab initio structural modeling to show that our design pipeline yields similar results when running on experimental struc tures or on computergenerated models, even when these do not reach high accuracy. We envisage that, taken together, these advances in computational biotechnology will enable in the future to obtain lead antibodies in a matter of days from the release of a pathogen genome, or from the identification of a novel diseaserelevant target.

Data collection
All protein structures in the PDB (37) were downloaded from the rcsb.org website using a 90% sequence identity cutoff to reduce redundancy. Downloaded files were further cleaned by removing noncanonical amino acids and structures with no sidechain in formation. We refer to this dataset as the PDB90 database.
We further assembled a database of nonredundant CDRs, which we call CDR database. To create this dataset, the Structural Antibody Database (SAbDab) (38) was downloaded, and the structures of all heavy and light CDR types (CDRH1,2,3 and CDRL1,2,3) according to the Chothia definition were extracted from the antibody struc tures and filtered for redundancy. A database of antibodyantigen structures, filtered for peptide or protein antigens only, was also ob tained directly from the SAbDab website and will be referred to as the AntibodyAntigen database. Last, a database of complete struc tures of antibody Fv regions, comprising both VH and variable light (VL) domains, as well as heavy chain-only antibodies (VHH) was retrieved from SAbDab and named Ab database.

Generation of a database of antigen-CDR-like interactions
Each of the binding loop structures in the CDR database was used as query to look for structurally similar motifs in the PDB90 database. To achieve so, each template CDR loop of length N residues was fragmented using a sliding window approach with a range of [4, N] amino acids. Then, each of the generated fragments was matched against the whole PDB90 database using the MASTER program (14) (version 1.3.1) to find CDRlike structures. This structural search is based on the Kabsch algorithm (39), which uses RMSD of the carbon alpha positions. For that, an RMSD cutoff of 0.4 Å was used for fragments of length 4 and increased by 0.05 Å for each additional residue (the maximum cutoff value was set to 1.0 Å). In this way, we obtained a database of CDRlike fragments whose backbone is found in a conformation compatible to that observed in at least one known antibody CDR, but with no constraints on sequence similarity with known CDRs.
Next, we sought to establish whether these CDRlike fragments had an antigenlike partner region in their native environments (i.e., in the structures where they have been identified). Here, we define antigenlike region any part of a protein structure within the PDB90 database comprising one or more fragments of at least four consecutive residues that is in contact with a CDRlike fragment. Two different definitions of contacting residues were used: first, those residues in the structure whose calculated SASA increases upon removal of a CDRlike fragment; second, those fragments found within a distance of 10.5 Å between C atom pairs from a CDRlike fragment. Therefore, as a final product, two databases of antigenlike regions associated to the corresponding interacting CDRlike fragments were obtained, based on the two different residuecontact definitions. They will be individually referred to as the AbAgSASA and AbAgCACA databases and collectively as the AbAg database.

Identification of CDR-like fragments interacting with a structured epitope
Given the structure of an epitope of interest as input, the two AbAg databases can be searched to identify antigenlike regions structurally similar to those within the input epitope. In this way, the CDRlike fragments interacting with the identified antigenlike regions have the potential to also interact with the structure of the epitope used as a query, as long as these regions have a reasonable sequence sim ilarity. To perform this search, the structure of the epitope is frag mented into smaller regions to increase the probability of identifying matching antigenlike regions in the databases. Two fragmentation modes are used: The first one uses a sliding window approach to fragment contiguous peptides; window sizes are in the range [4, N], where N is the length of the input epitope, or the length of the epitope region under fragmentation in the case of input epitopes formed by multiple noncontiguous fragments. This fragmentation approach constitutes the "linear" mode. The other fragmentation mode takes each individual residue and calculates the closest n resi dues based on distances between the center of mass of their side chains (Fig. 1). This is done with various n values in the range [4, N]. This fragmentation approach constitutes a conformational mode, as it can readily generate regions comprising noncontiguous polypeptide segments that are close with each other in the input structure of the epitope. All the generated fragmentations of the input epitope are used as queries to interrogate the AbAg databases using the MASTER program doing full backbonetobackbone compari sons using the same RMSD metrics as those used for the generation of the AbAg databases. To speed up the structural search, when using the linear fragmentation mode, the sequences of the generated epitope fragments are used as queries for a much faster blastp search (40) against the sequences of all antigenlike fragments within the AbAg databases (blast command: blastp -query input_ fragment_seauence.fasta -db AtAg_databases.fasta -qcov_hsp_perc 100.0 -matrixBLOSUM62 -task 'blastp-short' -word_size 2 -seg 'no' -evalue20000 -ungapped -comp_based_stats F -max_target_seqs 60000 -outfmt6 -out blast_hits.txt). This strategy is used to restrict the search space of the MASTER program within the AbAg databases to only those antigenlike regions with a sequence identity meeting a userselected threshold. When using the conformational fragmen tation mode, sequence identity is checked during the AbAg structural search whenever a match is found. In both modes, whenever a matching antigenlike region meets both sequence identity and structure similarity criteria, the corresponding interacting CDRlike fragments are retrieved. While the sequence identity threshold is specified by the user, the RMSD threshold (in Angstroms) is given by the function RMSDcutoff = 0.4 + n*0.033, where n represents the number of residues in the epitope fragment used as query. The retrieved CDRlike structures are then rotated to match the orien tation of the input epitope by superimposing the matching antigen like region together with its interacting CDRlike fragment(s) to the input epitope. As the matching region is typically smaller than the full input epitope, steric clashes may occur between the identified CDRlike fragments and the rest of the epitope or of the antigen, in which case the CDRlike fragments are discarded. Otherwise, these are labeled as CDRlike candidates ( fig. S1).

Optimization of the identified antigen-CDR-like interactions and raking of the hits
Each of the CDRlike candidates has a set of native interactions, which are defined as those interactions observed in the corresponding antigenlike region within the PDB90 database according to the SASA criterium of interaction described above. However, these interactions might not be fully conserved when the CDRlike candi date is paired with its corresponding epitope fragment, due to dif ferences in amino acid sequence and sidechain orientation between the epitope fragment and the matching antigenlike region. If this is the case, the probability for the CDRlike candidate to interact with the epitope of interest might decrease. To address this issue, for each CDRlike candidate, we run an optimization procedure on those residues that have different interactions with the input epitope than the corresponding native ones. For each of these CDRlike resi dues, the optimization starts by defining a local interacting structural motif. This motif comprises all epitope residues that are found interacting with the CDRlike residue under scrutiny according to the SASA criterium of interaction described above. Next, this local structural motif, which includes also the backbone atoms of the CDRlike residue itself, is used as a query to look for similar regions in the PDB90 database (RMSDcutoff = 0.6 + n*0.025, where n is the number of residues). The aim is to find a matching region where the hit amino acid corresponding to the CDRlike residue under opti mization has a backbone orientation very similar to the query, and therefore structurally compatible with the CDRlike candidate. Then, if those matching residues corresponding to the epitope resi dues have a sequence identity with the epitope higher than the cur rent value, the side chain of the CDRlike residue is replaced with that of the new hit, always avoiding hits that cause steric clashes or proline and cysteine residues that may respectively alter the CDR backbone conformation or later cause covalent dimerization of designed antibody candidates. This procedure is applied to all CDRlike candidates that need it, to maximize the number of native interactions. As multiple residue positions within each CDRcandidate may be optimized, and as each of them may have multiple optimi zation options, all possible combinations are generated. For example, a candidate with three optimization options at position 1 and two options at position 4 will yield a total of 12 CDR candidates.
All candidates are ranked according to their solubility, as com puted by the CamSol method (15). Furthermore, we also compute for each CDRlike candidate the number of native interactions, the number of shared interactions, and the number of interactions that are not shared, before and after optimization. Shared interactions are defined as interactions present in the original CDRlike/antigen pair found in the PDB90 database (native interactions) and that are also preserved in the optimized CDRlike bound to the epitope of interest. On the basis of these metrics, candidates with high number of shared interactions, low number of nonshared ones, and better solubility scores are regarded as the best ones. These scores and re sulting rankings can be used to shorten the list of candidates and aid the selection of the most promising binding CDRs.

Fragment assembly and CDR grafting
After optimization, the shortlisted CDRlike candidates are grafted in either fulllength CDRs or directly full Fv antibody regions. At this stage, CDRlike candidates can also be combined with each other to obtain longer CDR candidates (Fig. 1). To do so, the first candi date of the shortlist is matched against the Ab or CDR database using MASTER, and the best match with no steric clashes between the epitope and the selected full CDR or complete Fv region is saved. Then, to combine together multiple CDRlike fragments in the same design, the same fragment is paired with all other fragments in the shortlist, and the pairs are matched against the Ab or CDR databases to see if both fragments could fit together different parts of the same CDR loop, or different CDR loops of the same Fv region. If any of the pairs of candidates is successfully matched, the result is taken to build triplets, and the matching process is repeated until no further match can be identified. After that, the process is repeated with the second candidate in the list, avoiding the already tested combina tions. The iteration continues until all candidates and combinations are tested. Structural matching is done using C atom comparisons with RMSDcutoff = 0.4 + n*0.05, where n represents the total of residues in the query. This opens the opportunity of generating CDR loops comprising multiple CDRlike fragments, as well as antibodies with multiple candidates in different CDRs ( fig. S1).
This joining and grafting procedure may introduce new interac tions between the CDRs in which the candidates were grafted and the epitope, and possibly also between the epitope and other parts of the Fv region. If that is the case, each new set of interacting resi dues on the antibody side is subjected to the optimization procedure described above to increase the chances of successful binding. Last, the structure of the grafted candidates (either in CDRs or full Fv region) is produced as a final output.

Generation of antibodies targeting HSA, SARS-CoV-2 spike protein, and trypsin catalytic site
The described algorithm was applied to the entire surface of HSA (PDB ID 1AO6, chain B), to the antigen binding region of the RBD of the SARSCoV2 spike protein (PDB ID 6VSB), and to a small region comprising the catalytic site of the trypsin protease (PDB ID 1S0Q). Both linear and conformational epitope fragmentation modes were used, with 70 and 60% sequence identity thresholds used during the CDRlike candidate search, respectively. The search was constrained to fragments of length 4 to 13 amino acids. Both AbAg databases were used. The list of CDR candidates was shortened by selecting those whose number of shared interactions was greater than the number of nonshared interactions and corresponded to at least twothirds of the number of native interactions. For HSA, all shortlisted CDRs were then matched to full nanobody structures to find an amenable scaffold, and the top hit (based on the metrics describing the interactions, the solubility scores, and the quality of the grafting) was selected for experimental validation (DesAbHSAD3, consisting of two CDR fragments matched to the CDR1 and CDR3 of the VHH scaffold PDB 4DKA). In addition, the two top hits from the shortlisted CDR candidates (DesAbHSAP1 and DesAbHSAP2) were taken to be grafted directly into the CDR3 of a stable VHH scaffold (21). The latter strategy was also used for the spike RBD designs (DesAbRBDC1 and DesAbRBDC2) and trypsin active site design (DesAbTryp).

Analysis of the AlphaFold2 models and corresponding experimental structures within the CASP14 competition
Experimentally determined structures (targets) were downloaded from the CASP14 website, from https://predictioncenter.org/ download_area/CASP14/targets/ (files therein were downloaded in January 2021 casp14.targets.Tdom.public_11.29.2020.tar.gz). AlphaFold2 models were downloaded from the same website using the Table Browser feature and by selecting "427 AlphaFold2" at https://predictioncenter.org/casp14/results.cgi?view=tbsel and "all models" and "all targets." Selecting "all models" instead of the default "model 1" is important as it enables to assess multiple models for each experimental target, including those that were not top ranking and hence have lower quality. This table also contained all the model quality metrics as calculated by the authors of the CASP14 competition, such as the RMSD and the GDT (table S3) (29,41).
Given that the experimental structures of some of the targets have not yet been released publicly, at the time of analysis, coordi nates were available for 31 different targets of 63 expected from the table. As a consequence, we restricted our analysis to those 200 AlphaFold2 models that mapped on a target with available coordi nates. These corresponded to five models per target, with the excep tion of three targets (T1024, T1030, and T1038) that had a total of 15 models each, as for these targets two domains had been inde pendently modeled (five models per domain) and five additional complete models with both domains modeled together were generated, and of one target (T1050) that had a total of 20 models, as three domains had been independently modeled for it (again five models per domain plus five complete models). Furthermore, in a number of cases, there were amino acid residues present in the AlphaFold2 models, but not in the corresponding experimental structure (e.g., regions of missing electron density), or vice versa (residues not present in the model but present in the experimental structures). In these cases, we removed the extra residues before running the design calculations so that these ran on models and corresponding structures containing exactly the same residues (table S3, column "Processed"). This was a necessary precaution as the presence or absence of stretches of residues can generate different designed CDRs when running the design calculations. All PDB files from structures and models were cleaned using the PDBcleaner tool available on our web server (https://wwwcohsoftware.ch.cam.ac.uk/) to remove HETATM and to grow any missing atom. It is worth noting that the authors of the CASP competition select their targets, also ensuring that they represent a diverse sample of native folds characterized by different secondary structure contents and overall shape, thus making these structures a particularly suitable test set to explore the generality of our antibody design strategy.
To obtain the results presented in the main text (Fig. 5), we ran our algorithm on the selected models and their corresponding experimental structures using a 70% sequence identity cutoff, and the requested minimum length of the CDRlike fragments was set to four residues. We then calculated the SASA for the entire input structure as well as for the structure in complex with all the identi fied CDRlike candidates. Subtraction of these values indicates the "surface coverage" per input structure (table S3).
In addition to the results presented in the main text, it is worth noting that the observation that data points corresponding to dif ferent models of the same structure tend to cluster together in Fig. 5B suggests that the nature of the antigen may play a bigger role in determining the robustness of the design procedure than the quality of the model itself. For example, the five lowestranking models, three of which are outlier in the distribution with less than 20% CDRs in common with their structure, are all for the same target (T1064 in table S3, PDB ID 7jtl, Fig. 5, A and B). This is a viral pro tein with a long disordered loop on one side and several missing residues, which are the main culprits for the large number of CDRs that are different among models and target. Furthermore, it is worth noting that ( fig. S6): (i) the overall number of designed CDRs is typically very similar between models and target (Pearson's R = 0.96), (ii) the fraction of designed CDRs that are obtained for the experi mental structure and not for its models is typically small (median, 17%), and (iii) the total number of designed CDRs for a model appears to correlate with the overall fraction of CDRs that would also be obtained from the experimental structure (R = 0.51).

Protein production and characterization
Genes encoding the antiHSA singledomain antibody candidates (plus a Cterminal 7X HisTag) were synthesized and cloned into an isopropyldthiogalactopyranoside (IPTG)-inducible vector (by Atum in vector PD444) including a leading OmpA sequence to enable translocation to the periplasm and ultimately facilitate intra domain disulfide bond formation and the secretion of the product to the media. The antispike RBD and antitrypsin designs were introduced via restrictionfree cloning into the CDR3 of the DesAbHSAP2 plasmid. For all DesAbs, versions with a free Cterminal cysteine residue were created using sitedirected muta genesis. This cysteine was inserted as part of an AspCysGlu motif right before the start of the Cterminal HisTag. For the antispike RBD designs, versions with a clamp sequence (FCPF) (42) followed by a TEV cleavage site right before the Cterminal 7× HisTag were created by restrictionfree cloning.
Plasmids were transformed into E. coli Shuffle LysY strain to fur ther facilitate the formation of the disulfide bond of the antibody. Cultures (0.5 liter) of LB media were inoculated at initial 0.03 OD 600 (optical density at 600 nm), grown at 37°C until reaching 0.8 OD 600 nm, and then induced with 500 M IPTG. Overnight expression was carried out at 30°C. Cellular pellet was discarded after centrifugation, and the supernatant was filtered using a 0.45m filter to remove remaining cell debris. Supernatant was passed twice through a grav itational flow column packed with Ni Sepharose Excel IMAC resin (Cytiva, 17371201) previously equilibrated in phosphatebuffered saline (PBS) (pH 7.4). Then, the column was washed with PBS (pH 7.4), with a gradient of imidazole in PBS (10 and 30 mM). The protein was then eluted at 200 mM imidazole. Fractions were ana lyzed using SDS-polyacrylamide gel electrophoresis (SDSPAGE), and those with the most protein and highest purity were dialyzed against PBS to remove the imidazole. Purified proteins were diluted to 20 M, aliquoted, flashfrozen in liquid nitrogen, and stored at −80°C. The positive control nanobody Nb.B201 was expressed in the same way but using temperature and timing described in the original work (23), as well as the expression plasmid deposited in Addgene (pET26b_Nb.b201 Plasmid #131404).

Antigens
HSA was purchased from SigmaAldrich (A3782) as lyophilized powder, resuspended in PBS, and further purified via gel filtration using a Superdex 200 sizeexclusion chromatography column before use in binding assays. Pancreatic bovine trypsin was purchased from SigmaAldrich (T1426). Recombinantly produced [from human embryonic kidney (HEK) 293 cells] SARSCoV2 Spike Glycoprotein (S1) HisTagged RBD was purchased from The Native Antigen Company (REC31849) and supplied to high purity in dry ice. SDSPAGE analysis showed purity of >95% and a molecular weight consistent with the fully glycosylated RBD (data from The Native Antigen Company). Human ACE2 (18615) recombinant protein, used in the competition assay in Fig. 4F, was also purchased from The Native Antigen Company, where it was expressed in HEK293 cells with Sheep FcTag (REC31876). Trimeric HisTagged SARSCoV2 Spike Glycoprotein was purchased from The Native Antigen Company (REC31871100). Protein and antibody concen trations were determined by absorbance measurements at 280 nm using theoretical extinction coefficients calculated with Expasy ProtParam web server.

Protein-thermal shift stability measurements
The melting temperature of the DesAbs was measured with a proteinthermal shift assay on a BioRad CFX96 Touch quantitative polymerase chain reaction (PCR) machine using the ROX filter in white PCR plates. Samples were heated at 0.2°C/min from 25° to 95°C and consisted in purified antibody in PBS at a final concentra tion of 8 M and of Syproorange dye in Protein Thermal Shift Buffer (Thermo Fisher Scientific, 4461146) accounting for 25% of the final volume at a final concentration of 2× the recommended dilution. Sample volumes of 50 l per well were used. The signal from the dye in the absence of proteins was subtracted from the sample signal before analysis. The melting temperature (T m ) was determined as the point of steepest derivative, and the values reported in fig. S2 are average and SDs over four replicates. In Table 1, we chose to round the T m values to the closest 0.5°C because, while SDs across wells in the same plate are typically very small, interexperiment variations tend to be slightly larger.

Circular dichroism
Farultraviolet (UV) CD spectra of the DesAbs were recorded using a Chirascan Applied Photophysics spectropolarimeter equipped with a Peltier holder, using a 0.1cm path length quartz cuvette. Samples contained 6 M protein in PBS. The farUV CD spectra of all DesAbs were recorded from 200 to 250 nm at 25°C, and the spec trum of the buffer was systematically subtracted from the spectra of all DesAbs to yield the plots in fig. S2.

Maleimide labeling
To obtain conjugates of the design antibodies to Alexa Fluor 647 dye, the Cterminal cysteine variants of the design antibodies were incu bated with 1 mM dithiothreitol (DTT) for 10 min to reduce inter DesAbs disulfide bond yielding covalent C terminus-C terminus dimers that may have formed during storage. DTT was then removed using Zeba desalting columns (Thermo Fisher Scientific, 89882), and samples were concentrated to 100 M before incubation with Alexa Fluor 647-maleimide reagent (Thermo Fisher Scientific, A20347) for 1 hour at room temperature. Free dye was removed using PD10 desalting columns (Cytiva, 17085101), and the label ing efficiency was assessed by absorbance measurements. Trimeric SARSCoV2 spike protein was fluorescently labeled by incubating a 2.8 M protein solution with 50 molar equivalents of Alexa Fluor 647-Nhydroxysuccinimide ester reagent (Thermo Fisher Scientific, A20006) in the dark during 2 hours at room temperature. Excess dye was removed by desalting three times with Zeba columns (Thermo Fisher Scientific, 89882), and labeling efficiency was determined by absorbance (estimated to be 13:1 dye to protein labeling ratio).

MST binding affinity measurements
For the antiHSA designs, starting from 30 M HSA (150 M for the KK5 control DesAb in fig. S4), 16 samples of 1:1 serial dilutions were incubated with 70 nM Alexa Fluor 647-labeled antibody for 1 hour at room temperature. Samples were prepared in 170 mM NaCl, 50 mM trisHCl, 10 mM MgCl 2 (pH 7.4) with 0.05% Tween 20. After incubation, samples were run in triplicate in the Monolith NT.115 System (NanoTemper Technologies) using 20% lightemitting diode (LED) excitation power and 60% MST power, at 25°C. For the antiRBD designs: DesAbRBDC1 and DesAbRBDC2 (variants with the Cterminal clamp and TEV cleavage site) at 14.4 M and the antiHSA control DesAbHSAP2 at 4 M were used as starting concentration for preparing 16 1:1 serial dilutions in PBS (pH 7.4) with 0.05% Tween 20. They were incubated with a final concentra tion of 8 nM Alexa Fluor 647-labeled trimeric SARSCoV2 Spike protein at room temperature for 1 hour. After incubation, samples were run in triplicate in the Monolith NT.115 System (NanoTemper Technologies) using 15% LED excitation power and 80% MST power, at 25°C. All data were analyzed and fitted using the Monolith System software assuming a 1:1 binding interaction.
BLI binding affinity measurements BLI measurements were performed using an OctetBLI K2 system (ForteBio). All assays were carried out in a black 96well plate, 200 l per well, and all sensors were subjected to prehydration in the assay buffer for at least 15 min before usage. The assay plate was kept at 25°C throughout the entire experiment. For consistency with the MST measurements, antiHSA design binding assays were carried out in a buffer containing 170 mM NaCl, 50 mM trisHCl, and 10 mM MgCl 2 (pH 7.4). First, two aminopropylsilane (APS) sensors (sample and reference) were preincubated in buffer for 15 min. Assay program consisted of a 150s baseline in buffer; 300s loading using 4 M HSA; 300s wash in buffer; 90s baseline in buffer; 300s association in 1 M, 500 nM, and 250 nM antiHSA DesAbs for the sample sensor and buffer for the reference sensor; and 300s dissociation in buffer (Fig. 2D). As a control for nonspecific binding to the sensors, the same experiment was carried out with the Des AbTryp instead of the antiHSA DesAbs (Fig. 2D). The positive control Nb.B201 experiment was carried out in the same way but using 800 and 400 nM as analyte concentrations ( fig. S4B). Binding competition experiment of the antiHSA designs was carried out in a similar way, in a buffer consisting of onethird of PBS pH 7.4 and twothirds of the aforementioned 50 mM trisHCl, 170 mM NaCl, and 10 mM MgCl 2 (pH 7.4). APS sensors were loaded with HSA for 600 s, baseline for 300 s, then dipped in wells con taining 5 M of a first DesAb X1 for 600 s, moved in buffer wells for 60 s, then into wells containing 5 M of a second DesAb X2 for 300 s, and finally back to buffer wells for 600 s to monitor dissoci ation. DesAbs X1 and X2 refer to different combination of the antiHSA DesAbs as in the legend of Fig. 2F. Because trypsin could not be loaded effectively on APS sensor, the trypsin binding assay was carried out with Ni-nitrilotriacetic acid sensors using the same buffer composition as above. Sensors were loaded with 7.5 M Histagged DesAbTryp or control DesAb (either DesAb HSAP1 or DesAbHSAP2 as in fig. S5) for 900 s. We found that loading these sensors to saturation was the only viable way to fully suppress the nonspecific binding of trypsin to the nickel sensors; hence, we systematically used control DesAbs for all trypsin con centrations tested. These controls are identical to DesAbTryp except for the designed CDR3 (Table 1). Following loading, a baseline was taken for 180 s and then association and dissociation steps as in fig. S5. Assays for DesAbRBD designs were carried out in PBS with APS sensors, following the program: 120s baseline in buffer, 90s loading using 400 nM RBD, 300s 4 M HSA blocking, 120s baseline, 300s association, and 300s dissociation.
Data with multiple DesAb concentrations were fitted globally with inhouse python scripts, using K on and K off as global fitting param eters and R max as a local parameter (i.e., each DesAb concentration was allowed its own value of R max as these are probed by different BLI sensors). The K d was then derived as the ratio of K off /K on . Because of the shape of its dissociation curve, the positive control Nb.B201 experiment was fitted with a model that does not assume full dissociation at infinite time ( fig. S4B).
Crystallization, data collection, data reduction, structure determination, refinement, and final model analysis DesAbHSAP1 was concentrated before the setup of crystallization trials to a final concentration of 10 mg/ml. Crystals of DesAb HSAP1 were obtained with the vapor diffusion technique, in sitting drops, using equal volumes of the protein and 0.1 M sodium cacodylate (pH 6.5), 27% Polyethylene glycol 2000 monomethyl ethers (PEGMMEs) as a precipitant solution. Diffraction data were collected on cryoprotected crystals (25% glycerol) at 100 K, at the I04 beamline of the Diamond Light Source using a 0.9795Å wavelength. The col lected dataset was processed with Dials (43) and Aimless (44) from the CCP4 suite (45). The structure was solved by molecular replace ment with Phaser (46) using as a search model PDB ID 3B9V. The correct amino acids of the DesAbHSAP1 construct were built manually using COOT (47,48). The initial model was refined alternating cycles of automatic refinement using Phenix (version 1.17_3644) (49) and manual model building in COOT. Data collec tion and refinement statistics are reported in table S2. Analysis of molecular interfaces was performed using PISA (50).