Humanized V(D)J-rearranging and TdT-expressing mouse vaccine models with physiological HIV-1 broadly neutralizing antibody precursors

Significance Mouse models that express human precursors of HIV-1 broadly neutralizing antibodies (bnAbs) are useful for evaluating vaccination strategies for eliciting such bnAbs in humans. Prior models were handicapped by nonphysiological frequency and/or diversity of B lymphocytes that express the bnAb precursors. We describe a new class of mouse models in which the mice express humanized bnAb precursors at a more physiologically relevant level through developmental rearrangement of both antibody heavy- and light-chain gene segments that encode the precursors. The model also incorporated a human enzyme that diversifies the rearranging gene segments and promotes the generation of certain variable region sequences needed for the response. This new class of mouse models should facilitate the preclinical evaluation of candidate HIV-1 vaccination strategies.

expression diversifies antigen receptor variable region repertoires generated in mouse and human developing B and T cells that develop postnatally, with the notable exception of LC variable region repertoires in mice (10,22,23). Thus, while TdT is expressed during LC V(D)J recombination in postnatal human Pre-B cells (24), it is not expressed in postnatal mouse pre-B cells (25,26), leading to decreased junctional diversity and much more abundant MH-mediated joins in primary mouse LC repertoires compared to those of humans (22,23). Lack of TdT expression in fetal repertoires also is known to promote recurrent MH-mediated V(D)J junctions, that are not dominant in postnatal repertoires due to TdT expression. Some such recurrent MH-mediated V(D)J joins in fetal T or B cell repertoires generate TCRs or BCRs critical for certain physiological responses (13,14,27,28). However, the potential role of TdT and N regions in promoting specific responses has remained largely unaddressed.
To elicit VRC01-class broadly neutralizing antibodies (bnAbs), sequential vaccine immunization approaches propose a priming immunogen to drive precursors into GCs followed by boost immunogens designed to lead them through rounds of SHM/ affinity maturation. Based on a structurally designed eOD-GT8 immunogen that binds to the inferred VRC01 unmutated common ancestor (UCA) BCR, potential human VRC01-like precursor B cell frequency was estimated to be one in 400,000 or fewer (43,44). To test the priming and sequential immunogens that could elicit VRC01-class bnAbs in humans, mouse models are needed that reflect as closely as possible the biology of human B cell responses. Early models expressed knock-in V H 1-2 HCs and, in some, VRC01-class LC Vs, both with mature CDR3s (45)(46)(47). These models were nonphysiologic as their BCR repertoire was dominated by a single human HC/LC combination or a single human HC with diverse mouse LCs. Mice with fully human HC and LC gene segment loci assembled by V(D)J recombination were also tested; but precursor frequencies were 150-to 900-fold lower than that of humans (48), likely due to inability to express immense human-like CDR3 repertoires in mice with orders of magnitude fewer B cells. A V H 1-2-rearranging mouse model generated diverse V H 1-2 HC CDR3s, but it employed a germlinereverted VRC01 precursor LC with a 5-aa CDR3 from mature VRC01 bnAb (49). While useful for HC maturation studies during sequential immunization, this model was limited by over-abundance of VRC01 lineage LC precursors. More recently, B cells from transgenic VRC01-class UCA or eOD-GT8-binding precursor knock-in mice were adoptively transferred into congenic recipient mice at human-like frequencies (50)(51)(52)(53). While this elegant approach has been very useful, it still has certain limitations as it focused only on eOD-GT8-priming and tested just a small subset of potential VRC01 lineage precursors (50)(51)(52)(53).

Generation of Mice with VRC01-Class-Rearranging Human HC and
LC Vs. To address issues of prior models, we developed complete VRC01 mouse models in which individual B cells express one of a multitude of different VRC01 precursors at human-like frequencies, based on enforced rearrangement of both V H 1-2 and VRC01class Vκs (Fig. 1A). All complete VRC01-class models employ our previously described V H 1-2-rearranging HC allele in which the most D proximal functional mouse V H (V H 81X) was replaced with human V H 1-2 (49,54). The CTCF-binding site (CBE)-based IGCR1 element in the V H to D interval is also inactivated on this allele, which leads to dominant rearrangement of human V H 1-2 in an otherwise intact upstream mouse V H locus (55). On this allele, high-level V H 1-2 utilization in the absence of IGCR1 is mediated by its closely associated downstream CBE element (56). Our new models also use a version of this rearranging HC allele in which the mouse J H segments were replaced with human J H 2, which can contribute a tryptophan residue (Trp100B) conserved in the HC CDR3 of VRC01-class bnAbs (54). We have retained mouse Ds in the model for reasons we have previously described (57). Briefly, mouse Ds are highly related to certain human Ds and the contribution of D sequences to CDR3s is often obscured by V(D) J recombination-associated junctional diversification mechanisms including nucleotide deletions and N region additions (58). When the replacement allele in our new VRC01-class models is bred to homozygosity, V H 1-2 rearrangements represent nearly 73.8% of primary V(D)J rearrangements (SI Appendix, Fig. S1 A, Upper). Due to the counter selection of lower frequency upstream mouse V H rearrangements, V H 1-2 contribution to primary B cell BCR repertoires is reduced to 43% (SI Appendix, Fig. S1 A, Bottom), with immense CDR3 diversity (SI Appendix, Fig. S1B). Such CDR3 diversity is critical, as V H 1-2-encoded HC CDR3s were implicated in Env recognition by precursor VRC01-class BCRs and also implicated in the maturation of VRC01-class bnAbs (59,60).
To generate human Vκ-rearranging LC alleles, we used a strategy similar to that which we used for V H 1-2, as recently described (57). The CBE-based Cer/Sis element in the Vκ to Jκ interval has been implicated in promoting distal versus proximal Vκ rearrangements (61). To test Cer/Sis functions in more detail, we deleted this element from the wild-type mouse allele and assessed its impact on Vκ rearrangement via our high throughput HTGTS-Rep-seq method (62) (SI Appendix, Fig. S2A). Homozygous Cer/ Sis deletion substantially increased (up to eightfold) the frequency of 7 of the 11 the most Jκ-proximal Vκs (SI Appendix, Fig. S2B). Indeed, these 7 Vκs contributed to the vast majority of the primary BCR repertoire of these mice (SI Appendix, Fig. S2C), as upstream Vκ rearrangements were essentially abrogated in the absence of Cer/Sis. We note that Vκ3-2 and Vκ3-7 showed the greatest increase in utilization in the absence of Cer/Sis. Our initial plan for our VRC01-rearranging mouse models, analogous to our V H 1-2 rearranging Igh allele (49), was to increase the utilization of human Vκs in the model by introducing them into proximal positions on Cer/Sis-deleted Igκ alleles.
We replaced the Vκ3-2 sequence encoding the leader-intron-V sequence with the corresponding sequences of human Vκ1-33 on a wild-type Igκ allele ("Vκ1-33-rearranging" allele) and then also deleted Cer/Sis on that allele ("Vκ1-33 CS∆ -rearranging" allele) (57). In these replacement alleles, we maintained the mouse Vκ3-2 mice using a two-tailed unpaired t test. *P < 0.05, **P < 0.01, ***P < 0.001 sequence upstream of the start codon (ATG) (including the promoter) and the Vκ3-2 downstream sequence starting at the Vκ3-2 recombination signal sequence (RSS). HTGTS-Rep-seq revealed that similarly to Vκ3-2, human Vκ1-33 on homozygous replacement alleles in our VRC01-class models accounted for approximately 2% or 17% of primary Vκ rearrangements in the presence or absence of the Cer/Sis element, respectively (SI Appendix, Fig. S3A). Vκ1-33 contributed to the splenic BCR repertoire at similar frequencies (approximately 2% and 15%, respectively; SI Appendix, Fig. S3B). We also generated a "Vκ3-20-rearranging allele" in which mouse proximal Vκ3-7 was replaced with human Vκ3-20 ( Fig. 1A and SI Appendix, Fig. S3 C and D). When homozygous in mice, the Vκ3-20-rearranging allele contributed about 6% of primary Vκ rearrangements and contributed similar frequencies in splenic BCR repertoires (SI Appendix, Fig. S3E). We considered these levels sufficiently high to leave Cer/Sis intact for initial experiments. Based on studies of the Igh locus (56), we also inserted CBEs just downstream of the RSSs of the inserted Vκ1-33 and Vκ3-20 gene segments (Fig. 1A) (57). However, we found that, compared to the rearrangement frequencies of mouse Vκs they replaced, inserted CBEs had no measurable effect on Vκ1-33 rearrangement either in the presence or absence of Cer/Sis (SI Appendix, Figs. S2C and S3 A and B) and only modestly increased Vκ3-20 rearrangement in the presence of Cer/Sis (SI Appendix, Figs. S2C and S3E). The inability of an attached CBE to dominantly increase Vκ1-33 rearrangement in the absence of Cer/Sis suggests that mechanisms underlying CBE-enhanced dominant utilization of proximal V H s in the absence of IGCR1 may not similarly operate context of Igκ V(D)J recombination in the absence of Cer/Sis. This notion is consistent with recent findings, published after these models were generated, that indicated that mechanisms that promote long-range V H to DJ H joining are, at least in part, distinct from those that promote long-range Vκ to Jκ joining (63).
We refer to these new VRC01-class mouse models with human V H 1-2-and Vκ-rearranging ("R") alleles as the V H 1-2R JH2 /Vκ1-33R model, the V H 1-2R JH2 /Vκ1-33R CS∆ model ( "CS∆" indicates Cer/Sis deletion), and the V H 1-2R JH2 /Vκ3-20R model. Based on fluorescence-activated cell sorting (FACS) analyses of cell surface markers, splenic B and T cell populations in all three models were comparable to those of wild-type mice (SI Appendix, Fig. S3F). During our studies of the V H 1-2R JH2 /Vκ3-20R model, we discovered that the inserted Vκ3-20 sequence had acquired a single in-frame point mutation in CDR1 that changes an S to I residue (AGT to ATT) (SI Appendix, Fig. S4). We then corrected this mutation in the Vκ3-20 allele, introduced it into all mouse models described, and repeated all experiments originally performed with the mutated allele with mouse models harboring the corrected allele. Based on FACS analyses of cell surface markers, splenic B and T cell populations in the Vκ3-20 corrected model were also comparable to those of wild-type mice and those of the mouse models harboring mutated Vκ3-20 sequence (SI Appendix, Fig. S3F). Indeed, in all experiments described below, mouse models harboring the mutated and corrected Vκ3-20 sequence gave very similar results with respect to Vκ3-20-based VRC01-class responses, which, for comparison, are included in all immunization experiments and related figures described below.

Enforced Human TdT Expression Diversifies LC Repertoires.
VRC01-class bnAb LCs commonly have a LC 5-aa CDR3 with a relatively conserved QQYEF amino acid sequence (32,64). However, as compared to the frequency of LC 5-aa CDR3s in human BCR repertoires, our initial VRC01-class mouse models had 20-to 50-fold lower frequencies of LC 5-aa CDR3s (0.02%) in their mouse Vκ and human Vκ1-33 or Vκ3-20 LC BCR repertoires (SI Appendix, Fig. S5A) (48,64). In this regard, approximately 80% of human LC 5-aa-CDR3s are encoded by sequences with hTdT-generated N regions (SI Appendix, Fig. S5B). Thus, to enforce more human-like TdT expression in mouse bone marrow precursor B cells which normally lack TdT expression, we targeted human hTdT into the Rosa locus of ES cells containing the Vκ3-20R allele ( As compared to splenic B cells of V H 1-2R JH2 /Vκ3-20R mice, those of V H 1-2R JH2 /Vκ3-20R hTdT mice had markedly increased frequencies of N regions in both mouse Vκ to Jκ junctions and human Vκ3-20 to Jκ junctions (Fig. 1C), and, correspondingly, much more diverse CDR3s (Fig. 1D). Notably, while enforced N region addition increased the proportion of longer LC CDR3s (>9-aa), it also increased, up to fivefold, the proportion of short mouse and Vκ3-20 LC CDR3s (<7-aa), including 5-aa CDR3s ( Fig. 1 E and F). Correspondingly, the proportion of N-regions in short LC CDR3s was significantly increased (Fig. 1G) and the proportion of MH-mediated short Vκ to Jκ joins (<7-aa) was significantly reduced in splenic B cells of V H 1-2R JH2 /Vκ3-20R hTdT mice as compared to those of V H 1-2R JH2 /Vκ3-20R mice (Fig. 1H). In addition, we compared the LC CDR3s in splenic B cells of V H 1-2R JH2 /Vκ3-20R and V H 1-2R JH2 /Vκ3-20R hTdT mice to those in human tonsil naive B cells and found that enforced TdT expression in V H 1-2R JH2 /Vκ3-20R hTdT mice yielded more human-like CDR3s ( Fig. 1 C-E, G, and H). As endogenous mouse TdT expression is already robust in V H 1-2R JH2 /Vκ3-20R progenitor-stage B cells that undergo HC locus V(D)J recombination, human TdT expression had no obvious effect on HC CDR3 length and diversity in V H 1-2R JH2 /Vκ3-20R hTdT mice (SI Appendix, Fig. S5 H and I).
On day 8 post-immunization, the Glu96 (E), a conserved residue in 5-aa LC CDR3s of VRC01-class bnAbs, was dominantly selected by eOD-GT8 in VRC01-class 5-aa LC CDR3s from V H 1-2R JH2 / Vκ1-33R CS∆ , V H 1-2R JH2 /Vκ1-33R CS∆/hTdT , and V H 1-2R JH2 / Vκ3-20R hTdT mice but not from V H 1-2R JH2 /Vκ3-20R mice ( Fig. 2D and SI Appendix, Fig. S7D). This finding indicated that Vκ to Jκ joining events involving Vκ3-20 or mouse Vκs in the Vκ3-20 mice require N regions added by hTdT to generate the critical E residue in the VRC01-class 5-aa CDR3. Examination of Vκ3-20 and mouse Vκ sequences proved that this is the case (SI Appendix, Fig. S7E). On the other hand, examination of the Vκ1-33 sequences confirms that they can directly form the E residue in the VRC01-class 5-aa CDR3 when joined to mouse Jκ1 and human Jκ1 in the absence of hTdT activity (SI Appendix, Fig. S7E). Lack of this E residue in 5-aa mouse LC CDR3s in primary GCs that arose after a single eOD-GT8 immunization was also noted in prior studies (46,49,65,66). Thus, hTdT expression substantially enhanced the VRC01/Vκ3-20 and VRC01/mVκ GC response to eOD-GT8 immunization by generating Vκ3-20-based VRC01-class 5-aa CDR3s that, as a result of N-region addition, have the capacity to encode the critical CDR3 E residue.   We bred the V H 1-2R JH2 /Vκ1-33R CS∆/hTdT and V H 1-2R JH2 /Vκ3-20R hTdT mouse lines together to make an even more human-like model that rearranges both VRC01-class Vκs. In this new V H 1-2R JH2 /Vκ1-33R CS∆/hTdT /Vκ3-20R hTdT mouse model, Vκ1-33 and Vκ3-20 LCs were expressed in 7.8% and 3.4% of splenic B cells, respectively (SI Appendix, Fig. S8A). However, on day 8 postimmunization with eOD-GT8 60mer, VRC01/Vκ3-20 GC B cells were outcompeted by VRC01/Vκ1-33 GC B cells and were hardly represented in GCs, suggesting the frequency or affinity of responding VRC01/Vκ1-33 precursors was much higher than that of VRC01/Vκ3-20 precursors in this model (SI Appendix, Fig. S8B). Thus, we further generated the V H 1-2R JH2 /Vκ1-33R/Vκ3-20R hTdT model, in which Cer/Sis is still present on the Vκ1-33 allele, leading to a reduction in Vκ1-33 LC-expressing splenic B cell frequency to 0.74% (Fig. 3 A and B). Indeed, the relative frequency of Vκ1-33 versus Vκ3-20 expressing splenic B cells in the V H 1-2R JH2 / Vκ1-33R/Vκ3-20R hTdT model are more comparable to that of humans (67). To assess the frequency of VRC01-precursors, we sorted eOD-GT8-specific naive B cells and identified their BCR sequences ( Fig. 3C and SI Appendix, Fig. S8C). The frequency of eOD-GT8-specific VRC01 precursors using Vκ1-33 or Vκ3-20 LCs in this mouse model was approximately 1 in 230,000 (VRC01/ Vκ1-33: 1 in 500,000; VRC01/Vκ3-20: 1 in 420,000) (Fig. 3D), which is comparable to approximately 1 in 400,000 frequency of eOD-GT8-specific VRC01 precursors measured in humans (44). We also estimated the VRC01-precursor based on HTGTS-Rep-seq data by multiplying the frequency of V H 1-2 HCs by the frequency of Vκ1-33 and Vκ3-20 LCs with 5-aa CDR3s (Fig. 3E). The results suggest that only a small proportion of B cells expressing V H 1-2 HCs and Vκ3-20 LCs with 5-aa CDR3s bound to eOD-GT8. B cells at sufficient levels to support future prime-boost studies, we immunized them with eOD-GT8 60mer and then boosted them with eOD-GT8 60mer at day 28 (Fig. 4A). VRC01/Vκ1-33, VRC01/Vκ3-20, and VRC01/mVκ B cells were highly enriched in CD4bs-specific GC B cells at both 8 d and 36 d postimmunization ( Fig. 4B and SI Appendix, Fig. S9 A-C). Evaluation of GC responses at day 8 and day 36 revealed that the frequencies of VRC01/Vκ3-20 GC B cells and VRC01/Vκ1-33 GC B cells were comparable at day 8, but the frequencies of VRC01/Vκ3-20 GC B cells was higher than that of VRC01/Vκ1-33 GC B cells at day 36 (Fig. 4C). Sequencing analyses of VRC01-class antibodies cloned from both day 8 and day 36 GCs revealed extensive SHM, with a maximum of 17 aa mutations and a median of 9 aa mutations at day 36 ( Fig. 4D and SI Appendix, Fig. S9 D and E), and wide ranges of HC CDR3 length (SI Appendix, Fig. S9F). To further analyze VRC01class GC B cell sequence mutations, we compared them to intrinsic mutation patterns generated from nonproductive rearrangements of GC B cells without affinity selection ( Fig. 4E and SI Appendix,  Fig. S9 G-I) (see Method) (68). The Q61R mutant on the V H 1-2 HC reported for VRC01-class bnAbs was significantly enriched in day 36 VRC01-class antibodies (Fig. 4F) (42). The Glu96 (E) residues in LC CDR3s were dominant in all types of day 36 VRC01-class antibodies (Fig. 4G). We expressed several VRC01class antibodies with different LCs cloned from day 8 and day 36 GCs. Antibodies from day 8 GCs showed a range of binding affinities, with a median of 100 nM K D , to eOD-GT8 (Fig. 4H). For the antibodies from day 36 GCs, about 50% showed much higher binding activities, below 1 nM K D , representing an average affinity improvement of 100-fold ( Fig. 4H and SI Appendix, Table S1). Altogether, our findings strongly indicate that the V H 1-2R JH2 /Vκ1-33R/Vκ3-20R hTdT VRC01-class and related models will facilitate testing prime-boost immunization strategies aimed to advance eOD-GT8-primed vaccination studies to be used in human clinical trials.

Discussion
Many prior mouse models employed to test vaccine strategies designed to elicit VRC01-class HIV-1 bnAbs had exceedingly high or extremely low levels of VRC01-class precursor B cells (45)(46)(47)(48)(49). Other approaches to generate more physiological levels of VRC01 precursors in mouse models were limited by being designed to test only the eOD-GT8 priming immunogen in the context of very limited precursor diversity (50,51). We have now described more physiologically relevant VRC01-class V(D)J-rearranging mouse models for testing priming and boosting strategies designed to elicit VRC01-class bnAbs. These new VRC01-class rearranging mouse models rearrange both human VRC01-class V H 1-2 and Vκ3-20 and/or Vκ1-33 variable region gene segments, along with mouse V H s and Vκs during normal B cell development. The various mouse lines generated to make the VRC01-class rearranging models described here employ several different genetic strategies that should allow titration of the expression level of diverse Vκ3-20-and/or Vκ1-33-based variable region exons to establish mouse models that generate VRC01 precursor B cells over a wide range of levels (SI Appendix, Table S2). Of these models, the V H 1-2R JH2 / Vκ1-33R/Vκ3-20R hTdT model, described in depth in this report, generates a highly diverse set of potential VRC01-class precursors in mouse repertoires at similar relative levels to those found in human B cell repertories. Importantly, the potential VRC01-class precursors with highly diverse CDR3s generated in the VRC01class rearranging models should not be biased with respect to evaluating the efficacy of any particular VRC01-class priming immunogen (SI Appendix, Fig. S9F).
In this initial study, we have tested the eOD-GT8 priming immunogen in several VRC01 class rearranging models, including the most human-like V H 1-2R JH2 /Vκ1-33R/Vκ3-20R hTdT model and found robust engagement of VRC01-class precursors into GCs where they generated equally robust eOD-GT8-specific responses. Other types of priming immunogens that may not be as robust in engaging VRC01-class precursors as eOD-GT8, such as 426c-degly3 Ferritin (40,47) or GT1 trimer (69), should also be able to be readily evaluated in our new models. Conceivably, studies of some VRC01-class immunogens that have lower affinity for precursors may benefit initially through the use of VRC01-class models that express higher levels of VRC01-class precursors (SI Appendix, Table S2). Also, as individual VRC01-class precursor B cells in these new VRC01-class rearranging models express one of a multitude of different variations of the potential VRC01 precursors, they may, in theory, be useful for identifying new pathways that could lead to the generation of potent VRC01-class bnAbs. For any tested priming immunogen that generates a response, our new models could also be used to test sequential boost immunogens designed to lead them through rounds of SHM/affinity maturation that drive responses toward the generation of VRC01-class bnAbs, as described for less diverse earlier versions of these models (49,70).
A key feature of our new models is their ectopic TdT expression that forces their mouse Pre-B cells to further diversify their mouse and human LC variable region repertoires and make them more human-like, both with respect to contributing N-region diversity and by dampening recurrent MH-mediated join levels in their postnatal LC repertoires. As mentioned, the absence of TdT in fetal repertoires promotes recurrent MH-mediated junctions that lead to the generation of particular Ig or TCR variable region exon sequences (14)(15)(16)(17)(18)(19)(20)(21). For example, generation of recurrent "canonical" joins in fetal repertoires in the absence of TdT and N region additions underlies the generation of canonical junctions encoding recurrent γ/δ TCRs expressed on "innate-like" intraepithelial γ/δ T cells that persist into adulthood in both mice and humans (71,72). Notably, enforced TdT expression during fetal lymphocyte development dampens some such responses (13,28). In this study, we found that enforced TDT expression in mouse Pre-B cells increased the frequency of short 5-aa CDR3 sequences, such as those used in a VRC01-class response, and promoted a specific Vκ3-20-based eOD-GT8 primary response by generating N sequences that contribute to encoding a critical VRC01 class 5-aa CDR3 residue. Analyses of human Vκ3-20-based VRC01-class sequences indicate that this mechanism also operates in humans (e.g., SI Appendix, Fig. S7E). By extension, it is likely that postnatal TdT expression in mouse developing B cells will similarly contribute to other responses.
The strategies we employed for constructing the VRC01 rearranging mouse model can be generally adopted for generating mouse models for other classes of anti-HIV-1 bnAbs. In this regard, CDR3 diversification, including engineering the models to make very long human CDR3s, will be especially relevant for testing immunogens for bnAbs that rely heavily on CDR3 to contact Env epitopes, such as those of the V2 apex, V3 glycan, and MPER classes (73). The limitations with previously employed strategies to generate mouse models to test VRC01-class immunization strategies outlined above also will apply to mouse models designed to test immunogens in the context of these other bnAb lineages. Beyond this, all straight precursor variable region knock-in strategies are limited by difficulty in accurately inferring the CDR3 of the UCA sequence of precursors, which may include contributions from both nontemplated nucleotides and somatic hypermutations (74). Indeed, due to the enormous CDR3 diversity in human antibody repertoires, a specific   bnAb precursor may not be present in all individuals. To work at a population level, a vaccine should stimulate B cells expressing a range of related precursors. Mouse models expressing a unique bnAb precursor cannot assess this critical parameter. Also, the expression of certain bnAb precursor HCs or LCs can interfere with B cell development, leading to B cell deletion in bone marrow and/or anergy in peripheral lymphoid tissues (73,(75)(76)(77). The prototype VRC01-class rearranging mouse model we have described here addresses these potential issues in the VRC01 lineage. Thus, V(D) J recombination generates human VRC01-class precursors that express highly diverse CDR3s, many of which may be compatible with bnAb development. This type of mouse HIV-1 vaccine model does not depend on UCA inference. Additionally, the CDR3 diversity in the model facilitates the assessment of the ability of immunogens to tolerate CDR3 flexibility and mobilize related precursors for bnAb development. Finally, by generating diverse human primary BCR repertoires, rearranging mouse models can provide precursors that support normal B cell development and, correspondingly, generate B cells responsive to immunization.

VRC01-Rearranging Mouse Model and Embryonic Stem Cells.
The genetic modifications in the Igκ locus were introduced into previously generated V H 1-2 ES cells (129/Sv and C57BL/6 F1 hybrid background), using targeting strategies described previously (49). The mouse Vκ3-7 segment was replaced with human Vκ3-20 segment with an attached CBE (atccaggaccagcagggggcgcggagagcacaca) inserted 50 bp downstream of human Vκ3-20 segment. The replacement was mediated by homologous recombination using a PGKneolox2DTA.2 (Addgene #13449) construct and one guide RNA that targeted the mouse Vκ3-7 segment. The human TdT cDNA was cloned into the CTV (Addgene #15912) construct in which the TdT expression was driven by the CAG promotor and followed by an EGFP expression that is mediated by an internal ribosome entry site (IRES) (78). The TdT expression cassette was inserted into the first intron of mouse Rosa26 gene which is on the same chromosome 6 with Igκ locus by homologous recombination. The sequence of guide RNA used for targeting was listed in SI Appendix ,  Table S3. The ESCs were grown on a monolayer of mitotically inactivated mouse embryonic fibroblasts (iMEF) in DMEM medium supplemented with 15% bovine serum, 20 mM HEPES, 1× MEM nonessential amino acids, 2 mM glutamine, 100 units of penicillin/streptomycin, 100 mM b-mercaptoethanol, and 500 units/mL leukemia inhibitory factor (LIF). The V H 1-2 JH2 /Vκ3-20 hTdT -rearranging mouse was generated by blastocyst injection of the ES cells described above and several rounds of breeding to get germline transmission and homozygous mice. The V H 1-2 JH2 /Vκ1-33/Vκ3-20 hTdTrearranging mouse was generated by cross-breeding of V H 1-2 JH2 /Vκ3-20 hTdT and V H 1-2 JH2 /Vκ1-33 mice. Thus, human Vκ1-33 and Vκ3-20 segments were used on separated alleles. All mouse experiments were performed under protocol 20-08-4242R approved by the Institutional Animal Care and Use Committee of Boston Children's Hospital.
Immunogen and Immunization. Immunogen eOD-GT8 60mer was made as previously described (49). For immunization, each 8 to 12-wk-old mouse was immunized with 200 μL mixture that contain 25 μg filter-sterilized immunogen and 60 μg of poly I:C in PBS by intraperitoneal injection.

Splenic B Cell, GC B Cell Purification and Antigen-Specific GC B Cell
Sorting. Splenic B cells used for HTGTS-Rep-seq were purified from unimmunized 5 to 8-wk-old mice by MACS® Microbeads according to the manufacturer's protocol. In brief, spleens were dissected out from unimmunized mice, prepared into single-cell suspensions, and stained with anti-B220 Microbeads for 20 min at 4 °C. The splenic B cells were collected using the LS column and MACS™ Separator. GC B cells used for Rep-SHM-seq were purified from 8 to 12-wk-old mice after eOD-GT8 60mer immunization. GC B cells were sorted for the phenotype B220 + (BV711: BioLegend 103255), CD95 + (PE-Cy7: eBioscience 557653) and GL7 + (PE: BioLegend 144607). CD4-binding sitespecific GC B cells for single-cell RT-PCR were further selected for the phenotype eOD-GT8 Fc + and ∆eOD-GT8 Fc − . The eOD-GT8 Fc was conjugated with Alexa Fluor 647 fluorescence (Thermo Fisher Scientific A30009). The ∆eOD-GT8 Fc was conjugated with Biotin (Thermo Fisher Scientific A30010) and then stained with SA-BV605 (BioLegend 405229).

Human Tonsil Mature Naive B Cell Isolation and Genomic DNA Extraction.
Human tonsils were obtained from discarded tissues as part of a routine tonsillectomy from patients at Boston Children's Hospital. Human tissues were obtained under the IRB approved protocol IRB-P00026526, to J.P.M. Tonsils were minced in RPMI 1640 with 10% FBS and forced through a 45 μm mesh and washed twice with media. The single-cell suspension was stained with 7-AAD (Biolegend) for viability and antibodies directed against human CD19 (APC clone SJ25-C1, Thermo Fisher Scientific), CD38 (PE-Cy7 clone HB-7, Biolegend), IgD (FITC polyclonal, Thermo fisher) and CD27 (APC-Cy7 clone M-T271, Biolegend). Live Naive B cells were obtained by sorting the stained cells using a FACS Aria (BD Biosciences) as 7-AAD − CD19 + CD38 − IgD + CD27 − . Genomic DNA from sorted cells was prepared using a DNeasy Blood and Tissue Kit (Qiagen) according to the manufacturer's protocol.
HTGTS-Rep-seq and Rep-SHM-seq Analysis. Ten micrograms of DNA from purified splenic B cells was used for generating HTGTS-Rep-seq libraries as previously described (62). Four bait primers that target mouse Jκ1, Jκ2, Jκ4, and Jκ5 were mixed to capture all Igκ LC repertoire in one library. One bait primer that targets human J H 2 was used to capture HC repertoire. The sequences of human J H 2 and mouse Jκ primers were as same as the previously reported (54,68). These HTGTS-Rep-seq libraries were sequenced by Illumina NextSeq 2 × 150-bp paired end kit analyzed with the HTGTS-Rep-seq pipeline (62). DNA from GC B cells was used for generating Rep-SHM-seq libraries as previously described (68). To capture the fulllength V(D)J sequence especially the CDR1 region for intrinsic SHM analysis, we designed bait primers that target human V intron regions. The primer sequences are in SI Appendix, Table S3. These Rep-SHM-seq libraries were sequenced by Illumina MiSeq 2 × 300-bp paired end kit analyzed with the Rep-SHM-seq pipeline, which uses IgBLAST to annotate V, D, J, and CDRs for each read (68).The HTGTS-Rep-seq and Rep-SHM-seq data are available through the Gene Expression Omnibus (GEO) database (GSE214884).
Single-Cell RT-PCR and Monoclonal Antibody Production. Single-cell RT-PCR was performed as described previously (57). In brief, single antigen-specific GC B cells were sorted into 96-well plate that contain 5 µL lysis buffer in each well. After sorting, we used a primer mixture that specifically targets Cµ, Cγ1, Cγ2a, and Cκ to perform reverse transcription and then two rounds of nested PCR to amplify the V(D)J sequences of V H 1-2 HC, human Vκ3-20/Vκ1-33 LC, and mouse LC. The first round of PCR was performed at 94 °C for 5 min followed by 25 cycles of 94 °C for 30 s, 60 °C (V H 1-2) or 58 °C (Vκ3-20/Vκ1-33) for 30 s, 72 °C for 60 s, and final incubation at 72 °C for 5 min. The second round of PCR was performed with 2 µL of unpurified first-round PCR product at 94 °C for 5 min followed by 35 cycles of 94 °C for 30 s, 60 °C (V H 1-2) or 58 °C (Vκ3-20/Vκ1-33) for 30 s, 72 °C for 60 s, and final incubation at 72 °C for 5 min. The mouse LCs were amplified with different annealing temperatures, 50 °C in the first round of PCR and 45 °C in the second round of PCR, and different amplification cycles, 50 cycles for both PCRs. PCR products were run on agarose gels and sanger sequencing was used to confirm their identity. The V(D)J sequences of these antibodies have been deposited to GenBank (accession Nos. OP598882-OP599353). The primer sequences for V H 1-2 HC, Vκ3-20 and Vκ1-33 LC amplification are listed in SI Appendix, Table S3. The primer sequences for mouse LC amplification were as same as previously reported (79). Genes encoding the antibody Fv regions were synthesized by GenScript and cloned into antibody expression vectors pCW-CHIg-hG1 and pCW-CLIg-hk (GenBank accessions ON512569 and ON512571). Monoclonal antibodies were generated using the Expi293 expression system (Thermo Fisher Scientific) and purified by rProtein A Sepharose Fast Flow resin (Cytiva).
Carterra Human IgG Capture. Kinetics and affinity of antibody-antigen interactions were measured on Carterra LSA using HC30M or CMDP Sensor Chip (Carterra) and 1x HBS-EP+ pH 7.4 running buffer (20× stock from Teknova, Cat. No H8022) supplemented with Bovine Serum Albumin (BSA) at 1 mg/mL. Chip surfaces were prepared for ligand capture following Carterra software instructions. In a typical experiment about 1,000 to 1,700 RU of capture antibody (SouthernBiotech Cat no 2047-01) in 10 mM sodium acetate pH 4.5 was amine coupled. Phosphoric Acid 1.7% was our regeneration solution with 30 s contact time and injected three times per each cycle. Solution concentration of ligands was above 10 ug/mL, and the contact time was 10 min as per Carterra manual. Raw sensograms were analyzed using Kinetics software (Carterra), interspot and blank double referencing, Langmuir model. Analyte concentrations were quantified on NanoDrop 2000c Spectrophotometer using absorption signal at 280 nm.

Analyses of CDR3 Diversity and MH-Mediated V(D)J Recombination.
The lengths of insertion and MH for Vκ to Jκ rearrangement were annotated based on HTGTS-Rep-seq results. Insertion nucleotides can be classified into P (palindromic) nucleotides and N (nontemplate) nucleotides. For a read that can be aligned to the 3' end of V segment or 5' end of J segment, the length of P nucleotides was determined by greedy alignment of read sequence outside the V or J end to the reverse complimentary V or J sequence from the end. And the remaining insertion nucleotides were classified as N nucleotides. The length of MH was determined by the length of overlapping read sequence that could be aligned to both V and J (V_end_on_read -J_start_on_read + 1) after greedy alignment to V and J. CDR3 diversity was represented by the percentage of unique CDR3s for a series of downsampled read numbers (e.g., 20, 50, 100, and 200), which could be viewed as rarefaction and estimated by R package "iNEXT." Welch's t test was used to compare the percentage of unique CDR3s between groups.
Statistical Analysis. Statistical tests with appropriate underlying assumptions on data distribution and variance characteristics were used. The t test was used as indicated in the figure legends. Statistical analysis was performed in Prism (v.8, GraphPad Software).
Data, Materials, and Software Availability. All data needed to evaluate the conclusions of the paper are presented in the paper or deposited on the online database. Nucleotide sequences have been deposited to GenBank (accession Nos. OP598882-OP599353). The next-generation sequencing data reported in this paper have been deposited in the Gene Expression Omnibus (GEO) database under the accession number GSE214884. Previously published data were used for this work [GSE197255 (57)].The computational pipeline of Rep-SHM-Seq and the code for statistical analysis tools used in this study are available at https:// github.com/Yyx2626/HTGTSrep.