Deciphering the antigen specificities of antibodies by clustering their complementarity determining region sequences

ABSTRACT Recent advances in adaptive immune receptor repertoire sequencing have provided abundant B cell receptor (BCR) sequences under various conditions, including vaccination and disease. However, determining target antigen and epitope specificity of the corresponding antibodies is a major challenge due to their exceptional sequence diversity. Here, we introduce a novel method to cluster antibodies sharing antigenic targets based on their complementarity determining region (CDR) sequences. Using the proposed method, we demonstrate that SARS-CoV-2 spike protein receptor-binding domain (RBD) binders and non-RBD binders from publicly available BCR data were classified correctly, with a cluster purity of 95%. These clusters were then leveraged for annotating unlabeled COVID-19 patient BCR data, enabling the discovery of novel anti-RBD antibodies. We further validated the method by clustering BCR repertoires obtained from single-cell immune profiling of diphtheria-tetanus-pertussis (DTP)-vaccinated donors. Antibody expression and antigen-binding assays demonstrated that the clusters exhibited 96% antigen purity, surpassing the apparent 82% purity achieved by assigning antigens to the same B cells using fluorescently labeled DTP antigen probes. Moreover, antibodies within certain clusters were found to possess neutralizing activity, suggesting that CDR clusters contain epitope-level information. Together, this study offers a simple approach for antigen- and epitope-specific BCR discovery that is reproducible, inexpensive, and applicable to a wide range of antigen targets. IMPORTANCE Determining antigen and epitope specificity is an essential step in the discovery of therapeutic antibodies as well as in the analysis adaptive immune responses to disease or vaccination. Despite extensive efforts, deciphering antigen specificity solely from BCR amino acid sequence remains a challenging task, requiring a combination of experimental and computational approaches. Here, we describe and experimentally validate a simple and straightforward approach for grouping antibodies that share antigen and epitope specificities based on their CDR sequence similarity. This approach allows us to identify the specificities of a large number of antibodies whose antigen targets are unknown, using a small fraction of antibodies with well-annotated binding specificities.

• Point-by-point responses to the issues raised by the reviewers in a file named "Response to Reviewers," NOT IN YOUR COVER LETTER.
• Upload a compare copy of the manuscript (without figures) as a "Marked-Up Manuscript" file.
• Each figure must be uploaded as a separate file, and any multipanel figures must be assembled into one file.
• Manuscript: A .DOC version of the revised manuscript • Figures: Editable, high-resolution, individual figure files are required at revision, TIFF or EPS files are preferred ASM policy requires that data be available to the public upon online posting of the article, so please verify all links to sequence records, if present, and make sure that each number retrieves the full record of the data.If a new accession number is not linked or a link is broken, provide production staff with the correct URL for the record.If the accession numbers for new data are not publicly accessible before the expected online posting of the article, publication of your article may be delayed; please contact the ASM production staff immediately with the expected release date.
For complete guidelines on revision requirements, please see the journal Submission and Review Process requirements at https://journals.asm.org/journal/mSystems/submission-review-process.Submission of a paper that does not conform to mSystems guidelines will delay acceptance of your manuscript.
Please return the manuscript within 60 days; if you cannot complete the modification within this time period, please contact me.If you do not wish to modify the manuscript and prefer to submit it to another journal, please notify me of your decision immediately so that the manuscript may be formally withdrawn from consideration by mSystems.
Corresponding authors may join or renew ASM membership to obtain discounts on publication fees.Need to upgrade your membership level?Please contact Customer Service at Service@asmusa.org.
Thank you for submitting your paper to mSystems.
The ASM Journals program strives for constant improvement in our submission and publication process.Please tell us how we can improve your experience by taking this quick Author Survey.

Reviewer comments:
Reviewer #1 (Comments for the Author): This manuscript introduces a novel method to cluster antibodies sharing antigenic targets based on the CDR sequences.Authors applied this method to COVID-19 patient publicly available BCR data and Diphtheria-Tetanus-Pertussis (DTP)vaccinated donors.Using this method, antibody expression and antigen binding assays demonstrated that the clusters exhibited 96% antigen purity compared to 82% purity achieved by assigning antigens to the same B cells using fluorescently labeled DTP antigen probes.This approach allows us to identify the specificities of many antibodies whose antigen targets are unknown, using a small fraction of antibodies with well-annotated binding specificities.The manuscript overall is well written, and studies are well controlled.Here are some comments: 1.As mentioned in the manuscript, the four DTP-vaccinated donors display quite a heterogeneity for antibody clusters as well as humoral immune responses.For example, donor 3's humoral immune response is weaker than that of other donors.It will be interesting to show whether the binding and neutralization abilities of antibodies derived from donor 3 differ from other donors whose immune responses are stronger after vaccination.2. It will be helpful to indicate in Figure 5C which clusters these antibodies belong to.This will provide useful information for antibody binding, neutralization, and clustering information.3. It's unclear whether the difference between the higher purity for COVID-19 patients data and DTP vaccination data is due to sample size difference or different basal level antibodies.As COVID-19 appeared as a new pathogen with no basal-level antibodies for the population, the majority of the population might have been vaccinated with DTP before this study, and the vaccination of DTP serves as a boost to their immune response.Please discuss this in the discussion session, as this will provide useful information.

Reviewer #2 (Comments for the Author):
This is a very interesting manuscript describing a novel approach of clustering B cell receptoir repertoires based on their paratope (CDR), rather than on the epitope.The authors used two approaches to validate their approach, first using the publicly available database for SARS-CoV-2, and secondly applying their approach to immune sera from previously exposed human volunteers to DTaP or respective infections.The authors were able to confirm that in both cases their approach was able to allow them to cluster the repertoire based on the CDRs.Thus, the authors suggest, that when used in tandem with traditional approaches, this could help to characterize antibody repertoires with a higher resolution in a less complicated fashion.

Two quick comments that the authors may wnat to address:
1.There is no discussion of affinity maturation through somatic hypermutation in the manuscript, and one wonders if clustering the repertoire based on paratopes would also allow to predict clustering of high versus low affinity antibodies? 2. Secondly, as future vaccine target broadly neutralizing antibodies to increase the breath of immune protection, one wonders how clustering based on the paratope might allow the selection for broadly neutralizing antibodies.It would be great to see the authors addressing these two points in their discussion.

Response to Reviewers
Dear Prof. Ileana Cristea Thank you for giving us the opportunity to submit a revised version of the manuscript "Deciphering the antigen specificities of antibodies by clustering their complementarity determining region sequences" for publication in mSystems.We appreciate the time and effort that you and the reviewers dedicated in providing insightful comments and valuable feedback on our manuscript.Where feasible, we have incorporated the suggestions made by the reviewers and highlighted those changes within the manuscript.Below, please find our point-by-point responses to the reviewers' comments and concerns.All page and line numbers refer to the revised manuscript file with tracked changes.

Reviewer #1
The manuscript overall is well written, and studies are well controlled.Author response: Thank you for reviewing our manuscript and giving constructive comments.
1.As mentioned in the manuscript, the four DTP-vaccinated donors display quite a heterogeneity for antibody clusters as well as humoral immune responses.For example, donor 3's humoral immune response is weaker than that of other donors.It will be interesting to show whether the binding and neutralization abilities of antibodies derived from donor 3 differ from other donors whose immune responses are stronger after vaccination.
Author response: Thank you for pointing this out.Indeed, among the expressed antibodies from Clusters 1-25, Donor 3 only contributed to clusters that didn't bind to DTP (Cluster 3, 7 and 20) and as a result, there were no neutralizing antibodies derived from this donor.We cannot say that there was an absolute absence of neutralizing antibodies in Donor 3 due to the sampling limitation of repertoire analysis and antibody expression, but these results suggest that Donor 3 was a weak responder.To highlight these observations, we have modified the main text at Results and Discussion part as follows: Results (Line 241-243): Indeed, among the expressed antibodies from Clusters 1-25, Donor 3 only contributed to clusters that didn't bind to DTP antigen (Cluster 3,7 and 20), indicating that Donor 3 is a weak responder to the vaccine.
Discussion (Line 300-302): Moreover, we were able to identify a weak responder to the vaccine, highlighting the heterogeneity of human antibody responses to the same immune perturbation.
2. It will be helpful to indicate in Figure 5C which clusters these antibodies belong to.This will provide useful information for antibody binding, neutralization, and clustering information.
Author response: As suggested by the reviewer, we have modified Figure 5C by adding cluster information.Hopefully this can make it easier to see the antibody binding, neutralization, and clustering information in one place.
3. It's unclear whether the difference between the higher purity for COVID-19 patients' data and DTP vaccination data is due to sample size difference or different basal level antibodies.As COVID-19 appeared as a new pathogen with no basal-level antibodies for the population, the majority of the population might have been vaccinated with DTP before this study, and the vaccination of DTP serves as a boost to their immune response.Please discuss this in the discussion session, as this will provide useful information.
Author response: Although it is premature to generalize based on only two examples, we rather want to emphasize that 95 and 96% purities are very similar in spite of the differences in antigen, sample size, basal level of antibodies, and other factors.For this reason, we have not discussed the 1% difference in purity.We hope we have not misinterpreted the reviewer's comment.

Reviewer #2
This is a very interesting manuscript describing a novel approach of clustering B cell receptor repertoires based on their paratope (CDR) Author response: We appreciate the positive feedback very much.
1.There is no discussion of affinity maturation through somatic hypermutation in the manuscript, and one wonders if clustering the repertoire based on paratopes would also allow to predict clustering of high versus low affinity antibodies?Author response: Thank you for the suggestion to discuss more about the affinity maturation and somatic hypermutation.Here, we want to emphasize that the paradigm of paratope is more general than that of clonotype in terms of accommodating somatic hypermutation (SHM) in the CDRs.Unfortunately, predicting the affinities of antibodies is currently beyond the ability of our clustering software.However, we agree that this is a very important question to be addressed by additional analysis.To this end, we have added sentences to the Discussion section as follows: Discussion (Line 290-293):