Quantitatively Visualizing Bipartite Datasets

Open Access

Quantitatively Visualizing Bipartite Datasets

Tal Einav, Yuehaw Khoo, and Amit Singer

Phys. Rev. X 13, 021002 – Published 4 April 2023

Abstract

As experiments continue to increase in size and scope, a fundamental challenge of subsequent analyses is to recast the wealth of information into an intuitive and readily interpretable form. Often, each measurement conveys only the relationship between a pair of entries, and it is difficult to integrate these local interactions across a dataset to form a cohesive global picture. The classic localization problem tackles this question, transforming local measurements into a global map that reveals the underlying structure of a system. Here, we examine the more challenging bipartite localization problem, where pairwise distances are available only for bipartite data comprising two classes of entries (such as antibody-virus interactions, drug-cell potency, or user-rating profiles). We modify previous algorithms to solve bipartite localization and examine how each method behaves in the presence of noise, outliers, and partially observed data. As a proof of concept, we apply these algorithms to antibody-virus neutralization measurements to create a basis set of antibody behaviors, formalize how potently inhibiting some viruses necessitates weakly inhibiting other viruses, and quantify how often combinations of antibodies exhibit degenerate behavior.

8 More

Received 26 July 2022
Revised 31 December 2022
Accepted 24 February 2023

DOI:https://doi.org/10.1103/PhysRevX.13.021002

Published by the American Physical Society under the terms of the Creative Commons Attribution 4.0 International license. Further distribution of this work must maintain attribution to the author(s) and the published article’s title, journal citation, and DOI.

Published by the American Physical Society

Physics Subject Headings (PhySH)

Immune system diseases Viral diseases

Data analysis

General PhysicsPhysics of Living SystemsInterdisciplinary Physics

Authors & Affiliations

Tal Einav ^1,*, Yuehaw Khoo², and Amit Singer ³

¹Divisions of Computational Biology and Basic Sciences, Fred Hutchinson Cancer Center, Seattle, Washington 98109, USA
²Department of Statistics, University of Chicago, Chicago, Illinois 60637, USA
³Department of Mathematics and PACM, Princeton University, Princeton, New Jersey 08540, USA

^*Corresponding author. teinav@fredhutch.org

Popular Summary

As datasets grow and become more complex, they necessitate tools that build an intuitive understanding of a system by revealing its underlying structure. We look at the interactions between two classes of entries—a broad definition that includes datasets on how antibodies inhibit viruses, how transcription factors bind to DNA segments, and even how people rate movies. We develop three algorithms to embed these pairwise interactions into a low-dimensional space where the distance between entries corresponds to their interaction strength. Such embeddings not only predict unmeasured interactions, but they also provide a basis set of behaviors for a system. For example, given an antibody’s inhibition against one virus, we can predict its possible behaviors against other variants.

To create robust embeddings, we leverage tools from data science, geometric computation, and biophysics to explore three embedding schemes and quantify how they (and their combinations) can tolerate noise, missing measurements, and large outliers. We apply these methods to antibody-virus inhibition data and show that they conform with a 2D representation. Using this framework, we find that most two-antibody cocktails can be mimicked by a single antibody, whereas cocktails with three or more antibodies often exhibit novel behavior that no single antibody can replicate.

Such embeddings can be applied to diverse systems to harness a wealth of available data and extrapolate new behaviors. This not only amplifies the amount of data but also provides insight into the underlying trade-offs and constraints of a system. A key open question is how far such extrapolations can be pushed before they break.

Key Image

Article Text

Click to Expand

References

Click to Expand

Issue

Vol. 13, Iss. 2 — April - June 2023

Subject Areas

Reuse & Permissions

Author publication services for translation and copyediting assistance advertisement

Physical Review X