There is a newer version of the record available.

Published February 14, 2024 | Version v1
Preprint Open

A Scalable Implementation of Mapper for Topological Data Analysis via Vantage Point Trees

Creators

Description

The Mapper algorithm is a powerful tool used in Topological Data Analysis to extract valuable information about the shape of point clouds. One significant drawback of many existing open-source libraries for Mapper is a suboptimal approach to constructing open covers. This naive implementation does not scale well in high dimensions and hampers the overall performance of Mapper. In this study, we propose a novel methodology for building open covers for Mapper using vp-trees. By employing this approach, we develop a more efficient algorithm capable of handling high-dimensional data with improved scalability. Additionally, our methodology produces a simplified and more concise Mapper Graph, enhancing the interpretability of the results. To facilitate the adoption of our methodology, we introduce the tda-mapper Python library. This library, hosted at https://github.com/lucasimi/tda-mapper-python, implements our proposed approach for building open covers. With tda-mapper, users can effortlessly leverage the benefits of our methodology in their own analyses. Lastly, we conduct comprehensive benchmarks to assess the performance of tda-mapper in comparison to widely-used open-source alternatives. These benchmarks provide quantitative evidence of the superior scalability and efficiency of tda-mapper, further solidifying its position as a reliable and powerful option for Mapper-based analyses.

Files

paper.pdf

Files (975.0 kB)

Name Size Download all
md5:928eca6f4aa46d38b9311f8e1f3d15fd
496.3 kB Preview Download
md5:3d7d57b54cc54d9d319e921f76ce3198
478.7 kB Preview Download

Additional details

Related works

Is supplement to
Software: 10.5281/zenodo.10642381 (DOI)