Skip to main content
Log in

Persistence Curves: A canonical framework for summarizing persistence diagrams

  • Published:
Advances in Computational Mathematics Aims and scope Submit manuscript

Abstract

Persistence diagrams are one of the main tools in the field of Topological Data Analysis (TDA). They contain fruitful information about the shape of data. The use of machine learning algorithms on the space of persistence diagrams proves to be challenging as the space lacks an inner product. For that reason, transforming these diagrams in a way that is compatible with machine learning is an important topic currently researched in TDA. In this paper, our main contribution consists of three components. First, we develop a general and unifying framework of vectorizing diagrams that we call the Persistence Curves (PCs), and show that several well-known summaries, such as Persistence Landscapes, fall under the PC framework. Second, we propose several new summaries based on PC framework and provide a theoretical foundation for their stability analysis. Finally, we apply proposed PCs to two applications—texture classification and determining the parameters of a discrete dynamical system; their performances are competitive with other TDA methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. PersistenceImages. https://gitlab.com/csu-tda/PersistenceImages (2019)

  2. Adams, H., Emerson, T., Kirby, M., Neville, R., Peterson, C., Shipman, P., Chepushtanova, S., Hanson, E., Motta, F., Ziegelmeier, L.: Persistence images: A stable vector representation of persistent homology. J. Mach. Learn. Res. 18, 218–252 (2017)

    MathSciNet  MATH  Google Scholar 

  3. Atienza, N., Gonzalez-Diaz, R., Soriano-Trigueros, M.: A new entropy based summary function for topological data analysis. Electron. Notes Discret. Math. 68, 113–118 (2018). Discrete Mathematics Days 2018

    Article  Google Scholar 

  4. Atienza, N., González-Díaz, R., Soriano-Trigueros, M.: On the stability of persistent entropy and new summary functions for TDA. arXiv:1803.08304, (2018)

  5. Bell, G., Lawson, A., Pritchard, C.N., Yasaki, D.: The space of persistence diagrams fails to have yu’s property a (2019)

  6. Bendich, P., Marron, J.S., Miller, E., Pieloch, A., Skwerer, S.: Persistent homology analysis of brain artery trees. Ann. Appl. Stat. 10, 198 (2016)

    Article  MathSciNet  Google Scholar 

  7. Berry, E., Chen, Y.-C., Cisewski-Kehe, J., Fasy, B.T.: Functional summaries of persistence diagrams. J. Appl. Computat. Topol. 4, 211–262 (2020)

    Article  MathSciNet  Google Scholar 

  8. Bubenik, P.: Statistical topological data analysis using persistence landscapes. J. Mach. Learn. Res. 16, 77–102 (2015)

    MathSciNet  MATH  Google Scholar 

  9. The persistence landscape and some of its properties. Abel Symposia, 97–117 (2020)

  10. Bubenik, P., Vergili, T.: Topological spaces of persistence modules and their properties. J. Appl. Computat. Topol. 2, 233–269 (2018)

    Article  MathSciNet  Google Scholar 

  11. Bubenik, P., Wagner, A.: Embeddings of persistence diagrams into hilbert spaces (2019)

  12. Carrière, M., Bauer, U.: On the metric distortion of embedding persistence diagrams into separable hilbert spaces. In: Symposium on Computational Geometry (2019)

  13. Carrière, M., Cuturi, M., Oudot, S.: Sliced Wasserstein kernel for persistence diagrams. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning, vol. 70 of Proceedings of Machine Learning Research, International Convention Centre, Sydney, Australia, Aug 06–11, PMLR, pp 664–673 (2017)

  14. Carrière, M., Chazal, F., Ike, Y., Lacombe, T., Royer, M., Umeda, Y.: Perslay: A neural network layer for persistence diagrams and new graph topological signatures (2019)

  15. Carstens, C.J., Horadam, K.J.: Persistent homology of collaboration networks. Math. Probl. Eng. 2013 (2013)

  16. Chazal, F., Fasy, B., Lecci, F., Michel, B., Rinaldo, A., Wasserman, L.: Subsampling methods for persistent homology. In: International Conference on Machine Learning, pp 2143–2151 (2015)

  17. Chazal, F., Fasy, B.T., Lecci, F., Rinaldo, A., Wasserman, L.: Stochastic convergence of persistence landscapes and silhouettes. In: Proceedings of the Thirtieth Annual Symposium on Computational Geometry, p 474. ACM (2014)

  18. Chen, Y.-C., Wang, D., Rinaldo, A., Wasserman, L.: Statistical analysis of persistence intensity functions. arXiv:1510.02502 (2015)

  19. Chevyrev, I., Nanda, V., Oberhauser, H.: Persistence paths and signature features in topological data analysis. IEEE Trans. Pattern Anal. Mach. Intell. 42, 192–202 (2020)

    Article  Google Scholar 

  20. Chung, Y.-M., Day, S.: Topological fidelity and image thresholding: A persistent homology approach. J. Math. Imaging Vis. pp. 1–13 (2018)

  21. Chung, Y.-M., Hu, C.-S., Lawson, A., Smyth, C.: Topological approaches to skin disease image analysis. In: 2018 IEEE International Conference on Big Data (Big Data), pp 100–105. IEEE (2018)

  22. Cohen-Steiner, D., Edelsbrunner, H., Harer, J.: Stability of persistence diagrams. Discret. Comput. Geom. 37, 103–120 (2007)

    Article  MathSciNet  Google Scholar 

  23. Cohen-Steiner, D., Edelsbrunner, H., Harer, J., Mileyko, Y.: Lipschitz functions have l p-stable persistence. Found. Comput. Math. 10, 127–139 (2010)

    Article  MathSciNet  Google Scholar 

  24. De Silva, V., Ghrist, R., et al: Coverage in sensor networks via persistent homology. Algebraic Geom. Topol. 7, 339–358 (2007)

    Article  MathSciNet  Google Scholar 

  25. Divol, V., Lacombe, T.: Understanding the topology and the geometry of the space of persistence diagrams via optimal partial transport. J. Appl. Computat. Topol. pp. 1–53 (2020)

  26. Dlotko, P.: Persistence representations. In: GUDHI User and Reference Manual, GUDHI Editorial Board 3.1.1 (2020)

  27. Donato, I., Gori, M., Pettini, M., Petri, G., De Nigris, S., Franzosi, R., Vaccarino, F.: Persistent homology analysis of phase transitions. Phys. Rev. E. 93, 052138 (2016)

    Article  Google Scholar 

  28. Edelsbrunner, H., Harer, J.: Computational topology: An introduction, miscellaneous books, American Mathematical Society (2010)

  29. Edelsbrunner, H., Letscher, D., Zomorodian, A.: Topological persistence and simplification. In: Proceedings 41st Annual Symposium on Foundations of Computer Science, pp 454–463. IEEE (2000)

  30. Feichtinger, H.G., Strohmer, T.: Gabor Analysis And Algorithms: Theory and Applications. Springer Science & Business Media, New York (2012)

    MATH  Google Scholar 

  31. Ferri, M., Frosini, P., Lovato, A., Zambelli, C.: Point selection: A new comparison scheme for size functions (with an application to monogram recognition). In: Asian Conference on Computer Vision, pp 329–337. Springer (1998)

  32. Frosini, P.: Measuring shapes by size functions. In: Intelligent Robots and Computer Vision X: Algorithms and Techniques. International Society for Optics and Photonics, vol. 1607, pp 122–134 (1992)

  33. Guo, W., Manohar, K., Brunton, S.L., Banerjee, A.G.: Sparse-tda: Sparse realization of topological data analysis for multi-way classification. IEEE Trans. Knowl. Data Eng. 30, 1403–1408 (2018)

    Article  Google Scholar 

  34. Hayman, E., Caputo, B., Fritz, M., Eklundh, J.-O.: On the significance of real-world conditions for material classification. In: European conference on computer vision, pp 253–266. Springer (2004)

  35. Hein, J.: Discrete Mathematics, Discrete Mathematics and Logic Series. Jones and Bartlett Publishers, Boston (2003)

    Google Scholar 

  36. Kaczynski, T., Mischaikow, K., Mrozek, M.: Computational Homology, Applied Mathematical Sciences. Springer, New York (2004)

    MATH  Google Scholar 

  37. Kusano, G., Hiraoka, Y., Fukumizu, K: Persistence weighted gaussian kernel for topological data analysis. In: International Conference on Machine Learning, pp 2004–2013 (2016)

  38. Lawson, A.: PersistenceCurves (a python package for computing persistence curves). https://github.com/azlawson/PersistenceCurves (2018)

  39. Lawson, A.: On the Preservation of Coarse Properties over Products and on Persistence Curves, PhD thesis, The University of North Carolina at Greensboro (2019)

  40. Lazebnik, S., Schmid, C., Ponce, J.: A sparse texture representation using local affine regions. IEEE Trans. Pattern Anal. Mach. Intell. 27, 1265–1278 (2005)

    Article  Google Scholar 

  41. Li, C., Ovsjanikov, M., Chazal, F.: Persistence-based structural recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1995–2002 (2014)

  42. Li, L., Cheng, W.-Y., Glicksberg, B.S., Gottesman, O., Tamler, R., Chen, R., Bottinger, E.P., Dudley, J.T.: Identification of type 2 diabetes subgroups through topological analysis of patient similarity. Sci. Transl. Med. 7, 311ra174–311ra174 (2015)

    Google Scholar 

  43. Liu, L., Chen, J., Fieguth, P., Zhao, G., Chellappa, R., Pietikäinen, M.: From bow to cnn: Two decades of texture representation for texture classification. Int. J. Comput. Vis. 127, 74–109 (2019)

    Article  Google Scholar 

  44. Mileyko, Y., Mukherjee, S., Harer, J.: Probability measures on the space of persistence diagrams. Inverse Probl. 27, 124007 (2011)

    Article  MathSciNet  Google Scholar 

  45. Nakamura, T., Hiraoka, Y., Hirata, A., Escolar, E.G., Nishiura, Y.: Persistent homology and many-body atomic structure for medium-range order in the glass. Nanotechnology 26, 304001 (2015)

    Article  Google Scholar 

  46. Ojala, T., Maenpaa, T., Pietikainen, M., Viertola, J., Kyllonen, J., Huovinen, S.: Outex-new framework for empirical evaluation of texture analysis algorithms. In: Object recognition Supported by User Interaction for Service Robots, vol. 1, pp 701–706. IEEE (2002)

  47. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  48. Perea, J.A., Carlsson, G.: A klein-bottle-based dictionary for texture representation. Int. J. Comput. Vis. 107, 75–97 (2014)

    Article  MathSciNet  Google Scholar 

  49. Reininghaus, J., Huber, S., Bauer, U., Kwitt, R.: A stable multi-scale kernel for topological machine learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4741–4748 (2015)

  50. Richardson, E., Werman, M.: Efficient classification using the Euler characteristic. Pattern Recogn. Lett. 49, 99–106 (2014)

    Article  Google Scholar 

  51. Rieck, B., Sadlo, F., Leitte, H.: Topological machine learning with persistence indicator functions. In: Topological Methods in Data Analysis and Visualization, pp 87–101. Springer (2017)

  52. Rotman, J.: An Introduction to Algebraic Topology, Graduate Texts in Mathematics. Springer, New York (1998)

    Google Scholar 

  53. Saadatfar, M., Takeuchi, H., Robins, V., Francois, N., Hiraoka, Y.: Pore configuration landscape of granular crystallization. Nat. Commun. 8, 15082 (2017)

    Article  Google Scholar 

  54. Tauzin, G., Lupo, U., Tunstall, L., Pérez, J. B., Caorsi, M., Medina-Mardones, A., Dassatti, A., Hess, K.: giotto-tda: A topological data analysis toolkit for machine learning and data exploration (2020)

  55. Tralie, C., Saul, N., Bar-On, R.: Ripser.py: A lean persistent homology library for python. J. Open Source Softw. 3, 925 (2018)

    Article  Google Scholar 

  56. Turner, K., Mukherjee, S., Boyer, D.M.: Persistent homology transform for modeling shapes and surfaces. Inf. Infer. J. IMA 3, 310–344 (2014)

    MathSciNet  MATH  Google Scholar 

  57. Turner, K., Spreemann, G: Same but different: Distance correlations between topological summaries (2019)

  58. Umeda, Y.: Time series classification via topological data analysis. Inf. Media Technol 12, 228–239 (2017)

    Google Scholar 

  59. Zomorodian, A., Carlsson, G.: Computing persistent homology. Discret. Comput. Geom. 33, 249–274 (2005)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yu-Min Chung.

Additional information

Communicated by: Gitta Kutyniok

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The majority of this work was done when Yu-Min Chung was employed at the Department of Mathematics and Statistics, University of North Carolina at Greensboro.

Appendices

Appendix A: Explicit calculations of bounds using Theorem 1

In this section, we will assume \(C,D\in \mathcal {D}\) and utilize the definitions of the curves defined in Table 1. To ease notation we will write κ1 in place of κ1(ψ, C, D) and analogously for \(\kappa _{\infty },\delta _{1},\delta _{\infty }\). We recall the following definitions.

$$ \begin{array}{@{}rcl@{}} \kappa_{1}(\psi, C, D) & =&\sum\limits_{(b,d)\in C\setminus{{\varDelta}}}\underset{ t\in [b, d]}{\max} |\psi(b, d, t)| + \sum\limits_{(b^{\prime}d^{\prime})\in D\setminus{{\varDelta}}}\underset{t\in [b^{\prime}, d^{\prime}]}{\max} |\psi(b^{\prime},d^{\prime}, t)| \end{array} $$
(A.1)
$$ \begin{array}{@{}rcl@{}} \kappa_{\infty}(\psi, C, D)\! &=&\underset{(b,d)\in C\setminus{{\varDelta}}}{\max} \underset{ t\in [b, d]}{\max} |\psi(b, d, t)| + \underset{(b^{\prime},d^{\prime})\in D\setminus{{\varDelta}}}{\max} \underset{t\in [b^{\prime}, d^{\prime}]}{\max} |\psi(b^{\prime}, d^{\prime}, t)|. \end{array} $$
(A.2)
$$ \begin{array}{@{}rcl@{}} \delta_{1}(\psi, C, D) &=&\underset{\eta:C\to D}{\inf} \sum\limits_{i=1}^{n_{\eta}}\underset{{ t\in [b_{i}, d_{i}]\cap [\eta_{b_{i}}, \eta_{d_{i}}]}} {\max} |\psi(b_{i}, d_{i}, t) - \psi(\eta_{b_{i}}, \eta_{d_{i}}, t)|. \end{array} $$
(A.3)
$$ \begin{array}{@{}rcl@{}} \delta_{\infty}(\psi, C, D) &=&\underset{\eta:C\to D}{\inf} \underset{\substack{1\leq i \leq n_{\eta} \\ t\in [b_{i}, d_{i}]\cap [\eta_{b_{i}}, \eta_{d_{i}}]}}{\max} |\psi(b_{i}, d_{i}, t) - \psi(\eta_{b_{i}}, \eta_{d_{i}}, t)|. \end{array} $$
(A.4)

1.1 A.1 Betti-based curves

In this section we consider the curves based on the Betti number. As we will see the ψ function for each of the curves in this section is not continuous at the diagonal, which causes the δ bounds to be large.

1.1.1 A.1.1 Betti number curve (β)

Recall that the Betti curve is by taking ψ(D; b, d, t) := ψD(b, d) = 1 when (b, d) ∈ D, bd and 0 otherwise. The values for κ1 and \(\kappa _{\infty }\) are straightforward to calculate.

$$ \begin{array}{@{}rcl@{}} \kappa_{1} &=& \sum\limits_{i=1}^{n^{c}}1 + \sum\limits_{i=1}^{n^{D}} 1 =n^{C} + n^{D},\\ \kappa_{\infty} &=& 2. \end{array} $$

On the other hand, notice that if t ∈ [b, d] ∩ [ηb, ηd], then we have ψC(b, d) − ψD(ηb, ηd) = 1 when (b, d) ∈Δ⇒ (ηb, ηd)∉Δ and 0 otherwise. This means that the worst case scenario for the Betti curve is when all points are paired with the diagonal yielding the following δ values

$$ \begin{array}{@{}rcl@{}} \delta_{\infty} &\le& 1. \\ \delta_{1} &\le& \left( n^{C}+ n^{D}\right). \end{array} $$

Hence, we may conclude by Theorem 1 that

$$ \begin{array}{@{}rcl@{}} \|\beta(C)-\beta(D)\|_{1}&\le& (n^{C}+ n^{D})W_{\infty}(C,D) + L^{C}\land L^{D}, \end{array} $$
(A.5)
$$ \begin{array}{@{}rcl@{}} \|\beta(C)-\beta(D)\|_{1}&\le& 2W_{1}(C,D) + \left( L_{\infty}^{C}\land L_{\infty}^{D}\right)(n^{C}+ n^{D}). \end{array} $$
(A.6)

Neither of these bounds are desirable. Not only as the number of points increase both bounds tend to infinity, but also both bounds contain constants that are irrelevant to W1 nor \(W_{\infty }\). Moreover, the same occurs as the minimum lifespan grows.

1.1.2 A.1.2 Normalized Betti curve (s β)

According to Definition 2, we may define the normalized Betti curve by taking \(\psi (D;b,d,t) :=\psi ^{D}(b,d)= \frac {1}{n^{D}}\) when (b, d) ∈ D, bd and 0 otherwise. Again we can get the κ values quickly

$$ \begin{array}{@{}rcl@{}} \kappa_{1} &=&\sum\limits_{i=1}^{n^{C}} \frac{1}{n^{C}} + \sum\limits_{i=1}^{n^{D}}\frac{1}{n^{D}}=2,\\ \kappa_{\infty} &=& \frac{1}{n^{C}}+\frac{1}{n^{D}}\le\frac{2}{n^{C}\land n^{D}}, \end{array} $$

On the other hand, for all η : CD

$$ \begin{array}{@{}rcl@{}} \delta_{\infty} &\le& \underset{1\le i\le n_{\eta}}{\max} |\psi^{C}(b_{i},d_{i})-\psi^{D}(\eta_{b_{i}},\eta_{d_{i}})|\le \frac{1}{n^{C}\land n^{D}}.\\ \delta_{1} &\le& \sum\limits_{i=1}^{n_{\eta}}|\psi^{C}(b_{i},d_{i})-\psi^{D}(\eta_{b_{i}},\eta_{d_{i}})| \le \sum\limits_{i=1}^{n^{C}} \frac{1}{n^{C}} + \sum\limits_{i=1}^{n^{D}}\frac{1}{n^{D}}=2. \end{array} $$

Hence, we may conclude by Theorem 1 that

$$ \begin{array}{@{}rcl@{}} \|\mathbf{s}\boldsymbol\beta(C)-\mathbf{s}\boldsymbol\beta(D)\|_{1}&\le &2W_{\infty}(C,D) + \frac{L^{C}\land L^{D}}{n^{C}\land n^{D}}, \end{array} $$
(A.7)
$$ \begin{array}{@{}rcl@{}} \|\mathbf{s}\boldsymbol{\beta}(C)-\mathbf{s}\boldsymbol{\beta}(D)\|_{1}&\le& \frac{2W_{1}(C,D)}{n^{C}\land n^{D}} + 2\left( L_{\infty}^{C}\land L_{\infty}^{D}\right). \end{array} $$
(A.8)

Even with the normalization, these two bounds still depend heavily on the number of diagram points and the maximum lifespan.

1.1.3 A.1.3 Betti entropy curve (β e)

By Definition 3, the Betti Entropy Curve is defined by taking \(\psi (D;b,d,t)=\psi ^{D}(b,d) = -\frac {1}{n^{D}}\log \frac {1}{n^{D}}\) when (b, d) ∈ D, bd and 0 otherwise. If the points of C and D are indexed according to the optimal matching for \(W_{\infty }(C,D)\) distance and if nCnD ≥ 3 (so that \(\frac {1}{n^{C}\land n^{D}}\le \frac {1}{e}\)), then by Lemma 3

$$ \begin{array}{@{}rcl@{}} \kappa_{1} &=& \sum\limits_{i=1}^{n} \psi^{C}(b_{i},d_{i}) + \sum\limits_{i=1}^{n} \psi^{D}(b_{i},d_{i}) =-\log \frac{1}{n^{C}}-\log \frac{1}{n^{D}} \le 2\log\left( n^{C}\lor n^{D}\right),\\ \kappa_{\infty} &=& \underset{1\le i \le n}{\max} \psi^{C}(b_{i},d_{i}) + \underset{1\le i \le n}{\max} \psi^{D}(b_{i},d_{i}) \le\frac{2}{e}, \end{array} $$

On the other hand, if the points of C and D are indexed according to the optimal matching for W1(C, D), then by Lemma 3

$$ \begin{array}{@{}rcl@{}} \delta_{\infty} &\le& \underset{1\le i\le 1}{\max} |\psi^{C}(b_{i},d_{i})-\psi^{D}(\eta_{b_{i}},\eta_{d_{i}})|\le -\frac{1}{n^{C}\land n^{D}}\log \frac{1}{n^{C}\land n^{D}}.\\ \delta_{1} &\le& \sum\limits_{i=1}^{n}|\psi^{C}(b_{i},d_{i})-\psi^{D}(\eta_{b_{i}},\eta_{d_{i}})|\le -\frac{n^{C}\lor n^{D}}{n^{C}\land n^{D}}\log \frac{1}{n^{C}\land n^{D}}. \end{array} $$

Hence, we may conclude by Theorem 1 that if nCnD ≥ 3, then

$$ \begin{array}{@{}rcl@{}} \|\boldsymbol\beta\mathbf{e}^{C}-\boldsymbol\beta\mathbf{e}^{D}\|_{1}&\le& 2\log\left( n^{C}\lor n^{D}\right)W_{\infty}(C,D) - \left( L^{C}\land L^{D}\right)\left( \frac{1}{n^{C}\land n^{D}}\log \frac{1}{n^{C}\land n^{D}}\right), \end{array} $$
(A.9)
$$ \begin{array}{@{}rcl@{}} \|\boldsymbol{\beta}\mathbf{e}^{C}-\boldsymbol{\beta}\mathbf{e}^{D}\|_{1}&\le& \frac{2W_{1}(C,D)}{e} + \left( L_{\infty}^{C}\land L_{\infty}^{D}\right)\frac{n^{C}\lor n^{D}}{n^{C}\land n^{D}}\log \frac{1}{n^{C}\land n^{D}}. \end{array} $$
(A.10)

1.2 A.2 Midlife-based curves

For this section, we will assume that all birth and death values are non-negative (so that b + d ≠ 0). The midlife-based curves, similar to the lifespan based curves, benefit from continuous ψ functions. Moreover, the bounds in this section take a form similar to those of the lifespan-based counterparts. For this section, we will use the notation \({u_{i}^{D}}:=\frac {\eta _{b_{i}}+\eta _{d_{i}}}{2}\), \(U^{D}:={\sum }_{i=1}^{n^{D}}{u_{i}^{D}}\), and \(U_{\infty }^{D} := \max \limits _{i}{u_{i}^{D}}\). Finally, because we are assuming non-negativity, we also have 2UDLD.

1.2.1 A.2.1 Midlife curve m l(D)

The midlife curve take \(\psi (D;,b,d,t) := \psi ^{D}(b,d) = \frac {b+d}{2}\). The values κ1 = UC + UD ≤ 2(UCUD) and \(\kappa _{\infty } = U_{\infty }^{C} + U_{\infty }^{D}\le U_{\infty }^{C} \lor U_{\infty }^{D}\). We see then that for any η : CD

$$ \begin{array}{@{}rcl@{}} \delta_{\infty} &\le& \underset{1\le i\le n}{\max} \left|\psi^{C}(b_{i},d_{i})-\psi^{D}(\eta_{b_{i}},\eta_{d_{i}})\right|\\ &=&\underset{1\le i\le n}{\max} \left|\frac{(d_{i}+b_{i})}{2}-\frac{(\eta_{d_{i}}+\eta_{b_{i}})}{2}\right|\\ &\le&\frac{1}{2}\underset{1\le i\le n}{\max} |(d_{i}-\eta_{d_{i}})|+|(b_{i}-\eta_{b_{i}})|\\ &\le& W_{\infty}(C,D). \end{array} $$

If instead the indexing follows the optimal matching for W1(C, D) we see

$$ \begin{array}{@{}rcl@{}} \delta_{1} &\le& \sum\limits_{i=1}^{n}\left|\psi^{C}(b_{i},d_{i})-\psi^{D}(\eta_{b_{i}},\eta_{d_{i}})\right|\\ &=&\sum\limits_{i=1}^{n}\left|\frac{(d_{i}+b_{i})}{2}-\frac{(\eta_{d_{i}}+\eta_{b_{i}})}{2}\right|\\ &\le&\frac{1}{2}\sum\limits_{i=1}^{n}|(d_{i}-\eta_{d_{i}})|+|(b_{i}-\eta_{b_{i}})|\\ &\le& W_{1}(C,D). \end{array} $$

Therefore, by Theorem 1, we conclude

$$ \begin{array}{@{}rcl@{}} \|\mathbf{ml}(C)-\mathbf{ml}(D)\|_{1}&\le& 2\left( U^{C}\lor U^{D}\right)W_{\infty}(C,D) + \left( L^{C}\land L^{D}\right)W_{\infty}(C,D), \end{array} $$
(A.11)
$$ \begin{array}{@{}rcl@{}} \|\mathbf{ml}(C)-\mathbf{ml}(D)\|_{1}&\le& 2\left( U_{\infty}^{C}\lor U_{\infty}^{D}\right)W_{1}(C,D) + \left( L_{\infty}^{C}\land L_{\infty}^{D}\right)W_{1}(C,D). \end{array} $$
(A.12)

1.2.2 A.2.2 Normalized midlife curve s m l(D)

The normalized midlife curve take \(\psi (D;,b,d,t) := \psi ^{D}(b,d) = \frac {b+d}{{2U^{D}}}\). The values κ1 ≤ 2 and \(\kappa _{\infty } \le 2\). Moreover, suppose LCLD. We note that \(|U^{C}-U^{D}|\le {\sum }_{i=1}^{n}|{u_{i}^{C}}-{u_{i}^{D}}|\le nW_{\infty }(C,D)\).

$$ \begin{array}{@{}rcl@{}} \delta_{\infty} &\le& \underset{1\le i\le n}{\max} \left|\psi^{C}(b_{i},d_{i})-\psi^{D}(\eta_{b_{i}},\eta_{d_{i}})\right|\\ &\le&\underset{1\le i\le n}{\max} \frac{|{u_{i}^{C}}-{u_{i}^{D}}|}{U^{D}} +\underset{1\le i\le n}{\max} {u_{i}^{C}}\frac{|U^{C}-U^{D}|}{U^{C}U^{D}} \\ &\le&\underset{1\le i\le n}{\max} \frac{|b_{i}+d_{i}-\eta_{b_{i}}-\eta_{d_{i}}|}{L^{D}} +\underset{1\le i\le n}{\max} \frac{4(U^{C}_{\infty}\lor U^{D}_{\infty})|U^{C}-U^{D}|}{L^{C}L^{D}}\\ &\le& \frac{2W_{\infty}(C,D)}{L^{C}\lor L^{D}} + \frac{4n(U^{C}_{\infty}\lor U^{D}_{\infty})W_{\infty}(C,D)}{\left( L^{C}\land L^{D}\right)(L^{C}\lor L^{D})}. \end{array} $$

Moreover,

$$ \begin{array}{@{}rcl@{}} \delta_{1} &\le& \sum\limits_{i=1}^{n}|\psi^{C}(b_{i},d_{i})-\psi^{D}(\eta_{b_{i}},\eta_{d_{i}})|\\ &\le&\sum\limits_{i=1}^{n}\frac{|{u_{i}^{C}}-{u_{i}^{D}}|}{U^{D}} +\sum\limits_{i=1}^{n}{u_{i}^{C}}\frac{|U^{C}-U^{D}|}{U^{C}U^{D}} \\ &\le&\sum\limits_{i=1}^{n}\frac{|b_{i}+d_{i}-\eta_{b_{i}}-\eta_{d_{i}}|}{L^{D}} +\sum\limits_{i=1}^{n}\frac{2|U^{C}-U^{D}|}{L^{D}}\\ &\le& \frac{4W_{1}(C,D)}{L^{C}\lor L^{D}}. \end{array} $$

Therefore, by Theorem 1, we conclude

$$ \begin{array}{@{}rcl@{}} \|\mathbf{sml}(C)-\mathbf{sml}(D)\|_{1}&\le& 2W_{\infty}(C,D) + \left( L^{C}\land L^{D}\right)\left( \frac{2W_{\infty}(C,D)}{L^{C}\lor L^{D}} + \frac{4n(U^{C}_{\infty}\lor U^{D}_{\infty}) W_{\infty}(C,D)}{\left( L^{C}\land L^{D}\right)(L^{C}\lor L^{D})}\right), \\ &\le&4W_{\infty}(C,D)\left( 1+\frac{U_{\infty}^{C}\lor U_{\infty}^{D}}{L^{C}\lor L^{D}}n\right) \end{array} $$
(A.13)
$$ \begin{array}{@{}rcl@{}} \|\mathbf{sml}(C)-\mathbf{sml}(D)\|_{1}&\le& 2W_{1}(C,D) + L_{\infty}^{C}\land L_{\infty}^{D}\frac{4W_{1}(C,D)}{L^{C}\lor L^{D}} \le 6W_{1}(C,D). \end{array} $$
(A.14)

1.2.3 A.2.3 Midlife entropy curve m l e(D)

In this final section we will discuss the bounds for the midlife entropy curve, which uses \(\psi (D;b,d,t):=\psi ^{D}(b,d)=-\frac {b+d}{U^{D}}\log \frac {b+d}{U^{D}}\). Straightforward calculation reveals \(\kappa _{1} = \log n^{C} + \log n^{D}\) and \(\kappa _{\infty }\le \frac {2}{e}\). The bounds recovered for midlife entropy are the same as for life entropy. That is, if \(r_{\infty }(C,D),r_{1}(C,D)\le \frac {1}{2e}\), Theorem 1 guarantees that

$$ \begin{array}{@{}rcl@{}} {}\|\mathbf{mle}(C)-\mathbf{mle}(D)\|_{1}&\le& 2\log\left( n^{C}\lor n^{D}\right)W_{\infty}(C,D) - \left( L^{C}\land L^{D}\right)2r_{\infty}(C,D)\log{2r_{\infty}(C,D)},{\kern12pt} \end{array} $$
(A.15)
$$ \begin{array}{@{}rcl@{}} \|\mathbf{mle}(C)-\mathbf{mle}(D)\|_{1}&\le& \frac{2}{e}W_{1}(C,D) - \left( L_{\infty}^{C}\land L_{\infty}^{D}\right)2\left( n^{C}\lor n^{D}\right)r_{1}(C,D)\log{2r_{1}(C,D)}. \end{array} $$
(A.16)

Appendix B: Proof of Lemma 4 in Section 5.4

For convenience, we will restate the lemma here:

Lemma 1

Let \(f(t) = | \psi _{1}(t) \chi _{[b_{1},d_{1})}(t) -\psi _{2}(t) \chi _{[b_{2},d_{2})}(t) |\), where \(\psi _{i}(t) = \min \limits \{ t-b_{i}, d_{i} - t \}\) for i = 1, 2. Then,

$$ \|f\|_{\infty} \leq |d_{2}-d_{1}| \lor |b_{2} - b_{1}|. $$

Proof

We need similar estimates to that in Lemma 1. Consider

$$f(t) = | \psi_{1}(t) \chi_{[b_{1},d_{1})}(t) -\psi_{2}(t) \chi_{[b_{2},d_{2})}(t) |,$$

where \(\psi _{i}(t) = \min \limits \{ t-b_{i}, d_{i} - t \}\) for i = 1, 2. Recall that ψi(t) can also be expressed as

$$\psi_{i}(t) = \left\{\begin{array}{lll} 0 & \text{if } t\notin (b_{i},d_{i})\\ t-b_{i} & \text{if } t\in [b_{i},\frac{b_{i}+d_{i}}{2})\\ d_{i}-t & \text{if } t\in[\frac{b_{i}+d_{i}}{2}, d_{i}) \end{array}\right.. $$

Finally, define \(m_{i} = \frac {b_{i}+d_{i}}{2}\).

figure a

Case 1: b1d1b2d2 and [b1, d1) ∩ [b2, d2) = . Then,

$$ \left\{\begin{array}{ll} \max_{t\in [b_{1},d_{1})} | \psi_{1}(t) | = (d_{1} - b_{1})/2 \leq (b_{2} - b_{1})/2 \\ \max_{t\in [b_{2},d_{2})} | \psi_{2}(t) | = (d_{2} - b_{2})/2 \leq (d_{2} - d_{1})/2 \end{array}\right.. $$
(A.17)

Thus, f(t) ≤ |b2b1|∨|d2d1|.

figure b

Case 2: b1b2d1d2 and [b1, d1) ∩ [b2, d2) = [b2, d1). Then, we have three maximums to consider.

$$ \left\{\begin{array}{lll} \max_{t\in [b_{1},b_{2}]} | \psi_{1}(t) | \\ \max_{t\in [b_{2},d_{1})} | \psi_{1}(t) - \psi_{2}(t) | \\ \max_{t\in (d_{1},d_{2})} | \psi_{2}(t) | \end{array}\right.. $$
(A.18)

For the first, there are two subcases: b1m1b2 and b1b2m1. In either case we see

$$ \left\{\begin{array}{lll} \max_{t\in [b_{1},m_{1}]} t - b_{1} = m_{1} - b_{1} = \frac{(d_{1}-b_{1})}{2} \\ \max_{t\in [m_{1},b_{2})} d_{1} - t = d_{1} - m_{1} = \frac{(d_{1}-b_{1})}{2} \\ \max_{t\in [b_{1},b_{2})} t - b_{1} = b_{2} - b_{1} \end{array}\right. $$
(A.19)

Observe that for b1m1b2, we have \(\frac {(d_{1}-b_{1})}{2} - (b_{2} - b_{1}) = m_{1} - b_{2} \leq 0 \). In each of the cases above, \(\max \limits _{t\in [b_{1},b_{2}]} | \psi _{1}(t) |\le b_{2}-b_{1}\). Thus, \(\max \limits _{t\in [b_{1},b_{2}]} | \psi _{1}(t) | \leq |b_{2} - b_{1}|\). Similarly, one may verify that when t ∈ (d1, d2), \(\max \limits _{t\in (d_{1},d_{2})} | \psi _{1}(t) | \leq |d_{2} - d_{1}|\).

When t ∈ [b2, d1) = [b1, d1) ∩ [b2, d2), there are 5 subcases to consider.

  • i) m1b2d1m2;

  • ii) b2m1d1m2;

  • iii)m1b2m2d1;

  • iv) b2m1m2d1;

  • v) b2m2m1d1.

One may verify that each case is bounded by |d2d1|∨|b2b1|.

figure c

Case 3: b1b2d2d1. We have three maximums to consider.

$$ \left\{\begin{array}{lll} \max_{t\in [b_{1},b_{2}]} | \psi_{1}(t) | \\ \max_{t\in [b_{2},d_{2})} | \psi_{1}(t) - \psi_{2}(t) |. \\ \max_{t\in (d_{2},d_{1})} | \psi_{1}(t) | \end{array}\right. $$
(A.20)

The first and third maximums follow the same arguments as in case 2. For the middle maximum, we see that we have four subcases

  • i) m1 ∈ (b1, b2];

  • ii) m1 ∈ (b2, m2];

  • iii) m1 ∈ (m2, d2);

  • iv) m1 ∈ (d2, d2).

For i), note that d1b2 < d1m1 = m1b1 thus, d1b2 < b2b1. We complete this case by noting that for all t ∈ [b2, d2), ψ1(t) = d1tψ2(t) ≥ 0. Moreover ψ1 is decreasing over the interval [b2, d2). Therefore, the difference |ψ1(t) − ψ2(t)| is maximized at t = b2 and the difference is precisely d1b2.

One may verify the remaining cases satisfy the conclusion. □

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chung, YM., Lawson, A. Persistence Curves: A canonical framework for summarizing persistence diagrams. Adv Comput Math 48, 6 (2022). https://doi.org/10.1007/s10444-021-09893-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10444-021-09893-4

Keywords

Mathematics Subject Classification (2010)

Navigation