Skip to main content

Centroid Clustering of Cellular Lineage Trees

  • Conference paper
Information Technology in Bio- and Medical Informatics (ITBAM 2014)

Abstract

Trees representing hierarchical knowledge are prevalent in biology and medicine. Some examples are phylogenetic trees, the hierarchical structure of biological tissues and cell lines. The increasing throughput of techniques generating such trees poses new challenges to the analysis of tree ensembles. Some typical tasks include the determination of common patterns of lineage decisions in cellular differentiation trees. Partitioning the dataset is crucial for further analysis of the cellular genealogies. In this work, we develop a method to cluster labeled binary tree structures. Furthermore, for every cluster our method selects a centroid tree that captures the characteristic mitosis patterns of the group. We evaluate this technique on synthetic data and apply it to experimental trees that embody the lineages of differentiating cells under specific conditions over time. The results of the cell lineage trees are thoroughly interpreted with expert domain knowledge.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 34.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 44.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Arnaudon, M., Barbaresco, F., Yang, L.: Medians and means in riemannian geometry: Existence, uniqueness and computation. In: Nielsen, F., Bhatia, R. (eds.) Matrix Information Geometry, pp. 169–197. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  2. Arora, S., Lund, C., Motwani, R., Sudan, M., Szegedy, M.: Proof verification and the hardness of approximation problems. Journal of ACM 45, 501–555 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  3. Asai, T., Arimura, H., Uno, T., Nakano, S.-I.: Discovering frequent substructures in large unordered trees. In: Grieser, G., Tanaka, Y., Yamamoto, A. (eds.) DS 2003. LNCS (LNAI), vol. 2843, pp. 47–61. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  4. Bille, P.: A survey on tree edit distance and related problems. Theoretical Computer Science 337(1-3), 217–239 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  5. Bishop, C.: Pattern recognition and machine learning. Information science and statistics. Springer (2006)

    Google Scholar 

  6. Brusco, M., Köhn, H.: Optimal partitioning of a data set based on the p-median model. Psychometrika 73, 89–105 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  7. Ferrer, M., Valveny, E., Serratosa, F., Bardají, I., Bunke, H.: Graph-based k-means clustering: A comparison of the set median versus the generalized median graph. In: Jiang, X., Petkov, N. (eds.) CAIP 2009. LNCS, vol. 5702, pp. 342–350. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  8. Hadzic, F., Tan, H., Dillon, T.S.: Tree mining applications. In: Hadzic, F., Tan, H., Dillon, T.S. (eds.) Mining of Data with Complex Structures. SCI, vol. 333, pp. 201–247. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  9. Helmer, S., Augsten, N., Böhlen, M.: Measuring structural similarity of semistructured data based on information-theoretic approaches. The VLDB Journal 21(5), 677–702 (2012)

    Article  Google Scholar 

  10. Jain, B.J., Wysotzki, F.: Central clustering of attributed graphs. Machine Learning 56(1-3), 169–207 (2004)

    Article  MATH  Google Scholar 

  11. Klein, P., Tirthapura, S., Sharvit, D., Kimia, B.: A tree-edit-distance algorithm for comparing simple, closed shapes. In: Proceedings of the Eleventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2000, pp. 696–704. Society for Industrial and Applied Mathematics, Philadelphia (2000)

    Google Scholar 

  12. Land, A.H., Doig, A.G.: An automatic method of solving discrete programming problems. Econometrica 28, 497–520 (1960)

    Article  MATH  MathSciNet  Google Scholar 

  13. Luo, B., Wilson, R.C., Hancock, E.R.: Spectral embedding of graphs. Pattern Recognition 36, 2213–2230 (2003)

    Article  MATH  Google Scholar 

  14. Luo, B., Robles-Kelly, A., Torsello, A., Wilson, R.C., Hancock, E.R.: Discovering shape categories by clustering shock trees. In: Skarbek, W. (ed.) CAIP 2001. LNCS, vol. 2124, pp. 152–160. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  15. Marinai, S., Marino, E., Soda, G.: Tree clustering for layout-based document image retrieval. In: DIAL 2006: Proceedings of the Second International Conference on Document Image Analysis for Libraries (DIAL 2006), pp. 243–253. IEEE Computer Society (2006)

    Google Scholar 

  16. Marr, C., Strasser, M., Schwarzfischer, M., Schroeder, T., Theis, F.J.: Multi-scale modeling of gmp differentiation based on single-cell genealogies. FEBS J. 279(18), 3488–3500 (2012)

    Article  Google Scholar 

  17. Mladenovic, N., Brimberg, J., Hansen, P., Moreno-Perez, J.: The p-median problem: A survey of metaheuristic approaches. European Journal of Operational Research 179(3), 927–939 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  18. Nijssen, S., Kok, J.: Efficient discovery of frequent unordered trees. In: Proc. First Intl Workshop Mining Graphs, Trees, and Sequences, pp. 55–64 (2003)

    Google Scholar 

  19. Paul, D.: Extensions to phone-state decision-tree clustering: Single tree and tagged clustering. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, pp. 1487–1490 (1997)

    Google Scholar 

  20. Rastrigin, L.: The convergence of the random search method in the extremal control of a many parameter system. Automation and Remote Control 24, 1337–1342 (1963)

    Google Scholar 

  21. Rieger, M.A., Hoppe, P.S., Smejkal, B.M., Eitelhuber, A.C., Schroeder, T.: Hematopoietic cytokines can instruct lineage choice. Science 325, 217–218 (2009)

    Article  Google Scholar 

  22. Solis, F., Wets, R.J.-B.: Minimization by random search techniques. Mathematics of Operations Research 6, 19–30 (1981)

    Article  MATH  MathSciNet  Google Scholar 

  23. Tenenbaum, J.B., de Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319–2323 (2000)

    Article  Google Scholar 

  24. Torsello, A., Hancock, E.R.: Graph embedding using tree edit-union. Pattern Recognition 40(5), 1393–1405 (2007)

    Article  MATH  Google Scholar 

  25. Torsello, A., Hidović-Rowe, D., Pelillo, M.: Polynomial-time metrics for attributed trees. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(7), 1087–1099 (2005), cited By (since 1996)35

    Google Scholar 

  26. Xiao, B., Torsello, A., Hancock, E.R.: Isotree: Tree clustering via metric embedding. Neurocomputing 71(10-12), 2029–2036 (2008)

    Article  Google Scholar 

  27. Zaki, M.: Efficiently mining frequent embedded unordered trees. Fundamenta Informaticae 66, 33–52 (2005)

    MATH  MathSciNet  Google Scholar 

  28. Zhang, K.: A constrained edit distance between unordered labeled trees. Algorithmica 15(3), 205–222 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  29. Zhang, K., Shasha, D.: Simple fast algorithms for the editing distance between trees and related problems. SIAM J. Comput. 18, 1245–1262 (1989)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Khakhutskyy, V. et al. (2014). Centroid Clustering of Cellular Lineage Trees. In: Bursa, M., Khuri, S., Renda, M.E. (eds) Information Technology in Bio- and Medical Informatics. ITBAM 2014. Lecture Notes in Computer Science, vol 8649. Springer, Cham. https://doi.org/10.1007/978-3-319-10265-8_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-10265-8_2

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-10264-1

  • Online ISBN: 978-3-319-10265-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics