skip to main content
10.1145/1754239.1754284acmotherconferencesArticle/Chapter ViewAbstractPublication PagesedbtConference Proceedingsconference-collections
research-article

Biochemical network matching and composition

Published:22 March 2010Publication History

ABSTRACT

Graph composition has applications in a variety of practical applications. In drug development, for instance, in order to understand possible drug interactions, one has to merge known networks and examine topological variants arising from such composition. Similarly, the design of sensor nets may use existing network infrastructures, and the superpositon of one network on another can help with network design and optimisation. The problem of network composition has not received much attention in algorithm and database research. Here, we work with biological networks encoded in Systems Biology Markup Language (SBML), based on XML syntax. We focus on XML merging and examine the algorithmic and performance challenges we encountered in our work and the possible solutions to the graph merge problem. We show that our XML graph merge solution performs well in practice and improves on the existing toolsets. This leads us into future work directions and the plan of research which will aim to implement graph merging primitives in a database engine.

References

  1. M. Ashburner et al. Gene Ontology: Tool for the Unification of Biology. Nature Genetics, 25(1):25--29, May 2000.Google ScholarGoogle ScholarCross RefCross Ref
  2. R. Ausbrooks et al. Mathematical Markup Language (MathML) v. 2.0. October 2003. http://www.w3.org/TR/MathML2/.Google ScholarGoogle Scholar
  3. W. Bajguz. Graph and Union of Graphs Compositions, Jan. 31 2006. http://arxiv.org/abs/math/0601755.Google ScholarGoogle Scholar
  4. BeanShell - Lightweight Scripting for Java. http://www.beanshell.org/.Google ScholarGoogle Scholar
  5. P. Bouros et al. Evaluating reachability queries over path collections. In SSDBM, pages 398--416, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. W. H. C. C. Cornuejols. Compositions for Perfect Graphs. Discrete Mathematics, 55:245--254, 1985.Google ScholarGoogle ScholarCross RefCross Ref
  7. S. S. Chawathe. Comparing Hierarchical Data in External Memory. VLDB, pages 90--101, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. K. Degtyarenko et al. ChEBI: A Database and Ontology for Chemical Entities of Biological Interest. Nucleic Acids Research, 36(Database-Issue):344--350, 2008.Google ScholarGoogle Scholar
  9. R. Donaldson and D. Gilbert. A Model Checking Approach to the Parameter Estimation of Biochemical Pathways, CMSB. LNCS, 5307:269--287, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. E. Hunt et al. A Database Index to Large Biological Sequences. In VLDB, pages 139--48, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. Huq. Compositions of Graphs Revisited. Electr. J. Comb., 14(1), 2007.Google ScholarGoogle Scholar
  12. C.-L. Ignat and M. C. Norrie. Flexible Collaboration over XML Documents. In CDVE, pages 267--274, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. R. Irving. Plagiarism and Collusion Detection using the Smith-Waterman Algorithm. Technical report, University of Glasgow, Department of Computing Science, 2004. http://www.dcs.gla.ac.uk/publications/PAPERS/7444-/TR-2004-164.pdf.Google ScholarGoogle Scholar
  14. M. Kanehisa and S. Goto. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res., 28:29--34, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  15. A. Knopfmacher and M. E. Mays. Graph Compositions I: Basic Enumeration, Feb. 11 2003. http://citeseer.ist.psu.edu/564967.html.Google ScholarGoogle Scholar
  16. L. Lian et al. Development of Program Difference Tool based on Tree Mapping. IEICE Trans. Inf. and Systems, 78(10):1261--1268, 1995.Google ScholarGoogle Scholar
  17. M. H. Luerssen and D. M. W. Powers. Graph Composition in a Graph Grammar-Based Method for Automata Network Evolution. In IEEE Congress on Evolutionary Computation, pages 1653--1660, 2005.Google ScholarGoogle Scholar
  18. W. Miller and E. W. Myers. A File Comparison Program. Software - Practice and Experience, 15(11):1025--1040, 1985.Google ScholarGoogle Scholar
  19. E. Myers. An 0(nd) Difference Algorithm and its Variations. Algorithmica, 1(2):251--266, 1986.Google ScholarGoogle ScholarCross RefCross Ref
  20. K. Saleem et al. PORSCHE: Performance ORiented SCHEma mediation. Inf. Syst., 33(7--8):637--657, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. T. F. Smith and M. S. Waterman. Identification of Common Molecular Sub-Sequences. Journal of Molecular Biology, 147:195--197, 1981.Google ScholarGoogle ScholarCross RefCross Ref
  22. K.-C. Tai. The Tree-to-Tree Correction Problem. J. ACM, 26(3):422--433, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Y. Tian and J. M. Patel. TALE: A Tool for Approximate Large Graph Matching. In ICDE, pages 963--72. IEEE, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. R. A. Wagner and M. J. Fischer. The String-to-String Correction Problem. J. ACM, 21(1):168--173, 1974. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Y. Wang et al. X-Diff: An Effective Change Detection Algorithm for XML Documents. In ICDE, pages 519--530, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  26. D. J. Wilkinson. Stochastic Modelling for Systems Biology. Chapman & Hall, 2006.Google ScholarGoogle Scholar
  27. DeltaXML. http://www.deltaxml.com/.Google ScholarGoogle Scholar
  28. xmldiff. www.logilab.org/859.Google ScholarGoogle Scholar
  1. Biochemical network matching and composition

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      EDBT '10: Proceedings of the 2010 EDBT/ICDT Workshops
      March 2010
      290 pages
      ISBN:9781605589909
      DOI:10.1145/1754239

      Copyright © 2010 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 22 March 2010

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate7of10submissions,70%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader