ABSTRACT
Graph composition has applications in a variety of practical applications. In drug development, for instance, in order to understand possible drug interactions, one has to merge known networks and examine topological variants arising from such composition. Similarly, the design of sensor nets may use existing network infrastructures, and the superpositon of one network on another can help with network design and optimisation. The problem of network composition has not received much attention in algorithm and database research. Here, we work with biological networks encoded in Systems Biology Markup Language (SBML), based on XML syntax. We focus on XML merging and examine the algorithmic and performance challenges we encountered in our work and the possible solutions to the graph merge problem. We show that our XML graph merge solution performs well in practice and improves on the existing toolsets. This leads us into future work directions and the plan of research which will aim to implement graph merging primitives in a database engine.
- M. Ashburner et al. Gene Ontology: Tool for the Unification of Biology. Nature Genetics, 25(1):25--29, May 2000.Google ScholarCross Ref
- R. Ausbrooks et al. Mathematical Markup Language (MathML) v. 2.0. October 2003. http://www.w3.org/TR/MathML2/.Google Scholar
- W. Bajguz. Graph and Union of Graphs Compositions, Jan. 31 2006. http://arxiv.org/abs/math/0601755.Google Scholar
- BeanShell - Lightweight Scripting for Java. http://www.beanshell.org/.Google Scholar
- P. Bouros et al. Evaluating reachability queries over path collections. In SSDBM, pages 398--416, 2009. Google ScholarDigital Library
- W. H. C. C. Cornuejols. Compositions for Perfect Graphs. Discrete Mathematics, 55:245--254, 1985.Google ScholarCross Ref
- S. S. Chawathe. Comparing Hierarchical Data in External Memory. VLDB, pages 90--101, 1999. Google ScholarDigital Library
- K. Degtyarenko et al. ChEBI: A Database and Ontology for Chemical Entities of Biological Interest. Nucleic Acids Research, 36(Database-Issue):344--350, 2008.Google Scholar
- R. Donaldson and D. Gilbert. A Model Checking Approach to the Parameter Estimation of Biochemical Pathways, CMSB. LNCS, 5307:269--287, 2008. Google ScholarDigital Library
- E. Hunt et al. A Database Index to Large Biological Sequences. In VLDB, pages 139--48, 2001. Google ScholarDigital Library
- A. Huq. Compositions of Graphs Revisited. Electr. J. Comb., 14(1), 2007.Google Scholar
- C.-L. Ignat and M. C. Norrie. Flexible Collaboration over XML Documents. In CDVE, pages 267--274, 2006. Google ScholarDigital Library
- R. Irving. Plagiarism and Collusion Detection using the Smith-Waterman Algorithm. Technical report, University of Glasgow, Department of Computing Science, 2004. http://www.dcs.gla.ac.uk/publications/PAPERS/7444-/TR-2004-164.pdf.Google Scholar
- M. Kanehisa and S. Goto. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res., 28:29--34, 2000.Google ScholarCross Ref
- A. Knopfmacher and M. E. Mays. Graph Compositions I: Basic Enumeration, Feb. 11 2003. http://citeseer.ist.psu.edu/564967.html.Google Scholar
- L. Lian et al. Development of Program Difference Tool based on Tree Mapping. IEICE Trans. Inf. and Systems, 78(10):1261--1268, 1995.Google Scholar
- M. H. Luerssen and D. M. W. Powers. Graph Composition in a Graph Grammar-Based Method for Automata Network Evolution. In IEEE Congress on Evolutionary Computation, pages 1653--1660, 2005.Google Scholar
- W. Miller and E. W. Myers. A File Comparison Program. Software - Practice and Experience, 15(11):1025--1040, 1985.Google Scholar
- E. Myers. An 0(nd) Difference Algorithm and its Variations. Algorithmica, 1(2):251--266, 1986.Google ScholarCross Ref
- K. Saleem et al. PORSCHE: Performance ORiented SCHEma mediation. Inf. Syst., 33(7--8):637--657, 2008. Google ScholarDigital Library
- T. F. Smith and M. S. Waterman. Identification of Common Molecular Sub-Sequences. Journal of Molecular Biology, 147:195--197, 1981.Google ScholarCross Ref
- K.-C. Tai. The Tree-to-Tree Correction Problem. J. ACM, 26(3):422--433, 1979. Google ScholarDigital Library
- Y. Tian and J. M. Patel. TALE: A Tool for Approximate Large Graph Matching. In ICDE, pages 963--72. IEEE, 2008. Google ScholarDigital Library
- R. A. Wagner and M. J. Fischer. The String-to-String Correction Problem. J. ACM, 21(1):168--173, 1974. Google ScholarDigital Library
- Y. Wang et al. X-Diff: An Effective Change Detection Algorithm for XML Documents. In ICDE, pages 519--530, 2003.Google ScholarCross Ref
- D. J. Wilkinson. Stochastic Modelling for Systems Biology. Chapman & Hall, 2006.Google Scholar
- DeltaXML. http://www.deltaxml.com/.Google Scholar
- xmldiff. www.logilab.org/859.Google Scholar
- Biochemical network matching and composition
Recommendations
Conditional Edge-Fault Hamiltonicity of Matching Composition Networks
A graph $G$ is called Hamiltonian if there is a Hamiltonian cycle in $G$. The conditional edge-fault Hamiltonicity of a Hamiltonian graph $G$ is the largest $k$ such that after removing $k$ faulty edges from $G$, provided that each node is incident to ...
Ambient network composition
Ambient networks, a product of a European union sixth framework project, is a novel networking paradigm for beyond 3G. It aims at the ubiquitous provisioning of existing and new services over any access technology and any type of network. Network ...
L(2,1)-Labelings on the composition of n graphs
An L(2,1)-labeling of a graph G is a function f from the vertex set V(G) to the set of all nonnegative integers such that |f(x)-f(y)|>=2 if d(x,y)=1 and |f(x)-f(y)|>=1 if d(x,y)=2, where d(x,y) denotes the distance between x and y in G. The L(2,1)-...
Comments