Abstract
Graph Database Management systems (GDBs) are gaining popularity. They are used to analyze huge graph datasets that are naturally appearing in many application areas to model interrelated data. The objective of this paper is to raise a new topic of discussion in the benchmarking community and allow practitioners having a set of basic guidelines for GDB benchmarking. We strongly believe that GDBs will become an important player in the market field of data analysis, and with that, their performance and capabilities will also become important. For this reason, we discuss those aspects that are important from our perspective, i.e. the characteristics of the graphs to be included in the benchmark, the characteristics of the queries that are important in graph analysis applications and the evaluation workbench.
The members of DAMA-UPC thank the Ministry of Science and Innovation of Spain and Generalitat de Catalunya, for grant numbers TIN2009-14560-C03-03 and GRC-1087 respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Angles, R., Gutiérrez, C.: Survey of graph database models. ACM Comput. Surv. 40(1) (2008)
Neo4j: The neo database (2006), http://dist.neo4j.org/neo-technology-introduction.pdf
HypergraphDB: HypergraphDB website, http://www.kobrix.com/hgdb.jsp (last retrieved in March 2010)
Infogrid: Blog, http://infogrid.org/blog/2010/03/operations-on-a-graph-database-part-4 (last retrieved in March 2010)
Martínez-Bazan, N., Muntés-Mulero, V., et al.: Dex: high-performance exploration on large graphs for information retrieval. In: CIKM, pp. 573–582 (2007)
Jena-RDF: Jena documentation, http://jena.sourceforge.net/documentation.html (last retrieved in March 2010)
AllegroGraph: AllegroGraph website, http://www.franz.com/agraph/ (last retrieved in May 2010)
Prud’hommeaux, E., Seaborne, A.: SPARQL Query Language for RDF. W3C (2008), http://www.w3.org/TR/rdf-sparql-query/
Gremlin website: Gremlin documentation, http://wiki.github.com/tinkerpop/gremlin/ (last retrieved in June 2010)
Transaction Processing Performance Council (TPC): TPC Benchmark. TPC website, http://www.tpc.org (last retrieved in June 2010)
Cattell, R., Skeen, J.: Object operations benchmark. TODS 17(1), 1–31 (1992)
Carey, M., DeWitt, D., Naughton, J.: The oo7 benchmark. In: SIGMOD Conference, pp. 12–21 (1993)
Schmidt, A., Waas, F., Kersten, M., Carey, M., Manolescu, I., Busse, R.: Xmark: A benchmark for xml data management. In: VLDB, pp. 974–985 (2002)
Guo, Y., Pan, Z., Heflin, J.: Lubm: A benchmark for owl knowledge base systems. J. Web Sem. 3(2-3), 158–182 (2005)
Bader, D., Feo, J., Gilbert, J., Kepner, J., Koetser, D., Loh, E., Madduri, K., Mann, B., Meuse, T., Robinson, E.: HPC Scalable Graph Analysis Benchmark v1.0. HPC Graph Analysis (February 2009)
Dominguez-Sal, D., Urbón-Bayes, P., Giménez-Vañó, A., Gómez-Villamor, S., Martínez-Bazán, N., Larriba-Pey, J.L.: Survey of graph database performance on the hpc scalable graph analysis benchmark. In: Shen, H.T., Pei, J., Özsu, M.T., Zou, L., Lu, J., Ling, T.-W., Yu, G., Zhuang, Y., Shao, J. (eds.) WAIM 2010. LNCS, vol. 6185, pp. 37–48. Springer, Heidelberg (2010)
INSNA: International network for social network analysis, http://www.insna.org/
OReilly, T.: What is Web 2.0: Design patterns and business models for the next generation of software (2005)
Abiteboul, S., Buneman, P., Suciu, D.: Data on the Web: from relations to semistructured data and XML. Morgan Kaufmann Publishers Inc., San Francisco (2000)
Brickley, D., Guha, R.V.: Resource description framework (rdf) schema specification 1.0. W3C Candidate Recommendation (2000)
Shasha, D., Wang, J., Giugno, R.: Algorithmics and applications of tree and graph searching. In: PODS, pp. 39–52. ACM, New York (2002)
Anyanwu, K., Sheth, A.: ρ-queries: Enabling querying for semantic associations on the semantic web. In: WWW, pp. 690–699. ACM Press, New York (2003)
Chakrabarti, D., Faloutsos, C.: Graph mining: Laws, generators, and algorithms. ACM Computing Surveys (CSUR) 38(1), 2 (2006)
BioGRID: General repository for interaction datasets, http://www.thebiogrid.org/
PDB: Rcsb protein data bank, http://www.rcsb.org/
NAViGaTOR, http://ophid.utoronto.ca/navigator/
Brin, S., Page, L.: The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems 30(1-7), 107–117 (1998)
Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. J. ACM 46(5), 604–632 (1999)
Strands: e-commerce recommendation engine, http://recommender.strands.com/
Chein, M., Mugnier, M.: Conceptual graphs: fundamental notions. Revue d’Intelligence Artificielle 6, 365–406 (1992)
DirectedEdge: a recommendation engine, http://www.directededge.com (last retrieved in June 2010)
Amadeus: Global travel distribution system, http://www.amadeus.net/
Leskovec, J., Huttenlocher, D., Kleinberg, J.: Signed networks in social media. In: CHI, pp. 1361–1370 (2010)
Goertzel, B.: OpenCog Prime: Design for a Thinking Machine. Online wikibook (2008), http://opencog.org/wiki/OpenCogPrime
Erdos, P., Renyi, A.: On random graphs. Mathematicae 6(290-297), 156 (1959)
Leskovec, J., Lang, L., Dasgupta, A., Mahoney, M.: Statistical properties of community structure in large social and information networks. In: WWW, pp. 695–704 (2008)
Flickr: Four Billion, http://blog.flickr.net/en/2009/10/12/4000000000/ (last retrieved in June 2010)
Faloutsos, M., Faloutsos, P., Faloutsos, C.: On power-law relationships of the internet topology. In: SIGCOMM, pp. 251–262 (1999)
McGlohon, M., Akoglu, L., Faloutsos, C.: Weighted graphs and disconnected components: patterns and a generator. In: KDD, pp. 524–532 (2008)
Bader, D., Madduri, K.: Parallel algorithms for evaluating centrality indices in real-world networks. In: ICPP, pp. 539–550 (2006)
Bitton, D., DeWitt, D., Turbyfill, C.: Benchmarking database systems a systematic approach. In: VLDB, pp. 8–19 (1983)
Transaction Processing Performance Council (TPC): TPC Benchmark H (2.11). TPC website, http://www.tpc.org/tpch/ (last retrieved in June 2010)
Leskovec, J., Chakrabarti, D., Kleinberg, J., Faloutsos, C., Ghahramani, Z.: Kronecker graphs: An approach to modeling networks. Journal of Machine Learning Research 11, 985–1042 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Dominguez-Sal, D., Martinez-Bazan, N., Muntes-Mulero, V., Baleta, P., Larriba-Pey, J.L. (2011). A Discussion on the Design of Graph Database Benchmarks. In: Nambiar, R., Poess, M. (eds) Performance Evaluation, Measurement and Characterization of Complex Systems. TPCTC 2010. Lecture Notes in Computer Science, vol 6417. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-18206-8_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-18206-8_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-18205-1
Online ISBN: 978-3-642-18206-8
eBook Packages: Computer ScienceComputer Science (R0)