Benchmarking RDF Query Engines and Instance Matching Systems

Sakr, Sherif; Wylot, Marcin; Mutharaju, Raghava; Le Phuoc, Danh; Fundulaki, Irini

doi:10.1007/978-3-319-73515-3_7

Sherif Sakr⁶,
Marcin Wylot⁷,
Raghava Mutharaju⁸,
Danh Le Phuoc⁷ &
…
Irini Fundulaki⁹

1385 Accesses

Abstract

Standards and benchmarking have traditionally been used as the main tools to formally define and provably illustrate the level of the adequacy of systems to address the new challenges. In this chapter, we discuss benchmarks for RDF query engines and instance matching systems. In practice, benchmarks are used to inform users of the strengths and weaknesses of competing tools and approaches, but more importantly, they encourage the advancement of technology by providing both academia and industry with clear targets for performance and functionality.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 139.00; Price excludes VAT (USA)

Softcover Book: USD 179.99; Price excludes VAT (USA)

Hardcover Book: USD 179.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

D.J. Abadi, A. Marcus, S.R. Madden, K. Hollenbach, Scalable semantic web data management using vertical partitioning, in Proceedings of the 33rd International Conference on Very Large Data Bases, VLDB Endowment (2007), pp. 411–422
Google Scholar
J.L. Aguirre, K. Eckert, J. Euzenat, A. Ferrara, W.R. van Hage, L. Hollink, C. Meilicke, A. Nikolov, D. Ritze, F. Scharffe, P. Shvaiko, O. Svab-Zamazal, C. Trojahn, E. Jimenez-Ruiz, B. Cuenca Grau, B. Zapilko, Results of the ontology alignment evaluation initiative 2012, in OM (2012)
Google Scholar
G. Aluc, O. Hartig, T. Ozsu, K. Daudjee, Diversified stress testing of RDF data management systems, in ISWC (2014)
Google Scholar
S. Araujo, A. de Vries, D. Schwabe, SERIMI results for OAEI 2011, in OM (2011)
Google Scholar
C. Bizer, A. Schultz, The Berlin SPARQL Benchmark. Int. J. Semant. Web Inf. Syst. 5(2) (2009)
Google Scholar
C. Böhm, G. de Melo, F. Naumann, G. Weikum, LINDA: distributed web-of-data-scale entity matching, in CIKM (2012)
Book Google Scholar
T. Bohme, E. Rahm, XMach-1: a benchmark for XML data management, in BTW (2001)
Google Scholar
P. Boncz, T. Neumann, O. Erling, TPC-H analyzed: hidden messages and lessons learned from an influential benchmark, in TPCTC (2013). Revised Selected Papers
Google Scholar
S. Bressan, M.L. Lee, Y.G. Li, Z. Lacroix, U. Nambiar, XML management system benchmarks, in XML Data Management: Native XML and XML-Enabled Database Systems (Addison Wesley, Boston, 2003)
Google Scholar
D. Brickley, R.V. Guha, RDF Schema 1.1. https://www.w3.org/TR/rdf-schema/, February 2014. W3C Recommendation
S. Castano, A. Ferrara, S. Montanelli, G. Racca, Semantic information interoperability in open networked systems, in ICSNW (2004)
Google Scholar
M. Cheatham, Z. Dragisic, J. Euzenat, D. Faria, A. Ferrara, G. Flouris, I. Fundulaki, R. Granada, V. Ivanova, E. Jimenez-Ruiz, P. Lambrix, S. Montanelli, C. Pesquita, T. Saveta, P. Shvaiko, A. Solimando, C. Trojahn, O. Zamazal, Results of the ontology alignment evaluation initiative 2015, in OM (2015)
Google Scholar
I.F. Cruz, C. Stroe, F. Caimi, A. Fabiani, C. Pesquita, F.M. Couto, M. Palmonari, Using AgreementMaker to align ontologies for OAEI 2011, in OM (2011)
Google Scholar
E. Daskalaki, D. Plexousakis, OtO matching system: a multi-strategy approach to instance matching, in CAiSE (2012)
Google Scholar
J. David, J. Euzenat, F. Scharffe, C. Trojahn, The alignment api 4.0. Semant. Web J. 2(1), 3–10 (2011)
Google Scholar
DBpedia: Towards a Public Data Infrastructure for a Large, Multilingual, Semantic Knowledge Graph. http://wiki.dbpedia.org/
K.M. Dixit, Overview of the SPEC Benchmarks, in The Benchmark Handbook for Database and Transaction Systems, 2nd edn. (Morgan Kaufmann, San Francisco, 1993)
Google Scholar
S. Duan, A. Kementsietsidis, K. Srinivas, O. Udrea, Apples and oranges: a comparison of RDF benchmarks and real RDF datasets, in SIGMOD (2011)
Book Google Scholar
Dublin Core Metadata Initiative. http://dublincore.org/
A.K. Elmagarmid, P. Ipeirotis, V. Verykios, Duplicate record detection: a survey, in IEEE TKDE (2007)
Google Scholar
J. Euzenat, A. Ferrara, L. Hollink, A. Isaac, C. Joslyn, V. Malaise, C. Meilicken, A. Nikolov, J. Pane, M. Sabou, F. Scharffe, P. Shvaiko, V.S.H. Stuckenschmidt, O. Svab-Zamazal, V. Svatek, C. Trojahn, G. Vouros, S. Wang, Results of the ontology alignment evaluation initiative 2009, in OM (2009)
Google Scholar
J. Euzenat, A. Ferrara, C. Meilicke, J. Pane, F. Schare, P. Shvaiko, H. Stuckenschmidt, O. Svab-Zamazal, V. Svatek, C. Trojahn, Results of the ontology alignment evaluation initiative 2010, in OM (2010)
Google Scholar
J. Euzenat, A. Ferrara, W.R. van Hage, L. Hollink, C. Meilicke, A. Nikolov, F. Scharffe, P. Shvaiko, H. Stuckenschmidt, O. Svab-Zamazal, C. Trojahn, Final results of the ontology alignment evaluation initiative 2011, in OM (2011)
Google Scholar
Febrl project. http://sourceforge.net/projects/febrl/
A. Ferrara, D. Lorusso, S. Montanelli, G. Varese, Towards a benchmark for instance matching, in OM (2008)
Google Scholar
A. Ferrara, S. Montanelli, J. Noessner, H. Stuckenschmidt, Benchmarking matching applications on the semantic web, in ESWC (2011)
Google Scholar
Fodor and Zagat’s Restaurant Guide. http://userweb.cs.utexas.edu/users/ml/riddle/data.html
Freebase. http://www.freebase.com/base/fbontology
GeoNames. http://www.geonames.org/
C. Goutte, E. Gaussier, A probabilistic interpretation of precision, recall, and F-score, with implication for evaluation, in ECIR (2005)
Google Scholar
B.C. Grau, Z. Dragisic, K. Eckert, J. Euzenat, A. Ferrara, R. Granada, V. Ivanova, E. Jimenez-Ruiz, A.O. Kempf, P. Lambrix, A. Nikolov, H. Paulheim, D. Ritze, F. Schare, P. Shvaiko, C. Trojahn, O. Zamazal, Results of the ontology alignment evaluation initiative 2013, in OM (2013)
Google Scholar
J. Gray (ed.), The Benchmark Handbook for Database and Transaction Systems, 2nd edn. (Morgan Kaufmann, San Francisco, 1993)
MATH Google Scholar
Y. Guo, Z. Pan, J. Heflin, LUBM: a benchmark for OWL knowledge base systems. J. Web Semant. 3(2–3), 158–182 (2005)
Article Google Scholar
O. Hassanzadeh, R. Xin, R.J. Miller, A. Kementsietsidis, L. Lim, M. Wang, Linkage query writer. Proc. VLDB Endow. 2(2) (2009)
Google Scholar
T. Heath, C. Bizer, Linked Data: Evolving the Web into a Global Data Space, in Synthesis Lectures on the Semantic Web: Theory and Technology, 1st edn. (Morgan and Claypool, San Rafael, 2011)
Google Scholar
M.A. Hernandez, S.J. Stolfo, The merge/purge problem for large databases. SIGMOD Rec. 24(2) (1995)
Google Scholar
W. Hu, J. Chen, C. Cheng, Y. Qu, Objectcoref & falcon-ao: results for oaei 2010, in OM (2010)
Google Scholar
K. Huppler, The art of building a good benchmark, in TPCTC (2009)
Google Scholar
E. Ioannou, N. Rassadko, Y. Velegrakis, On generating benchmark data for entity matching. J. Data Semant. 2(1), 37–56 (2013)
Article Google Scholar
R. Isele, C. Bizer, Learning expressive linkage rules using genetic programming. Proc. VLDB Endow. 5(11) (2012)
Google Scholar
R. Isele, C. Bizer, Active learning of expressive linkage rules using genetic programming. J. Web Semant. 23 (2013)
Google Scholar
Y.R. Jean-Mary, E.P. Shironoshita, M.R. Kabuka, ASMOV: results for OAEI 2009, in OM (2009)
Google Scholar
Y.R. Jean-Mary, E.P. Shironoshita, M.R. Kabuka, ASMOV: results for OAEI 2010 Proceedings 5th ISWC Workshop on Ontology Matching, in OM (2010)
Google Scholar
E. Jimenez-Ruiz, B. Cuenca Grau, I. Horrocks, LogMap and LogMapLt results for OAEI 2012, in OM (2012)
Google Scholar
E. Jimenez-Ruiz, B. Cuenca Grau, I. Horrocks, LogMap and LogMapLt results for OAEI 2013, in OM (2013)
Google Scholar
E. Jimenez-Ruiz, B. Cuenca Grau, W. Xia, A. Solimando, X. Chen, V. Cross, Y. Gong, S. Zhang, A. Chennai-Thiagarajan, LogMap family results for OAEI 2014, in OM (2014)
Google Scholar
E. Jimenez-Ruiz, C. Grau, A. Solimando, V. Cross, LogMap family results for OAEI 2015, in OM (2015)
Google Scholar
A. Khiat, M. Benaissa, InsMT/InsMTL results for OAEI 2014 instance matching, in OM (2014)
Google Scholar
A. Khiat, M. Benaissa, M.-A. Belfedhal, STRIM results for OAEI 2015 instance matching evaluation, in OM (2015)
Google Scholar
V. Kotsev, N. Minadakis, V. Papakonstantinou, O. Erling, I. Fundulaki, A. Kiryakov, Benchmarking RDF query engines: the LDBC semantic publishing benchmark, in BLINK (2016)
Google Scholar
L. Leito, P. Calado, M. Herschel, An overview of XML duplicate detection algorithms, in Soft Computing in XML Data Management, vol. 255 (Springer, Berlin, 2010)
Google Scholar
C. Levine, TPC-C: The OLTP Benchmark, in SIGMOD, 1997. Industrial Session
Google Scholar
C. Li, L. Jin, S. Mehrotra, Supporting efficient record linkage for large data sets using mapping techniques, in WWW (2006)
Google Scholar
S. Manegold, I. Manolescu, Performance evaluation in database research: principles and experience, in EDBT, 2009. Tutorial
Google Scholar
D.L. McGuinness, F. van Harmelen, OWL Web Ontology Language Overview. https://www.w3.org/TR/owl-features/, February 2004. W3C Recommendation
M. Morsey, J. Lehmann, S. Auer, A.-C. Ngonga Ngomo, DBpedia SPARQL benchmark – performance assessment with real queries on real data, in ISWC (2011)
Google Scholar
M. Nagy, M. Vargas-Vera, P. Stolarski, DSSim results for OAEI 2009, in OM (2009)
Google Scholar
R.O. Nambiar, M. Poess, A. Masland, H.R. Taheri, M. Emmerton, F. Carman, M. Majdalany, TPC benchmark roadmap, in Selected Topics in Performance Evaluation and Benchmarking (Springer, Berlin, 2012)
Google Scholar
T. Neumann, G. Weikum, RDF-3X: a RISC-style engine for RDF. PVLDB 1(1) (2008)
Google Scholar
A.-C. Ngonga Ngomo, S. Auer, LIMES: a time-efficient approach for large-scale link discovery on the web of data, in IJCAI (2011)
Google Scholar
A.-C. Ngonga Ngomo, D. Schumacher, Borderflow: a local graph clustering algorithm for natural language processing, in CICLing (2009)
Google Scholar
K. Nguyen, R. Ichise, SLINT+ results for OAEI 2013 instance matching, in OM (2013)
Google Scholar
X. Niu, S. Rong, Y. Zhang, H. Wang, Zhishi.links results for OAEI 2011, in OM (2011)
Google Scholar
J. Noessner, M. Niepert, CODI: combinatorial optimization for data integration – results for OAEI 2010, in OM (2010)
Google Scholar
OKKAM Project. http://project.okkam.org/
R. Othayoth Nambiar, M. Poess, A. Masland, H.R. Taheri, M. Emmerton, F. Carman, M. Majdalany, TPC Benchmark Roadmap 2012, in TPCTC (2012)
Google Scholar
H.K. Patni, C.A. Henson, A.P. Sheth, Linked sensor data, in CTS (2010)
Google Scholar
N. Redaschi, UniProt Consortium, UniProt in RDF: tackling data integration and distributed annotation with the semantic web, in Biocuration Conference (2009)
Google Scholar
F. Saïs, N. Niraula, N. Pernelle, M.C. Rousset, LN2R – a knowledge based reference reconciliation system: OAEI 2010 Results, in OM (2010)
Google Scholar
M. Saleem, Q. Mehmood, A.-C. Ngonga Ngomo, FEASIBLE: a feature-based SPARQL benchmark generation framework, in ISWC (2011)
Google Scholar
T. Saveta, E. Daskalaki, G. Flouris, I. Fundulaki, M. Herschel, A.-C. Ngonga Ngomo, LANCE: piercing to the heart of instance matching tool, in ISWC (2015)
Google Scholar
T. Saveta, E. Daskalaki, G. Flouris, I. Fundulaki, M. Herschel, A.-C. Ngonga Ngomo, Pushing the limits of instance matching systems: a semantics-aware benchmark for linked data, in WWW, Companion Volume (2015)
Book Google Scholar
A.R. Schmidt, F. Wass, M. Kersten, D. Florescu, M.J. Carey, I. Manolescu, R. Busse, XMark: a benchmark for XML data management, in VLDB (2002)
Book Google Scholar
M. Schmidt, T. Hornung, M. Meier, C. Pinkel, G. Lausen, SP2Bench: a SPARQL performance benchmark, in Semantic Web Information Management (Springer, Berlin, 2009)
Google Scholar
Md.H. Seddiqui, M. Aono, Anchor-flood: results for OAEI 2009, in OM (2009)
Google Scholar
C. Shao, L. Hu, J. Li, RiMOM-IM results for OAEI 2014, in OM (2014)
Google Scholar
P. Singla, P. Domingos, Multi-relational record linkage, in MRDM (2004). Co-located with KDD
Google Scholar
H. Stoermer, N. Rassadko, Results of OKKAM feature based entity matching algorithm for instance matching contest of OAEI 2009, in OM (2009)
Google Scholar
M. Suchanek, G. Kasneci, G. Weikum, YAGO: a core of semantic knowledge unifying WordNet and Wikipedia, in WWW (2007)
Book Google Scholar
Y. Sure, S. Bloehdorn, P. Haase, J. Hartmann, D. Oberle, The SWRC ontology – semantic web for research communities, in EPIA (2005)
Google Scholar
A. Taheri, M. Shamsfard, SBUEI: results for OAEI 2012, in OM (2012)
Google Scholar
Transaction Processing Council. http://www.tpc.org/
J. Volz, C. Bizer, M. Gaedke, G. Kobilarov, Discovering and maintaining links on the Web of Data, in ISWC (2009)
Google Scholar
J. Wang, X. Zhang, L. Hou, Y. Zhao, J. Li, Y. Qi, J. Tang, RiMOM results for OAEI 2010, in OM (2010)
Google Scholar
R.P. Weicker, An overview of common benchmarks. Computer 23(12) (1990)
Google Scholar
B.B. Yao, M. Tamer Özsu, N. Khandelwal, XBench benchmark and performance testing of XML DBMSs, in ICDE (2004)
Google Scholar
K. Zaiss, Instance-based ontology matching and the evaluation of matching systems. PhD thesis, Heinrich-Heine-Universiat Dusseldorf, 2010
Google Scholar
K. Zaiss, S. Conrad, S.A. Vater, Benchmark for testing instance-based ontology matching methods, in KMIS (2010)
Google Scholar
X. Zhang, Q. Zhong, F. Shi, J. Li, J. Tang, RiMOM results for OAEI 2009, in OM (2009)
Google Scholar
Q. Zheng, C. Shao, J. Li, Z. Wang, L. Hu, RiMOM2013 results for OAEI 2013, in OM (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

College of Public Health & Health Informatics, King Saud bin Abdulaziz University for Health Sciences, Riyadh, Saudi Arabia
Sherif Sakr
Fakultät IV - Open Distributed Systems, Technische Universität Berlin, Berlin, Germany
Marcin Wylot & Danh Le Phuoc
Knowledge Discovery Lab, GE Global Research, Niskayuna, New York, USA
Raghava Mutharaju
Foundation for Research and Technology - Hellas (FORTH), Institute of Computer Science (ICS), Heraklion, Greece
Irini Fundulaki

Authors

Sherif Sakr
View author publications
You can also search for this author in PubMed Google Scholar
Marcin Wylot
View author publications
You can also search for this author in PubMed Google Scholar
Raghava Mutharaju
View author publications
You can also search for this author in PubMed Google Scholar
Danh Le Phuoc
View author publications
You can also search for this author in PubMed Google Scholar
Irini Fundulaki
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Sakr, S., Wylot, M., Mutharaju, R., Le Phuoc, D., Fundulaki, I. (2018). Benchmarking RDF Query Engines and Instance Matching Systems. In: Linked Data. Springer, Cham. https://doi.org/10.1007/978-3-319-73515-3_7

Download citation

DOI: https://doi.org/10.1007/978-3-319-73515-3_7
Published: 02 March 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73514-6
Online ISBN: 978-3-319-73515-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics