Abstract
Most mediation systems use a caching policy in order to overcome their performance challenges. One of the most widely adopted strategies is known as semantic caching. Semantic caches are called so because they store the descriptions of all submitted queries. Although they may seem to be based on semantics because of their name, this is not really the case. In fact, they actually compare the syntax of the cached queries to the syntax of the new query to retrieve responses from the cache. This can lead to significant delays, especially if multiple requests are stored in the cache. In this work, we propose a new semantic approach based on ontologies to compute the semantic similarity between two given queries, and we provide also a new algorithm to filter all regions of the cache that do not semantically cover a user query. In this way, the use of the cache would be optimal and fast at the same time, despite the large number of regions in the cache. In fact, only the most beneficial regions will be processed to retrieve data from the cache.
Similar content being viewed by others
References
Abbas MA, Qadir MA, Ahmad M, Ali T, Sajid NA (2011) Graph based query trimming of conjunctive queries in semantic caching. In: 2011 7th International conference on emerging technologies (ICET). IEEE
Adel E, El-Sappagh S, Barakat S, Elmogy M (2019) Ontology-based electronic health record semantic interoperability: a survey. In: U-Healthcare monitoring systems, Academic Press, pp 315–352
Aggoune A (2022) Intelligent data integration from heterogeneous relational databases containing incomplete and uncertain information. 2022: 75–99
Ahmad M, Qadir MA, Sanaullah M (2009) An efficient query matching algorithm for relational data semantic cache. In: 2nd IEEE conference on computer, control and communication
Ajarroud O, Zellou A, Idri A (2018) A new filtering-based query processing: improving semantic caching efficiency in mediation systems. In: Proceedings: ACM 12th International conference on intelligent systems: theories and applications
Ajarroud O, Zellou A, Idri A (2020) A coverage-based approach for filtering and prioritizing regions in a semantic cache. Concurrency Computat Pract Exper. https://doi.org/10.1002/cpe.5639
Ajarroud O, Zellou A, Idri A (2019) Measuring semantic coverage rates provided by cached regions in mediation systems. In: Rocha Á, Serrhini M (eds) Information Systems and Technologies to Support Learning: Proceedings of EMENA-ISTL 2018. Springer International Publishing, Cham, pp 312–320. https://doi.org/10.1007/978-3-030-03577-8_34
Ajarroud O, Zellou A (2022) SBQP: Towards a semantic-based query processing for efficient mediation caching. In: Borzemski L, Selvaraj H, Świątek J (eds) Advances in systems engineering: Proceedings of the 28th international conference on systems engineering, ICSEng 2021. Springer International Publishing, Cham, pp 477–487. https://doi.org/10.1007/978-3-030-92604-5_42
Alghobiri MA, Khan HU, Malik TA, Iqbal S (2016) A comprehensive framework for the semantic cache systems. Int J Adv Appl Sci 3(10):72–78
Ambite JL, Tallis M, Alpert K, et al. (2015) SchizConnect: virtual data integration in neuroimaging. In: Paper presented at: proceedings of the international conference on data integration in the life sciences, Springer, pp 37–51
Arora S, Bala A (2021) An ensembled data frequency prediction based framework for fast processing using hybrid cache optimization. J Ambient Intell Human Comput 12:285–301. https://doi.org/10.1007/s12652-020-01973-5
Asim M-N, Wasim M, Khan MUG et al (2018) A survey of ontology learning techniques and applications. Database. https://doi.org/10.1093/database/bay101
Aouicha MB, Hadj Taieb MA (2016) Computing semantic similarity between biomedical concepts using new information content approach. J Biomed Inform 59:258–275. https://doi.org/10.1016/j.jbi.2015.12.007. (Epub 2015 Dec 17 PMID: 26707454)
Bohring H, Auer S (2015) Mapping XML to OWL ontologies. Marktplatz Internet: Von e-Learning bis e-Payment, 13. Leipziger Informatik-Tage (LIT 2005)
Briache A, Marrakchi K, Kerzazi A et al (2012) Transparent mediation-based access to multiple yeast data sources using an ontology driven interface. BMC Bioinformatics 13:S7. https://doi.org/10.1186/1471-2105-13-S1-S7
Bukhres OA, Chen J, Du W, Elmagarmid AK, Pezzoli R (1993) Interbase: an execution environment for heterogeneous software systems. Computer 26(8):57–69
Calvanese D, Cogrel B, Komla-Ebri S et al (2017) Ontop: answering SPARQL queries over relational databases. Semantic Web 8(3):471–487
Calvanese D, De Giacomo G, Lembo D, Lenzerini M, Rosati R (2018) Ontology-based data access and integration. In: Ling Liu M, Özsu T (eds) Encyclopedia of database systems. Springer New York, New York, pp 2590–2596. https://doi.org/10.1007/978-1-4614-8265-9_80667
Camaco-Rodriguez J et. al. (2019) Apache hive: from mapreduce to enterprise-grade big data warehousing. In: Proceedings of ACM international conference on management of data (SIGMOD), pp 1773–1786
Carey MJ, Haas LM, Schwarz PM, et al. (1995) Towards heterogeneous multimedia information systems: the garlic approach. In: Paper presented at: Proceedings of the RIDE-DOM’95 5th international workshop on research issues in data engineering-distributed object management, IEEE, pp 124–131
Chawathe S, Garcia-Molina H, Hammer J, Ireland K, Papakonstantinou Y, Ullman J, Widom J (1994) The TSIMMIS project: integration of heterogenous information sources. In: Information processing society of Japan (IPSJ 1994), October 1994, Tokyo, Japan
Cimiano P (2006) Ontology Learning and Population from Text: Algorithms, Evaluation and Applications. Springer, Heidelberg
Corcho O, Priyatna F, Chaves-Fraga D (2020) Towards a new generation of ontology based data access. Semantic Web 11(1):153–160
Dash S, Biswas S, Banerjee D, Rahman A (2019) Edge and fog computing in healthcare – a review. Scalable Comput 20(2):191–206
DeWitt D, Futtersack P, Maier D, Velez F (1990) A study of three alternative workstation server architectures for object oriented database systems. In: Proceedings of VLDB Conference
Franklin M (1996) Client data caching: a foundation for high performance object database systems
Gao J, Zhang B, Chen X (2015) A WordNet-based semantic similarity measurement combining edgecounting and information content theory. Eng Appl Artif Intell 39:80–88. https://doi.org/10.1016/j.engappai.2014.11.009
Haas LM, Lin ET, Roth MA (2002) Data integration through database federation. IBM Syst J 41(4):578–596
Hahinmoghadam M, Motamedi A (2021) An ontology-based mediation framework for integrating federated sources of BIM and IoT data. In: Toledo Santos E, Scheer S (Eds.) Proceedings of the 18th international conference on computing in civil and building engineering. ICCCBE 2020. Lecture Notes in Civil Engineering, Springer, Cham. https://doi.org/10.1007/978-3-030-51295-8_63
Hartig O, Vidal M, Freytag J (2017) Federated semantic data management. Dagstuhl Reports 7(6):135–167
Hirst G, St-Onge D (1998) Lexical chains as representations of context for the detection and correction of malapropisms. In: Fellbaum C (ed) WordNet: An Electronic Lexical Database. MIT Press, USA, pp 305–332
Horrocks I (2013) What are ontologies good for? In: Küppers BO, Hahn U, Artmann S (eds) Evolution of Semantic Systems. Springer, Heidelberg
John McCrae. (2020). English WordNet 2020 Edition. https://github.com/globalWordNet/english-WordNet/blob/master/src/wn-noun.location.xml
Jónsson BÞ, Arinbjarnar M, Þórsson B, Franklin MJ, Srivastava D (2006) Performance and overhead of semantic cache management. ACM Trans Internet Technol 6(3):302–331
Jovanovic P, Nadal S, Romero O et al (2021) Quarry: a user-centered big data integration platform. Inf Syst Front 23:9–33. https://doi.org/10.1007/s10796-020-10001-y
Khan S, Bilal M (2013) Bitmap index in ontology mapping for data integration. Arab J Sci Eng 38:859–873. https://doi.org/10.1007/s13369-012-0373-4
Knoblock CA, Szekely P, Ambite JL, Goel A, Gupta S, Lerman K, Muslea M, Taheriyan M, Mallick P (2012) Semi-automatically mapping structured sources into the semantic web. In: Extended semantic web conference, pp 375–390. Springer Berlin, Heidelberg
Kock-Schoppenhauer AK, Kamann C, Ulrich H, Duhm-Harbeck P, Ingenerf J (2017) Linked data applications through ontology based data access in clinical research. Stud Health Technol Inform 235:131–135
Langegger A, Wöß W, Blöchl M (2008) A semantic web middleware for virtual data integration on the web. In: Bechhofer S, Hauswirth M, Hoffmann J, Koubarakis M (eds) The Semantic Web: Research and Applications. Springer, Heidelberg. https://doi.org/10.1007/978-3-540-68234-9_37
Leacock C, Chodrow M (1998) Combining local context and WordNet similarity for word sense identification. In: Fellbaum C (ed) WordNet: An Electronic Lexical Database. MIT Press, pp 265–283
Li Y, Bandar Z, McLean S (2003) An approach for measuring semantic similarity between words using multiple information sources. Trans Data Knowl Eng 15(4):871–882
Lin D (1998) An information-theoretic definition of similarity. In: Proceedings of the 15th international conference on machine learning ICML. Madison, Wisconsin
Maziarz M, Piasecki M, Rudnicka E, Szpakowicz S, Kędzia P (2016) plwordnet 3.0–a comprehensive lexical-semantic resource. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers, pp 2259–2268
Meng L, Huang R, Gu J (2013) An effective algorithm for semantic similarity metric of word pairs. Int J Multimed Ubiquitous Eng 8(2):1–12
Messaoudi C, Fissoune R, Badir H (2020) IPDS: a semantic mediator-based system using Spark for the integration of heterogeneous proteomics data sources. Concurrency Computat Pract Exper 33(1):e5814. https://doi.org/10.1002/cpe.5814
Miller GA (1995) WordNet: a lexical database for English. Commun ACM 38(11):39–41
Mohankumar P, Balamurugan B (2017) An intelligent approach of query process optimization using cooperative semantic caching technique. J Eng Sci Technol 12(9):2476–2487
Mountasser I, Ouhbi B, Hdioud F et al (2021) Semantic-based Big Data integration framework using scalable distributed ontology matching strategy. Distrib Parallel Databases 39:891–937. https://doi.org/10.1007/s10619-021-07321-6
O’Toole J, Shrira L (1994) Hybrid caching for large scale object systems. In: Proceedings of the 6th international workshop on persistent object systems
Press G (2020) 54 Predictions about the state of data In 2021, Dec 2020, [online] Available: https://www.forbes.com/sites/gilpress/2021/12/30/54-predictions-about-the-state-of-data-in-2021/?sh=2ced7ae6397d
Qi J, Xu B, Xue Y, Wang K, Sun Y (2018) Knowledge based diferential evolution for cloud computing service composition. J Ambient Intell Humaniz Comput 9(3):565–574
Rada R, Mili H, Bicknell E, Blettner M (1989) Development and application of a metric on semantic nets. IEEE Trans Syst Man Cybern 19(1):17–30. https://doi.org/10.1109/21.24528
Ren Q, Dunham MH, Kumar V (2003) Semantic caching and query processing. IEEE Trans Knowl Data Eng 15(1):192–210. https://doi.org/10.1109/tkde.2003.1161590
Resnik P (1995) Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the 14th international joint conference on artificial intelligence IJCAI. Canada: Montreal Québec, pp 448–453
Richardson R, Smeaton A, Murphy J (1994) Using WordNet as a knowledge base for measuring semantic similarity between words. In: Proceedings of AICS conference. Dublin: Trinity College
Ross SM (2002) A First course in probability, 6th edn. Prentice Hal, Upper Saddle River
Rudnicka EK, Witkowski W, Kaliński M (2015) Towards the methodology for extending princeton wordnet. Cognit Studies/Études Cognit (15):335–351
Sharma G, Kalra S (2020) Advanced lightweight multi-factor remote user authentication scheme for cloud-IoT applications. J Ambient Intell Hum Comput 11:1771–1794
Stripelis D, Anastasiou C, Ambite JL (2018) Extending apache spark with a mediation layer. In: Paper presented at: proceedings of the international workshop on semantic big data; 2018:2; ACM
Tan R, Chirkova R, Gadepally V, Mattson TG (2017) Enabling query processing across heterogeneous data models: a survey. In: Proceedings of international conference on big data
Wu Z, Palmer M (1994) Verb semantics and lexical selection. In: Proceedings of the 32nd annual meeting of the associations for computational linguistics, pp 133–138
Xiaoyong L, Hui Z (2010) Answering semantic caching for integration systems. In: 2010 3rd international conference on advanced computer theory and engineering (ICACTE), 2010, pp V5–59-V5–61, doi: https://doi.org/10.1109/ICACTE.2010.5579234
Yang F, Tschetter E, Léauté X, Ray N, Merlino G, Ganguli D (2014) Druid: a real-time analytical data store. In: Proceedings of the 2014 ACM SIGMOD international conference on Management of data, (SIGMOD)
Yun H, He Y, Lin L, Wang X (2019) Research on multi-source data integration based on ontology and karma modeling. Int J Intell Inf Technol (IJIIT) 15(2):69–87. https://doi.org/10.4018/IJIIT.2019040105
Yazidi MHE, Zellou A, Idri A (2012). Towards a fuzzy mapping for mediation systems. In: Paper presented at: IEEE international conference on complex systems (ICCS); Agadir, Morocco. https://doi.org/10.1109/icocs.2012.6458573
Zaharia M, Chowdhury M, Franklin MJ, Shenker S, Stoica I (2010) Spark: cluster computing with working sets. HotCloud 10(10):95
Zhang XG et al. (2017) A novel comprehensive approach for estimating concept semantic similarity in WordNet.” ArXiv abs/1703.01726
Zhang H, Guo Y, Li Q, George TJ, Shenkman E, Modave F, Bian J (2018) An ontology-guided semantic data integration framework to support integrative data analysis of cancer survival. BMC Med Inform Decis Mak 18(Suppl 2):41. https://doi.org/10.1186/s12911-018-0636-4
Zhao Y, Ma T, Hao Y, Shen W, Tian Y, Al-Dhelaan A (2019) ICRA: index based cache replacement algorithm for cloud storage. Int J Sensor Netw (IJSNET) 29(1):48
Zhou Z, Wang Y, Gu J (2008) A new model of information content for semantic similarity in WordNet. In: Proceedings of the 2nd international conference on future generation communication and networking symposia FGCNS. Hainan Island, China: Sanya, pp 85–89
Zhu X, Li F, Chen H, Peng Q (2018) An efficient path computing model for measuring semantic similarity using edge and density. Knowl Inf Syst 55(1):79–111. https://doi.org/10.1007/s10115-017-1078-5
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ajarroud, O., Zellou, A. & Idri, A. A new ontology-based similarity approach for measuring caching coverages provided by mediation systems. Knowl Inf Syst 66, 959–987 (2024). https://doi.org/10.1007/s10115-023-01974-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-023-01974-8