Skip to main content
Log in

A new ontology-based similarity approach for measuring caching coverages provided by mediation systems

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Most mediation systems use a caching policy in order to overcome their performance challenges. One of the most widely adopted strategies is known as semantic caching. Semantic caches are called so because they store the descriptions of all submitted queries. Although they may seem to be based on semantics because of their name, this is not really the case. In fact, they actually compare the syntax of the cached queries to the syntax of the new query to retrieve responses from the cache. This can lead to significant delays, especially if multiple requests are stored in the cache. In this work, we propose a new semantic approach based on ontologies to compute the semantic similarity between two given queries, and we provide also a new algorithm to filter all regions of the cache that do not semantically cover a user query. In this way, the use of the cache would be optimal and fast at the same time, despite the large number of regions in the cache. In fact, only the most beneficial regions will be processed to retrieve data from the cache.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Abbas MA, Qadir MA, Ahmad M, Ali T, Sajid NA (2011) Graph based query trimming of conjunctive queries in semantic caching. In: 2011 7th International conference on emerging technologies (ICET). IEEE

  2. Adel E, El-Sappagh S, Barakat S, Elmogy M (2019) Ontology-based electronic health record semantic interoperability: a survey. In: U-Healthcare monitoring systems, Academic Press, pp 315–352

  3. Aggoune A (2022) Intelligent data integration from heterogeneous relational databases containing incomplete and uncertain information. 2022: 75–99

  4. Ahmad M, Qadir MA, Sanaullah M (2009) An efficient query matching algorithm for relational data semantic cache. In: 2nd IEEE conference on computer, control and communication

  5. Ajarroud O, Zellou A, Idri A (2018) A new filtering-based query processing: improving semantic caching efficiency in mediation systems. In: Proceedings: ACM 12th International conference on intelligent systems: theories and applications

  6. Ajarroud O, Zellou A, Idri A (2020) A coverage-based approach for filtering and prioritizing regions in a semantic cache. Concurrency Computat Pract Exper. https://doi.org/10.1002/cpe.5639

    Article  Google Scholar 

  7. Ajarroud O, Zellou A, Idri A (2019) Measuring semantic coverage rates provided by cached regions in mediation systems. In: Rocha Á, Serrhini M (eds) Information Systems and Technologies to Support Learning: Proceedings of EMENA-ISTL 2018. Springer International Publishing, Cham, pp 312–320. https://doi.org/10.1007/978-3-030-03577-8_34

    Chapter  Google Scholar 

  8. Ajarroud O, Zellou A (2022) SBQP: Towards a semantic-based query processing for efficient mediation caching. In: Borzemski L, Selvaraj H, Świątek J (eds) Advances in systems engineering: Proceedings of the 28th international conference on systems engineering, ICSEng 2021. Springer International Publishing, Cham, pp 477–487. https://doi.org/10.1007/978-3-030-92604-5_42

    Chapter  Google Scholar 

  9. Alghobiri MA, Khan HU, Malik TA, Iqbal S (2016) A comprehensive framework for the semantic cache systems. Int J Adv Appl Sci 3(10):72–78

    Article  Google Scholar 

  10. Ambite JL, Tallis M, Alpert K, et al. (2015) SchizConnect: virtual data integration in neuroimaging. In: Paper presented at: proceedings of the international conference on data integration in the life sciences, Springer, pp 37–51

  11. Arora S, Bala A (2021) An ensembled data frequency prediction based framework for fast processing using hybrid cache optimization. J Ambient Intell Human Comput 12:285–301. https://doi.org/10.1007/s12652-020-01973-5

    Article  Google Scholar 

  12. Asim M-N, Wasim M, Khan MUG et al (2018) A survey of ontology learning techniques and applications. Database. https://doi.org/10.1093/database/bay101

    Article  Google Scholar 

  13. Aouicha MB, Hadj Taieb MA (2016) Computing semantic similarity between biomedical concepts using new information content approach. J Biomed Inform 59:258–275. https://doi.org/10.1016/j.jbi.2015.12.007. (Epub 2015 Dec 17 PMID: 26707454)

    Article  Google Scholar 

  14. Bohring H, Auer S (2015) Mapping XML to OWL ontologies. Marktplatz Internet: Von e-Learning bis e-Payment, 13. Leipziger Informatik-Tage (LIT 2005)

  15. Briache A, Marrakchi K, Kerzazi A et al (2012) Transparent mediation-based access to multiple yeast data sources using an ontology driven interface. BMC Bioinformatics 13:S7. https://doi.org/10.1186/1471-2105-13-S1-S7

    Article  Google Scholar 

  16. Bukhres OA, Chen J, Du W, Elmagarmid AK, Pezzoli R (1993) Interbase: an execution environment for heterogeneous software systems. Computer 26(8):57–69

    Article  Google Scholar 

  17. Calvanese D, Cogrel B, Komla-Ebri S et al (2017) Ontop: answering SPARQL queries over relational databases. Semantic Web 8(3):471–487

    Article  Google Scholar 

  18. Calvanese D, De Giacomo G, Lembo D, Lenzerini M, Rosati R (2018) Ontology-based data access and integration. In: Ling Liu M, Özsu T (eds) Encyclopedia of database systems. Springer New York, New York, pp 2590–2596. https://doi.org/10.1007/978-1-4614-8265-9_80667

    Chapter  Google Scholar 

  19. Camaco-Rodriguez J et. al. (2019) Apache hive: from mapreduce to enterprise-grade big data warehousing. In: Proceedings of ACM international conference on management of data (SIGMOD), pp 1773–1786

  20. Carey MJ, Haas LM, Schwarz PM, et al. (1995) Towards heterogeneous multimedia information systems: the garlic approach. In: Paper presented at: Proceedings of the RIDE-DOM’95 5th international workshop on research issues in data engineering-distributed object management, IEEE, pp 124–131

  21. Chawathe S, Garcia-Molina H, Hammer J, Ireland K, Papakonstantinou Y, Ullman J, Widom J (1994) The TSIMMIS project: integration of heterogenous information sources. In: Information processing society of Japan (IPSJ 1994), October 1994, Tokyo, Japan

  22. Cimiano P (2006) Ontology Learning and Population from Text: Algorithms, Evaluation and Applications. Springer, Heidelberg

    Google Scholar 

  23. Corcho O, Priyatna F, Chaves-Fraga D (2020) Towards a new generation of ontology based data access. Semantic Web 11(1):153–160

    Article  Google Scholar 

  24. Dash S, Biswas S, Banerjee D, Rahman A (2019) Edge and fog computing in healthcare – a review. Scalable Comput 20(2):191–206

    Google Scholar 

  25. DeWitt D, Futtersack P, Maier D, Velez F (1990) A study of three alternative workstation server architectures for object oriented database systems. In: Proceedings of VLDB Conference

  26. Franklin M (1996) Client data caching: a foundation for high performance object database systems

  27. Gao J, Zhang B, Chen X (2015) A WordNet-based semantic similarity measurement combining edgecounting and information content theory. Eng Appl Artif Intell 39:80–88. https://doi.org/10.1016/j.engappai.2014.11.009

    Article  Google Scholar 

  28. Haas LM, Lin ET, Roth MA (2002) Data integration through database federation. IBM Syst J 41(4):578–596

    Article  Google Scholar 

  29. Hahinmoghadam M, Motamedi A (2021) An ontology-based mediation framework for integrating federated sources of BIM and IoT data. In: Toledo Santos E, Scheer S (Eds.) Proceedings of the 18th international conference on computing in civil and building engineering. ICCCBE 2020. Lecture Notes in Civil Engineering, Springer, Cham. https://doi.org/10.1007/978-3-030-51295-8_63

  30. Hartig O, Vidal M, Freytag J (2017) Federated semantic data management. Dagstuhl Reports 7(6):135–167

    Google Scholar 

  31. Hirst G, St-Onge D (1998) Lexical chains as representations of context for the detection and correction of malapropisms. In: Fellbaum C (ed) WordNet: An Electronic Lexical Database. MIT Press, USA, pp 305–332

    Google Scholar 

  32. Horrocks I (2013) What are ontologies good for? In: Küppers BO, Hahn U, Artmann S (eds) Evolution of Semantic Systems. Springer, Heidelberg

    Google Scholar 

  33. John McCrae. (2020). English WordNet 2020 Edition. https://github.com/globalWordNet/english-WordNet/blob/master/src/wn-noun.location.xml

  34. Jónsson BÞ, Arinbjarnar M, Þórsson B, Franklin MJ, Srivastava D (2006) Performance and overhead of semantic cache management. ACM Trans Internet Technol 6(3):302–331

    Article  Google Scholar 

  35. Jovanovic P, Nadal S, Romero O et al (2021) Quarry: a user-centered big data integration platform. Inf Syst Front 23:9–33. https://doi.org/10.1007/s10796-020-10001-y

    Article  Google Scholar 

  36. Khan S, Bilal M (2013) Bitmap index in ontology mapping for data integration. Arab J Sci Eng 38:859–873. https://doi.org/10.1007/s13369-012-0373-4

    Article  Google Scholar 

  37. Knoblock CA, Szekely P, Ambite JL, Goel A, Gupta S, Lerman K, Muslea M, Taheriyan M, Mallick P (2012) Semi-automatically mapping structured sources into the semantic web. In: Extended semantic web conference, pp 375–390. Springer Berlin, Heidelberg

    Google Scholar 

  38. Kock-Schoppenhauer AK, Kamann C, Ulrich H, Duhm-Harbeck P, Ingenerf J (2017) Linked data applications through ontology based data access in clinical research. Stud Health Technol Inform 235:131–135

    Google Scholar 

  39. Langegger A, Wöß W, Blöchl M (2008) A semantic web middleware for virtual data integration on the web. In: Bechhofer S, Hauswirth M, Hoffmann J, Koubarakis M (eds) The Semantic Web: Research and Applications. Springer, Heidelberg. https://doi.org/10.1007/978-3-540-68234-9_37

    Chapter  Google Scholar 

  40. Leacock C, Chodrow M (1998) Combining local context and WordNet similarity for word sense identification. In: Fellbaum C (ed) WordNet: An Electronic Lexical Database. MIT Press, pp 265–283

    Google Scholar 

  41. Li Y, Bandar Z, McLean S (2003) An approach for measuring semantic similarity between words using multiple information sources. Trans Data Knowl Eng 15(4):871–882

    Article  Google Scholar 

  42. Lin D (1998) An information-theoretic definition of similarity. In: Proceedings of the 15th international conference on machine learning ICML. Madison, Wisconsin

  43. Maziarz M, Piasecki M, Rudnicka E, Szpakowicz S, Kędzia P (2016) plwordnet 3.0–a comprehensive lexical-semantic resource. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers, pp 2259–2268

  44. Meng L, Huang R, Gu J (2013) An effective algorithm for semantic similarity metric of word pairs. Int J Multimed Ubiquitous Eng 8(2):1–12

    Google Scholar 

  45. Messaoudi C, Fissoune R, Badir H (2020) IPDS: a semantic mediator-based system using Spark for the integration of heterogeneous proteomics data sources. Concurrency Computat Pract Exper 33(1):e5814. https://doi.org/10.1002/cpe.5814

    Article  Google Scholar 

  46. Miller GA (1995) WordNet: a lexical database for English. Commun ACM 38(11):39–41

    Article  Google Scholar 

  47. Mohankumar P, Balamurugan B (2017) An intelligent approach of query process optimization using cooperative semantic caching technique. J Eng Sci Technol 12(9):2476–2487

    Google Scholar 

  48. Mountasser I, Ouhbi B, Hdioud F et al (2021) Semantic-based Big Data integration framework using scalable distributed ontology matching strategy. Distrib Parallel Databases 39:891–937. https://doi.org/10.1007/s10619-021-07321-6

    Article  Google Scholar 

  49. O’Toole J, Shrira L (1994) Hybrid caching for large scale object systems. In: Proceedings of the 6th international workshop on persistent object systems

  50. Press G (2020) 54 Predictions about the state of data In 2021, Dec 2020, [online] Available: https://www.forbes.com/sites/gilpress/2021/12/30/54-predictions-about-the-state-of-data-in-2021/?sh=2ced7ae6397d

  51. Qi J, Xu B, Xue Y, Wang K, Sun Y (2018) Knowledge based diferential evolution for cloud computing service composition. J Ambient Intell Humaniz Comput 9(3):565–574

    Article  Google Scholar 

  52. Rada R, Mili H, Bicknell E, Blettner M (1989) Development and application of a metric on semantic nets. IEEE Trans Syst Man Cybern 19(1):17–30. https://doi.org/10.1109/21.24528

    Article  Google Scholar 

  53. Ren Q, Dunham MH, Kumar V (2003) Semantic caching and query processing. IEEE Trans Knowl Data Eng 15(1):192–210. https://doi.org/10.1109/tkde.2003.1161590

    Article  Google Scholar 

  54. Resnik P (1995) Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the 14th international joint conference on artificial intelligence IJCAI. Canada: Montreal Québec, pp 448–453

  55. Richardson R, Smeaton A, Murphy J (1994) Using WordNet as a knowledge base for measuring semantic similarity between words. In: Proceedings of AICS conference. Dublin: Trinity College

  56. Ross SM (2002) A First course in probability, 6th edn. Prentice Hal, Upper Saddle River

    Google Scholar 

  57. Rudnicka EK, Witkowski W, Kaliński M (2015) Towards the methodology for extending princeton wordnet. Cognit Studies/Études Cognit (15):335–351

  58. Sharma G, Kalra S (2020) Advanced lightweight multi-factor remote user authentication scheme for cloud-IoT applications. J Ambient Intell Hum Comput 11:1771–1794

    Article  Google Scholar 

  59. Stripelis D, Anastasiou C, Ambite JL (2018) Extending apache spark with a mediation layer. In: Paper presented at: proceedings of the international workshop on semantic big data; 2018:2; ACM

  60. Tan R, Chirkova R, Gadepally V, Mattson TG (2017) Enabling query processing across heterogeneous data models: a survey. In: Proceedings of international conference on big data

  61. Wu Z, Palmer M (1994) Verb semantics and lexical selection. In: Proceedings of the 32nd annual meeting of the associations for computational linguistics, pp 133–138

  62. Xiaoyong L, Hui Z (2010) Answering semantic caching for integration systems. In: 2010 3rd international conference on advanced computer theory and engineering (ICACTE), 2010, pp V5–59-V5–61, doi: https://doi.org/10.1109/ICACTE.2010.5579234

  63. Yang F, Tschetter E, Léauté X, Ray N, Merlino G, Ganguli D (2014) Druid: a real-time analytical data store. In: Proceedings of the 2014 ACM SIGMOD international conference on Management of data, (SIGMOD)

  64. Yun H, He Y, Lin L, Wang X (2019) Research on multi-source data integration based on ontology and karma modeling. Int J Intell Inf Technol (IJIIT) 15(2):69–87. https://doi.org/10.4018/IJIIT.2019040105

    Article  Google Scholar 

  65. Yazidi MHE, Zellou A, Idri A (2012). Towards a fuzzy mapping for mediation systems. In: Paper presented at: IEEE international conference on complex systems (ICCS); Agadir, Morocco. https://doi.org/10.1109/icocs.2012.6458573

  66. Zaharia M, Chowdhury M, Franklin MJ, Shenker S, Stoica I (2010) Spark: cluster computing with working sets. HotCloud 10(10):95

    Google Scholar 

  67. Zhang XG et al. (2017) A novel comprehensive approach for estimating concept semantic similarity in WordNet.” ArXiv abs/1703.01726

  68. Zhang H, Guo Y, Li Q, George TJ, Shenkman E, Modave F, Bian J (2018) An ontology-guided semantic data integration framework to support integrative data analysis of cancer survival. BMC Med Inform Decis Mak 18(Suppl 2):41. https://doi.org/10.1186/s12911-018-0636-4

    Article  Google Scholar 

  69. Zhao Y, Ma T, Hao Y, Shen W, Tian Y, Al-Dhelaan A (2019) ICRA: index based cache replacement algorithm for cloud storage. Int J Sensor Netw (IJSNET) 29(1):48

    Article  Google Scholar 

  70. Zhou Z, Wang Y, Gu J (2008) A new model of information content for semantic similarity in WordNet. In: Proceedings of the 2nd international conference on future generation communication and networking symposia FGCNS. Hainan Island, China: Sanya, pp 85–89

  71. Zhu X, Li F, Chen H, Peng Q (2018) An efficient path computing model for measuring semantic similarity using edge and density. Knowl Inf Syst 55(1):79–111. https://doi.org/10.1007/s10115-017-1078-5

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ouafa Ajarroud.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ajarroud, O., Zellou, A. & Idri, A. A new ontology-based similarity approach for measuring caching coverages provided by mediation systems. Knowl Inf Syst 66, 959–987 (2024). https://doi.org/10.1007/s10115-023-01974-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-023-01974-8

Keywords

Navigation