Abstract
Day by day data volumes are increasing, and most of the data are stored in the databases after manual transformations and derivations. The behavior of those stored data is unpredictable. Furthermore, the data are collected from various sources such as physical, geological, environmental, chemical, and biological. A relational database management system (RDBMS) provides a high level data interface. Inside RDBMS sources and intermediate data items are relations, tuples, and attributes. In the context of data provenance, this paper describes how data are produced. When data needs to be retrieved from RDBMS using queries, sometimes it is necessary to check the output data product back to its source values if that particular output seems to have an unexpected value. The aim of this paper is to show the source values for output data using query inversion approach, and to propose the technique for creating an inverse query for queries with aggregation functions, multiple (join, set) operations, and sub-queries.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Buneman, P., Khanna, S., Tan, W.C.: Why and where: a characterization of data provenance. In: Proceedings of the International Conference on Database Theory, pp. 316–330 (2001)
Lanter, D.P.: Design of a lineage-based meta-data base for GIS. Cartogr. Geogr. Inf. Sci. 18(4), 255–261 (1991)
Foster, I.T., Vockler, J.-S., Wilde, M., Zhao, Y.: Chimera: a virtual data system for representing, querying, and automating data derivation. In: SSDBM’02: Proceedings of the 14th International Conference on Scientific and Statistical Database Management, pp. 37–46. IEEE Computer Society, Washington, DC, USA (2002)
Chiticariu, L., Tan, W.-C., Vijayvargiya, G.: DBNotes: a post-it system for relational databases based on provenance. In: SIGMOD ’05: Proceedings of the 2005 ACM SIGMOD International Conference on MANAGEMENT of Data, pp. 942–944. ACM Press, New York, NY, USA (2005)
Benjelloun, O., Sarma, A.D., Halevy, A., Widom, J.: ULDBs: Databases with uncertainty and lineage. In: Proceedings of the International Conference on Very Large Data Bases, pp. 953–964 (2006)
Agrawal, P., Benjelloun, O., Sarma, A.D., Hayworth, C., Nabar, S., Sugihara, T., Widom, J.: Trio: a system for data, uncertainty, and lineage. In: Proceedings of the International Conference on Very Large DataBases, pp. 1151–1154 (2006)
Anand, M.K., Bowers, S., McPhillips, T., Ludäscher, B.: Efficient provenance storage over nested data collections. In: Proceedings of the International Conference on Extending Database Technology: Advances in Database Technology, pp. 958–969. ACM (2009)
Park, U., Heidemann, J.: Provenance in sensornet republishing. In: Provenance and Annotation of Data and Processes, volume 5272 of LNCS, pp. 280–292. Springer (2008)
Huq, M.R., Apers, P.M.G., Wombacher, A.: An inference-based framework to manage data provenance in geoscience applications. Accepted in IEEE Transactions on Geoscience and Remote Sensing, Earlyaccess article https://doi.org/10.1109/tgrs.2013.2247769. IEEE Geoscience and Remote Sensing Society (2013)
Cui, Y., Widom, J., Wiener, J.L.: Tracing the lineage of view data in a warehousing environment. ACM Trans. Database Syst. 25(2), 179–227 (2000)
Bhagwat, D., Chiticariu, L., Tan, W.C., Vijayvargiya, G.: An annotation management system for relational databases. In: VLDB, pp. 900–911 (2004)
Cui, Y., Widom, J., Wiener, J.: Tracing the lineage of view data in a warehousing environment. ACM Trans. Database Syst. (TODS) 25(2), 179–227 (2000)
Perm: Processing provenance and data on the same data model through query rewriting—Boris Glavic, Gustavo Alonso. In: ICDE ’09: Proceedings of the 25th International Conference on Data Engineering (2009)
Salah Uddin, Md., Alexandrov, D.V., Rahman, A.: Query inversion to find data provenance. In: Wyld, D.C., et al. (eds.) DMDB-2018, pp. 17–31. AIRCC Publishing. CS&IT-CSCP 2018 (2018). https://doi.org/10.5121/csit.2018.80102
Cheney, J.: Program slicing and data provenance. IEEE Data Bull. Eng. 30(4), 22–28 (2007)
Acknowledgements
We would like to thank all the colleagues at the School of Software Engineering of NRU HSE for their feedback and useful recommendations that contributed to bringing this paper to its final form.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Salah Uddin, M., Alexandrov, D.V. (2019). A Query Inversion Technique for Detection of Unexpected Values in Relational Databases. In: Arai, K., Kapoor, S., Bhatia, R. (eds) Intelligent Systems and Applications. IntelliSys 2018. Advances in Intelligent Systems and Computing, vol 869. Springer, Cham. https://doi.org/10.1007/978-3-030-01057-7_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-01057-7_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01056-0
Online ISBN: 978-3-030-01057-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)