skip to main content
research-article

Evaluating top-k queries with inconsistency degrees

Published:01 July 2020Publication History
Skip Abstract Section

Abstract

We study the problem of augmenting relational tuples with inconsistency awareness and tackling top-k queries under a set of denial constraints (DCs). We define a notion of inconsistent tuples with respect to a set of DCs and define two measures of inconsistency degrees, which consider single and multiple violations of constraints. In order to compute these measures, we leverage two models of provenance, namely why-provenance and provenance polynomials. We investigate top-k queries that allow to rank the answer tuples by their inconsistency degrees. Since one of our measure is monotonic and the other non-monotonic, we design an integrated top-k algorithm to compute the top-k results of a query w.r.t. both inconsistency measures. By means of an extensive experimental study, we gauge the effectiveness of inconsistency-aware query answering and the efficiency of our algorithm with respect to a baseline, where query results are fully computed and ranked afterwards.

References

  1. Adult dataset. https://github.com/HoloClean/holoclean/blob/master/testdata/AdultFull.csv.Google ScholarGoogle Scholar
  2. Food inspection dataset. https://data.cityofchicago.org/Health-Human-Services/Food-Inspections/4ijn-s7e5.Google ScholarGoogle Scholar
  3. S. Amer-Yahia, S. Elbassuoni, A. Ghizzawi, R. M. Borromeo, E. Hoareau, and P. Mulhem. Fairness in online jobs: A case study on taskrabbit and google. In EDBT 2020, pages 510--521, 2020.Google ScholarGoogle Scholar
  4. Y. Amsterdamer, D. Deutch, and V. Tannen. Provenance for aggregate queries. In ACM PODS 2011, page 153--164, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. Arenas, L. Bertossi, and J. Chomicki. Consistent query answers in inconsistent databases. In ACM PODS 1999, pages 68--79, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. A. Arioua and A. Bonifati. User-guided repairing of inconsistent knowledge bases. In EDBT 2018, pages 133--144, 2018.Google ScholarGoogle Scholar
  7. L. Bertossi. Database Repairing and Consistent Query Answering. Morgan & Claypool, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. L. Bertossi. Database repairs and consistent query answering: Origins and further developments. In ACM PODS, page 48--58, 2019. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. L. Bertossi, A. Hunter, and T. Schaub. Introduction to inconsistency tolerance. In Inconsistency Tolerance, volume LNCS 3300, pages 1--14, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. L. E. Bertossi. Repair-based degrees of database inconsistency: Computation and complexity. CoRR, abs/1809.10286, 2018.Google ScholarGoogle Scholar
  11. L. E. Bertossi and J. Chomicki. Query answering in inconsistent databases. In J. Chomicki, R. van der Meyden, and G. Saake, editors, In Logics for Emerging Applications of Databases, 2013.Google ScholarGoogle Scholar
  12. P. Buneman, S. Khanna, and W. C. Tan. Why and where: A characterization of data provenance. In ICDT 2001, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. Calautti, M. Console, and A. Pieris. Counting database repairs under primary keys revisited. In ACM PODS, page 104--118, 2019. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. Calì, D. Lembo, and R. Rosati. On the decidability and complexity of query answering over inconsistent and incomplete databases. In ACM PODS 2003, pages 260--271, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. L. Chengkai, C.-C. C. Kevin, and I. Ihab F. Supporting ad-hoc ranking aggregates. In ACM SIGMOD 2006, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. Chomicki, J. Marcinkowski, and S. Staworko. Computing consistent query answers using conflict hypergraphs. In CIKM 2004, pages 417--426, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. X. Chu, I. F. Ilyas, and P. Papotti. Discovering denial constraints. PVLDB, 6(13):1498--1509, Aug. 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Y. Cui, J. Widom, and J. L. Wiener. Tracing the lineage of view data in a warehousing environment. ACM TODS, 25(2):179--227, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. H. Decker and D. Martinenghi. Modeling, measuring and monitoring the quality of information. In ER 2009 Workshops. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. D. Deutch, T. Milo, S. Roy, and V. Tannen. Circuits for datalog provenance. In ICDT 2014, 03 2014.Google ScholarGoogle Scholar
  21. D. Didier, L. Jerôme, and P. Henri. Possibilistic logic. In Handbook of Logic in Artificial Intelligence and Logic Programming, pages 439--513. Oxford University Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. R. Fagin, A. Lotem, and M. Naor. Optimal aggregation algorithms for middleware. In ACM PODS, page 102--113, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. F. Geerts, F. Pijcke, and J. Wijsen. First-order under-approximations of consistent query answers. Int. J. Approx. Reasoning, 83(C):337--355, Apr. 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. J. Grant and A. Hunter. Measuring inconsistency in knowledgebases. Journal of Intelligent Information Systems, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. T. J. Green. Containment of conjunctive queries on annotated relations. In ICDT, pages 296--309. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. T. J. Green, G. Karvounarakis, and V. Tannen. Provenance semirings. In ACM PODS 2007, pages 31--40. ACM, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. I. F. Ilyas, W. G. Aref, and A. K. Elmagarmid. Supporting top-k join queries in relational databases. The VLDB Journal, 13(3):207--221, Sept. 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. I. F. Ilyas, G. Beskales, and M. A. Soliman. A survey of top-k query processing techniques in relational database systems. ACM Computing Surveys (CSUR), 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. I. F. Ilyas and X. Chu. Data Cleaning. ACM, 2019. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. O. Issa, A. Bonifati, and F. Toumani. Evaluating Top-k Queries with Inconsistency Degrees. https://hal.archives-ouvertes.fr/hal-02898931. 2020.Google ScholarGoogle Scholar
  31. N. P. Karl Schnaitter. Evaluating rank joins with optimal cost. In ACM PODS 2008, pages 43--52, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. S. Kolahi and L. V. S. Lakshmanan. On approximating optimum repairs for functional dependency violations. In ICDT 2009, pages 53--62, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. J. Lang and P. Marquis. Reasoning under inconsistency: A forgetting-based approach. Artif. Intell., 174(12--13):799--823, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. E. Livshits and B. Kimelfeld. Counting and enumerating (preferred) database repairs. In ACM PODS, pages 281--301, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. E. L. Lozinskii. Information and evidence in logic systems. Journal of Experimental and Theoretical Artificial Intelligence, pages 163--193, 1994.Google ScholarGoogle ScholarCross RefCross Ref
  36. D. Maslowski and J. Wijsen. A dichotomy in the complexity of counting database repairs. Journal of Computer and System Sciences, 79(6):958 -- 983, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. J. Rammelaere and F. Geerts. Explaining repaired data with CFDs. PVLDB, 11(11):1387--1399, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. T. Rekatsinas, X. Chu, I. F. Ilyas, and C. Ré. Holoclean: Holistic data repairs with probabilistic inference. PVLDB, 10(11):1190--1201, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. K. Schnaitter and N. Polyzotis. Optimal algorithms for evaluating rank joins in database systems. ACM TODS, 35(1):6:1--6:47, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. J. Wijsen. Database repairing using updates. ACM TODS, 30(3):722--768, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. D. Xin, J. Han, and K. C.-C. Chang. Progressive and selective merge: computing top-k with ad-hoc ranking functions. In ACM SIGMOD, pages 103--114, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image Proceedings of the VLDB Endowment
    Proceedings of the VLDB Endowment  Volume 13, Issue 12
    August 2020
    1710 pages
    ISSN:2150-8097
    Issue’s Table of Contents

    Publisher

    VLDB Endowment

    Publication History

    • Published: 1 July 2020
    Published in pvldb Volume 13, Issue 12

    Qualifiers

    • research-article

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader