Editorial Notes
A corrigendum was issued for this paper on March 31, 2022. You can download the corrigendum from the Supplemental Material section of this citation page.
Abstract
We present a new approach to e-matching based on relational join; in particular, we apply recent database query execution techniques to guarantee worst-case optimal run time. Compared to the conventional backtracking approach that always searches the e-graph "top down", our new relational e-matching approach can better exploit pattern structure by searching the e-graph according to an optimized query plan. We also establish the first data complexity result for e-matching, bounding run time as a function of the e-graph size and output size. We prototyped and evaluated our technique in the state-of-the-art egg e-graph framework. Compared to a conventional baseline, relational e-matching is simpler to implement and orders of magnitude faster in practice.
Supplemental Material
Available for Download
Corrigendum to "Relational e-matching" by Zhang et al., Proceedings of the ACM on Programming Languages, Volume 6, Issue POPL (PACMPL 6:POPL).
- [n.d.]. Glean System for collecting, deriving and querying facts about source code. https://glean.software Accessed: 2021-10-12.Google Scholar
- Christopher R. Aberger, Andrew Lamb, Susan Tu, Andres Nötzli, Kunle Olukotun, and Christopher Ré. 2017. EmptyHeaded: A Relational Engine for Graph Processing. ACM Trans. Database Syst., 42, 4 (2017), Article 20, Oct., 44 pages. issn:0362-5915 https://doi.org/10.1145/3129246 Google ScholarDigital Library
- Andreas Amler. 2017. Evaluation of Worst-Case Optimal Join Algorithm. Master’s thesis.Google Scholar
- Tony Antoniadis, Konstantinos Triantafyllou, and Yannis Smaragdakis. 2017. Porting Doop to Soufflé: A Tale of Inter-Engine Portability for Datalog-Based Analyses. In Proceedings of the 6th ACM SIGPLAN International Workshop on State Of the Art in Program Analysis (SOAP 2017). Association for Computing Machinery, New York, NY, USA. 25–30. isbn:9781450350723 https://doi.org/10.1145/3088515.3088522 Google ScholarDigital Library
- Albert Atserias, Martin Grohe, and Dániel Marx. 2008. Size Bounds and Query Plans for Relational Joins. In Proceedings of the 2008 49th Annual IEEE Symposium on Foundations of Computer Science (FOCS ’08). IEEE Computer Society, USA. 739–748. isbn:9780769534367 https://doi.org/10.1109/FOCS.2008.43 Google ScholarDigital Library
- Pavel Avgustinov, Oege De Moor, Michael Peyton Jones, and Max Schäfer. 2016. QL: Object-oriented queries on relational data. In 30th European Conference on Object-Oriented Programming (ECOOP 2016).Google Scholar
- Clark W. Barrett, Christopher L. Conway, Morgan Deters, Liana Hadarean, Dejan Jovanovic, Tim King, Andrew Reynolds, and Cesare Tinelli. 2011. CVC4. In Computer Aided Verification - 23rd International Conference, CAV 2011, Snowbird, UT, USA, July 14-20, 2011. Proceedings, Ganesh Gopalakrishnan and Shaz Qadeer (Eds.) (Lecture Notes in Computer Science, Vol. 6806). Springer, 171–177. https://doi.org/10.1007/978-3-642-22110-1_14 Google ScholarCross Ref
- Stefano Ceri and Jennifer Widom. 1991. Deriving Production Rules for Incremental View Maintenance. In 17th International Conference on Very Large Data Bases, September 3-6, 1991, Barcelona, Catalonia, Spain, Proceedings, Guy M. Lohman, Amílcar Sernadas, and Rafael Camps (Eds.). Morgan Kaufmann, 577–589. isbn:1-55860-150-3 http://www.vldb.org/conf/1991/P577.PDFGoogle Scholar
- Leonardo de Moura and Nikolaj Bjørner. 2007. Efficient E-Matching for SMT Solvers. In Automated Deduction – CADE-21, Frank Pfenning (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg. 183–198. isbn:978-3-540-73595-3Google Scholar
- Leonardo de Moura and Nikolaj Bjørner. 2008. Z3: An Efficient SMT Solver. In Tools and Algorithms for the Construction and Analysis of Systems, C. R. Ramakrishnan and Jakob Rehof (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg. 337–340. isbn:978-3-540-78800-3Google ScholarDigital Library
- David Detlefs, Greg Nelson, and James B. Saxe. 2005. Simplify: A Theorem Prover for Program Checking. J. ACM, 52, 3 (2005), May, 365–473. issn:0004-5411 https://doi.org/10.1145/1066100.1066102 Google ScholarDigital Library
- Michael Freitag, Maximilian Bandle, Tobias Schmidt, Alfons Kemper, and Thomas Neumann. 2020. Adopting Worst-Case Optimal Joins in Relational Database Systems. Proc. VLDB Endow., 13, 12 (2020), July, 1891–1904. issn:2150-8097 https://doi.org/10.14778/3407790.3407797 Google ScholarDigital Library
- Dexter Kozen. 1977. Complexity of Finitely Presented Algebras. In Proceedings of the Ninth Annual ACM Symposium on Theory of Computing (STOC ’77). Association for Computing Machinery, New York, NY, USA. 164–177. isbn:9781450374095 https://doi.org/10.1145/800105.803406 Google ScholarDigital Library
- Amine Mhedhbi and Semih Salihoglu. 2019. Optimizing Subgraph Queries by Combining Binary and Worst-Case Optimal Joins. arxiv:1903.02076v2.Google Scholar
- Michał Moskal, Jakub Ł opuszański, and Joseph R. Kiniry. 2008. E-Matching for Fun and Profit. Electron. Notes Theor. Comput. Sci., 198, 2 (2008), May, 19–35. issn:1571-0661 https://doi.org/10.1016/j.entcs.2008.04.078 Google ScholarDigital Library
- Chandrakana Nandi, Max Willsey, Adam Anderson, James R. Wilcox, Eva Darulova, Dan Grossman, and Zachary Tatlock. 2020. Synthesizing Structured CAD Models with Equality Saturation and Inverse Transformations. In Proceedings of the 41st ACM SIGPLAN International Conference on Programming Language Design and Implementation, PLDI 2020, London, UK, June 15-20, 2020, Alastair F. Donaldson and Emina Torlak (Eds.). ACM, 31–44. https://doi.org/10.1145/3385412.3386012 Google ScholarDigital Library
- Patrick Nappa, David Zhao, Pavle Subotić, and Bernhard Scholz. 2019. Fast Parallel Equivalence Relations in a Datalog Compiler. In 2019 28th International Conference on Parallel Architectures and Compilation Techniques (PACT). 82–96.Google ScholarCross Ref
- Hung Q. Ngo, Ely Porat, Christopher Ré, and Atri Rudra. 2018. Worst-Case Optimal Join Algorithms. J. ACM, 65, 3 (2018), Article 16, March, 40 pages. issn:0004-5411 https://doi.org/10.1145/3180143 Google ScholarDigital Library
- Hung Q Ngo, Christopher Ré, and Atri Rudra. 2014. Skew Strikes Back: New Developments in the Theory of Join Algorithms. SIGMOD Rec., 42, 4 (2014), Feb., 5–16. issn:0163-5808 https://doi.org/10.1145/2590989.2590991 Google ScholarDigital Library
- Christos H. Papadimitriou and Mihalis Yannakakis. 1999. On the Complexity of Database Queries. J. Comput. Syst. Sci., 58, 3 (1999), June, 407–427. issn:0022-0000 https://doi.org/10.1006/jcss.1999.1626 Google ScholarDigital Library
- Philipp Rümmer. 2012. E-matching with free variables. In International Conference on Logic for Programming Artificial Intelligence and Reasoning. 359–374.Google ScholarDigital Library
- Kenneth Salem, Kevin S. Beyer, Roberta Cochrane, and Bruce G. Lindsay. 2000. How To Roll a Join: Asynchronous Incremental View Maintenance. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, May 16-18, 2000, Dallas, Texas, USA, Weidong Chen, Jeffrey F. Naughton, and Philip A. Bernstein (Eds.). ACM, 129–140. isbn:1-58113-217-4 https://doi.org/10.1145/342009.335393 Google ScholarDigital Library
- Robert Endre Tarjan. 1975. Efficiency of a Good But Not Linear Set Union Algorithm. J. ACM, 22, 2 (1975), April, 215–225. issn:0004-5411 https://doi.org/10.1145/321879.321884 Google ScholarDigital Library
- Ross Tate, Michael Stepp, Zachary Tatlock, and Sorin Lerner. 2009. Equality Saturation: A New Approach to Optimization. In Proceedings of the 36th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’09). Association for Computing Machinery, New York, NY, USA. 264–276. isbn:9781605583792 https://doi.org/10.1145/1480881.1480915 Google ScholarDigital Library
- Raoul-Gabriel Urma and Alan Mycroft. 2013. Expressive and Scalable Source Code Queries with Graph Databases.Google Scholar
- Max Willsey, Chandrakana Nandi, Yisu Remy Wang, Oliver Flatt, Zachary Tatlock, and Pavel Panchekha. 2021. Egg: Fast and Extensible Equality Saturation. Proc. ACM Program. Lang., 5, POPL (2021), Article 23, Jan., 29 pages. https://doi.org/10.1145/3434304 Google ScholarDigital Library
- Yichen Yang, Phitchaya Mangpo Phothilimtha, Yisu Remy Wang, Max Willsey, Sudip Roy, and Jacques Pienaar. 2021. Equality Saturation for Tensor Graph Superoptimization. arXiv e-prints, Article arXiv:2101.01332, Jan., arXiv:2101.01332 pages. arxiv:2101.01332.Google Scholar
- Yue Zhuge, Hector Garcia-Molina, Joachim Hammer, and Jennifer Widom. 1995. View Maintenance in a Warehousing Environment. In Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data, San Jose, California, USA, May 22-25, 1995, Michael J. Carey and Donovan A. Schneider (Eds.). ACM Press, 316–327. isbn:978-0-89791-731-5 https://doi.org/10.1145/223784.223848 Google ScholarDigital Library
Index Terms
- Relational e-matching
Recommendations
An Evaluation of Relational Join Algorithms in a Pipelined Query Processing Environment
A query processing strategy which is based on pipelining and data-flow techniques is presented. Timing equations are developed for calculating the performance of four join algorithms: nested block, hash, sort-merge, and pipelined sort-merge. They are ...
E-matching for Fun and Profit
Efficient handling of quantifiers is crucial for solving software verification problems. E-matching algorithms are used in satisfiability modulo theories solvers that handle quantified formulas through instantiation. Two novel, efficient algorithms for ...
Programming with triggers
SMT '09: Proceedings of the 7th International Workshop on Satisfiability Modulo TheoriesWe give a case study for a Satisfiability Modulo Theories (SMT) solver usage in functional verification of a real world operating system. In particular, we present a view of the E-matching pattern annotations on quantified formulas as a kind of logic ...
Comments