Abstract
Datalog has become a popular implementation language for solving large-scale, real world problems, including bug finders, network analysis tools, and disassemblers. These applications express complex behaviour with hundreds of relations and rules that often require a non-deterministic choice for tuples in relations to express worklist algorithms.
This work is an experience report that describes the implementation of a choice construct in the Datalog engine Soufflé. With the choice construct we can express worklist algorithms such as spanning trees in a few lines of code. We highlight the differences between rule-based choice as described in prior work, and relation-based choice introduced by this work. We show that a choice construct enables certain worklist algorithms to be computed up to 10k\(\times \) faster than having no choice construct.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases: The Logical Level, 1st edn. Addison-Wesley Longman Publishing Co., Inc., Boston (1995)
Aho, A.V., Lam, M.S., Sethi, R., Ullman, J.D.: Compilers: Principles, Techniques, and Tools, 2nd edn. Addison-Wesley Longman Publishing Co., Inc., Boston (2006)
Allen, N., Scholz, B., Krishnan, P.: Staged points-to analysis for large code bases. In: Franke, B. (ed.) CC 2015. LNCS, vol. 9031, pp. 131–150. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-46663-6_7
Aref, M., et al.: Design and Implementation of the LogicBlox System. In: SIGMOD 2015, pp. 1371–1382. ACM (2015)
Ball, T., Larus, J.R.: Efficient path profiling. In: Proceedings 29th Annual ACM/IEEE International Symposium on Microarchitecture, pp. 46–57. MICRO 29 (1996)
Beeri, C., Fagin, R., Howard, J.H.: A complete axiomatization for functional and multivalued dependencies in database relations. In: Proceedings of the 1977 ACM SIGMOD International Conference on Management of Data, pp. 47–61 (1977)
Bravenboer, M., Smaragdakis, Y.: Strictly declarative specification of sophisticated points-to analyses. In: Proceedings 24th ACM SIGPLAN Conference on Object Oriented Programming Systems Languages and Applications, pp. 243–262 (2009)
Ceri, S., Gottlob, G., Tanca, L.: Overview of research prototypes for integrating relational databases and logic programming. In: Logic Programming and Databases. SURVEYS, pp. 246–266. Springer, Heidelberg (1990). https://doi.org/10.1007/978-3-642-83952-8_12
Giannotti, F., Greco, S., Saccá, D., Zaniolo, C.: Programming with non-determinism in deductive databases. Ann. Math. Artif. Intell. 19, 97–125 (2004)
Giannotti, F., Pedreschi, D., Saccà, D., Zaniolo, C.: Non-determinism in deductive databases. In: Delobel, C., Kifer, M., Masunaga, Y. (eds.) DOOD 1991. LNCS, vol. 566, pp. 129–146. Springer, Heidelberg (1991). https://doi.org/10.1007/3-540-55015-1_7
Giannotti, F., Pedreschi, D., Zaniolo, C.: Semantics and expressive power of nondeterministic constructs in deductive databases. J. Comput. Syst. Sci. 62(1), 15–42 (2001)
Grech, N., Brent, L., Scholz, B., Smaragdakis, Y.: Gigahorse: thorough, declarative decompilation of smart contracts. In: ICSE 2019, pp. 1176–1186. ACM (2019)
Grech, N., Kong, M., Jurisevic, A., Brent, L., Scholz, B., Smaragdakis, Y.: Madmax: surviving out-of-gas conditions in ethereum smart contracts. In: SPLASH 2018 OOPSLA (2018)
Greco, S., Molinaro, C.: Datalog and logic databases. In: Synthesis Lectures on Data Management, vol. 10, pp. 47–57 (2016)
Greco, S., Saccá, D., Zaniolo, C.: DATALOG queries with stratified negation and choice: from P to \(D^P\). In: ICDT (1995)
Greco, S., Zaniolo, C.: Greedy algorithms in datalog with choice and negation. In: IJCSLP (1998)
Greco, S., Zaniolo, C.: Greedy algorithms in datalog. Theory Pract. Log. Program. 1(4), 381–407 (2001)
Greco, S., Zaniolo, C., Ganguly, S.: Greedy by choice. In: Proceedings 11th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, 2–4 June 1992, San Diego, California, USA, pp. 105–113. ACM Press (1992)
Hecht, M.S., Ullman, J.D.: Characterizations of reducible flow graphs. J. ACM (JACM) 21(3), 367–375 (1974)
Hecht, M.S., Ullman, J.D.: A simple algorithm for global data flow analysis problems. SIAM J. Comput. 4(4), 519–532 (1975)
Henning, J.L.: SPEC CPU2000: measuring CPU performance in the new millennium. Computer 33(7), 28–35 (2000)
Hoder, K., Bjørner, N., de Moura, L.: \(\mu \)Z – An efficient engine for fixed points with constraints. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 457–462. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22110-1_36
Hu, X., Karp, J., Zhao, D., Zreika, A., Wu, X., Scholz, B.: The choice construct in the souffle language (2021)
Huang, S.S., Green, T.J., Loo, B.T.: Datalog and Emerging Applications: An Interactive Tutorial. In: SIGMOD 2011, pp. 1213–1216. ACM (2011)
Jordan, H., Scholz, B., Subotić, P.: Soufflé: on synthesis of program analyzers. In: Chaudhuri, S., Farzan, A. (eds.) CAV 2016. LNCS, vol. 9780, pp. 422–430. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-41540-6_23
Jordan, H., Subotić, P., Zhao, D., Scholz, B.: A specialized B-Tree for concurrent datalog evaluation. In: PPoPP 2019, pp. 327–339. ACM (2019)
Krishnamurthy, R., Naqvi, S.: Non-deterministic choice in datalog. In: Proceedings International Conference on Data and Knowledge Bases, pp. 416–424. Morgan Kaufmann (1988)
Madsen, M., Yee, M.H., Lhoták, O.: From datalog to flix: a declarative language for fixed points on lattices. In: PLDI 2016, pp. 194–208. ACM (2016)
Mendelzon, A.O.: Functional dependencies in logic programs. In: VLDB - Volume 11, pp. 324–330. VLDB Endowment (1985)
Naqvi, S.A., Tsur, S.: A Logical Language for Data and Knowledge Bases. Computer Science Press, Cambridge (1989)
Ou, X., Govindavajhala, S., Appel, A.W.: MulVAL: a logic-based network security analyzer. In: Proceedings USENIX Security Symposium - Volume 14, p. 8. SSYM 2005. USENIX Association (2005)
Paredaens, J., De Bra, P., Gyssens, M., Van Gucht, D.: Constraints. In: The Structure of the Relational Database Model. EATCS, vol. 17, pp. 61–112. Springer, Heidelberg (1989). https://doi.org/10.1007/978-3-642-69956-6_3
Rayside, D., Kontogiannis, K.: A generic worklist algorithm for graph reachability problems in program analysis. In: Proceedings of the Sixth European Conference on Software Maintenance and Reengineering, pp. 67–76 (2002)
Scholz, B., Jordan, H., Subotić, P., Westmann, T.: On fast large-scale program analysis in datalog. In: CC 2016, pp. 196–206. ACM (2016)
Sharir, M.: A strong-connectivity algorithm and its applications in data flow analysis. Computers & Mathematics with Applications 7(1), 67–72 (1981)
Subotic, P., Jordan, H., Chang, L., Fekete, A.D., Scholz, B.: Automatic Index Selection for Large-Scale Datalog Computation. PVLDB 12(2), 141–153 (2018)
Tarski, A.: A lattice-theoretical fixpoint theorem and its applications. Pacific J. Math. 5(2), 285–309 (1955). https://projecteuclid.org:443/euclid.pjm/1103044538
Wiederhold, G.: Database Design, vol. 1077. McGraw-Hill New York, New York (1983)
Zhou, W., Sherr, M., Tao, T., Li, X., Loo, B.T., Mao, Y.: Efficient querying and maintenance of network provenance at internet-scale. In: SIGMOD, pp. 615–626 (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Hu, X., Karp, J., Zhao, D., Zreika, A., Wu, X., Scholz, B. (2021). The Choice Construct in the Soufflé Language. In: Oh, H. (eds) Programming Languages and Systems. APLAS 2021. Lecture Notes in Computer Science(), vol 13008. Springer, Cham. https://doi.org/10.1007/978-3-030-89051-3_10
Download citation
DOI: https://doi.org/10.1007/978-3-030-89051-3_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-89050-6
Online ISBN: 978-3-030-89051-3
eBook Packages: Computer ScienceComputer Science (R0)