ABSTRACT
We report on our experience implementing a lightweight, fully verified relational database management system (RDBMS). The functional specification of RDBMS behavior, RDBMS implementation, and proof that the implementation meets the specification are all written and verified in Coq. Our contributions include: (1) a complete specification of the relational algebra in Coq; (2) an efficient realization of that model (B+ trees) implemented with the Ynot extension to Coq; and (3) a set of simple query optimizations proven to respect both semantics and run-time cost. In addition to describing the design and implementation of these artifacts, we highlight the challenges we encountered formalizing them, including the choice of representation for finite relations of typed tuples and the challenges of reasoning about data structures with complex sharing. Our experience shows that though many challenges remain, building fully-verified systems software in Coq is within reach.
- Serge Abiteboul, Richard Hull, and Victor Vianu. Database Foundations. Addison-Wesley, 1995.Google Scholar
- Yves Bertot and Pierre Castéran. Interactive Theorem Proving and Program Development. Coq'Art: The Calculus of Inductive Constructions. Texts in Theoretical Computer Science. Springer Verlag, 2004. Google ScholarDigital Library
- R. Bornat, C. Calcagno, and P. OHearn. Local Reasoning, Separation and Aliasing. Proc. SPACE, volume 4, 2004.Google Scholar
- Stephen Brookes. A semantics for concurrent separation logic. Theor. Comput. Sci., 375(1--3):227--270, 2007. Google ScholarDigital Library
- Adam Chlipala, Gregory Malecha, Greg Morrisett, Avraham Shinnar, and Ryan Wisnesky. Effective interactive proofs for higher-order imperative programs. In Proc. ICFP, 2009. Google ScholarDigital Library
- C. J. Date. Introduction to Database Systems. Addison-Wesley Longman Publishing Co., Inc., 2002. Google ScholarDigital Library
- Ramez Elmasri and Shamkant B. Navathe. Fundamentals of Database Systems (5th Edition). Addison Wesley, 2006. Google ScholarDigital Library
- Parke Godfrey, Jarek Gryz, and Calisto Zuzarte. Exploiting constraint-like data characterizations in query optimization. In Proc. SIGMOD, 2001. Google ScholarDigital Library
- Carlos Gonzalia. Relations in Dependent Type Theory. PhD Thesis, Chalmers University of Technology, 2006.Google Scholar
- Conor Mcbride. Elimination with a motive. In Proc. TYPES, 2000. Google ScholarDigital Library
- Conor McBride and James McKinna. The view from the left. J. Functional Programming, 14(1):69--111, 2004. Google ScholarDigital Library
- James Mckinna and Joel Wright. A type-correct, stack-safe, provably correct expression compiler in epigram. In J. Functional Programming, 2006.Google Scholar
- Aleksandar Nanevski, Paul Govereau, and Greg Morrisett. Towards type-theoretic semantics for transactional concurrency. In Proc. TLDI, 2009. Google ScholarDigital Library
- Aleksandar Nanevski, Greg Morrisett, and Lars Birkedal. Polymorphism and separation in hoare type theory. In Proc. ICFP, 2006. Google ScholarDigital Library
- Aleksandar Nanevski, Greg Morrisett, Avraham Shinnar, Paul Govereau, and Lars Birkedal. Ynot: Dependent types for imperative programs. In Proc. ICFP, 2008. Google ScholarDigital Library
- Ulf Norell. Towards a Practical Programming Language Based on Dependent Type Theory. PhD thesis, Chalmers University of Technology, 2007.Google Scholar
- Peter W. O'Hearn. Resources, concurrency, and local reasoning. Theor. Comput. Sci., 375(1--3):271--307, 2007. Google ScholarDigital Library
- Peter W. O'Hearn, John C. Reynolds, and Hongseok Yang. Local reasoning about programs that alter data structures. In Proc. CSL, 2001. Google ScholarDigital Library
- Nicolas Oury and Wouter Swierstra. The power of pi. Proc. ICFP, 2008. Google ScholarDigital Library
- Sharon E. Perl and Margo Seltzer. Data management for internet-scale single-sign-on. In Proc. WORLDS, 2006. Google ScholarDigital Library
- P. Rajagopalan and C. P. Tsang. A generic algebra for data collections based on constructive logic. In Algebraic Methodology and Software Technology, volume 936 of LNCS, pages 546--560. Springer Berlin / Heidelberg, 1995. Google ScholarDigital Library
- John C. Reynolds. Separation logic: A logic for shared mutable data structures. In Proc. LICS, 2002. Google ScholarDigital Library
- Alan Sexton and Hayo Thielecke. Reasoning about b+ trees with operational semantics and separation logic. Electron. Notes Theor. Comput. Sci., 218:355--369, 2008. Google ScholarDigital Library
- Alan Sexton and Hayo Thielecke. Reasoning about b+ trees with operational semantics and separation logic. Electron. Notes Theor. Comput. Sci., 218:355--369, 2008. Google ScholarDigital Library
- Carsten Sinz. System description: Ara - an automatic theorem prover for relation algebras. In Proc. CADE-17, 2000. Google ScholarDigital Library
- Matthieu Sozeau. Program-ing finger trees in coq. In Proc. ICFP, 2007. Google ScholarDigital Library
- Matthieu Sozeau and Nicolas Oury. First-class type classes. In Proc. TPHOLs, 2008. Google ScholarDigital Library
Index Terms
- Toward a verified relational database management system
Recommendations
Toward a verified relational database management system
POPL '10We report on our experience implementing a lightweight, fully verified relational database management system (RDBMS). The functional specification of RDBMS behavior, RDBMS implementation, and proof that the implementation meets the specification are all ...
Verified heap theorem prover by paramodulation
ICFP '12We present VeriStar, a verified theorem prover for a decidable subset of separation logic. Together with VeriSmall [3], a proved-sound Smallfoot-style program analysis for C minor, VeriStar demonstrates that fully machine-checked static analyses ...
A relational modal logic for higher-order stateful ADTs
POPL '10: Proceedings of the 37th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languagesThe method of logical relations is a classic technique for proving the equivalence of higher-order programs that implement the same observable behavior but employ different internal data representations. Although it was originally studied for pure, ...
Comments