Keywords

figure a
figure b

1 Introduction

Separation logic [20, 37] has successfully reasoned about programs manipulating pointer structures. It empowers reusability and scalability through compositional reasoning [6, 7]. A compositional verification system relies on bi-abduction technology which is, in turn, based on entailment proof systems. Entailment is defined: Given an antecedent \(A\) and a consequent \(C\) where \(A\) and \(C\) are formulas in separation logic, the entailment problem checks whether \(A ~{\models }~ C\) is valid. Thus, an efficient decision procedure for entailments is the vital ingredient of an automatic verification system in separation logic.

To enhance the expressiveness of the assertion language, for example, to specify unbounded heaps and interesting pure properties (e.g., sortedness, parent pointers), separation logic is typically combined with user-defined inductive predicates [9, 31, 35]. In this setting, one key challenge of an entailment procedure is the ability to support induction reasoning over the combination of heaps and data content. The problem of induction is challenging, especially for an automated inductive theorem prover, where the induction rules are not explicitly stated. Indeed, this problem is undecidable [1].

Developing a sound and complete entailment procedure that could be used for compositional reasoning is not trivial. It is unknown how model-based systems, e.g. [14, 15, 17, 18, 22, 23], could support compositional reasoning. In contrast, there was evidence that proof-based decision procedures, e.g., Smallfoot [2] and the variant [12], and Cycomp [42], can be extended to solve the bi-abduction problem, which enables compositional reasoning and scalability [7, 25]. Smallfoot was the centre of the biabductive procedure deployed in Infer [7], which which greatly impacted academia and industry [13]. Furthermore, Smallfoot is very efficient due to its use of the “exclude-the-middle” rule, which can avoid the proof search over the disjunction in the consequent. However, Smallfoot works for hardwired lists and binary trees only. In contrast, Cycomp, a recent complete entailment procedure, is a cyclic proof system without “exclude-the-middle“ and can support general inductive predicates but has double exponential time complexity due to the proof search (and back-tracking) in the consequent.

This paper introduces a cyclic proof system with an “exclude-the-middle”-styled decision procedure for decidable yet expressive inductive predicates. We especially show that our procedure runs in polynomial time when the maximum number of fields of data structures is bounded by a constant. The decidable fragment, \({{\texttt{SHLIDe}}}\), contains inductive definitions of compositional predicates and pure properties. These predicates can capture nested list segments, skip lists and trees. The pure properties of small models can model a wide range of common data structures, e.g. a list with fast-forward pointers, sorted nested lists, and binary search trees [22, 32]. This fragment is much more expressive than Smallfoot’s and is incomparable to Cycomp’s [42]: there exist some entailments our system can handle, but Cyccomp could not, and vice versa.

Our procedure is a variant of the cyclic proof system introduced by Brotherston [3, 5] and has become one of the leading solutions to induction reasoning in separation logic. Intuitively, a cyclic proof is naturally represented as a tree of statements (entailments in this paper). The leaves are either axioms or nodes linked back to inner nodes; the tree’s root is the theorem to be proven, and nodes are connected to one or more children by proof rules. Alternatively, a cyclic proof can be viewed as a tree possibly containing some back-links (a.k.a. cycles, e.g., “C, if B, if C”) such that the proof satisfies some global soundness condition. This condition ensures that the proof can be viewed as a proof of infinite descent. For instance, for a cyclic entailment proof with inductive definitions, if every cycle contains an unfolding of some inductive predicate, then that predicate is infinitely often reduced into a strictly “smaller” predicate. This infinity is impossible as the semantics of inductive definitions only allows finite steps of unfolding. Hence, that proof path with the cycle can be disregarded.

The proposed system advances Brotherston’s system in three ways. First, the proposed proof search algorithm is specialized to \({{\texttt{SHLIDe}}}\), which includes “exclude-the-middle“ rules and excludes any back-tracking. The existing proof procedures typically search for proof (and back-track) over disjunctive cases generated from unfolding inductive predicates in the RHS of an entailment. To avoid such costly searches, we propose “exclude-the-middle“-styled normalised rules in which the unfolding of inductive predicates in the RHS always produces one disjunct. Therefore, our system is much more efficient than existing systems. Second, while a standard Brotherston system is incomplete, our proof search is complete in \({{\texttt{SHLIDe}}}\): If it is stuck (i.e., it can not apply any inference rules), then the root entailment is invalid.

Lastly, while the global soundness in [5] must be checked globally and explicitly, every back-link generated in \({{\texttt{SHLIDe}}}\) is sound by design. We note that Cycomp, introduced in [42], was the first work to show the completeness of a cyclic proof system. However, in contrast to ours, it did not discuss the global soundness condition, which is the crucial idea attributing to the soundness of cyclic proofs.

Contributions Our primary contributions are summarized as follows.

  • We present a novel decision procedure, \(\mathtt {S2S_{Lin}}\), for the entailment problem in separation logic with inductive definitions of compositional predicates.

  • We provide a complexity analysis of the procedure.

  • We have implemented the proposal in a prototype tool and tested it with the SL-COMP benchmarks [38, 39]. The experimental results show that \(\mathtt {S2S_{Lin}}\) is effective and efficient compared to state-of-the-art solvers.

Organization The remainder of the paper is organised as follows. Sect. 2 describes the syntax of formulas in fragment \({{\texttt{SHLIDe}}}\). Sect. 3 presents the basics of an “exclude-the-middle” proof system and cyclic proofs. Sect. 4 elaborates on the result, the novel cyclic proof system, including an illustrative example. Sect. 5 discusses soundness and completeness. Sect. 6 presents the implementation and evaluation. Sect. 7 discusses related work. Finally, Sect. 8 concludes the work.

2 Decidable Fragment \({{\texttt{SHLIDe}}}\)

Subsection 2.1 presents syntax of separation logic formulae and recursive definitions of linear predicates and local properties. Subsection 2.2 shows semantics.

2.1 Separation Logic Formulas

Concrete heap models assume a fixed finite collection of data structures Node, a fixed finite collection of field names Fields, a set Loc of locations (heap addresses), a set of non-addressable values Val, with the requirement that \({\textit{Val}} {\cap } {\textit{Loc}} {=} \emptyset \) (i.e., no pointer arithmetic). \({{{\texttt{null}}}}\) is a special element of Val. \(\mathbb {Z}\) denotes the set of integers (\(\mathbb {Z} {\subseteq } \textit{Val}\)) and \(k\) denotes integer numbers. \(\textit{Var}\) an infinite set of variables, \(\bar{v}\) a sequence of variables.

Syntax Disjunctive formula \(\varPhi \), symbolic heaps \(\varDelta \), spatial formula \(\kappa \), pure formula \(\pi \), pointer (dis)equality \(\phi \), and (in)equality formula \(\alpha \) are as follows.

$$ \begin{array}{ll} \begin{array}{l} \varPhi :\,\!:= \varDelta ~|~ \varPhi ~{\vee }~ \varPhi \qquad ~ \varDelta :\,\!:= \kappa {\wedge }\pi \mid \exists {v}{.}~\kappa {\wedge }\pi \\ \kappa :\,\!:= {\texttt{emp}}~|~ {x}{{\mapsto }}c(f{:}v,..,f{:}v) ~|~ {{\texttt{P}}}(\bar{v}) ~|~\kappa {*}\kappa \end{array} &{}\quad \begin{array}{l} \pi :\,\!:= {{\texttt{true}}}\,\mid \alpha \mid {\lnot } \pi \mid \pi {\wedge } \pi \\ \alpha :\,\!:= a{=} a\mid a{\le } a\qquad a :\,\!:= \!k \mid v \\ \end{array} \end{array} $$

where \(v {\in } \textit{Var}\), \(c {\in } {\textit{Node}}\) and \(f{\in } {\textit{Fields}}\). Note that we often discard field names \(f\) of points-to predicates \({x}{{\mapsto }}c(f{:}v,..,f{:}v)\) and use the short form as \({x}{{\mapsto }}c(\bar{v})\). \(v_1 {\ne } v_2\) is the short form of \(\lnot (v_1{=}v_2)\). \(E\) denotes for either a variable or \({{{\texttt{null}}}}\). \(\varDelta [E {/} v]\) denotes the formula obtained from \(\varDelta \) by substituting \(v\) by \(E\). A symbolic heap is referred as a base, denoted as \({\varDelta ^b}\), if it does not contain any occurrence of inductive predicates.

Inductive Definitions We write \({\mathcal {P}}\) to denote a set of \(n\) defined predicates \({\mathcal {P}}{=}\{{{\mathtt {P_1}}},...,{{\mathtt {P_n}}}\}\) in our system. Each inductive predicate has following types of parameters: a pair of root and segment defining segment-based linked points-to heaps, reference parameters (e.g., parent pointers, fast-forwarding pointers), transitivity parameters (e.g., singly-linked lists where every heap cell contains the same value \(a\)) and pairs of ordering parameters (e.g., trees being binary search trees). An inductive predicate is defined as

$$ \begin{array}{l} {{\mathtt {pred~P}}}(r{,} F{,} \bar{B}{,}u{,}sc{,}tg) ~{\equiv }~ {\texttt{emp}}{\wedge } r{=}F {\wedge } sc{=}tg \\ ~\qquad \vee ~ \exists {X}_{tl}, \bar{Z},sc'. r{\mapsto }c(X_{tl}{,}\bar{p}{,}u{,}sc') ~{*}~ \kappa ' ~{*}~ {{\texttt{P}}}({X}_{tl}{,} F{,} \bar{B}{,}u{,}sc'{,}tg) \wedge r{\ne } F \wedge sc \diamond sc' \end{array} $$

where \(r\) is the root, \(F\) the segment, \(\bar{B}\) the borders, \(u\) the parameter for a transitivity property, \(sc\) and \(tg\) source and target, respectively, parameters of an order property, \(r{\mapsto }c(X_{tl}{,}\bar{p}{,}u{,}sc') ~{*}~ \kappa '\) the matrix of the heaps, and \(\diamond \in \{=,\ge ,\le \}\). (The extension for multiple local properties is straightforward.) Moreover, this definition is constrained by the following three conditions on heap connectivity, establishment, and termination.

Condition C1. In the recursive rule, \( \bar{p} = \{{{{\texttt{null}}}}\} {\cup } \bar{Z}\). This condition implies that If two variables points to the same heap, their content must be the same. For instance, the following definition of singly-linked lists of even length does not satisfy this condition.

$$ \begin{array}{l} {{\mathtt {pred~ell}}}(r{,}F) ~{\equiv }~ {\texttt{emp}}{\wedge } r{=}F ~\vee ~ \exists {x_{1}}{,}{X} . r{\mapsto }c_1(x_{1}) {*}x_1{\mapsto }c_1(X) {*} {{\texttt{ell}}}(X{,}F) {\wedge } r{\ne }F \end{array} $$

as \(n_3\) and \(X\) are not field variables of the node pointed-to by \(r\).

Condition C2. The matrix heap defines nested and connected list segments as:

$$ {\kappa ' {:=} {{\texttt{Q}}}(Z{,}\bar{U}) \mid \kappa '{*}\kappa ' \mid {\texttt{emp}}} $$

where \(Z {\in } \bar{p}\) and \((\bar{U} \setminus \bar{p}) \cap Z = \emptyset \). This condition ensures connectivity (i.e. all allocated heaps are connected to the root) and establishment (i.e. every existential quantifier either is allocated or equals to a parameter).

Condition C3. There is no mutual recursion. We define an order \(\prec _{\mathcal {P}}\) on inductive predicates as: \({{\texttt{P}}} ~{\prec _{\mathcal {P}}}~ {{\texttt{Q}}}\) if at least one occurrence of predicate \(\texttt{Q}\) appears in the definition of \(\texttt{P}\) and \(\texttt{Q}\) is called a direct sub-term of \(\texttt{P}\). We use \({\prec ^*_{\mathcal {P}}}\) to denote the transitive closure of \(\prec _{\mathcal {P}}\).

Several definition examples are shown as follows.

$$ \begin{array}{l} {{\mathtt {pred~ll}}}(r{,}F) ~{\equiv }~ {\texttt{emp}}{\wedge } r{=}F ~ \vee ~ \exists {X_{tl}}. r{\mapsto }c_1(X_{tl}) {*} {{\texttt{ll}}}(X_{tl},F) {\wedge } r{\ne }F \\ {{\mathtt {pred~nll}}}(r{,}F{,}B) ~{\equiv }~ {\texttt{emp}}{\wedge } r{=}F \\ \quad \vee ~ \exists X_{tl}{,}Z. r{\mapsto }c_3(X_{tl}{,}Z) {*} {{\texttt{ll}}}(Z,B){*} {{\texttt{nll}}}(X_{tl}{,}F{,}B) {\wedge } r{\ne }F\\ {{\mathtt {pred~skl1}}}(r{,}F) ~{\equiv }~ {\texttt{emp}}{\wedge } r{=}F ~\vee ~ \exists {X_{tl}}. r{\mapsto }c_4(X_{tl}{,}{{{\texttt{null}}}}{,}{{{\texttt{null}}}}) {*} {{\texttt{skl1}}}(X_{tl},F) {\wedge } r{\ne }F \\ {{\mathtt {pred~skl2}}}(r{,}F) ~{\equiv }~ {\texttt{emp}}{\wedge } r{=}F \\ \quad \vee ~ \exists {X_{tl},Z_1}. r{\mapsto }c_4(Z_1{,}X_{tl}{,}{{{\texttt{null}}}}) {*} {{\texttt{skl1}}}(Z_1{,}X_{tl}) {*} {{\texttt{skl2}}}(X_{tl},F) {\wedge } r{\ne }F \\ {{\mathtt {pred~skl3}}}(r{,}F) ~{\equiv }~ {\texttt{emp}}{\wedge } r{=}F \\ \quad \vee ~ \exists {X_{tl}{,}Z_1{,}Z_2}. r{\mapsto }c_4(Z_1{,}Z_2{,}X_{tl}) {*} {{\texttt{skl1}}}(Z_1{,}Z_2) {*} {{\texttt{skl2}}}(Z_2{,}X_{tl}) {*} {{\texttt{skl3}}}(X_{tl}{,}F) {\wedge } r{\ne }F \\ {{\mathtt {pred~tree}}}(r{,}B) ~{\equiv }~ {\texttt{emp}}{\wedge } r{=}B \\ ~\quad \vee ~ \exists {r_l}, {r_r}. r{\mapsto }c_{t}(r_l{,}r_r) {*} {{\texttt{tree}}}(r_l{,}B) {*} {{\texttt{tree}}}(r_r{,}B) \wedge r{\ne }B \end{array} $$

\(\texttt{ll}\) defines singly-linked lists, \(\texttt{nll}\) defines lists of acyclic lists, \(\texttt{slk1}\), \(\texttt{slk2}\) and \(\texttt{slk3}\) define skip-lists. Finally, \(\texttt{tree}\) defines binary trees. We extend predicate \(\texttt{ll}\) with transitivity and order parameters to obtain predicate \(\texttt{lla}\) and \(\texttt{lls}\), respectively, as follows.

$$ \begin{array}{l} {{\mathtt {pred~lla}}}(r{,}F{,}a) ~{\equiv }~ {\texttt{emp}}{\wedge } r{=}F ~ \vee ~ \exists X_{tl}. r{\mapsto }c_2(X_{tl}{,}a) *{{\texttt{lla}}}(X_{tl}{,}F{,}a) {\wedge } r{\ne }F \\ {{\mathtt {pred~lls}}}(r{,}F{,}mi{,}ma) ~{\equiv }~ {\texttt{emp}}{\wedge } r{=}F {\wedge }ma{=}mi \\ \quad \vee ~ \exists X_{tl}{,}mi_1. r{\mapsto }c_4(X_{tl}{,}mi_1) *{{\texttt{lls}}}(X_{tl}{,}F{,}mi_1{,}ma) {\wedge } r{\ne }F \wedge mi {\le } mi_1\\ \end{array} $$

Unfolding Given \({{\mathtt {pred~P}}}(\bar{t})\equiv \varPhi \) and a formula \({{\texttt{P}}}(\bar{v}){*}\varDelta \), then unfolding \({{\texttt{P}}}(\bar{v})\) means replacing \({{\texttt{P}}}(\bar{v})\) by \(\varPhi [\bar{v}/\bar{t}]\). We annotate a number, called unfolding number, for each occurrence of inductive predicates. Suppose \(\exists \bar{w}. r{\mapsto }c(\bar{p}) ~{*}~ {{\texttt{Q}}}_1(\bar{v}_1){*}...{*}{{\texttt{Q}}}_m(\bar{v}_m) ~{*}~ {{\texttt{P}}}(\bar{v}_0) {\wedge } \pi \) be the recursive rule, then in the unfolded formula, if \({{\texttt{P}}}(\bar{v}_0[\bar{v}/\bar{t}])^{k_1}\) and \({{\mathtt {Q_i}}}(...)^{k_2}\) are direct sub-terms of \({{\texttt{P}}}(\bar{v})^{k}\) like above, then \(k_1{=}k{+}1\) and \(k_2 = 0\). When it is unambiguous, we discard the annotation of the unfolding number for simplicity.

2.2 Semantics

The program state is interpreted by a pair \((s{,}h)\) where \(s{\in } {\textit{Stacks}}\), \(h{\in } {\textit{Heaps}} \) and stack \({\textit{Stacks}}\) and heap \({\textit{Heaps}}\) are defined as:

$$ \begin{array}{lcl} {\textit{Heaps}} &{} {\overset{{\text {def}}}{=}} &{} {\textit{Loc}} {\rightharpoonup _{fin}} ({\textit{Node}} ~{\rightarrow }~ (\textit{Fields}~{\rightarrow }~ \textit{Val}\cup \textit{Loc})^m) \\ {\textit{Stacks}} &{} {\overset{{\text {def}}}{=}} &{} {\textit{Var}} ~{\rightarrow }~ \textit{Val}\cup \textit{Loc}\end{array} $$

Note that we assume that every data structure contains at most \(m\) fields. Given a formula \(\varPhi \), its semantics is given by a relation: \(s{,}h~{\models }~ \varPhi \) in which the stack \(s\) and the heap \(h\) satisfy the constraint \(\varPhi \). The semantics is shown below

$$ \begin{array}{lcl} {s},{h} \models {\texttt{emp}}&{} {\mathtt { iff~}}&{} \textit{dom}({h}) {=} \emptyset \\ {s},{h} \models {v}{{\mapsto }}c(f_i:v_i) &{} {\mathtt { iff~}}&{} {\textit{dom}}(h) {=}\{s(v)\}, h(s(v)){=}g, g(c,f_i){=} s({v_i})\\ {s},{h} \models {P}(\bar{v}) &{} {\mathtt { iff~}}&{} {(h,s(\bar{v}_1),..,s(\bar{v}_k)) \in \llbracket P\rrbracket } \\ {s},{h} \models \kappa _1 *\kappa _2 &{} {\mathtt { iff~}}&{} \exists h_1,h_2 ~s.t~ h_1 {\#} h_2 \text{, } h{=}h_1 {\cdot } h_2, \text{, } s,h_1 \models \kappa _1 \text{ and } s,h_2 \models \kappa _2\\ {s},{h} \models {{\texttt{true}}}\,&{} {\mathtt { iff~}}&{} \text{ always } \\ {s},{h} \models \kappa {\wedge }\pi &{} {\mathtt { iff~}}&{} {s},{h} \models {\kappa } \text { and } {s} \models {\pi } \\ {s},{h} \models \exists {v} {.}\varDelta &{} {\mathtt { iff~}}&{} \exists {\alpha } . {s}[v {\mapsto }{\alpha }],{h} \models {\varDelta } \\ {s},{h} \models \varPhi _1 \vee \varPhi _2 &{} {\mathtt { iff~}}&{} s,h\models \varPhi _1 \text{ or } s,h\models \varPhi _2\\ \end{array} $$

\(dom(g)\) is the domain of \(g\), \(h_1 {\#} h_2\) denotes disjoint heaps \(h_1\) and \(h_2\) i.e., \({\textit{dom}}(h_1) {\cap } {\textit{dom}}(h_2) {=} \emptyset \), and \(h_1 {\cdot } h_2\) denotes the union of two disjoint heaps. If \(s\) is a stack, \(v {\in } \textit{Var}\), and \(\alpha {\in } \textit{Val}{\cup } \textit{Loc}\), we write \( {s}[v {{\mapsto }} {\alpha }] = {s}\) if \(v {\in } \textit{dom}(s)\), otherwise \( {s}[v {{\mapsto }} {\alpha }] = {s} {\cup } \{(v, \alpha )\}\). Semantics of non-heap (pure) formulas is omitted for simplicity. The interpretation of an inductive predicate \({{\texttt{P}}}(\bar{t})\) is based on the least fixed point semantics \(\llbracket {{\texttt{P}}}\rrbracket \).

Entailment \(\varDelta \models \varDelta '\) holds iff for all \(s\) and \(h\), if \(s,h\models \varDelta \) then \(s, h\models \varDelta '\).

3 Entailment Problem & Overview

Throughout this work, we consider the following problem.

$$ \begin{array}{|ll|} \hline \quad \text {PROBLEM:} &{} {\mathtt{QF{\_}ENT{-}SL_{LIN}}}. \\ \quad \text {INPUT:} &{} {\varDelta _a \equiv \kappa _a{\wedge }\pi _a} \text { and } {\varDelta _c \equiv \kappa _c{\wedge }\pi _c}\text { where } \textit{FV}(\varDelta _c) \subseteq \textit{FV}(\varDelta _a)\cup \{{{{\texttt{null}}}}\}. \quad \\ \quad \text {QUESTION:} &{} \text {Does } {\varDelta _a ~\models ~ \varDelta _c} \text { hold? } \qquad \\ \hline \end{array} $$

An entailment, denoted as \({\texttt{e}}\), is syntactically formalized as: \(\varDelta _a~{\vdash }~\varDelta _c\) where \(\varDelta _a\) and \(\varDelta _c\) are quantifier-free formulas whose syntax are defined in the preceding section.

In Sect. 3.1, we present the basis of an exclude-the-middle proof system and our approach to \(\mathtt{QF{\_}ENT{-}SL_{LIN}}\). In Sect. 3.2, we describe the foundation of cyclic proofs.

3.1 Exclude-the-Middle Proof System

Given a goal \(\varDelta _a~{\vdash }~\varDelta _c\), an entailment proof system might derive entailments with a disjunction in the right-hand side (RHS). Such an entailment can be obtained by a proof rule that replaces an inductive predicate by its definition rules. Authors of Smallfoot [2] introduced a normal form and proof rules to prevent such entailments when the predicate are lists or trees. Smallfoot considers the following two scenarios.

  • Case 1 (Exclude-the-middle and Frame): The inductive predicate matches with a points-to predicate in the left-hand side (LHS). For instance, let us consider an entailment which is of the form \({\texttt{e}}_1: {x}{{\mapsto }}c(z) *\varDelta ~{\vdash }~ {{\texttt{ll}}}(x,y) *\varDelta '\), where \(\texttt{ll}\) is singly-linked lists and \({{\texttt{ll}}}(x,y)\) matches with \({x}{{\mapsto }}c(z)\) as they have the same root x. A typical proof system might search for proof through two definition rules of predicate \(\texttt{ll}\) (i.e., by unfolding \({{\texttt{ll}}}(x,y)\) into two disjuncts): One includes the base case with \(x=y\), and another contains the recursive case with \(x\ne y\). Smallfoot prevents such unfolding by excluding the middle in the LHS: It reduces the entailment into two premises: \({x}{{\mapsto }}c(z) *\varDelta \wedge x=y ~{\vdash }~ {{\texttt{ll}}}(x,y) *\varDelta '\) and \({x}{{\mapsto }}c(z) *\varDelta \wedge x\ne y ~{\vdash }~ {{\texttt{ll}}}(x,y) *\varDelta '\). The first one considers the base case of the list (that is, \({{\texttt{ll}}}(x,x)\)) and is equivalent to \({x}{{\mapsto }}c(z) *\varDelta \wedge x=y ~{\vdash }~ \varDelta ' \). Furthermore, the second premise checks the inductive case of the list and is equivalent to \(\varDelta \wedge x\ne y ~{\vdash }~ {{\texttt{ll}}}(x,z) *\varDelta '\).

  • Case 2 (Induction proving via hard-wired Lemma). The inductive predicate matches other inductive predicates in the LHS. For example, consider the entailment \({\texttt{e}}_2:{{\texttt{ll}}}(x,z) *\varDelta ~{\vdash }~ {{\texttt{ll}}}(x,{{{\texttt{null}}}}) *\varDelta '\). Smallfoot handle \({\texttt{e}}_2\) by using a proof rule as the consequence of applying the following hard-wired lemma \({{\texttt{ll}}}(x,z) *{{\texttt{ll}}}(z,{{{\texttt{null}}}}) \models {{\texttt{ll}}}(x,{{{\texttt{null}}}})\) and reduces the entailment to \(\varDelta ~{\vdash }~ {{\texttt{ll}}}(z,{{{\texttt{null}}}}) *\varDelta '\).

In doing so, Smallfoot does not introduce a disjunction in the RHS. However, as it uses specific lemmas in the induction reasoning, it only works for the hardwired lists.

This paper proposes \(\mathtt {S2S_{Lin}}\) as an exclude-the-middle system for user-defined predicates, those in \({{\texttt{SHLIDe}}}\). Instead of using hardwired lemmas, we apply cyclic proofs for induction reasoning. For instance, to discharge the entailment \({\texttt{e}}_2\) above, \(\mathtt {S2S_{Lin}}\) first unfolds \({{\texttt{ll}}}(x,z)\) in the LHS and obtains two premises:

  • \({\texttt{e}}_{21}: ({\texttt{emp}}\wedge x=z) *\varDelta ~{\vdash }~ {{\texttt{ll}}}(x,{{{\texttt{null}}}}) *\varDelta '\); and

  • \({\texttt{e}}_{22}: ({x}{{\mapsto }}c(y) *{{\texttt{ll}}}(y,z) \wedge x \ne z) *\varDelta ~{\vdash }~ {{\texttt{ll}}}(x,{{{\texttt{null}}}}) *\varDelta '\)

While it reduces \({\texttt{e}}_{21}\) to \(\varDelta [z/x] ~{\vdash }~ {{\texttt{ll}}}(z,{{{\texttt{null}}}}) *\varDelta '[z/x]\), for \({\texttt{e}}_{22}\), it further applies the frame rule as in Case 1 above and obtains \( {{\texttt{ll}}}(y,z) *\varDelta \wedge x \ne z ~{\vdash }~ {{\texttt{ll}}}(y,{{{\texttt{null}}}}) *\varDelta '\). Then, it makes a backlink between the latter and \({\texttt{e}}_2\) and closes this path. Doing so does not introduce disjunctions in the RHS and can handle user-defined predicates.

3.2 Cyclic Proofs

Central to our work is a procedure that constructs a cyclic proof for an entailment. Given an entailment \(\varDelta ~ {\vdash }~\varDelta '\), if our system can derive a cyclic proof, then \(\varDelta \models \varDelta '\). If instead, it is stuck without proof, then \(\varDelta \models \varDelta '\) is not valid.

The procedure includes proof rules, each of which is of the form:

figure c

where entailment \({\texttt{e}}\) (called the conclusion) is reduced to entailments \({\texttt{e}}_1\), ..,\({\texttt{e}}_n\) (called the premises) through inference rule \(\mathtt {PR_0}\) given that the side condition \(\texttt{cond}\) holds.

A cyclic proof is a proof tree \({\mathcal T}_{i}\) which is a tuple \(({V}, {E}, {\mathcal C})\) where

  • V is a finite set of nodes representing entailments derived during the proof search;

  • A directed edge \(({\texttt{e}}, {{\texttt{PR}}}, {\texttt{e}}') \in {E}\) (where \({\texttt{e}}'\) is a child of \({\texttt{e}}\)) means that the premise \({\texttt{e}}'\) is derived from the conclusion \({\texttt{e}}\) via inference rule \(\texttt{PR}\). For instance, suppose that the rule \(\mathtt {PR_0}\) above has been applied, then the following \(n\) edges are generated: \(({\texttt{e}}, {{\mathtt {PR_0}}}, {\texttt{e}}_1)\), .., \(({\texttt{e}}, {{\mathtt {PR_0}}}, {\texttt{e}}_n)\);

  • and \({\mathcal C}\) is a partial relation which captures back-links in the proof tree. If \({\mathcal C}({\texttt{e}}_c{\rightarrow }{\texttt{e}}_b, \sigma )\) holds, then \({\texttt{e}}_b\) is linked back to its ancestor \({\texttt{e}}_c\) through the substitution \(\sigma \) (where \({\texttt{e}}_b\) is referred to as a bud and \({\texttt{e}}_c\) is referred to as a companion). In particular, \({\texttt{e}}_c\) is of the form: \(\varDelta ~{\vdash }~ \varDelta '\) and \({\texttt{e}}_b\) is of the form: \(\varDelta _1 {\wedge }\pi ~{\vdash }~ \varDelta _1'\) where \(\varDelta \equiv \varDelta _1\sigma \) and \(\varDelta ' \equiv \varDelta '_1\sigma \).

A leaf node is marked as closed if it is evaluated as valid (i.e. the node is applied with an axiom), invalid (i.e. no rule can apply), or linked back. Otherwise, it is marked as open. A proof tree is invalid if it contains at least one invalid leaf node. It is pre-proof if all its leaf nodes are either valid or linked back. Furthermore, a pre-proof is a cyclic proof if a global soundness condition is established in the tree. Intuitively, this condition requires that for every \({\mathcal C}({\texttt{e}}_c{\rightarrow }{\texttt{e}}_b, \sigma )\), there exist inductive predicates \({{\texttt{P}}}(\bar{t_1})\) in \({\texttt{e}}_c\) and \({{\texttt{Q}}}(\bar{t_2})\) in \({\texttt{e}}_b\) such that \({{\texttt{Q}}}(\bar{t_2})\) is a subterm of \({{\texttt{P}}}(\bar{t_1})\).

Definition 1

(Trace)

Let \({\mathcal T}_{i}\) be a pre-proof of \(\varDelta _{a} ~{{\vdash }}~ \varDelta _{c}\) and \(({\varDelta _{a_i} ~{{\vdash }}~ \varDelta _{c_i} })_{i{\ge }0}\) be a path of \({\mathcal T}_{i}\). A trace following \(({\varDelta _{a_i} {{\vdash }} \varDelta _{c_i} })_{i{\ge }0}\) is a sequence \((\alpha _i)_{i{\ge }0}\) such that each \(\alpha _i\) (for all \(i{\ge }0\)) is a subformula of \({\varDelta _{a_i}}\) containing predicate \({{\texttt{P}}}(\bar{t})^u\), and either:

  • \(\alpha _{i{+}1}\) is the subformula occurrence in \(\varDelta _{a_{i+1}}\) corresponding to \(\alpha _{i}\) in \(\varDelta _{a_{i}}\).

  • or \({\varDelta _{a_i} ~{{\vdash }}~ \varDelta _{c_i} }\) is the conclusion of a left-unfolding rule, \(\alpha _{i} \equiv {{\texttt{P}}}(\bar{t})^u\) is unfolded, and \(\alpha _{i+1}\) is a subformula in \(\varDelta _{a_{i+1}}\) and is the definition rule of \({{\texttt{P}}}(\bar{x})^{u}[\bar{t}/\bar{x}]\). In this case, \(i\) is said to be a progressing point of the trace.

Definition 2

(Cyclic proof) A pre-proof \({\mathcal T}_{i}\) of \(\varDelta _{a} ~{{\vdash }}~ \varDelta _{c}\) is a cyclic proof if, for every infinite path \((\varDelta _{a_i} {{\vdash }} \varDelta _{c_i})_{i{\ge }0}\) of \({\mathcal T}_{i}\), there is a tail of the path \(p{=}(\varDelta _{a_i} ~{{\vdash }}~ \varDelta _{c_i})_{i{\ge }n}\) such that there is a trace following p which has infinitely progressing points.

Suppose that all proof rules are (locally) sound (i.e., if the premises are valid, then the conclusion is valid). The following Theorem shows global soundness.

Theorem 1

(Soundness [5]). If there is a cyclic proof of \(\varDelta _a ~{\vdash }~ \varDelta _c\), then \(\varDelta _a \models \varDelta _c\).

The proof is by contraction (c.f. [5]). Intuitively, if we can derive a cyclic proof for \(\varDelta _a ~{\vdash }~ \varDelta _c\) and \(\varDelta _a \not \models \varDelta _c\), then the inductive predicates at the progress points are unfolded infinitely often. This infinity contradicts the least semantics of the predicates.

4 Cyclic Entailment Procedure

This section presents our main proposal, the entailment procedure \(\omega \)-ENT with the proposed inference rules (subsection 4.1), and an illustrative example (subsection 4.2).

4.1 Proof Search

Fig. 1.
figure 1

Proof tree construction procedure

The proof search algorithm \(\omega \)-ENT is presented in Fig. 1. \(\omega \)-ENT takes \({\texttt{e}}_0\) as input, produces cyclic proofs, and based on that, decides whether the input is \({{{{\texttt{valid}}}}}\) or \({{{{\texttt{invalid}}}}}\). The idea of \(\omega \)-ENT is to iteratively reduce \({\mathcal T}_{0}\) into a sequence of cyclic proof trees \({\mathcal T}_{i}\), \(i\ge 0\). Initially, for every \({{\texttt{P}}}(\bar{v})^k \in {\texttt{e}}_0\), \(k\) is reset to \(0\), and \({\mathcal T}_{0}\) only has \({\texttt{e}}_0\) as an open leaf, the root. On line 3, through the procedure \(\mathtt {is\_closed}\)(\({\mathcal T}_{i}\)), \(\omega \)-ENT chooses an open leaf node \({\texttt{e}}_i\), and a proof rule \(PR_i\) to apply. If \(\mathtt {is\_closed}\)(\({\mathcal T}_{i}\)) returns \(\texttt{valid}\) (that is, every leaf is applied to an axiom rule or involved in a back-link), \(\omega \)-ENT returns \({{{{\texttt{valid}}}}}\) on line 4. If it returns \(\texttt{invalid}\), then \(\omega \)-ENT returns \(\texttt{invalid}\) (one line 5). Otherwise, it tries to link \({{\texttt{e}}_i}\) back to an internal node (on line 6). If this attempt fails, it applies the rule (line 7).

Note that at each leaf, \(\mathtt {is\_closed}\) attempts rules in the following order: normalization rules, axiom rules, and reduction rules. A rule \(PR_i\) is chosen if its conclusion can be unified with the leaf through some substitution \(\sigma \). Then, on line 7, for each premise of \(PR_i\), procedure \(\texttt{apply}\) creates a new open node and connects the node to \({\texttt{e}}_i\) via a new edge. If \(PR_i\) is an axiom, procedure \(\texttt{apply}\) marks \({\texttt{e}}_i\) as closed and returns.

Procedure \(\mathtt {is\_closed}\)(\({\mathcal T}_{i}\)) This procedure examines the following three cases.

  1. 1.

    First, if all leaf nodes are marked closed, and none is \(\texttt{invalid}\), then \(\mathtt {is\_closed}\) returns \({{{{\texttt{valid}}}}}\).

  2. 2.

    Secondly, \(\mathtt {is\_closed}\) returns \({{{{\texttt{invalid}}}}}\) if there exists an open leaf node \({\texttt{e}}_i:~\varDelta ~{\vdash }~\varDelta '\) in NF such that one of the four following conditions hold:

    1. (a)

      \({\texttt{e}}_i\) could not be applied by any inference rule.

    2. (b)

      there exists a predicate \(op_1(E) \in \varDelta \) such that \(op_2(E) \notin \varDelta '\) and one of the following conditions holds:

      • either \( {{\texttt{P}}}(E'{,} E{,}...)\) or \(E'{\mapsto }c(E{,}..)\) are on both sides

      • both \({{\texttt{P}}}(E'{,} E{,}...) \not \in \varDelta \) and \(E'{\mapsto }c(E{,}..) \not \in \varDelta \)

    3. (c)

      there exists a predicate \(op_1(E) {\in } \varDelta '\) such that \(G(op_1(E)) {\in } \varDelta \) and \(op_2(E) {\notin } \varDelta \).

    4. (d)

      there exist \({x}{{\mapsto }}c_1(\bar{v}_1) \in \varDelta \), \({x}{{\mapsto }}c_2(\bar{v}_2) \in \varDelta '\) such that \(c_1 \not \equiv c_2\) or .

  3. 3.

    Lastly, an open leaf node \({\texttt{e}}_i\) could be applied by an inference rule (e.g. \(PR_i\)), \(\mathtt {is\_closed}\) returns the triple (\({{{{\texttt{unknown}}}}}\), \({\texttt{e}}_i\), \(PR_i\)).

In the rest, we discuss the proof rules and the auxiliary procedures in detail.

Normalisation An entailment is in the normal form (NF) if its LHS is in NF. We write \(op(E)\) to denote for either \({E}{{\mapsto }}c(\bar{v})\) or \({{\texttt{P}}}(E{,} F{,} \bar{B}{,}\bar{v})\). Furthermore, the guard \(G(op(E))\) is defined by: \(G({E}{{\mapsto }}c(\bar{v})) \overset{{\text {def}}}{=}{{\texttt{true}}}\,\) and \(G({{\texttt{P}}}(E{,} F{,} \bar{B}{,}\bar{v})) \overset{{\text {def}}}{=}E{\ne }F\).

Definition 3

(Normal Form) A formula \(\kappa {\wedge } \phi {\wedge } a\) is in normal form if:

$$ \begin{array}{clcl} 1.&{} {op(E) \in \kappa } \text { implies } {G(op(E)) \in \phi } &{} \qquad 4.&{} {E_1{=}E_2 \not \in \phi } \\ 2.&{} {op(E) \in \kappa } \text { implies } {E{\ne }{{{\texttt{null}}}}\in \phi } &{} \qquad 5.&{} {E{\ne }E \not \in \phi } \\ 3.&{} {op_1(E_1)*op_2(E_2) \in \kappa } \text { implies } {E_1{\ne }E_2\in \phi } &{} \qquad 6.&{} {a} \text { is satisfiable} \end{array} $$

If \(\varDelta \) is in NF and for any \(s, h\models \varDelta \), then \(\textit{dom}(h)\) is uniquely defined by \(s\).

The normalisation rules are presented in Fig. 2. Basically, \(\omega \)-ENT applies these rules to a leaf exhaustively and transforms it into NF before others. Given an inductive predicate \({{\texttt{P}}}(E,F,...)\), rule \(\texttt{ExM}\) excludes the middle by doing case analysis for the predicate between base-case (i.e., \(E{=}F\)) and recursive-case (i.e., \(E{\ne }F\)). The normalisation rule \({\ne }{{{{\texttt{null}}}}}\) follows the following facts: \(E{\mapsto }c(\_\,) \Rightarrow E{\ne }{{{\texttt{null}}}}\) and \({{\texttt{P}}}(E{,} F{,} \_\,) {\wedge }E{\ne }F \Rightarrow E{\ne }{{{\texttt{null}}}}\). Similarly, rule \({\ne }{*}\) follows the following facts: \(x{{\mapsto }}\_\,{*} {{\texttt{P}}}(y{,} F{,} \_\,) {\wedge }y{\ne }F \Rightarrow x{\ne }y\), \(x{{\mapsto }}\_\,{*} y{{\mapsto }}\_\,\Rightarrow x{\ne }y\), and \({{\mathtt {P_i}}}(x{,} F_1{,} \_\,) {*} {{\mathtt {P_j}}}(y{,} F_2{,} \_\,) {\wedge }x{\ne }F_1 {\wedge }y{\ne }F_2 \Rightarrow x{\ne }y\).

Fig. 2.
figure 2

Normalization rules

Axiom and Reduction

Fig. 3.
figure 3

Reduction rules (where \(\sharp {:}~{{\texttt{P}}}(x{,}F{,}\bar{B}{,}u{,}sc{,}tg){\not \in }\kappa _2\), \(\dagger {:}~ {x}{{\mapsto }}c(X{,}E_1{,}E_2{,}u{,}sc') {\not \in } \kappa _2\))

Axiom rules include \(\texttt{Emp}\), \(\texttt{Inconsistency}\) and \(\texttt{Id}\), presented in Fig. 3. If each of these rules is applied to a leaf node, the node is evaluated as \({{{{\texttt{valid}}}}}\) and marked as closed. The remaining ones in Fig. 3 are reduction rules.

For simplicity, the unfoldings in rules \(\texttt{Frame}\), \(\texttt{RInd}\), and \(\texttt{LInd}\) are applied with the following definition of inductive predicates:

$$ \begin{array}{l} {{\texttt{P}}}(x{,}F{,}\bar{B}{,}u{,}sc{,}tg) \equiv {\texttt{emp}}{\wedge }x{=} F {\wedge }sc{=}tg \\ \quad \vee ~ \exists X{,} sc'{,}d_1{,} d_2 . {x}{{\mapsto }}c(X{,}d_1{,}d_2{,}u{,}sc){*}{{\mathtt {Q_1}}}(d_1{,}B){*}{{\mathtt {Q_2}}}(d_2{,}X){*}{{\texttt{P}}}(X{,}F{,}\bar{B}{,}u{,}sc'{,}tg){\wedge }\pi _0 \end{array} $$

where \(B {\in } \bar{B}\), the matrix \(\kappa '\) contains two nested predicates \(Q_1\) and \(Q_2\), and the heap cell \(c \in \textit{Node}\) is defined as \({{\texttt{data}}}~ c \{c ~ next;c_{1}~down_1;c_{2}~down_2;\tau _s~scdata;\tau _u ~udata\}\) where \(c_{1},c_{2} {\in } \textit{Node}\), \(down_1\) and \(down_2\) fields are for the nested predicates in the matrix heaps, the \(udata\) field is for the transitivity data, and the \(scdata\) field is for ordering data. The rules for the general form of the matrix heaps \(\kappa '\) are presented in [28].

\(\mathtt {{=}R}\) and \(\texttt{Hypothesis}\) eliminate pure constraints in the RHS. In rule \(\mathtt {{*}}\), \({{\texttt{roots}}}(\kappa )\) is defined inductively as: \({{\texttt{roots}}}({\texttt{emp}}){\equiv }\{\}\), \({{\texttt{roots}}}(r{\mapsto }\_\,){\equiv }\{r\}\), \({{\texttt{roots}}}(P(r,F,..)){\equiv }\{r\}\) and \({{\texttt{roots}}}(\kappa _1{*}\kappa _2)\equiv {{\texttt{roots}}}(\kappa _1)\cup {{\texttt{roots}}}(\kappa _2)\). This rule is applied in three ways. First, it is applied into an entailment which is of the form \(\kappa {\wedge } \pi ~{\vdash }~ \kappa {\wedge } \pi '\). It matches and discards the identified heap predicates between the two sides to generate a premise with empty heaps. As a result, this premise may be applied with the axiom rule \(\texttt{EMP}\). Secondly, it is applied to an entailment of the form \({x_i}{{\mapsto }}c_i(\bar{v}_i){*}...{*}{x_n}{{\mapsto }}c_n(\bar{v}_n) {\wedge } \pi ~{\vdash }~ \kappa ' {\wedge } \pi '\). For each points-to predicate \({x_i}{{\mapsto }}c_i(\bar{v}_i) {\in } \kappa '\), \(\omega \)-ENT searches for one points-to predicate \({x_j}{{\mapsto }}c_j(\bar{v}_j)\) in the LHS such that \({x_j}{{\mapsto }}c_j(\bar{v}_j) \equiv {x_i}{{\mapsto }}c_i(\bar{v}_i)\). Lastly, it is applied into an entailment that is of the form \(\varDelta _1 *\varDelta ~{\vdash }~ \varDelta _2 *\varDelta '\) where either \(\varDelta _1 ~{\vdash }~\varDelta _2\) or \(\varDelta ~{\vdash }~ \varDelta '\) could be linked back into an internal node.

In \(\texttt{RInd}\), for each occurrence of inductive predicates \({{\texttt{P}}}(r{,} F{,} \bar{B}{,}u{,}sc{,}tg) \) in \(\kappa '\), \(\omega \)-ENT searches for a points-to predicate \(r{\mapsto }\_\,\). If any of these searches fail, \(\omega \)-ENT decides the conclusion as \(\texttt{invalid}\). Rule \(\texttt{LInd}\) unfolds the inductive predicates in the LHS. Every LHS of entailments in this rule also captures the unfolding numbers for the subterm relationship and generates the progressing point in the cyclic proofs afterwards. These numbers are essential for our system to construct cyclic proofs. This rule is applied in a depth-first manner, i.e., if there are more than one occurrences of inductive predicates in the LHS that could be applied by this rule, the one with the greatest unfolding number is chosen. We emphasise that the last five rules still work well when the predicate in the RHS contains only a subset of the local properties wrt. the predicate in the LHS.

Back-Link Generation Procedure \(\mathtt {link\_back_{e}}\) generates a back-link as follows. In a pre-proof, given a path containing a back-link, say \({\texttt{e}}_1,{\texttt{e}}_2,..,{\texttt{e}}_m\) where \({\texttt{e}}_1\) is a companion and \({\texttt{e}}_m\) a bud, then \({\texttt{e}}_1\) is in NF and of the following form:

  • \({\texttt{e}}_1{\equiv }{{{\texttt{P}}}(x{,}F{,}\bar{B}{,}u{,}sc{,}tg)^{k}{*}\kappa {\wedge }\pi {\wedge }x{\ne }F{\wedge }x{\ne }{{{\texttt{null}}}}} ~ {\vdash }~{{{\texttt{Q}}}(x{,}F_2{,}\bar{B}{,}u{,}sc{,}tg_2){*}\kappa '{\wedge }\pi '}\).

  • \({\texttt{e}}_2\) is obtained from applying \(\texttt{LInd}\) into \({\texttt{e}}_1\). \({\texttt{e}}_2\) is of the form:

    $$ \begin{array}{l} {{x}{{\mapsto }}c(X,\bar{p},{,}u{,}sc) {*}\kappa ' {*} {{\texttt{P}}}(X{,}F{,}\bar{B}{,}u{,}sc'{,}tg)^{k{+}1}{*}\kappa {\wedge }\pi {\wedge }x{\ne }F{\wedge }x{\ne }{{{\texttt{null}}}}{\wedge }\pi _1} \\ \quad {\vdash }~{{{\texttt{Q}}}(x{,}F_2{,}\bar{B}{,}u{,}sc{,}tg_2){*}\kappa '{\wedge }\pi '} \end{array} $$

    We remark that \(sc \diamond sc' \in \pi _1\), and if \(k \ge 1\), then \(sc_i \diamond sc \in \pi \)

  • \({\texttt{e}}_3\), .., \({\texttt{e}}_{m{-}4}\) are obtained from applications of normalisation rules to normalise the LHS of \({\texttt{e}}_2\) due to the presence of \(\kappa '\). As the roots of inductive predicates in \(\kappa '\) are fresh variables, the applications of the normalization rules above do not affect the RHS of \({\texttt{e}}_2\). That means the RHS of \({\texttt{e}}_3\), .., and \({\texttt{e}}_{m{-}4}\) are the same as that of \({\texttt{e}}_2\). As a result, \({\texttt{e}}_{m{-}4}\) is of the form:

    $$ \begin{array}{l} {{x}{{\mapsto }}c(X,\bar{p},{,}u{,}sc) {*}\kappa _1'' {*} {{\texttt{P}}}(X{,}F{,}\bar{B}{,}u{,}sc'{,}tg)^{k{+}1}{*}\kappa {\wedge }\pi {\wedge }x{\ne }F{\wedge }x{\ne }{{{\texttt{null}}}}{\wedge }\pi _1{\wedge }\pi _2} \\ \quad {\vdash }~{{{\texttt{Q}}}(x{,}F_2{,}\bar{B}{,}u{,}sc{,}tg_2){*}\kappa '{\wedge }\pi '} \end{array} $$

    where \(\kappa _1''\) may be \({\texttt{emp}}\) and \(\pi _2\) is a conjunction of disequalities coming from \(\texttt{ExM}\).

  • \({\texttt{e}}_{m{-}3}\) is obtained from the application of \(\texttt{ExM}\) over \(x\) and \(F_2\) and of the form:

    $$ \begin{array}{l} {{x}{{\mapsto }}c(X,\bar{p},{,}u{,}sc) {*}\kappa _1'' {*} {{\texttt{P}}}(X{,}F{,}\bar{B}{,}u{,}sc'{,}tg)^{k{+}1}{*}\kappa {\wedge }\pi {\wedge }x{\ne }F{\wedge }x{\ne }{{{\texttt{null}}}}{\wedge }\pi _1{\wedge }\pi _2} \\ \quad {\wedge }x{\ne }F_2~{\vdash }~{{{\texttt{Q}}}(x{,}F_2{,}\bar{B}{,}u{,}sc{,}tg_2){*}\kappa '{\wedge }\pi '} \end{array} $$

    (For the case \(x{=}F_2\), the rule \(\texttt{ExM}\) is kept applying until either \(F\equiv F_2\), that is, two sides are reaching the end of the same heap segment, or it is stuck.)

  • \({\texttt{e}}_{m{-}2}\) is obtained from the application of \(\texttt{RInd}\) and is of the form:

    $$ \begin{array}{l} {{x}{{\mapsto }}c(X,\bar{p},{,}u{,}sc) {*}\kappa _1'' {*} {{\texttt{P}}}(X{,}F{,}\bar{B}{,}u{,}sc'{,}tg)^{k{+}1}{*}\kappa {\wedge }\pi {\wedge }x{\ne }F{\wedge }x{\ne }{{{\texttt{null}}}}{\wedge }\pi _1{\wedge }\pi _2} \\ \quad {\wedge }x{\ne }F_2~{\vdash }~{{x}{{\mapsto }}c(X{,}\bar{p}{,}u{,}sc){*}\kappa _2'' {*}{{\texttt{Q}}}(X{,}F_2{,}\bar{B}{,}u{,}sc'{,}tg_2){*}\kappa '{\wedge }\pi '}{\wedge }\pi _2' \end{array} $$
  • \({\texttt{e}}_{m{-}1}\) is obtained from the application of the \(\texttt{Hypothesis}\) to eliminate \(\pi _2'\) (otherwise, it is stuck) and is of the form:

    $$ \begin{array}{l} {{x}{{\mapsto }}c(X,\bar{p},{,}u{,}sc) {*}\kappa _1'' {*} {{\texttt{P}}}(X{,}F{,}\bar{B}{,}u{,}sc'{,}tg)^{k{+}1}{*}\kappa {\wedge }\pi {\wedge }x{\ne }F{\wedge }x{\ne }{{{\texttt{null}}}}{\wedge }\pi _1{\wedge }\pi _2} \\ \quad {\wedge }x{\ne }F_2~{\vdash }~{{x}{{\mapsto }}c(X{,}\bar{p}{,}u{,}sc){*}\kappa _2'' {*}{{\texttt{Q}}}(X{,}F_2{,}\bar{B}{,}u{,}sc'{,}tg_2){*}\kappa '{\wedge }\pi '} \end{array} $$
  • \({\texttt{e}}_{m}\) is obtained from the application of \(\mathtt {*}\) and is of the form:

    $$ \begin{array}{l} { {{\texttt{P}}}(X{,}F{,}\bar{B}{,}u{,}sc'{,}tg)^{k{+}1}{*}\kappa {\wedge }\pi {\wedge }x{\ne }F{\wedge }x{\ne }{{{\texttt{null}}}}{\wedge }\pi _1{\wedge }\pi _2 {\wedge }x{\ne }F_2} \\ \quad {\vdash }~{{\texttt{Q}}}(X{,}F_2{,}\bar{B}{,}u{,}sc'{,}tg_2){*}\kappa '{\wedge }\pi ' \end{array} $$

When \(k \ge 1\), it is always possible to link \({\texttt{e}}_m\) back to \({\texttt{e}}_1\) through the substitution is \(\sigma {\equiv }[x/X,sc/sc']\) after weakening some pure constraints in its LHS.

4.2 Illustrative Example

We illustrate our system through the following example:

$$ \begin{array}{l} {\texttt{e}}_0{:}~ {{\texttt{lls}}}(x{,}{{{\texttt{null}}}}{,}mi{,}ma)^0 \wedge x{\ne }{{{\texttt{null}}}}~ {\vdash }~ {{\texttt{llb}}}(x{,}{{{\texttt{null}}}}{,}mi) \\ \end{array} $$

where the sorted linked-list \(\texttt{lls}\) (\(mi\) is the minimum value and \(ma\) is the maximum value) is defined in Sect. 2.1 and \(\texttt{llb}\) define singly-linked lists whose values are greater than or equal to a constant number. Particularly, predicate \(\texttt{llb}\) is defined as follows.

$$ \begin{array}{l} {{\mathtt {pred~llb}}}(r{,}F{,}b) ~{\equiv }~ {\texttt{emp}}{\wedge } r{=}F \\ \quad \vee ~ \exists X_{tl}{,}d. r{\mapsto }c_4(X_{tl}{,}d) *{{\texttt{llb}}}(X_{tl}{,}F{,}b) {\wedge } r{\ne }F \wedge b {\le } d\\ \end{array} $$

Since the LHS is stronger than the RHS, this entailment is valid. Our system could generate the cyclic proof (shown in Fig. 4) to prove the validity of \({\texttt{e}}_0\). In the following, we present step-by-step to show how the proof was created. Firstly, \({\texttt{e}}_{0}\), which is in NF, is applied with rule \(\texttt{LInd}\) to unfold predicate \({{\texttt{lls}}}(x{,}{{{\texttt{null}}}}{,}mi{,}ma)^0\) and obtain \({\texttt{e}}_{1}\) as:

figure e

We remark that the unfolding number of the recursive predicate \(\texttt{lls}\) in the LHS is increased by \(1\). Next, our system normalizes \({\texttt{e}}_{1}\) by applying rule \(\texttt{ExM}\) into \(X\) and \({{{\texttt{null}}}}\) to generate two children, \({\texttt{e}}_{2}\) and \({\texttt{e}}_{3}\), as follows.

$$ \begin{array}{ll} {\texttt{e}}_{2}{:}&{} {x}{{\mapsto }}c_4(X,m') *{{\texttt{lls}}}(X{,}{{{\texttt{null}}}}{,}m'{,}ma)^1 \wedge x{\ne }{{{\texttt{null}}}}\wedge mi{\le }m' \wedge X{=}{{{\texttt{null}}}}\\ &{} \quad {\vdash }~ {{\texttt{llb}}}(x{,}{{{\texttt{null}}}}{,}mi) \\ {\texttt{e}}_{3}{:}&{} {x}{{\mapsto }}c_4(X,m') *{{\texttt{lla}}}(X{,}{{{\texttt{null}}}}{,}m'{,}ma)^1 \wedge x{\ne }{{{\texttt{null}}}}\wedge mi{\le }m' \wedge X{\ne }{{{\texttt{null}}}}\\ {} &{} \quad {\vdash }~ {{\texttt{llb}}}(x{,}{{{\texttt{null}}}}{,}mi) \end{array} $$
Fig. 4.
figure 4

Cyclic Proof of \({{\texttt{lls}}}(x{,}{{{\texttt{null}}}}{,}mi{,ma})^0 {\wedge }x{\ne }{{{\texttt{null}}}}~ {\vdash }~ {{\texttt{llb}}}(x{,}{{{\texttt{null}}}}{,}mi)\).

For the left child, it applies normalization rules to obtain \({\texttt{e}}_4\) (substitute \(X\) by \({{{\texttt{null}}}}\)) and then \({\texttt{e}}_5\), by \(\texttt{LBase}\) to unfold \({{\texttt{lls}}}({{{\texttt{null}}}}{,}{{{\texttt{null}}}}{,}m'{,}ma)^1\) to the base case, as:

$$ \begin{array}{ll} {\texttt{e}}_{4}{:}&{} {x}{{\mapsto }}c_4({{{\texttt{null}}}},m')*{{\texttt{lls}}}({{{\texttt{null}}}}{,}{{{\texttt{null}}}}{,}m'{,}ma)^1 \wedge x{\ne }{{{\texttt{null}}}}\wedge mi{\le }m' ~ {\vdash }~ {{\texttt{llb}}}(x{,}{{{\texttt{null}}}}{,}mi) \\ {\texttt{e}}_{5}{:}&{} {x}{{\mapsto }}c_4({{{\texttt{null}}}},ma) \wedge x{\ne }{{{\texttt{null}}}}\wedge mi{\le }ma ~ {\vdash }~ {{\texttt{llb}}}(x{,}{{{\texttt{null}}}}{,}mi) \end{array} $$

Now, \({\texttt{e}}_5\) is in NF. \(\mathtt {S2S_{Lin}}\) applies \(\texttt{RInd}\) and then \(\texttt{RBase}\) to \(\texttt{llb}\) in the RHS as:

$$ \begin{array}{ll} {\texttt{e}}_{6}{:}&{} {x}{{\mapsto }}c_4({{{\texttt{null}}}},ma) \wedge x{\ne }{{{\texttt{null}}}}\wedge mi{\le }ma \\ {} &{} \quad {\vdash }~ {x}{{\mapsto }}c_4({{{\texttt{null}}}},ma)*{{\texttt{llb}}}({{{\texttt{null}}}}{,}{{{\texttt{null}}}}{,}mi) \wedge mi{\le }ma\\ {\texttt{e}}_{6'}{:}&{} {x}{{\mapsto }}c_4({{{\texttt{null}}}},ma) \wedge x{\ne }{{{\texttt{null}}}}\wedge mi{\le }ma ~ {\vdash }~ {x}{{\mapsto }}c_4({{{\texttt{null}}}},ma) {\wedge }mi{\le }ma \end{array} $$

After that, as \(mi{\le }ma \Rightarrow mi{\le }ma\), \({\texttt{e}}_{6'}\) is applied with \(\texttt{Hypothesis}\) to obtain \({\texttt{e}}_{7}\).

$$ {\texttt{e}}_{7}{:}~ {x}{{\mapsto }}c_4({{{\texttt{null}}}},ma) \wedge x{\ne }{{{\texttt{null}}}}\wedge mi{\le }ma ~ {\vdash }~ {x}{{\mapsto }}c_4({{{\texttt{null}}}},ma) $$

As the LHS of \({\texttt{e}}_{7}\) is in NF and a base formula, it is sound and complete to apply rule \(*\) to have \({\texttt{e}}_{8}\) as \( {\texttt{emp}}\wedge x{\ne }{{{\texttt{null}}}}\wedge mi{\le }ma ~ {\vdash }~ {\texttt{emp}}\). By \(\texttt{Emp}\), \({\texttt{e}}_{8}\) is decided as \({{{{\texttt{valid}}}}}\). For the right branch of the proof, \({\texttt{e}}_{3}\) is applied with rule \(\mathtt {{\ne }{*}}\) and then \(\texttt{RInd}\) to obtain \({\texttt{e}}_{9}\):

$$ \begin{array}{ll} {\texttt{e}}_{9}{:}&{} {x}{{\mapsto }}c_4(X,m'){*} {{\texttt{lls}}}(X{,}{{{\texttt{null}}}}{,}m'{,}ma)^1 \wedge x{\ne }{{{\texttt{null}}}}\wedge mi{\le }m' \wedge X{\ne }{{{\texttt{null}}}}\wedge x{\ne }X \\ {} &{} \quad {\vdash }~ {x}{{\mapsto }}c_4(X,m'){*} {{\texttt{llb}}}(X{,}{{{\texttt{null}}}}{,}mi) {\wedge }mi{\le }m' \end{array} $$

Then, \({\texttt{e}}_{9}\) is applied with \(\texttt{Hypothesis}\) to eliminate the pure constraint in the RHS:

$$ \begin{array}{ll} {\texttt{e}}_{10}{:}&{} {x}{{\mapsto }}c_4(X,m'){*} {{\texttt{lls}}}(X{,}{{{\texttt{null}}}}{,}m'{,}ma)^1 \wedge x{\ne }{{{\texttt{null}}}}\wedge mi{\le }m' \wedge X{\ne }{{{\texttt{null}}}}\wedge x{\ne }X \\ {} &{} \quad {\vdash }~ {x}{{\mapsto }}c_4(X,m'){*} {{\texttt{llb}}}(X{,}{{{\texttt{null}}}}{,}mi) \end{array} $$

\({\texttt{e}}_{10}\) is then applied the rule \(\mathtt {*}\) to obtain \({\texttt{e}}_{11}\) and \({\texttt{e}}_{12}\) as follows.

$$ \begin{array}{ll} {\texttt{e}}_{11}{:}&{} {x}{{\mapsto }}c_4(X,m')~ {\vdash }~ {x}{{\mapsto }}c_4(X,m') \\ {\texttt{e}}_{12}{:}&{} {{\texttt{lls}}}(X{,}{{{\texttt{null}}}}{,}m'{,}ma)^1 \wedge x{\ne }{{{\texttt{null}}}}\wedge mi{\le }m' \wedge X{\ne }{{{\texttt{null}}}}\wedge x{\ne }X ~ {\vdash }~ {{\texttt{llb}}}(X{,}{{{\texttt{null}}}}{,}mi) \end{array} $$

\({\texttt{e}}_{11}\) is valid by \(\texttt{Id}\). \({\texttt{e}}_{12}\) is successfully linked back to \({\texttt{e}}_{0}\) to form a pre-proof as

$$ {({{\texttt{lls}}}(X{,}{{{\texttt{null}}}}{,}m'{,}ma)^1 {\wedge }X{\ne }{{{\texttt{null}}}})[x/X,mi/m'] ~{\vdash }~ {{\texttt{llb}}}(X{,}{{{\texttt{null}}}}{,}mi)[x/X,mi/m'] } $$

is identical to \({\texttt{e}}_0\). Since \({{\texttt{lls}}}(X{,}{{{\texttt{null}}}}{,}m'{,}ma)^1\) in \({\texttt{e}}_{12}\) is the subterm of \({{\texttt{lls}}}(x{,}{{{\texttt{null}}}}{,}mi{,}ma)^0\) in \({\texttt{e}}_{0}\), our system decided that \({\texttt{e}}_{0}\) is valid with the cyclic proof presented in Fig. 4.

5 Soundness, Completeness, and Complexity

We describe the soundness, termination, and completeness of \(\omega \)-ENT. First, we need to show the invariant about the quantifier-free entailments of our system.

Corollary 1

Every entailment derived from \(\omega \)-ENT is quantifier-free.

The following lemma shows the soundness of the proof rules.

Lemma 1

(Soundness). For each proof rule, the conclusion is valid if all premises are valid.

As every backlink generated contains at least one pair of inductive predicate occurrences in a subterm relationship, the global soundness condition holds in our system.

Lemma 2

(Global Soundness). A pre-proof derived is indeed a cyclic proof.

The termination relies on the number of premises/entailments generated by \(*\). As the number of inductive symbols and their arities are finite, there is a finite number of equivalence classes of these entailments in which any two entailments in the same class are equivalent under some substitution and linked back together. Therefore, the number of premises generated by the rule \(*\) is finite, considering the back-links generation.

Lemma 3

\(\omega \)-ENT terminates.

In the following, we show the complexity analysis. First, we show that every occurrence of inductive predicates in the LHS is unfolded at most two times.

Lemma 4

Given any entailment \({{\texttt{P}}}(\bar{v})^k *\varDelta _a~{\vdash }~\varDelta _c\), \(0 \le k \le 2\).

Let n be the maximum number of predicates (both inductive predicates and points-to predicates) among the LHS of the input and the definitions in \(\mathcal {P}\), and \(m\) be the maximum number of fields of data structures. Then, the complexity is defined as follows.

Proposition 1

(Complexity). \({\mathtt{QF{\_}ENT{-}SL_{LIN}}}\) is \(\mathcal {O}(n \times 2^{m} + n^3)\).

If \(m\) is bounded by a constant, the complexity becomes polynomial in time.

Our completeness proofs are shown in two steps. First, we show the proofs for an entailment whose LHS is a base formula. Second, we show the correctness when the LHS contains inductive predicates. In the following, we first define the base formulas of the LHS derived by \(\omega \)-ENT from occurrences of inductive predicates. Based on that, we define bad models to capture counter-models of invalid entailments.

Definition 4

( \({{\texttt{SHLIDe}}}\) Base) Given \(\kappa \), define \({\overline{\kappa }}\) as follows.

$$ \begin{array}{l} {\overline{{{\texttt{P}}}(E{,}F{,}\bar{B}{,}u{,}sc{,}tg)}} ~{\overset{{\text {def}}}{=}}~ {E}{{\mapsto }}c(F{,}E_1{,}E_2{,}u{,}tg) *{\overline{{{\mathtt {Q_1}}}(E_1{,}B)}} {*} {\overline{{{\mathtt {Q_2}}}(E_2{,}F)}} {\wedge } \pi _0 \\ {\overline{{E}{{\mapsto }}c(\bar{v})}} ~{\overset{{\text {def}}}{=}}~ {E}{{\mapsto }}c(\bar{v}) \qquad \quad {\overline{{\texttt{emp}}}} ~{\overset{{\text {def}}}{=}}~ {\texttt{emp}}\qquad \quad {\overline{\kappa _1 {*} \kappa _2}} ~{\overset{{\text {def}}}{=}}~ {\overline{\kappa _1}} {*} {\overline{\kappa _2}} \end{array} $$

The definition for general predicates with arbitrary matrix heaps is presented in [28]. As \(\mathcal {P}\) does not include mutual recursion (Condition C3), the definition above terminates in a finite number of steps. In a pre-proof, these \({{\texttt{SHLIDe}}}\) base formulas of the LHS are obtained once every inductive predicate has been unfolded.

Lemma 5

If \(\kappa \wedge \pi \) is in NF, then \({\overline{\kappa }} \wedge \pi \) is in NF, and \({\overline{\kappa }}\wedge \pi ~{\vdash }~ \kappa \) is valid.

In other words, \({\overline{\kappa }}\wedge \pi \) is an under-approximation of \({\kappa }\wedge \pi \); invalidity of \({\overline{\kappa }}\wedge \pi ~{\vdash }~ \varDelta '\) implies invalidity of \({\kappa }\wedge \pi ~{\vdash }~ \varDelta '\).

Definition 5

(Bad Model) The bad model for \({\overline{\kappa }} \wedge \phi \wedge a\) in NF is obtained by assigning

  • a distinct non-\({{{\texttt{null}}}}\) value to each variable in \(\textit{FV}({\overline{\kappa }} \wedge \phi )\); and

  • a value to each variable in \(\textit{FV}(a)\) such that \(a\) is satisfiable.

Lemma 6

  1. 1.

    For every proof rule except the rule \(\mathtt {*}\), all premises are valid only if the conclusion is valid.

  2. 2.

    For the rule \(\mathtt {*}\), where the conclusion is of the form \({{\varDelta ^b}} ~{\vdash }~ \kappa '\), all premises are valid only if the conclusion is valid and \({\varDelta ^b}\) is in NF.

The following lemma states that the correctness of the procedure \(\mathtt {is\_closed}\) for cases 2(b-d).

Lemma 7

(Stuck Invalidity). Given \(\kappa {\wedge }\pi ~{\vdash }~\varDelta '\) in NF, it is \(\texttt{invalid}\) if the procedure \(\mathtt {is\_closed}\) returns \(\texttt{invalid}\) for cases 2(b-d).

A bad model of the \({\overline{\kappa }}{\wedge }\pi \) is a counter-model. Cases 2b) and 2c) show that the heaps of bad models are not connected, and thus accordingly to conditions C1 and C2, any model of the LHS could not be a model of the RHS. Case 2d) shows that heaps of the two sides could not be matched. We next show the correctness of Case 2(a) of the procedure \(\mathtt {is\_closed}\), and invalidity is preserved during the proof search in \(\omega \)-ENT.

Proposition 2

(Invalidity Preservation). If \(\omega \)-ENT is stuck, the input is invalid.

In other words, if \(\omega \)-ENT returns invalid, we can construct a bad model.

Theorem 2

\(\mathtt{QF{\_}ENT{-}SL_{LIN}}\) is decidable.

6 Implementation and Evaluation

We implement \(\mathtt {S2S_{Lin}}\) using OCaml. This implementation is an instantiation of a general framework for cyclic proofs. We utilize the cyclic proof systems to derive bases for inductive predicates shown in [24] to discharge satisfiability of separation logic formulas. We use the solver presented in [29, 31] for those formulas beyond this fragment. We also develop a built-in solver for discharging equalities.

We evaluated \(\mathtt {S2S_{Lin}}\) to show that i) it can discharge problems in \({{\texttt{SHLIDe}}}\) effectively; and ii) its performance is compatible with state-of-the-art solvers. The evaluation of \(\mathtt {S2S_{Lin}}\) is provided as a companion artifact [27].

Experiment settings We have evaluated \(\mathtt {S2S_{Lin}}\) on entailment problems taken from SL-COMP benchmarks [38], a competition of separation logic solvers. We take 356 problems (out of 983) in two divisions of the competition, qf_shls_entl and qf_shlid_entl, and one new division, qf_shlid2_entl. All these problems semantically belong to our decidable fragment, and their syntax is written in SMT 2.6 format [39].

  • Division qf_shls_entl includes 296 entailment problems, \(122\) \(\texttt{invalid}\) problems and \(174\) \(\texttt{valid}\) problems, with only singly-linked lists. The authors in [33] randomly generated them

  • Division qf_shlid_entl contains 60 entailment problems which the authors in [15] handcrafted. They include singly-linked lists, doubly-linked lists, lists of singly-linked lists, or skip lists. Furthermore, the system of inductive predicates must satisfy the following condition: For two different predicates \({{\texttt{P}}}\), \({{\texttt{Q}}}\) in the system of definitions, either \({{\texttt{P}}} \prec ^*_{\mathcal {P}}{{\texttt{Q}}}\) or \({{\texttt{Q}}} \prec ^*_{\mathcal {P}}{{\texttt{P}}}\).

  • In the third division, we introduce new benchmarks, with 27 problems, beyond the above two divisions. In particular, every system of predicate definitions includes two predicates, \({{\texttt{P}}}\) and \({{\texttt{Q}}}\), that are semantically equivalent. We have submitted this division to the Github repository of SL-COMP.

To evaluate \(\mathtt {S2S_{Lin}}\)’s performance, we compared it with the state-of-the-art tools such as \(\mathtt {Cyclist_{SL}}\) [5], \(\texttt{Spen}\) [15], \(\texttt{Songbird}\) [40], SLS [41] and Harrsh [23]. We omitted Cycomp [42], as these benchmarks are beyond its decidable fragment. Note that \(\mathtt {Cyclist_{SL}}\), \(\texttt{Songbird}\) and SLS are not complete; for non-valid problems, while \(\mathtt {Cyclist_{SL}}\) returns \(\texttt{unknown}\), \(\texttt{Songbird}\) and SLS use some heuristic to guess the outcome. For each division, we report the number of correct outputs (\(\texttt{invalid}\), \(\texttt{valid}\)) and the time (in minutes and seconds) taken by each tool. Note that we use the status (\(\texttt{invalid}\), \(\texttt{valid}\)) annotated with each problem in the SL-COMP benchmark as the ground truth. If the output is the same as the status, we classify it as correct; otherwise, it is marked as incorrect. We also note that in these experiments, we used the competition pre-processing tool [39] to transform the SMT 2.6 format into the corresponding formats of the tools before running them. All experiments were performed on an Intel Core i7-6700 CPU 3.4Gh and 8GB RAM. The CPU timeout is 600 seconds.

Table 1. Experimental results

Experiment results The experimental results are reported in Table 1. In this table, the first column presents the names of the tools. The following three columns show the results of the first division, including the number of correct \(\texttt{invalid}\) outputs, the number of correct \(\texttt{valid}\) outputs and the taken time (where m for minutes and s for seconds), respectively. The number between each pair of brackets (...) in the third row shows the number of problems in the corresponding column. Similarly, the following two groups of six columns describe the results of the second and third divisions, respectively.

In general, the experimental results show that \(\mathtt {S2S_{Lin}}\) is the one (and only one) that could produce all the correct results. Other solvers either produced wrong results or could discharge a fraction of the experiments. Moreover, \(\mathtt {S2S_{Lin}}\) took a short time for the experiments (8.38 seconds compared to 15.91 seconds for \(\texttt{Spen}\), 324 minutes for \(\texttt{Songbird}\), 635 minutes for Harrsh, 739 minutes for SLS and 2120 minutes for \(\mathtt {Cyclist_{SL}}\)). While SLS returned 14 false negatives, \(\texttt{Spen}\) reported 20 false positives. \(\mathtt {Cyclist_{SL}}\), \(\texttt{Songbird}\) and Harrsh did not produce any wrong results. Of 569 tests, \(\mathtt {Cyclist_{SL}}\) could handle 85 tests (15%), Harrsh could handle 215 tests (38%), and \(\texttt{Songbird}\) could decide on 235 tests (41.3%). In the total of 223 \(\texttt{valid}\) tests, \(\mathtt {Cyclist_{SL}}\) could handle 85 problems (38%), and \(\texttt{Songbird}\) could decide 222 problems (99.5%).

Now we examine the results for each division in detail. For qf_shls_entl, \(\texttt{Spen}\) returned all correct, \(\texttt{Songbird}\) 186, Harrsh 155, and \(\mathtt {Cyclist_{SL}}\) 58. If we set the timeout to 2400 seconds, both \(\texttt{Songbird}\) and Harrsh produced all the correct results. Division qf_shlid_entl includes \(24\) \(\texttt{invalid}\) problems and \(36\) \(\texttt{valid}\) problems. While \(\texttt{Songbird}\) produced 37 problems correctly, \(\mathtt {Cyclist_{SL}}\) produced 24 correct results. \(\texttt{Spen}\) reported 27 correct results and 13 false positives (\(\mathtt {skl2{-}vc\{01-04\}}\) \(\mathtt {skl3{-}vc01}\), \(\mathtt {skl3{-}vc\{03-10\}}\)). The last division, qf_shlid2_entl, includes 14 \(\texttt{invalid}\) and 13 \(\texttt{valid}\) test problems. While \(\texttt{Songbird}\) decided only 12 problems correctly, \(\mathtt {Cyclist_{SL}}\) produced 3 correct outcomes. \(\texttt{Spen}\) reported 10 correct results. However, it produced 7 false positives (\(\mathtt {ls{-}mul{-}vc\{01-03\}}\), \(\mathtt {ls{-}mul{-}vc05}\), \(\mathtt {nll{-}mul{-}vc\{01-03\}}\)). We believe that engineering design and effort play an essential role alongside theory development. Since our experiments provide breakdown results of the two SL-COMP competition divisions, we hope that they provide an initial understanding of the SL-COMP benchmarks and tools. Consequently, this might reduce the effort to prepare experiments over these benchmarks to evaluate new SL solvers. Finally, one might point out that \(\mathtt {S2S_{Lin}}\) performed well because the entailments in the experiments are within its scope. We do not entirely disagree with this argument but would like to emphasize that tools do not always work well on favourable benchmarks. For example, \(\texttt{Spen}\) introduced wrong results on qf_shlid_entl, and Harrsh did not handle qf_shlid_entl and qf_shlid2_entl well, although these problems are in their decidable fragments.

7 Related Work

\(\mathtt {S2S_{Lin}}\) is a variant of the cyclic proof systems [3,4,5, 26] and [42]. Unlike existing cyclic proof systems, the soundness of \(\mathtt {S2S_{Lin}}\) is local, and the proof search is not back-tracking. The work presented in [42] shows the completeness of the cyclic proof system. Its main contribution is introducing the rule \(*\) for those entailments with a disjunction in the RHS obtained from predicate unfolding. In contrast to [42], our work includes normalization to soundly and completely avoid disjunction in the RHS during unfolding. Moreover, our decidable fragment SHLIDe is non-overlapping to the cone predicates introduced in [42]. Furthermore, due to the empty heap in the base cases, the matching rule in [42] cannot be applied to the predicates in \({{\texttt{SHLIDe}}}\). Finally, our work also presents how to obtain the global soundness condition for cyclic proofs.

Our work relates to the inductive theorem provers introduced in [10, 40] and Smallfoot [2]. While [10] is based on structural induction, [40] is based on mathematical induction. Smallfoot [2] proposed a decision procedure for linked lists and trees. It used a fixed compositional rule as a consequence of induction reasoning to handle inductive entailments. Compared with Smallfoot, our proof system replaces the compositional rule by combining rule \(\texttt{LInd}\) and the back-link construction. Our system could support induction reasoning on a much more expressive fragment of inductive predicates.

Our proposal also relates to works that use lemmas as consequences of induction reasoning [2, 16, 30, 41]. These works in [16, 25, 30, 41] automatically generate lemmas for some classes of inductive predicates. S2 [25] generated lemmas to normalize (such as split and equivalence) the shapes of the synthesized data structures. [16] proposed to generate several sets of lemmas not only for compositional predicates but also for different predicates (e.g., completion lemmas, stronger lemmas and static parameter contraction lemmas). SLS [41] aims to infer general lemmas to prove an entailment. Similarly, S2ent [30] solves a more generic problem, frame inference, using cyclic proofs and lemma synthesis. It infers a shape-based residual frame in the LHS and then synthesizes the pure constraints over the two sides.

\(\mathtt {S2S_{Lin}}\) relates to model-based decision procedures that reduce the entailment problem in separation logic to a well-studied problem in other domains. For instance, in [8, 11, 17], the entailment problem, including singly-linked lists and their invariants, is reduced to the problem of inclusion checking in a graph theory. The authors in [18] reduced the entailment problem to the satisfiability problem in second-order monadic logic. This reduction could handle an expressive fragment of spatial-based predicates called bounded-tree width. Moreover, the work presented in [23] shows a model-based decision procedure for a subfragment of the bounded-tree width. Furthermore, while the work in [15, 19] reduced the entailment problem to the inclusion checking problem in tree automata, [21] presented an idea to reduce the problem to the inclusion checking problem in heap automata. Moreover, while the procedure in [15] supported compositional predicates (single and double links) well, the procedure in [19] could handle predicates satisfying local properties (e.g., trees with parent pointers). Our decidable fragment subsumes the one described in [2, 11, 15] but is incomparable to the ones presented in [8, 17,18,19]. Works in [34] and [35, 36] reduced the entailment problem in separation logic into the satisfiability problem in SMT. While GRASShoper [35, 36] could handle transitive closure pure properties, \(\mathtt {S2S_{Lin}}\) is capable of supporting local ones. Unlike GRASShoper, which reduces entailment into SMT problems, \(\mathtt {S2S_{Lin}}\) reduces an entailment to admissible entailments and detects repetitions via cyclic proofs.

Decidable fragments and complexity results of the entailment problem in separation logic with inductive predicates were well studied. The entailment is 2-EXPTIME in cone predicates [42], the bounded tree-width predicates and beyond [14, 18], and EXPTIME in a sub-fragment of cone predicates [19]. In the other class, entailment is in polynomial time for singly-linked lists [11] and semantically linear inductive predicates [15]. Moreover, the extensions with arithmetic [17] are in polynomial but become EXPTIME when the lists are extended with double links [8]. \({{\texttt{SHLIDe}}}\) (with nested lists, trees and arithmetic properties) is roughly in the “middle” of the two classes above. The entailment is EXPTIME and becomes polynomial under the upper bound restriction.

8 Conclusion

We have presented a novel decision procedure for the quantifier-free entailment problem in separation logic combined with inductive definitions of compositional predicates and pure properties. Our proposal is the first complete cyclic proof system for the problem in separation logic without back-tracking. We have implemented the proposal in \(\mathtt {S2S_{Lin}}\) and evaluated it over the set of nontrivial entailments taken from the SL-COMP competition. The experimental results show that our proposal is effective and efficient when compared to the state-of-the-art solvers. For future work, we plan to develop a bi-abductive procedure based on an extension of this work with the cyclic frame inference procedure presented in [30]. This extension is fundamental to obtaining a compositional shape analysis beyond the lists and trees. Another work is to formally prove that our system is as strong as Smallfoot in the decidable fragment with lists and trees [2]: Given an entailment, if Smallfoot can produce proof, so is \(\mathtt {S2S_{Lin}}\).