Induction and Skolemization in saturation theorem proving

We consider a typical integration of induction in saturation-based theorem provers and investigate the effects of Skolem symbols occurring in the induction formulas. In a practically relevant setting we establish a Skolem-free characterization of refutation in saturation-based proof systems with induction. Finally, we use this characterization to obtain unprovability results for a concrete saturation-based induction prover.


Introduction
Automated inductive theorem proving (AITP) is a branch of automated deduction that aims at automating the process of finding proofs that involve mathematical induction. In first-order automated theorem proving (ATP) we try to establish validity whereas in automated inductive theorem proving (AITP) one is usually interested to prove that a formula is true in the standard model of some inductive type, such as natural numbers, lists, trees, etc. By Gödel's incompleteness theorems, truth in the standard model is in general not semi-decidable (even worse, it is in general not even arithmetically definable). Hence, for AITP there is a lot more freedom in the choice of proof systems, than there is for ATP. In practice we see methods that make use of typical first-order induction schemata, Hilbert-style induction rules (for example [KP13,Ker14]), and even more exotic cyclic calculi (see [Bro05,BGP12]) that can exceed the power of the first-order induction schema [BT17,BT19].
The most prominent applications of automated inductive theorem proving are found in formal methods for software engineering. For example, the formal verification of software relies strongly on one or another form of induction since any non-trivial program contains some form of loops or recursion. Besides the applications in software engineering, AITP methods have applications in the formalization of mathematics. For instance, AITP methods can be employed by proof assistants to explore a theory in order to provide useful lemmas [JRSC14], [JDB09].
A wide variety of methods for automated inductive theorem proving have been developed: there are methods based on recursion analysis [BM79, Ste88, BvHH + 89], proof by consistency [Com01], rippling [BSvH + 93], cyclic proofs [BGP12], extensions of saturation-based provers [BHHW86, KP13, Ker14, Cru15, Cru17, EP20, RV19, HHK + 20, Wan17], tree grammar provers [EH15], theory exploration based provers [CJRS13], rewriting induction [Red90], encoding [Sch20], extensions of SMT solvers [RK15]. Many methods integrate the induction mechanism more or less tightly within a proof system that is well-suited for automation. Therefore, these methods exist mainly at lower levels of abstraction, often close to an actual implementation. Such methods are traditionally evaluated empirically on a set of benchmark problems such as the one described by Claessen et. al. [CJRS15]. Formal explanations backing the observations obtained by the empirical evaluation are still rare. As of now, it is difficult to classify methods according to their strength and to give theoretical explanations of an empirically observed failure of a given method in a particular context.
The work in this article is part of a research program that aims at analyzing methods for AITP by applying techniques and results from mathematical logic. The purpose of this is twofold. Firstly, formal analyses allow us to complement and to explain the empirical knowledge obtained by the practical evaluations of AITP methods. Secondly, the analyses carried out during this program will inevitably lead to a development of the logical foundations of automated inductive theorem proving. In particular, we believe that practically relevant negative results are especially valuable in revealing the features a method is lacking. Thus, negative results may drive the development of new methods. Moreover, we believe that this research program will strengthen the link between the research in automated inductive theorem proving and mathematical logic, and therefore, may lead to cross-fertilization by providing interesting theoretical techniques from mathematical logic and new problems for mathematical logic.
As part of this research program Hetzl and Wong [HW18] have given some observations on the logical foundations of inductive theorem proving. Vierling [Vie18] has analyzed the n-clause calculus [KP13,Ker14] resulting in an estimate of the strength of this calculus. Building on this analysis Hetzl and Vierling [HV20] have further abstracted the n-clause calculus and situated this calculus with respect to some logical theories. The authors are currently also working on an unprovability result for the n-clause calculus.
The subject of AITP has recently increasingly focused on integrating mathematical induction in saturation-based theorem provers [KP13, Ker14, Cru15, Cru17, Wan17, EP20, RV19, HHK + 20]. In this article we propose abstractions of these systems and investigate how Skolemization interferes with induction in such a system. In a fairly general yet practically relevant setting we are able to show that Skolem symbols take the role of induction parameters. We use this insight to provide unprovability results for a family of methods using induction for quantifier-free formulas. This allows us in particular to obtain unprovability results for the concrete method described in [RV19,HHK + 20].
In this article we will provide a unified view of a commonly used strategy to integrate induction into saturation-based theorem proving and concentrate on the role of Skolemization in these systems. To our knowledge the interaction between induction and Skolemization has not been investigated in the related literature. Section 2 introduces all the necessary notations related to our logical formalism, our presentation of Skolemization, and the arithmetic theories used in this article. We will give a precise presentation of Skolemization, that imposes a concrete naming schema which will be particularly useful in dealing with the languages generated by saturation systems. In Section 3 we give an abstract description of saturation-based proof systems and describe abstractly a common strategy to integrate induction in such systems. We furthermore present a restriction of this system that generalizes a way to handle induction found in most practical saturation systems with induction. Section 4 gives a very clear characterization of refutation in saturation systems with an unrestricted induction rule (see Theorem 4.11) and analyzes the effects of Skolemization on the induction. In Section 5 we analyze the effect of Skolemization in syntactically restricted systems that are closer to the practical methods. This section culminates in a Skolem-free characterization of these systems (see Theorem 5.23). Finally in Section 6 we make use of the results from Section 5 to provide practically relevant unprovability results for a family of methods using quantifier-free induction formulas (see Theorem 6.6) and apply this result to the concrete method presented in [RV19, HHK + 20].

Preliminary Definitions
In this section we settle the details of the logical formalism that we use throughout the article. For the sake of clarity we try to adhere as much as possible to standard terminology, but we introduce some non-standard notations where it is beneficial for the presentation. In Section 2.1 we describe our logical formalism and the related notations such as clauses. Section 2.2 introduces some definitions and well-known results related to Skolemization and in particular the naming schema for Skolem symbols that we adopt in this article. Finally, in Section 2.3 we recall some notions of formal arithmetic and introduce a particular theory of formal arithmetic that will be of use at various occasions.

Formulas, theories, and clauses
We work in a setting of classical single-sorted first-order logic with equality. That is, besides the usual logical symbols we have a logical binary predicate symbol = denoting equality. In the context of automated theorem proving it is common to work in a many-sorted setting, but in order to keep the presentation simple we only use one sort. All our definitions and results easily generalize to the many-sorted case. A first-order language L is a countable set of function symbols and predicate symbols with their respective arities. Let σ be a (function or predicate) symbol, then we write σ/n to denote that σ has arity n ∈ N. Terms are constructed from function symbols and variables. Formulas are constructed as usual from atomic formulas, the connectives ¬, ∨, ∧, →, and the quantifiers ∃ and ∀. In order to save some parentheses we assume the following order of precedence for the propositional connectives: ¬, ∨, ∧, →. By F(L) we denote the set of L formulas. The notions of bound variables and free variables are defined as usual. By FV(ϕ) we denote the set of free variables of a formula ϕ. A formula that has no free variables is called a sentence. By (∃!y)ϕ( x, y) we abbreviate the formula (∃y)ϕ( x, y) ∧ (∀y 1 , y 2 )(ϕ( x, y 1 ) ∧ ϕ( x, y 2 ) → y 1 = y 2 ).
In this article we are more interested in the axioms of a theory, rather than the deductive closure of these axioms. Hence, we define a theory as a set axioms and manipulate the deductive closure by means of the first-order provability relation (see Definition 2.2).
Definition 2.1 (Theories). Let L be a first-order language, then a first-order L theory T is a set of L sentences called the axioms of T .
For the sake of legibility we often present the axioms of a theory as a list of formulas with free variables, with the intended meaning that these formulas are universally closed. By L(T ) we denote the language of the theory T . When no confusion arises we sometimes write T in places where L(T ) is expected.
Definition 2.2 (Provability). Let ϕ be a sentence and T a theory, then we write T ⊢ ϕ to denote that ϕ is provable in first-order logic from the axioms of T . Let Γ be a set of sentences, then we write T ⊢ Γ to denote that T ⊢ ϕ for all sentences ϕ ∈ Γ. Let T 1 and T 2 be theories, then we write T 1 ≡ T 2 if T 1 ⊢ T 2 and T 2 ⊢ T 1 .
Let ϕ( x) be a formula and T a theory, then in order to ease the notation we will sometimes write T ⊢ ϕ( x) in place of T ⊢ (∀ x)ϕ( x). Definition 2.3 (Conservativity). Let T 1 and T 2 be theories, and Γ a set of formulas. We say that for some first-order language L, then we may simply write T 1 ⊑ L T 2 for T 1 ⊑ F (L) T 2 .
Automated theorem provers-in particular saturation systems-usually do not operate directly on formulas but instead operate on clauses and clause sets (see Section 3).
Definition 2.4 (Literals and clauses). Let L be a first-order language. An L literal is an L atom or the negation thereof. An L clause is a finite set of L literals. An L clause set is a set of clauses. By we denote the empty clause. Let C and D be clauses, then we write C ∨ D for the union of the clauses C and D. Let C be a clause set and D a clause, the we write C ∨ D to denote the clause set {C ∨ D | C ∈ C}. Furthermore, we write L(C) to denote the language of C, that is, the set of non-logical symbols that occur in clauses of C.
Whenever the language L is clear from the context or irrelevant, we simply speak of clauses and clause sets instead of L clauses and L clause sets.
We will now recall basic some model-theoretic concepts and notations. Let L be a language, then an L structure is a pair M = (D, I), where D is a non-empty set and I is an interpretation. The interpretation I is a function that assigns to each symbol σ/k ∈ L an interpretation σ I such that if σ is a predicate symbol, then σ I ⊆ D k and if σ is a function symbol, then σ I : D k → D. Let ϕ(x 1 , . . . , x n ) be an L formula and d 1 , . . . , d n ∈ D, then we write M, {x i → d i | i = 1, . . . , n} |= ϕ if ϕ is true in M under the variable assignment that assigns d i to x i for i = 1, . . . , n.
Definition 2.5 (Notation). Let L be a language, M = (D, I) an L structure, then we define |M | = D. Moreover, we sometimes write d ∈ M if d ∈ D and for a symbol σ ∈ L, we also denote its interpretation σ I in M by σ M . Let ϕ(x 1 , . . . , x n ) be an L formula and d 1 , . . . , d n ∈ D | x| , then we write Definition 2.6. Let L be a language and M a first-order structure, then we define We are often interested in the formulas that have a certain structure.
Definition 2.7. We say that a formula is ∃ 0 (or ∀ 0 or open) if it is quantifierfree. We say that a formula is ∃ n+1 (∀ n+1 ) if it is of the form (∃ x)ϕ( x, y) ((∀ x)ϕ( x, y)), where ϕ is ∀ n (∃ n ) and x is a possibly empty vector of variables. Let L be a first-order language, then by Literal(L), Open(L), ∃ n (L), and ∀ n (L) we denote the set of literals, open formulas, ∃ n formulas, and ∀ n formulas of the language L. We say that a theory is ∀ n (∃ n ) if all of its axioms are ∀ n (∃ n ).
As mentioned above, automated theorem provers often work on sets of clauses, rather than formulas. Hence, it is necessary to discuss how formulas are associated with clause sets. In the following definition we fix one such translation that we use throughout the article.
Definition 2.8. By CNF we denote a fixed function that assigns to any ∀ 1 sentence ϕ, a clause set C ϕ such that L(ϕ) = L(C ϕ ) and ϕ and C ϕ are logically equivalent. Let T be a ∀ 1 theory, then CNF (T ) := ϕ∈T CNF (ϕ).
The function CNF fixed by the definition above could for example be the translation to conjunctive normal form that proceeds by moving negations inwards and by distributing disjunction over conjunction. We did not fix this particular translation because it is irrelevant for us how a conjunctive normal form is obtained as long as the translation preserves the language and is logically equivalent to the original sentence. Since this article focuses on the interaction of induction and Skolemization, we choose to exclude conjunctive normal form translations that do not preserve the language. The question how these more advanced transformations interact with induction is clearly also important and should be investigated separately.

Skolemization
We essentially use inner Skolemization with canonical names. On the one hand this form of Skolemization is convenient from a theoretical point of view, because it can be described as a function on formulas. In particular, the canonical naming schema for Skolem symbols allows us to be precise about the languages generated during the saturation processes considered in this article. On the other hand, inner Skolemization performs comparatively well with respect to proof complexity [BL94], and furthermore using canonical Skolem symbols does not increase proof complexity. Hence, this form of Skolemization is also a reasonable choice from the perspective of automated deduction.
We start by defining an operator describing all the Skolem symbols that can be obtained by Skolemizing a single quantifier over a given language L. This operator is then iterated on the language L in order to produce all the Skolem symbols that are required to Skolemize L formulas.
Definition 2.9. Let L be a first-order language, then we define where Q ∈ {∀, ∃}. We set S(L) := S ∀ (L) ∪ S ∃ (L). Now we define sk (L) := L ∪ S(L). By sk i (L) we denote the i-fold iteration of the sk operation. Finally, we define sk ω (L) := i<ω sk i (L). We call the stage of a symbol the least i ∈ N such that the symbol belongs to the language sk i (L). A first-order language L is Skolem-free if it does not contain any of its Skolem symbols, that is, if L ∩ S(sk ω (L)) = ∅. Now we can define the universal and existential Skolem form of a formula.
Definition 2.10. We define the functions sk ∀ , sk ∃ : F(sk ω (L)) → F(sk ω (L)) mutually inductively as follows sk Q (P ( t)) := P ( t), Before we discuss some details of the sk ∃ function in more detail, we will look at an example that illustrates how the function sk ∃ operates.
Observe that the symbols that are introduced by sk ∃ depend on the names of the variables. Thus, in particular, the symbols introduced for two formulas that only differ in the names of bound variables may not be the same. For example, let P be a unary predicate symbol, then sk ∃ ((∃x)P (x)) = P (s (∃x)P (x) ) = P (s (∃y)P (y) ) = sk ∃ ((∃y)P (y)).
Clearly, we could build the equivalence of formulas up to renaming into the Skolemization function. However, we prefer not to draw logical reasoning into the definition of the Skolemization function. Identification of provably equivalent formulas can be added by means of additional axioms, such as the Skolem axioms given in Definition 2.13.
The following property of Skolemization is well-known.
In general we do not have the converse of the above implications. We will now introduce Skolem axioms. These axioms essentially correspond to the existential Skolem form of the logical axioms ϕ → ϕ.
Definition 2.13. Let L be a first-order language, and ϕ(x, y) an sk ω (L) formula, then we define We define L-SA : The Skolem axioms allow us to also obtain the converse of Proposition 2.12.

Proof. Straightforward.
Skolem axioms over a Skolem-free theory have the following well-known conservation property.
Proposition 2.15. Let L be a Skolem-free first-order language and T be an L theory, then L-SA + T ≡ L T .
With the property above we now immediately obtain the well-known fact that Skolemizing a theory results in a conservative extension of that theory.
Lemma 2.16. Let L be a Skolem-free language and T be an L theory, then Proof. The direction sk ∃ (T ) ⊑ L T is an immediate consequence of Proposition 2.12. For the other direction we have T ≡ Prop. 2.15 This also immediately gives us the following weaker statement that is perhaps more familiar in automated deduction.
Corollary 2.17. Let L be a Skolem-free language and T be theory, then T is consistent if and only if sk ∃ (T ) is consistent.

Induction and arithmetic
We conclude the preliminary definitions with the definition of some notions related to formal arithmetic. Let us start by discussing the setting for induction that we use in this article. In automated inductive theorem proving it is customary to work with various inductively defined objects such as the natural numbers, lists, trees, and mutually recursive constructions. Typically inductive theorem proving concentrates on a multi-sorted setting where a subset of the sorts is interpreted as the term algebra constructed over a set of function symbols, called the constructors. Such a construction, while of great practical relevance, incurs significant notational complexity. Therefore, in order to avoid overloading the presentation, we restrict our setting to the natural numbers. However, we expect that our results straightforwardly carry over to the more general case mentioned above, because the structure of the induction axiom remains essentially the same.
We can now define induction axioms and the first-order structural induction schema.
Definition 2.19. Let L be a language, and ϕ(x, z) be an L formula, then the L ∪ L 0 formulaĨ x ϕ is given by We refer to the variable x as the induction variable and to the variables z as the induction parameters. Moreover we define the induction axiom I x ϕ by I x ϕ := (∀ z)Ĩ x ϕ. Let Γ be a set of L formulas, then the set of L ∪ L 0 sentences Γ-IND is given by {I x γ | γ(x, z) ∈ Γ}.
By an arithmetical language we understand a first-order language containing the symbols 0/0, s/1, and possibly some symbols representing primitive recursive functions. In the following definition we recall some standard terminology for arithmetic.
Definition 2.20. Let L be an arithmetical language. By N L the structure whose domain is the set of natural numbers and that interprets the non-logical symbols of L in the natural way. An arithmetical theory is a theory over an arithmetical language. Let T be an L theory. We say that the theory T is sound if N L |= T . Furthermore, we say that T is ∃ 1 -complete if N L |= ϕ implies T ⊢ ϕ for all ∃ 1 L sentences.
We conclude this section by describing the setting of linear arithmetic that will in particular serve us in Section 6.2 for obtaining unprovability results for the methods [RV19, HHK + 20]. The language of linear arithmetic contains besides 0/0 and s/1 only the function symbols p/1 and +/2 as infix symbol, where p denotes the predecessor function and + denotes the addition. Clearly, the setting of linear arithmetic is closely related to Presburger arithmetic. However, we are not interested in the theory of the standard interpretation, but rather in its subtheories such the ones that were already studied by Shoenfield [Sho58]. This setting of linear arithmetic turns out to be quite useful in the analysis of methods for automated inductive theorem proving, because on the one hand it is simple enough to still allow for straightforward model-theoretic constructions, yet it is complex enough to provide interesting independence results.
Let us fix some notational conventions. Let m ∈ N and t be a term, then by m · t we denote the term t + (t + · · · + (t + t) · · · ). Let f be a unary function symbol, then f m (t) stands for f (· · · f (t) · · · ). By m we denote the term s m (0). Our base theory for linear arithmetic is defined as follows.
Definition 2.21. By T we denote the theory axiomatized by the universal closure of the following formulas p(s(x)) = x, (A3) x + s(y) = s(x + y), We conclude with two basic observations about the theory T . We shall make use of these observations at several occasions and will for the sake of readability not mention them explicitly every time.
Proof. The soundness part is obvious. For the ∃ 1 -completeness observe that T decides ground formulas.

Saturation-based systems and induction
Induction can be integrated into a saturation proving system in different ways. One possibility is to contain the induction mechanism in a separate module that may use a saturation prover to discharge subgoals. Moreover, the induction module may receive additional information from the saturation prover, for instance information about failed proof attempts [BHHW86]. Another, currently more popular, way is to integrate the induction mechanism more tightly into the saturation system as some form of inference rule [KP13,Ker14], [RV19, HHK + 20], [Cru15,Cru17], [Wan17], [EP20]. In this section we give an abstract framework for AITP methods integrating induction in saturation-proof systems in terms of a general induction rule. This framework will allow us to investigate in Sections 4 and 5 the role of Skolem symbols in these systems. In Section 6 we show that the methods described in [RV19, HHK + 20] fit into our framework. In Section 3.1 we define saturation systems abstractly and introduce some related notions. After that, Section 3.2 introduces the notion of induction rule as a general way to integrate induction into a saturation system and presents a practically relevant specialization of this induction rule.

Saturation-based proof systems
Saturation is a technique of automated theorem proving that consists of computing the closure of a set of formulas or clauses under some inference rules. The saturation process goes on until some termination condition, such as the derivation of the empty clause, is met or until no more "new" formulas can be generated. Typically saturation-based theorem provers operate in a clausal setting because clauses have less structure and are therefore better suited for automated proof search.
In what follows we concentrate on the refutational setting, because most state-of-the art theorem provers are refutation provers. That is, in order to determine for some theory T whether a given sentence ϕ is provable in T , the prover saturates the clause set CNF (sk ∃ (T + ¬ϕ)) until the empty clause is derived. However our definitions can be easily adapted to the positive case by dualizing them, so as to cover for example connection-like methods.
Practical saturation proof systems are usually based on a variant of the superposition calculus. In order not to get involved in the technical details of these saturation-based proof systems we will abstractly think of a such a prover as a state transition system whose current state is a set of derived clauses and whose state transitions are inference rules that generate new clauses. In particular, our notion of saturation system does not have any notion of redundancy mechanisms such as simplification rules and deletion rules. Since this article is mostly about upper bounds on the logical strength of AITP methods, the assumption that clauses are never deleted is unproblematic.
Definition 3.1 (Saturation systems). A saturation system S is a set of inference rules of the form C D , also written as C/D where C is a set of clauses D is a finite set of clauses. Let S 1 and S 2 be two saturation-based proof systems, then by S 1 + S 2 we denote the system obtained by the union of the inference rules of S 1 and S 2 .
Informally, an inference rule C/D indicates that if the system is in the "state" C, then the system changes into the "state" C ∪ D. The reason why we consider inference rules of this form is that they allow us to keep track of global properties of the prover such as for example the language of the currently derived clauses. Observe that our notion of inference rules is very general since C may be infinite. Hence we could formulate an ω-rule for saturation systems. However, we will only work with inference rules that operate with the language of C and a finite set of clauses C 0 ⊆ C.
Example 3.2. The resolution rule can be presented as follows: where C is a clause set, C and D are clauses, and µ is the most general unifier of the literals l and m.
Definition 3.3 (Deduction, Refutation). Let C 0 be a set of clauses and S a saturation-based proof system. A deduction from C 0 in S is a finite sequence of clause sets D 0 , . . . , D n such that Since we are usually interested in extending saturation systems for pure first-order logic by inference rules for induction we need to introduce the notion of soundness and refutational completeness.
Definition 3.4. Let S be a saturation system. We say that S is sound if whenever a clause C is derivable from a clause set C 0 in S, then L(C) ⊆ L(C 0 ) and C 0 |= C. The saturation system S is said to be refutationally complete if there is a refutation from C 0 if C 0 is inconsistent.

Induction rules
Typically induction is integrated in a saturation prover by a mechanism, that, upon some condition, selects some clauses out of the generated clauses and constructs an induction formula based on the selected clauses. After that, the resulting induction axiom is clausified and the clauses are added to the search space [KP13, Ker14, RV19, HHK + 20, Cru15,Wan17]. The systems differ in the heuristics that are used to construct the induction formula, in the shape of the resulting induction formulas and in the conditions upon which an induction axiom is added to the search space. For instance, Kersani and Peltier's method [KP13,Ker14] carries out an induction only once, namely when the generated clauses are sufficient to derive the empty clause. Thus this method does, technically speaking, not even generate clauses. We abstract the induction mechanisms of the aforementioned methods by the following induction rule.
Definition 3.5. The induction rule IND R is given by where C is a set of clauses, ϕ(x, z) is a L(C) formula.
Despite being limited to natural numbers, the induction rule presented above is very general in the sense that it does not impose any restrictions on the complexity of the induction formulas. None of the methods known to us comes even close to making use of the full power offered by that rule. Nevertheless, it will serve us as a useful tool for theoretical analyses.
There is an important observation that we can make about this induction rule. First of all, in a saturation system with this induction rule Skolemization may happen at any time and not just once before the saturation process begins, as is the case in saturation systems for pure first-order logic. Secondly, the induction rule IND R permits Skolem symbols to appear in induction formulas. In other words, the induction IND R iteratively extends the language of the induction formulas by Skolem symbols. Interestingly, a similar situation has been considered in the literature on mathematical logic [Bek03]. In saturation systems for pure first-order logic, the role of Skolemization is clear: It allows us to obtain an equiconsistent formula without existential quantifiers (see Corollary 2.17). In saturation systems with the induction rule IND R the role of Skolemization is not clear anymore, in the sense of Corollary 2.17. This raises the question how the extension of the language of induction formulas by Skolem symbols affects the power of the system. Also note that this feature is not artificial but actually appears in the concrete methods mentioned above.
We shall address this question in Section 4. In particular we will provide a logical characterization of refutability in a sound and complete saturation system extended by the induction rule IND R in terms of a theory with an induction schema (see Theorem 4.11). As a corollary we obtain the soundness of the rule IND R (see Corollary 4.12).
The following example illustrates how to use the above induction rule.
Example 3.6. Let us work in the setting of linear arithmetic and let S be a sound and refutationally complete saturation system. We will now outline a refutation in S + IND R of the clause set C 0 given by Let sk ∃ (¬(∀x)(∀y)x + y = y + x) = (c 1 + c 2 = c 2 + c 1 ), then we have c 1 ∈ L(C 0 ) and C 0 |= c 1 + c 2 = c 2 + c 1 . (1) Let ϕ 1 (x) := (c 1 + x = x + c 1 ), then we may apply the induction rule IND R to obtain the clause set C 1 : , then we have c 3 ∈ L(C 1 ) and furthermore by (1) we have Now observe that T |= 0 = 0 + 0 and T |= 0 + s(c 4 ) = s(0 + c 4 ). Hence, T |= ϕ 2 (0). Therefore, by (3) we obtain , then by the above we obtain Now we apply the induction rule IND R in order to obtain the clause set Hence, by (6), we have C 3 |= ⊥. Hence, by the refutational completeness of S we obtain a refutation of C 3 . Therefore, by combining the applications of IND R used to obtain C 3 with the S refutation of C 3 we obtain a S + IND R refutation of C 0 .
Analyzing the rule IND R will give us some general insights about the role of Skolem symbols in saturation systems with induction, however in order to be more specific about particular methods we have to consider some restricted forms of this induction rule. We start by introducing some additional terminology. We call initial Skolem symbols those Skolem symbols that arise from the Skolemization of the input problem and induction Skolem symbols those Skolem symbols that are generated by an application of the induction rule.
Before we introduce a restriction of the induction rule that is of practical relevance we will discuss some remarkable design choices encountered in practical methods that we will incorporate into the induction rule: • Syntactical restriction of induction formulas: The methods presented in [RV19, HHK + 20] restrict induction formulas to literals, [KP13,Ker14] restricts induction formulas to ∃ 1 formulas, and [Cru15,Cru17] restricts induction formulas to ∀ 1 formulas.
• Control over occurrences of Skolem symbols: The practical induction mechanisms exert control over occurrences of the induction Skolem symbols either by avoiding the introduction of Skolem symbols altogether [KP13,Ker14] or by introducing nullary Skolem symbols only [RV19, HHK + 20], [Cru15,Cru17]. In particular none of these methods allows for parameters in the induction formula. As a consequence induction Skolem symbols trivially occur as subterms of ground terms.
Restrictions on the shape of the induction formulas is a feature that is common to all methods for automated inductive theorem proving because it is currently still difficult to search efficiently for a syntactically unrestricted induction formula. We incorporate this feature into the induction rule by parameterizing it by a set of formulas from which the induction formulas are constructed. The second feature is only slightly more complicated to generalize. If we are to allow induction formulas with quantifier alternations, then Skolemizing the corresponding induction axioms introduces Skolem symbols that are not nullary. Hence variables may occur in the scope of induction Skolem symbols. Therefore we generalize the second feature by explicitly requiring that variables do not occur within the scope of a Skolem symbol.
In other words we require that Skolem symbols may appear in the induction formula only in subterms of ground terms. Both generalized features are captured by the following restricted induction rule.
Definition 3.7. Let Γ be a set of formulas, then the rule Γ-GIND R is given by where C is a set of clauses, ϕ(x, z) ∈ Γ, and t is a vector of ground L(C) terms.
Remark 3.8. This restriction on occurrences of Skolem symbols is not only motivated by abstracting the current practice in AITP, it is also of independent theoretical interest: As described in [Dow08], Skolemization without this restriction in simple type theory makes the axiom of choice derivable, hence this restriction has been introduced in [Mil87]. This restriction is also used as an assumption for proving elementary deskolemization of proofs with cut in [BHW12], [Kom21].
Let us again consider an example to illustrate the rule.
Example 3.9. Consider the refutation carried out in Example 3.6. We have used the induction rule three times to derive the clause sets CNF (sk . All three induction formulas are equational atoms in which only nullary Skolem symbols appear. Hence the refutation outlined in Example 3.6 is also a refutation in S +Eq(T )-GIND R , where Eq(L) denotes the set of equational atoms over the language L.
As with the rule IND R we now have to ask the question how the system behaves. There are two major cases that we need to distinguish depending on whether the set of formulas Γ may contain initial Skolem symbols. By letting Γ be a set of Skolem-free formulas, we can restrict the occurrences of all Skolem symbols in the induction formulas. In Section 5 we mainly concentrate on this case and provide a characterization for the refutability in a sound and refutationally complete saturation system with the rule Γ-GIND R , thus, settling the question. In practical systems the initial Skolem symbols usually can appear in the induction formulas without restriction, that is, these systems correspond to the case where the formulas in Γ may contain initial Skolem symbols. However, this case is actually part of a more general open problem concerning occurrences of Skolem symbols in axiom schemata, that we will not address in the this article (see Remark 3.8). Nevertheless, we can handle the simple case when the initial Skolem symbols are nullary. We will mainly deal with this case in Section 6 in order to provide an unprovability result for the methods described in [RV19] and [HHK + 20].

Unrestricted induction and Skolemization
In the previous section we have abstractly described a common integration of induction into a saturation system via the induction rule IND R . In this section we will first represent a sound and refutationally complete saturation system extended by the rule IND R as a logical theory. After that we make use of this representation in order to investigate the interaction between Skolemization and the induction rule.

Representation as logical theory
A useful technique when analyzing AITP methods is to reduce the system to an "equivalent" logical theory. Alternatively, when such a theory cannot be found it is a good practice to approximate the system by a logical theory as closely as possible. The construction of that theory usually reveals the essential features of the method. Moreover, we can then make use of powerful techniques from mathematical logic in order to study the theory. In particular, we can compare methods in terms of their representative theories.
In the following we will show that the theory SI ω (T ) is a faithful representation of a saturation system extended by the induction rule IND R and operating on an initial clause set corresponding to a theory T . In other words, we will show that for a sound and refutationally complete saturation system S and a theory T , the saturation system S + IND R refutes the clause set CNF (sk ∃ (T )) if and only if SI ω (sk ∃ (T )) is inconsistent. Intuitively, we can see that this is the case because the operation SI(T ) corresponds to a simultaneous application of IND R to all L(T ) formulas. However, by the compactness theorem for first-order logic, only finitely many of these induction formulas actually appear in a proof of the inconsistency of SI ω (sk ∃ (T )). Hence we can derive the same induction axioms with the induction rule IND R .
Lemma 4.2. Let S be a sound saturation system and T be a theory. If S + IND R refutes CNF (sk ∃ (T )), then the theory SI ω (sk ∃ (T )) is inconsistent.
Proof. Assume that SI ω (sk ∃ (T )) is inconsistent, then by the compactness theorem there exists a finite subset S of SI ω (sk ∃ (T )) such that S is inconsistent. Furthermore there clearly exist sets S 0 , S 1 , . . . , S n with n ∈ N such that S 0 ⊆ sk ∃ (T ), S ⊆ S n , and S i = S i−1 ∪ {sk ∃ (I i )}, with I i ∈ SI i−1 (sk ∃ (T ))-IND and L(I i ) ⊆ L(S i ), for i = 1, . . . , n.
Now we can easily construct a refutation of CNF (sk ∃ (T )) in S + IND R by letting C 0 = CNF (sk ∃ (T )), and obtaining C i = C i−1 ∪ CNF (sk ∃ (I i )) for i = 1, . . . , n by the IND R rule. Clearly, C n is logically equivalent to S n , therefore we obtain a refutation from C n because of the refutational completeness of S.
We summarize the results so far in the following proposition.
Proposition 4.4. Let S be a sound and refutationally complete saturationbased proof system and T be a theory. Then S + IND R refutes CNF (sk ∃ (T )) if and only if the theory SI ω (sk ∃ (T )) is inconsistent.
Proof. An immediate consequence of Lemma 4.2 and Lemma 4.3.
The theory SI ω (sk ∃ (T )) is still not very convenient to work with. By working it a bit we can on the one hand eliminate the recursion that interleaves induction and Skolemization and secondly we can even "factor" out the Skolemization part. We start by analyzing which Skolem symbols occur in the theories generated by SI ω (·). Our first observation is that induction axioms that do not bind a free variable of the inducted upon formula allow us to introduce all the Skolem symbols.
The formulas of the form sk ∃ (ϕ → ϕ) are of interest because they correspond, roughly speaking, to Skolem axioms.
Remark 4.6. The requirement in Lemma 4.5 that the induction formula does not contain the induction variable is peculiar, but convenient to handle. A similar result as Lemma 4.5 can be achieved without this assumption by working, for example, with induction formulas of the form u = u ∧ ϕ, where the variable u is not free in the formula ϕ. In practice a system does usually not intentionally use its induction mechanism to introduce Skolem axioms. Instead some systems (for example [Cru15,Cru17]) provide a lemma rule that introduces the clauses CNF (sk ∃ (ϕ → ϕ)) into the search space.
Hence, it suffices to show that for every symbol σ ∈ sk ω (L(T ) ∪ L 0 ), there exists k ∈ N such that σ ∈ L(SI k+1 (T )). We proceed by induction on the stage of the symbol σ. For the base case let σ have stage 0, then it belongs to L(T ) ∪ L 0 and we already have σ ∈ L(SI 1 (T )). Now if σ ∈ sk ω (L(T ) ∪ L 0 ) has stage n > 0, then it is a Skolem symbol of the form σ = s (Qx)ϕ with Q ∈ {∀, ∃} and (Qx)ϕ only contains symbols of stage less than n. Hence by the induction hypothesis L((Qx)ϕ) ⊆ L(SI k+1 (T )) for some k ∈ N. Therefore sk ∃ (I u (Qx)ϕ) ∈ SI k+2 (T ), thus by Lemma 4.5 the symbol s (Qx)ϕ belongs to L(SI k+2 (T )), where u is a variable that does not occur freely in (Qx)ϕ.
With this in mind we see that SI ω (T ) contains the existential Skolemization of the sk ω (L(T )) induction schema. This allows us to eliminate the iteration of the operator SI(·) that was used to build up the language of the induction. Proof. Let ϕ be an sk ω (L(T )∪L 0 ) formula. By Lemma 4.7 we have L(SI ω (T )) = k<ω L(SI k (T )) = sk ω (L(T ) ∪ L 0 ). Hence, there exists k ∈ N such that L(ϕ) ⊆ L(SI k (T )). Therefore, SI k+1 (T ) ⊢ sk ∃ (I x ϕ).
Proposition 4.10. Let T be a theory, then Proof. First of all observe that sk ω (L(sk ∃ (T )) ∪ L 0 ) = sk ω (L(T ) ∪ L 0 ) and therefore (L(sk ∃ (T )) ∪ L 0 )-SA = (L(T ) ∪ L 0 )-SA. For the direction from right to left we observe that With this in mind it is straightforward to see that (L(T ) ∪ L 0 )-SA + T + sk ω (L(T ) ∪ L 0 )-IND ⊢ SI ω (sk ∃ (T )). For the direction from left to right, we observe that by Lemmas 4.8, 4.9 we have Hence, by Proposition 2.14 we obtain As an immediate consequence of the results above we obtain the following characterization of refutability in a sound and refutationally complete saturation based system extended by the induction rule IND R .
Theorem 4.11. Let S be a saturation system, T a theory, and ϕ an L(T ) sentence.
(i) If S is sound and S + IND R refutes CNF (sk ∃ (T + ¬ϕ)), then  We conclude this section with a remark.
Remark 4.13. In the presence of the Skolem axioms every formula is equivalent to an open formula. In particular, for a language L, we have Thus, we can formulate Theorem 4.11 in a slightly more canonical way, by using Open(sk ω (L))-IND in place of sk ω (L)-IND. In other words, in the presence of Skolem axioms Skolem symbols permit us to simulate quantification. Conceptually, we can thus split the unrestricted induction rule of Definition 3.5 into a lemma rule and an induction rule for clause sets.

Conservativity
In the previous section we have characterized the extension of a sound and refutationally complete saturation system by the induction rule IND R in terms of a theory with induction over formulas that contain Skolem symbols. This gives rise to the question how the addition of Skolem symbols to the language of the induction schema affects the strength of the system. In particular, can we provide an equivalent Skolem-free induction schema? Let L be a Skolem-free language and T an L theory, then a natural candidate for a Skolem-free characterization of the strength of L-SA + T + sk ω (L)-IND is the theory T + L-IND.
Question 4.14. Let L be a Skolem-free language and T an L theory, do we have In the following we give a partial answer to the above question. The general case remains open. Our answer relies on the following idea: If a Skolem function is definable in terms of an L formula then wherever the Skolem symbols occurs we can instead use its definition to eliminate the symbol.   For the sake of the presentation we have moved the proof of Proposition 4.17 to Appendix A. The proof essentially proceeds by replacing in each model the occurrences of the Skolem symbols by instances of their defining formulas.
In order to illustrate Proposition 4.17 we will consider some practically relevant special cases. An important special case of Proposition 4.17 is when the Skolem functions are definable already in a theory.
Proposition 4.19. Let T be a Skolem-free theory with definable Skolem functions, then every model of T has definable Skolem functions.
In particular, a theory has definable Skolem functions if it has a definable well-order. We simply need to define the Skolem functions in terms of the least of the candidate values in each point.
Definition 4.20. Let L be a language, and θ(x, y) an L formula in two variables. For the sake of legibility we write θ(x, y) as x ≺ θ y and by (∀x≺ θ y)ψ(x, y) we abbreviate the formula (∀x)(x ≺ θ y → ψ(x, y)). The total order axioms TO θ for θ are given by the universal closure of the following formulas The least number principle L-LNP θ for θ(x, y) consists of the axioms where ψ(x, z) is an L formula. We define L-WO θ := TO θ + L-LNP θ .
Proposition 4.21. Let T be a Skolem-free theory. If there exists an L(T ) formula θ(x, y) such that T ⊢ L(T )-WO θ , then T has definable Skolem functions.
These results are quite far-reaching. For example, for every sound arithmetic theory T containing the symbol +/2 with the usual primitive recursive definition of + we have where θ := (∃z)x + z = y. Therefore, extending the full induction principle by all the Skolem symbols based on such a theory results in a system that proves the same L(T ) formulas as the Skolem-free system. So far we have considered the effects of extending the full induction schema by all Skolem symbols. We have concluded that not only is this extension always sound but it is also conservative over the Skolem-free system in a setting where Skolem functions are definable in all models and in particular if the theory provides a well-order. We have left open the case where there are models in which a Skolem function is not definable.

Restricted induction and Skolemization
In the previous section we have considered some high-level questions about the soundness and conservativity of Skolemization in saturation theorem proving with an unrestricted induction rule. In this section we will focus on the role of Skolem symbols in the more practical setting corresponding to the induction rule Γ-GIND R given in Definition 3.7, where Γ is a set of formulas. We start by providing in Section 5.1 a representation as a logical theory for sound and refutationally complete saturation systems extended by the induction rule Γ-GIND R . After that we will make use of that characterization in order to clarify the role of the Skolem symbols in saturation systems extended by the rule Γ-GIND R mostly under the assumption that Γ is Skolem-free. As already mentioned earlier, the restriction to a Skolemfree Γ deviates from practical systems in which Γ may contain initial Skolem symbols but not induction Skolem symbols. Nevertheless, studying the effect of restricting the occurrences of all Skolem symbols in the induction schema is an interesting theoretical question and allows us to better understand the overall role of Skolem symbols.

Representation as logical theory
We will now provide a preliminary representation as a logical theory for sound and refutationally complete saturation systems extended by the induction rule Γ-GIND R . We start by introducing some additional notions that will be used throughout this section.
So far we have considered the traditional induction schema with induction parameters. In the following we introduce a notation for induction without induction parameters. Parameter-free induction schemata have been investigated in mathematical logic [Ada87, KPD88, Bek97, CFM11, Jeř20], hence, we adopt a similar notation.
Definition 5.1. Let Γ be a set of formulas, then the parameter-free induction schema for Γ formulas Γ-IND − is given by Γ- The grounding operator given in the following definition allows us to obtain all instances of a set of formulas obtained by replacing some of the variables by ground terms.
We can now introduce an operator corresponding to the rule Γ-GIND R .
Definition 5.3. Let T be a theory and Γ be a set of formulas.
It is straightforward to see that Γ-GSI ω (·) characterizes a sound and refutationally complete saturation-based proof system extended by the induction rule Γ-GIND R .
Proposition 5.4. Let S be a sound and refutationally complete saturationbased proof system and T be a theory. Then S+Γ-GIND R refutes CNF (sk ∃ (T )) if and only if Γ-GSI ω (sk ∃ (T )) is inconsistent.
Proof. Analogous to the proof of Proposition 4.4.
In Section 5.2 we will have a closer look at the role of the Skolem symbols in such saturation systems.

Induction parameters and Skolem symbols
The induction rule Γ-GIND R only generates parameter-free induction axioms, but on the other hand the generated induction axioms may contain Skolem symbols whose role is not yet clear at this point. Thus, it appears reasonable to begin by comparing sound and refutationally complete saturation systems extended by the rule Γ-GIND R with the induction schema Γ-IND − . In the setting of linear arithmetic with Γ := Open(T ) and θ(x, y) := y + x = x → y = 0 we readily obtain an example where both systems differ in strength.
Let c := s (∀x)θ(x,x) , then Open-GSI 1 (sk ∃ (T + ¬(∀x)θ)) ⊢ I x θ(x, c). Hence we now work in the theory Open-GSI 1 (sk ∃ (T + ¬(∀x)θ(x, x))) and proceed by induction on x in the formula θ(x, c). For the base case it suffices to see that c = c + 0 = 0 by (A4). For the induction step we assume that c + x = x → c = 0 and c + s(x) = s(x). By (A5) we obtain s(c + x) = s(x) and therefore we obtain c + x = x. Hence c = 0 by the assumptions. Therefore we now obtain θ(c, c) and ¬θ(c, c), that is, ⊥.
On the other hand we also have the following.
The proof of Lemma 5.6 can be found in Appendix B and consists of the elimination of the symbol p from induction formulas followed by the construction of a model M. The domain of M consists of elements of the form (b, i) ∈ {0, 1} × Z such that b = 0 implies i ∈ N. Furthermore, the symbol 0 is interpreted as the element (0, 0) and + is interpreted as the operation (b 1 , n 1 ) + M (b 2 , n 2 ) = (max{b 1 , b 2 }, n 1 + n 2 ). Hence, M |= θ ((1, 0), (1, 0)).
Remark 5.7. We clearly have T + Open(T )-IND ⊢ θ(x, x) by proceeding by induction on x in the formula θ(x, y). Hence Lemma 5.6 is highly interesting for AITP because it provides us with a simple formula that requires induction on a syntactically more complex formula.
The proof of Lemma 5.5 is reminiscent of the obvious proof of θ(x, x) in the theory T + Open(T )-IND. Thus the proof suggest that the occurrences of Skolem symbols in ground terms of the induction formulas provide some of the strength of induction parameters. In the following we will confirm this intuition (see Theorem 5.22).
We start by showing that the Skolem symbols appearing in the ground terms of the induction axioms of Γ-GSI ω (sk ∃ (T )) are not more powerful than induction parameters. This is relatively straightforward because ground terms can be abstracted by induction parameters. In particular, the grounding operation given in Definition 5.2 is absorbed by parameterized induction.
Lemma 5.8. Let Γ be a set of formulas and L a language, then Proof. Observe that ⊢ I x ϕ(x, y, z) → I x ϕ(x, y, t).
We have announced that this section deals mainly with the case where the set of formulas Γ is Skolem-free. This corresponds to a saturation system that also restricts the occurrences of the initial Skolem symbols. In practical systems this is usually not the case, because the restriction mainly applies to induction Skolem symbols. We briefly address this more general case in the following lemma. We can now apply the above lemma to the case that is relevant for us in order to show that allowing occurrences of Skolem symbols in ground terms of induction formulas is not stronger than induction parameters.
Corollary 5.11. Let L be a Skolem-free first-order language, T an L theory, and Γ a set of L formulas. If Γ-GSI ω (sk ∃ (T )) is inconsistent, then T +Γ-IND is inconsistent.
In the following we will show by a proof-theoretic argument that we even have the converse, that is, ground Skolem terms behave in the refutational setting exactly as induction parameters. Thus, we start by recalling the necessary concepts from proof theory. We introduce a partially prenexed form of the induction schema in which the strong quantifier of the induction step is pulled into the quantifier prefix. Moving this quantifier into the quantifier prefix will simplify the subsequent arguments.
Definition 5.12. Let γ(x, z) be a formula, then we define the sentence I ′ x γ by .
Let Γ be a set of formulas, then we define Γ- This induction schema is clearly equivalent to the usual one given in Definition 2.19.
We will work with the following Gentzen system, which is essentially a variant of the calculus G1c given in [TS00] with atomic logical axioms extended by a cut rule and axioms for equality.
Definition 5.14. A sequent is an expression of the form Γ ⇒ ∆, where Γ and ∆ are finite multisets of formulas.
Definition 5.15. The sequent calculus G consists of the following rules Axioms: Rules for weakening, contraction, and cut: Rules for logical connectives: where Γ, ∆, Λ, Π stand for multisets of formulas, F, G stand for formulas, A stands for atomic formulas, t, r stand for terms, and for R ∈ {L∀, R∃} the variable α is called the eigenvariable of R and α does not occur freely in the conclusion of R.
We recall some important notions and properties of the calculus G. The calculus G is sound and complete for first-order logic.
Lemma 5.16. Let ϕ be a sentence, then ⊢ ϕ if and only if there exists a G proof of the sequent ⇒ ϕ.
The calculus G has the following form of cut elimination.
Definition 5.17. In a cut inference the formula F is called the cut formula. We say that a G proof is in atomic cut-normal form (ACNF, for short) if all of its cut formulas are atomic.
Lemma 5.18. If a sequent Γ ⇒ ∆ is provable in G, then it has a G proof in ACNF.
Definition 5.19. The inference rules L∃ or R∀ are called strong quantifier inference rules. Let π be a G proof, then by sqi(π) we denote the number of strong quantifier inferences in π.
In the argument to follow the number of strong quantifier inferences of a proof will be used as the induction measure.
Proof. We follow the ancestors of the formulas in Σ and ∆ in π and replace eigenvariables of these ancestors by their respective Skolem terms.
Proposition 5.21. Let T be a theory with L 0 ⊆ L(T ) and Γ a set of formulas. If T + Γ-IND is inconsistent, then Γ-GSI ω (sk ∃ (T )) is inconsistent.
Proof. Assume that T + Γ-IND is inconsistent, then clearly sk ∃ (T ) + Γ-IND ′ is inconsistent as well. Hence by Lemma 5.16 of G there exists a proof π in ACNF of a sequent of the form Π, I ⇒, where Π is a finite subset of sk ∃ (T ) and I is a finite subset of Γ-IND ′ . Observe, furthermore, that we can assume without loss of generality that the symbol 0 occurs in Π since L 0 ⊆ L(T ).
Let µ be a proof in ACNF of a sequent of the form Σ, I ⇒ with Π ⊆ Σ ⊆ Γ-GSI ω (sk ∃ (T )). We proceed by induction on the number of strong quantifier inferences of µ in order to obtain a proof of a sequent Σ ′ ⇒ where Σ ′ ⊆ Γ-GSI ω (sk ∃ (T )). If µ does not contain strong quantifier inferences, then we obtain a proof of Σ ⇒ by permuting inferences on ancestors of I downward. For the induction step assume that µ contains at least one strong quantifier inference. Because µ does not contain non-atomic cuts, we can permute quantifier inferences toward the bottom of the proof without introducing any new strong quantifier inferences. Since Σ is free of strong quantifiers any strong quantifier inference takes place on an ancestor of a formula in I. Hence, by permuting a strong quantifier inference toward the bottom of the proof µ, we obtain a proof ν with sqi(ν) ≤ sqi(µ) of the form where ϕ(x, z) is a Γ formula and t is a vector of ground terms for which we can assume without loss of generality that L( t) ⊆ L(Σ). If t would contain a symbol σ of I that does not already occur in Σ, then there is a formula γ( x) ∈ Γ containing σ and we introduce sk ∃ (I x γ(0, . . . , 0)) into Σ by a left weakening. Now we let c := s (∀x)(ϕ(x, t)→ϕ(s(x), t)) .
We can summarize the results in the following proposition.
Proposition 5.22. Let L be a Skolem-free first-order language, T an L theory with L 0 ⊆ L(T ), and Γ a set of L formulas, then Γ-GSI ω (sk ∃ (T )) is inconsistent if and only if T + Γ-IND is inconsistent.
Proof. An immediate consequence of the propositions 5.10 and 5.21.
The above result shows that in a refutational setting allowing Skolem symbols to appear in ground terms of induction formulas corresponds exactly to induction with parameters. This confirms our initial intuition that Skolem symbols in ground terms behave like induction parameters. We can rephrase the result of Proposition 5.22 as follows.
Theorem 5.23. Let L be a Skolem-free first-order language, T an L theory, Γ a set of L formulas, ϕ an L formula such that L 0 ⊆ L(T ) ∪ L(ϕ), and S a sound and refutationally complete saturation system. Then S + Γ-GIND R refutes CNF (sk ∃ (T + ¬ϕ)) if and only if T + Γ-IND ⊢ ϕ.
We have thus obtained a Skolem-free characterization of a sound and refutationally complete saturation-based proof system with the induction rule Γ-GIND R . We conclude this section with a question about a generalization of Theorem 5.23.
Question 5.24. Consider again the situation of Lemma 5.9, where we have shown that Γ-GSI ω (T ) is L conservative over L-SA+T +Γ-IND where L ⊇ L 0 is a first-order language, T an L theory, and Γ a set of L formulas. This gives rise to the question whether we can characterize a system that allows initial Skolem symbols to occur in the induction formulas without restriction, but restricts the occurrences of induction Skolem symbols in an analogous way to Proposition 5.21. In particular, is Γ-GSI ω (T ) inconsistent if and only if L-SA + T + Γ-IND is inconsistent?

Unprovability
In the previous sections we have studied two forms of induction rules occurring in saturation-based induction provers. In particular we were able to give a Skolem-free characterization as a logical theory of the induction rule Γ-GIND R where Γ is a set of Skolem-free formulas. In this section we will make use of this result in order to provide concrete unprovability results for saturation systems that make use of this induction rule. In Section 6.1 we will provide unprovability results for saturation-based systems that are based on the induction rule Open(L)-GIND R , where L stands for the language of the initial clause set. Then in Section 6.2 we show that the concrete methods described in [RV19, HHK + 20] belong to this family and that therefore we obtain unprovability results for these methods.

Open induction
The setting of linear arithmetic described in Section 2.3 proves to be a source of very simple and practically relevant unprovability examples. We make use of an elegant characterization proved by Shoenfield [Sho58].
The following formulas were already studied by Shoenfield in [Sho58]. Their interesting relation to the theory T ′ will be crucial for our unprovability results. We have now everything at hand to formulate the unprovability result.
Definition 6.5. Let m, n ∈ N, then the clause sets X m and Y m,n are given by Y m,n := CNF (sk ∃ (T ′ + ¬D m,n )).
Theorem 6.6. Let S be a sound saturation system and C ∈ {X m , Y m,n | 0 < n < m}, then S + Open(L(C))-GIND R does not refute the clause set C.
Proof. We consider the case for C = X m with 1 < m. The other case is treated analogously. Proceed indirectly and assume that S+Open(L(X m ))-GIND R refutes X m . Then by Lemma 5.9 we have First of all, observe that sk ∃ (T ′ ) = T ′ . By applying Proposition 2.14 we obtain This result begs the question which features a system needs in order to prove the sentences C m and D m,n for 0 < n < m. In the following we briefly mention some extensions of the open induction schema that would allow us to overcome our unprovability results. The extensions we suggest are purely theoretical in the sense that we do not take into account whether they can be implemented efficiently in a saturation system. A possible extension follows from a remark by Shoenfield [Sho58] that C m and D m,n with 0 < n < m can be proved with parameterized double induction (also known as simultaneous induction) on open formulas. Definition 6.7. Let γ(x, y, z) be a formula, then the formulaĨ (x,y) γ is given by ((∀x)γ(x, 0, z) ∧ (∀y)γ(0, y, z) ∧ (∀x, y)(γ(x, y, z) → γ(s(x), s(y), z))) → (∀x, y)γ(x, y, z).
Lemma 6.8. Let m, n ∈ N with 0 < n < m, then The second possibility is to extend the induction rule used by the system at least to ∀ 1 formulas without parameters.
Lemma 6.9. Let m, n ∈ N with 0 < n < m, then Proof. The proof of (i) is left as an exercise. For (ii) we work in T + ∀ 1 (T )-IND − and proceed by induction on the formula (∀y)(m · x = m · y → x = y). For the base case we have to show that m·0 = m·y → 0 = y. By Lemma 2.23 we have m · 0 = 0. By (B1) we need to distinguish two cases. If y = 0, then we are done, otherwise we obtain a contradiction by (A1). For the induction step we assume (∀y)(m · x = m · y → x = y) and m·s(x) = m·y. We want to obtain s(x) = y. By (A5) and (B2) we obtain s m (m·x) = m·s(x) = m·y. By (B1) we can distinguish two cases. If y = 0, then by 2.23 we s m (m · x) = 0, which contradicts (A1). Hence by Lemma 2.22 we have m · x = m · p(y) and it suffices to show x = p(y). By the induction hypothesis we have m · x = m · p(y) → x = p(y). Thus we obtain x = p(y).
For (iii) we proceed analogously.
Shoenfield has shown the following interesting theorem.
From this it follows that at least in the setting of linear arithmetic double induction and parameter-free ∀ 1 induction are sufficient to prove all true quantifier-free formulas.
In a similar way to what we did in this section we obtain many more unprovability results by using independence results of Shepherdson [She64] and Schmerl [Sch88]. However, these results are formulated in the language that besides the symbols of linear arithmetic contains the symbols−/2 and ·/2 for the truncated subtraction and multiplication, respectively. The properties that are shown independent of the base theory with open induction express slightly more complicated properties such as the irrationality of the square root of two, Fermat's last theorem for n = 3, and similar diophantine equations. Hence, these independence results are currently less practically realistic.

Literal induction: a case study
In the previous section we have provided unprovability results for sound saturation systems that are extended by the rule Open(L)-GIND R , where L is a Skolem-free language. In this section we will show that these results apply to the concrete systems described in [RV19,HHK + 20].
In [RV19] Reger and Voronkov describe an AITP system that extends a sound saturation-based proof system by the induction rule where a is a constant, l(x) is a literal free of a, and l(a) ground. We informally refer to this induction rule as the first analytical literal induction rule. Basically, this induction rule operates as follows: Whenever a clause of the form l(a) ∨ C is encountered, then the rule generates the clauses corresponding to the induction axiom I x l(x) and immediately resolves these against l(a) ∨ C. In a practical implementation the rule will not apply to every clause of the form l(a) ∨ C but only when some additional conditions are satisfied. We call this induction rule analytical because an induction is carried out only for literals that actually are generated during the saturation process. The motivation for choosing the very restricted induction rule Literal-AIND R 1 is to solve problems that require "little" induction reasoning and complex first-order reasoning [RV19]. In particular the induction rule is chosen so as to not generate too many clauses, which otherwise would potentially result in performance issues. Empirical observations [HHK + 20], however, suggest that this method is unable to deal even with very simple yet practically relevant problems such as In order to relax the overly restricting analyticity, [HHK + 20] introduces the following induction rule: where l(x) is a literal, a is a constant such that l(a) is ground. This rule reduces the degree of analyticity by allowing induction to be carried out on slight generalizations of the currently derived literals. This results in more possibilities to add induction axioms to the search space and thus makes search more difficult, but the degree of analyticity of the induction is reduced sufficiently to make the method able to prove some challenging formulas such as for example x+(x+x) = (x+x)+x (See [HHK + 20] for details). It is clear that the rule Literal-AIND R 2 is at least as strong as the rule Literal-AIND R 1 . Hence we will in the following concentrate on the rule Literal-AIND R 2 . In the next step we will show how the induction rule Literal-AIND R 2 can be expressed in terms of the restricted induction rule given in Definition 3.7. The proof proceeds in three steps: First we extract the induction axioms that are introduced with Literal-AIND R 2 ; secondly, we derive these induction axiom with the induction rule of Definition 3.7; finally, we use first-order inferences to reconstruct a refutation.
As an immediate consequence, we can transfer the previously established unprovability results to the concrete method described in [RV19, HHK + 20].
Theorem 6.13. Let S be a sound and refutationally complete saturation system, then the system S + Literal-AIND R 2 does neither refute the clause set X m nor the clause set Y m,n for 0 < n < m.
Proof. We consider the case for the clause set X m with 1 < m. The other case is analogous. Suppose that S + Literal-AIND R 2 refutes X m , then by Proposition 6.12 the saturation system S + Literal(L(X m ))-GIND R refutes X m . This contradicts Theorem 6.6. Theorem 6.13 gives us a family of simple and practically relevant clause sets that cannot be proved by the calculi presented in [RV19, HHK + 20].
Let us now briefly discuss these results. A possible source of criticism for Theorem 6.13 may be that the underlying independence results (Lemma 6.4) are overly strong. That is they do not exploit the restriction of the induction to literals, but instead rely on the fact that the sentences C m and D m,n with 0 < n < m are already unprovable with induction for all quantifier-free formulas. We can address this point by the following results.
Proof. Proving B2 and B3 is straightforward. For B4 we show the contrapositive y = z → x + y = x + z. We assume y = z and proceed by induction on x in the formula x + y = x + z. For the base case we have to show 0 + y = 0 + z. By B2 and the definition of + the formula 0 + y = 0 + z is equivalent to y = z which we have assumed. For the induction step we assume s(x) + y = s(x) + z. By B2 and A5 we obtain s(x + y) = s(x + z), hence x + y = x + z and we are done.
Proving B1 is slightly more complicated because the induction interacts even more with the context. We assume x = 0 and we have to show x = s(p(x)). We proceed by induction on y in the formula x = y. The induction base is trivial since we have assumed x = 0. For the induction step we assume x = y 0 and we have to show x = s(y 0 ). Hence we assume x = s(y 0 ). Now we have s(p(x)) = s(p(s(y 0 ))) = s(y 0 ) = x and we are done. Therefore we obtain the formula (∀y)x = y and in particular x = x, which is a contradiction. Hence we obtain x = s(p(x)).
In the light of Shoenfield's theorem it is now clear that induction for literals is as powerful as quantifier-free induction. Proof. The direction from right to left is obvious. For the direction from left to right follows from Lemma 6.14 and Shoenfield's Theorem (Theorem 6.2).
The underlying independence results are therefore not too strong and it is not possible to improve the result by taking into account the restriction of the induction to literals. The result may also be interesting from a practical point of view, because induction for literals is much easier to implement efficiently than induction for quantifier-free formulas. It would therefore be interesting to investigate under which conditions induction for quantifier-free formulas collapses to induction for literals. However, we believe that there are practically relevant theories in which the induction schema for literals is strictly weaker than the induction schema for quantifier-free formulas. Such a theory could allow us to provide unprovability results that give a motivation for the development of stronger induction mechanisms.
Another possible source of criticism is that our results focus on abstractions that are quite far from practical reality. Most importantly, we do not exploit the fact that the induction rules Literal-AIND R i (i = 1, 2) attempt induction only for literals of which an instance of the dual literal occurs in the derived clauses. Selecting the induction literals in this way seems to be a strong theoretical and practical restriction. However, this restriction is crucial for current practical systems because it permits an efficient operation of the prover. In practice, the restriction is usually weakened by the usage of heuristics for the selection of induction formulas [HHK + 20]. Another promising method for discovering induction formulas is introduced in [CJRS13, VJ15], but it is unclear how to integrate this efficiently into a saturation-based system. We currently do not have a candidate clause set that exploits the way in which Literal-AIND R i (i = 1, 2) select induction literals, but we plan to investigate this restriction in the future.
On the other hand, working with high-level abstractions allows us to obtain results that are robust against minor refinements of the induction rule from [RV19] such as the refinement proposed in [HHK + 20]. Moreover, the underlying independence results together with Lemmas 6.8 and 6.9 suggest natural, yet not necessarily practical, extensions of the induction rule by allowing simultaneous induction on multiple variables or by allowing quantification inside the induction formula.
In Section 4, we have considered a general framework for induction over natural numbers in saturation-based provers that extend the language by Skolem symbols. By reducing this induction mechanism to a logical theory (see Theorem 4.11), we have shown that in many relevant cases extending the language of the induction schema by Skolem symbols does not grant any additional power (see Proposition 4.21). Furthermore, we have considered, in Section 5, an induction rule that restricts occurrences of Skolem symbols to ground terms according to similar restrictions observed in practical systems.
We have shown that under this restriction Skolem symbols correspond to induction parameters (see Theorem 5.22). Finally, in Section 6, we have used the results from Section 5 and independence results from the literature on mathematical logic to obtain some practically relevant unprovability results for the systems described in [RV19, HHK + 20] (see Theorem 6.13).
We plan to continue the work on induction in saturation-based theorem proving by analyzing the methods developed by Cruanes [Cru15,Cru17], Wand [Wan17] and Echenim and Peltier [EP20]. We are particularly interested in Cruanes' method because its mode of operation is very similar to the methods described in [RV19, HHK + 20]. We suspect that under reasonable assumptions, the induction in Cruanes' system corresponds to the restricted induction rule (see Definition 3.7) over ∀ 1 formulas. Furthermore, Cruanes' method also allows induction on several formulas simultaneously and introduces definitions by the AVATAR splitting mechanism [Vor14].
Furthermore the work in this article has given rise to a number of questions that we hope to address in the future. In Section 4 we have established some very coarse results concerning the conservativity of extensions of the language of the induction formulas by Skolem symbols. In particular we have shown that in many relevant cases extending the induction schema by Skolem symbols does not result in a more powerful system. We have however left open the general case (see Question 4.14). This question is not proper to induction but is part of a more general question concerning the extension of the language of an axiom schema by Skolem symbols. In Section 5 we have mainly considered the case where the occurrences of all Skolem symbols in the induction formulas are subject to the restriction mentioned above. Practical systems only impose this restriction on Skolem symbols that are generated by the induction rule. We have left open the question about a characterization of these systems (see Question 5.24). Finally, it seems worthwhile to investigate the effects of the analyticity properties of induction rules used in concrete systems such as [RV19, HHK + 20] and their interaction with redundancy rules.
Next we show that whenever a p-free term contains a free variable x, then whenever the variable x is substituted for s(x), we can propagate one occurrence of the successor function to the root of the term.
Lemma B.2. Let t(x) be a non-ground p-free term, then there exists a p-free term t ′ (x) such that T ⊢ t(s(x)) = s(t ′ (x)).
Proof. We proceed by induction on the structure of the term t. If t = x, then we are done by letting t ′ = t. If t = s(u(x)), then u is non-ground and p-free. We let t ′ = u(s(x)), then we have T ⊢ t(s(x)) = s(u(s(x))) = s(t ′ (x)). If t = u 1 + u 2 , then we have to consider two cases depending on whether u 2 is ground. If u 2 is not ground, then by the induction hypothesis there exists u ′ 2 such that T ⊢ u 2 (s(x)) = s(u ′ 2 (x)). Then we have T ⊢ u 1 (s(x)) + u 2 (s(x)) = u 1 (s(x)) + s(u ′ 2 (x)) = s(u 1 (s(x)) + u ′ 2 (x) and we set t ′ = u 1 (s(x)) + u ′ 2 . If u 2 is ground, then u 1 is non-ground and by the induction hypothesis there exists u ′ 1 such that T ⊢ u 1 (s(x)) = s(u ′ 1 (x)). We have T ⊢ t(s(x)) = u 1 (s(x)) + u 2 = s(u ′ 1 (x)) + k = s(s k (u ′ 1 (x))), hence we choose t ′ = s k (u ′ 1 ). Now we will show that given a term t(x), we can eliminate the occurrences of p in t(s N (x)) when N is large enough. Lemma B.3. Let t(x) be a term, then there exists N ∈ N and a p-free term t such that T ⊢ t(s N (x)) = t ′ .
Proof. If t is a ground term, then we have T ⊢ t = k for some k and we let t ′ = k and N = 0. If t = x, then we let N = 0 and t = t ′ . If t = s(u), where u is a term, then by the induction hypothesis there exists N ′ and a p-free u ′ such that T ⊢ u(s N ′ (x)) = u ′ . Hence we have T ⊢ t(s N (x)) = s(u(s N (x))) = s(u ′ ). Thus we let N := N ′ and t ′ = s(u ′ ). If t = p(u), then by the induction hypothesis we have some N ′ and a pfree u ′ such that T ⊢ u(s N ′ (x)) = u ′ . Hence by Lemma B.2 we have T ⊢ p(u(s N ′ +1 (x))) = p(u ′ (s(x))) = p(s(u ′′ )) = u ′′ , for some p-free term u ′′ and we let N := N ′ + 1 and t ′ = u ′′ . If t = u 1 + u 2 , then by the induction hypothesis there exists for i ∈ {1, 2} a natural number N i and a p-free term u ′ i such that T ⊢ u i (s N i (x)) = u ′ i . Let N = max{N 1 , N 2 }, then we have T ⊢ t(s N (x)) = u 1 (s N (x)) + u 2 (s N (x)) = u ′ 1 (s N −N 1 (x)) + u ′ 2 (s N −N 2 (x)), thus we let t ′ = u ′ 1 (s N −N 1 (x)) + u ′ 2 (s N −N 2 (x)).
Lemma B.4. Let ϕ(x) be a formula, then there exists N ∈ N and a p-free formula ϕ ′ (x) such that T ⊢ ϕ(s N (x)) ↔ ϕ ′ .
Let θ 1 (x), . . . , θ n (x) be all the atoms of ϕ. Let i ∈ {1, . . . , n}, then apply the argument above to θ i in order to obtain a natural number M i and a p-free atom θ ′ i such that T ⊢ θ(s M i (x)) ↔ θ ′ i . Let M = max{M i | i = 1, . . . , n} and obtain ϕ ′ by replacing in ϕ(s M (x)) every atom θ i (s M (x)) by θ ′ i (s M −M i (x)). Clearly we have T ⊢ ϕ(s M (x)) ↔ ϕ ′ .
We can now "factor" the symbols p out of the induction schema. The idea is instead of starting the induction at 0 we start the induction at some N ∈ N that is large enough, so that we can eliminate p according to the lemma above. Proof. Let ϕ(x) be an L(T ) formula. We want to show I x ϕ(x). By Lemma B.4 above we obtain an N ∈ N and a p-free formula ψ such that T ⊢ ϕ(s N (x)) ↔ ψ(x). Now we work in T + (B1) + Open(L ′ )-IND − and assume ϕ(0) and ϕ(x) → ϕ(s(x)) and we want to show ϕ(x). Hence by a N − 1 fold application of Lemma (B1) it suffices to show ϕ(0), ϕ(1), . . . , ϕ(s N p N (x)). By starting with ϕ(0) and iterating ϕ(x) → ϕ(s(x)) we obtain ϕ(n) for all n ∈ N. Hence it remains to show ϕ(s N (p N (x))). We proceed by induction on ψ. For the induction base we have to show ψ(0) which is equivalent to ϕ(N ), hence we are done. For the induction step we assume ψ(x) and we have to show ψ(s(x)). We have ψ(x) ↔ ϕ(s N (x)) and by (∀x)(ϕ(x) → ϕ(s(x))) we obtain ϕ(s N (x)) → ϕ(s N +1 (x)) thus by modus ponens ϕ(s N +1 (x)) which is equivalent to ψ(s(x)). This completes the induction step. By the induction we thus obtain ψ(x), and in particular ψ(p N (x)) which is equivalent to ϕ(s N p N (x)). This completes the proof.
As an immediate consequence of the above lemma we can factor all the occurrences of p/1 in the induction formulas into a single axiom.