1 Introduction

Reactive systems are notoriously difficult to design and even to specify correctly [1, 13]. As a consequence, formal methods have emerged as useful tools to help designers to built reactive systems that are correct. For instance, model-checking asks the designer to provide a model, in the form of a Mealy machine \({\mathcal {M}}\), that describes the reactions of the system to events generated by its environment, together with a description of the core correctness properties that must be enforced. Those properties are expressed in a logical formalism, typically as an LTL formula \(\varphi _{\textsf{CORE}}\). Then an algorithm decides if \({\mathcal {M}}\models \varphi _{\textsf{CORE}}\), i.e. if all executions of the system in its environment satisfy the specification. Automatic reactive synthesis is more ambitious: it aims at automatically generating a model from a high level description of the “what” needs to be done instead of the “how” it has to be done. Thus the user is only required to provide an LTL specification \(\varphi \) and the algorithm automatically generates a Mealy machine \({\mathcal {M}}\) such that \({\mathcal {M}}\models \varphi \) whenever \(\varphi \) is realizable. Unfortunately, it is most of the time not sufficient to provide the core correctness properties \(\varphi _{\textsf{CORE}}\) to obtain a Mealy machine \({\mathcal {M}}\) that is useful in practice, as illustrated next.

Example 1

[Synthesis from \(\varphi _{\textsf{CORE}}\) - Mutual exclusion] Let us consider the classical problem of mutual exclusion. In the simplest form of this problem, we need to design an arbiter that receives requests from two processes, modeled by two atomic propositions \(r_1\) and \(r_2\) controlled by the environment, and that grants accesses to the critical section, modeled as two atomic propositions \(g_1\) and \(g_2\) controlled by the system. The core correctness properties (the what) are: (i) mutual access, i.e. it is never the case that the access is granted to both processes at the same time, (ii) fairness, i.e. processes that have requested access eventually get access to the critical section. These core correctness specifications for mutual exclusion (ME) are easily expressed in LTL as follows: \(\varphi ^{\textsf{ME}}_{\textsf{CORE}} \equiv \square ( \lnot g_1 \vee \lnot g_2) \wedge \square ( r_1 \rightarrow \lozenge g_1) \wedge \square ( r_2 \rightarrow \lozenge g_2)\). Indeed, this formula expresses the core correctness properties that we would model check no matter how \({\mathcal {M}}\) implements mutual exclusion, e.g. Peterson, Dedekker, Backery algorithms, etc. Unfortunately, if we submit \(\varphi ^{\textsf{ME}}_{\textsf{CORE}}\) to an LTL synthesis procedure, implemented in tools like Acacia-Bonzai [11], BoSy [17], or Strix [25], we get the solution \({\mathcal {M}}\) depicted in 1-(left) (all three tools return this solution). While this solution is perfectly correct and realizes the specification \(\varphi ^{\textsf{ME}}_{\textsf{CORE}}\), the solution ignores the inputs from the environment and grants access to the critical sections in a round robin fashion. Arguably, it may not be considered as an efficient solution to the mutual exclusion problem. This illustrates the limits of the synthesis algorithm to solve the design problem by providing only the core correctness specification of the problem, i.e. the what, only. To produce useful solutions to the mutual exclusion problem, more guidance must be provided.

Fig. 1.
figure 1

(Left) The solution of Strix to the mutual exclusion problem for high level specification \(\varphi _{{LOW}}^{\textsf{ME}}\). Edge labels are of the form \(\varphi /\psi \) where \(\varphi \): Boolean formula on input atomic propositions (Boolean variables controlled by environment) and \(\psi \): maximally consistent conjunction of literals over set of output propositions (Boolean variables controlled by system). (Right) A natural solution that could be drawn by hand, and is automatically produced by our learning/synthesis algorithm for the same specification plus with two simple examples.

The main question is now: how should we specify these additional properties ? Obviously, if we want to use the ”plain” LTL synthesis algorithm, there is no choice: we need to reinforce the specification \(\varphi ^{\textsf{ME}}_{\textsf{CORE}}\) with additional lower level properties \(\varphi ^{\textsf{ME}}_{\textsf{LOW}}\). Let us go back to our running example.

Example 2

[Synthesis from \(\varphi ^{\textsf{ME}}_{\textsf{CORE}}\) and \(\varphi ^{\textsf{ME}}_{\textsf{LOW}}\)] To avoid solutions with unsolicited grants, we need to reinforce the core specification. The Strix online demo website proposes to add the following 3 LTL formulas \(\varphi ^{\textsf{ME}}_{\textsf{LOW}}\) to \(\varphi ^{\textsf{ME}}_{\textsf{CORE}}\) (see Full arbitrer \(n=2\), at https://meyerphi.github.io/strix-demo/): (1) \(\bigwedge _{i \in \{1,2\}} \square ((g_i \wedge \square \lnot r_i) \rightarrow \lozenge \lnot g_i)\), (2) \(\bigwedge _{i \in \{1,2\}} \square (g_i \wedge \bigcirc (\lnot r_i \wedge \lnot g_i) \rightarrow \bigcirc (r_i \textsf{R} \lnot g_i))\), and (3) \(\bigwedge _{i \in \{1,2\}} (r_i \textsf{R} \lnot g_i)\). Strix, on the specification \(\varphi ^{\textsf{ME}}_{\textsf{CORE}} \wedge \varphi ^{\textsf{ME}}_{\textsf{LOW}}\), provides us with a better solution, but it is more complex than needed (it has 9 states: refer [5]) and clearly does not look like an optimal solution to our mutual exclusion problem. E.g., the model of Fig. 1-(right) is arguably more natural. How can we get this model without coding it into the LTL specification, which would diminish greatly the interest of using a synthesis procedure in the first place?

In general, higher level properties are properties that need to be met by all implementations, e.g. safety-critical properties. In contrast, lower level properties are more about a specific implementation, its expected behaviour and efficiency. At this point, it is legitimate to question the adequacy of LTL as a specification language for lower level properties, and so as a way to guide the synthesis procedure towards relevant solutions to realize \(\varphi _{\textsf{CORE}}\). In this paper, we introduce an alternative to guide synthesis toward useful solutions that realize \(\varphi _{\textsf{CORE}}\): we propose to use examples of executions that illustrate behaviors of expected solutions. We then restrict the search to solutions that generalize those examples. Examples, or scenarios of executions, are accepted in requirement engineering as an adequate tool to elicit requirements about complex systems [12]. For reactive system design, examples are particularly well-suited as they are usually much easier to formulate than full blown solutions, or even partial solutions. It is because, when formulating examples, the user controls both the inputs and the outputs, avoiding the main difficulty of reactive system design: having to cope with all possible environment inputs. We illustrate this on our running example.

Example 3

[Synthesis from \(\varphi ^{\textsf{ME}}_{\textsf{CORE}}\) and examples] Let us keep, as the LTL specification, \(\varphi ^{\textsf{ME}}_{\textsf{CORE}}\) only, and let us consider the following simple prefix of executions that illustrate how solutions to mutual exclusion should behave:

  1. (1)

    \(\{!r_1,!r_2\} . \{!g_1,!g_2\} \#\{r_1,!r_2\} . \{g_1,!g_2\} \# \{!r_1,r_2\} . \{!g_1,g_2\}\)

  2. (2)

    \(\{r_1,r_2\} . \{g_1,!g_2\} \# \{!r_1,!r_2\} . \{!g_1,g_2\}\)

These trace prefixes prescribe reactions to typical fixed finite input sequences: (1) if there is no request initially, then no access is granted (note that this excludes already the round robin solution), if process 1 and 2 request subsequently, process 1 is granted first and then process 2 is granted after, (2) if both process request simultaneously, then process 1 is granted first and then process 2 is granted after. Given those two simple traces together with \(\varphi _{\textsf{CORE}}\), our algorithm generates the solution of Fig. 1-(right). Arguably, the solution is now simple and natural.

Contributions First, we provide a synthesis algorithm SynthLearn that, given an LTL specification \(\varphi _{\textsf{CORE}}\) and a finite set E of prefixes of executions, returns a Mealy machine \({\mathcal {M}}\) such that \({\mathcal {M}}\models \varphi _{\textsf{CORE}}\), i.e. \({\mathcal {M}}\) realizes \(\varphi _{\textsf{CORE}}\), and \(E \subseteq \textsf{Prefix}(L({\mathcal {M}}))\), i.e. \({\mathcal {M}}\) is compatible with the examples in E, if such a machine \({\mathcal {M}}\) exists. It returns unrealizable otherwise. Additionally, we require SynthLearn to generalize the decisions illustrated in E. This learnability requirement is usually formalized in automata learning with a completeness criterium that we adapt here as follows: for all specifications \(\varphi _{\textsf{CORE}}\), and for all Mealy machines \({\mathcal {M}}\) such that \({\mathcal {M}}\models \varphi _{\textsf{CORE}}\), there is a small set of examples E (polynomial in \(|{\mathcal {M}}|\)) such that \(L({\textsc {SynthLearn}}(\varphi _{\textsf{CORE}},E))=L({\mathcal {M}})\). We prove this completeness result in Theorem 4 for safety specifications and extend it to \(\omega \)-regular and LTL specifications in Section 4, by reduction to safety.

Second, we prove that the worst-case execution time of SynthLearn is 2ExpTime (Theorem 7), and this is worst-case optimal as the plain LTL synthesis problem (when \(E=\emptyset \)) is already known to be 2ExpTime-Complete [27]. SynthLearn first generalizes the examples provided by the user while maintaining realizability of \(\varphi _{\textsf{CORE}}\). This generalization leads to a Mealy machine with possibly missing transitions (called a preMealy machine). Then, this preMealy machine is extended into a (full) Mealy machine that realizes \(\varphi _{\textsf{CORE}}\) against all behaviors of the environment. During the completion phase, SynthLearn reuses as much as possible decisions that have been generalized from the examples. The generalization phase is essential to get the most out of the examples. Running classical synthesis algorithms on \(\varphi _{\textsf{CORE}} \wedge \varphi _E\), where \(\varphi _E\) is an LTL encoding of E, often leads to more complex machines that fail to generalize the decisions taken along the examples in E. While the overall complexity of SynthLearn is 2ExpTime and optimal, we show that it is only polynomial in the size of E and in a well-chosen symbolic representation a set of Mealy machines that realize \(\varphi _{\textsf{CORE}}\), see Theorem 6. This symbolic representation takes the form of an antichain of functions and tends to be compact in practice [19]. It is computed by default when Acacia-Bonzai is solving the plain LTL synthesis problem of \(\varphi _{\textsf{CORE}}\). So, generalizing examples while maintaining realizability only comes at a marginal polynomial cost. We have implemented our synthesis algorithm in a prototype, which uses Acacia-Bonzai to compute the symbolic antichain representation. We report on the results we obtain on several examples.

Related works Scenarios of executions have been advocated by researchers in requirements engineering to elicit specifications, see e.g. [12, 14] and references therein. In [28], learning techniques are used to transform examples into LTL formulas that generalize them. Those methods are complementary to our work, as they can be used to obtain the high level specification \(\varphi _{\textsf{CORE}}\).

In non-vacuous synthesis [8], examples are added automatically to an LTL specification in order to force the synthesis procedure to generate solutions that are non-vacuous in the sense of [23]. The examples are generated directly from the syntax of the LTL specification and they cannot be proposed by the user. This makes our approach and this approach orthogonal and complementary. Indeed, we could use the examples generated automatically by the non-vacuous approach and ask the user to validate them as desirable or not. Our method is more flexible, it is semi-automatic and user centric: the user can provide any example he/she likes and so it offers more flexibility to drive the synthesis procedure to solutions that the user deems as interesting. Furthermore, our synthesis procedure is based on learning algorithms, while the algorithm in [8] is based on constraint solving and does not offer guarantees of generalization, unlike our algorithm (see Thm 4).

Supplementing the formal specification with additional user-provided information is at the core of the syntax-guided synthesis framework (SyGuS [3]), implemented for instance in program by sketching [31]: in SyGuS, the specification is a logical formula and candidate programs are syntactically restricted by a user-provided grammar, to limit and guide the search. The search is done by using counter-example guided inductive synthesis techniques (CEGIS) which rely on learning [32]. In contrast to our approach, examples are not user-provided but automatically generated by model-checking the candidate programs against the specification. The techniques are also orthogonal to ours: SyGuS targets programs syntactically defined by expressions over a decidable background theory, and heavily relies on SAT/SMT solvers. Using examples to synthesise programs (programming by example) has been for instance explored in the context of string processing programs for spreadsheets, based on learning [30], and is a current trend in AI (see for example [26] and the citations therein). However this approach only relies on examples and not on logical specifications.

[4] explores the use of formal specifications and scenarios to synthesize distributed protocols. Their approach also follows two phases: first, an incomplete machine is built from the scenarios and second, it is turned into a complete one. But there are two important differences with our work. First, their first phase does not rely on learning techniques and does not try to generalize the provided examples. Second, in their setting, all actions are controllable and there is no adversarial environment, so they are solving a satisfiability problem and not a realizability problem as in our case. Their problem is thus computationally less demanding than the problem we solve: Pspace versus 2ExpTime for LTL specs.

The synthesis problem targeted in this paper extends the LTL synthesis problem. Modern solutions for this problem use automata constructions that avoid Safra’s construction as first proposed in [24], and simplified in [18, 29], and more recently in [16]. Efficient implementations of Safraless constructions are available, see e.g. [9, 15, 17, 25]. Several previous works have proposed alternative approaches to improve on the quality of solutions that synthesis algorithms can offer. A popular research direction, orthogonal and complementary to the one proposed here, is to extend the formal specification with quantitative aspects, see e.g. [2, 6, 10, 22], and only synthesize solutions that are optimal.

The first phase of our algorithm is inspired by automata learning techniques based on state merging algorithms like RPNI [20, 21]. Those learning algorithms need to be modified carefully to generate partial solutions that preserve realizability of \(\varphi _{\textsf{CORE}}\). Proving completeness as well as termination of the completion phase in this context requires particular care.

2 Preliminaries on the reactive synthesis problem

Words, languages and automata An alphabet is a finite set of symbols. A word u (resp. \(\omega \)-word) over an alphabet \(\varSigma \) is a finite (resp. infinite sequence) of symbols from \(\varSigma \). We write \(\epsilon \) for the empty word, and denote by \(|u|\in \mathbb {N}\cup \{\infty \}\) the length of u. In particular, \(|\epsilon |=0\). For \(1\le i\le j\le |u|\), we let u[i : j] be the infix of u from position i to position j, both included, and write u[i] instead of u[i : i]. The set of finite (resp. \(\omega \)-) words over \(\varSigma \) is denoted by \(\varSigma ^*\) (resp. \(\varSigma ^\omega \)). We let \(\varSigma ^\infty =\varSigma ^*\cup \varSigma ^\omega \). Given two words \(u\in \varSigma ^*\) and \(v\in \varSigma ^\infty \), u is a prefix of v, written \(u\preceq v\), if \(v = uw\) for some \(w\in \varSigma ^\infty \). The set of prefixes of v is denoted by \(\textsf{Prefs}(v)\). Finite words are linearly ordered according to the length-lexicographic order \(\preceq _{ll}\), assuming a linear order \(<_{\varSigma }\) over \(\varSigma \): \(u\preceq _{ll} v\) if \(|u|<|v|\) or \(|u|=|v|\) and \(u=p\sigma _1u'\), \(v=p\sigma _2 v'\) for some \(p,u',v'\in \varSigma ^*\) and some \(\sigma _1<_{\varSigma } \sigma _2\). In this paper, whenever we refer to the order \(\preceq _{ll}\) for words over some alphabet, we implicitly assume the existence of an arbitrary linear order over that alphabet. A language (resp. \(\omega \)-language) over an alphabet \(\varSigma \) is a subset \(L\subseteq \varSigma ^*\) (resp. \(L\subseteq \varSigma ^\omega \)).

In this paper, we fix two alphabets \({\mathcal I}\) and \({\mathcal O}\) whose elements are called inputs and outputs respectively. Given a word \(u\in ({\mathcal I}{\mathcal O})^\infty \), we let \(\textsf{in}(u) \in {\mathcal I}^\infty \) be the word obtained by erasing all \({\mathcal O}\)-symbols from u. We define \(\textsf{out}(u)\) similarly and naturally extend both functions to languages.

Automata over \(\omega \)-words A parity automaton is a tuple \({\mathcal {A}}=(Q,Q_{\textsf{init}},\varSigma ,\delta ,d)\) where Q is a finite non empty set of states, \(Q_{\textsf{init}} \subseteq Q\) is a set of initial states, \(\varSigma \) is a finite non empty alphabet, \(\delta : Q \times \varSigma \rightarrow 2^Q \setminus \{\emptyset \}\) is the transition function, and \(d : Q \rightarrow \mathbb {N}\) is a parity function. The automaton \({\mathcal {A}}\) is deterministic when \(|Q_{\textsf{init}}|=1\) and \(|\delta (q,\sigma )|=1\) for all \(q\in Q\). The transition function is extended naturally into a function \(\textsf {Post}^* : Q\times \varSigma ^*\rightarrow 2^Q \setminus \{\emptyset \}\) inductively as follows: \(\textsf {Post}^*(q,\epsilon )=\{q\}\) for all \(q\in Q\) and for all \((u,\sigma )\in \varSigma ^*\times \varSigma \), \(\textsf {Post}^*(q,u\sigma ) = \bigcup _{q'\in \textsf {Post}^*(q,u)}\delta (q',\sigma )\).

A run of \({\mathcal {A}}\) on an \(\omega \)-word \(w=w_0 w_1 \dots \) is an infinite sequence of states \(r=q_0 q_1 \dots \) such that \(q_0 \in Q_{\textsf{init}}\), and for all \(i \in \mathbb {N}\), \(q_{i+1} \in \delta (q_i,w_i)\). The run r is said to be accepting if the minimal colour it visits infinitely often is even, i.e. \(\liminf (d(q_i))_{i\ge 0}\) is even. We say that \({\mathcal {A}}\) is a Büchi automaton when \(\textsf{dom}(d)=\{0,1\}\) (1-coloured states are called accepting states), a co-Büchi automaton when \(\textsf{dom}(d)=\{1,2\}\), a safety automaton if it is a Büchi automaton such that the set of 1-coloured states, called unsafe states and denoted \(Q_\textsf{usf}\), forms a trap: for all \(q\in Q_\textsf{usf}\), for all \(\sigma \in \varSigma \), \(\delta (q,\sigma )\subseteq Q_\textsf{usf}\), and a reachability automaton if it is \(\{0,1\}\)-coloured and the set of 0-coloured states forms a trap.

Finally, we consider the existential and universal interpretations of nondeterminism: under the existential (resp. universal) interpretation, a word \(w \in \varSigma ^{\omega }\) is in the language of \({\mathcal {A}}\), if there exists a run r on w such that r is accepting (resp. for all runs r on w, r is accepting). We denote the two languages defined by these two interpretations \(L^{\exists }({\mathcal {A}})\) and \(L^{\forall }({\mathcal {A}})\) respectively. Note that if \({\mathcal {A}}\) is deterministic, then the existential and universal interpretations agree, and we write \(L({\mathcal {A}})\) for \(L^\forall ({\mathcal {A}}) = L^\exists ({\mathcal {A}})\). For a deterministic automaton \({\mathcal {A}}\), the initial state is fixed to the singleton \(\{q\}\).

For a co-Büchi automaton, we also define a strengthening of the acceptance condition, called K-co-Büchi, which requires, for \(K \in \mathbb {N}\), that a run visits at most K times a state labelled with 1 to be accepting. Formally, a run \(r=q_0 q_1 \dots q_n \dots \) is accepting for the K-co-Büchi acceptance condition if \(|\{ i \ge 0 \mid d(q_i))=1 \}| \le K\). The language defined by \({\mathcal {A}}\) for the K-co-Büchi acceptance condition and universal interpretation is denoted by \(L^{\forall }_K({\mathcal {A}})\). Note that this language is a safety language because if a prefix of a word \(p \in \varSigma ^*\) is such that \({\mathcal {A}}\) has a run prefix on p that visits more than K times a states labelled with color 1, then all possible extensions \(w \in \varSigma ^{\omega }\) of p are rejected by \({\mathcal {A}}\).

(Pre)Mealy machines Given a (partial) function f from a set X to a set Y, we denote by \(\textsf {dom}(f)\) its domain, i.e. the of elements \(x\in X\) such that f(x) is defined. A preMealy machine \({\mathcal {M}}\) on an input alphabet \({\mathcal I}\) and output alphabet \({\mathcal O}\) is a triple \((M,m_{\textsf{init}},\varDelta )\) such that M is a non-empty set of states, \(m_{\textsf{init}} \in M\) is the initial state, \(\varDelta : Q \times {\mathcal I}\rightarrow {\mathcal O}\times M\) is a partial function. A pair \((m,\textsf{i})\) is a hole in \({\mathcal {M}}\) if \((m,\textsf{i}) \not \in \textsf{dom}(\varDelta )\). A Mealy machine is a preMealy machine such that \(\varDelta \) is total, i.e., \(\textsf{dom}(\varDelta )=M \times {\mathcal I}\).

We define two semantics of a preMealy machine \({\mathcal {M}}= (M,m_{\textsf{init}},\varDelta )\) in terms of the languages of finite and infinite words over \({\mathcal I}\cup {\mathcal O}\) they define. First, we define two (possibly partial functions) \(\textsf {Post}_{\mathcal {M}}: M\times {\mathcal I}\rightarrow M\) and \(\textsf {Out}_{\mathcal {M}}: M\times {\mathcal I}\rightarrow {\mathcal O}\) such that \(\varDelta (m,\textsf{i}) = (\textsf {Post}_{\mathcal {M}}(m,\textsf{i}),\textsf {Out}_{\mathcal {M}}(m,\textsf{i}))\) for all \((m,\textsf{i})\in M\times {\mathcal I}\) if \(\varDelta (m,\textsf{i})\) is defined. We naturally extend these two functions to any sequence of inputs \(u\in {\mathcal I}^+\), denoted \(\textsf {Post}_{\mathcal {M}}^*\) and \(\textsf {Out}_{\mathcal {M}}^*\). In particular, for \(u\in {\mathcal I}^+\), \(\textsf {Post}_{\mathcal {M}}^*(m,u)\) is the state reached by \({\mathcal {M}}\) when reading u from m, while \(\textsf {Out}_{\mathcal {M}}^*(m,u)\) is the last output in \({\mathcal O}\) produced by \({\mathcal {M}}\) when reading u. The subcript \({\mathcal {M}}\) is omitted when \({\mathcal {M}}\) is clear from the context. Now, the language \(L({\mathcal {M}})\) of finite words in \(({\mathcal I}{\mathcal O})^*\) accepted by \({\mathcal {M}}\) is defined as \(L({\mathcal {M}})=\{ \textsf{i}_1\textsf{o}_1\dots \textsf{i}_n\textsf{o}_n\mid \forall 1\le j\le n,\ \textsf {Post}_{\mathcal {M}}^*(m_\textsf{init},\textsf{i}_1\dots \textsf{i}_j) \text { is defined and } \textsf{o}_j = \textsf {Out}_{\mathcal {M}}^*(m_\textsf{init},\textsf{i}_1\dots \textsf{i}_j)\}\). The language \(L_\omega ({\mathcal {M}})\) of infinite words accepted by \({\mathcal {M}}\) is the topological closure of \(L({\mathcal {M}})\): \(L_\omega ({\mathcal {M}}) = \{ w\in ({\mathcal I}{\mathcal O})^\omega \mid \textsf{Prefs}(w)\cap ({\mathcal I}{\mathcal O})^*\subseteq L({\mathcal {M}})\}\).

The reactive synthesis problem A specification is a language \({\mathcal {S}}\subseteq ({\mathcal I}{\mathcal O})^\omega \). The reactive synthesis problem (or just synthesis problem for short) is the problem of constructing, given a specification \({\mathcal {S}}\), a Mealy machine \({\mathcal {M}}\) such that \(L_\omega ({\mathcal {M}})\subseteq {\mathcal {S}}\) if it exists. Such a machine \({\mathcal {M}}\) is said to realize the specification \({\mathcal {S}}\), also written \({\mathcal {M}}\models {\mathcal {S}}\). We also say that \({\mathcal {S}}\) is realizable if some Mealy machine \({\mathcal {M}}\) realizes it. The induced decision problem is called the realizability problem.

It is well-known that if \({\mathcal {S}}\) is \(\omega \)-regular (recognizable by, e.g., a parity automaton [33]) the realizability problem is decidable [1] and moreover, a Mealy machine realizing the specification can be effectively constructed. The realizability problem is 2ExpTime-Complete if \({\mathcal {S}}\) is given as an LTL formula [27] and ExpTime-Complete if \({\mathcal {S}}\) is given as a universal coBüchi automaton.

Theorem 1

([7]). The realizability problem for a specification \({\mathcal {S}}\) given as a universal coBüchi automaton \({\mathcal {A}}\) is ExpTime-C. Moreover, if \({\mathcal {S}}\) is realizable and \({\mathcal {A}}\) has n states, then \({\mathcal {S}}\) is realizable by a Mealy machine with \(2^{O(n log_2 n)}\) states.

We generalize this result to the following realizability problem which we describe first informally. Given a specification \({\mathcal {S}}\) and a preMealy machine \({\mathcal {P}}\), the goal is to decide whether \({\mathcal {P}}\) can be completed into a Mealy machine which realizes \({\mathcal {S}}\). We now define this problem formally. Given two preMealy machines \({\mathcal {P}}_1,{\mathcal {P}}_2\), we write \({\mathcal {P}}_1\preceq {\mathcal {P}}_2\) if \({\mathcal {P}}_1\) is a subgraph of \({\mathcal {P}}_2\) in the following sense: there exists an injective mapping \(\varPhi \) from the states of \({\mathcal {P}}_1\) to the states of \({\mathcal {P}}_2\) which preserves the initial state (\(s_0\) is the initial state of \({\mathcal {P}}_1\) iff \(\varPhi (s_0)\) is the initial state of \({\mathcal {P}}_2\)) and the transitions (\(\varDelta _{{\mathcal {P}}_1}(p,\textsf{i})=(\textsf{o},q)\) iff \(\varDelta _{{\mathcal {P}}_2}(\varPhi (p),\textsf{i})=(\textsf{o},\varPhi (q))\). As a consequence, \(L({\mathcal {P}}_1)\subseteq L({\mathcal {P}}_2)\) and \(L_\omega ({\mathcal {P}}_1)\subseteq L_\omega ({\mathcal {P}}_2)\). Given a preMealy machine \({\mathcal {P}}\), we say that a specification \(\mathcal {S}\) is \({\mathcal {P}}\)-realizable if there exists a Mealy machine \({\mathcal {M}}\) such that \({\mathcal {P}}\preceq {\mathcal {M}}\) and \({\mathcal {M}}\) realizes \({\mathcal {S}}\). Note that if \({\mathcal {P}}\) is a (complete) Mealy machine, \({\mathcal {S}}\) is \({\mathcal {P}}\)-realizable iff \({\mathcal {P}}\) realizes \({\mathcal {S}}\). The next result is proved in [5]:

Theorem 2

Given a universal co-Büchi automaton \({\mathcal {A}}\) with n states defining a specification \({\mathcal {S}}= L^\forall ({\mathcal {A}})\) and a preMealy machine \({\mathcal {P}}\) with m states and \(n_h\) holes, deciding whether \({\mathcal {S}}\) is \({\mathcal {P}}\)-realizable is ExpTime-hard and in ExpTime (in n and polynomial in m). Moreover, if \({\mathcal {S}}\) is \({\mathcal {P}}\)-realizable, it is \({\mathcal {P}}\)-realizable by a Mealy machine with \(m+n_h2^{O(n log_2 n)}\) states. Hardness holds even if \({\mathcal {P}}\) has two states and \({\mathcal {A}}\) is a deterministic reachability automaton.

3 Synthesis from safety specifications and examples

In this section, we present the learning framework we use to synthesise Mealy machines from examples, and safety specifications. Its generalization to any \(\omega \)-regular specification is described in Sec. 4 and solved by reduction to safety specifications. It is a two-phase algorithm: (1) it generalizes the examples while maintaining realizability of the specification, and outputs a preMealy machine, (2) it completes the preMealy machine into a full Mealy machine.

Phase 1: Generalizing the examples This phase exploits the examples by generalizing them as much as possible while maintaining realizability of the specification. It outputs a preMealy machine which is consistent with the examples and realizes the specification, if it exists. It is an RPNI-like learning algorithm [20, 21] which includes specific tests to maintain realizability of the specification. In particular, it first builds a tree-shaped preMealy machine whose accepted language is exactly the set of prefixes \(\textsf{Prefs}(E)\) of the given set of examples E, called a prefix-tree acceptor (PTA). Then, it tries to merge as many as possible states of the PTA. The strategy used to select a state to merge another given state with, is a parameter of the algorithm, and is called a merging strategy \(\sigma _G\). Formally, a merging strategy \(\sigma _G\) is defined over 4-tuples \(({\mathcal {M}},m,E,X)\) where \({\mathcal {M}}\) is a preMealy machine, m is a state of \({\mathcal {M}}\), E is a set of examples and X is subset of states of \({\mathcal {M}}\) (the candidate states to merge m with), and returns a state of X, i.e., \(\sigma _G({\mathcal {M}},m,E,X)\in X\).

The pseudo-code is given by alg. 1. Initially, it tests whether the set of examples E is consistentFootnote 1and if yes, checks if \(\textsf {PTA}(E)\) can be completed into a Mealy machine realizing the given specification \({\mathcal {S}}\), thanks to Thm. 2. If that is the case, then it takes all prefixes of E as the set of examples, and enters a loop which consists in iteratively coarsening again and again some congruence \(\sim \) over the states of \(\textsf {PTA}(E)\), by merging some of its classes. The congruence \(\sim \) is initially the finest equivalence relation. It does the coarsening in a specific order: examples (which are states of \(\textsf {PTA}(E)\)) are taken in length-lexicographic order. When entering the loop with example e, the algorithm computes at line 4 all the states, i.e., all the examples \(e'\) which have been processed already by the loop (\(e'\prec _{ll} e\)) and whose current class can be merged with the class of e (predicate \(\textsf {Mergeable}(\textsf {PTA}(E),\sim ,e,e')\)). State merging is a standard operation in automata learning algorithms which intuitively means that merging the \(\sim \)-class of e and the \(\sim \)-class of \(e'\), and propagating this merge to the descendants of e and \(e'\), does not result any conflict. The formal definition is in [5]. At line 5, it filters the previous set by keeping only the states which, when merged with e, produce a preMealy machine which can be completed into a Mealy machine realizing \({\mathcal {S}}\) (again by Thm. 2). If after the filtering there are still several candidates for merge, one of them is selected with the merging strategy \(\sigma _G\) and the equivalence relation is then coarsened via class merging (operation \(\textsf {MergeClass}(\textsf {PTA}(E),\sim ,e,e')\)). At the end, the algorithm returns the quotient of \(\textsf {PTA}(E)\) by the computed Mealy-congruence. As a side remark, when \({\mathcal {S}}\) is universal, i.e. \({\mathcal {S}}= ({\mathcal I}{\mathcal O})^\omega \), then it is realizable by any Mealy machine and therefore line 5 does not filter any of the candidates for merge. So, when \({\mathcal {S}}\) is universal, Algo 1 can be seen as an RPNI variant for learning preMealy machines.

figure a

Phase 2: completion of preMealy machines into Mealy machines As it only constructs the PTA and tries to merge its states, the generalization phase might not return a (complete) Mealy machine. In other words, the machine it returns might still contain some holes (missing transitions). The objective of this second phase is to complete those holes into a Mealy machine, while realizing the specification. More precisely, when a transition is not defined from some state m and some input \(\textsf{i}\in {\mathcal I}\), the algorithm must select an output symbol \(\textsf{o}\in {\mathcal O}\) and a state \(m'\) to transition to, which can be either an existing state or a new state to be created (in that case, we write \(m' = \textsf {fresh}\) to denote the fact that \(m'\) is a fresh state). In our implementation, if it is possible to reuse a state \(m'\) that was created during the generalization phase, it is favoured over other states, in order to exploit the examples. However, the algorithm for the completion phase we describe now does not depend on any particular strategy to pick states. Therefore, it is parameterized by a completion strategy \(\sigma _C\), defined over all triples \(({\mathcal {M}}, m, \textsf{i}, X)\) where \({\mathcal {M}}\) is a preMealy machine with set of states M, \((m,\textsf{i})\) is a hole of \({\mathcal {M}}\), and \(X\subseteq {\mathcal O}\times (M\cup \{\textsf {fresh}\})\) is a list of candidate pairs \((\textsf{o},m')\). It returns an element of X, i.e., \(\sigma _C({\mathcal {M}},m,\textsf{i},X)\in X\).

In addition to \(\sigma _C\), the completion algorithm takes as input a preMealy machine \({\mathcal {M}}_0\) and a specification \({\mathcal {S}}\), and outputs a Mealy machine which \({\mathcal {M}}_0\)-realizes \({\mathcal {S}}\), if it exists. The pseudo-code is given in Algo 2. Initially, it tests whether \({\mathcal {S}}\) is \({\mathcal {M}}_0\)-realizable, otherwise it returns UNREAL. Then, it keeps on completing holes of \({\mathcal {M}}_0\). The computation of the list of output/state candidates is done at the loop of line 5. Note that the for-loop iterates over \(M\cup \{\textsf {fresh}()\}\), where \(\textsf {fresh}()\) is a procedure that returns a fresh state not in M. The algorithm maintains the invariant that at any iteration of the while-loop, \({\mathcal {S}}\) is \({\mathcal {M}}\)-realizable, thanks to the test at line 7, based on Thm. 2. Therefore, the list of candidates is necessarily non-empty. Amongst those candidates, a single one is selected and the transition on \((m,\textsf{i})\) is added to \({\mathcal {M}}\) accordingly at line 10.

figure b

Two-phase synthesis algorithm from specifications and examples The two-phase synthesis algorithm for safety specifications and examples, called SynthSafe\((E, {\mathcal {S}}, \sigma _G,\sigma _C)\) works as follows: it takes as input a set of examples E, a specification \({\mathcal {S}}\) given as a deterministic safety automaton, a generalizing and completion strategies \(\sigma _G,\sigma _C\) respectively. It returns a Mealy machine \({\mathcal {M}}\) which realizes \({\mathcal {S}}\) and such that \(E\subseteq L({\mathcal {M}})\) if it exists. In a first steps, it calls Gen\((E,{\mathcal {S}},\sigma _G)\). If this calls returns UNREAL, then SynthSafe return UNREAL as well. Otherwise, the call to Gen returns a preMealy machine \({\mathcal {M}}_0\). In a second step, SynthSafe calls Comp\(({\mathcal {M}}_0,{\mathcal {S}},\sigma _C)\). If this call returns UNREAL, so does SynthSafe, otherwise SynthSafe returns the Mealy machine computed by Comp. The pseudo-code of SynthSafe can be found in [5].

The completion procedure may not terminate for some completion strategies. It is because the completion strategy could for instance keep on selecting pairs of the form \((\textsf{o},m')\) where \(m'\) is a fresh state. However we prove that it always terminates for lazy completion strategies. A completion strategy \(\sigma _C\) is said to be lazy if it favours existing states, which formally means that if \(X\setminus ({\mathcal O}\times \{\textsf {fresh}\})\ne \varnothing \), then \(\sigma _C({\mathcal {M}},m,\textsf{i},X)\not \in {\mathcal O}\times \{\textsf {fresh}\}\). The 1st theorem states correctness and termination of the algorithm for lazy completion strategies (assuming the functions \(\sigma _G\) and \(\sigma _C\) are computable in worst-case exptime in the size of their inputs).

Theorem 3

(termination and correctness). For all finite sets of examples \(E\subseteq ({\mathcal I}.{\mathcal O})^*\), all specifications \({\mathcal {S}}\subseteq ({\mathcal I}.{\mathcal O})^\omega \) given as a deterministic safety automaton \({\mathcal {A}}\) with n states, all merging strategies \(\sigma _G\) and all completion strategies \(\sigma _C\), if SynthSafe(\(E,{\mathcal {S}},\sigma _G,\sigma _C\)) terminates then, it returns a Mealy machine \({\mathcal {M}}\) such that \(E\subseteq L({\mathcal {M}})\) and \({\mathcal {M}}\) realizes \({\mathcal {S}}\), if it exists, otherwise it returns UNREAL. Moreover, SynthSafe(\(E,{\mathcal {S}},\sigma _G,\sigma _C\)) terminates if \(\sigma _C\) is lazy, in worst-case exponential time (polynomial in the sizeFootnote 2 of E and exponential in n).

The proof of the latter theorem is a consequence of several results proved on the generalization and completion phases, and is given in [5].

A Mealy machine \({\mathcal {T}}\) is minimal if for all Mealy machine \({\mathcal {M}}\) such that \(L({\mathcal {T}}) = L({\mathcal {M}})\), the number of states of \({\mathcal {M}}\) is at least that of \({\mathcal {T}}\). The next result, proved in [5], states that any minimal Mealy machine realizing a specification \({\mathcal {S}}\) can be returned by our synthesis algorithm, providing representative examples.

Theorem 4

(Mealy completeness). For all specifications \({\mathcal {S}}\subseteq ({\mathcal I}.{\mathcal O})^\omega \) given as a deterministic safety automaton, for all minimal Mealy machines \({\mathcal {M}}\) realizing \({\mathcal {S}}\), there exists a finite set of examples \(E\subseteq ({\mathcal I}.{\mathcal O})^*\), of size polynomial in the size of \({\mathcal {M}}\), such that for all generalizing strategies \(\sigma _G\) and completion strategies \(\sigma _C\), and all sets of examples \(E'\) s.t. \(E\subseteq E'\subseteq L({\mathcal {M}})\), SynthSafe(\(E',{\mathcal {S}},\sigma _G,\sigma _C) = {\mathcal {M}}\).

The polynomial upper bound given in the statement of Theorem 4 is more precisely the following: the cardinality of E is \(O(m+n^2)\) where n is the number of states of \({\mathcal {M}}\) while m is its number of transitions. Moreover, each example \(e\in E\) has length \(O(n^2)\). More details can be found in Remark 1 of [5].

4 Synthesis from \(\omega \)-regular specifications and examples

We now consider the case where the specification \({\mathcal {S}}\) is given as universal coBüchi automaton, in Section 4. We consider this class of specifications as it is complete for \(\omega \)-regular languages and allow for compact symbolic representations. Further in this section, we consider the case of LTL specifications.

Specifications given as universal coBüchi automata Our solution for \(\omega \)-regular specifications relies on a reduction to the safety case treated in Sec. 3. It relies on previous works that develop so called Safraless algorithms for \(\omega \)-regular reactive synthesis [18, 24, 29]. The main idea is to strengthen the (safety) acceptance condition of the automaton from coBüchi to K-coBüchi. It is complete for the plain synthesis problem (w/o examples) if K is large enough (in the worst-case exponential in the number of states of the automaton (e.g., see [18])). Moreover, it allows for incremental synthesis algorithms: if the specification defined by the automaton with a k-coBüchi acceptance condition is realizable, for \(k\le K\), so is the specification defined by taking K-coBüchi acceptance. Here, as we also take examples into account, we need to slightly adapt the results. The next theorem is proved in [5] while the next lemma is immediate:

Theorem 5

Given a universal co-Büchi automaton \({\mathcal {A}}\) with n states defining a specification \({\mathcal {S}}= L^{\forall }({\mathcal {A}})\) and a preMealy machine \({\mathcal {P}}\) with m states, we have that \({\mathcal {S}}\) is \({\mathcal {P}}\)-realizable iff \({\mathcal {S}}'=L^{\forall }_K(A)\) is \({\mathcal {P}}\)-realizable for \(K = nm|{\mathcal I}|2^{\textbf{O}(n\log _2 n)}\).

Lemma 1

For all co-Büchi automata \({\mathcal {A}}\), for all preMealy machines \({\mathcal {P}}\), for all \(k_1 \le k_2\), we have that \(L^\forall _{k_1}({\mathcal {A}}) \subseteq L^\forall _{k_2}({\mathcal {A}})\) and so if \(L^\forall _{k_1}({\mathcal {A}})\) is \({\mathcal {P}}\)-realizable then \(L^\forall _{k_2}({\mathcal {A}})\) is \({\mathcal {P}}\)-realizable. Furthermore for all \(k \ge 0\), if \({\mathcal {S}}'=L^{\forall }_k(A)\) is \({\mathcal {P}}\)-realizable then \({\mathcal {S}}= L^{\forall }({\mathcal {A}})\) is \({\mathcal {P}}\)-realizable.

Thanks to the latter two results applied to \({\mathcal {P}}= \textsf {PTA}(E)\) for a set E of examples of size m, we can design an algorithm for synthesising Mealy machines from a specification defined by a universal coBüchi automaton \({\mathcal {A}}\) with n states and E: it calls \(\textsc {SynthSafe}\) on the safety specification \(L^\forall _k({\mathcal {A}})\) and E for increasing values of k, until it concludes positively, or reach the bound \(K = 2^{\textbf{O}(m n\log _2 mn)}+1\). In the latter case, it returns UNREAL. However, to apply SynthSafe properly, \(L^\forall _k({\mathcal {A}})\) must be represented by a deterministic safety automaton. This is possible as k-coBüchi automata are determinizable [18].

Determinization The determinization of k-co-Büchi automata \({\mathcal {A}}\) relies on a simple generalization of the subset construction: in addition to remembering the set of states that can be reached by a prefix of a run while reading an infinite word, the construction counts the maximal number of times a run prefix that reaches a given state q has visited states labelled with color 1 (remember that a run can visit at most k such states to be accepting). The states of the deterministic automaton are so-called counting functions, formally defined for a co-Büchi automaton \({\mathcal {A}}=(Q,q_{\textsf{init}},\varSigma ,\delta ,d)\) and \(k \in \mathbb {N}\), as the set noted \(CF({\mathcal {A}},k)\) of functions \(f : Q \rightarrow \{-1,0,1,\dots ,k,k+1\}\). If \(f(q)=-1\) for some state q, it means that q is inactive (no run of \({\mathcal {A}}\) reach q on the current prefix). The initial counting function \(f_{\textsf{init}}\) maps all 1-colored initial states to 1, all 0-colored initial states to 0 and all other states to \(-1\). We denote by \(\mathcal {D}({\mathcal {A}},k)=(Q^\mathcal {D} = CF({\mathcal {A}},k),q^\mathcal {D}_{\textsf{init}}=f_{\textsf{init}},\varSigma ,\delta ^\mathcal {D},Q^{\mathcal {D}}_\textsf{usf})\) the deterministic automaton obtained by this determinization procedure. It is formally defined in [5]. We can now give algorithm SynthLearn, in pseudo-code, as Algo 3.

figure c

Complexity considerations and improving the upper-bound As the automaton \(\mathcal {D}({\mathcal {A}},k)\) is in the worst-case exponential in the size of the automaton \({\mathcal {A}}\), a direct application of Thm. 3 yields a doubly exponential time procedure. This complexity is a consequence of the fact that the \({\mathcal {P}}\)-realizability problem is ExpTime in the size of the deterministic automaton as shown in Thm. 2, and that the termination of the completion procedure is also worst-case exponential in the size of the deterministic automaton.

We show that we can improve the complexity of each call to SynthSafe and obtain an optimal worst-case (single) exponential complexity. We provide an algorithm to check \({\mathcal {P}}\)-realizability of a specification \({\mathcal {S}}=L^{\forall }_k({\mathcal {A}})\) that runs in time singly exponential in the size of \({\mathcal {A}}\) and polynomial in k and the size of \({\mathcal {P}}\). Second, we provide a finer complexity analysis for the termination of the completion algorithm, which exhibits a worst case exponential time in \(|{\mathcal {A}}|\). Those two improvements lead to an overall complexity of SynthLearn which is exponential in the size of the specification \({\mathcal {A}}\) and polynomial in the set of examples E. This is provably worst-case optimal because for \(E=\emptyset \) the problem is already ExpTime-Complete. We explain next the first improvement, the upper-bound for termination is provided in [5].

Checking \({\mathcal {P}}\)-realizability of a specification \({\mathcal {S}}=L^{\forall }_k({\mathcal {A}})\) To obtain a better complexity, we exploit some structure that exists in the deterministic automaton \(\mathcal {D}({\mathcal {A}},k)\). First, the set of counting functions \(CF({\mathcal {A}},k)\) forms a complete lattice for the partial order \(\preceq \) defined by \(f_1\preceq f_2\) if \(f_1(q)\le f_2(q)\) for all states q. We denote by \(f_1\bigsqcup f_2\) the least upper-bound of \(f_1,f_2\), and by \(W_k^{\mathcal {A}}\) the set of counting functions f such that the specification \(L(\mathcal {D}({\mathcal {A}},k)[f])\) is realizable (i.e. the specification defined by \(\mathcal {D}({\mathcal {A}},k)\) with initial state f). It is known that \(W_k^{\mathcal {A}}\) is downward-closed for \(\preceq \) [18], because for all \(f_1\preceq f_2\), any machine realizing \(L(\mathcal {D}({\mathcal {A}},k)[f_2])\) also realizes \(L(\mathcal {D}({\mathcal {A}},k)[f_1])\). Therefore, \(W_k^{\mathcal {A}}\) can be represented compactly by the antichain \(\lceil W^{{\mathcal {A}}}_k \rceil \) of its \(\preceq \)-maximal elements. Now, the first improvement is obtained thanks to the following result:

Lemma 2

Given a preMealy \({\mathcal {P}}=(M,m_0,\varDelta )\), a co-Büchi automata \({\mathcal {A}}\), and \(k \in \mathbb {N}\). For all states \(m\in M\), we let \(F^*(m)=\bigsqcup \{ f \mid \exists u \in ({\mathcal I}{\mathcal O})^* \cdot \textsf {Post}_{{\mathcal {P}}}^*(m_0,u) = m \wedge \textsf {Post}_{\mathcal {D}}(f_0,u)=f \}\). Then, \(L(\mathcal {D}({\mathcal {A}},k))\) is \({\mathcal {P}}\)-realizable iff there does not exist \(m \in M\) such that \(F^*(m) \not \in W^{{\mathcal {A}}}_k\).

It is easily shown that the operator \(F^*\) can be computed in pTime. Thus, the latter lemma implies that there is a poly-time algorithm in \(|{\mathcal {P}}|, |{\mathcal {A}}|, k \in \mathbb {N}\), and the size of \(\lceil W^{{\mathcal {A}}}_k \rceil \) to check the \({\mathcal {P}}\)-realizability of \(L^{\forall }({\mathcal {A}})\). Formal details in [5].

We end this subsection by summarizing the behavior of our synthesis algorithm for \(\omega \)-regular specifications defined as universal co-Büchi automata.

Theorem 6

Given a universal coBüchi automaton \({\mathcal {A}}\) and a set of examples E, the synthesis algorithm SynthLearn returns, if it exists, a Mealy machine \({\mathcal {M}}\) such that \(E \subseteq L({\mathcal {M}})\) and \(L_{\omega }({\mathcal {M}}) \subseteq L^{\forall }({\mathcal {A}})\), in worst-case exponential time in the size of \({\mathcal {A}}\) and polynomial in the size of E. Otherwise, it returns UNREAL.

Specifications given as an LTL formula We are now in position to apply Alg. 3 to a specification given as LTL formula \(\varphi \). Indeed, thanks to the results of the subsection above, to provide an algorithm for LTL specifications, we only need to translate \(\varphi \) into a universal co-Büchi automaton. This can be done according to the next lemma. It is well-known (see [24]), that given an LTL formula \(\varphi \) over two sets of atomic propositions \(P_{{\mathcal I}}\) and \(P_{{\mathcal O}}\), we can construct in exponential time a universal co-Büchi automaton \({\mathcal {A}}_{\varphi }\) such that \(L^{\forall }({\mathcal {A}}_{\varphi })={[\!\![ \varphi ]\!\!]}\), i.e. \({\mathcal {A}}\) recognizes exactly the set of words \(w \in (2^{P_{\mathcal I}} 2^{P_{\mathcal O}})^{\omega }\) that satisfy \(\varphi \). We then get the following theorem that gives the complexity of our synthesis algorithm for a set of examples E and an LTL formula \(\varphi \), complexity which is provably worst-case optimal as deciding if \({[\!\![ \varphi ]\!\!]}\) is realizable with \(E=\emptyset \), i.e. the plain LTL realizability problem, is already 2ExpTime-Complete [27].

Theorem 7

Given an LTL formula \(\varphi \) and a set of examples E, the synthesis algorithm SynthLearn returns a Mealy machine \({\mathcal {M}}\) such that \(E \subseteq L({\mathcal {M}})\) and \(L_{\omega }({\mathcal {M}}) \subseteq {[\!\![ \varphi ]\!\!]}\) if it exists, in worst-case doubly exponential time in the size of \(\varphi \) and polynomial in the size of E. Otherwise it returns UNREAL.

5 Implementation and Case study

We have implemented the algorithm SynthLearn of the previous section in a prototype tool, in Python, using the tool Acacia-Bonzai [11] to manipulate antichains of counting functions. We first explain the heuristics we have used to define state-merging and completion strategies, and then demonstrate how our implementation behaves on a case study whose goal is to synthesize the controller for an elevator. The interested reader can find in [5] other case studies, including a controller for an e-bike and two variations on mutual exclusion.

Merging and completion strategies implemented in our prototype Our tool implements a merging strategy \(\sigma _G\) where, given an example e that leads in the current preMealy machine to a state m and a set \(\{m_1,m_2, \dots , m_k\}\) of candidates for merging, as computed in line 7 of Algorithm 1, we choose state \(m_i\) with a \(\preceq \)-minimal counting function \(F^*(m_i)\), as defined in Lemma 2. Intuitively, favouring minimal counting functions preserves as much as possible the set of behaviors that are possible after the example e.

Our tool also implements a completion strategy \(\sigma _C\), where for every hole \((m,\textsf{i})\) of the preMealy machine \({\mathcal {M}}\) and out of the list of candidate pairs, selects an element which again favour states associated with \(\preceq \)-minimal counting functions. For more details, we refer the reader to [5].

Lift Controller Example We illustrate how to use our tool to construct a suitable controller for a two-floor elevator system.

Considering two floors is sufficient enough to illustrate most of the main difficulties of a more general elevator. Inputs of the controller are given by two atomic propositions \(\texttt {b0}\) and \(\texttt {b1}\), which are true whenever the button at floor 0 (resp. floor 1) is pressed by a user. Outputs are given by the atomic propositions \(\texttt {f0}\) and \(\texttt {f1}\), true whenever the elevator is at floor 0 (resp. floor 1); and \(\texttt {ser}\), true whenever the elevator is serving the current floor (i.e. doors are opened). This controller should ensure the following core properties:

  1. 1.

    Functional Guarantee: whenever a button of floor 0 (resp. floor 1) is pressed, the elevator must eventually serve floor 0 (resp. floor 1): G(b0 -> F (f0 & ser)) & G(b1 -> F (f1 & ser))

  2. 2.

    Safety Guarantee: The elevator is always at one floor exactly: \(\texttt {G(f0<->!f1)}\)

  3. 3.

    Safety Guarantee: The elevator cannot transition between two floors when doors are opened: G((f0 & ser) -> X(!f1)) & G((f1 & ser) -> X(!f0))

  4. 4.

    Initial State: The elevator should be in floor 0 initially: f0

Additionally, we make the following assumption: whenever a button of floor 0 (or floor 1) is pressed, it must remain pressed until the floor has been served, i.e., G(b0 -> (b0 W (f0 & ser))) & G(b1 -> (b1 W (f1 & ser))).

Before going into the details of this example, let us explain the methodology that we apply to use our tool on this example. We start by providing only the high level specification \(\varphi _{\textsf{CORE}}\) for the elevator given above. We obtain a first Mealy machine from the tool. We then observe the machine to identify prefix of behaviours that we are unhappy with, and for which we can provide better alternative decisions. Then we run the tool on \(\varphi _{\textsf{CORE}}\) and the examples that we have identified, and we get a new machine, and we proceed like that up to a point where we are satisfied with the synthesized Mealy machine.

Fig. 2.
figure 2

Machine returned by our tool on the elevator specification w/o examples. Here, \(q_0, q_1, q_2, q_3\) represents the states where f0 is served when required, where b1 is pending, where f1 is served, the state where b0 is pending respectively.

Fig. 3.
figure 3

Mealy machine returned by our tool on the elevator specification with additional examples. The preMealy machine obtained after generalizing the examples and before completion is highlighted in red. This took 3.10s to be generated.

Let us now give details. When our tool is provided with this specification without any examples, we get the machine depicted in fig. 2. This solution makes the controller switch between floor 0 and floor 1, sometimes unnecessarily. For instance, consider the trace s # {!b0 & !b1}{!f0 & f1 & !ser} # {!b0 & !b1}{f0 & !f1 & !ser}, where we let s = {!b0 & b1}{f0 & !f1 & !ser} # {!b0 & b1}{!f0 & f1 & ser}. Here, we note that the transition goes back to state \(q_0\), where the elevator is at floor 0, when the elevator could have remained at floor 1 after serving floor 1. The methodology described above allows us to identify the following three examples:

  1. 1.

    The 1st trace states that after serving floor 1, the elevator must remain at floor 1 as b0 is false: s # {!b0 & !b1}{!f0 & f1 & !ser} # {!b0 & !b1}{!f0 & f1 & !ser}

  2. 2.

    The 2nd trace states that the elevator must remain at floor 0, as b1 is false: {!b0 & !b1}{f0 & !f1 & !ser} # {!b0 & !b1}{f0 & !f1 & !ser}

  3. 3.

    The 3rd trace ensures that after s, there is no unnecessary delay in serving floor 0 after floor 1 is served in s: s # {b0 & !b1}{!f0 & f1 & !ser} # {b0 & !b1}{f0 & !f1 & ser}

With those additional examples, our tool outputs the machine of fig. 3, which generalizes them and now ensures that moves of the elevator occur only when required. For example, the end of the first trace has been generalized into a loop on state \(q_1\) ensuring that the elevator does not go to floor 0 from floor 1 unless b0 is pressed. We note that the number of examples provided here is much smaller than the theoretical (polynomial) upper bound proved in Theorem 4.

6 Conclusion

We have introduced synthesis with a few hints, which allows the user to guide synthesis using examples of expected executions of high quality solutions. Existing synthesis tools may provide unnatural solutions when fed with high-level specifications only. As providing complete specifications goes against the very goal of synthesis, we believe our algorithm has a greater potential in practice.

We have studied the computational complexity of problems that need to be solved during our synthesis procedure. We have proved our algorithm is complete: any Mealy machine \({\mathcal {M}}\) realizing a specification \(\varphi \) can be obtained from \(\varphi \) and a representative example set E, whose size is bounded polynomially in the size of \({\mathcal {M}}\). We have implemented our algorithm in a prototype tool that extends Acacia-Bonzai [11] with tailored state-merging learning algorithms. We have shown that only a small number of examples are necessary to obtain high quality machines from high-level LTL specifications only. The tool is not fully optimized yet. While this is sufficient to demonstrate the relevance of our approach, we will work on efficiency aspects of the implementation.

As future works, we will consider extensions of the user interface to interactively and concisely specify sets of (counter-)examples to solutions output by the tool. In the same line, an interesting future direction is to handle parametric examples (e.g. elevator with the number of floors given as parameter). This would require to provide a concise syntax to define parametric examples and to design efficient synthesis algorithm in this setting. We will also consider the possibility to formulate negative examples, as our theoretical results readily extend to this case and their integration in the implementation should be easy.