LETHE: Forgetting and Uniform Interpolation for Expressive Description Logics

Uniform interpolation and forgetting describe the task of projecting a given ontology into a user-specified vocabulary, that is, of computing a new ontology that only uses names from a specified set of names, while preserving all logical entailments that can be expressed with those names. This is useful for ontology analysis, ontology reuse and privacy. Lethe is a tool for performing uniform interpolation on ontologies in expressive description logics, and it can be used from the command line, using a graphical interface, and as a Java library. It furthermore implements methods for computing logical difference and performing abduction using uniform interpolation. We present the tool together with an evaluation on a varied corpus of realistic ontologies.


Introduction
Description logic (DL) ontologies are used in a range of application areas as a means to define terminological domain knowledge via concept and role names.Applications in medicine, biology and the semantic web often lead to the development of large and complex ontologies that cover wide areas of knowledge.Understanding and maintaining such complex ontologies becomes difficult without appropriate tool support.On the other hand, some information from existing ontologies might be useful for reuse in new ontologies, while one does not want to import the complexity of the whole ontology.Uniform interpolation, also studied under the name of forgetting, has the potential to approach these challenges [4,14].Given an ontology O and a signa- ture of concept and role names, a uniform interpolant for O over is a new ontology that covers all logical entail- ments in , while using no names that are outside of the signature (see Fig. 1 for an example).Uniform interpolation can be used for ontology reuse by computing a specialised ontology that only deals with the names that are relevant for the new application.Furthermore, it can be used to make implicit, hidden relations between names visible, which can be helpful for ontology understanding and maintenance.In addition, uniform interpolation can be used to solve other non-classical reasoning problems relevant in the context of ontology maintenance, such as logical difference [19] and abduction [3,15].
Lethe is a tool that can be used to compute uniform interpolants in different expressive DLs. 1 Internally, it uses a resolution method presented in [9] for ALCH TBoxes, and later extended to SHQ [10] and knowledge bases consist- ing of both a TBox and an ABox [11].Since those publications, a few bugfixes, optimisations and new features have been implemented.This paper presents the current version of Lethe: the reasoning services it supports out of the box, the different user-interfaces, and an evaluation comparing Lethe with Fame [20], the other state-of-the-art uniform interpolation tool for expressive DLs.

Preliminaries
We first give an overview about the DLs relevant for Lethe, and then discuss the supported reasoning services.

Description Logics
In the DLs we consider, concepts are constructed from the pair-wise disjoint sets , and of respectively concept, role-and individual names according to the following syntax rule: The basic DL ALC just supports the constructs ⊤ , ⊥ , A, ¬C , C ⊓ D , C ⊔ D , ∃r.C and ∀r.C , no universal roles, and only axioms of the form C ⊑ D .S additionally allows for axiom (r) .EL restricts ALC by only allowing ⊤ , A, C ⊓ D and ∃r.C If L is a DL, LH denotes its exten- sion with role axioms r ⊑ s , LQ its extension with number restrictions ≥nr.C , LO its extension with nominals {a} , and LU its extension with universal roles.For instance, ALCH extends ALC with axioms of the form r ⊑ s , and SHQ supports axioms of the forms r ⊑ s , (r) , and concepts of the form ≥nr.C.
The semantics of DLs is defined in terms of interpretations I = ⟨ I , ⋅ I ⟩ , with the non-empty set I as domain, and the interpretation function ⋅ I mapping each a ∈ to a I ∈ I , each A ∈ to A I ⊆  I , each r ∈ to r I ⊆  I ×  I , ∇ I = I × I , and which is extended to concepts by An axiom is satisfied in an interpretation I , in symbols I ⊧  , if  = C ⊑ D and C I ⊆ D I ,  = r ⊑ s and r I ⊆ s I , = (r) and r I is transitive, = A(a) and a I ∈ A I , or = r(a, b) and ⟨a I , b I ⟩ ∈ r I .If for a KB K , I ⊧  for all axi- oms ∈ K , then I is a model of K .An axiom is entailed by a KB K , in symbols K ⊧  , if I ⊧  for all models I of K.
In addition to these classical concept constructors, a less common concept constructor we use is the greatest fixpoint X.C[X] [2], which corresponds to the limit of the sequence … , Given a DL L , we denote by L the extension with greatest fixpoint operators.For formal details on the semantics of fixpoint operators, we refer to [2].

Uniform Interpolation and Related Tasks
Definition 1 (Uniform interpolation) Let K be a KB, L a DL, and be a signature.Then, we call a KB K L, a uniform ⟨L, ⟩-interpolant of K iff  (K L, ) ⊆  , and 2. for every L axiom with () ⊆  , we have K ⊧  iff K L, ⊧ .
Note that we do not require K L, to be a K ontology itself.If the DL does not allow for fixpoints, already for acyclic ontologies, there can be signatures, for which no uniform interpolant exists in that DL [14].On the other hand, for the DLs considered here, uniform interpolants of ontologies always exist in DLs with fixpoints.Furthermore, when interpolating KBs with assertions, a uniform interpolant may only exist if we allow for nominals in the result [8,11].We often speak of the uniform interpolant, referring the the logically strongest among the possible options.The dual notion of uniform interpolation is forgetting: The result of forgetting a name x from an ontology O is the uniform ⟨L, ⟩-interpolant for = (O)⧵{x}.
Fixpoint operators are not supported by the web ontology standard OWL, which is why Lethe offers two ways to eliminating them when producing the result: by either approximation, or by using auxiliary concept names, so-called definers, that simulate the behaviour of greatest fixpoints and make sure all logical entailments of the uniform interpolant are see [9].)If we want to reuse a uniform interpolant in a different context, it may be useful to compute it in a DL with universal roles.The following theorem is an easy consequence of [6,Theorem 3].
Intuitively, if we want to reuse an ALC -or an ALCQ-ontology O in another context that speaks about , we can replace O by its uniform ⟨LU, ⟩-interpolant and still preserve all consequences over .A corresponding property does not hold for uniform ⟨L, ⟩-interpolants in general.
In addition to uniform interpolation, Lethe implements logical difference and abduction by reduction to uniform interpolation with some dedicated optimisations.
Lethe uses computes representations of logical difference by checking for entailments of axioms in the uniform interpolant.Additional optimisations are used to restrict the number of reasoner calls and forgetting steps performed for comparing large ontologies with large syntactical overlap.Uniform interpolation is also used for computing representatives of logical differences in [12,19].Definition 3 (Abduction) Let O 1 be an ontology, an axiom s.t.O ⊭  (the observation), and a signature (the set of abducibles).Then, a hypothesis for O ⊧  in is an ontol- ogy H s.t.

User Interfaces of Lethe
Lethe implements three different algorithms for computing uniform interpolants, one for ALCH ontologies based on [9] (ALCHForgetter), one for SHQ ontologies based on [10] (SHQForgetter) and one for forgetting in SH knowledge bases based on [8,11] (KBForgetter).The general approach and implementation idea of these methods is described in Sect. 4. Uniform interpolation is always performed by forgetting one name after the other.While the logic supported by KBForgetter is more general than the one by ALCH-Forgetter, the implementations differ substantially, and thus may perform differently well on the same input.

Graphical User Interface.
With the application of ontology analysis in mind, Lethe comes with a simple graphical user interface that can be used to quickly try out the tool (see Fig. 1 illustrating uniform interpolation with the pizza ontology 2 ).Ontologies in OWL syntax can be loaded and are displayed in DL syntax.The user then selects the target signature, the method to be used, and whether greatest fixpoint operators should approximated or simulated with helper concepts.During computation, the user is presented with a progressbar where he sees the current name being forgotten.The first 80-90% of names are usually forgotten very fast, while the more difficult names are forgotten in the end.If the user does not want to wait, he can cancel the forgetting process, in which case he sees the currently computed uniform interpolant.
Console Interface.Lethe furthermore allows to be used as a command line tool.Here, the user can also set a time out, after which the partial uniform interpolant is saved if the computation did not terminate yet.A second command is provided for computing logical differences.
Use as Java Library.Probably the most relevant for practical applications is the possibility to use Lethe as a Java library.Though implemented in Scala, Lethe provides for a facade supporting standard Java data structures that is compatible with the OWL API 5.1.7[5].The use of this facade is documented on the website.Classes and interfaces are provided for the three different forgetting methods, and for performing uniform interpolation, logical difference computation and abduction with abducibles.

General Method
In order to forget a specific name, Lethe performs the following steps: 1. normalise the input, 2. compute all inferences on the name to forget, 3. filter out occurrences of the name, and 4. denormalise.
For illustration, we describe the method for ALC TBoxes- the idea is the same for the more expressive DLs (for details, see [8][9][10][11]).In our normal form, every axiom is of the form where A ∈ , r ∈ , and D is taken from a special set ⊆ of definers.We call the L i literals and normalised axioms clauses, usually omit the leading ⊤ ⊑ and treat them as sets, that is, no literal occurs twice and the order is not important.We further have the restriction that no clause contains more than one literal of the form ¬D , where D ∈ .
In Step 2, we make use of the calculus shown in Fig. 2. In r-Prop, the definer D 12 is a definer representing D 1 ⊓ D 2 , which we introduce if not existent yet.In r-Res, O refers to the current set of clauses.Here, Lethe uses HermiT to decide the entailment.The rules A-Res and r-Res are used to perform the inferences on the symbol to forget (concept name A or role name r).Since a clause may contain at most one negative definer literal, the rules are not applicable if the premises contain different negative definer literals, for instance ¬D 1 and ¬D 2 .The rule r-Prop is a so-called com- bination rule and may combine the different definers D 1 and D 2 into a new definer D 12 , resulting in new clauses which contain ¬D 12 instead of ¬D 1 and ¬D 2 , which makes new inferences with A-Res and r-Res possible (recall that clauses may contain at most one negative definer literal).The calculi for more expressive DLs have additional combination rules that reflect the additional expressivity.Our method makes sure that in the worst case, at most exponentially many new definers are introduced, which ensures termination of the forgetting procedure.
In the denormalisation step, the definers are eliminated again using standard rewriting rules.It is in this step that we may introduce fixpoint operators into the ontology.Alternatively, if fixpoints are not desired in the output, we keep the definers for which the corresponding fixpoint cannot be simplified away (see below), or we approximate the fixpoint expression by unfolding the fixpoint expression up to a certain depth.

Implementation of Forgetting Calculus
The implemented forgetting methods use different strategies of determining when a combination rule has to be applied: ALCHForgetter keeps a map for each definer that stores its "distance" to the name to be forgotten.A combination rule then only combines definers that have the same distance.SHQForgetter and KBForgetter instead use a "lazy" approach: they first apply resolution unrestricted, allowing more than one negative definer.If a clause with negative definers ¬D 1 , … , ¬D n is inferred, clauses containing D 1 , … , D n under a role restriction are determined, and com- bination rules are tried to introduce a definer representing D 1 ⊓ … ⊓ D n .In the first approach, we try to predict when definer combination is necessary.In the second approach, we apply combination on demand.In addition, we use a set of usual techniques from resolution methods, such as indexing, forward-and backward subsumption.

Implementation of Uniform Interpolation
To compute the uniform interpolant, we apply Steps 1-4 for each name in the ontology that is not in the desired signature.These steps are only applied to the axioms that contain the name to be forgotten, which are then replaced by the forgetting result.It turns out that the order in which we forget is very crucial to the performance: our heuristics take into account the positive and negative occurrences of the name to be forgotten and we generally start with the least frequent ones.
In addition, we use pre-and post-processing to reduce the number of axioms to be processed, and to improve the shape of the computed uniform interpolant.First, we use module extraction as in [18], and as implemented in the OWL API, to compute a subset of the ontology that contains all relevant axioms for the uniform interpolant.Second, we use purification to quickly forget all names which occur either only positively or only negatively, in which case they can be replaced by respectively ⊤ and ⊥ .As post-processing, we use a set of beautification rules that improve the syntactic shape of the axioms, by detecting tautological or contradictory subexpressions (including fixpoints), detecting redundancies or applying associativity.A cheaper version of beautification is used during the forgetting phase to keep the size of the current uniform interpolant small.A more expensive form is applied at the end to make the final uniform interpolant more human-readable and to keep the expressivity of the used DL small.For instance, an EL ontology might be preferable over an equivalent ALC ontology, if this transformation is possible in simple steps.

Evaluation
We evaluate Lethe and compare it with Fame, the other state-of-the-art tool for uniform interpolation in expressive DLs.While being faster, later versions of Fame compute uniform interpolants in a very expressive DL not supported by OWL, which explains why we could not produce OWL files for most inputs with Fame 2.0, the latest version.For this reason, we used the ALCOIQ-forgetter of Fame 1.0 in our experiments. 3valuations of older versions of Lethe [8][9][10][11] and comparisons with other tools [1,20,21] can be found in the literature.Since then, additional optimisations and features have been implemented, as well as some bugs fixed.The evaluation presented differs in three further aspects from earlier evaluations.(1) We compute uniform interpolants with universal roles, which are now directly supported by Lethe.Fame always does this, and since it makes forgetting roles much easier, this provides for a fairer comparison.Furthermore, universal roles in the uniform interpolant can be useful in practice (see Theorem 1).(2) We do not discard computations that caused timeouts, but instead evaluate the uniform interpolants computed within the given time frame, since in many applications, a fast computed uniform interpolant with a few more symbols is sufficient and preferable over long waiting times.(3) We use different heuristics for selecting samples of signatures.

System Specification.
The experiments where run on an Intel Core i5-4590 CPU machine with 3.30 GHz and 32 GB RAM, using Debian/GNU Linux 9 and OpenJDK 11.0.5.

Corpus.
We use the ontologies from the OWL Reasoner Evaluation 2015 [17], for the track DL Classification, which has been balanced in terms of size, expressivity and complexity of ontologies.From each ontology, we removed axioms outside of ALCH , where we translated n-ary equiva- lence and disjointness axioms, as well as domain and range axioms, into corresponding ALCH concept inclusions.From the resulting corpus, we removed all ontologies that had less than 100 names and more than 100,000 axioms.Figure 3 shows sizes and expressivity of the ontologies in the resulting corpus of 198 ontologies.
Signatures.We focused on uniform interpolants for small signatures, which are particularly useful for ontology analysis, and thus selected signatures of 100 names for each computed uniform interpolant.We used different strategies to select signatures: (1) fully random signatures by selecting each name with equal probability, (2) weighted signatures by selecting each name weighted with the frequency of its occurrences in the ontology, and (3) coherent signatures by selecting names related to each other using genuine modules [18].A genuine module is a module extracted for the signature of some axiom, and thus has as signature names that are related to that axiom in the ontology, and are consequently related to each other.To obtain a coherent signature, we took the union of randomly selected genuine modules until the overall signature size was above 100, and then randomly selected names from the resulting signature.The results of the evaluation are shown in Figs. 5 and 4. For the coherent signatures, Lethe produced an out of memory error each time for 7 of the ontologies.Apart from these cases, Lethe always computed a uniform interpolant, and in most cases completely for the desired signature.Fame does not have a timeout functionality as Lethe, and was terminated after the timeout passed, which is why we have less results for it.Still, one can see that Lethe was more often able to compute the uniform interpolant for the required signature size of 100.Note that Fame can also produce uniform interpolants with more than 100 symbols, due to definer symbols used to simulate fixpoints, and because the method used is incomplete and sometimes fails to forget some of the names.

Related Work
For an overview on forgetting in logics see [4].Theoretical properties of uniform interpolation in expressive description logics have been investigated in [14].The main competing tool for uniform interpolation in expressive DLs is Fame [20,21], which we used in our evaluation.Which tool is recommended to use depends on the application, as Fame can be faster and supports more expressive DLs.The faster versions however often fail to compute results in OWL, as interpolants may use non-classical constructs.Another difference to Lethe is that the method underlying Fame is incomplete in the sense that it is not guaranteed to compute a uniform interpolant for every given signature.A very recent tool that can be used for forgetting in expressive DLs is DLS-Forgetter [1], which applies the DLS algorithm on first order logic formulae.More similar to Lethe is the resolutionbased method for ALC presented in [12], which however is not able to deal with cyclic ontologies.For the light-weight DL EL , there exist an implemented method for acyclic ter- minologies [7], and one for general ontologies [13].

Outlook
We are currently investigating further reasoning services that are based on forgetting, and which will be implemented in future versions of Lethe.Specifically, we are looking at module extraction, abduction, and using forgetting to explain entailments in an ontology.Regarding abduction, we want to support arbitrary signatures and ABox assertions.For this generalised abduction problem, we have to adapt the forgetting procedure as well, as it has for instance to handle negated role assertions.Furthermore, we noticed that uniform interpolants are often smaller than modules extracted with the OWL API, but in some cases also larger and with more complex axioms.Another line of research to pursue is to develop a method that sits in between uniform interpolation and module extraction, and is optimised to compute small and simple ontologies that captures the entailments of a given signature, similar to [16].
where A ∈ and r ∈ ∪ {∇} and n ∈ ℕ , n ≥ 1 .∇ denotes the universal role.A knowledge base (KB) is a finite set of concept inclusions (CIs) of the form C ⊑ D , role inclusions (RIs) of the form r ⊑ s , and assertions of the forms C(a), r(a, b) where C, D are concepts and r, s ∈ and a, b ∈ .CIs, RIs and assertions are collectively called axioms.A KB without assertions is called ontology or TBox.