LTCS–Report A Goal-Oriented Algorithm for Uniﬁcation in E LH R + w.r.t. Cycle-Restricted Ontologies

Uniﬁcation in Description Logics (DLs) has been proposed as an inference service that can, for example, be used to detect redundancies in ontologies. For the DL EL , which is used to deﬁne several large biomedical ontologies, uniﬁcation is NP -complete. A goal-oriented NP uniﬁcation algorithm for EL that uses nondeterministic rules to transform a given uniﬁcation problem into solved form has recently been presented. In this report, we extend this goal-oriented algorithm in two directions: on the one hand, we add general concept inclusion axioms (GCIs), and on the other hand, we add role hierarchies ( H ) and transitive roles ( R + ). For the algorithm to be complete, however, the ontology consisting of the GCIs and role axioms needs to satisfy a certain cycle restriction.


Introduction
The DL EL, which offers the constructors conjunction ( ), existential restriction (∃r.C), and the top concept ( ), has recently drawn considerable attention since, on the one hand, important inference problems such as the subsumption problem are polynomial in EL, even in the presence of general concept inclusions (GCIs) [11].On the other hand, though quite inexpressive, EL can be used to define biomedical ontologies, such as the large medical ontology SNOMED CT. 1A tractable extension of EL [6], which includes role hierarchy and transitivity axioms, is the basis of the OWL 2 EL profile of the new Web Ontology Language OWL 2. 2Unification in DLs has been proposed in [10] as a novel inference service that can, for instance, be used to detect redundancies in ontologies.For example, assume that one developer of a medical ontology defines the concept of a patient with severe injury of the frontal lobe as ∃finding.(Frontal_lobe_injury∃severity.Severe), whereas another one represents it as ∃finding.(Severe_injury∃finding_site.∃part_of.Frontal_lobe). ( These two concept descriptions are not equivalent, but they are nevertheless meant to represent the same concept.They can obviously be made equivalent by treating the concept names Frontal_lobe_injury and Severe_injury as variables, and substituting the first one by Injury ∃finding_site.∃part_of.Frontal_lobe and the second one by Injury ∃severity.Severe.In this case, we say that the descriptions are unifiable, and call the substitution that makes them equivalent a unifier. Our interest in unification w.r.t.GCIs, role hierarchies, and transitive roles stems from the fact that these features are important for expressing medical knowledge.For example, assume that the developers use the descriptions (3) and (4) instead of ( 1) and (2): ∃finding.∃finding_site.∃part_of.Brain ∃finding.(Frontal_lobe_injury∃severity.Severe) ∃status.Emergency ∃finding.(Severe_injury∃finding_site.∃part_of.Frontal_lobe) The descriptions (3) and ( 4) are not unifiable without additional background knowledge, but they are unifiable, with the same unifier as above, if the GCIs ∃finding.∃severity.Severe ∃status.Emergency, Frontal_lobe ∃proper_part_of.Brain are present in a background ontology and this ontology additionally states that part_of is transitive and proper_part_of is a subrole of part_of.
In [7], we were able to show that unification in the DL EL (without GCIs and role axioms) is NP-complete.In addition to a brute-force "guess and then test" NP-algorithm [7], we have developed a goal-oriented unification algorithm for EL, in which nondeterministic decisions are only made if they are triggered by "unsolved parts" of the unification problem [9], and an algorithm that is based on a reduction to satisfiability in propositional logic (SAT) [8], which enables the use of highly-optimized SAT solvers [13].Whereas both approaches are clearly better than the brute-force algorithm, none of them is uniformly better than the other.First experiments with our system UEL [1] show that the SAT translation is usually faster in deciding unifiability, but it needs more space than the goaloriented algorithm and it produces more uninteresting and large unifiers.In fact, the SAT translation generates all so-called local unifiers, whereas the goaloriented algorithm produces all so-called minimal unifiers, though it may also produce some non-minimal ones.The set of minimal unifiers is a subset of the set of local unifiers, and in our experiments the minimal unifiers usually made more sense in the application.
In [9] it was shown that the approaches for unification of EL-concept descriptions (without any background ontology) mentioned above can easily be extended to the case of a so-called acyclic TBox (a simple form of GCIs, which basically introduce abbreviations for concept descriptions) as background ontology without really changing the algorithms or increasing their complexity.For more general GCIs, such a simple solution is no longer possible.In [3], we extended the bruteforce "guess and then test" NP-algorithm from [7] to the case of GCIs, which required the development of a new characterization of subsumption w.r.t.GCIs in EL.Unfortunately, the algorithm is complete only for general TBoxes (i.e., finite sets of GCIs) that satisfy a certain restriction on cycles, which, however, does not prevent all cycles.For example, the cyclic GCI ∃child.Human Human satisfies this restriction, whereas the cyclic GCI Human ∃parent.Human does not.In [4] we provide a more practical unification algorithm that is based on a translation into SAT, and can also deal with role hierarchies and transitive roles, but still needs the ontology (now consisting of GCIs and role axioms) to be cyclerestricted.In the presence of role hierarchies (H) and transitive roles (R + ), we use the name ELH R + rather than EL for the logic.
Motivated by our experience that, for the case of EL without background ontology, the goal-oriented algorithm sometimes behaves better than the one based on a translation into SAT, we introduce a goal-oriented algorithm for unification in ELH R + w.r.t.cycle-restricted ontologies.In a previous report [2], we have described a specialized version of the algorithm that only deals with cycle-restricted EL-ontologies.

The Description Logics EL and ELH R +
The expressiveness of a DL is determined both by the formalism for describing concepts (the concept description language) and the terminological formalism, which allows to state additional constraints on the interpretation of concepts and roles in a so-called ontology.
The concept description language considered in this report is called EL. Starting with a finite set N C of concept names and a finite set N R of role names, ELconcept descriptions are built from concept names by the constructors conjunction (C D), existential restriction (∃r.C for every r ∈ N R ), and top ( ).Since we only consider EL-concept descriptions, we will sometimes dispense with the prefix EL.
On the semantic side, concept descriptions are interpreted as sets.To be more precise, an interpretation I = (∆ I , • I ) consists of a non-empty domain ∆ I and an interpretation function that maps concept names to subsets of ∆ I and role names to binary relations over ∆ I .This function is inductively extended to concept descriptions as shown in the semantics column of Table 1.
A general concept inclusion axiom (GCI) is of the form C D for concept descriptions C, D, a role hierarchy axiom is of the form r s for role names r, s, and a transitivity axiom is of the form r • r r for a role name r.An interpretation I satisfies such an axiom if the corresponding condition in the semantics column of Table 1 holds, where • in this column stands for composition of binary relations.An ELH R + -ontology is a finite set of such axioms.It is an EL-ontology if it contains only GCIs.An interpretation is a model of an ontology if it satisfies all its axioms.An EL-concept description is an atom if it is an existential restriction or a concept name.The atoms of an EL-concept description C are the subdescriptions of C that are atoms, and the top-level atoms of C are the atoms occurring in the toplevel conjunction of C. Obviously, any EL-concept description is the conjunction of its top-level atoms, where the empty conjunction corresponds to .The atoms of an ELH R + -ontology O are the atoms of all the concept descriptions occurring in GCIs of O.

Semantics concept name
An atom is called flat if it is a concept name or an existential restriction of the form ∃r.A for a concept name A. A GCI is called flat if it is of the form A B C, where A, B are flat atoms or and C is a flat atom.An ELH R + -ontology is called flat if all its GCIs are flat.Every ELH R + -ontology can be transformed in polynomial time into an equivalent flat ontology (see [5] for details).

Subsumption in ELH R +
All previous unification algorithms for EL depend on a structural characterization of subsumption.While [7] provides one for the case of an empty background ontology, in [3] a more general characterization is used that accounts for GCIs.This is proved using a Gentzen-style proof calculus for subsumption.We now extend this calculus to prove a similar characterization for arbitrary ELH R +ontologies.

Proving Subsumptions by Inference Rules
In [2] we defined a system of inference rules that characterize subsumption in EL modulo a flat EL-ontology O.More precisely, they define a binary relation O on all concept descriptions such that C O D iff C O D. For details about this relation, we refer to [2].We will describe here only the changes that are necessary to generalize this approach to ELH R + -ontologies.
We only need to modify the inference rule that deals with existential restrictions: Let r s ∈ O and (E, F ) ∈ r I , i.e., there is a proof tree T for E O ∃r.F .Then the following is a proof tree for E O ∃s.F : This shows that r I ⊆ s I , i.e., I satisfies this role hierarchy axiom.
For each transitive role r, we have to show that r I • r I ⊆ r I holds.Let (E 1 , E 2 ) ∈ r I and (E 2 , E 3 ) ∈ r I , i.e., there are proof trees T 1 for E 1 O ∃r.E 2 and T 2 for E 2 O ∃r.E 3 .Then the following is a proof tree for E 1 O ∃r.E 3 : To conclude the proof, we notice that C ∈ C I , since C O C holds by rule (R 3 ).On the other hand, we assumed that C O D does not hold, which implies C / ∈ D I , and thus C I D I .

A Structural Characterization of Subsumption
Based on this proof-theoretic characterization of subsumption, we will now generalize a structural characterization of subsumption from [3] to ELH R + -ontologies.
We say that a subsumption between two atoms is structural if their top-level structure is compatible.To be more precise, following [4] we define structural subsumption between atoms as follows: the atom C is structurally subsumed by the atom D w.r.t.O (C s O D) iff one of the following holds:  [5]).The following lemma extends Lemma 6 from [2] and is an important foundation for the algorithm we will present later.
Proof.If one of the alternatives 1. or 2. holds for every atom D j , then clearly the claimed subsumption relationship holds.In [2], the other direction was first reduced to the case of a flat EL-ontology.This reduction also works in the same way for ELH R + -ontologies, which is why we can reuse most of the proof of Lemma 6 from [2].
We prove by induction on the height of T that for every atom D j one of the alternatives 1. or 2. holds.We consider the rule applied at the root of T. If this rule is one of the original rules, we can show the claim using the same arguments as in [2] since they only depend on reflexivity and transitivity of s O and the GCIs in O, but not on any other properties derived from O. Hence we consider only proof trees that end with the application of one of our new inference rules (R 8 ) or (R 8 ).

Unification in ELH R +
We partition the set N C into a set N v of concept variables (which may be replaced by substitutions) and a set N c of concept constants (which must not be replaced by substitutions).A substitution σ maps every concept variable to an EL-concept description.It is extended to concept descriptions in the usual way: An EL-concept description C is ground if it does not contain variables.Obviously, a ground concept description is not modified by applying a substitution.An ELH R + -ontology is ground if it does not contain variables.
We say that Γ is unifiable w.r.t.O if it has a unifier.
Note that some of the previous papers on unification in DLs use equivalences C ≡ ?D instead of subsumptions C ? D. This difference is, however, irrelevant since C ≡ ?D can be seen as a shorthand for the two subsumptions C ? D and D ?C, and C ? D has the same unifiers as C D ≡ ?C. Also note that we have restricted the background ontology O to be ground.This is not without loss of generality.If O contained variables, then we would need to apply the substitution also to its GCIs, and instead of requiring σ(C i ) O σ(D i ) we would thus need to require σ(C i ) σ(O) σ(D i ), which would change the nature of the problem considerably (see [5] for a more detailed discussion).
As mentioned in the introduction, the unification algorithm we will present in Section 5 is complete only for ELH R + -ontologies that satisfy a certain restriction on cycles.
In [5] we show that a given ELH R + -ontology can be tested for cycle-restrictedness in polynomial time.The main idea is that it is sufficient to consider the cases where C is a concept name or .
To simplify the description of the algorithm, it is convenient to first flatten the ontology and the unification problem.The unification problem Γ is called flat if it contains only flat subsumptions of the form where n ≥ 0 and C 1 , . . ., C n , D are flat atoms. 3et Γ be a unification problem and O an ELH R + -ontology.By introducing auxiliary variables and concept names, respectively, Γ and O can be transformed in polynomial time into a flat unification problem Γ and a flat ELH R + -ontology O such that the unifiability status remains unchanged, i.e., Γ has a unifier w.r.t.O iff Γ has a unifier w.r.t.O .In addition, if O was cycle-restricted, then so is O (see [5] for details).Thus, we can assume without loss of generality that the input unification problem and ontology are flat.

Local Unifiers
The main idea underlying the "in NP" results in [7,3] is to show that any unification problem that is unifiable has a so-called local unifier.
We denote by At the set of atoms occurring as subdescriptions in subsumptions in Γ or axioms in O and define Furthermore, we define the set of non-variable atoms by At nv := At tr \N v .Though the elements of At nv cannot be variables, they may contain variables if they are of the form ∃r.X for some role r and a variable X.
We call a function S that associates every variable X ∈ N v with a set S X ⊆ At nv an assignment.Such an assignment induces the following relation We call the assignment S acyclic if > S is irreflexive (and thus a strict partial order).Any acyclic assignment S induces a unique substitution σ S , which can be defined by induction along > S : • If X ∈ N v is minimal w.r.t.> S , then we define σ S (X) := D∈S X D.
• Assume that σ(Y ) is already defined for all Y such that X > S Y .Then we define σ S (X) := D∈S X σ S (D).
We call a substitution σ local if it is of this form, i.e., if there is an acyclic assignment S such that σ = σ S .If the unifier σ of Γ w.r.t.O is a local substitution, then we call it a local unifier of Γ w.r.t.O.
The main technical result shown in [3] is that any unifiable EL-unification problem w.r.t. a cycle-restricted ontology has a local unifier.This yields the following brute-force unification algorithm for EL w.r.t.cycle-restricted ontologies: first guess an acyclic assignment S, and then check whether the induced local substitution σ S solves Γ.As shown in [3], this algorithm runs in nondeterministic polynomial time.NP-hardness follows from the fact that already unification in EL w.r.t. the empty ontology is NP-hard [7].In [3] it is also shown why cyclerestrictedness is needed: there is a non-cycle-restricted EL-ontology O and an EL-unification problem Γ such that Γ has a unifier w.r.t.O, but it does not have a local unifier.

A Goal-Oriented Unification Algorithm
The brute-force algorithm is not practical since it blindly guesses an acyclic assignment and only afterwards checks whether the guessed assignment induces a unifier.In this section, we introduce a more goal-oriented unification algorithm, in which nondeterministic decisions are only made if they are triggered by "unsolved parts" of the unification problem.In addition, failure due to wrong guesses can be detected early.This goal-oriented algorithm generalizes the goal-oriented algorithm for unification in EL (without background ontology) introduced in [9], though the rules look quite different because here we consider unification problems that consist of subsumptions whereas in [9] we considered equivalences.It is more closely related to the algorithm presented in [2] for unification w.r.t.cycle-restricted EL-ontologies.
We assume that the cycle-restricted ELH R + -ontology O and the unification problem Γ 0 are flat.Given O and Γ 0 , the sets At, At tr , and At nv are defined as above.
Starting with Γ 0 , the algorithm maintains a current unification problem Γ and a current acyclic assignment S, which initially assigns the empty set to all variables.For each subsumption in Γ it maintains the information on whether it is solved or not.Initially, all subsumptions of Γ 0 are unsolved, except those with a variable on the right-hand side.Rules are applied only to unsolved subsumptions.A (non-failing) rule application does the following: • it solves exactly one unsolved subsumption, • it may extend the current assignment S, and • it may introduce new flat subsumptions built from elements of At tr .
Each rule application that extends S additionally expands Γ w.r.t.X as follows: every subsumption s ∈ Γ of the form Subsumptions are only added if they are not already present in Γ.If a new subsumption is added to Γ, either by a rule application or by expansion of Γ, then it is initially designated unsolved, except if it has a variable on the righthand side.Once a subsumption is in Γ, it will not be removed.Likewise, if a subsumption in Γ is marked as solved, then it will not become unsolved later.
If a subsumption is marked as solved, this does not necessarily mean that it is indeed already solved by the substitution induced by the current assignment.Instead, it may be the case that the task of satisfying the subsumption was deferred to solving other subsumptions, which are "smaller" than the given subsumption in a certain well-defined sense.A subsumption whose right-hand side is a variable is always marked as solved since the task of solving it is deferred to solving the subsumptions introduced by expansion.
The rules of the algorithm consist of the three eager rules Eager Ground Solving, Eager Solving, and Eager Extension (see Figure 1), and several nondeterministic rules (see Figures 2 and 3).Eager rules are applied with higher priority than nondeterministic rules, and among the eager rules, Eager Ground Solving has the highest priority, then comes Eager Solving, and then Eager Extension.
Algorithm 5. Let Γ 0 be a flat EL-unification problem.We initialize Γ := Γ 0 and S X := ∅ for all variables X ∈ N v .While Γ contains an unsolved subsumption, do the following: (1) Eager rule application: If some eager rules apply to an unsolved subsumption s in Γ, apply the one with the highest priority.If the rule application fails, return "not unifiable".
(2) Nondeterministic rule application: If no eager rule is applicable, let s be an unsolved subsumption in Γ.If one of the nondeterministic rules applies to s, choose one of these rules and apply it.If none of these rules apply to s or the rule application fails, then return "not unifiable".
(3) Eager application of Decomposition: If in the previous step one of the rules Mutation 2 or 3 was applied, do the following for all subsumptions s added to Γ by this rule application: If one of the rules Decomposition 1 or 2 applies to s , nondeterministically choose one of the applicable decomposition rules and apply it to s . 4nce all subsumptions in Γ are solved, return the substitution σ induced by the current assignment.

Eager Ground Solving:
Condition: This rule applies to s does not hold, the rule application fails.Otherwise, s is marked as solved.

Eager Solving:
Condition: This rule applies to s • D is ground and G O D holds, where G is the set of all ground atoms in {C 1 , . . ., C n } ∪ X∈{C 1 ,...,Cn}∩Nv S X .
Action: Its application marks s as solved.

Eager Extension:
Condition: This rule applies to Action: Its application adds D to S X .If this makes S cyclic, the rule application fails.Otherwise, Γ is expanded w.r.t.X and s is marked as solved.In step (2), the choice which unsolved subsumption to consider next is don't care nondeterministic.However, choosing which rule to apply to the chosen subsumption is don't know nondeterministic.Additionally, the application of nondeterministic rules requires don't know nondeterministic guessing.This is an extension of the algorithm from [2] by more general nondeterministic rules and the addition of step (3).This additional step immediately applies all possible Decomposition rules after each application of Mutation 2 or 3.This ensures that in each rule application the generated subsumptions are "smaller" than the triggering subsumption in some well-defined sense, which will be used to prove soundness (see Section 5.1).
The eager rules are mainly there for optimization purposes, i.e., to avoid nondeterministic choices if a deterministic decision can be made.For example, a ground subsumption, as considered by Eager Ground Solving, either holds, in which case any substitution solves it, or it does not, in which case it does not have a solution.This condition can be checked in polynomial time using the subsumption algorithm for ELH R + [6].In the case considered by Eager Solving, the substitution induced by the current assignment already solves the subsumption.The Eager Extension rule solves a subsumption that contains only a variable X  and some elements of S X on the left-hand side.The rule is motivated by the following observation: for any assignment S extending the current assignment, the induced substitution σ satisfies σ (X) ≡ σ (C 1 ) • • • σ (C n ).Thus, if S X contains D, then σ (X) σ (D), and σ solves the subsumption.Conversely, if σ solves the subsumption, then σ (X) σ (D), and thus adding D to S X yields an equivalent induced substitution.
The nondeterministic rules only come into play if no eager rules can be applied.In order to solve an unsolved subsumption s = C 1 • • • C n ?D, we consider the two conditions of Lemma 2. Regarding the first condition, which is addressed by the rules Decomposition 1 and 2 and Extension, assume that γ is induced by an acyclic assignment S. To satisfy the first condition of the lemma with γ, the atom γ(D) must structurally subsume a top-level atom in γ(C 1 ) • • • γ(C n ).This atom can either be of the form γ(C i ) for a non-variable atom C i , or of the form γ(C) for C ∈ S C i and a variable C i .In the second case, the atom C can either already be in S C i or it can be put into S C i by an application of the Extension rule.The two versions of Decomposition correspond to the cases (2) and (3) in the definition of structural subsumption. 5he Mutation rules cover the second condition in Lemma 2. Each rule covers a different case: For example, Mutation 1 is applicable only to subsumptions with more that one conjunct on the left-hand side.It solves the subsumption by making Mutation 1: Action: Its application chooses such atoms, marks s as solved, and generates the following subsumptions: • it chooses for each η ∈ {1, . . ., k} an i ∈ {1, . . ., n} and adds the subsumption • it adds the subsumption B ? D to Γ.

Mutation 2:
Condition: This rule applies to s = ∃r.X ?D if X is a variable, D is ground, and there are atoms ∃r  sure that all conditions of the second condition of Lemma 2 are satisfied.The rule guesses atoms This can be checked using the polynomial-time subsumption algorithm for ELH R + .Whenever the second condition of Lemma 2 requires a structural subsumption γ(E) s O γ(F ) to hold for a (hypothetical) unifier γ of Γ, the rule creates the new subsumption E ?F , which has to be solved later on.This way, the rule ensures that the substitution built by the algorithm actually satisfies the conditions of the lemma.The other Mutation rules follow the same idea, but they consider cases where only a single atom occurs on the left-hand side of the subsumption to be solved.The reason for considering these cases separately is that in the proof of soundness we need the newly introduced subsumptions to be "smaller" than the subsumption that triggered their introduction.For Mutation 1 this is the case due to the smaller left-hand side (only one atom), whereas for the other mutation rules this is not so clear.Actually, for Mutation 2 and 3, the new subsumptions turn out to be smaller only after Decomposition is applied to them.Mutation 4 implicitly applies a form of decomposition.
In contrast to the algorithm presented in [2] for unification w.r.t.cycle-restricted EL-ontologies, this algorithm uses two modified Decomposition rules instead of only one to account for the generalized definition of structural subsumption.Additionally, the new algorithm separates the implicit application of the Decomposition rules in the rules Mutation 2 and 3 into a separate step in the algorithm.For Mutation 1 this is not necessary, while for Mutation 4 it is not possible since applying condition (3) of the definition of structural subsumption to the resulting subsumption could lead to non-termination (see the proof of Lemma 8 for details).

Soundness
The soundness of the unification algorithm in [2] is shown with the help of a measure assigned to all subsumptions generated during the process of computation.The measure is equipped with a well-founded order , which is used for an induction argument.In order to show that the modified algorithm is also sound, we have to slightly modify the measure and show that it works for the new rules, i.e., the measure of the subsumptions obtained by applications of these rules is smaller than the measure of the unsolved subsumptions to which they were applied.
In the following, let S be the final assignment computed by Algorithm 5 on input Γ 0 and σ be the substitution induced by S. With Γ we denote the final set of subsumptions computed by this run, i.e., the original subsumptions of Γ 0 together with the new ones generated by rule applications.
• s is small if n = 1 and C 1 is ground or C n+1 is ground.
m 2 (s) := X if C n+1 = X or C n+1 = ∃r.X for a variable X and a role name r ∈ N R , and m 2 (s) := ⊥ otherwise. - m 4 (s) := rd(σ(C n+1 )) • The strict partial order on such tuples is the lexicographic order, where the first, third, and fourth components are compared w.r.t. the normal order > on natural numbers.The variables in the second component are compared w.r.t. the relation > S induced by the acyclic assignment and ⊥ is smaller than any variable.
As in the proof of Lemma 30 in [2], we use induction on this well-founded strict partial order to show that σ solves Γ.
Lemma 7. σ is a unifier of Γ w.r.t.O, and thus also of its subset Γ 0 .
Proof.Let s ∈ Γ and assume that σ solves all subsumptions s ∈ Γ with s ≺ s.
There are several cases to consider depending on the way s was solved.
• If s has a non-variable atom on the right-hand side, then it was initially marked as unsolved and must have been solved by a successful rule application.The cases of the Eager rules and the Extension rule can be handled exactly as in [2] since the measure is not needed.We now consider the new nondeterministic rules.
-Decomposition 1: Then s is of the form C 1 • • • C n ?∃s.D with C i = ∃r.C and r O s and we have s = C ? D ∈ Γ.We will show that s s holds.By induction, this implies that σ solves s , and by Lemma 2 thus also s.To compare m(s) and m(s ), we observe first that m 2 (s) = m 2 (s ).We now make a case distinction on m 1 (s ).If s is small, then s is either not small, i.e., m 1 (s) > m 1 (s ), or small and of the form ∃r.C ?∃s.D .In the second case, we have m 1 (s) = m 1 (s ) and m 3 (s) > m 3 (s ).If s is not small, then both C and D are variables, and thus s is also not small, which yields m 1 (s) = m 1 (s ).Furthermore, Thus, in all cases we have s s .
Furthermore, for every η ∈ {1, . . ., k} there is a subsumption s η = C i ?A η ∈ Γ for some i ∈ {1, . . ., n} and the subsumption s = B ? D is also in Γ.Since s is not small and all the subsumptions s 1 , . . ., s k , s are small, they are smaller than s w.r.t. .By induction, σ solves those small subsumptions, and thus we have -Mutation 2: Then s is of the form ∃r.X ?D, D is ground, and ∃r 1 .A  • The case that s has a variable as its right-hand side can again be handled exactly as in the proof of Lemma 30 in [2].

Completeness
Assume that Γ 0 is unifiable w.r.t.O and let γ be a ground unifier of Γ 0 w.r.t.O.As in [2], we can use this unifier to find a successful computation path of Algorithm 5 on Γ 0 such that the following invariants are preserved by the successive rule applications for the current set of subsumptions Γ and the current assignment S: (I) γ is a unifier of Γ.
(II) For all B ∈ S X , we have γ(X) O γ(B).
In [2], it is shown that these invariants are maintained by expanding Γ and by applying eager rules.We also know that invariant (II) guarantees that the current assignment is always acyclic.We now reprove a variant of Lemma 34 from [2] that deals with the modified nondeterministic rules.
Lemma 8. Let s be an unsolved subsumption of Γ to which no eager rule applies.Then there is a nondeterministic rule that can be successfully applied to s while maintaining the invariants.
Proof.s must be of the form C By Lemma 2, one of the following alternatives holds: 1.There is an index i ∈ {1, . . ., n} such that E s O γ(D) for a top-level atom E of γ(C i ).We consider the following cases for C i , which can obviously not be .
• If C i is a concept name, then C i = E = D and Eager Solving is applicable to s, which contradicts the assumption.and B s O γ(D).If n > 1, we can apply Mutation 1 in such a way that all created subsumptions are solved by γ.The possible subsequent applications of Decomposition rules again preserve the invariants and cannot fail.
If n = 1, we distinguish several cases for C 1 and D: • If both C 1 and D are ground, then the Eager Ground Solving rule is applicable to s, which contradicts our assumption.
• If C 1 = X is a variable, then the Eager Extension rule is applicable to s, which again contradicts our assumption.
• If C 1 = ∃r.X for a variable X, then we have A η = ∃r η .A η with r O r η and γ(∃r.X) O A η for every η ∈ {1, . . ., k}.Thus, we can add the subsumptions In this case, we can apply Mutation 3 to s while preserving the invariants.
• If C 1 is ground and D = ∃s.Y for a variable Y , then we have To summarize, we have shown that for any unifiable input problem Γ 0 there is a non-failing run of Algorithm 5 on Γ 0 during which the invariants (I) and (II) are satisfied.Together with the fact that any run of the algorithm terminates (see below), this shows completeness, i.e., whenever Γ 0 has a unifier w.r.t.O, the algorithm computes one.

Termination and Complexity
Similar termination arguments as in [2] also hold for the algorithm with the new nondeterministic rules.The only remark we have to make concerns the number and the form of the subsumptions created in the course of a computation.These subsumptions can have one of the following forms: 1. subsumptions from Γ 0 ; Hence for the modified algorithm, the set of atoms appearing in the subsumptions is larger since the right-hand sides can be from At tr and not only from At.But the increase is only polynomial in the size of the input and therefore there are only polynomially many subsumptions of the above forms.
Hence, since each rule application solves one subsumption and each such application takes only polynomial time, the modified algorithm is sound and complete and terminates in time polynomial in the size of the input Γ 0 and O.
Theorem 9. Algorithm 5 is an NP-decision procedure for unifiability in ELH R + w.r.t.cycle-restricted ontologies.

Conclusions
We have presented a goal-oriented NP-algorithm for unification in ELH R + w.r.t.cycle-restricted ontologies.In [4], we have developed a reduction of this problem to SAT, which is based on a characterization of subsumption different from the one in Lemma 2. Though clearly better than the brute-force algorithm introduced in [3], both algorithms suffer from a high degree of nondeterminism due to having to guess true subsumptions between concepts built from atoms of the background cycle-restricted ontology.We must find optimizations to tackle this problem before an implementation becomes feasible.
On the theoretical side, the main topic for future research is to consider unification w.r.t.unrestricted ELH R + -ontologies.In order to generalize the brute-force algorithm in this direction, we need to find a more general notion of locality.
Starting with the goal-oriented algorithm, one idea could be not to fail when a cyclic assignment is generated, but rather to add rules that can break such cycles, similar to what is done in procedures for general E-unification [14].
Another idea could be to use just the rules of our goal-oriented algorithm, and not fail when a cyclic assignment S is generated.Our conjecture is that then the background ontology O together with the cyclic TBox T S := {X ≡ C∈S X C | X ∈ N v } induced by S satisfies C O∪T S D for all subsumptions C ? D in Γ 0 if an appropriate hybrid semantics [12] for the combined ontology O ∪ T S is used.
All the results on unification in Description Logics mentioned here are restricted to relatively inexpressive logics that do not support all Boolean operators.If we close EL under negation, then we obtain the DL ALC, which corresponds to the modal logic K [15].Whether unification in K is decidable is a long-standing open problem.It is only known that relatively minor extensions of K have an undecidable unification problem [16].

(R 8 )(R 8 )(R 8 )Lemma 1 .
Closure under existential restriction: For all EL-concept descriptions C, D and each r ∈ N R , C O D ∃r.C O ∃r.D We replace it by the following two rules: Role hierarchy: For all EL-concept descriptions C, D and all role names r, s with r s ∈ O, C O D ∃r.C O ∃s.D Role transitivity: For all EL-concept descriptions C, D and each transitive role r, C O ∃r.D ∃r.C O ∃r.D Notice that if r = s, then (R 8 ) is exactly (R 8 ).The following lemma is an extension of Lemma 9 from [2].Let O be a flat ELH R + -ontology and C, D be two EL-concept descriptions.Then C O D iff C O D. Proof.It is easy to verify that the two new inference rules are sound, i.e., we have C O D whenever we can derive C O D using these rules.If C O D does not hold, we can show that C I D I holds in the following canonical model I of T .The domain of I is the set C of all EL-concept descriptions built over N C and N R .For every concept name A, we define A I := {E ∈ C | E O A} and for every role name r, we set r I := {(E, F ) ∈ C 2 | E O ∃r.F }.With exactly the same arguments as in [2], we can show that C I = {E ∈ C | E O C } holds for each concept description C .There it was also shown that I is a model of all GCIs in O.It remains to verify that it also satisfies the role axioms.
and there is a proof tree for C O D .By Lemma 1, we have C O D .Since r O s, we have C 1 s O D 1 by definition of s O , i.e., the first alternative holds for D 1 .If (R 8 ) has been applied, then n = m = 1, C 1 = ∃r.C , D 1 = ∃r.D , r • r r ∈ O, and there is a proof tree for C O ∃r.D .Again, Lemma 1 shows that C O ∃r.D , and thus we have C 1 s O D 1 since r O r O r, i.e., the first alternative holds for D 1 .

Figure 2 :
Figure 2: The nondeterministic rules Decomposition 1 and 2 and Extension.

-Decomposition 2 :-
Then s is of the form C 1 • • • C n ?∃s.D with C i = ∃r.C and we have s = C ? ∃t.D ∈ Γ for some transitive role t with r O t O s. Again, s s can be shown in exactly the same way as in the case of Decomposition 1. Mutation 1: Then s is of the form C 1 • • • C n ?D with n > 1 and there are atoms A 1 , . .

2 .
subsumptions created by expansion of Γ 0 : these are of the formC 1 • • • C n ?A for a subsumption C 1 • • • C n ?X ∈ Γ 0 and A ∈ At nv ,3.subsumptions of the form C D, where C ∈ At, D ∈ At tr .

Table 1 :
[11,6] and semantics of EL.A concept description C is subsumed by a concept description D w.r.t. an ontology O (written C O D) if every model of O satisfies the GCI C D. We say that C is equivalent to D w.r.t.O (C ≡ O D) if C O D and D O C. If O is empty, we also write C D and C ≡ D instead of C O D and C ≡ O D, respectively.As shown in[11,6], subsumption w.r.t.ELH R + -ontologies (and thus also w.r.t.EL-ontologies) is decidable in polynomial time.

the role hierarchy can be computed in polynomial time in the size of O. It is easy to see that r O s implies that r I ⊆ s I for all models I of O.
It is easy to see that subsumption w.r.t.∅ between two atoms implies structural subsumption w.r.t.O, which in turn implies subsumption w.r.t.O.Other important properties of s O are reflexivity and transitivity (see r O s, and C O D .3. C = ∃r.C , D = ∃s.D , and C O ∃t.D for a transitive role t such that r O t O s.

Decomposition 1: Condition:
This rule applies to s = C 1 • • • C n ?∃s.D if there is an index i ∈ {1, . . ., n} with C i = ∃r.C and r O s. Action: Its application chooses such an index i, adds the subsumption C ? D to Γ, expands it w.r.t.D if D is a variable, and marks s as solved.This rule applies to s = C 1 • • • C n ?∃s.D if there is an index i ∈ {1, . . ., n} and a transitive role t with C i = ∃r.C and r O t O s. Action: Its application chooses such an index i, adds the subsumption C ? ∃t.D to Γ and marks s as solved.This rule applies to s = C 1 • • • C n ?D if there is an index i ∈ {1, . . ., n} with C i ∈ N v .Action: Its application chooses such an index i and adds D to S C i .If this makes S cyclic, the rule application fails.Otherwise, Γ is expanded w.r.t.C i and s is marked as solved.
• • • ∃r k .A k O D holds for atoms ∃r 1 .A 1 , ..., ∃r k .A k of O with r O s1, ..., r O r k .We take a look at one of the produced subsumptions ∃r.X ?∃r η .A η for η ∈ {1, ..., k}.Itself, this subsumption is not smaller than s w.r.t..However, one or more of the Decomposition rules applies to it, and thus there must also be a subsumption s = X ?A η or s = X ?∃t.A η with r O t O r η in Γ.Both of these are smaller than s w.r.t.since they are both small, their right-hand sides are ground, and m 3 (s ) = m 3 (s ) = rd(σ(X)) < rd(σ(∃r.X)) = m 3 (s).By induction, the produced subsumption is solved by σ, which implies by Lemma 2 that σ(∃r.X) O ∃r η .A η .Since this holds for all η ∈ {1, ..., k}, we conclude that σ(∃r.X) O ∃r.A 1 • • • ∃r.A k O D.-Mutation 3: Then s is of the form ∃r.X ?∃s.Y and the subsumption ∃r 1 .A 1• • • ∃r k .A k O ∃u.B holds between atoms of O with r O r 1 , . . ., r O r k , u O s.As above, we can show that the produced subsumptions ∃r.X ?∃r η .A η for η ∈ {1, . . ., k} are solved by σ.We now consider the remaining subsumption ∃u.B ?∃s.Y .Again, one or more of the Decomposition rules applies to it, and thus the subsumption s = B ? Y or s = B ? ∃t.Y with u O t O s must be in Γ.Both are smaller than s w.r.t.since they are both small while s is not.By induction, the produced subsumption is solved by σ, and thus we have ∃u.B O σ(∃s.Y ) by Lemma 2. We can conclude that σ(∃r.X) O ∃r 1 .A 1 • • • ∃r k .A k O ∃u.B O σ(∃s.Y ).Mutation 4: Then s is of the form C ? ∃s.Y , C is ground, and there is an atom ∃u.B of O with u O s such that either C O ∃u.B or C O ∃t.B for a transitive role t with u O t O s. Furthermore, the subsumption s = B ? Y is in Γ.Both s and s are small and m 2 (s) = Y = m 2 (s ).Since B is a constant, i.e., has depth 0, we have m 3 (s) ≥ m 3 (s ).Finally, m 4 (s) = rd(σ(∃s.Y )) > rd(σ(Y )) = m 4 (s ), and thus s s .By induction, B O σ(Y ) holds, and thus Lemma 2 implies that either C O ∃u.B O σ(∃s.Y ) or C O ∃t.B O σ(∃s.Y ).
and thus D = ∃s.D and r O s and either (i) γ(C ) O γ(D ) or (ii) γ(C ) O ∃t.γ(D ) for a transitive role t with r O t O s.In case (i), Decomposition 1 can be successfully applied to s and results in a new subsumption C If C i = X is a variable, then invariant (II) is preserved by adding D to S X since γ(X) E O γ(D).This implies that S stays acyclic, and thus we can successfully apply Extension to s.2.There are atoms A 1 , . . ., A k , B of O such that for all η ∈ {1, . . ., k} there is i ∈ {1, . . ., n} with γ( ?D that is solved by γ.Similarly, in case (ii) Decomposition 2 can be applied.• Since B 1 is a concept name, again the second case of Lemma 2 applies and we can derive the same consequences as above.In particular, there is an atom∃u 2 .B 2 of O such that B 1 O ∃u 2 .B 2 and u 2 O t 1 and either (i) B 2 O γ(Y ) or (ii) B 2 O ∃t 2 .γ(Y ) for some transitive role t 2 with u 2 O t 2 O t 1 .An important consequence is that C 1 O ∃u 1 .B 1 O ∃u 1 .∃u 2 .B 2 O ∃t 1 .∃t 1 .B 2 O ∃t 1 .B 2since both u 1 and u 2 are subroles of the transitive role t 1 .In case (i), we can thus again successfully apply Mutation 4 to s.In case (ii), we can apply the same case analysis as before.This process cannot go on indefinitely since at some point the same concept name B i would occur twice and then we would have a subsumption B i O ∃t 1 .B i , which is impossible since O is cycle-restricted.Thus, at some point we must find an atom ∃u k .B k of O such that u k O t 1 , C 1 O ∃t 1 .B k , and B k O γ(Y ), i.e., we can successfully apply Mutation 4 to s.
B 1 and u 1 O s and either(i) B 1 O γ(Y ) or (ii) B 1 O ∃t 1 .γ(Y ) for some transitive role t 1 with u 1 O t 1 O s.In case (i), we can successfully apply Mutation 4 to s while maintaining the invariants.In case (ii), note that the resulting subsumption B 1 O γ(∃t 1 .Y ) is similar to the original C 1 O γ(∃s.Y ) since B 1 is also ground.