Memory-Limited Model-Based Diagnosis

Various model-based diagnosis scenarios require the computation of the most preferred fault explanations. Existing algorithms that are sound (i.e., output only actual fault explanations) and complete (i.e., can return all explanations), however, require exponential space to achieve this task. As a remedy, and to enable successful diagnosis both on memory-restricted devices and for memory-intensive problem cases, we suggest RBF-HS, a diagnostic search method based on Korf’s well-known RBFS algorithm. We show that RBF-HS can enumerate an arbitrary ﬁxed number of fault explanations in best-ﬁrst order within linear space bounds, without sacriﬁ-cing the desirable soundness or completeness properties. We discuss the potential impact of RBF-HS and reﬂect on promising synergies with existing algorithms such as Reiter’s HS-Tree. Closer investigations of one hybrid between RBF-HS and HS-Tree, e.g., reveal that times comparable to HS-Tree can be achieved while capping the consumed memory at a prespeciﬁed amount.


Introduction
Model-based diagnosis (MBD) (Reiter 1987;de Kleer and Williams 1987) is a popular and well-understood paradigm that has over the last decades found widespread adoption for troubleshooting systems as different as programs, circuits, physical devices, knowledge bases, spreadsheets, robots, vehicles, or aircrafts. The principle behind MBD is to model the system to be diagnosed by means of a logical knowledge representation language. Beside general knowledge about the system, this system description includes a characterization of the normal behavior of all system components relevant to the diagnosis task. Logical theorem provers can then be used to verify if the predicted system behaviordeduced from the system description under the assumption that all components work nominally-is consistent with factual evidence (observations) about the real system behavior. In case of an inconsistency, the goal is to find a (minimal) diagnosis, i.e., an (irreducible) set of components whose abnormality explains the discrepancy between real and predicted system behavior. Unfortunately, however, there is of-(Best-First) Search in MBD. To this end, (heuristic) search techniques have proven to be a powerful tool in MBD, e.g., for diagnosis computation (Reiter 1987), for the calculation of so-called conflicts (i.e., per-se faulty sub-systems) (Stern et al. 2012), or for the determination of optimal system measurements to narrow down the diagnosis candidates (Rodler 2016). Among those applications, search methods have had the highest impact on the core MBD task, which is diagnosis finding (Jannach, Schmitz, and Shchekotykhin 2016;Wotawa 2001;Lin and Jiang 2003;Greiner, Smith, and Wilkerson 1989;Shchekotykhin et al. 2014;Rodler and Herold 2018;Feldman and Van Gemund 2006). Due to the inherent (NP-)hardness of this problem, 1 it is in most applications infeasible to determine all (minimal) diagnoses. Thus, diagnosis approaches have to make do with often just a small computationally feasible set of diagnoses.
For this reason, the focus is usually laid on minimal diagnoses which are often as informative as all diagnoses. 2 That is, the minimum requirements to employed diagnosis search methods is such case are that only minimal diagnoses are found (soundness) and that all minimal diagnoses are considered (completeness). Moreover, one usually demands that minimal diagnoses are enumerated in best-first order according to some preference criterion, such as cardinality (diagnoses including fewest components first) and probability (most likely diagnoses first). In particular, the set of the best diagnoses appears to be more appropriate than just any sample of diagnoses, i.a., • if all components have a very low failure probability (actual diagnosis among minimum-cardinality diagnoses), as is the case for many physical devices, • if the given probabilistic information is trustworthy and well-founded (actual diagnosis among most probable diagnoses), like in projects with a long history of bug-fixes, or • in sequential diagnosis 3 scenarios where an early termination appears to be reasonable only if the best remaining solution is always known.
Space Complexity in MBD Search. Like for any type of algorithm, a trade-off between time and space complexity is necessary for MBD search methods. Among the factors time and space, the latter appears to be the more critical criterion. Because, if the memory consumption of an algorithm exceeds the amount of available memory, the problem becomes intractable, whereas, with a higher time demand, an algorithm does still work in principle and will deliver a solution (albeit with a potentially longer waiting time for the user). In MBD, there are a range of scenarios which either (a) pose substantial memory requirements on the diagnostic search methods or (b) suffer from too little memory. One example for (a) are problems involving high-cardinality diagnoses, e.g., when two systems are integrated and a multitude of errors emerge at once (Meilicke 2011a;Shchekotykhin et al. 2014). Manifestations of (b) are frequently found in today's era of the Internet of Things (IoT), distributed or autonomous systems, and ubiquitous computing, where low-end microprocessors, often with only a small amount of RAM, are incorporated into almost any device. Whenever such devices should perform (self-)diagnosing actions (Klein 1999;Williams and Nayak 1996), memory-aware diagnosis algorithms are a must (Williams and Ragno 2007;Zoeteweij et al. 2008).
Linear-Space Best-First MBD Search. Traditional (sound and complete) best-first diagnosis search methods require an exponential amount of memory in that all paths in the search tree must be stored in order to guarantee that the best one is expanded in each iteration. Hence, they disqualify for scenarios like (a) and (b) above. As a remedy, we have engineered a hitting set version of Korf's well-known Recursive Best-First Search (RBFS) algorithm (Korf 1992) which is best-first and exhibits worst-case linear-space requirements for classic path-finding problems. We show that our suggested hitting set version, dubbed Recursive Best-First Hitting Set Search (RBF-HS), features all the desirable properties of diagnostic searches (soundness, completeness, best-first property) and is able to return an arbitrary fixed number of the best existing solutions within linear memory bounds. We discuss the potential impact of RBF-HS to MBD and reflect upon variants of RBF-HS and hybrid versions that combine RBF-HS with existing diagnostic search methods like Reiter's popular HS-Tree. The core idea behind these suggestions is the creation of a synergy between different algorithms by exploiting their complementary advantages while overcoming their shortcomings. To demonstrate the benefits of such hybrid search methods on one concrete instance, we propose an algorithm called HBF-HS (Hybrid Best-First Hitting Set Search) that integrates RBF-HS with HS-Tree to flexibly trade off time and space, while retaining the soundness, completeness and best-first properties. The rationale of HBF-HS is to initially run HS-Tree as long as enough memory is still available (optimize time), and to then switch to RBF-HS to minimize the additional used memory (optimize space) in order to preserve problem solvability.
Extensive Evaluations on real-world diagnosis cases evince that the performance of RBF-HS and HBF-HS in terms of runtime is comparable with existing sound, complete and best-first MBD search (HS-Tree), even though the latter requires significantly (up to orders of magnitude) more space. In the case of HBF-HS, we additionally observed that the required memory virtually does not further grow after the switch to RBF-HS has taken place, i.e., the required memory can be seen to be arbitrarily "capped" at some prespecified amount.
Organization. The rest of the paper is organized as follows. To make this work self-contained, we repeat fundamental concepts from the fields of MBD and heuristic search in Sec. 2. The RBF-HS algorithm is discussed in Sec. 3 wrt. how it was derived from RBFS, its functioning in terms of a walkthrough and a detailed example, and its theoretical properties. Then, in Sec. 3.6, we reflect on variations of RBF-HS and suggest HBF-HS. Finally, Sec. 5 presents our experiments and reviews the obtained results, whereas concluding remarks and pointers to future work are given in Sec. 6.

Preliminaries
We first briefly characterize MBD concepts used throughout this work, based on the framework of (Shchekotykhin et al. 2012;Rodler 2015) which is (slightly) more general ) than Reiter's theory (Reiter 1987). 4 Next, we concisely review important notions from heuristic search and contrast classic path-finding with diagnosis search problems. This comparison should serve to facilitate the understanding of our "translation" of the path-finding algorithm RBFS to a diagnosis computation procedure RBF-HS in Sec. 3.  knowledge) comprises any additional available system knowledge and system observations. More precisely, there is a one-to-one relationship between sentences ax i ∈ K and components c i , where ax i describes the nominal behavior of c i (weak fault model). E.g., if c i is an AND-gate in a circuit, then ax i := out(c i ) = and(in1(c i ), in2(c i )); B in this case might contain sentences stating, e.g., which components are connected by wires, or observed circuit outputs. The inclusion of a sentence ax i in K corresponds to the (implicit) assumption that c i is healthy. Evidence about the system behavior is captured by sets of positive (P ) and negative (N ) measurements (de Kleer and Williams 1987;Felfernig et al. 2004;Reiter 1987). Each measurement is a logical sentence; positive ones p ∈ P must be true and negative ones n ∈ N must not be true. The former can be, depending on the context, e.g., observations about the system, probes or required system properties. The latter model properties that must not hold for the system, e.g., if K is a medical knowledge base to be debugged, a negative test case might be "every tumor causes pain". We call K, B, P , N a diagnosis problem instance (DPI).
Example 1 (Diagnosis Problem) Tab. 1 depicts an example of a DPI, formulated in propositional logic. The "system" (which is the knowledge base itself in this case) comprises five "components" c 1 , . . . , c 5 , and the "nominal behavior" of c i is given by the respective axiom ax i ∈ K. There is neither any background knowledge (B = ∅) nor any positive measurements (P = ∅) available from the start. But, there is one negative measurement (i.e., N = {¬A}), which postulates that ¬A must not be an entailment of the correct system (knowledge base). Note, however, that K (i.e., the assumption that all "components" work nominally) in this case does entail ¬A (e.g., due to the axioms ax 1 , ax 2 ) and therefore some axiom in K must be faulty (i.e., some "component" is not healthy).
Diagnoses. Given that the system description along with the positive measurements (under the assumption K that all components are healthy) is inconsistent, i.e., K∪B∪P |= ⊥, or some negative measurement is entailed, i.e., K ∪ B ∪ P |= n for some n ∈ N , some assumption(s) about the healthiness of components, i.e., some sentences in K, must be retracted. We call such a set of sentences D ⊆ K a diagnosis for the DPI K, B, P , N iff (K \ D) ∪ B ∪ P |= x for all x ∈ N ∪ {⊥}. We say that D is a minimal diagnosis for dpi iff there is no diagnosis D ⊂ D for dpi . The set of minimal diagnoses is representative of all diagnoses under the weak fault model , i.e., the set of all diagnoses is equal to the set of all supersets of minimal diagnoses. Therefore, diagnosis approaches usually restrict their focus to only minimal diagnoses. We furthermore de-note by D * the actual diagnosis which pinpoints the actually faulty axioms, i.e., all elements of D * are in fact faulty and all elements of K \ D * are in fact correct.
Diagnosis Probability Model. In case useful meta information is available that allows to assess the likeliness of failure for system components, the probability of diagnoses (of being the actual diagnosis) can be derived. Specifically, given a function pr that maps each sentence (system component) ax ∈ K to its failure probability 0 < pr (ax ) < 1, the probability pr (X) of a diagnosis candidate 6 X ⊆ K (under the common assumption of independent component failure) is computed as the probability that all sentences in X are faulty, and all others are correct, i.e., Example 3 (Diagnosis Probabilities) Reconsider the DPI depicted in Tab.1 and let the component probabilities pr (ax 1 ), . . . , pr (ax 5 ) = .1, .05, .1, .05, .15 . Then, we can compute the probabilities of all minimal diagnoses from Example 2 as pr (D 1 ), . . . , pr (D 4 ) = .0077, .0036, .0036, .0058 . For instance, pr (D 1 ) is calculated as 0.1 * (1 − 0.05) * 0.1 * (1 − 0.05) * (1 − 0.15). The normalized diagnoses probabilities would then be .37, .175, .175, .28 . Note, this normalization makes sense if not all diagnoses, but only minimal diagnoses are of interest, which is usually the case in model-based diagnosis applications for complexity reasons.
Conflicts. Instrumental for diagnosis computation is the notion of a conflict (de Kleer and Williams 1987;Reiter 1987). A conflict is a set of healthiness assumptions for components c i that cannot all hold given the current knowledge about the system. More formally, C ⊆ K is a conflict for the DPI K, B, P , N iff C ∪ B ∪ P |= x for some x ∈ N ∪ {⊥}. We call C a minimal conflict for dpi iff there is no conflict C ⊂ C for dpi .
Example 4 (Conflicts) For our running example, dpi , in Tab. 1, there are four minimal conflicts, given by C 1 := ax 1 , ax 2 , C 2 := ax 2 , ax 3 , ax 4 , C 3 := ax 1 , ax 3 , ax 5 , and C 4 := ax 3 , ax 4 , ax 5 . 7 For instance, C 4 , in CNF equal to (¬A ∨ ¬C) ∧ (¬B ∨ C) ∧ (¬A ∨ B ∨ C), is a conflict because adding the unit clause (A) to this CNF yields a contradiction, which is why the negative test case ¬A is an entailment of C 4 . The minimality of the conflict C 4 can be verified by rotationally removing from C 4 a single axiom at the time and controlling for each so obtained subset that this subset is consistent and does not entail ¬A.
Relationship between Conflicts and Diagnoses. Conflicts and diagnoses are closely related in terms of a hitting set and a duality property (Reiter 1987): Hitting Set Property A (minimal) diagnosis for dpi is a (minimal) hitting set of all minimal conflicts for dpi . (X is a hitting set of a collection of sets S iff X ⊆ Si∈S S i and X ∩ S i = ∅ for all S i ∈ S) Duality Property Given a DPI dpi = K, B, P , N , X is a diagnosis (or: contains a minimal diagnosis) for dpi iff K \ X is not a conflict (or: does not contain a minimal conflict) for dpi .
Example 5 (Conflicts vs. Diagnoses) Let us again consider our example DPI from Tab. 1. Regarding the Hitting Set Property, e.g., the minimal diagnosis D 1 (see Example 2) is a hitting set of all minimal conflict sets because each conflict (see Example 4) contains ax 1 or ax 3 . It is moreover a minimal hitting set since the elimination of ax 1 implies an empty intersection with, e.g., C 1 , and the elimination of ax 3 means that, e.g., C 4 is no longer hit. Thus, given the collection C of all minimal conflicts, we can determine all the minimal diagnoses as the collection of minimal hitting sets of C.
Concerning the Duality Property, e.g., D 4 is a diagnosis because K \ D 4 = {ax 1 , ax 3 , ax 4 } is not a conflict (this can be easily verified by checking that no minimal conflict in Example 4 is a subset of this set), or, equivalently, (K \ D 4 ) ∪ B ∪ P = {ax 1 , ax 3 , ax 4 } is both consistent and does not entail ¬A. Inversely, e.g., C 2 is a conflict since K \ C 2 = {ax 1 , ax 5 } is not a diagnosis (again, this can be easily seen by verifying that no minimal diagnosis in Example 2 is a subset of this set), or, equivalently, (K \ (K \ C 2 )) ∪ B ∪ P = C 2 ∪ B ∪ P = {ax 2 , ax 3 , ax 4 } entails the negative measurement ¬A.

Search
Path-Finding Problem. A path-finding problem instance (PPI) (Russell and Norvig 2010) can be characterized as a tuple S 0 , succ(), goal(), g() where S 0 is a distinguished initial state, succ() is a successor function that returns all directly reachable neighbor states of any given state, goal() is a Boolean goal test that returns true iff a given state is a goal state, and g() is a cost function that assigns a realvalued cost to any given sequence of states (called path). A solution to a PPI is a path from the initial state to some goal state, and the objective is often to find an optimal solution, i.e., one with the least costs among all solutions. 7 In this work, we always denote conflicts by angle brackets.
Example 6 (Path-Finding Problem) A pretty intuitive instance of a path-finding problem is the task of searching for the (shortest) route between two cities, say Berlin and Vienna. In this case, we would define S 0 := Berlin, succ() to return all (major) cities reachable from a given city by a direct motorway, g() to return the summed up (motorway-)distances through all cities along a given path, and goal() to return false for all cities except for Vienna.
Search Algorithms. Various algorithms exist to tackle PPIs, which usually produce a systematic search tree. Each search tree is a tree composed of nodes and edges, where the root node n 0 corresponds to the state S 0 , and from a node n corresponding to state S there are |succ(S)| emanating edges to other nodes, each of which represents one of the states in succ(S). For a node n that represents a state S, we also say that n's node label is S. The creation of child nodes from a current leaf node n by means of succ() is called expansion of n. Inversely, the creation of a child node n when its parent is expanded is called generation of n. Importantly, each generated node n stores a pointer to its parent to allow for the reconstruction of the path to n in case it is a goal. If there are specific (named) actions that can be taken in a state S, each of which results in some successor state in succ(S), the respective action name is often used as an edge label between S and the successor node reached through this action. Note that one and the same state can occur multiple times in a search tree, depending on the used algorithm. In general, different ways of constructing the search tree-i.e., in which order nodes are selected for expansion, and how much about the tree construction "history" (e.g., already expanded nodes) is stored -yield a variety of search methods with different properties regarding completeness (will a solution be found whenever one exists?), best-first property 8 (will the best solution be found first?), as well as time and space complexity (how much runtime and memory will the algorithm need to find a solution?). Search algorithms that solve PPIs usually stop after the first path to a goal state is found.
Informed Search. If problem-specific information beyond the mere PPI is (not) available to an algorithm, the problem is called (un)informed. If applicable, such problemspecific information is normally given as a heuristic function h(n) which assigns to each node n a non-negative real value as an estimation of the cost of the best path from n's state to some goal state. This heuristic value h(n) can then be combined with the costs g(n) already incurred to reach n, in terms of f (n) := g(n) + h(n), which estimates the overall cost of the path from the start to some goal state via node n.
Example 7 (Heuristic Function) Recall the route planning task from Example 6. For this problem, a simple heuristic function is given by the straight-line distance between a particular city n and the closest destination city (goal state).
Example 8 (Search Algorithms) Important uninformed se-arch strategies are depth-first, breadth-first, uniform-cost and iterative deepening search; popular informed searches are A* and IDA* (Russell and Norvig 2010). Each of them maintains a queue of nodes that is sorted in a specific way, where the first node of this queue is chosen for expansion at each step. Each expanded node is deleted from the queue and its generated successors are added to it in a way the defined sorting is preserved. (If the set of expanded nodes is explicitly stored in addition to the queue and no already-expanded nodes are generated or expanded, then the respective search is called graph search, otherwise tree search.) Whenever a node is expanded whose state satisfies the goal() test, the respective path is returned and the search terminates. Now, depth-first search maintains a LIFO queue, breadthfirst search a FIFO queue, and uniform-cost search and A*, respectively, a queue sorted in ascending order by g() and f (). Iterative deepening and IDA* run in iterations, executing one depth-first search per iteration. At this, each iteration uses an incremented depth-limit l = 1, 2, . . . (iterative deepening) or an incremented cost-limit equal to the best known node from the last iteration that has not been expanded (IDA*). A depth-limit (cost-limit) k means that no successors are generated for any node at tree depth k (with cost > k).
Diagnosis Search Algorithms. Given a DPI K, B, P , N , a diagnosis search algorithm 9 is characterized by the definition of a node processing procedure. The latter is divided into two parts, node labeling and node assignment. A generic diagnosis search algorithm then works as follows: • Start with a queue including only the root node ∅.
• While the queue is non-empty and not enough minimal diagnoses have been found, 10 poll the first node n from the queue and process it. That is, compute a label L for n, and assign n (or potentially its successors) to an appropriate node class (e.g., solutions, non-solutions) based on L. Different concrete diagnosis search algorithms are obtained by (re)defining (i) the sorting of the queue and (ii) the node processing procedure, which means specifying how nodes are labeled, and to which collections nodes are assigned.
Example 9 (Reiter's HS-Tree) To make this more concrete, let us examine how (i) and (ii) is realized in Reiter's seminal HS-Tree (Reiter 1987): Sorting of the queue: Depending on the desired preference criterion to be optimized, either a FIFO-queue is used (breadth-first search; minimum-cardinality diagnoses first) or the queue is kept sorted in descending order of pr (n), cf. Eq. 1 (uniform-cost search; most probable diagnoses first). Node labeling: The following checks are made in order, and a label is returned as soon as the first check is positive.
(non-minimality) Is n a superset of some already found diagnosis? If yes, return L = closed . (duplicate) Is there another node equal to n in the queue? If yes, return L = closed . (reuse label) Is there a conflict C among the already used node labels such that n ∩ C = ∅? If yes, return L = C. (compute label) Compute a minimal conflict for K \ n, B, P , N . If some set C is computed, return L = C. If 'no conflict' is output, return L = valid . Node assignment: If n's computed label L = ax 1 , . . . , ax k , then k new successor nodes n 1 , . . . , n k are generated and added to the queue, where n i = n ∪ {ax i }. L = valid , then n is a solution and added to the collection of minimal diagnoses. L = closed , then n is irrelevant or a proven non-solution and not added to any collection, i.e., it is discarded. Note, apart from guiding the node assignment, there is no purpose of node labels L. Thus, labels are not further stored. Remark: In order for this algorithm to be sound, complete and best-first • the function for conflict computation used in (compute label) must be sound (if a set is returned, it is a conflict), complete (a conflict is returned whenever there is one), and must return only minimal 11 conflicts, and • the probability model pr (ax ) for ax ∈ K needs to be cost-adjusted, i.e., pr (ax ) < 0.5 for all ax ∈ K. 12 Clearly, the latter condition is not needed if a FIFO-queue is used.
Diagnosis Search vs. Path-Finding. The main properties that distinguish diagnosis search from path-finding are: (I) PPI-formulation does not suffice: Although the problem of searching for minimal diagnoses for a DPI can be stated as a PPI-where S 0 = ∅, succ() gets a labeled node n with label L and returns the successors of n if L is a set, and ∅ else, goal(n) returns true iff n is a diagnosis, and g(n) := pr (n) as per Eq. 1-this characterization is not a sufficient basis to run a diagnosis search. What is missing is the definition of a node labeling and a node assignment strategy (see above). Importantly, these missing building blocks decide over the soundness, completeness and best-first property of the diagnosis search. By contrast, for path-finding, the PPI includes all relevant information for the problem to be directly solved by an off-the-shelf path-finding algorithm.
(II) States, nodes and paths coincide: In diagnosis search, the state of a search tree node n corresponds to n itself (i.e., to a set of ax i -elements). So, no distinction between states and nodes is made. When viewing successor node generation performed by succ(n) as an application of actions, then the only possible action at any node n would be "add one element ax i from n's label L to n". When the label ax i is assigned to the edge pointing from n to its child node n ∪ {ax i }, nodes (and states) can be seen as representatives of the (edge labels along the) paths in the search tree.
(III) Solutions are sets, not paths: Solutions to a diagnosis search problem are nodes (sets of edge labels along a tree path) which are minimal diagnoses for the given DPI. Unlike in path-finding problems, the order of labels along the path does not matter.
(IV) Multiple solutions are sought: In diagnosis search, it is usually of interest to find multiple solutions, i.e., after the first solution is determined, the search must be (correctly) continuable until sufficient solutions are found.
(V) Search for maximal-cost solutions: In diagnosis search, one wants to calculate the maximal-cost (i.e., most probable or maximal cardinality −1 ) solutions whereas path-finding is usually about finding a minimal-cost solution.
(VI) Slightly stricter conditions on cost function: Like for path-finding, the used cost function must fulfill certain criteria in order for soundness, completeness and the best-first feature to be guaranteed. While it is common for (uninformed) path-finding problems to specify the cost function f () in a way the path costs amount to the sum of the action (or: step) costs along the path, the cost function f (n) := pr (n) as per Eq. 1 used for diagnosis search cannot be seen as a sum of step costs. As a consequence, it suffices in the former case to make sure step costs are non-negative (f () is said to be monotonic in this case), as opposed to the latter case, where the cost-adjustment (see Example 9) is necessary (and it does not suffice that merely pr (ax i ) > 0 for all ax i ∈ K). Note, this cost-adjustment makes the function f (n) := pr (n) anti-monotonic, i.e., f (n i ) ≥ f (n j ) whenever n j is a successor of n i (i.e., n i ⊂ n j ). 13 (VII) Soundness is not trivial: Whereas in path-finding any path whose end state satisfies the goal test is a valid solution to the PPI, in diagnosis search an appropriate combination of suitable goal test, node labeling, node assignment and cost function is necessary to ensure soundness, i.e., that each found solution is indeed a minimal diagnosis for the given DPI. 13 In fact, f (n) is even strictly anti-monotonic (f (ni) > f (nj)) as 0 < pr (ax ) < 1 holds for all ax ∈ K (this can easily seen from Eq. 1). Moreover, anti-monotonicity when searching for maximalcost solutions (as for diagnosis search, cf. (V)) is the equivalent to monotonicity when searching for minimal-cost solutions (as for path-finding problems).

Algorithm 1 RBFS
Input: . PPI ppi := S0, succ(), goal(), g() and a heuristic function h() (if ppi is an uninformed problem, then h(n) := 0 for all nodes n) Output: a path from S0 to some goal state, if a goal state is reachable from S0 by means of successive applications of the succ() function; null otherwise for Si ∈ succ(STATE(n)) do

22:
Child_Nodes ← SORTINCREASINGBYF(Child_Nodes)   (Korf 1992) provides the inspiration for RBF-HS. Historically, the main motivation that led to the engineering of RBFS was the problem that best-first searches by that time required exponential space. The idea behind RBFS is to trade (more) time for (much less) space by means of a "(re)explore-current-best & backtrack & forget-most & remember-essential & updatecost" cycle. As a result, RBFS is complete and best-first and works within linear-space bounds.
RBFS: Briefly Explained. RBFS is presented by Alg. 1. In a nutshell, it works as follows (Russell and Norvig 2010). Initial node costs are the f -values computed from g() and h(), and backed-up node costs are named F -values. Initially, all backed-up node costs are the nodes' initial costs. Starting from the root node corresponding to S 0 , the principle is to follow the best (lowest F ) path downwards (recursive RBFS'-calls, line 26). At each downward step, the variable bound is used to keep track of the (backed-up) cost of the best alternative path available from any ancestor of the current node (note, this is the globally best alternative path). If the current node exceeds bound , the recursion unwinds back to the alternative path. As the recursion unwinds, the cost of each node along the path is replaced with a (new) backedup cost value, which is the best (backed-up) cost of its child nodes (cf. line 30). In this way, RBFS always remembers the backed-up cost of the best leaf in the forgotten subtree and can therefore decide whether it is worth reexpanding the subtree at some later time (this decision is made through the condition of the while-loop). When expanding a subtree rooted at node n which has already been expanded and forgotten before (condition in line 16 is true), child nodes n i in this subtree whose initial cost (f -value) appears more promising than the algorithm knows from a previous iteration and the stored backed-up cost F (n) it actually is, this information is not tediously learned again by RBFS, but n's F -value is directly used to update n i 's F -value (see line 17). If some node is recognized to correspond to a goal state, the path to this node is returned and RBFS' terminates (lines 7-9).
From RBFS to RBF-HS: Necessary Modifications. In order to translate a path-finding to a diagnosis search algorithm, we have to make adequate amendments to the former with due regard to all differences between both paradigms discussed in (I)-(VII) in Sec. 2. Next, we list and explain the main modifications necessary to derive RBF-HS from RBFS (line numbers given refer to the respective locations of the changes in the RBF-HS algorithm, i.e., in Alg. 2).
(Mod1) A node labeling (line 12 and LABEL procedure) and a node assignment (lines 13-19) strategy have to be added. Importantly, the goal test (check, whether a node is a minimal diagnosis, lines 39, 42 and 44) as well as the preparation of nodes for expansion (i.e., the provision of a minimal conflict, line 43 or 49) is part of these two code blocks. Note, it is crucial for achieving soundness, completeness and the best-first property that node labeling and node assignment are properly engineered. Justification: Bullet (I).
(Mod2) Differentiation between nodes, states and paths is no longer necessary, which is why the functions MAKE-NODE (generates node from state), STATE (extracts state from node), and GETPATHTO (returns path from root to node) can be omitted. This becomes evident in line 9 (root node is simply equal to initial state ∅; cf. line 3 in Alg. 1), line 16 (a set n is added to the solutions D; cf. line 8 in Alg. 1), line 20 and EXPAND function (successors are generated directly from the node n; cf. line 12 in Alg. 1), and lines 39, 42 and 45 (goal test performed on node, not state; cf. line 7 in Alg. 1). Justification: Bullets (II) and (III).
(Mod3) The requirement that multiple solutions are generally desired in diagnosis search is handled in lines 17-19. Note, it is essential to return −∞ (i.e., the worst possible cost) as the backed-up F -cost of the solution node n in order to allow the search to continue in a well-defined and correct way. More precisely, this will cause the F -value of n's best sibling node to be propagated upwards. As a consequence, the backed-up value for any subtree including n will be the so-far found best cost over all nodes in this subtree except for n. In fact, any backed-up value F * := F (n) > −∞ would prevent RBF-HS' from terminating and thus would make it incomplete (intuitively, at some point all other nodes would have a value lower than F * and the algorithm would loop forever exploring n again and again). Justification: Bullet (IV).
(Mod5) The used probability measure pr needs to be costadjusted. For any given model that assigns some failure probability pr (ax ) ∈ (0, 1) to each ax ∈ K, this can be achieved as explained in footnote 12. Justification: Bullet (VI).
(Mod6) To achieve soundness (only minimal diagnoses are added to the solutions D in line 16), the following provisions are made. The function f is cost-adjusted (since pr is cost-ajusted and f := pr , cf. inputs of Alg. 2) which implies, by the sorting of Child_Nodes (line 28), that minimal diagnoses will be found prior to non-minimal ones (cf. bullet (VI)). Moreover, the LABEL function is designed such that only nodes n can be labeled valid for which no already-found diagnosis exists which is a subset of n (goal test, part 1, line 39), and which is evidentially a diagnosis (goal test, part 2, line 45). Finally, the node assignment ensures that only nodes labeled valid can be assigned to the solution list D (line 16). 14 Justification: Bullet (VII).

RBF-HS Algorithm Walkthrough
Inputs and Output. RBF-HS is depicted by Alg. 2. It accepts the following arguments: a DPI dpi = K, B, P , N , a probability measure pr (see Sec. 2), and a stipulated number ld of minimal diagnoses to be returned. It outputs the ld (if existent) minimal diagnoses of maximal probability wrt. pr for dpi . To effect that diagnoses of minimum cardinality (instead of maximal probability) are preferred, the probability model must satisfy pr (ax ) := c for all ax ∈ K for some arbitrary fixed c ∈ (0, 0.5). Note, this is equivalent to defining pr (n) := 1/|n| for all nodes n.
Trivial Cases. At the beginning (line 2), RBF-HS initializes the solution list of found minimal diagnoses D and the list of already computed minimal conflicts C. Then, two trivial cases are checked, i.e., whether no diagnoses exist for dpi (lines 4-5), or if the empty set is the only diagnosis for dpi (lines 6-7). Note, the former case applies iff the empty set is a conflict for dpi , which implies that K\∅ = K is not a diagnosis for dpi by the Duality Property (cf. Sec. 2), which in turn means that no diagnosis can exist since diagnoses are subsets of K and each superset of a diagnosis must be a diagnosis as well (weak fault model, cf. Sec. 2). The latter case holds iff there is no conflict at all for dpi , i.e., in particular, K is not a conflict, which is why K \ K = ∅ is a diagnosis by the Duality Property, and consequently no other minimal diagnosis can exist. If none of these trivial cases is given, the call of FIND-MINCONFLICT (line 3) returns a non-empty minimal conflict C (line 8 is reached), which entails by the Hitting Set Property (cf. Sec. 2) that a non-empty (minimal) diagnosis will exist. For later reuse (note: conflict computation is an expensive operation), C is added to the computed conflicts C, and then the recursive sub-procedure RBF-HS' is called Algorithm 2 RBF-HS Input: . tuple dpi, pr , ld comprising • a DPI dpi = K, B, P, N • a probability measure pr that assigns a failure probability pr (ax ) ∈ (0, 1) to each ax ∈ K (cf. Sec. 2), where pr is cost-adjusted (cf. footnote 12); note: the cost function f (n) := pr (n) as per Eq. 1 for all tree nodes n ⊆ K • the number ld of leading minimal diagnoses to be computed Output: list D where D is the list of the ld (if existent) most probable (as per pr ) minimal diagnoses wrt. dpi, sorted by probability in descending order 1: procedure RBF-HS(dpi, pr , ld) 2: if f (n) > F (n) then if true, n was already expanded before 23: Child_Nodes ← ADDDUMMYNODE(Child_Nodes)

28:
Child_Nodes ← SORTDECREASINGBYF(Child_Nodes) 29: if n ⊇ ni then goal test, part 1 (is n non-minimal?) 40: return closed n is a non-minimal diagnosis 41: for C ∈ C do 42: if C ∩ n = ∅ then cheap non-goal test (is n not a diagnosis?) 43: return C n is not a diagnosis; reuse C to label n 44: if L = 'no conflict' then goal test, part 2 (is n diagnosis?)

46:
return valid n is a minimal diagnosis 47: else n is not a diagnosis 48: for e ∈ C do 53: Succ_Nodes ← ADD(n ∪ {e}, Succ_Nodes)

54:
return Succ_Nodes (line 9). The arguments passed to RBF-HS' are the root node ∅, its f -value, and the initial bound set to −∞.
Recursion: Principle. The basic principle of the recursion is very similar as sketched above for RBFS. That is, always explore the open node with best F -value in a depthfirst manner, until the best node has worse costs than the globally best alternative node (whose cost is always stored by bound ). Then backtrack and propagate the best F -value among all child nodes up at each backtracking step. Based on their latest known F -value, the child nodes at each tree level are re-sorted in best-first order of F -value. When reexploring an already explored, but later forgotten, subtree, the cost of nodes in this subtree is, if necessary, updated through a cost inheritance from parent to children. In this vein, a re-learning of already learned backed-up cost-values, and thus repeated and redundant work, is avoided. Exploring a node in RBF-HS means labeling this node and assigning it to an appropriate collection of nodes based on the computed label (cf. Example 9). The recursion is executed until either D comprises the desired number ld of minimal diagnoses or the hitting set tree has been explored in its entirety.
Recursion: Details. The first argument passed to RBF-HS' (line 9 or 32) is the node n it will process. Node Labeling. As a first step, n is labeled by the LABEL function (line 12).
Node Assignment. The computed label is then handled very similarly as in case of Reiter's HS-Tree (cf. Example 9), i.e., closed nodes are discarded, valid ones added to D, and those labeled by a conflict L are expanded by the EXPAND function (line 20). In addition, since a value has to be returned by each recursive RBF-HS'-call (cf. line 32) in order for the recursion to be properly resumed, the (worst possible) backed-up F -value −∞ is returned for nodes without successors (labels closed and valid ). Intuitively, the value −∞ can be interpreted as "this node is hopeless or already explored". The rationale behind this is to avoid a misleading of the algorithm towards re-exploring such nodes once their costs would be better than those of all other nodes. In fact, any F -value larger than −∞ would even imply the algorithm's non-termination and thus incorrectness, cf. (Mod3).
Notably, nodes with F -value equal to −∞ can be considered again (given their parent nodes are expanded again), but, if so, they are directly labeled closed in line 40 because they are either equal to or proper supersets of some node in D. Equality holds for nodes originally labeled valid , which are therefore in D; the superset property is given in case of nodes originally labeled closed , for which there was already a proper subset in D and thus there still must be one. This (inexpensive) catching of re-explored nodes at the very beginning of LABEL is critical since the FINDMINCON-FLICT operation later in LABEL involves costly theorem prover calls, and must thus be performed as rarely as possible.
Node Expansion. Whenever n is neither a closed nor a valid node, it is labeled by a minimal conflict L and its successors Child_Nodes are created via a call of the EXPAND function (line 20). The result of this node expansion are |L| nodes, generated as n ∪ {ax i } for each ax i ∈ L (line 53).
Node Cost Inheritance. 15 Next, the F -value of each of the newly-generated child nodes n i is set (lines 21-25). Note, this is necessary at each node expansion since a (child) node's F -value exists only as long as the node is in memory; is it no longer stored after a node is discarded through a backtracking step of the algorithm. Intuitively, the ideal F -value would be: (a) the original f -value for child nodes never explored before, for which there cannot be a "learned" F -value yet, (b) the last known F -value for child nodes already explored before.
Basically, there are two possibilities how RBF-HS may specify the F -value of a child node n i : either the F -value of the parent n is inherited to the child node, or n i 's (original) f -value is used. In fact, the algorithm first checks whether n has already been explored before, which is true if f (n) > F (n) (line 22).
In case f (n) > F (n), the child nodes can be partitioned into those that have been explored before, and those that have not. For the latter class, we have F (n) ≥ f (n i ), which involves that each non-explored child node keeps its original f -cost (min in line 23). For the former class, it indeed holds that F (n) < f (n i ), which is why all already-explored nodes inherit the F -value of the parent n (min in line 23). Note, the child nodes' last known F -value (before they were discarded) might have been lower than the inherited F (n) because only one F -value is remembered by the algorithm when a subtree is forgotten; however, F (n) is at least to some extent lower than f (n i ) which implies that at least some "fraction" of n i 's already learned backed-up cost is restored by the inheritance.
Alternatively, given f (n) = F (n) (note that f (n) ≤ F (n) for all nodes n is an invariant throughout RBF-HS'), n can, but does not need to, have been explored already. If n was not yet explored, then clearly none of its child nodes n i can have been explored either, which is why it is reasonable to set the F -value of all children to their f -value (line 25). Otherwise, i.e., if n was explored, then the latest backed-up value F (n) (which is necessarily less than f (n)) must have been overridden during backtracking by a greater F -value of another node (which is possible). Since, however, the fvalue of each node is greater than the f -value of any of its successors (anti-monotonicity of f , cf. footnote 13), the "learned" F -value should in any case be "closer to the real cost" (i.e., lower) than the original estimation given by f . For this reason, it does not make sense to set the F -value of any child node n i to the value F (n) (> f (n i )). Hence, it is plausible also in this case to set the F -value of all children to their original f -value (line 25).

Child Node Preparation. Once all nodes in Child_Nodes
15 For a detailed argumentation why the assertions about the fand F -costs of nodes made in this paragraph hold, please consider the proof of Theorem 2. have been assigned their F -value, Child_Nodes is prepared for node exploration (while-loop, line 31) in the following way: First, if there is only a single node in Child_Nodes, then a second "dummy" node is added. The reason for this is that lines 30 and 35 require a second node to be present in Child_Nodes. In order not to compromise the correctness of RBF-HS, the F -value of this dummy node has to be set to the worst possible value −∞ (cf. argumentation for Node Assignment above). Second, the nodes in Child_Nodes are sorted in descending order of F -value, such that exactly the nodes with the highest and second-highest F -value are extracted from Child_Nodes in lines 29 and 34, respectively.
Recursive Child Node Exploration. Now, as the child nodes have been generated, their F -costs have been set, and the list Child_Nodes has been prepared for being processed, the final block of RBF-HS' involves the best-first exploration of nodes in Child_Nodes by means of the algorithm's while-loop. Throughout the iteration of the loop, the variables n 1 and n 2 always comprise the best and second-best node among Child_Nodes, according to their (backed-up) F -value. This is guaranteed by lines 33, 34, and 35, where INSERTSORTEDBYF inserts a node to a list such that the sorting of the list according to F is preserved. The while-loop is iterated by always exploring the best node n 1 through a recursive call of RBF-HS' (line 32) as long as the current n 1 's F -value is better than bound . The latter stores the maximal F -value over all child nodes of all ancestors of n 1 (see the max which determines the bound at each recursive downward step in line 32). This value at the same time corresponds to the maximal F -value of any alternative node in the entire hitting set tree, which in turn is greater than or equal to the f -value (i.e., the probability pr ) of any existing solution other than n 1 (see the proof of Theorem 2 for a precise argumentation why these things hold). Hence, the use of bound as a ruler of backtracking actions guarantees that the most probable (remaining) solution is always found first (next). At the point where all nodes in Child_Nodes have an F -value lower than bound , the while-loop is exited and the currently best F -value among the nodes in Child_Nodes is returned, i.e., propagated upward to their parent node n. Note, in the course of the recursive explorations of the subtrees rooted at nodes in Child_Nodes throughout the iteration of the while-loop, solutions might be located and added to D.
Termination. Whenever D is extended, a check is run which tests if the list of solutions D has already reached the stipulated size ld (line 17). If so, the RBF-HS' procedure terminates (line 18). Otherwise, i.e., if there are fewer than ld minimal diagnoses existent for the tackled DPI, RBF-HS' terminates once all nodes in the hitting set tree have been explored and assigned the backed-up value −∞, which is why all recursive while-loops must stop (condition in line 31). In any case, RBF-HS finally returns D (line 10).
K, B, P , N and outputs a minimal conflict C ⊆ K if one exists, and 'no conflict' else. A well-known algorithm that can be used to implement this function is QUICKXPLAIN (Junker 2004;Rodler 2020b).
• ADD(x, L) takes an object x and a list of objects L as inputs, and returns the list obtained by appending the element x to the end of the list L. • ADDDUMMYNODE(L) takes a list of nodes L, appends an artificial node n with f (n) := −∞ to L, and returns the result. • GETANDDELETEFIRSTNODE(L) accepts a sorted list L, deletes the first element from L and returns this deleted element. • GETFIRSTNODE(L) accepts a sorted list L and returns  L's first element. • SORTDECREASINGBYF(L) accepts a list of nodes L, sorts L in decending order of F -value, and returns the resulting sorted list. • INSERTSORTEDBYF(n, L) accepts a node n and a list of nodes L sorted by F -value, and inserts n into L in a way the sorting of L by F -value is preserved. Finally, the LABEL function can be seen as a series of the following blocks: Note that this LABEL function of RBF-HS' is equal to the one used in Reiter's HS-Tree (cf. Example 9), except that the duplicate check is obsolete in RBF-HS'. The reason for this is that there cannot ever be any duplicate (i.e., set-equal) nodes in memory at the same time during the execution of RBF-HS. This holds because, for all potential duplicates n i , n j , we must have |n i | = |n j |, but equal-sized nodes must be siblings (depth-first tree exploration) which is why n i and n j must contain |n i | − 1 equal elements (same path up to the parent of n i , n j ) and one necessarily different element (label of edge pointing from parent to n i and n j , respectively).

RBF-HS Exemplification
The following example illustrates the workings of RBF-HS.
Illustration (Figures). The way of proceeding of RBF-HS is depicted by Figs. 1 and 2. In the figures, we use the following notation. Axioms ax i are simply referred to by i (in node and edge labels). Numbers k indicate the chronological node labeling (expansion) order. Recall that nodes in Alg. 2 are sets of (integer) edge labels along tree branches. E.g., node 9 in Fig. 1 corresponds to the node n = {ax 2 , ax 4 }, i.e., to the assumption that components c 2 , c 4 are at fault whereas all others are working properly. The probability pr (n) (i.e., the original f -value) of a node n is shown by the black number from the interval (0, 1) that labels the edge pointing to n, e.g., the cost of node 9 is 0.18. We tag minimal conflicts . . . that label internal nodes by C if they are freshly computed (expensive; FINDMINCON-FLICT call, line 44), and by R if they result from a reuse of some already computed and stored (see list C in Alg. 2) minimal conflict (cheap; reuse label check; lines 41-43). Leaf nodes are labeled as follows: "?" is used for open (i.e., generated, but not yet labeled) nodes; (Di) for a node labeled valid , i.e., a minimal diagnosis named D i , that is not yet stored in D; × (Expl) for a node labeled closed , i.e., one that constitutes a non-minimal diagnosis or a diagnosis that has already been found and stored in D; Expl is an explanation for the non-minimality in the former, and for the redundancy of node in the latter case, i.e., Expl names a minimal diagnosis in D that is a proper subset of the node, or it names a diagnosis in D which is equal to node, respecti-vely. Whenever a new diagnosis is added to D (line 16), this is displayed in the figures by a box that shows the current state of D. For each expanded node, the value of the bound variable relevant to the subtree rooted at this node is denoted by a red-colored value above the node. By green color, we show the backed-up F -value returned in the course of each backtracking step (i.e., the best known probability of any node in the respective subtree). Further, f -values that have been updated by backed-up F -values are signalized by green-colored edge labels, see, e.g., in Fig. 1, the left edge emanating from the root node of the tree has been reduced from 0.41 (f -value) to 0.09 (F -value) after the first backtrack. Finally, F -values of parents inherited by child nodes (line 23) are indicated by brown color, see the edge between node 14 and node 15 in Fig. 2.
Discussion and Remarks. Initially, RBF-HS starts with an empty root node, labels it with the minimal conflict 1, 2, 5 at step 1 , generates the three corresponding child nodes {1}, {2}, {5} shown by the edges originating from the root node, and recursively processes the best child node (left edge, f -value 0.41) at step 2 . The bound for the subtree rooted at node 2 corresponds to the best edge label (Fvalue) of any open node other than node 2 , which is 0.25 in this case. In a similar manner, the next recursive step is taken in that the best child node of node 2 with an F -value not less than bound = 0.25 is processed. This leads to the labeling of node {1, 4} with F -value 0.28 ≥ bound at step 3 , which reveals the first (provenly most probable) diagnosis D 1 := [1, 4] with pr (D 1 ) = 0.28, which is added to the solution list D. Note that −∞ is at the same time returned for node 3 . After the next node has been processed and the second-most-probable minimal diagnosis D 2 := [1, 6] with pr (D 2 ) = 0.27 has been detected, the by now best remaining child node of node 2 has an F -value of 0.09 (leftmost node). This value, however, is lower than bound . Due to the best-first property of RBF-HS, this node is not explored right away because bound suggests that there are more promising unexplored nodes elsewhere in the tree which have to be checked first. To keep the memory requirements linear, the current subtree rooted at node 2 is discarded before a new one is examined. Hence, the first backtrack is executed. This involves the storage of the best (currently known) F -value of any node in the subtree as the backed-up F -value of node 2 . This newly "learned" F -value is signalized by the green number (0.09) that by now labels the left edge emanating from the root. Analogously, RBF-HS proceeds for the other nodes, whereas the used bound value is always the best value among the bound value of the parent and all sibling's F -values. Please also observe the F -value inheritance that takes place when node {2, 4} is generated for the third time (node 15 , Fig. 2). The reason for this is that the original fvalue of {2, 4} is 0.18 (see top of Fig. 1), but the meanwhile "learned" F -value of its parent {2} is 0.11 and thus smaller. This means that {2, 4} must have already been explored and the de-facto probability of any (minimal) diagnosis in the subtree rooted at {2, 4} must be less than or equal to 0.11.
Output. Finally, RBF-HS immediately terminates as soon as the ld -th (in this case: fourth) minimal diagnosis D 4 is loca-ted and added to D. The list D of minimal diagnoses arranged in descending order of probability pr is returned.

RBF-HS Complexity Analysis
Time Complexity. We can distinguish between two sources of time complexity inherent in RBF-HS: (t1) logical consistency checking, and (t2) tree construction and management.
As to (t1), both the hardness and the number of performed consistency checks are of relevance.
First, the hardness of consistency checks executed by RBF-HS depends on the knowledge representation language adopted to model the diagnosed system and thus cannot be generally assessed. It might range from polynomial in the case of Horn logic over NP-complete for propositional system descriptions to even much harder, such as (2)NEXPTIME-complete for some Description Logics (Grau et al. 2008). Note, despite these somewhat discouraging theoretical complexities, experience with real-world diagnosis cases has shown that practical runtimes for consistency checks are often reasonable, even for interactive scenarios and very expressive logics (Kalyanpur 2006;Horridge 2011;Rodler et al. 2019;Shchekotykhin et al. 2012;Shearer, Motik, and Horrocks 2008).
Regarding the number of consistency checks, in contrast, we are able to derive the upper bound O(|K|(|minC| + |ld |)) where minC denotes the set of all minimal conflicts for the DPI dealt with. To see why this holds, observe that (i) the only place where RBF-HS issues consistency checks is in line 44 (FINDMINCONFLICT), (ii) each FINDMINCON-FLICT call either yields a minimal conflict (line 49) or a minimal diagnosis (line 46), (iii) RBF-HS terminates once the desired ld minimal diagnoses have been found, (iv) each minimal conflict is actually computed only once (but it might be reused multiple times by means of the stored list of conflicts C), and (v) one call of FINDMINCONFLICT requires O(|K|) consistency checks in the worst case (Marques-Silva, Janota, and Belov 2013) if a minimal conflict C is returned, and only a single check if a minimal diagnosis is found (i.e., 'no conflict' is output). Hence, no more than |minC| + |ld | calls of FINDMINCONFLICT, each issuing no more than |K| consistency checks, can be made throughout the execution of RBF-HS.
Factor (t2) is somewhat harder to estimate, as one and the same node might be explored multiple times (cf., e.g., node {2, 4}, which is processed three times in our Example 10). Essentially, there are two main aspects that affect this factor, i.e., (i) the larger the number of different f -values among all nodes are and (ii) the higher the distribution of promising nodes in the search tree is, the more backtrackings and node re-explorations RBF-HS will do (Hatem, Kiesel, and Ruml 2015). In the worst case, each node has a different f -value and, when sorting all nodes according to their f -value, any two neighbors in this sorting are in different subtrees of the root node. In such scenario, O(n) node explorations have to be executed per newly expanded node, where n is the number of all nodes in the complete hitting set tree (as constructed by HS-Tree). The reason for this is that each node expansion requires forgetting the entire last explored subtree of the root and expanding another one until the newly expanded node is reached. Since n nodes will be explored overall (as many as HS-Tree explores 16 ), we have a resulting complexity of O(n 2 ) (cf. the analogue argumentation in (Hatem, Kiesel, and Ruml 2015) for RBFS). However, this scenario is only possible when most probable diagnoses are sought.
In the minimum-cardinality case, we can deduce 17 from the findings of (Korf 1992) that RBF-HS explores O(n) nodes, i.e., for sufficiently large problem size, no more than a constant number as many as HS-Tree does. Intuitively, the plausibility of this can be verified by considering (i) and (ii). As to (i), we have only d different node costs in the minimum-cardinality case where d is the size of the lastfound diagnosis. Regarding (ii), it is straightforward to see that the next explored node of any node n will be the sibling of n's closest ancestor which has not been processed in the current iteration. 18 Thus, each next-best node will be "close" to the current node and a minimum number of backtracking steps will have to be performed to reach the next-best node from the current one.
Space Complexity. First, the space complexity of Korf's original RBFS algorithm, that acts as a basis for RBF-HS, is linear (Korf 1992), i.e., in O(bd) where b is the maximal number of successor states of any state (a.k.a. branching factor) and d the maximal length of any path in the search space. Second, no amendments to the recursive (depth-first) nature of RBFS have been made while deriving RBF-HS (cf. Sec. 3.1). Third, RBF-HS stores computed minimal conflicts and minimal diagnoses, information RBFS does not need. In RBF-HS, recorded conflicts allow for a more efficient labeling of nodes (reuse instead of recalculation), whereas the storage of diagnoses is essential for the algorithm's correct- 16 This holds under the assumption that HS-Tree does not close duplicate nodes, i.e., the (duplicate) criterion is left out, cf. Example 9. In this case, it will explore exactly the same nodes as RBF-HS (which, by construction, cannot eliminate duplicates, cf. 'Sub-Procedures' in Sec. 3.2), except that the latter might explore nodes more than once. Note that we have observed in diverse experiments with HS-Tree that it usually runs faster if the duplicate-check is omitted, because the latter has to explore a potentially exponentialsized collection of nodes at (almost) each processing of a node. The correctness of HS-Tree is not harmed by this modification. 17 (Korf 1992) derived this for RBFS in comparison to breadthfirst search. We can transfer this result to RBF-HS and HS-Tree (without duplicate check, cf. footnote 16) for the following reasons: First, HS-Tree performs exactly a breadth-first search when minimum-cardinality diagnoses are sought, due to the f -cost of any node n being reciprocal to its cardinality (tree-depth) |n| ins this case, cf. Sec. 3.2. Second, the fact that RBF-HS and HS-Tree usually execute until multiple solutions are found (while RBFS and breadth-first search terminate with the finding of the first solution) is not detrimental to Korf's analysis as his result is independent of the goal function. In other words, if k diagnoses should be found, the k-th found diagnosis is interpreted as the first goal node (and the first k − 1 diagnoses are simply interpreted as non-goal nodes without successors).
18 Like (Korf 1992), we define an iteration of RBF-HS as "the interval of time when those nodes being expanded for the first time are all of the same cost." ness and moreover trivially necessary as diagnoses constitute exactly the solutions which should finally be returned.
Hence, the space complexity of RBF-HS is affected by three factors: (s1) |D| (number of stored minimal diagnoses), (s2) |C| (number of stored minimal conflicts), and (s3) the space required to store the search tree.
Factor (s1) is bounded by the fixed input argument ld , which is arbitrarily preset by the user of RBF-HS, and thus in O(1). 19 Clearly, factor (s2) is bounded by |minC| where minC is the set of all minimal conflicts for the considered DPI. Analogously to RBFS, factor (s3) is bounded by |C max | * |minC| where C max is the minimal conflict for DPI with maximal cardinality. The explanation for this is that (i) no node can have more than |C max | child nodes (reason: exactly k successors result from a node-labeling conflict of size k; no other ways of successor generation exist in RBF-HS), (ii) no node (set of edge labels along tree path) can include more than |minC| elements (reason: any node including |minC| elements must hit all minimal conflicts and thus must be a diagnosis; diagnoses are labeled valid or closed and never further expanded by RBF-HS), and (iii) at any tree depth, only a single node can be expanded at one particular point in time (reason: depth-first recursion, line 32). All in all, given finite ld , we thus have a space complexity of O(|C max | * |minC|) which can be interpreted as branching factor times maximal depth, equivalently as for RBFS.
In many practical diagnosis scenarios, the number of minimal conflicts is relatively small. Only in exceptional situations, like when two different, but related, systems are integrated, a somewhat larger number of different (independent) faults, and thus minimal conflicts, might emerge at once (Shchekotykhin et al. 2014). In both cases, however, absolute values of |minC| usually range from single-digit to medium double-digit numbers. In addition, and more importantly, experience in the diagnosis field suggests that usually 20 the number of minimal conflicts does not depend on (or: grow with) the size of the diagnosed system. There are small systems with a higher number of conflicts, as well as there are huge systems with negligible numbers of conflicts. So, from an empirical perspective it appears to be in many cases justified to interpret |minC| to be in O(1). This assumption implies that RBF-HS is linear in the size of the DPI K, B, P , N , because clearly |C max | ≤ |K| due to 19 Note, if ld := ∞ is specified, which means that the intention is to find all minimal diagnoses for the given DPI, then ld is not in O(1), but conditioned by the number of minimal diagnoses existent. Obviously, the existence of a generally linear algorithm to accomplish that task is theoretically impossible since the mere maintenance of the collection of (potentially exponentially many) solutions D might require more than linear space. 20 Still, we see in ( de Kleer 1991) that there are systems (from the domain of digital circuits) that include exceptionally long connected chains of components which altogether determine some system output. If such an output is observed to be wrong, this long component chain gives rise to a large set of minimal conflicts, which does depend on the system size |K|. C max ⊆ K (cf. Sec. 2.1). Note, if both b and d are assumed to be not in O(1) (i.e., are dependent on the problem size), then also the original RBFS algorithm loses its linear space bounds.
Summary. Synopsized, our complexity results are: Theorem 1. . Let dpi = K, B, P , N be an arbitrary DPI, ld the number of diagnoses to be computed, n the number of nodes expanded by HS-Tree (without the duplicate criterion) for dpi and ld , t CC the worst-case time of a consistency check for dpi , minC the set of all minimal conflicts for dpi , and C max the conflict of maximal size for dpi . Further, let TPT := t CC |K|(|minC| + |ld |) (theorem proving time). Finally, assume that ld ∈ O(1) and minC ∈ O(1). Then:

RBF-HS Correctness
The following theorem shows that RBF-HS is correct. The proof can be found in Appendix A. Theorem 2. Let FINDMINCONFLICT be a sound and complete method for conflict computation, i.e., given a DPI, it outputs a minimal conflict for this DPI if a minimal conflict exists, and 'no conflict' otherwise. RBF-HS is sound, complete and best-first, i.e., it computes all and only minimal diagnoses in descending order of probability as per the cost-adjusted probability measure pr .

RBF-HS: Potential Impact and Synergies with Other Techniques
Beside RBF-HS's direct use • as a space-efficient alternative to (exponential-space) best-first diagnosis search algorithms such as HS-Tree (Reiter 1987), HST tree (Wotawa 2001), DynamicHS (Rodler 2020a), GDE (de Kleer and Williams 1987), or StaticHS (Rodler and Herold 2018), or • as a best-first alternative to sound and complete linearspace any-first searches like Inv-HS-Tree (Shchekotykhin et al. 2014), or • as a complete alternative to best-first, but incomplete algorithms like CDA* (Williams and Ragno 2007) or STACCATO (Abreu and Van Gemund 2009), several employments of RBF-HS combined with existing ones can be conceived of. We briefly sketch some of them: (A) Informed HS-Tree: The idea is to run RBF-HS as a preprocessor in order to provide more informed node probabilities, and to subsequently adopt HS-Tree using these "learned" probabilities as f -values. To this end, e.g., RBF-HS could be executed with a fixed time limit and modified to store backed-up F -values of all (or some) visited nodesnot only of the ones that are kept in memory after backtracking steps. Like a heuristic for classic A*, this additional "lookahead" information might lead to the finding of the preferred diagnoses by expanding significantly fewer nodes.
(B) Faster, Relaxed-Bound RBF-HS: To remedy the issue pertinent to RBF-HS that node probabilities which differ only slightly can lead to a large number of backtracking and re-exploration iterations (), variants of RBF-HS are possible which explore "a little" further than until the given bound is violated (cf. (Hatem, Kiesel, and Ruml 2015)). A challenge arising from this change is the preservation of the soundness and best-first properties (as non-minimal or non-preferred diagnoses might be found earlier than minimal or preferred ones). If cardinality is the preference criterion, one possibility to face this problem would be the use of a post-processor routine, such as Inv-QX (Shchekotykhin et al. 2014), which can efficiently minimize possibly non-minimal output diagnoses belatedly.
(C) RBF-HS as a Decision Heuristic...: The rationale is to run RBF-HS for a certain time and to afterwards take the "learned" F -values of nodes as an estimate of the hardness or some other relevant property of the diagnosis problem. ...for Algorithm Choice: E.g., if RBF-HS runs a cardinality search, the backed-up F -values provide an estimation of the least depth of the search tree, i.e., of the least size c of minimum-cardinality diagnoses. If c exceeds a specific threshold, then it can be a good idea to drop certain nice-to-have requirements to the adopted diagnosis computation algorithm (such as completeness or the best-first property) in order to keep the runtime within acceptable bounds. E.g., when completeness is abandoned, possible algorithm alternatives are SAFARI (Feldman, Provan, and Van Gemund 2008), STACCATO (Abreu and Van Gemund 2009), or NGDE (de Kleer 2009); if the necessity for best-first solution generation is dropped, Inv-HS-Tree (Shchekotykhin et al. 2014) could be a good choice. ...for Parameter Choice: E.g., given a reasonable approximation of the cardinality (or: probability) of the most preferred diagnosis computed by means of RBF-HS, this approximation can be used for an informed selection of a limit for depth-limited (or: cost-limited) search (Russell and Norvig 2010). Depth-limited search using a suitable limit can be a powerful linear-space strategy to find the preferred diagnoses, and might be substantially faster than iterative deepening or IDA* (hitting set) searches and RBF-HS.
(D) RBF-HS as a Plug-In: Given a diagnosis search method that uses a hitting set generation routine as a black-box, such as SDE (Stern et al. 2012), RBF-HS can be used as a plugin, e.g., in case memory issues are faced when using other best-first algorithms.

(E) Hybrid Best-First Hitting Set Search (HBF-HS):
The goal is to allow for an as fast as possible sound, complete and best-first diagnosis search also in cases where state-of-theart searches boasting these three properties (e.g., HS-Tree) run out of memory. The principle is to normally execute standard HS-Tree (see Example 9) initially, but to equip it with a switch criterion (e.g., a maximal number of processed nodes, or a maximal amount or fraction of memory consumed) that, when triggered, prompts a switch to RBF-HS. The latter then continues the search while only consuming a linear amount of additional memory. In this vein, HS-Tree can utilize as much memory as it needs while executing (focus on time optimization), and, before the available memory is depleted, RBF-HS overtakes (focus on space optimization) such that the problem remains solvable. As a theoretical analysis revealed, the transfer of control between HS-Tree and RBF-HS is rather straightforward while guaranteeing the retention of soundness, completeness and best-first properties. Specifically, merely three steps are necessary to set up the execution of RBF-HS after the switch criterion stops HS-Tree: (S1) View all current open nodes of HS-Tree as child nodes Child_Nodes of an imaginary root node; set the bound of this root node to −∞. Then, execute plain RBF-HS.
Example 11 (HBF-HS) Let us reconsider the DPI introduced in Example 10 and have a look how HBF-HS would proceed for it. Assume the switch criterion is defined as "ten generated nodes". Specifically, this means: Execute HS-Tree until ten nodes are generated, then execute steps (S1)-(S3), and finally run RBF-HS. Fig. 3 shows on the left the end state of HS-Tree before the switch is performed, and on the right the state of the transformed tree on which RBF-HS will begin its operations. Observe the following things: • At the time the switch takes place, ten node generations have taken place, and seven nodes are currently maintained by HS-Tree, encompassing two valid nodes ( ) and five open nodes ("?"). Note that two of the open nodes, the leftmost and fourth-leftmost one, are equal (i.e., the path labels {1, 2} and {2, 1} coincide). Hence, one of them is a duplicate and does not need to be further considered (recall that diagnoses are sets of edge labels). Now, step (S1) of the switch process prompts the construction of a new tree through (i) the generation of a virtual root node with bound (red color) set to −∞ and (ii) the connection of this root node by one edge each to the four non-duplicate open nodes (cf. step (S2)), as shown on the right of Fig. 3. Note that the labels of the edges emanating from the root node are now sets of elements from K (cf. the right part of Fig. 3). Nevertheless, all labels for other edges non-linked to the root node are singletons, just as in plain RBF-HS 21 (cf. Example 10). • Two minimal diagnoses have already been located by HS-Tree (nodes 3 and 4 ), and three minimal conflicts have been computed (node labels 1 , 2 and 5 ). These are copied to the respective collections D and C maintained by ' ' state of RBF-HS before first node (rightmost one) is explored Figure 3: Sketch of the execution of HBF-HS on DPI from Example 10 RBF-HS in step (S3), as depicted at the top of the right part of Fig. 3.
• The execution of RBF-HS works exactly as discussed in Example 10, with the difference that it starts with the partial hitting set tree displayed on the right of Fig. 3, where we have one root node and four elements among the Child_Nodes of the root. That is, the first explored node would be the rightmost one, {5}, with the maximal F -value 0.25 among Child_Nodes, and the bound for the processing of {5} would be 0.18, the second-best Fvalue (of node {2, 4}) among Child_Nodes. Intuitively, the RBF-HS execution in the course of HBF-HS can be regarded as a warm-start version of RBF-HS with some conflicts and open nodes, and potentially also some diagnoses, provided from the outset.

Classifying Diagnosis Computation Methods
Literature offers a wide variety of diagnosis computation algorithms, motivated by different diagnosis problems, domains and challenges. These algorithms can be compared along multiple dimensions, e.g., 22 • best-first: minimal diagnoses are output in order, mostpreferred first, according to a given preference criterion ( de Kleer 1991;de Kleer and Williams 1987;Greiner, Smith, and Wilkerson 1989;Reiter 1987 The list of references quoted for each dimension (bullet point) is not intended to be exhaustive. We rather tried to give some representatives of each property and to give credit to (hopefully) most of the relevant works over all the discussed dimensions.
found (Abreu and Van Gemund 2009;de Kleer 2009;Feldman, Provan, and Van Gemund 2008;Li and Yunfei 2002;Siddiqi, Huang, and others 2007;Williams and Ragno 2007), • conflict-based: minimal diagnoses are built as hitting sets of conflicts (de Kleer 2011;de Kleer and Williams 1987;Greiner, Smith, and Wilkerson 1989;Lin and Jiang 2003;Reiter 1987;Rodler 2015;Rodler and Herold 2018;Stern et al. 2012;Wotawa 2001;Xiangfu and Dantong 2006) vs. direct: minimal diagnoses are built without reliance on conflicts, e.g., through divide-and-conquer or compilation techniques (Darwiche 2001;Felfernig, Schubert, and Zehentner 2012;Metodi et al. 2014;Shchekotykhin et al. 2014;Torasso and Torta 2006), • stateful: the state of the search data structure, usually a tree or graph, is maintained and reused throughout a diagnosis session, i.e., even if the diagnosis problem changes through the acquisition of new information about the diagnosed system (de Kleer and Williams 1987;Rodler 2015;Rodler and Herold 2018;Siddiqi and Huang 2011;Rodler 2020a) vs. stateless: whenever the diagnosis computation algorithm is called, it computes diagnoses by means of a fresh search data structure (de Kleer 2011;Greiner, Smith, and Wilkerson 1989;Reiter 1987;Wotawa 2001;Xiangfu and Dantong 2006), • black-box: the theorem prover called throughout diagnosis search is used as it is, i.e., as a pure oracle, which makes the diagnosis search very general in that no dependency on any particular logic or reasoning mechanism is given (Greiner, Smith, and Wilkerson 1989;Reiter 1987;Rodler 2015;Rodler and Herold 2018;Shchekotykhin et al. 2014;Wotawa 2001) vs. glass-box: the used theorem prover is internally optimized or modified for diagnostic purposes, which can bring performance gains, but makes the method reliant on one particular logic and reasoning mechanism (Baader and Penaloza 2008;Horridge 2011;Kalyanpur 2006;Schlobach et al. 2007), • on-the-fly: conflicts are computed on demand in the course of the diagnosis search (Greiner, Smith, and Wil-kerson 1989;Reiter 1987;Rodler and Herold 2018;Wotawa 2001;de Kleer and Williams 1987) vs. preliminary: the set of minimal conflicts must be known in advance and given as an input to the diagnosis search (Abreu and Van Gemund 2009;de Kleer 2011;Li and Yunfei 2002;Lin and Jiang 2003;Pill and Quaritsch 2012;Xiangfu and Dantong 2006), • worst-case linear-space: the algorithm requires an amount of memory that is linear in the problem size, even in the worst case (Felfernig, Schubert, and Zehentner 2012;Shchekotykhin et al. 2014) vs.

Towards Improving Existing Methods
Our study of these existing works suggests two different things. First, the best choice of algorithm, in general, depends largely on the particular tackled problem (domain and requirements). Consequently, there is little hope to find an algorithm that comes anywhere near improving all of the existing ones. Second, performance improvements for algorithms are often achieved at the cost of losing desirable properties (e.g., completeness or the best-first guarantee). Hence, it is particularly noteworthy that RBF-HS as well as HBF-HS aim to improve existing sound, complete and best-first diagnosis search while preserving all these favorable properties. Moreover, to the best of our knowledge, RBF-HS is the first linear-space diagnosis computation method that ensures soundness, completeness and the best-first property.

Related Works in Diagnosis Domain
In terms of the above-mentioned dimensions, RBF-HS and HBF-HS are best-first, complete, stateless, conflict-based, black-box, and on-the-fly. Moreover, RBF-HS is worst-case linear-space whereas HBF-HS is not. 23 We now discuss diagnosis algorithms related to the ones proposed in this work and point out crucial differences wrt. the enumerated dimensions. Specifically, these related algorithms can be categorized into compilation-based (not black-box; can be polynomial-space or linear-space, but only under certain circumstances), duality-based (either not best-first or not linear-space) and best-first search (whenever sound and complete, then exponential-space) approaches: Compilation-Based Approaches: These techniques compile the diagnosis problem into some target representation such as SAT (Metodi et al. 2014), OBDD (Torasso and Torta 2006) or DNNF (Darwiche 2001). Often, the generation of (minimum-cardinality; but not maximal-probability) diagnoses can be accomplished in worst-case polynomial time in the size of the respective compilation. For a polynomialsized compilation, this implies polynomial-time diagnosis generation. However, the size of the compilation might be exponential in the size of the diagnosis problem for all these approaches, which means that no guarantee for polynomialspace (or polynomial-time), let alone linear-space, diagnosis generation can be given. Second, for these compilation approaches to be applicable to a DPI, the diagnosed system must be amenable to a propositional-logic description, which is not always the case (Shchekotykhin et al. 2012;Kalyanpur 2006;Horridge 2011). Beyond that, compilation approaches usually do not allow to take influence on the exact order in which diagnoses are output. In summary, these methods are in general not linear-space, not best-first, and not black-box. A compilation-based approach that is based on abstraction techniques and especially suited for a sequential diagnosis scenario is SDA (Siddiqi and Huang 2011). One difference between RBF-HS and SDA is that only a single best diagnosis (instead of a set of best diagnoses) is output by SDA at the end of the sequential diagnosis process. Second, it is questionable if similar abstraction-techniques as used in SDA are applicable to logics more expressive than propositional logic and to systems that are structurally different from typical circuit topologies. (El Fattah and Dechter 1995) present an approach that translates a circuit diagnosis problem into a constraint optimization problem. When the dual constraint graph of this problem is a tree, then the minimum-cardinality diagnoses can be generated in linear time and space. However, it is unclear if and how non-circuit-problems and more expressive or other types of logics can be addressed.
Duality-Based Approaches: FastDiag (Felfernig, Schubert, and Zehentner 2012) and its sequential diagnosis extension Inv-HS-Tree (Shchekotykhin et al. 2014) perform a linear-space depth-first diagnosis search that is grounded on the relationship between diagnoses and conflicts according to the Duality Property (cf. Sec. 2.1). The soundness and completeness of the diagnosis computation despite the depth-first search is accomplished by interchanging the role of conflicts and diagnoses in the hitting set tree. That is, in these approaches the node labels correspond to minimal diagnoses and the tree paths represent conflicts. The computation of minimal diagnoses instead of minimal conflicts during the labeling process is achieved by a suitable adaptation (Shchekotykhin et al. 2014) of the QuickXPlain algorithm (Junker 2004;Rodler 2020b). The main difference of these approaches to RBF-HS (and HBF-HS) is that they cannot ensure that the diagnoses are computed in any particular (preference) order. (Stern et al. 2012) present a sound and complete approach that interleaves conflict and diagnosis computation in a way that information from conflict computation aids the diagnosis computation and vice versa. However, unlike RBF-HS, this approach is not linear-space in general. In addi-tion, it cannot compute most-probable, but only minimumcardinality diagnoses.
Best-First-Search Approaches 24 First and foremost, we have the seminal methods HS-Tree (Reiter 1987), along with its amended version HS-DAG proposed by (Greiner, Smith, and Wilkerson 1989), and GDE (de Kleer and Williams 1987), which are sound, complete 25 and best-first. (Kalyanpur 2006;Rodler 2015) describe sound and complete uniform-cost search variants of Reiter's HS-Tree which enumerate diagnoses in some order of preference. At this, (Rodler 2015) defines the preference order by means of a probability model over diagnoses (as characterized in Sec. 2.1) whereas (Kalyanpur 2006) relies on a heuristic model that ranks single axioms based on their "importance". The sum over axioms included in a diagnosis is used to determine the rank of the diagnosis. (Meilicke 2011b) goes one step further and incorporates a heuristic function into the search, yielding a hitting set version of A*. Note that the specification of a useful heuristic function, as suggested in (Meilicke 2011b) for an additive cost function, is an open problem in uniform-cost hitting set search with non-additive costs (cf. (VI) in Sec. 2.2), as in the case of our proposed methods. 26 (Wotawa 2001) suggests a variant of HS-DAG which builds a hitting set tree based on a subset-enumeration strategy in order improve the diagnosis computation time. The same objective is pursued by (Jannach, Schmitz, and Shchekotykhin 2016), who propose parallelization techniques for Reiter's HS-Tree. Further, there are sound, complete and best-first diagnosis searches that are particularly useful for fault isolation and sequential diagnosis, StaticHS (Rodler and Herold 2018) and DynamicHS (Rodler 2020a). These are stateful in that they exploit a persistently stored and incrementally adapted (search) data structure to make the diagnostic process more efficient. More specifically, StaticHS aims at the reduction of the number of interactions necessary from a user, e.g., to make system measurements or answer system-generated queries, and DynamicHS targets the minimization of the computation time.
In contrast to RBF-HS, all these best-first search approaches require exponential space in general.

Related Works in Heuristic Search Domain
Next, we discuss other memory-limited general search algorithms that are related to RBFS (and thus to RBF-HS), 24 We restrict the discussion here to sound and complete methods. The consideration of all best-first algorithms would go beyond the scope of this work. 25 Reiter's original HS-Tree is complete only if no non-minimal conflicts are generated during the construction of the HS-Tree (cf. Example 9). 26 Note, multiplicative costs can be reframed as additive costs by applying the negative logarithm to each component probability pr (ax ) for ax ∈ K (Chung, Van Eepoel, and Williams 2001). However, this does not solve problem of finding a useful heuristic.
works that aim at improving RBFS, and methods related to HBF-HS.
Memory-Limited Search Algorithms: Beside RBFS, there is a range of alternative linear-space heuristic search techniques. Some examples are IDA* (Korf 1985), MREC (Sen and Bagchi 1989), MA* (Chakrabarti et al. 1989), DFS* (Rao, Kumar, and Korf 1991), IDA*-CR (Sarkar et al. 1991), MIDA* (Wah 1991), ITS (Nau, Ghosh, and Kanal 1992), IE (Russell 1992), and SMA* (Russell 1992). In contrast to RBFS, these algorithms generally do not expand nodes in best-first order if the given cost function is nonmonotonic. This property, however, does not pose a problem in the hitting set computation scenario. The reason for this is that the cost function in hitting set search has to be antimonotonic (cf. (VI) in Sec. 2.2) to find solutions in bestfirst order. Recall from footnote 13, anti-monotonicity (for maximal-cost solutions) in hitting set search is the equivalent to monotonicity (for minimal-cost solutions) in classic heuristic search. Hence, in principle, any of these algorithms could have been used as a basis for this work, i.e., for being translated to a hitting set version. The causes for choosing RBFS as a foundation for our presented algorithms are twofold: First, RBFS is particularly well-understood and covered by a rich collection of literature including both theoretical and empirical analyses of the algorithm. Second, and more importantly, RBFS is asymptotically optimal 27 , requiring O(b d ) time 28 when being used for minimum-cardinality diagnosis computation (Korf 1992), which is one of the most central and fundamental problems in model-based diagnosis. Compared to IDA*, 29 which is the most prominent 30 linear best-first search algorithm and also asymptotically optimal for minimum-cardinality hitting set search, RBFS exhibits a better practical (empirical) time complexity 31 (Zhang and Korf 1995), which can be intuitively explained by the fact that RBFS, unlike IDA*, does not discard the entire search tree between any two iterations (Korf 1992). This runtime advantage of RBFS over IDA* holds especially when the cost for node expansion is high (Hatem and Ruml 2014). This is absolutely the case in diagnosis search where node expansion requires a conflict, which must either be sought in a maintained list of conflicts (reuse case) or must be newly generated using expensive theorem proving (computation case), see the LABEL function in Alg. 2. This is why RBFS appeared to be a more appropriate base for constructing a 27 An algorithm algo is called asymptotically optimal for some problem class C iff it is (for the problem size n growing to infinity) not more than a constant factor worse than the best achievable running time best on problems of class C. Formally: time(algo, C) ∈ O(time(best, C)).
28 At this, b is the branching factor, i.e., the maximal number of successors of any node, and d is the maximum search depth. 29 Cf. Example 8, where we give a brief description of IDA*. 30 When we judge "prominence" by the citation tally on Google Scholar (as of April 2020). hitting set search than IDA*. Finally, there is CDA* (Williams and Ragno 2007), a version of A*, originally proposed for solving optimal constraint satisfaction problems, which is also employable for diagnosis search. It incorporates an any-space algorithm that generates the most preferred diagnoses first. The two important differences to RBF-HS are that CDA* is not black-box, i.e., appears to be not as flexibly usable with arbitrary logics and reasoners as RBF-HS, and that CDA* is generally incomplete (Stern et al. 2012).
Works towards Improving RBFS: The price to pay for the guaranteed linearity of RBFS in terms of space consumption is that nodes have to be forgotten each time a backtracking step is made. Whenever an already explored subtree becomes attractive again (because all other better subtrees have been explored), it will be re-examined. This scheme results in a potentially large number of node re-explorations. In the worst case, when every node has a unique f -value and the node with next-best f -value is always located in a different subtree of the root, O(n 2 ) nodes will be expanded where n is the number of nodes A* would expand (Hatem, Kiesel, and Ruml 2015). Addressing this problem, Hatem, Kiesel, and Ruml have proposed three techniques for controlling the overhead caused by excessive backtracking in RBFS, at the cost of generating suboptimal solutions in general. These techniques are called RBFS , RBFS kthrt and RBFS CR . The idea of RBFS is to allow the algorithm to explore a little (as ruled by the choice of the parameter ) further than suggested by bound , i.e., bound in line 25 of Alg. 1 is replaced by bound + . While this slight change yields good results in practice under an adequate setting of , it does not lower the quadratic worst-case time complexity. RBFS kthrt goes one step further by loosening both bound and the f -function, thereby achieving fewer backtrackings and fewer node expansions, albeit still without theoretical time complexity savings. Finally, RBFS CR adopts a concept originally introduced by (Sarkar et al. 1991) for IDA* in order to reduce re-expansions. The idea is to track the distribution of f -values under each node along the currently explored path, which allows to adapt the backed-up F -value in a way it can be guaranteed that, each time a node is reexplored, twice as many successor nodes will be investigated than when this node was last explored. In this vein, the number of explored nodes can be shown to be in O(n), i.e., asymptotically maximally by a constant worse than for A*. All of these three techniques are applicable to RBF-HS as well, and we expect the (positive) implications on practical performance in the hitting-set case to be in line with what was observed in (Hatem, Kiesel, and Ruml 2015) for classic search problems. A clear shortcoming of such an approach, however, will be the potential non-minimality of the returned diagnoses (unsoundness) and the potential violation of the preference order on the output diagnoses (best-first property not given). Whereas the soundness problem can be taken care of by post-processing the returned diagnoses, e.g., by means of Inv-QX (Shchekotykhin et al. 2014), it is not straightforward how to handle the best-first violation, i.e., how to ensure that the returned collection D includes ex-actly the |D| best diagnoses. Both the implementation of these suboptimal RBF-HS variants as well as the study of this latter question will be part of our future work. If only one solution is demanded, i.e., only the single most probable or single minimum-cardinality diagnosis is to be found, then techniques discussed in (Hansen and Zhou 2007) can be applied to RBF-HS. However, this is useful only if a reasonable heuristic function (for non-additive, probabilistic costs) can be expressed for hitting set searches, which is to date still an open problem.
Works Related to HBF-HS: HBF-HS follows a similar principle for RBF-HS as MREC (Sen and Bagchi 1989) does for IDA*. MREC trades off the time and space complexity by a single parameter that determines how much memory is available for use by the algorithm. In the same way that MREC behaves equally to IDA* for minimal available memory and similarly to A* for a large amount of conceded memory, HBF-HS resembles RBF-HS and HS-Tree in these two cases. Two other strategies that attempt to optimally exploit and exhaust the available memory in order to increase search speed are MA* (Chakrabarti et al. 1989) and SMA* (Russell 1992). Their underlying principle is to store every node until the memory limit is reached, and to then purge the least promising node(s) in order to make room for the next node to be explored. Whenever the search problem is solvable within the given amount of memory, these algorithms will not run out of memory and return a best solution. Theoretically, this property cannot be proven for HBF-HS as it acts like RBF-HS from the point where the (memory-dependent) switch criterion is triggered. In other words, if the switch takes place too late (such that very little memory remains which cannot hold the linear number of nodes additionally explored after the switch), then HBF-HS can run out of memory. However, first, we observed in our experiments that the number of additional nodes stored by HBF-HS after the switch was always minor (small single digit percentage) relative to those generated before the switch took place. Second, (S)MA*'s concept of on-demand node pruning can be integrated into HBF-HS as well in order to resolve this problem. Still, as a future work, we plan to carry over these algorithms to the diagnosis domain as well, and to study their hitting set versions. Moreover, (Russell 1992) suggested the IE algorithm, which however behaves the same as RBFS for a monotonic costfunction (and thus, a hitting set version of it would act identically to RBF-HS, cf. footnote 13).

Dataset
As a test dataset for our experiments with RBF-HS and HBF-HS we used inconsistent real-world knowledge bases (KBs) expressed in Description Logics (Baader et al. 2007). Table 2 lists the investigated KBs. These cases were already analyzed in studies conducted by other works. Specifically, the KBs in rows 1, 2, 5, 6, 9, 10, 12, 14 of Tab. 2 were considered in (Shchekotykhin et al. 2012), those in rows 33203 SHF 15/1/5 1): Description Logic expressivity, cf. (Baader et al. 2007); the higher the expressivity of a logic, the higher is the complexity of reasoning for this logic. 2): #D/min/max denotes the number/the minimal size/the maximal size of minimal diagnoses for the DPI resulting from each input KB K. If tagged with a * , a value signifies the number or size determined within 1200sec using HS-Tree (for problems where the finding of all minimal diagnoses was impossible within reasonable time).
3, 4 analyzed by (Rodler et al. 2019), those in rows 9, 10 used in (Stuckenschmidt 2008), and those in rows 11, 13 are accessible online. 32 As the table shows, these KBs cover a spectrum of different problem sizes (number of axioms or components; column 2), logical expressivities (which determine the complexity of consistency checking; column 3), as well as diagnostic structures (number and size of minimal diagnoses; column 4).
The reasons for choosing the domain of knowledge base debugging (KBD) to evaluate our approach are: • KBD is more general than MBD, i.e., any MBD problem can be stated as a KBD problem . 33 Moreover, KBD can capture not only positive, but also negative measurements (cf. Sec. 2) which are necessary to model, e.g., test driven development ) as well as ontology and CSP debugging scenarios (Shchekotykhin et al. 2012;Felfernig et al. 2004). • KBD is a diagnosis application domain where RBF-HS's features soundness, completeness and best-firstness are important requirements to diagnosis searches (Meilicke 2011c;Kalyanpur 2006;Rodler 2015). • KBD problems represent a particularly challenging domain for diagnosis approaches as they usually deal with harder logics wrt. the complexity of theorem proving (up to 2NEXPTIME-complete (Grau et al. 2008)) than traditional MBD problems (which often use propositional kno-32 ..link??.. 33 In other words, when formulated as KBD problems, all MBD problems differ just in terms of their manifestations regarding the attributes characterized by the columns of Tab. 2, i.e., in terms of their size, their hardness of theorem proving, and their diagnostic structure. wledge representation languages that are not beyond NPcomplete).
In our experiments, we considered a multitude of different diagnosis scenarios. A diagnosis scenario is defined by the set of inputs given to Alg. 2, i.e., by a DPI dpi , a number ld of minimal diagnoses to be computed, as well as a (cost-adjusted) setting of the component fault probabilities pr . The DPIs for our tests were defined as K, ∅, ∅, ∅ , one for each K in Tab. 2. That is, the task was to find a minimal set of axioms (faulty components) responsible for the inconsistency of K, without any background knowledge or measurements initially given (cf. Example 10). For the parameter ld we used the values {2, 6, 10, 20}. The fault probability pr (ax ) of each axiom (component) ax ∈ K was either chosen uniformly at random from (0, 1) (maxProb), or specified in a way (cf. Sec. 3.2) the diagnosis search returns minimum-cardinality diagnoses first (minCard). As a Description Logic reasoner, we adopted Pellet (Sirin et al. 2007).
To simulate as realistic as possible diagnosis circumstances, where the actual diagnosis (i.e., the de-facto faulty axioms) is of interest and needs to be isolated from a set of initial minimal diagnoses (cf. col. 4 of Tab. 2), we ran five sequential diagnosis (de Kleer and Williams 1987;Shchekotykhin et al. 2012) sessions for each diagnosis scenario defined above. At this, a different randomly chosen actual diagnosis was set as the target solution in each session.
Such a sequential diagnosis session can be conceived of having two alternating phases, that are iterated until a single diagnosis remains: diagnosis search, and measurement conduction. More precisely, the former involves the determination of ld minimal diagnoses D for a given DPI, the latter the computation of an optimal system measurement (to rule out as many spurious diagnoses in D as possible), as well as the incorporation of the new system knowledge resulting from the measurement outcome into the DPI. Measurement computation is accomplished by means of a measurement selection function which gets a set of minimal diagnoses D as input, and outputs one system measurement such that any measurement outcome eliminates at least one spurious diagnosis in D. In our experiments, a measurement was defined as a true-false question to an oracle (Shchekotykhin et al. 2012;Rodler et al. 2019;Rodler 2015), e.g., for a medical KB one such query could be Q := Tumor ∃causes.Pain ("does every tumor cause pain?"). Given a positive (negative) answer, Q is moved to the positive (negative) measurements of the DPI. The new DPI is then used in the next iteration of the sequential diagnosis session. That is, a new set of diagnoses D is sought for this updated DPI, an optimal measurement is calculated for D, and so on. Once there is only a single minimal diagnosis for a current DPI, the session stops and outputs the   Table 2 and parameter ld ∈ {2, 6, 10, 20}.
remaining diagnosis. To determine measurement outcomes (i.e., to answer the generated questions), we used the predefined actual diagnosis, i.e., each question was automatically answered in a way the actual diagnosis was not ruled out.
As measurement selection functions we adopted split-inhalf (SPL), which suggests a measurement, if existent, that eliminates half of the diagnoses in D regardless of the outcome, and entropy (ENT), which selects the measurement with highest information gain. These functions were also used in the evaluations carried out by (Shchekotykhin et al. 2012;Rodler et al. 2013;Rodler and Schmid 2018).
The advantages of using sequential diagnosis sessions in our evaluations (instead of just applying a single diagnosis search execution to the DPIs listed in Tab. 2) are: (i) Multiple diagnosis searches, each for a different (updated) DPI, are executed during one sequential session and flow into the experiment results, which gives us a more representative picture of the algorithm's real performance. (ii) The potential impact of measurement selection functions on algorithms' performances can be assessed. (iii) Sequential diagnosis is one of the main applications of diagnosis searches.
To sum up: We ran five diagnosis sessions, each searching for a randomly specified minimal diagnosis, for each algorithm among RBF-HS and HS-Tree, for each measurement selection function among ENT and SPL, for each DPI from Tab. 2, for each probability setting among maxProb and min-Card, and for each number of diagnoses ld ∈ {2, 6, 10, 20} to be computed (in each iteration of the session, i.e., at each call of a diagnosis search algorithm).

Experiment Results
The results 34 for the minCard experiments are shown by Figures 4-6. Each figure compares the runtime and memory consumption we measured for RBF-HS and HS-Tree averaged over the five performed sessions (note the logarithmic scale). More specifically, the figures depict the factor of less memory consumed by RBF-HS (blue bars), as well as the factor of more time needed by RBF-HS (orange bars), in relation to HS-Tree. That is, blue bars tending upwards (downwards) mean a better (worse) memory behavior of RBF-HS, whereas upwards (downwards) orange bars signify worse (better) runtime of RBF-HS. For instance, a blue bar of height 10 means that HS-Tree required 10 times as much memory as RBF-HS did in the same experiment; or a downwards orange bar representing the value 0.5 indicates that RBF-HS finished the diagnosis search task in half of HS-Tree's runtime. Regarding the absolute runtime and memory expenditure (not displayed in the figures) in the experiments, we measured a min/avg/max runtime of 0.04/24/744sec and 0.05/17/1085sec for ENT and SPL, respectively, as well as a min/avg/max space consumption of 9/17.5K/1.3M and 9/4.4K/183K tree nodes for ENT and SPL, respectively.
We make the following observations: 35 (1) More space gained than extra time expended: Whenever the diagnosis problem was non-trivial to solve, RBF-HS trades space favorably for time, i.e., the factor of space saved is higher than the factor of time overhead (blue bar is higher than orange one). Only for KBs K,C for SPL10 and K for SPL20 (Fig. 4), as well as K in ENT10 and ENT20 (Fig. 5), the time-space balance of RBF-HS was negative; however, these were the easiest cases in our dataset both in terms of required memory (no more than 10 and 100 nodes) and runtime (no more than 0.8sec and 0.5sec), respectively.
(2) Substantial space savings: Space savings of RBF-HS range from significant to tremendous, and often reach values larger than 10 and up to 50 (ENT) and 57 (SPL). In other words, HS-Tree required up to 57 times as much memory for the same tasks as RBF-HS did.
(3) Often favorable runtime: In 40% (ENT) and 46% (SPL) of the cases RBF-HS exhibited even a lower or equal runtime compared with HS-Tree. Note, also studies comparing classic (non-hitting-set) best-first searches have observed that linear-space approaches can outperform exponential-space ones in terms of runtime. One reason for this is that, at the processing of each node, the management (node insertion and removal) of an exponential-sized priority queue of open nodes requires time linear in the current tree depth (Zhang and Korf 1995). Hence, when the queue management time of HS-Tree outweighs the time for redundant node regenerations expended by RBF-HS, then the latter will outperform the former.
(4) Whenever it takes RBF-HS long, use HBF-HS: In those cases (for ENT) where RBF-HS manifested a 20 % or higher time overhead, the use of HBF-HS (with a mere allowance of 400 nodes in memory before the switch is triggered) could always reduce the runtime to equally as much or even less that of HS-Tree. At the same time, remarkably, memory consumption of HBF-HS never exceeded 416 nodes (whereas HS-Tree required memory for up to more than half a million nodes, which amounts to a deterioration factor of over 1000 compared to HBF-HS). Similar observations can be made in case of SPL, where, e.g., a runtime overhead factor of 10.4 (the worst value for RBF-HS we measured in all our experiments, see SPL20, case O, Fig. 4) could be reduced to a factor of 0.98 (i.e., to even a 2 % better runtime than HS-Tree's) by means of HBF-HS. This suggests that, whenever RBF-HS gets caught in redundant re-explorations of subtrees and thus requires notably more time than HS-Tree, the allowance of a relatively short run of HS-Tree (until it creates 400 nodes) before switching to RBF-HS can yield to a runtime comparable to HS-Tree. One reason for this phenomenon is that RBF-HS can save a significant number of re-explorations through the information gained by the initial breadth-first exploration of the top of the search tree. A potential second reason might be the above-mentioned high expense of managing an increasingly large queue of open nodes required by HS-Tree, as opposed to a set of open nodes of smaller and almost fixed size in case of HBF-HS.
(5) HBF-HS allows to quasi "cap" the used memory: The average number of nodes additionally consumed by HBF-HS after the switch (at 400 nodes) to RBF-HS, was less than 2 % on average, and never more than 4 %. Very similar values could be observed for HBF-HS performing the switch at 50, 100 and 200 generated nodes. This suggests that, despite no memory-bound being theoretically guaranteed, the consumed amount of memory can practically be more or less arbitrarily limited by the definition of a suitable switch condition-of course, only as long as the specified limit is not lower than the (very low) memory requirement of standalone RBF-HS.
(6) Performance independent of number of computed diagnoses and measurement selection function: The relative performance of RBF-HS versus HS-Tree appears to be largely independent of the number ld of computed minimal diagnoses as well as of the used measurement selection function.
(7) Performance dependent on diagnosis problem: The gain of using RBF-HS instead of HS-Tree gets the larger, the harder the considered diagnosis problem is. This tendency can be clearly seen in Figs. 4 and 5, where the KBs on the xaxis are sorted in ascending order of RBF-HS's memory reduction achieved, for each value of ld . Note that roughly the same group of (more difficult / easy to solve) diagnosis problems ranks high / low for all values of ld .
(8) Performance dependent on diagnosis preference criterion: The discussion of the results so far concentrated on the consistently good results attained for the minCard probability setting. In case of the maxProb setting, we see a pretty different picture, where time is more or less one-toone traded for space, i.e., k orders of magnitude savings in space against HS-Tree require approx. k orders of magnitude more runtime of RBF-HS (blue and orange bars roughly equal). The reason for this performance degradation in case of maxProb is due to the known phenomenon of Korf's RBFS algorithm to perform relatively poorly when original f -values (probabilities) of nodes vary only slightly (Ha- tem, Kiesel, and Ruml 2015). As a result, RBF-HS suffers from too many "mind shifts" and spends most of the time doing backtracking and re-exploration steps while making very little progress in the search tree. However, like in the case of minCard, when we allow for the utilization of a small amount of more memory than RBF-HS does, this problem is remedied to a great extent. In fact, adopting HBF-HS with a switch at 400 generated nodes, led to a comparable-in 43 % of scenarios even lower-runtime as compared to HS-Tree. Only in a single scenario, i.e., SPL20 with KB O, HBF-HS400 still required substantially more time than HS-Tree did. Obviously, this exact combination represents a particularly demanding case for HBF-HS and RBF-HS (cf. bullet (4)).
As additional tests turned out, the answer to this problem is the employment of HBF-HS equipped with a relative switch criterion (instead of an absolute one). Concretely, we allowed HS-Tree to consume 60 % of the available memory before handing over to RBF-HS. Equal runtimes as for HS-Tree could be achieved in this way (while making use of only 60 + % of the disposable memory where was always below ...??...).
(9) Scalability tests: The observations discussed so far have brought to light that DPIs with thousands of axioms and diagnoses (cf. 2nd + 4th col. in Tab. 2) could be well handled by RBF-HS in our tests (Figs. 4 and 5), and even led to a better relative performance in comparison to HS-Tree than problems with fewer components and possible faults. In order to evaluate the scalability of RBF-HS wrt. ld , i.e., the number of diagnoses to be computed, we conducted an additional scalability experiment. To this end, we first selected the most demanding DPIs based on their absolute runtime and memory cost in the normal experiments, and then ran the same experiments on these DPIs as described in Sec. 5.2, but with ld := 100.
The results we obtained are presented by Fig. 6(left). It displays that massive space savings oppose minor runtime overheads (7 cases; RBF-HS's runtime overhead always lower than factor 1.65), roundly equal runtimes (2 cases) and even time savings (5 cases; runtime savings of RBF-HS between 7 % and 22 %). Space savings achieved by RBF-HS ranged from 83 % (case O, SPL) to 99 % (case Cig, SPL; case IT, SPL+ENT). Note, even the combination of SPL and O, which proved to be a particularly unfavorable case as regards runtime in the normal experiments, turned out to be unproblematic in the scalability tests.
(10) Results for the hardest cases: For the purpose of cla-rity of Figs. 4 and 5, we excluded the results for the two DPIs ccc and cce. These two DPIs result from the integration (alignment (Euzenat et al. 2011)) of two KBs describing a common domain (in this case: a conference management system) in a different way. As a consequence of the automatized alignment process, a multitude of inconsistent sub-KBs (conflicts, cf. Sec. 2) emerge at once. This leads to large sizes of minimal diagnoses (cf. Tab. 2, 4th col.), which causes a high depth and thus enormous size of the hitting set tree. The runtime and memory measurements for these hard cases are demonstrated by Fig. 6(right). We detect gigantic space savings up to four orders of magnitude while runtime still remains comparable with HS-Tree (sometimes RBF-HS's runtime is even better). Again, as discussed above, the use of HBF-HS can be used to level any significant time overheads of RBF-HS while consuming a quasi-constant amount of memory.

Conclusions and Future Work
Given a collection of sets from a universe, an (irreducible) set which has a non-empty intersection with each set in the collection, and which does not contain any other elements, is called a (minimal) hitting set wrt. the collection. Hitting set computation is an important task since a multitude of realworld problems can be formulated as a hitting set problem. One prominent application domain is model-based diagnosis, where possible explanations of the faulty behavior of a system correspond to hitting sets. In this work we introduced a hitting set variant, RBF-HS, of the seminal RBFS algorithm proposed by (Korf 1992). We show that RBF-HS, given some preference function on elements of the universe, computes minimal hitting sets in a sound and complete way, and enumerates the hitting sets in best-first order as prescribed by the preference function. In contrast to existing systems in model-based diagnosis, RBF-HS guarantees these three properties under linear-space memory bounds.
We discuss various potentially useful employments of RBF-HS and how synergies with existing methods might be achieved. In particular, we introduce a parameterizable hybrid between Reiter's influential HS-Tree algorithm (Reiter 1987) and RBF-HS, called HBF-HS. The underlying idea is to first run HS-Tree and to switch to RBF-HS as soon as some switch criterion (e.g., some amount of memory consumed) is triggered. As empirical evidence indicates, HBF-HS can trade off runtime and memory consumption such that memory-intensive problems remain solvable within acceptable time while barely exceeding a predefined maximal amount of memory.
In comprehensive experiments on a corpus of real-world knowledge-based diagnosis problems of various size, reasoning complexity, and diagnostic structure, we compared RBF-HS against HS-Tree, the state-of-the-art sound, complete and best-first hitting set algorithm in the knowledge base debugging domain. The results testify that: When minimum cardinality diagnoses are sought, (1) RBF-HS achieves significantly higher space savings than time losses in all non-trivial cases, and the performance gains tend to increase with increasing problem size and complexity; (2) in many cases, RBF-HS's improvements of memory costs are enormous, reaching factors of up to 57 for normal cases, and up to 90 and 8900 for the scalability tests and hardest test cases, respectively; (3) the memory advantages reached by RBF-HS mostly do not come at the cost of notable runtime increases; (4) in the rare cases where RBF-HS's runtime overheads are significant, the working solution is to use HBF-HS instead of RBF-HS to reduce the runtime to values comparable with HS-Tree while staying within almost fixed memory bounds.
(5) For the use case of searching for most probable diagnoses, it is recommendable to draw upon HBF-HS rather than on RBF-HS. For the latter tends to be impaired by the execution of significant redundant actions due to a well-known issue () already inherent in the original RBFS algorithm. If HBF-HS is equipped with an adequate switch criterion, it proved to make do with bounded memory at HS-Treecomparable runtimes.
Finally, note that although the original intention underlying RBF-HS is in solving diagnosis problems, the algorithm is well-suited for any other application domain where preferred hitting sets need to be enumerated, especially under restricted memory circumstances, e.g., on mobile or IoT devices.
Future work topics include (1) the integration of RBF-HS and HBF-HS into our ontology debugging plug-in OntoDebug 36  for Protégé 37 (Noy et al. 2000), (2) closer investigations of and experiments with other approaches (apart from HBF-HS) outlined in Sec. 3.6, and (3) experiments with (diagnosis) problems from other domains, such as spreadsheet (Jannach et al. 2014) or software debugging (Wotawa 2010).
Lemma 1. In RBF-HS, only diagnoses can be added to the collection D.
Proof. Let us start backwards from line 16, which is the only place in RBF-HS where elements are added to D. The condition that must be fulfilled for this line to be reached is that L = valid must be returned for the currently processed node n that is added to D. Considering the LABEL function, we find that it must return in line 46 which in turn requires that FINDMINCONFLICT( K \ n, B, P , N ) before must have returned 'no conflict'. This means that K \ n does not contain a minimal conflict, or, equivalently, is not a conflict. By the Duality Property (cf. Sec. 2.1), we obtain that n is a diagnosis.
Lemma 2. If line 9 is executed, then a non-empty minimal diagnosis exists.
Proof. The statement of this lemma follows from the algorithm's the analysis (lines 4 and 6) of the output of the FIND-MINCONFLICT call in line 3 along with the Duality Property (cf. Sec. 2.1). See paragraph "Trivial Cases" in Sec. 3.2 for a more detailed argumentation.
Lemma 3. If a node n corresponding to a minimal diagnosis D is processed for the first time by RBF-HS, then n will be (directly) added to D in line 16. (Equivalently: After any call of RBF-HS' which processes a node n corresponding to D returns, D is an element of D.) Proof. Assume that, for the first time throughout the execution of RBF-HS, a node n equal to D is processed, where D is a minimal diagnosis. Initially, in line 12, a label L is computed for n. Within the LABEL function, the first thing executed is the non-minimality check in lines 38-40, where a node n i is sought in D which is a subset of n. Since (1) only diagnoses can be in D as per Lemma 1, (2) n = D is a minimal diagnosis, and (3) it is the first time that a node equal to D is processed, there cannot be any subset n i of n in D. Hence, line 41 is reached. Due to the Hitting Set Property (cf. Sec. 2.1) and the fact that n is a (minimal) diagnosis, there cannot be any (minimal) conflict C such that C ∩n = ∅. Consequently, line 44 is reached. The FINDMINCONFLICT call in line 44 will return 'no conflict' due to the Duality Property and because n is a diagnosis. As a result, LABEL will return in line 46, which means that n will be added to D in line 16. (The equivalent statement of the lemma holds since no element once added to D can ever be removed from it, for the simple reason that there is no statement in RBF-HS that modifies D except for the one that adds elements to D in line 16.) Lemma 4. For any call RBF-HS'(n, F (n), bound ), a value X < F (n) is returned (unless the RBF-HS'-procedure is exited in line 18 before a return takes place).
Proof. Assume an execution of some call of RBF-HS'(n, F (n), bound ) throughout which no exit of the RBF-HS'-procedure takes place in line 18. Observe that there are three spots where RBF-HS'might return, i.e., in any of the lines 14, 19 or 36. For the returns in lines 14 and 19, first two cases, −∞ is returned. However, F (n) > −∞ must hold. To prove this, let us consider the two places where the RBF-HS'-call can have been issued, i.e., lines 9 or 32. In the former case, F (n) is equal to f (∅), which can only attain values in (0, 1) (cf. Sec. 2.1). In the latter case, n is equal to a child node n 1 of some node and F (n) = F (n 1 ) > −∞ due to the while-condition in line 31. Therefore, the statement of the lemma holds for the returns in lines 14 and 19.
For the return in line 36, we first point out that, for any call RBF-HS'(n, F (n), bound ), F (n) ≥ bound must hold. To see this, consider again lines 9 and 32, where RBF-HS' can be invoked. In the former case, bound = −∞ and F (n) > bound follows from the argumentation in the previous paragraph. In the second case, as explained above, n is equal to a child node n 1 of some node. Through the whilecondition, we thus know that F (n) = F (n 1 ) is larger than or equal to the old value of the bound. Moreover, we know by the sorting of Child_Nodes and the fact that n 1 is the node in Child_Nodes with the largest F -value (due to lines 28, 29, 33 and 34), that F (n) = F (n 1 ) ≥ F (n 2 ) for the node n 2 with second-largest F -value in Child_Nodes (cf. lines 30 and 35). Since bound is defined as the maximum among the old value of bound and F (n 2 ), F (n) ≥ bound must be true.
Finally, note that a return in line 36, which is our current assumption, can only take place if the condition of the while-loop is violated. This implies that the returned value (F (n 1 )) is either equal to −∞ or strictly less than bound . As F (n) must be greater than −∞, as demonstrated in the first paragraph of this proof, we deduce that the statement of the lemma also holds for the return in lines 36.
Lemma 5. Throughout the entire execution of RBF-HS and for any node n, the following invariant holds: F (n) ≤ f (n).
Completeness: Let ld := ∞ (all existing minimal diagnoses should be found) and let there be a minimal diagnosis D such that D / ∈ D for the collection D returned by RBF-HS. The return can take place in lines 5, 7 or 10. Line 5 cannot apply since, in this case, the FINDMINCONFLICT call in line 3 returns ∅, which means that there cannot be any diagnosis by the Duality Property-this is a contradiction to our assumption that D is a diagnosis. If line 7 applies, then 'no conflict' was output by FINDMINCONFLICT in line 3, which implies that ∅ is the only diagnosis, again by the Duality Property. Hence, D = ∅ must hold. Since D = [∅] is returned, we have a contradiction to the assumption that D is not returned.
Finally, let the return of D be in line 10. This means that RBF-HS' must have been called in line 9. By Lemma 3, our assumption from above can be stated as: No node corresponding to D is processed throughout the execution of RBF-HS'. First, note that, for each minimal diagnosis, there is a possible path from the root to that diagnosis, due to the Hitting Set Property (i.e., each diagnosis, in particular D , includes some element of every minimal conflict) and the fact that RBF-HS' can generate a node equal to D by starting with the empty (root) node (cf. line 9), labeling it with a minimal conflict C 1 (see LABEL function, line 43 or 49), and by selecting a child node equal to {x} for some element x ∈ C 1 ∩ D , and labeling this child again with a conflict C 2 ∩ {x} = ∅, and so on. We next show that each node n ⊆ D along some path from the root to D will be processed.
First, let us assume that some node n ⊆ D of cardinality k ≥ 1 is generated, but never processed. By Lemma 7, it follows that F (n ) = f (n ) > 0 > −∞ will hold throughout the entire execution of RBF-HS'. Since RBF-HS terminates, any RBF-HS'-call for the parent node n p of n must return, and since n (i.e., a child node) was generated it must return exactly in line 36. (Note that n p (⊂ D ) can be processed multiple times; however, each time the respective RBF-HS'call that processes n p will return in line 36 since (1) D is a minimal diagnosis by assumption, (2) only diagnoses can be labeled valid or closed by Lemma 6, and (3) ld = ∞ ensures that line 18 can never be executed.) Thus, for any call that processes n p , the returned value F (n p ) ≥ F (n ) > −∞ (due to the sorting of Child_Nodes, see lines 28 and 33, and due to the fact that the child node with maximal F-value is always returned, see lines 29 and 34). The same argumentation can be applied along the branch from n p to the root node, until the new n p is equal to the root. Finally, we can derive that F (n 1 ) ≥ F (n ) > −∞ will hold throughout the entire execution of the first call of RBF-HS' made in line 9, which means that the condition of the while-loop is satisfied forever (recall that bound = −∞ at the first RBF-HS'call in line 9). This is a contradiction to the fact that RBF-HS always terminates. Thus, we have demonstrated that, for k ∈ {1, . . . , |D |}, if some n ⊆ D with |n | = k is generated, it will also be processed. In particular, this implies that D will be processed, given that it is generated.
It remains to be shown that D will be generated. To this end, observe that the root ∅ is trivially processed (see line 9) and must be labeled with a non-empty minimal conflict (as line 9 was reached, see above), which entails by line 20 (EXPAND function) that all tree nodes of cardinality k = 1 are generated, among them one subset n of D . Since n must be processed (note: maybe not immediately, but definitely at some stage of the algorithm's execution), as proven, some n ∪ {x} ⊆ D of cardinality k + 1 is generated. The same inductive argument can be applied to all nodes n ⊂ D . Consequently, D will be eventually generatedand processed, as argued above. This is a contradiction to the assumption that D is never processed, which finalizes the completeness proof.
Best-First Property: We already know that RBF-HS is complete, i.e., that all minimal diagnoses for the given DPI will be in the returned list D. We now have to show that this list is sorted in descending order by f -value. Since any node corresponding to a minimal diagnosis that is processed by RBF-HS will be (directly) added to D by Lemma 3, it suffices to demonstrate that, for any two minimal diagnoses D , D with f (D ) < f (D ), some node equal to D is processed prior to all nodes equal to D .
To this end, let ld = ∞ (the algorithm does not terminate before all minimal diagnoses have been found) and assume the opposite, i.e., some node corresponding to D is processed earlier than all nodes equal to D . Take the (first ever) call RBF-HS'(n, F (n), bound ) with n = D (i.e., the first call that processes D ). Then we have that F (n) ≥ bound (while-condition) and bound = max{F (n 1 2bst ), F (n 2 2bst ), . . . , F (n k 2bst )} with k = |D | − 1 where n r 2bst denotes the best alternative node (according to F -value) at tree depth r. (Note that, at any time during its execution, RBF-HS' involves only one expanded node at each tree level; amongst the generated nodes at one level r, the best one is expanded and the second best one is precisely n r 2bst . To see that bound is equal to the maximum of the stated set of best alternative nodes, observe that bound = −∞ at the very first call of RBF-HS' in line 9, and for each node that is expanded, the new bound is the maximum of the current bound and the current best alternative node, cf. line 32). Now, let n * be the deepest common ancestor node of D and D in the tree, i.e., n * = D ∩D . Since both D and D are minimal diagnoses, n * ⊂ D and n * ⊂ D . Moreover, let n * r,D denote the r-th successor node of n * along a path to a node equal to D . E.g., n * 1,D describes the child node of n * along the path to D ; note that n * r,D = D for r = |D | − |n * | and that n * r,D is a node at tree depth |n * | + r. For s = |n * | + 1, we know from above (F (n) ≥ bound ) that F (n) ≥ F (n s 2bst ) and, since n s 2bst is the best alternative node at level s, that F (n s 2bst ) ≥ F (n * 1,D ). Furthermore, by Lemma 5, f (n) ≥ F (n) must hold. Overall, since n = D , we so far have f (D ) ≥ F (n * 1,D ). If |D | − |n * | = 1, i.e., n * 1,D = D , then (*) F (n * 1,D ) = f (n * 1,D ) = f (D ) must be true. The reason for this is Lemma 7 and that no node corresponding to D can have been processed yet, as this would be a contradiction to our assumption that we are considering the first call that processes a node equal to D and that this one is processed earlier than any node equal to D . Thus, we have deduced that f (D ) ≥ f (D ), which gives a contradiction to our assumption.
Finally, assume (b). From Lemma 7, we know that n * 1,D must already have been processed. In addition, since D is a minimal diagnosis and n * 1,D ⊂ D , we have that n * 1,D can never be labeled valid or closed when it is processed, due to Lemma 6. Therefore, and because ld = ∞, every (and, in particular, the last) call of RBF-HS' that processed n * 1,D must have returned in line 36. From this, we infer that F (n * 1,D ) = max n∈Child_Nodes (F (n)) where Child_Nodes refers to the child nodes of n * 1,D . Since n * 2,D ⊆ D is one node among Child_Nodes, we obtain that F (n * 1,D ) ≥ F (n * 2,D ). If |D | − |n * | = 2, i.e., n * 2,D = D , then the same argumentation as in (*) above can be applied to show that f (D ) ≥ f (D ), a contradiction.

Soundness:
We have to prove that every node that is added to D is a minimal diagnosis. To this end, assume that some D ∈ D is not a minimal diagnosis. That is, D is (a) not a diagnosis or (b) a diagnosis, but not minimal. Suppose (a). Here we immediately get a contradiction to Lemma 1. Now, suppose (b). That is, D is a non-minimal diagnosis, or, in other words, there is a minimal diagnosis D ⊂ D . By the fact that f is cost-adjusted, f (D ) > f (D ) must hold. Further, D must have been added to D in line 16 as node n because this is the only place in RBF-HS where D is extended. Thus, the LABEL function must have been executed for n, in particular lines 38-40. However, no return can have taken place in line 40 due to the fact that n was assigned the label valid which implies that line 46 must have been reached. As a consequence, the test n ⊇ n i in line 39 must have been negative for all n i ∈ D. Hence, no node in D is a subset of n = D , which means that, in particular, D / ∈ D at the time D is processed. Now, since D is a minimal diagnosis and has a higher f -value than D , we obtain a contradiction to the completeness and best-first properties shown above. This completes the soundness proof.