Expert Systems With Applications

inter-asset


Introduction
Urban infrastructure assets, such as roads, ground and utilities (e.g.water, electricity, gas), are critical to the functioning of modern society ( Clarke et al., 2017 ).Without efficient and effective diagnosis and maintenance, asset failures such as ground sinking caused by underground sewer collapse, can lead to significant economic, social, and environmental costs ( Hojjati, Jefferson, Metje, & Rogers, 2016 ).These problems are particularly challenging in urban areas with increasing destructive street works due to extreme weather and ageing infrastructure.Research efforts have been devoted to developing various kinds of decision support systems (DSSs) for proactive urban infrastructure maintenance https://doi.org/10.1016/j.eswa.2020.1134610957-4174/© 2020 The Authors.Published by Elsevier Ltd.This is an open access article under the CC BY license.( http://creativecommons.org/licenses/by/4.0/ ) ( Halfawy, 2010;Hojjati et al., 2016;Quintero, Konar, & Pierre, 2005;Rogers et al., 2012 ).For example, Arsene, Gabrys, and Al-Dabass (2012) proposed a decision support system for water pipe leakage detection, Moazami, Behbahani, and Muniandy (2011) proposed a supporting tool for pavement rehabilitation and maintenance prioritisation using fuzzy logic.But in general, these systems are confronted with four practical challenges for achieving proactive maintenance.
The first challenge is that urban infrastructure assets are interdependent at multiple levels ( Ouyang, 2014;Rogers et al., 2012 ), but they are usually constructed and maintained by different stakeholders who plan and conduct street works independently without considering these interdependencies.Construction works or deterioration related to one asset may damage other assets nearby, causing cascading problems ( Ouyang, 2014 ).For example, breaking up or opening a road may damage the underlying ground and buried utilities.Although it has been widely recognised that an integrated ( Halfawy, 2010;Quintero et al., 2005 ) or a "system of systems" approach ( Hall, Tran, Hickford, & Nicholls, 2016 ) is needed for infrastructure management, the lack of explicit knowledge of asset interdependencies makes it difficult for decision makers to have a holistic view of the potential impact of their actions.Moreover, the ground, which supports the road and the buried utilities and transmits actions (e.g.traffic load) between them, is rarely considered by practitioners as an asset ( Clarke et al., 2017;Rogers et al., 2012 ).Successful implementation of an integrated approach largely depends on the ability to share comprehensive multi-sector knowledge, especially the broad knowledge of asset interdependencies.
The second challenge is that decision making in urban infrastructure management requires a variety of data ( Quintero et al., 2005 ), such as underground utility maps, road construction details and road closure regulations.This data is often held by different data owners and stored in disconnected or even incompatible platforms, which makes it difficult for decision makers to gather useful data in a short period of time.The ability to integrate disconnected datasets into one single system would be helpful for decision makers ( Michele & Daniela, 2011 ).Although semantic techniques have been proposed to integrate various buried asset data based on ontologies ( Balasubramani et al., 2017;Halfawy, 2010;Quintero & Pierre, 2002 ), none of these work considered other contextual information in the urban infrastructure system, such as weather, road traffic, and ground conditions, which significantly limits their applicability in complex decision scenarios.
The third challenge is how to devise appropriate methods for proactive infrastructure maintenance, i.e. to predict the potential consequences of actions/observations in infrastructure management and suggest appropriate countermeasures.This requires identifying potential consequences / hazards on infrastructure assets (e.g.road collapse), society (e.g.traffic delays/disruptions, damage to property) and environment (e.g.ground contamination), as well as identifying the causes (e.g.possible behaviours) that may lead to these consequences and the internal mechanism.For example, model-based techniques, such as probabilistic models and neural networks were used for water pipe failure prediction ( Arsene et al., 2012;Hadzilacos et al., 20 0 0 ) and electrical utility maintenance ( Bumblauskas, Gemmill, Igou, & Anzengruber, 2017 ); case-based reasoning techniques were used for selecting infrastructure intervention techniques ( Quintero et al., 2005 ).However, all these techniques require a set of historical data or cases as training samples, which do not always exist in practice.Instead of learning from voluminous historical data, Marlow, Gould, and Lane (2015) used logical rules formulated by domain experts to suggest suitable pipe and road pavements rehabilitation techniques.The advantage of using rule-based approach lie in the fact that rules are based on experts' knowledge underpinned by observation, experiments and theory so they have limited dependence on historical data; the experience from one city's infrastructure can also be easily generalised to another city.But this approach has not been fully examined for diagnosis and predicting consequences in urban infrastructure management yet.
The last challenge is that in the application of rule-based approach, rules formulated by domain experts are not always certain but require hedging with a confidence.To solve this, approaches such as Certainty Factors (CFs) ( Shortliffe & Buchanan, 1975 ) have been proposed by attaching degrees of belief to propositions and rules.However, research warned that certainty factors could yield disastrously incorrect degrees of belief through over-counting of evidence in several circumstances ( Heckerman, 1986 ), especially when the rule sets become larger.So almost all CFs based rule systems were either purely diagnostic (e.g.MYCIN) or predictive ( Heckerman, 1986;Heckerman & Shortliffe, 1992 ). Additionally, the belief in rules is usually specified by domain experts using numerical values, whereas human judgemental reasoning is often more qualitative than numerical ( Parsons & Parsons, 2001 ).Fuzzy rules have also been used to encode the uncertainty of knowledge ( Chen, 1994;Malmir, Amini, & Chang, 2017;Moazami et al., 2011 ), but this method still requires numerical range values for deciding the membership functions and may be challenging for nonacademic domain experts without logic background.The domain engineers we consulted in civil engineering also suggested that it is difficult to formulate rules with numeric values, especially in the case of dealing with the ground.In fact, in many cases, precise specification of numerical values may not be necessary for supporting decision making ( Goldszmidt & Pearl, 1996;Wellman, 1990 ).

Our contribution
In order to meet the challenges described above, we present an intelligent web-based decision support system for urban infrastructure inter-asset management based on a system-of-systems approach ( Fig. 1 ), especially focusing on the assets in direct contact with the underground world, including road, ground and underground utilities.The system is called Assessing the Underworld DSS, referred to as ATU-DSS hereafter. 3The above four challenges were addressed as below: (1) firstly, to help address the challenge of multi-sector knowledge sharing, urban infrastructure is considered as a system of multiple subsystems and a family of interlinked modular ontologies were developed to capture the domain knowledge on each sector, including assets (e.g.pipe), related triggers (e.g.road cracks), potential consequences (e.g.loss of utility service) and investigation techniques (e.g.ground penetrating radar surveys), etc.Then, based on the concepts defined in these domain ontologies, a set of logical rules were developed to encapsulate the interdependencies between different assets (e.g."IF RoadBaseWa-terContent increases and PipeLeakingRate is Severe, THEN RoadBaseStiffness will definitely decrease") and the relations leading to serious asset failures or other hazard consequences.(2) secondly, to help address the challenge of disconnected data, various spatial datasets about the infrastructure assets and their contextual information were sourced from different owners and integrated in a single system to provide instant location specific data retrieval.(3) thirdly, to face the challenge of proactive maintenance and limited historic data, our system adopted a rule based reasoning approach, which could be enriched when more human decisions or real data are fed in.An inference engine is applied to infer the potential consequences of a reported trigger based on the knowledge in the rule base and the retrieved data from the integrated database.(4) lastly, a qualitative uncertainty based reasoning approach is proposed to handle the challenge of uncertainty in human knowledge; the system can also make assumption of the states of missing data to derive potential issues when not all data is available and suggest investigation methods to obtain the missing data.This function allows practitioners to plan for further surveys to reduce the potential risk.
Our contribution in this work is twofold.Theoretically, we proposed a framework (shown in Fig. 1 ) for developing a knowledgedriven decision support system using a system-of-systems approach, which can be easily generalised to various engineering applications.The framework starts by identification and modelling of subsystems using modular ontologies, and capturing their interdependencies using logical rules, followed by definition of triggers which may affect the subsystems and definition of consequences which may have serious impact on the subsystems or external environment (e.g.social consequences).Then, an inference engine is applied to predict the potential consequences of given triggers and advise on whether additional information is needed and suggest ways of obtaining such information.Practically, a prototype system has been developed based on the above concept by combining real-time site specific data retrieval with automated reasoning.It also allows users to modify data values for alternative analysis.
The prototype system can help decision makers (e.g.incident managers, contractors, local authorities) to gather relevant data in one-stop, store and re-use previously collected data, codify requirements from local authorities (e.g.restrictions), predict possible issues of observations or actions in advance, and learned wisdom (e.g.issues encountered) from previous experience to help data interpretation and decision making.These functions can improve safety in infrastructure management and reduce costs and prevent delays.For example, this system can help institutions pass on knowledge to junior engineers and help answering questions like ( Clarke et al., 2017 ): ( a ) How will the condition of road surface, adjacent pipes and ground at a specific site change because of extreme weather, deterioration of assets, or human actions (e.g.planned excavation)?(b) Will this change cause any undesirable consequences (e.g.traffic disruption, loss of services or even fatalities)?and (c) which asset should we maintain/replace in the first place or when and where should we install a new underground asset?This evaluation of the undesirable consequences is different from traditional traffic or environmental impact assessment as it takes into account the knowledge of different factors and their inter-dependencies to achieve a more comprehensive assessment.The system can also suggest the likelihood and severity of potential consequences which can help decision makers to prioritise their maintenance tasks, preparation of health/safety files and reduce the potential risk of high likelihood and severity.
The rest of the paper is structured as follows: Section 2 briefly introduces the ATU domain ontologies; Section 3 explains the qualitative uncertainty based reasoning approach and the strategy for handling missing data in this system.We then provide a detailed description of the prototype system in Section 4 , including how the rule base was developed, what exemplar data sources were integrated and different functions of the user interface; followed by discussions in Section 5 and conclusions in Section 6 .

ATU Ontologies: A common vocabulary for data integration and reasoning
An ontology is a formal representation of the knowledge within a domain using a set of concepts and relationships between them ( Staab & Studer, 2009 ).It can be used as a common vocabulary and thus plays an important role in information sharing ( Gruber, 1993;Noy & McGuinness, 2001 ).More formally, an ontology consists of a TBox which defines terminologies or knowledge at the conceptual level, and an ABox which describes facts about individuals using terminologies defined in the TBox ( Baader, Calvanese, McGuinness, Nardi, & Patel-Schneider, 2007;Du, Alechina, Jackson, & Hart, 2013 ).A TBox contains definitions of classes or concepts and definitions of roles or object properties, which cover the conceptual hierarchies and relations among classes.An ABox contains assertions about individuals or instances.
At the heart of the ATU-DSS is a suite of interlinked modular ontologies ( Fig. 2 ), developed by following the NeOn methodology ( Suárez-Figueroa, Gómez-Pérez, & Fernández-López, 2012 ), consulting various domain experts (e.g.geotechnical engineers,   Cabrerizo, & Herrera-Viedma, 2019 ), which allows fuzzy relations additionally compared to regular ontologies, the ontologies we defined here are regular or crisp ontologies.To ensure the high quality of knowledge defined in them, the ATU ontologies are created semi-automatically, going through an iterative process involving several discussions and meetings with domain experts.As shown in Fig. 2 , the ATU ontologies model the knowledge on infrastructure assets, triggers, environment and investigation techniques (e.g.sensors) for urban infrastructure management.Each ontology is introduced in the following sections.
The ATU Urban Infrastructure Asset Ontologies describe the main concepts and relations of underground related urban infrastructure assets using three ontologies, including the soil/ground, road (surface infrastructure) and pipe (buried utilities) ( Du et al., 2017;2016 ).Each asset ontology models the investigated asset using a set of properties (e.g.ground clay content), processes (e.g.ground biological process) and simple relations about how properties and processes affect each other ( Du et al., 2016 ).The processes and properties are grouped into categories as the characteristics of the asset, such as GroundBiologicalProcess and GroundChemicalProperty .The number of processes and properties in the three asset ontologies are shown in Table 1 .An example of the hierarchy of the ATU Pipe Ontology is shown in Fig. 3 .
A key feature of these ontologies is that within an asset ontology, a change in a property would activate a process which leads to a change in other property(s).This cascading structure was achieved by encoding the dependency between classes in each sub-ontology with six relationships, including hasImpactOn and its inverse influencedBy , as well as increases, decreases and their inverse increasedBy and decreasedBy ( Du et al., 2017;2016 ).An example of the cascading relationships between different ontology concepts is "GroundSwelling decreases the GroundStiffness which hasImpactOn some GroundS-WaveVelocity ".The complex relation-ships between multiple concepts are defined with rules and will be explained in Section 3 .The urban infrastructure asset ontologies are publicly available at https://doi.org/10.5518/190 . 8 The ATU Trigger Ontology defines the categories and properties of events that may cause cascading effect on infrastructure assets ( Clarke et al., 2017 ).They are often human actions such as planned construction works, or observable phenomena like natural phenomena (e.g.excessive rainfall, extreme temperatures), abnormal observations on road (e.g road cracks), on ground (e.g.ground movement) or on buried utilities (e.g. drop in water pressure).For example, rainfall is an external trigger which infiltrates the ground and leading to an increase in GroundWaterContent .32 types of triggers were included in the trigger ontology as shown in Fig. 4 , each of which relates to a scenario that we want to tackle in urban infrastructure management.The properties of triggers include general properties like severity, location/spatial geometry (e.g.point, linestring) and time (a time point or a period), and specific properties like the type of a construction work.In the prototype DSS system, a decision process starts with a report of a trigger by users through the user interface.
The ATU Consequence Ontology identifies various consequences in infrastructure management, such as direct consequences on stakeholders (e.g.cost overrun) or on infrastructure assets (e.g.pipe burst, trench collapse), and indirect legal, social, economical (e.g.traffic disruption) or environmental consequences (e.g.water pollution) on the general public.Each consequence is attached with a context-dependent severity level (e.g.Negligible, Marginal, Critical, Catastrophic).

ATU Investigation Ontology (Sensors)
. Being different from the well known Semantic Sensor Network (SSN) ontology ,9 the ATU Investigation Ontology (Sensors) encodes the knowledge of currently available techniques for obtaining different infrastructure asset properties.These techniques can include a pointer to an external institution (e.g.website), a laboratory test, or different types of sensor surveys.By working with domain experts in geotechnical engineering and geophysics and reviewing literature ( sen, 1988 ), the current ontology includes seven classes (e.g.seismic methods, electrical methods)and 26 types of geophysical techniques (e.g.Ground Penetrating Radar), together with their relationships with the ATU Urban Infrastructure Asset Ontologies.Two relationships were defined to describe the suitability of an investigation method for measuring different asset properties in shallow ( 0 − 5 m depth) surveys.The relationships are measures and its inverse measuredBy , which means "SensorA Measures PropertyB  with usefulness_score N", and "PropertyB is MeasuredBy SensorA with usefulness_score N".The investigation suggestions and the corresponding usefulness score N (i.e. an integer between 0 and 4, where 0 means "not considered applicable" and 4 means "generally considered an excellent and well developed approach") of each relation were assigned by domain experts and implemented in OWL Protégé as annotations (which can be queried).With these relationships established, appropriate investigation techniques can be recommended to users in a prototype decision support system when the data of an asset property is missing.Other investigation methods can also be added into the knowledge base in the future.
The ATU Environment Ontology models the environment factors (e.g.rainfall, drought) affecting or being affected by the infrastructure assets.Instead of building this ontology from scratch like the modular ontologies presented above, the Environment Ontology was created based on several existing external ontologies (e.g.NASA's SWEET Ontology,10 the Environment Ontology,11 Ordnance Survey's Buildings and Places Ontology12 ).This is because our work does not need a thorough modelling of the environment, but only need several essential concepts like rainfall, drought, contamination.
The concepts in these ATU ontologies are used to guide relevant data sourcing, and as a common vocabulary for defining inference rules (complex relationships between multiple ontology concepts) and integrating various datasets from different domains such that heterogeneous data can be used seamlessly in automated reasoning.Though ontologies have been used widely for knowledge modelling and information retrieval ( Munir & Anjum, 2018 ), most of the existing approaches use a single domain ontology rather than a series of ontologies covering various domains as we do here.For more details of the ATU ontologies, interested readers are referred to one of our earlier works ( Du et al., 2016 ) on soil ontology.The authors are also preparing a separate paper to introduce the modelling process of ATU ontologies.

Qualitative uncertainty based reasoning in rule-based systems
Based on the concepts defined in the ATU ontologies, we continue to develop logical rules in collaboration with domain experts to encapsulate the broad knowledge of internal dependencies in one subsystem, as well as the external dependencies between different infrastructure assets, environment factors and human activities.For example, a rule "Heavy and Long rainfall will infiltrate the road if the road crack penetrates the road surface." is defined referring to the concepts in the Environment Ontology and the Road Ontology , written as: "Environment Rainfall Intensity (Heavy) ∧ Envi- As discussed previously, rules are often conditional and require augmenting with confidence, but the domain engineers we consulted in urban infrastructure management found it difficult to formulate rules with numeric probabilities.Instead they preferred expressing uncertain information using qualitative linguistic expressions, which is in accordance with the theory proposed by  Wallsten and Budescu (1995) on human reasoning.In this work, we present a qualitative confidence levels based uncertainty management scheme, which is considered to be more accessible by our domain experts compared with the linguistic quantifiers used with fuzzy sets theory ( Bonissone, Gans, & Decker, 1987;Cid-Lpez, Hornos, Carrasco, Herrera-Viedma, & Chiclana, 2017;Zadeh, 1984 ) or kappa calculus ( Clinton & David, 2004;Goldman & Maraist, 2015;Goldszmidt & Pearl, 1996;Poole & Smyth, 2005 ).The proposed confidence levels can also be interpreted in terms of the degrees of surprise using kappa calculus or using probability.Different calculation formulae are proposed to sequentially propagate the qualitative confidences of data and rules, as well as combine parallel chains leading to the same conclusion.It is an extension of our previous work ( Mahesar et al., 2017 ) by adding probabilistic interpretations of the approach, adding a mechanism to avoid multi-counting of the same fact in different rules, checking potential contradictions in the rule base and considering diagnostic rules for abductive inference.

Qualitative confidence levels
In the scope of this paper, six qualitative confidence levels { Impossible, Very Unlikely, Unlikely, Likely, Very Likely, Definite } were empirically selected to describe experts' confidence in a rule, i.e. the degree of people's (e.g.domain experts) belief of the conclusion is true given the premise.The confidence levels can be any other ordered linguistic lists (i.e.words/phrases) if the list follows an order from impossible to definite.For example, a list { impossible, very improbable, improbable, probable, very probable, definite } could be used to replace the one used in this work.Furthermore, the confidence levels granularity depends on the requirements of different applications.For example, the designer of a decision support system could simplify the confidence levels of rule sets from six to four as { Impossible, Unlikely, Likely, Definite } if they want to ease the workload of domain experts.
Definitions of the six confidence levels used in our work are listed in Table 2 .These confidence levels constitute three pairs of symmetrical confidence levels, including {Impossible vs Definite}, {Very Unlikely vs Very Likely} , and {Unlikely vs Likely} .With this feature, domain experts are free to use any of the two states of a binary variable (e.g.On/Off, Active/Inactive) for authoring rules, since the complementary confidence level can be automatically inferred to guarantee other related rules could be fired.For example, let's assume a binary variable B with two states On and Off, and two rules: Originally, if we observe A, the first rule will be fired but the second not.However, based on our definition above, we can generate a new rule based on the first rule as: "If A, Then it is Impossible that B is Off".With this rule added and fired, the second rule will also be fired to further infer the state of C. Such complementary rules are added to the system automatically in the rule definition phase.We can also attach qualitative confidences to input facts to reflect their imprecision due to the inaccuracies in observation or measurement limitations (e.g.instrument precision).
Confidence levels can also be interpreted in terms of the degrees of surprise using kappa calculus ( Goldszmidt & Pearl, 1996 ), an order of magnitude calculus in which each component is an order of magnitude more surprising than the next; a probability distribution P can be mapped to a kappa ranking κ such that P / κ is finite but not infinitesimal for an infinitesimal .Kappa rankings can be interpreted as an approximation to probabilities through the following relations ( Darwiche & Goldszmidt, 1994 ): where multiplication in probabilities ( P (α ∧ β ) = P (α| β ) P (β ) ) is replaced by addition of kappa values ( Eq. ( 2) ), and addition of ) is replaced by minimisation in the kappa calculus ( Eq. ( 3) ) ( Clinton & David, 2004;Goldszmidt & Pearl, 1996;Poole & Smyth, 2005 ).

Qualitative confidence vectors of facts and rules
For computation purpose, instead of directly using the linguistic confidence levels or a single confidence value (i.e., a scalar), we use an ordered vector of four numerical elements (abbreviation of { Very Unlikely, Unlikely, Likely, Very Likely }) to encode the confidence level of a rule or a fact, called confidence vectors as: The four elements of the confidence vector C F of a fact can have any non-negative integer elements, while the confidence vector C R of a predictive rule can have at most one positive element corresponding to the confidence levels defined in Table 2 .For example, a rule "If A happens, then it is likely that B will happen" is attached with a confidence level Likely and the corresponding confidence vector is C R = 0 , 0 , 1 , 0 .It should be noted that confidence level D (Definite) is implicit in the definition when VU, U, L, V are all 0, and confidence level I (Impossible) is implicit in the definition when any of the vector elements is with an extreme large number (for example, V U = 10 0 0 is assigned to rules marked as Impossible in our application). 13This vector representation is designed to store the accumulated uncertainties from data and rules using simple bit-addition.We can also easily restore the approximate probability from a confidence vector at the end of an inference process.However, if the uncertainties are represented using one single scalar, addition cannot be used any more as it will be difficult to separate and reconstruct the uncertainty from one single accumulated scalar.More explanations will be given in the next section.
Interpretation of confidence vectors.Given the confidence vector C = V U , U , L, V of a fact E (or a rule R ), assuming that each element of this vector can be represented by a numerical value a i between 0 and 1, the probability p (a measure of an expert's belief) of this fact (rule) to happen can be calculated as: It can be noted that p E approximates 0 when any 13 More details about Impossible will be given in the following sections.For example, assume C = 10 0 0 , 0 , 0 , 0 and the probability related to VU is a 1 = 0.01, the corresponding probability of this confidence vector is P = a 10 0 0 1 = 0 .01 10 0 0 ≈ 0 ; therefore, we consider this fact/rule as impossible.
of the vector elements gets extremely large (e.g. a conclusion is inferred from a long sequence of uncertain rules), especially for VU since a 1 is close to zero; while p E will be close to 1 when the sum of { VU , U , L , V } is getting close to 0. The extreme case is when VU, U, L, V are all 0, p E is equal to 1, which is consistent with the definition of "Definite" in Eq. ( 4) .The numerical values of { a 1 , a 2 , a 3 , a 4 } can vary for different applications or for different experts (though should be consistent in one knowledge base).The confidence vector provides an easy and intuitive way to encode and propagate the uncertainties in a rule-based system.
In terms of kappa-calculus, by taking logs of the probability defined in Eq. ( 5) (with a positive infinitesimal base ) and replacing multiplication by addition, we get the surprise level of E (or R ) happening as: The surprise that E happens comes from the surprise of each element of VU, U, L, V , and can be re-written as the multiplication of the confidence vector VU, U, L, V and a constant vector K = log a 1 , log a 2 , log a 3 , log a 4 T which represents the surprise of different confidence levels.As is infinitesimal, { a 1 , a 2 , a 3 , a 4 } ∈ [0, 1] and VU, U, L, V are non-negative numbers, the kappa values are always non-negative.When all elements of VU, U, L, V are zero, the surprise of E happening is zero which is consistent with the definition of Definite in Table 2 .In the following sections, we explain how to propagate the confidence vectors of data and rules in different situations.

Formula 1: The confidence level of a rule's conclusion given one fact
For a rule R 1 with one premise, R 1 : I f a, then b; if our confidence in a is C a = V U a , U a , L a , V a , and the experts' confidence in this rule is then our confidence in the conclusion b , denoted as C a,b (based on a ), can be computed by adding the confidence vectors of the premise and the rule, written as: For example, if we have C a = 0 , 0 , 0 , 0 and C R 1 = 0 , 0 , 0 , 1 ( Very Likely ); then, we can estimate the confidence in b as In terms of kappa-calculus, the surprise of ( a ∧ b ) happening can be obtained based on the addition formula in Eq. ( 2) , and linked to confidence vectors based on the definition of kappa values in Eq. ( 6) : The derived confidence vector is the same as the formula in Eq. ( 7) , thus it can be considered as a simple way to accumulate the kappa values (the first part of κ( E )) of different facts and rules.
We can decide the surprise level of a derived fact by multiplying K with the derived confidence vector and tuning the values of a 1 , a 2 , a 3 , and a 4 .

Formula 2: The confidence level of a rule's conclusion given a conjunction of facts
For a rule R 2 with n uncertain premises A i and confidence vec- and assuming all the facts are independent, our confidence in the conjunction of ( A 1 ∧ ∧ A n ) can be calculated by taking the sum of the confidence vectors of all these facts, written as: After obtaining the confidence in the conjunction of premises, our confidence in the conclusion B can be derived based on Formula 1 ( Eq. ( 7) ) by adding the confidence vectors of the rule and the premises: For example, if our confidence in a rule "If c and d, then h" is C R = 0 , 1 , 0 , 0 ( Unl ikel y ) and the two facts In the above formula, we assume all premises are independent.In terms of risk we may want to know if there is any chance the premises are not independent as this may suggest a higher confidence of the inferred fact 14 The following solution of multi-count problem partially handles the dependencies of premises (when one premise is used to infer other premises).
Avoidance of "Multi-count" Problem of Uncertainties.In rule based systems, whenever we have a rule of the form A ⇒ B , we can conclude B given A without worrying about other rules; and once B is proved, it can be used regardless of how it was derived.However, in dealing with probabilities, the source of the premises of a conclusion is important for subsequent reasoning ( Russell & Norvig, 2010 ).Ignoring this may result in "multi-count" of the uncertainties of several facts during inference ( Heckerman, 1986;Heckerman & Shortliffe, 1992 ).For example, assume there are two input facts {a, b} and two rules in the knowledge base: Rule 1 : i f a and b , then c ; Rule 2 : i f c and a , then d . ( Based on Formula 2 ( Eqs. ( 10) and ( 11) ), we can derive the uncertainty of c and d and record all the facts used to deduce each fact ( Mahesar, Dimitrova, Magee, & Cohn, 2017 ): where (GF) records the antecedents of an inferred fact.It can be seen that for inferring d , the uncertainty of a has been accumulated twice.To avoid multi-counting the uncertainty of a fact, we can first propagate the confidence vectors of rules to conclusions (inferred facts) and store the corresponding antecedents of all premises (i.e.unique values) in an inference process; then, the confidence of each inferred fact can be calculated by adding the Propagation of confidence vectors to avoid multi-count problem in an example ( Eq. ( 12) ).
confidence vectors of rules and unique antecedents when inference finishes.
To implement this, a given fact d can be initialised as "g f (C ) ", where C F stores the confidence vector of this fact, C R stores the accumulated confidence vector of rules used to infer this fact, GF stores the antecedents of this fact.For an inferred fact, its C R is the sum of the C R of all its premises (based on Formula 1) and its GF is the union of the given facts GF of all its premises.For the example in Eq. ( 12) , the reasoning process is shown in Table 3 from top to bottom.
Then, we can infer the confidence vector of c as In summary, the "over-counting" problem can be avoided by checking the antecedents of all the premises of a rule and this solution deals in part with the dependencies between premises.
Interpretation of Formula 2. For the example in Eq. ( 12) , since the confidence in Rule 2 suggests the conditional probability P ( d | c, a ), we can obtain the joint probability P ( c, a ; d ) as In our system, we consider a premise a to be dependent on a premise c if a was used for deriving c (i.e a is the antecedent of In terms of kappa-calculus, the surprise of ( c ∧ a ∧ d ) happening can be obtained using the kappa values ( Eq. ( 6) ) and the addition Eq. ( 2) as: means there is no surprise that a happens if we observe c happens since a is one of the antecedents of c .Let Q denote the set of independent premises of rule R 2, then the kappa value of the conclusion (and the premises) happening can be written as: So the corresponding confidence vector of the conclusion can be derived as: This formula is in accordance with Formula 2; moreover the multi-count problem has also been considered by excluding the dependent premises.

Formula 3: Combining confidence levels of the same conclusion derived from parallel rules
Let C F = V U F , U F , L F , V F denote our confidence in an inferred fact F .If the same fact F is inferred from two separate rules and the confidences are returns the minimum of the two arguments as: For example, in Fig. 5 (a knowledge base with six predictive rules), nodes in the figure represent uncertain variables, and arrows represent the rules from premises to conclusions.
Fact a, c and d are observed with confidence vectors C a , C c and C d .It can be seen from the figure that fact e can be inferred from two different rules Rule3 (Definite) and Rule4 (Likely) .Based on the propagation Formula 1 defined in Eq. ( 7) , the confidence vectors of e inferred separately from the two rules are: Based on Formula 3 in Eq. ( 20) , the confidence vector of e is: (as shown in Fig. 5 ).Given two mutually exclusive events A and B and assume P ( A ) ≥ P ( B ), we have: As b (Rule 3) and h (Rule 4) provide two independent reasons to believe e , the two observations together should infer e with a belief that is stronger than either component in isolation.So P ( e ) is always larger than the maximum probability of individual event.Replacing the kappa values with confidence vectors ( Eq. ( 6) ), we have The minimum confidence vector is related to the maximum probability of A and B and the derived confidence vector is in accordance with the formula in Eq. ( 19) .Derivation of the minimum confidence vector is detailed in Appendix B .

Rule base consistency validation with qualitative confidence levels
In this section, a mechanism is proposed to check the consistency of a rule base based on the definition of qualitative confidence levels in Section 3.1.1 .The principle is to ensure that given a knowledge base and assuming all the rules are satisfied, no contradictory conclusions will be inferred from the same group of facts.For example, assume both fact p and its negation ¬p are derived with confidence levels Definite from the same group of facts, since "¬p is Definite " implies that "p is Impossible ", then p (Definite ) and p (Impossible) are contradictory.Based on the six confidence levels defined in Table 2 , six pairs of contradictory confidence levels are defined in Table 4 , such as Definite vs Impossible .
For example, if we have four rules in a rule base ( Fig. 6 ), and A is observed with confidence vector 0, 0, 0, 0 (Definite).
Then the four rules imply that: It can be seen that D is considered to be very likely to happen through Rule 1 and Rule 3 ; but it is also considered to be very unlikely to happen through Rule 2 and Rule 4 .If such contradictions are found in the rule definition phase, an alert will be shown to domain experts; the experts can either accept these inconsistencies 15 , or add more conditions on the left hand side of the relevant rules or adjust the confidence levels of these rules.

Extended confidence vector for diagnostic rules
Rules in the system can be classified as predictive rules or diagnostic rules.Predictive rules describe the relationship from cause to effect ( "If Cause, then Effect"); and diagnostic rules describe the relationship from evidence to hypothesis ("If Effect, then Possible Cause").For example, given an observed infrastructure defect, the confidence in a predictive rule reflects a stable property of this defect (i.e. the likelihood of a consequence to happen, given the defect).In contrast, the confidence in a diagnosis rule (i.e. the likelihood of a defect, given a consequence) depends on the incidence rates of that defect and other reasons that may cause the same consequence ( Heckerman & Shortliffe, 1992 ). Generally, domain experts feel more comfortable when asked to formulate predictive rules than diagnostic rules since the incidence rate of different causes ( prior probabilities) is often hard to define.

Extended confidence vector with an abduction count
Although the confidence level of diagnostic/abductive rules are hard to define, in order to warn users with all possible causes and the potential consequences of an observation but not mix the confidence in predictive and diagnostic rules, we extend the qualitative confidence vectors described in the previous sections by adding an extra element A to the front of a confidence vector, written as: For predictive rules, A is fixed as 0 whilst VU, U, L, V are defined by domain experts; for diagnostic rules, A is fixed as 1 whilst VU, U, L, V are fixed as 0, 0, 0, 0 .For a fact, A records how many diagnostic rules have been used for inferring this fact, whilst VU, U, L, V represent its uncertainty (either provided at the beginning of an inference or propagated from other facts and predictive rules).
A fact is also attached with a list of given facts ( GF ) and abductive facts ( AF ) for storing the corresponding antecedents from predictive and diagnosis rules.The three formulae ( Eq. ( 7) , Eq. ( 10) and Eq. ( 19) ) defined in previous sections also apply to the extended (ordered) confidence vector by adding one element A into the calculation.
In our applications, predictive rules are defined by domain experts and diagnosis rules are automatically generated by reversing the cause and effect.For ease of clarification, in the following sections, we use R → to represent predictive rules, and → R to represent diagnostic rules.For example, a road crack could be triggered by several factors, such as traffic overloading, extreme temperature (e.g.freezing), water infiltration into the road due to nearby road cracks and rainfall, etc.A predictive rule "Surface deformation can cause road cracks" can be reversed as "Road cracks could be caused by surface deformation", the two rules are written: "RoadDeformation (Active) 0 ; 0 , 0 , 0 , 0 − −−−−−−→ RoadCracking (Active)" and "RoadCracking (Active) → 1 ; 0 , 0 , 0 , 0 RoadDeformation (Active)".

Reasoning with both predictive and diagnostic rules
In practice, predictive and diagnostic rules are often used seamlessly.For example, given a defect, we hypothesise what is happening in the world to explain why this defect appears; then, we apply the predictive rules to infer all consequences potentially caused by deterioration of this defect.But if both predictive and diagnostic rules exist in one knowledge base, inter-causal reasoning may happen.For example, suppose we have one predictive rule "Sprin-

Like ly
Rain '' in a conventional rule-base, if we see the sprinkler is on, chaining forward through the rules, this will increase the belief that the grass will be wet, which in turn increases the belief that it is raining.To mitigate this interaction, pre-defined salience scores are added to all the rules such that abductive inference (using diagnostic rules) are performed first to find all possible causes of observed facts, followed by predictive inference (using predictive rules) to find all potential consequences.For example, as shown in Fig. 7 , if road cracks are observed and the ground principal type is sand, all possible causes of the cracks are inferred based on diagnostic rules.The additional potential consequences of these inferred facts are then inferred through predictive rules.

Reasoning with missing facts
In addition to the uncertainty in rules, the domain experts we consulted also wanted to know what facts were assumed to be present in the derivation of a potential consequence so they could conduct further investigations to check whether these missing facts hold or not ( Mahesar et al., 2017 ).To do this, a mechanism is provided in our system to handle incomplete data, i.e. if any premises (facts) of an inference rule are missing16 , the system will make assumptions of all possible states of the missing facts so related rules can still be fired.These missing facts (with assumed values) are also attached to the inferred fact as ( {MF}) , the same as the given facts and abduced facts.The facts used for inferring a consequence will be displayed on the user interface for further guidance.
In order to infer all potential consequences, assumed facts are added into the knowledge base with all possible states of the missing facts (e.g.subgrade type = sand/rock/clay/gravel); these facts are combined with the rule base for reasoning.However, feeding different values of a fact (inconsistent information) into the same knowledge base may cause inter-causal problems since one inferred fact can be used regardless of its justification.For example, suppose we have one given fact A (Severe) and three rules in a knowledge base as below: As A is given, the system will make two assumptions about the missing fact B as "B is Clay" and "B is Sand".Ideally, we only want rule 1 to fire to infer C ( increases ) or rule 2 and rule 3 to fire together to infer C ( increases ) and D ( decreases ).However, the C ( increases ) inferred from rule 1 will also cause rule 3 to fire in which case "B is Clay" and "B is Sand" cannot hold together.To avoid this, in Truth Maintenance System (Reason Maintenance Systems), a dependency network is often constructed to record the dependencies of derived facts so as to retract the inconsistent facts at the end of inference ( Doyle, 1979 ), but this postprocessing approach may allow exponentially many subsequent in- consistent facts to be inferred during the inference process.Therefore, in our system, we avoid logically inconsistent derivation on the fly by checking whether there are any contradictory antecedents in a rule.As shown previously in the example in Table 3 , all given facts are initialised by including themselves in the given fact list, whilst all assumed facts (missing facts) are initialised by adding themselves in the missing fact list.For the above example, since ]}, rule 3 will not fire with the C inferred from rule 1.This step can guarantee that each derived fact has a consistent antecedents.
Maximally consistent sets of assumptions.With the conclusions (e.g.consequences in our ATU-DSS) inferred from different combinations of given and assumed facts, we need to group consistent consequences based on different assumptions: 1) firstly, an adjacency matrix (undirected graph) of all consequences is generated: two consequences are connected if they are with consistent antecedents (i.e.no fact hold different values); 2) then, all maximal cliques in this undirected graph are identified, in each of which consequences are consistent and the clique cannot be extended by including one more adjacent vertex.Whilst typically systems would choose a preferred set of assumptions automatically based on certain criteria, we are looking at all sets simultaneously and our user interface in the prototype will let users compare the reasoning chains of different consequences and decide which group of assumptions is more reasonable ( Fig. 12 ).

A prototype decision support system for urban infrastructure inter-asset management
A prototype has been developed based on the uncertain reasoning approach and the ATU ontologies presented in previous sections.The system architecture is shown in Fig. 8 .It includes a data layer, a logic lager and an interface layer.In the following sections, we will first introduce the ATU-DSS rule base, then present the data sources, and finally present the user interface and demonstrate how to use the system with an example.A video demonstrating the prototype is available at: http://bit.ly/2MRHMCc . 17

Developing a rule-base for ATU-DSS based on scenarios
Since the information needed to be encoded as rules in ATU-DSS is very extensive and the time of domain experts is precious, we adopted a scenario-based strategy for rule base development.First, several representative scenarios were selected by our domain experts, such as rainfall with road cracking and underground pipe leakage with active traffic loading.New scenarios can be added gradually subsequently.Then, for each scenario, rules were defined by domain experts by following the deterioration process of assets and considering all contextual possibilities.The flowcharts of these processes were sent to external domain experts and practitioners for validation.For example, in a scenario about rainfall with road cracking , the relevant variables include road construction properties, road cracking depth, rainfall duration/intensity and subgrade (ground) type.If the road surface is cracked and the cracks extend to the underlying road base, then any rainfall event will lead to infiltration into the road construction and underlying subgrade; if the subgrade is clay and the ground water level is low, it is likely that the ground water level will rise softening the clay; if the subgrade is a soluble rock it is possible that solution cavities could form, etc.
Rules with confidence levels were first created by domain experts in an agreed format and stored in text files; then they were automatically converted to a format recognisable by inference engines using a piece of code written in Python.We used the rule inference engine Jess18 in our prototype (this can be interchanged with CLIPS19 or other rule engines) for rule implementation and reasoning.A full-length exemplar code and explanation of the implementation are given in Appendix D .Diagnosis rules were also automatically generated by inverting the predictive rules.

Data sources and spatial criteria for data retrieval
Informed by the ATU ontologies and rules, various infrastructure and contextual datasets were sourced from different owners (e.g.UK Met-Office, UK Department for Transport, British Geological Survey, utility companies) and integrated in the prototype system to provide instant location/time specific data retrieval.The sourced datasets are mostly in (or converted to) the form of GIS tables and stored in a PostgreSQL database.
The Meteorological data is sourced from the UK Met-Office.20When a trigger is reported through the system user interface, the 30-day weather data (e.g.daily/hourly rainfall, maximum and minimum air/concrete/soil temperatures) up to the occurrence day of the trigger is calculated based on the data from its nearby (less than 10 km away) weather stations.Historic flood outline map is also sourced from the UK Environment Agency to provide the information of flood risk around the site of the trigger.Weather forecast data may also be added in the future.
The Road and Traffic Information is sourced from the UK Department for Transport21 and Ordnance Survey.For a reported trigger, its nearest road segment is first retrieved from the road network database to identify the corresponding traffic counting point for obtaining the historical traffic data, based on which the weighted annual traffic on this road is calculated according to the wear factors of seven types of heavy vehicles (e.g.buses). 22The road designed traffic loading can be provided by external road datasets (e.g. the National Street Gazetteer) or added by users.By comparing the traffic volume on a road with the average volume in the locality, the importance of this road can be assessed, which is also an indicator of the effect that the trigger and subsequent consequences or mitigation measures could have on the traffic flow.The past and future planned roadworks on this road are also retrieved (data from Highways England website23 ) to help evaluate the vulnerability of the road system.
The Ground Conditions data (e.g.ground water level, geological faults) is sourced from the British Geological Survey (BGS) and local councils.For example, the BGS 50K dataset24 provides geological information like superficial and bedrock geology; the corrosivity dataset 25 gives an indication as to whether the ground conditions below the top soil are likely to cause corrosion of underground iron asset, and the SuDS26 dataset suggests the potential presence of geological and hydrogeological hazards that could be initiated or worsened by water infiltration to the ground.Brownfield information sourced from different local councils is also added in the system.
Buried utilities data is sourced from different asset owners (e.g.United Utilities, National Grid, North West Electricity), vectorised and integrated in a back-end spatial database,27 in which multiple attributes of the buried assets are recorded (e.g.utility type, location, depth, material, diameter, pressure/voltage, year of installation, owners, operation status).Based on the location of a reported trigger, data of buried utilities in the general vicinity of this trigger (within a 200m radius) is retrieved from the database ( Fig. 11 (a)).
Nearby Services.Information of the nearby services such as hospitals, schools, banks, is also important for estimating the potential social and economic impact of a trigger.For example, the people at schools and hospitals may be more vulnerable to harm and harder to evacuate in an emergency, such as a gas explosion due to damage to a gas pipe.The data of sensitive services around a trigger ( ≤ 2 km ) is fetched from the OpenStreetMap when a trigger is re-Fig.9. Workflow in the ATU-DSS (from the left to right) ( Wei et al., 2018 ).ported ( Fig. 11 (b)).To assess the potential impact on each service, their shortest driving distances to the trigger is calculated using the widely used path planning A * algorithm.
Mapping Data to ATU Ontology Concepts.The sourced data is mapped to corresponding ontology concepts based on a predefined correspondence table so that it can be used for inference together with the logical rules.
Although automatic methods exist for matching data and ontologies ( Munir & Anjum, 2018 ), in our case, all the correspondences were manually defined by experts to guarantee their correctness.We carefully read the related documents of each dataset, especially their definitions, the meaning of each table column (or attribute, field), the data unit, and how the value in each cell was derived.Some examples of mapping are given below: -For some cases, the name of a data table (or a column) can be similar to an ontology concept, which gives a hint to find the correct match.For example, the table column length in the OS Open Roads Dataset28 (table uk_road_network ) also suggests the length of a road segment, thus can be mapped to the ontology concept RoadLength in the Road Ontology; -But for some cases, similar names may not suggest a correct correspondence.For example, there exists a geological dataset Depth_to_water_table29 and an ontology concept GroundWa-terTableDepth in the Ground Ontology.The two names look similar but the depth data cannot be mapped to the ontology concept as the data values were not the real depth but categorised into [1,2,3] (1: > 5 m below ground surface; 2: 3-5m below ground surface; 3: < 3 m below ground surface); -For some cases, there are no similarities between the names, correspondences can only be established after checking the definition of data and ontology concepts.For example, the table column Function in the OS Open Roads Dataset can be mapped to the ontology concept RoadType in the Road Ontology because both of them give the information of road types such as A road, B road, etc.Another example is about a geological dataset called GroundWaterLevels30 .Its document suggests that this dataset provides the information of depth to groundwater level and the unit of its table column Value is in metres which is the same as the definition of the ontology concept Ground-WaterTableDepth , therefore, a correspondence is established between them.
All these correspondence relations are stored in a predefined correspondence table ( Table 5 ).

ATU-DSS User interface
For ease or use, a web-based user interface has been developed using Python Django, Geoserver and openLayers .The workflow of utilising ATU-DSS is shown in Fig. 9 , including four steps: first users can access the system from standard web browsers to report new triggers by providing the trigger type and properties; then, relevant contextual information is retrieved from local or online databases; after that, the retrieved data is both displayed on the user interface and fed into the rule engine to infer potential consequences with their associated uncertainty, severity levels and antecedents.Finally, the system also gives suggestions on how to get the missing data and supports alternative assessment.
For example, assuming that: A member of the public observes a water pipe leakage and phones up the local authority to report it!
In order to estimate the potential consequences of this trigger, the local authority first need to report the leakage through the user interface: -Reporting new triggers .Users can either report a trigger by manually typing the information ( Fig. 10 ), or uploading an XML file containing the information of triggers.The second option allows ATU-DSS to be connected with other existing information systems, such as the pothole reporting systems used in a lot of local councils, and to use external data sources as triggers to start the decision support process.When manually reporting a trigger, the trigger type can be selected from a type list (defined in the Trigger Ontology ) as shown in Fig. 10 .The trigger severity level ( High, Medium, Low ), geographic location (e.g.postcode, GPS coordinates, location pinpointed by users on the displayed map, or an uploaded spatial file, Fig. 10 ) and time should also be provided.Users can also upload multiple photos of a trigger at different times to help analyse/monitor the development of this trigger.-Localised Data Retrieval and Automated Reasoning.Then, the relevant localised contextual data of the reported trigger is automatically retrieved based on its occurrence location and time using different spatial criteria.The retrieved/processed data is displayed on the user interface ( Fig. 11 ).The data provided by users (e.g.trigger information) and data retrieved from the database is written as a fact file (e.g.GroundPrincipalType (Clay), EnvironmentRainfallDuration (Long) ), then fed into the rule engine for automated reasoning of potential consequences.-Identification of Potential consequences.Once the reasoning process finishes, potential consequences are identified from the inferred facts.Currently we are looking at four types of consequences in ATU-DSS, including consequences on buried utilities (e.g.utility fail), on roads (e.g.road collapse), on ground (e.g.ground collapse) and social/economic/environmental/legal consequences (e.g.traffic disruption, loss of business, loss of utility service, damage to property, injury, and loss of life).Each consequence has five attributes attached: uncertainty level, severity level, given facts { GF }, missing information { MF } and consequence type (e.g.ground, social/economic) 31The uncertainty level, given facts and missing facts of a consequence are propagated/accumulated from the uncertainty reasoning approach, whilst the severity level ( Negligible, Marginal, Critical, Catastrophic ) of a consequence is context dependent and defined in each logical rule.For example, if road deformation happens on a road, the severity of potential traffic disruption is only marginal; but if a pipe burst happens on a road, the potential traffic disruption can be critical.Knowing the likelihood and severity of potential consequences of a reported trigger can help users to prioritise their tasks and take appropriate mitigation measures to reduce the potential risks, especially for those with higher likelihood and (or) higher severity.-Visualisation of the Potential Consequences.The uncertainty level of potential consequences can be shown to users as a confidence vector or using a textual representation.In this work, the textual representation is obtained by taking the first element in the VU , U , L , V vector with a non-zero value since we assume an order-of-magnitude semantics for the confidence vectors, i.e. the left-most component is more surprising than the next unless there were a huge number of rules applied to get the first value ( Appendix A ).For example, for an inferred fact with a confidence vector 0, 0, 2, 1 , since 2 on the position of Likely is the first non-zero value, this fact is considered as Likely to happen.To meet the different needs of users, two views have been designed in ATU-DSS for visualising the potential consequences: The first option is to display the consequences in an impact matrix table ( Fig. 12 ) according to their severity levels/impact and uncertainty level/likelihood.The number of missing facts (if there are any) used for deducing a certain consequence is also displayed behind each consequence.As we make assumptions of all possible values of missing facts, multiple instances of the same consequence (e.g.utility fail) could be derived from different sets of facts and with different likelihood.To ease the analysis by users, we have added several filtering boxes on the right panel of the matrix view so that users can either filter consequences based on their categories (e.g.social consequences) or combinations of different assumptions.When users hover the mouse on one consequence in the risk table, other duplicate consequences are also highlighted; a tooltip will also appear to show the facts used to derive the hovered consequence ( Fig. 12 ).
The second option is a list view ( Fig. 13 ) in which consequences and their related parameters (e.g.confidence vector, likelihood, severity, number of missing facts, data used) are displayed in a table and can be sorted according to different attributes.It should be noted that by keeping elements in the confidence vectors with the same length, the alphabetically sorting algorithm used in an html  table is the same as the formula for confidence vector comparison defined in Eq. ( 20) .
-Reasoning Chain of Potential Consequences.Users can explore the details of each potential consequence by clicking on the links in the risk view or the list view.The system can give explanations of each consequence in the form of a sequence of text descriptions or as a network diagram of nodes and directed arcs.For example, the reasoning chain of a potential utility support decrease caused by a reported pipe leakage is illustrated in Fig. 14 , in which the colour of ellipses indicates whether a fact is given, assumed or inferred; and the arrows indicate the reasoning flow together with the likelihood of each rule.By showing users the reasoning process of arriving at a particular consequence, the system can help users make a more reasonable and confident decision.Users can also manually update (e.g.decrease) the confidence and severity levels of a consequence based on their expert opinion.-Investigation Suggestions for Collecting and Updating Missing Data .As mentioned previously, in cases where real data is miss-ing in the reasoning process of a potential consequence, the system will suggest suitable investigation techniques to get the missing data based on the ATU Investigation Ontology ( Fig. 16 ).Users can decide whether to accept the assumed value of a missing fact or to do some investigation and update the data later ( Fig. 16 ).When new data is added/updated, the whole reasoning process automatically re-activates.We designed this function since it is often rare to have complete datasets in an information system.Even if such datasets do exist, they are subject to change or the observations from different people may vary.For example, the system may suggest that the ground type at a specific location is Sand , but one user may suggest the ground type as Clay based on his/her investigation.In ATU-DSS, the value of a fact can be updated no matter whether it is retrieved from the database or provided by a user.The system records which user makes a modification so that the provenance of all data is recorded.The ability for users to see alternative results by modifying assumptions can help users understand better the impact of a trigger in different context.).On this page, users can select the new rule likelihood from a drop-down list and add their comments or explanations.Once submitted, this rule will replace the original rule in this case study (i.e. this specific trigger) and the whole reasoning process re-activates. 32The re-reasoning step is essential since the modified rule may have also been used for inferring other potential consequences.User can notice whether a rule has been modified in this scenario by the colour of likelihood on a reasoning graph: the original ones are in blue while the modified ones are shown in red.The modification history of each rule will be recorded and shown on the bottom of the page so users can see others' insights.

Computational complexity
We used the rule inference engine Jess for rule implementation and reasoning in the prototype.As Jess uses an improved Rete algorithm ( Forgy, 1982 ) for reasoning, the performance is largely independent of the number of rules/facts, but as we are also dealing with missing facts, the computational time of different scenarios will depend on the number of missing facts in the antecedents of a rule and the possible values of each missing fact, as well as the similar patterns of rules' LHS (left hand side) in the knowledge base.Therefore, it is always a good practice to put the most specific patterns near the top of each rule's LHS.The system currently comprises 377 predictive rules and each rule has 4 antecedents in average; the most complex rule has 12 antecedents and the simplest rule has two antecedents.For the pipe leaking scenario illustrated in Fig. 12 , two facts were provided ( PipeLeaking (Active), PipeLeakingRate (Severe)), two facts were retrieved ( RoadType (A class road), TrafficLoad (Active)), and three missing facts were assumed with possible values: PipeDepth (Deep) (Shallow), RoadSlope (High) (Medium) (Slight), Subgrade (Clay) (Sand) (SolubleRock).235 facts were inferred from different combinations of facts and 65 were identified as potential consequences.The maximum depth of inference is 22 and the minimum depth is only one.The inference, data tidying and figure rendering process took about 4 seconds on a laptop with an Intel 2.7-GHz processor.

Evaluation
In addition to test the system with several real scenarios (e.g. the pipe leaking scenario presented in previous sections), we have also evaluated users' acceptance of the system by demonstrating the ATU-DSS prototype to a wide range of potential users in two workshops and collecting their feedback.The first workshop was organised in September 2017 to assess the system framework and user interface design.The participants included one invited experienced utility manager, one utility surveyor trainers, and around twenty academics from different UK universities and institutions (with diverse backgrounds such as civil engineering, geotechnical engineering, geophysics, computer science).An overview of the decision support system (including the framework, data and underlying semantic technologies) was given at the beginning of the workshop, followed by a live demonstration with real data from a historic ground collapse in Manchester which caused major disruption.After that, feedback from participants was acquired via a plenary discussion.The participants showed great interest and generally praised the effectiveness of the system, especially the rich data provided, the transparency of the reasoning module and the informative investigation suggestions.Several suggestions were received regarding the user interface design (e.g.colour in the matrix table, legend).
The second workshop was organised in November 2017 and attracted attendees from various backgrounds, e.g.local authorities, utility companies, survey companies, contractors (utility pipe lining/design), risk managers, sensor manufacturing companies, individual consultants, academics.The workshop followed the same procedure as the first one, except that feedback from participants was acquired via individual questionnaires as well as a plenary discussion.In the questionnaire (available at http://bit.ly/2oL7MAt ), 33 participants were asked whether the prototype addressed key problems in their practice and whether it would fit into their current work.18 questionnaires were collected -some sample responses from the participants are shown in Table 6 .Among these questionnaires, ten participants answered the question "Is this useful for you or somebody you can think of ?" and suggested that this system can be a potentially useful tool for different stakeholders, such as incident managers, survey company developers, constructors, asset owners, local authority.Possible tasks included risk mitigation, prioritisation and justification of asset design and maintenance expenses and activities.The participants also pointed out that the system could be useful for the general public and could have the potential benefits for training novice or junior staff in streetworks management.Eight people did not answer this question but their responses in questions "How is this related to your practice ?" and "Does it address any specific challenges you are facing ?" indicated that the DSS was not directly relevant to them because they were from a company selling sensors, were academics, or in one case because "current practice is so far behind the presented ideas that it is hard to imagine it in use for xx years yet".Regarding the specific functions of the system, the users were particularly interested in the integrated data platform that brought various critical contextual data together.The participants suggested that the automated reasoning module was useful for helping determine the impact of an incident in a short period of time, identify potential consequences from seemingly insignificant triggers and potentially reduce the streetworks disruptions.One participant was particularly pleased with the visualisation of the reasoning chain, which could help users to better understand the cause and effect in relation to understanding spatio-temporal correlation between utility and road problems.
As for future improvements of the system, participants suggested to add additional data sources, such as bus routes, agriculture data, and archaeological data.It was also recommended to develop a smart phone application for easier access to the system.We are encouraged by the number of new stakeholders interested in the system since the workshop and are exploring further case studies (e.g.diagnosing leaking pipes, surveyor training, route planning for street excavations) under a recently started follow-on impact acceleration project. 33Accessed: 2020-02-23.

Discussion
In this section, we discuss the key challenges we met and lessons we learned from building the above knowledge-driven decision support system for urban infrastructure inter-asset management, as well as the advantages and limitations of the proposed framework.

Distributed and integrated rule base development
In order to predict the future behaviour of infrastructure assets, in this work, we captured the asset dependencies using rules.However, it is not feasible to create an exhaustive rule base for urban infrastructure management, especially when working with few domain experts whose time is extremely valuable.In this work, we created a rule base with two scenarios to demonstrate the applicability of the proposed reasoning and decision supporting framework.For larger scale applications, rule bases can be created by different experts or organisations (e.g.industry) for different scenarios/applications in a distributed manner.Importantly, the ATU ontologies can provide a common language to facilitate this process.We have discussed the problem of rule base inconsistency in Section 3.5 by assuming that all rules were created with the same confidence levels.However, in practice, different experts may have different understanding of a probability phrase ( Wallsten & Budescu, 1995 ) and these phrases tend to change their meanings in different contexts.Future work is required to investigate how to combine/align the rules created by several domain experts (intervariances) and the rules created by the same domain experts at different times (intra-variances) or in different context.Section 3.5

Transferability to other applications
In this work, we provided a framework for developing knowledge-driven DSSs based on ontologies and rules with qualitative confidences.This framework can be adapted for various engineering applications, which require a system-of-systems thinking and qualitative uncertainties.Since users can change data values and compare different alternatives with explanations, the system framework can also be used for training less experienced engineers to perform complex decision making that requires multisector knowledge.In addition to the above applications, the system can be used as an evidence base to store all reported triggers, decisions/mitigation made by users, and the actual consequences for sharing experiences and lessons learned.

System maintenance: balancing rule-driven and data-driven approaches in long term
For building a decision support system for infrastructure management, both the historic and current data is important.As the historic data of infrastructure behaviours is not often available or can be incomplete, it is practical by starting working with domain experts to identify the key data/knowledge and encode the essential process rules (i.e.cause and effect) in representative situations.It should be noted that these hand-crafted rules are not immutable, instead they are only used as the starting points to understand a domain, especially how a domain expert would evaluate the risk/potential consequences from certain observations, such as the level of details of knowledge they used and the order of their inference.As more data is gradually available, especially in the context of Internet-of-Things as more smart sensors are installed to monitor the cities, the hand-crafted rules could be used to guide quantitative/logic rules learning and to validate the existing rules provided by domain experts.Another function our system provides is to let users to specialise a rule by modifying the Integrated data platform "The pool of information on a single platform, from the output his seems limitless which is very exciting" "A one step location for critical information" "by having a system with the potential to show and provide all records and local specific information is a huge benefit" Help estimate potential consequences through automated reasoning "Ability to determine the impact of an incident in a short period of time and consider vulnerability of all assets" < ddq > 'Bring expert analysis across a consistent analysis / reasoned approach' < /ddq > 'Help to better visualise cause and effect in relation to understanding temporal spatial correlation between utility and road problems" "Incident manager", "Senior Management", "survey companies developers" "The asset owner to prioritise their spend and its justification" "definitely for our customers; power utilities / water utility, etc., and Highways England.Also for constructors.Safety!" "Business decision makers, contract managers, consultant analysis specialists, CDM(Construction, Design and Management) managers" General public "possibly accountable to member / public who sees a trigger" "Should this be available to the general public Google disaster?" 3. How is this related to your practice?
Incident and risk management "The ability for risk management to support the delivery validation of the build, safety, environment, and all other factors" Installation design "most likely use is for design; need to acquire knowledge of design decisions" Utility location, condition survey "We are surveying practitioner, the more information we have available to use, the better judgement and decision we can make" "We can provide additional local information Use of EM/GPR survey technologies to get best outcome data in important, high risk scenarios" Education/training "'Useful for training" 4. Does it address any specific challenges you are facing?

Yes
"Yes, by having a system with the potential to show and provide all records and local specific information is a huge benefit" "Helps consultancy sales / training best practice decision making for customers.
Leading to ROI decision making in equipment investment decisions" "Provide different choices for surveying or monitoring approaches, and subsequent selection of techniques" Potentially yes "Not at the present time.Current practice is so far behind the presented ideas that it is hard to imagine it in use for xx years yet." rule likelihood (or adding more conditions, deleting a rule) in their applications, enabling a case-based analysis.When more feedback (i.e.modifications and comments) is collected from different users, the users' feedback will be used to update the original likelihood of a rule in the knowledge base or to further delineate the rule in different conditions.Moreover, data is also important for inference (or prediction) of potential consequences.As commented by the stakeholders (decision makers) participated in the ATU-DSS workshops, bringing together different types of relevant data on one data platform greatly simplifies the data collection procedure for them, which is particularly useful in emergency situations.However, in practice, the availability of relevant datasets is still limited though we have tried our best to include as many open datasets and private licensed datasets as possible.For this reason, our system also provides an interface to other existing data warehouses using API 34 and allows users to manually upload their previous project data or records (e.g.previous geophysical survey results, private borehole scans) into the system for reuse in the future projects.The increase of 34 An external system providing buried utility searching service: https://www.linesearchbeforeudig.co.uk/ .Accessed: 2020-02-23.data sources may also lead to another challenge related to data redundancy, i.e. the same asset information (e.g.buried utilities) being provided by multiple data sources.Further investigation is needed to decide whether performing data fusion to assert one unified value into the inference engine ( Dou et al., 2016 ), or resolving the data redundancy problem inside the inference engine.Furthermore, data can have different granularity and semantics, which may not accord with the definition of ontology concepts or rules.Currently, mapping between the datasets and ATU ontology concepts is manually performed to ensure quality but in future work it would be useful to investigate how to automatically source relevant datasets from the internet and link them to the system ontologies to make the system more powerful, although there is a possibility that data discrepancy will increase.

Conclusion
In this work we have presented a novel knowledge-based decision support system for integrated urban infrastructure inter-asset management -it provides a systematic way to handle interdependencies between different infrastructure assets and to model both the uncertainty and incompleteness of data and knowledge.A web-based prototype system has also been developed with several visualisation functions so that users can easily access the system through a standard web browser for reporting new triggers, examining contextual data and interacting with the system for alternative analysis of the potential consequences.The collected feedback from external domain experts suggested that the reasoning processes (rules) in ATU-DSS and the estimated consequence are appropriate for current practice.Users' feedback collected from two workshops showed that the system is widely recognised as easy to fit into their current practice and will be helpful for quickly obtaining contextual data and inferring potential consequences of triggers based on multi-sector knowledge.
In summary the paper has made the following novel contributions: • We have presented the first Decision Support System which allows integrated, holistic decision support for streetworks, presenting the user with integrated data and a qualitative risk table; • We have formalised and developed an inference system for qualitative confidence levels which can be combined with inference system which allows assumptions to be made and tracked; • A set of ontologies for the streetworks domain and related domains (in particular an environment and sensor ontology) were proposed; • A web based system has been built and received positive user feedback.
As future work, we will continue to expand the current system by considering other scenarios included in the Trigger Ontology , such as potholes and different types of construction works.Since these scenarios share the same contextual information, such as weather and traffic, the major work to extend the current system is to develop relevant rule sets of each scenario by collaborating with relevant domain experts.More datasets may also need to be added if more factors are considered in different rule sets.Qualitative temporal information will also be added into the knowledge base for more informative analysis.As anybody could help take responsibility for maintaining a sustainable street infrastructure system, the general public could also get involved into the system as citizen sensors.
Slots are also reserved to store whether a fact is given or assumed, as well as the missing/assumed facts and given/input facts used for inferring this fact.
2) Implementing rules with confidence vectors in Jess.Then, the rules developed by experts are converted to Jess format and facts are expressed using the defined fact template.For example, a rule "Rule 1: PipeLeaking is Active ∧ PipeLeakingRate is Severe 3) Asserting facts and Inference .To assert facts into a knowledge base, a given fact is initialised by including itself in the given fact list, whilst a missing fact is initialised by adding itself in the missing fact list.For example, a trigger "PipeLeaking is Active" and an assumed fact "PipeLeakingRate is Severe" are asserted in Jess using the code below: 4) Inference .The predefined fact template, functions and rules are stored in a rule file and are loaded into Jess for reasoning.With new facts asserted, Jess will apply all applicable rules to the asserted facts using Rete algorithm ( Forgy, 1982 ); the confidence levels of rules and facts will be propagated during this process.The code snippet is attached below: Jess will apply all the rules to the asserted facts; the confidence levels of rules and facts will be propagated during this process.

Fig. 1 .
Fig. 1.Building blocks of the ATU-DSS: given a trigger, an inference engine is able to predict the potential consequences by reasoning with uncertain data and rules developed based on a set of modular ontologies.

Fig. 3 .
Fig. 3.An example of the ontology concept hierarchy of pipe properties.

Fig. 4 .
Fig. 4. Triggers and their properties included in the Trigger Ontology.
1 .How to combine the uncertainty of a fact inferred from parallel rules will be discussed in Section 3.4 .Interpretation of Formula 1 .As rule R 1 defines the belief in b when a happens ( P ( b | a )), we can obtain the joint probability of b and a through rule R 1 as: P (a, b) = P (a ) P (b| a ) (8) Note we can not derive P ( b ) since P ( b | ¬a ) is not defined in this rule.

c
) and P (a | c) = 1 ; otherwise, we assume that different premises are independent to each other (e.g.P (a | c) = P (a ) ).Then, the joint probability can be written as: P (c, a ; d) = P (c) * 1 * P (d| c, a ) , i f a is an antecedent of c P(c) P(a ) P(d| c, a ) , i f a is not an antecedent of c (15)

Fig. 5 .
Fig. 5.A direct graph representing six predictive rules in a knowledge base (shaded circles for given facts and white circles for derived facts.).

1
Interpretation of Formula 3.For the above example, let the conjunction of b and e as one event A with P (A ) = P (b ∧ e ) = P (e | b) P (b) , the conjunction of h and e as one event B with P (B ) = P (h ∧ e ) = P (e | h ) P (h ) , and b and h are mutually exclusive

Fig. 6 .
Fig. 6.A direct graph representing the predictive rules in a knowledge base.In terms of kappa-calculus, the surprise of e happening is based on the smaller surprise (kappa value) of A ( e ∧ b ) and B ( e ∧ h ) according to Eq. (3) as: κ (e ) = min (κ (e, b) , κ (e, h )

Fig. 7 .
Fig. 7.An example of city infrastructure assessment with diagnosis and predictive inference.Colour scheme: turquoise for input facts, white for inferred facts, and light blue for final conclusions.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Fig. 10 .
Fig. 10.User interface for reporting new triggers with a drop-down menu of trigger types (©OSM).

Fig. 11 .
Fig. 11.Snapshots of the user interface with a list of retrieved contextual data (background maps (a) Google Satellite Image, (b) OpenStreetMap).

Fig. 12 .
Fig. 12.The impact matrix view for visualisation of the potential consequences of a reported pipe leakage.There are several filtering options on the user interface: (a) users can select different combinations of assumptions of the missing facts from the multi-checkboxes on the right panel and the corresponding consequences will be shown in the impact matrix table (N.B. colours in the table are assigned based on generic heuristics); the facts used to infer a specific consequence are shown with a mouse-over effect in a tooltip; (b) users can also filter the consequences to be displayed by their categories (e.g.road, social/economic consequences); c) users can click on a consequence and more details will be shown in a new page.

Fig. 13 .
Fig. 13.The list view for visualisation of potential consequences of a given trigger.Consequences in the table can be sorted according to different attributes.

Fig. 14 .
Fig. 14.Example of a consequence reasoning chain.Colour scheme in the figure: turquoise for given facts, pink for assumed facts (missing facts), white for intermediate facts and yellow for the final output.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Fig. 15 .
Fig. 15.User interface for displaying and modifying the likelihood of a rule.

Fig. 16 .
Fig.16.Interface for suggesting the investigation techniques for obtaining a missing fact (e.g.Ground Cavity ), where 0 = Not considered applicable; 1 = limited use; 2 = used or could be used, but not best approach or has limitations; 3 = excellent potential but not fully developed; 4 = generally considered an excellent approach, techniques well developed.There are also two buttons linking to two different pages for updating the data value.

Table 1
Number of processes and properties in ATU urban infrastructure asset ontologies.

Table 2
Definitions of confidence levels for an event E happening.

Table 4
Contradictory confidence levels.

Table 5
A predefined correspondence table between data and ontologies.

Table 6
Users' feedback about the ATU-DSS from a user workshop.