Templates as a method for implementing data provenance in decision support systems.

Decision support systems are used as a method of promoting consistent guideline-based diagnosis supporting clinical reasoning at point of care. However, despite the availability of numerous commercial products, the wider acceptance of these systems has been hampered by concerns about diagnostic performance and a perceived lack of transparency in the process of generating clinical recommendations. This resonates with the Learning Health System paradigm that promotes data-driven medicine relying on routine data capture and transformation, which also stresses the need for trust in an evidence-based system. Data provenance is a way of automatically capturing the trace of a research task and its resulting data, thereby facilitating trust and the principles of reproducible research. While computational domains have started to embrace this technology through provenance-enabled execution middlewares, traditionally non-computational disciplines, such as medical research, that do not rely on a single software platform, are still struggling with its adoption. In order to address these issues, we introduce provenance templates - abstract provenance fragments representing meaningful domain actions. Templates can be used to generate a model-driven service interface for domain software tools to routinely capture the provenance of their data and tasks. This paper specifies the requirements for a Decision Support tool based on the Learning Health System, introduces the theoretical model for provenance templates and demonstrates the resulting architecture. Our methods were tested and validated on the provenance infrastructure for a Diagnostic Decision Support System that was developed as part of the EU FP7 TRANSFoRm project.

Templates as a method for implementing data provenance in decision support systems Vasa Curcin a, * , Elliot Fairweather a , Roxana Danger b , Derek Corrigan c

Introduction
The importance of data, its origins and quality, has long been recognised in clinical research. In recent years, we have also witnessed increased reliance of clinical practice on data, through routine data capture in Electronic Health Record systems, quality improvement initiatives at multiple levels, and growing adoption of evidence-based medicine.
The patient safety implications of diagnostic error in family practice are potentially severe for both patient and clinician [1]. other scenarios can all then be associated with the routes that the data is taking through the LHS. The trust information associated with the data needs to be made available at each step of these use cases, to support auditability and transparency.
When applied to DSS-s, this trust requirement translates to the ability to readily demonstrate the clinical reasoning that was performed in a clinical encounter, together with the recommendation received. In addition to supporting the auditability of the process, this capability also promotes transparency and traceability from the recommendation back to the rules applied to produce the recommendation.The data provenance community has been working on methods for ensuring reproducibility in scientific research, through use of Semantic Web techniques and the W3C PROV standard [3], that are highly relevant to the challenges of decision support in the LHS environment. Computational provenance provides a uniform data-centered audit trail of what actually happened during some task, and we shall describe how these methods can be adapted to the needs of LHS.
There are two main techical challenges to be addressed in applying data provenance to the Decision Support System scenario; firstly, how to have heterogeneous, distributed software agents (security systems, rule engines...) construct unified, verifiable provenance traces, and secondly, how to formally guarantee that the resulting provenance traces will satisfy domain constraints, often expressed in ontologies, and user data requirements.
In order to address these issues, we introduce provenance templates, abstract provenance fragments representing meaningful domain actions that can be used to generate a model-driven service interface for domain software tools to routinely capture the provenance of their data and tasks. A template defines a provenance graph in a generic manner by means of variables such that it may be later instantiated and grafted onto pre-existing provenance graphs. Importantly, this paper introduces the idea that templates may describe subgraphs subject to bounded iteration in both serial and parallel manner.
The EU FP7 TRANSFoRm project [4] has developed a diagnostic deci-sion support tool that promotes numerous state-of-the-art practices of good clinical decision support. These include precisely defined usability patterns, integration with an electronic health record (EHR), allowing for recommendations at the point of care as part of the clinician workflow, and a provenance backend that captures provenance data about the computational aspects of the diagnostic task.
The paper first introduces the concepts of the Learning Health System, data provenance and decision support systems in section 2, before presenting the requirements of the LHS-enabled DSS, novel provenance templates formalism and the associated provenance architecture in section 3. Section 4 demonstrates how the new model was used to construct DSS audit trails in TRANSFoRm and in section 5 we consider how our approach addresses the wider LHS requirements for trust in decision support systems, its impact with respect to some recent developments, and list related work. Section 6 offers conclusions and presents pointers for future research.

Background
We shall now review the Learning Health System paradigm and the data provenance technologies, and relate them to the challenges of clinical Decision Support Systems, presenting as an example the DSS developed as part of the TRANSFoRm project.

Learning Health System
The Learning Health System (LHS) movement aims to establish a nextgeneration healthcare system, "... one in which progress in science, informatics, and care culture align to generate new knowledge as an ongoing, 5 natural by-product of the care experience, and seamlessly refine and deliver best practices for continuous improvement in health and health care." [5] Each participant in the LHS, be they clinician, patient, or researcher, acts as a consumer and a producer of knowledge, with the LHS providing: a) routine and secure aggregation of data from multiple sources, b) conversion of data to knowledge and c) dissemination of that knowledge, in actionable form, to everyone who can benefit from it [2]. Thus, the LHS creates routes for knowledge transfer between different parts of the health system, thereby increasing its research and learning capacity.
Different data-driven scenarios, such as decision support systems, clinical trial recruitment and management, epidemiological studies, all represent applications within the LHS, each associated with the movements and processing of data and knowledge. A number of LHS implementations have been developed at varying scales [4,6,7,8].
Attempts to define the core requirements of the Learning Health System [5] have highlighted concerns about a perceived lack of transparency and tracking in current systems demonstrating how clinical reasoning was actually applied in any given clinical case. A fundamental feature of the LHS is the generation and curation of clinical evidence using electronic data sources.
Such a process is critically dependent on a full transparency of how evidence is produced, maintained and consumed as a means of generating trust in the underlying system. Trust in the evidence base leads to the acceptance of responsibility for the clinical recommendations made by it which is essential if these tools are to gain widespread acceptance in the clinical community.

Data provenance
Put simply, data provenance describes what actually happened for some data entity to achieve its current form. W3C standards body defines provenance as a form of contextual resource metadata that describes entities and processes involved in producing and delivering or otherwise influencing that resource. Provenance provides a critical foundation for assessing authenticity, enabling trust, and allowing reproducibility. The Office of the National Coordinator (ONC) for Health IT describes it as attributes about the origin of health information at the time it is first created and tracks the uses and permutations of the health information over its lifecycle. Term data provenance is used to establish the focus on data entities produced in the processes.
Data provenance provides traceability by automatically capturing the trace of the research task and resulting data in a uniform and domainindepent way, thereby facilitating reproducible research. The original concept comes from the eScience and cyber-infrastructure communities, where it was used for capturing the exact parameterisations and configurations of scientific workflows that produced a particular data set [9,10]. Although the original users of provenance data were the scientific programmers creating and maintaining research workflows, the increasing number of tools and technologies available resulted in a wide array of stakeholders who can benefit from provenance information using visual front-end tools and interactive reports.

PROV model
The provenance technology, as defined in the W3C PROV standard [3], provides a common platform for automated capture of metadata about the data artifacts (e.g. databases, individual patient records, diagnostic recommendations), all processes that use or create those artifacts, and all actors that participate in those processes, such as clinicians, patients, researchers, or computer software. The resulting provenance data stores are typically semantically annotated databases, shared between all different software tools in some software system that can be mined for generating new knowledge, or investigated for audit purposes [11].
PROV is an interoperability standard, so there is no need for every system to use it as its core data model, or even to use a graph data model, but the W3C recommendation is for each provenance-enabled system to support import and export in the PROV format.
Nodes in a provenance graph come in three flavours: entities, which represent immutable states of a some data for which one wants to provide a history, activities that produce and consume such entities and agents associated in some capacity with either of the former. The edges of a graph represent various inter-relations between the node types, such as usage, generation, and association [3]. Validity of graphs is defined using a number of typing, ordering and impossibility constraints to be checked upon a normalised form of a graph, if one exists [12]. All nodes have a mandatory identifier given as a qualified name. A qualified name consists of an optional namespace followed by a local name of form ns:name. Identifiers belong to the prov namespace. Nodes and edges may be annotated with an optional dictionary of attribute-value pairs, formed of a qualified name and a data value, which can be used to attach ontological annotations onto nodes, specifying their meaning in some domain.  features in diagrammatic form using the standard PROV representation of entities as yellow ellipses, activities as blue rectangles and agents as orange pentagons. Node annotations are shown as dashed grey boxes.

Clinical Decision Support Systems
Decision support systems (DSS) have a long and sometimes controversial research history [13,14]. Clinical decision support system is defined as software that is designed to be a direct aid to clinical decision-making, in which the characteristics of an individual patient are matched to a computerized clinical knowledge base and patient-specific assessments or recommendations are then presented to the clinician or the patient for a decision [15].
The exact nature of the patient-specific assessments or recommendations and the delivery mechanism used to present that information to the patient or clinician can vary greatly [16]. This has resulted in a number of different types of clinical decision support system that address particular clinical areas, ranging from computerised physician order entry and appropriate medicines management, via risk calculators, diagnostic aids, and triggered alerts and reminders to full electronic implementations of clinical guidelines.
The demonstrable efficacy of DSS in clinical practice however has been limited. One reason is that research impacts of implementing such systems have frequently been assessed as a technical driver of process change. Ideally they should more usefully demonstrate a measurable positive impact on practitioner performance that leads to directly attributable and measurable improvement in patient outcomes [17]. But more promising results have been demonstrated in research environments outside the clinical area of diagnostics [18,19,20,21].
Traditional approaches to diagnostic decision support have lacked broad acceptance for a number of other well documented reasons: poor integration with EHRs and clinician workflow, static black-box rule based evidence that lacks transparency and trust, usage of proprietary technical standards hindering wider interoperability [22,18,23,24,25]. Despite these problems there is an increasing recognition of the need to realise the potential value of implementing decision support systems more generally. This is reflected in their inclusion as important components of wider government ICT based health policy legislation in practice [26,27].
The evolution of clinical decision support development reflects attempts to address workflow and integration issues, interoperability standards and also separation of the knowledgebase as a separate service distinct from the tools themselves [28]. The focus has largely been on implementations of what can be described as diagnostic symptom checkers, relying on a knowledge base defined as a series of rules in the form of a database of knowledge facts. These may be triggered or combined together in the form of guidelines based on statements using a knowledge rule languages or rule engines such as Arden Syntax [29], GLIF [30] and GELLO [31]. These approaches have led to a recent shift towards model-based approaches to knowledge representation for the purposes of clinical decision support [32].

TRANSFoRm Decision Support System
The EU FP7 TRANSFoRm project (2010-2015) [4], working with 20 partners in 10 European countries, developed and evaluated a single unified international platform to support main Learning Health System scenarios that combine research and clinical practice, and reduce barriers to entry for using Electronic Health Record (EHR) systems and large medical data sources.
The project developed a next generation diagnostic decision support tool that addresses many of the issues highlighted as being essential for good clinical decision support [22]. These include integration with an electronic health record (EHR) allowing for recommendations at the point of care as part of the clinician workflow. An essential part that is the subject of this research paper has been the support for the LHS concepts of transparent generation and use of evidence in this system. curated or populated using a separately developed data mining module [34].
The DSS system used the provenance infrastructure that forms a core part of the TRANSFoRm middleware, together with the security framework and semantic interoperability modules.

Material and methods
We shall now look into how TRANSFoRm implemented the provenance infrastructure for its diagnostic decision support system. First, we shall present the requirements stemming from the context of the Learning Health System, and then present the theoretical framework for provenance template architecture.

Reproducibility requirements of a provenance-enabled decision support system
To inform our design for provenance templates as means of implementing reproducibility in DSS, we now establish the reproducibility requirements for a provenance-enabled DSS, by placing them in the context of the key Learning Health System challenges [5]: • An LHS that is trusted and valued by the public and all stakeholders.
Privacy, security, and transparency are key elements related to building public trust and generating value. Trust and confidence at all stages of the LHS operation are essential; from inputs to outputs (and outcomes). This implies the need for traceability -a continuous trail of data artifacts and operations on those artifacts, starting from the data creation (e.g. routine data capture or import from a data source) through the transformations (knowledge base processing, rule application) all the way to recommendations made by the DSS.
In the context of an adaptable system, how do we determine what adapts? How can a system adaptably ingest, manage, refine, and emit data from a rapidly growing source environment? What evidence must be gathered about the development, design, and operation of the system and about the environment in which it operates to enable certification?
The LHS software architectures need to provide a mechanism for such evidence to be routinelly curated -gathered, organized, interpreted, and maintained.
• An LHS Capable of Engendering a Virtuous Cycle of Health Improvement. How do we develop ways to communicate the generated results, information, or knowledge to others who may wish to replicate (or build upon) the work done, as well as to the general public? How can the computational procedures employed in the system be documented in ways that are assuredly consistent, understandable, checkable, and repeatable, and how can the computational provenance of derived data be tracked from its points of production through consumption and use?
These features rely on permanent auditability of the system, with all neccessary audit data being automatically generated from the provenance traces, and the models Based on these, we define the key reproducibility requirements that apply to decision support.
1. System transparency. The black box approach and lack of transparency results in the lack of trust and is cited as one of the main reasons behind the poor take-up of clinical decision support systems [35]. Therefore, in a provenance-enabled DSS, activities related to usage and generation of evidence need to be readily available for users to review.

Auditability of recommendations.
Medical/legal liability concerns are considered a potential stumbling block for Decision Support Systems [22], in that it is unclear who takes responsibility for various elements in the DSS that could potentially go wrong. This relates to the auditability of the system, which must enable the user to look up a diagnostic recommendation and find all the relevant detail about how it was made -evidence base used, patient cues entered, software employed. The level of detail captured must be validated against the required report granularity.
3. Understandability of data. The data that is captured about the workings of the DSS needs to be not only accessible to the users (clinicians, auditors, researchers, patients) but it has to rely on standardized concepts expressed in terminologies the users are familar with.
4. Validation readiness. In order to guarantee that the provenance metadata being captured is at the right level of granularity and encompasses all the necessary features, the structure of the provenance data needs to be modelled and verified separately from the software implementation.

Traceability of evidence.
The evidence repository will evolve through the lifecycle of any recommendation software. It is imperative that the content of the repository is subject to an orderly release cycle and an associated quality assurance procedure, including an evidence curation process. This is to ensure that the exact versions of the knowledge bases used in each specific recommendation can be traced back and analysed if needed.
6. Reproducibility of recommendations. An underlying feature for a number of these characteristics is that the recommendations made and be able to demonstrate that the patient data is never used contrary to some set of rules. Furthermore, the transformations and anonymi-sations on patient data need to be captured in order for the trace to be validated against privacy constraints.
An important feature of provenance support in the DSS is not to do harm, and does not impede the normal running of the DSS. This requires seamless integration with no noticeable degradation in performance that would adversely affect the clinician in their daily routines. Furthermore, the system must be able to scale up in line with the expected usage volume, so the provenance store needs to be appropriately specified to cope with accumulation of usage data over time.
These nine requirements were used to guide the design of our provenance solution. We shall now introduce the theory behind provenance templates.

Provenance templates
Data provenance originated in research communities that rely on uniform computational infrastructures, such as life and earth sciences. The resulting techniques [36] are not directly applicable to LHS scenarios and decision support systems, due to heterogeneity of software systems involved and the need to ensure consistency of provenance graphs produced by different systems. To that end, we introduce provenance templates as abstractions that have domain meaning and can easily be mapped to the actions of the client software tools. The formalism described here is based on W3C PROV [3] as the current standard for representing provenance data as graph models.
However, the authors can see no barriers to generalising the approach to any graph-based provenance representation. 2 Informally, a provenance template is an abstract provenance graph which may be instantiated to generate a concrete provenance graph, possibly connected to some existing graph structure. We refer to that instantiation process together with associated linkage and validation steps as graph generation.
The template may contain fragments which are to be repeated, for example, a series of editing operations on some data, and it may specify the places where the generated graph will be grafted (attached) onto some existing graph.
A template, T , is a provenance graph with some reserved annotations,  The node act1 is annotated with the type value Concept taken from ontology myOntology. In this way the semantic type of the node is constrained, allowing us to assign clear domain meaning to the concepts in the templates.

Series and parallel zones
An important requirement for our templates is to represent repetition in provenance graphs, often used to describe similar segments that are created by repeated instantiations of a template.
The concept of a repeated pattern in a template is represented using a zone Z, a connected subgraph that is to be iterated either in series or in parallel upon generation of the graph. The attributes of zones belong to the new zone namespace. Each zone has a unique identifier, zone:id and may optionally be assigned minimum and maximum bounds zone:min and zone:max, setting the minimum and maximum number of iterations allowed for the zone. min(Z) and max(Z) respectively denote the minimum and maximum bounds of the zone Z, if such values are defined.
A zone is defined by the set of template nodes which belong to it. A node may only belong to one zone. A node that belongs to a zone is denoted an internal node N ι , and its identifier must be a variable. Each internal node of a zone is also annotated with the zone identifier using the pgt:zone attribute, and inherits the zone's type and bounds. In the figures below, for readability purposes, zones are represented as frames around associated internal nodes, in practice they are still PROV annotations, as described in 2.2.1. A value variable is deemed to belong to a zone if it occurs in an annotation of an internal node of that zone. Let zvars(Z) denote the set of variables and value variables belonging to a given zone Z.
An external node, N , is any node of a template that does not belong to a zone. A fixed external node represents a constant node with a fixed value that is not instantiated further. Any external node of a template may also act as a graft node, annotated with the pgt:graft flag, serving as the point at which the template instance can be linked to another graph. A fixed graft node may share the identifier of a node from a pre-existing graph and similarly a variable graft node may be given an existing node identifier upon substitution. We write tvars(T ) to represent the set of variables and value variables belonging to external nodes of a template T .
Every edge of a graph has a unique identifier. If the edge is between two internal nodes, it is called an internal edge, while an edge between two external nodes of a template T is called an external edge. Edges that enter and exit the zone are called entry and exit edges, respectively. The entry and exit edges of a zone define the manner in which the subgraphs generated by zone iterations are connected to the instantiated external nodes of the template.
A zone may be iterated in parallel or in series, specified by the zone:type attribute that can take values of parallel or series respectively. Intuitively, a parallel iteration represents provenance derivations which may happen independently, where the entry and exit edges of the zone are duplicated to create forking and synchronising points respectively in the final graph, whereas a series type zone represents one which is repeated in sequential fashion and the entry and exit edges define the connection to an initial and terminal state.
A parallel zone must have at least one entry or exit edge in order to ensure graph connectedness upon generation. Series type zones have some additional notation and requirements. A recursive edge is a virtual edge of a template by which generated serial iterations of a zone are to be joined.
Each such edge defines a connection to be generated from the instantiation of an internal node in one iteration to the instantiation of an internal node in the following iteration. Such an edge is declared by annotating the exit node of the edge with the identifier of the entry node as the value of the pgt:rec entry attribute. The entry node must be another internal node belonging to the same zone. Write rec(Z) for the set of recursive edges of a zone Z. Each node given a value for the pgt:rec entry attribute must also be given a value for the pgt:rec type specifying the PROV type of the edge to be created. Each series type zone must have at least one recursive edge to ensure a graph generated from the template is connected.
A template is valid if it is a valid provenance graph as defined by [12] and also such that all recursive edges defined in the template also conform to the typing and impossibility constraints applied to normal graph edges.

Template generation
The generation of a particular instantiation of a provenance graph, G, from a template is specified by a substitution. A substitution S is defined as a mapping from a pair comprising a qualified name and a non-negative integer representing the iteration number to a PROV value. Thus note that no variables or value variables remain after a substitution has been To encode the templates in a standard way, we extend the notation of PROV-N [38] by introducing a new predicate name sub and writing a substitution as a list of expressions of the form sub(qn, i, val) , where qn is the qualified name of a variable or value variable, i is a non-negative integer and val the value to be substituted for that name in that particular iteration.
In order for a substitution to be valid, every variable or value variable has to have at least one value to be substituted and if multiple instantiations of a zone are given, all variables and value variables belonging to that zone must be given a value to be substituted in each iteration, and these iterations must be numbered in increasing order. The total number of iterations to be made for a zone Z specified in a substitution S is written bound(Z, S) and must fall within any given minimum or maximum bound constraints given for the zone, that is, min(Z) ≤ bound(Z, S) ≤ max(Z). Finally, for each variable, p ∈ tvars(T ), every value given for p in S must be a PROV identifier, which must not occur in any pre-existing graph except if the node to which v belongs has been labelled as a graft node. (Value variables may be substituted for any PROV value.) Template generation may proceed in two ways, either in a single-step when given a complete substitution or step-wise using incremental substitutions. Fig. 6 describes the generation of a graph G for a template T given a complete valid substitution S.

Examples of Generated Graphs
To illustrate the generation process, consider the valid instantiation for template T 1 that is shown in Fig. 7 alongside the corresponding generated graph G 1 . As previously noted, the template contains no zones and so all that occurs is the substitution of the external variables and value variables with those identifiers and values given by the instantiation. Now consider the instantiation and provenance graph shown in Fig. 8 generated from the template T 2 with a parallel type zone given in Fig. 4.

Results
In order to demonstrate the applicability of the template-based data provenance architecture to providing the relevant audit trail for decision support systems, we have impemented such an architecture within the context 31 of the TRANSFoRm project. The starting point for defining the provenance use cases was expressing their requirements as a set of basic provenance related questions, describing the provenance information that we require to be automatically recorded and available through our decision support system: 1. Which decision support user was responsible for initiating a decision support tool session that resulted in a specific diagnostic recommendation being generated on a certain date?
This is a typical audit-style question that assigns responsibility for the diagnosis made. One could think of further provenance questions that could be asked about the operation of a decision support system, most notably around the provenance of the evidence base itself and the creation and management of rules therein, however in the TRANSFoRm project, the evidence base was manually curated and thus not suitable for inclusion in our use cases.

Representing DSS concepts as PROV annotations
One strength of using PROV as the provenance representation language is that it allows for provenance nodes and edges to be annotated with keyvalue pairs. In order to precisely define the items that are being captured in provenance traces, we have assigned each node an ontological concept and a value, thus allowing provenance graphs to be queried using precise semantics. TRANSFoRm rcto classes that are also subclasses of PROV-O ontology [40] concepts, creating identifiers and text labels which are then used as PROV annotations onto provenance template nodes, as shown in Table 1.

Clinical Decision Support Templates
Two use cases for the TRANSFoRm decision support system were defined and expressed in the form of provenance templates. The first describes the user logging into the system and getting authenticated by the security framework, while the second supports provenance collection during evidence consumption and subsequent clinical recommendation provided by the deployed evidence repository accessed by the decision support tool itself. Note that the two template instantiations are invoked by two different pieces of software in the TRANSFoRm system, the former by the security subsystem and the latter by the decision support tool itself.
In order to represent the semantic categories, each node in the template is further constrained by the ontological annotations described in section 4.1 and shown as PROV key-value attribute pairs in the grey boxes.
The template in Fig. 10 shows the task of a user logging into the de-   The template in Fig. 11 depicts the operation of the diagnostic decision support system. External nodes var:ehr and var:patient denote the Electronic Health Record system used and the patient presenting for diagnosis, respectively, while var:ceRepo and var:dss represent the clinical evidence repository used and the decision support system. The zone represents a single diagnosis task for the patient, of which it is assumed there will be several, with different sets of cues (var:cueSet) producing different diagnostic recommendations (var:diagRec). Patient symptoms are noted (var:collectCues) and used to generate a record of the patient visit var:patientVisit, which is used by the decision support system var:dssSys to make a comparison (var:evidenceComp) against the available clinical evidence in its knowledge base var:ceRepo to generate a matching set of rules var:matchSet and a diagnostic recommendation var:diagRec.
Note that the two templates overlap on the session entity, which is a  graft node in the second template, so there needs to be one login provenance fragment for each diagnosis fragment. One example of the provenance data collected in the TRANSFoRm DSS system is shown in Figure 12, visualised in the Neo4J database.

Provenance template server architecture
The architecture of the system is illustrated in Figure 13. The overall structure is simple. The provenance server itself is accessible as a web service  via an endpoint offering a RESTful API with the data stored using a MySQL relational database with a D2RQ relational-to-RDF adapter [41] allowing querying via SPARQL language that is targetted at semantically annotated data. The data is also transferred into a Neo4j graph database using Extract-Transform-Load (ETL), with an instance of the Neo4j web server allowing captured data to be queried and visualised by users via a browser-based interface using the Cypher query language [42].
The main subcomponent of the server is the template engine, which is 40 responsible for validating substitutions for a given template and generating the subsequent graph. Provenance graphs are converted to and from the database format by a translator component. Since Neo4j employs a higherlevel, graph-theoretic model by a translator component, the ETL represents PROV node types as Neo4j node labels and PROV annotations as Neo4j properties.
As described in section 3.2, the current prototype implementation employs a graph model similar to that of Neo4j. Graphs are described in terms of vertices and edges which may be annotated by dictionaries of named data values. Templating, provenance and user-defined data are kept in separate namespaces. This design allows the engine to be agnostic as regards the provenance standard used by the server, whether that be PROV, the Open Provenance Model (OPM) or any other. However, the remaining components, the REST endpoint, provenance validator and graph translator are specific to a particular standard and must be implemented on a case-by-case basis along with adapters for the template engine.
The main use case is that for storing a graph generated from a template and accompanying instantiation and proceeds as follows. Given a description of the provenance template and instantiation for that template, both are serialised in JSON format and sent to the REST endpoint in a single API call to generate and store the data. After being received by the server, this description is deserialised and then passed to the template engine where it is first validated in order to ensure that the substitution is valid for the template provided. If this succeeds then the graph synthesis component proceeds to generate the expanded graph following the algorithm described in Fig. 6.
This graph is then run through the provenance validator component which checks the validity of the generated graph. If the graph is valid then it is passed to the translator which commits this to the database. Note that if a template includes graft nodes then the provenance validator may need to query the database about existing graphs in order to assess the validity of the generated graph. The second use case, the storage of a complete graph is a subcase of the first. A description of the graph is presented to the server directly via the endpoint at which point it is immediately handed to the provenance validator component rather than to the template engine, and then stored if deemed valid.
As discussed in Section 3.3, it may be also be desirable to generate graphs from a template in a step-wise rather than single-step fashion. In this situation, three API methods would be published. An initialisation method would first provide a template and a substitution for external variables and value variables of the template. Following this, one or more calls to a zone iteration method would then be made for each zone within the template providing substitution data for the instantiation of a new iteration of the zone.
A finalization method would then signal that a template was considered complete which would trigger the completion of series type zones, validation of the instantiation and of the generated graph. Intermediate graph states would be commited as temporary graph fragments in the database and either commited fully or rolled back following final validation.

Provenance data collected
The TRANSFoRm diagnostic decision support system was evaluated using a high-fidelity simulation of the clinical consultation using real EHR system in a simulated clinic environment. The evaluation employed 34 real physicians and a series of actors reenacting real patient scenarios, around three presenting problems (chest pain, shortness of breath and dyspnea) resulting in a total of 408 patient encounters. Each clinician would be logging into a system once per day, and producing one diagnostic template instantiation with 3-4 zone repetitions per encounter. In a real-world clinic with 8 general practitioners, seeing 40 patients per day, which is standard for an inner London practice, this would translate into around 1000 zone instantiations per day, giving us a scale of the data size and velocity involved in a real life environment.

Queries developed
The traditional way of querying provenance data uses SPARQL as a semantically enabled query language. In our architecture, the provenance data store accepts SPARQL queries and returns answers. However, we are also interested in the interactive query capabilities, for which the Neo4J database front-end has suitable tooling via its Cypher query language [42] and browsing features. In this type of queries, the reply serves as a starting point for interactive graph exploration, which is not feasible in SPARQL. Thus, in order to demonstrate the viability of using provenance graphs as the audit trail of the decision support tools, we have mapped our initial set of questions onto queries in both SPARQL and the Cypher graph query language, making use of the ontologies we developed. Note that for readability in Cypher queries we are using human readable labels derived from ontological categories.
Query 1: Which decision support user was responsible for initiating a deci-sion support tool session that resulted in a specific diagnostic recommendation being generated on a certain date?   As can be seen from the structure of implemented queries, the SPARQL queries operating on the RDF representation are, in effect, recreating the structure of the instantiated template to ask questions. Cypher queries meanwhile can make use of generic graph connectivity queries through the -[*]-> construct, which, while computationally expensive, provides for a more expressive query construction. Furthermore, further navigation and querying from the original result is simpler and faster in Neo4J, which supports the exploratory investigation of provenance traces. Broadly speaking, the strength of Cypher is in processing queries once the entry point has been found [43], while SPARQL running on RDF representations is better at aggregated queries that need to traverse the entirety of the database. However, with improved indexing capabilities in Neo4J v3, this may be subject to change and we are planning to do the full comparison on a larger, simulated, data set as part of future work.

Discussion
In section 3, nine requirements were defined for achieving reproducibility in decision support systems, which we now revisit to demonstrate how our solution addresses them: 1. System transparency. The provenance trace in any of its forms provides the insight into the workings of the decision support system, with the granularity defined by the ontologies used, allowing varying levels of detail.
2. Auditability of recommendations. To ensure recommendations made are auditable, the system must guarantee that the required subset of information is present in the provenance traces. Templates provide exactly this functionality, by specifying the metadata that will be cap- 8. Privacy and security.
The provenance data stored contains the unbroken chain of actions that transformed various pieces of data in the system. If this includes confidential information that should not be presented to the system users, there are several techniques for abstracting parts of provenance data according to predefined security policies [44,45,46]. 9. Usability and scalability. The provenance system is hidden away from the decision support system users, so by implementing light-weight components and asynchronous REST calls, we ensure that its presence does not impede the clinical consultation process by introducing delays. With regards to the scalability, the use of templates allows us to precisely determine what will be the volume of the provenance data collected and the velocity at which it accumulates, allowing the system administrators to implement appropriate storage policies.
By addressing these, we believe that our approach can be used as a ba- The SPARQL and Cypher interfaces allow system developers to quickly query the accrued provenance data, but in order to expose this information to a broader range of end-users, such as institutional and external auditors, commissioning groups, and even clinicians, more visual tooling is necessary.
TRANSFoRm implemented two prototype front-end tools: a web interface containing several representative queries such as the ones above, hiding away the complexity and allowing the user to enter only the relevant parameters and obtain back the results in graph form; and a set of interactive reports containing tabular and chart information, implemented in Eclipse Business Intelligence Reporting Toolkit (BIRT). This approach is currently being explored further by the authors in a follow-up industrial collaboration that shall develop end-user query tools.
Historically, a major challenge for adoption of decision support systems has been the lack of transparency in the recommendations and rules that those recommendations have been based on. In order to evaluate whether provenance technologies can make a tangible difference in clinical practice, and determine whether this information can benefit clinicians directly, provenance elements need to be embedded into a DSS user interface and a full usability study, e.g. using Technical Acceptance Model, shall be necessary.
Thus, evaluation of provenance technologies should encompass both the computational and other operational cost involved in running the tools, and the effort needed to design the necessary models and queries should also be measured. Another issue of note is the quality of collected provenance data, its completeness and accuracy, which should be improved by the use of provenance templates. In order to evaluate this, a separate method of data collection should be specified for each provenance question, and the two compared against each other. These and other software engineering challenges related to provenance are being addressed through the work on PRIME methodology [47]. The authors are applying and extending this work within the LHS-Stroke secondary stroke prevention project that is part of the CLAHRC South London programme 3 .
While this paper focused on diagnostic decision support systems, because this was the DSS use case in the TRANSFoRm project, identical issues arise in other types of DSS such as patient management tools and higher level reporting tools such as management portals used by comissioners, insurers, and health administrators. Indeed some of these have been found to have more impact on patient care than diagnostic systems [21].

Recent developments
The practical need for research into the area of computable provenance for diagnostic decision support has been starkly highlighted by recently reported events in UK general practice surrounding incorrect recommendations being made by a decision support system [48]. The QRISK 2 score is a validated, accepted and widely used decision support aid for predicting the cardiovascular risk in patients in the UK, that is integrated with EHR systems via a parameterised programmatic interface allowing triggering of rules from within the EHR. One particular implementation of the QRISK2 score with a widely used general practice EHR system, was found to be overstat- to identify GP practices to notify potentially impacted patients and to reexamine their cardiovascular risk. The provenance work as described in this research can be seen to be highly applicable in such a logistically difficult scenario. This damage limitation is a classic example of provenance taint analysis as described in section 4, and in such scenario our system could be used to identify: • Potential GP practices where the QRISK2 score has been used to give diagnostic recommendations. (similar to Query 1, looking into practice instead of a user) • Diagnostic recommendations actually made for identifiable patients from identifiable EHR systems that actually used the QRISK2 tool.
(similar to Query 5) • The actual patient cue sets that were submitted to the QRISK 2 tool interface to make the diagnostic recommendations (Query 3) • The returned diagnostic recommendation risk score result for each individual patient involved. (similar to Query 5, including risk score in the template model) Due to this unfortunate incident, we expect the topics of trust and auditability of decision support systems to come even more to the forefront.

Tracing evidence evolution
In more generic decision support systems that rely on a collection of rules that are combined to produce recommendations, the questions on the lineage 54 of rules used to produce a recommendation become important: What data sets and algorithms were used in rule production? How was the rule validation performed? We considered these questions during the TRANSFoRm project, and produced example templates that would satisfy these questions, but they were not implemented at this stage due to the data mining framework remaining separate from the rest of the system. In this extension to the model, each addition, change, or deletion of a rule in the evidence base is tracked, whether the rule has been manually modified or a result of an automated data mining algorithm.

Related work
The prototype of template-based provenance was introduced in its early form in [49,50] and successfully demonstrated the feasibility of using provenance templates to support a large, heterogeneous software infrastructure, albeit missing the theoretical foundation and full architecture presented here.
A separate effort at Southampton [51] is currently looking into lower-level provenance templates that abstract individual PROV variables, rather than larger graph fragments. The instantiation then proceeds by performing crossproducts of all variable value spaces, as restricted by constraints. The concept of substructures in provenance graphs has also been researched in the context of SPARQL queries for RDF provenance repositories and generic graph fragment queries that use a graph motif with a set of constraints on that motif [52]. Related efforts have also been made in the area of graph summarisation [53] which use the graph structure as a basis for summarising and compressing relational knowledge by detecting patterns and compressing structural knowledge encoded within relational graphs, including repetitive or sequen-tial structures. Finally, a body of work exists in abstracting provenance graphs for security purposes. ZOOM system uses the concept of user views to abstract nodes that are not of interest to the consumer [54]. The TACLP [44], ProvAbs [45], and ProPub [46] approaches use security policy definitions to determine which components of the provenance graph to include and which to abstract.

Conclusions and future work
A defining characteristic of the Learning Health System is the trust that must be placed in every aspect of the system [5]. The participants in the LHS must be able to gain insight into its workings if they are to put faith in its actions and entrust it with their data. Furthermore, the system must possess introspective qualities in order to be able to learn about itself and continously improve, engendering a Virtuous Cycle of Health Improvement. This implies capabilities for data and knowledge sharing between the research and clinical actors, under clear and automatically enforced privacy and security rules. A semantically clear and unambiguous provenance trace provides a mechanism for such sharing.
This paper has looked into the reproducibility challenges facing decision support systems, guided by the LHS paradigm, and proposed a solution based on data provenance technologies and abstract provenance template constructs. The semantic complexity of the medical domain modelled was modeled using ontologies annotated onto provenance graphs, and the software architecture used the templates to facilitate provenance capture from the decision support tool. The work was originally prototyped in the diagnostic 56 decision support system developed within the TRANSFoRm project where it was used to capture data from over four hundred simulated diagnostic patient encounters and key analytical queries on that data were shown.
Ultimately, this work contributes to the efforts in integrating trust into computerised decision support systems, enabling transparency and auditability by creating a basis for implementing validation mechanisms. The complexity of decision support systems offers numerous opportunities for problems to arise, from quality of data capture and accuracy of EHR interactions via usability issues to algorithmic errors in rule design. Thus, their increased use puts more and more focus on the techniques for ensuring correctness of the tasks involved. Data provenance offers the mechanism to achieve this, and through use of provenance templates, we have shown how such infrastructure can be implemented in the context of decision support systems.
In the era of Big Data, deep learning systems such as IBM Watson, and other technologies that often rely on black-box analytical environments, it is of paramount importance to support transparency in computerised systems which actions may have direct consequences on human lives. A particularly dangerous assumption of some Big Data evangelists is that with a sufficiently large data, correlation can replace causality in our analytical models. While this may be perfectly fine for market analysis questions such as investigating customer churn or supermarket shopping baskets, medical research in particular depends on full understanding of finer points such as bias, data quality, and statistical significance to derive its conclusions. Thankfully, there is an increasing understanding of the fact that we require not just intelligent machines but intelligible machines [55]. Rather than avoiding Big Data technologies, we need to understand which aspects of it are well-suited to medical research, and then build research software frameworks that support transparency, auditability, replicability and reproducibility [56].
The version of the DSS provenance infrastructure employed in TRANS-FoRm is currently being updated for use by further projects, such as the DSS for prevention of secondary stroke in patients in South London. The tool, developed as part of the CLAHRC South London programme 4 , has been designed by the team at King's College London with key stakeholders including clinicians, patients, and commissioners, and the provenance module will provide audibility and traceability of decisions made with the tool.
With the recent changes in scalability support in Neo4J, we intend to do away with the relational/RDF store and use Neo4J as our main database storage, providing a SPARQL query front end for backwards compatibility.
Furthermore, we plan to add more templates that cover non-diagnostic decision support scenarios and implement some more advanced PROV concepts such as hyperedges representing relations between more than two nodes.

Acknowledgements
The work presented here has received funding from EPSRC under grant