Verification of User Interface Software: The Example of Use-Related Safety Requirements and Programmable Medical Devices

One part of demonstrating that a device is acceptably safe, often required by regulatory standards, is to show that it satisfies a set of requirements known to mitigate hazards. This paper is concerned with how to demonstrate that a user interface software design is compliant with use-related safety requirements. A methodology is presented based on the use of formal methods technologies to provide guidance to developers about addressing three key verification challenges: 1) how to validate a model, and show that it is a faithful representation of the device; 2) how to formalize requirements given in natural language, and demonstrate the benefits of the formalization process; and 3) how to prove requirements of a model using readily available formal verification tools. A model of a commercial device is used throughout the paper to demonstrate the methodology. A representative set of requirements are considered. They are based on US Food and Drug Administration (FDA) draft documentation for programmable medical devices, and on best practice in user interface design illustrated in relevant international standards. The methodology aims to demonstrate how to achieve the FDA's agenda of using formal methods to support the approval process for medical devices.


I. INTRODUCTION
Design anomalies in user interface software are an important concern in safety-critical application domains, including Aviation, Power Generation and Medicine.In these domains, manufacturers are mandated by regulators to ensure that their system or device has been developed using best practice, and that risks associated with the use of their products are minimal or "as low as reasonably practicable".This must be done before the system or device can be deployed in a safety critical context.Such a demonstration usually includes "proof" that the device satisfies a set of safety requirements designed to mitigate identified hazards (see [1]).The FDA has produced one such set of requirements in draft documentation [2] focusing on programmable medical devices, particularly infusion pumps.
This paper is concerned with how to demonstrate that a software design is compliant with use-related requirements.FDA guidelines propose that such a demonstration is "highly dependent upon comprehensive software testing, inspections, analyses and other verification tasks performed at each stage of the software development cycle" [3].The data produced for such analysis is usually substantial and does not prove the absence of compliance breaches.
Formal techniques provide additional information.They are concise, precise and exhaustive, and can be applied before a complete implementation is available.Using formal techniques for the analysis of a system design involves two main steps.The first step is to develop a formal (i.e., mathematical) model of the device that captures relevant characteristics and functionalities of the system.The second step is to use mechanized tools to perform a systematic analysis of the developed model, to check that the behaviors described in the model comply with relevant requirements.If the model correctly represents the real system, then requirements proved of the model apply also to the real system.Formal techniques provide information when the analysis of a requirement fails.This information, typically captured in a counter-example, identifies a precise scenario in which design aspects of the device can violate the constraints imposed by the requirements being analyzed.Developers can use counter-examples constructively, to explore the significance of the failure, and decide whether redesign is necessary or other factors or processes can be taken into account to mitigate the identified failure.
Three types of formal techniques are used in this work to analyze the user interface software of an interactive system: model checking, theorem proving, and simulation.
• Model checking is used to validate models and analyze the interface mode behavior of the device against relevant safety requirements.This verification technology focuses on state transition systems.By exploring the state space of the transition system exhaustively, developers can check automatically whether a requirement is true of the model.The ingenuity in model checking is to build abstract models whose complexity can be handled by the analysis tools, and to formulate the requirements appropriately.• Theorem proving is used for the analysis of detailed models of the device that cannot be analyzed efficiently using model checking.The verification technology is based on natural deduction, and is concerned with resolving logic problems by mechanized application of inference rules.Arbitrarily complex models and requirements can be handled.However, proof of requirements is usually not fully automatic, and guidance from the analyst may be required.• Simulation is used throughout the analysis process to validate a developed model against the real system.The specific type of simulation technology adopted in the present work involves the use of executable formal models and prototyping.The detailed formal models guide the behavior of interactive prototypes whose features and characteristics closely resemble those of the real system.These prototypes can be used by domain experts to check whether the formal model mimics the behavior of the real device, and by human factors specialists to discuss userelated requirements with engineers and formal methods experts.Formal methods experts can demonstrate results obtained using model checking or theorem proving in a way that is accessible to the other team members.Finally, engineers can explore alternative design solutions at reduced cost.This process is fundamental in human-centered design processes carried out by a multidisciplinary team of developers [4].
To date, little or no guidance has been produced to demonstrate how to use formal methods technologies for the analysis of user interface software.As a result, formal verification of usability aspects have seen slow take-up in industry.We address this challenge, providing software developers with a demonstration of how formal techniques can be used to gain high assurance that user interface software design is compliant with given requirements, while also considering how software engineers, domain experts and human factors experts can collaborate to ensure that the manner in which the requirements are satisfied will mitigate use-error.
Contribution.We present a methodology that can be used as guidance by developers when applying formal methods technologies to the analysis of user interface software against use-related requirements.The paper describes the steps taken to demonstrate that the requirements are satisfied.This includes demonstrating that: (i) the model correctly describes the interactive behavior of the device; (ii) requirements given in natural language are correctly translated into formal properties of the model; (iii) the model satisfies the formal properties and (iv) the consequence of failure, if there is failure, is considered leading either to redesign of the device, modification of the property, or changes to the overall system to mitigate against the failure.A case study based on a commercial infusion pump is used to illustrate the methodology.
We progress an agenda suggested by the FDA's use of formal techniques [5].However, whether the analysis is feasible in the organizations that develop medical devices is an important consideration, and this challenge is part of the paper's discussion.
Organization.Section II sets the work described in the paper in context, discussing related work and scoping what is achieved.Section III describes the methodology adopted for the formal analysis of user interface software design.Section IV illustrates the user interface features of the medical device.Section V illustrates the two forms of the model that were developed.Section VI discusses how the models were validated against the actual device.Section VII describes the requirements and indicates how theorems representing the requirements were proved.Finally, Section VIII provides discussion and conclusions.

II. RELATED WORK
Other work shares similar concerns with this paper about how to verify use-related requirements.Bolton et al. [6] use models of user tasks as a basis for generating properties for verification, while in [7] a set of patterns that embody usability principles are proposed.Bolton and Bass [8], [9] used SAL [10] to analyze related properties of the Baxter iPump.They translate formal representations of normative tasks into properties that can be checked on a model of the pump user interface.They also analyze the normative tasks systematically to identify potentially erroneous sequences.Example tasks include: turning the pump on and off, stopping the infusion, and entering a volume to be infused.Their papers include detailed discussion of the challenges faced when modeling and analyzing a realistic system model with a model checker.Mori and others [11] and Fields [12] also represent tasks in a formal language and use a model checker to analyze tasks.Berstel and others [13] use a formal notation to model and analyze WIMP style interfaces.
Bowen and Reeves have produced some initial proposals for design patterns for modeling user interfaces [14], for example the callback pattern, representing the behavior of confirmation dialogs used to confirm user operations and the binary choice pattern, representing the behavior of input dialogs used to acquire data from the user.These patterns address use-related design issues.Their utility is orthogonal to our approach, as they focus on best modeling practice for user interface software models, rather than verification of given safety requirements.
Most previous research, relating to the formal verification of safety requirements has been devoted to the analysis of the control part of a system, rather than to the human-machine interface.For example, a set of FDA safety requirements was also formalized in [15] using the UPPAAL [16] model checker.Their analysis focuses on design aspects of the controller of the pump rather than user interface design and use-related requirements.Li and others [17] have developed verification patterns that can be used for the analysis of safety interlock mechanisms in interoperable medical devices.Although they use the patterns to analyze use-related properties such as "When the laser scalpel emits laser, the patient's trachea oxygen level must not exceed a threshold Θ O2 ", the aim is the integration of a model checker in the actual implementation of the safety interlock as a runtime fault prevention mechanism, rather than the analysis of use-related aspects of the safety interlock.This and other similar research activities, e.g., [18]- [20], are not concerned with the analysis of use-related requirements.Proving requirements with similar characteristics to those described in this paper (though not explicitly userelated) has been the focus of a mature set of tools developed by Heitmeyer's team using SCR [21].Their approach uses a tabular notation to describe requirements which makes the technique relatively acceptable to developers.
Combining simulation with model checking, as discussed in Section III when validating the model, has also been a focus in, for example, [22]- [24].Recent work concerned with simulations of PVS specifications has been used to support the specific modeling process described in this paper with simulation [25].
While the present paper takes an existing device as its starting point, this must be seen in the context of work that uses formal specifications as part of the design process.Tools such as Event B [26] have been developed with such a goal in mind.With Event B, an initial model is first developed that specifies the device characteristics and incorporates the safety requirements.This model is gradually refined using details about how specific functionalities are implemented.Bowen and Reeves [27] have presented a similar refinement approach for user interface design.Their work specifically targets the design of user interface layouts.Presentation Interaction Models (PIMs) are used to describe the user interface layout in terms of its component widgets.A mixture of formal and informal refinement is used to transform the initial specification into its implementation.

III. MODELING AND ANALYSIS APPROACH
The process of proving regulatory requirements described in this paper involves a sequence of steps.These are as follows.
1) Developing a model of the user interface.This model captures the states of the device's user interface, the available user and internal actions, and how these change the device's state.These models are expressed as state machines.The concrete choice of modeling language depends on the type of formal user interface analysis to be carried out.2) Validating the model.This is done using both model checking and simulation.The validation process involves generating witnesses for properties p that should be true of the device.This is done by creating invalid assertions in the form always not p, and using the model checker to find counter-examples for these invalid assertions.The generated counter-examples are witnesses for property p, in the sense that they identify sequences of actions that satisfy p.These sequences can be compared with logs from the actual device, to assess the plausibility of the identified actions and device behaviors.This technique is similar to that used in [28] for generating test cases.Simulation of the model provides an opportunity for analysts who are not formal methods experts to explore the behavior of the user interface interactively.This enables verification that the model reflects the design of the real system, or discussion of the significance of a given requirement.Whether model checking or simulation is used as validation methodology depends on the analyst (e.g., formal methods experts are likely to prefer model checking), and the model (e.g., complex models may exceed the capabilities of model checking tools, therefore simulation is the only option).Concrete examples are discussed in Section VI. 3) Formalizing the requirements.This involves two stages.
The first disambiguates the requirements so that they can be translated more easily into a device-specific property.This is described in more detail in [29].These precise requirements are designed to be implementation independent, and can be used over a range of devices.
The second stage involves refining each formalized requirement so that it is specifically about the device under analysis.Both stages are typically interactive and can involve discussion with both human factors specialists, checking the validity of the interpretation of the requirement, and regulator to check that the spirit of the original requirement is correctly captured by the developed property.4) Proving the formalized requirements.The property representing the formalized requirement is proved of the model by model checking or theorem proving.In the example, all the formalized requirements were proved using both technologies except those that involve full number entry, which could only be proved efficiently using theorem proving.5) Iterating the process.The steps of the analysis process may be iterated as the model is successively refined or may trigger the generation of arguments as to why the design, as modeled, satisfies or fails to satisfy the requirements.Whilst the described process assumes a model-based view of development, the same methodology can also be applied retrospectively to existing devices.For already developed devices, the process of building the model involves reverse engineering the device implementation.A white box view of the system may be taken by reverse engineering the code, or a black box view of the system can be developed by modeling the system based on user manuals and experience of the device itself.In the present paper, this retrospective application of the methodology has been used to perform the analysis of the example medical device.

IV. THE MEDICAL EXAMPLE
Intravenous infusion pumps are used in many contexts in hospitals, for example intensive care and oncology.They are designed to infuse prescribed doses of medication intravenously over specified periods of time.Use error is a particular concern in such devices.Nurses program them, often under pressure, from paper prescriptions provided by doctors or pharmacies.The format of prescriptions can vary in terms of presentation, legibility, units used and whether the prescription focuses on volume and rate, or volume and time -two alternative ways of describing a prescription.The chosen device (see Figure 1) is an existing commercial product [30] that has characteristics that are common to many devices that control processes over time.The clinician user sets infusion pump parameters and monitors the infusion process using the device.The values and settings can be changed using a combination of function and chevron keys (see Figure 1).Fig. 1.Actions and attributes used to model the pump.Actions define functionalities provided by the device to enable user interaction (e.g., the fup key) and to handle device internal events (e.g., a function for handling alarms).Attributes define the characteristics of display elements used to present feedback to the user (e.g., the top-line display), as well as internal state variables necessary to represent the device behavior.
Chevron keys are used to increase (using specified actions fup or sup), or decrease (using fdown or sdown), entered numbers incrementally.Holding the chevron key down accelerates the size of the increment or decrement.
Because the device is small it is reliant on modes to make effective use of the screen and available action keys.The mode structure first distinguishes whether the device is infusing (i.e., pumping medication into the patient) or holding (i.e., paused).Additional modes are offered by the device for changing therapy settings and pump configuration options.For example, data entry modes govern whether the chevron keys change infusion rate, vtbi or time, or alternatively allow the user to move between options in a menu.Menus are available to select predefined settings in bag mode and in query mode.Bag mode allows the user to select from a set of infusion bag options, thereby setting vtbi to a predetermined value.Query mode, invoked by the query button, generates a menu of options configured by the manufacturer.Options include: locking the infusion rate; disabling the locking of it; setting vtbi and time rather than vtbi and infusion rate; and changing the units of volume and infusion rate.The device allows movement between display modes via three function keys (key1, key2 and key3).Each function key has a display associated with it indicating its present function (fndisp1, fndisp2 and fndisp3).
In the analysis of the requirements that follows, further details of the device and modes will be provided as necessary, particularly when the analysis of a requirement leads to interesting consequences.

V. DEVELOPING A MODEL OF THE DEVICE
Neither a process of developing and refining models satisfying requirements nor producing models from program code was feasible in the present case.The device had already been developed, and the program code was not available to us.The model was therefore developed by hand using a combination of user manuals, simulations and the device itself.
At the stage of the analysis described in this paper, a model of the particular pump had already been developed as part of a general analysis of usability properties of the device [31], and without the particular FDA requirements considered in the present paper in mind.It had been used to analyze properties of the interactive modes, such as whether device modes were presented by the device without ambiguity.This initial model was analyzed using the NuSMV model checking tool [32].
This model was further extended to facilitate analysis of the number entry system of the device as required to prove various FDA requirements.One effect of this extension was that the size of the model increased significantly, making the use of the NuSMV model checker infeasible.We therefore changed verification technology when analyzing the extended model.Our option was theorem proving, which is less automated than model checking, but can handle the analysis of more complex models (compare [8]).The specific theorem prover is PVS [33], which offers an expressive specification language that made it easy to translate and extend the initial model.NuSMV uses symbolic model checking.Alternative model checking technologies exist (e.g., bounded model checking), also available within the NuSMV toolset, that might have been able to analyze efficiently the extended model for the considered properties, possibly at the expense of completeness.
Another key reason for the adoption of PVS was its ability to generate the prototypes that enabled validation of the model and broader discussion of the requirements.

A. Initial model of the device
The model of the device has two main elements: a generic "pump" component modeling the pumping mechanism controlled by the device; and an "interface" component that is specific to the particular user interface of the device.The pump component has been reused in other models.For example, an infusion pump developed by another manufacturer has also been studied in detail in [31] which uses this component.The initial model focused on features of the device that involved interaction with the user.This model used the underlying pump process but focused on the effect of actions insofar as they change the basic modes of the device (infusing, paused, off), the interaction modes of the device and the information that is displayed.
To ease the modeling process, a first order notation oriented around actions was used to describe user actions that were provided by the infusion pump.The notation that was used (Modal Action Logic or MAL), and its mapping to SMV and analysis using NuSMV is supported by the IVY tool [7], [34].MAL is a simple state transition language, easily translated from state transition diagrams or the SCR tabular format [21].The notation is used because it is of a type that is more readily acceptable by developers, making the actions and state transitions explicit in their description.The properties that translate the requirements are expressed in Temporal Logic.Both CTL and LTL logics (see [35]) are supported by IVY and NuSMV.LTL model checking is implemented by NuSMV in terms of CTL model checking, and is typically less efficient [36].In practice, CTL is used unless the property can only be expressed in LTL.To facilitate tractability using model checking, token values were used for the pump variables vtbi, infusion rate, time and volume infused.They were assumed to be integers in the range [0 . . .7].These simplifications were considered to be valid when limiting the analysis to the modal behavior of the device.
The following MAL modal axiom describes the conditions in which key1 (see Figure 1) has the effect of confirming a device reset (Figure 2 shows the pump display at this step).
This axiom describes (after [key1]) the effect of action key1.The expression to the left of the action, namely (topline = clearsetup), states the condition under which the behavior described for the action is enabled.This specifies that when the top line of the display shows "clear setup", and the action is invoked, then the expression after [key1] describes the behavior.The rule describes changes to visible attributes middisp, topline, fndisp1, fndisp2, and fndisp3.The priming of an attribute (topline , for example) indicates that the action changes the value of that attribute.The action key1 also changes the mode of the device (entrymode) to allow entry of infusion rate (rmode).Finally the rule describes how the action further invokes an action in the pump component (device.reset) that initializes all the pump variables.The action reset is accessed in the reusable pump component by using the pump's identifier device in the MAL specification.The keep(. . . ) expression specifies which attributes are not affected by the action and remain unchanged.This example shows how the MAL model focuses on interface features and the modes of the device, describing concretely how actions change the display and modes of the device.It has a simple discrete model of time.An action tick increments time as the infusion process continues, or while the device is paused.In the latter case the value of time is used to determine how long the pause has been.This model, even without full number entry, requires substantial processing for analysis in NuSMV, around 90 minutes for the subset of properties that are tractable.The analysis was performed on an Apple MacBook Pro 2.9 GHz Intel core i5 with 8GB of RAM.A detailed discussion of the challenges related to model checking realistic user interface models can be found in [8].

B. Detailed model of the device
The second model of the device, which includes additional details needed for the analysis of various FDA requirements, was developed by translating the MAL systematically into the PVS [33] theorem proving system, and then extending it with details related to the data entry system.PVS allows the analysis, in principle, of models and properties involving infinitely many states.The equivalent specification for the MAL fragment described in the previous sub-section is: The PVS theory captures all the characteristics of the MAL model, including time, but also includes a full number entry model and other specific details.The PVS features that correspond to MAL elements can be seen in the specification.This function key1 case clearsetup is invoked in the more general key1 function when the condition topline(st) = clearsetup is true.The function has domain (per key1), where per key1 is a predicate that restricts the function domain to the set of states for which the action is accessible to the user.This information is used by PVS to ensure wellformedness of the model.

VI. VALIDATING THE MODEL AGAINST THE REAL DEVICE
Fidelity of the model to the implemented device was first demonstrated using the NuSMV [32] model checker by proving a range of properties.This validation of the model inevitably included state abstractions required to reduce the state space for the model checking analysis.The sequences generated as witnesses provided structural details such as a sequence of mode transitions, rather than the details of entry of a particular value for the infusion rate.
NuSMV accepts a finite state model (translated from the MAL model illustrated above) and analyzes it exhaustively to prove or disprove a property.An example of such a property is that, once relevant pump variables had been entered, infusion would lead to a state in which the volume infused was equal to the entered vtbi: The property is expressed as a negation.It asserts that it is always the case, for all paths, that if infusion rate is set to 1 (a token value) and vtbi is set to 7, then a state cannot be reached in which volume infused is 7.This particular property is generic in the sense that it would be a desirable characteristic of any programmable infusion pump.It does not depend on the details of the device user interface, depending only on the generic pump model, but produces results that enable an analysis of the interface, making possible a comparison between alternative interfaces.As expected, the property fails when checked and produces a trace of steps (a witness) in which the infusion rate is set to 1 and vtbi is set to 7. It indicates that once this has happened, eventually the device is set to infuse, and then after more steps a state is reached where the volume that has been infused becomes 7.The trace can be compared with the actual device logs.This model was used to validate a set of plausibility properties before translating it into the fuller PVS model.For each action specified in the first model, a function was described that transformed PVS descriptions of states, and for each permission describing when the action was enabled, a PVS predicate was produced.Texture was then added to provide a more detailed description of the user interface behavior, in particular related to the data entry system of the pump.A set of informal rules were used to achieve this translation.The correctness preserving properties of the translation were not however checked formally.A prototype was also produced automatically from the PVS model to obtain an interactive simulation with the "look and feel" of the actual device (see Figure 2).The traces and simulations were indistinguishable from the behavior of the physical device.The only difference between the logs and traces obtained in the simulation and those of the real device was that the precise timings differed.The simulations were generated with the aim that they could be explored by regulator, manufacturer or for training purposes by the user.These simulations only allow, of course, an exploration of the paths that the user (for example, regulator) chooses to explore.Simulation can also be used to illustrate what the failure of a property means.Part of the argument that a failure is acceptable may then involve a demonstration of the features of the device that fail the requirement, showing that they do not present a risk.Validation of the PVS model was checked in this case not by proving the equivalence of the PVS and MAL models, though this is feasible, but by simulating the PVS model using PVSio-web [37].Both MAL sequences and PVSio-web simulations were explored by hand with the help of expert users and the user manual of the real device.
The remainder of the present paper focuses on the PVS model and the theorems that were generated to prove use-related requirements.Relevant snippets of the developed models and theorems will be presented.The full MAL model with CTL properties, and the full PVS theory and theorems with proofs can be found in the specification repository 1 .All proofs can be reinvoked using the IVY and PVS tools.

VII. FORMALIZING AND PROVING REQUIREMENTS
In this section, we demonstrate how the process of formalizing and proving use-related safety requirements can encourage a constructive dialog between the analyst and the other experts involved in the development of a user interface.
The developed device model was used to analyze two small but representative sets of requirements.The FDA requirements described in [5] are considered first.They are designed to mitigate use hazards that could lead to mis-programming of the infusion pump and, consequently, delayed therapy or even patient harm.A further set of requirements is then considered based on the property templates described in [38], [39].These additional requirements capture best practice in user interface design.Note that, whilst the process is demonstrated for a specific device, the applicability of the methodology is general, and can help design better user interfaces.

A. Formalization process
To formalize the requirements given in natural language, it is necessary to consider their precise interpretations (see [29]).A way to develop a precise interpretation is to translate the requirements into logic properties.These properties are expressed using the PVS language, which combines a functional notation, similar to that used in programming languages, with logic connectives such as AND (conjunction), OR (disjunction), IMPLIES (implication).Precision is achieved by defining abstractions that can be more readily understood by different stakeholders.Also, this process makes it easier to construct properties that match the requirements as PVS theorems.The formalization must capture the essence of the requirements as understood both by the regulator, who developed them in the first place, and the human factors specialists, who can comment on the user aspects of the requirements and whether they are fulfilled by the specific properties of the device.

B. The FDA Requirements
The FDA requirements are specific to the safety of infusion pumps.For each requirement, the following information is presented: the original formulation described in FDA documentation; the safety concerns addressed by the requirement; a formalization of the requirement and a proof of compliance with the requirement for the example device.
R1: Clearing the pump settings and resetting of the pump shall require confirmation.
Safety concerns.This requirement is designed to check whether there is a barrier (in this case confirmation) to reduce the risk that infusion settings will be cleared or will revert to predefined settings inadvertently.
Formalization.To formalize this requirement, a logic property needs to be constructed that requires that relevant pump variables are not cleared until a confirm action has occurred.The property needs to specify also that no other action (no confirm) will change the specified pump variables.For the sake of clarity, the formalization is described for each pump variable separately.The requirement, for pump variable vtbi, can be expressed as follows: In the above formula, ready to clear(vtbi) is a predicate that identifies the states in which the device is ready to clear the vtbi value.The action clear setting (vtbi)  specifies the first step, prior to possible confirmation, in which the settings are cleared.This action does not clear the setting itself but must precede the confirmation action.
Interpretation for the specific pump.The process of further refining the terms in the abstract property is valuable in reaching agreement about how the requirement applies to the specific pump.It leads to consideration as to what, for example, should be the state of the device when it is "ready to clear" and how the state of the device should be communicated to the user.Further, it leads to consideration of what the confirmation should be and how it should be signaled to the user.The actions captured by no confirm are also a matter of concern, as they define what actions should be permitted to abandon the clearing of the setting.For example, in the example device, vtbi can be cleared when the device is turned on and is in the holding state.The action of clearing the pump variable is only possible if its value is non-zero.Relevant state attributes for expressing whether the pump is infusing and turned on are infusing?and powered on, respectively.Predicate ready to clear can therefore be expressed as: NOT device(st)'infusing? AND device(st)'powered_on?AND device(st)'vtbi = x AND x /= 0 where x is a parameter representing a generic, but given, vtbi value.Normally, a specification of ready to clear is necessary that includes visual attributes of the user interface.For example, in this case the device should be ready to clear unless the top line of the display indicates that it is infusing.It was felt unnecessary however to include this in the formulation -other requirements were considered in the analysis that demonstrate the visibility within the user interface of operating modes.The required clearing of the pump variables is achieved by switching the pump off and then switching it back on again.When the settings are cleared in this way, as the device is switched back on, a request is made for confirmation.This confirmation can be given by the user through action key1.
The request for confirmation is indicated by a top line display of "clear setup".
Proving the requirement.The requirement can be proved in PVS with no human intervention by using the predefined proof strategy grind (which performs quantifier elimination, expansion of definitions, and propositional simplification) to prove each sub-goal.

R2:
The pump shall issue an alert if paused for more than t minutes.
Safety concerns.This requirement aims to ensure that the user is alerted if the device is left unattended.
Formalization.The requirement can be formalized using a predicate user input strictly overdue, which indicates whether the device has been paused without activity for a specified period, and a predicate alert, which describes an appropriate alert produced by the device: Interpretation for the specific pump.
The generic formulation can be refined to capture specific scenarios relevant for the device, for example if the device is left unattended when paused.In this case, predicate user input strictly overdue can be expressed in more detail as: where paused and elapsed will have specific meanings for the particular infusion pump.In this case, the pump is paused when the device is powered on and not infusing: pause(st) = device(st)'powered_on?AND NOT device(st)'infusing?
Attribute elapse specifies the time since the device was last used when in holding mode.This attribute is incremented each tick action (which simulates the evolution of time in the developed model) when the device is paused.
Proving the requirement.This requirement can be proved for all reachable states through structural induction.That is, the PVS theorem contains the following two parts: the first part proves that the formalized requirement (denoted as R2assertion in the theorem) is true of the initial device state; the second part proves that, given a generic state pre for which the requirement is true, it is always the case that the requirement is also true for any state post reached from pre by any available action: In the theorem, predicate state transitions is used as a means to relate states st1 to st2 if st2 can be reached by any available action from st1.This requirement can be proved in PVS by splitting the theorem into sub-goals based on the available actions, and then using grind to complete the proof of each sub-goal.
R3: If the pump is in a state where user input is required, the pump shall issue periodic alerts/indications every t minutes until the required input is provided.
Safety concerns.This requirement aims to mitigate situations where the clinician has entered incomplete infusion parameters.This might occur, for example, if the clinician is interrupted while programming the device.
Formalization.The formalized requirement includes the following elements: alert, a predicate describing an appropriate alert produced by the device; ready, a predicate identifying the device state before the alert was issued; and user confirm, an action that clears the alert: user_input_strictly_overdue(st) IMPLIES (alert(st) AND ready(user_confirm(st))) Interpretation for the specific pump.Providing an interpretation for these abstract terms challenges the designer and the human factors specialist to consider the features of the design that need to be considered in the requirement.For example, user input strictly overdue should be true when the device has been paused without activity for a specified period (compare requirement R2).The formalization also challenges the analysis team to consider how this device state is communicated to the user, and which key is used as confirmation key.In the specific pump, alert is indicated by an appropriate top line display (attention) which is characterized also by an audible alarm.Action user confirm is defined as key3.Note that the confirm does nothing if key3 is not enabled.Finally, predicate ready needs to be defined.It checks that the device returns to a pause state and the elapsed time is set to a value less than the time out.
Proving the requirement.This requirement can be proved in PVS with minimal human intervention, using a structural induction similar to the previous example.

R4:
The flow rate for the pump shall be programmable.
Safety concerns.This requirement aims to ensure that the clinician can program the pump with the flow rate values indicated in the prescription provided by the pharmacy.
Formalization.This requirement is more challenging to make precise.It requires a description of what is meant by "programmable" in this context.The requirement is designed to ensure that any flow rate value indicated in a prescription can be programmed in the pump.A reasonable interpretation is that there is always an available action (when in a relevant mode) that will change the flow rate programmed in the pump so that it is closer to the target rate.The requirement can be reformulated therefore as follows: "If the device is ready to enter the flow rate, then there is always an action that will take the flow rate closer to the expected rate, and eventually the intended rate will be reached."This reformulation can be translated into the following logic property: where e is a target value for the flow rate, a1 is an action that reduces the rate, and a2 is an action that increases the rate.
Interpretation for the specific pump.This requirement, as expressed, raises questions about the nature of the actions a1 and a2, how the target rate is made visible to the user, and how the device indicates the progress that is being made to reaching it.For the specific pump, the device is ready to accept a rate value (rate entry ready) when: the device is switched on, infusion rate is not locked, and the top line display shows "holding" or "infusing".The two actions that provide the expected programmability are single chevron up (sup) and single chevron down (sdown).Note that the requirement is not concerned with how efficient the programmability is.Rather it aims to check that there exists a sequence of actions for entering a given value.A separate requirement needs to be defined to address efficiency of programming.
Proving Requirement R4.The PVS prover completes the proof of this theorem unaided, using grind.
R5: To avoid accidental tampering of the infusion pump's settings such as flow rate/vtbi, at least two steps should be required to change the setting.
Safety concerns.This requirement is designed to mitigate hazards resulting from accidental tampering of pump settings, as a result for example of a single erroneous button click.
Formalization.This requirement is useful to further illustrate the role of abstraction as a means of communication with different stakeholders.Whilst this requirement shares similarities with requirement R1, it is interesting to note the different safety mechanism indicated in the requirement: R1 requires a "confirmation action"; R5 requires "at least two steps".This is done intentionally, because a confirmation action can be performed automatically by the user without further thought, e.g., after a timeout.In R5, the requirement is that two user actions are performed -a safety mechanism based on timeouts can be easily defeated in the case of accidental key presses.To ease the formalization, the requirement can be reformulated as follows: "For any value x of a given pump setting, if data entry is ready (vtbi entry ready) for that pump setting (in this example vtbi), then the pump setting cannot be updated to x in a single step."This property needs to be proved for all possible actions (state transitions xkey1) with the exception of key1 which does update the pump setting: (vtbi_entry_ready(pre) AND vtbi_value(pre) = x AND newvtbi(pre) /= x AND state_transitions_xkey1(pre, post)) IMPLIES vtbi_value(post) = x Interpretation for the specific pump.The example pump is ready to accept a vtbi value when the pump is turned on and in a relevant data entry mode.Specifically, the top line display needs to be either dispvtbi (i.e., when entering vtbi and rate) or vtbitime (i.e., when entering vtbi and time).
Proving the requirement.PVS is able to prove the requirement unaided for vtbi and time, using grind.For infusion rate, on the other hand, the requirement fails.Based on the counter-example provided by PVS, it can be shown that the requirement can be satisfied for the infusion rate only if it is assumed that the clinician always locks the rate before starting the infusion.Whilst there is always a reminder to lock the rate when starting the infusion, the assumption needs to be validated against clinical practice adopted in the hospital within which the pump is to be used.

C. Requirements from Property Templates
The second route to generating use-centred safety requirements is to adopt a set of property templates based on usability design principles.The templates are designed to help developers to construct requirements appropriate to the analysis of user interface features.They can be instantiated to the particular details of the device, and provide indications of how to develop user interfaces that are easier to use and promote more transparency of the effect of actions.
Three property templates will be used to illustrate the approach: "feedback", "consistency", and "reversibility".Further property templates and detailed examples can be found in [39].

Feedback template
When certain important actions are taken, a user needs to be aware of whether the resulting device status is appropriate or problematic [40].Feedback can be considered conveniently as action feedback, requiring that an action always has an effect that is visible to the user, and state feedback, requiring that a change in the state (usually specific attributes of the state rather than the whole state) is visible to the user.Two example requirements are now illustrated that are based on this template.R6: Whenever a pump variable is being entered, the variable should be clearly identified and its current value visible to the user.The instantiation of the general form of action feedback (as described in [39]) requires that entry of the relevant variable is enabled (i.e., the device is in the appropriate data entry mode), and that the variable relevant to the mode is visible.This requirement can be formalized as follows: Proving the requirement.The requirement can be proved automatically in PVS for all reachable states through structural induction (compare requirement R2).

R7:
The current mode should be clearly identified, and changes in mode should have perceivable feedback.This requirement is an example of state feedback.It requires that, in any situation, if the mode changes then the mode is visible.For the considered pump, top line is the indicator of the entry mode.The formulation can be further refined therefore as follows: "When the entry mode changes then the top line changes."The requirement expressed as logic formula is: entrymode(pre) /= entrymode(post) IMPLIES topline(pre) /= topline(post) Proving the requirement.The proof of this requirement illustrates the type of human intervention that is necessary to complete an interactive proof attempt.The theorem prover fails to prove the theorem, and generates a first counter-example at the point of failure when the device is in bag mode, which allows standard bag volumes to be assigned to vtbi.The counter-example shows that: • The top line indicates "volume to be infused" • Action key1 can be used to exit the mode • The new mode, after performing action key1, has a top line which also indicates "volume to be infused", but in this mode chevron keys change the value of vtbi through up and down adjustments rather than by navigating the infusion bag menu.The fact that the prover identifies this counter-example leads the analyst, along with domain and human factors experts, to consider whether this ambiguity is likely to be an issue.Further discussion might suggest that the format of the display as a whole between the two modes is significantly different, and therefore enough to prevent mode confusion.This case can be therefore excluded by introducing the following guard in the theorem: entrymode(pre) /= bagmode When the proof is attempted again with the guard, the theorem prover throws up another similar situation for another data entry mode, where a top line "vtbi over time" is displayed in two different modes.As in the previous case, this counter-example can be considered a false positive because the overall format of the two displays is significantly different.An additional guard is therefore added to the theorem to exclude this new situation: entrymode(pre) /= ttmode This further refinement of the theorem is sufficient to complete the proof of the requirement.

Consistency template
Users quickly develop a mental model that embodies their expectations of how to interact with a user interface.Because of this, the overall structure of a user interface should be consistent in its layout, screen structure, navigation, terminology, and control elements [40].The consistency template is formulated as a property of a group of actions, or it may be the same action under different modes, requiring that all actions in the group have similar effects on specific state attributes.The following requirement for example can be constructed based on this template.R8: When entering numbers, a given action will always confirm the value entered.This requirement aims to check that the same key can be used across different data entry modes to confirm user input.A general formulation of the requirement includes the following elements: a predicate guard em ok restricts the set of relevant entry modes where a given confirmation action confirm is available; predicate temp em filter extracts the state attributes that have been changed "temporarily" because of the mode; and predicate real em filter extracts the state attributes that are changed as a result of exiting from the mode.All predicates have device data entry mode em as parameter, as the precise definition of the predicates depends on the data entry mode in the general case.
guard_em_ok(em, st) IMPLIES temp_em_filter(em, st) = real_em_filter(em, confirm(st)) For the example pump, key1 is typically associated with the ok function during data entry.The verification effort therefore aims to ensure that, whenever key1 is enabled and associated with function display ok, the confirmation action behaves in a similar way in all data entry modes.
Proving the requirement.When attempting the proof in PVS, a counter-example is found by the theorem prover.The use of the confirmation key is not consistent in data entry mode "vtbi over time".This particular data entry mode involves setting vtbi and time.The theorem fails because when vtbi has been entered, and key1 is pressed with the ok function display, the pump value of vtbi is not changed.It is only changed after the next step when time has been changed.The failure, identified while proving the theorem, may require further scrutiny to demonstrate that the system's safety is not affected.The device in fact assumes that vtbi and rate are the standard mechanisms for setting up the infusion rate.There may therefore be issues when a prescription is received that presents vtbi and time.

Reversibility template
Users may perform incorrect actions, and the device needs to provide them with functions that allow them to recover by reversing the effect of the incorrect action.In [41], it has been shown that lack of compliance to this requirement could lead to data entry errors.An example requirement based on this template is as follows.
R9: Any data entry action should be reversible.To facilitate the formalization process, this requirement can be reformulated as follows: "For any particular action a1, there is a reversing action a2 which returns the device to its original state."This reformulation can be easily translated into the following generic formula: data_entry_ready(st) IMPLIES pump_variable(a2(a1(st))) = pump_variable(st) For the example pump, the formula needs to consider all the chevron keys and entry of infusion rate, time and vtbi.For the sake of clarity the example is given only for the case of entering infusion rate and the single chevron up (sup) key with inverse single chevron down (sdown) key.The example pump is designed so that the user can accelerate the size of the increment by holding the chevron key down.This possibility is reflected in the specification but is simplified here for illustration.The formulation for the other chevron keys and data entry modes is identical.The generic version of the requirement is therefore instantiated as follows: Proving the requirement.This final proof example further shows how feedback from the theorem prover can be used constructively to refine the understanding of the interaction design of the device.Proof of the theorem fails, and the theorem prover returns counter-examples indicating anomalies for certain specific values.An example counter-example identified by the theorem prover occurs at 99.9: pressing sup and then sdown produces 99 (instead of 99.9).This happens because of "step boundaries" where the size of the increment or decrement changes.Up to, but not including, 100, both the single chevron up key and the single chevron down key make a step of 0.1.From 100 upwards (up to 1000), the increment is 1.Therefore, to prove that sup can be reversed by sdown, the theorem must take account of these step boundaries.The additional conditions necessary to prove the theorem for the range 0 to 100 for the sup and sdown keys are illustrated here.The first condition ensures that the analysis is performed within a range of values where the same increment step (in this case, 0.1) is always maintained: where floor returns the largest integer less than or equal to the specified number, and ceil returns the smallest integer greater than or equal to the given number.Wrapping all these constraints together, for all step boundaries, and for all chevron keys, the property to be proved for requirement R9 becomes much more complex.Proof of this requirement, with all its qualifications, provides limited assurance that number entry actions are easily reversible by users.Clearly the process was valuable in understanding the characteristics of the number entry system, and identify precisely in the design space where the property fails.Formulating and proving the theorem raised practical questions about whether the data entry system of the device is acceptable, or whether it is likely to lead to use error.
It should be noted that later releases of the firmware for this device have fixed these issues with step boundaries.

VIII. DISCUSSION AND CONCLUSIONS
Demonstrating that the design of a medical device is compliant to relevant safety and usability requirements is a serious problem.The techniques described in this paper are designed to address this issue.It is estimated that there were 56,000 adverse event reports relating to infusion pumps between 2005 and 2009 in the United States including at least 500 deaths [42].This has resulted in 87 infusion pump recalls to address identified safety concerns, according to FDA data.Of these adverse event reports, use error has been a significant factor.The documentation provided by manufacturers to regulators as part of a safety argument is usually substantial.The scale of the argument inevitably makes it difficult for regulators to comprehend them and to be confident that the evidence provided is of satisfactory quality.The use of formal techniques has advantages.(1) It is precise and concise, potentially avoiding the documentation explosion generated by a typical deposition.(2) Tools like model checkers and theorem provers enable mechanical and exhaustive verification.(3) The use of simulation techniques combined with formal modeling can clearly demonstrate how potential problems are addressed that can be of value to regulators, human factors and domain specialists.
There are well known obstacles to the immediate take-up of these facilities.They are not routinely part of a typical developer's suite of tools.They are not routinely used in product development.However there are signs of interest in these techniques.For example, the FDA has developed generic PCA models [43] using Simulink.It is not however a feasible option to expect regulators to construct models after the fact.An ideal option would be that manufacturers produce models as part of their design process demonstrating that a submitted product adheres to safety requirements.The regulators would then use tools to validate the models provided as part of the developer's submission.
We have used model checking combined with simulation to support the process of validation of models by generating traces to be validated on the device.However, manufacturers have access to source code, and even if they do not develop their devices using models, they can create faithful models systematically.A further important issue, not addressed here, is how to validate that the considered safety requirements correctly address the usability and safety of the device for the context in which the device is to be used.This problem is orthogonal to the techniques presented in this paper.Safety aspects can be addressed with a human factors emphasis using a hazard analysis such as the one presented in [44].
We have illustrated how formalizing the requirements provides benefits in addition to the ability to prove them.It has led to much more detailed thinking about the precise nature of the requirements, both in general and for a specific device, than was possible in the informal natural language version.The pragmatic and informal combination of model checking and theorem proving provided powerful tools for analysis.By using each flexibly for requirements they were suited to, rather than ideologically favoring one for all requirements, or trying to combine them into a single tool applying both, it was possible to prove the requirements with relatively low effort.One potential drawback of this approach is the need to master the two verification techniques.Indeed, both verification methods currently require significant skills for analysis.However, we have observed recurrent patterns in the structure of the formal models of devices from different manufacturers, and in the strategies needed to complete verification of several types of safety requirements.Therefore there are clear opportunities to create automated proof strategies that can be used to reduce the analysis effort.