Modelling and verification of parameterized architectures: A functional approach

The merit of higher order functions for hardware description and transformation is widely acknowledged by hardware designers. However, the use of higher order types makes their correctness proof very difficult. Herein, a new proof approach based on the principle of partial application is proposed which transforms higher order functions into partially applied first ‐ order ones. Therefore, parameterised architectures modelled by higher order functions could be easily redefined only over first ‐ order types. The proof could be performed by induction within the same specification framework that avoids translating the higher order properties between different semantics, which remains extremely difficult. Using the notion of parameterisation where verified components are used as parameters to build more complex ones, the approach fits elegantly in the incremental bottom ‐ up design where both the design and its proof could be developed in a systematic way. The potential features of the proposed methodological proof approach are demonstrated over a detailed example of a circuit design and verification within a functional framework.


| INTRODUCTION AND MOTIVATION
Within the framework of functional programing, the merit of higher order functions (HOFs) called combining forms is widely acknowledged by hardware designers. Their ability to make abstractions allows them to be used for many purposes such as: structuring and reusing circuit specifications [1,2], expressing regular designs [3,4], optimising designs [2,5], developing parallel pattern-based models [6][7][8], etc. However, if HOFs provide many advantages at the specification level, their correctness proof remains a tedious task. The main reason relates to the use of higher order types which make the process of showing that a design meets its specification for all input values, or defining standard higher order homomorphisms extremely difficult. Such obstacles lead some authors to consider rather weakening higher order homomorphism [9] (lacking the congruence relation that requires complicated logic calculus involving partial ordering on types), and some authors to simulate higher order types by first-order ones within an appropriate framework, such as higher order logic [10,11], term rewriting systems [12,13], functional programing [14][15][16][17]. Most of the proposed approaches use a proof tool to reason about the HOFs properties which still remain very difficult to translate and prove [18][19][20][21]. Moreover, some higher order cases are still difficult [22].
HOFs have been extensively used for building complex hardware designs whose proofs remain an unaccomplished goal. In Ref. [2], the author uses HOFs within CλaSH (subset of Haskell) to develop complex regular architectures which are validated only by simulation before being translated into the VHDL code. The authors of Ref. [6], propose a reconfigurable architecture framework called Plasticine for developing, simulating, and implementing efficient pipelined SIMD architectures over HOFs (called parallel patterns) within a Scalabased HDL. Parallel patterns are implemented in Plasticine by a collection of low level Pattern Units which precisely capture the semantics of functional and memory hardware units.
verified. In Ref. [8], the authors present a similar approach within a functional framework called Lift, which provides both high-level patterns for describing designs at high level of abstraction, and low-level patterns (as rewrite rules) for making optimisations and generating FPGA target implementations. However, such patterns which support both the design construction and its transformations are used without verification. The authors of Ref. [23], use HOFs within CλaSH for modelling Run-Time Reconfigurable designs, and use assertion-based verification to ensure their correctness. Assertions are described by functions and implemented on hardware designs. Therefore, their proof approach requires more space and time, and consequently, it remains critical with respect to the design scalability. In Ref. [24], the authors make use of full Haskell to develop high-order descriptions of processor designs which are defunctionalized (for the synthesis purpose) into first-order implementations in ReWire (subset of Haskell ). The correctness of the high-level functional description is ensured by a theorem which is proved outside of ReWire. To show the design scalability of their approach, they applied it to the design of both single core and dual core (by instantiating two copies of the single core). While their approach scales well with the design complexity, it remains however, unclear with respect to the proof scalability.
This article proposes a new methodological approach within Haskell framework for the formal specification and verification of parameterised hardware designs (that are built over HOFs). The proposed approach combines both the notion of parameterisation (to structure the design) and the notion of partial application (to ease the proof) to incrementally build and prove complex hardware designs.
In our context, both first-order and higher order specifications represent functional programs that could be used for both formal verification and simulation (actual designs are validated by mixing these two techniques [25,26]). Moreover, the methodology carries out the proof by induction within the same specification framework, that is both the design and its proof could be carried out using the same semantics. The use of a separate proof tool requires translating the design properties from the specification framework to the proof framework, and so, the soundness property remains difficult.
The proposed proof methodology follows the layered vertical-horizontal design approach which has been proposed as one of the most innovative contribution in our previous work [27] on first-order models. Such work involves only the system behaviour while it says nothing about its structure. Complex systems are incrementally built one over the other and consequently, their structure (as well as their reliability) is very crucial. The objective of the present work is to show how such systems are built and proved. For this purpose, we introduce two basic concepts: the notion of parameterisation (implemented by the notion of HOF) which is the key point for structuring large designs, and the notion of partial application to simplify the proof. With respect to our previous work, this methodology also shares both the notion of state functions (to formally represent machine states) and the notion of time functions (to link synchronous points where the states of two different implementations could be compared) Throughout this work, we provide an effective framework in which both the design and its proof could be developed in a systematic way.
Herein, Section 2 introduces the basic definitions underlying our proof methodology. Section 3 presents the principle of the proposed proof methodology through the layered vertical-horizontal design approach. Section 4 is dedicated to the modelling and verification of the different microarchitectures: Sequential, pipelined, and superscalar. Section 5 presents a detailed case study of our proof methodology with respect to MIPS processors. Finally, the conclusion section outlines the main contributions of this research work.

| PRELIMINARIES
Haskell [28] is a non-strict purely functional language that provides in addition to its formal semantics definition (to support formal reasoning) powerful features such as: function composition, parallelism, lazy evaluation, polymorphism, higher order functions, etc., which demonstrated its viability with respect to complex hardware designs. Throughout this work, we will use plain Haskell (to deeply exploit its powerful features) instead of a functional HDL (Hardware Description Language). General Haskell could be subsequently translated into a restricted intermediate representation [29,30] before translating it to an imperative HDL like VHDL, or Verilog.

| Polymorphic higher-order functions
A higher order function (HOF) is a function that takes functions as arguments or returns a function as a result. Polymorphic HOFs express elegant regular designs while enhancing their abstraction through their parametric polymorphism. Figure 1 shows both the definition and the implementation of (map-scan-right) mscanr HOF [31]. It combines three standards HOFs: map, fold and scan in only one operation.

| Curried and partially applied HOFs
Haskell allows a (higher order) function to be defined in a curried form; that is, it takes its arguments separately (juxtaposed). In this way, it can be partially applied (to fewer arguments than it was) to form a new function (called specialisation) defined only over first-order types. For instance, let us consider the polymorphic HOF mscanr specified so far in Haskell. Let data Bit = Zero|One, and fadd be a function whose type matches the type of the functional argument f.

| State and next-state functions
Throughout this work we will use the notion of state functions for modelling complex processor microarchitectures involving memories.
Let S be a non-empty set, called the state space. A state function with an initial state c::S, and a next-state function: f:: S→S, is recursively defined as follows: The distributed aspect of a machine state space over its components requires decomposing the state and the next-state functions into coordinates.
Let S = (S 1 ,...,S k ) be the state space distributed over k components (the observables) where S i is the state of the i th component, for 1 ≤ i ≤ k. Thus, the state function will be decomposed as follows: Taking the initial state into account, the distributed form of the state function F will be redefined as follows:  1 1 ,…c 1 k ) Such form will be very important for the specification of designs that require performing only one step, such as the execution of one instruction.

| Layered design approach
The proposed proof methodology allows to incrementally model and prove state-of-the-art parameterised designs. The verification process follows the layered vertical-horizontal design approach shown in Figure 3. The highest level represents the Instruction-Set-Architecture (ISA) specification that describes the semantics of the processor's operations. The Micro-Architectural (MA) level represents the top-level design implementing the ISA specification. All MA designs which are hierarchically built one over the other represent different implementations of the same ISA specification. Three MA designs will be considered; the sequential MA design (SMA), the pipelined MA design (PMA), and the superscalar MA design (SSMA). The SMA design which involves only first-order functions (and consequently its proof could be easily performed against an ISA specification) represents the reference core architecture over which will be hierarchically developed both the PMA and the SSMA designs. So, the PMA will be parameterised with the SMA over which it is build, and the SSMA will be parameterised with the PMA over which it is built as well. Both pipelined and superscalar designs will be proved against the reference SMA design. Other approaches such as: the flushing technique [32][33][34], the completion functions method [35], the one-step theorem [36] and the well-founded bisimulation [37] attempt to prove the Micro-Architecture/Register-Transfer design against an ISA specification. However, such approaches reveal many difficulties [38] to link the MA/RT level to the ISA level for pipelined and superscalar designs (instructions overlap) that require defining complex abstraction functions.

| Formal verification of the first-order implementation
Let S and W be two finite sets of first-order types, representing values at the ISA and MA levels, respectively.
Let abs : : W → S be an abstraction function that maps the MA level to the ISA level. Then, proving the correctness of the sma implementation with respect to the isa specification requires proving the following commutative property: ∀v1 : : w1, …, vn: :wn isa(abs(v1), …abs(vn)) = abs(sma(v1, …, vn)) Such a property is depicted in Figure 4, by the corresponding commutative diagram.
Throughout this work, the sequential microarchitecture implementation will be assumed correct. A detailed proof about sequential microarchitectures is given in our previous work [39].

| Formal verification of the higher order implementation
Let H, be a higher order curried function modelling a system design as follows: H :: t f1 → …→ t fn → t x1 → … → t xn → (t y1 , …,t yn ) H f 1 … f n x 1 … x n = (y 1 , … y n ) In our framework, such higher order definition describes the architecture of a system design (structure) and will be interpreted as follows: the function name H denotes the system name, the functional parameters f i , represent its components, the variables xi, denote the inputs, and yi denote the outputs as depicted in Figure 5a. The polymorphic aspect of the parameters f i provides the ability of changing the functionality of the system components just by instantiating them with real design operations (technology implementation independent). However, to avoid dealing with functional types in the proof, our approach proposes deriving new partially applied first-order functions from HOFs. Notice that, HOFs must be defunctionalised into first-order functions during the synthesis process to make them translatable.
Let F be a partial first-order function (specialisation) defined over H as follows.
The function F involves only first-order types and describes precisely the behaviour of the system design whose structure is described by H, as shown in Figure 5b. Therefore, our approach reduces the verification of the design modelled by H to the verification of the design modelled by F. So, our approach assigns naturally a structural semantics to the H.O. Implementation and a behavioural (computational) semantics to the F.O. Implementation. Other approaches give a dual semantics to the same design: a structural semantics, to build it and a computational semantics to simulate it [2,6].

| Modelling at the instruction level
At the microarchitectural level, the state of a sequential machine could be modelled by the sma next-state function which is depicted in Figure 6b. It allows observing the evolution of the state

| PMA higher order implementation
The PMA (Pipelined Microarchitectural) model will be formalised in terms of clock cycles at the program level. Its construction requires two steps, as depicted in Figure 8a The first step which is an irregular computation fills progressively the pipe till the cycle k = S − 1, where S is the number of stages. At each clock cycle a new stage function is activated that computes a new component state. The second step which starts from the cycle k ≥ S, is a regular computation: It allows to recursively compute the PMA state by applying the next state function f, defined as follows: f = (f 1 ,…,f s ). It involves S parallel functions f i , each one performs the task of a stage S i .

| PMA first-order implementation
From the PMA higher order definition, we will derive a partial first-order definition that captures the behaviour of the same design and is easier to prove.
Let FPMA be a partially applied first-order function defined as follows:  ; c s s )) -In case of no stalls, the time function which links the pipelined and the sequential states is defined as follows: t n (k,j) = (k−j)*s + j.
The corresponding sequential state is computed as follows: In case of stalls, the time function rewrites as follows: where e is the number of stalls. Figure 10 shows such synchronisation.

| Timed sequential model for pipelined designs
The TSMP (Timed Sequential Model for Pipelined designs) model against which the FPMA model will be verified is a firstorder model developed over the SMA model and the time abstraction function. It inputs the same clock cycle k as the FMPA model, unlike the SMA model which inputs the number of instructions to execute. For each clock cycle k, it builds S terms as shown in Figure 11, each one computes a partial result bound to an instruction within the pipe.

| Criterion of correctness
Proving the behavioural equivalence of the FPMA model with respect to TSMP model, requires proving the following equation.
: This means that, starting from the same initial state: c 1 s , …, c s s both models must reveal the same states at each clock cycle k, according to the timing diagram shown in Figure 10. The proof of such an equation decomposes systematically to the proof of the following equations which are defined only over first-order types and provable by induction over clock cycles.

| SSMA higher order implementation
Superscalar designs extend pipelined designs by replicating pipelines so that to issue multiple instructions per clock cycle [40]. For the lack of space, only in-order execution designs will be considered. Thereby, the SSMA (SuperScalar Micro-Architectural) model could be developed upon the PMA model as follows: Let W = ((W 1   1 ,…,W 1 s ),…,(W n 1 ,…,W n s )), be the SSMA state distributed over n pipelines (each one with S stages), and pma i for 1 ≤ i ≤ n, be the component function that performs the functionality of the pipeline i, using the stages functions f i . Thus, given an initial state: (c 11 s ,…, c s1 s ),…,(c 1n s ,…, c sn s ), and n pipelines: pma 1 ,…,pma n the SSMA state at clock cycle k ≥ S, is captured by the SSMA H.O function which is shown in Figure 12.
As we can see the superscalar model is built in a modular way over the pipeline model and consequently, its proof is subject to the proof of the pipeline modules.
Let n, be the number of pipelines. Thus, two cases will be considered: case of no stalls: The time function is defined as follows: tn(k,j,i) = (n*(k-j) +(i-1))*s + j The corresponding sequential state is computed as follows: where e is the number of stalls (Fig. 14).

| Timed Sequential Model for Superscalar designs
The TSMS (Timed Sequential Model for Superscalar designs) model against which the FSMA model will be verified is shown in Figure 15. It is a first-order model built in a modular way over the TSMP model developed so far (Section 3.5.2) for the verification of the pipelined model. By substituting the tsmp by its definition, we get a TSMS definition expressed with SMA terms that reflect exactly the timing between superscalar and sequential models.

| Modelling and verification flow diagram
The Build-and-Prove flow diagram depicted in Figure 16 summarises the different steps of our proof methodology for developing (using parameterisation) and proving (using partial application) higher order models of microarchitectural designs. Figure 14. Synchronisation between Superscalar and Sequential Models.

| APPLICATION TO THE MIPS PROCESSOR MICROARCHITECTURES
This section shows a practical aspect of our proof methodology involving the MIPS processor microarchitectures. MIPS processors are well structured and modular RISC architectures and consequently they fit adequately in the incremental design approach. Three processors will be considered as three categories: the sequential MIPS core processor (SMA), the pipelined MIPS processor (PMA) and the superscalar MIPS dual-issue processor (SSMA). These processors which are compatible with respect to the MIPS ISA are incrementally developed one over the other. So, the pipelined processor will be parameterised with the sequential core over which it is built and the superscalar will be parameterised with the pipelined over which it is built as well. Both pipelined and superscalar designs will be proved against the reference sequential core which is supposed to be correct. A detailed proof methodology about sequential processors is described in Ref. [25].

| Instruction set format
This case study considers the implementation of three MIPS instruction types: an R-type instruction (add), the memory reference M-type instructions (load and store) and a branch Btype instruction (beq). Their formats are shown in Table 1.

| MIPS core microarchitecture
The simplified microarchitecture of the MIPS core processor implementing the above instruction set is drawn in Figure 17. Its datapath is divided into five stages which are clocked in a round Robin. Each stage performs its functionality within one clock cycle. The Fetch and Decode stages are common to all instructions while the remaining stages: Execute, Memory-access and Write-back are specific to each instruction. Table 2 shows an informal description of its stages.

| Functional implementation of the MIPS SMA stages
The functions F i which implement the functionality (in terms of hardware components) of the different stages are described in Table 3: The outputs produced by a stage constitute the inputs of the next stage .

| Modelling the MIPS SMA at the instruction level
The state produced by the different stages of the MIPS sequential microarchitecture during the instruction execution is modelled by the sma function which is shown in Figure 18. It is built as follows: These three observables are computed and captured at the instruction level and the program level as shown in Figure 20a and b, respectively.

| MIPS pipelined microarchitecture
The MIPS pipelined microarchitecture extends the MIPS sequential multicycle one by adding pipeline registers and an appropriate control logic to clock the different stages in parallel, as shown in Figure 21.

| PMA higher order implementation
The function modelling the pipelined MA design will be parameterised with the stage functions f i of the MIPS sequential microarchitecture, as shown in Figure 22.
The functions f i , describing the internal structure of the stages operate in parallel. At each cycle, each one returns a partial result. Thereafter, we could capture any component state. At this level, the stage functions are polymorphic and therefore, the microarchitecture model is technology independent.

| PMA first-order implementation
The PMA partial first-order implementation that describes the behaviour of the MIPS pipelined microarchitecture is represented in Figure 23, by the FPMA function which is defined as follows: FPMA :: (Int, State) → State, FPMA = PMA f 1 f 2 f 3 f 4 f 5 FPMA (k=5, pc, dm, rf) = (pc, dm, rf)

| TSMP model
The FPMA model will be verified against the TSMP model which is drawn in Figure 24.

| Correctness Criterion
Let us limit the proof to three observables: The Program counter (pc), the data memory (dm) and the registerfile (rf). Thus, proving the behavioural equivalence of these two first-order models requires proving the following equation.

| MIPS Dual-Issue Microarchitecture
The dual-issue MIPS processor uses the same five-stage structure which is used by the single issue pipelined MIPS processor. Instructions are paired and aligned on a 64 bit length: It fetches a 64-bit Instruction, decodes two individual instructions (2*32 bit instruction), executes two operations, accesses memory to load/store 64-bit word, and finally writes back two results. Its microarchitecture is shown in Figure 25, and a modular representation of such microarchitecture is depicted in Figure 26.

| SSMA higher order implementation
The SSMA function modelling the dual-issue microarchitecture will be parameterised with the PMA function which will be distributed over two identical pipelines operating in parallel, as shown in Figure 27.

| SSMA first-order implementation
The partial first-order implementation of the MIPS dual issue is represented by the FSMA function which is derived from the higher order SSMA function as follows: We get the following definition which is represented in Figure 28.

| TSMS model
The FSMA first-order model which represents the behaviour of the MIPS dual-issue MA will be verified against the TSMS first-order model which is shown in Figure 29.

| CONCLUSION
A methodological design approach based on functional techniques for the formal modelling and verification of digital circuits using higher order functions has been presented. The approach exploits both the mechanism of parameterisation and the notion of partial application to incrementally develop both the design and its proof in a systematic way. With respect to alternative approaches, the proposed proof approach brings many contributions, particularly: -It develops accurate functional models useful for both formal verification and simulation (actual designs combine both techniques). -It scales well with the design complexity while it abstracts from its implementation details. Hence, more complex designs could be straightforwardly tackled. -It reduces the design proof to the first-order case by simple partial application. Thereafter, it refines such proof (down till the component level) into a set of verification properties which are separately provable. Consequently, we can limit the proof only to some observations in which we are interested. This is very helpful for debugging complex designs. Furthermore, it carries out the proof process by induction within the same specification framework that avoids translating the design properties between different semantics (from the specification framework to the proof framework) which remains very difficult. -It does not require a data abstraction function for the verification of pipelined and superscalar microarchitectural designs which remains difficult to construct (its complexity increases with the increase of the design complexity). Only a time abstraction function is needed to map between the models involved in the proof. -Furthermore, the stalling case (due to hazards) is captured by the proof as well. -However, a functional graph rewriting framework such as Clean [41] is more adequate for mechanical reasoning than a purely functional framework such as Haskell.