Journal of Logical and Algebraic Methods in Programming

The speciﬁcation of a concurrent program module, and the veriﬁcation of implementations and clients with respect to such a speciﬁcation, are diﬃcult problems. A speciﬁcation should be general enough that any reasonable implementation satisﬁes it, yet precise enough that it can be used by any reasonable client. We survey a range of techniques for specifying concurrent modules, using the example of a counter module to illustrate the beneﬁts and limitations of each. In particular, we highlight four key concepts underpinning these techniques: auxiliary state, interference abstraction, resource ownership and atomicity. We demonstrate how these concepts can be combined to achieve two powerful approaches for specifying concurrent modules and verifying implementations and clients, which remove the limitations highlighted by the counter example. © 2018 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).


Introduction
The specification of a concurrent program module and the verification of implementations and clients with respect to such a specification are difficult problems. When concurrent threads work with shared data, the resulting behaviour can be complex. Reasoning about such modules in a tractable fashion requires effective abstractions that hide this complexity. To be effective, an abstract specification of a module must balance two key requirements: it must be general enough that any reasonable implementation satisfies it; and it must be precise enough that any intended client can use it. A specification that is too precise will disallow some reasonable implementations, while one that is too general will disallow reasonable clients. The specification should support modular verification, in that the verification of the module implementation and clients should only reference the specification, and not each other's code. This requires the specification to be modular, in that it should capture the entire contract between a module and its clients. Since the 1970s, substantial progress has been made on reasoning techniques for concurrency, and recent developments have brought us closer than ever to a general approach to effective modular specification and verification.
In this survey paper, we describe some of the key techniques for reasoning about concurrency that have been developed in recent decades. We restrict our exposition to four concepts which are pervasive and underpin modern program logics for concurrency: auxiliary state, interference abstraction, resource ownership and atomicity. To illustrate these concepts, we consider a concurrent counter module, with an implementation using a spin loop (Section 2.1) and a ticket-lock client (Section 2.2). In Section 3, we look at a range of historical reasoning techniques for concurrency, and how they embody the key concepts: • Owicki-Gries reasoning [1] introduces auxiliary state (Section 3.2) to abstract the internal state of threads; • rely/guarantee reasoning [2] introduces interference abstraction (Section 3.3) to abstract the interactions between different threads; • concurrent separation logic [3] introduces resource ownership (Section 3.4) to encode interference abstraction as auxiliary state; • linearisability [4] introduces atomicity (Section 3.5) to abstract the effects of an operation so that it appears to take place instantaneously.
Modern program logics, such as TaDA [5,6], Iris [7] and FCSL [8,9], combine these techniques, allowing us to prove effective modular specifications for concurrent modules such as the counter. We compare two approaches: a first-order approach used in TaDA (Section 3.6.2), and a higher-order approach introduced by Jacobs and Piessens [10] and used in Iris (Section 3.6.1). In Section 4, we compare these approaches by showing how the spin-counter implementation can be verified against such a counter specification and how the ticket-lock client can be verified using the specification.

Concurrent modules
We use a concurrent counter module as the case study for this paper. This section describes a spin-counter implementation and a ticket-lock client.

A spin-counter implementation
Consider the spin-counter implementation of a concurrent counter shown in Fig. 1. We make use of three primitive atomic operations (i.e. operations that take effect at a single, discrete instant in time) for manipulating the heap. The load operation x := [E]; reads the value of the heap at the address given by E and assigns it to the variable x. The store operation [E 1 ] := E 2 ; stores the value E 2 in the heap at the address given by E 1 . Finally, the compare-and-set (CAS) operation x := CAS(E 1 , E 2 , E 3 ); checks if the value in the heap at the address given by E 1 is equal to E 2 : if so, it replaces it with the value E 3 and assigns 1 to x; otherwise, x is assigned 0.
The counter module has three operations. The read operation returns the value of the counter. The incr operation increments the value of the counter and returns the old value, using the compare-and-set operation to do this atomically. The compare-and-set can fail if the value of the counter is changed concurrently, so the operation loops (or spins) until it  succeeds. The wkIncr operation also increments the value of the counter and returns the old value. However, this operation uses a store instead of a CAS, which can lead to different behaviour in a concurrent setting. A specification of the counter module should describe how each operation affects the value of the counter. It should express that the counter must be allocated as a precondition of the operations that access it. It should also describe the permitted interference from the context of concurrent operations. Intuitively, the read and incr operations are robust with respect to concurrent operations that change the value of the counter. By contrast, the potentially faster wkIncr requires that no concurrent operation changes the value of the counter between the load and store operations, in order for it to behave as intended. This informal specification is subtle, and so it is an interesting case study to capture formally.

A ticket-lock client
Consider a ticket-lock client [11] that uses the counter module to provide synchronisation. The code for the ticket lock is given in Fig. 2. The lock uses two counters, the ticket counter next and the serving counter owner, which both initially have the value 0. A thread acquires the lock by calling the acquire operation. This operation increments the next counter to obtain a notional ticket. When the value of the owner counter agrees with this ticket, the thread has acquired the lock.
It can then use whatever resources are protected by the lock, without interference from other threads. Control of these resources is relinquished by calling the release operation. This increments the owner counter, passing the lock on to the next waiting thread. Intuitively, the use of incr for the acquire operation is necessary, since it needs to be robust with respect to concurrent threads taking tickets. The use of wkIncr for the release operation is possible, since only the thread holding the lock should release it.
The spin-counter and its ticket-lock client provide a case study that illustrates some of the key difficulties in specifying concurrent modules, and verifying their implementations and intended clients. The challenge is to develop a concurrent specification of the counter module that is strong enough to allow us to reason about the ticket lock. This mandates a precise description of how each operation affects the value of the counter, and a detailed account of concurrent interference, which distinguishes between incr and wkIncr. We require a reasoning technique that can formally express such specifications, and verify implementations and clients using these formal specifications. In Section 3, we concentrate on how to specify the counter module, while in Section 4, we demonstrate how to verify the spin-counter implementation and ticket-lock client using our eventual specification.

Specification
Our objective is to give a specification for the counter module that, in particular, is satisfied by our spin-counter implementation and can be used to verify the ticket lock as a client. We start by considering a sequential specification, before exploring how different techniques can be used to give concurrent specifications for the counter module. In particular, we show how these techniques use the concepts of auxiliary state, interference abstraction, resource ownership, and atomicity.
We then show how recent logics combine these ideas in a way that can be used to give an effective specification for the counter.

Sequential specification
It is straightforward to give a sequential specification for the counter module using standard Hoare triples [12]. A Hoare triple, written P C Q , states that if program C is executed from a state satisfying assertion P , then it either does not terminate or terminates with the resulting state satisfying assertion Q . The assertions P and Q are given in first-order logic, with predicates of the form E 1 → E 2 describing those heaps containing a heap cell at address E 1 with value E 2 . The sequential specification of the counter module is given by: With Hoare logic, it is possible to verify that the spin-counter implementation satisfies the sequential specification, and to verify the correctness of sequential clients that use the counter module. However, this specification gives no information about the behaviour of the operations in a concurrent setting.

Auxiliary state
Owicki and Gries [1] developed the first tractable proof technique for concurrent programs, identifying the importance of reasoning about interference between threads and of using auxiliary state. With the Owicki-Gries method, each thread is given a sequential proof. When the threads are composed, we must check that they do not interfere with each other's proofs. This is achieved by extending standard Hoare logic with the Owicki-Gries rule for parallel composition:

OG-Parallel
The non-interference side-condition constrains the proof derivations for C 1 and C 2 . It requires that every intermediate assertion between atomic actions in the proof of C 1 must be preserved by every atomic action in the proof of C 2 , and vice-versa. This side-condition leads to non-compositional reasoning, in the sense that it refers to details of the proof derivations that are not represented in the specifications. An abstract specification for the counter needs to be robust with respect to the non-interference condition. However, in general, the condition will vary depending on the concurrent context. Let us assume that the client may invoke any of the counter operations concurrently, but will not directly interact with the state of the counter. That is, we will only consider interference caused by the counter operations themselves. To this end, we can use an invariant: that is, an assertion that is preserved by each atomic action in the module. For the counter specification, the invariant ∃n. x → n asserts that the counter at x is allocated and has some value. Using this invariant, we can give the following specification for the counter module: However, these specifications are too weak to verify clients such as the ticket lock. They lose all information about the value of the counter, and give no information about how the operations change this value. In fact, the read operation could change the value of the counter and still satisfy the specification! Unfortunately, assertions that describe the precise value of the counter are not invariant.
The Owicki-Gries method is able to provide stronger specifications by using auxiliary state, which records extra information about the execution history via auxiliary variables. The auxiliary state is updated by auxiliary code, which instruments the program code. Since the auxiliary code only updates auxiliary variables, it has no effect on the program behaviour and so can be erased. The auxiliary code is not required when the program is run; it is only used for the static logical reasoning.
By way of example, consider two threads that both increment a counter, as in Fig. 3. The auxiliary variables y and z, with initial values 0, are used to record the contribution (that is, the number of increments) of each thread. For each thread, the code of the incr operation is instrumented with code that updates the appropriate auxiliary variable when the CAS True x := makeCounter(); x → 0 y := 0; z := 0; operation succeeds. This auxiliary variable must be updated at the same instant as the counter, so that the counter always holds the sum of the two contributions -our invariant. This is expressed by the angle brackets, _ , which indicate that the CAS and auxiliary code should be executed in a single atomic step. Note that the auxiliary state can be seen as an abstraction of the internal state of the threads. That is, we can recover all of the relevant information about the auxiliary variables by knowing each thread's program counter and local variables. The auxiliary state captures just the information about the internal state that we need for our reasoning.
The resulting specification of the two-increment program is strong, with precise information about the initial and final value of the counter. However, it comes at the price of modularity. Firstly, the incr operations require different specifications depending on the client's use: in our example, the assertion x → y + z uses auxiliary variables y and z; with three threads, the specification requires three auxiliary variables. A modular specification would be one that captures all use cases.
Secondly, each use of the incr operation requires the underlying implementation to be extended with auxiliary code to increment the appropriate auxiliary variable. Modular verification would not modify the module code for each use by the client. Thirdly, and more subtly, the Owicki-Gries method requires the global non-interference condition. To meet this, we made the implicit assumption that the client only interacts with the state of the counter through the counter operations. Modular specification would be explicit about such assumptions regarding the behaviour of the client.
Thesis The concept of auxiliary state, introduced in the Owicki-Gries method, is important for the specification of concurrent modules. Auxiliary state abstracts the internal state of threads, and is a powerful mechanism for giving precise specifications. However, auxiliary variables can violate modularity; module code may be instrumented with different auxiliary code and its specification may give a different description of the auxiliary state, depending on how the client uses the code. As we shall see, various subsequent approaches have taken a more modular approach to auxiliary state than that provided by auxiliary variables in the Owicki-Gries method.

Interference abstraction
Jones [2] introduced interference abstraction, providing the rely/guarantee method as a way to improve the compositionality of the Owicki-Gries approach. To avoid the global non-interference condition, specifications both explicitly constrain the interference from the concurrent context and describe the interference that a thread may cause. To this end, each specification incorporates two relations, the rely and guarantee relations, that abstract the interference between threads. The rely relation abstracts the actions of other threads; each assertion in the derivation must be stable under all of these actions. The guarantee relation abstracts the actions in the current derivation; each atomic update by the thread must be described by the guarantee. (See Fig. 4.) Rely/guarantee specifications have the form R, G RG P C Q , where the additional relations, R and G, are the rely and guarantee relations respectively. We denote the elements of the rely and guarantee relations as actions p q. The actions of the rely relation describe the changes that may be made by the concurrent environment, while the actions of the guarantee relation describe the changes that may be made by the thread (or threads) under consideration. When composing concurrent threads, each thread belongs to the environment of the other and, hence, the guarantee of each thread must be included in the rely of the other. The parallel composition rule is therefore adapted to: True x := makeCounter(); ∃n. x → n ∧ n ≥ 1 Fig. 4. Reasoning about concurrent increments using interference abstraction.

RG-Parallel
The rely for C 1 consists of the actions that may be performed by the wider environment, namely R, together with the actions that may be performed by C 2 , namely G 2 . The guarantee for C 1 C 2 consists of both the actions of C 1 and the actions of C 2 , namely G 1 ∪ G 2 . The rely/guarantee specifications for the read and incr operations are: The read specification has an empty guarantee relation indicating that nothing is changed by the read. It has the rely relation A, stating that other threads can only increment the counter, and that they can do so as many times as they like. The incr specification has the same rely relation. Its guarantee relation is also A, stating that the increment can increase the value of the counter. The guarantee must be defined for all n, as the environment can change the counter value. This means that we cannot express that the incr operation only does a single increment. The rely/guarantee specification for the wkIncr operation is subtle. Recall that, intuitively, the wkIncr operation is intended to be used when no other threads are concurrently updating the counter. As a first try, we can give a simple specification with a rely condition that enforces this constraint: The rely relation is empty, so this specification cannot be used in a context where concurrent updates may occur. This means that the guarantee relation can be very precise, consisting of a single action.With this specification, the wkIncr operation will effectively appear as a single atomic operation. Although this specification captures some of the intended behaviour of wkIncr, it is insufficient to reason about the ticket lock. With the ticket lock, it is possible for two invocations of the wkIncr operation to be executing concurrently.
Consider a situation in which one thread currently holds the lock, while another thread is attempting to acquire the lock (that is, it is in the loop of the acquire operation). Suppose that the first thread executes the release operation, in which it calls wkIncr. After the body of wkIncr has executed, but before the call has returned, the second thread can perform its read and observe that it holds the lock. Moreover, it can then call the release operation, executing wkIncr before the first thread's call to wkIncr has returned. Thus we have two concurrent invocations of wkIncr.
The above specification does not allow this concurrent behaviour, since the empty rely relation rules out all concurrent updates to the counter. It is possible to allow such concurrent updates by changing the rely, but at the expense of weakening the postcondition: and G is as before. Notice that the rely states that concurrent increments can only happen when the value of the counter is above n. Also notice that, in generalising the rely, we must weaken the postcondition to make it stable.
In summary, this specification is too weak to reason about the ticket lock. It is possible to instrument the code with auxiliary variables, as with the Owicki-Gries method, again leading to a loss of modularity.
Thesis The concept of interference abstraction, introduced in the rely/guarantee method, is important for the specification of concurrent modules. By abstracting the interactions between different threads, specifications are able to express constraints on their concurrent contexts. This abstraction leads to more compositional reasoning: since the interference is part of the specification, we do not need to examine proofs in order to justify parallel composition. While it may be specified differently, some form of interference abstraction is generally present in subsequent approaches to verifying concurrent programs.

Resource ownership
O'Hearn and Brookes developed a new style of Hoare-logic reasoning for concurrency based on resource ownership, extending the ideas of separation logic [13,14] to concurrency. They introduced concurrent separation logic [3,15], which provides a highly compositional approach to reasoning about concurrency. Resource ownership can be seen as a specialised form of auxiliary state and interference abstraction. Resource ownership provides auxiliary information that a thread has the right to access some resource; the program does not explicitly record which threads own which resources. Ownership also provides the simple interference abstraction that only a thread that owns a resource can update that resource.
Concurrent separation logic uses an assertion language based on the Bunched Logic of O'Hearn and Pym [16]. Assertions in separation logic treat data, such as heap cells or counter objects, as resources. Each operation acts on some specific resource, with the precondition requiring ownership of the resource it represents. When threads operate on disjoint resources, they do not interfere with each other and so their effects can be combined simply. This principle is embodied in the disjoint parallel composition rule: where the assertion P 1 * P 2 in the conclusion describes the disjoint combination of the resources described by the assertions P 1 and P 2 .

Remark 1. (Disjoint Resources)
The fundamental idea behind separation logic [13,14] is to treat the heap as a resource, which can be subdivided into separate disjoint sub-heaps also treated as resources. Heap operations only require parts of the heap for their execution. For example, the update of a heap cell only requires the resource of the heap cell. The rest of the heap is not needed for the update. Such operations are local to the specific resource on which they operate, as they do no affect other parts of the heap. Locality is expressed by the frame rule 3 : The frame rule allows us to reason about programs in a local way. We can focus our reasoning on the resource that the program uses; any additional resource, which would not be affected by the program, can be added using the frame rule. In particular, the premiss states that, if a program is run in a state described by precondition P then it will not fault and, if it terminates, the resulting state will be described by postcondition Q . The conclusion states that the program has the same behaviour if the disjoint resource R is added to the precondition and postcondition. This is possible because the separating conjunction * enforces disjointness. 2 In the original concurrent separation logic, resources can be shared between threads by using invariants. The resource associated with an invariant can only be accessed by a thread during a conditional critical region [3,15], which enforces coarse-grained synchronisation between accesses to the shared resource. We have seen that invariants with auxiliary state allow for precise reasoning, but they are less compositional than the interference abstraction provided by rely/guarantee reasoning. Subsequent developments in concurrent separation logic [17][18][19] incorporate various forms of rely/guarantee reasoning over shared resources in order to support reasoning about fine-grained concurrency, where threads typically make multiple accesses to shared resources through atomic operations. The concurrent abstract predicate (CAP) [20] approach builds on this with abstractions that hide the shared resources, effectively allowing disjoint concurrent reasoning at the abstract level.
Let us illustrate this CAP reasoning on the counter module. Consider the concurrent abstract predicate Counter(x, n) which denotes the existence of a counter at address x with value n. With our spin counter implementation, this abstract predicate simply describes the concrete sub-heap that is the heap cell x → n. Treating Counter(x, n) as a resource, we could use the original sequential specification as a concurrent one. However, for multiple threads to use the counter, they would have to transfer the resource between each other using some form of synchronisation. Such a specification effectively enforces sequential access to the counter. This is because the client has no mechanism for dividing the resource: in particular, just does not hold since it is not possible to split the concrete heap into two parts which both contain the cell x → n.

Fig. 5.
Reasoning about concurrent increments using resource ownership.
Following Boyland [21], Bornat et al. [22] introduced permission accounting to separation logic. This allows shared resources to be divided by associating with them a fraction in the interval (0, 1]. Shared resources may be subdivided by splitting this fraction. For instance, we may associate fractions with our counter resource and declare the logical axiom: Counter(x, n, π 1 + π 2 ) ⇐⇒ Counter(x, n, π 1 ) * Counter(x, n, π 2 ) for π 1 + π 2 ≤ 1. We can now modify our counter specification to give concurrent read access: Notice that we require full permission (the 1) in order to perform either increment operation. This means that only concurrent reads are permitted; concurrent updates must be synchronised with all other concurrent accesses (both increments and reads). If only partial permission were necessary, then the specification for read would be incorrect, since it could no longer guarantee that the value being read matched the resource it had.
It is possible to specify concurrent increments by changing how we interpret the counter predicate Counter(x, n, π). Now, the resource Counter(x, n, π) no longer asserts that the value of the counter is n, except if π = 1. Instead, it asserts that the thread is contributing n to the value of the counter; other threads may also have contributions. We can split this counter resource by declaring the logical axiom: Counter(x, n 1 + n 2 , π 1 + π 2 ) ⇐⇒ Counter(x, n 1 , π 1 ) * Counter(x, n 2 , π 2 ) for n 1 , n 2 ∈ N and π 1 , π 2 ∈ (0, 1]. We then specify our counter operations as: At last, we have a specification that allows concurrent reads and increments. Fig. 5 shows how this specification can be used to verify the example of two concurrent increments. Whereas in Fig. 3 each thread was instrumented with different auxiliary code, here the code has not been changed. Rather than each thread having an auxiliary variable to record its contribution to the counter, the contribution is recorded in auxiliary resource that is owned by the thread and encapsulated in the Counter(x, n, π) predicate. This idea of subjective auxiliary state is at the core of Subjective Concurrent Separation Logic (SCSL) [23] (and the subsequent Fine-grained Concurrent Separation Logic (FCSL) [8,9]).
This specification still has weaknesses. It requires the wkIncr operation to be synchronised with the other operations.
It also does not guarantee that sequenced reads will never see decreasing values of the counter (since the contribution is not changed and only provides the lower bound). It is possible to describe a more elaborate permission system that allows wkIncr in the presence of reads, and to extend the predicate to record the last known value as a lower bound for reads.
This would give us a more useful, if somewhat cumbersome, specification. However, it would still not handle the ticket lock. While a ticket lock has been verified using CAP reasoning [20], the proof depends on the atomicity of the underlying counter operations in order to synchronise access to shared resources. The proof does not work with any of our abstract specifications, since they simply do not embody the necessary atomicity.
Thesis The concept of resource ownership, developed in the work on concurrent separation logic and its successors, is important for the specification of concurrent modules. The idiom of ownership can be seen as a form of auxiliary state, which critically embodies a notion of disjointness and interference abstraction. Various approaches have explored the power of ownership for reasoning about concurrency [20,23,24,8,25,26,9,7]. While it is an effective concept, and can be used to give elegant specifications, something more is required to provide the strong specifications required for our ticket-lock example.

Atomicity
Atomicity is the abstraction that an operation takes effect at a single, discrete instant in time. The concurrent behaviour of such operations is equivalent to a sequential interleaving of the operations. A client can use such operations as if they were simple atomic operations.
Herlihy and Wing introduced linearisability [4], a well-known correctness condition for atomicity, which identifies when the operations of a concurrent module appear to behave atomically. Using the linearisability approach, each operation is given a sequential specification. The operations are then proved to behave atomically with respect to each other. One way of seeing this is that there is an instant during the invocation of each operation at which that operation appears to take effect. This instant is referred to as the linearisation point. With linearisability, the interference of every operation is tolerated at all times by any of the other operations. Consequently, the interference abstraction is deemed to be the module boundary.
Given our sequential specification for the counter in Section 3.1, is our implementation linearisable? If we only consider the read and incr operations, then yes, it is. However, the addition of the wkIncr operation breaks linearisability. The problem with wkIncr is that, for instance, two concurrent calls can result in the counter only being incremented once.
This is not consistent with atomic behaviour.
The essence of the problem is that we only envisage calling wkIncr in a concurrent context where there are no other increments. In such a case, it would appear to behave atomically. However, the sequential specification cannot express this constraint. We need an interference abstraction that constrains the concurrent context. Linearisability is related to the notion of contextual refinement. With contextual refinement, the behaviour of program code is described by (more abstract) specification code. (In general, this specification code need not be directly executable.) Contextual refinement asserts that the specification code can be replaced by the program code in any context, without introducing new observable behaviours; we say that the program code contextually refines the specification code. Filipović et al. [27] have shown that, under certain assumptions about a programming language, linearisability implies contextual refinement for that language. For a linearisable module, each operation contextually refines the operation itself executed atomically. For instance, the code for incr(x) contextually refines the atomic command incr(x) . Conversely, contextual refinement implies linearisability.
CaReSL [24] is a logic for proving contextual refinement of concurrent programs. CaReSL makes use of auxiliary state, interference abstraction and ownership in the technical proofs. However, these concepts are not exposed in specifications.
This means that it is not obvious what a suitable specification of wkIncr in CaReSL should be.
Thesis The concept of atomicity, put forward by linearisability, is important for the specification of concurrent modules. Atomicity can be seen as a form of interference abstraction: it effectively guarantees that the only observable interference from an operation will occur at a single instant in its execution. This is a powerful abstraction, since a client need not consider intermediate states of an atomic operation (which, for non-atomic operations, might violate invariants) but only the overall transformation it performs.

Synthesis
We now examine two approaches, a higher-order approach and a first-order approach, that bring together the ideas we have so far discussed to provide expressive modular specifications for concurrent modules.

A higher-order approach
One way of overcoming the non-modularity of the Owicki-Gries method was introduced by Jacobs and Piessens [10]. Their key idea is to give higher-order specifications for operations, which are parametrised by auxiliary code that is performed when the abstract atomic operation appears to take effect (the linearisation point). Where previously we instrumented the code of the incr operation differently for different call sites, here it is instrumented uniformly; the auxiliary code is a parameter that is determined at the call site.
x → 0 y := 0; z := 0; Applying this idea to the incr operation we have the following code: Note that the function ρ is an auxiliary code parameter of the operation. When the atomic update to the counter occurs, ρ is invoked, which can update the client's auxiliary state. The function ρ is parametrised by the value of the counter immediately before the update occurs, which allows the update to the auxiliary state to depend on this value.
The specification of incr is parametrised by the specification of the auxiliary code. Written as a proof rule, the specification is as follows: In the conclusion of this rule, the assertion I is an invariant; it is disjoint from the pre-and postcondition, and must be preserved by atomic updates of all threads. At the point where the counter is atomically incremented, the following steps conceptually take place: 1. the first equivalence from the premiss is used to convert the disjoint combination of the invariant I and the precondition P into the disjoint combination of the counter heap assertion x → n and R(n) for some value of n; 2. the module performs the increment, updating x → n to x → n + 1; and 3. the auxiliary code ρ(n) is run, updating the combination of x → n + 1 and R(n) to the combination of I and T (n).
This specification enables us to exploit the expressivity of auxiliary variables in a modular way. In particular, Fig. 6 shows how this technique can be used to prove two concurrent increments. The proof is very similar to the one shown in Fig. 3. The new specification allows us to abstract the atomic update performed by the incr and use the same module implementation for both threads. The invariant I is instantiated as x → y+z. The predicate R(n) is instantiated as n = y+z. The predicates P and T are instantiated with the pre-and postconditions of incr at each call site. Since the lifetime of the threads is syntactically scoped, we can create an invariant that holds for this scope: we require it to hold before we enter the scope and assume that it holds after the scope; outside the scope, it is no longer invariant. (When threads are created by a fork operation, their lifetimes are not syntactically scoped, and so a different approach to invariants is required.) The read operation can be specified as: x → n * R(n) ρ(n) I * T (n) Finally, recall that the wkIncr operation is intended to be used when there are no updates from the environment. This can be specified as: A key difference with the wkIncr specification is that n is not quantified in the premisses. This is because the value of the counter must be preserved by other threads before the update. Note that, although these specifications are written in the form of a proof rules, they are actually implications: the implementation must show that the conclusion follows from the premisses, and a client can use the conclusion if it establishes the premisses. The predicates I , P , T and R, as well as the ghost code ρ, are universally quantified: the client can instantiate them as necessary.
This higher-order specification approach has been adopted in other higher-order logics such as HOCAP [25], iCAP [26] and Iris [7]. In these logics, auxiliary state is not manipulated by auxiliary code, but by view shifts [28]. These view shifts serve essentially the same purpose: they are able to update auxiliary state, but have no effect on the concrete state.

A first-order approach
An alternative way of providing specifications for concurrent modules was introduced in the program logic TaDA [5,6] using atomic triples. Rather than treating atomic specifications as a higher-order construct, atomic triples build such specifications into TaDA as a first-order construct. An atomic triple has the form: Abstractly, this can be read as: C atomically updates P (x) to Q (x), under the assumption that the environment ensures that, before the atomic update, P (x) holds continuously for some x ∈ X which may change over time. The "pseudo-quantifier" A is part of the syntax of the TaDA specification, and not a quantifier in the underlying assertion language. If x were bound by the standard universal quantifier ∀ instead, C would still update P (x) to Q (x) for arbitrary x, but the environment would not be permitted to change the value of x. The A x binding combines the arbitrary nature of x (hence the resemblance to ∀) and the changeable nature of x. 4 This abstract description hides the subtle behaviour permitted by an implementation. The implementation may assume that the assertion P (x 0 ) holds initially for some x 0 ∈ X . The implementation must tolerate continual interference from the environment updating P (x) to P (x ) for any x, x ∈ X . The implementation may make updates, but it must preserve P (x) at each step; it cannot change the value of x itself. At some point, the implementation must update P (x) to Q (x) for the current choice of x, if the implementation terminates. After this update, Q (x) is no longer available to the implementation and another thread may be using it.
In TaDA, resources may belong to a particular thread or be shared between all threads. Shared resources are encapsulated by shared regions, a kind of invariant which establish protocols for threads to use the encapsulated resources. To ensure that the protocol is followed, a thread can only access the contents of a shared region for the duration of an atomic operation, and the atomic update must conform to the region's protocol. To use C with the above atomic specification to update a shared region, the region's protocol must ensure that P (x) currently holds, and will continue to do so for arbitrary, changeable x ∈ X . Moreover, the thread must have the right to perform the update from P (x) to Q (x) according to the protocol.
With the counter example in TaDA, the counter operations are specified in terms of an abstract predicate [30] that represents the state of a counter: the abstract predicate Counter(s, x, n) asserts the existence of a counter at address x with value n. The first parameter s ranges over an abstract type T 1 , which captures implementation-specific information about the counter. To the client, the type is opaque; the implementation realises the type appropriately. The predicate confers ownership of the counter: it is not possible to have more than one Counter(s, x, n) for the same value of x.
The specification for the makeCounter operation is a simple Hoare triple: True makeCounter() ∃s ∈ T 1 . Counter(s, ret, 0) The operation creates a new counter, which is initially set to value 0, and returns its address. The specification says nothing about the granularity of the operation. In fact, the granularity is hardly relevant, since no concurrent environment can meaningfully observe the effects of makeCounter until its return value is known: that is, once the operation has been completed. (On the abstractly-typed parameters). Typically, a proof of a specification concludes with a step that existentially quantifies over some fixed parameters in the representation of the data structure. With the atomic specifications of TaDA, this approach is not possible as the following rule is unsound:

Remark 2
This rule is unsound in because, in the conclusion, the environment is able to change the value of s, while in the premiss the value cannot be changed by the environment. (A limited form of atomic existential rule is sound [29], where the parameter 4 In Ntzik's thesis [29], which combines TaDA with refinement, A x is interpreted as a combination of universal quantification with stuttering and mumbling refinement rules that account for x changing over time. in the premiss is bound by A .) Since we cannot abstract in this fashion, we instead expose the parameter s. The client does not need to know any particular information about s, only that it should not be changed; hence, the type of s can be abstracted.
In contrast, a higher-order logic can avoid this additional parameter. This is done by specifying in the postcondition of the constructor (makeCounter) that there exists some predicate Counter for which the counter operations satisfy their specifications: Since the parameter is specific to the particular instance of the counter, it is abstracted in the existentially-quantified Counter predicate. (Indeed, the address of the counter can also be abstracted in the predicate.) The parameter s can be viewed as an artefact of defunctionalising the higher-order specification. 2 The specification for the read(x) operation is the atomic triple: Intuitively, this specification states that the read operation will read the state of the counter atomically, even in the presence of concurrent updates by the environment that may change the value of the counter, which are possible as n is bound by A . However, the environment must preserve the counter and cannot, for instance, deallocate it. This atomicity means that the resources in the specification may be shared: that is, concurrently accessible by multiple threads. Sharing in this way is not possible with ordinary Hoare triples, since they make no guarantee that intermediate steps preserve invariants on the resources. The atomic triple, by contrast, makes a strong guarantee: as long as the concurrent environment guarantees that the (possibly) shared resource Counter(s, x, n) is available for some n, the read operation will preserve Counter(s, x, n) until it reads it; after reading, the operation no longer requires Counter(s, x, n), and is consequently oblivious to subsequent transformations by the environment such as another thread incrementing the counter.
It is significant that the notion of atomicity is tied to the abstraction in the specification. The predicate Counter(s, x, n) can abstract multiple underlying states in the implementation. If we were to observe the underlying state, the operation might no longer appear to be atomic.

The specification of the incr is similar:
A n ∈ N. Counter(s, x, n) incr(x) Counter(s, x, n + 1) * ret = n The specification states that incr operation will increment the state of the counter atomically and return its previous value, even in the presence of concurrent updates by the environment that may change the value of the counter, which are possible as n is bound by The specification of the wkIncr operation is slightly different: ∀n ∈ N. Counter(s, x, n) wkIncr(x) Counter(s, x, n + 1) * ret = n The specification states that the wkIncr will increment the state of the counter atomically and return the previous value, as long as the environment guarantees that the shared counter will not change the value before the atomic update. This specification holds for arbitrary n, but this n cannot be changed by the environment as it is universally quantified in the standard sense. This means that, if the counter is shared, other threads can concurrently only perform read operations until the counter has been incremented. It is, however, possible for other incr or wkIncr operations to occur between the update and the return of the operation. Atomic triples specify operations with respect to an abstract assertion, such as the Counter(s, x, n). This means that each operation can be verified independently of the modules of the library. This makes it possible to extend modules with new operations without having to verify the existing operations again. Linearisability, by contrast, is a whole module property: the addition of new operations such as wkIncr can break the linearisability of the module.
TaDA [5] introduced a generalised version of the atomic triple that combines atomic updates to shared resources with non-atomic updates to resources owned by the thread. For example, we can use this to specify an operation that reads the value of the counter into a buffer: the read happens atomically, but the write to the buffer does not, and so ownership of the buffer must be transferred between the client and implementation. Such specifications are not possible with traditional linearisability, although Gotsman and Yang [31] have proposed an extension of linearisability that supports ownership transfer.

Remark 3 (On relating first-order and higher-order approaches).
We can relate the first-order approach to the higher-order approach by encoding atomic triples in the higher-order setting. This approach was taken with Iris [7]. In the Jacobs-Piessens logic, the atomic triple can be encoded as the "rule" This encodes how an atomic triple may be used which, in TaDA, is expressed through the proof rules for the atomic triples.
The first premiss is used to guarantee that the environment maintains P (x) initially. The second premiss establishes that it is legal to update P (x) to Q (x).
In Section 4, we show proofs using both approaches. While the details differ, the essence of the proofs are the same in each approach. 2 Evaluation A combination of auxiliary state, interference abstraction, resource ownership and atomicity makes it possible to specify modules in a way that is both precise and modular. The specifications described in this section are precise enough to derive the earlier specifications we have considered. As we shall see in Section 4, they are also precise enough to verify a client such as the ticket lock, which uses counters to provide synchronisation. Furthermore, the specifications do not expose implementation details. This makes it possible to vary the implementation without changing the specification, which would require updating proofs of client modules. The result is an expressive, modular approach to specification and verification of concurrent modules.

Verification
We have discussed both higher-order and first-order approaches to giving expressive, modular specifications for concurrent programs. We now show how these approaches can be used to verify the spin-counter implementation given in Section 2.1, and the ticket-lock client of the counter given in Section 2.2. We begin with the first-order approach of TaDA, before comparing it with a higher-order approach in the style of Jacobs and Piessens.

Remark 4 (Constructors).
We omit the proofs for the constructors makeCounter and makeLock, focusing instead on the operations that specifically involve abstract atomicity.

First-order approach
In Section 3.6.2, we used TaDA's first-order approach to specify the spin counter using atomic triples. TaDA has proof rules for establishing atomic triples. These rules involve shared regions, which are TaDA's mechanism for providing interference abstraction over shared state.

Spin counter
Recall the spin counter implementation from Section 2.1 and the counter specification from Section 3.6.2. To verify the implementation against the specification, we must give an interpretation of the abstract predicate Counter(s, x, n) including an interpretation of the abstract type T 1 of its first parameter, and prove the implementation against the specification under this interpretation. Since the specifications are atomic, we cannot simply interpret Counter(s, x, n) as x → n. Instead, TaDA requires that x → n be encapsulated by a shared region, which determines the interference abstraction associated with the counter.
A shared region encapsulates resource that is available to multiple threads when they perform atomic operations. The region enforces a protocol that determines how threads can mutate the encapsulated resource. Rather than expressing the protocol for a region directly in terms of the resource it encapsulates, the region is associated with a set of abstract states, and the protocol is specified in terms of these. An interpretation function determines the concrete resource that is associated with each abstract state.
The region is associated with abstract resources called guards -a form of auxiliary state -that determine the role that a thread can play in the protocol. The protocol is defined as a labelled transition system on the abstract states of the region, where the labels are guards. To change the state of the shared region, a thread needs to own the guard associated with the transition it will perform. The guard that a thread owns determines the possible guards that the environment can own, thus limiting the transitions that are available to the environment. Consequently, the guards determine what knowledge a thread can have about the region that is stable: that is, continues to hold under the actions of other threads. In TaDA, the guards for a region are specified as a partial commutative monoid (PCM). This gives us the flexibility to specify complex usage patterns for regions.

Remark 5 (Partial commutative monoids as auxiliary state).
Since concurrent separation logic, there has been extensive work [32,33,28,7] on using partial commutative monoids to model auxiliary state. Partial commutative monoids allow us to express complex patterns for subdividing resources. A partial commutative monoid (G, •, 0) consists of a carrier set G equipped with a partial binary operator • : G × G G and a neutral (or identity) element 5 0 ∈ G satisfying: The guard PCM, protocol and interpretation function associated with a region are determined by a region type. We define a region type Counter for counter regions. Multiple regions can have the same region type; for example, the ticket lock uses two counters, and hence two instances of the Counter region type. We distinguish instances by giving each region a distinct region identifier. A region type can be parametrised: Counter is parametrised by the address of the heap cell representing the counter. Region type parameters do not change during the lifetime of the region, unlike the region's abstract state. For a Counter region, the abstract state is a natural number representing the current value of the counter. To specify a Counter region with region identifier a, parameter x and current state n, we write Counter a (x, n).
The guard PCM associated with Counter regions simply comprises an indivisible guard Inc, which is used to increment the counter, and the empty guard 0. The composition Inc • Inc is undefined, and all other compositions are determined by 0 being the neutral element.
The set of abstract states for Counter regions is the set of natural numbers N, representing the possible values of the counter. The labelled transition system for the region enables the counter to be incremented using the guard Inc. This is specified by: Inc : ∀n. n n + 1 The region interpretation function I for Counter regions is: With this interpretation, the heap cell that contains the value of the counter is always in the region, and its value corresponds to the abstract state of the region.

Remark 6 (Protocols).
Protocols enforce a set of rules governing how threads can mutate and exchange resources. Protocols exist in many forms, the simplest form being unary invariants such as the ones used in concurrent separation logic. With a unary invariant, all updates must preserve the invariant assertion. Another form of a protocol is the relational invariants used in rely/guarantee reasoning. With a relational invariant, updates can only change the state in accordance with the invariant relation: for the environment, this is the rely relation; for the thread, this is the guarantee relation. Various approaches, such as RGSep [17], LRG [18], CAP [20], VCC [34], Verifast [10] and HOCAP [25], have localised the notion of protocols to specific shared resources, often as regions or other similar constructs. CaReSL [24], SCSL [23] and iCAP [26] extended the concept of regions with a notion of abstract state and a transition system over those abstract states: in CaReSL these protocols are called islands; in SCSL they are called concurroids; and in iCAP and, following iCAP, TaDA [5], Total-TaDA [35] and Caper [36], they are called shared regions. Iris [7] encodes regions with state transition systems using unary invariants and partial commutative monoids. Finally, some logics support additional abstraction over the protocol actions, such as LRG [18], SCSL [23] and CoLoSL [37]. 2 Having defined the Counter region type, we can now give a concrete interpretation to the abstract predicate Counter(s, x, n) and the abstract type T 1 : Here, RId is the set of region identifiers. The region assertion Counter a (x, n) asserts that there exists a Counter region with identifier a and parameter x in state n. The guard assertion [Inc] a asserts ownership of guard Inc for region a. Notice that the Counter(a, x, n) predicate encapsulates ownership of both a Counter shared region and the guard Inc required to update the region.
We are now in a position to prove that the counter implementation satisfies the specification using the above definitions.
The proof outlines for the read, incr and wkIncr operations are given in Figs. 7, 8 and 9, respectively. These proofs use TaDA's core proof rules for deriving atomic specifications: MakeAtomic and UpdateRegion. The MakeAtomic rule allows us to derive an atomic specification that updates a shared region, provided evidence that the code performs a single atomic update on the region, under suitable constraints on how the environment can update the region. The UpdateRegion rule is used to perform the update at the linearisation point, and produces the evidence of this update.
∃n ∈ N. a Z ⇒ (n, n + 1) * ret = n Counter a (x, n + 1) * [Inc] a * ret = n Counter(s, x, n + 1) * ret = n We consider a simplified version of TaDA's MakeAtomic rule: The region assertion t a ( z, x) asserts that the region with identifier a is of type t (for example Counter), with parameters z, and is in state x. The conclusion of the rule establishes that C effectively atomically updates this region from some state x ∈ X to some state y ∈ Q (x) using the guard G for the region. The first premiss requires that this update is permitted by the transition system given the guard G. Here, T t (G) is the set of transitions for region type t that are labelled by G. We assume that this is closed upwards under adding resource: Then T t (G) * is the reflexive-transitive closure of this relation. The second premiss requires the assertion R to be pure: that is, independent of resources and regions. The final premiss captures the notion of atomicity of C, with respect to the abstraction in the conclusion, as a proof obligation. Specifically, the region must be in the state x for some x ∈ X , which may be changed by the environment, until at some point the thread updates it to some y ∈ Q (x). This obligation is expressed using two new technical concepts that are used in the premiss. The first, a : x ∈ X Q (x), is called the atomicity context. The atomicity context records the actual abstract atomic action that is to be performed: from some state x ∈ X to a state in Q (x). The second, a Z ⇒ −, is the atomic tracking resource. The atomic tracking resource indicates whether the atomic update has occurred (the a Z ⇒ indicates it has not) and, if it has, the state of the shared region immediately before and after (the a Z ⇒ (x, y) indicates an update from x to y). The resource a Z ⇒ also plays two special roles that are normally filled by guards. Firstly, it limits the interference on region a: the environment may only update the state so long as it remains in the set X , as specified by the atomicity context. Secondly, it confers permission for the thread to update the region from state x ∈ X to any state y ∈ Q (x); in doing so, the thread also updates a Z ⇒ to a Z ⇒ (x, y). This permission is expressed by the UpdateRegion rule (described below), and ensures that the atomic update only happens once.
In the proof of the read operation given in Fig. 7, the MakeAtomic rule is instantiated as follows: The second key TaDA proof rule is the UpdateRegion rule, which uses the atomicity tracking resource to update a region.
A simplified version of this rule is as follows (where * binds tighter than ∨): In the premiss of this rule, either the atomic operation updates the state of the region to some y ∈ Q (x), or the state is unchanged. In the conclusion, this is represented by either updating the atomic tracking resource to a Z ⇒ (x, y), or leaving it as a Z ⇒ . Note that, if y = x in the postcondition, the abstract state of the region is not changed and we can either perform the atomic update or not.
In the proof of the read operation in Fig. 7, the UpdateRegion rule is instantiated as follows: Here, we choose P (x) True, Q 1 (x, y) (v = x) and Q 2 (x) False, and simplify the assertions. The proof of the read implementation (Fig. 7) first rewrites the specification using the definition of the Counter predicate. It is then possible to apply the MakeAtomic rule. The atomicity context allows the region a to be in any abstract state n ∈ N. The UpdateRegion rule performs the atomic action, which leaves the region in the same state and records the state in the atomic tracking resource.
The proof of the incr implementation (Fig. 8) follows a similar style. The main difference is that, when entering the loop, it first performs a read operation and stores the current value of the counter in v. The OpenRegion rule allows a region to be opened for an atomic operation, provided that the abstract state is unchanged. Here, it is used to read the value of the counter. The UpdateRegion rule is then used to perform the atomic action conditionally. If the atomic compare-and-set operation succeeds, the region transitions from state n to n + 1 and the atomic tracking component is updated. If it fails, the region remains in the same state and the atomic tracking component is not updated. The loop will repeat until the compare-and-set succeeds, with the loop invariant ensuring that the region has not yet been updated. After the loop, the region is guaranteed to have been updated.
The proof of the wkIncr implementation (Fig. 9) is somewhat similar to incr, except that the atomicity context does not allow the environment to change the abstract state of the region before the atomic update occurs. Consequently, the update to the region is always successful.

Ticket lock
We give a specification for a lock based on ownership transfer. We show how to prove that the ticket lock implementation from Section 2.2 satisfies this specification using the atomic specification of the counter.

Lock specification
We start by specifying the lock module using a specification based on ownership transfer, first given in the work on CAP reasoning [20]. The specification provides two abstract predicates: IsLock(x), which is a non-exclusive resource that allows a thread to compete for the lock; and Locked(x), which is an exclusive resource that represents that the thread has acquired the lock, and allows it to release the lock. The lock is specified as follows: When a thread acquires the lock, it gets the Locked(x) predicate that can subsequently be used to release the lock. The last two axioms respectively allow us to duplicate the non-exclusive resource describing the existence of a lock, and guarantee that two threads cannot hold the Locked(x) resource at the same time.
Implementation To verify this implementation against the atomic specification, we first define a shared region for the ticket lock. Recall that the ticket lock comprises two counters: the first counter records the next available ticket, while the second counter records the ticket which currently holds the lock. The lock is considered unlocked when the two counters are equal. In order for a thread to acquire the lock, it must obtain a ticket by incrementing the first counter and then must wait until the second counter reaches the value of the obtained ticket. To release the lock, a thread simply increments the second counter.
To verify the implementation, we introduce a new shared region type, TLock. The abstract state of the region will be a natural number n, representing the ticket that currently holds the lock. The guard PCM is generated by the guards Pending(n, m), for n ≤ m ∈ N, and Key(n), for n ∈ N, subject to the following rules: Pending(n, m) • Key(n) = Pending(n + 1, m) Conceptually, Key(n) will represent ownership of ticket n for the lock. The guard Pending(n, m) tracks the yet-to-be-used tickets that are currently held by threads, namely those between n and m − 1 inclusive. The last two rules allow us to extract and merge tickets with the Pending guard. This PCM can be seen as an instance of the authoritative monoid of Iris [7].
(Specifically, it is the authoritative monoid on the monoid of finite subsets of N under disjoint union. Here, Pending(n, m) = • {i | n ≤ i < m} and Key(n) = • {n}.) An alternative approach would be to have only Key guards, one for each natural number, and define Pending as an infinite combination of Key guards, as in [38,20]. The labelled transition system is given by: Key(n) : n n + 1 It ensures that a thread must hold the ticket Key(n) in order to pass ownership of the lock to the next ticket. We define the region interpretation for TLock regions by: In the region, x is the address of the lock and enables us to retrieve both counter addresses, located at owner and next respectively, and n is the value of the owner counter. In addition, the logical variables s and t denote parameters associated with the two counter abstract predicates.
We now define the interpretation of the predicates as follows: IsLock ( exclusively, we guarantee that the region abstract state cannot be changed by the environment, as no other thread holds the guard (Key(n)) necessary to perform the update action. It remains to prove the specifications for the operations and the axioms. The last key TaDA rule that we mention is the UseAtomic rule. A simplified version of the rule is as follows: This rule allows a region a, with region type t, to be opened so that it may be updated by C, from some state x ∈ X to state f (x). In order to do so, the precondition must include a guard G that is sufficient to perform the update to the region, in accordance with the labelled transition system as is established by the first premiss.
The proofs of the release and acquire operations are given in Fig. 10 and Fig. 11. The interesting part of release is the call to wkIncr. Here, the thread has the Key(n) guard for the current state of the lock n. The UseAtomic rule is applied choosing X = {n} and f (x) = n + 1. When the region is opened, the guards Pending(n, m) and Key(n) combine to give Pending(n + 1, m). Together with the update to the owner counter, the region is closed in state n + 1. In the postcondition of the UseAtomic rule, we must stabilise the assertion to account for the environment's possible changes to the region. Ultimately, we weaken the postcondition to True, as required by the specification.
The acquire proof uses the A quantifier in the premiss of the UseAtomic rule to account for the fact that the state of the lock is not stable. The first use of the UseAtomic rule increments the counter and retrieves a Key(t) for the value read.
After the read, because we own Key(t), we can guarantee that the state of the region cannot be larger than t, i.e. that the environment does not have the necessary guards to perform such a transition. The loop then simply waits until the state of the region matches the ticket. When that happens, we know it cannot change as long as we own the guard Key(t) and as such we can satisfy the Locked(x) predicate.

Higher-order approach
We now consider how the spin-counter implementation and the ticket-lock client can be verified in a higher-order approach based on that of Jacobs and Piessens [10]. Similar proofs are possible in other higher-order separation logics, such as HOCAP [25], iCAP [26] and Iris [7], although the details vary. ∃m.

I P
Atomic Exists:

Spin counter
Recall the higher-order specification of read, given in Section 3.6.1 in the form of a rule: x → n * R(n) ρ(n) I * T (n) This should be read as a logical implication between the premisses and conclusion, with the predicates I, P , R, T , and parameters x, ρ universally quantified. A proof outline for this specification is given in Fig. 12.
In the read operation, the concrete memory read is sequenced with the auxiliary code ρ in a single atomic operation (delimited by _ ). Since it is atomic, we can transfer the invariant I into the local state for the duration, reestablishing it at the end. The assumed bi-implication I * P ⇔ ∃n. x → n * R(n) is used to obtain the x → n resource, which is then read into v. The auxiliary code ρ then updates x → n * R(n) to I * T (n). Once the invariant is closed, we are thus left with the required postcondition. Compare this proof with the corresponding TaDA proof (Fig. 7). We may look at the MakeAtomic rule as establishing a Z ⇒ and a Z ⇒ (n, n) as proxy resources for P and T (n) respectively. The Counter region plays a similar role to the invariant, in that both are opened for the duration of atomic operations. The interpretation of the region plays the role of the bi-implication I * P ⇔ ∃n. x → n * R(n). The UpdateRegion both opens the region for the duration of the atomic update and plays the role of ρ in updating the auxiliary state.

I P
Atomic The specification for incr is as follows: The proof outline for this operation is given in Fig. 14. By comparison with read, the operation involves multiple atomic steps and so exploits the bi-implication in both directions: at all atomic steps where the update does not occur, the biimplication is used to restore the invariant and precondition. The linearisation point occurs when the CAS succeeds, so the auxiliary code ρ is executed conditionally at this point.
Note that after the first atomic read, no relationship between the value that was read and the current value is retained. This is necessary, since it may in fact change arbitrarily before the CAS operation. This contrasts with the wkIncr operation, where the value cannot change.
Recall the wkIncr specification: The proof outline for this operation is given in Fig. 13. Here, the value of the cell must be fixed at some n, and so when the update is performed v + 1 will be n + 1. As for read, the higher-order proofs of incr and wkIncr somewhat resemble their TaDA counterparts, but with abstract predicates taking on the roles of the region and atomic tracking resources. The higher-order approach does lead to specifications that obscure the atomic update. However, we can see them as encoding a notion of atomic triple, as per Remark 3.

Ticket lock
We now show how to verify the ticket lock in the higher-order setting, using the above specifications for the counter. We give the ticket lock a specification where the lock itself (represented by the IsLock predicate) belongs to the invariant: Other than treating IsLock as an invariant rather than a duplicable assertion, this specification is essentially the same as the TaDA specification given in Section 4.1.2.
∃n. x → n * R(n) Exists: We define the IsLock and Locked predicates, along with auxiliary predicate LockDescr, as follows: LockDescr(x, o, n) ∃π 1 , π 2 . x.owner IsLock(x) ∃owner, next, n, m. LockDescr(x, owner, next) * n ≤ m * owner → n * next → m * x.ghost The invariant for the lock, IsLock(x), establishes that the x.owner and x.next cells point to owner and next, respectively. It only holds fractional permission for x.owner and x.next, allowing all threads to be able to obtain knowledge of owner and next and to be sure that they will not change; when a thread reads either of the cells, it will take a fractional permission from the invariant. The invariant holds full ownership of the owner and next cells, which represent the owner and next counters, and asserts that the value of the owner counter is at most the value of the next counter. The invariant also uses auxiliary (or ghost) resources, which can only be accessed by auxiliary code. The auxiliary heap cell, x.ghost, tracks the owner counter. A 1 /3 permission to this cell always belongs to the invariant. When a thread holds the lock, the remaining 2 /3 permission belongs to the thread holding the lock, which is encapsulated by the Locked(x) predicate. Since permissions cannot exceed 1, the axiom Locked(x) * Locked(x) =⇒ False holds. Holding the Locked(x) predicate ensures that no other thread can change the value of the ghost cell, and hence the value of the owner counter, which it tracks. When no thread owns the lock, the remaining 2 /3 permission belongs to the invariant. (It might seem that we could use fractional permissions on the owner counter instead of this auxiliary heap cell. However, the specification of the counter read operation requires full permission, so full permission for the owner counter must belong to the invariant.) The other auxiliary resource in the invariant is a ghost bag [10], represented by the gbag predicate. The ghost bag is an abstract data type that represents a bag (or multiset). Ghost bags have two associated abstract predicates: gbag(b, B), which represents a ghost bag with identifier b and contents B; and gbagh(b, v), which represents the knowledge that the ghost bag with identifier b contains element v and the permission to remove this element. The two operations for updating the ghost bags are specified as follows: Here, , ∈ and \ are bag join, membership and difference operations, respectively. We treat a set as a bag where the elements have multiplicity 1.
The invariant uses a ghost bag with identifier x.ts to track the pending tickets. When the acquire operation takes a ticket, it adds the ticket to the bag using gbag_add. In doing so, it obtains a gbagh(x.ts, v) resource, representing ownership of ticket v. When it successfully obtains the lock, it removes the ticket from the bag using gbag_remove, losing the resource in the process.
The invariant allows for two cases, depending on whether or not the lock is currently held. If the lock has been successfully acquired by some thread, the disjunct gbag(x.ts, {n + 1, . . . ,m − 1}) * n < m holds. In this case, the thread holding the lock owns the fractional permission x.ghost − → n permission. Fig. 15 gives the proof outline for the release operation. When the invariant is opened, the first disjunct must hold, since otherwise there would be too much ownership of the x.ghost heap cell. We can thus establish that I * P ⇔ owner → n * R. When the auxiliary code is executed, it updates x.ghost to match owner at n + 1. We establish owner → n + 1 * R [x.ghost] := [x.ghost] + 1; I * T (n) as follows: owner → n + 1 * ∃next, m. LockDescr(x, owner, next) * n + 1 ≤ m * next → m * gbag(x.ts, {n + 1, . . . ,m − 1}) * x.ghost → n + 1 // n := n + 1 ⎧ ⎪ ⎨ ⎪ ⎩ ∃owner, next, n , m. LockDescr(x, owner, next) * n ≤ m * owner → n * next → m * x.ghost  Fig. 16 gives the proof outline for the acquire operation. When incr is used to acquire a ticket, the auxiliary code adds the new ticket m to the ghost bag, reestablishing the invariant, while obtaining the ticket resource gbagh(x.ts, m) for the thread. The proof outline for this auxiliary code is as follows: ownership of the lock in the case that t matches the owner counter. Instead, the gbagh(x.ts, t) resource is exchanged for x.ghost 2 /3 − → t when a thread acquires the lock.

Conclusions
We have examined four major techniques for specifying and verifying concurrent modules: Owicki-Gries, rely/guarantee, concurrent separation logic, and linearisability. For each technique, we identified a particularly valuable contribution. We demonstrated how a synthesis of these contributions can be used to produce effective modular specifications for concurrent modules, using a counter module as a case study. We gave specifications for the counter module in both the first-order approach of TaDA and the higher-order approach of Jacobs and Piessens. With each approach, we demonstrated the expressivity and modularity of the counter specification by proving that it is satisfied by a spin-counter implementation and sufficient to verify a ticket-lock client.