Session typing and asynchronous subtyping for the higher-order π -calculus

The higher-order π -calculus Asynchronous subtyping


Introduction
The higher-order π -calculus with session types In global computing environments, applications are executed across multiple distributed sites or devices.The use of mobile code is prominent in such environments, where several participants are synthesised by communication of not only passive values but also of runnable code: for example a service can be delegated to different participants, by sending either a channel via which it is accessible, or code that accesses it; and incoming code may transit through several devices that alter their computational behaviour or their data through interaction with it.Indeed, mobile code has become really pervasive at many levels.For example when we speak of "software updates," we are in fact referring to mobile code, and we use it in mobile phone applications, operating systems, and all kinds of networked applications.
The Higher-Order π -calculus (HOπ -calculus) [46] is a general formalism of interaction in which two kinds of mobility, name passing and process passing, are integrated in a simple and universal form: in this model, processes can be instantiated by names and other processes, just like a piece of mobile code is instantiated with local capability after migration.This additional expressiveness inherited from the λ-calculus provides a powerful basis for describing and analysing dynamic behaviour in global computing scenarios.
As a type-theoretic foundation for highly structured communication protocols often found in distributed applications, this paper focuses on the notion of sessions and their types [48,54,24].A session is a series of communications between two parties which form a meaningful logical unit, just like a web session between a browser and a server when a human user interacts with an e-commerce site.Session types model such interactions as an abstract structure of typed choice, inputs and outputs.The study of session typing systems is now wide-spread due to the need for structured communications in various scenarios in distributed computing.While many advanced session types for the π -calculus and programming languages have been studied, before our work [35] there existed no session typing systems for the HOπ -calculus.Incorporation of sessions into this language offers a general theoretical basis for examining the interplay between two non-trivial features in communication-based programming, higher-order mobility and session-based structured interaction.
As the first contribution, this article establishes the first session type theory for the HOπ -calculus which can statically validate the type safety of complex distributed scenarios with code mobility.In spite of their simple type syntax, the previous literature have shown that obtaining type soundness for session types is an intricate task because of delegation of sessions [54].Preservation of typability becomes even more non-trivial in the presence of higher-order process passing, especially when the mobile processes contain free sessions.
Higher-order processes with asynchronous sessions We now outline technical challenges by examples.Code mobility in HOπ -calculus is facilitated by sending not just ground values and channels, but also abstracted processes that can be received and activated locally, reducing the number of transmissions of remote messages.The simplest code mobility operations are sending a thunked process P via channel s (denoted as s! P ), and receiving and running it by applying the unit (denoted as s?(x).x()).In our calculus, communications are always within a session, established when accept and receive processes synchronise on a shared channel: a(x).x! 5 .x! true .

x?(y).(y() | R) | a(x).x?(z 1 ).x?(z 2 ).(x! P | Q )
This results in a fresh session, consisting of two channels s and s, each private to one of the two processes, and their queues initialised to be empty: (νs)(s! 5 .s! true .s?(y).(y()| R) | s?(z 1 ).s?(z 2 ).(s To avoid conflicts, an output on a channel s (resp.s) places the value on the dual queue s (resp.s), while an input on s reads from s (resp.for s).Thus, after two steps the outputs of 5 and true are placed on queue s as follows: (νs)(s? (y) where U is the type of P , have the property S 1 = S 2 derived from a duality relation on types, and this guarantees that values are communicated in a complementary order.

Asynchronous communication optimisation with code mobility
The main idea of optimisation by message permutation, in the context of buffered communications, is that outputs can be performed in advance without affecting correctness with regards to the delayed inputs.This is based on the fact that there are two buffers per session (as there are two streams per socket in network programming) which means that we only need to preserve the relative order of outputs (resp.inputs) to avoid communication mismatches.In the previous example, suppose the size of P is very large and it does not contain z 1 and z 2 , for example because they appear in Q and the program is not optimised.Then, if s does not appear in P , the right process might wish to start transmission of P to s : concurrently without waiting for the delivery of 5 and true on s: .Thus, we can send P ahead obtaining s! P .s?(z 1 ).s?(z 2 ).Q where s now has the type S 2 =![U ].? [nat].? [bool].end.The interaction with the left process is still safe since both s and s continue to receive the expected type of value and in the expected order, specifically s will receive U and s will receive first nat then bool.However, the optimised code is not composable with the other party by the original session system [48] since it cannot be assigned S 2 for s which is the only type such that S 1 = S 2 .
To make this optimisation valid, we proposed asynchronous subtyping in [37] by which we can refine a protocol to maximise asynchrony without violating the session.For example, in the above case, S 2 is an asynchronous subtype of S 2 , written S 2 c S 2 , so the optimised process can also be assigned S 2 , and can therefore compose with the left process as before.
Unsafe optimisations, such as one where the left process sends values in a different order, first ![bool] and then ![nat], are filtered out by the typing system, otherwise z 1 of type nat would receive a value of type bool.The idea of this subtyping is intuitive and the combination of two kinds of optimisations is vital for typing many practical protocols [50,23] and parallel algorithms [38], but it requires subtle formal formulations due to the presence of higher-order code.The linear functional typing permits to send a value that contains free session channels: for example, s! P can be s! s ?(x).s !x or even s! s?(x).s! x which contains its own session (if R conforms with the dual session, e.g., R = s! 7 .s?(z).0).In the first case, we can permute the output s! P as explained, but in the second case it would be unsafe, since the input action s?(x) from the thunk will appear in parallel with s?(z 1 ).s?(z 2 ).Q , creating a race condition, as seen in: (νs)(s?(x).s! x | R | s?(z 1 ).s?(z 2 ).Q | s: This article shows that the combination of two optimisations is indeed possible by establishing soundness and communication-safety.The technical challenge is to prove the transitivity of the asynchronous subtyping integrated with higher-order (linear) function types and session-delegation, since the types now appear in arbitrary positions, both covariantly and contravariantly.Moreover, the definitions are now exposed in detail.Another challenge is to formulate a runtime typing system which handles both stored higher-order code with open sessions and asynchronous subtyping.We demonstrate all aspects of type-preserving optimisations explained above by using e-commerce scenarios.
Outline This article is a full version of the extended abstracts published in two conference papers [35,36] and the first author's PhD thesis [32].Here it includes the detailed definitions, expanded explanations, more detailed examples, and complete proofs.We have also updated the related work with recent literature.In the rest of the article, Section 2 defines the syntax, operational semantics, and demonstrates the combined use of sessions, code mobility and asynchronous optimisation with examples.Section 3 defines types and Section 4 introduces the asynchronous subtyping.Section 5 illustrates the typing system for programs and Section 6 extends it to the typing system for runtime processes.Section 7 proves the main theorems, type soundness and communication safety of the typed processes.Section 8 discusses related work and Section 9 concludes the article.Appendices A-C list the detailed definitions and proofs which are omitted from the main sections.

The higher-order π -calculus with asynchronous sessions
The HOπ -calculus with asynchronous sessions, HOSπ , is a variant of the HOπ -calculus [46].There are two notable differences compared to [46].First, in HOSπ communications occur in the context of an initiated session synchronising two processes that perform a prescribed protocol.Second, communications are buffered in message queues, to realise asynchronous FIFO semantics.HOSπ encompasses two types of mobility: name passing, with which dynamic communication topologies can be programmed, and code passing, where by transmitting processes a dynamic behaviour can be achieved.Note that the calculus is monadic, i.e., only one value is sent/received at each communication step, but this does not affect the results and serves for simplicity.

Syntax
The syntax of HOSπ is given in Fig. 1.The calculus extends the HOπ with a small kernel of session primitives: a way to initiate a session over a shared channel, a class of session names -which we call endpoints -used for communications within sessions, and primitives for offering and making choices indexed by labels.
Identifiers Variables range over x, y, z, . . . .Shared channel names, which are used only to initiate sessions (we describe this in detail further below), are ranged over a, b, c, . . . .We write u, v, w, . . . to represent shared identifiers, that is, those that are either variables or shared channel names.Session channels, ranged over s, . . .and s, . . ., are the endpoints through which values are communicated within an established session (which as we shall see is always between exactly two processes).
The name s denotes the dual of s, that is, if one process in a session uses s, the other process uses s, and in this way each of the two processes possess a unique endpoint.This separation of endpoints is similar to the use of two polarities in [19,54].We define duality to be idempotent, thus, we have that s = s.This property of endpoint names is used in the reduction semantics, where a communication is synchronised over the two endpoints of a session.We write k, k , k , . . .for linear identifiers, consisting of variables and session channels.
Values We write V , V , W , . . .for those terms that may be used as values, that is, as the object of a communication or as the argument in function application.First, we have identifiers, shared and linear (as standard).Abstraction, written λ(x : U ).P , encapsulates a process P , where x may occur free, into a function over x (with type annotation U ).This is the basic mechanism for the exchange of processes, and the unit () is useful when we wish to obtain a value from an arbitrary process P : take a variable x not free in P , then λ(x : unit).P is a value, usually referred to as a thunk, and abbreviated to P .To reveal and execute a process within a thunk, we use the run function λ(y :unit → ).(y ()) which takes a thunk as argument and applies it to the unit value to obtain the hidden process.
To facilitate terms that exhibit infinitary behaviour, we introduce a recursive function constructor μ(x :U → T ).λ(y :U ).P .
In this fixpoint representation, instances of the variable x within P represent the function itself.
Terms Terms range over P , Q , R, . . . .The main constructs are: Session initialisation u(x).P and u(x).Q are prefixed processes that may synchronise and commence a session.The interactions will adhere to the session type assigned to the shared identifier u, and since each session consists of two endpoints used in a complementary way, we distinguish the two different behaviours with respect to this type using u and ū.The bound variable x is a placeholder for a fresh session endpoint, initialised after the prefixes react to establish a session.Input and output k?(x).P is the standard input prefixed process, with linear subject k and using x as a placeholder for the received value.k! V .P is an output prefixed process, sending value V over session k.Branching and selection k £ {l 1 : P 1 , . . ., l n : P n } offers a set of label-indexed choices l i : P i on endpoint k, with a process continuation P i corresponding to each label l i .It is often written k £ {l i : P i } i∈I with index set I .The dual (or co-action) of a branch is a process ready to perform a selection k ¡ l.P where the chosen label is within the domain of the branch set.Essentially a branching is an input expecting a label and performing case analysis (which covers all cases) on this label to choose a continuation.Dually, a selection is an output of a label designating a choice.Clearly, it is undesirable to allow the empty set in branching, since no selection can be made (that is, there is no effective co-action), and henceforth we assume that there is at least one branch (and the respective indexing sets, when used, are non-empty).

Fresh names
We write (νa : S ) P to denote a process P in which the shared channel a (typed by S ) is unique.With (νs) P we denote that the two endpoints s and s are unique in P , that is, no external process can perform a session action on either of these endpoints; this gives non-interference within a session.
Message queues A message queue s : h provides access, via a session that uses s, to the ordered messages h.It can be thought of as a network pipe in a TCP-like transport mechanism.The messages can be values, or labels which are required for selection and branching.
Other constructs are the nil process 0, parallel composition P | Q , and functional application P Q , which are standard from π-calculus and λ-calculus.
We often omit 0 and some type annotations when not relevant.
The bindings are induced by (νa : S )P , (νs)P , u(x).P , u(x).P , k?(x).P , λ(x : U ).P , and μ(x : U → T ).λ(y : U ).P .The derived notions of bound and free identifiers, alpha equivalence and substitution are mostly standard.We write fv(P )/fn(P ) for the set of free variables/names, respectively, extended to queue processes (which can contain labels) as follows; the complete definition is in Fig. A. 13.

fn((νs)
As usual, in all mathematical contexts we assume Barendregt's variable convention, that is, free and bound variables are always chosen to be different, and all bound variables are distinct; the same applies to names.Note that queues and session restrictions appear only at our formalisation of runtime systems, since programmers do not normally write protocols with "open" sessions.Furthermore, we use the terminology program for a process which does not contain such runtime elements.

Reduction semantics
We define the standard structural congruence, denoted '≡', as the smallest equivalence relation which is congruent with respect to the calculus constructors (parallel composition, name restriction, prefixes) and respects the axioms and rules in Fig. 2. The only non-standard rule is for garbage collecting queues from completed sessions: (νs) (s : | s: ) ≡ 0.
P −→ P (νs)P −→ (νs)P P −→ P (νa: S )P −→ (νa: S )P (ress, resc) The single-step call-by-value reduction, denoted −→, is a binary relation from closed terms to closed terms, defined by the rules in Fig. 3. Rule (beta) is standard from the call-by-value λ-calculus.The case of (rec) is similar, with the added step of unfolding the recursive function, by substituting it in place of the variable y within the function body P .Rule (conn) establishes a new session between two processes a(x).P and a(z).Q ready to synchronise on a.The result of this rewriting is a parallel composition of the session bodies P and Q with a fresh set of endpoints s and s substituted for the session variables x and z, respectively.The side condition ensures that the new endpoints do not already appear free in either P or Q .The result contains empty queues corresponding to the session channels ( denotes the empty sequence).
Rules (send) and (sel) respectively enqueue a value or label at the tail of the queue for the dual endpoint k.When V is a function, we have higher-order code passing; when V is a session endpoint, we call it higher-order session passing.Rules (recv) and (bra) dequeue, from the head of the queue, a value or label.The rule (recv) substitutes value V for x in P , while (bra) selects the corresponding branch for index m.The received label l m must be in the branch set as indicated by the side condition.Due to the self-inverse duality property of endpoints, if k = s then we have an output from s to s, and if k = s, the output is from s to s.
Since (conn) provides a queue for each channel, these rules say that a sending action is never blocked (asynchrony) and that two messages from the same sender to the same channel arrive in the sending order (order preservation).
In the remaining rules: (app − l) and (app − r) implement a left to right reduction order for functional application; (par) reduces the leftmost parallel process; (resc) and (ress) are standard and reduce a process under name hiding.The last rule, (str), introduces structural congruence [31] into the reduction relation.This is necessary for re-arranging terms to match reduction rules.
Encoding replication By using recursion, we can represent infinite behaviours of processes such as, e.g., the definition agent def, or the replication !u(x).P of [30,54,24,35].Replication on a shared name, useful for defining persistent servers, can be encoded as follows: !u(x).P def = (μy.λz.z(x).(P| y z)) u taking y, z / ∈ fv(P ) Hereafter when writing a replicated connection-prefixed process we shall mean that this encoding is used.Note that we did not (and by typing we cannot) replicate a session endpoint, since that would violate linearity.To validate the encoding, we can observe a reduction using a replicated connection !a(x).P and a suitable co-action a(z).Q : Note that in the application of rule (conn), since x is bound in !a(x).P , the substitution { s / x} has no effect on this subterm.Once a connection is established via (conn), we can apply structural congruence ≡ to obtain a term where !a(x).P can react again; for this we used the fact that s and s do not occur free in !a(x).P , which is ensured by the conditions of the previous reduction with (conn).

Example: business protocol with code mobility
We show a simple protocol which contains essential features by which we can demonstrate the expressivity of the code mobility and session primitives for the HOSπ -calculus; it consists of a combination of code mobility, session delegation and branching.This extends a typical collaboration pattern that appears in many web service business protocols [23,8,47] to code mobility.
In Fig. 4, we show the sequence diagram for a protocol which models a hotel booking: first, Booking Agency and Client initiate interaction at session x over channel a; then Client starts exchanging a series of information with Agency; during this initial communication, Agency calculates its Round Trip Time (RTT) between Client and Agency; Agency selects an appropriate Hotel and creates a new session y over channel b with that Hotel.If the RTT is short (Fig. 4), then Agency delegates to Client its part of the remaining activity with Hotel, by sending session channel y to Client; then Client and Hotel continue negotiations by message passing.If the RTT is long (Fig. 5), since many remote interactions increase the communication time as well as the danger of communication failures, Agency asks Client to send mobile code to the Hotel (via y) which contains the communications pertaining to the Client's room plan and negotiation behaviour.Client sends the code to Hotel, then Hotel runs it locally, finishing a series of interactions in its location.Finally, Agency receives a commission fee (10 percent of the room rate) via session x, concluding the transaction.
The given scenario is straightforwardly encoded in our calculus, where session primitives make the structure of interactions clearer.Agency first initiates at a and starts the interactions with Client; then it initiates at b and establishes session y; next it invokes either label cont or label move in Client depending on the RTT and sends y (higher-order session passing) to it, and waits for completion of the transaction between Client and Hotel at x ("if-then-else" can be encoded using branching, and we use other base types and their operators).
Client requests a service at a and starts a series of interactions with Agency, and either continues the remaining activity with Hotel or sends the code (a thunk in Line 4).Note that Client can safely send the commission fee back to Agency because the return message x z × 0.1 which uses session channel x is embedded in the thunk.
Hotel performs the interactions with Agency and Client via a single session at y (by the facility of higher-order sessions).In Line 6, the code sent by Client is run locally.
Hotel def = !b(y).y £ { cont : y?(z).y! roomrate(z) .Q ; (5) move : y?(code).(runcode | y?(z).y! roomrate(z) .Q )} (6) This example demonstrates a couple of subtle points whose slight modification would violate the expected "complementarity" of session actions, leading to obvious violations of soundness.First, in Line 4, if we send code which does not complete the session, e.g. if we have interactions at y (say y! w ) after sending the thunk in Line 4 of Client, the session at y will eventually appear in three threads (two in Hotel, one in Client), so values may get mixed up due to the nondeterminism on y-actions.Secondly, in Line 6, if we have two or more applications (say run code | run code) instead of exactly one run code, we will end up with duplication of session endpoints (both y and x).Finally, if the code is not activated in Line 6 (for example if we use (λx.0)code instead of run code), the other end of the session, y?(z) .y! roomrate(z) .Q , will never find a matching output.Hence, the variable code must appear exactly once and become instantiated into a process exactly once.We type this example in Section 6.2.

Example: optimised business protocol with code mobility
We now show a business protocol which integrates the two kinds of type-safe optimisation: code mobility, by which a protocol can be executed at the location of the receiver, which is especially useful when latency is high; and also message re-ordering, which allows an implementation to perform outputs in advance, essentially permitting both participants of a session to send at the same time.We thus extend the previous protocol to highlight the behaviours that are possible using our methods.Fig. 6 draws the sequencing of actions modelling a hotel booking through a process Agent.On the left Client behaves dually to Agent; on the right, an optimised MClient utilises type-safe asynchronous behaviour.
The Agent behaves the same towards both clients: initially it calculates the round-trip time (RTT) of communication (rtt) and sends it; it then offers to the other party the option to consider the RTT and either send mobile code to interact with the Agent on its location, or to continue the protocol with each executing remotely their behaviour.When mobile code (after choice move) is received, it is run by the Agent completing the transaction on behalf of the client, in a sequence of steps.The behaviour of Client is straightforward and complementary to Agent, but MClient has special requirements: it represents a mobile device with limited processing power, and irrespective of the RTT it always sends mobile code; moreover, it does not care about money, and provides the credit card number (card) before finding out the rate.
To represent this optimised scenario, we start from the process for Agent (which is a simplification of Agency): x?(roomtype).x!rate .x?(creditcard) . . .
The session is initiated over a, then the rtt is sent, then the choices move and local are offered.If the first choice is made then the received code is run in parallel to the process Q which continues the agent's session, performing optimisation by code mobility.As expected, Client has dual behaviour:  Client = a(x).x?(rtt).x¡ move.x! x! ritz .x! suite .x?(rate).x! card . . . .
A more interesting optimisation is given by MClient, one which at first may seem to disagree with the intended protocol: x?(rate) . . .
After the session is established, it eagerly sends its choice move, ignoring rtt, followed by a thunk that will continue the session; and another important point is that in the mobile code the output of the card happens before rtt and rate are received.As explained in the previous subsection, even without subtyping, the typing of sessions in the HOπ -calculus poses delicate conditions; in the present system, we can further verify that the optimisation of MClient does not violate communications safety: when values are received they are always of the expected type, conforming to a new subtyping relation given in Section 4. Optimisation by permutation is very delicate, for example as explained in the introduction we cannot optimise s?(z 1 ).s?(z 2 ).s! s! 5 .0into s! s! 5 .s?(z 1 ).s?(z 2 ).0, because the thunk in the first process contains the sender's session (on s) and a permutation to the left (before the inputs) will cause interference as explained in the previous example.
In fact, the second term is untypable.

Higher-order linear types
This short section presents the syntax of the types, which combine linear functions, unrestricted functions, and session types.

Types
The syntax of types is given on Fig. 7.It is an integration of the types from the simply typed λ-calculus with unit and the session types from the π -calculus.Term types range over T , and can be value types, ranging over U , or the process type .Value types consist of the unit type unit, the type U → T of shared functions, the type U T of linear functions, the type S of sessions, and the shared channel type S which enforces that sessions initiated on the corresponding channel will follow the protocol defined by S. The session types are defined inductively as follows.The type ![U ].S represents the sending of a value of type U , followed by the remaining session S. Dually, with ?[U ].S the action will be to receive a value of expected type at least U , followed by S as before.The selection type ⊕[l 1 : S 1 , . . ., l n : S n ] signifies that one of the choices l 1 , . . ., l n will be made (operationally this is an output of a label), and depending on this label the corresponding session continuation chosen from S 1 , . . ., S n will take place.The co-type of selection is the branch type &[l 1 : S 1 , . . ., l n : S n ] corresponding to the reception of a label followed by the corresponding continuation type as in selection.Recursive session types are written μt.S, where the type variable t is bound and may occur free in S. We only consider contractive recursive types [18,54].Practically, contractiveness of μt.S means that every free instance of t in S is guarded under at least one input, output, selection or branching constructor.More general recursive types Our restriction to tail-recursive types may cause a slight limitation with regards to expressiveness, and as noted in [3] there are safe processes that are not tail-recursive.For example, if we were to encode a data type such as a tree with elements of type T , we would need a type of the shape μt.
The first branch uses recursion to send the left and right subtrees to the client (which will have a dual type).However, we can easily lift the restriction without changing anything substantial with respect to subtyping, and with minor modifications to some definitions (e.g., Definition 4.1 which defines how recursive types are unfolded), and so it serves for simplicity.

Examples of types
Session types can encode many common interactions.For example the following type can be used to iterate through a list containing elements of type U : The type describes the behaviour of the client process accessing the list: first a choice is made, either to query the list and discover if it has more elements, by choosing hasnext; or alternatively the choice finished can be made in which case the protocol reaches its end.If hasnext is chosen, then the list can respond by choosing next, after which the client can receive a value of type U .Moreover the type variable t signifies that at this point the protocol is repeated from the point of definition, that is, from the μ-binder at the beginning.If the list replies by choosing finished, the protocol is complete.
Abstractions that contain running sessions must be used exactly once, which demonstrates the difference between linear and unrestricted functional types: This term is safe, since the thunk which contains s is used exactly once within the function that receives it.To denote linear usage, the argument has type U = unit .

( λ(x : U ).0 ) • a(x).x! 5 .0
Although the function disappears after the application, this term is safe, because even if the thunk will not be used in the function, it does not contain any linear or session element that needs to be preserved.Hence, the argument must have type U = unit → .These examples are easy to check with the typing rules in Section 5. Duality In the above example (Section 3.2) we show the type of the iterator, but not of the list.In fact the list's type can be obtained by duality.Each session type S has a dual type, denoted by S, which describes complementary behaviour.This is inductively defined by the rules in Fig. 8. Essentially, dualisation interchanges input (?) with output (!), branching (&) with selection (⊕), leaving end, type variables and μ binders unchanged.Duality is idempotent.Note that we do not need to define duality for other types such as function types, as these are never dualised.

Higher-order asynchronous subtyping
This section presents our theory of asynchronous session subtyping: reordered communications between two processes, in the presence of higher-order values and session mobility, can preserve the type-safety of the original protocol.
As we have seen in the introduction, asynchronous subtyping allows processes to perform output actions (which include selections) in advance within the same session, taking advantage of the underlying buffered model of communication.
Thus, we enable certain permutations of inputs with outputs.However, a permutation of two inputs or two outputs is not permissible because it can violate type-safety.Suppose: P = s! 2 .s! true .s?(x).0 and Q = s?( y).s?(z).s! y + 2 .0.
These processes interact correctly.If we permute the outputs of P to get P = s! true .s! 2 .s?(x).0, then the parallel composition ( P | Q ) causes a type-error.By duality, it is easy to understand why two inputs cannot be permuted.Moreover, an alteration in the relative order of inputs and outputs such that an input is done in advance may cause deadlock, losing progress in session s.For example, consider exchanging s! true and s?(z) in P , obtaining: P = s! 2 .s?(x).s! true .0 and Q = s?( y).s?(z).s! y + 2 .0.
Obviously (P | Q ) ends with deadlock, since the two inputs (the second action on both P and Q ) are blocked after the initial prefixes interact.The only way to optimise the communication within a session is to place outputs before inputs, for example:

Asynchronous subtyping
We begin with some preliminary notions.An occurrence of a type constructor not under a recursive prefix in a recursive type is called a top-level action.For example, ![U 1 ] and ?
Consider the following types: Intuitively, we want to include S 1 in the subtypes of S 2 , because in the infinite expansion of the types any action of S 1 can be matched to one in S 2 .The first output ![U 1 ] of S 1 needs to be matched with a copy of the same output obtained after unrolling the recursion in S 2 once, resulting in: This unrolling is necessary because under the μ binder every action is repeated, and by unrolling once we can obtain one of the possibly infinite instances of the action.For this strategy to succeed, we need to obtain the output ![U 1 ] in S 2 which is guarded under the input action ?[U 2 ].Then, the input action can be compared, and the remaining types checked, following the standard coinductive method.
To summarise, in asynchronous coinductive subtyping we need to formalise both the unfolding of a type and also the type contexts specifying the top-level actions that may guard an output (or selection).
We generalise the type unfolding function defined in [19] so that it can be applied to guarded types, yielding the following definition, based on [37]: For any recursive type S, unfold n (S) is the result of inductively unfolding the top level recursion up to a fixed level of nesting.For example: From the definition we have that unfold 1 (unfold n (S)) = unfold n (unfold 1 (S)), even though normally we apply from the outside.Also, since recursive types are not unfolded until they become guarded, but only n-times, unfold n (S) terminates.Moreover, because our recursive types are contractive, there is no need to apply unfolding indefinitely to obtain a guarded type.
Then, we proceed to define the contexts corresponding to a nested structure of top-level input actions (where branching is treated like input in the sense that a label is to be received).The rationale is that a supertype is less asynchronous than a subtype, hence may consist of input actions before any outputs that need to be checked first, based on the prefix of the subtype.Thus, the multi-hole asynchronous contexts are defined as follows:

Definition 4.2 (Asynchronous contexts).
A :: We write A S h h∈H for the context A with holes indexed by h ∈ H , where each hole • h∈H is substituted with S h .For example, taking H = {1, 2} and To formalise subtyping in the presence of recursive types a simulation-based (or coinductive) method is used, in which subtyping is determined by membership of the goal within a binary relation on types.We adapt the standard simulation approaches from [19,44,11], extending the method non-trivially to account for asynchrony.

Definition 4.3 (Asynchronous subtyping). A relation
The coinductive subtyping T 1 c T 2 (read: T 1 is an asynchronous subtype of T 2 ) is defined when there exists a type simulation with (T 1 , T 2 ) ∈ .Formally, c is the largest type simulation, defined as the union of all type simulations.
Most cases are similar to the ones in [37,11], but in order to facilitate the asynchronous rules the unfolding of the supertype is performed at each case for some level n. (1)(2)(3)(4)6) are the base cases, while (5) says that the shared channel type is invariant (as in the standard session types [19,37,24]).Now we focus on the new rules: in (7), an output prefix of T 1 can be simulated when T 2 can be unfolded to obtain a type that has an output hidden under an asynchronous context A, which by definition consists of only inputs and branchings.The type U 1 is compared to U 2 , the first available top-level output; this is contravariant which is standard in π-calculus [44].
Then, the continuation S 1 of T 1 is compared with the type A S 2h h∈H consisting of the asynchronous context in which the output(s) have been removed, since they were matched with the output prefix of T 1 .For the input in (8), we do not use any context, since the input must appear as the first action after unfolding.No action can appear before the desired input at the supertype: if there is a branching (which is a form of input, with labels as values) it is not comparable, and if there is an output or selection then T 2 cannot be a supertype of the input-prefixed type T 1 , since it would be intuitively more asynchronous.
In (9), selection is defined similarly to output and any label appearing in T 1 must be included in the top level selections of the asynchronous context derived from T 2 .Note that in the supertype, each hole in the context may use a different indexing set I h , but the set I of the subtype is smaller than all these sets (∀h ∈ H .I ⊆ J h ).Dually to selection, in (10), branching is defined like input and any labelled branch of (the unfolding of) T 2 must be supported in T 1 .Finally (11) forces T 1 to be unfolded until it becomes a guarded type.
Remark To include subtyping between base types, we would need to follow [32] where we employ a slightly more elaborate definition, in which for all types except session types output is covariant and input is contravariant.There, we define:

not session types
The subtyping simulation in [32] has the following output-input clauses: In this definition output appears covariant, but because of the inversion applied only to session types it becomes, in this case, contravariant.This explains why our present definitions show contravariant output subtyping (unit and other invariant types are not affected).Now, if we consider the types int and real with int c real, then we have (int, real) = (int, real), i.e., no inversion, hence in (7) above we would obtain a covariant subtyping.For example, we would have ![int].end c ![real].end, and not the opposite which would be non-sensical.
The usual subtyping for functional types can also be integrated into (3,4) using the above definition from [32], but it is orthogonal to our purposes and therefore omitted for simplicity.

Examples of asynchronous subtyping
We show four small but representative examples which highlight key points of our subtyping relation.The first example shows that permuting outputs in advance of inputs in an infinite type preserves subtyping.The second example demonstrates that in some subtypings, a finite number of extra outputs can appear in the subtype, and dually, a finite number of extra inputs can appear in the supertype; this is acceptable when the total outputs remain infinite without losing type compatibility, and similarly for inputs.The third example demonstrates a case where n-level unfolding is required.The fourth example which is atypical exposes a class of subtypings that induce infinite simulation relations, due to asynchronous subtyping.
Three typical examples Consider the types given previously: It is easy to verify that S 1 c S 2 by checking that the following relation is a type simulation: It is also straightforward to show that for the following types: it holds that S 3 c S 4 using the following simulation: We can demonstrate easily that for the following types: we have that S 5 c S 6 with the following simulation: = { (S 5 , S 6 ), (U , U ), (unfold 1 (S 5 ), S 6 ), in which the fourth pair (which is added when matching the output) is obtained after unfolding S 6 at level n = 2, i.e., using unfold 2 (S 6 ); this is because there is are two μ-binders guarding the asynchronous context where the output is located.
Moreover, since as we prove in the next subsection c is transitive, we can also find a simulation such that: whenever (U 2 , U 1 ) ∈ and (U 3 , U 1 ) ∈ .For this the simulation will support the intermediate results A more controversial example Consider the types: Perhaps surprisingly, it holds that S 7 c S 8 , as evidenced by the following simulation: (by the outputs on s) but not in the program that implements s : S 7 .As a consequence, in a naive implementation the buffer can increase in size indefinitely, which is undesirable and in some cases unsafe (e.g., buffer overflow).However, dealing with unreachable data is typically the job of a garbage collector, as in most mainstream languages, so we do not think this is a real problem.
Type soundness and progress in the presence of asynchronous subtyping As shown in the last example, a surprising property of our notion of asynchronous subtyping is that it allows an implementation to not actually receive all the values sent to its buffer.It is then natural to ask how this may affect the properties one expects from a sessions system.First, type safety is not violated since no value of unexpected type is ever received within a term, because two inputs (resp.two outputs) on the same endpoint cannot be permuted.However, one property that can be affected is progress.Specifically, if a session on s or a linear function containing s is never received from a buffer -due to a subtyped process not performing the input at all -then a process waiting to perform an action on the dual endpoint s may become stuck. 1his situation is not easy to address in the present framework, because asynchronous optimisation means that we can postpone inputs ad infinitum, which is not so different than not having those inputs at all.On the other hand, the "standard" sessions systems only guarantee progress on a per session basis, allowing the interleaving of sessions even if it may cause deadlocks, so in that sense not much is lost.We should note that, if one wishes to ensure that all messages are received, there are some solutions: we can restrict subtyping as in [32, p. 181], or following the recent work [9], motivated in part by our subtyping; we return to this later.

The relation c is a preorder
We conclude this section with the main theorem, stating that c is a preorder.In inductively defined subtyping systems, commonly presented as a set of deduction rules, transitivity is a property by definition [18,43].In a coinductive setting, transitivity cannot be assumed, and not every simulation is guaranteed to contain the necessary hypotheses; however, we can prove that c is transitive by careful construction of supporting simulations, containing the necessary components up to unfolding and context manipulation.
If c was not transitive, there would not be type safety.The typical explanation is that, if there exists U 1 c U 2 and U 2 c U 3 such that U 1 c U 3 , then from two consecutive applications of subsumption we may provide a value of type U 1 when U 3 is expected, which is unsafe when U 1 c U 3 .For a detailed exposition to the issues arising from the use of coinductive definitions in subtyping, see Chapter 21 of [43].
The standard method of relational composition [19] is not enough for proving the transitivity of c .The difficulty is that, given S 1 c S 2 and S 2 c S 3 , we need to find a subtyping relation that includes enough elements to justify S 1 c S 3 directly.However, due to the use of nested n-times unfolding with manipulation of asynchronous contexts, S 1 c S 2 provides insufficient information which cannot be straightforwardly combined with the hypotheses from S 2 c S 3 to obtain the result.
Our objective is to discover how to obtain the "missing elements," and to achieve it we gradually formalise a set of extensions on simulations, essentially monotonous functions from simulations to simulations, and then utilise them to prove the main result, Theorem 4.4, stating that c is a preorder.

Theorem 4.4 ( c is a preorder). The relation c is reflexive and transitive.
Overview of proof.The proofs of Theorem 4.4 are given in Appendix B.
Specifically, we perform the following steps: 1. We prove as standard that unfolding S 1 or S 2 or both in S 1 c S 2 preserves subtyping.We formalise the unfolding extension of a simulation to include such n-times unfoldings.(Lemmas B.1 and B.2, Definition B.3, Proposition B.1.) 2. We define a class of single-step permutation contexts representing an input/branching guarded type.Then we formalise rules for moving an output/selection appearing within such a context (that is, immediately after the initial input/branching), to the position ahead of it.This represents the finest granularity of permutation since it is not defined to be transitive.(Definition B.4.) 3. The contextual extension of a simulation is defined, which is a supporting construction.It is necessary in order to obtain the subtypings that arise when removing an output/selection from a single-step permutation context, thus changing its original structure.(Definition B.5 and Lemma B.6.) 4. The asynchronous extension of degree n is defined by applying n consecutive single-step permutations on the subtypes in a simulation relation, and up to contexts A (that is, also deep within the structure of types).Both the contextual and the unfolding extensions are necessary to prove that this relation is also a simulation.(Definition B.7 and Lemma B.8.) 5. Multi-step permutations that can extract an output/selection from deep within a context A, placing it ahead of all actions (that is, prefixing A), are shown to be included in the asynchronous extension of degree ω.This is effectively a proof that the transitive application of nested single-step permutations is included in the asynchronous extension.(Corollary B.9.) 6.The transitivity connection of two simulations is then defined, utilising a composition of asynchronous extensions for the given simulations.The proof that the transitivity connection is a simulation implies that c is transitive.(Definition B.10 and Lemma B.11.) 7. The relation c is shown to be a preorder: reflexivity is easy to obtain using straightforward techniques, and transitivity is proved directly by utilising the result for transitivity connections.(Theorem 4.4.)

Typing system
We now present the typing system, which combines techniques from linear λ-calculus and session typing, integrating the asynchronous subtyping from the previous section.The system presented here is for initial programs, i.e., for terms without any queues or already activated sessions.We will augment the type system later, so as to also cover the runtime constructs.
Environments We first define three kinds of finite mappings for environments, needed when typing a term with free identifiers:   Γ is a finite mapping, associating shared value types to identifiers.Λ associates variables and linear function types.Σ is a finite mapping from variables/session channels to session types.Σ, Σ and Λ, Λ denote disjoint-domain unions.Γ, u :U means u / ∈ dom(Γ ), and similarly for the other environments.
Typing judgement The typing judgement takes the shape: Γ ; Λ; Σ P : T which is read: under a (global) shared environment Γ and a linear function environment Λ, a term P has type T with session usages described by Σ .We say that a judgement is well-formed if the environments (pairwise) do not share elements in their domains, that is, when the disjoint union dom(Γ ) dom(Λ) dom(Σ) is defined.
To reduce the number of type rules, we make use of the following abbreviation: Typing rules The typing rules for identifiers, subtyping, and functions are given in Fig. 9.The rules for processes and sessions are given in Fig. 10.In each rule, we assume that the environments in the consequence are defined.
Starting from Fig. 9, the first group is (Common).First we have a rule for the unit value (), assigning the type unit.In the conclusion, notice that an arbitrary Γ is allowed, but no linear variables (Λ = ∅), or sessions (Σ = ∅).This restriction agrees with the use of weakening only for shared environments, a condition necessary for the preservation of linearity.(Shared) is an introduction rule for identifiers with shared types, i.e., not including U T or S. (LVar) is for linear variables and (Session) is for session endpoints, recording x : U T in Λ and k : S in Σ , respectively.The general strategy is that the environments Λ and Σ record precisely the desired usages of linear variables/sessions, and then within a derivation these usages are combined using disjoint union (to ensure that no copying takes place) and prefixing composition in the case of sessions (to ensure that certain separated usages are seen as one largest use).The use of disjoint union effectively forbids contraction.The absence of weakening guarantees that all linear hypotheses are actually used.
The group (Structure) consists of two rules from Linear Logic [21].The rule (Promotion) ensures that shared functions do not contain linear terms, as unrestricted functions may be used more than once, breaking linearity, or may not be used at all, again violating linearity by making endpoints or linear functions disappear.The rule is a special case of linear promotion [21], since the type U → T is basically !(U T ).Dually, (Dereliction) allows to use a shared function in a linear way, which is perfectly safe, and this is convenient when we wish to record, e.g., ![U T ].S in an environment where the sent function has the unrestricted type U → T .The group (Subtyping) consists of one subsumption rule, (Sub), introducing the coinductive subtyping c into typing derivations.We write Σ c Σ when dom(Σ) = dom(Σ ) and for all k : S ∈ Σ , we have k : S ∈ Σ with S c S .Notice that subsumption can apply to the session environment, but not to other environments, and it can also apply to the given type T for the term P .
The second group, (Function), comes from the simply typed linear λ-calculus.In the abstraction rule (Abs), the argument x : U is from the appropriate environment following the definition of #, and it is removed in the conclusion, as expected.(App) is the rule for functional application, and allows the arrow type to be either linear or unrestricted, similarly to [53]; this is needed due to (Rec), since abstractions and variables can always be assigned a linear arrow type, by rules (Abs) and (Dereliction), respectively.The conclusion says that the session environments and linear variable sets of P and Q must be disjoint; otherwise, there is copying (more than one usage) of the respective linear terms, which is forbidden.Rule (Rec) is similar to (Abs), but with the addition of a hypothesis for x in the premise, representing the function itself, and used for typing instances of the function within its body.It is required that the linear function and session environments are empty, since a recursive function may rewrite itself repetitively copying all its contents.
In Fig. 10 we have the final group, (Process), for processes integrated with linear functional and session typing.Rule (Nil) types the empty process.(New) hides a shared name.There is no typing rule for session channels (s, s) in initial programs, but in Section 6 we define a rule (New s ) that verifies the communication patterns for the two endpoints s and s, in order to ensure compatible dyadic interactions up to asynchronous permutations.
(Conn) and (ConnDual) are for initiating sessions.In the premises of (Conn), the usage S of the endpoint x in P has to agree with the type S recorded for the shared identifier u in the typing environment Γ .Rule (ConnDual) is similar, however the type in the environment Γ is dual to the usage in the session body P .This is needed in order to indicate which side of the session is followed with respect to a shared channel type, since connecting processes must use their endpoints dually.(Recv) is for receiving values, and uses the notation with # to cover the different cases for linear, session, and unrestricted types.The new session type is composed in the conclusion's session environment, in a way that agrees with the protocol, that is, the input is appended before any subsequent actions on k within P .
(Send) is the most complex rule, integrating session typing and linear typing.Either Σ 1 or Σ 2 contains the complete session k : S, which in practise means that after sending a value, the rest of the session on endpoint k must appear (and be completed) either in the continuation P of the sending process, or inside the value V .In the latter case, we can even have that V = k, which implements higher-order session passing of k over k, i.e., a self-delegation.The composition Σ 1 , Σ 2 is defined in the conclusion, which entails that no endpoint appears in both the remaining sender P and the sent value V , because, in that case, we would have a race condition between the receiver of V and P , in the usage of communications over these common sessions.The same applies to linear variables free in V and P .If V has a functional type, all session endpoints within it must be complete, that is, suffixed with end, because they should not compose further.This is achieved by the necessary use of a suitable instance of (Close).This rule uniformly generalises the corresponding rules in the session types literature [19,48,54,24].In the conclusion, we delete k : S where it occurs, either in Σ 1 or Σ 2 , and the updated type for k is recorded in the conclusion's session environment, consisting of the continuation type S prefixed with the output ![U ].
In (Par), we parallel-compose two processes, assuming disjointness of linear function and session environments, as in (App).(Bra) and (Sel) are the standard rules for branching and selection from [24].In (Bra) all continuations P i must have corresponding session usages on k that agree with the branch type.In (Sel) the continuation P must have a usage S j on k that agrees with the type corresponding to the selected label l j on the selection type of the conclusion.
Closing sessions In the above rules for session communication, the premises always contain a hypothesis for the subject of the session action, e.g., k : S appears in Σ i located in the premise of the typing for k! V .P .This does not necessarily imply that k appears in P , as the usage {k : end} can be obtained using (Close).This rule is used to effectively close a session on k by introduction of a hypothesis k :end, in order for further composition (i.e., more session actions on k) to be rejected.

Examples of typing
Here we state a few examples and counter-examples that demonstrate the purpose of the type system.We revisit some examples from Section 3.2 and from the Introduction.
First, session endpoints must not become "forgotten": ( λ(x: S).0 ) • s In the above term, after reduction by the (beta) rule, the endpoint s will not appear any more, and the session on s might become stuck.This term is only typable if S = end, otherwise it is not typable because in the premises of rule (Abs) we require a session hypothesis x : S which cannot be introduced in the typing of 0 except by use of (Closed).Second, session endpoints must not be copied: The above term reduces to: in which we have copied s breaking the condition of linearity, which is undesirable as the endpoint s will nondeterministically interact with one of the outputs, leaving the other waiting forever.The first term is untypable because typing the function body x! V | x! V with (Par) requires that the sessions in each parallel process are disjoint, which is not the case here due to the common presence of x.We also revisit the examples in Section 3.2.

( λ(x
This term is unsafe as the thunk which contains s does not appear in the function that receives it, after reduction.This is an indirect way for an endpoint to become "forgotten" as before.The typing fails because U = unit (as above) and (Abs), used for the left subterm of the application, requires x : U to appear in the linear function environment of the typing of 0, which is impossible.
We finally type the optimised higher-order mobility from the Introduction.In the connect process: .end and U is the type of y (receiving the mobile code P ).This is obtained by applying (Conn), (Send), and (Recv) appropriately.On the other hand, in the optimised session: a(x).x! P .x?(z 1 ).x?(z 2 ).Q , .end, applying (ConnDual) with a : S 2 , then (Send) and (Recv).By an application of (Sub) in the body of the session, x can also be typed by S 2 (the dual of S 1 ), because S 2 c S 2 by Definition 4.3 (7).So, the same term can also be assigned a : S 2 which is the same as a : S 1 , and we are done.

Typing system for runtime
The typing system extends the one for programs given previously, replacing a few rules with more general versions.New formulations are needed for the integration of typing at the level of session queues, and for ensuring that the asynchronous calculus is sound.
Queue types Due to the presence of labels in session queues, we need to extend the types to facilitate all buffer components, as follows: Therefore, every label induces a singleton type identified with the label value.
Session remainder Type soundness is established by also typing the queues created during the execution of a well-typed initial program.We track the movement of linear functions and channels to and from a queue to ensure that linearity is preserved, and we check that endpoints continue to have dual types up to asynchronous subtyping after each use.To analyse the intermediate steps precisely, we utilise a session remainder S − τ = S which subtracts the vector τ of the queue types of the values stored in a queue from the complete session type S of the queue, obtaining a remaining session S .When the remainder S is end, then the session has been completed; otherwise it is not closed yet.The rules are formalised in Fig. 11.
(Empty) is a base rule.(Get) takes an input prefixed session type ?[U ].S and subtracts the type U at the head of the queue, then returns the remainder S of the rest of the session S minus the tail τ of the queue type.(Put) disregards the output action type of the session and calculates the remainder S of S − τ , which is returned prefixed with the original output giving ![U ]. τ .This is because we are subtracting the input queue types, and therefore the output is not consumed.
(Branch) is similar to (Get), but it only records the remainder of the k-th branch with respect to a stored label l k .Dually, (Select) records the remainder on the nested types, similarly to (Put), because selection is an output action.An example of the use of session remainders can be found in Section 6.3.(LABEL) (νs)P : (NEW) Γ, a: S ; Λ; P : Γ ; Λ; (νa: S )P : Fig. 12. Runtime typing for asynchronous HOπ -calculus.
Typing system for terms with session queues We first extend the session environment as follows: The typing judgement is now of the form: Γ ; Λ; l : l and Γ ; Λ; P : T The first judgement is used for typing any labels appearing in a session queue.contains usage information for queues in a term (s :: τ ), so that the cumulative result can be compared with the expected session type; for this we use the pairing (s :: (S, τ )) that combines the usage of a channel and the sequence of types already on its queue.Observe that the lighter notation (k : τ ) is ambiguous, since τ can be τ = S .This is why we use (k :: (S, τ )) and (k :: τ ), respectively.
We define a composition operation on -environments, used to obtain the paired usages for channels and queues: The typing rules for runtime are listed in Fig. 12. (Label) types a label in a queue, while (Queue) forms a sequence corresponding to the types of the values in a queue: we ensure the disjointness of session environments of values, and apply a weakening of ended session types (Σ 0 ) for closure under the structure rules.(New s ) is the main rule for typing the two endpoint queues of a session.Types S 1 and S 2 can be given to the queues s and s when the session remainders S 1 and S 2 of S 1 − τ 1 and S 2 − τ 2 are dual session types up to asynchronous subtyping; more precisely, S 1 must be a subtype of the dual of S 2 , written S 1 c S 2 .This is equivalent to S 2 c S 1 .Since the session endpoints are compatible, we can restrict s.The combination of coinductive subtyping with a syntactic duality operator, which is practically the same as the compatibility relation in [20], has two advantages: first, it avoids the need for a separate coinductive duality as in [19]; secondly, as is detailed in [3], a simple syntactic duality does not work with equi-recursive types, and our solution avoids such problems.
(Par) composes processes, including queues, and records the session usage by ; this rule subsumes (Par) for programs.Note that, as this is a runtime typing system, there are no free variables at the top level.Moreover, queues can only appear at the top-level, in parallel to the terms that may appear in initial programs, and never inside functions.Finally, we had to redefine (New) to account for restriction over queues, i.e., with a -environment.

Typing the mobile business protocol
We can now type the hotel booking example in Section 2.3, guaranteeing its type safety.Agency has the following types S 1 contains higher-order session passing of type S 2 , and the thunk in S 2 has a linear arrow type.Client and Hotel just have the dual of Agency's type at a and the dual of Agency's type at b, respectively.Note that in Client, the received session y appears subsequently in the sent code V , which is typed by (Send) with the side condition k : S 3 ∈ Σ 2 explained in Section 6.

Typing the optimised mobile business protocol
Now, using also the runtime typing system, we can type the hotel booking example of Section 2.4, in the presence of asynchronous optimisation for higher-order mobility.Agent and standard Client can be typed, by using the rules in Figs. 9 and 10, as follows:

Type soundness and communication safety
This section studies the key properties of our typing system.First, we show that typed processes enjoy subject reduction and communication safety.
We begin by introducing balanced environments which specify the conditions for composing environments of runtime processes.Our definition extends the one in [19] to accommodate for the presence of buffers, using session remainders.Definition 7.1 (Balanced ).balanced( ) holds if for all s :: (S 1 , τ 1 ), s :: The definition is based on (New s ) in the runtime typing system (Fig. 12): intuitively, all subprocesses generated from an initial typable program should conform to the balanced condition.
Next, we define an ordering between session environments which abstractly represents an interaction at session channels.

Definition 7.2 ( ordering).
Recall defined in Section 6.We define The first four axioms capture the transfer of types (corresponding to values) between programs and queues.For example the first axiom captures how an input session against a non-empty queue will evolve by removing the prefix and head element, respectively.The output axioms can be understood by duality.Then we have rules that introduce n-times unfolding (this is needed due to asynchrony) and arbitrary contexts ( ) which simplify the other rules.In the last rule, which allows to deal with asynchronous subtypes, there are two notable points.First, we are only interested in output actions, and this is why we use the queue k.Second, note that the queue type k :: τ is the same for all premises ( j ∈ H ), since we are performing a common asynchronous action.In fact, τ will be equal to τ h where h is a label or value type; this is evident from the output axioms.Note that if 1 s 2 and 1 is defined, then The proofs can be found in Appendix C. We make use of a number of supporting lemmas; the actual proof of Type Soundness begins on page 260.

Communication safety
We now formalise communication-safety (which subsumes the usual type-safety).First, a k-buffer is a queue process k : h.A k-input is a process of the shape k?(x).P or k £ {l i : P i } i∈I .A k-output is a process k! V .P or k ¡ l.P .
Then, a k-process is a k-buffer, k-input, or k-output.Finally, a k-redex is a parallel composition of a k-input and non-empty k-buffer, or of a k-output and k-buffer.Definition 7.4 (Error process).We say P is an where Q is one of the following: (a) a |-composition of two k-processes or of a k-process and a k-process, that does not form a k-redex or a k-input with an empty k-buffer; or application containing a k-buffer.
The above says that a process is an error if (a) it breaks the linearity of k by having e.g. two k-inputs in parallel; (b) there is communication-mismatch; (c) there is no corresponding opponent process for a session; or (d) it encloses a queue under prefix, thus making it unavailable.As a corollary of Theorem 7.3, we achieve the following general communication-safety theorem:
Proof.It is enough to consider a one step reduction from a well-typed term.From Theorem 7.3 we know that the result is well-typed.Therefore it suffices to prove that a well-typed term cannot be an error.We consider the given cases.For (a), we may have a composition of two k-processes such as, e.g., k : ∈ I .When the remainder is undefined, the rule (New s ) cannot apply, and therefore the term is untypable.For (c), a missing occurrence of the dual buffer is excluded by (New s ).In particular, even if a session on k is ended and so does no occur in communications, the buffer on k will still exist under the scope of k/k.For (d), a buffer cannot occur in the body of an abstraction or under an input prefix or a branching, as can be seen by the use of Σ -environments in the user-level typing rules in Figs. 9 and 10.P Corollary 7.6 (Open communication safety).If Γ ; Λ; P : with balanced( ), then P never reduces into an error.
Proof.This follows easily from Theorem 7.5, since we can close the linear interface with abstractions and then apply to linear function arguments, obtaining a term of the same type for which safety holds.In particular, P σ , with σ a closing substitution for Λ, will never reduce to an error, so the same is easily shown to hold for P .P

Related work
There is a large literature on linear and session types for both the λ-calculus and the π -calculus.Below we give the most closely related work, dividing into three parts: one focuses on the linear typing system of the λ-calculus and the session types for functional programming languages, the next focuses on asynchronous subtyping systems, and finally the last explains the relationship between linearity and asynchrony from the aspect of proof theory, following recent developments.
See also [16] for discussions on other type disciplines of the π -calculus as well as on applications of session types.

Linear and session typing systems for higher-order functions
Our typing system is substructural in the sense that for session environments Σ we do not allow weakening and contraction, ensuring that a session channel is recorded as having been used only when it actually occurs in session communication expressions.Similarly no structural transformations can apply to linear variable environments, ensuring that the occurrence of a variable manifests that it has indeed been used exactly once.The ways in which our typing system enforces linearity can be seen as an amalgamation of the two approaches in [53], retaining the simplicity of declarative systems, and the decidability of algorithmic ones.Walker's work [53] provides a good exposition to substructural typing (in which linear and affine usages can be seen as special cases).Note that in our system there is no need to enforce linear usage for other than functional types.Applying the inference techniques of [17,15] and [51], with the algorithmic subtyping of [19], it should be possible to construct a type inference system.Session types in functional languages have been studied in various works.In the first study [52], the authors define a concurrent multi-threaded functional language with sessions.Their language supports sending of channels and higher-order values, branching and selection, recursive sessions and channel sharing.It has an explicit multi-threading primitive (fork) and explicit stores.The paper [20] extends the previous language to a variant of sessions where message sending is nonblocking.This is handled by explicitly storing an entry for the two endpoint channels in a buffer.Its functionality is the same as our use of two session channels for distinguishing the two endpoints (similarly to [19]).They simplify their previous type judgement which required input and output environments [52] by integrating linear typing with a split operator, which is more directly related to the original non-deterministic typing of [53].While a precise typability comparison is difficult due to our additional primitives, their work also shows a use of linear types for functional languages with sessions.
One of the active areas in the functional setting is the integration of session types into the lazy functional language Haskell [39,45,27].Incorporating primitives for session interaction into Haskell requires to define an appropriate IO-monad, which is also suitable for solving aliasing problems.Instead of extending the type system of an existing language and adding linear types like our work and [52,20,6], the work [39,45] encode sessions using the features of Haskell's type system.In general, the encoding approach in [39,45] generates more cumbersome types, but can take advantage of Haskell's type inference (them in most cases).The work [27] establishes a more advanced session type inference technique.An ML-style polymorphism based on [20] is also investigated in [6].
Also, the work [10] uses (the synchronous part of) our typing system (published in [35]) to encode session types in linear behavioural types in the HOπ -calculus.This demonstrates that the substructural features of our typing system makes it easy to translate session types to other structures, mechanically.
Finally, the work [49] presents an alternative session system for higher-order processes, based on a logical interpretation integrated with a functional language.This system enjoys stronger properties in terms of progress, so that processes do not get deadlocks, but is slightly more complex due to the heavy use of monads.Moreover, it does not make use of any form of asynchronous subtyping.

Asynchronous session typing and subtyping
Asynchronous subtyping was first studied in [37] for multiparty session types [25]; however, this work does not support neither higher-order sessions (delegations) nor code mobility (higher-order functions).Both of these features provide powerful abstractions for structured distributed computing; delegation is the key primitive in our implementation of session types in Java [26] and web service protocols [23], to which we can now apply our theory for flexible optimisation.The proof of the transitivity in this work requires a more complex construction of the transitive closure trc( 1 , 2 ) (Definition B.10) than the one in [37] due to the higher-order constructs.In spite of the richness of the type structures, we proposed a more compact runtime typing and proved communication safety in the presence of higher-order code, which is not presented in [37].Moreover, our new typing system extends naturally the synchronous account of the linear typing system published in [35], demonstrating a smooth integration of two kinds of type-directed optimisation.
The coinductive subtyping of recursive session types was first studied in [19], adapting standard methods from IOsubtyping in the π-calculus [44].The subtyping system of [19] does not provide any form of asynchronous permutation, thus does not need the nested n-times unfolding (Definition 4.1).Moreover, our transitivity proof is significantly more involved than in [19] due to the incorporation with n-time unfolding, permutation, and higher-order functions.
Our treatment of runtime typing, specifically our method for typing session queues and the use of session remainders, is more compact than previous asynchronous session works (e.g.[25,4]) where they use the method of rolling-back messagesthe head type of a queue typing moves to the prefix of the session type of a process using the queue, and then compatibility is checked on the constructed types.Our method is simpler, as we remove type elements appearing in a queue from its typing, and also more flexible, as it naturally extends to asynchronous contexts.Our queue typing is more similar to that in [20], where smaller types are obtained after matching with buffer values.However, our method works with queue types rather than with values directly, which allowed it to be extended smoothly to handle asynchronous optimisation, which is not treated in [20].For example, we allow a type consisting of an output followed by an input action to be reduced with a type corresponding to the input, leaving the output prefix intact.Moreover, using a more delicate composition between values and queue typing, our system enables linear mobile code to be stored in the queues.
An analysis of asynchronous session action permutations, encompassing an asynchronous "acceptance" relation which accommodates for output actions performed in advance, appears in an unpublished manuscript [40].The authors suggest that their algorithm is terminating.However, if their system admits μt.Finally, a notion of asynchronous context and a definition of asynchronous duality that resembles our subtyping (combined with duality) appears in [5].However, this notion is only developed in order to prove type soundness and it is not integrated with the typing system which was mentioned as an interesting future work.It is developed for finite sessions that, additionally, do not support delegation (name passing).Our work develops such a subtyping for a much more expressive calculus supporting name and code mobility, and also in the context of recursive session types.These features require co-inductive methods that really bring to the surface a number of challenges such as those arising from infinite simulations.
The recent work [9] studies a notion of preciseness in session subtyping, including an adaptation of our notion of asynchronous subtyping.As we mentioned in Section 4.2, the subtyping of [9] avoids so called "orphan" messages, i.e., those that are never received from a queue, by restricting the subtyping relation to contain a finite amount of branchings (in our case this would also include inputs) before an output can be fetched from inside an asynchronous context.In simple terms, they do not allow the accumulation of messages which follows from missing inputs.We believe a practical application of asynchronous subtyping will make use of both approaches, ours and that of [9]: for many kinds of values that do not require linear constraints, messages can safely be left on a queue and later garbage collected; for messages containing linear values, a restriction might be needed such as the one in [9] or the one in [32, p. 181] or a buffer bound as in [20].
As a general remark, note that our choice to use μ-types instead of (infinite) regular trees serves better our aim of informing programming technology, and given the restrictions, notably that of contractiveness, the two notions are equivalent [18,2].Actually, the coinductive treatment would not differ much, except in notational aspects, since we would have to fetch output actions from deep elements of the tree representation as we do with asynchronous contexts.

Linearity and asynchrony from the proof theoretical perspective
A typical use of linearity in processes is to simply require that linear channels are used exactly once, which differs than sessions-based linearity where channels are used once "at any moment" and can be reused in order to complete a protocol.In that sense, linearity in sessions is about avoiding race conditions on channels, but the two notions can be interchanged as seen by recent works [22].
There is, however, a deeper notion of linearity that arises from propositions-as-types interpretations, starting from [1].Recently, the work [7] gave the first such correspondence for sessions types, matching session typed processes to Intuitionistic Linear Logic proofs.This kind of interpretation becomes more relevant for asynchrony once the constraints of sequentiality (arising from sequent proofs) are relaxed, as has been done in [14] 2 where logical sessions are obtained for asynchronous π-calculus, and even more in [33] where logical sessions based on Proof Nets are obtained for a Solos [29] calculus.Indeed, once we eliminate many of the prefixes, the need to perform asynchronous subtyping may seem redundant, however this is not the case: in distributed computing communications are implemented using sockets or channels of some form, so our buffered model is in fact more realistic.In the case of [14], our subtyping would allow output actions hidden under an input prefix to be extracted, which corresponds to valid transformations in Linear proofs.For example, we would allow B ⊗ (A C ) to be a subtype of A (B ⊗ C ), under certain conditions.

Conclusion
We formalise for the first time session typing for a process language that allows not only data but also runnable code to be the subject of structured type-safe communications.The ability to exchange code is fundamental in concurrent and distributed systems where programs cannot be fully fixed ab initio and dynamicity is a prerequisite.We then relax the strict compatibility requirements that govern pairs of interacting processes to allow certain classes of message-passing actions to be permuted, offering not only greater flexibility in composing programs, but also guidance toward type-safe optimisations.Our session typing system for the HOπ -calculus can serve as a theoretical foundation for process and functional languages, and our asynchronous subtyping has been already implemented in order to allow message overlapping in message passing parallel algorithms in a C language with sessions [41].The first author's thesis also demonstrates that the theory developed in this article is smoothly extensible to object-orientation [32].Our future work includes a type-preserving fully abstract encoding of HOSπ into the session π -calculus based on a session-based asynchronous bisimulation [28] or behavioural equivalences [42]; a development of a decidable algorithm for the asynchronous subtyping relation along the lines of [19]; extensions to multiparty session types [4,25]; and an incorporation with actor-based languages for concurrency, following the Erlang-based development in [34].In summary, an automatic optimisation that preserves the intended semantics and does not violate type-safety is interesting, both theoretically and practically, and in this work we have established a solid theory to support this development.
Proof.Trivial as U n ( ) is defined as the union of two simulations.P We now define the single-step permutation transformations for top-level actions, which enable us to obtain more asynchronous subtypes, as this is needed further on when, given a simulation, we obtain more asynchronous simulations utilising single and multi-step permutations.There are two components, permutation contexts C and permutation rules , defined as follows: and (10), with the required assumptions provided in .We do not need to examine the subcomponent as it is a simulation by assumption.P Next we define the asynchronous extension of a simulation, with degree n.The degree represents the number of singlestep permutations, applied successively to all the components of the given simulation, up to asynchronous contexts A.

Definition B.7 (Asynchronous extension).
Given a simulation , the asynchronous extension of with degree n is defined as follows: The notation U ω (α n−1 ( )) stands for the union of all U m (α n−1 ( )) such that m ∈ N. Proof.We proceed by induction on the degree n.The base case of n = 0 holds because is a simulation by assumption.
We then prove the inductive case for any n ≥ 1.
By the inductive hypothesis α n−1 ( ) ⊆ c , then by Proposition B.1 we have U ω (α n−1 ( )) ⊆ c , and by Lemma B.6 we obtain CE(U ω (α n−1 ( ))) ⊆ c .Therefore, it is not necessary to examine pairs in this subset of α n ( ).Then, it remains to examine an arbitrary pair (A S 1k k∈K , S 2 ) ∈ α n ( ) such that (A S 1k k∈K , S 2 ) ∈ α n−1 ( ) with ∀ k ∈ K .S 1k S 1k .We proceed by taking cases on the shape of the context A.
Case A = • k∈K .Then let S 1 = A S 1k k∈K , and S 1 = A S 1k k∈K .We have S 1 S 1 , and proceed by examination of the permutation applied.

s 2 k
as follows: k :?[U ].S k :: U τ s k : S k :: τ k :![U ].S k :: τ s k : S k :: τ U k : &[l i : S i ] i∈I k :: l j τ s k : S j k :: τ j ∈ I k : ⊕[l i : S i ] i∈I k :: τ s k : S j k :: τl j j ∈ I k : unfold n (S) : S j k :: τ s k : S j k :: τ for all j ∈ H k : A S h h∈H k :: τ s k : A S h h∈H k :: τ

Lemma B. 8 .
If ⊆ c then α n ( ) ⊆ c .That is, for any simulation and degree n ∈ N, the asynchronous extension α n ( ) is a type simulation.
νs) P ≡ (νs) (νa : S ) P (νs) P | Q ≡ (νs) (P | Q ) s, s / ∈ fn(Q ) (νa : S ) (νb : S ) P ≡ (νb : S ) (νa : S ) P For example μt.![nat].t is contractive, but μt.μt .t is not.Moreover, we only consider tail-recursive session types, therefore types such as μt.![t].end are not well-formed.To indicate that a session is finished, we use the terminal end.We write T for the set of types.Abbreviated forms We often write &[l i : S i ] i∈I and ⊕[l i : S i ] i∈I for branching and selection types, T for unit → T and T 1 for unit T .The terminal end is sometimes omitted.
s : &[move :?[unit ].S s :: (S MClient , int), s :: (&[move :?[unit ].S h 1 | k : h 2 or k?(x 1 ).P 1 | k?(x 2 ).P 2 .It is clear than no such combination is typable: we cannot compose by any environments 1 and 2 with k on both, unless if one is a queue typing k :: τ and the other is a session typing k : S. For (b), a communication mismatch is untypable since the session remainder will be undefined, e.g., ?[U ].S − l k τ is not defined, and similarly for &[l i : S i ] i∈I − l k τ when k / ![U 1 ].t as a subtype of μt.![U 1 ].?[U 2 ].t, which as we show on page 239 induces an infinite simulation, then it is unclear how it avoids divergence without any special provision.