Edinburgh Explorer Typechecking Protocols with Mungo and StMungo: A Session Type Toolchain for Java

Static typechecking is an important feature of many standard programming languages. However, static typing focuses on data rather than communication, and therefore does not help programmers correctly implement communication protocols in distributed systems. The theory of session types provides a basis for tackling this problem; we use it to develop two tools that support static typechecking of communication protocols in Java. The ﬁrst tool, Mungo, extends Java with typestate deﬁnitions, which allow classes to be associated with state machines deﬁning permitted sequences of method calls: for example, communication methods. The second tool, StMungo, takes a session type describing a communication protocol, and generates a typestate speciﬁcation of the permitted sequences of messages in the protocol. Protocol implementations can be validated by Mungo against their typestate deﬁnitions and then compiled with a standard Java compiler. The result is a toolchain for static typechecking of communication protocols in Java. We formalise and prove soundness of the typestate inference system used by Mungo, and show that our toolchain can be used to typecheck a client for the standard Simple Mail Transfer Protocol (SMTP).


Introduction
Many popular programming languages use static typechecking to ensure correct manipulation of data. This offers significant practical benefit to software developers. However, modern software is increasingly reliant on communication, which must also be programmed correctly: messages must be sent and received in the correct sequence and with the correct format, in order to implement a desired communication protocol. The theory of session types [25,47] provides a basis for supporting communication-based programming by static typechecking of communication operations; it allows the structure of communication to be codified as type definitions, analogous to data type definitions, which can be used by compilers for typechecking. However, session types are not yet a feature of standard mainstream programming languages, so software developers are not able to benefit from them.
We present two tools that use the theory of session types to support static typechecking of communication protocols in Java. Our first tool, Mungo, 1 extends Java with typestate definitions, which associate classes with state machines defining permitted sequences of method calls [45]. To associate a typestate definition with a class, the programmer adds a @Typestate annotation to the class telling Mungo where to find the typestate definition file. Mungo then ensures that instances of the class are used in a manner consistent with the declared typestate. To control aliasing, objects with typestate definitions are used according to a linear type discipline; this is analogous to the linear control of channels in session type systems. Our second tool, StMungo (Scribble-to-Mungo), uses this typestate feature to connect Java to the broader setting of communication protocols specified as session types in the Scribble protocol language [43]. Given a Scribble protocol projected to a particular endpoint (a so-called local protocol), StMungo generates a typestate specification capturing the sequences of sends and receives permitted along that endpoint. Each endpoint implementation can be validated separately by Mungo against its typestate definition and then compiled as usual with javac. The separate typechecking of each endpoint is integral to our approach, and is justified by the theory of multiparty session types [26], the formal foundation of Scribble. Multiparty session types provide an important safety guarantee: once each endpoint implementation is known to conform to its local protocol, the various implementations can be composed into a system free of communication errors.
Our work advances a line of research applying session types to real-world programming languages [9,15,17,16,22,29,34,35,37,41]. In particular, we build on the work of Gay et al. [23], which connected session types to the objectoriented notion of typestate. They observed that the valid sequences of messages for a given endpoint could be captured by a typestate definition for a class, allowing the channel endpoint to be modelled as an object. While an important idea, this earlier work lacked a practical implementation and relied on typestate declarations on parameters and return types. The Mungo/StMungo toolchain is the first integration 1 Saint Mungo is the founder and patron saint of the city of Glasgow.
2 of session types and a practical typestate system for checking the communication behaviour of Java programs. Moreover, Mungo uses a typestate inference system to eliminate the need for typestate declarations on parameters and return types. The Mungo/StMungo toolchain offers other practical advances over previous efforts to combine session types with objects. For example, SJ [29] only supports binary session types, whereas StMungo generates Mungo specifications from multiparty session types. Furthermore, Mungo permits objects with typestates to be stored in fields, whereas SJ requires them to be created and fully used within methods. Using the @Typestate annotation means we avoid any need for language extensions.

Contributions
Mungo. We describe the Mungo typestate checker for Java ( § 3). Mungo currently supports a subset of Java; support for the full language is discussed in § 8. StMungo. We describe StMungo ( § 2), which translates Scribble local protocols into Mungo typestate specifications. StMungo also generates Java method stubs for each endpoint. Both tools can be downloaded from [1]. SMTP case study. We present a statically typechecked SMTP client ( § 4), which illustrates the toolchain provided by Scribble, StMungo and Mungo. Typestate inference system. We formalise the essential features of Mungo as a core object-oriented calculus ( § 5). We define a typestate inference system for that language and prove its type safety ( § 6).
A summary of this work was presented at PPDP 2016 [31].

StMungo: Scribble-to-Mungo
The integration of session types and typestate, defined by Gay et al. [23], consists of a formal translation of session types for communication channels into typestate specifications for channel objects. The main idea is that a channel object has methods for sending and receiving messages and the typestate specification defines the order in which these methods can be called; therefore it is a specification of the permitted sequences of messages, i.e. a channel protocol.
We extend this translation from binary to multiparty session types [26] and implement it as the StMungo (Scribble to Mungo) tool, which translates Scribble [43,49] local protocols into typestate specifications and skeleton socket-based implementation code. The resulting code is typechecked using Mungo. A Scribble local protocol describes the communication between one role and all the other participants in a multiparty scenario, including the way in which messages sent to different participants are interleaved. This interleaving is not captured by binary session types and by tools based on them, like SJ [29]. StMungo is based on the principle that each role in the multiparty communication can be abstracted as a Java class following the typestate corresponding to the role's local protocol. The typestate specification generated from StMungo together with the Mungo typechecker can guide the user in the design and implementation of distributed multiparty communication-based programs with guarantees on communication safety and soundness. StMungo is the first tool to provide a practical embedding of Scribble multiparty protocols into object-oriented languages with typestate.
We illustrate StMungo on a multiparty protocol that models the process of booking flights through a university travel agent. There are three participants: Researcher (abbreviated R), who intends to travel; Agent (A), who is able to make travel reservations; and Finance (F), who approves expenditure from the budget. After the request, quote and check messages requesting authorisation for a trip, Finance can choose to approve or refuse the request. The global protocol is defined as follows. The Scribble tool is used to check the above protocol definition for well-formedness and to derive a local version of the protocol for each role, according to the multiparty session types theory [26]. This is known as endpoint projection. Here we show the local protocol for Researcher, which describes only the messages involving that role. The self keyword indicates that R is the local endpoint. StMungo generates an API for this role, class RRole, which provides an implementation of RProtocol. When instantiated, it connects to the other role objects in the session (ARole and FRole). The method calls, describing the messages exchanged with the other roles, follow the interleaving specified by the RProtocol typestate. Alternatively, the developer may choose to ignore this API (and the Mungo socket library that it depends on), and use only the generated typestate protocols to develop his/her own implementation. He/she also has the ability to further refine the generated state machine, e.g., give appropriate names to states, or use anonymous states to have a coarser state refinement.

Mungo
Mungo extends Java with an optional typestate definition. The tool is implemented in Java using the JastAdd framework [24], a meta-compiler based on reference attribute grammars. Source files are typechecked in two phases: first according to the regular Java type system, and then according to our typestate extension. The source files can then be compiled using javac and executed in the standard Java 1.8 runtime environment.
A typestate is attached to a Java class and it defines an object protocol in the form of a state machine. Each state offers a set of methods that must be a subset of the methods defined by the class; each method specifies a transition to a successor state. Typestate definitions are provided in separate files, using the Java-like syntax, shown in Example 1 below. A typestate definition is attached to a class using the annotation @Typestate("ProtocolName"), where "ProtocolName" names the file where the typestate is defined. The typestate inference algorithm, presented in § 6, constructs the sequences of methods called on all objects associated with a typestate, and then checks if the inferred typestate is a subtype of the object's declared typestate. Some Java features are not yet supported. Some we anticipate to be relatively straightforward extensions (synchronised statements, the conditional operator ?:, inner and anonymous classes, and static initialisers). Generics, inheritance and exceptions are non-trivial and are discussed in future work ( § 8). Currently, generics are not supported; inheritance is supported for classes without typestate definitions; and exceptions are supported syntactically but are typechecked under the (unsound) assumption that no exceptions are thrown. (A try-catch statement is typechecked by typechecking the try body; if an exception is thrown a typestate violation may result.) Example 1. We introduce Mungo through the example of an unbounded stack data structure that follows a typestate specification. Given the following enumerated type: In this case Mungo would report a linearity error on argument s in su.pushN(s, 64) informing the programmer that variable s is used uninitialised, because the usage of variable s in line 19 as an argument consumed its linear value.
Inferring typestate for fields. Using fields to store objects can lead to a more idiomatic object-oriented style than explicitly passing values between methods. To illustrate this, we define a second client, StackUser2, that stores a Stack as a field. Typechecking the field s of StackUser2 follows the possible sequences of method calls specified by StackUserProtocol, and also takes into account the constructor body of StackUser2. Then, Mungo can guarantee that if a StackUser2 instance is used according to StackUserProtocol, then the Stack field of the object is also used according to StackProtocol. Short-circuit boolean expressions. Line 18 in the StackUser2 example above illustrates a final technical detail of typestate inference. The inference algorithm takes into account the fact that logical disjunction short-circuits if the first disjunct evaluates to true. Mungo will ensure that the typestate of su is consistent with there being either one, two or three successive invocations of pushN().

Case Study: Typechecking SMTP
In order to show the practicality and robustness of our StMungo and Mungo toolchain, we have developed a substantial case study in which we statically typecheck an SMTP client. We use this client to communicate with the gmail server. The full source code of the SMTP client can be found in [1].
SMTP (Simple Mail Transfer Protocol) is an Internet standard electronic mail transfer protocol, which typically runs over a TCP (Transmission Control Protocol) connection. We consider the version defined in RFC 5321 [44]. An SMTP interaction consists of an exchange of text-based commands between the client and the server. For example, the client sends the EHLO command to identify itself and open the connection with the server. The commands MAIL FROM : <address> and RCPT TO : <address> specify the e-mail address of the sender and the receiver of the e-mail, respectively. The DATA command allows the client to specify the text of the e-mail. The QUIT command is used to terminate the session and close the connection. The responses from the server have the following format: three digits followed by an optional dash "-", such as 250-, and then some text, like OK. The server might reply to EHLO with 250 <text> or to MAIL FROM or RCPT TO with 250 OK.
To typecheck the SMTP protocol using StMungo and Mungo, we first represent the text-based commands as messages in a Scribble global protocol, based on Hu and Yoshida [28]. Then, we use the Scribble tool to validate and project the above global protocol into local protocols, one for each role. We focus only on the client side and describe the SMTP_C local protocol. This fragment of code of the SMTP describes a loop (rec X1), in which the server S performs a choice between the messages _250DASH and _250. Next, other loops follow (rec Z1 and rec Z3), where in the second one the client C chooses among the messages SUBJECT, to send the subject, DATALINE, to send a line of text, or ATAD to terminate the e-mail by sending a dot. In addition, it generates a skeletal implementation based on sockets, although other implementations are possible. Every interaction in the local protocol becomes a method call in the typestate specification, as we will see shortly. State definitions group methods into choices and impose sequencing. Running the StMungo tool on SMTP_C produces the following files: 1. CProtocol.protocol, which captures the interactions local to the SMTP_C role as a typestate specification.

2.
CRole.java, which is a class that implements CProtocol by communication over Java sockets. This is an API that can be used to implement the SMTP client endpoint.
3. CMain.java, which is a skeletal implementation of the SMTP client endpoint. It runs as a Java process and provides a main() method that uses CRole to communicate with the other participants, in this case the SMTP server.
The CProtocol generated by StMungo is defined as follows. After choosing one of the branches, _250DASH or _250, the payload of type String is received via another method call, following receive_250dashStringFromS() in line 6 and receive_250StringFromS() in line 7, respectively for the available choices. The internal choice made at self, namely role C (lines 11-17 of SMTP_C), is translated into a set of send methods, one for each branch of the choice (lines 12-14 of CProtocol). When running the program, only one of these methods will be called, thus performing a single message selection corresponding to it.
CRole implements all the methods in CProtocol. In this implementation, since communication occurs on Java sockets, we declare and create a new socket to connect to the gmail server. This is given in lines 3 and 5, respectively, in CRole.
We now describe the correspondence between the text-based commands in SMTP and the method calls in Mungo. Consider "SUBJECT: Hello World" which is an atomic command composed of the keyword SUBJECT and the subject text. We use an intermediate layer to split this command into two separate method calls, shown in lines 7-9 in the main method below. The first, send_SUBJECTToS(), selects the command SUBJECT. The second, send subjectStringToS("Hello World"), sends the message "SUBJECT: Hello World". The intermediate layer is also used when receiving a command from the server. The command name is converted into a value of an enumeration, and the associated text is treated as a separate message.
Finally, CMain.java contains the main method where the CRole object is created and used to implement the client logic. Typically the programmer would flesh out the skeletal implementation with extra logic that, for example, gets relevant input from the user or decides which choice to make when several are available, or customise CMain by adding SSL connection code for authentication with the gmail server. Mungo is able to statically check CMain, or any code that uses a CRole object, to ensure that methods of the protocol are called in a valid sequence and that all possible responses are handled. The programmer is not required to use the skeleton implementation of CMain, or even the CRole API. It is possible to write new code that uses the API, or to use the typestate specification to guide the development of an alternative API, or to refactor the typestate specification itself.

A Core Calculus for Mungo
We define the syntax and operational semantics of a core object-oriented calculus, based on Gay et al. [23], which we use to formalise Mungo. We only formalise the typestate inference system and not the ability of Mungo to work with full Java, as this would require formalising a large subset of Java. Syntax. The syntax of the calculus is given in Fig. 1. We use · to denote a possibly empty set of elements that range over the subject meta-variable. A program is a set of type declarations D, each of which declares either a class or an enumerated type. A class declaration defines a class named C with typestate specification S , fields F and methods M. An enumeration declaration defines an enumerated type named E with a non-empty set l of enum values. For simplicity, our language has no support for inheritance or interfaces.
We assume as an implicit context a program D, where every class or enumeration has a unique name, and the fields and methods of a given class also have unique names. For any class C : S { F; M} ∈ D we write fields(C) for F, methods(C) for M, and typestate(C) for S . For any enum E { l} ∈ D we write enums(E) for l.
A typestate definition S specifies a state machine that has as actions the methods of a class. A typestate definition is either an internal choice H of method signatures, or a recursive typestate µX.S , which may contain the recursive typestate variable X. A method signature H can have two forms, depending on whether the method transitions to a state S , or it is an external choice E m(T ) : l : S l l∈E with the method signature defining the transition to one of the possible states S l l∈E ; in the latter case the return type of the method must be E. The empty or inactive typestate  A type is either the name of a class or enumeration, void or bool. A field declaration is a field name f associated with a type T . A method declaration T m(T x) {e} specifies a return type T , the name m of the method, the type T of the parameter x, and the expression e that comprises the method body. A path is either the atomic path this denoting the current object (receiver), the composite path r. f denoting the field f of the object denoted by r, or a parameter x. At runtime paths are resolved to heap locations (see runtime syntax below). A constant is the special value null, which is assignable to any class type, a literal value tt or ff of type bool, a literal value * of type void, or an enum value l.
In the expression forms method call r.m(e), field assignment r. f = e, and object creation r. f = new C, the target object of the invocation or assignment is restricted to a path r, rather than an arbitrary expression. This is to allow the typestate of the target to be tracked by the type system. The other expression forms include constants, paths, sequential composition e; e , switch expressions, if ...else expressions, labelled expressions λ : e, and continue expressions that jump to the enclosing expression labelled by λ. Configurations and runtime syntax. Fig. 2 extends the source syntax with additional runtime constructs used by the operational semantics.
A configuration h, e is a pair of a heap h and runtime expression e. The heap h is defined as an object C[ f : o], where C is the class of the object and f : o are its fields; the contents o of each field is either a constant c or another object. The "heap" is a tree of objects, with neither cycles nor sharing, due to the linearity of object references enforced by the type system (see § 6).
The expression e in a configuration h, e is a runtime expression in which every (compile-time) path of the form this, r. f or x has been replaced by a runtime path that refers to a heap value. A runtime path r in a heap h is either the atomic path root denoting h itself or the composite path r . f denoting the field f of the object denoted by r , where r is also a path in h. Runtime expressions also include the form e@r, which is an expression e that has been tagged with @r to track the active receiver. A value v is either a constant c or runtime path r. Every runtime expression is either a value, or uniquely of the form E[e], where E is an evaluation context (an expression with a hole). As usual, the notation E[e] denotes the plugging of the hole in E with an expression e.
The operational semantics is annotated with labels that denote the creation of a new object (r. f.new C), an enum value choice (r. l ), method call (r.T m T ), assigning a field (r. f = v), the conditional label (if), and the silent label (τ). The definition of states is extended to the set of enum values l : S l l∈E and we define action labels s for labels: internal choice T m(T ), external choice E m(T ) : l, and for enum values l. Labelled reduction semantics. We define heap access and update functions that are used by the reduction relation in Fig. 3: The root object is accessed via h(root). The access of a field h(r. f ) is inductively defined on the access of h(r). Similarly, we use the heap access function to update object fields as in h{r. f → o}. Fig. 3 defines the labelled reduction semantics; hereafter by "expression" we shall mean runtime expression, and by "path" runtime path, unless otherwise indicated. Rule R-Ctx lifts the semantic rules to an arbitrary expression using an evaluation context. Rule R-Seq discards the value v in a τ label and proceeds with the evaluation of e. Rules R-True and R-False are the usual rules for the if ...else expression and are annotated with label if. Rule R-New is labelled with r. f.new C and overwrites the contents of the field r. f by a new object C[ f = init(T )] whose fields are all initialised to the value init(T ), where T is the type of the field, defined by:  where for every enumerated type E we require there to be a distinguished element E init ∈ enums(E). The result of R-New is the void value * . The object is constructed at a location within an already existing object r. f . There are two assignment rules, depending on whether the value being assigned is a constant or an object path. Both forms return the void value * . A constant c has no associated typestate and may be used unrestrictedly; therefore the R-AsgnC rule is labelled with τ and simply updates the heap to store c in r. f . A path r , on the other hand, refers to an object and must be used linearly. Therefore the effect of the R-AsgnR rule is to relocate the object from r to h.r, leaving null at its old location. The annotation label for R-AsgnR is r. f = r . Although any existing object at h.r will be overwritten, the type system ( § 6) only permits the assignment if the typestate of that object is fully consumed. The R-Call rule is labelled with r.T m T and resolves the method m by first looking up the receiver r in the heap, which must be an object C[ f : o], and then selecting the method m from the definition of C. Prior to executing the selected method, we convert its body e, which is a source-level expression, into a runtime expression by substituting the runtime path r for this and also v for the formal parameter. In addition, the resulting runtime expression is tagged with @r, recording the fact that r is the active receiver. The active receiver tag @r on a value is removed using a τ label when the value is fully evaluated and it is not an enum label, as defined by rule R-Value. If the value returned by the method is an enum label l , then it must occur as the scrutinee of a switch expression; rule R-Switch defines the reduction via action r. l , of the switch expression to the branch indicated by l . The r is used in the reduction label to indicate which object made the choice. Rule R-Label is labelled with τ, and says that a labelled expression λ : e discards the λ and substitutes a copy of the labelled expression for every occurrence of continue λ that occurs in the loop body e.
It is easy to show that the operational semantics is deterministic. Assume a heap consisting of an instance of class C, where given fields(C) = T f , each field of C is initialised with the corresponding value init(T ). Execution can then be initiated using a top-level expression that substitutes path this with path root.

Typestate Inference
In this section we formalise the core of Mungo's typestate inference system and prove its safety properties. The formalisation makes a simplification, which is that the body of a method is analysed every time the method is called. If the implementation of Mungo used this simplification, then as well as being inefficient it would be impossible to analyse recursive methods. Section 6.4 informally explains how Mungo uses the concept of partial typestate to remove this restriction. The typestate inference system infers a typestate specification for a class, consistent with the static usage of instances of the class; the inferred typestate is then checked against the declared typestate of the class. Proving the soundness of the inference system requires proving that the trace of the execution of a well-typed program is included in the trace of the inferred type for that program. A sound inference system should be able to guarantee the progress property requiring that a program either reduces or is a value. The syntax of the inferred types, ranged over by U, and the typing context, ranged over by ∆, are defined below: The inferred types U differ from top-level types T in that every class type C is refined with a typestate specification S . There is a distinguished bottom type bot, which is used to type continue statements. It use will be illustrated when we discuss recursion and choice, later in this section. Typing context ∆ is a partial function from runtime paths r to types U, and from expression labels λ to recursive type variables X. A type U that is not a class type is referred to as a constant type. The inference system uses a subtyping relation sbt and a binary operator join(·, ·). Definition 2. ( sbt , = sbt , join) The following relations are defined on typestates, inferred types and typing contexts.
• The subtyping relation sbt is defined by the rules in Fig. 4.
• The join operator join(·, ·) is defined by the rules in  Subtyping on typestates is essentially a simulation preorder and is given in an algorithmic style [40,Chapter 21]. It constructs a set R of pairs of typestates using rules S-Rec1 and S-Rec2. The algorithm terminates either when end matches end (rule S-End) or when a pair of typestates is revisited (rule S-Terminate). Rule S-Method checks for prefix-matching. Rule S-Set requires covariance on subtyping with the empty set being treated as a special case. Rule S-Enum matches the external choice prefix. It requires subtyping on typestates for every value of the enumerated type. By analogy with subtyping on session types [21], it should be possible to generalise the definition to allow contravariant subtyping in the set of enumerated values. This might be relevant if we wanted to support inheritance, but it is not necessary for the case studies that we have implemented so far. The subtyping relation extends to inferred types and typing contexts. Proving that it is a preorder is standard; we use reflexivity and transitivity in proofs anout the type system.
The join operator computes an upper bound with respect to sbt . It is used to compute a common typestate in typing rules that combine multiple execution paths. The most interesting case of join on typestates is the join of method signatures.
For methods common to the two typestates, the continuation typestates are joined; the remaining methods are combined using set union. A join involving a recursive typestate is defined by unfolding. The operation extends to inferred types U (in particular, to types of the form C[S ]) and typing contexts. Finally, we define a transition relation on typestates as follows. The first two rules state that a method-prefixed typestate reduces to its continuation, using the method signature as the label. The next two rules are contextual, stating that reduction can occur in a set of typestates and under recursion, respectively. The last rule defines a reduction on a runtime typestate, as defined in Fig. 2. It states that a branching typestate reduces to one of its components, using the corresponding enumerated value as the label. Later we will use the notation S s − − → S for a sequence of transitions, where s is a sequence of method signatures and labels.

Typestate Inference Rules
Before introducing the inference rules, we give the form of the judgements: ∆ e : U ∆ ∆ C[S ] class C : S { F; M} D The first one is the typing judgement for expressions. The judgement is read from right to left. It takes as input the typing context ∆ and the expression e, and algorithmically computes the type U. The effects of the expression on ∆ are then captured in ∆. (However, it is interesting to note that the judgement can also be read from left to right in a type system fashion, where the expression "consumes" ∆ in order to produce ∆ .) The second judgement infers the typestates of the fields of a class when the class is used according to its declared typestate. The last two typing judgements state the well-formedness of classes and programs.
The typestate inference rules for expressions are given in Fig. 6. The rules are syntax-directed, meaning that at any point in the derivation there is only one rule which is applicable. Rules Void, Bool, Enum and Null type the constants with  The output typing context, ∆ 2 , for PathR states that k has an end typeWstate before the assignment. Rule PathR "guesses" a type for a path expression. However, the combination of PathR and AsgnR enforces a match on the type of s in the output typing context ∆ 2 and the type of k in the input typing context ∆ 0 . For the first expression in (1) we use rule New. By assumption we satisfy its premise; we have S sbt StackProtocol, meaning path s is used according to the StackProtocol typestate. Rule New always infers void; this satisfies the premise of rule Seq, which requires the type of the first expression to be neither a class type (so that discarding does not violate linearity) nor bot (to forbid dead code after a continue λ expression -see rule Continue). The type of the sequential expression is the type of the second expression, void. To summarise the derivations described so far: To preserve linearity, s and k exchange their typestates before and after assignment. If the type of s in ∆ 0 were not end, path s would still be usable after being read from, violating linearity, as in: s = new Stack; k = s; s.push (5) (3) Note that, in rules AsgnR and New, the path this cannot be assigned to, otherwise the current object would be overwritten in the heap. In rule PathR, the path this cannot be read from, because using this in an expression would violate the linearity condition that the unique reference to the object is at the call site.
The other rules for paths and assignments are as follows. Rule PathC infers a constant type U for a path r and has no effect in the input typing context, if r is mapped to U in the input typing context. Rule AsgnC is similar to AsgnR, except that e has a constant type U that is left unchanged in the input and output typing contexts. Loops, Choice and Recursive Typestate We now explain how a loop can be controlled by the enumerated value returned by a method, and how this leads to the inference of a recursive typestate specification. The example illustrates rules LExpr, Continue, Switch, and If. Consider the following class StackUser that defines methods that use a Stack. Note that the example is expressed in the core calculus, using Java-style formatting, so there is no return keyword. The body of method popAll in line 4 is a labelled expression, and so rule LExpr applies. The premise computes a typestate for the switch expression, using an input context ∆ 0 augmented with the assumption loop : X, where X is fresh. Let ∆ 1 = ∆ 0 , loop : X. LExpr closes all free occurrences of X in the output typing context. For the switch expression, rule Switch computes a typestate for each branch, using the typing context for the entire switch expression, ∆ 1 , as the input context. The inferred output contexts of the branches are then joined and used as an input to infer a typestate for the method call expression which is the condition of the switch. The condition should have an enumeration type that matches the labels of the switch branches. Finally, the type of the switch is the join of the types of its branches. For the EMPTY branch we use rule PathR: We join the output typing contexts ∆ 2 and ∆ 4 of the EMPTY and NONEMPTY branches, and use the result as an input typing context for the method call x.isEmpty(), as per the second premise of Switch. The output typing context of Switch is: To complete the derivation for LExpr we close the recursive variable X in ∆ 5 and obtain the output typing context for the labelled expression in lines 4-5, namely: Notice the equivalence of the type µX.join(end, X), that ∆ 6 assigns to path this, and the type end, meaning that rule Equiv can be applied. Rule If is similar to Switch. Both conditional branches are individually inferred and then joined to obtain the output typing context of the conditional expression. We further require that the condition has type bool. Method Call. Rule Call records the method call trace of paths in a program, to respect the principle that the trace of the execution of an object follows its inferred typestate. It uses the function initT(·), defined by T C =⇒ initT(T ) = T and initT(C) = C[end]. Rule Call typechecks the method body every time a method is called. This is a simplification for presentational purposes; an algorithm directly extracted from the rules will be unable to construct a type for a recursive method call. However, the rules can be used to derive typings if suitable pre-and postconditions are put into the derivation by hand. The implementation of Mungo's type inference system uses a more complex notion of partial typestate so that method bodies do not need to be checked at every call site; recursive methods are also supported.
Rule Call includes the hypothesis S = sbt S . This condition means that the typestate of this is preserved by the body of the method. The same condition Rec-St We can now apply rule Call on c.pushVal(s) as follows: Call Stack pushVal(Stack x) {x.push(2); x} ∈ methods(StackUser) (1) Premise (1) looks up the definition of method pushVal in the class of the receiver, StackUser. Premise (2) infers a typestate for the method body in which c has been substituted for the keyword this. Both the method call and the body use the same input typing context ∆ 1 . The output typing context from the method body must contain a typestate assumption for the method parameter and receiver: Then, premise (3) requires a typestate inference in order to match the typestate of the method parameter with the type of the argument s. For this, rule PathR is used where ∆ 3 also updates the type of the receiver: The parameter of a method must be consumed by the body of the method. This is shown by the typing x : initT(T ) in rule Call, where initT(T ) is end if T has a typestate. This can be done in several ways: by following the typestate of the parameter to end; by returning the parameter as the result of the method (perhaps after following part of its typestate); by assigning the parameter to a field (again, perhaps after following part of its typestate). Assigning a parameter to a field is always possible as a default way of consuming the typestate of a parameter. Rule Call requires that the types of the receiver c in the input and output typing contexts for the body of the method are equivalent, according to the relation = sbt . This avoids exposing to the caller how the method uses its receiver. For example, assume method pushVal is defined as: Given that ∆ 1 (c) = StackUser[{{Stack popAll(Stack) : end}}], it is revealed that the body of pushVal calls method popAll on its receiver, exposing this implementation detail to the caller.
Methods, Fields, Classes and Programs. The rules for methods, fields, classes and programs are given in Fig. 7. Rule Method-St uses the relation refines(U, T ), which means that U adds a typestate to T if T is a class. It is defined by: if T is not a class Rule Method-St infers a method-prefixed typestate: it first computes the continuation typestate, and then uses the output typing context to infer the method prefix, by first inferring a typestate for its body. The auxiliary definition refines(U i , T i ) is used to check that the return type and parameter types declared in the syntax of the method match the corresponding inferred types for return and parameters, respectively; essentially, we expect to infer a typestate of type C[S ], for some S , for (return and parameter) types that are declared with type C. As in Call, a self-call should preserve the typestate of the receiver up to = sbt . Rule Enum-St is similar to rule Method-St, inferring and then joining the typestates of all branches and then inferring the method prefix. Rule Set-St requires the inference and join of the typestates of all branches. Rule End-St requires all fields of the class to finish in the end typestate. Rules Rec-St and Var-St are similar to rules LExpr and Continue, binding and using a recursive variable, respectively. Rule Class initiates the inference of the typestate of the class. It states that a class declaration is well-typed if every field of the class has an end typestate in the typing context computed in the premise of Class. Rule Program states that a program is well-typed if all of its classes are well-typed.
The inference rules for runtime expressions are given in Fig. 8. We show only those that differ from the ones in Fig. 6. Rule Switch-AtR is similar to Switch, the difference being the condition of the switch, which is evaluated to an active receiver rather than a method call. Rule AtR infers a typestate for e@r, by first inferring a typestate for e. The other rules are used to type runtime configurations. Rule Object, similarly to rule New, checks that the typestate of the objects in the context matches the declared typestate of their class. Rule Heap uses rule Object to check whether a typing context is consistent with all the objects in the heap. Finally, rule Config infers a typestate for a runtime configuration, by first inferring a typestate for the expression and then using its output typing context to type the heap. The output typing context and the typestate of the configuration match those of the expression.

Properties of the Typestate Inference System
Progress and type preservation require that the output typing context of an expression mimic the reductions of the expression itself. To this end, we define  Fig. 9 which use the same labels as the reductions on expressions. Rule Ty-Id states that ∆ remains unchanged under a τ-reduction. Rule Ty-New states that a path in ∆ mapped to an end typestate reduces under r. f.new C and its typestate is updated accordingly. Rules Ty-AsgnR and Ty-AsgnC label the reduction with an assignment of a path and a constant, respectively. The former ensures linearity is respected when an assignment takes place; the latter leaves the typing context unchanged. Rule Ty-Call performs a reduction of a method-prefixed typestate with the method prefix as the label. Similarly, rule Ty-Label reduces with an enumerated value for paths that have a runtime switch typestate. The behaviour of the if label is captured by rule Ty-If. In both the last two rules the result of the reduction is a subtype of the starting typing context. We now state the progress and a type preservation theorem. The proof is given in Appendix A. Type preservation implies the runtime safety property that for every object, the sequence of method calls and enumerated return values is a path within the declared typestate of the object's class. The reasoning is the following. The type preservation theorem, in the case when l is a method call or a return of an enumerated value, states that l is the first step of a path in the inferred typestate. By repeatedly using type preservation along a sequence of steps, we find that a sequence of method calls and returns is always a path within the inferred typestate. When an object is created (rule New), its inferred typestate is a subtype of its declared typestate. Subtyping is essentially simulation, so it implies trace inclusion. Therefore the sequence of method calls and enumerated return values is a path within the declared typestate. Typability also guarantees that the program completely uses every object according to its typestate, by following a path to end. This does not mean that every object will be completely used at runtime, for example because there could be non-terminating code before a particular method in the typestate is called. But there must be code that can potentially follow every typestate to termination.

Typechecking expressions and typechecking programs
The proof of type preservation, and hence the runtime safety property, does not require the program D to be typable. That is, the rules in Fig. 7 are not used. The type preservation theorem can be applied to an initial configuration of the form described at the end of § 5, and typability of the configuration uses the rules in Figures 6 and 8. What, then, is the purpose of the rules in Fig. 7? These rules check that the declared typestate of each class is consistent, in the sense that the sequences of method calls allowed by the typestate are consistent with the effects of the method calls on the typestates of the fields of the class. If we consider our formal system without Fig. 7, typestates are inferred and then compared with the declared typestates, but the declared typestates might be inconsistent: they might allow additional sequences of method calls that are not consistent and could never be inferred. This is not a problem for type safety, but the presence of "junk" in a declared typestate would be undesirable from a documentation point of view.
If Fig. 7 is used to check consistency of the declared typestates, then an additional property is guaranteed, which we state informally. For every class C, it is possible to construct a typable expression that creates an object of class C and uses all of the behaviour allowed by the declared typestate of C. StMungo does this when it generates the skeleton main method that uses a generated API.
Mungo implements the rules from Fig. 7 for two reasons: (1) to avoid the declaration of inconsistent and therefore confusing typestates; (2) to avoid checking method definitions at every call site, as explained in §6.1.

Implementation of the Type System
The type system in this section captures most of the principles that are implemented in Mungo. However, for the sake of simplicity not all features of the implementation have been formalised.
Firstly, the formalisation omits some Java constructs that are not central to the treatment of typestate.
Secondly, the formalisation associates a typestate specification with every class, and therefore all objects are subject to linear typing in order to ensure unique references. The implementation also supports classes that do not have typestate specifications, and objects of such classes are not controlled by linear typing.
Thirdly, the most important feature not reflected in the formalisation is support for recursive method calls and a related technique for avoiding repeated analysis of method bodies. Mungo treats a method body in isolation by inferring a typestate for all objects used within the method body. It also keeps track of the objects referenced by the fields used within a method when the method is called and when the method returns. At a method call, instead of re-analysing the method body, Mungo combines the inferred typestates from the previous analysis of the method body with the current usage of the fields. This technique also supports recursive method calls, by constructing cycles. The data structure that represents a typestate resembles a state machine.
Formalising this aspect of Mungo would require defining a notion of partial typestate. This is because the analysis of a method leaves the usage of some objects referenced by fields incomplete, i.e. truncated. We leave formalising partial typestate to future work, but here we illustrate the operation of Mungo on a simple example.
Consider the Java code where the body of method void m() uses field f and also makes a recursive call within a conditional statement. The Mungo typechecker constructs the following data structure to represent the typestate of field f. For each method call on field f, there is a node labelled by the name of the method. Directed edges (arrows) between nodes represent the control flow. The initial arrow comes from a control point that designates the beginning of the method body. The continuation from m2() is the recursive call, which is represented by an arrow to the initial control point. The control point that designates the return of the method body has an arrow towards method m3(). This is because after a recursive call the control flow proceeds to the call of m3(). If there were no recursive call then the continuation from f.m2() would be f.m3(). A similar control flow structure is created for every field that is used inside a method. These control flow structures are what we mean by partial typestates, as the circular nodes are points at which further typestate transitions can be added when other methods are typechecked. The partial typestates can be regarded as truncated or fragmentary state machines.

Related and Future Work
Session types and programming languages. The Session Java (SJ) language [29] builds on earlier work [15,14,17] to add binary session type channels to Java. SJ has been applied to a range of situations including scientific computation [39] and event-driven programming [27]. SJ implements a library for binary sessions that have a pre-defined interface. The Java syntax is extended with communication statements that enable typechecking. The scope of a session is restricted to the body of a single method. Mungo lifts these restrictions by allowing the abstraction of multiparty session types as user-defined objects that can be passed and used throughout different program scopes. Gay et al. [23] outlined an implementation of their type system as a language called Bica, which is not currently maintained and is unusable. Mungo improves on Bica by using type inference to remove the need for typestate declarations on methods. Hu et al. [27] extend Session Java with runtime type inspection and asynchronous communication semantics to enable an event-driven framework based on binary session types. As a usecase they implement a binary session-typed SMTP server that uses a reactive structure to handle multiple clients concurrently. In our work we implement an SMTP client by using StMungo, which automatically generates code from a global protocol. Extending Mungo with runtime typestate inspection would enable us to investigate event-driven programming with multiparty session types.
In Capecchi et al. [9] a class defines sessions instead of methods. A session generalises a method to an extended session typed dialogue over a communication channel As far as we know, this new paradigm has not yet been implemented.
Ng et al. [38] typecheck the operations of a library that implements multiparty session types using a restricted set of MPI [32] primitives. In contrast, our framework typechecks Java statements and expressions, instead of higher-level operations. Ng et al. [37] use Scribble to automatically generate MPI code based on user-defined kernels that produce and consume data. The generated code does not require typechecking. On the other hand, the StMungo transpiler can be used together with the Mungo typechecker to develop more flexible multiparty session type implementations. Monitoring based on Scribble. Neykova et al. [36] use Scribble protocol definitions to achieve dynamic monitoring in Python, by translating local protocols into finite state machines that intercept communication and check the validity of runtime messages. Subsequently, [35] implements a session-based Actor framework that uses runtime monitoring to integrate multiparty session types. A hybrid approach is used by Hu [28] to analyse an SMTP client in Java. Hu's SMTP API implements multiparty session types using a pattern in which each communication method returns the receiver object with a new type that determines which communication methods are available at the next step. If the pattern is used properly, then standard Java typechecking can verify correctness of communication, but runtime monitoring is needed to check linearity constraints. In contrast, our analysis of SMTP is able to statically check all aspects of the protocol implementation.
The receiver-returning pattern is at the basis of functional programming with session types [22] and has been used to achieve protocol checking in Idris [30] and as a replacement for explicit typestate in Rust [42]. Typestate. There have been several projects to add typestate to practical languages, since their introduction in [45]. Vault [12,19] is an extension of C, and Fugue [13] applies similar ideas to C#. Plural [6] is based on Java and has been used to study access control systems [5] and transactional memory [4], and to evaluate the effectiveness of typestate in Java APIs [6]. In contrast Mungo follows Gay et al. which is inspired by session types; the possible sequences of method calls are explicitly defined, rather than being consequences of pre-and post-conditions. Like Plural, a typestate in Mungo can depend on the return value of a method call.
Sing# [18] is an extension of C# which was used to implement Singularity, an operating system based on message-passing. It incorporates typestate-like contracts, which are a form of session type, to specify protocols. Bono et al. [8] have formalised a core calculus based on Sing# and proved type safety.
Aldrich et al. [2,46] propose a new paradigm of typestate-oriented programming, implemented in the Plaid language. Instead of class definitions, a program consists of state definitions containing methods that cause transitions to other states. Transitions are specified in a similar way to Plural's pre-and post-conditions. Like classes, states are organised into an inheritance hierarchy. Recent work [20,48] uses gradual typing to integrate static and dynamic typestate checking. We focus on the object-oriented paradigm in order to be able to apply our results to Java.
Bodden and Hendren [7] developed the Clara framework, which combines static typestate analysis with runtime monitoring. The monitoring is based on the tracematches approach [3], using regular expressions to define allowed sequences of method calls. The static analysis attempts to remove the need for runtime monitoring, but if this is not possible, the runtime monitor is optimised. Mungo uses a purely static analysis, and can allow the state after a method call to depend on the method's (enumerated type) result.
Typestate systems must control aliasing, otherwise method calls via aliases can cause inconsistent state changes. Literature includes the "adoption and focus" approach of Vault and Fugue, the permission-based approaches of Plural and Plaid, and an expressive fine-grained system by Militão et al. [33]. Also relevant is recent work by Crafa and Padovani [11] which applies the chemical approach to concurrent typestate oriented programming, allowing objects to be accessed and modified concurrently by several processes, each potentially changing only part of their state. We expect that many of these systems can be applied to Mungo. However, linear typing has not been a limiting factor for the applications described in the present paper. Future Work. The combination of Mungo and StMungo is effective for statically checking the correct implementation of communication protocols. We intend to extend Mungo to increase its power for general-purpose programming with typestate. Our first aim is to generalise the use of linear typing as a mechanism for the alias control required by typestate systems. Candidates include the "adoption and focus" technique of Vault and Fugue, the permission-based approaches of Plural and Plaid, and the system by Militão et al. [33]. Another aim is to support generics and inheritance. Inheritance between typestate classes requires a subtyping relation between their typestate specifications, based on standard definitions of subtyping for session types [21]. Method calls on an object whose type is a generic parameter must be typechecked against the typestate specification of the parameter's upper bound. To extend typechecking to exception handlers, we need to allow typestate specifications to define the state transitions corresponding to exceptions, and check that these transitions are consistent with the states of fields at the point where an exception is thrown. Existing work on exceptions in session types [10] provides inspiration, but doesn't address the complexities of Java's exception mechanism. Using these Mungo extensions with StMungo for more sophisticated protocol verification will also require extensions to Scribble to support generic protocols, inheritance between protocols, and more general handling of exceptions.

Conclusion
We have presented two tools, Mungo and StMungo, which extend the Java development process with support for static typechecking of communication protocols. Mungo extends Java with typestate definitions, which associate classes with state machines defining permitted sequences of method calls. StMungo uses the typestate feature to connect Java to Scribble, the latter being a language used to specify communication protocols. In order to illustrate the practicality and robustness of Mungo and StMungo, we have implemented a substantial case study, an SMTP client, which we are able to statically typecheck. We use this client to communicate with the gmail server. Finally, we have formalised the essential features of Mungo by defining a typestate inference system for a core object-oriented language. We proved safety and progress properties (Theorem 4), which mean that typestate inference guarantees correct behaviour of a program with respect to the declared typestate specifications.
• d is a subderivation of d concluding ∆ e : U e ∆ e , • the position of d in d corresponds to the position of the hole in E, • ∆ e : U e ∆ e , such that U e sbt U e , then ∆ E[e ] : U ∆ such that U sbt U.
Proof. Follows [23], by replacing the derivation d in d with the derivation for ∆ e : U e ∆ e . Lemma 8 (Subtyping and join). The following relate subtyping and join on inferred types U and typing contexts ∆.
1. Let U, U be inferred types such that join(U, U ) is defined. Then, U sbt join(U, U ) and U sbt join(U, U ).
Proof. The proof follows immediately by combining the definition of subtyping in