Language components for modular DSLs using traits

https://doi.org/10.1016/j.cl.2015.12.001Get rights and content

Highlights

  • The contribution of this work is to synthesize a collection of patterns and techniques that can be used to implement language components, using traits. Separate the syntactic concern from the construction of the abstract representation of the language. Separate the abstract representation of the language from the implementation of its semantics. Modularize the implementation of the semantics in distinct phases; decouple the abstract representation from the semantics of each phase, possibly expressing dependencies between phases. The benefit of representing language concepts through traits is an improved modularization, thereby simplifying code sharing across language implementations. Moreover, since traits in most languages can be written as separate code units, employing them in the modularization of a language make it possible to compile each language component separately and independently from the others, allowing them to be shared as binary assets, that, nevertheless, can be still combined together post-compilation. Scala׳s trait implementation has been used to demonstrate our contribution.

Abstract

Recent advances in tooling and modern programming languages have progressively brought back the practice of developing domain-specific languages as a means to improve software development. Consequently, the problem of making composition between languages easier by emphasizing code reuse and componentized programming is a topic of increasing interest in research. In fact, it is not uncommon for different languages to share common features, and, because in the same project different DSLs may coexist to model concepts from different problem areas, it is interesting to study ways to develop modular, extensible languages. Earlier work has shown that traits can be used to modularize the semantics of a language implementation; a lot of attention is often spent on embedded DSLs; even when external DSLs are discussed, the main focus is on modularizing the semantics. In this paper we will show a complete trait-based approach to modularize not only the semantics but also the syntax of external DSLs, thereby simplifying extension and therefore evolution of a language implementation. We show the benefits of implementing these techniques using the Scala programming language.

Introduction

In recent years, the practice of developing domain-specific languages (DSL) to deal with domain-specific problems has started to regain interest among researchers and practitioners as demonstrated by surveys and books [1], [2] and a more recent study about research trends and applications of DSLs [3]. Support tooling is becoming more and more powerful, flexible and convenient, with the introduction of new frameworks and platforms; modern API design is progressively converging to a style that resembles an embedded language, to the point where the very distinction between a DSL and a general purpose language (GPL) is becoming thinner; a style that Martin Fowler and Eric Evans dubbed fluent interface [4], and that languages such as Scala [5], Smalltalk, Ruby and Groovy actively promote the flexible parser and syntax of such languages allow users to even omit some punctuation, making it simple to simulate the embedding of a foreign language. For instance, Listing 1 shows a Scala internal DSL (punctuation that can be omitted has been dimmed).

Listing 1

A state machine language as a Scala embedded DSL

However, language embedding as a fluent interface has still to obey to the host language limitations; inevitably, external DSLs provide an unrivaled level of syntactic flexibility (for instance, compare the state machine in Listing 1 with the equivalent written in an external DSL and reported in Listing 3), at the cost of requiring developers to write their own parsing routines, and to implement the semantics of each single construct. Nevertheless, modern libraries and languages today provide programmers with tools that make developing their own external DSL within their reach. For instance, parser combinator libraries [6], [7], [8] make it possible to define an executable parser in such a way that the code that describes it closely resemble the structure and looks of its formal grammar. We are getting closer and closer to a full componentization of language implementations, where domain specific languages could be implemented as the combination of features concretized as reusable assets.

Modular language development is a research branch that investigates tools and techniques to componentize the design and implementation of languages, with particular attention to DSLs, where each feature may be easy to represent as a distinct code unit, making a language implementation very close to a combination of a selection of such components, realizing a form of feature-oriented language composition [9], [10], [11], [12]. In general a language implementation can be described by (i) a parser for a concrete syntax, yielding (ii) an abstract syntax tree that puts in relation the concrete syntax of an input program with an abstract syntax representation, and (iii) the semantics that can be associated with the nodes of the abstract syntax tree [13], [14]. These three parts of a language can be decomposed into a collection of language components [9], [15], that is, a sort of bundle that includes the syntax, and the semantics of that construct; the semantics may be represented as a sequence of evaluation phases (e.g., type checking and evaluation) that pertain to that particular construct (Fig. 1). A language component can be shared and distributed as a whole across different language implementations, possibly as a binary, pre-compiled package; the final objective is to provide the features of the language as self-contained bundle of components that can be just combined together. The contribution of this work is to synthesize a collection of patterns and techniques that can be used to implement language components, using traits, lightweight entities of code reuse that are often contrasted to single and multiple inheritance [16], [17], and that have been already shown (e.g., [18], [19], [12]) to be especially good to achieve language componentization. To this end, we will show how to

  • separate the syntactic concern from the construction of the abstract representation of the language;

  • separate the abstract representation of the language from the implementation of its semantics;

  • modularize the implementation of the semantics in distinct phases;

  • decouple the abstract representation from the semantics of each phase, possibly expressing dependencies between phases.

We will use traits to componentize the parser implementation in such a way that the code resembles a grammar, thereby making it easier to understand and develop, and to implement the interpreter pattern separating different concerns of the semantics of the language constructs.

The benefit of representing language concepts through traits is an improved modularization, thereby simplifying code sharing across language implementations. Moreover, since traits in most languages can be written as separate code units, employing them in the modularization of a language makes it possible to compile each language component separately and independently from the others, allowing them to be shared as binary assets, that, nevertheless, can still be combined together post-compilation.

The approach that we present has been influenced by many sources of inspiration: first of all, Scala׳s parser combinator library bundles traits with predefined combinators for commonly used literals and regex patterns, that users can mix-in to their classes; then, our experience with the implementation of the Neverlang framework [15], [20], [21], [22] for componentized language development, with which the trait-based model that we will present shares a few commonalities; finally, the previous work on modularizing the semantics of an interpreter (e.g., [12], [18], [19]). The objective of this work is to present a complete solution, including syntax and semantic composition, to realize the implementation of language components. The final goal will be, in the future, to be able to implement languages in a feature-oriented way, possibly using feature modeling techniques to present the variability in a language family; such an experience has been already carried out—see [9], [23], [24]—using our own programming language framework, that provides first-class support for language components (known as slices); in this work we want to show that, although a dedicated tool simplifies a componentized model of language development, a similar degree of code reuse can be reached by employing constructs and features that are already available in many modern GPLs.

For this work we chose to use Scala׳s trait implementation, since it completes Schärli׳s original prototype [16] with the additional guarantees of correctness that a static type system provides. Nevertheless, the approach should be portable to any language that supports trait-like composition and a library for parser combinators, such as Smalltalk, Ruby, and Groovy.

A simple state machine DSL: As our running example we will use a simple State Machine language similar to the one from Tratt׳s paper [25] (see Listing 2). Similar to Tratt, we will also show how to extend the basic state machine DSL with guards and action language; but in our case the extended DSL will be the result of composing together traits from the basic state machine language and a separate action language.

Listing 2

State machine DSL grammar.

Listing 3

Door state machine for the grammar in Listing 2.

The rest of this paper is structured as follows. Section 2 describes the background. Then the paper is divided into two parts: in Section 3 we will draw a parallel between grammars and traits and we will show that it is possible to modularize a parser implementation in the same way we will partition the set of rules of a formal grammar. In this section we will use Scala׳s traits and its parser combinator library. In Section 4 we will show how to implement the semantics of our DSLs using traits to decouple the semantic implementation of the interpreter from the abstract representation of language concepts. Section 5 expands the running example of state machines in a full case study, by extending the basic state machine language with support for an action language and guard expressions. Section 6 compares our solution to some related work, and in Section 7 we draw our conclusions.

Section snippets

Background

We will give a few details on the technical background that is required to understand the rest of this paper. We first briefly recap formal grammars, then we define the concept of trait as found in [16], [17], and finally we describe the peculiarities of Scala׳s trait implementation, with respect to the features we will use here.

Trait-based grammar modularization

A parser is usually defined as a single, self-contained entity, but reasoning by analogy with language grammar, a parser may be easily componentized. A grammar can be partitioned into a collection of interdependent sets of productions. Using parser combinators and traits, we provide a construction method to represent such sets and their dependencies as pluggable, shareable and reusable components. The resulting traits implement parser components that can be easily combined together, unplugged

Trait-based semantics composition

In a typical interpreter or compiler implementation, the concrete syntax of a language is mapped onto an abstract representation, the abstract syntax tree (AST). In a functional programming language, we would usually define an eval(AstNode) function that would pattern match on the type of these nodes. In Scala we could write: This solution has the limit to centralize the implementation of the semantics of our interpreter in the code of such function, thus making the interpreter less modular and

Case study

The main point of a DSL is to describe the solution of a domain problem concisely, and the purpose of componentizing a language implementation is to reuse part of its features in different language implementations. This is particularly useful when put in the perspective of evolving a DSL implementation. In the previous sections we introduced the state machine DSL as our running example. In Section 3 we showed that, given a partition over the grammar of a language, it is possible to componentize

Evaluation and comparison with related work

Many component-based language development frameworks have been proposed over the years (e.g., [11], [33], [34], [35]). These frameworks emphasize the separation of the concepts of a language as pluggable and composable units, but do not rely on a particular host language; rather, they provide a programmable platform to implement external DSLs; some of them, even provide IDEs and generate IDEs for the implemented languages. For this work we took inspiration from Neverlang [15], [20], [21], where

Conclusions

DSL development is an aspect that modern GPLs have been emphasizing more and more. In this work, we exploited well-known patterns, techniques and constructs to implement external DSLs with a high degree of flexibility and modularity. The final objective is being able to implement DSLs by combining components together, maximizing code reuse and minimizing duplication. The approach revolves around the use of traits both for the realization of the parser of the DSL and for the implementation of

References (42)

  • Moors A, Piessens F, Odersky M. Parser combinators in scala. CW report 491. Leuven, Belgium: Katholieke Universiteit...
  • Renggli L, Ducasse S, Gîrba T, Nierstrasz O. Practical dynamic grammars for dynamic languages. In: Proceedings of the...
  • E. Vacchi et al.

    Variability support in domain-specific language development

  • S. Erdweg et al.

    Language composition untangled

  • H. Krahn et al.

    MontiCorea framework for compositional development of domain specific languages

    Int J Softw Tools Technol Transf

    (2010)
  • B.C.d.S. Oliveira et al.

    Feature-oriented programming with object algebras

  • P. Wadler

    Monads for functional programming

  • S. Liang et al.

    Monad Transformers and Modular Interpreters

  • N. Schärli et al.

    Traits: composable units of behaviour

  • S. Ducasse et al.

    Traitsa mechanism for fine-grained reuse

    ACM Trans Program Lang Syst

    (2006)
  • Zenger M, Odersky M. Independently extensible solutions to the expression problem. In: Proceedings of the 12th...
  • Cited by (0)

    View full text