1 Introduction

A collective adaptive system (CAS), often also called ensemble, consists of collaborating entities that are able to adapt at runtime to dynamically changing open-ended environments and to new requirements [40, 45]. Often the entities of such a system have their own individual properties and objectives; interactions with other entities or with humans may lead to the emergence of unexpected phenomena. Many modern systems are collective and adaptive: smart systems such as smart cities and smart traffic, collective cyber-physical systems such as robot swarms and sensor networks, as well as socio-technical systems such as distributed energy systems.

Rigorous engineering of collective adaptive systems requires appropriate methods and tools that help guaranteeing that—in spite of dynamically changing environments—they live up to their intended purpose. This includes theories for designing and analysing collective adaptive systems, techniques for programming and operating such systems, rigorous methods for devising adaptation mechanisms, validation and verification techniques as well as approaches for ensuring security and trust and optimising performance.

Research projects In the last 10 years research on collective adaptive systems has been fostered by European and national research programs and projects. The notion of collective adaptive system was elaborated in a workshop on “Fundamentals of Collective Adaptive Systems”, organised by the European Commission in 2009 [45]; earlier in 2006–2008, the notion of ensemble was developed in a series of workshops of the Interlink project [71], funded by the Future and Emerging Technologies (FET) Unit of the European Commission. As a sequel, the FET unit launched two proactive initiatives. The action ICT-2009.8.5 “Self-Awareness in Autonomic Systems”Footnote 1 called for “new concepts, architectures, foundations and technologies” for computing and communication systems “that are able to optimise overall performance and resource usage in response to changing conditions, adapting to both context (such as user behaviour) and internal changes (such as topology).” The follow-up action ICT-2011.9.10 on “Fundamentals of Collective Adaptive Systems”Footnote 2 focused on socio-technical systems “that are constructed as a collective of heterogeneous components and that are tightly entangled with humans and social structures”. Altogether 12 projects were funded including ASCENS [73, 74] on “Autonomic Service- Component ENSembles”, SAPERE [75] on “Self-Aware PERvasive service Ecosystems”, ALLOW Ensembles [17] on large scale collective adaptive ensembles, QUANTICOL [14] on a quantitative approach to the design and management of collective and adaptive behaviours, and the coordination action FoCAS.Footnote 3

More recently, a number of National projects or initiatives have been funded in Europe, here we would like to mention just two that are very close to us.

The German Transregional Collaborative Research Centre 248 “Foundations of Perspicuous Software Systems”Footnote 4 aims at enabling comprehension in a cyber-physical world with the human in the loop. The starting consideration is that computer programs increasingly participate in actions and decisions that affect humans but that the understanding of how these applications interact and what is the cause of a specific automated decision is lagging far behind. These “smart systems” calculate and propagate outcomes of computations, but unfortunately do not provide explanations.

The Italian collaborative project IT-MattersFootnote 5 has the goal of developing and the experimenting with a novel methodology for the specification, implementation and validation of trustworthy smart systems based on formal methods. Three basic steps are envisaged: (i) providing and analysing system models to find design errors, (ii) moving from models to executable code by translation into domain-specific programming languages and, finally, (iii) monitoring run-time execution to detect anomalous behaviours and to support systems in taking context-dependent decisions autonomously.

ISoLA This special section on ‘’Rigorous Engineering of Collective Adaptive Systems” was inspired by three successful tracks [30, 44, 72] on rigorous engineering at the ISoLA conferences in 2014 [51], 2016 [52], and 2018 [53]. In 2014, the topic of the track was “Rigorous Engineering of Autonomic Ensembles” and the six papers of the track presented results of the ASCENS project [73]. The tracks of 2016 and 2018 were not dedicated to the outcome of a single project, but addressed more widely the theme of rigorous engineering collective adaptive systems. The 12 scientific contributions of 2016 and the 18 contributions of 2018 covered a broad range of topics such as formally modelling, analysing and programming collective adaptive systems as well as security, machine learning, and software support for collective adaptive systems.

Contents The seven papers of this special section are revised and extended versions of papers presented at the ISoLA 2018 track [30]. In this paper we give a very short account of the field of rigorously engineering collective adaptive systems by structuring the research about collective adaptive systems along three different lines:

  • systematic development of CAS,

  • methods and theories for modelling and analysing CAS,

  • techniques for programming and operating CAS.

The seven contributions of the special section are presented as part of this general overview.

2 Systematic development of CAS

Key features of collective adaptive systems are the so-called self*-properties, such as self-awareness and self-adaptation. Systematic development approaches address these features and support (some or most of) the classical phases of the software development life cycle such as requirements elicitation and specification, design, implementation, validation and verification, and deployment. In the following we distinguish goal-oriented methods with focus on the early development phases, methods based on feedback control loops, and automated synthesis.

Goal-oriented methods Well-known early goal-oriented engineering approaches are KAOS [28] and Tropos [15]. KAOS [28] distinguishes hard and soft goals, is formally based on (linear temporal) logic, and proposes activities for refining the goals and deriving operational requirements which serve as the basis for system design. Tropos [15] is a method for developing multi-agent systems. Its notion of agent is founded on the BDI (Belief, Desire, and Intention) architecture [62]. Key modelling concepts are actors, their dependencies as well as the goals, plans and capabilities. For KAOS and Tropos, the development process covers all phases from requirements to implementation; special focus is put on the identification of stakeholders and the specification of the environment in the early requirements phases. Neither KAOS nor Tropos offer direct support for adaptation, but in [24, 58] it is shown how to use KAOS for developing dynamic self-adaptive systems, while a Tropos extension, called Tropos4AS, is presented in [57] as a tool for engineering self-adaptive systems.

ADELFE [11] is a methodology for developing self-organising cooperative and thus collective multi-agent systems. Its main ingredients are the characterisation of the system environment and the identification and treatment of so-called non-cooperative situations. ADELFE 3.0 [55] proposes an iterative process that integrates modelling, programming and simulation techniques. The method supports the UML Unified Process and tailors it to the specificity of self-organising systems. In [56] ADELFE is combined with Tropos4AS to obtain a goal-oriented development method for self-organising multi-agent systems.

Autonomous Requirements Engineering [67] (ARE) is a goal-oriented method for systematically eliciting so-called autonomy requirements. These are the self*-objectives of the system which are derived by applying a model for generic autonomy requirements to any system goal and its environmental constraints. The ARE method was applied to unmanned space missions [68] as well as to ensembles such as a peer-to-peer cloud [67].

The paper in this special section “The SOTA approach to engineering collective adaptive systems” [4] by Dhaminda Abeywickrama, Nicola Bicocchi, Marco Mamei, and Franco Zambonelli presents SOTA, a goal-oriented requirements engineering and modelling method for describing the overall domain and the requirements of a collective adaptive system. SOTA focuses on modelling and analysing functional and non-functional requirements of self-adaptation, and enables the early verification of requirements, the identification of knowledge requirements for self-adaptation, and the identification of the most suitable architectural patterns for self-adaptation. A more detailed and formal counterpart of SOTA specifications is the general ensemble model GEM [41].

The invariant refinement method IRM [19] supports the transition from early high-level requirements—specified, e.g. in SOTA and ARE, see [4, 67]—to software architecture. IRM captures goals and requirements as invariants that describe desired system states over time. High-level invariants are iteratively decomposed into more specific sub-invariants up to the level that they can be operationalised by autonomous components and component ensembles.

Methods based on feedback control loops An early reference model for adaptive systems is the MAPE-K architecture introduced by IBM [42]. It comprises a control loop with the four phases Monitor, Analyse, Plan, Execute, all of which exploit the Knowledge of the system itself and its environment. The research roadmap for self-adaptive systems [29] suggests a life cycle based on MAPE-K and proposes the use of a process modelling language to describe the self-adaptation workflows and feedback control loops. The “generic life cycle for context-aware adaptive systems” [43] is also based on MAPE-K and addresses foreseen and unforeseen evolution of the environment. In [16] feedback loops are considered as first-class entities and are explored from a control engineering perspective. Similarly, in [64] a bio-cybernetic loop serves as basis for a three-layered software architecture for adaptive systems.

The ensemble development life cycle EDLC [39] deals with awareness and adaptation in a “runtime feedback control loop.” It is an agile process whose development phases are arranged in two “wheels”: at “design” time, the classical development phases—requirements engineering, modelling and programming, verification and validation—are iterated; at “runtime”, the entities of the ensemble iterate the “runtime feedback control loop” consisting of monitoring, awareness, and self-adaptation. The connection between design time and runtime is established by the deployment or hot update of the system and in the other direction by the runtime system providing feedback to the design cycle. The construction of collective adaptive systems using EDLC is supported by eight engineering principles [9]. EDLC has been used in the development of several ensemble systems covering swarm robots [74], peer-to-peer cloud [54], and e-mobility applications [18].

Automated synthesis Automated synthesis of programs is a lifely and important research area (for an overview see [36]). Collective adaptive systems are typically too complex to be automatically generated but some crucial parts can be synthesised or generated.

Autonomous systems are subject to faults caused by the interaction with the environment. The paper [33] proposes an algorithm for automatically synthesising runtime monitors for fault detection and recovery strategies for controller synthesis. The approach is currently under implementation in the BIP [7] framework.

The gap between simulated hardware and real hardware implementations is addressed in [70]. It proposes a novel architecture for realising collective adaptive systems on hardware devices for real-world scenarios. Hardware devices are equipped with self-descriptions and these self-descriptions are used to automatically find appropriate devices for the tasks required by the system or the users.

Modern energy systems comprise several local community energy systems [23] which produce, consume, and sell energy. In [23] community energy systems are modelled as so-called electronic institutions [60, 63] and two optimisation methods are proposed to automatically find appropriate policies for selling and consuming energy. The optimisation methods are based on genetic programming and reinforcement learning. The results show that the evolved policies clearly outperform the initially human-designed policy and enable the energy system to remain sustainable over time.

The paper in this special section “Learning-based coordination model for spontaneous self-composition of reliable services in a distributed system” [50] by Houssem Ben Mahfoudh, Giovanna Di Marzo Serugendo, Nabil Naja, and Nabil Abdennadher studies decentralised services which are built and composed on-demand and arise from the interaction of multiple sensors and devices. The bio-inspired SAPERE coordination model [75] is extended to a learning-based coordination model which accommodates spontaneous self-composition of services in fully distributed scenarios using multiple nodes. Through learning, agents progressively update their behaviours and initial results show that the collective adaptive system are able to converge towards reliable composition of services, in terms of both functionality and expected quality of services.

3 Methods and theories for modelling and analysing CAS

Process algebraic, logical, and model-based methods are used for modelling dynamically reconfigurable architectures and ensemble requirements.

As of now there exist a large number of instances of Process Description Languages (PDLs), but overall they share a common ground: they can be generally described as action-based formalisms relying on behavioural operators that support compositionality and abstraction, where the meaning of process terms is formally defined via a set of structural operational semantic rules that can be used to associate a state transition graph to each process term. Such state transition graphs are then used as the basis for model checking properties expressed with one of the many temporal logics that have been proposed. After the success of such logics in specifying and verifying qualitative properties of system, other richer logics and tools for considering also quantitative properties have been proposed, to perform also probabilistic, stochastic or statistical model checking: see, e.g. [46] and [47]. For an extensive overview of such logics and associated tools, we refer the reader to [6] and references therein.

In parallel with the above approach, Architecture description languages (ADLs) have been developed and used to model the architecture of software-intensive systems. With the term architecture we refer to the description of the components that comprise a system, the behavioural specifications of those components, and the patterns and mechanisms for their interactions. However, as pointed out in surveys on the main ADLs [27, 34, 59] their semantics is not always fully formal and early analysis of system architectures is often difficult if not impossible.

Below we describe some of the logic-based approaches that have been more concerned with the specification and verification of collective adaptive systems. Then we will consider (architectural) languages and their models.

Logic-based methods and analysis techniques Complex requirements of collective adaptive systems are specified concisely by modal logics that exploit dynamic, spatial and temporal concepts.

Spatial Temporal Logics have been introduced in order to specify spatial aspects of computation that are very important in systems distributed in physical space and thus in many collective adaptive systems.

These logics offer a topology-based approach to formal verification of spatial properties depending upon physical space based on a spatial logic, We extend the framework with a spatial surrounded operator, a propagation operator and with some collective operators.

A spatial logic (SLCS), stemming from the tradition of topological interpretations of modal logics, dating back to earlier logicians such as Tarski, where modalities describe neighbourhood is introduced in [25]. SLCS extends the classical framework with a spatial surrounded operator, a propagation operator and with some collective operators which are interpreted over arbitrary sets of points instead of individual points in space. A variant of SLCS, named SSTL, with two new spatial modalities: the somewhere operator and a novel bounded version of the topological surrounded operator is introduced in [26] and is used in [25] to detect problems in vehicle location data for city buses and, e.g. spot the undesirable phenomenon of “clumping” which occurs when there is not enough separation between subsequent buses serving the same route.

Properties of dynamically changing ensembles can be described in dynamic logics. In [61] a differential dynamic logic (dL) is proposed to lay the foundations for developing adaptive cyber-physical system models by rigorously specifying their behaviour: the logic is then accompanied by a continuous monitoring to safeguard the decisions of learning agents and guide run-time decisions. In [38] a dynamic logic is presented for specifying the global behaviour of systems by desired and forbidden interaction scenarios.

Languages and models Several formalisms for specialised component models and languages for collective adaptive systems have been proposed, in particular in the context of the ASCENS project [73]. One of the main objectives of these formalisms is the actual formation of ensembles of components forming a collective system with the idea that ensembles are formed according to the overall goals of the system and possibly to the environmental conditions.

The Service Component Ensemble Language (SCEL) [32] has been designed to deal with collective adaptation. It brings together programming abstractions to directly address aggregations (how different components interact to form ensembles and systems), behaviours (how components progress) and knowledge manipulation according to specific policies. These ingredients constitute the basis of so-called attribute-based communication, a novel paradigm that enables groups of partners to interact by considering the predicates over the (dynamic) values of the attributes they expose. SCEL is equipped with an operational semantics that permits verification of properties of systems. SCEL has also been the main source of inspiration for CARMA, a language recently defined to support specification and analysis of collective adaptive systems. CARMA [13, 48] combines the lessons learned from SCEL and the long tradition of stochastic process algebras [31].

The Helena approach [37] proposes a role-based method for modelling collaborations using a UML-like notation and is founded on a rigorous formal semantics. Helena focuses on the description of the behaviour of each role as well as on the behaviour on the ensemble level. In Helena, a component instance explicitly indicates which of the ensembles it belongs to.

The DEECo (Dependable Emergent Ensembles of Components) component model [21] aims at providing the relevant software engineering abstractions that ease the programmers’ tasks. A component in DEECo features an execution model based on the MAPE-K autonomic loop. Like SCEL, it consists of well-defined knowledge and processes that are executed periodically in a soft real-time manner. The component concept is complemented by the first-class ensemble concept. An ensemble stands as the only communication mechanism between DEECo components. In DEECo ensembles are specified explicitly and ensemble formation employs a “greedy” strategy, in that a component is a member of each of the ensemble instances, the membership condition of which it satisfies, and that at the same time, this condition is solely based on the component’s current knowledge.

Another formalism developed outside ASCENS is BIP [7]. This formalism aims at combining the approach based on PDLs and ADLs by relying on a logic language for modelling the interaction of components. Within the framework, components are obtained as the superposition of three layers that give name to the framework itself: Behaviour, Interaction between transitions of the behaviour in the form of a set of connectors, and Priority to describe the scheduling policies of interactions. The framework is designed to support incremental construction of systems starting from atomic components combined through a parameterised binary composition operator.

The two papers in this special section address reconfiguration modelling and formal modelling of scenario evolution.

The paper “The DReAM framework for dynamic reconfigurable architecture modelling: theory and applications” [49], Alessandro Maggi, Joseph Sifakis, and Rocco De Nicola introduces a new framework for modelling dynamic reconfigurable architectures. A system is understood as a dynamically changing set of typed components and a system architecture is characterised by a set of coordination constraints. The DReAM language is based on the Propositional Interaction Logic [12] of the BIP framework and extends it with operations for data exchange between components and by coordination terms for regulating interactions and reconfigurations among a set of components. Static architectures are modelled by Interaction Logic formulas. The main ingredient for modelling dynamic architectures is the new notion of ‘motif’ which is defined by a DReAM coordination term. The DReAM framework is implemented as a Java API together with an execution engine.

In “Adapting quality assurance to adaptive systems: the scenario coevolution paradigm” [35] by Thomas Gabor, Andreas Sedlmeier, Thomy Phan, Fabian Ritz, Marie Kiermeier, Lenz Belzner, Bernhard Kempter, Cornel Klein, Horst Sauer, Reiner Schmid, Jan Wieghardt, Marc Zeller, and Claudia Linnhoff-Popien take a more abstract view and present a formal framework for adaptation and testing of adaptive systems using scenarios; it also discusses how such a framework can be used for increasing the trustworthiness of complex adaptive systems. In particular, the work extends the system model and the notion of adaptation space of [41] of the ASCENS project [74] by abstract definitions of self-adaptation and scenarios. A main contribution is also the concept of scenario coevolution for making quality assurance self-adaptive in order to match the capabilities of the self-adaptive system-under-test.

4 Techniques for programming and operating CAS

In collective adaptive systems interactions are typically performed following an opportunistic pattern and might involve several selected partners among a huge number of alternatives. It is thus very costly or infeasible to rely on channels, addresses or component identities for selecting such partners. A number of alternative programming frameworks have been proposed with different intercommunication strategies and abstractions.

ECA programming Unlike traditional programming languages, Event-Condition-Action (ECA) programs consist of a small set of rules, each of which specifies an action that is performed when an event occurs and a specific condition holds [10]. This event-driven model allows for the specification of robust programs in which the static data model is separated from the dynamic part. Tools have been developed to support runtime execution and static/dynamic analysis of ECA programs to guarantee satisfaction of invariants [65, 66].

Attribute-based programming The “attribute-based” paradigm allows for specifying communicating systems by abstracting from channels and identities. Systems are described as sets of parallel components equipped with attributes encoding relevant features such as role, status, and position that can be modified at runtime. Communication links among components are dynamically established through predicates over their attributes. Attribute-based programming was first used in SCEL [32] and has been then refined in AbC [1, 2]. This more abstract calculus concentrates on primitives and interaction mechanisms that are crucial for dealing with CAS and relies on a notion of implicit, anonymous, and non-blocking multicasting. An AbC system is rendered as a set of parallel components, each equipped with a set of attributes, termed as attribute environment, and with a behaviour, termed as a process. The attribute values can be modified by internal actions and the behaviour of a component is parameterised to these values.

Aggregate programming A similar approach is adopted in the “aggregate programming” paradigm [8]: systems are coherent collections of interacting devices, whose coordination and composition are hidden to programmers. In aggregate programming, a program is specified over a region of the computational environment expressing e.g. network structure, continuous space and time. Programs manipulate data computed by the individual devices in that region, via local sensing and interactions with (spatial or network) neighbours. Several models and programming languages supporting encapsulation, modulation, and composition of services have been already proposed, e.g. the Field Calculus and the language Protelis [69].

The last part of this special session is exactly dedicated to setting up appropriate programming frameworks to support a programming style that would ease programming, evolution, management, and security of collective adaptive systems. It also addresses testing methods and fault handling techniques for self-organising adaptive and autonomic systems.

The paper “A distributed API for coordinating AbC programs” [3] by Yehia Abd Alrahman and Giulio Garbi concentrates on the interaction models for CAS where environmental conditions largely influence interactions. In particular, it considers a possible implementation of message exchange based on anonymous multicast communication as advocated in AbC. The paper describes an efficient and distributed coordination infrastructure for AbC and proves its correctness with respect to the formal operational semantics of ABC. The proposed approach guarantees that the individual components are infrastructure agnostic. Thus the code of a component does not specify how messages are routed in the infrastructure but rather what properties a target component must satisfy. The actual communication infrastructure is implemented through a Go API, named GoAt, and an Eclipse plugin that enables one to program in a high-level syntax which can be automatically used to generate matching Go code. The approach is illustrated through a non trivial case study implementing a distributed graph colouring algorithm. The performance of the tree-based coordination infrastructure is also evaluated against two alternative ones based on clusters and rings.

The paper “A language and framework for dynamic component ensembles in smart systems” [20] by Tomáš Bureš, Ilias Gerostathopoulos, Petr Hnetynka, František Plášil, Filip Krijt, Jiří Vinárek, and Jan Kofron presents the ”Trait-based COmponent Ensemble Language” TCOEL. This specification and architecture description language builds on the experience on using DEECo [21] within the same group and is based on the consideration that in complex real-life systems ensembles may overlap, be nested, and dynamically formed and dismantled in a distributed environment. TCOEL is built on top of Scala and supports the combination the ensemble specifications with existing libraries in Java and other JVM-based languages available on multiple platforms. Ensembles are constructed by integrating domain-independent concepts (such a number and type of the members of an ensemble) with domain-dependent features, so-called traits, similar to mixins in object-oriented languages. Examples for traits are spatial constraints, predictions of certain values, or statistical tests. For evaluating the approach it is shown how a RoboCup Rescue application can be specified and simulated in TCOEL.

The last paper of this section, ‘’Toward autonomically composable and context-dependent access control specification through ensembles” [5] by Rima Al Ali, Tomáš Bureš, Petr Hnetynka, Jan Matejek, František Plášil, and Jiří Vinárek is concerned with the possibility of extending the concept of ensemble and exploiting it to define dynamic access control rules to govern interactions in a system of evolving autonomic components. Autonomic component ensembles [22] are enhanced with autonomously composable and context-dependent access control rules that follow the dynamicity and context-dependence of the components of a system. These are the so-called (dynamic) security ensemble rules that describe the allowed interactions in the system and follow the system during its evolution. The rules are complemented with an approach to match ensembles to the current state of the system so as to resolve the binding of dynamic access control rules to current situations and to generate the set of actions permitted at a given moment given the components in the system and their context. Care is taken for deriving the state of components that are not controlled by the system (e.g. humans and 3rd party components) and thus are not directly observable. For the specification of the rules, an internal Scala-based DSL has been proposed, but for end-user usage, an external a user friendly DSL is also offered.

5 Concluding remarks

Collective adaptive systems pose many research challenges which range from engineering adaptivity to novel requirements for the classical topics of modelling, analysing, testing, safety, security, and performance. Adaptivity is probably the most challenging topic for further evolution of our current technologies. But adaptivity is not only concerned with technical challenges and solutions. Even more important are the questions of acceptance and trust as well as those concerned with liability and, in general, with law. More research will be needed to master the rigorous engineering of such collective adaptive systems, to make them reliable and trustworthy. The story continues ...., in the mean time, enjoy this special section of STTT.