Introduction

This research work presents a theoretical framework for implementing and applying AI models that interact with a designer to suggest design options. We use case studies to uncover the general properties of AI models that can be leveraged by CAD to support conceptual designs. We focus on (1) understanding how and what the model learns and (2) what new forms of human–machine interaction can AI introduce into CAD.

Understanding how and what an AI model learns is important because, since the very first development of models for data generation [1], the design community has demonstrated a growing interest in these techniques. We believe that a better understanding of AI is crucial to exploit the full potential of these models for design applications.

Understanding how AI can be integrated with CAD software and enhance human–machine interaction in design is significant because it will facilitate the implementation of such techniques in practice and in architecture schools. Over the years, inventions in CAD have only become true innovations when they have been implemented in user-friendly Graphic User Interfaces (GUIs). For example, parametric design software became extremely popular when Grasshopper was released in 2007, even though similar software had been around for years [2]. The ease of use of the Grasshopper GUI allowed non-experts to familiarise themselves with the technology and apply it to ordinary design tasks. It also facilitated the spread of parametric finite element modelling and evolutionary optimisation in design professions.

AI has already been tested over a wide range of architectural and engineering applications. Many engineering studies have focused on analysis and optimisation, prioritising efficiency and computational speed over creativity and human–machine interaction. Conversely, in architecture, AI applications tend to focus on the conceptual design phase. For example, the group led by del Campo tested AI in a variety of ways, including text-to-image translation [3] and style transfer [4]. Bolojan and Vermisso [5] used AI to simulate a visual analogy process that translates forest images into interior views of the Sagrada Familia. However, these applications are currently limited to the production of visual outputs, and have not considered how AI can be used to develop more autonomous and participative design systems.

In this article, we hypothesise that AI models can support design applications at a deeper level. Specifically, AI can be trained in design and integrated in CAD software to support conceptual design and idea generation.

To test our hypothesis, we addressed the following three questions:

  • which AI models are suitable for design applications?

  • to what extent can AI models acquire design knowledge and synthesise design propositions?

  • how can AI exchange acquired knowledge with a designer to suggest new design possibilities?

Some of these questions were already addressed in the early years of AI in design research. The first applications were informed with evidence from the emerging field of cognitive psychology and were aligned with some of the objectives of design research. However, the current research into AI for design applications does not seem to have been developed in continuity with this background. Nevertheless, we propose that some of the results from this early season of experimentation can provide the basis of a theoretical framework for the future use of AI in design applications.

Our theoretical framework indicates three learning mechanisms that can be simulated by AI: expertise, playfulness and analogical reasoning. In design education, expertise is related to studying and analysing design precedents, while playfulness is linked to model making, and analogical reasoning pertains to finding inspiration in domains other than architecture, such as nature, art, music and literature.

We have used three case studies to investigate expertise, playfulness and analogical reasoning separately. Collectively, the three case studies respond to the aforementioned questions, each of them contributing to a part of the answer. The applications are based on structural design, because of our familiarity with the topic, but other applications could have been selected to test the hypothesis.

Furthermore, our applications demonstrate the ability of AI to learn from and synthesise data in two formats: 2D images and 3D models. We selected these formats because most AI models are developed to process visual information, and because a case study should be simple enough – but also representative enough – to be managed by the already existing AI techniques. However, the currently available models are also capable of learning from other data representations, such as natural language. These data representations may be used in the future to expand the capabilities of the models discussed in this article.

The article is structured as follows. In “The different meanings of AI in design”, we provide a historical overview of AI in design applications, and illustrate how this research topic developed over the years within the discipline as part of the broader conversation on CAD research. In “The three learning mechanisms”, we present our theoretical framework, which is based on the identification of three learning mechanisms that can be simulated by AI to acquire and manipulate design knowledge. In “Applications”, we demonstrate the potential of this theoretical framework through three applications. We conclude the article by comparing our work with current approaches and describing future directions for our research.

The different meanings of AI in design

Investigating why designers began studying AI – and understanding what issues they were trying to address by integrating AI within the design workflow – is essential to explain the origin of our research questions. Such questions were first formulated by pioneers in the field in the early 1990s, but have remained unanswered because of technology limitations.

This historical overview begins with the invention of the first CAD system [6]. It links many research fields that are closely related to design disciplines – such as CAD research, design research, performance-driven design – with more distant ones, including cognitive psychology and computer science.

The narrative of this introduction is based on the evolution of the concept of ‘intelligence’, i.e. the many meanings intelligence has taken on over the years, and how such understandings have informed the development of AI applications in the design field. It is only a brief overview that does not aim to be exhaustive but rather functional to position our work within the research field.

What is intelligence?

Legg and Hutter [7] analysed 70 definitions of intelligence and proposed a synthetic one: “intelligence measures an agent’s ability to achieve goals in a wide range of environments”. The notion of ‘agent’ implies that intelligence is an emerging property of any system – biological or artificial – capable of ‘achieving goals’ or solving problems. The last part of the definition concerns the universality of intelligence: an intelligent agent needs to demonstrate its ability to solve a large variety of problems.

Kurzweil [8], one of the founding fathers of AI applied to speech recognition, observed that humans define intelligence, and tasks requiring intelligence, in relation to what current AI models can or cannot do. For instance, playing chess at a professional level stopped being an indicator of intelligence once the Deep Blue AI algorithm defeated the world-champion Kasparov. For other examples, see Kurzweil [8].

Therefore, it is possible to argue that the definition of intelligence by Legg and Hutter [7] is the result of a process of continuous refinement. It comprises tasks that the current level of AI is not yet able to achieve, such as succeeding in a large variety of environments, and which would likely change when AI is able to accomplish this goal. Therefore, we do not downplay the historical and current achievements of AI – in and beyond the field of design – even though some of these achievements would be declassed as non-intelligent according to the definition described above.

Intelligence as the ability to interact

Sketchpad [6] is generally considered to be the first software to have a GUI, as well as the first modern CAD system. The ability of Sketchpad to solve geometric problems in an iterative fashion nourished the hopes of such pioneers of CAD research as Coons, who ascribed certain features, like intelligent behaviour, to Sketchpad. In an interview by John Fitch, Coons described Sketchpad as an artificial ‘human assistant’ that ‘communicates’ with a designer and ‘understands’ drawings [9].

It is worth noting that Coons’ definitions should be considered within the era of the technological developments of the 1960s, when human–machine interaction was limited to the preparation of punch cards. Real-time interaction made Sketchpad closer to a human assistant – and therefore apparently intelligent – than any other computational system available at that time.

Achieving a quasi-human level of interaction with CAD systems was also the main objective of Negroponte’s ‘architecture machine’ [10]. The architecture machine was a concept for a CAD system that never actually surpassed the prototyping stage. Despite this, its development allowed Negroponte to push the concept of human–machine interaction beyond the mere ability of CAD to communicate with the designer via a GUI. He observed that CAD systems had to acquire design knowledge though a “full range of sensors and effectors” to become a true design assistant. In other words, they had to learn how to design rather than simply compute a set of instructions provided by the designer.

The work of Negroponte anticipated the efforts made in the 1990s to integrate such AI techniques as Knowledge-Based and Case-Based Reasoning systems with CAD software. These techniques became popular in the 1980s through applications for medical diagnosis, which led to the first commercial products based on AI technology.

Intelligence as the ability to retrieve knowledge

In the 1980s, many researchers proposed integrating AI technology with CAD software [11]. The kind of software that was developed using this approach was named ‘Intelligent CAD’ (I-CAD) [12].

I-CAD software was meant to perform as a design expert, and it therefore featured procedures for design knowledge storage, retrieval, and manipulation [13,14,15]. The problem of developing I-CAD led to three fundamental questions:

  • What kind of design knowledge should be represented, and how can it be represented?

  • How should design knowledge be stored to be easily retrieved for future use?

  • What kind of mechanism should a CAD system be endowed with to assist designers in synthesising new knowledge?

Researchers have proposed many strategies to represent design knowledge. Gero [16] proposed categorising information about design precedents into three classes: function, behaviour, and structure. Function was defined as the purpose of a designed object. Behaviour is what an object does. Structure is what an object is. These classes represent the general properties of abstract design objects – named ‘design prototypes’. Another proposed strategy involved storing information about specific design precedents. This information was encoded as classes of property-value pairs [17] or concept libraries [14, 15]. The knowledge base of these I-CAD systems was queried either by means of keywords or diagrams and sketches [18].

However, these I-CAD systems did not meet the initial expectations of researchers because their representation of design knowledge was oversimplified and excluded any experiential knowledge of the designers [19]. Furthermore, these I-CAD systems suffered from the same limitations as all early AI techniques: the categorisation of design knowledge was arbitrary and the query procedures were based on pre-defined rules. All this made I-CAD systems incapable of realising Negroponte’s vision of a more autonomous artificial design assistant.

Intelligence as the ability to learn

In the 1990s, CAD researchers also explored the application of a different AI technique: connectionist models. These models simulate the structure of the human brain through a set of processing units – called neurons – that are organised in layers. Information between layers flows through connections that represent synapsis. The most recent AI applications are based on such models, which are also known as Artificial Neural Networks (ANNs).

Early applications

The early applications of ANNs in the design field were based on the connectionist models described by Rumelhart [20]. Coyne et al. [21] applied an ANN for the analysis and synthesis of architectural layout descriptions. The objective was to set up learning schema of room types – such as ‘living room’ – from a dataset of house plans described by combinations of room features. After training, the model was able to generate hybrid room types that did not fall into any human-defined category. Silva and Bridges [22] used a similar ANN for the design of door entrances. Early applications of these models in architectural design also exploited alternative strategies for data representations, such as pixel grids. Coyne and Postmus [23] used this representation to train an ANN for the design of building footprints. Petrovic [24] carried out a more ambitious research work, whereby he proposed integrating multiple ANNs, trained to perform different tasks, as ‘agents’ of a complete design system called ‘design agency’.

Despite the potential of ANNs for the development of autonomous design assistants, these experiments did not develop into actual CAD systems. A major limitation was the low computing capabilities of the early ANNs, which progressively reduced interest in these techniques in and beyond the design fields.

For the sake of completion, we should also mention that ANNs have instead been applied continuously in the engineering design field for the development of surrogate analysis models and optimisation solvers [25, 26]. We do not provide an extensive review of these applications here as they do not address the general problem of design knowledge acquisition and representation. However, we do discuss some of the most recent applications of ANNs for this sort of problem in “Current applications”, because of their similarity with the techniques described later on in this article.

Current applications

The use of ANNs in the design field has continued throughout the 2000s with applications for data clustering and visualisation [27]. However, a major leap in this field only occurred after 2012, when complex ANN architectures, specifically designed to process visual information – convolutional neural networks (CNNs) – first excelled in image classification [28] and then in image generation tasks [1]. A second breakthrough was achieved one year later with the development of reinforcement learning agents (see “Simulating playfulness with reinforcement learning”) that were able to play video games [29].

Many researchers started speculating about the possible implications of using these techniques in architectural design [30]. CNNs were successfully used for the classification of architecture pictures [31, 32] and the generation of architectural layouts [33], whereas reinforcement learning was tested to design optimal architectural layouts that satisfied a set of environmental requirements [34, 35].

The use of these techniques in structural design was mostly limited to performance-driven design. For instance, data generation models were used to solve topology optimisation problems [36,37,38,39], while reinforcement learning was proposed as a general tool for structural optimisation [40, 41].

Independently of the design field in which ANNs have been applied, we have observed that the current research works are focused more on developing task-specific design tools than on artificial design assistants that actively participate in the design process. It seems that the current use of AI in design research does not pursue the objectives set out by the pioneers in this field. This can be deducted by the overall lack of reference to the design process and the designer’s cognition in most publications on the topic. Our work aims to bridge the gap between the original motivations for AI applications in design applications and what the technology can do today. We believe that a major step in the human-AI interaction can only be achieved in design disciplines if more general questions are addressed.

Intelligence as the ability to create

The aim of the early research in AI for design purposes was to produce functional CAD systems and develop models for future AI-supported creative design. This aim was explicitly stated in the book ‘Modelling Creativity and Knowledge-Based Creative Design’ [42]. In the introductory chapter, Gero classified design products into three categories: (1) routine designs; (2) innovative designs; and (3) creative designs. Gero defined these three categories in relation to the characteristics of the design/search space considered by the designer:

  • Routine designs are contained in bounded design spaces. All the design variables are known, and the design problem is one of instantiation.

  • Innovative design can be found by extending the space of known solutions. The process involves making variations or adaptations of existing designs.

  • Creative designs have few obvious relationships with existing designs. The process requires reformulating the design space.

Rosenman and Gero [43] described four computational processes that could be implemented to generate innovative and creative designs: (1) combination,(2) mutation; (3) analogy; and (4) first principles. According to the authors, combination could be achieved by combining parts of different designs in a new whole; mutation involved modifying a single element of a design object; analogy required making associations between two different design domains; the first principles allowed the form of an object to be inferred directly from the requirements.

However, these processes were never fully implemented in I-CAD software. Notwithstanding this, as observed by Cross [44], they provided “useful explanatory models of creative design both within and outside the artificial intelligence community”, which means that these processes could also explain how human designers produce innovative and creative designs.

We consider the processes described by Gero as a starting point to further investigate the nature of the design process. However, the aim of such an investigation is not to produce AI systems that would be able to generate creative design: creativity cannot be considered an intrinsic property of design objects, as it is based on a human evaluation conditioned by culture and time [45]. Instead, we are interested in the possibility of using AI to simulate some of the processes that underlie the autonomous acquisition of design knowledge. This is a necessary condition to enable a design conversation between humans and AI, which may or may not lead to the development of a creative design idea. Following the work of Cross, we aim to “give the machine a sufficient degree of intelligent behaviour and a corresponding increase in participation in the design process” [46].

The three learning mechanisms

In this section, we describe three strategies that designers use to acquire design knowledge and which were partially inspired by the processes described by Gero (see “Intelligence as the ability to create”). Borrowing terminology used in cognitive psychology, we define such strategies as ‘learning mechanisms’. Our objective is to demystify how designers acquire design knowledge to develop models that can acquire knowledge in a similar – but clearly simplified – fashion.

We expect that an AI model that is able to acquire design knowledge by simulating these mechanisms would build a form of ‘design understanding’ that is substantially different from that of human designers. It is exactly such a difference that makes the prospect of a human-AI partnership so interesting.

Expertise

The first learning mechanism – expertise – concerns the acquisition of knowledge from design precedents.

Precedents are descriptions of an existing design solution that “suggest possible ways of doing things in design” [47]. These descriptions include not only technical information – such as material properties, components, and geometry – but also conceptual information, such as explanations for a particular design choice.

Learning from design precedents plays a pivotal role in design education. Hertzberger et al. [48] observed that the memorisation of design ideas from the studying of precedents expands a designer frame of reference, which in turn supports decision making during the design process. Studying precedents may involve analysing and interpreting the multidimensional information embedded in a design object, but it often focuses on visual information. Cross [49] noted that design objects are per se a form of knowledge that is concrete, tangible and can be experienced through the senses. A way of learning from this ‘material culture’ is to reproduce – for instance, by drawing – the experienced object.

Unlike novices, expert designers possess a great deal of knowledge about precedents, which can readily be retrieved to develop ideas for a variety of design problems. Experts also seem to access this knowledge in a way that is completely different from novices: they can filter the information almost instantly and identify the design features that are relevant to the problem at hand [47].

Playfulness

Playfulness is a learning mechanism that plays a key role in the development of human intelligence [50]. The explorative nature of playfulness is recognised as the main cause of the superiority of learning by playing compared to learning by tuition, as it can lead to more innovative behaviour [51].

Learning by playing is at the core of the kindergarten, an educational method that was developed by the German pedagogue Fröbel to stimulate the cognitive development of children. With the kindergarten method, children were encouraged to produce forms by assembling wooden objects – called Fröbel blocks – into spatial configurations. The role of the educator was passive: they only had to guide the children through the autonomous invention and discovery of forms [52].

The relevance of playfulness as a mechanism for the acquisition of design knowledge is best exemplified by Wright’s self-account of his experience with Fröbel blocks. In his autobiography [53], Wright explained how playing with Fröbel blocks allowed him to develop architectural composition skills. Rubin [54] found that Wright used to reproduce the formal patterns he learnt during his kindergarten years to articulate his architectural plans.

The kindergarten method has a correspondence with design education strategies based on model making. Dunn [55] observed that the exploration of spatial configurations while producing architectural models – in particular, ‘development models’ – can lead to the emergence of a creative idea. One of the reasons why this happens is because physical models are prone to endless manipulation, which favours revision and reinterpretation.

Analogical reasoning

Analogical reasoning is a learning mechanism that involves recognising similarities and transferring knowledge from a source domain to a target domain [56]. In architecture, the target domain is represented by design knowledge, while the source domain is comprised of knowledge from any field that is different from architectural design.

Some cognitive psychologists observed that the ability to construct analogies is an important characteristic of human intelligence [57], as, apart from learning, it is functional to recognition and classification [58]. For example, when humans do not fully understand/recognise a situation, they resort to analogies to acquire extra information and fill their gap in knowledge. Humans also use analogies in communication – for instance, in scientific writing – to explain complex concepts to an audience which has little to no technical background on the topic. However, the most important aspect of analogical reasoning is that it can empower creativity and scientific discovery [59, 60]. It is generally accepted that the more distant target and source domains are, the higher the chances the knowledge transfer process will lead to the generation of a creative idea.

The role of analogical reasoning in design learning and creativity has been extensively studied in design research [61, 62]. Design studies have revealed that the use of knowledge from various domains occurs naturally when a design brief is broad and generic [63]. Moreover, the use of analogical reasoning does not depend on expertise. Both experts and novices use this learning mechanism [64], although experts seem to recognise deeper – relational – similarities between source and target domains than novices and can therefore produce more useful analogies [65].

A notable example of analogical reasoning is the transfer of knowledge from biology to design, which is generally referred to as bio-informed design [66]. Some architects have used a bio-informed approach to develop innovative solutions to design problems. For example, Frei Otto drew inspiration from the self-organising processes of natural systems to develop form-finding methods for the design of large-span structures [67].

Analogies can also be made within the design domain. Le Corbusier developed the concept for the Unité d’Habitation by studying the organisation of steamships, therefore relating architecture to the industrial developments in new materials and machines.

A unified view of design knowledge acquisition

The three learning mechanisms are clearly not independent of one another. For instance, as observed by Lawson [47]: “some (precedents) are recent buildings, and some are historical buildings. Some may be from other objects such as the clinker-built hulls of boats and so on”, which suggests that expertise might involve transferring knowledge between two different design domains and – therefore – learning through analogical reasoning. Wright’s self-account also shows that playfulness made him aware of patterns – such as windmills – that he used to articulate his architectural plans. Frei Otto’s form-finding methods involved model making and thus learning through experimentation and playful exploration. These examples suggest that playfulness can also rely on analogical reasoning.

Analogical reasoning seems to link the other two mechanisms, which aligns with Hofstadter’s claim that analogy is at the “core of cognition” [57].

The distinction remains useful, for the purpose of this article, to describe three different ways of addressing AI-simulated design knowledge acquisition: (1) learning from a dataset; (2) learning by pure exploration; and (3) learning by exploration conditioned by a dataset. We describe these approaches in detail in “Applications”.

Requirements for AI-driven acquisition and communication of design knowledge

The following lists summarise a set of requirements for AI-driven acquisition and the communication of design knowledge that we identified from the analysis of the three learning mechanisms.

The requirements for knowledge acquisition are:

  • Autonomy: ability to acquire knowledge with little or no supervision. Datasets can be provided, but the model should rely on its own experience to extract relevant design features.

  • Memorisation and categorisation: ability to structure the acquired knowledge into meaningful categories. Categories do not need to correspond to human analogousness but should capture the correlations between similar chunks of knowledge.

In terms of knowledge communication, the requirements are:

  • Immediacy: communication with the AI model should be instantaneous. This requirement enables a form of conversation with the AI model that supports the natural flow of design ideas.

  • Interpretation of partial and ambiguous information: the reasoning capabilities of the AI model should be flexible enough to deal with different input representations.

  • Interactivity: the model should be able to provide design suggestions interactively. The obtained results may be partial rather than fully defined solutions so as not to unduly condition the designer’s reasoning process.

Applications

This section presents three applications – one for each learning mechanism – that were developed to test how AI models can inform conceptual design and idea generation, and therefore be used to develop more autonomous and participative design systems.

These applications were informed by the requirements described in “Requirements for AI-driven acquisition and communication of design knowledge”. For each application we:

  • introduce the AI model chosen to simulate the learning mechanism and explain why such a model is suitable for the task.

  • provide a cognitive interpretation of the chosen AI model.

  • illustrate how the AI model is trained on a design task and interfaced with CAD software.

We do not provide any technical details about the implementations because such details fall outside the scope of this article. Details on similar methods to those used for the first application are published in Mirra and Pugnale [68]. Two other papers on playfulness and analogical reasoning are currently under review and will soon be published.

Here, we use diagrams to describe and compare the functional components of the AI models and how they were interfaced with CAD software. These components include (1) trainable AI modules; (2) the input required for the training process; (3) the output produced by the AI models; (4) the input provided to the AI model at the inference time, that is, after completion of the training and once the model has been interfaced with CAD software.

Simulating expertise with generative models

Generative Models are a class of AI techniques that are used for data generation. They have received ever-increasing attention from the AI research community since the invention of Generative Adversarial Networks (GANs) [1], that is, models that can be trained to synthesise artificial, albeit extremely realistic, images of human faces and objects that have never existed [69, 70]. Apart from images, these models have been successfully applied to synthesise audio, 3D models and other data typologies.

GANs and other Generative Models, such as Variational Autoencoders (VAEs) [71], learn in an unsupervised fashion. They do not require any human input, apart from the provision of a dataset. The application presented in this section uses VAEs to (1) extract features from precedents and (2) recombine such features to generate design propositions that do not exist.

Acquiring knowledge by copying

The VAE architecture comprises two components: an encoder and a decoder. The encoder compresses input data into a low-dimensional representation, whereas the decoder attempts to reconstruct the input data from such a low-dimensional representation. The VAE is trained to perform this task on every dataset sample, which results in a set of low-dimensional representations that are uniformly distributed in a ‘latent space’. After training, this latent space can be sampled to produce data that resemble those that populate the dataset. The resemblance is the result of the preservation of some underlying features of the dataset in the newly generated data.

Since VAEs acquire knowledge by reconstructing data – in this case, design precedents – we observed that they could simulate the strategy of knowledge acquisition described by Cross (see “Expertise”), i.e. learning through copying/reproducing. The reconstruction process follows the encoding of information, which has a psychological analogousness: the process of constructing mental representations from perceptual stimuli [72]. The set of ‘mental representations’ constructed by a VAE is constituted by the low-dimensional representations that define the latent space.

In a previous work [68], we highlighted the similarity between the latent space constructed by a VAE and a design space constructed in conventional computational design applications, such as parametric design and optimisation. In this article, we extend the analogy by stating that the latent space can also be understood as a surrogate of the designer’s ‘conceptual space’, that is, the frame of reference that includes design knowledge but also a cultural background within which new ideas can be generated [45].

We realise that this analogy may sound inappropriate: AI does not possess a cultural background as intended in social sciences. However, if we extend the definition of culture provided by Hofstede [73] to artificial systems – i.e. “the collective programming of the mind that distinguishes the members of one group or category of people from another” – it can be argued that the group of AI models, collectively, does in fact exhibit a certain form of culture.

Learning from structural design precedents

In this section, we describe an application of VAE used to learn from a dataset of 40 shell and spatial structures designed by influential architects and engineers. The dataset comprised a variety of structural typologies, including masonry and RC shells, gridshells and membranes. We modelled each design sample in 3D and converted it into a 128 × 128 pixel 2D depth map. This conversion process was necessary to train an existing VAE implementation that learns from images [74]. A data augmentation strategy was used to increase the number of samples from 40 to 4000. This strategy involves performing a set of rigid transformations of the depth maps – rotation and translation – and it allows the VAE to recognise geometric features in the samples that are independent of the position and orientation. Figure 1 illustrates the trainable VAE components and the source of the input data.

Fig. 1
figure 1

Diagram of the VAE architecture used for the application on expertise

Figure 2 shows the result of the qualitative test that was performed to evaluate the characteristics of the features learnt by the model and the quality of the synthesised data. We selected four design samples from the dataset, fed them to the VAE encoder and obtained their low-dimensional representations. We then linearly interpolated these low-dimensional representations and fed them to the VAE decoder to synthesise new forms. Our results demonstrate that, even with the provision of only 40 3D models, the AI model was able to learn to extract and recombine geometric patterns in a meaningful way. In this case, the model was able to synthesise hybrid designs that blended the main features – such as openings, support edges, and curvature inversions – that characterised the selected designs.

Fig. 2
figure 2

Evaluation of the features learnt by the VAE through the linear interpolation of four dataset samples

Exploring the VAE conceptual space

We developed an AI-CAD interface to explore the potential uses of the model in design applications. Our interface comprised two nodes: (1) a server on which the AI model was run, and (2) a client that sent information to the server and waited for the output. The client exploited the GUI of existing CAD software – Rhinoceros 3D – and relied on a visual scripting environment – Grasshopper – to manage the I/O communication with the AI model.

The interface allowed the conceptual space of AI to be explored through the exchange of visual information with the model. The designer sketches 2D footprints on a canvas in Rhinoceros and waits for the model to transform the sketch into a 3D model that can be visualised on the same canvas. Therefore, unlike conventional computational design approaches, the exploration of design options involves neither the manipulation variables nor the analysis of design performances.

Figure 3 (on the right-hand side) shows a set of design propositions developed by the model, starting from 2D footprints sketched in Rhinoceros. Each row identifies multiple solutions generated from the same input by means of recursively feeding the model with forms generated over consecutive iterations. We define this process interpretation: the model first produces a 3D model by reproducing coarse design features that were extracted from the dataset, and progressively refining the form into a plausible design proposition.

Fig. 3
figure 3

Application of a VAE-CAD interface to synthesise 3D forms from 2D footprints

Simulating playfulness with reinforcement learning

Reinforcement Learning (RL) is one of the most popular areas of research in machine learning. It is the technique that allowed AI to master board games, such as ‘Go’, at a higher-than-human level [75]. It has also been used successfully in video games [29] and robotic control applications [76].

Here, we provide an interpretation of design as an RL problem. In particular, we focus on the possibility of modelling drawing as a time-dependent decision process and training an AI model to produce design options that satisfy certain requirements.

A playful exploration of design possibilities through drawing

RL applications are formalised as Markov Decision Processes (MDPs), which are abstractions used to represent sequential decision-making problems where an agent interacts with an environment [77]. In plain terms, the goal of an MDP is to find a function that maps states observed from the environment into actions that maximise the agent’s reward. This function can be derived deterministically or learnt – i.e. approximated – by an artificial neural network. For an overview of the available techniques, see Sutton and Barto [78].

An RL model consists of a network – also known as ‘policy’ – that predicts the actions an agent must perform in an environment. In the case of chess, the agent is the player, while the environment consists of the chessboard on which actions – or game moves – are performed. The network is trained to predict actions that maximise the agent’s future reward, which in this case corresponds to the quality of a chess move in relation to the expectation of winning the game. Unlike other techniques, RL models are usually trained without any dataset. They autonomously learn by trying different actions, observing the reward of each action, and ‘reinforcing’, i.e., performing those actions that led to the achievement of a higher reward more frequently. The agent must: (1) interpret the state of the environment – e.g., a specific configuration of the chessboard – at each time step of the decision process; and (2) balance the exploration of the environment with the exploitation of the acquired knowledge.

The learnt policy defines the behaviour of the agent in the time domain, that is, how it will act/move within the environment in consecutive time steps. Since the design process is also dynamic, we propose modelling it through the MDP formalism. We define this implementation as the Markov Decision Design Process (MDDP).

In an MDDP, the agent is the designer who learns a design strategy to maximise a reward that can either be the achievement of a design goal or the pleasure of engaging in the design activity itself. The environment of an MDDP can be interpreted in many different ways: it can be (1) a conceptual space in which the designer performs design moves [79]; (2) a working environment in which the designer cooperates with other designers – i.e. with other agents – and/or negotiates with the client; or (3) a drawing board or spatial grid in which actions correspond to placing either strokes on paper or blocks in space to form a drawing or a spatial configuration.

In the following application, we implement the third interpretation of an MDDP, which is based on an interaction with a drawing board. We assume that the agent does not have any prior design knowledge and thus cannot rely on an existing policy. Like a child, the agent engages in the ‘playful’ exploration of an environment and develops its own policy from scratch.

Learning to design an arch/frame

We describe an application of an MDDP to train an AI agent to solve a simple design task. The task consisted in designing a 2D frame structure made of welded steel pipes and connected to the ground by two support nodes. The objective was to develop feasible design options for a variety of boundary conditions. The feasibility of a design option depended on the satisfaction of two requirements: (1) avoiding collisions of the structure with differently sized obstacles, which were randomly placed in the environment; and (2) minimising the displacement of the structure under vertical loads.

The agent was trained using a custom-made implementation of Deep Q-Network (DQN) [29]. Figure 4 shows the trainable components of the model and the input provided for the training process. In this case, the input does not consist of a dataset but of an environment with which the agent interacts. The environment includes: (1) a drawing board, which is represented by a 32 × 32 pixel greyscale image; (2) a set of actions that control the placement of single pixels onto the drawing board; (3) an FEM solver that converts the placed pixels into nodes of a 2D structural frame, assigns mechanical properties and loading conditions, and computes the maximum displacement.

Fig. 4
figure 4

Diagram of the functional components of the RL model used for the second application on playfulness

The agent observes, at each step, the current state of the drawing board – which at time zero includes the obstacle and the support nodes – and decides where to place the next structural node. The agent receives a negative reward if it positions a node within the obstacle boundaries and a positive reward if it reaches the second support point within a maximum of 300 time-steps, at which point the agent will also receive an additional reward based on the results of the structural analysis.

We evaluated the trained model by analysing its ability to generate complete structures from new boundary conditions. Figure 5 shows the test result, which demonstrated that the agent was able to produce a complete structure in about 90% of the cases and that most of the designed structures presented a negligible displacement.

Fig. 5
figure 5

Analysis of the predictive capabilities of the trained RL agent through the synthesis and evaluation of multiple structural frames from different boundary conditions

Interacting with the AI agent through drawing

We developed and tested an interface to allow humans to communicate with the trained AI agent. The test required the agent to interpret and complete a partially drawn structure in a meaningful way.

Figure 6 shows the results of the test. We observed that the agent was able to complete a structurally sound frame most of the time (top row) but failed to produce performative frames when the input was significantly different from the paths produced during the training process (bottom row).

Fig. 6
figure 6

AI-human interface that allows a partially drawn design option to be iteratively completed

The most relevant outcome of this test was the confirmation that the dynamic nature of the agent-environment interactions at the training time also characterised the agent-human interaction at the test time. Overall, we considered this form of interaction more powerful than the static interaction mode supported by VAEs (“Simulating playfulness with reinforcement learning”).

Simulating analogical reasoning with reinforced adversarial learning

Reinforced adversarial learning is an AI technique that was first introduced by Ganin et al. [80], and then refined by Mellor et al. [81]. The implementation of the technique – named SPIRAL – involves combining reinforcement learning with generative models to train AI agents in image synthesis. Unlike conventional data generation models, SPIRAL does not generate images through the recombination of features in the pixel space. The model instead learns to perform actions in a drawing software.

We here describe our implementation of SPIRAL whereby the agent interacts with a 3D modelling environment – and therefore synthesises 3D models instead of images – to design artificial replacements of natural habitats. We used a dataset of biological forms to guide the agent in the extraction of formal features that were relevant for the task.

AI-driven visual abstraction

Ganin et al. [80] tested SPIRAL for two kinds of application: (1) inverse graphics and (2) non-photorealistic rendering. The first type of application concerns, for instance, finding a set of commands to reconstruct an image within drawing software, whereas the second involves producing an artistic representation of a target image that preserves the main features of such an image. SPIRAL is based on reinforcement learning, and therefore, in a similarly way to the application described in “Simulating playfulness with reinforcement learning”, it models an agent that explores different drawing actions to maximise a reward. In SPIRAL, this reward is the ‘similarity between images produced by the agent and target images’, which can include human faces, handwritten digits or 2D projections of 3D sceneries. However, the reward is not provided to the agent through a deterministic function, like the FEM solver described in “Simulating playfulness with reinforcement learning”, but is learnt by the model together with the policy. Ganin et al. [80] included an additional network in the SPIRAL model to learn the similarity function, that is, a GAN discriminator [1]. The discriminator is trained to learn a similarity score that differentiates between images generated by the agent and the images that populate the dataset. The similarity score informs the agent about how good the images that it has produced are.

Because of the features described above, SPIRAL can be classified as a generative model. However, SPIRAL can do more than just generate realistic images: it can also produce ‘visual abstractions’ of images, that is, simplified representations that are still able to contain the figurative meaning [82]. The simplification is enabled by the possibility of defining which and how many drawing actions the agent can perform, which effectively constraints its representational capabilities.

We exploited the process of visual abstraction to make the SPIRAL agent synthesise simplified representations of complex natural forms. This process causes the transferral of knowledge from biology to design, and is thus classified as analogical reasoning, through the specification of design constraints.

In order to relate the reinforcement-learning application presented here with the MDDP formalism described in “A playful exploration of design possibilities through drawing”, we define the problem as a ‘conditional MDDP’. The agent still engages in the playful exploration of design possibilities, but its behaviour is conditioned by knowledge acquired from a different domain, which, in this case, is biology.

Learning to design simplified tree forms

We tested the ability of our SPIRAL implementation to acquire knowledge from a dataset of tree forms and to synthesise visual abstractions of such forms. The forms synthesised by the agent can be used to inform the design of human-made replacements for deforested areas that are easy to build and scalable. For an overview of the challenges related to this sort of design problem, see Hannan et al. [83].

We simulated analogical reasoning by specifying the following design constraints. We limited the agent’s action space to the placement of lines in the 3D modelling environment. These lines represented wooden poles through which a digital design option, produced by the agent, could be materialised in the real world. Furthermore, we limited the number of actions to 10. This set an upper bound for the number of lines the agent could use to synthesise a 3D form.

We developed a simple 3D modelling environment consisting of a 32 × 32 × 32 spatial grid. The agent could move a cursor within the grid boundaries and place lines that were rendered as voxels. Figure 7 illustrates the source of data used for this application, including the dataset and environment, and the two trainable components of the SPIRAL architecture.

Fig. 7
figure 7

Diagram of the functional components of the SPIRAL implementation used for the third application on analogical reasoning

We tested our implementation by analysing the forms produced by the agent at the last iterations of the training process. Figure 8 shows a sample of such forms. Our analysis involved (1) visually inspecting the forms generated by the agent to assess their similarity with the tree form dataset, and (2) applying a set of synthetic measures to extract geometric features and evaluate the suitability of the synthesised forms for the design of natural habitat replacements. We found that the agent was able to successfully reproduce the main features of the tree forms, such as the branching patterns and trunk-canopy articulations, even for a limited provision of only 10 modelling actions.

Fig. 8
figure 8

Evaluation of the capabilities of the trained SPIRAL agent to synthesise visual abstractions of tree forms

Interacting with the AI agent through 3D modelling

At the current stage of development, our application does not feature a human-AI interface. However, we imagine that an interface like the one described in “Interacting with the AI agent through drawing” could easily be extended to 3D modelling and implemented to interact with the SPIRAL agent. Such an interface will allow the designer to define a partial 3D form and seek design suggestions from the agent on how to further develop their design proposition.

Discussion

The three applications presented in this article have demonstrated how AI can be used to inform conceptual design applications. In continuation with the work of the CAD and I-CAD research pioneers, we have developed a theoretical framework to implement AI models which has allowed us to set some key requirements in terms of design knowledge acquisition and communication.

Overall, our work contributes to the discussion on how emerging AI technologies can be applied to the design field. However, unlike most studies on the topic, our work is not limited to the investigation of a particular technique to address a specific design problem. Instead, it shows the potential of a variety of techniques and how such techniques could enable new forms of human-AI design partnership.

Comparison with existing studies

Table 1 highlights the chosen AI techniques in relation to the learning mechanism. It describes how similar techniques are currently applied in design applications and summarises the results of the applications presented in this article.

Table 1 Comparison of our results and the current uses of the chosen AI techniques in design practice

The currently available research works have not fully addressed the problem of human-AI interaction within existing CAD software. As stated in the introduction, we believe that investigating this problem is crucial to expand the range of applications of AI in the design field. Our contribution in this area consisted in testing different forms of design exploration via simple AI-CAD interfaces. We observed that the selected techniques enabled two main forms of exploration. The first one involves retrieving a complete design description from an ambiguous representation. This form of exploration was tested through the application on expertise, in which the AI model had to interpret a 2D footprint sketched in CAD software. The model was able to interpret the 2D footprint and return a 3D form that represented a possible instantiation of a design idea. The second one allows a partially defined design description to be completed, while maintaining the same level of representation abstraction. This kind of exploration was tested through applications based on reinforcement learning. The agent was able to interpret partially drawn frames and decide on the next drawing iterations.

It is worth mentioning that the designer can fully control both forms of exploration. For instance, in the first application, the designer could decide how many times a 2D footprint had to be interpreted by the AI model. This allowed less or more refined design suggestions to be visualised (Fig. 3). Similarly, in the second application, the designer could ask the AI agent not to complete the design but to continue it for a limited number of steps. These features enable a form of communication between designers and AI that is not deterministic but encourages exploration and revision of design ideas.

Future developments

Our results demonstrate that the chosen AI techniques are suitable for design applications and can extend the scope of computational design beyond analysis and optimisation. Nevertheless, we acknowledge that the techniques have several limitations, in terms of knowledge acquisition and design generation, which we shall address in future work.

First, the ability of AI to acquire design knowledge from representations that are only visual is questionable. As stated in “Expertise”, design precedents are multidimensional and include information that is conceptual, which can only be represented through symbols. We will integrate visual representation with symbols to expose the AI model to a variety of design attributes.

Second – with reference to an aspect that is related to data representation – our interfaces only enable the exchange of visual information with the AI model. Using symbols would allow humans to communicate with the agent at a deeper level. This could also lay the ground for interfaces based on other AI technologies such as speech recognition.

Third, our implementation of reinforcement learning has shown a great potential, in terms of knowledge acquisition and communication, but failed to produce useful design options. Conditioning the exploration of design possibilities using an external dataset – as we did for the application in analogical reasoning – proved to be effective in guiding the agent in the synthesis of 3D forms. We will explore other conditioning strategies to guide the agent in generating more interesting design options. These strategies might involve providing the agent with data about how designers would address analogous design problems, which the agent could use as the starting point for the playful exploration of alternative solutions.

Finally, we believe that an AI system that supports design exploration should implement all the possible learning mechanism at once. Our ultimate aim is to develop new I-CAD systems that are able to engage in a variety of design problems, which can only be achieved with a hard form of AI – or artificial general intelligence (AGI) [84, 85]. This has been a hot topic in computer science since the origins of the computing technology, which inspired the first applications of AI in design applications and our research work.

Conclusion

This study demonstrates that AI can simulate aspects of human cognition and support conceptual design and idea generation, by interacting with the designer through visual data formats, such as 2D images and 3D models.

AI applications have been used successfully in various research fields. For this reason, we argue that it is important to understand how AI can be integrated with CAD systems to explore new and enhanced forms of human–machine interaction in design disciplines.

We have identified expertise, playfulness, and analogical reasoning as three learning mechanisms that AI can simulate. We used three case studies to test these mechanisms separately. An application related to structural design illustrated how Variational Autoencoders learn from precedents to design shell structures from a simple 2D footprint, thereby simulating design expertise. Another structural application concerning the design of a simple frame structure described how reinforcement learning works in a such processes as playing and model making, strategies that children and designers commonly use. This is the only performative application in this study because it involves FEM analysis. It also enables an interactive form of AI-human partnership. A third application involved analogical reasoning and the transfer of knowledge from biology to design. This is an AI-driven process of visual abstraction that extracts relevant features from 3D-scanned tree forms to design artificial replacements for natural habitat structures.

We here contribute to exploring emerging AI techniques in design applications and the integration of AI into CAD systems. However, considerations about human–machine interaction are currently limited to the study of how such forms of interaction take place at the interface level, i.e. in terms of input–output.

The aim of this work is not to demonstrate that AI supports design exploration but instead to show that different forms of exploration from those that characterise the current computational design tools and approaches can be used. For the same reason, we do not seek to demonstrate whether such forms of exploration lead to the emergence of creative design output.

To conclude, this work offers a theoretical framework, supported by practical applications, that can be used to implement AI models in architectural and structural design applications. It should be considered as a step towards the development of autonomous and participative design systems.