The City of Tomorrow from... the Data of Today

In urban planning, a common unit of measure for housing density is the number of households per hectare. However, the actual size of the physical space occupied by a household, i.e., a dwelling, is seldom considered, neither in 2D nor in 3D. This article proposes a methodology to estimate the average size of a dwelling in existing urban areas from available open data, and to use it as one of the design parameters for new urban-development projects. The proposed unit of measure, called “living space”, includes outdoor and indoor spaces. The idea is to quantitatively analyze the city of today to help design the city of tomorrow. First, the “typical”-dwelling size and a series of Key Performance Indicators are computed for all neighborhoods from a semantic 3D city model and other spatial and non-spatial datasets. A limited number of neighborhoods is selected based on their similarities with the envisioned development plan. The size of the living space of the selected neighborhoods is successively used as a design parameter to support the computer-assisted generation of several design proposals. Each proposal can be exported, shared, and visualized online. As a test case, a to-be-planned neighborhood in Amsterdam, called “Sloterdijk One”, has been chosen.


Introduction
Nowadays, urban transformations are characterized by ever-increasing complexity and uncertainty [1]. As a consequence, and according to the specific case and set of challenges, advisors, stakeholders, and often even end-users are sitting at the table right from the beginning in order to achieve consensus on the strategic pathways to undertake together. The involved experts no longer work following a pre-assigned order. Process and design have ceased to be the signature of a particular leader or author, but are much more the result of the collaboration of everyone involved [2]. As all involved parties are coming from diverse backgrounds and use different ways of communication, there is currently more and more need for designs and design tools enabling the interaction and knowledge sharing among all disciplines and stakeholders [3].
Given the above mentioned framework and considering the fact that urban transformations of the future will always have to be somehow related to the existing city ("the city of today"), this article elaborates on a widely used notion in urban planning, i.e., the number of households per hectare, as one of the common units of measure for urban housing density [4].
It must be noted that the term "household" refers to the number of people sharing the same living space (e.g., a family), and, although often used in urban planning, it is rather surprising that the actual size of the physical space used by a household (i.e., a dwelling) is seldom considered, neither in 2D nor in 3D. Dovey and Pafka [5], for example, define the concept of "internal density" associated to the size of the household as an indicator of slums and luxury housing (measured in m 2 /resident). This is a variation of the "urban footprint" concept proposed by Berghauser Pont and Haupt [6] and the "BUA" (built-up area per capita) [7].
In this paper, the overall idea is to add the volumetric size of the physical space used by a household and-more generally-the size of the typical "living space" to the urban design process. For this reason, the term "dwelling" will be used instead of "household" wherever necessary to avoid confusion.
The size of the "living space" will be first estimated using (open) urban data, and then used as a quantitative design parameter for a new development area. Please note that the concept of living space is intended here to be comprehensive of indoor, outdoor, residential, and non-residential spaces, and considers all the spatial elements inside of a neighborhood, such as green areas, water, streets, pedestrian areas, bicycle lanes, parking lots, etc. The test case is located within the Haven-Stad project area, in the west of Amsterdam, the Netherlands. This area was chosen due to its envisioned development plans: up to 70,000 new dwellings in the next 20 years are expected to be built. Specifically, the second stage of the project, so-called "Sloterdijk One", was chosen due to its size (58 hectares) and the number of planned new dwellings (11,220) to be realized. Figure 1 shows the position and the approximate extent of the area. The proposed methodology is organized in sequential steps. To begin with, the actual situation of a city (and its neighborhoods) is described by extracting a number of meaningful KPIs (Key Performance Indicators). These KPIs can help the urban planners in "reading" and understanding the city, and they eventually contribute to the improvement of the project during the design process. Our approach can be considered quite innovative due to the fact that knowledge and quantitative design parameters are directly extracted from existing urban situations, out of references from "reallife" examples of successful urban interventions. For a whole city (e.g., Amsterdam), a variety of spatial and non-spatial datasets are retrieved and harmonized to compute the KPIs. Among others, aspects tied to land use, housing density, average price of residential dwellings, the quality of life, and the year of construction of the buildings are considered to compute the KPIs.
Thereafter, additional parameters are computed in order to obtain the size of the average "typical" living space for each neighborhood. The average sizes for indoor spaces (calculated as volumes in 3D) and for open spaces (calculated as areas in 2D) are such parameters. A semantic 3D city model is used for this purpose. Although obtained as values of areas, a conceptually similar approach is used to calculate the outdoor spaces. In addition, to facilitate the (visual) comparison of the results, all living-space values are normalized and referred to one prototypic average buildingin our case a composition of a single dwelling with additional non-residential and outdoor spaces. In the following urban-planning design phase, the sizes of the living spaces are used as design parameters.
A prototypical, semi-automatic design tool has been developed. The goal is to speed up the design process while allowing at the same time, in a quick and interactive way, the creation of various proposals regarding the same study area. Several design parameters can be set and changed. In the meantime, feedback is given by the tool on whether certain design constraints are respected or not. A number of parameters (and constraints) can be added based, for example, on the guidelines issued by the municipality regarding a specific development project or area.
In order to enhance both transparency and communication, it is possible to generate a specific report containing all design parameters for each design configuration. Additionally, and if needed, each configuration can be exported, shared, and visualized in 3D (locally or on-line). The user can export and integrate 3D geometries, as well as attributes (and semantics), this being indeed a rather well-known limitation of most CAD-based design tools. Having several scenarios available, along with the possibility to integrate stakeholder's feedback in the design process, can be indeed a contribution to facilitate the collaborative process.

Related Work
Several solutions to facilitate urban planning work have been developed using computer-based design tools. Part of these tools is functioning fully automatic, generating design solutions requiring no intervention at all by the designer. The Interactive Urban Synthesis Computational Methods [8] are an example focusing on the automatic creation of multiple design scenarios implying null or limited interaction with the designer during the process. Some other tools are semi-automatic, entailing little interaction between software and the user. These tools can be very effective because they enable the combination of professional knowledge with the strength of computers. One of the most popular examples in Europe is the Kaisersrot project in Zurich, Switzerland [9]. In this project the software makes use of a database containing the future residents' needs as primary input, together with information about the context to determine a balance in needs, and finally gives shape to an optimal parcellation solution. The limitation is the reduced flexibility in generating regular geometries as it is based on Voronoi diagrams. Another 2D tool is InViTo [10], which is about largescale urban projects for analysis and design. This and the Kaisersrot project are powerful examples of tools, but the 3D element is still missing in the analysis. Another solution is the CItyMaker proposed and developed by Beirão [11]. It works in 3D and holds a strong background of induction patterns for urban design. CItyMaker focuses mainly on urban fabrics design including spatial analysis, but leaves partially uncovered the shape of buildings. Finally, the Möbius modeler [12], is a web-based parametric modelling tool running geo-computational procedures in a 3D environment.
It is comparable with the Grasshopper tool for Rhinoceros, taking advantage of the possibility to combine geometric data with semantic information. Its limitation, however, lies in the rather limited interaction between 3D model and user during the model generation process.
Several tools for urban planning, many of them based on Rhinoceros/Grasshopper, have been developed in recent years, either as academic projects or as commercial software. For example, the "parametric urban design" tool [13] belongs to the former group and is mentioned here because of its similarity with the prototype tool presented in this article. Both share similar principles in terms of urban design systems; however the "parametric urban design" tool uses a limited list of design parameters and lacks support for geospatial data and context information outside of the design area.
In terms of commercial software, "Ostate" [14] and "Parametric Smart Planning" [15] are two examples of solutions using Grasshopper and Rhinoceros. The former is specifically oriented to the Dutch market with small-size projects (up to project areas of 10-20 ha). It consists of four main modules that allow to read official Dutch datasets directly from the source (including context), create the volumetric shape of buildings, make internal distribution of the buildings, and finally calculate their energy efficiency. The latter is more oriented to the global market and large-scale projects (up to project areas of 100-200 ha). It offers a vast quantity of design parameters, mainly focused on blocks and superblocks. The main drawback of these proprietary solutions is, however, the impossibility to customize them at will, as they are conceived and sold as closed-source software.
Another set of software tools commonly used for parametric design is Autodesk Revit [16] and its visual scripting environment for designers Dynamo [17]. Even though Revit was conceived as a tool for Building Information Modelling (BIM), sometimes it is also used in projects related to urban design [18]. Nevertheless, the nature of Revit is mostly tied to the BIM world: typical application field is with projects at building-scale (or block-scale) as the main focus is on the single building and its components (windows, doors, walls, etc.), making the use of this software more suitable for disciplines like building physics [19] or construction technologies [20].
Concerning city analyses based on open data, it is worth mentioning a recent project in urban morphology analysis by Kepczynska-Walczak and Pietrzak [21], in which the city of Lodz is analyzed in 3D, enabling comparisons among different areas within the city. This new way of analyzing the current situation of a city based on data makes possible the understanding of the complexity of the urban environment. The drawback in this research is the lack of 3D data, a fundamental dimensional factor in city spatial analyses.
Road design is one of the backbones contributing to open-space quality in urban planning. Governments and municipalities around the world are investing a lot to make several interactive tools available to create street cross-sections. Examples are the Abu Dhabi Urban Street Utility Design Tool [22], Streetmix [23], and Streetplan [24]. These online tools are made available to everyone and are powerful in terms of visualization capabilities, but in the best-case scenario, one can only export an image of the created design or just basic information of the realized work. The work of de Klerk and Beirão [25] deals with another road modelling tool: it allows to create street cross-sections implemented in Rhinoceros and Grasshopper using semantic structures and real-time visual analytics to support in the design process. However, all the above-mentioned road-modelling tools are available to design in 2D and do not include considerations concerning the surrounding infrastructure of the project area.
The living space and its size are nowadays studied from different angles, e.g., in terms of quality of life, health, or space efficiency. With regard to the quality of life in residential areas, there is a direct correlation with the size of spaces (i.e., bigger spaces correspond to higher quality of life); however, this holds true only in some cases. Addressing the Netherlands, Maas et al. [26] demonstrated that quality of life is in fact more directly linked to the "greenness" of the neighborhood, defining quality of life as good health of its inhabitants, or "happier" if mentioning White et al. [27]-this time regarding the UK. Regarding indoor spaces, the size of the households is another important indicator of quality of life. According to Foye [28], in the UK, bigger houses can increase the level of subjective well-being, but only for a short period of time. Finally, referring to space efficiency, The Why Factory group at TU Delft carried out a design research project called On-The-Go [29], in which the minimum volume needed for daily activities like cooking or sleeping was first derived by tracking the project participants by means of wearable sensors. The results were then used to determine the typical size of the living space. Once known, the spaces themselves were clustered and re-organized to generate the shape of a building. Figure 2 (adapted from [29]) shows a simplified procedure overview of the On-The-Go project. In a similar way to The Why Factory's approach, this article puts forward a methodology to first calculate the average size of a single dwelling based on the current way of living of people and then extends it to envision a new urban district. The added value in the work presented in this article is the inclusion of outdoor spaces (in 2D) in order to consider not only the indoor living spaces, but also the "outdoor" open spaces in the city.
Considering the large amount of heterogeneous spatial and non-spatial data to take into account in urban planning, a valuable support can be the adoption of standard-based semantic 3D city models. Thanks to the constant progress in all fields connected to geomatics, more and more cities in the Netherlands and elsewhere are creating and using 3D virtual city models as "digital twins"-or, as recently proposed, "digital geotwins" [30]-and means for integration, harmonization, and storage of data. A unique and spatio-semantically coherent urban model [31] can deliver multiple beneficial effects for a city, functioning as a hub of integrated and harmonized spatial and non-spatial information for further applications ranging from noise mapping, augmented reality, up to energy simulation tools [32][33][34]. Biljecki et al. [35] provide a review of applications based on 3D city models. To this extent, CityGML [36] is an international standard conceived specifically as an information and data model for semantic city models at urban and territorial scale. According to the authors' knowledge, it is the only open and well documented standard of this type. CityGML is being adopted and used more and more as an integrated source of data in various scientific fields. One of the advantages consists in the possibility to extend the standard for domain-specific needs via the socalled Application Domain Extensions (ADE). Several examples of ADEs have already been developed in recent years [37].

Objectives and Structure of the Article
The analysis of the current situation of a city through existing (open) standardized data is the first objective of the proposed methodology. The generated knowledge is meant for urban planners, facilitating their way of reading the city and proposing a new unit of measure in urban development projects, namely the size of the living space per household. As previously mentioned, in urban planning and urban development, the parameter of density normally used in projects is traditionally given by the number of households per hectare. However, as the actual size of the associated dwellings is seldom considered, the goal is to "extend" the concept of living space out of the literature, both in terms of dimensionality (i.e., going from 2D to 3D as much as possible) and in terms of scale (i.e., from a single building/dwelling to the whole city).
Directly following the first objective, a second goal of the proposed methodology is to test and embed semantic 3D city models (based on CityGML) in the whole process, thus being able to exploit their added value as a source of spatial (2D and 3D) and non-spatial information.
An additional goal is to facilitate the creation of several design configurations in a quick way, based on the knowledge previously created. The long time needed for the decision-making regarding the urban-design process is a well-known problem that this study tries to reduce. This process can last sometimes up to ten years, or longer. In these relatively long period guidelines, design parameters or other issues regarding a project can change. Therefore, as time passes and additional conditions might require changes, being able to quickly adapt a project is clearly a benefit.
Finally, the proposed methodology-together with the accompanying developed prototype tool-aims to facilitate mutual communication between different stakeholders from various fields of expertise (and their feedback), in that supporting the urban designer in the interactive generation and visualization of design configurations in a rather quick way-a process that nowadays is often expensive and frequently very time-consuming.
The following five points are the main research questions this article wants to address. They are formulated as follows: (1) Can the city of today be quantitatively "read" by means of existing (open) data in order to provide suggestions for the to-be-designed city of tomorrow? (2) Is it possible to define some (3D) spatial KPIs (related to the above-mentioned concept of living space) that may support the urban designer in the generation of computer-assisted spatial configurations? (3) What is the effect of these KPIs in the case of a real project (e.g., "Sloterdijk One" in Amsterdam)? (4) How can semantic 3D city models, as hubs of harmonized and integrated spatial and non-spatial information, contribute to the interpretation of the city of today? (5) How can they be integrated with existing tools that are generally used in GIS and urban design?
The article is structured as follows: after this introduction, Section 2 contains the overview of the methodology. All steps and substeps are introduced and described, while all software architecture and implementation details of the developed prototype are presented in Section 3, which accounts for the major part of the article. The case study area and the results are then described in Section 4, while Section 5 contains some further reasoning concerning the current limitations and future improvements. Section 6 is dedicated to the concluding remarks. A list with all abbreviations used in this article, as well as their expansion, is given in Appendix A.

Overview of the Proposed Methodology
This section gives an introductory description of the proposed methodology, whereas the technical details and the specific implementation decisions will be dealt with in the successive Section 3. Figure 3 shows a graphical overview of all envisioned workflow blocks. Each step is represented in numbers and it contains some substeps, represented with letters. The reader is invited to keep this schematic overview at hand for reference. The overall approach is divided into five major successive steps, whereas only the first three will be described in detail in this article, as the last two are actually not of technical nature and, therefore, beyond the scope of this work. They are nevertheless shown and briefly mentioned for the sake of completeness. Every step is further illustrated with figures in order to facilitate the comprehension of the overall workflow. The main idea is to develop a tool that uses open data coming as much as possible (ideally: exclusively) from official national, regional, and local data sources, and is based on existing open standards. This decision is intended to facilitate the re-use of the tool for other case studies in Amsterdam, but also-with minor adjustments-in different Dutch cities, as long as the same or equivalent data contents are provided.
In the first step, existing spatial and non-spatial datasets are collected and evaluated. A set of Key Performance Indicators (KPIs) is first decided upon to characterize the overall urban fabric and then extracted from the available datasets. The KPIs are computed for each existing neighborhood in the city and then used to perform a first analysis at city level. In this work, the following KPIs were defined and used:


Percentage of Residential Area Index (RA-I), i.e., the ratio between the area used for residential purposes and the total area of the neighborhood; These KPIs are, on the one hand, quite general in scope. On the other hand, they are rather common in urban planning, hence they were evaluated and chosen in consideration of the target study area of "Sloterdijk One" as they align with the overall goals of the Haven-Stad project. For example, the RA-I is fundamental to find predominantly residential neighborhoods. Similarly, the housing density (AND-I) plays a relevant role because there is a direct relation between the size of the dwelling and the density of an urban area. The average price per square meter of residential construction is meant to act as a proxy of socio-economic status of the neighborhoods (SEL-I) in social, medium-level, or high-level classes. The Quality of Life Index (QOL-I) was introduced as a synthetic parameter to "quantify" the livability in each neighborhood. For this index, the national dataset Leefbarometer (in English: "barometer of life [quality]") was used, as it is published every two years and contains up-to-date data for the whole of the Netherlands. It allows, therefore, for a reference parameter nation-wide, also considering possible extensions of the tool to other Dutch cities. Finally, the average age of the building stock index (ABS-I) is intended to help characterize the building stock of each neighborhood, i.e., to understand whether the area is of recent development or a historical part of the city.
Finally, in step 1C, the 3D city model of the city is created during this step-as in the case of Amsterdam-, or, alternatively, it can be enriched/enhanced if it is already available, in order to be usable as a source of data. The 3D city model itself is also used to compute some KPIs. It must be stressed that in the first step, the order of the substeps is not purely sequential, as it may take a number of iterations to identify, collect, harmonize, and integrate the required datasets and to compute the set of KPIs. This indeed represents an iterative process that is well established among practitioners and stakeholders in urban planning.
In the second step, some additional data transformation parameters and rules are first defined and implemented. They may comprise, for example, the net-to-gross floor area ratio for nonresidential usage zones, and the rules to compute the residential volume in buildings from the 3D city model, once the volume of all non-residential usage zones is known. The size of the living space is computed for each neighborhood, both in terms of indoor volumes (for residential and nonresidential buildings) and areas for open spaces. Eventually, a query is performed to identify and select those neighborhoods that most closely represent the envisioned, to-be-designed, new transformation. The selected neighborhoods will act as a sort of "urban design template" or blueprint. Figures 4-6 contain some visual examples of data analyses carried out at city level using the KPIs, a screenshot of the city model of Amsterdam, as well as an example of neighborhoods resulting from the selection process based on some of the KPIs. All images are produced using the prototype tool that will be described more in detail in the next section.   In the third step, additional design parameters and constraints are defined and added. They can be classified into two main groups. The first group consists of so-called "external" design parameters. They are derived from the city regulations and guidelines. Examples are the maximum allowed height of buildings in the city (or in the new development area), the number of planned new dwellings, etc. The second group consists of so-called "internal" design parameters and objects. It comprises, for example, a library of "typical" (simplified) residential and non-residential building classes (or typologies) to be later parametrically instantiated when generating the scenarios. Once the internal and external design parameters are set, the interactive generation of the scenarios can start. All relevant parameters and KPIs of the originally selected "template" neighborhoods are then used to generate a number of alternative scenarios, together with accompanying reports that contain all used parameters. It must be stressed that the scenario generation process allows for some interaction with the user during the process. For example, the user is notified on-the-fly if certain constraints cannot be held, so that the design process can be changed accordingly.
The scenarios are eventually stored and integrated into the original 3D city model, and prepared for visualization and sharing. The user has different options to explore and compare the different scenarios. From within CesiumJS, the user can interact with the 3D geometries, access a dashboard where graphs are displayed, compare two or more scenarios side-by-side, and finally print a report as .pdf-file with all input and design parameters. This is meant to facilitate the dissemination and visual comparison of the different scenarios, and is important for potential involvement of a wider audience in the evaluation or participatory process. Figure 7 presents an example of two scenarios of a test area-identified here simply as Scenario A and Scenario B-obtained during the interactive urban-design process in Rhinoceros/Grasshopper. Scenario B is also presented later in CesiumJS: existing buildings are shown in white, while new buildings are represented in green ( Figure 8, left). Alternatively, the UsageZones can be visualized and queried (Figure 8, right). From the CesiumJS interface, the user can also access a Scenario dashboard ( Figure 9) and a Scenario comparison tool ( Figure 10).    In the fourth step, the stakeholders can examine and discuss the proposed scenarios, choose one scenario or provide feedback to perform some changes. Depending on the type of feedback, the changes can go back to different previous steps or substeps, in order to allow for more degrees of intervention in the whole design process. In the fifth and last step, the final scenario is chosen and the further steps in the urban planning design can roll out. As mentioned before, these last two steps (4 and 5) are out of the scope of this article; however they are mentioned here for the sake of completeness and to allow for a global view of the whole proposed methodology.

Implementation
In this section, the implementation of the methodology is presented. Heterogeneous, spatial, and non-spatial data were first gathered, harmonized, and integrated. A centralized database was set up and used to store both the semantic 3D city model and ancillary data. All subsequent interactions are carried out between a specific tool and the database. Data exchange is ensured via a number of interfaces, depending on the target application consuming the data. In this work, there are four main software architecture components:


At the core, acting as the central data repository, is a PostgreSQL/PostGIS-based database [38,39]. The PostGIS version of the CityGML 3D City Database [40] is additionally used to store the CityGML-based 3D city model and-by means of a number of extensions to it-additional data and the scenarios. Data is transferred to the client applications via interfaces based on Python, PostgREST, PHP, etc., depending on the specific technology used.  QGIS 3 [41] is used as a generic, all-purpose client for geo-data exploration, both in 2D and-to a limited extent-3D. The decision to rely on QGIS has multiple reasons: its well-proven support for PostgreSQL/PostGIS, its open-source nature and its growing adoption in municipalities for a number of common GIS tasks.  For the urban design step, the 3D-modelling software packages Rhinoceros and its extension Grasshopper [42] are chosen.
Despite not being open-source, their adoption is motivated by its wide acceptance and usage in the urban-planning and design domain (e.g., by architects and urban planners). Its parametric-modelling capabilities are well-known and suit well the overall purpose of the proposed methodology.  For the online visualization and geo-data exploration of the 3D city model and the generated scenarios, CesiumJS [43] is chosen due to its open-source nature, the number of existing projects, and available tools already using it, and, more generally, its documented good performance when it comes to dealing with large datasets (e.g., as in the case of 3D city models).
In the following sections, the overall procedure will be divided and described from the data and software point of view. The order corresponds approximately to the scheme presented in Figure 3. The overview is instead depicted in Figure 11. The user is invited to refer to these figures for visual aid and guidance. Figure 11. Overview of the datasets and technologies used to implement the methodology presented in Section 2, and associated data flows. The acronyms of the datasets are explained in Section 3.1.1.

Data Collection
First of all, heterogeneous data from national and local sources were collected, analyzed, and selected depending on their suitability for the goals of this work. The list of datasets presented hereafter is the result of an interactive selection process. On the one hand, there are datasets "typically" needed to generate a city model (e.g., footprints, point clouds, DTM, DSM, addresses, etc.) More details can be found in literature reporting experiences from other cities [44][45][46]. On the other hand, some datasets were chosen depending on the discussions stemming from the KPI identification process (step 1B). At the same time, KPIs were defined also considering the existing and available datasets.
In the case of Amsterdam, the datasets used are described as the following. Unless differently indicated, they can be all accessed and downloaded either from the Amsterdam WebGIS portal [47] or via the Dutch geo-portal for open data [48]: 1. Basisregistratie Adressen en Gebouwen (BAG). This is the most detailed, openly available dataset on buildings and addresses in the Netherlands. It contains information about each address in a building, such as its current main use (residential, commercial, industrial, etc.), construction date and registration status. The polygons in the BAG represent the footprint of the building as the projection of the roof outline. The dataset is regularly updated as new buildings are registered, built or demolished. Additional information is given for the net area associated with each address. 2. Actueel Hoogtebestand Nederland (AHN). This dataset is the openly available elevation model of the Netherlands obtained by aerial laser scanning. It is accessible in raster and raw point cloud (LAZ) formats. It is delivered both as DSM (Digital Surface Model) and derived DTM (Digital Terrain Model). In the case or the raster version, it is available at 0.5-and 5-m grid cell resolution. 3. 3D BAG [49]. This dataset is available as open data and is a subset of the above-mentioned BAG.
It contains only building information, but it is supplemented by a set of height values of the roof and of the ground surfaces extracted by intersecting the BAG with the AHN point clouds. This dataset is generated automatically by the 3D Geoinformation Group of TU Delft on a regular basis and covers the whole of the Netherlands [50]. It hence allows to generate LoD1 buildings by simply extruding the BAG footprints by the corresponding roof extrusion value. For both roof and ground heights, several values are given corresponding to different percentiles. For example, attribute roof-99 corresponds to the height of the building when the roof surface is set at the 99 th percentile of the z-coordinates of the point cloud of the building. In this work, the height values of ground-50 and roof-50 were considered. Additional attributes are used to exclude buildings with invalid (or null) height values. One possible reason is that those buildings were built (or significantly changed in shape) after the AHN survey the datasets are derived from (2015/2016). 4. Basisregistratie Grootschalige Topografie (BGT). This dataset is the official large-scale topographic map of the Netherlands. It is regularly updated and freely available for download. 5. Basisregistratie Kadaster (BRK). Dutch dataset with information about land registry. This dataset contains geometries and information regarding the cadastral parcel boundaries, parcel numbers, and the most important buildings. 6. Leefbaarometer 2016 [51]. Dutch dataset containing a "Quality of Life" index. This dataset is meant to provide aggregated information about "the extent to which the living environment meets the conditions and needs that are imposed on it by people". 7. CBS Wijken en Buurten 2016. Dutch dataset containing the information related to the census of the Netherlands. It includes, for example, the number of households per neighborhood. 8. Functiekaart. Amsterdam dataset containing all non-residential activities in the city. An approximate indication of the non-residential surface (in m 2 ) is also given. 9. Grondgebruik 2017. Amsterdam  Please note that for several datasets, data from 2016 (or around 2016) were chosen in order to avoid temporal misalignments and inconsistencies as far as possible, despite the availability in certain cases of more recent datasets. The decision is mainly due to the availability of the AHN3 data (and derived products), which were last collected in 2015/2016 and upon which the current 3D city model is based. A newer campaign is, as a matter of fact, already on-going as of 2019/2020, with the first new datasets expected approximately by the end of 2020 or beginning of 2021.
All data were copied and loaded into the PostgreSQL-based database. For national datasets, only data related to Amsterdam were kept. Despite the availability of several data sources via web services, it was chosen to rely on a local copy mainly for practical and performance-related reasons, although this decision may change in future as the developed prototype evolves. For most ETL (Extract, Transform, and Load) operations, Safe Software's FME (Feature Manipulation Engine) Workbench [53] was used.

Generation of the Semantic 3D City Model
Regarding the 3D city model of Amsterdam, two existing datasets were carefully considered in the first place as possible candidates upon which to build a CityGML-based city model. Given the relevance of the volume enclosed by each building in the computation of the KPIs and the livingspace parameters, particular attention was paid to identify possible issues that might significantly affect the volume computation.
A LoD2-"alike" model of the whole city was kindly delivered for testing by the Municipality of Amsterdam [54]. In its current form, it is used for online visualization using the Unity engine. It is mainly meant to be informative to the public and therefore contains just very general information, i.e., only the BAG ID is associated with each building. The model is reconstructed from AHN3 data and consists of a triangulated mesh. The model was delivered in CityGML format; however, it contained some issues such as the geometries being not water-tight due to "holes" in the mesh or the lack of ground surfaces in the building envelope. As volume computation would not be possible, this model was not used. An example of a building from this dataset is shown in Figure 12a.
As a second candidate, the 3D BAG dataset was considered. As mentioned before, it consists of footprint geometries and the associated height values both for the ground and roof surfaces, de facto allowing for the generation of a LoD1 model. For the height values, the "ground-50" and the "roof-50" attributes were used, i.e., corresponding each to the median height of the Lidar points used to estimate the respective values. From initial tests, a number of issues were, however, discovered, such as the lack of height information for all buildings north of the IJ (i.e., the body of water that separates the borough of Amsterdam-Noord from Amsterdam-Centrum and the rest of the city) and a tendency to largely overestimate the enclosed volume of irregularly-shaped buildings. An example is given in Figure 12b, which highlights the volume overestimation compared to the same building shown in a).
As a result, a new 3D city model was generated from scratch. From the AHN3 DSM and DTM rasters at 0.5-m grid resolution, a normalized DSM (nDSM) was computed and intersected with the BAG polygons. The resulting reconstruction process focused on generating "LoD1-alike" prismatic buildings, i.e., obtained through extrusion of the footprints by the median of the intersected height values. The focus is, however, not on the building's shape itself, but more on the enclosed volume. For this reason, the resulting geometries actually define a volumetric city model, although they look like LoD1 buildings. An example is given in Figure 12c. A comparison of the three models of the same building can be seen in Figure 12d.
For practical reasons, the volumetric city model was indeed treated as a CityGML-based LoD1 city model, and successively enriched with building-relevant attributes coming from other datasets. Information from the BAG (addresses, functions, year of construction, etc.) were integrated in order to better characterize the building stock. Particular care was required to integrate the Amsterdam dataset containing the non-residential functions and areas associated with each address point (see Functiekaart dataset). For this purpose, in terms of data model, the UsageZone class defined by the Energy Application Domain Extension [55] was used, as it allows to partition a building into volumetric units having the same usage or function. Originally conceived for energy simulation purposes, the Energy ADE offers classes that can be used also in other application domains. As of the current version of CityGML, 2.0, this is not possible without additional classes from an ADE, unless one resorts to the Generics module, which contains the GenericCityObject and GenericAttribute classes. Another reason for adopting the Energy ADE is the availability of a database implementation for PostgreSQL/PostGIS [56] or-alternativelythe possibility to automatically generate a database schema using the latest version of the 3D City Database Importer/Exporter tools [57].
Data-wise, the decision to integrate the Amsterdam Functiekaart dataset is due to the fact that the information regarding the non-residential functions within the building is more detailed than the national BAG dataset. In other words, information from the BAG dataset was used only to identify buildings containing residential spaces, while the Functiekaart was used to identify buildings containing non-residential spaces and to derive the gross volume of the non-residential spaces. To convert the Functiekaart non-residential net floor area values to gross volumes, two conversion parameters were introduced (step 2A). Both parameters are commonly adopted in architectural and construction fields. The first one is the conversion factor from net to gross floor area. According to the literature [58], it generally oscillates between 1.2 and 1.4. A single value of 1.3 (i.e., an increase of 30% from net to gross floor area) was chosen. The second parameter defines the average height of a non-residential story, and in the literature values, generally varies between 3.5 m and 4.5 m. A value of 4 m was chosen in this case.
In addition, a set of rules was defined to fuse the two datasets and to classify the buildings into five classes, corresponding to (fully) "Residential", "Mixed-use", "Non-residential (single-function)", "Non-residential (multi-function)", or "Unknown"-and to compute the volume of the residential UsageZones accordingly. For example, if a building is classified as residential by the BAG, and no other information is available, then the entire volume is assigned to a single newly-created UsageZone of type "Residential", and the whole building is classified as "Residential". If additional information is available about non-residential functions, then the volume of each non-residential UsageZone is computed and subtracted from the building volume. If the remaining volume is positive, then it is assigned to a newly created UsageZone of type "residential" and the whole building is classified as a "Mixed-use" building. If only non-residential UsageZone(s) are known of a building, their volume is computed, and the building is classified as "Non-residential (singlefunction)" or "Non-residential (multi-function)", respectively. If no information at all is available, then the building is classified as "Unknown".
All UsageZone volumes were then divided by the corresponding footprint area in order to obtain an extrusion height, and vertically stacked if belonging to the same buildings. A visual example of the resulting UsageZone is given in Figure 13, while Table 1 contains some statistics regarding the number of classified buildings and associated volumes of the volumetric city model.  Additional analyses were carried out to investigate inconsistencies between the city model and the UsageZones from the Functiekaart. Ideally, for each building the difference between the volume given by the UsageZone(s) and the volume from the city model should be negative (i.e., the volume of the UsageZones is smaller than the volume of the whole building) or zero, but never positive (i.e., the volume of the UsageZones exceeds the volume of the building). Nevertheless, a number of errors were found. The reasons for such inconsistencies are manifold and can be traced back to possible errors in: the reconstruction of the building from AHN3 data, inadequate conversion coefficients from net floor area to gross volume, errors in Functiekaart data, or a combination thereof. In the case of the Functiekaart, sometimes the value of the net floor area is erroneously associated with a single building, although it actually includes areas also of an adjacent building(s). A visual example is given in Figure 14. Finally, it is worth noting that the volumetric city model reflects building volumes above ground, hence all underground stories and their volumes are not considered. If, on the one hand, this is an acceptable simplification for residential buildings, on the other hand, the error may become more and more relevant with larger, non-residential buildings.  Tables 2 and 3 contains some details to better describe some of these inconsistencies. For each building, the difference between the UsageZone(s) volume and the volume from the city model was computed, distinguishing between positive, negative, or zero differences (ΔV+, ΔV-, ΔV0, respectively).
As a consequence of the data fusion rules, buildings classified as "Residential" and "Unknown" were not affected by volumetric inconsistencies. Globally, in 5950 buildings (out of 171,515, i.e., 3.5%) a positive volume difference was found, and in 2806 buildings (1.6%) a negative one, hence for the remaining 94.9% of the buildings there are no volume inconsistencies. The most problematic building classes are the non-residential ones (both single-and multi-function), where the respective percentage of correct buildings is only 40.0% (4682 out of 11,714) and 13.4% (86 out of 640). Further details are given in Table 2. In terms of volume differences, the sum of the UsageZones volume yields a global net difference of 6,459,194 m 3 in excess, i.e., 2.5% more than the reference total volume of buildings in the city model (259,064,056 m 3 ). However, this apparently relatively small difference is actually the algebraic sum of positive and negative volume differences, i.e., 28,258,106 and −21,798,912 m 3 , corresponding to 10.9 and −8.4% of the reference volume. In other words, non-negligible volumetric inconsistencies exist especially for non-residential buildings, although they eventually even out when adding them. For the mixed-use buildings, only positive volume differences could be identified automatically, as in the case of negative differences, where the missing volume was used to create a new residential UsageZone. For "Mixed-use" buildings, the volume in excess yields 2,981,263 m 3 , i.e., 4.8% of the 62,729,605 m 3 from this building class in the city model. Further details are given in Table 3. A completely automatic solution of the above-mentioned inconsistencies resulting from the data fusion process is not possible, and in order to reduce their effect, it was decided to keep the buildings classified as mentioned before and to use the volumetric city model for the computation of the living spaces, with the exception of the "Mixed-use" buildings, for which the respective UsageZone volumes were considered.
Finally, a TIN-based DTM was also generated, as well as datasets like land use, the tree cadastre, and city boundaries. As a destination format, CityGML 2.0 was chosen and the generated CityGML files were finally imported in a PostgreSQL/PostGIS instance of the 3D City Database. In order to visualize the city model, the CesiumJS platform is used and a connection to the PostgreSQL database was established using the 3DCityDB web-map client's interface [59] based on PostgREST. PostgREST [60] is a stand-alone web server that turns a PostgreSQL database directly into a RESTful API (Application Programming Interface).

Living Space (Step 2)
Once the datasets are collected and the 3D city model generated, the KPIs described in Section 2, the size of living-space, as well as a number of other ancillary zonal parameters are computed. In general, all calculations are carried out by means of classical GIS operations consisting mostly of different spatial overlay and aggregation functions, and take place at database level: several stored procedures were written in the PostgreSQL procedural language PL/pgSQL exploiting also the additional PostGIS spatial functions. Table 4 presents an excerpt of these zonal parameters. They can be computed at different spatial levels of aggregation: from the block level, to the district/quarter one, up to the whole city. A number of database views is then generated in order to enable data exploration, analysis, and visualization, e.g., via QGIS (see Figure 4) or in Grasshopper (Figure 20).
At this point, the actual selection of the existing "template" neighborhoods can take place (step 2C). A number of predefined queries has been prepared for this project, but it can be further extended or customized depending on the specific selection criteria that are defined during the project, or resulting from the feedback session in step 4. In particular, the calculation of the living space is aimed to obtain, for each neighborhood, the average size of a "typical" dwelling. The living space consists of two main sub-spaces: the volumetric indoor space and the areal open space (computed in 2D). Indoor space is further specialized into residential and non-residential spaces. In Figure 15, an overview is given of the building usages and space classes considered.
For the indoor space, the gross volume of the residential and non-residential spaces of each building are required. All indoor-space values are then aggregated and averaged at the neighborhood level. In the case of Amsterdam, the 3D city model was used as the only source of information for this operation. The non-residential 3D indoor space was computed from the volume of all non-residential buildings (both single-and multi-function) and the non-residential UsageZones of the "Mixed-use" buildings. In a similar way, the residential 3D indoor space was obtained from the "Residential" buildings and the residential UsageZones of the "Mixed-use" buildings. When it comes to the 2D open spaces, some layers in the BGT dataset were filtered and processed to calculate the amount of traffic, green, parking, bike path, water, and pedestrian areas in Amsterdam. Similarly to the 3D indoor spaces, all open spaces were then aggregated to obtain a characteristic surface value for each neighborhood.
Once the living-space values are computed, they can be used to generate and visualize a "typical" building for simple visual inspection and comparison, as shown for example in Figure 16: the values of two neighborhoods in Amsterdam (Bellamybuurt Noord and Passeerdersgrachtbuurt) are imported in Rhinoceros/Grasshopper to parametrically generate a prototypical building on the assumption of a common footprint of 10 × 5 m-which indeed is a rather common size in Dutch houses.

Parametric Urban Design (Step 3)
The third step of the methodology is where the "city of tomorrow" starts being shaped by urban planners combining professional knowledge, experience, and data from the "city of today". In this work, the chosen environment to design the new development area (via multiple scenarios) is Rhinoceros, as 3D modelling software, and Grasshopper, as its embedded graphical algorithm editor. The reason is that these are two very well-known and commonly used software packages in the field of urban and architectural parametric design.

Internal and External Design Parameters
Two different sets of design parameters and objects were first defined and integrated into a newly-developed prototypic workbench based on Rhinoceros/Grasshopper. The first one, defined as a set of "external parameters and constraints", is meant to implement guidelines established by a municipality or another policy-making institutional body. In the case of the Municipality of Amsterdam, for example, these guidelines are conceived to allow for a seamless transition between the new development area and its surroundings. They prescribe the maximum height of different types of buildings, as well as the overall development targets to be achieved in the new development area, e.g., in terms of total number of new dwellings, number of new jobs, schools, etc. A list of such parameters and constraints in the case of Amsterdam is presented in Section 4.
The second set of so-called "internal parameters" complements the first one and provides the designer with additional information during the design process to obtain a wider panorama of the urban characteristics. The implemented parameters are well-known and commonly used in architecture and urban planning. Some of these parameters can be set by the user and others are interactively calculated on-the-fly. The user can, for example, decide upon the typology of the building to design in terms of footprint shape (e.g., solid or courtyard building, Figure 17). Similarly, the user can define and use different types of road configurations. In general, road intersections are considered as traffic space (Figure 18), while road segments are subdivided into the six outdoor functions (Figure 19). Table 5 lists some of the internal parameters with some ancillary information.    This value results from the subtraction of all the previous outdoor functions from the total width of the road. See Figure 19 3.

Connection to the Database
In order to retrieve information from the 3D city model and use the previously computed livingspace parameters, a data interface is needed for bidirectional data exchange between the Rhinoceros/Grasshopper and 3D city model stored in PostgreSQL/PostGIS. Such an interface was implemented using GH Python Remote [61] and Psycopg2 [62]. GH Python Remote is a set of opensource tools in Grasshopper enabling the execution of Python code directly in the GH Python component. Psycopg2 is the most popular PostgreSQL adapter for the Python programming language. An example of how to establish a connection between Rhinoceros/Grasshopper and a PostgreSQL instance of the 3DCityDB (including the Energy ADE) was described in Wang [63]. An excerpt of data extracted from the city model and imported into Rhinoceros/Grasshopper is shown in Figure 20.

Interactive Design Process
Once the data is retrieved, the workbench developed in Rhinoceros/Grasshopper can be used as a simple front-end to explore and visualize (3D) data of the neighborhoods (Figure 20), or as a parametric modelling tool. In the latter case, the user (here an urban designer) must make design decisions and provide additional input, namely which existing buildings in the new development area will be kept, which ones will be demolished, which plots will be used for the new buildings, and the layout of the new street network (as road centerlines). The geometries of the existing buildings to keep are imported from the 3D city model. The new plots are generated based on a parcellation plan of the area provided as input by the user. The reason to leave the user the freedom to set both the street network and the plots fulfils a requirement by the urban designers: these input datasets could be themselves the result of a previous design process-hence the need to import them manually instead of retrieving them from the existing 3D city model. Finally, the designer may want to create multiple scenarios using different input datasets. It must be noted that the automatic generation of a new street network and related parcellation is not part of the scope of this work, the focus here is specifically on the buildings and the open spaces. It could be however possible, if desired, to import the existing street centerlines and the plot polygons from the PostgreSQL database.
Finally, the list of "template" neighborhoods is loaded. It contains, for each neighborhood, the set of parameters regarding the size of the living space, namely the average gross volumes (both for residential and non-residential). For each new scenario, the user can choose which template neighborhood to use as reference. Then, the corresponding total gross volume to be built in the new development area is obtained by multiplying these two parameters by the planned number of new dwellings, while the average size of the working space is obtained by dividing the total nonresidential gross volume by the number of planned jobs.
For each selected plot where a new building can be built, the user can choose which typology of building to use, for example, solid or courtyard shape. Their footprint depends, on the one hand, on the shape of the parcels and, on the other hand, on the selected building typology (e.g., for courtyard building, the thickness of the building is an additional design parameter). The new buildings can be residential, non-residential, or mixed-used. When generating the new buildings, their volume is computed and deducted from the total planned gross volume previously computed. Regarding the open spaces, they are generated from the centerlines of the street networks. Depending on the selected road typology, the amount of space for road, bike lanes, pedestrian spaces, green, and water areas are generated automatically ( Figure 19).
The parametric modelling tool is accessible via color-coded panels: the yellow ones provide realtime information to the user, the grey ones allow to edit the design parameters ( Figure 21). The parametric modelling process is guided by design rules: if certain constraints are not respected, warning messages are issued to inform the user about irregularities or oversize generated objects, as well as suggestions that guarantee the minimum space quality standards in the project. An example is shown in Figure 22. The user is warned and assisted in case that:


The maximum allowed height of a buildings is reached;  The minimum distance between buildings is reached;  The selection of the plots, depending on their area. This is intended to guide the user in the selection of the plots for solid and courtyard buildings.  During the design process, additional information from the surroundings such as 2D and 3D maps can be added to help the user ( Figure 23). This can be made using external sources of information or by retrieving data directly from the 3D city model in the database. These additional "layers" can be conveniently enabled and disabled in Rhinoceros when needed. Further details about the implementation and the functionalities are described in García González [64], a demonstration video can be retrieved from https://www.youtube.com/watch?v=cPYT5_cFIgw&t=24s (accessed on 15 August 2020)

Export and Sharing of the Scenarios
Once the interactive parametric modelling is completed, different types of output can be generated for each scenario. In general, the user can store them both locally on his machine or online in the PostgreSQL/PostGIS database. Locally, each scenario can be stored as a set of .dwg-files containing the geometries and their IDs, and a .csv-file containing the IDs and all associated attributes. At database level, the 3DCityDB has been first extended in order to cope with scenarios, as it originally lacks this functionality. To this extent, inspiration was taken from preliminary work of a CityGML Scenario ADE [56,65], for which both a UML diagram and a database implementation are available. The Scenario ADE was conceived to deal with multiple representations of the city-or portions of it-at the same time, considering both geometries, attributes, and-optionally-also different additional parameters that might be needed to describe how a certain scenario is obtained and which constraints it may obey to [66]. Examples are the parameter settings to perform a certain simulation, or-as in the case of this work-the internal and external design parameters.
Using the above-mentioned Python-based data interface, data are then exported from Rhinoceros/Grasshopper and imported to ad hoc temporary tables. Then they are automatically converted to CityGML-compliant data and finally stored into the 3DCityDB via database stored procedures written in the procedural language PL/pgSQL. From the objects generated in Grasshopper, the equivalent UsageZones (and parent new buildings) are stored in the 3DCityDB, as well as the open spaces (as CityGML TransportationComplex objects), and information about their appearance.
Once the data are in the 3DCityDB, scenario geometries are exported as 3D Cesium Tiles via FME, while the attribute data are exposed via PostgREST.
Both geometries (as 3D Cesium Tiles) and attributes (via PostgREST) are then consumed by a customized version of the 3DCityDB web-map client, which itself is based on CesiumJS. The 3DCityDB web-map client has been adapted and extended in order to allow for user interaction with the different scenarios (Figure 24). Once a certain project area is selected, the user can choose which scenario to load. At this point, for each scenario, the web-map client takes care of loading all necessary information, to make the existing buildings invisible (if they are going to be demolished), to add the corresponding scenario layers with the new buildings, the new UsageZones, the new open spaces, and to set the connection parameters so that all layers are linked to the corresponding database tables via PostgREST for attribute retrieval upon click.
Further scenario comparison functionalities (either via a dashboard containing graphs or a data comparison interface) and an overall scenario management GUI are implemented in PHP and embedded in the web-map client GUI, too. The PHP code is first generated using PHPGenerator by SQL Maestro [67] and then further adapted and customized. Via the PHP interface, the user can explore the scenario data at a higher level of detail (e.g., via master-detail views, etc.), and generate PDF reports that contain all design parameters of that specific scenario.

Evaluation and Feedback (Step 4)
Once the scenario generation is completed, another step is planned in the methodology. Step 4 is meant to allow the different stakeholders involved in the decision making to evaluate the different proposals, provide feedback, and decide whether it is necessary to review some decisions made in steps 1, 2, or 3. For example, such feedback might consist in a selection of different neighborhoods as design templates, changes to decisions made to calculate the size of the living space, or the need to generate more scenarios using different design parameters. In any case, the proposed methodology allows data scientists, urban planners and stakeholders to review the decisions made during the whole design process, resulting in an interactive and iterative way of generating scenarios. Furthermore, the tools used in this research regarding data processing and 3D modelling allow the revisiting of design parameters in a relatively short period of time.

Case Study and Results
"Sloterdijk One" is the name of the second stage of the Haven-Stad project in Amsterdam. The position and the approximate extents of the study area are shown in Figure 1. "Sloterdijk One" was chosen as a study area because of its projected high-density (192 households/ha) and predominantly residential use (minimum 80%). The Municipality of Amsterdam provides a list of guidelines related to the Haven-Stad project. They are contained in the "Ontwikkelstrategie Haven-Stad" document (in English: Development Strategy Haven-Stad) [68]. Additional parameters (and constraints) are based on the guidelines of the "Sloterdijk One" project published in the "Sloterdijk I Strategie nota" (in English: Strategy Document Sloterdijk One) [69]. Other requirements prescribe a high quality of life and three types of dwellings, each corresponding to a different socio-economic status: high-level residences, medium-level residences, and social housing.
The KPIs previously calculated for each neighborhood in Amsterdam were used to find the most similar areas to "Sloterdijk One" within the present city. The land use index (RA-I) was used to find predominantly residential areas with a residential index higher than 80%. The housing density (AND-I) was used to find high-density neighborhoods because the planned density for "Sloterdijk One" is 192 households/ha, close to the highest in Amsterdam (211 households/ha). The selection criteria regarding household density were set to identify neighborhoods with more than 100 households per hectare, which in the Netherlands is considered as high-density. As the Haven-Stad project aims to replicate the highest quality of life already present in the city, the neighborhoods with a minimum score of 7 (out of 9) in the Quality of Life Index (QOL-I) were selected as areas of "urban success". Finally, the average year of construction of the building stock (ABS-I) was used to understand whether the area is a recent development or a historical part of the city. Accordingly, only the neighborhoods with an ABS-I higher than 1920 were chosen.
The average price of the households per square meter of residential construction (SEL-I) was used as a proxy value for the socio-economic status of a neighborhood. Therefore, the following criteria were defined to classify the neighborhoods:


Social housing: below 3000 €/m 2 ,  Medium-level housing: between 3000 and 4000 €/m 2 ,  High-level housing: above 4000 €/m 2 . The socio-economic status of the neighborhoods was not used in the selection criteria, instead it was used to understand the type of each neighborhood in the result query.

Living Space (Step 2)
The result of the query consists of 10 neighborhoods (in Dutch: buurten) to be used as "template" neighborhoods. An overview with the respective KPIs is presented in Table 6 and a map showing their position in the city center is shown in Figure 25. For each of these neighborhoods, the size of the living spaces was computed and later used to design the new scenarios for "Sloterdijk One". A selection of the computed parameters is shown in Table 7.
From the analysis of the selected neighborhoods, it is interesting to notice that all of them are in the high-level housing range with an SEL-I higher than 4000 €/m 2 . This means that the Municipality of Amsterdam envisioned "Sloterdijk One" as a neighborhood with similar characteristics to the highlevel housing in the city. In this case, these areas are located in the central part of the city because of the combination with high quality of life values.
The ABS-I of the selected neighborhoods (between 1920 and 1952) in combination with their location reflects the historical moment in which they were planned. These buurten were built before the two great expansion plans of the city: the "Algemeen Uitbreidingsplan van Amsterdam-AUP" (in English: General Expansion Plan of Amsterdam), adopted by the city council in 1935 [70] and the "Zuidoost Structuurplan 1965" (in English: Structure Plan South-East 1965) [71].
These two plans represent exemplary cases of modern urban planning with lower densities and larger open spaces. Besides, during the last decades, the high demand for housing in the city center has led to subdivide large(r) historical dwellings into smaller ones, increasing considerably the density in those areas.

Urban Design (Step 3)
In terms of external design parameters, they were taken from the "Ontwikkelstrategie Haven-Stad" document. Table 8 contains an excerpt of such parameters for "Sloterdijk One". As a matter of fact, "Sloterdijk One" was already pre-dimensioned by the Municipality of Amsterdam mainly by means of two parameters: the number of households (11220) and the number of jobs (7480). As a consequence, these two parameters were considered as the most important ones in the design tool. More precisely, in the "Sloterdijk One" guidelines, it is requested to distribute new dwellings as follows: social housing (30%), medium-level housing (40%), and high-level housing (30%). The Floor Space Index (FSI, i.e. the ratio between the buildings gross area, including all stories, and the size of the parcels in which these are located) must be between 2.2 and 3.5. In terms of distribution between residential and non-residential, the percentage is 80 and 20%, respectively. Additional constraints are: Whenever possible, these guidelines were formalized as parameters or constraints, and eventually implemented in the Rhinoceros/Grasshopper design tool. However, some constraints could not be mapped and transformed directly into a numeric value, e.g., in the case of the concept of "high quality of life".
Regarding the road network, "Sloterdijk One" is today an active low-density industrial area with existing road infrastructure and land subdivisions in parcels. For the modelling purposes in the design tool, the network layout was preserved, but the roads were (re)classified as primary, secondary, and tertiary. This allows the modeler to generate new street cross-sections.
Once the external and external design parameters were set, and the parameters for the size of the living space for the selected "template" neighborhoods were imported, the actual interactive generation of the scenarios took place as described in Section 3.3. As a result, several different scenarios were generated in terms of buildings size, position and typologies. They were successively stored back into the database and made accessible via the CesiumJS web-map client. An overview of the different scenarios is presented in Figure 26. Figure 26. Examples of scenarios generated for "Sloterdijk One" using different "template" neighborhoods.

Results
A detailed analysis and evaluation of the different scenarios from an urban-planning point of view is beyond the scope of this article. However, to facilitate the comprehension of how the design tool works, 10 additional scenarios were generated. These additional scenarios were generated specifically for comparison purposes: the number of buildings, their typologies and layout, the road layout, as well as all remaining parameters were kept the same, de facto reducing the choices of the designer once the first scenario is defined. The only differences among them consist in the size of the indoor space parameters, which stem from the different template neighborhoods.
In the following lines, only two scenarios will be presented and discussed: the first one based on the Bellamybuurt Noord and the second one based on Passeerdersgrachtbuurt. The former has the smallest amount of residential and non-residential space (in total: 1,007,100 m 2 ), the latter is the opposite, it has the highest (1,288,200 m 2 ). In Table 9, an excerpt is given of the parameters extracted from each scenario and in Figure 27 both scenarios are visualized in CesiumJS.  Although both of them satisfy the requirements of the municipality in terms of new planned households and number of jobs, they are indeed quite different regarding their (volumetric) size. For example, the highest buildings from the scenario based on Passeerdersgrachtbuurt are 3 stories higher than the highest ones in the other scenario.
The differences are recognizable even from a simple visual inspection. The highest buildings in the Bellamybuurt Noord-based scenario are of class mixed-use (max height: 27.6 m), while in the Passeerdersgrachtbuurt-based scenario, the highest buildings are residential and 39.8 m high. The latter scenario, therefore conflicts with the maximum allowed height constraints set by the municipality.
If one takes the municipality's planned average size of a dwelling, this value yields 100 m 2 , it includes both residential and non-residential spaces, and it is obtained dividing 1,122,000 (total gross floor area) by 11,220 (number of planned households). Both values are given in Table 8. In the Bellamybuurt-based scenario, this value yields 88.1 m 2 , while in the Passeerdersgrachtbuurt-based scenario, the value yields 107.3 m 2 and is therefore above the reference value (Table 9).
These simple examples show that the size of the living space can play a relevant role in urban development projects, as it may help to sort out apparently possible scenarios that are, however, not compliant with the overall regulations.
Socio-economic and monetary values are not present as parameters in the design tool yet, but the total amount of new gross floor area in new buildings for each scenario is provided as one of the output parameters. It may be considered as a proxy value for developers to calculate the total costs of investment. Nevertheless, analyzing the scenarios may help give insight into some qualitative matters, e.g., to characteristics factors that are relating new and existing building blocks to one another, or their possible added value influenced by the proximity to amenities or other particular contextual utilities.

Discussion
This section contains some further reasoning about the current limitations and shortcomings of the proposed methodology, as well as some ideas for future improvements. The overall direction that this work could lead to from the research point of view are eventually described.

Current Limitations
Being the proposed methodology for the largest part a data-driven approach, it is clear that all input data sources should be as complete and reliable as possible. Even though perfect and error-free datasets correspond to a situation that is hardly achievable in reality, this is one of the reasons why in this work, official data sources were exclusively used, as they are considered more complete and reliable. Besides, data availability for the whole of the Netherlands makes the methodology possibly extendible to other cities. Nevertheless, a number of issues were encountered, especially when creating the 3D city model, as described in Section 3.1.
Regarding the estimation of the buildings' volume, for example, none of the existing datasets could be used, implying that a volumetric 3D city model had to be generated on purpose. It is, however, worth mentioning that the issues affecting the 3D BAG are known [50] and that a newer and completely updated version is expected during summer 2020. The new 3D BAG 2.0 will offer both improved LoD1 and LoD2 buildings for the entire territory of the Netherlands. At the same time, the Municipality of Amsterdam is working on improving its own LoD2 model.
Regarding the integration of BAG data with the Functiekaart for the generation of the UsageZones, a classification of the buildings into five main classes was carried out distinguishing, for example, between residential, mixed-use, and non-residential buildings. Due to a number of issues discovered, the inclusion of the UsageZones volumes in the computation of the living space was limited to the mixed-use buildings. For the remaining building classes, only the volume of the entire building was used instead. Besides the reasons already identified in Section 3.1, it is worth mentioning here the assumptions made when it comes to the conversion factors from net to gross floor area and from gross floor area to gross volume. The decision to adopt for the entire city just these two values is surely an (intentional) oversimplification, which is, however, not uncommon in urban planning: on the one hand, it reduces to a minimum the number of parameters to set by the user and, in the case of Amsterdam, it has led to a 2.5% volumetric deviation compared to the city model-which altogether is an acceptable error to pay for such a simplified approach (see Table 4). However, significant deviations, both positive and negative (ΔV+, ΔV−), have been highlighted. Further statistical investigations are therefore needed to better understand how such deviations are distributed spatially, and with regard to the different building classes and types of UsageZones. Solving the issues related to these deviations would be beneficial, but it must be clear that there would still be a number of other issues that still cannot be solved, such as the lack of underground volumes in the city model, or the errors in associating floor area values to nearby buildings.
One major limitation is the availability of the city model as a single temporal snapshot of the city (for Amsterdam: 2016). Although care was taken to enrich the city model as much as possible with data from the same year, it is clear that using a "static" city model could soon become a limitation, especially with quickly changing/growing cities. If, for the Amsterdam region, the new AHN4 Lidar data will allow, among the rest, for a temporal update, this does not completely solve the problem: city models should be dynamic and change continuously, and from them one could extract data according, for example, to certain temporal criteria. Fully "dynamic" city models are, however, yet to come and are still the subject of current research worldwide [72,73].
Apart from the temporal aspects, most of the above mentioned issues might be solved (or reduced) if an "official", already existing, semantic 3D city model containing also detailed information about the spaces (volumes) of the buildings used for different purposes were available: this would indeed make a great part of the city modelling step unnecessary, greatly speeding up the whole process, reducing the number of assumptions (and therefore raising the overall accuracy) and-on the whole-enhancing the quality of the results. Despite the fact that such a semantic 3D city model may not be available yet (at least in the Netherlands), it is indeed imaginable that it will be in the coming future, given the current trends in data integration, harmonization, and standardization [74].
When it comes to the developed tool itself, there are a number of limitations, too. The number of identified KPIs, as well as the number of other design parameters, is still rather limited. However, defining new ones depends on the purpose of the analyses-and the availability of data. Therefore, identifying new KPIs and parameters as well as implementing their calculation is not an intrinsic major limitation, but rather a question of further developments. The same applies to the limited number of building typologies in the parametric modelling phase (i.e., the solid and the courtyard one) or road composition rules. As a matter of fact, the number of possible typologies was kept low in order to focus more on the overall structure and setup of the tool, and to avoid adding additional layers of complexity at a too early stage.

Future Improvements and Outlook
In order to overcome some of the above-mentioned limitations, and to continue developing the prototype tool, a number of improvements are planned.
All the design tools presented in this article (see the related work section, and the prototype tool itself) have strengths and weaknesses in assisting the urban planner during the design process. However, it is quite difficult to think about a tool that is "complete" or "best", as the specific purposes that led to its implementation may differ, hence impacting the type, quantity, and quality of availably functionalities. Nevertheless, comparison between the different existing tool can also be a source of ideas and suggestions for further improvements.
Regarding the prototype presented here, first and foremost, a new round of interviews and feedback is expected with the intended users of this tool (mainly urban designers on the one hand, and stakeholders involved in the decision process on the other hand), in order to test how far the current implementation fulfils their requirements and what is still missing to make it more usable. Before delving into software development, this is indeed the most important part to check what has been done so far and to prioritize in which direction to go.
Taking inspiration from the Ostate [14] and the "Parametric Smart Planning" [15] tools, the goal is to further enrich the number of spatial urban parameters and indicators. However, before focusing on adding further KPIs and more building typologies, there is certainly a need to implement an inbetween layer to convert purely numeric output parameters into easier-to-evaluate classified indicators-a rather common practice in urban planning. This would facilitate communication of the results and allow the end-user to quickly understand whether a certain value falls in an acceptable range or not.
The Kaisersrot project [9] made clear the importance of having multiple constraints while designing a new development area. The current prototype could be enhanced adding extra datasets that provide additional hints regarding the distribution of new buildings in the lots. Additionally, more information about topography and land ownership could help to better understand the location of the project as well as open spaces like squares and parks, or open spaces adjacent to hospitals, schools, etc. This would represent a further step towards the seamless integration between the urban planning and the geomatics worlds.
When it comes to the software implementation part, there are several already envisioned further developments. In general, attention should be paid to better integrate the different steps and, wherever reasonable, reduce the dependence from proprietary solutions. For example, the export of the scenarios into Cesium 3D tiles could be hard-coded instead of passing through FME. Strictly speaking, and given the modularity of the entire software architecture, even the whole Rhinoceros/Grasshopper element could be swapped with another solution for parametric modelling, e.g., the Mobius modeler [12]. Although this is not a planned change in the near future, there are indeed already examples that could be used as reference, e.g., as in Bilieki et al. [75].
The CesiumJS-based client could be extended to perform more analysis, both in terms of data visualization and interaction. For now, the most "classical" GIS operations are carried out in QGIS, but some query-functionalities could be integrated directly into the 3D environment.
The whole scenario management could be updated drawing from the experiences so far. The improvements could cover the complete chain, from the data modelling part, up to the generation, storage, online visualization, and interaction of the scenarios. Currently, the online visualization is conveyed via CesiumJS, but further (long-term) improvements could consider other ways of interaction, e.g., via AR (Augmented Reality) or VR (Virtual Reality). Particularly in the field of urban planning, this is still a current subject of research, where solutions covering different applications have been proposed by a number of authors in the recent years [76][77][78][79]. Speaking of additional modules, more functionalities could be added and coupled with the 3D city model (and the scenarios) to perform specific simulations. Preliminary tests have already been conducted, e.g., when it comes to coupling tools for micro-climate simulations [80] or energy-related applications [63,81].
In terms of data modelling, the decision to use the UsageZones from the Energy ADE and the Scenario ADE for the scenario management seems the most logical one, given their ready availability both as data models and database implementation. However, it must be observed that the upcoming CityGML 3.0 [82]-the release of the final specifications is expected in the second half of 2020introduces new concepts that may reduce the dependency on external ADEs. As mentioned in the previous section about the limitations, the current version of CityGML does not support timedependent variations, therefore each city model is de facto a static representation of a city at a certain timestamp. The Versioning module and Dynamizer module in CityGML 3.0 are meant to manage time-dependent properties. In particular, the Versioning module manages qualitative changes that are slower in nature, such as the history or evolution of cities (e.g., construction or demolition of buildings), or even multiple versions of the city models. The Dynamizer module focuses more on management of quantitative changes representing frequent variations of object properties (e.g., variations of attribute values such as energy demands, temperature, etc.), or real-time sensor observations. Although adoption of CityGML 3.0 will call for deep changes of current implementation and is therefore a rather long-term improvement, it is expected that having a dynamic semantic 3D city model at the core that acts as a source of integrated, heterogeneous information will facilitate the interoperability between the to-be-coupled different modules, for whatever specific urban analyses may be needed.

Conclusions
City planners commonly use the number of households as a unit of measure for urban developments. They need to know how many families can be hosted in a determinate urban project. In this work, the available data from a city (namely Amsterdam) was used to give a spatial dimension to the physical space occupied by a household, i.e., a dwelling. However, the spatial dimension does not only consider the average gross volume of the dwelling, but also some associated 3D and 2D spaces that come along, such as non-residential spaces and open spaces (e.g., roads). All these spaces are defined altogether as living space, and their definition plays therefore a crucial role in the shaping of a new development area.
The main idea comes from one of the ambitions of the Municipality of Amsterdam itself: analyze areas of "urban success" in the city center and reproduce them in the future expansion areas outside of the city center. Furthermore, the living space of a typical household can be computed for a specific area within the city, for example a district or a neighborhood. This process is meant to help urban planners analyze and take inspiration from areas considered as an urban success.
The move toward the adoption of design parameters extracted from a virtual 3D city model of a "real" city represents an innovative aspect of the methodology. The city of today is quantitatively analyzed and characterized using existing open spatial and non-spatial datasets. The resulting knowledge is then transferred to help design the city of the future. This shows that the adoption of a semantic 3D city model as a hub of integrated and harmonized spatial and non-spatial information facilitates the exchange of information and helps in the analysis of the city of today.
From the urban planning point of view, this is indeed a rather new way of imagining the city of the future that calls; however, for a tight connection between the geomatics, urban planning, and architecture disciplines. In a sort of round trip, one can imagine to start from the geo-domain (e.g., geospatial analyses), then moving to the urban-planning and architecture domains (e.g., design and parametric modelling), and finally closing the circle back to the geo-domain. The structure and the functionalities of the prototype tool described in this article show that a semantic 3D city model can be integrated with existing tools from the GIS and urban design domain.
The semantic 3D city model of Amsterdam has been generated using open data from national and local data sources. At the same time, the availability of such data has allowed the definition of KPIs to assist the urban planner in the computer-assisted design process of a new development area. In Section 4, different spatial configurations have been generated on purpose-ceteris paribus-by changing only the size of the living space. Despite the need to introduce some simplifications and assumptions in the design process, the results show that the size of the living space plays a role in the shaping of urban developments. The definition of just the number of new households for the development area is not enough, as the generation of rather different scenarios having the same number of new households has shown. This illustrates that KPIs dealing with the concept of living space do have an influence in the shaping of different scenarios.
The novelty of this work is supported by the apparent lack of similar approaches found in the literature. As written in Section 1.1 (Related Work), there exists a number of tools and works that focus on specific aspects of either disciplines, but they seldom or weakly integrate them. Nevertheless, this is indeed a rather new and promising way to go, given, on the one hand, the steady growth of geospatial data and related tools available to cities, and, on the other hand, the increasing availability and need of tools to perform different types of analyses at urban scale, but with a granularity that reaches down to the architectural scale of the building.
One last aspect that this work tries to assess is the traditionally long period of time that passes from the inception of a project and its final realization, i.e., the realization of a new development area.
Project plans are generated and then carried out, but changes "on the fly" are rather hard to achieve, although the existing boundary conditions may have changed in the course of time. Instead of a quasi-static and linear process, the methodology (and the developed prototype tool) aims to add a fast(er) feedback loop in the process, which could affect several steps in the planning process at different times, for example:


Before: to help establish the minimum parameters and constraints of a new project;  During: to review the guidelines and check whether the parameters and constraints are up to date;  After: to adjust the parameters or add extra information to the analyses over time.
The information associated with each scenario is aimed to help developers in the different stages of the design process to set up, review, or update the guidelines of housing development projects.
A tool for the interactive and semi-automatic design of new development areas has been therefore implemented and tested with a real-world case study in Amsterdam. Despite its prototypic nature, it already allows to define design parameters and constraints based on GIS-based spatial analyses and the city's regulations, and to generate different design configurations (called scenarios) that fulfil certain criteria. The urban designer working in Rhinoceros/Grasshopper is informed throughout the design process whenever a certain requirement is not met. Several scenario export possibilities are offered in order to make the design process more transparent. The possibility to publish and explore the results in a web-based digital globe, integrated with a 3D city model, is a welcome beneficial side effect that can further valorize the use of (open) datasets for design purposes.
In everyday practice, most of the design proposals often end up "forgotten" or locally stored "somewhere", sometimes in closed, proprietary data formats, and, eventually, they become inaccessible or lost. In this work, each scenario is stored in a centralized database and the data are modelled following an international open standard (here: CityGML). The advantages become clearer if one thinks of the added value of re-using each scenario-and the surrounding city it is embedded into-as input for further domain-specific analyses (e.g., solar irradiance, 3D noise and micro-climate simulations, urban heat island assessment, etc.). These domain-specific applications can differ greatly from each other, but they generally all need large quantities of input data (both geometrically and semantically) that must be otherwise generated every time on-purpose.
Finally, it is worth mentioning that the current tool uses for the most part national open data, so it is reasonable to think that it could be used for other (Dutch) cities, too. There is indeed a relatively small number of local datasets provided by the Municipality of Amsterdam, but it is rather safe to assume that equivalent ones exist for other cities.

Funding: This research received no external funding
Acknowledgments: The authors would like to express their gratitude to the Municipality of Amsterdam for gently providing their LoD2 model for testing purposes, and to the following people: Son H. Nguyen (TU Munich) for the hints on how to extend the 3DCityDB web-map client, Daniela Maiullari (TU Delft) for the fruitful discussions, Patricia Hernández-Lelli and Leyden Durand for their help in checking the manuscript and creating some figures, respectively. Finally, the authors would like to thank the reviewers for their constructive comments and suggestions.

Conflicts of Interest:
The authors declare no conflicts of interest.

Appendix A
Please refer to Table A1 for the list of abbreviations used in the text. For the listed datasets, a detailed description can be found in Section 3.1.1.