From CFD to GIS: a methodology to implement urban microclimate georeferenced databases

. L’obiettivo del contributo è quello di presentare una metodologia per l’integrazione tra ambiente di simulazione microclimatica Computazionale Fluido-dinamica (CFD) e Sistema Informativo Geografico (GIS). Il primo workflow preve-de l’attribuzione delle coordinate spaziali ai dati puntuali estratti dal CFD; l’imple-mentazione di un database SQLite; il collegamento al database per la visualizza-zione delle informazioni


Introduction
Research on urban climatology constitutes a well established field of studies in environmental design. Many elements simultaneously contribute to characterise climate at the local (with a resolution from 50 m to 100 km) and micro-urban (1 cm to 1 km) scales (Oke, 1987). Variations in wind direction and intensity, air temperature, relative humidity (Lobaccaro et al., 2021) and, generally, in air quality contribute to inform the design of the city, with a view to mitigating the Urban Heat Island. The impact of built environment characteristics (related to urban density and form, building orientation and positioning, quality of surface materials and presence of ecosystem services) on the definition of user comfort and in response to a rapidly changing climate has been widely debated in the literature (Losasso et al., 2021;Pollo et al., 2020).
The rapid spread of enabling technologies, with the consequent increase in computing capabilities, redefines design practice in favour of computational approaches, which require interoperability, modelling, 'simulability' and connectivity (Tucci, 2020). In the context of open space design, Computational Fluid-Dynamic (CFD) numerical models, in particular, allow the simulation of the main environmental and comfort variables down to a resolution of less than 1 m, providing scientific support to professionals and decision-makers for the ex-ante and ex-post evaluation of urban transformation scenarios. On the other hand, managing simulations at this resolution scale requires a high degree of specialisation. Moreover, the data produced are rarely accessible and difficult to use outside the simulation environment as, for example, in Geographical Information natura diversa per analisi spaziali multi-criteriali. L'obiettivo di questo articolo, dunque, è di implementare una metodologia per l'integrazione tra ambienti CFD e GIS, nella prospettiva di estrarre informazioni dai dataset microclimatici, output del processo di simulazione. Nel paragrafo seguente vengono presentati la metodologia, i software impiegati e il caso studio applicativo. Vengono successivamente introdotti due approcci per l'integrazione e la georeferenziazione dei dataset (Workflow), le cui potenzialità vengono discusse nel quinto paragrafo. Le conclusioni riportano le prospettive future legate all'ottimizzazione dei processi di interoperabilità, georeferenziazione e integrazione di informazioni in processi avanzati di analisi.

Metodologia
Systems (GIS), through which georeferenced data actually become information, potentially able to reveal new patterns and to 'cross-reference' data of a different nature for multi-criteria spatial analyses. Therefore, the objective of this paper is to implement a methodology for the integration between CFD and GIS environments, to extract information from microclimate datasets, the output of the simulation process. The methodology, the software used and the case study are presented in the following section. Two approaches for the integration and georeferencing of datasets (Workflows) are then introduced, whose potential is discussed in the fifth section. The conclusions report on future perspectives related to the optimisation of interoperability processes, georeferencing and integration of microclimate information in advanced analysis processes.

Materials and methods
The modelling process was conducted in the ENVI-met 5.0.2 environment, which is able to simulate climate-related phenomena occurring in the Urban Canopy Layer (UCL), the lowest layer of the urban atmosphere that extends from the ground to the height of the buildings (Erell et al., 2012). Simulations may concern the actual state and possible micro-urban transformation scenarios, and are carried out by assessing the interactions between local climate variables, vegetation, and horizontal and vertical surfaces (Bruse and Fleer, 1998). ENVI-met can reconstruct a large number of environmental variables (e.g. solar radiation, air and soil temperature and humidity, surface temperatures and pollutant concentration, etc.), as well as the main comfort indices (e.g. PET, PMV, UTCI, etc.). Among the output options, it is possible to extract, by time steps and variable considered, raster images in bitmap format (*.bmp), datasets with point values referring to each model cell, and NetCDF files (*.nc). Georeferencing of the ENVI-met output data was carried out in ArcGIS Pro, used for demonstration purposes to highlight the possibilities offered by the interaction between microclimatic information, obtained from the model, and traditional geographic information, available in numerous formats and accessible through GIS (Fig. 1). The main limitation to the integration of microclimate data in a GIS environment is their georeferencing, due to the 'internal' reference system used by ENVI-met. As for Workflow (A), the georeferencing process takes place through the extraction of the point data dataset in .xlsx format (manageable in Microsoft Excel) and the attri-bution of the spatial coordinates. The microclimate data were subsequently organised in an SQLite database. The connection to the database during the work session in ArcGIS Pro allows for easy work with other spatial data. In the case study, for example, vector data of buildings in the area of interest were imported from the OpenStreet-Map database (*.osm) too. Regarding Workflow (B), the approach is based on point shapefiles. In fact, following the import of the vector data on the built-up area (*.osm) and georeferencing of the simulation raster (*.bmp) ( Fig. 2) in ArcGIS Pro, a grid with a similar number of cells as the modelling grid in ENVI-met was created. Finally, a unique ID was assigned to the point data on the microclimate to allow the dataset to be merged with the centroids of each cell. As an example, the results for Air Temperature (TA) Caso studio L'area d'indagine si estende per ~1.8 km 2 (420m x 420m) e si colloca nel quadrante nord-est della Città di Torino, Italia (clima Cfa secondo la classificazione di Köppen-Geiger). La peculiarità del tessuto urbano, orientato prevalentemente secondo l'asse nord-sud, non consente la definizione di un vero e proprio canyon urbano. Il 65% circa della superficie territoriale del lotto non è costruita; di questa, il 53% circa è destinata a verde orizzontale, con un rapporto tra numero di abitanti e quantità di superfici verdi superiore alla media della città (~31 m 2 /ab. contro i ~24 m 2 /ab. circa) (Fig. 3).

Case study
The survey area covers ~1.8 km 2 (420 m x 420 m). It is located in northeast Turin, Italy (Cfa climate according to the Köppen-Geiger classification) (Fig.  3). The peculiarity of the urban fabric, mainly oriented along the northsouth axis, does not allow the definition of a true urban canyon. Approximately 65% of the plot's land area is unbuilt; of this, approximately 53% is covered by horizontal greeneries, with a ratio between the number of inhabitants over the quantity of green areas higher than the city's average (~31 m 2 /inhab. vs. ~24 m 2 /inhab. approx.) (Fig. 3).

Workflow (A): database implementation
Modelling was carried out using the 'Space' module of ENVI-met. The spatial resolution was set as 2x2x2 (xyz); therefore, the digitisation grid of the area measures 210x210x30 m (based on the actual dimensions of the plot). The raster supporting the modelling was downloaded from the Geoportal of Turin 1 , from which information on the asphalt maintenance (good/sufficient condition -albedo: 0.10; degraded -0.15; very degraded -0.20) and the public trees was extracted. The setting of the simulation file, using the modules 'ENVI-Core' and 'ENVI-Guide' , required the input of the meteorological boundary conditions. The hourly data of air temperature [°C], relative humidity [%], solar radiation [W/ m2], wind speed [m/s] and direction [°] were downloaded from the ARPA Piemonte 2 portal, selecting the closest urban station to the analysed area ('Torino Grassi' , ~4 km as the crow flies from the case study), and are relative to the hottest day of 2019. Given the size of the plot and the high spatial resolution, the simulation process took 148 h.

Managing point data
The results were extracted via the 'Leonardo' module in *.bmp and *.xlsx formats. The simulated point data were managed via a spreadsheet, in which the spatial coordinates of the points (x,y) are based on the modelling grid in the software's internal reference system (coordinate origin 1,1). The point data are organised consequently by ordinates, whereby the first 'series' of data concerns all abscissae of ordinate 1 (from 1,1 to 210,1). Given the spatial resolution set (2x2), there are 210 abscissae and 210 ordinates in total (44,100 points).
Within the table, two columns were provided in reference to the spatial coordinates of the WGS84 system, in Decimal Degrees, in order to allow georeferencing of each point. Using ArcGIS Pro, the coordinates of the origin point of the internal reference system (1,1) were identified (45.09070, 7.70930). On the spreadsheet, the abscissae of the points relative to the same ordinate were assigned incrementally, adding +2 for each abscissa, given the size of the modelling cells (e.g. point (1,2) will have coordinates (45.09070,7.70932) and so on). Similarly, the ordinate of the internal coordinate (2,1) was identified by adding +2 to the starting ordinate (45.09072, 7.70930), keeping on assigning the longitude values. Finally, the 'null' data, referring to cells with the presence of built-up areas, were 'filtered' and eliminated, in order to handle 'in uscita' l'estensione NetCDF 5 per la georeferenziazione diretta degli output climatici. Tuttavia, notevoli difficoltà sono state incontrate dagli autori nell'importazione dei file relativi alla TA (*.nc) in ArcGIS Pro. Un singolo file orario ha richiesto circa 50' per poter essere importato ed è stato impossibile gestire il file climatico complessivo relativo all'intera simulazione (dalle dimensioni superiori ai 5GB). Non è stato inoltre possibile importare alcun file *.nc in QGIS 6 . L'importazione dei dati su ArcGIS Pro ha invece consentito di ricostruire il trend della TA del tutto assimilabile alla visualizpoint data exclusively referring to open spaces (35,500 points).

Database implementation
Once the spatial coordinates were assigned, the point data were subsequently organised in an SQLite Spatialite database. Spatialite is an opensource library that provides a spatial extension of the SQLite database, a relational Database Management System (DBMS) that allows all necessary database information to be concentrated in a single file. SQLite was chosen for its great speed and compactness, making it ideal for storing large amounts of data. Its widespread use also allows for good interoperability with other software than GIS. At an operational level, an initial 'empty' SQLite database was created on ArcGIS Pro using the 'Catalog' function, and then the *.xsls file with the point data was imported.
At this point, spatial coordinates were provided via the 'Display XY Data' function.

Data analysis and utilisation
As a test, some information layers were imported from OpenStreetMap 3 to observe the integration of spatial data with the SQLite database containing microclimatic data on the TA. Specifically, this approach allows the GIS analyst to use simulated data for analysis, which is easier to obtain than direct measurements. However, the georeferencing process has a certain degree of error for two main reasons. The first is intrinsic to the modelling process on Envi-MET, which requires the 'digitisation' of the area using square cells of a certain size, which is a function of the calculation capacity and the real extent of the plot. The second is due to the actual georeferencing process, applied by 'picking' several notable points identified in the area. This problem can be partially solved with special tools provided by the GIS. Since we are dealing with point data, it is possible to distribute the points based on their density weighted on a given parameter, identifying the areas that are more or less subject to a certain phenomenon. Fig. 5 shows the TA parameter as a dynamic heatmap, superimposed on the vector data of the buildings and the road network.

Workflow (B): raster georeferencing
After completing the modelling and simulation process, and importing the vector data on buildings and roads (*.osm), we proceed with georeferencing of the bitmap, the ENVI-met's raster output. The operation is performed using the ' Add Control Points' function, georeferencing (at least) four notable points. It will then be sufficient to apply a working grid to the raster, through the 'Fishnet' function, having 'rows' and 'columns' equal to the number of cells used in the ENVI-met environment (in this case, 210x210). The output of the process will, therefore, be a new Feature Class 4 , containing polygons and their centroids, with its ID (from 1 to 44,100). To unambiguously link the point values of air temperature to the 'cells' created, it will, therefore, be necessary to assign, proceeding by 'rows' , a similar ID to the point data in the spreadsheet, which this time will not require the assignment of single spatial coordinates. The data source can finally be joined to the Feature Class via the 'Joins' function (Fig. 6).

Discussions
Concerning the development of the ENVI-met software, there are already 04 | Vista assonometrica del caso studio. Fonte: Calorio, F. (2021), "Morfologia urbana e microclima. Il caso studio del quartiere Regio Parco a Torino". Tesi di Laurea Magistrale in Architettura per il Progetto Sostenibile, Politecnico di Torino. Rel.: R. Pollo, M. Trane Axonometric view of the case study. Source: Calorio, F. (2021), "Urban morphology and microclimate. The case study of the Regio Parco district in Turin". Master's thesis in Architecture for the Sustainable Project, Politecnico di Torino. Rel.: R. Pollo, M. Trane several options that allow the integration of vector data *.osm and shapefiles 'in input' , to facilitate the modelling phases. We also point out the possibility of using the NetCDF5 (*.nc) extension 'on the output' for direct georeferencing of climate simulations. However, considerable difficulties were encountered by the authors in importing the (*.nc) TA files into Arc-GIS Pro. A single-time file took about 50' to be imported, and it was impossible to handle the overall climate file for the entire simulation (over 5 GB in size). It was also not possible to import any *.nc files into QGIS6. Importing the data into ArcGIS Pro, on the other hand, made it possible to reconstruct a TA trend, which is definitely similar to the raster view (Figs. 2, 5, 6). As regards the limitations of the proposed approach, data from one of the ARPA meteorological stations in the study area were used for the simulation, as mentioned. This is undoubtedly useful to carry out the analysis; however, it refers to a dataset acquired in a single meteorological station, which is just 'close' to the area of interest. In this regard, a climate reanalysis dataset with a spatial resolution of 2.2x2.2 km (VHR-REA_IT Dataset) was made available during the final drafting phase of the article (Raffa et al., 2021). Therefore, integration in this sense will be taken into account for further developments. In addition, a large amount of point data were extracted, but they were just related to TA simulation at 3:00 PM. This leads us to suggest that, for larger-scale applications, the resolution of the modelling grid could be reduced. As far as Workflow (A) is concerned, the choice of using a connection to a database would make it possible to overcome the main limitations of shapefiles (*.shx, *.shp, *.dbf etc.), linked to the difficulty of transferring and managing multi-file folders with different formats, metadata and reference systems. Among the critical points encountered was the need to proceed 'autonomously' with the georeferencing of point data, which, in the case of simulations over larger areas, can represent a strong limitation to the implementation of the methodology. As for Workflow (B), the process described is more agile, but the men-tioned limitations of working locally using shapefiles (the centroid points created by the 'Fishnet' function) still remain. The two approaches could be adopted depending on the needs, the scale of the survey and the type of users who will use the information produced. However, we would like to point out that Spatialite database was chosen because it can condense all the data into a single file, but it is not the only type of database that can be used (see Oracle, GeoPackage or PostgreSQL). However, the use of a database, whose main strength lies in the possibility of uploading to a server available to other users, might be preferable in the presence of large quantities of information and a heterogeneous typology of users needing to access it. 05 | Importazione in ArcGIS Pro dei dati vettoriali da OpenStreetMap (sinistra); zoom sui dati puntuali georeferenziati mediante collegamento al database (centro); visualizzazione dei dati come Heatmap dinamica (destra) Importing vectorial data from OpenStreetMap into ArcGIS Pro (left); zoom on georeferenced point data by linking to the database (center); visualization of data as a dynamic heatmap (right) 06 | Georeferenziazione del raster in ArcGIS Pro (sinistra); applicazione della funzione 'Fishnet' (centro) e attribuzione dei valori di TA mediante ID; visualizzazione dei dati (destra). Si segnala che, a scopo illustrativo e per questioni di visibilità, l'immagine riporta una griglia fittizia di dimensioni 20x20 celle Georeferencing the raster in ArcGIS Pro (left); applying the 'Fishnet' function (center) and assigning TA values via ID; displaying the data (right). It should be noted that, as an example and for visibility issues, the image shows a fictitious grid of size 20x20 cells zazione in formato raster (Figg. 2,5,6). Per quanto riguarda i limiti dell'approccio proposto, come accennato, per la simulazione sono stati utilizzati i dati di una stazione meteorologica prossima all'area di studio. Questo è senza dubbio un dato utile allo svolgimento delle analisi, tuttavia fa riferimento ad un dataset acquisito in un 'punto' nello spazio, a circa 4 km dall'area di interesse. A tal proposito, durante la fase finale di stesura dell'articolo, è stato reso disponibile un dataset di rianalisi climatiche con una risoluzione spaziale pari a 2.2x2.2 km (VHR-REA_IT Dataset) (Raffa et al., 2021). Pertanto, nella prospettiva di future implementazioni, si terrà conto di una possibile integrazione in tal senso. Inoltre, data la grande quantità di dati puntuali estratti, relativi alla simulazione delle ore 15:00 e alla sola variabile microclimatica legata alla TA, suggeriamo che, per applicazioni a scala più ampia, la risoluzione spaziale della griglia di modellazione potrebbe essere inferiore. Per quanto riguarda il Workflow (A), la scelta di operare mediante collegamento ad un database consentirebbe di superare le principali limitazioni proprie degli shapefile (*.shx, *.shp, *.dbf etc.), legate alla difficoltà di trasferimento e gestione di cartelle multifile con formato diverso, dei metadati e dei sistemi di riferimento. Tra le criticità riscontrate nell'implementazione del processo, segnaliamo la necessità di dover 'autonomamente' georeferenziare i dati puntuali che, nel caso di simulazioni su aree più estese, può costituire un limite notevole all'implementazione della metodologia. Segnaliamo inoltre che Spatialite è stato scelto in quanto in grado di condensare tutti i dati in un unico file, ma non è l'unico tipo di database a cui si può fare ricorso (cfr. Oracle, GeoPackage o PostgreSQL). Per quanto riguarda il Workflow (B), il processo descritto è più snello, ma rimangono i limiti, già citati, di lavorare in locale mediante shapefile (i punti centroidi creati dalla funzione 'Fishnet'). I due approcci potrebbero essere adottati a seconda delle necessità, della scala di indagine e del tipo di utenti che usufruiranno delle informazioni prodotte. L'utilizzo di un database, il cui principale punto di forza risiede nella possibilità di caricamento su un server a disposizione di altri utenti, potrebbe tuttavia essere preferibile in presenza di grandi quantità di informazioni e di una tipologia eterogenea di utenti che hanno la necessità di accedervi.

Conclusioni e prospettive future Conclusions and future perspectives
In the context of microclimate research, the use of data derived from Remote Sensing (RS) techniques, although widely accessible and georeferenced, implies certain limitations. RS data provide information, which usually comes with lower resolution compared to CFD simulations at the micro-urban scale. Moreover, this information mainly regards air and surface temperatures. Eventually, RS data might not allow to investigate the daily evolution of urban microclimate environmental variables, as they are strictly linked to the satellite coverage of a certain area. Besides, they do not allow to derive comfort indexes, which are essential to properly design open spaces, as they turn out to be actual 'tools' to check the quality of the environmental design. On the other hand, CFD software is among the most advanced and widely adopted approaches in climatology studies at the UCL scale, as well as in a project's quality assessment by architects and other professionals. Therefore, the enabling role of digital technologies for modelling and simulation, in the field of micro-urban climatology, is crucial when it comes to ex-ante assessment of several design options, affecting user comfort, the UHI mitigation and, thus, the climate adaptation scenarios. Consolidating the evidence-based approach of the technological environmental design at urban and microurban scale, user-centred to promote the well-being and quality of spaces, needs updated datasets that are accessible to stakeholders. Thanks to advanced simulations and georeferencing, these datasets are essential to provide advanced knowledge on the main environmental variables, con-sidering several (hour and season) climatic conditions, on comfort indexes, and considering rising temperatures. Indeed, among the most promising future developments we can mention the possibility of carrying out microclimatic analysis considering not only the observed climatic conditions, but also future projections based on the IPCC's Shared Socioeconomic Pathways (2022), to consolidate the role of the project as a means of coping with forthcoming environmental crises (Gherri et al., 2021). Furthermore, the use of georeferenced information at such a high space-time resolution would allow for the differentiation of different thermal comfort conditions as 'boundary conditions' change, to support epidemiological studies (e.g. heat-health nexus) (Ellena et al., 2020), targeted project analyses (Zhou and Dai, 2021) and context-specific assess-ments (e.g. neighbourhood climate action) (Johnson et al., 2021). Finally, we point out that it might be appropriate to create (a set of) plug-ins to facilitate the software interoperability between simulation and georeferencing/analysis environments at the 'output' stage too. This would help to expand the designers' 'toolkit' , allowing architects, urban planners and engineers to rely on the integration of geographical and modelling data for design at multiple scales. From this point of view, an important step forward could be the automation of the generation of CFD outputs and the attribution of spatial coordinates to simulated data. This data may be stored in the form of accessible databases (e.g. organised per environmental/comfort variable, season, day, hour, climate change scenario, etc.), to be updated and overwritten with subsequent sim-na, rimane pertanto centrale in relazione alla verifica ex-ante delle alternative progettuali, rispetto al loro impatto sul comfort per l'utente, sulla mitigazione dell'Isola di Calore Urbana e, dunque, sugli scenari di adattamento climatico. Il consolidamento dell'approccio evidence-based del progetto tecnologico ambientale alla scala urbana e micro-urbana, incentrato sulla promozione del benessere e della qualità degli spazi, necessita pertanto di dataset aggiornati e accessibili agli stakeholder, in grado di fornire, attraverso la georeferenziazione, conoscenza avanzata rispetto alle principali variabili ambientali considerando diverse condizioni climatiche (orarie e stagionali), agli indici di comfort, e ai trend relativi alle temperature in costante aumento. Tra gli sviluppi futuri più promettenti, infatti, segnaliamo la possibilità di effettuare analisi microclimatiche considerando anche le proiezioni future basate sugli Shared Socioeconomic Pathways dell'IPCC (2022), al fine di consolidare il ruolo del progetto come mezzo per fronteggiare le prossime crisi ambientali (Gherri et al., 2021). Inoltre, l'utilizzo di informazioni georeferenziate e con una così alta risoluzione spaziotemporale permetterebbe di supportare studi epidemiologici (e.g., heat-health nexus) (Ellena et al., 2020), analisi progettuali mirate (Zhou and Dai, 2021) e valutazioni specifiche di contesto (e.g., neighbourhood climate action) (Johnson et al., 2021). Segnaliamo infine che, con l'obiettivo di favorire l'interfaccia tra CFD e GIS, potrebbe risultare opportuno predisporre un (set di) plug-in per facilitare l'interoperabilità tra gli ambienti di simulazione e di georeferenziazione/analisi anche 'in uscita'. Questo contribuirebbe infatti ad ampliare il 'toolkit' del progettista, che potrebbe contare sull'integrazione di dati geografici e modellistici per la progettazione su più scale. Da questo punto di vista, un passo avanti importante potrebbe essere rappresentato dall'automatizzazione della generazione degli output CFD e dell'attribuzione di coordinate spaziali ai dati simulati. La costruzione di database (organizzati, ad esempio, per variabile ambientale/di comfort, stagione, giorno, ora, etc.), aggiornabili e sovrascrivibili con successive simulazioni, renderebbe l'interazione con il GIS molto più agile, aprendo a interessanti scenari nella visualizzazione dei dati 'online almost real-time'.
ulations, making the interaction with the GIS much more agile and opening up interesting scenarios in visualising 'online almost real-time' data. NOTES 1 Geoportale di Torino: http://geoportale.comune.torino.it/web/ 2 ARPA Piemonte: http://www.arpa. piemonte.it/dati-ambientali/datimeteoidrografici-giornalieri-richiestaautomatica 3 OpenStreetMap: https:// www.openstreetmap.org/ export#map=5/51.500/-0.100 4 A Feature Class is a homogeneous category of elements with the same spatial representation (points, lines or polygons) and a set of attribute columns in common 5 The Network Common Data Format (NetCDF) is a set of software libraries and machine-independent data formats that support the creation, access and sharing of scientific data, mainly used in the field of climate data 6 The same problem was faced by several users of the 'ENVI-met support centre' forum: http://www.envi-hq.