Advancing urban building energy modelling through new model components and applications: A review

Due to rapid urbanisation and the signiﬁcant contribution of cities to worldwide energy use and green-house gas emissions, urban energy system planning is growing more important. Urban building energy modelling (UBEM) draws increasing attention in the energy modelling ﬁeld due to its inherent capacities for modelling entire cities or building stocks, and the potential of varying data inputs, approaches and applications. This review aims to identify best practices and improvements for UBEM applications by examining previous research, with a focus on the currently least established approaches. Different arche-type development procedures are analysed for common problems, six main under-developed input approaches or parameters are identiﬁed, and applications for future scenario development are surveyed. By analysing previous studies in related ﬁelds, this paper provides an overview of gaps in the published research and possible additions to future UBEM projects that can help expanding the existing modelling procedures. Comprehensive human behaviour models with additional aspects beyond occupant presence are identiﬁed as a major point of interest. Further research on socio-economic parameters, such as household income and demographics, are also suggested to further improve modelling. This study also under-lines the potential for utilising UBEM as a tool for evaluating future climate change scenarios. (cid:1) 2022 The Authors. Published by Elsevier B.V. ThisisanopenaccessarticleundertheCCBYlicense


Background
As our cities continue to grow, the need for management of urban energy demand and sustainable urban planning increases.The last few decades have seen a rapid increase in greenhouse gas emissions and an alarming increase of climate change worldwide.Urban areas are responsible for 78% of the worldwide energy consumption and around 66% of the global greenhouse gas emissions [1].As urban areas, compared to rural ones, already have a greater share of the overall global population, the gradual shift from rural to urban lifestyle with the increasing global urbanisation rate, and increasing population growth has been predicted to cause a 50% increase in urban population by 2050 [2].
In urban development there is an increasing need to integrate energy system planning now more than previously as it has been traditionally focused on only design and land use, but recently it has been changed and more inclining towards a sustainable perspective.Residential and commercial buildings are responsible for nearly 40% of the total CO 2 emissions, while also having a major potential for energy efficiency [3].City-scale energy modelling and simulation has emerged as a way to implement energy system planning and create city-integrated sustainable energy solutions.
Building energy modelling (BEM) has been widely implemented for building performance evaluation, building stock analysis and architecture as well as heating, ventilation, and air conditioning (HVAC) design since the 1970s [4].Individual BEM focuses on building and energy system simulation, utilising thermal modelling.Following the societal developments with increased urbanisation and the awareness of the role of cities in energy system planning, city-scale implementations of BEM approaches have become increasingly common, forming the concept of Urban Building Energy Modelling (UBEM).In city-scale building modelling, multiple building simulations are aggregated on a higher level to be able to quantify the energy performance of buildings for a whole city, district or neighbourhood, and to be able to analyse the energy demand for different time scales.One goal of UBEM is to combine the modelling of thermal systems with the study of the building's interaction with the surrounding urban environment.Ang et al. [5] list four main application categories for fully developed UBEM models: urban planning and new neighbourhood design, stocklevel carbon reduction strategies, individual building-level recommendations, and buildings-to-grid integration.
BEM has historically been conducted using a wide range of methods and approaches, and is a subject for extensive research and several comprehensive studies reviewing its use for residential energy demand estimations [6,7].As UBEM is receiving increasing attention, the knowledge base of the field has also been growing rapidly in the last few years.

Previous reviews
Reinhart and Cerezo Davila [8] were the first to define UBEM as a concept in 2015.They describe UBEM as a meta-scale, bottom-up engineering-based model that applies ''physical models of heat and mass flows in and around buildings to predict operational energy use as well as indoor and outdoor environmental conditions for groups of buildings".A main finding of the study is that the largest uncertainty for UBEM simulations is associated with the archetype definition process and how well the created archetypes represent the building stock, as well as that building energy data is severely restricted.In the EU, as compared to the United States, the access to Energy Performance Certificate (EPC) databases or national building registers makes this issue less relevant.
Li et al. [9] review basic associated workflows in UBEM, with a specific focus on model calibration.The study mentions surrounding climate conditions as highly important for energy modelling.Local climate variations, like the Urban Heat Island (UHI) effect, could cause future challenges and have a greater impact on the energy demand.On a national level, this effect is not resolved by dividing the building stock into regional climate zones.The UHI effect is defined as heightened air and surface temperatures in urban areas compared to the surrounding areas, especially at night.In urban areas, as compared to low-density areas, building materials absorb a greater proportion of short-wave solar radiation during the day and then re-radiate it as long-wave radiation less efficiently during the night [10].
Archetype development and urban micro-climate challenges are also brought up by Johari et al. [11], who survey and summarise modelling approaches, techniques, and research gaps in the UBEM field.The authors suggest further research in data mining and machine learning techniques, and point out that building energy use data is often hard to access, making the availability of measured data one of the other main challenges for UBEM, especially for probabilistic approaches and model validation.Further research gaps include studies on improved computing power and the integration of UBEM models with spatio-temporal human activity patterns (urban occupancy and mobility models).
In a recent review, Ali et al. [12] evaluate different UBEM approaches, also including the reduced-order methodology.They agree with previous reviews that data availability and quality are important issues, and that future research should further integrate spatio-temporal human activity patterns and socio-technical factors for improved modelling results.
As a UBEM model can be viewed as a function of what is known about the actual building stock, building stock research can lay a solid foundation for UBEM studies.Mangold et al. [13] review the Swedish residential building stock research up until 2015 and conclude that the energy use in kWh/m 2 has decreased, showing an increased efficiency, while the total energy use has been stationary.They argue for more energy-related studies of the building stock that take socio-economic factors and population density into consideration, including efficient usage as a parameter complementary to energy efficiency in building stock analysis, measuring energy use per capita instead of energy use per area.Another area of proposed future research is studies using EPC data to describe the building stock, which have since been published, e.g. by Hjortling et al. [14].
The review papers summarised above extensively describe the nature of UBEM and the strengths and weaknesses of different building simulation approaches, and go into great detail about model optimisation and calibration.The main focus is on already common modelling approaches and the optimisation of existing techniques, and not on possible extensions of the applied framework.Although statistical and probabilistic methods and analyses are extensively discussed, assessment of uncertain parameters is not given as much attention as data acquisition of more certain parameters.All surveyed papers point out to some extent that access to high quality measured data is a central issue in the UBEM field.
Archetype development is a common topic in the papers, where it is shown that archetype-based UBEM models can provide a more promising and extensive modelling framework with further development possibilities.Some model components are also consistently referred to as lacking in previous research, predominantly spatio-temporal human activity models, machine learning techniques and inclusion of micro-climate models and socio-technical factors.None of the papers thoroughly discuss applying UBEM for integration with larger-scale future scenario estimations, for applications such as future climate models or urban development models.On this scale, only Ali et al. [12] review possible applications for UBEM models.
Under-developed research in the field includes combining the framework with larger scale development models and future scenarios, along with including modelling aspects that are largely ignored in the field, such as socio-technical factors.

Aim and scope of the review
The aim of this review is to identify best practices and possible improvements for UBEM, and to provide recommendations for future research with a focus on the less established aspects within the field.
This paper considers that the UBEM framework can be described as five distinct main processes which follow each other chronologically; (1) data acquisition and processing, (2) building stock segmentation and archetype development, (3) simulation (modeling components), (4) model calibration, and (5) model application (see Fig. 1).In this framework, the survey of previous reviews suggests that further development is required for three of these processes: building archetype development, possible extensions of the applied modeling framework, and possible applications for UBEM models, including future scenarios.
These three areas for further development have influenced the structure of the paper and the following research questions, which guided the literature review and are answered in the paper: This paper assesses and examines a range of modelling aspects for UBEM archetype-based models and applications, and their usefulness for integration in the energy system and for creating energy demand scenarios, rooted in the previous knowledge about the UBEM concept.The purpose is not to go into depth about technical modelling details and optimisation, as previous reviews have thoroughly surveyed the state of the art of UBEM and the theoretical and technical details.The review instead focuses on the general possibilities for development and applications, regarding the possible methods, components or applications that are explicitly mentioned by contemporary research as areas of development in future studies.This paper gives a short overview of UBEM research, and as building stock representation and data availability are regarded as predominantly important aspects in previous research, archetype development procedures with or without using EPC data as source material for classification is also given a thorough overview.
The method for literature review followed the 'Snowballing' procedure, where the search is based on key papers and interesting additional papers are found among cited and citing papers.First, comprehensive UBEM studies and review papers concerned with the entire process from data gathering to simulation and model application were examined for which aspects that were highlighted as important.If an important aspect was identified, that is if the component or application was highlighted or explicitly mentioned as an area for future development, by at least one study, more detailed or specific studies on this aspect were found.The component or application was then examined more specifically and evaluated for its relevance, and if it actually was a research gap, by cross-checking with contemporary research.The search was was predominantly conducted through the ScienceDirect database website and the Uppsala University Library.
To limit the extent of this paper, the articles that were deemed most relevant for the scope of this review were selected to be included in the text.The findings are presented in the chronological order previously defined, from over-arching UBEM studies to archetype development, model components and finally future applications.
The review focused on studies from areas located in a temperate climate; different climate zones affect how buildings are constructed and used, and in turn how energy modelling is performed and the relative importance of certain parameters.

Outline of the paper
Section 2 gives a brief overview of the field of Urban Building Energy Modelling.Section 3 provides an overview of UBEM archetype development processes, with and without EPC database involvement.Section 4 overviews additional model components for UBEM, specifically examining the aspects of occupancy and behaviour models, socio-economic factors, probabilistic characterisation of parameters, building renovation status, historic buildings, and energy system models.Section 5 examines studies of future energy system developments using future scenario simulation, focusing on building retrofitting, heat demand reduction and climate change impacts.Section 6 provides further discussion and recommendations for further research.Section 7 concludes the paper.

Brief overview of Urban Building Energy Modelling
For coherence in the field, the concept of UBEM has been more or less defined as meta-scale, bottom-up engineering-based modelling of heat and mass flows to simulate the energy performance of multiple buildings (see Section 1.2).This is the definition that will be used in this paper, and it will not consider top-down models, (bottom-up) statistical models or individual BEM.However, the hybrid approach is common in the reviewed UBEM research.In hybrid models, while the buildings are modelled based on their physical characteristics, the required data, particularly userdriven and/or typically stochastic parameters (such as building air leakage or occupancy) are being treated as in statistical models, creating empirical distributions of the uncertain parameters [11].Hybrid models can therefore make use of both deterministic or probabilistic archetype classification and characterisation processes.For deterministic approaches, building model parameters can be acquired from building codes and standards, previous case studies, literature or databases such as TABULA, a residential building typology database developed for European countries [15].Probabilistic methods have not yet been completely adapted into UBEM research [11].They are often inherently more complex and will often increase the number of needed data input sources as well as the overall simulation complexity.
UBEM development follows a clear-cut workflow.The procedure starts with identifying the building stock and the geometrical properties of the buildings, i.e. shape, geometry and geospatial positions, sometimes through 3D models of the city.Subsequently, the non-geometrical properties, i.e. construction materials, HVAC systems and occupancy, are defined and any necessary calibrations are performed.After defining the parameters, they are imported to a simulation engine where the building models and equivalent thermal zones are defined and simulated under the prevailing weather and climate conditions.Models that incorporate a higher level of detail will be more accurate and detailed but also require more complex and time-demanding simulation.Higher resolution data, such as high resolution LiDAR (Light Detection and Ranging) data also increases accuracy but is not available everywhere [16].There are also numerous different approaches for both segmentation, characterisation and calibration, which is something that has been reviewed several times [5,8,11,17].
As modelling and finding data for all buildings in the building stock individually would be an insurmountable task, abstracting the building stock into representative buildings, i.e. archetypes, is a useful approach.Classification of the building stock into archetypes is a common process in UBEM development and may involve several different techniques, usually deterministic or probabilistic classification, or cluster analysis.
Clustering is a well-known data mining technique, but it is a new approach in UBEM [11].It is increasing in popularity mostly due to its ability to examine various building features simultaneously and advances in machine learning techniques [18].De Jaeger et al. [19] evaluated different clustering techniques on multiple combinations of all parameters considered relevant for archetype classification, to find the optimal method for UBEM purposes.They concluded that for predicting the annual energy demand, the optimal parameters are heated floor area, total loss area and construction period.The clustering technique that performs best is the kmeans method.The k-means method and similar building parameters have been successfully used to develop representative archetypes for UBEM [18]20,21].
Archetype classification can be of varying complexity and the number of archetypes required for building stock representation can vary greatly.The first step of classification is usually segmentation of the building stock, which involves dividing the buildings into sub-groups to better represent the differences in energy demand or simulation behaviour.Typical parameters for segmentation are building type, use, construction age, size, or similar.Second, the representative building groups are characterised into archetypes.To be able to simulate the buildings accurately, each reference building has to be characterised by defining nongeometric parameters, such as construction materials, energy/ heating system, infiltration and occupancy.A third step involving calibration of the developed archetypes is also often necessary.Depending on the availability of data, segmentation or characterisation can be deterministically or probabilistically based.It is not necessary to use archetypes when creating a fully probabilistic model.Some studies instead characterise each building separately, using statistical methods [22].

Archetype classification in UBEM studies
Archetype classification is the process of identifying representative buildings in the building stock and typologically dividing them into sub-groups of identical buildings.There is no standard or generally optimal way of implementing this, but rather the studies involving archetype classification have adopted different methodologies depending on the aim of the study.However, some general patterns can be found within previous research.
This section reviews archetype development processes in selected UBEM studies that are considered representative for their specific scientific objective or methodology.The studies are reviewed for their main classification methods and for what model components that are considered as issues or prospects for future studies, to get a picture of the aspects or topics that are missing in current UBEM research.EPCs are currently one of the most extensive data sources for buildings in the EU [23], and is rapidly growing as a tool for decision-planners, real-estate analysts and energy modelling researchers alike [24].Therefore, studies are divided based on whether or not they have utilised any variant of an EPC database.The studies chosen for review will also reflect historic development and novelties within the field, with stand-out approaches or notable reflections.The review is limited to studies with an archetype approach for building stock representation and studies that encompass areas the size of a city district or larger.The focus is also on UBEM and building stock studies from areas located in temperate climates, as mentioned in Section 1.3.Table 1 summarises archetype characterisation aspects for the UBEM studies and reviews surveyed in this paper.

Classification and characterisation without Energy Performance Certificates
Cerezo Davila et al. [28] created a full UBEM model of the city of Boston in a large and thorough modelling project.The citywide model uses 52 archetypes based on GIS and energy data from surveys.The non-geometric building parameters are deterministically defined and utilise occupancy schedules divided by use type.The UBEM successfully models a large city but requires long simulation times and is hard to validate as the data inputs are based on surveys.The authors also criticise the use of surveys as energy use input data and specifically call for hourly metering data and occupant behaviour models in further work.
Kristensen et al. [31] created a model for all single family housing in a Danish city and were able to predict the annual heat demand profiles with high accuracy.The study proves that archetype-based models with normally distributed parameters can predict hourly demand profiles with very high accuracy, given accurate enough training data.This model utilises real meter heat consumption data for calibrating the archetype distribution parameters.However, this makes the model highly specific and the method not usable if one cannot obtain meter data for the studied area.
Yang et al. [37] developed a combined GIS-archetype approach to model the energy demand for residential space heating in the Netherlands.The study claims the model is 'pure engineering based', as it uses no statistical analysis or values, utilising GIS data as the only input data, with U-values and similar taken from the TABULA database.TABULA is a residential building typology database developed for building stock energy assessment in 13 EU countries [15].The model takes advantage of easily obtainable data, and throughout different stages in the process the differences in energy performance (and its correctness) are shown, which is favourable for visualisation.However, the shortcomings of the TABULA data are discussed, as well as the number of parameters or characteristics that have to be assumed, such as heating system, window-to-wall ratio (WWR) and that the internal room temperature has an assumed fixed value.The model by Yang et al. [37] produces accurate mean values but struggles to capture the spread of energy performance values in the building stock and the problem of the stochastic nature of building occupancy is also discussed.Through the modelling process it can be noted that only 44% of the building stock is covered due to the lack of physical data or high quality parameter values.This low coverage means that even though it is a pure engineering based model, it is still not having a higher coverage than the hybrid approach.This way of approaching UBEM is more or less the opposite of the study by Kristensen et al. [31].
Monteiro et al. [33] created a UBEM model with multi-detail building archetypes to assess the impact of considering different levels of detail in the characterisation process.The results show that a higher number of archetypes is correlated with a higher model accuracy, acknowledging that a more diverse building stock needs a higher number of archetypes to be equally well represented.The model also uses only GIS input data and deterministic characterisation of non-geometric parameters.The archetype main tier segmentation is determined by geometry, construction, systems, operation, roof type and neighbouring.The authors point out that the impact of subdividing archetypes by a specific parameter is highly dependent on how representative the parameter is.
Mohammadiziazi et al. [32] in a recent study created a UBEM model combining several databases and a novel way of assigning building envelope properties through street view photogrammetry and image processing.It is one of few studies that includes commercial buildings, and the authors conclude that the energy use intensity is highly correlated with the use type.
Sokol et al. [36] developed a methodology for defining residential archetypes focusing on Bayesian calibration.Their model uses statistical classification, clustering from GIS data and performs calibration using measured energy consumption.The characterisation is probabilistic (by distributions) for infiltration, thermostat set points, occupant density, plug load and lighting power density, and the domestic hot water flow rate.The study presents the simulated energy use intensities as probability distributions and conclude that deterministic archetype characterisation is unable to account for the variety in building energy performance.They moreover conclude that the modelling accuracy increases with the number of archetypes and that calibration leads to significantly better annual building energy performance fits.For future studies they propose using stochastic modelling of hourly schedules and more detailed calibration procedures.Cerezo et al. [29] continue the Bayesian calibration research with a comparison of four building archetype characterisation methods, describing deterministic and probabilistic (distribution) approaches.The calibration of unknown archetype parameters creates a better model fit, but the authors point out that due to privacy and ownership concerns, individual energy data is still extremely difficult to access for samples of buildings large enough for UBEM models.
The surveyed studies demonstrate the diversity within the field, even though only archetype-based, bottom-up UBEMs were considered.Nearly all of the examined studies utilise GIS to gather geometric building data.Most often the characterisation process is done by using deterministic non-geometric parameters and a database such as TABULA for remaining data, which is not representative of all buildings in a country but rather a specific collection of generic residential buildings.A significant share of the studies also have access to metered or measured energy use data for characterisation or calibration.As neither GIS nor deterministic databases contain information about building energy use, the only other reasonable alternative to measures energy use data is energy use data from surveys.Consistent findings include issues regarding gathering reliable data, that the number of archetypes relates to model accuracy, that stochastic occupancy models are particularly important, and that some parameters are evidently more relevant than others.

Classification and characterisation using Energy Performance Certificates
EPCs are public domain documents with the goal of comparing buildings in terms of energy use and to provide help with the identification of possible energy efficiency improvements [38].The most used method of energy performance assessment is calculation based on a number of building parameters and statistical data.Some EU countries, including Belgium, France, Germany and Hungary [39], however use measured rating for parts of the building stock.A notable exception is Sweden, where nearly all collected EPC are assessed using measured values [14,40].EPCs are often mandatory to some extent and issued by governmental agencies.In Sweden, for example, they have been collected by the National Board of Housing, Building and Planning since 2006 and are mandatory for newly constructed buildings, if the building is larger than 250 m 2 or if it is made available for sale.In the United Kingdom, EPCs have been mandatory since 2008 if you own a property and want to rent it out or sell it [41].As a result, EPC databases are one of the most extensive data sources for building energy performance currently available, especially in EU countries and the UK [23].
In recent years, there has been an increasing use of EPCs to thoroughly map a building stock with regards to energy use and energy efficiency [42,43] and to analyse renovation potential, to perform energy planning, and to predict future energy demands [24].Hjortling et al. [14] created an energy mapping of the Swedish building stock, mainly for energy efficiency and climate studies.It divides the buildings into categories from the EPC energy data and provide a statistical analysis.They argue that the EPCs in Sweden are quite reliable since they are based on data from energy bills, but that the calculated and measured energy performance sometimes differ.
There has been some criticism over the need for stronger requirements regarding data quality and content of EPCs [24].Data quality studies have shown that the methods of deriving the heated floor area are not consistent [40].In the issued EPCs, heated area is generally underestimated which makes energy performance generally overestimated.Other issues mentioned are under-and over-representation of certain building types, heating system types averaged over many buildings and that the EPC assessors are dividing the electricity use differently.Around 30% of homes in the UK may be placed in the wrong EPC energy class due to assessors disagreeing on parameters such as wall type and building form [44].
The database has however since been updated in many countries.von Platten et al. [23] matched old EPCs in Sweden with updated ones, to examine which have been renewed and changed or renovated.They concluded that new and old EPCs can be matched with certainty, and that the differences demonstrate the energy performance development in different parts of the building stock.The paper provides a solution to the issue with incorrectly derived heated floor area.Newly issued EPCs have adopted a new way of deriving the areas, and a regression preformed by the authors shows that the changed method for deriving heated floor area overestimates the improvement in building energy performance by approximately 7 kWh/m 2 in over half of the surveyed buildings, and has to be corrected for an accurate representation.
Johansson et al. [30] used EPCs and LiDAR data to create a 3D energy model of a Swedish city with a methodology considerably similar to the UBEM framework although the study does not use this terminology.The methodology was sufficient to create an accurate energy model and is reproducible for any Swedish city since all data is nationally available.
Broström et al. [27] created a cross-disciplinary categorisation method for historic buildings in Visby, Sweden, by combining EPC data with local building survey databases.Using the local databases, they were able to develop archetype buildings with extensive detail about construction materials, something that is not present in EPC data.However, this type of data is not commonly available.For future studies the authors propose that the inventory data is coordinated with GIS for a more manageable and precise model, and with the use of automated statistical methods for models on a larger scale.More parameters can also be taken into account, such as cultural values, national energy saving targets and constructional limitations.
Ahern and Norton [25] derived a residential stock model for the Ireland building stock using EPC databases.The model uses datadriven segmentation by U-values and using EPC data to determine deterministic parameters such as heating system and construction year, resulting in 35 archetypes.The study claims that some EPC values, such as energy performance and heating system proportion are not statistically significant, but still uses default U-values determined from EPC data on building construction year.The authors mention the lack of information on the composition of building stocks as a major issue, meaning that EPCs or other (deterministic) databases cannot be the only data source.
Pasichnyi et al. [34] created building archetypes with a datadriven categorisation method with the purpose of aiding energy retrofitting decisions.The study only defines three archetype buildings, which is far less than most other studies.The study instead focuses on the characterisation of a predefined segmentation, and represents the relevant EPC parameters as distributions, decreasing the need for specific clusters or patterns that single out particular ranges of energy performance within the building stock.The study characterise a significant number of parameters, and use measured factual hourly heating data to calibrate the characterisation.The study brings up that single-family houses are significantly under-represented in the EPC database, undermining the accuracy of the results.EPCs are also reissued every 10 years, and the database is not annually updated.This creates a temporal lag in the data, and excludes buildings constructed after the current issue date.
Ali et al. [21] developed a data-driven approach to study the differences between different scales when using building archetypes in UBEM models.The Irish building stock is categorised by dwelling type and age, and characterised by building U-values and clustering via the k-means method.The authors emphasise the need for more detailed data and processes to enhance the simulation results.They propose to integrate dynamic occupant behaviour, include commercial buildings in the studied building stock, and use stochastic building energy stock models to handle uncertainties.They also mention automating the process of archetype development and simulation.
As can be seen from the above account, EPCs have been labelled as reliable and as the most promising data source for building modelling purposes, at least in the EU, but they have also been stated to have some deficiencies.Several studies have been able to create accurate and easily reproduced archetype-based models using different parameters from the EPC database, such as registered energy performance or heating system.When compared to models that utilise other data sources, EPCs are sufficient as the only data source, while other models are more often created using several databases and validated or calibrated using measured data.Thereby, they have eliminated some of the data gathering or reliability issues that were mentioned in the previous section.Many of the concerns for future research raised by these studies do not regard the used data, but rather missing inputs such as overall composition of the building stock or dynamic occupant behaviour models.With EPCs, data issues for UBEM have been relocated from data gathering to data quality.But as EPCs are reissued and updated regularly, and with the knowledge about statistical deviations in the database and developed correction factors, these factors may be easily corrected for.

Model components for further UBEM development
This section provides an overview of model components and further inputs important for future UBEM research, possible improvements and applications.The section starts with an overview of model components considered lacking in the current research, and tries to identify the parameters or inputs which are not included in any or most studies.The components thought to be more under-developed or of particular interest in the research field are then more closely examined in the following subsections.Each subsection also begins with a summary of how this particular component is treated and used in current research.Not only UBEM studies are considered; as the components are examined for their availability and usefulness for related research, it is important to gain knowledge of how the particular component has been used in other types of studies and fields.

Component overview
Building simulation and modelling have some widely recognised essential components (see Section 2) but there is a potential for adding external data or models.General model components that could be of particular interest for further research are listed in Table 2, along with the papers that mention the specific component as a gap in current research.Note that not all papers in the field are included, but only the most significant and representative articles that are otherwise featured in this review.
The model components that are examined further in this paper are chosen based on the number of studies in which it is mentioned as important for further research, or if the component is regarded as highly important or interesting, but is scarcely used, studied or discussed within the field.These components are described and further examined in the following subsections.Some of the listed components were analysed but not considered reasonable to examine further in this paper, as explained below.
As previously mentioned, the most common deficiencies mentioned in the reviewed articles was the lack of measured or metered energy use data, and the lack of reliable calibration and validation methods, which in turn heavily relies on the availability of measured data.As this can not be regarded as a model component, it is not listed in Table 2 nor included as a topic for further discussion within the scope of this paper.
Similarly, several reviewed papers mention the fast developing field of machine learning as promising for improving modelling techniques within UBEM.Recent UBEM projects have used machine learning-based extensively for predictive studies [12] and the number of UBEM studies built on machine learning or deep learning are increasing [49,50].As this is also a fast-moving and big field without the need of recognition here as an under-developed area, it is not treated in more detail in this section.
Studies that integrate urban micro-climate models are still scarce.City-level simulation of such integrated models will be so complex, that computational fluid dynamic (CFD) methods are essential for resolving them, which will render the overall model too complex or in need of simplification [45].Therefore, and because it has already been thoroughly reviewed previously [11], this topic is not further reviewed in this paper.Integration of UBEM with urban mobility modelling is another important extension; however, this is also discussed in detail by Johari et al. [11] and is therefore not included here.
Apart from urban mobility and micro-climate models, the other model components in Table 2 are discussed below, each in its own subsection.

Stochastic occupancy and behaviour models
The energy use of an individual building can be viewed as the result of a combination of two subsystems, the physical household and the social household [51], the physical household being the building characteristics, physical parameters and appliances, and the social household being best described as the behaviour of the people in the building, defining the use of the building.The behaviour of the building occupants determines the energy demand by interaction with the physical household and has been proven to influence the thermal load on a building as well as having a significant effect on the total energy demand [26,52].Multiple studies also point out that user behaviour and occupant preferences are commonly acknowledged reasons for the performance gap between measured and simulated building energy use [8,53,54].
In previous UBEM studies, building occupancy is an essential parameter and has generally been modeled with deterministic approaches, while stochastic occupancy modelling has been a more common approach for individual Building Energy Modelling (BEM) [48].A common deterministic approach is assigning a fixed value of occupants per square meter of the building and using assigned hourly schedules, often fairly simple, for occupant presence.These schedules are typically different for different building use types, such as single family homes, offices or schools, and days of the week or year.Probabilistic methods require more data and are inherently more complex, which increases the overall complex- [27] Energy system models [11,45] ity in larger scale models, but is considered important or lacking by a significant share of the UBEM research.
Recent reviews on occupant behaviour models conclude that they are still mostly considered at the individual building level and not at the urban level, and that stochastic models perform better than deterministic [48,55].Stochastic models generate more realistic energy demand patterns at the district level, which is especially true when predicting peak loads [56].Happle et al. [48] specifically compare modelling approaches for bottom-up engineering-based UBEM models, and shows that the available models can be divided into three types; deterministic models, stochastic space based models and stochastic person-based models.The review notes that models that take differences between certain occupant groups into account are lacking in the research field.In follow-up studies to this review the authors also conclude that data-driven schedules or probabilistic occupant presence models should be used instead of the standard deterministic schedules [57,58].
Occupant user behaviour has an effect on internal heat gain (heat gain from people, heat losses from electricity use, etc.) and on energy use (appliance use, lighting and hot water use, etc.).Domestic electricity use has been proven to be significantly increased by the number of occupants and their use of household appliances [59][60][61].It has also been proven to be affected by a number of socio-economic and societal indicators.Differences in occupant behaviour and preferences may influence this userrelated energy demand, including feeling responsibility for reducing energy use due to climate change, or the willingness to spend economic resources on energy or personal comfort preferences [48].Yarrow [62] argues in his paper about heritage and energy conservation that the people who value the character of a cultural heritage building can be more tolerant to lower levels of heating or lighting, as this is associated with said character, and would as such be keener to adapt to these.Other indicators and parameters that may affect such behaviour, like income and socio-economic status, differences in gender or age and similar aspects are further discussed in Section 4.3.
Building occupancy is also heavily dependent on human mobility, the movements of occupants between buildings.These patterns of human behaviour are possible to predict using human mobility models, something that has recently seen increasing attention in the UBEM field [11,46].Mobility models are also related to infrastructures such as transportation and electric vehicle charging.They can also have a significant impact on maximum occupancy rate [47], and may increase the accuracy when developing dynamic urban occupancy models [63].
There has been a number of studies utilising mobile phone call records or smart phone/social media Location-based service (LBS) data for collecting data for the sake of understanding human mobility [46].However, these types of data that rely on the use of specific technology are arguably not so representative of the whole population, and are generally bias to the younger and wealthier.Representing mobility for various social-demographics, building types, and locations, is still a knowledge gap [46,47], which would be of importance to connect UBEM with human mobility models.
Barbour et al. [26] extended a recent modelling framework with mobile phone data to assign occupants to residential and commercial buildings using the individual trajectories of 3.5 million inhabitants.The study found that for commercial buildings, the occupancy rates are five times lower than implied by current assumptions, which may lead to building loads being consistently mispredicted, while the occupancy rates vary widely by neighbourhood for residential buildings.This difference between actual capacity and estimated occupancy implies that better space utilisation may improve building efficiency.Since mobile phone data is used, the occupancy levels are normalised to the actual number of inhabitants rather than reference occupancy, which is derived from predefined occupant density schedules based on building type.
Occupancy that is normalised to the actual number of inhabitants seems to be less common than the reference alternative, normalisation by area, regardless of type of occupancy model.The review by Happle et al. [48] lists a few deterministic occupancy models using real occupant densities for individual buildings, but no probabilistic or agent-based ones.However, models exist for all three approaches, which all use some way of defining individual building occupant densities sampled from a probability distribution.
To summarise, occupancy is an essential parameter in UBEM and the importance of its accuracy is getting more attention.Stochastic occupancy models and models on an urban (city or neighbourhood) scale are the most sought after for further research, along with models for capturing relevant behaviour aspects and variations.Stochastic models based on real measurements, like mobile phone data, and normalised against real numbers, are on the rise but are still not commonly used.The increased complexity has been stated as the main reason for the lack of probabilistic models in UBEM, but the amount of studies highlighting the need for such methods, along with the studies that have used such methods successfully, suggest that this added complexity is beneficial and essential for further research.Differences between occupant groups are evidently not investigated or included enough, but the topic is related to socio-economic factors so a model that includes such aspects will likely influence the occupancy modelling as well.

Socio-economic data
Society-related economic factors including employment, income, education, and sometimes also ethnic background, gender or similar can be referred to as socio-economic data or indicators.Socio-economic indicators, factors and parameters have been considered important for model accuracy and representation, but are as of now remarkably under-developed in existing UBEM research.Reasons given in the papers are typically that socio-economic data is only fit for top-down approaches or hard to integrate in engineering models or that there is a risk of significant errors as they change over time [9,12].
The social household as a subsystem can be divided into occupancy (see Section 4.2) and and socio-economic parameters.Regarding the effect of occupant behaviour on energy demand, studies point out temperature or lighting preferences, employment and socio-economic status, lifestyle or culture and differences in gender or age as significant parameters that affect the building energy demand [53,54].The application of the UBEM model by Barbour et al. [26] found that the residential occupancy is not only highly neighbourhood dependent but also that the occupancy distribution curve has a long tail.This implies that the occupant densities have a wide range, and that other factors connected to different neighbourhoods, like socio-economic status, significantly affect the occupancy.The study gives an example regarding student housing neighbourhoods which shows significantly higher occupant density than the residential average.Some examples from various studies show what kind of socio-economic parameters have been shown to have an impact on building energy use and could be considered for inclusion into UBEM.
Jones et al. [59] conducted a comprehensive review of socioeconomic, dwelling and appliance related factors that affect domestic electricity consumption.The study found that 13 socioeconomic factors affected household electricity use, of which three proved to have a significant positive effect.More occupants, more teenagers in the household, and higher household income all led to an increased electricity use.The review also concludes that dwelling size and number of appliances are strongly related to the electricity demand, and that a considerable number of the studied factors had only been examined by a few previous studies.For the building heating demand and total energy use, any comprehensive review could not be found for the scope of this review.
In a study of renovation investments and incentives, Mangold et al. [64] find that the form of ownership has an impact on the energy performance of multi-family buildings.They point out that building ownership is context specific, but that municipalityowned and rental housing has a lower energy performance than resident owned, private real estate.The study used EPCs for energy use data and real-estate ownership data from the Swedish National Land Survey and the Retriever Business database.Yohanis et al. [61] also showed that privately owned homes have an over 100% higher electricity demand in the evenings than that of rented homes.
Studying variations in gas usage for space heating in the UK, Fuerst et al. [51] found that the gas usage is largely determined by households' socio-economic rather than physical characteristics.They also found that single person households consume 3-6 times more gas compared to larger households, and that the wealthiest households use more gas per capita than the poorer households.Persons aged 60 or older also consume double the amount of gas compared to younger families.A 2009 study on Swedish and German people also conclude that income, age, but also gender and social status influence the energy demand [65].
Another recent Swedish study found that the energy use and electricity consumption per capita is strongly correlated with income [66,67].More wealthy households can have larger homes and more appliances while low-income residents have higher residential density, and more people living together tend to be more energy efficient in terms of electricity use.Low income can also have a direct effect on the energy demand as residents cut heating or electricity use due to energy costs [52].However, the energy performance has also been found to be inversely correlated with income [68]; the highest energy use per square meter in Sweden is found in the lowest income decile.
Households and businesses with a shortage of capital, short-or long term, may also reduce spending on building refurbishments and modern equipment.Less wealthy areas may consequently postpone energy efficiency-improving actions, resulting in a higher overall energy demand.von Platten et al. [68] also point out that energy use does not increase linearly with floor area, and that measuring energy use normalised by area has been proven to penalise small buildings because of this.They also argue that for maintaining a sustainable built environment, increased space utilisation is needed as urbanisation continues to increase demands for housing in cities; the amount of construction of new residential buildings is directly tied to the total energy use and the overall energy demand per capita.A study from Switzerland shows that while the population increase in the country is low, the total building stock area and domestic energy consumption has increased by a factor three over a period of 50 years [69].This results in a significant net increase in energy demand per capita although the energy performance (in kWh/m 2 ) has only increased marginally.This pattern has led to that several studies now argue for the implementation of energy use per capita as a second key number for energy efficiency [68,70], or that energy use per area is insufficient as a key number for energy efficiency [26].Effective space utilisation could therefore be a parameter of interest for creating energy demand scenarios.
As indicated above, several studies have shown that the household income is closely related to the energy demand, which should encourage including resident income (or median neighbourhood income) as input parameters for city-scale energy modelling.By using energy demand per capita, the concept of effective space utilisation becomes available, which is an important but neglected measure of energy efficiency.Demographics (age, gender and persons per apartment) may also affect the energy demand, but perhaps not as much as income.However, demographics are likely to influence the overall occupancy of buildings or city districts which may affect the occupancy modelling.
How to incorporate these data in building modelling is however still a complex issue.Socio-economic indicators are usually only included in statistical, not physics-based, models, and top-down models are more suited to include such aspects compared to bottom-up approaches [9].Data for indicators such as household income and persons per apartment can be collected by using tax records or similar databases.The data can then be used for classification of archetypes or for cross-checking with model simulation results to be able to find patterns or to draw useful conclusions.

Probabilistic characterisation of parameters
Many parameters in BEM follow a stochastic pattern, which can vary greatly between buildings and be hard to measure accurately.Therefore, many of them are most often deterministically defined as a fixed value, or several fixed values following a schedule.Probabilistic characterisation has been generally regarded as more representative and accurate for UBEM applications than its deterministic counterpart [29].However, not many of the common parameters have been sufficiently researched or fully integrated in UBEM models [11].Of the studies summarised in Table 1, only three use probabilistic characterisation to some extent.
Building infiltration, or air leakage, is one such non-geometric parameter that is usually modelled deterministically.Infiltration is the unintended introduction of air in a building, as compared to ventilation which is intended air exchange.Infiltration is highly important when calculating and modelling the energy demand in buildings, and is chosen for further examination here because of its high uncertainty [36,29] and its tendency for causing discrepancy between modelled and actual energy use in BEM [8].
Building infiltration has a significant effect on the energy performance of buildings, around 15-30% of the energy use for space heating [71] and largely depends on building characteristics and outdoor wind speed.The parameter is most often measured in Air Change per Hour (ACH), the share of the total air volume in the building or system that is replaced with incoming air in one hour.
Since the infiltration rate varies with building characteristics and renovation rate it can be difficult to model accurately for a large set of buildings.In previous UBEMs the infiltration is set deterministically, at a set rate according to building standards [28].As a reference, these values could correspond to around 0.3-0.7 ACH for new construction and 0.5-1.5 ACH for old or low-income construction [72].These values are derived from histograms of measured air leakage in buildings.
One way to solve the infiltration modelling issue is to model the infiltration in a building stock as a statistical distribution, assigning values for individual buildings from the distribution.This has been done by Shi et al. [73] where air infiltration rate distributions of multi-family residences in Beijing were mapped.The goal was not energy performance assessment but instead air quality; however the air exchange parameter is relevant for both areas of study.The study found that the leakage area was dependent on building size and year of construction, and that the infiltration rate of all studied buildings could be well represented by the lognormal distribution with a median value of 0.16 ACH.The study also notes other studies agreeing that the lognormal distribution best represents the distribution of the infiltration rate.They also note that the lognormal distribution may underestimate small infiltration rates and overestimate large infiltration rates, respectively.
A similar distribution is also showed by Eskola et al. [74] in their study of air exchange and energy performance in historic residential buildings in Estonia, Finland and on Gotland.They also found that the air exchange rate was significantly higher for two-story houses than one-story houses, and for wind-exposed houses as compared to sheltered buildings.
Pietrzyk and Hagentoft [75] offer a more mathematical approach, presenting a probability density function of the air change rate as a result of variations of the climatic conditions.This approach also renders the ACH distribution as lognormal, with a mean value of 0.19 ACH.
There seems to be a high agreement on that the stochastic behaviour of infiltration is following a lognormal distribution pattern.The findings stem from individual BEM and air quality research, but since the fields are similar it is noteworthy that they have not yet found their way into UBEM projects.Regarding stochastic parameters, an advantage of city-scale modelling of multiple buildings over BEM is that the characterisation of building archetypes can be in the form of a probability distribution, and the values for the simulated individual buildings can be randomly sampled from the distribution.If sampled from a known distribution, for example a lognormal distribution with known mean and standard deviation for the infiltration, the large-scale model will likely provide more accurate simulations.
Similar to infiltration, there are other modelling parameters that are usually deterministically defined in BEM and which could be possible to represent with probabilistic characterisation, for example the window-to-wall ratio (WWR) [11,28].WWR in UBEM was studied in detail by Mohammadiziazi et al. [32] with a resulting WWR range that could be represented by a probability distribution.Indoor temperature (thermostat set points) is generally deterministically defined, but is also regarded as being a parameter with high uncertainty [29,37,48].Sokol et al. [36] modelled indoor temperature as a probability distribution with promising results.

Renovation and refurbishment records
As part of the transition to a low-emission society, existing buildings need to be adapted to comply with strict environmental and energy performance requirements [76].As part of the European Green Deal, the European Commission has launched a new strategy to boost energy efficiency by doubling the renovation rates in a 10 year period [77].When renovation is mentioned in previous UBEM research it is most often as a target for energy efficiency or as a possible model application.Some studies use the renovation year as a complement to the construction year, to more accurately evaluate the materials and U-values, but also state that such data can be too hard to find [25,28].
Deep renovation has the potential to greatly reduce the energy demand of a building, and the last major retrofit has been proven to have a greater correlation with building energy use than the original construction year [36].This is also proven by Mangold et al. [78], who provide an estimate of the economic and societal challenges of renovating the ageing building stock in Gothenburg, Sweden.They conclude that there is a risk of decreasing societal equity due to rent increases as buildings are renovated.For the study, they introduce a method for assigning the value year of a building based on renovation cost and year.The value year is a term used when a building's characteristics and energy performance better align with a more recent construction period, as opposed to the actual construction year, due to renovation or similar procedures.If a renovation goes beyond ordinary maintenance, the cost and renovation year is registered by the Swedish Tax Agency.This also gives a number on how many deep renovations have been carried out in the building stock, and consequently how many buildings are in need of refurbishment ahead.This could make renovation tax data an interesting data source addition for a UBEM project.
When refurbishing a building to a more energy-efficient state, it is done with the intent of decreasing the energy demand.However, the actual energy use may not reflect this, due to a rebound effect [51,62,67,79].Within energy economics, the term rebound effect is used to describe mechanisms that reduce the expected energy savings from applied energy efficiency improvements [80].One result of such improvements is that the marginal cost of related energy services is reduced, which may lead to an increase in consumption of those services, which is observed as a direct rebound effect.There are also other indirect reasons for the net rebound effect, for example re-spending effects, where the money saved from energy efficiency may be spent on other services that also require energy to be provided.
If the rebound effects due to energy efficiency measures are so strong that they will result in a net increase in energy consumption, it is known as Jevons' Paradox.This could have major implications for sustainability as it implies that it may be counterproductive to apply energy efficiency measures as a means of reducing carbon emissions [80].
A direct rebound effect related to building renovations is observed as efficiency measures create room to increase electricity consumption or to improve the thermal comfort of the building.Comfort-or preference-related actions like increasing indoor temperatures, increasing the electric load or similar (see Section 4.2) are often taken by residents if given the possibility.Renovation often increases this possibility, and may lower the cost of taking such actions, which could then create a new baseline after the renovation has been made.
Studies have also confirmed that identical buildings can still have a energy use for heating that can vary with a factor 2-3, due to user practices [67].Non-intrusive measures targeting user behaviour also have an energy savings potential of around 10-15% per building and are especially relevant for cultural heritage buildings due to the restrictions regarding refurbishments in historic buildings [52].
Building renovation has a great potential to reduce the energy demand of a building, and knowledge of renovation year may be important to accurately represent the building stock, especially in terms of construction materials.Renovation tax data and value year could therefore be additional model inputs.Due to the observed rebound effect, as renovated buildings may not necessarily be more energy efficient, it may not be as simple as to classify or divide buildings based on whether or not renovations have been made.The energy efficiency related to renovation is also therefore closely tied to resident behaviour.

Historic and cultural heritage buildings
Historic buildings have a heritage significance to present and future generations, which corresponds to values of importance that people assign to a building, often expressed with characterdefining building elements [81].For historic buildings, the age of the building is the main indicator, but there is no agreed-upon age that renders a building historic.Similarly, historic buildings do not necessarily have to be assigned as cultural heritage.
Building retrofits and the conflict between energy efficiency and the conservation of historic buildings are common research areas within the field of cultural heritage and conservation [76].Even when refurbishing for energy efficiency, the energy performance goals set for newly constructed buildings can be hard to reach for existing buildings.Not all building features can be refurbished, as the energy-saving methods and strategies used in modern buildings are not necessarily usable or appropriate for older buildings [69].The discussion about energy efficiency in cultural heritage buildings is based on the premise that the consideration of cultural values and technical properties in these buildings has a negative effect on the potential for energy saving and retrofit measures [82].It is however also a risk to neglect the need for energy efficiency measures.They are a prerequisite for cultural heritage buildings to be preserved, largely due to that increased heating and maintenance cost almost always lead to decay and demolition.
A number of cultural heritage building stock inventory studies have been published [27,82], several with the main purpose of evaluating the potential for or effect of energy efficiency measures.The overall energy use is often lower for historic buildings due to different energy behaviour [52].More knowledge about a building's historic values can affect the user's evaluation of the approved levels of comfort, so that residents in historic buildings respond to thermal comfort more positively [83].Berg and Fuglseth [76] show that refurbishments of historic buildings are favourable from a climate change mitigation perspective, and it can take more than 50 years for an equivalent new building to compensate for its environmental impact.A concept related to this is embodied energy, which is the energy required for all stages of a life cycle, which for buildings includes for example production and transportation of building materials, construction processes and maintenance [84].When reviewing this concept, Dixit [85] found that embodied energy can be up to 60% of total building energy use.New, energy-efficient constructions also have a higher share of embodied energy than older buildings, due to increased material use.New buildings have a superior overall operational energy performance but it should not necessarily be used as an argument against the conservation of historic buildings.In addition to effective space utilisation it is important to use and re-use old buildings to avoid constructing new buildings when possible.Regardless of energy efficiency measures, new construction always increases the environmental impact.
In UBEM, distinguishing historic buildings from the rest of the building stock is probably most useful for later applications, as this distinction dictates which measures or actions can be implemented, and for inclusion in the archetype classification process.Construction materials affect building energy performance and the U-values used in energy simulation.Historic buildings are built differently than new construction and can as such be represented in archetype classification simply with construction year as input.To keep down the overall energy use and environmental impact, regardless of efficiency measures it is important to use and reuse old buildings to avoid constructing new buildings when possible.The overall energy consumption in historic buildings is also tied to resident behaviour, just like general building renovation, indicating that UBEM can benefit from including renovation and presence of historic buildings in the behaviour and occupancy models.

Energy system models
Urban energy flows regards both demand and production, so an urban energy system model needs to cover both consumption and generation as well as the energy infrastructure.
Energy system models related to UBEM are often generation and distribution systems, where projects with the goal of simulating large-scale thermal city models integrate a UBEM with energy system models such as distribution systems and energy supply units [86].Research on district-level energy systems is considerable [87], as well as studies on a larger scale [45].However, some parts of large-scale energy systems are not as well represented, examples being photovoltaic (PV) power, electric vehicles (EVs) and building or district scale load matching strategies.
The global sales of EVs are rapidly growing and currently several large car manufacturers are planning to reform their product lines to produce only EVs [88].A larger share of buildings today have integrated PV power production and EV home-charging.One way to include these technologies in energy demand estimation is to represent consumption, PV power production and EV charging with probability distribution models [89].Another is to use models that generate time series [90], which can be obtained from stochastic models or from historical data.These models are important to connect to occupancy models, if such are used, so that the various parts of the system model are synchronised.The charging time for an EV is dependent on household occupancy and the placement and movement of the vehicle, which is also a vital part of urban mobility models.Data on the presence of PV system and their total energy yield is available from EPCs and can such also be represented with distributions within the building stock.However, the importance of this data is limited as the fraction of residential buildings with installed PV systems is still low in many countries and cities.Knowledge of installed PV capacity is however important for energy simulations since power produced locally (i.e., the same building) is not accounted for in measuring the total energy consumption, which then results in better energy performance.For future scenarios, the roof-top area and other building surface areas potentially available for PV systems are more important than data for existing PV systems.Using LiDAR data, the potential of rooftop-mounted PV systems can be assessed for the entire area included in a UBEM model [16].
Lingfors et al. [91] created a target-based visibility assessment model for rooftop PV systems, specifically targeted for cultural heritage buildings, that also can be used for assessing roof type and inclination.With a high share of cultural heritage buildings, many factors restrict the placement potential of PV systems, including fragile or protected roofs and the loss of value of a building if the modern technology is visible from the ground or windows.Using this model it is possible to quickly gain knowledge of the potential for solar energy in a city, which can be of great interest for UBEM model application.
An aspect of energy system modelling that could be of interest for BEM and UBEM is load matching and on-site energy storage.In many locations around the world, a building with installed PV power producing the same annual amount of electricity that the building consumes annually will usually produce surplus electricity in the summer months and under-produce in winter, leading to a seasonal mismatch (the same being true for diurnal cycles, i.e. day-and nighttime) [92].Self-sufficiency and self-consumption for building electricity use is important for cutting power peaks but also to better utilise electric power when it is available, to decrease the amount of power imported from the electricity distribution grid, which will decrease the overall electricity consumption.
Smart charging of EVs could improve the synergy between PV, EVs and electricity consumption [93], which can lead to technical improvements but also reduced power and electricity demand.Electricity consumption and user mobility are also considered major uncertainties in this research field, further suggesting the possibility of linking such models to developed UBEM energy demand models.
Roof type and potential PV capacity are important parameters for building energy modelling, and can be obtained from LiDAR data.Existing installed PV capacity data could also be useful, and can be obtained from EPCs even though this data is still mostly too scarce.Load matching and PV-EV synergy and impacts on energy distribution infrastructure are promising topics and could be part of UBEMs, perhaps not for individual buildings due to computational complexity but on levels of districts and larger.UBEMs could also provide important inputs to distribution system simulations, which require data on energy use in the network nodes.For evaluating future scenarios with the help of UBEM such combined energy system models could be a valuable tool.

Future scenarios
This section reviews existing UBEM studies in terms of applications, and aims to examine studies concerning developing future scenarios based on, or related to, building modelling along with approaches for energy demand forecasting.Scenario approaches are widely used to inform and support various planning and decision-making processes, and are usually performed from an exploratory, normative or optimisation perspective [94].UBEM research does not necessarily have to be intended for a specific application and could strive only for optimising the model itself.
Here, only UBEM projects or related studies that aim to create future scenarios of some sort, or to aid energy efficiency planning, are examined and described.The major focus of each study is determined and gaps in the research are analysed.

Retrofitting or heat demand reduction
To quantify, forecast or accurately predict the future energy demand in the building stock is a common goal for UBEM studies [5], with the largest percentage of them being data-driven models [12].Forecasting and prediction are common topics for other bottom-up energy demand studies as well [95,96].A significant number of studies have been published that use a UBEM or similar to create and evaluate scenarios regarding methods for retrofitting of existing buildings and their potential for decreasing the heat demand [32,42,94,[97][98][99].Yang et al. [100] use a pre-developed bottom-up model to predict the energy use and carbon emissions in China's building stock up until 2050.The model framework is similar to the deterministic UBEM concept and based on collected real historical and monitored data.Guo et al. [101] use a similar model to predict the future energy demand and carbon emissions under different future policy scenarios; a reference scenario, a scenario with stronger strategies for transition, an energy sufficient scenario and a scenario to reach the 1.5°C goal.The scenarios are well defined and show significant differences.However, the authors point out several weaknesses and uncertainties with the used model; renewable energy carriers are cut from the energy system, hybrid models are mentioned as a stronger alternative to deterministic models and the model is calibrated for history and status quo but not for future analysis.
Holck Sandberg et al. [79] use a data-driven building stock model for scenario analysis of the energy demand in Norway from 2016 to 2050.The stock model is similar to a UBEM model, but is dynamic and takes stock change into account.Construction, renovation and demolition over time is connected with population models to visualise the development of the building stock over time.Seven scenarios of changes in the building stock and energy mix are analysed with renovation as a major parameter, showing that the total delivered energy is decreased in 2050 even though the total building stock area has significantly increased.They also find that the rebound effect (see Section 4.2) results in an actual total delivered energy that is larger for all scenarios, except some that remain almost stagnant when compared to 2016.In the simulations, user behaviour reduces the saving potential from 51% for the theoretical estimate to 36% for the estimated actual energy demand from 2016 to 2050.
Rakha and El Kontar [35] create a neighbourhood UBEM based on a clustering approach with deterministic characterisation and use the model to evaluate scenarios of a number of design parameters.The scenarios have different heating/cooling setpoints, shading and window-to-wall ratio, and effects on energy use and solar radiation control are noticed.In this way the different scenarios work more like a sensitivity analysis.
Constructing and evaluating future weather or climate scenarios are common within the field of environmental science, such as the IPCC reports or the EEA's PRELUDE scenarios [102].UBEM models usually use real weather data sets for building thermal simulations.As weather conditions directly influence the thermal conditions and energy demand in buildings, using appropriate weather data is important for the accuracy of the results [103] However, this weather data is historical, and does not consider climate change, uncertainties or extreme conditions [12], which can render the model conclusions unreliable regarding future outcomes.
Scenarios regarding retrofit potentials are common in the UBEM field while scenarios for energy demand estimation are common in general but not specifically in the UBEM field.This creates good opportunities for using UBEM for bottom-up estimation of energy use, and similarly that UBEM can benefit from using other types of energy demand estimation techniques.

Climate change scenarios
Climate change will not only affect how new buildings are constructed but also directly impact the existing building stock.Li et al. [9] point out that climate change could have a significant impact on future energy consumption in buildings, and that ''the prediction of energy use in future climate scenarios can inform decision making by governments and private sectors".It is possible to use a UBEM framework to predict building energy use due to weather variations caused by climate change [32].However, as of 2018, there had been no study assessing the effects of climate change and future weather scenarios on the energy performance of urban buildings [104], indicating the scarcity of such studies.
The impact of climate change on building energy consumption can be substantial, due to the direct relationship between the heating and cooling demand and the outdoor temperature.Heating demand is generally predicted to decrease in winter while the cooling demand in the summer months will increase [105].Most buildings are expected to see an over 50-100% increase due to climate change while some studies have reported predictions of increased cooling demand by 200-1050% [106].The large numbers are corresponding to countries that have virtually no current cooling demand, but are predicted to have a demand in future scenarios.Because of this, it is also a frequently researched subject, relying on building simulation as a baseline tool [107].However, most of those studies focus on novel climate analyses or similar aspects and hence simulate the building stocks with limited detail.Methodologies used for these studies are, for example, to utilise BEM for individual buildings [108], use real case study buildings for measurements [109], or to model neighbourhoods with specific known details [110].
Some studies have specifically examined several climate change scenarios, and also made analyses on an urban scale.Nik and Sasic Kalagasidis [111] studied the future impacts of climate change and specific climate uncertainties on the energy performance of the building stock in Stockholm, Sweden.The study used 153 buildings to represent the Stockholm building stock, with probabilistic characterisation, and utilised a custom Simulink energy simulation module containing the heat balance equation.Several years later Perera et al. [112] quantified the future impacts of climate change and extreme climate events on the Swedish energy system with 13 climate change scenarios.The authors used the same building energy model to simulate the residential building stock of 30 Swedish cities.They combined a certain number of buildings to represent typical urban areas in each city and assumed that the heating demand was supplied by heat pumps.
Hietaharju et al. [113] created a building stock model for evaluating the future district heating demand impacted by climate change.The stochastic building stock model is dynamic and considers demolition, new construction and renovation of existing buildings.They conclude that their model can tie annual renovation rate to the future energy efficiency goals, and that future scenarios run in the model can accurately predict the changes in overall district heating demand and relative daily variation due to climate change.
Yang et al. [107] provide an impact assessment of climate change on the energy performances and thermal comfort of European residential building stocks.The large model simulates the future building energy performance in 38 cities, considering 13 future climate scenarios over 90 years.The model uses the TABULA database (see Section 2) for deterministically characterised archetype buildings and average values for the heating and cooling demand.The model simulates building energy performance and indoor comfort using the software IDA ICE, uses deterministic occupancy, and calculates the cooling capacity by finding the enthalpy difference.For further research with similar models the study suggests using a wider range of building archetypes and finer spatial resolutions of building stocks, and including socioeconomic parameters.
All the previously mentioned studies point out the importance of considering climate change when determining future building energy performance or designing urban energy systems.Furthermore, they all conclude that the impact of climate change on future heating demands and power supply reliability can be assessed with the applied models.Scenarios regarding climate change impacts on energy systems related to buildings are also shown to be prevalent but not in the UBEM field.Since several studies exist for either building stock models or individual BEM, there seems to be plenty of room for combining these into comprehensive UBEM projects.
Table 3 summarises studies of future scenarios in the reviewed building modelling research papers.Note that the parameter for comparison is listed as it is presented in each paper; several are notably similar although they all represent kWh/m 2 as the unit of measure, namely energy use intensity, energy need intensity, energy saving, energy performance and primary energy consumption.This could be the cause for significant confusion when comparing results from similar studies.

Recommendations for future research
Common UBEM practises and approaches have a considerable knowledge base through the extensive research that has been carried out during the last decade.However, most existing papers mainly consider only one approach, or focus on optimisation of a pre-defined modelling procedure.Several papers suggest including the same few auxiliary parameters and/or applications that are still not particularly researched in the field.In this review, previous research was surveyed to find research gaps, auxiliary model components of interest were reviewed and possible improvements and applications for future research were examined and analysed.
The articles reviewed in Sections 1.2, 2 and 3 were surveyed for which aspects that the authors regarded as the most important for further research or as missing in the studies.It was found that the most common issue was that measured or metered building energy data was limited or missing.The second most brought up aspect was the need for including stochastic occupant behaviour models, followed by including socio-economic factors and machine learning methods in modelling.For Sections 4 and 5 certain aspects or research fields related to UBEM and relevant model inputs or approaches were examined with the purpose of assessing their position in the corresponding field, or usefulness for integration in future UBEM research.The following are the main UBEM is a diverse research field, yet similar methods are used for archetype classification.It is evident from the surveyed studies that some parameters are more relevant for the classification process; better assessment, especially through using clustering methods, and more included parameters could result in more representative archetypes.Comprehensive databases such as the EPCs may contribute to decreasing some of the common data issues in UBEM, relocating them from data gathering to data quality.However, EPCs have been labelled as reliable and as the most promising data source for building modelling purposes.
Several modelling aspects, such as stochastic occupancy models and socio-economic input parameters, are being frequently mentioned as important and that models should strive for including them, however only a few actually do.Stochastic and dynamic occupancy and mobility models are emerging and we will probably see them included in a greater extent in the near future.However, differences between occupant groups are not really regarded at all, apart from the finding that occupancy rates vary widely by neighbourhood, and should be included in further research.Socio-economic factors tie in with occupancy well; mainly as these factors contribute to how people behave and cause many of the differences between occupant groups and neighbourhoods.The stochastic nature of human behaviour is also connected to probabilities that certain events occur, which may be related to socio-economic factors.Certain behaviour may also be tied to household income, and could affect an integrated mobility model.Future research should further investigate the possibilities of including socio-economic factors in UBEM.
The vast majority of energy efficiency studies consider kWh/m 2 and not kWh per capita.Energy demand per capita (per building) along with income data could make an interesting tool for analysing the building stock and possibly for archetype classification, which has not yet been observed in the field.Uncertain probabilistic parameters in UBEM, such as infiltration, can be thoroughly assessed and assigned a probability distribution, which makes it possible to assign building parameter values that are statistically accurate.Future research should look into this and is likely to find that other uncertain parameters, like WWR and indoor temperature, can be assessed similarly and represented in a more accurate way.The rebound effect is important to take into account in future studies, since it makes it difficult to see how energy demand is correlated with refurbishing measures.This also puts some doubts on using renovation status as a classification parameter.It would certainly be of interest for further research to investigate the connections between the rebound effect and occupant behaviour, and find ways to include the rebound effect in UBEM.
There is a growing body of research on retrofit and energy efficiency measures for cultural heritage buildings.However, very few studies have utilised UBEM or even energy modelling.Cultural heritage buildings can be an important distinction for archetypes, as they are usually different in terms of construction material, indoor temperature or renovation potential.
Local energy system models for energy conversion or city-wide energy distribution infrastructures are interesting to connect to UBEMs.Building-integrated or building-applied technologies such as PV systems are particularly important to include in UBEM.When applying PV system models, data for identifying roof types and characteristics are important and can be of help for building modelling.
It is possible to create future scenarios using place-based models that evaluate climate change impacts, energy security, sustainability, social cost-benefit energy strategies, affordability etc. However most are still focusing on energy efficiency and retrofitting, for which extensive research exists.Future scenario modelling studies that evaluate climate change impacts do exist but do not use 'true' UBEM models, which could be further investigated.
For studies on future scenarios, commonly used ways of evaluating the energy demand may not hold up as variations in the energy demand might increase.Changes like increased cooling demand due to climate change or increased electricity demand due to electrification of industry and transports, affect the building stock and building modelling approaches.This is interesting for further research, and by including a UBEM perspective, such applications can be introduced to the demand forecasting field.

Conclusion
This review has identified best practices and possible improvements for common UBEM applications, and provided recommendations for future research within the field, based on research gaps concerning auxiliary model inputs and future scenario development.
Further potential developments for building archetype classification in UBEM are advances in data gathering from comprehensive databases such as the EPC database, and increasing the number of parameters used for classification, especially through clustering processes.It seems to be a consensus in the UBEM field that stochastic occupancy and urban mobility models are considerably important and that more research is needed.Advanced occupant behaviour models could not only be important for improving model accuracy regarding occupancy, but also be capable of incorporating changes in socio-economic and environmental conditions, as well as behaviour connected to renovation and cultural heritage, which may be the cause of differences between groups and districts.Probabilistic characterisation of traditionally deterministic parameters is another area that shows promise, which should benefit further UBEM research if applied to a larger extent.
UBEMs are currently applied to quantify, forecast or predict the future energy demand in the building stock, often utilising future scenarios.Scenarios regarding potential for building retrofits are common while the prevalence of most other scenarios is not as high in the UBEM field.UBEM frameworks can also be used to accurately predict building energy use due to weather variations caused by climate change, and it is of high importance to consider climate change when determining future building energy performance or designing new urban energy systems.The scarcity of studies assessing scenarios for the effects of climate change on the energy performance of urban buildings indicates a great potential for combining UBEM with scenario building and analysis.
Based on this review of current research in UBEM and similar fields it can be concluded that there are areas in which further research and projects can be expanded and presumably improved, and that UBEM could provide important framework for further advances in urban planning and a wide range of applications and analyses.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Table 1
Archetype development in the reviewed UBEM studies.

Table 2
Model components relevant for further research, as mentioned in existing UBEM studies or reviews.

Table 3
Studies of future scenarios in building modelling research.