Spatiotemporal upscaling errors of building stock clustering for energy demand simulation

Energy demand of buildings forms a key component of energy system analyses. To integrate building energy demand into energy system models, detailed simulation of every single building is impractical and often computationally infeasible. Simulations are therefore typically performed on a sample of buildings for bottom-up analysis. For a simpliﬁed representation of the building stock, grouping and clustering are widely applied. However, a spatiotemporal quantiﬁcation of the resulting upscaling errors is however lacking. Here, a grouping and clustering approach is performed for all Swiss buildings for bottom-up energy demand analysis. The spatiotemporal simulation error is quantiﬁed at the national, regional and neighbourhood (300 (cid:1) 300 m) scales. Whereas national energy demands are well represented, the error induced by the grouping and clustering is signiﬁcant at the neighbourhood scale: simulated demand is easily over or underestimated by double-digit percentages. Obtained regional full load hours and daily load factor values by a grouping and clustering approach are comparable to a full simulation of all buildings. However, careful consideration must be given to the used climate data if relying on clustering approaches. Our analysis reveals the challenge of representing peak demands well by aggregation of building demand proﬁles, which is exacerbated by grouping and clustering. Bottom-up studies thus need to consider errors arising from the grouping and clustering approach and distinguish between errors resulting from the upscaling process and other sources of uncertainty such as data inputs or modelling concepts. Finally, calculated energy signatures are shown to perform well as a fast and viable alternative to grouping and clustering.


Introduction
Detailed simulation of building energy demand is crucial for informed policy decisions to realise the necessary sustainability transition towards net-zero emissions [1,2].Geospatially explicit knowledge of building energy demand is indispensable, particularly for planning future energy systems, integrating renewables or designing energy infrastructure from the local to the national level.Robust decision-making depends on knowing precise building-specific and regional energy demands.Building energy modelling programs are used to simulate buildings under changing conditions such as climate change or for assessing changes in demand due to building retrofit measures [3,4,5;6].In contrast to commonly used top-down approaches for national scale analysis, bottom-up simulation approaches use detailed building representation [7].This is crucial for capturing dynamic changes of the building stock composition, to study the climatic impact on build-ings and neighbourhoods [8] or for analysing urban densification [9].Representing spatiotemporal patterns at an urban or regional scale based on simulation of building energy demand is, however, complex due to the large amounts of required input data, the diversity from occupancy or the urban built environment [10].
Energy demand in buildings is driven by diverse factors such as the climate, building characteristics, or occupant behaviour [11,12].Thus, the integrated simulation at a high spatiotemporal scale is computationally intensive.Detailed simulations for a large number, i.e. thousands or millions of buildings, becomes timeconsuming and data-intensive when considering multiple climate, weather or retrofit scenarios.To address the challenge of simulating building energy demands for a large number of buildings at a regional or national scale, grouping and clustering techniques are commonly applied.These techniques enable simulations on a subset of representative buildings, also commonly referred to as building archetypes, which can then be used for upscaling.The advantage of relying on a simplified building stock representation is not only the reduced computational cost for the analysis but also https://doi.org/10.1016/j.enbuild.2022.1118440378-7788/Ó 2022 The Author(s).Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

⇑ Corresponding author.
E-mail address: sven.eggimann@empa.ch(S.Eggimann).Relying only on a limited set of buildings for upscaling simulation results will however inevitably lead to errors.Whereas in literature various clustering algorithms are compared [13], the impact of grouping and clustering buildings used for upscaling is far less studied and understood.With a few notable exceptions [14,15], the quantification of this error source has largely been ignored.Given the number of authors who rely on building stock simplification, exploring the uncertainties arising from the upscaling and the simplified building stock representation is obligatory.Here, we aim to provide insights into how building energy simulations based on a simplified Swiss building stock affects the assessment of regional and national scale energy demand estimates.

Energy demand simulations using a generalized building stock
This section provides a brief overview of energy studies relying on generalized building stock representations for Switzerland, which is, however.representative of similar studies in other countries.This brief overview focuses on approaches that group and cluster buildings for generating a simplified representation of the building stock used for upscaling: In energy research, numerous studies have used bottom-up techniques for national scale analysis based on detailed building stock representation [7].Necessary national datasets on detailed building attributes are, however, oftentimes not available, outdated or incomplete.Automated approaches based on machine learning can be used to obtain more complete building attribute datasets [16,17] that serve as a basis for detailed clustering and grouping analysis.
For Switzerland, Streicher et al. [18] perform a building stock grouping to estimate national heating demands.Tardioli et al. [19] use clustering and predictive modelling to obtain 67 representative buildings and associated clusters and Girardin et al. [20] group buildings according to age and type for integrated energy analysis for Geneva.Streicher et al. [21,22] develop a bottom-up building stock model and explore retrofit pathways using building archetypes.Silva et al. [23] apply supervised classification to distinguish between different types of buildings to explore passive cooling opportunities.Murray et al. [24] investigate optimal transformation strategies for Swiss buildings by finding representative buildings and districts with clustering methods.Gupta et al. [25] rely on a clustered building stock to study the deployment of renewables and heat pumps for assessing the impact on the distribution grid.These studies demonstrate that the energy demand of buildings is driven by various factors and consequently many different features have been used for clustering and grouping the building stock.De Jaeger et al. [14] provide a literature overview of typical features used in energy demand simulation and Goy et al. [15] provide an extensive overview of different building grouping and clustering approaches.Typically, studies have so far focused on the grouping and clustering of individual buildings, even though the focus is recently shifting to neighbourhoods [9,26,27].Spatial or density-based clustering is furthermore commonly used, particularly related to district heating or other network-based energy infrastructure [28].Whereas such approaches similarly divide and segment buildings into (spatial) clusters, the motivation for clustering is different: the goal is typically simulating energy supply systems and speeding up the solving of energy optimization problems using network-based infrastructure.Density-based clustering has, for example, been used to divide cities into multiple districts to perform energyhub optimization [29,30].We consider such spatially-driven building clustering approaches as distinct and do not review them further.
Whereas we have shown that grouping and clustering are commonly used in building energy research, its validation is generally lacking.If energy demand simulations are validated, typically no clear distinction is made between the simulation error arising from the energy simulation model and the upscaling methodology [15,31].More often, the quantification of uncertainty resulting from simulation errors due to input data or input models is studied, which is however separate from upscaling errors [32].This study builds on a few notable exceptions, considering upscaling uncertainties from grouping and clustering: De Jaeger et al. [14] compare how the number of clusters affects the simulation of peak and annual space heating demand.They find that randomly selecting buildings performs significantly worse than relying on a clustering approach.In the absence of real-world measured data, De Jaeger et al. [14] suggest analysing clustering performance by comparing obtained values based on the clustering with a full simulation of all buildings.Since a full-scale comparison at the national scale may not be possible due to the limited computational power, such a comparison can typically only be done for a subset of the data.

Original contribution
We note that energy demand studies typically use grouping and clustering techniques for upscaling simulated energy demands to regional or national scales without an in-depth analysis of upscaling uncertainties.To address this research gap, we perform a spatial and temporal explicit error assessment of grouping and clustering the entire Swiss building stock.We evaluate the upscaling at national, regional and neighbourhood levels for annual demand and explore hourly (peak) demands and complications arising from aggregation effects.The here outlined generic methodology is replicable and equally well suited to other countries and only constrained by data availability.
The simulation results of all simulated buildings in this study were condensed to provide building type and building agespecific energy signatures.Energy signatures (also called changepoint regression models) [33] are regression models which allow setting into relation outdoor climatic variables with a buildings energy demand in a simple, robust and accurate way [34].Energy demand estimations are typically only based on specific annual indicators (e.g.kWh/m 2 ) and the dynamics considering temperature, radiation or construction properties are ignored.Even though other studies have published energy signatures for Switzerland [20,35,36], the here provided energy signatures are, to the best of the authors' knowledge, the most detailed, considering both heating and cooling demands across different building types and building construction ages.

Methods
Fig. 1 provides a schematic overview of the workflow to obtain national scale heating, cooling, domestic hot water and electricity demand profiles at an hourly time resolution based on upscaling from individual Swiss buildings.For each building, energy demands are calculated based on their floor area and simulated cluster-specific energy demands that allow regional or national upscaling.

Data collection and preparation
First, different spatial and non-spatial building databases and weather data are collected for Switzerland.All datasets and their use are summarized in Table 1.The data collection and preparation methods are outlined in more detail in the following sections.

Building geometry and attributes
The most comprehensive national-scale building dataset is the Federal Building Registry (GWR dataset) [37], which is used for determining the building type, building age and floor level.Additionally, information based on the statistics of the company structure (STATENT) is used for detailed building characterization [38] (see Fig. 2 for detailed attribute mapping).The here used energy simulation framework CESAR-P (Section 2.5) requires georeferenced building data (LOD1) with an average building height and a detailed footprint information [24].The used building footprint polygons are from OpenStreetMap [39].Having detailed building geometries available is critical for energy demand simulations [40], and therefore the 3D swissBUILDINGS3D 2.0 dataset [41] is used for calculating the building height; this is done by extracting and calculating the average building height of each 3D building based on the mean height of all roof patches of the building Multi-Patches.A 3-meter offset is subtracted from the building height to remove cellars, which are included in the raw dataset.This building height is then spatially merged to the building footprints obtained from OpenStreetMap.OpenStreetMap data typically distinguishes between individual buildings, even if they are attached, and thus provides individual building footprints.The information on the number of floors is spatially merged from the GWR dataset.In case floor level data are not available (20.9% of all buildings), the floor level is estimated based on dividing the building height with a standard height assumption and floor thickness.For residential buildings (single-and multifamily homes) a height of 2.85 m is assumed, for service buildings, the building height is assumed to range between 2.75 m and 4 m (plus 50 cm floor slab thickness per floor level), depending on the building footprint [42].In case of too low building heights (e.g. the building was under construction when the height measurement was recorded), the building height is also corrected based on the number of floors and the same floor level height assumptions.

Weather data processing
Different climatic parameters are required to capture the effect of weather or climate change on building energy demand.Regional population-weighted weather data have been proposed to consider the spatial distribution of buildings and to capture regional climatic effects [43,44].We generate population-weighted weather files for each climate region (Fig. 3).They provide the building energy simulation software with all required climatic input variables.As outlined in Mutschler et al. [35], an hourly populationweighting was performed for the most important climate parameters (j) for each climate zone and each climate parameter separately, calculated as follows: whereby the local weather-station (i) based parameter P i is weighted by the population weighting factor (w i ) given by the fraction of the population for the weather station (p i ) within the total population of the climate zone ( P n i¼1 p i ).The population statistics used for population weighting are obtained from household statistics at a 2x2 km resolution [45].The temperature data were obtained from the Swiss Federal Office of Meteorology [46] for the year 2016, which was one of the warmest ever recorded years and was 0.7 °C milder than the 1981-2010 norm [47].The data is  provided from SwissMetNet, consisting of about 160 fully automated weather stations [48].Missing measurement values are linearly interpolated.As the direct normal irradiance (DNI) profiles are not measured at any of the stations, they were derived from measured global horizontal irradiance (GHI) and diffused horizontal irradiance (DHI) using where h z is the zenith angle [49].In cases of missing DHI profiles (about 50% of the stations), DNI profiles were calculated using the DIRINT model [50,51] which is implemented in the python package pvlib [52].Finally, only weather stations up to an elevation of 1 0 200 m are considered.For 2016, these processing steps resulted in 79 viable weather stations, which are distributed across the different climate regions and are shown in Fig. 3.

Grouping
The entire Swiss building stock is segmented and grouped according to building type, building age and climate region, which are very commonly used features in urban simulation studies [14].As this grouping is purely done on attributes without any clustering algorithm and in a fully supervised way [15], we use the termi-nology of group (as opposed to cluster) for the segmentation results.Goy et al. [15] point out that typically the selection of grouping technique lacks justification.For upscaling results in the energy domain, we argue that grouping along key drivers of energy demands (i.e.building type, building age and climate) is a plausible methodological approach.

Building type
Building geometry and building use typically characterise the building type.Single-family homes, for example, are buildings with a relatively small building footprint and show very different occupancy profiles or appliance use compared to, for example, shops.The building type classification is based on the GWR dataset (Federal Building Registry), which contains building type information for residential and mixed-use buildings, and the STATENT dataset, which provides information for non-residential buildings.For each building, the closest GWR data points within a 20 m radius is used to define the building type.After this step, 6.8% of all buildings have no data assigned from the GWR database and spatial interpolation is applied by attributing the most frequent building type of the 40 closest buildings.This spatial interpolation is justified by the assumption that districts are typically developed within a similar period and building types and ages are highly spatially corre-Fig.2. Workflow of how building types are derived using the GWR (Federal Building Registry) and STATENT (company structure statistics) dataset.For STATENT, the attribute NOGA08_SECTOR is used, for the GWR dataset, the attribute GKLAS.lated.We only show aggregated results of the used GWR and STA-TENT datasets to prevent privacy infringements.
Nine different building types are considered here, namely single-family homes (SFH), multi-family homes (MFH), restaurants and hotels, shops, offices, hospitals, schools, industrial buildings and other buildings.The method for the building type classification based on the two datasets is specified in Fig. 2 (see Appendix C for detailed attribute mapping).All buildings classified as 'other' and 'industry' are not further considered for this study.

Buildling age
Building energy demand is closely linked to the building age and, typically construction periods are used [53,21], which allows assigning construction properties to characterize the buildings within the energy simulation software [6].For upscaling and simplification purposes, we only consider five different building age classes (Table 2).When building age information is unavailable (11.3% of all buildings), we perform the same spatial interpolation as for the building type.For the sampling process in Section 2.4, we always randomly assign a specific year within the building age class period.

Climate region
Climate regions are obtained from the CH2018 [54] dataset to geographically group the buildings and calculate regional climate weather files.We ignore large urban agglomerations in the Southern and Eastern parts of Switzerland so that all urban agglomerations roughly fall in a similar climate zone (Fig. 3).The reason for this simplification is to obtain a weather file that represents the specific climate within each climate zone, for which we make the simplifying assumption that the climate is comparable within the total extent of each climate region.Within each climate zone, weather data are processed as outlined in Section 2.1.2.

Clustering
After the grouping using building types, building ages and climate regions, all buildings of the same group are further partitioned by clustering.Whereas the first grouping (Section 2.2) was directly performed without any clustering algorithm, the follow-up clustering is based on feature variables that first need to be calculated (Section 2.3.1)before applying the k-medoids algorithm (Section 2.3.2).

Clustering feature value calculation
For the clustering analysis, three feature values (Fig. 4) are calculated for every building that are most interesting building features that drive energy demand: Building compactness (f c ): A full range of morphological building measures have been proposed considering only the building footprint [55].Other authors propose compactness measures based on 3-dimensional geometries [56].The compactness ratio provides information on the heated room and the associated external surfaces through which energy flows and describes well the thermal behaviour of a building.The most common compactness indicator is the surface-to-volume ratio, which is derived by dividing the surface of a building by its volume.Building density (f d ): The building footprint coverage in the neighbourhood of every building is calculated.This measure indicates the urban density or structural density, i.e. the degree to which a building is embedded in the urban built environment.This feature is interesting from an energy point of view, as it provides information on the solar radiation exposure of the façades or the geothermal potential.Exposed façades influence both heating and cooling demands through passive solar gains [57][58] or enable effective solar energy harvesting on façades.With a high density of buildings, the specific geothermal supply capacity for heating and cooling is reduced [59,60].At a given borehole depth, the energy density or building density determines what proportion of the energy demand can be met with geothermal energy.For calculation, the building footprint is first buffered with a radius (r) to obtain a buffer area (A).To capture the effect of solar irradiation on the building, the neighbourhood is limited to the area that potentially affects the building by shading.The buffer radius depends on the building height (h) and the inclination angle (a): An angle of 20°is assumed for the inclination angle, approximatively representing the inclination of the sun at 1 pm CET on the 20th of December in Zürich, Switzerland.The building buffer is then intersected with neighbouring building footprints and the percentage of the buffer area covered with buildings (A b ) is divided by the total buffer area: Building size (f s ): Building floor area is assumed to be an approximation of the energy reference area (see Section 3.3 where we correct for this assumption).The floor area is derived by multiplying the floor number by the building footprint.The floor number can either be calculated based on available floor level information or by dividing the building height with floor level assumptions considering floor thickness (cf.Section 2.1.1).

Clustering algorithm
The k-medoids algorithm is applied for partitioning buildings within a group of buildings into different sub-groups.Medoids are the objects in each cluster that minimize the sum of dissimilarities, i.e. distances, from the medoids to all other objects in the cluster [61].The k-medoids clustering is closely related to the kmeans algorithm, which is however more robust to noises and outliers [62].This is crucial here to exclude non-conventional buildings that are least representative.The clustering is performed in 3-dimensional feature space with the previously outlined features.Because the used variables are measured on a continuous scale and in different units, we standardize each variable to unit variance [63].Before clustering, scaled values (Z) are calculated for all values (x), as described in Eq. 4, where u represents the mean feature value and s the standard deviation.The number of clusters for the k-medoids cluster is set a priori to four per building group.If a group has less than 100 buildings, all buildings are assigned to a single cluster.The scikit-learn package is used for scaling and clustering [64].

Sampling
Simulating all buildings within each cluster to obtain averaged results is computationally expensive.Therefore, averaged specific energy demands (kWh/m 2 ) are calculated for each cluster based on a number of sampled buildings.The calculations to obtain specific energy demands are based on 50 buildings per cluster.The computational cost increases with an increasing number of building samples.The conceptual assumption is that buildings whose feature values have the smallest distance to the cluster feature mean are most representative of the entire cluster.Buildings within each cluster are therefore sampled according to the distance to the cluster mean of all clustering feature variables.To explore the sensitivity of selecting only a limited number of buildings, we analyse how the specific energy demand changes by increasing the sample size (see Appendix A, Note B).

Energy simulation
Only useful energy demand is considered here, i.e. efficiencies of technologies or how energy demands are met is not considered.To simulate useful heating, cooling and electricity energy demands, the Combined Energy Simulation and Retrofit in Python (CESAR-P) software [65,6] is used.CESAR-P is an urban building energy simulation framework based on EnergyPlus [66].A predecessor of CESAR-P was validated and compared to measured data by 6,67.For building element characterization, CESAR-P assigns typical values based on the building age class.Cooling and heating setpoints are according to the design parameters in the SIA 2024 standard [68].As the simulation software assumes infinitesimally thin floor slab thickness, a slab thickness (30 cm for SFH and MFH, 50 cm for all other building types) per floor level is assumed and deducted when setting up the building geometry for the energy simulation.Accounting for the slab thickness is necessary when calculating the building volume which needs to be heated or cooled.Analogously, a 40 cm external wall buffer is assumed.Buildings within a radius of 100 m are considered to capture the effect of shading and reflections from surrounding buildings.CESAR-P relies on weather files that are used for building performance simulation (Section 2.1.2).
The entire analysis of this study, i.e. the simulation of energy profiles and the spatiotemporal analysis, is performed for the year 2016.
The spatial aggregation of multiple simulated building demand profiles bears the challenge of representing peak demands well.Accurate peak representation is particularly challenging for bottom-up approaches [69] as aggregating multiple identical profiles does not capture well the occupancy or behaviour variability [70,71].The coincidence, load and diversity factors, which are typically generated out of empirical studies or observations, are commonly used metrics to describe the variability in the temporal shape or characteristics of energy demands profiles [72].Typically, different profiles are assumed and a selection of pre-defined profiles is chosen to introduce variability [73].CESAR-P allows simulating buildings using nominal and variable schedules.In the case of nominal schedules, buildings having the same characteristics (e.g.building type, building age etc.) are all run with the same default parameters and same occupancy schedules.In the case of variable schedules, the occupancy schedules, together with heating and cooling, lighting, and electricity schedules, are randomly selected from a library of 100 schedules to capture inter-building variability for each of the obtained clusters.For a full description of the method of randomizing profiles, we refer the reader to Wang et al. [6].No variability in absolute demands is considered and only the impacts of horizontal variability are simulated, i.e. variability related to the time of energy use.Nominal schedules are used for calculating average demands per cluster (Section 2.4) and for comparing annual demands at neighbourhood and regional levels (Figs.8-9).Variable profiles are used to investigate the aggregation effects at the national-scale (Section 3.3).

Upscaling analysis
With the help of the cluster-specific energy demands (kWh per m 2 ), obtained by averaging across all sampled buildings per cluster, total building energy demands are calculated based on their floor area (Fig. 6).The multiplication of specific energy demands by floor space area at the building level can be aggregated at different scales (e.g.municipal, district, cantonal, national).Industrial buildings are excluded as industrial processes are often dominating compared to heating and cooling and require a different modelling approach.

Spatiotemporal validation
For three sample communities, we run simulations for archetypical buildings and upscale results to the neighbourhood and community scale.Additionally, we simulate each building individually (i.e.full simulation) to assess upscaling errors.This enables the determination of the error from the grouping and clustering approach by comparing the full simulation to bottom-up clustering demand estimation.Because the full simulations are computationally intensive, this validation was only performed for selected communities, namely Brig-Glis (2 0 731 buildings), Muri bei Bern (3 0 874 buildings) and Chur (5 0 265 buildings).They reflect different climate zones, types and sizes of communities across Switzerland.The spatial comparison of the full simulation and the upscaling result was performed across the communities on a 300Â300 m grid to gain insight into the spatial characteristics and dimensionality of the upscaling error.This spatial extent was chosen to reflect differences at urban neighbourhood scales.For this analysis, two types of weather data are used: (1) weather data is selected from the geographically closest weather station and (2) the same population-weighted weather data is used in obtaining average clustering results.The spatial comparison is performed using nominal schedules for the full-simulation approach.For comparison, all 'other' and 'industrial' buildings were excluded for which no simulation was performed, as well as raster cells with less than two buildings.Additionally, building polygons are ignored which have inner polygon rings (i.e.courtyards), as this otherwise resulted in errors in the simulation process since CESAR-P currently does not support such geometries.
We rely on different measures to compare and validate the simulated building load profiles results with the clustering simulations (see results in Sections 3.4 and 3.5).Load factors are used, which are defined by dividing average daily demand by daily peak demand.Additionally, annual full load hours are used, which are defined by dividing the annual by the maximum hourly peak demand.The analysis is performed over the entire year, i.e. peak hourly demand represents the hour with the highest aggregated demands in a year over a case study area (i.e.hourly demand is summed over all buildings).However, as a practical approach to remove outliers and account for uncertainties from aggregation, the 95th percentile value of all hourly values is used to define the annual peak demand.We limit ourselves to regional (community) peak comparison since we consider the uncertainty to be too high in the case of aggregating hourly profiles for a limited number of buildings at the neighbourhood level.

Grouping and clustering
Of the 2.3 million Swiss buildings in our dataset, about 4.7% were classified as industrial buildings and 21.8% as 'other' (e.g.silos, very small buildings, sheds, barns) and therefore ignored.In total, 1.72 million remaining buildings were grouped into 270 groups of buildings.We obtain the theoretical maximum of 270 groups (5 building ages * 6 climate zones * 9 building types) as buildings are available for all possible combinations.Fig. 5 shows detailed statistics of the grouping results.The average number of buildings within one group is about 8 0 600 (median about 1 0 800).The most frequent building types are single-family (41.4%) and multi-family homes (23.2%).The distribution according to age class shows that the historic building stock (<1945) is considerable (36.0%).Most buildings are found in the densely populated Central Plateau (35.6%) according to the climate zone definition in Fig. 3.
The mean values of each group, obtained by grouping based on age, climate and building type, are visualized in Appendix B, which reveals that a simple grouping enables to obtain distinctive groups of buildings.Whereas some of the obtained groups only contain a few buildings, other groups contain thousands of buildings.After the grouping step, clusters are calculated for each group as outlined in Section 2.3.This clustering step results in a total of 756 clusters for which we calculate average energy demands based on 50 sampled buildings (cf.Appendix A, Note B).The sampling methodology is outlined in Section 2.4.

Heating and cooling signatures
Energy signatures provide valuable information for approximating energy demand relying only on ambient temperature and assumptions on building type and building age information.Energy demand signatures are obtained from all energy simulations across all climate zones (36 0 616 buildings) per building type and building age and calculated with averaged cluster-specific energy demands (Fig. 6).Fig. 6 shows energy signatures based on mean daily temperatures.For hourly signatures, see Appendix A, Note 1.We note that the differences between the intersection points of heating and cooling for different age classes reveal that heating demand is not only lower for newer buildings, but also that the duration of the heating and cooling season depends on the building age.The heating period is longer for older buildings, which require heating at higher temperatures.The cooling period is longer for newer buildings as cooling needs are required for lower outdoor temperatures.The energy signatures were calculated across all different climate zones, thus differences in irradiation are not distinguished and the signatures are only a comprehensive approximate.The linear approximation works better for heating than for cooling, as indicated by the coefficients of performance (Appendix D).It needs to be noted that using temperature alone to approximate cooling demand is a simplified approach [74], as other factors such as solar radiation play a prominent role in cooling compared to heating.
For validation purposes, the simulated heating and cooling signatures for MFH and Offices for the age class > 2010 are compared to measured values of a building demonstrator (NEST) at Empa in Dübendorf [75].NEST consists of residential and office units for which high-quality measurement data are obtained under realistic conditions.Compared to NEST, the simulated crossing point of the heating and cooling curve is less pronounced [35].The measured heating demand starts at higher outdoor temperatures and hence the crossing point is more pronounced in the measured values.Furthermore, the measured slope for the heating signature is steeper, particularly for offices.

Specific energy demands per building type and age
Specific heating and cooling useful energy demands per age class and building type for the year 2016 are shown in Table 3.To compare specific cooling and heating demand to values from other years, we recommend scaling based on heating-or cooling degree-day differences [76].The results are consistent with obtained results in 6,67, where the simulations were based on the same simulation software.A cross-comparison with Streicher et al. [21] however shows that heating demands are overestimated for older buildings.This can be explained by the combination of a colder year and unaccounted retrofits of older buildings (<1970), where portions of retrofitted building elements, specifically ground roof, and window, were shown to be frequently above 50% [18].Comparing the heating energy demands for SFH and MFH to Streicher et al. [21], the simulated values for buildings built after 2010 are by at least a factor of two lower.This is, however, within the expected range for newer buildings according to Wang et al. [67].Comparing the results of older age classes of MFH with Streicher et al. [21], heating energy demands are slightly lower in our simulation for the 1986-2010 building age category and higher for buildings older than 1961.For SFH older than 1961, our values are in a comparable range and values but lower for the 1986-2010 age class.Simulations are furthermore compared to data from the year 2019 of the building demonstrator NEST (Section 3.2.1),which was on average approximatively 0.3 °C warmer compared to the reference year 2016.The residential NEST units correspond to a modern (>2010) multi-family home and the specific space heating demand of 38 kWh/m 2 /year is significantly higher than the simulated value of 13.8 kWh/m 2 /year (Table 2).This difference may be explained by the different choice of year or differences in setpoints, as the setpoints in the SIA norms are lower compared to those at NEST.In summary, the energy analysis can be improved by revisiting the parametrization of CESAR-P to better account for past energy retrofits or U-values.
For cooling, we simulate the highest demands in modern SFH and shops.New buildings show considerably higher cooling loads due to high glazing ratios and low infiltration rates [23].Most simulated cooling demand is currently not serviced by mechanical cooling technologies as particularly in the residential sector, cooling only plays a minor role but could change with climate change [35].Compared to Streicher et al. [21], the measured and simulated values for residential cooling are comparable.Similarly, for offices, the simulated values for heating differ significantly for newer (>2010) offices (17.8 kWh/m 2 /year compared to 60 kWh/m 2 /year).The cooling energy demand for offices is again comparable.
Electricity demand and domestic hot water statistics are shown in Table 4. Electricity demands do not include electricity used for space heating, such as electricity consumption from heat pumps.

National upscaling
National electricity, cooling and heating energy demands are calculated based on the grouping and clustering analysis and respective simulated building-specific energy demands (Section 2.6).The upscaling is based on the 36 0 616 simulated buildings, which represent in numbers about 2% of all considered buildings for which the demand is estimated.Fig. 7a compares the calculated Swiss floor area with energy reference area (ERA) estimates of other studies.To validate the obtained bottom-up results, we use national energy statistics provided by SFOE [77].The ERA is deduced from the total simulated floor area assuming an average heated area factor of 0.9, which is a conservative estimation according to pom+ [78].For the simulation year 2016, we obtain combined residential and service national annual demands of 82.6 TWh for heating, 10.9 TWh for domestic hot water, 24.4 TWh for electricity, and 4.2 TWh for cooling (Fig. 7b).Electricity demand excludes electrified space heating, domestic hot water, and space cooling.The electricity demand for cooling from SFOE [77] is converted into space cooling demand assuming a COP value of 3.
Comparing our calculated ERA values (Fig. 7a), we overestimate the combined residential and service ERA by about 9% compared to the SFOE statistics.In the residential sector, we fall in the upper 20% of the main reviewed studies and overestimate the ERA by around 6%.In the service sector, with a smaller number of available studies, compared to Schneider et al. [27], our results are 54% higher, which could be due to classification differences of building types.Nevertheless, there is a smaller difference (17%) compared to the SFOE study.Concerning the total national demands (Fig. 7b), the highest discrepancy occurs for the space heating demand (about 27%).Residential heating demand is, in general, higher compared to all the main studies, with the largest deviation (about 35%) compared to the SFOE study.In the service sector, the heating demand is within 8.8%.For both the residential and the service sectors, the higher estimated heating demand is likely due to a combination of the higher floor area estimates, as well as not accounting for retrofitting of older buildings as evidenced by the higher specific heating demands (see Table 3).When considering the residential heating demand of other studies [79,80,18], all overestimate the demand compared to the SFOE, while, at the same time, underestimating the ERA.In other words, the average specific heating demand in the residential sector in the SFOE study is significantly lower compared to the literature with a value of 86 kWh/m 2 /year.Additionally, while the deviations in the domes- tic hot water and electricity demand are within 9% of the SFOE values, deviations within each sector can be significant (Fig. 7b).Similarly, for the cooling demand, while we are within 15% of the total demand, we grossly overestimate residential demands (about 1 0 400%) since the majority of the cooling demand in Switzerland is unmet, as most households do not use mechanical cooling.In the service sector, the cooling demand is significantly underestimated (close to 63%).However, differences in internal heat gain from occupancy and equipment, solar heat gain through the windows, as well as temperature set points, can significantly alter cooling demands.Finally, a further source of uncertainty is that the building footprints from OpenStreetMap used in this study sometimes include roof overhangs and thus overestimate the actual footprint area.The difference between the national-scale upscaling results and the data used for validation provided by SFOE [77] is also summarized in Appendix Note E.
Fig. 8 shows hourly aggregated Swiss demands based on the grouping and clustering analysis.Considerable peaks and unrealistic ramping behaviour are observed when nominal profiles are assumed.It has been established, that demand in a district is different to the sum of individual buildings' demand [10]: when we introduce variability in the profiles, where each cluster is represented by the average of up to 50 different variable profiles, the effect of variability is maintained and peak coincidence reduced at least to some degree when the profiles are upscaled, which can also be seen from the lower values of the coefficients of variation.
However, in the case of aggregating clustering results for a large number of buildings, as is the case here, peak demands could still be overestimated.The mechanism behind a potential overestimation is that identical profiles are aggregated for each building assigned to the same cluster.This effect of overlaying peaks is expected even in the case of using 50 variable electricity profiles when aggregating thousands of buildings, let alone millions of buildings.We note a reduction in peak demand when using variable load profiles.This effect is also captured in the coefficient of variation (defined by dividing the standard deviation by the mean) for each of the annual profiles (see table inset in Fig. 8).Peak reduction is highest for electricity, followed by cooling demand.This is to be expected, as electricity and cooling demand contain an overall higher number of variable profiles, with electricity and cooling demand more evenly split between the residential and service sectors.On the other hand, higher fluctuations for both space heating   and domestic hot water demand are observed, as they are dominated by residential demand and therefore contain an overall lower number of variable profiles.For weather-dependent energy demand, particularly heating, a combination of coinciding occupancy and lowest temperatures in the early morning hours, dominated by the residential sector, lead to high ramp rates during this time.
The degree of hourly peak overestimation at a national scale (cf.Section 3.4 for a discussion of this issue at regional scale) is out of scope here.To account for and correct for this aggregation problem, further procedures would need to be developed and applied, which is however addressed and studied elsewhere [10,81].

Regional and neighbourhood upscaling errors
For a selection of Swiss communities, simulated annual energy demands for each individual building (''full-simulation") and the energy demands for each building obtained based on the grouping and cluster approach (''clustering") are compared at the neighbourhood and regional scale.For neighbourhood analysis, the differences between the two approaches are calculated across all buildings that are within the same cell of a 300Â300 m grid.Population-weighted climate data and nominal occupancy profiles are used for the clustering and full-simulation.For comparing annual demands, the error can be equally assessed either with nominal or variable occupancy profiles.Analysing the spatial upscaling errors for all neighbourhoods in Fig. 9, we note that the errors generally are within a plus-minus range of 25%.A combination of building type and building density is likely to result in these differences.In the case of large building coverage, the annual upscaling error at neighbourhood scale is oftentimes below 10%.Cells with only a limited number of buildings (<20) show the highest error.Larger differences between the two approaches are, however, to be expected in case only very few buildings are compared.If energy demands are aggregated across an increasing number of buildings, e.g. by selecting a larger cell size, the error diminishes.
The impact of using different weather data is shown in Fig. 10, i.e. the differences in energy demands of using populationweighted weather data (capturing weather of the entire climate zone) versus using specific local weather data of each community.If using the geographically closest weather station for each community, the error between the two approaches is larger.This is to be expected, as population-weighted weather data representing an entire climate region is used in the grouping and clustering approach.Therefore, to isolate the error resulting from two approaches from the error arising from using different climate data, the same analysis is performed using identical climate data.Fig. 9 and Fig. 10 allow to draw several conclusions: The choice of the weather station is critical to obtain sensible regional results [82,83].Not surprisingly, using local climate data leads to bigger Fig. 8.Comparison of the aggregated (i.e.upscaled) hourly demand profiles for the Swiss building stock using variable (red lines) and nominal (grey lines) load profiles.The table inset summarises the coefficients of variation (CVs) for each of the profiles: Electricity (EL), domestic hot water (DHW), space heating (SH) and space cooling (SC).The total annual demands are equal for the nominal and variable profiles.For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.differences between the two approaches and calculations based on local climate data outperforms the clustering approach.However, we note, that independently of the choice of weather stations, the variability resulting from the methodological upscaling errors is significant at the neighbourhood scale.Whereas the regional differences across the entire Swiss communities are reasonable, particularly if using the same climate data, errors at the neighbourhood scale can be in the double-digit percentage range.However, the annual error at urban neighbourhood level between the full simulation and clustering approach depends on urban density and is frequently in the ±10% range.Therefore the use of clustering and grouping represents the energy demands considerably well.Extreme outliers in Fig. 10 could be omitted with a more realistic neighbourhood representation instead of a grid.The regional deviation across the case studies is comparable (red points).The error at the neighbourhood scale is largest for cooling and negligible for electricity and domestic hot water.Green points show the error of the full simulation to a calculation based on the calculated energy signatures.For cooling, the energy signatures based calculations are less accurate, as cooling depends also on many other factors than temperature.However, the generated energy signatures in combination with local weather data is a fast and viable alternative to the grouping and clustering approach.
In summary, even though the grouping and clustering approach is powerful to approximate regional demands to then represent demand at a national scale, careful consideration is necessary in the case of neighbourhood simulations.Energy signatures thus allow obtaining very similar results, particularly for heating.

Peak demand analysis
Peak demands are critical for system dimensioning and require special attention [84].The differences in hourly peak demands between the full-simulation and clustering approach are shown in Fig. 11 for the case study region Chur (see Appendix A for all results).Daily heating and cooling load factors are plotted at the regional scale to compare the overall shape of the energy demand profiles.Additionally, the full load hours and the difference between the maximum peak hour of the full simulation and the clustering approach are calculated.Full load hours are a common indicator that provides valuable information for choosing technologies as well for their economic efficiency at the system level.
Fig. 11 reveals that even though the load factor distribution differs between the two approaches, the overall pattern is in overall good agreement for heating.For cooling, the daily differences are more pronounced between the full simulation and the clustering analysis, where more distinct differences arise from overall lower cooling demands which impact the calculation of the load factors.The same observations hold for all case studies (Appendix A).Also, the calculated full load hours are comparable: for heating, the differences are smaller than 2% across the different case studies, for cooling the maximum difference is 6.7%.
Typical full load hours for heating in a Swiss residential building are between 1 0 800-2 0 700 [85,86] depending on the building type, building age and climate zone.For cooling, the typical full load hours for an office are between 800 and 1 0 500 [87].Our obtained full load hours are significantly overestimated in case of encountering the aggregation effects of the energy demand profiles (Section 3.3) and as our national profile include multiple building types.
When comparing the absolute maximum annual peak hour demand for the nominal profiles, we note significant differences between the full simulation and the clustering approach, showing the aggregation challenge discussed before.For Chur and Brig-Glis, the clustering approach resulted in higher maximum peak demands for heating (Chur: 14.2%, Brig-Glis: 14.7%) and cooling (Chur: 19.2%, Brig-Glis: 22.8%).For Muri bei Bern, peak demand is lower by grouping and clustering for heating (3.8%) and cooling (1.4%).These difference in peak demands resulting from the aggregation effect can be explained by the respective diversity of the building stock of the different case studies.We argue that the challenge of aggregating demands exists for the full simulation as well as the grouping and clustering approach, as in both cases the aggregated demands are calculated based on a limited number of occupancy profiles.However, grouping and clustering resulted in significantly higher peaks for Brig-Glis and Chur.Consequently, clustering and grouping amplify the outlined temporal aggregation problem, particularly in case of homogeneous building stocks.

Limitations and research needs
Several further research needs arise from the limitations of this study.For more detailed and realistic energy demand simulations, recent data collection efforts of 3D buildings increasingly allow moving away from abstracted building geometries to detailed building geometries and can be included in bottom-up approaches [40].More research effort also needed to address the problem of variability in the case of aggregating individual building energy demands.This includes better capturing the heterogeneity of buildings and their demands within an archetype [40].Whereas our focus was to analyse annual energy demands at the neighbourhood scale, we have argued that the complexity considerably increases when aiming at capturing peak demands, particularly in the case of spatial and temporal high-resolution (hourly).Finally, whereas we validated the grouping and clustering approach a the national scale, we did not provide a regional validation as typically measured data at regional scales are not readily available [88].
Quantifying the error contribution in energy demand simulations from the complexity of the numerous used data, algorithms or model inputs is highly challenging.Whereas we do not provide easy solutions for such a quantification, we could, in this work, isolate the potential errors of relying on a simplified building stock representation due to grouping and clustering.Other approaches such as sensitivity analysis are required to explore and quantify other sources of errors.

Conclusion
We have argued that grouping and clustering a heterogeneous building stock with thousands or millions of buildings for energy system analysis is commonly performed without detailed analysis of upscaling errors.Our bottom-up energy demand analysis of Switzerland shows that simulations on a limited set of buildings leads to systematic errors.An uncertainty quantification was achieved by comparing the clustering results with full-scale simulations at the neighbourhood (300Â300 m) and regional scale.Whereas national or regional scale demand estimates are plausible based on grouping and clustering, considerable upscaling errors can result at the neighbourhood scale.At the regional or national level, the grouping and clustering approach is a powerful method for energy research: it yields similar results in terms of daily load factors and full load hours compared to a full simulation.For the three exemplary case study regions, the difference in full load hours for heating was below 2%.For cooling, the maximum difference in full load hours was 6.8%.However, estimating highly localized demands at the neighbourhood level needs careful consideration, as annual demands are frequently over or underestimated in the double-digit percentage range.Particularly if only considering a small number of buildings (<20) clustering and grouping is not practical.We have also shown that capturing variability due to different occupancy is particularly challenging: For bottom-up simulations, the temporal aggregation results in overestimating peak demands when aggregating identical load profiles multiple times.This challenge is amplified when using clustering approaches.
Detailed building-type and building age-specific energy signatures are made available from all building simulations, which are proposed as a viable alternative approach to grouping and clustering for fast and simplified bottom-up energy demand simulation to derive regional or national energy demands with minimal data requirements.Energy signatures enable particularly the consideration of local climate, which is difficult to capture with grouping and clustering approaches.Finally, the presented analysis demonstrates the importance of isolating uncertainties related to upscaling (when relying on grouping or clustering buildings) from other sources of uncertainty, such as data inputs or the applied energy demand simulation framework.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Energy & Buildings 258 (2022) 111844 Contents lists available at ScienceDirect Energy & Buildings j o u r n a l h o m e p a g e : w w w .e l s e v i e r .c o m / l o c a t e / e n b the reduced need for detailed input data of individual buildings which are often protected and difficult to obtain.

Fig. 1 .
Fig. 1.Workflow of the grouping, clustering and sampling of the Swiss building stock.

Fig. 3 .
Fig. 3. Swiss climate regions used for grouping the building stock and for generating population-weighted weather data.All used weather stations are shown.

Fig. 4 .
Fig. 4. Visualization of the calculated features used for clustering for an example region.

Fig. 5 .
Fig. 5. Swiss building stock grouped according to building type, building age and climate region (percentages are provided on top of the bars).

Fig. 6 .
Fig. 6.Building type and building age class-specific energy signatures of all heating and cooling energy simulations across all climate zones.Linear fitting parameters are provided in Appendix D. Hourly energy signatures are also provided in Appendix A.

Fig. 9 .Fig. 10 .
Fig. 9. Visualization of the upscaling error at the neighbourhood scale.Annual heating demands of the clustering (clustered) are compared to full-simulation results (full) for different Swiss communities on a 300Â300 m grid.Only raster cells with a minimum of two buildings are considered.

Fig. 11 .
Fig. 11.Comparison of daily load factors, full load hours and difference in maximum peak hour demand between the grouping and clustering approach and the full simulation for the case study region Chur.

Table 1
Datasets used to characterize the Swiss building stock for simulating energy demands.

Table 2
Building age classification into building age classes according to the GWR (Federal Building Registry) dataset with the GBAUP attribute.

Table 3
Simulated building type and building age class-specific cooling and heating energy demands for the year 2016.

Table 4
Specific domestic hot water and electricity energy demands.