Harvesting big data from residential building energy performance certificates: retrofitting and climate change mitigation insights at a regional scale

The reduction of energy consumption and the increase in energy efficiency is currently an important cornerstone of EU policy. Energy performance certificates (EPCs) were implemented as one of the tools to promote this agenda, and are used for the energy performance assessment of buildings. In this study, the characteristics of the Portuguese dwelling stock are regionally analysed using data from approximately 523,000 Portuguese residential EPCs. Furthermore, a bottom-up building typology approach is used to assess the regional energy needs impact of retrofitting actions and to estimate the heating and cooling energy performance gaps of the whole dwelling stock, as well as the potential CO2 emissions resulting from the gaps’ potential offset due to increase thermal comfort. The results show that Portuguese residential buildings have very low energy performance, with windows and roofs being identified as the most energy inefficient elements. Roof retrofitting has the highest potential for the reduction of energy needs. The estimated heating and cooling energy performance gap amount to very significant percentages, due to the poor performing building stock but also very low energy consumption levels, with probable consequences for the thermal comfort of occupants. Assuming the current energy mix, carbon emissions would be 9.8 and 20.2 times higher associated with heating and cooling, respectively, if the actual final energy consumption were to match the estimated theoretical values derived from building regulation. This study demonstrates several application cases and leverages the potential of the individual EPC, increasing the detail in the dwelling stock characterization and energy performance estimation, revealing its value for energy retrofit and climate change mitigation assessments, as well as establishing the ground for future work related to building retrofits, energy efficiency measure implementation, climate change mitigation, thermal comfort, and energy poverty studies.


Introduction
Energy consumption is at the core of economic development, but its severe impacts on the depletion of resources and climate change have justified a call for general reduction across all economic activities. The residential sector is crucial to achieving carbon dioxide (CO 2 ) emission reductions, as it has an important energy-saving potential, and its environmental controls are difficult to displace to other countries (Pablo-Romero et al 2017).
Energy consumption in residential buildings represents a significant share of energy consumption in OECD countries; however, it varies significantly among the EU28 (e.g. from 12% in Luxembourg, to 37% in Croatia) (PORDATA 2016). A wide range of variation is also observed on a per capita basis, from 7.6 to 37.4 GJ per capita/annum, with the lowest consumption indicator observed in southern EU countries. Additionally, when looking to the importance of space heating in residential buildings, total final energy consumption varied across member states from 22% (in Portugal) to 83% (in Denmark) in 2016 (Odyssee-Mure 2019). Space cooling is becoming an increasingly relevant issue in Europe, albeit still representing a residual percentage of consumption (Odyssee-Mure 2019). Data on space cooling final energy demand in residential buildings of the EU member states is still limited (Jakubcionis and Carlsson 2017).
Significant efforts have been made to improve energy efficiency (EE) in buildings and appliances (e.g. Energy Performance of Buildings Directive, Eco-design and Labelling Directives, the Energy Efficiency Directive). Energy Performance Certificates (EPCs) are an integral part of the Energy Performance of Buildings Directive (2002/31/EC; 2010/91/EU) and are regarded as one of the most important tools of energy performance assessment for European building stock (BPIE 2014). EE policies and measures are an important cornerstone of building energy transition and could potentially reduce around 15% of EU energy consumption by 2040 (IEA 2016). Nevertheless, EE alone is not sufficient to meet the targets of energy and greenhouse gas (GHG) emission reduction, as it is also necessary to look beyond efficiency improvements towards the reduction of absolute energy demand. This is especially true in countries where a trade-off assessment needs to be performed between complying with targets of energy consumption reduction and covering the energy demand for increased thermal comfort levels and other basic energy services.
In this context, climate change mitigation is also a top priority, as buildings currently represent a significant share of carbon emissions (36%) in the EU (European Commission 2019). At the worldwide scale, building energy consumption and emissions may considerably increase with greater access of the population of developing nations to adequate housing (Lucon et al 2014) and the increasing global shift from rural to urban societies (Seto et al 2014). There is a lock-in effect of energy consumption and GHG emissions pathways, related to the long life-span of buildings (Seto et al 2014). Therefore, deep and urgent building retrofit and energy efficiency measures are paramount to mitigate energy consumption levels and consequently GHG emissions, as stated by Lucon et al (2014) and evidenced in case studies such as that by Asdrubali et al (2019).
Climate change mitigation assessments are the basis of climate change adaptation and mitigation plans, which are increasingly important in city planning, as highlighted by Reckien et al (2018). Creutzig et al (2019) stress the importance of building stock data for assessing the impact of retrofitting and ultimately to address urban climate solutions for the reduction of GHG emissions in the urban context. There is a lack of bottom-up approaches and data, as top-down methods often produce simplified and misinformed representations of regional building systems (Lucon et al 2014). Conducting case studies on different urban settings is pivotal to understand the impact of different urban climate solutions, in order to generate insights for broader scales and generalize knowledge (Lamb et al 2019), which can strengthen the research and policy arenas. Steinberg (2015) reinforces that in research, case studies can be suitable for generalization, just as large N-studies can.
Several researchers have analysed the energy performance of buildings, with the objective of estimating the energy needs, actual energy use of the building, or even GHG emissions, at different spatial scales (single buildings, or at the neighbourhood, city and country building stock level). These studies can be of value for the optimization of energy supply and enabling the increase of renewable energy in the electricity mix (e.g. Koehler et al 2016), assessment of energy conservation interventions in the properties of buildings and adaptation strategies (e.g. Cox et al 2015), and to foster urban sustainability (e.g. Braulio-Gonzalo et al 2016).
In the context of these kind of analyses, EPCs have been used for: (1) energy planning at different scales This paper aims to regionally characterize the Portuguese building stock, using the data from EPCs on the parameters of buildings and climatization equipment ownership. Furthermore, the potential of EPCs is leveraged by developing a building typology approach to scale up the information delivered by individual EPCs, i.e. extracting buildings' characteristic data at the dwelling level, estimating space heating and cooling energy needs to assess retrofitting measures per NUT3 region and climatic zones, and computing the theoretical and actual heating and cooling (H&C) final energy consumptions to analyse the energy performance gaps of the whole country's dwelling stock at a very high-resolution scale. Additionally, the CO 2 emissions stemming from a potential offset of the energy performance gaps were also computed.
This paper therefore highlights the need and benefits of looking deeper into residential sector energy consumption in a southern European country, since the differences of consumption suggest that it is relevant to look into different EU countries through casebased approaches, to accommodate the particularities of individual countries. This case-based approach can span multiple scales, both spatial and temporal, giving insights to similar contexts . It showcases the advantages of using EPCs for a more detailed analysis of building stocks' energy performance and to support the development of multiple assessments (e.g. building retrofitting potential, climate change impact on energy needs, GHG emissions mitigation).

Methodology
This research work was developed in three distinct parts: (1) analysis of EPC data for dwelling stock characterization at NUT3 level; (2) regional assessment of energy retrofitting measures; (3) estimation of the space heating and cooling (H&C) theoretical and actual final energy consumptions and energy performance gaps, using the raw data from EPC samples. The consequences on CO 2 emissions related to a potential increase on actual final energy consumption were also estimated.

Case study and energy performance certificates
Portugal was chosen as a case study due to its aging building stock, with 15% of the buildings dating back to 1945. About 70% of the building stock was built prior to 1990 (INE 2011) when energy performance regulations concerning residential buildings were still inexistent in the country. About 29% of Portuguese residential buildings need some type of intervention work, and 1.6% are severely degraded (INE 2011). All these indicators point to deficient energy performance of Portuguese residential buildings, related to the use of poor materials and thermal insulation in the construction process (Vasconcelos et al 2011). The decaying state of the Portuguese building stock further stresses the need for a regional detailed characterization analysis, to identify potential sources of energy inefficiency and adequate measures to mitigate them.
EPCs have been the subject of research work in Portugal before. Araújo et al (2013) analysed the influence of building parameters on the energy performance certification of buildings. Magalhães and Leal (2014) used EPC data to assess the energy performance of Portuguese building stock, according to the previous 2006 energy performance regulation (RCCTE). Fonseca and Oliveira Panão (2017) developed a Monte Carlo model to predict EPC indicators using data from 170, 000 EPCs, while Panão and Brito (2018) also used the Monte Carlo method to estimate hourly electricity consumption for a group of residential buildings, using EPC data.
A total of nearly 1.4 million residential EPCs have been issued so far in Portugal (ADENE 2019), representing approximately 23% of the dwelling stock. EPCs are produced by qualified architects or engineers, with at least five years of experience in building energy performance and acclimatization, and with specific training for the activity. These experts visit households and collect the required information regarding the building characteristics, heating, ventilation and air-conditioning (HVAC) equipment and domestic water heating systems, to determine the energy performance parameters and subsequently calculate the required energy performance indicators, as well as identify opportunities for improvement. The quality of the EPC is dependent on the detail and precision of the experts' inventory and analysis of the dwelling's characteristics, which will determine the energy performance parameters, as well as on the available documents about the building, provided by the owner. Some experts conduct more thorough examinations, for instance employing heat transfer coefficient measurement techniques, whilst others opt for simpler approaches, only using qualitative characterization to quantify the energy performance parameters. The whole process is conducted by the expert, so human error is a factor to consider. EPC classes are defined by the ratio (R NT ) between the estimated and reference annual nominal primary energy consumption. The R NT intervals for the A+(better performance), A, B, B-, C, D, E, F (lower performance) classes are respectively: R NT 0.25; 0.26R NT 0.50; 0.51R NT 0.75; 0.76 R NT 1.00; 1,01R NT 1.50; 1.51R NT  2.00; 2.01R NT 2.50; and R NT 2.51. The first building energy performance regulation in the country was implemented in 1990. The most recent recast of the building's energy performance regulation was implemented in 2013, replacing the previous 2006 regulation. The major change centres on the redefinition of the set temperature for thermal comfort in the heating season, from 20°C to 18°C, which directly influences energy needs in the heating season. The summer inside comfort temperature remained the same. The climatic zoning and the heating and cooling season duration were also changed. The reference value of certain parameters such as the air renovation rate per hour (from 0.6 to 0.4 for the heating season) were also revised, and slight amendments were introduced in the methodology, such as for the calculation of the solar thermal gains. The current national regulation sets three different heating season climatic zones (I1, I2, and I3) and three cooling season zones (V1, V2, and V3) in the country (figure 1). The heating zones are defined by the heating degree days in the heating season (I1<1300°C; 1300°C<I2<1800°C; and I3>1800°C), for an optimal inside base temperature of 18°C. The cooling zones are set according to the average outside temperature in the cooling season (V1<20°C; 20°C<V2<22°C; V3>22°C), for an optimal inside base temperature of 25°C.
The majority of the EPCs pertain to existent dwellings (about 98.1%) while smaller percentages represent renovated and new dwellings (0.3% and 1.7%, respectively) (ADENE 2019). Approximately 1.6% of the EPCs correspond to dwellings dating from before 1919, 4.1% from the period between 1919 and 1945, 6.6% from 1946 to 1960, 16.3% from between 1961 and 1980, 57.2% from between 1981 and 2005, and 14.1% correspond to dwellings built after 2005. Dwellings in house typologies represent about 31.1% of all certificates, whilst the remaining 68.9% correspond to households in apartment buildings. The EPCs provide data on the building elements, equipment, energy needs, and carbon emissions.

Regional building typology characterization
The data of approximately 523, 000 residential building energy performance certificates, issued from December 2013 to October 2017 in Portugal, under the most recent energy performance regulation, was used to statistically analyse and characterize selected building parameters that define the Portuguese dwelling stock at a regional level, for all 25 NUT3 regions of the country (figure 1). A total of 253 regionally representative building typologies was established.  Due to their importance for the energy performance of buildings, several building elements were selected and analysed in terms of their thermal transmittance (U-value or psi value) and physical characteristics (e.g. area or height): (1) roof (flat/pitched roof with/without insulation); (2) pavement (type of material, insulation and contact with the ground); (3) walls (simple/double, type of material and insulation); (4) windows (from single glazing wooden frame to triple glazing with metal frame); (5) linear thermal bridges (wall junctions, parts of window frames, different parts of the façade); (6) flat thermal bridges (beams, pillars, roller blind boxes); (7) doors. Other significant parameters related to the thermal performance of the dwellings were also analysed, in particular the solar factor, air change rate and thermal inertia.
As stated by Mangold et al (2015), EPC data quality should be remediated before its use. Therefore, predata analysis and data cleansing were performed prior to the analysis. The EPCs that presented data with typographical errors were discarded. Other certificates presented unrealistic data concerning the dwelling area elements (such as the floor, window, roof), probably stemming from estimation or typographical errors. For instance, some EPCs presented a floor area equivalent to the whole building, when it should only have presented the area of the dwelling. Others presented a reduced floor area (5 m 2 ), significantly below the limit area defined in the regulations. After conducting a literature review process to identify general minimum and maximum area values, consulting the General Regulation of Urban Buildings (Decree-Law n°. 38382/51) to identify officially set limit values, as well as using common sense, these outliers were identified and removed (this equated to approximately 3% of the sample).

Energy impact of retrofitting measures
First, using the raw EPC data on the parameters of buildings such as the heat transfer coefficient (U-value) and areas, the space H&C useful energy needs of these typologies were estimated using a simple bottom-up building typology stationary model, supported by a set of key building characteristics such as area, walls, floor and roof, and other relevant aspects. This analysis was conducted per building typology, for each climatic zone and NUT3 regions. The model used for this calculation was developed for this purpose, based on the work of Palma et al (2019), and it is derived from the national residential buildings' energy performance regulation (IteCons 2013). This model calculates the energy needs necessary to assure thermal comfort conditions for the household occupants, i.e. considering the maintenance of the optimal inside temperatures, in both the heating and cooling seasons, for the whole useful area of the dwelling and whole season duration. The generic equations to compute the H&C useful energy needs of a building typology are displayed respectively in equations (1) and (2): where Q tr is the heat transfer through conduction between the building and the surroundings in [kWh]; Q ve is the heat transfer through ventilation [kWh]; Q gu is the total useful heat gain in [kWh]; A p represents the useable floor area in the building in [m 2 ].
where η represents the utilization factor of the heat gains; Q g represents the heat gains in [kWh]; A p is the useable floor area in the building in [m 2 ]. Subsequently, the H&C energy needs were again calculated assessing retrofitting measure impact, using the optimal heat transfer coefficients (U-value) for the walls, roofs and windows and the optimal solar factor, according to the current energy performance regulation, defined for each climatic zone, NUTS2 region of the country (mainland or islands) and building typology as shown in table 1. The percentage difference between the energy calculated using the EPC parameters data and the optimal reference values of the regulation was then computed.
The optimal values for the energy needs calculations are used to assess the impact of the retrofitting measures of the building elements on the energy needs, showing the most impactful measures and/or region. The doors and floors were not accounted for in the methodology, for simplification of assessment, hence the focus was set on the building elements with the greater surface area and lower thermal performance. The change in the wall U-value pertains to the potential improvement of external insulation with fiberglass or expanded polystyrene (closed cell foam) (EPS) or the improvement of internal insulation with extruded polystyrene (closed cell foam) (XPS) polyurethane/polyisocyanurate (closed cell foam). The roof U-value improvements result from the potential installation of insulation such as open cell structure fiberglass or polyurethane/polyisocyanurate (closed cell foam) (XPS). The improvement of the windows U-value and solar factor implies the implementation of solutions such as double glazing with 12 mm air gap or double glazing with 20 mm air gap and low-E coating.

Energy performance gap and related CO 2 emissions
The theoretical final energy consumption per civil parish was subsequently estimated by multiplying the energy needs of each building typology by the corresponding (1) heating and cooling equipment ownership rate, (2) typical climatization systems' efficiencies (ADENE 2018), and (3) the number of occupied main residence dwellings and respective usable area per building typology (INE 2011). The actual final energy consumption for space heating and cooling per civil parish was calculated using: (a) municipal statistics on total final energy consumption per energy carrier on the residential sector (DGEG 2019); (b) regional heating and cooling shares, obtained from representative municipal energy matrixes for each of the country's climatic zones (adapted from Palma et al 2019); (c) national heating and cooling energy consumption per household, available in the National Survey on Energy Consumption in the Domestic Sector (INE and DGEG 2011), for the energy carriers whose end-uses were not discriminated in the energy matrixes; (d) civil parishes' area and number of dwellings per typology. Subsequently, based on previous work from Palma et al (2019), the percentage difference between the theoretical final energy consumption and the actual final energy consumption, for space heating and cooling, was assessed.
Following the approach of Seixas et al (2018), the carbon dioxide emissions associated with the theoretical and the actual final energy consumption were calculated, using the different emission factors of 2018 (EDP 2019, APREN 2019) for the energy carriers, with a national average of 273.6 g (CO 2 kWh −1 ), and assuming the same energy mix for both consumptions. The difference in the total emissions of the theoretical and actual energy consumption was then estimated. This calculation highlights the potential impact of increasing energy consumption for improved thermal comfort and reduced energy poverty levels on the achievement of climate change mitigation targets.

Results and discussion
3.1. Regional building typology characterization Of the sample of approximately 523, 000 residential EPCs, which are mandatory for new houses and houses in the market (sale and rent), about 87.8% have a 'C' rate or less, which is an indicator of the poor energy performance and energy inefficiency of the dwelling stock. The percentage of dwellings per EPC rating is displayed in figure 2.
The box plots representing the regional and national variation of the U-value, for different building parts-roof, walls, floor, doors and windows-are displayed in figures 3-8 respectively. The main statistical indicators describing the variation of the U-values and psi values, areas, and height, thermal inertia and air changes per hour in the winter and summer are displayed respectively in the tables in figures A1-A3 of appendix A. An additional excel file is available as online supplementary material at stacks.iop.org/ERL/ 14/095007/mmedia, with the regional dwelling typologies' characterization regarding these parameters.
It is possible to observe that the U-value data of the different building elements has significant dispersion at a regional and national level. The same occurs for most of the other parameters analysed. The national average EPC U-values for the roof, walls, floor, doors, and windows are 2.1, 1.2, 1.9, 2.5 and 3.6, in W (m 2 .K) −1 , respectively. For the roof, walls, floor and windows, these average U-values are significantly superior to the reference U-values defined in the latest regulation, which is to be expected, as the first residential building energy performance regulation to be implemented in the country was the Buildings' Thermal Behaviour Characteristics regulation, in 1990.   Although the average values have been decreasing in recent decades (Sousa et al 2013), Portugal, alongside Spain, has still registered consistently higher weighted mean U-values across building typologies for windows, pavement, walls and roofs, in comparison to the majority of European countries. The building characteristics vary across the EU-in northern countries, the dwelling stock has better and more insulation (and lower U-values), in response to the colder climate. Moreover, regulation requirements are stricter, compared to southern countries such as Portugal (Inspire 2014, Anagnostopoulos and De Groote 2016).
According to the data of the EPC sample, the average U-value for doors is the only building element in accordance with the reference value in the regulation. Comparing the average U-values of the roof, walls, floor and windows with the maximum values allowed, it is possible to observe that these averages still surpass  the limit values for every climatic zone and for every building element, except the walls. After the doors, walls are the element which has better thermal performance in the Portuguese dwelling stock thermal envelope, according to the EPC data. A buildings' roof and windows are the least insulation effective building elements, with greater potential for improvement in that regard. The solar factor of the windows glazing has an average value of 0.8, above the highest value of the maximum values range, set in the regulation, which further emphasizes the low energy performance of the windows. The air change rate per hour is at adequate levels (higher than 0.6), either for winter and summer.
On the other hand, the average psi value (0.5 W (m.K) −1 ) and U-value (0.8 W (m 2 .K) −1 ) of the linear thermal bridges and flat thermal bridges respectively, fall inside the regulations' range values, indicating appropriate thermal performance.
Looking at the regional scale, regarding roofs, dwellings in the regions of Alentejo (south Portugal), in climatic zone V3I1, present higher U-value (between 2.3-2.5 W (m 2 .K) −1 ), which may be associated with more frequent use of the old practice of constructing a type of roof that consists of tiles laid on top of others, without a mortar substrate connecting them, and without insulation. The pavement thermal transmittance is higher (2.0 W (m 2 .K) −1 ) in Algarve and Lisboa, in climatic zones V3I1 and V2I1 respectively, potentially due to the higher frequency of dwellings without pavement insulation (>75%), whilst in Baixo Alentejo, where house typologies are more common, the pavement is in contact with the ground and the use of insulation is more frequent (17%), resulting in lower U-values (1.6 W (m 2 .K) −1 ). The data shows that most walls in the country's residential dwellings are mostly simple or double plastered walls, without insulation (average 78%). The second most common type is the double wall with in-between airspace insulation (average 9%), that guarantees better thermal transmittance levels. The Região Autónoma da Madeira, where the majority of the walls are simple with no insulation, has the highest wall U-value in the country (1.7 W (m 2 .K) −1 ) of all NUT3 regions.
Regarding the windows, most dwellings in all regions have mostly wood framed windows with simple glazing (between 13% and 35% across regions), metal framed simple glazed windows (23% to 44%) and metal framed double glazed windows (24% to 43%). The higher U-values are found in the dwellings of regions such as Alto Alentejo (south) and Região Autónoma da Madeira, in I1 winter climatic zones, which have higher percentages of simple glazing windows. In regions with low average U-values, there is a greater percentage of metal-framed double glazing with thermal cutting windows, that are associated with better energy performance, as in Alto Tâmega (20%) and Terras de Trás-os-Montes (22%), in the northern region of the country (winter climatic zones I2 and I3).
There is no qualitative information about the type of doors used in the dwellings, so no relation can be made to the data available. The Região Autónoma da Madeira presents the highest thermal transmittance associated with flat thermal bridges, with the dwellings in these regions having the lowest percentage of thermal bridges in roller blind compartments and the highest in beams and pillars. The linear thermal bridge psi-value has residual variations amongst the regions, maintaining a constant value of 0.5 W (m.K) −1 throughout the whole dwelling stock.
The thermal inertia is frequently stronger, i.e. has the highest resistance to temperature change, in dwellings located in the southern regions of the country, as approximately 60% of the dwellings present a 'strong' classification for this parameter in these regions. The other 40% represents the dwellings with medium thermal inertia. Virtually no dwellings have weak thermal inertia, in all regions.
The air change parameter is fairly stable among regions for both the heating and cooling seasons, registering slightly smaller figures in the winter, potentially due to the colder temperatures and consequent occupant behaviour. The solar factor value remains constant amongst dwellings in all the regions, with slight differences and an approximate value of 0.8. Overall, the dwellings in the Região Autónoma da Madeira register, on average, at least adequate parameter values, whilst the dwellings in the northern Regions of Cávado and Terras de Trás-os-Montes present the best indicators of energy performance in the country.
Analysing the difference between the 3rd and 1st quartile of the different building elements, a higher diversity of characteristics and U-values of the dwellings located in the region of Terras de Trás-os-Montes, in the north, is observed, whereas there is less diversity of characteristics in the dwellings of the Metropolitan Region of Lisbon, in the south, reflected in a smaller range of U-values.
Further information retrieved from the EPCs on the space heating and cooling system average efficiencies and ownership split per climatic zone and building typology can be found respectively in tables B1 and B2 of appendix B. Data on the energy needs and carbon emissions are displayed in table C1 of appendix C. The data provided is disaggregated by NUT3 region and climatic zone, establishing the basis for more in-depth regional analysis.
The equipment efficiencies are in line with the requirements of the regulation, which rules out the possibility of a significant issue concerning the energy efficiency of the HVAC systems in Portuguese dwellings. Nevertheless, looking at the acclimatization equipment split, in the colder climatic zones, there is a prevalence of biomass stove/fireplaces, which are the least efficient heating equipment. A considerable part of the population still relies on this type of system, as wood was the most common and accessible fuel. As for the warmest climatic zones, the multi-split air conditioner is the most common choice, which has an adequate energy efficiency. However, it is important to note that the ownership rate of space cooling equipment in Portuguese dwellings is very low (around 15.4%, according to INE 2017), which, together with the low thermal performance relating to the building envelope, can create a thermal comfort issue for the occupants.
Regarding energy needs, for space heating, the highest values are associated with house typologies, in the most severe winter climatic zones (I2 and I3), as the number of heating degree days in these zones are higher. Older typologies, from before 1980, present generally higher energy needs, due to the lower energy performance of the building envelope. The difference in the cooling energy needs according to the building typology is not as discernible but, as expected, energy needs are highest in V3 climatic zones, where average outside temperature is greater. Carbon emissions are directly related to the magnitude of the energy needs and the percentage rate of the heating and cooling equipment and the type of fuel, which are used to calculate the primary energy. A greater percentage rate of a lower efficiency equipment that uses a fuel with a high emission factor results in higher carbon emissions, which is why typologies using boilers, either oilor gas-fuelled, are generally responsible for higher carbon emissions. According to the regulation IteCons (2013), these two fuels have the highest emissions factors, 0.267 and 0.202 kgCO 2 kWh −1 , respectively. Although, to a lesser degree, electric equipment also contributes to an increase in the carbon emissions (emission factor of 0.144 kgCO 2 kWh −1 ), particuarly electrical heaters due to their lower efficiency. Older house typologies regularly have higher carbon emissions, due to the combination of higher energy needs and the significant use of boilers and electric heaters.

Energy impact of retrofitting measures
The assessment of the impact of energy retrofitting measures is an important example of how EPC data can be used to leverage research and inform policy making. Notwithstanding the fact that this it is a simplified method to calculate energy needs, as it does not use all the building details that EPCs provide, the method allows the upscaling of the analysis to the whole building stock, which constitutes a significant advantage.
The results of the energy needs calculation method show that the application of building envelope retrofitting measures can lead to significant reductions in the space heating and cooling energy needs of the building stock. In order to abide by the requirements of the regulation, the increase in thermal insulation, i.e. the reduction of the U-value of walls and roof had to be more significant, resulting in a higher average reduction in energy needs related to the implementation of these measures. The percent reductions in space heating and cooling energy needs by type of retrofit measure for each building typology and climatic zone are provided in table C2 of appendix C.
Regarding space heating, the roof is the building element with the highest potential for contributing to a decrease in the energy needs of the Portuguese building stock, as its retrofit might result in a 43.2% reduction. Wall renovation is also relevant, potentially resulting in an average national reduction of 29.7%. Window retrofit is not an effective action to decrease heating energy needs because, according to the applied method, by decreasing the solar factor and the windows' heat transfer coefficient, both the heat loss and the solar energy gains decrease, which have opposite effects on the energy needs. When the reduction of heat loss is lower than that of the solar energy gains, an increase in the energy needs might even occur, compared to the reference situation.
Regarding space cooling, the calculation of energy need depends on the ratio between the heat gains and losses. The greater this ratio, the higher the energy needs. According to the data used and methodology, when the wall and roof U-value decrease, both the parameters' value also decrease. However, generally, the heat loss energy decrease is proportionally higher, which results in higher cooling energy needs. As a result, only windows are effective measures for reducing the cooling energy needs. The retrofit of this building element could potentially result in a national average reduction of 16.3% of the cooling energy needs.
Looking into building typologies, regarding all the the analysesd building elements, higher reductions can be achieved in house typologies compared to dwellings in apartment buildings. The difference is not substantial between the retrofit of walls and windows (1.4% and 3.2%), but it is more significant in the roof retrofit (11.7%), due to the still common absence of insulation in the roof in Portuguese houses, whilst dwellings in apartment buildings have less heat losses as there is frequently another dwelling on top, which results in better insulation. Regarding the walls, houses and apartment dwellings from 1919-1945 and 1945-1960 have higher potential for energy saving, particularly in the climatic zones I2 and I3 in the north NUT3 regions, with no considerable difference between the two zones, and the energy needs reduction varying between 34% and 41%. This is explained by the frequent use of simple plastered walls in those typologies, as well as the conditions of the winter season, namely the higher number of degree days, which increase the energy needs and consequently the potential for reduction. As for the roofs, the higher potential of reduction through retrofit action is associated with house typologies, particularly those from the period before 2005, transversally in all climatic zones, particularly in the typology built from 1980-2005. These typologies were built, for the most part, without any energy performance regulations in place, which resulted in roofs with low insulation and performance. Even in the mildest winter climatic zones such as V3I1, in the Alentejo NUT3 regions, considerable reductions in energy needs from roof retrofit can be achieved, due to the type of tiled roof previously described without insulation, which is still frequent in the regions in that area. When looking at climate zones, the single most effective energy reduction measure, in percentage terms, is the roof retrofit of the house typology from 1980-2005, in this precise climatic zone, with a reduction of about 63% of the space heating energy needs.
At NUT3 level, the dwelling stock of the Oeste region has the highest average heating needs reduction potential for wall retrofit, 34%, whereas the dwelling stock of Beira Baixa has the highest for roof retrofit, 49%. The house typology from 1945-1960, in Área Metropolitana do Porto, has the greater potential for wall retrofit to be applied (about 47%), whilst the house typology from 1980-2005, in Baixo Alentejo, has the highest percent reduction for roof retrofit, 64%, which is the single most effective measure when looking at NUT3 regions.
Regarding energy need reductions for space cooling, house and apartment dwellings built between 1919 and 1980, particularly the typologies of 1960-1980, have the largest potential for savings. Significant energy savings can be achieved regardless of the climatic zone. However, dwellings in the zone V2I2, which encompasses several NUT3 regions, have the highest average potential (19.7%) compared to the other zones. The Oeste region also has the greatest potential for cooling needs reduction through window retrofit, at 21%. The dwellings in apartment buildings in Região Autónoma da Madeira are the typology with the most significant reduction in cooling needs through this kind of retrofit, with about 28%.

Energy performance gap and related CO 2 emissions
After the regional dwelling stock characterization, the work of Palma et al (2019) was replicated using this EPC dataset, to demonstrate the potential of its use for wider energy performance assessment studies. The estimated theoretical energy consumption of the dwelling stock, for space heating and cooling respectively, is about 57.3 TWh and 5.7 TWh, compared to the values Palma et al (2019) put forward of 66.6 TWh and 6.9 TWh, for nominal conditions defined in the national regulation. The country's aggregated global heating and cooling gap calculated using the same actual final energy consumption figures amounts to 89.6% and 95.1%, compared to 91.5% and 96.2% in Figure 9. Cooling energy performance gap (%). Palma et al (2019). The average civil parish heating and cooling gap were also reduced using the EPC data, from 92.5% and 97.2% to 89.7 and 96.2% respectively. At NUT3 level, Região Autónoma da Madeira and Alto Tâmega have the highest average civil parish heating and cooling gaps respectively with 94.5% and 98.3%. At climatic zone level, the V3I3 has the highest average civil parish heating gap, with 92.9% and the zone V1I3 has the highest cooling gap with 98.4%. All national and regional energy performance gaps are considerably high, due to the low energy performance and energy efficiency of the building stock, which results in very high theoretical energy consumptions, and very low space H&C energy consumptions. The heating and cooling energy performance gap maps are displayed in figures 8 and 9.
The main difference between the two approaches resides in the characterization of the building typologies, which simultaneously represents the upgrade that the EPC data can bring to this type of analysis. Additionally, using the EPC data, it may be possible to establish new building typologies and further increase the representativeness of the Portuguese dwelling stock diversity. The outcome of the new typology characterization using EPC data put forward by this study does not considerably change the perspective already given by Palma et al (2019) study, as the heating and cooling gap still register very high percentages, despite a slight decrease. Although the nominal conditions set in the regulations are not a realistic standard to assess thermal discomfort, not properly representing the real climatization habits of the population, the magnitude of these gaps unavoidably continues to point to low consumption rates and the poor thermal energy performance of buildings. These indicators also raise questions about a potential thermal discomfort issue in Portuguese homes and high energy poverty vulnerability, as stated by Gouveia et al (2019).
Assuming the existing electricity generation mix portfolio in the country, and testing the possibility of bridging the gap between the current low final energy consumption and the theoretical final energy consumption levels necessary for indoor thermal comfort, this would mean an increase in carbon dioxide emissions of approximately 5497 kt and 351 kt for both space heating and cooling, i.e. the emissions would be respectively 9.8 and 20.2 times higher, which would work against the national climate change mitigation goals defined for 2030, on the National Energy and Climate Plan, and for 2050, on the Portuguese Carbon Neutrality Roadmap. Therefore, it is crucial to continue the recent pathways of an energy mix shift, the electrification of consumption, and an increase in speed of in the investment and roll-out of renewable energy technologies and more efficient demand-side equipment. Additionally, as demonstrated by the assessment conducted of the retrofit of building elements, deep retrofitting schemes should be put in place in a country where more than two-thirds of the dwelling stock is very inefficient, with consequences for indoor thermal comfort and people's health and wellbeing.
All these measures linked to energy transitions will entail a reduction of the GHG emissions in the country, which would still need to potentially increase the final energy consumption to guarantee thermal comfort conditions across the whole dwelling stock.
As demonstrated in this work, systematic quantitative data can feed case study assessments. Consequently, the insights from these studies can serve as the bridge from systematic individual data collection to general knowledge. This particular case study provides relevant methodological insights for assessing the impact of retrofitting as a climate change mitigation solution, in a residential urban and rural setting with considerable climate change mitigation potential. This study shows that EPCs can provide high-detail data that is suitable for use in bottom-up approaches, not only for national but also regional and local analyses, which could be used to improve the evaluation of the impact of retrofit measures on energy consumption and GHG emissions.
There is a need for a continued increase in big data quality and availability, e.g. smart meter data, to further investigate regional consumption patterns and also thermal comfort attainment, which continues to be a real concern for a considerable part of the Portuguese population. This is an issue that should be quickly and seriously addressed, as it constitutes a risk to the population's health and standard of living.
The EPC data has proven to be of significant value, as demonstrated by this study, and could be further used to: (1) test the effect of climate change on the energy demands and consumption of buildings; (2) test the effect of energy efficiency and retrofit measures on buildings' energy performance and GHG emissions. Notwithstanding its utility and value, it is necessary to acknowledge the uncertainties associated with EPC data, related to the audit and issuing process. This level of uncertainty might be one of the causes of the great range in values concerning the studied parameters. Nonetheless, the EPC is currently the most detailed and up to date source of data for analyzing a dwelling stock, though increasing the quality of its production should continue to be pursued.

Conclusions
Residential sector consumption is a moving target, which increases the complexity of adequate policies and instruments for addressing in some countries (as in Portugal) the bottleneck between increased demand for, e.g., climatization due to the current lack of thermal comfort; and complying with the objectives of increased energy efficiency which ultimately intend to reduce energy consumption (Gouveia 2017). This calls for different levels of knowledge to feed multiscale policies and a deeper assessment of the potential impact on climate mitigation targets.
EPCs are a useful source of data, not only at the individual dwelling level but also for leveraging into bigger scale studies encompassing a whole dwelling stock, namely for the assessment of its energy performance and GHG emissions. In this study, EPCs are used to analyse the whole Portuguese dwelling stock for 25 NUTS3 regions and 9 climatic zones, in terms of the thermal performance of the building elements. Furthermore, the raw EPC data is also used to assess the impact of building retrofit on the energy needs, to replicate a previous study calculating the theoretical final energy consumption and heating and cooling gap, and in the estimation of carbon emissions, of the whole Portuguese dwelling stock. The results of this study demonstrate that the Portuguese dwelling stock does not have the appropriate characteristics for adequate energy performance. Dwellings located in Região Autónoma da Madeira are the most vulnerable and unprepared, whereas the dwellings in the northern regions of Cávado and Terras de Trás-os-Montes have better performing building elements. The roof is the building element with the highest potential for reducing heating energy needs, particularly in the house typology from 1980-2005, whilst the retrofit of windows can be effective in reducing cooling needs, particularly in house and apartment building typologies from 1960-1980. The theoretical final energy consumption estimated using the EPC data is still considerably higher than the actual final energy consumption in most of the regions, which translates into very significant heating and cooling energy performance gaps. Using the current electricity generation mix, the future reduction of the energy performance gaps would entail a considerable increase in carbon dioxide emissions, which stresses the importance of renewable energy as a route to sustainability, and deep dwelling retrofitting to reduce the demand for climatization services.
Besides these three examples of potential EPC data upscaling, several other uses for the data might be explored in the future. This analysis and methodology lay the groundwork and further stress the need for future work in the country on the identification of vulnerable consumers and regions (e.g. under energy poverty conditions), social support policies, planning for local energy efficiency instruments and measures, renewable energy source integration and climate change impact evaluation on energy needs and thermal comfort. Future data availability and quality play a major part in increasing the detail of similar studies. The implementation of Display Energy Certificates in the country, which provide actual consumption data, would be an important additional contribution for the improvement of these assessments and related energy efficiency and building climate mitigation analysis. The outcomes of these potential future uses could be of great value to policy design and decision makers at both local and national scales, allowing for tailormade measure implementation, which might have a greater impact on the population.

Data availability statement
The data that support the findings of this study are available from the corresponding author upon reasonable request. The data are not publicly available for legal and data protection reasons and the whole database is managed by ADENE, the Portuguese Energy Agency. Simplified versions of the information of individual energy performance certificates are available at www.sce.pt/pesquisa-certificados/.
Appendix A Figure A1. Minimum value, 1st quartile, median, average, maximum value and 3rd quartile of the U-value and psi value data for the different elements of dwellings for all NUT3 regions. Figure A2. Minimum value, 1st quartile, median, average, maximum value and 3rd quartile of the area data for the different elements of dwellings for all NUT3 regions. Figure A3. Minimum value, 1st quartile, median, average, maximum value and 3rd quartile of the data concerning the height, air change per hour, solar factor of the different elements of dwellings, as well as the thermal inertia characterization for all NUT3 regions.