Exploring energy geography: Data insights on household consumption

Household energy consumption (HEC) varies across neighbourhoods and gender groups. This database provides raw and analysed data on HEC determinants and their estimated influence on HEC in 2707 residential neighbourhoods (Wijk) in the Netherlands in 2018. The raw data consists of 17 indicators on energy demand, socioeconomic characteristics, microclimate and buildings. The indicators are retrieved from and calculated based on open national and international datasets. The analysed data presents the local coefficients of the HEC determinants, the outcome of the geographically weighted regression model (GWR) presented in the related article [1].


Value of the Data
• The dataset offers a comprehensive set of high-resolution, geo-referenced data and estimated impacts on HEC to study household energy consumption, environmental inequality, and energy poverty.• Researchers in human geography, spatial planning, energy, and environmental studies can use the database.• The data can be further developed for studying climate change, environmental hazards and electric mobility.

Background
The database is produced to identify and map the gender groups with the highest energy demand in Dutch neighborhoods.The database includes geo-referenced socioeconomic, housing, environmental, and land cover data.This article paves the way for a comprehensive study at the scale of Dutch neighbourhoods (similar to the original article [1] ) and finer scales.

Raw datasets
The presented database is retrieved, calculated, and aggregated using six open-to-public data sources stored in the "Raw" folder.

Neighborhood census
Published by the Dutch central bureau for Statistics, Wijk-en-buurtkaart 2018 provides the annual census on gas and electricity consumption and socioeconomic characteristics of neighborhoods [2] .The data is saved in the folder "Raw \ WijkBuurtkaart_2018_v3".

Building database
The 3D BAG dataset provides GIS data on the buildings ʼ shape, location, height, and construction year [3] .The raw data is stored in raster format: height of buildings (Raw \ Buildings \ Buildings_Age) and age of buildings (Raw \ Buildings \ Buildings_DEM).

Land cover data
To estimate aerodynamic roughness length and wind speed using KNMI data, the CORINE land cover 2018 is used (Raw \ CORINE_raster100m \ CORINE_raster100m) [5] .

Land surface temperature
The nine MODIS/Terra and nine MODIS/Aqua, each representing an eight-day average land surface temperature for specific periods, are retrieved and stored (Raw \ LST_MOD11A2_MYD11A2) [6] .

Interopolated meteorological rasters
The observations of the KNMI stations and satellite imagery are interpolated and aggregated.(For a detailed description of data processing, see the original publication [1] .)The results are stored in five geo-referenced raster files in the "Interpolated_rasters \ Final_Rasters.gdb" folder: CDD_2018; HDD_2018; Humidity2018; NDVI2018; Wind_2018.

Dataset aggregated at the scale of Dutch residential zones
The final database aggregates all the above data at the scale of neighbourhoods in the Netherlands (Aggregated \ Final_database.sav).Table 1 shows the acronyms and descriptions of the database variables.

Estimates of the GWR Model
The final database (GWR Estimates \ GWR_Adaptive_Bisquare_band380.xlsx) contains estimates of the GWR model for each residential zone (for specification of the model, see [1] ).The dependent variable of the GWR model is annual HEC per head.The estimates show the impact (i.e.standardised coefficient) of the socioeconomic, building characteristics, meteorological factors, and land cover on the HEC of each residential zone.The fields in the files are as follows: • Wijk_code: Residential zone's identification code, corresponding with the raw data from CBS [2] .• x_coord: x coordinate of the residential zone's centroid with RD New projection • y_coord: x coordinate of the residential zone's centroid with RD New projection • est_intercept: the estimate of the constant term at the Wijk • se_intercept: standard error of the constant term at the Wijk • t_intercept: pseudo t value (estimate / standard error) the constant term at the Wijk • est_Z * : the estimated standardised coefficient of variable * (see Table 1 ) at the Wijk • se_ Z * : standard error of the estimated standardised coefficient of variable * (see Table 1 ) at the Wijk • t_ Z * : pseudo t value (estimate / standard error) the estimated standardised coefficient of the variable * (see Table 1 ) at the Wijk • residual: standardised residual of the GWR model at the wijk • localR2: R2 value of the GWR model at the wijk The GWR model also estimates the local standardised coefficients of the interaction terms between gender and other socioeconomic variables.The interaction terms are marked by "X" in the middle of the variables' names.For instance, est_ZFemaleXBui_Meadian_Age shows the estimated standardised coefficient of the interaction between gender (Female) and building age.

Experimental Design, Materials and Methods
In the first step, the data on the residential zones on energy consumption and socioeconomic characteristics are retrieved.The data is in the format of polygons (.shp).The residential zones with a missing piece of data, presumably non-residential, are excluded from the database.In the second step, other data types are retrieved or interpolated as raster data.Subsequently, the raster data are aggregated at the residential zones, and the final aggregated database is obtained.At the last stage, using the Golden Selection Search function of the GWR4 software [7] , the best bandwidth for the application of a GWR model is found, and the local coefficients are estimated (for more details, see [1] ).
The data presented in this article provides a comprehensive set of determinants for analysing household energy consumption at the neighbourhood scale.The provision of data on gas and electricity demand allows for geographic analysis of different energy sources (similar to [7] ) and the influence of urban morphology on energy demand (similar to [8] ).Furthermore, it paves the way for the analysis of environmental impacts, such as land surface temperature, on energy demand (similar to [9] ) and environmental inequality among socioeconomic groups (similar to [ 10 , 11 ]).The database provides the basis for geographic studies on energy poverty (similar to [ 12 , 17 ]) and households' energy expenditure (similar to [13] ).Using the database, further studies can estimate the energy grid congestion (similar to [14] ) and optimally allocate new points for energy supply to meet the upcoming charging demand of electric vehicles (similar to [ 15 , 16 ]).

Limitations
Not applicable.

Ethics Statement
The raw data of this study is provided by open, public GIS data sources, in full compliance with ethical requirements for publication in the journal of Data in Brief.

Data Availability
Studying household energy consumption: 19 essential spatial data and their estimated impacts (Original data) (Mendeley Data).

Table 1
The description of data in the neighborhood-aggregated database.