Building Power Demand Forecasting Using K-Nearest Neighbors Model – Practical Application in Smart City Demo Aspern Project

Following the ongoing transformation of the European power system, in the future, it will be necessary to locally balance the increasing share of decentralized renewable energy supply. Therefore, a reliable short term load forecast at the level of single buildings is required. In this work, we use a forecaster, which is based on K-Nearest Neighbors approach and was introduced in an earlier publication, on three buildings of Smart City Demo Aspern project. We demonstrate, how this forecaster can be applied on different buildings without any manual setup or parametrization, showing that it is viable to replace load profiling solutions for predicting electricity consumption at the level of single buildings.


INTRODUCTION
The increasing share of renewable energy sources within electricity generation leads to new challenges for the power system infrastructure.Establishing smart grids helps addressing these challenges.The concept seeks to improve the transparency and controllability of the distribution grid by installing advanced information and communication technology at the level of single buildings.It allows applying advanced control schemes to use load flexibilities present in such smart buildings [1].Many of these control capabilities require an accurate building-level Short-Term Load Forecast (STLF).However, appropriate techniques that can be universally applied to various different buildings without their explicit knowledge are yet to be developed.
Until recently, forecasting the load in the local distribution system was of limited interest, and buildings were modelled as low-voltage end-consumers with uncontrollable load.Electricity demand of such consumers is commonly forecast using Standardized Load Profiles (SLP) predefined for general consumer groups.As long as power demand and supply are balanced at the high-voltage level, only large aggregations of end-consumers have to be considered for the load forecasting, where SLPs deliver reliable accuracy.
With the increasing share of decentralized supply connected to the distribution grid, balancing of power generation and consumption will have to be done locally, at the low-voltage level.For this purpose, SLPs are inappropriate as they only rudimentarily reflect the diversity and highly stochastic nature of the building electricity demand.Additionally, newly commissioned smart buildings can no longer be assumed to have a relatively constant load profile as it was done in the past.Such a building is often equipped not only with photovoltaic (PV) panels producing power, but also with electrical Heating, Ventilation, Air-Conditioning (HVAC)-systems, electrical and thermal storages.Its electricity demand can change considerably over the days, depending on PV generated power and weather.At the same time, it has load flexibilities, which can be used for balancing purposes locally or at higher domains.
To do so, STLF at the building level is required.Widearea introduction of smart meters in the European Union [2] will allow to use Individual Load Profiles (ILPs) ([3]) and to apply more advanced STLF-concepts to the buildings.While there are numerous approaches for the STLF in higher domains (such as system and balancing group levels [4]), techniques specialized for wide-area application at numerous different buildings still present a big task.In general, it is easier to forecast large aggregated loads present in higher domains ( [5]) and, until now, there have been only few attempts ( [6][7][8]), to adopt those approaches at the building level, where the load aggregation is much smaller.Such propositions are, mostly, done for a specific building and it is still an ongoing challenge to develop a universally applicable forecaster that can be applied on various different buildings disregarding their size or purpose, delivering reliable accuracy.
In a recent publication [9], we have proposed an initial approach for such forecaster based on the K-Nearest Neighbors (KNN) method.This technique is very common for data classification, but can also be extended to forecast functional time series.KNN forecasters in general ( [10]), are attractive because of their simplicity and the ability to predict complex nonlinear behavior, such as the one expected of a building load.Our proposition required no manual setup and was applied to numerous different end-consumers, without any explicit knowledge about them.
In this article, we use our forecaster at the smart buildings erected for Smart City Demo Aspern (SCDA) project [11].We start with a problem formulation where we provide the information about the buildings and define the forecasting problem.Afterwards, we explain the forecasting approach of our KNN forecaster and present the results concluding this article.

PROBLEM FORMULATION
Three buildingsa housing, a student dorm and a school campus including primary school and a kindergartenconstitute Aspern Smart City Research testbed [11].
Together with the building energy management system and necessary information and communication technology infrastructure, these smart buildings are used within SCDA-project to try various load forecasting concepts which will allow to use the load flexibilities in those buildings in further stages of the research.
Equipped with PV-panels, solar thermal panels, hybrid panels, heat pumps and various thermal as well as electrical storage facilities, smart buildings of the SCDAproject represent modern prosumer buildings which are being increasingly commissioned all over Europe.The overview of electrical equipment for each building is provided in Table 1.Due to this equipment, building load considerably deviates from the predefined SLPs which makes it harder to predict.
Metering infrastructure of each building delivers precise load measurements, and for this study we use the electricity consumption data between 1 st of April 2016 and 30 th of September 2016 obtained for each building with a 30 minutes resolution.
The load curves for three randomly selected days are depicted in Figure 1.Due to the 221 kW of installed PV capacity, dorm often feeds electricity into the grid which results in negative net building load.School and housing appear to have more volatile curves with the former having a load peak during the day, while the latter features a distinguishable evening peak expected of a residential load.Note, how the dorm load considerably depends on PV-generation.On 15 th of June, probably due to bad weather, the dorm had significantly increased overall consumption around midday.At this point, we introduce the following notation.Each metering device delivers continuous equidistant measurements that we divide into daily load curves  1 , … ,   where  1 represents the first and   represents the last day of historic data.Given the 30-minute smart meter resolution, each curve   with  = 1, … ,  represents a set of 48 equidistant measurements: For our buildings, we formulate the forecasting problem as follows.Given historic daily load curves  1 , … ,   , where   corresponds to the most recently measured load curve, we want to predict the curve representing the load for the next 24 hours.In this study, we assume that forecast is done at midnight predicting the load curve for the entire upcoming day.

FORECASTING APPROACH
To predict the load of buildings described above, we apply the forecaster based on the KNN technique.Its overview is provided in Figure 2 and it is described in more detail in our recent publication [9].After automatic parametrization, it finds KNN of the most recently measured load curve   .The historic successors of KNN are, then, combined into a forecast using a predefined combination function.
Given a set of historic observations and their successors, a KNN model relies on the assumption that similar observations are also likely to produce similar successors.
In our case, the observations represent historic daily load curves  1 , … ,   with their historic successors ′ 1 , … , ′  , where ′ 1 =  2 ,…, ′ −1 =   .Let us assume that each observation   with  = 1, … ,  − 1 represents an object with a set of features.Coordinates of the object in a 48dimensional feature space ℝ 48 are the 48 data points comprising a daily load curve.For a univariate model, the only feature of each object is its historic successor, so an object can be described as a set {  ,  ′  }.We call a collection of such objects a feature space Having measured load curve   in the last 24 hours, we want to predict the curve  ̂+1 for the next 24 hours.To do so, we place   in the feature space described above.We apply workday calendar to use only historic load curves whose successors have the same day-type (workday, Saturday and holiday -representing public holidays and Sundays) as  ̂+1 .For example, if next-day is a workday, then only historic curves   whose successors  ′  were also workdays are considered in the feature space.
Next, a predefined distance notion (  ,   ) is used to measure the distances between   and other historic load curves in the feature space.According to the distance notion,  nearest neighbors of   corresponding to the  most similar historic load curves are selected.Their historic successors are included into a set C ≔ { ′  } with  = 1, … , .
Ultimately, the curves in C are combined into a forecast  ̂+1 using a predefined combination function.

Parametrization
The forecaster parameters  and  (described further in the text) are set automatically using k-fold cross validation [12].It is a common technique from machine learning which searches through a list of candidate parameters evaluating the resulting forecaster on a training set.We use the historic data to parametrize our model before finding KNN.

Find KNN
In order to find K nearest neighbors of the curve   , we calculate its distances to the other historic load curves in the feature space.For this purpose, instead of using common Euclidian distance ( 2 -norm) we define the distance using the concept of the, so called, permuted  2norm which has been developed specifically to quantify the similarity between the time series, rather than calculating a point-wise deviation.
Given time series   and   , distance (  ,   ) is calculated allowing small time permutations of each point   (  ) comprising   , while comparing the time series and penalizing the amplitude differences using  2 -norm: In this equation, Ρ(ω, 48) represents the set of all -local permutations (  )  on  = 1, … ,48 points constituting the curve   .An -local permutation rearranges the time series points by moving each one forwards and backwards by up to -time units.The above equation is solved according to [13].

Combine
Having stored the KNN successors in the set C, we need to combine these load curves using a predefined combination function to a consensus representation of C. Such representation is a curve that has minimal cumulated distance to the curves in C, according to a predefined distance notion.
For our forecaster, we use permutation merge as a combination function.It is a technique, that uses permuted  2 -norm as such notion.This technique determines a curve  ̂+1 that minimizes the cumulated distance according to that norm with respect to all curves in C. The combined load curve  ̂+1 can be expressed as a solution to the following optimization problem: This optimization problem is solved according to [14].

RESULTS
The KNN forecaster described above is used to forecast the electricity consumption of SCDA buildings during the time period between 1 st of May and 30 th of September.
The forecast is done day after day, while the preceding three weeks are used as history for the KNN forecaster (sliding window approach).The history is also used for parametrization which is repeated every six weeks.
There is an ongoing discussion about how to assess forecast accuracy at lower levels of load aggregation [13,15].In this study we define the Adjusted Error (AE) based on the permuted  2 -norm as: where  and  are actual and forecast daily load curve respectively.With  = 3, we allow permutations of up to 90 minutes in total (given 30-min smart meter resolution).To assess the entire forecast we calculate median daily error over the forecast period.
In order to put KNN forecast accuracy into perspective, we also forecast the load using individual load profiles (ILP).Nine profiles are calculated for each building by averaging all of the available historic data for every unique combination of attributes: seasons (winter, summer, transition) and day-type (weekday, Saturday, holiday).Similarly to SLP-approach, the forecast is obtained by using the appropriate ILP (e.g., profile for a holiday in summer).Using the absolute AE notion, we define a relative error that is scale independent, allowing us to compare forecast results among the building as: Herewith,   < 1 indicates that KNN forecaster was more accurate than ILP and vice versa.
Forecast results are presented in Table 2.In case of housing, our forecaster and ILP had similar accuracy.However, substantial improvement is achieved for other buildings.KNN forecaster turned out to be 7% and 23% more precise than ILP used for the dorm and the school respectively.
To illustrate these results, we exemplify the forecast and actual load curves in Figure 3.In case of the housing and the dorm, KNN forecaster has its biggest deviations during the morning / midday where overall building energy consumption is substantially dependent on PV generation of the building.Therefore, our forecaster can be further improved by explicitly considering PV generation forecast for the next daysomething we will address in our future work

CONCLUSION
In this publication, we have applied a forecaster based on the K-Nearest Neighbors technique to predict daily load curves of three different smart buildings participating in Smart City Demo Aspern project, achieving convincing forecast accuracy.Without any manual setup or parametrization, our forecaster turned out to be significantly more accurate than the forecast done using individual load profiles obtained for each building.However, we have seen that our forecaster can be further improved by explicitly considering the photovoltaic generation, electrical heating and ventilation systems as well as other load flexibilities installed in the building.
In the future research, we will place our forecaster in wider context, comparing it with others state of the art forecasters for buildings.We will, also, extend our model to account for the PV-generation and scheduled usage of load flexibilities of the building, so that it can become a powerful tool for predicting electricity consumption of the smart buildings.

Figure 1 :
Figure 1: Electricity consumption during 30 minutes intervals of the SCDA buildings on three exemplary days in June 2016.

Table 1 :
Buildings overview including total energy (  ) and average daily (  ) energy consumption during the period from April to September 2016, peak power demand measured during this period (  ) as well as size (number of dwellings / inhabitants / net area), installed PV generation capacity (  ), battery storage capacity, and installed electrical heating power.

Table 2 :
Median daily forecast error (kWh)Figure3: Forecast and actual electricity consumed during 30 minute period on 16th of June.