State estimation of medium voltage distribution networks using smart meter measurements

Distributed generation and low carbon loads are already leading to some restrictions in the operation of distribution networks and higher penetrations of e.g. PV generation, heat pumps and electric vehicles will exacerbate such problems. In order to manage the distribution network effectively in this new situation, increased real-time monitoring and control will become necessary. In the future, distribution network operators will have smart meter measurements available to them to facilitate safe and cost-effective operation of distribution networks. This paper investigates the application of smart meter measurements to extend the observability of distribution networks. An integrated load and state estimation algorithm was developed and tested using residential smart metering measurements and an 11 kV residential distribution network. Simulation results show that smart meter measurements, both real-time and pseudo measurements derived from them, can be used together with state estimation to extend the observability of a distribution network. The integrated load and state estimation algorithm was shown to produce accurate voltage magnitudes and angles at each busbar of the network. As a result, the algorithm can be used to enhance distribution network monitoring and control. 2016 The Authors. Published by Elsevier Ltd. This is an openaccess article under the CCBY license (http:// creativecommons.org/licenses/by/4.0/).


Introduction
The increasing use of distributed energy resources (DERs) such as distributed generators, electric vehicles, heat pumps, demand response, and energy storage brings significant uncertainties and, at high penetrations, may lead to operational difficulties in the distribution network [1]. Therefore, accurate knowledge of system states is critical for the network operator to ensure safe, prompt and cost-effective operation of the network, while making the best use of the assets [2].
At present, comprehensive real-time monitoring and control of the medium voltage (MV) and low voltage (LV) distribution networks is limited due to technical and economic constraints. In a typical distribution network, real-time measurements are provided only at the primary (33/11kV) substations. Almost no real-time monitoring is carried out on the 11kV or the 0.4kV circuits [3,4]. A distribution network can therefore be described as an under-determined system. This means that the installed real-time measurements are insufficient to make the system fully observable.
In over-determined power networks with many measurements, e.g. transmission networks, state estimation [5] is used routinely to clean up the errors in a set of redundant measurements. In contrast, in under-determined distribution networks state estimation is used to find the state of the network from a limited number of measurements. A distribution network state estimator applies a minimum set of real-time measurements together with pseudo measurements to identify the operating states of the network. Pseudo measurements are obtained from load estimates using historical measurements (e.g. data collected from load surveys, transformer kVA ratings, and customer energy meter readings) or from near real-time measurements of smart meters and measurements of Automatic Meter Reading (AMR) systems [1]. Statistical analysis, artificial intelligence, and cluster analysis methods have all been used for load estimation in power networks [6][7][8][9][10][11].
Weighted least squares (WLS) estimators [12] have been widely applied in radial distribution networks.
WLS estimators exhibit a good performance only when the measurement errors and noise characteristics are known and normally distributed. The performance of WLS based state estimators is affected by outliers, gross errors in real-time and pseudo measurements, and measurement errors that do not follow normal distribution [1,[13][14][15].
Iteratively re-weighted least squares (IRWLS) estimators have been developed to find system states more accurately and detect, identify, and eliminate the inherent errors in the measurements, network model, or system parameters [16]. IRWLS estimators, whose outputs remain insensitive to deviations in a limited number of measurements, are more robust than WLS estimators. In contrast to WLS estimation, which assigns the same weight to a measurement throughout all iterations, IRWLS estimators iteratively change the measurement weights. Measurements with large residuals will have their weights reduced iteratively [17][18][19][20].
Smart metering is recognised as an important starting point in the evolution of smart grids [21]. Smart meters employ advanced metrology, control, data storage, and Information and Communication Technologies (ICT) to provide near real-time consumption information to the consumers that will help them manage their energy use, save money, and reduce greenhouse gas emissions [22]. At the same time, smart meter measurements will: enable more accurate demand forecasts, allow improved asset utilisation in distribution networks, locate outages, shorten supply restoration time and reduce the operational and maintenance costs of the networks [23,24].
Smart meters and their associated ICT infrastructure can improve the observability of distribution networks.
However, their communication systems face significant technical and operational challenges [1]. The technical challenges include the lack of sufficient signal strength; the shortage of tools to detect network failure; and the indoor/outdoor placement of meters. Examples of the operational challenges include planned or unplanned maintenance of the system, software and hardware faults or malfunction of the smart meters, and customers unwilling to communicate their energy consumption data. These challenges make smart meter measurements susceptible to time delays or even temporary loss when requested by the energy suppliers or network operators [25][26][27]. Imprecise and lost measurements will degrade the performance or even disable a conventional state estimator.
As smart meter measurements become widely available, distribution networks will progressively evolve from being in an under-determined to an over-determined state. Therefore, a state estimation algorithm that can work for both conditions is required. To address this requirement, an integrated load and state estimation algorithm for both under-determined and over-determined distribution networks was developed in this paper.
The integrated estimation algorithm builds upon previous research on load estimation using -means clustering [10,11] and iteratively re-weighted least squares (IRWLS) state estimation [12].
Reference [10] presented the -means based clustering of residential load profiles. Detailed description of both the -means cluster analysis and the load estimation algorithm were introduced in [11]. Reference [12] investigated the performance of the IRWLS state estimation algorithm using real-time measurements collected from an actual LV microgrid.
In this paper, the mathematical formulation of the IRWLS state estimation algorithm is presented in detail.
A comprehensive case study has been carried out to demonstrate the capabilities of the integrated estimation algorithm using a real 11kV residential distribution network and smart meter measurements. This paper shows that the state estimation algorithm is easy to implement and requires no prior knowledge of any variables other than primary (HV/MV) substation measurements and aggregated measurements collected from LV smart meters to reliably define the voltage magnitudes and angles at each busbar of the distribution network. Additionally, the integrated load and state estimation algorithm overcomes the requirement to have smart meter measurements in real-time by deriving pseudo measurements from past measurements of smart meters. Therefore, the integrated estimation algorithm has the capability to extend the observability and enhance monitoring, operation and control in distribution networks.

Framework of the state estimator
The integrated load and state estimator consists of three basic components: the -means cluster analysis algorithm, a load estimation algorithm, and an IRWLS state estimator. The framework of the integrated estimation algorithm is illustrated in Fig. 1.  are not installed in all LV network premises, the load estimation algorithm will provide the estimated MV nodal load measurements to allow the state estimator to function.
An assumption was made that there are 384 residential smart meters connected to each MV/LV transformer [28]. An aggregated daily load profile was created by summing the measurements of the 384 smart meters at each half hour time step. Eq. (1) illustrates the aggregation of smart meter measurements.
agg. daily is a vector that represents the aggregated daily load profile, lp ( ) is the measurement of the ℎ smart meter at the ℎ half hour, is the smart meter index, = 1, 2, 3, …, 384, is the half-hour index, = 1, 2, 3, …, 48.

Real-time measurements
were used as inputs to the -means clustering algorithm to obtain a number of clusters and their corresponding centres.
The test set is made up of real-time load profiles. If the profiles of the test set a. contain no missing measurements, these profiles are passed to the state estimator as real-time measurements; b. contain any missing measurements, the profiles are passed to the load estimation algorithm to estimate these measurements. Outputs of the load estimator (i.e. the load profiles with estimated measurements) are used as pseudo measurements by the state estimator.
Finally, an IRWLS state estimator using real-time smart meter measurements, MV real-time measurements (collected at the primary substation) and pseudo measurements on the MV side of MV/LV transformers (derived from LV smart meter measurements) was applied to define the voltage magnitude and angle at each busbar of a distribution network.

k-means cluster analysis algorithm
The -means algorithm that was developed in [11] is used in this paper. The load profiles (at the 11kV level) of the training set and the maximum number of clusters are the inputs of the -means algorithm. The outputs of the -means algorithm include the cluster centres, and load profiles assigned to their respective clusters.
The -means clustering method iteratively groups training load profiles (each comprised of half-hourly measurements) into clusters, by minimising the intra-cluster sum of squared distances between the load profiles and cluster centres. Eq. (2) shows the objective function of the -means method.
(2) agg. daily, : is a vector representing the ℎ aggregated daily load profile, = 1, 2, 3, … , , and : is a vector that represents the ℎ cluster centre, = 1, 2, 3, … , . The ℎ load profile is described as = At each iteration the Average Euclidean Distance (AvED) is calculated between the load profiles and their cluster centres according to Eq. (3). As a result, each load profile is assigned to the cluster that has the nearest centre.
The centre of a cluster is the mean values of all load profiles that are assigned to this specific cluster, calculated at each half-hourly time step. Eq. (4) defines the centre of any cluster.
is the number of load profiles assigned to the ℎ cluster.
Pycluster [29], an open source cluster analysis package was used to develop the -means cluster analysis algorithm in Python 2.7.

Load estimation algorithm
The load estimation algorithm applies the cluster centres (that were obtained from the clustering of the training load profiles) to estimate any missing measurements that exist in the test profiles. A Canberra distance function is used to link test profiles (with missing measurements) to the training cluster centres. Eq. (5) calculates the average value of Canberra distance between the ℎ test load profile and the ℎ training cluster centre.
Canberra distance measures the distance between and on a rectilinear basis such that the absolute difference between any two half-hourly measurements of the load profile and cluster centre is divided by the sum of the absolute values of these two measurements. The average value is then calculated as the average Canberra distance. As a result, a set of half-hourly measurements (one measurement to be estimated plus the − 1 half-hourly measurements that precede it) is paired to the nearest cluster centre (that has the same duration of half-hours). The measurements were estimated iteratively with only one half-hourly measurement estimated in each iteration. Fig. 2 demonstrates the concept of load estimation using segmented cluster centres.

IRWLS state estimation algorithm
Iteratively re-weighted least squares (IRWLS) estimators [19] minimise the sum of weighted squared residuals between the measured and the estimated values of the network states, as shown in Eq. (6), subject to the constraints given by the measurement equations shown in Eq. (7): where is the measurements vector, is the vector of nonlinear measurement functions, is the vector of system state variables, is the vector of measurement errors, W is the weight of the ℎ measurement, is the measurement index, and is the number of the measurements.
Eq. (6) is formulated in the same way as the WLS estimator resulting in Eq. (8) given that is the weight matrix which is a diagonal matrix such that the weights of the measurements are in the main diagonal. The weight, that is equal to the reciprocal of the variance of a measurement (1/ 2 ), reflects the accuracy of the measurement.
Measurements are normalised with respect to their standard deviations In a general form, ( ) = ( )⁄ , i.e. for each , h ( ) = f ( ) σ ⁄ . Consequently, the normalised residual between the ℎ measurement and its calculated value is defined as: The diagonal elements of the weight matrix, W , are modified iteratively (with iteration count ) according to the relationship To avoid convergence problems that might result from the division by very small residuals, the range of weights was limited between minimum and maximum weight thresholds. The minimum weight threshold was set to 0.001 while the maximum weight threshold was unity. Eq. (12) - (14) define the criteria of measurement re-weighting.
If R > 0.001 R then: The IRWLS state estimator can handle real-time measurements collected from the primary/secondary substation(s), distributed generators, smart meters and pseudo measurements based on load estimation. The integration of distributed generators brings several uncertainties which will impact the operation of distribution networks. The IRWLS state estimator is capable of accommodating uncertainties arising from the deployment of DGs. Distributed generators equipped with metering devices will have their measurements input to the state estimator as (near) real-time measurements. Alternatively, any unmetered generation can be estimated using the load estimation algorithm. In this case, the estimated generation will be modelled as a negative load with relatively low weight in the state estimation algorithm. The modelling of DG is not the focus of this paper.
Further details can be found in [12].

Smart meter measurements dataset
Load profiles using measurements from residential smart meters were used to train, test and validate the performance of the load and state estimation algorithm. The load profiles were obtained from the Irish smart metering Customer Behaviour Trials (CBT) and were accessed via the Irish Social Science Data Archive [30]. Daily and segmented (generated by partitioning the daily load profiles using different time windows) load profiles were used in this study. Whereas a daily load profile consists of 48 half-hourly measurements, a segmented load profile extends over a time window that is less than or equal to 24 hours. Fig. 3 illustrates the concept of segmented load profiles.
where is the number of the daily load profiles, and is the number of half-hourly measurements per a daily load profile. The present study applies a time window of 16 hours to segment the daily load profiles.

Test distribution network
An 11kV residential distribution network was used to investigate the performance of the integrated load and state estimation algorithm. Fig. 4 is a single line diagram of the distribution network. The distribution network [32], that is located in England, consists of a radial feeder that has 12 busbars and 11 sections. Busbar 1 is the secondary side of the primary substation while busbars 2-12 represent MV/LV transformers. The network has a peak load of 3642kW and 1245kVAr and an average feeder R/X ratio of 2.2.

Load and state estimation methodology
For both the training and test sets of load profiles, measurements from the smart meters installed at customer premises were aggregated using Eq. (1) to give the active power consumption at the LV side of the MV/LV transformer. The following steps describe the approach adopted to obtain the active and reactive profiles at the MV side of the transformer.

1.
The aggregated daily load profiles at the LV level were normalised with respect to their maximum values. The normalised profiles were then scaled up to match the peak (active and reactive) power demand of each transformer.

2.
A random percentage in the range between 6-10% [33,34] of the load demand (at the LV side of the transformer) was added to each daily load profile. This percentage reflects the total (active and reactive) power loss (of service cables and the transformer). The outputs of this step are the daily load profiles at the MV side of each MV/LV transformer.

Cluster analysis of the load profiles
Using Eq. (15), the integrated load and state estimation algorithm started with dividing the daily load profiles

Load estimation of missing measurements
The load estimation algorithm (Section 2.2) replaced missing measurements of the test set with estimated measurements obtained from the cluster centres (Section 5.1) using Canberra distance function.
In order to simulate the loss of smart meter measurements at an MV/LV transformer, the actual values of the test load profiles (at the MV side) were replaced by zeros. For each test load profile, different durations of lost measurements ( ) from 1 to 24 consecutive hours were simulated. The load estimation algorithm iteratively estimated the lost measurements one half-hourly measurement at a time. Using a brute-force approach, all possible combinations of ( ) hours of measurement loss were covered.
The estimated measurements at the MV (11kV) level are applied as pseudo measurements to the IRWLS state estimator.

Integrated load and state estimation
The integrated load and state estimation algorithm was applied to estimate the voltage magnitudes and angles at each busbar [2] of the 11kV distribution network shown in Section 4. The integrated estimation algorithm utilised the test profiles and made the following assumptions. Eq. (16)-(18) define the MAPE [37], mean of the MAPE and the overall maximum value of APE. [2] Except the voltage angle at Busbar 1 (slack busbar) where it is held constant at 0°.

A set of real-time measurements is
is the overall number of half-hourly time steps and is the number of busbars of the network.

Results and discussion
Differences exist between the load and voltage profiles on weekdays and weekends of the test period. These differences are results of the diversified use of appliances at different times of the day which can be linked to the daily schedule of occupants within the residential premises. The integrated load and state estimation algorithm was applied to estimate the load and voltage profiles of each busbar (except the load and voltage angle of Busbar 1) of the residential network regardless of the day type (weekday or weekend).

Load estimation
The load estimation algorithm was applied to estimate the active and reactive power demand of each MV/LV transformer. Fig. 5 shows the actual and estimated load profiles of Busbar 11 for a representative weekday and weekend.  The performance of the load estimation algorithm using low frequency smart metering data was comprehensively analysed in [11]. Accuracy of the estimated load was reported to decrease as the duration of the estimated load increases.

Load flow analysis
For the representative weekday and weekend, Figs. 6 and 7 show the range of load flow voltage magnitudes and angles at each busbar of the residential distribution network.  The solid black profile is the mean value of the voltage magnitudes (Fig. 6) and angles (Fig. 7) while the red profiles are the minimum and maximum voltage magnitudes (and angles) calculated over the 48 halfhourly time steps during a day. For any busbar, the mean voltage magnitude and angle was obtained according to Eq. (19) and (20) The maximum and minimum voltage magnitudes and angles of any busbar are calculated according to Eq. (21) -(24) In Eq. (19) - (24), is the busbar voltage magnitude, is the busbar voltage angle, is the busbar index, is the half-hourly measurement, and is the overall number of half hours within a day.
For all half hours during a day, the minimum voltage magnitude (Fig. 6) and the most negative voltage angle ( Fig. 7) were observed at Busbar 11 which is located at the end of the feeder. Busbar 11 was used in the following studies.

Integrated load and state estimation
The integrated load and state estimation algorithm was applied to estimate the operating state of the residential distribution network using the assumptions presented in Section 5.3.
The estimated voltage magnitudes and angles and the active and reactive power demands at each busbar of the network were compared with their respective values obtained from the load flow solution. for the weekday and weekend, up to 24 hours of pseudo measurements. In Fig. 8, the grey profiles are the estimated voltage magnitudes of Busbar 11 while the red profile represents the voltage magnitudes obtained from the load flow analysis. Fig. 9 illustrates a box-whisker plot [38] of the MAPE distribution of the estimated voltage magnitudes (of the test period) at Busbar 11 for 1-24 hours of pseudo measurements.  Fig. 9 shows that for 24 hours of pseudo measurements, the MAPE of the estimated voltage magnitudes at Busbar 11 were less than 0.08%.

Accuracy of the estimated voltage angles
For the weekday and weekend, the load flow and estimated voltage angles at Busbar 11, for 1-24 hours of pseudo measurements, are shown in Fig. 10.  Using the estimated voltage magnitudes and angles, the state estimator calculates the active and reactive power demands at each MV/LV transformer of the residential distribution network. Fig. 12 shows the load flow and estimated active power demand of Busbar 11 on both the weekday and weekend, for 1-24 hours of pseudo measurements.
In Section 6.1, Fig. 5 showed the estimated active power demand of Busbar 11 resulting from the load estimation algorithm. However, Fig. 12 shows the active power demand calculated using the estimated voltage magnitudes and angles. The red profile in Fig. 12 is the active power demand obtained from the load flow analysis, whereas the grey profiles represent the power demand calculated using the estimated voltage magnitudes and angles. As the duration of pseudo measurements increases from 1-24 hours, the estimated demand shown in Fig. 12 diverges from the load flow demand.
For 1-24 hours of pseudo measurements, Fig. 13 shows a box-whisker plot representing the MAPE distribution of the estimated active power demand at Busbar 11 during the test period.  Fig. 13 shows that up to 10 hours of pseudo measurements, 75% of the estimation errors were less than 5%.
At the same time, a day ahead active power demand was estimated with a median MAPE that was less than 8%.

Sensitivity of state estimator outputs to uncertainty in LV network losses
The results shown in Figs. 8-13 were obtained assuming that the losses of the LV networks and MV/LV transformers during the test period were known and were equal to their values (selected between 6-10% of the load demand) defined in the training period (Section 5).
In order to reflect the uncertainty in network losses at its lowest voltage level, these losses were modelled  However, the mean error of the estimated voltage magnitude remains below 0.03% while the overall maximum error attains values less than 0.26% even for 100% uncertainty in the LV network losses.
The mean MAPE and the overall maximum absolute percentage errors of the estimated voltage angles are shown in Fig. 15.  Fig. 15 shows that although the increase in the level of uncertainty in LV network losses is accompanied with an increase in the errors of estimated voltage angle, however, the mean MAPE remains below 3% and the overall maximum error less than 30%.
Despite the differences between voltage and load profiles on weekdays and weekends, the integrated load and state estimation algorithm is capable of producing reliable estimates of the load and voltage profiles at each busbar of a distribution network.

Conclusions
An integrated load and state estimation algorithm was developed and used to estimate the operating state, defined in terms of the busbar voltage magnitudes and angles, of a residential distribution network. Themeans based load estimation algorithm showed promising application of cluster analysis methods to extract consumption patterns from, and estimate the load of, aggregated smart meters at the 11kV level.
The integrated estimation algorithm was tested using real-time and pseudo measurements derived from residential smart meter measurements and an 11kV residential distribution network. Under extreme conditions of real-time measurements only at the primary substation, pseudo measurements of MV/LV transformers, and 100% uncertainty in the LV network losses, the overall maximum APE was less than 0.26% in the estimated voltage magnitudes and 29% in the estimated voltage angles. In this case, the mean values of the MAPE were less than 0.03% for the estimated voltage magnitudes and 3% for the estimated voltage angles.
Simulation results showed that the integrated load and state estimation algorithm reliably defined the operating state of a distribution network using limited MV real-time measurements and both real-time and pseudo measurements derived from smart metering data. Therefore, the algorithm can be applied to extend the observability and improve the operation and control of distribution networks.