Exploring the impact of different parameterisations of occupant-related internal loads in building energy simulation

A building energy simulation relies on accurate parameterisation of occupant-related internal loads to simulate a realistic energy balance within a building. The internal loads are inextricably linked to occupant behaviour, both directly through the contribution of occupant heat output to thermal energy balance and indirectly via the interactions between occupants, appliances and building services. While occupancy itself is difficult to measure directly, most buildings possess a wealth of data in the form of monitored electricity consumption in varying degrees of resolution. These data, particularly plug loads, may be used to inform the model of occupant-related internal loads. Different approaches to parameterisation of plug uilding energy simulation ccupancy-related internal loads lectricity consumption lug loads ncertainty quantification on-domestic buildings loads have been investigated, with the purpose of exploring the conditions that might lead to preference of one approach over another. The models have been tested through a case study and simulation results have been compared against a range of response variables. Conclusions have been drawn as to the most important features of plug load parameterisation for a model to be used for forecasting future demand. © 2016 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license tochastic analysis


Introduction
In the UK the buildings sector accounts for 37% of the total annual greenhouse gas emissions, with non-domestic buildings being responsible for 36% of the sector emissions [1]. Progress has been slow in improving this performance and building energy simulation has a role to play in assessing the impact of potential changes to building fabric and operation on building energy consumption for all types of non-domestic building [2][3][4].
A building energy simulation relies on accurate input of internal loads to facilitate a realistic simulation of the energy balance within a building. It is well known that building energy consumption simulated at the design stage rarely agrees with observed data post-design, and with increasing deployment of energy monitoring systems this so-called 'performance gap' is becoming increasingly visible [5]. One would expect that forecast consumption for an already existing building would be in closer agreement with reality, yet it is still notoriously difficult to match the simulation to the observed data [6].
One fundamental cause of the gap is the inadequacy of current approaches to definition of occupancy-related loads, even in fully operational buildings [7]. The internal loads in a building are inextricably linked to occupant behaviour, both directly through the contribution of occupant heat output to thermal energy balance and indirectly via the interactions between occupants, appliances and building energy services. Occupant-related services are a principal component of building electricity consumption and must be understood if accurate estimations are to be made. However, occupancy and occupant-related internal loads are difficult to specify as occupant behaviour is inherently stochastic; hence these loads represent a significant source of uncertainty in the simulation results [8]. A comprehensive review of the state of the art in occupant behaviour modelling has been performed [9], and many issues are being addressed under the auspices of the International Energy Agency Energy in Buildings and Communities Program (IEA EBC) Annex 66: Definition and Simulation of Occupant Behaviour in Buildings.
Not only is occupant behaviour inherently stochastic, occupant presence is also difficult to measure directly. An alternative approach to simulating occupancy is to infer building occupancy from a measurable quantity; the feasibility of such an 'implicit occupancy' approach has been demonstrated using monitored computer status to infer occupancy using existing IT infrastructure [10]. One must bear in mind though that occupant presence may not be the best or a complete indicator of energy demand as many devices, e.g. lights, air-conditioning, are controlled centrally -especially in the case of non-domestic buildings. Indeed, recent studies have indicated a greater correlation in device state between one hour and the next, rather than between occupancy and device state in any hour [11]. Nonetheless for an operational building a wealth of data exists in the form of monitored electricity consumption and many non-domestic buildings are now routinely sub-metered by end-use e.g. plug loads, lights and air conditioning. Accessing these data is relatively straightforward and gives an immediate insight into actual electricity consumption, and hence building operation, that can be further augmented by an understanding of the building control settings.
This paper examines the different ways by which submetered electricity consumption data may be used to define the occupancy-related internal loads, specifically small power electricity consumption or 'plug loads', in a non-domestic building. Focus has been placed on plug loads alone as the demand is measurable and more closely related to occupancy than lighting, which may be centrally controlled. Small power equipment is diverse and highly dependent on the building function, but for a typical office building comprises primarily computers and peripheral equipment, together with catering equipment. Plug loads were found to account for 23% of total electricity consumption in California's commercial office buildings [12] but [13] suggest that this could increase to as high as 50% for a high-efficiency office. In the UK, the Energy Consumption Guide 19 [14] provides data for the energy consumption of typical and 'good practice' offices which suggest that plug loads account for between 28 and 58% of the total electricity consumption of an office building, and give a range of values varying between 1.9 and 19.1 W/m 2 .
Different approaches have been identified for quantifying plug loads comprising both the accepted methodology used in the UK and possible alternatives; top down data-driven, bottom-up deterministic and bottom-up stochastic models. Each approach has its advantages, and the aims of this paper are threefold: 1) To explore the conditions which might lead to preference of one model over another for the quantification of plug loads. 2) To explore the extent to which the different sources of uncertainty identified in the models are adequately represented, and 3) To identify the most important features of plug load quantification for forecasting of future demand.
Recognizing that the 'adequate' level of complexity may be governed by the nature of the design problem, or 'context', the different models have been applied to an existing building. Model outputs have been compared against a range of standard Key Performance Indicators (KPIs), such as the mean weekday and weekend demand profiles, peak hourly, daily total and the timing of the peak hourly electricity consumption. A particular KPI may be more relevant than another depending on the design problem. For example, peak demand may be more important from the point of view of electricity tariffs, and mean weekday profiles become more relevant for quantifying associated heat gains to size cooling systems.
The models have been applied both with and without making use of monitored plug-load data to tune model inputs for the building in question. An implicit question posed through this exercise is whether the availability of sub-metered data from the same building is necessary for a sufficiently accurate quantification of plug loads. The top-down models of course require some kind of relevant and applicable data set, and we use plug-loads monitored in another very similar building to train the top-down models. At the same time, bottom-up models also greatly benefit from using sub-metered data to tune model inputs.
A brief review of the current methods for characterisation of occupancy-related internal loads in building energy simulation is presented in the next section of this paper, together with an outline of the desirable qualities for such a model. This is followed by a description of the models selected for use in the comparative study and the results of the case study are presented and discussed in Sections 4 and 5. The paper concludes with a consideration of the models' performance against the desirable criteria based on the case study results.

Parameterisation of occupant related internal loads
In a typical computational building energy simulation plug loads are characterised by the user-defined peak power demand associated with devices. These are multiplied by (user-defined) schedules of diversity factors that simulate the typical daily change in use. For an existing building a detailed energy audit may be undertaken to understand how the building operates, but it can be prohibitively time consuming to observe schedules and peak power demand for every end-use and building zone. To reduce the effort required by audit-based studies, a number of alternative approaches have been proposed in the literature.
The approaches identified for use here range from simple aggregation of demand to fully stochastic simulation. Within the simplest models it is assumed that there is different weekday/weekend power demand that fluctuates between peak and off-peak values (estimated from benchmarks, literature, or measured) according to the weekday or weekend time schedule [15]. More complexity may be added by assigning different schedules of use and power demand to different device types and hence building up an aggregate power demand; this is the 'bottom-up' deterministic approach [16]. Aggregating the demand like this may misrepresent an essentially stochastic load, however [7]; whether this is significant may depend on the purpose of the simulation and the key parameters of interest. The DELORES model [17] accounts for the stochastic nature of the power demand by generating a fully stochastic 365 day/24 h demand profile based on the probability of each individual device changing state in each hour. An alternative way to generate a stochastic demand is by using a top-down approach; synthetic time histories may be generated via a statistical analysis of monitored data [6] or a time series analysis [11]. Both of these approaches use the mean monitored daily profile, but differ in the way in which the variability about that mean is simulated.
The 'best' model may be different according to the context and the key parameters of interest [18]. If the purpose of the simulation is to extract aggregate consumption, as might be the case for an analysis of the impact of potential retrofit scenarios on the annual electricity consumption of a building, then an aggregate model may well be adequate. However if the key parameters of interest include such quantities as peak daily power demand and the timing of that peak, e.g. for demand scheduling purposes, then it is necessary to use a model which encompasses the inherent stochasticity of the power demand. Further desirable qualities include being able to assimilate large quantities of data as data acquisition becomes more prolific, and to be able to use those data to improve forecast accuracy. It is also important that a model is flexible in its ability to simulate building operation; if aspects of that operation change, for example if the building layout or occupancy are re-organised, or if building use changes, a model should be able to simulate the corresponding change in power demand.
The models are assessed against these desirable qualities in the comparative study of the different types of model currently available detailed in the following sections.

Description of internal load models
Dynamic simulation models are used routinely at the design stage to predict the operational energy demand of a building. In order to define operational power demand, in the UK guidelines such as the National Calculation Methodology (NCM) [15] may be consulted. In this approach, one standard activity is assigned to each building zone defining the small power demand profile in terms of a nominal value and a daily schedule; electricity consumption may then be summed over the year. For a typical office, peak power demand is stated as 11.77 W/m 2 and the peak time period is from 7am until 7pm, weekdays only (off-peak power demand is 0.63 W/m 2 ). The value quoted lies within a range of 0.64-27.31 W/m 2 for all office types, with a mean of 12.36 and a standard deviation of 4.4 W/m 2 .
Recently, an alternative approach for annual power demand has been proposed. Following CIBSE TM54 [19] a designer would estimate annual total power demand based on the expected number of devices, average device power demand and average annual operational hours. Disaggregation of annual operation would then be required to extract weekly and daily consumption. CIBSE TM54 recommends that engineers 'present the results as a range', but the extent of the range is unspecified; typical values for average power demand, together with 'average', 'conservative' and 'highly conservative' heat gains from desktop computers and monitors are given in CIBSE Guide F [20]. However, typical operating hours for equipment are omitted from CIBSE Guide F [21].
The advantage of these simple approaches is that they require a small number of objective parameters and consequently allow consistency in simulation across a portfolio of buildings. However, this strength is also their weakness; the results are applicable only for the most 'typical' use profiles, and if use deviates significantly from the norm, the results may be misleading.

Bottom-up deterministic model
Menezes et al. [16] suggest two parameterisations for plug loads: 1) Using random sampling of monitored data. 2) Bottom-up model.
In Model 1, daily electricity consumption profiles at 1-minute intervals are randomly selected from a general database of monitored data for each equipment type. The process is repeated 30 times and a Student's t-distribution is used to calculate upper and lower prediction limits. This model avoids the need for assumptions regarding the expected usage profile of individual items of equipment provided that the monitored data set is applicable to the building being modelled; however the approach relies heavily on large amounts of good quality monitored data per device and space type that are not typically available, and this model is not considered further in this study.
Model 2 is an alternative, bottom-up, approach that extends CIBSE TM54 by specifying operational power demand in more detail, including an estimate of the uncertainty associated with the calculation. Electricity consumption is estimated based on the quantity, power demand and usage of each type of device. For devices such as computers and screens in particular, which account for a significant proportion of the plug loads in non-domestic buildings [22], the device state is characterised as 'off', 'low' or 'on', corresponding to a specified power demand for each state. Here 'off' corresponds to the lowest power demand while the equipment is connected to the mains, while 'low' corresponds to the low power mode a device may enter after a period of inactivity i.e. its 'stand-by' state. Operation times are defined by 'strict' and 'extended' switch- on and switch-off times, where 'strict' corresponds to a normal office 9am-5pm day and 'extended' corresponds to a longer day, as detailed in Table 1. Each device controlled by an individual user is assigned to one of four possible usage profiles (see Table 2); the four usage profiles relate directly to the operation times, and are termed 'strict', 'extended', 'always on' or 'transient', where 'transient' equates to being in the 'on' state for 50% of the time period corresponding to 'strict' switch-on and switch-off times. Estimates of the number of devices of each type switched off at the end of each day and the expected drop in power demand at lunchtime are also specified. Finally, Menezes et al. specify a usage diversity factor d per day type; one for a weekday and one for the weekend (see Table 1). This usage diversity factor should not be confused with the hourly diversity factors used by ASHRAE [23], which encompass both the usage diversity factor and usage type specified in the Menezes model. The hourly electricity consumption, Q t , calculated using the Menezes model can be summarised by the following equation; where Q base is the base load calculated from the proportion of equipment switched off at the end of the day and assuming the remaining devices are in the 'low' state. Of the remaining parameters i is the device type (i.e. desktop, laptop, monitor etc.), N is the number of different types of device, j is the usage profile (i.e. 'transient', 'strict' etc.), p i,j is the number of devices of type i assigned to usage profile j and q i,j,t is the power demand above the base load of device type i assigned to usage profile j, according to that usage profile for the hour of interest, t. The stochastic nature of the demand is bounded by specifying a +/− 10% variation on the usage diversity factor, d. Table 2 presents power demand values, distribution of type and usage profile for computers in a typical office [16]. Columns 2 (labelled proportion) and 4 (labelled usage profile) are used to derive p i,j e.g. if there are 100 computers, typically 14 would be high end desktops, and of these 30%, or 4, would be operated according to 'strict' hours.
The benefit of such a bottom-up approach is that there is no need for high-resolution sub-metered data; the downside is that expert judgement may be required to define the model parameters especially those associated with estimating the base load, the proportion of power demand above the base load and the assignment of usage profiles.

Bottom-up stochastic model
Another approach that uses a 'bottom-up' summation of equipment power demand is DELORES [17]. In this model, device state is again characterised by the power demand in the 'on', 'low' or 'off' states, but here the stochasticity is simulated directly; transition probabilities are assigned to the state of each device in each hour, dependent on its prior state and the time period of the day. Each day is divided into three time periods, corresponding to 'peak', 'offpeak' and 'rest' times; the three states and three daily time periods  therefore require 27 transition probabilities as illustrated in Table 3 e.g. at peak time if a device is 'off' the probability of it switching 'on' in the next hour is 0.87. The state of each device is calculated in each hour of the year from its state in the previous hour and the probability of transition using a Markov Chain Monte-Carlo simulation.
The stochastic nature of the model results in different daily profiles each day, mimicking the typical variation in daily profile observed over time.
A nominal time schedule defines the hours that correspond to the daily time periods, with a specified allowable potential deviation from this nominal schedule. The transition probabilities and nominal schedules are defined for weekdays, Saturdays and Sundays and potential holidays are accounted for by estimating the probability in a given month that any day will be a holiday. Example parameters for a desktop computer weekday operation are detailed in Table 3; corresponding weekend transition probabilities are given in Ref. [24].
The model can be summarised by Eq. (2): i.e. the hourly electricity consumption, Q t is equal to the summation over N devices of the power demand of each device, q j in state s, where s is a function of the time period of the day, T, the prior state, s-1, and the matrix of transition probabilities, P. For an office computer (including the monitor), the power demand q has been assumed to be 5 W in the 'Off' state, 65 W in the 'Low' state and 100 W in the 'On' state [17].

Top-down data-driven models
A number of studies have investigated the use of metered electricity consumption data to derive models for quantifying future electricity demand [25]. The study by Sun [6] takes this approach further by also using the data to quantify uncertainty surrounding future predictions. The basic formulation of the Sun model is simply: i.e. the hourly electricity consumption, Q t is equal to the product of the peak hourly electricity consumption across all hours, q P and an hourly diversity factor, D t , which takes values between 0 and 1.
This diversity factor is very different from the weekday/weekend usage diversity factor, d, specified in the Menezes model. Sun used empirical data from a series of 16 buildings analysed under ASHRAE Research Project 1093-RP [23]. The annual peak hourly electricity consumption, q P , was identified from the data for each building and the values were collated as a normal distribution. The electricity consumption data were also used to derive a matrix of hourly diversity factors for the 16 buildings in the following manner; for each building the mean electricity consumption across the week and weekend were collated into a 48 h vector, where the first 24 h represent a mean weekday and hours 25 -48 represent a mean weekend day. The 16 vectors were then collated into a 16 × 48 matrix, and the mean, and covariance, ˙, of this matrix were used as input for the random generation of 48 h vector profiles of diversity D, assuming that the distribution of D followed a n-dimensional multivariate normal distribution, i.e.
In generating a full covariance matrix it is inherently assumed that there is correlation between the hourly electricity consumption in one hour and in any other hour. Sun concluded that a reduced covariance matrix was more appropriate in which only the diagonal and immediately adjacent terms are retained, implying correlation only between the hourly electricity consumption in adjacent hours. While this reduces the complexity of the problem, it is possible that the reduction could be an over-simplification of the autocorrelation.
In the Sun model, a sample diversity factor is generated from a multivariate normal distribution with the covariance matrix accounting for the temporal autocorrelation. Another method that takes into account the autocorrelation between hourly values is the time series modelling approach, as applied in the paper by Wang et al. [11] to investigate the correlation between the variability in occupancy and the variability in device state. Although this approach has not been widely used in the simulation of building energy consumption, we think that it potentially offers a facility to investigate the nature of the correlations in the monitored data that may be more easily interpretable than the covariance matrix proposed by Sun. We therefore evaluate this approach using a well-known time series method; the Auto-Regressive Integrated Moving Average (ARIMA) model. ARIMA models are applicable to data that are stationary and independent of time; the variability of electricity consumption about the mean typically fulfils these requirements (although additional pre-processing of the data such as differencing, i.e. subtraction of the mean from the data, may be required to achieve stationarity). It is possible, therefore, to use the mean electricity consumption together with an ARIMA model of the residual electricity consumption to simulate plug loads. The forecasting equation is a linear equation in which the terms consist of previous values of the dependent variable (Auto-Regressive) and the forecast errors (Moving Average). The number of previous values included in the model -termed the 'order' of the model-is dependent on the autocorrelations observed in the data.
The ARIMA model may be expressed as: i.e. the total hourly electricity consumption, Q t , at time t is the sum of the mean electricity consumption at that hour, t , plus the residual, Y t , which in turn is a combination of 3 terms: • an Auto-Regressive AR(p) model which is a weighted sum of the Y t-i values at previous time steps with weightings ␣ i , back to time t-p, • a white noise term, W t with zero mean and variance w 2 , and • a Moving Average MA(q) model which is a weighted sum of the noise terms, W t-j , at previous time steps with weighting ␤ j , back to time t-q.
Periodic variation -termed 'seasonality' in the ARIMA literature -such as a daily variation of electricity consumption, may also be incorporated by using seasonal differencing i.e. incorporating terms which are functions of values in the previous period. For example, for daily variation of electricity consumption, a season would be one day with a period of 24 h, hence while a non-seasonal model at time t may include terms from t-1, t-2 . . . hours, a seasonal model may incorporate values from both t-1, t-2 . . . hours and t-24, t-25 . . . hours.
Typical terminology is to refer to an ARIMA model as ARIMA(p,dif,q)(pS,difS,qS)[T] where p and q are the orders of the non-seasonal autoregressive and moving average models respectively and dif is the degree of non-seasonal differencing required to achieve stationarity. The parameters, pS, difS, and qS, are the same terms, but for the seasonal part of the model, and T represents the period of the seasonality, i.e. 24 h for daily variation. Several tools exist in software such as R [26] and MATLAB [27] to facilitate the identification of the ARIMA model that best fits a data set.

Uncertainty
The models are fundamentally different in the ways in which they incorporate uncertainty. A useful way of classifying the sources of uncertainty in a model, presented by Kennedy and O'Hagan [28], is to consider six different categories, namely: 1) parameter uncertainty i.e. uncertainty in the model inputs, e.g.
computer peak power demand. 2) parametric variability closely linked to parameter uncertainty but reflecting the range of possible parameter values over a range of scenarios, e.g. variations in peak power demand across seven desktop computers. 3) model inadequacy, i.e. the difference between the true mean value of a real world process that the model is simulating and the simulation output at the true value of the model input.
4) residual variability, i.e. in this study taken to be the variability associated with the process being stochastic. 5) observation error e.g. in measurement of electricity consumption, and 6) code uncertainty, particularly important as code increases in complexity.
Of these, the last two are the least significant in this comparative study; observation error cancels out in a comparative study and the models are numerically quite simple, rendering code uncertainty less important. One of the purposes of this paper is to investigate the degree of model inadequacy for each model; all models are simplified representations of reality and comparisons such as this study serve to illuminate which of the models may better represent the real world in the context of interest. However, it is important to recognise that model inadequacy may arise not only from an inappropriateness of context i.e. the model not being applicable for the purpose of the simulation, but also from a failure of training or calibration data to be sufficiently representative of the scenario of interest.
For the models considered here, model inadequacy is also inextricably linked with residual variability as the process being modelled is inherently stochastic. All the models to a certain extent include residual variability in some form, with the exception of the NCM model. Indeed, even in the deterministic Menezes model, the variation about the mean electricity consumption is simulated by including a variation of +/− 10% on the usage diversity factor in order to generate the upper and lower bound electricity consumption (see Section 3.1 and Table 1). By comparison, the stochastic model, DELORES, explicitly incorporates inherent randomness into the daily predictions and the variation in electricity consumption is extracted from the simulation results over many days. The topdown Sun and ARIMA models focus on simulating the variability about the mean, and again, the simulations must be run for many days in order to encompass all possible variation.
Parameter uncertainty, or uncertainty in the inputs into a model, is the simplest to understand yet it is not quantified comprehensively in any of the models considered here. Let us consider a measurable parameter, the power demand of a device (a computer, for example). The CIBSE TM54 approach suggests using a range of possible power demand values to establish upper and lower bounds per device type or end-use and thus very roughly incorporates both parameter uncertainty and parametric variability jointly. For the Menezes and DELORES models, the simulation outcome is directly proportional to the specified device power demand, hence using a range as TM54 recommends would increase the uncertainty in the simulation predictions. Power demand is also different for different devices; for both the Menezes and DELORES models, uncertainty in the proportion of different devices is only significant if the power demand for each device under each usage type is significantly different (see Table 2 and Section 3.2).
Parameter uncertainty and parametric variability in operational parameters such as schedules are more difficult to quantify and are not included in the Menezes model; in DELORES the specification of possible deviation from the daily schedule simulates the possible variations in transition times from one day to another (Section 3.2). The impact of incorporating this uncertainty is to increase variability in the timing of the transition between operational states; this influences the aggregate electricity consumption and the timing of the daily peak. In DELORES it is also necessary to consider parameter uncertainty and parametric variability in the transition probabilities. At present there are few data available to facilitate quantification of these uncertainties, and the choice of transition probabilities can have a variable impact dependent on where the simulation is operating within the distribution of power states i.e. the Markov chain converges to a stationary distribution of states, but the route and rate of convergence will depend on the starting conditions.
In the top-down models, uncertainties in model parameters are directly derived from the data; in the Sun model, for example, peak hourly electricity consumption, q P , and diversity factor, D, are characterised as normal distributions. The ARIMA model is derived from fitting the best-fit curve to the data, and the fitting process results in an estimate of the standard error, or parameter uncertainty associated with the fitted parameters i.e. the weighting terms, ␣ and ␤, and the magnitude of the variance of the white noise term, ω 2 (see Section 3.3). The impact of this uncertainty is likely to be small provided the residual electricity consumption is of a smaller order of magnitude than the mean. For the top-down models, parametric variability is only incorporated if more than one data set is used for model training in such a way that different model parameters are derived for the different data sets.
The parameters associated with the models considered in this paper may be standardised across buildings or building-specific, measurable or requiring expert input ('subjective'), or derived directly from monitored data. A simple categorisation of the main parameters for these models in terms of their applicability, measurability and uncertainty is proposed in Table 4; in the table a scoring system has been used where 0 indicates 'No' and 1 indicates 'Yes'. In some cases parameters may be inferred from monitored data but are not directly measurable whereas conversely other parameters cannot be inferred from aggregate data but must be measured directly.
As will be demonstrated in the following sections, with the exception of the NCM model the simulation results exhibit upper and lower bounds; it must be stressed that these bounds do not encompass all uncertainties.

Model application
The models have been applied to a case study of the Ashby Laboratory, at Cambridge University Engineering Department, UK. This is a graduate student office, 916 m 2 in area, comprising 5 self-contained faculty offices together with a large open-plan space intended to accommodate up to 84 students. The space is sub-metered for plug loads arising primarily from the use of desk-top/laptop computers and associated monitors. Drawings are available which indicate the notional floor layout, however the actual floor layout is somewhat different in terms of desk positioning and orientation. For the purposes of this analysis the term-time electricity consumption attributable to small power demand has been analysed using data from October 2013 to December 2014.
The models have been applied under two different data availability scenarios; the first application considers the case where there are no metered consumption data available for the space, as would be typically the case in an early design study. In this situation, there are three options for specifying the small power loads: 1) use a reference approach such as NCM, 2) use a bottom-up approach such as DELORES or the Menezes model, with a notional floor plan and model parameters taken from the literature, or 3) use a top-down data-driven approach with data from other similar buildings.
The second application considers the case where monitored data are available, as would be the case in a retrofit or operational energy management study. In this situation, a data-driven model would be the natural choice, but it is also possible to 'tune' the bottom-up models using monitored data in order to achieve a better comparison between simulation and reality and thereby improve confidence in model forecasts.

Early design stage simulation
The models have been used to ascertain whether using an alternative approach would give a better estimate of the plug loads than the NCM model at an early design stage. CIBSE TM54 has not been considered explicitly here as it relates primarily to annual loads. For the bottom-up models of Menezes and DELORES, the model parameters given in the literature and detailed previously in this paper have been used in conjunction with the notional floor/desk layout to generate electricity consumption profiles. The notional floor plan indicates that there are 93 computers in the space, represented as desktop computers in DELORES, but distributed between 'highend' and 'low-end' desktops/laptops in the Menezes model. Small power electricity consumption has been assumed to be entirely attributable to computing as the proliferation of computers subsumes all other consumption within this space.
In the absence of directly relevant monitored data, it is useful to consider the electricity consumption of similar buildings or similarly occupied spaces in order to define parameters for the data-driven models. In this instance, monitored data from the University Computing Laboratory for the period October-December 2013 are used as the training data to derive parameters for the Sun and ARIMA models. This building has three sub-metered spaces which house a mix of faculty and graduate studies offices, similar to our case study. The sub-metered data have been normalised by area to facilitate application to a different building space. For the Sun model, analysis of the sub-metered data suggests that the peak hourly demand, q P , is best represented by a normal distribution with a mean of 8.6 Wh/m 2 and a standard deviation of 3.1 Wh/m 2 . While in Sun's original model the 48 h diversity factors for 16 buildings were collated into a 16 × 48 matrix, in this instance we have borrowed strength across weeks and meters rather than buildings; the training data set consists of 8 weeks of data from 3 meters, giving rise to a total of 24 diversity factors. These have been collated into a 24 × 48 matrix and the mean, and covariance, , of this matrix, as illustrated in Figs. 1 and 2, have then been used in conjunction with Eq. (4) for the random generation of 48 h vector profiles of diversity D. As one might expect, the variance, i.e. the terms on the diagonal of the matrix, are greatest at the start and end of the working day and during the weekend afternoons. Using Eq. (3), each randomly generated diversity factor, D, has been used in conjunction with the peak hourly demand, q P , to generate hourly electricity consumption values over an 8 week period for comparison against the monitored data; the simulation has been performed many times with a different diversity factor, D, in order b If devices are assigned to a usage profile then a mean daily schedule could be derived from monitored data. to ensure that the full range of the potential response is adequately modelled (Figs. 1 and 2).
Derivation of an ARIMA model requires first that the monitored electricity consumption be separated into the mean (base level) and the residual values (variability about the base level). This has been performed for each of the three meters that comprise the training data set. We then analyse the correlations in the residual values between the values in each hour and the preceding hours. Fig. 3(a) and (b) illustrate the auto-correlation (ACF) and partial auto-correlation (PACF) functions respectively for time differences, or 'lags', of up to 48 h, in the residual values derived from one of the meters. Fig. 3(a) shows both a high degree of correlation between consecutive time steps (lag = 1, ACF of around 0.8), and between consecutive days at the same hour (lag = 24, ACF of around 0.5). Fig. 3(b) shows the correlation between values at different lags with the correlation due to smaller lags removed i.e. while Fig. 3(a) indicates that the correlation between hour t and hour t-3, or a lag equal to 3, is approximately 0.6, Fig. 3(b) shows that the majority of the correlation is accounted for by the correlation between values separated by 1 h (lag = 1, PACF = 0.8) and the correlation between values separated by 2 h (lag = 2, PACF = 0.15), while for lag = 3 the PACF value is just 0.07. The tools provided in R statistical software [26] have been used to identify the order of the ARIMA model that best fits the residuals data. Using the terminology introduced in Section 3.3, the order of the model that best fits the data for all three meters is an ARIMA(2,0,2)(0,1,1)[24] model i.e. the non-seasonal model consists of an autoregressive component of order p = 2 combined with a moving average component of order q = 2, and the seasonal model comprises a single order seasonal differencing (pS = 1) with a period of 24 h (T = 24) and a moving average component of order 1 (qS = 1).  An ARIMA model has been fitted separately for data sets from each of the three metered spaces in the Computing Laboratory, and used to forecast the hourly electricity consumption residuals. The model parameters are listed in Table 5. See Section 3.3 and Eq. (5) for explanation of these parameters.
Model outputs have been compared against the monitored electricity consumption data of the Ashby Laboratory from October -December 2014. Fig. 5 illustrates the mean 48-h profile of electricity consumption output by the models compared against the monitored data, where the first 24 h represent a mean weekday and hours 25-48 represent a mean weekend day. Fig. 6 shows the predicted interquartile range, or 'spread', of the results over the 48 h. As discussed in Section 3.4, this value gives an indication of some of the uncertainties in the model predictions. However, it is important to note that not all uncertainties are included in all models, hence the spread of the results does not equate to the total uncertainty • The NCM profile has a constant base load, constant weekday daytime consumption and no increase over the base load at the weekend (Fig. 5). In this case study, this model over-predicts the electricity consumption (and associated heat gains) substantially during the week. This may be a potential problem as it can lead to over-sizing of cooling equipment. The model also underestimates weekend electricity consumption in this case, illustrating the care that must be taken when choosing the appropriate benchmark consumption value; the comparison suggests that a graduate studies office may be used more at the weekends than the assumed commercial office benchmark. • The two bottom-up models give quite different results. For this case study, the parameters taken from the literature for DELORES (Table 3) lead to an over-prediction of the mean hourly electricity consumption whereas the Menezes model, with parameters taken from the literature ( Table 2), under-predicts the mean hourly electricity consumption (Fig. 5). This is likely due in part to differences in the mean power demand values for the devices as the Menezes model assumes a high proportion of low power laptops. It could also be an indicator that the transition probabilities to the 'On' state assumed in DELORES should be reduced for this case. • Looking at the results from the two top down data driven models, a comparison of the predicted results and monitored data highlights the issues associated with using data from another similar building to train the model; the base load and daily consumption of the Ashby Laboratory predicted by these models does not match the data well (Fig. 5). • The interquartile range or 'spread' of the results is significant and does appear to have a periodic variation, being at a minimum during the nighttime period (Fig. 6).
• The +/−10% variation in usage diversity factor used in the Menezes model results in low values for interquartile range (Fig. 6) and if this is interpreted as the uncertainty in the forecast it could potentially engender a false level of confidence in the results, particularly when the consumption value is low. • The stochastic variability of the interquartile range is visible in the results of DELORES and the data-driven models (Fig. 6); the drop in interquartile range at night does not appear in the DELORES results as much as observed in the monitored data, which suggests that the transition probabilities need adjusting. • A higher variability in interquartile range is observed in the two data-driven models than in the monitored data, particularly at the weekend (Fig. 6). This is in part because the monitored data are for an 8-week period only, whereas the stochastic models have been run for a significantly larger number of days in order to ensure the full range of response has been extracted. In addition, the Computing Laboratory data encompasses three meters and hence the variability incorporates the difference between the different metered zones.
Considering the two data-driven models separately, the interquartile range is greater for the ARIMA model than the Sun model, reflecting the different ways in which monitored data from the three meters have been assimilated i.e. In the Sun model, the mean and covariance of the diversity factor have been calculated across the entire data set. For the ARIMA model, the time series cannot be compiled into a single data set, hence three separate models are developed and the results extracted from each model before compilation into the results presented here.

Operational simulation
If operational electricity consumption data are available for a building, a data-driven model may be the natural choice. However, it may also be possible to 'tune' the bottom-up Menezes and DELORES models to improve the simulation outcome depending on the quantity of interest; here the electricity consumption data for October-December 2013 were used as a basis for tuning the models, with the process comprising the following steps; • First the base load was quantified and the type/state of devices were apportioned to match the base load. • Next the mean daily peak load was quantified and the type/state of devices were apportioned to match the peak. • Finally the notional schedules were adjusted to match the observed mean time schedule.
This is a straightforward process for the deterministic Menezes model. For the stochastic DELORES model, tuning requires adjustment of the transition probabilities yet it is possible only to infer net transition probabilities from the monitored data. The nature of the Markov Chain approach used in DELORES means that the distribution of operational states converges to a stationary distribution over time, dependent on the transition probabilities and the starting distribution. So for a given base load it is necessary to ensure that the model converges at night to a distribution that matches that base load. Tuning the peak necessitates ensuring convergence to the right distribution at the right time of day, whereas the 'off-peak' period corresponds to a transition from the state distribution at peak electricity consumption to a satisfactory starting distribution for the night period. In this study, only a single type of device has been assumed and hence the process is simplified; tuning transition probabilities for multiple device types could become increasingly unmanageable as the number of device types increases. The 'tuned' parameters used for the Menezes and DELORES models are given  in Tables 6 and 7. The device power demand values are the same as used in the blind simulation for each model. The Sun model has been generated in a similar manner to the early design stage simulation, but in this instance, we use metered data from the Ashby Laboratory for 8 weeks in the period October-December 2013. Rather than borrowing strength from data across different meters, in this case a single meter has been used and the mean and covariance calculated using data from 8 different weeks, i.e. a data set consisting of 8 × 48 hourly diversity factors has been used to derive a mean diversity factor and covariance matrix as shown in Figs. 7 and 8. The peak hourly demand, q P , is represented by a normal distribution with a mean of 6.5 Wh/m 2 and a standard deviation of 0.3 Wh/m 2 . Compared against Figs. 1 and 2, this data set has a lower base load and higher daily range both in the week and at the weekend. The variance is much lower than before, and although a similar pattern is visible in the covariance matrix it is much less marked.
An ARIMA model of the residuals has been fitted using the same monitored Ashby Laboratory electricity consumption data for the period October-December 2013. Consideration of the residuals shows, perhaps surprisingly, a much lower degree of seasonality than observed in the Computing Laboratory data ( Fig. 9(a)). The ARIMA model that fits these data best is an ARIMA[1,0,1] model with p 1 = 0.7678 and q 1 = − 0.3327, and it was not possible to fit a seasonal model that satisfied the Ljung-Box test. The areanormalised ω 2 is estimated as 1.354 × 10 −7 , which is comparable with the mean value from our previous ARIMA model shown in Table 5. Fig. 10 illustrates the ACF and PACF for the residuals of  High-end desktop  30  20  35  15  30  Low-end desktop  20  70  10  0  20  Laptop  50  30  40  0  30   19" screen  70  50  30  0  20  21" screen  30  50 30 0 20  the fitted model; again, the Ljung-Box test cannot reject the null hypothesis that the model is adequate at a 0.05 level.
The results of the tuned models are compared against the monitored data for October-December 2014 in Figs. 11 and 12. As shown in Fig. 11, there is a much closer agreement with the monitored mean electricity consumption data, suggesting that the training data used for tuning the models are reasonably representative of the mean electricity consumption from October-December 2014.
The monitored data demonstrate a higher degree of variability in the interquartile range than the model predictions (Fig. 12). As in the early design stage simulation, the +/− 10% variation in usage diversity factor incorporated in the Menezes model appears to give too low an interquartile range when compared against monitored data.
The DELORES and Sun models demonstrate a periodic behaviour that is comparable with the monitored data. It is not surprising that the ARIMA model shows no seasonal variation in the interquartile range, as there is no seasonal component to the fitted model. What is surprising is that the Sun and ARIMA models, both being based on the same data, do not show the same degree of seasonal variation in the interquartile range i.e. while the Sun model exhibits lower variability overnight (Fig. 12, 22-30 h), the ARIMA model exhibits no such reduction. The difference lies in the way in which the data are processed; the Sun model requires the pre-processing of the training data into a mean 48-h diversity profile for each week, followed by calculation of the mean and covariance matrix of those 8 weekly profiles. By comparison, the ARIMA model uses a single time series over the 8 weeks of the training data. The different results for these two models suggest there is a greater correlation between the mean weekly profiles, i.e. in the Sun model, than there is from one hour to the next on a daily basis i.e. in the ARIMA model. Whether or not this is significant depends very much on the key parameters of interest as discussed in the following section.

Key performance indicators
While the mean electricity consumption profile and interquartile range are interesting indicators of the comparability between the simulated and monitored electricity consumption, it is comparison of the predicted Key Performance Indicators (KPIs) that provides a more useful insight. The KPIs considered here are: (a) the daily peak, (b) the timing of the daily peak, (c) the daily total and, (d) the weekly total electricity consumption values. The peak hourly electricity consumption is compared against the monitored data for the early design stage and operational simulations in Figs. 13 and 14 for weekdays and weekend days. The early design stage simulations echo Fig. 5 illustrating the under-prediction of the Menezes model and the over-prediction of all other models for this case. The operational simulation results show much better agreement for all models.
The models that best simulate the timing of the peak are the two top-down data driven models and DELORES. Fig. 15 shows the probability distribution of the timing of the peak hourly electricity consumption compared against monitored data. In the early design stage simulations the models predict that the peak will occur later in the day for a weekday than observed, reflecting the mean electricity consumption profile illustrated in Fig. 5; also a much less defined peak is predicted at the weekend than observed. Once tuned the model predictions are closer to the monitored data as expected and all demonstrate a good indication of the timing of the peak, particularly on a weekday. The total daily weekday and weekend, and total weekly electricity consumption results are illustrated in Figs. [16][17][18] showing that the tuned or the operational model predictions match the measurements better. The Sun and ARIMA early design stage simulations predict a much wider spread in the data than observed over the 8 weeks of monitored data, consistent with the use of three different sets of metered data for the model development. For the operational simulation the ARIMA model predicts a lower spread than the Sun model, increasingly evident as the level of aggregation increases.

Discussion
The purpose of the specific analysis described in this paper was to investigate which, if any, model offers the best approach for simulating plug loads in order to generate power demand profiles for input into a dynamic simulation model. To this end, it is necessary to explore the conditions that might lead to preference of one model over another; the models considered all differ in their approach and all have potentially useful features depending on the context of the simulation. The simplest model used here, namely the NCM model, over-predicts weekday and under-predicts weekend demand for our case study, reflecting the problems associated with applying a standard electricity consumption profile to a specific building.
The early design stage simulations compared both the bottomup and top-down approaches, with the latter using monitored electricity consumption data from a building thought by the authors to be occupied by similar users, and hence to offer a similar usage profile. The results illustrate the difficulty in predicting electricity consumption accurately when the data set does not belong to the exact same building. One could argue that increasing the volume of training data may overcome this difficulty, however, it may also increase the interquartile range of the results beyond reasonable. While in this study all the models produce results closer to the monitored data than the NCM approach, all have benefits and disadvantages: • The Menezes model is simple to apply and in this case produces a reasonable representation of the mean profile. However, the level of variability assumed is too low, and the model is incapable of predicting the timing of the daily peak. • The DELORES model results are directly dependent on the transition probabilities; in this early design stage simulation the values used have been taken from the literature and have not been subject to extensive validation for a wide range of cases. Further investigation into the possible range of applicability of these parameters would be useful. • The Sun and ARIMA models have been based on data from a supposedly similar building, and serve to demonstrate the impact of the data on the simulation results. It is clear that while there are similarities, there are also fundamental differences between the Computing Laboratory and the Ashby Laboratory; the ratio of the peak load to the base load is lower and the base load is higher in the Computing Laboratory. These differences translate to an increase in all of the predicted KPIs. • The use of three different meters as a basis for the two data-driven models is parallel to Sun's use of different buildings, and in this instance it increases the predicted range of the results substantially. The applicability of these approaches is only as good as the comparability between the data used and the real electricity consumption.
The operational simulation used monitored data from the case study building. The use of this monitored data in the top-down data driven models or to 'tune' the bottom-up models should improve the agreement between prediction and monitored data provided the training data is representative. In this study all of the models give a good estimate of mean consumption, suggesting that the mean of the training data is representative and that the models have been tuned adequately. The tuning process can be particularly difficult as the number and type of devices increases. In particular, the transition probabilities which characterise DELORES are difficult to infer with certainty from monitored data; uncertainty analysis of MCMC methods has been studied in some depth in the medical field [29,30], and a suggested approach is to identify the possibility space for the transition probabilities from monitored data and to perform sensitivity studies which explore this space; this would be a necessary component of uncertainty analysis using the DELORES model. The peak daily electricity consumption can only be simulated using a model that embraces the stochasticity of the demand, as other models predict a uniform daily maximum. This is significant if prediction of the timing of the peak is of interest; DELORES and the Sun and ARIMA models give a good estimate of the time of day at which the peak occurs. All of these observations help to inform the user as to which parameterisation is the most appropriate; the answer to this question is, however, context-specific. If no operational data are available then a bottom-up approach, such as DELORES or that proposed by Menezes, offers a better approach than the NCM model, as it encourages the modeller to gain a better understanding of the space use and the potential system dynamics. If only aggregate consumption is required then a deterministic model may be used but it is important to ensure that the range of possible results is fully understood; the +/−10% used here in the Menezes model seems low when compared against monitored data. A bottom up stochastic model such as DELORES is necessary to predict timedependent KPIs, and may provide greater insight into the potential variability of the results. Where monitored data are available it seems most appropriate to use those data to inform the analysis, either by tuning the bottom-up models or by using a top-down data driven model. This study has highlighted the difficulty in tuning bottom-up models as the number of components increases and has illustrated some of the potential pitfalls when using data which are not sufficiently representative of the scenario to be modelled. It is important to understand how uncertainty is represented in the models and to explore the adequacy of that representation. With the exception of the NCM model all of the models give some indication of upper and lower bound electricity consumption. These bounds are not necessarily comparable; the Menezes model solely incorporates variation on the usage diversity factor, the DELORES bounds are generated from the stochasticity of the device state, while the Sun and ARIMA models use previous data to generate a possibility space from within which random electricity consumption profiles are drawn, enabling upper and lower bound results to be extracted. At present in the bottom-up models there is no mechanism for incorporating the difference in parameter uncertainty between the early design and the operational stages directly into the simulations, yet evidence suggests that results at the early design stage are likely to be further from reality than the results of a simulation performed using operational data to tune the models. For the top-down models, uncertainty at the early design stage arises from the variability of the data used to derive the model parameters and care must be taken to ensure that the data set used is sufficiently representative; if the data set is too broad then the bounds of uncertainty may be too widely spread to be useful, yet if it is too narrow the KPIs may be significantly under-or over-estimated. For a model to be comprehensive in its treatment of uncertainty, it would need to include uncertainty in both measurable parameters, such as device power demand, and the more subjective operational parameters such as usage profile and time schedule. Uncertainty in the operational parameters is hard to define but may be as significant as measurable uncertainty depending on the KPIs; it may be best defined via a process of expert elicitation combined with inference from monitored data.
It is clear from the studies performed here that the most important features of plug load parameterisation for a model to be used for forecasting future demand are threefold: 1. The ability to predict the key parameters of interest, 2. The ability to assimilate data, and 3. Flexibility.
Not all models are capable of predicting all parameters e.g. the Menezes model as it stands cannot predict the timing of the peak daily power demand. Any model selected must be capable of predicting both the parameters of interest and the uncertainty around these parameters. Assimilation of data, as we have seen, is key to predicting energy consumption in line with reality. The final feature, flexibility, relates to a model's ability to simulate change in building operation. The flexibility of a top down data-driven model to change in operation of all or part of a building is low; the Sun and ARIMA models are limited to prediction based on past history, and offer no mechanism for disaggregation to component spaces. By comparison, tuning of a bottom-up model offers the facility to simulate change in use and hence offers greater flexibility, provided it is possible to quantify the impact of a change on the model parameters e.g. DELORES is able to encompass operational change provided there is sufficient understanding of the impact of that change on the transition probabilities. In the authors' opinion, what is needed is a bottom-up stochastic model that may be tuned using monitored electricity consumption data with the minimum of effort.

Conclusions
Recent approaches for the parameterisation of plug loads suitable for input into a dynamic simulation model have been assessed as regards their applicability to the prediction of electricity consumption. It has been found that by using monitored electricity consumption data it is possible to use any of the approaches to create a tuned model capable of predicting future power demand to a reasonable level of accuracy, provided the tuning of the model is appropriate and robust and that the building is not subject to change in operation. It is less clear which model is appropriate for simulation when no directly relevant monitored data are available; the difficulty of making predictions under this situation with any degree of confidence has been demonstrated.
If the desire is to simulate the impact of changes to the operation of an existing building, then the applicability of the models has to be reviewed. Conclusions can be drawn as regards the requirements for a model that will best suit the purposes in this instance. First, the applicability of a model is dependent on the key parameters of interest; an aggregated approach may be sufficient for prediction of annual electricity consumption, but to identify the associated uncertainty, or to predict the variation in timing of the daily peak demand, some measure of the stochasticity is required. Second, it must be possible to use monitored data to characterise the model. Finally, in order to simulate change in operation of all or part of a building, the model must be sufficiently flexible; either a bottomup model is required, or a means of disaggregating top-down data needs to be developed but it must be possible to quantify the impact of the changes on the disaggregated data. Of all of the models considered here, DELORES is the best suited of the bottom-up models but it is difficult to calibrate using aggregated data. Of the top-down models, the Sun model is more straightforward to use and the added complexity of an ARIMA model doesn't appear to offer significant benefits. However it is not clear how the parameterisation by the mean and covariance matrix of the diversity factor lends itself to disaggregation and thereby to simulation of operational change.