Online energy simulator for building fault detection and diagnostics using dynamic energy performance model

Faults in buildings systems affect energy efficiency and occupancy comfort. Simulating building behavior and comparing it with measured data allows to detect discrepancies due to faults. We propose a methodology to recursively compare actual data with dynamic energy simulations at different layers of aggregation to reduce the scope in searching for faults through the development the Online Energy Simulator, a tool to set up automated simulations using standard interfaces usable with different building systems and simulation engines. We test our simulator on a real building at the University of Southern Denmark, showing how continuous monitoring allows to quickly detect and identify buildings faults.


INTRODUCTION
Buildings are responsible for a large portion of energy consumption. In the USA they accounted for 7% of primary energy consumption in 2010, which is more than transportation and industrial sector. Buildings energy consumption is also rapidly increasing over time, doubling from 1290 TWh in 1980 to 2784 TWh in 2010 [1]. In the European Union buildings account for 40% of the total energy used and 36% of the total CO 2 emissions [2]. Thus, the focus on buildings is fundamental to achieve the energy efficiency and environmental objectives, such as the European goal of saving 20% of primary energy consumption by 2020 compared to projections [3], and 30% by 2030 [4].
Modern buildings have complex control systems that monitor the current status and manage heating, cooling, ventilation and lighting. Each one of these subsystems has also increasing complexity, and can, therefore, suffer from faults and malfunctions. Faults can impact occupancy comfort, e.g. a broken radiator would result in a cold room, but can also yield higher energy consumption. It is estimated that in 2009 the most common faults in USA commercial buildings were responsible for over $3.3 billion in energy waste [5].
Without a continuous monitoring of the building, faults can happen and go undetected for a long time. Moreover, many fault detection methods rely on detecting changes from previous behavior, and are, therefore, ineffective in detecting faults present since the construction of the building. 'Energy models' of the buildings can be developed and used to assess that the actual energy consumption follows the design goals by simulating the building's behavior. 'Static' energy models are simpler and require low computational power but assume steady-state conditions and require simplifications. 'Dynamic' energy models are instead more complex both to develop and to simulate but can accurately capture interactions between components and changes over time.
In this article we propose a methodology for fault detection and diagnostics (FDD) in buildings using energy models simulations and comparing with real building at different aggregation layers. We present a software solution to automate simulations without relying on any manual procedure. Our tool uses industry standard interfaces to support different simulation engines and automatic data retrieval from the building. We then report the application and testing of our method and tool on a real case study building.
Our tool was developed under the COORDICY Project, a strategic DK-US interdisciplinary research project for advancing ICTdriven research and innovation in energy efficiency of public and commercial buildings [6]. We use our tool to monitor the daily energy usage of our case study building at several aggregation layers, such as whole building, by subsystem or by floor.
The rest of the article is organized as follows. The state-of-art is reviewed in Section 2. The FDD methodology is introduced in Section 3 and the Online Energy Simulator in Section 4. Section 5 presents the case study and discusses results and implications. Finally, conclusions are drawn in Section 6.
2 STATE-OF-ART

FDD in buildings
Kim et al. present a comprehensive review of FDD for building systems in recent years. FDD studies are classified using two different schemes: based on building equipment/size, such as large/ small buildings, Heating Ventilation and Air-Conditioning (HVAC) systems, lighting, water heaters and ventilation units, and based on method. FDD methods can be divided in history based and qualitative or quantitative model based [7].
History-based methods rely on the availability of historical data for a building. Such data is used to create black-box or gray-box models, often using machine learning techniques such as artificial neural networks, for the system under analysis. Faults impact the system's behavior so that it does no longer match the model's predictions. Historical-based models can be used when little or no information about the physical system under test is available and can in general represent complex interactions. However, they require good quality fault-free training data and can only make accurate predictions within its range. Moreover, they are specific to the system used for training and cannot easily be used on other ones.
Qualitative model-based methods rely on a priori knowledge of the system under investigation. Such knowledge, provided by documentation or expert knowledge, is used to create rule-based or qualitative physical systems. Qualitative model-based methods are simple to implement and can usually be validated by field experts. They are also usually robust to numerical uncertainty in input data. However, they often result in rigid models that cannot be applied to different systems or easily extended.
Quantitative model-based methods rely on explicit mathematical models of system under investigation. Such models, which accurately represent the system's physical function, are used to simulate the system's expected behavior, which can be compared with the actual one. Quantitative model-based methods provide the most accurate results, and are usually able to simulate transients in dynamic systems, and even faulty behavior. However, such models are often complex and are both difficult to develop and computationally heavy. They also require validation and parameter estimation with experimental data before their results can be trusted, and cannot easily be used with different systems.
Methods from each category have different trade-offs and are suitable for different kinds of systems. Hybrid approaches that make use of multiple methods are also common, in order to exploit advantages and reduce disadvantages of individual methods. Using multiple methods also increases robustness and reliability.

Building simulation
Many simulation engines are available for simulating buildings energy performance, some explicitly oriented to this field, such as EnergyPlus [8], some more generic, such as Modelica [9].
Clarke et al. describe the overall topic of building performance simulation, its aims and achievements both at the present and in the future. The authors analyze the current state-of-art for building performance simulation tools with respect to different aspects, such as subsystems modeling, control, occupants representation, computation time and economic considerations [10].
Costa et al. discuss the advantages brought by monitoring buildings and comparing with energy performance simulations. The authors describe some of the available visualization techniques to display information obtained from building monitoring in a way to facilitate FDD. They also describe how results from monitoring can be used to improve model calibration and operations optimization [11].
Maile et al. propose a new methodology to compare results from simulations using energy models to actual measured data. They consider the importance of multiple hierarchies, such as by component and by location, which can be used to better evaluate the results. An assessor should gather measurement and simulation assumption, perform simulation and collect data, and finally compare the results. All differences between simulated and measured data must be categorized in either: measurements problems, simulation problems and operational problems. Not all differences are actually performance problems, some may be due to measurement or simulation assumptions. Models should be iteratively adapted to reflect the actual building [12].
Wetter proposes a framework to connect several simulation engines together using Ptolemy II modeling environment as middleware to manage communication. The author defines an interface for communication between the engines and implements it for several engines such as EnergyPlus, Modelica, Matlab and Simulink. The author tests his framework by performing a co-simulation between EnergyPlus and Modelica, exploiting the advantages of each engine in a particular domain [13].
Pang et al. present a framework for real-time simulation synchronized with the actual building using the simulation engine EnergyPlus. The simulation is managed using Ptolemy II actors and a BACnet interface is used to exchange data with the Building Management System (BMS). The authors proceed to test their methodology on a real test bed and observe large differences between measures and simulated total energy consumption. However, when looking at disaggregate plots it is possible to figure out what are the causes. Difference of cooling energy consumptions has similar peaks of difference of total energy consumptions, and they are caused by mismatch in chilling strategies between the model and the actual equipment. The same was noted in the case of lights left on overnight [14].
This framework supports only few selected simulation engines and only BMSs that publish data over a BACnet interface. In order to overcome these restrictions Pang et al. revise their work and re-implement their framework by using Functional Mock-up Interface (FMI), which is a standard interface supported by many simulation engines. They also use the Simple Measurement and Actuation Profile (sMAP) to exchange data, which is an open protocol for data publication [15].
Sharmin et al. present a methodology for sensor-based monitoring of buildings and apply it to two residential buildings and run data analysis on the results. The authors show how monitoring reveals non-obvious information and insights about energy consumption, e.g. heating loss was higher for units on middle floors, which suggests the need for better insulation. The authors also observe that users react by improving their energy usage when introducing feedback from monitoring, but only short term. Automated control is necessary to achieve long-term results [16].
With most engines, users must perform repetitive, timeconsuming and error-prone operations to setup and run a simulation. First they have to fetch the input data, optionally preprocess it, and convert it to the expected format (e.g. many engines expect data at fixed intervals corresponding to the simulation step). Then the model must be modified to point to the correct input data files. Then the user must manually start the simulation. Finally, the user can access the results usually from a CSV file.
Often simulation results are interesting for multiple users. Either such users must each independently go through all the mentioned steps, or one user usually shares the results by 'unstructured' ways, such as sending files by email. The former option multiplies the necessary time (and the potential for errors), while the latter presents other problems, such as misunderstandings with respect to successive versions of results and possibly authorization issues.
Finally, models in quantitative model-based methods are complex and strictly related to the equipment under test and, therefore, are difficult to generalize and apply them with different equipment. Different simulation engines are optimized for certain systems and users need to learn the details of each of them. Thus, it appears evident that a solution able to automate simulations from different engines in a transparent way and make real-time results easily available online to multiple users is valuable.

METHODOLOGY FOR FDD IN BUILDINGS
Faults in buildings impact either occupants comfort or energy consumption. We use a dynamic energy performance model to simulate the building's behavior and compute the expected energy consumption. Thus, any deviation of the actual energy consumption data compacted to the simulated results will highlight faults and anomalies to be investigated. Buildings record energy consumption at different layers. There is a main meter for electricity that measures the entire building consumption and sub-meters for every system, such as HVAC and lighting. Some buildings also have individual sub-meters for floors, other zones or other components. Separate energy distribution trees can be available for hot water and district heating systems, depending on the building. Figure 1 shows an example of electrical energy distribution tree for a building. Sub-meters allow to split the aggregate data from the main meter and to understand how different systems use energy in the building in a more clear and detailed manner. Building energy models are able to provide results at different granularities, therefore, it is possible to compare actual and simulated values for sub-meters.
In this study we develop and implement a top-down approach for FDD as shown in Figure 2: when a deviation between actual and simulated values is detected at the main meter, the next submeters layer are compared to understand which system is affected Online Energy Simulator for building fault detection and diagnostics by the fault. This recursive investigation continues until reaching the leaves of the energy distribution tree. At this point the smallest unit or zone where the fault is located was identified. After the scope was reduced, it is possible to use a more focused FDD method to completely isolate the fault.
Let's assume, for instance, that we have detected a higher consumption of the building with respect to the district heating distribution tree. Hot water coming from the district heating pipes is used to heat up air in the ventilation units and water in radiators. In our next step the simulated and actual values for the respective sub-meters are compared. If the radiators are found responsible for the deviation, the ventilation units are then excluded from the investigation and labeled as not faulty. Depending on the granularity of sub-meters, we could go deeper in the distribution tree and isolate the exact areas responsible for higher energy consumption, and from there perform specific FDD for radiators.

ONLINE ENERGY SIMULATOR
The Online Energy Simulator is a tool that • fetches required data for the simulation (e.g. weather conditions or occupancy count) from time series on the data storage; • maps such time series to a model's input variables; • runs the simulation for a specified number of steps/period of time; • collects results from model's output variables; and • posts results to the data storage.
All these operations are automated and the Online Energy Simulator can be run without any manual intervention. The high-level architecture is shown in Figure 3.
The Online Energy Simulator uses the Simple Measurement and Actuation Profile (sMAP) for accessing building data, a protocol common for building systems [17]. The protocol supports reading and writing time series. It also supports time series metadata in form of key-value pairs. Metadata can be used to query the data storage for the correct time series. The protocol is independent of the underlying storage system. In order to add support for sMAP to a system it is enough to develop a 'driver', i.e. an application that forwards data from such system over sMAP.
In order to support different simulation engines, the Online Energy Simulator uses the Functional Mock-up Interface (FMI). FMI is an interface to perform model exchange and co-simulation of dynamic models [18]. It allows to wrap an existing model in a self-contained Functional Mock-up Unit (FMU) and to make it available to other programs. A program can run simulations through FMUs without any information about the actual simulation engine.

Configuration
FMUs expose input and output variable through the FMI. The Online Energy Simulator uses a set of configuration files to map such variables to time series. Input variables can be provided in three different ways.
• Explicitly: the variable's value is constant over the whole simulation period and set in the configuration file. • From a CSV file.
• From a time series on sMAP, identified by its Universally Unique IDentifier (UUID).
Basic arithmetic operations are also supported to allow unit conversion. For each input variable the Online Energy Simulator will either prepare a constant time series, load it from the CSV file or fetch it from the data storage. Then it will pass it to the FMU and start the simulation. Output variables are mapped to sMAP time series by 'source name', 'path' and 'UUID'. The Online Energy Simulator also supports setting metadata of output time series, e.g. its unit or its location. An example of mapping configuration is shown in Listing 1.
Besides input/output mappings the Online Energy Simulator reads from configuration files the path to FMU, simulation start/ end time, simulation step size and sMAP connection details.
The FMU and configuration files completely define the behavior of the Online Energy Simulator. Therefore, it is simple to replace the model when a new more accurate version is available, or even to switch to a different simulation engine, as long as the new one supports the FMI.

Batch and real-time simulations
Necessary input data for the whole simulation period must be available at the beginning of simulation. This assumption holds for simulations over historical data, but not for simulations over present or future time, where data become available during the simulation itself. A naive solution would be to divide the simulation period in single iterations and run independent simulations in sequence. For instance, the Online Energy Simulator could simulate one day at the time over a week. However, some engines such as EnergyPlus perform a certain amount of initial 'warm-up' steps to compute initial values for room temperature and other measurements. This would result in discontinuities at the boundaries of each iteration.
To account for this use case, the Online Energy Simulator supports a special kind of execution. The simulation period is again divided in single iterations, but the Online Energy Simulator stops at the end of each iteration and waits for user input. Then it fetches input data only for the 'next iteration period' (with the exception of weather data), and runs the next iteration. The warm-up phase is only performed at the beginning of the first iteration, and all the measurements are continuous over the entire simulation period. User input for iteration start is deterministic and, therefore, the user can be replaced by another program.

EnergyPlus simulation engine
EnergyPlus is a whole building energy simulation tool developed by the US National Renewable Energy Laboratory [8]. It is used to simulate the building's behavior and energy consumption over time, both at whole building level but also at room and subsystem level. It can simulate large variety of buildings subsystems such as HVAC, water and hot water distribution and lighting.
The model describing the building is contained in a single EnergyPlus Input File (IDF). This file contains information about the whole building envelope, such as walls, pavements and windows, their geometry, material and thermal properties, and about the building subsystems such as ventilation units and lights. The building is divided in independent thermal zones that interact between each other over time.
EnergyPlus supports wrapping its models to FMUs and to expose a machine-friendly interface usable by the Online Energy Simulator [19].

Weather file update
Due to using the FMI the Online Energy Simulator is engineagnostic, i.e. it supports EnergyPlus models but also models from other simulation engines, as long as they expose the correct interface. There is one exception, however, because EnergyPlus has limited support for weather data as input. Instead, weather data must be provided in the form of an EnergyPlus Weather (EPW) file, and it needs to be available at FMU 'creation time'.
Since providing updated weather data at execution time is a useful use case, the Online Energy Simulator supports this EnergyPlus-specific feature. FMUs are in practice renamed ZIP files containing the simulation engine (or a wrapper to call the actual engine) in form of a shared library. FMUs created from EnergyPlus contain also additional files, i.e. the model IDF file and an EPW file.
When the Online Energy Simulator loads an FMU it decompresses its ZIP file, replaces the interesting columns of its EPW file with weather data provided as input and re-compresses as a new ZIP file. In this way it is possible to provide weather data at the beginning of a simulation. Providing weather data as input 'during the simulation', such as for occupancy data or setpoints, is not possible due to limitations of EnergyPlus engine.

CASE STUDY: BUILDING OU 44
In this article we present Odense Undervisning Building 44 as case study [20]. The building, shown in Figure 4, is located at University of Southern Denmark, campus Odense and was built in 2015. It has four floors and is mainly used for teaching and it consists of classrooms, study rooms and offices. Regarding the HVAC system, there are four ventilation units, each serving one of the corners of the building. In addition, the building is heated using a district heating loop and, partially, through the ventilation system. Every room has the following sensors: • Finally, occupancy counting cameras are also located at every entrance of the building, providing an estimate of people in the entire building. All sensors are accessible through a KNX bus [21] and broadcast records to the BMS according to their configuration. All energy meters are accessible through an EnergyKey system. Custom drivers fetch data from the BMS and EnergyKey system and publish it to a centralized data storage using sMAP, so that it is available to other applications, such as occupancy prediction [22] and model development and calibration [23].

Monitoring building performance with Online Energy Simulator
An overall dynamic energy performance model for the OU44 model was developed by Jradi et al. [20] considering various building characteristics and specifications including physical envelope, energy supply systems and operational parameters. The building model is continuously re-calibrated within the developed framework, considering a 3 months timeframe. The model was prepared for export by exposing selected input/output variables in the interface. This step is automated using the EPQuery tool [24], which helps to modify EnergyPlus IDF files using Python scripts. Employing the developed dynamic model, the Online Energy Simulator was configured and deployed to the case study building Odense Undervisning Building 44 (OU44) to monitor its energy performance. Once per day a simulation is run over the previous 24 h providing the following input data: • Weather data from the local weather station: outdoor temperature, wind speed and solar radiation. • Whole building occupancy data, obtained from occupancy counting cameras. • Single room occupancy data for the four test rooms that have occupancy counting cameras.
We focused on the four test rooms because having an estimate of the occupants count helps understanding their dynamics.
These rooms also have additional room level energy meters and higher resolution sensors.
The following output variables were collected at each simulation step, i.e. 10 min, and posted to data storage: • Whole building electrical energy consumption.
• Whole building heating energy consumption.
• Whole building lighting energy consumption.
• Electricity consumption for the four ventilation units.
• Room temperature for the four test rooms.
• CO level for the four test rooms.
An overall building occupancy profile was generated using input from the different camera counts around the building [25]. The model assumes that occupants spread uniformly over the entire building. For the four test rooms, however, specific occupancy count estimates are provided to improve simulation accuracy.
Once results are posted to data storage, they are available to every other application. In particular, simulation results can be compared with the actual measured values. This allows to detect any deviation or differences between the actual and predicted performance of the building.

Results
In this section we show the results obtained by running the Online Energy Simulator on OU44. We used an EnergyPlus model and we ran simulations for 8 months from Thursday 1 September 2016 to Sunday 14 May 2017. We provided whole building occupancy count, room level occupancy counts for four test rooms, outdoor temperature, wind speed and solar radiation as simulation input. We show charts for selected time periods. Figure 5 shows the simulated and measured electrical energy consumption over a week for building OU44. Cumulative energy consumption over time is shown on the left column and energy consumed every 2 h is shown on the right column. We chose this value because some of the sub-meters have low time resolution, which resulted in spikes using shorter values. The last row shows the total occupants in the building, estimated through the occupancy counting cameras.

Results for energy performance
Energy performance at the whole building level is on par with the simulation results, with a small deviation toward the end of the week. We consider the next sub-meters layer, i.e. the ventilation system and lighting. The rest of energy consumption is due to building operations, such as elevators and plugs load. We observe two distinct phenomena: the ventilation system performs consistently worse than the model, and energy consumption for lighting deviates significantly during the weekend.
We can explain the anomaly for lighting by looking at occupancy over time. During the weekend, occupants count drops but the building is not completely empty. It is possible that a small number of students come to study on weekends and spread to different rooms. In this case the lights would be turned on for  Online Energy Simulator for building fault detection and diagnostics many rooms even with a small number of occupants, while the model assumes a proportional lighting energy consumption. We continue our investigation of the ventilation system and examine the sub-meters in the next layer, i.e. at the individual ventilation units. Unit 1 follows closely the simulation, but the other three deviate. Units 2 and 4 consume less energy than expected, while unit 3 consumes significantly more. There are no more meters in the ventilation units, therefore, we cannot further compare simulated and measured performance. We succeeded in reducing the scope to ventilation unit 3, which has a large deviation from the expected performance and now we can run specific FDD techniques to completely isolate the faulty component. Further investigation should also be performed to understand why ventilation units 2 and 4 have a lower energy consumption than expected.

Results for indoor conditions
In addition to energy meters, we compared the room level indoor conditions measured by building sensors with the ones from the simulation. Figure 6 shows the simulated and measured room temperature for one of the four test rooms. Although the dynamic EnergyPlus model was calibrated based on the overall energy consumption of the building, actual room indoor air temperature were found to be in line with the model predictions, with the two values following the same trend. However, it is noticed that room temperature measured by the building sensor quickly drops during the night of Tuesday 4 April 2017, deviating from the simulated value.
We can explain this anomaly by noticing that the indoor temperature follows closely the outdoor temperature recorded by the building's weather station. The most likely cause was that the room windows were left open during the night.

Computational load of simulations
In order to estimate the computational load of simulations we ran the Online Energy Simulator over periods of different lengths and recorded the elapsed time. The results are shown in Table I. Simulating an entire day or even an entire month only takes few minutes. The elapsed times are very similar even for very different simulation periods because EnergyPlus spends long time during the warm-up phase, which is the same for every simulation.

CONCLUSIONS
We proposed a method for FDD in building systems using dynamic energy models to simulate the expected behavior of the building and compare it with the actual one at different layers. We presented a tool for scheduling and automatically running simulations without user interaction, using industry standard interfaces to support many simulation engines and building systems. Finally, we tested our method and tool on a real building, identifying anomalies in energy consumption of lighting and ventilation units, and in room temperature. As the tool was implemented for a short time for validation in the case study building, the savings due to the implementation were not evaluated, but major expected savings include less operational costs, higher maintenance process response, lower energy consumption and higher thermal comfort.
Splitting energy consumption in sub-meters allowed us to understand how different subsystems use energy inside our building. We were able to follow the energy distribution tree from its root to its leaves, ruling out branches where measured values were on par with simulation results and exploring the ones where the they deviate. We succeeded in identifying the ventilation unit responsible for higher energy consumption and gained insights about the lighting system.
We also showed how using an automated solution to schedule simulations can reduce the risk for human errors. The Online Energy Simulator developed and presented in this study has been running automatically for several months in the OU44 building within the 'ObepME Tool', Online Building Energy Performance Monitoring and Evaluation, for automatic and continuous energy monitoring and evaluation of the overall building energy performance aiming to reduce energy performance gaps and forming a backbone for FDD [26]. Thanks to a configuration-based approach, we are able to easily upgrade and calibrate the dynamic model to newer versions and repeat simulations over any period with any functional changes.

Future work
The methodology proposed in this article covers the high-level identification of a faulty subsystem, and represents an important intermediate block of a complete FDD solution for building Temperature

[•C]
Actual Simulated Outdoor Figure 6. Comparison between simulated and measured room temperature. Temperature dropped sharply during one night, following the outdoor temperature. systems. In order to perform a full FDD it is first necessary to ensure validation of input data-which we previously approached in [27]-and then to use specific methods to completely isolate the faulty component inside the identified subsystem. Those methods should exploit the characteristics of the considered systems, such as individual ventilation units or room lighting, to reach the best FDD performance. Moreover, simulated and measured data are both available on our data storage for client applications, but they are not accessible in a user-friendly way. A dashboard application would enable non-technical users to assess the building status and performance. Furthermore, we are extending the Online Energy Simulator to play an important role as component of a new 'virtual building'. The virtual building behaves as closely as possible to a real building, also with respect to control input. It waits for new actuation commands to be posted to our data storage and simulate the outcome. A BMS can then be deployed on the virtual building making possible to test our control strategies before deploying it on a real one.