Building simulation in adaptive training of machine learning models

Combining building performance simulation (BPS) and artificial intelligence (AI) provides smart buildings with the ability to adapt by utilizing BPS ’ s data synthesis and training capabilities. There is a scarcity of comprehensive reviews focusing on how building simulation contributes to the adaptation process. The contribution of this review is to analyze the implementation of building simulation in adaptive (AI) systems as both data acquisition and training environments, by interpreting adaptation as a cyclical process. Here, the reviewed studies are classified into four major applications: prediction, optimization, control, and management. It is concluded that defining adaptation as a cyclical process provides a useful framework for the development of adaptive smart buildings. Among the reviewed control and management applications, 48% of decision-making AI agents were trained adaptively, with contributions from BPS. Further research is needed to fully exploit the potential of BPS in training decision-making AI especially when aiming at continuous (cyclical) adaptation.


Background
Buildings account for 20-40% of total energy consumption in developed countries [1].In this context, building energy management systems (BEMS) provide a multitude of opportunities to enhance energy efficiency, improve occupant comfort, promote the building's sustainability, and guarantee high-performance building operation [2].At a larger scale, virtual power plants (VPPs), i.e., a network of decentralized energy generation and storage units plus flexible prosumers can aggregate buildings' energy resources to participate in the energy market optimally and improve overall energy performance [3,4].On the other hand, climate change and unforeseen events like COVID-19, heat waves, war, etc. highlighted the need for resilient and reliable building energy systems when faced with low-probability and high-impact occurrences.These challenges can be addressed by adaptive control and management systems ensuring the reliability of building system operation while increasing resilience [5].
Prior to the building industry, exploiting the potential of simulation environments and artificial intelligence (AI) for developing agents making adaptive and autonomous decisions has been applied in many fields.For instance, David Silver et al. [6] presented an intelligent program playing sophisticated games (like, chess) with superhuman performance using a deep neural network (DNN) and reinforcement learning (RL) by self-play.In fact, they substituted the required handcrafted information and knowledge with this novel approach initiated by random play.Also in many engineering fields, some studies reviewed the use of this promising approach for self-driving vehicles [7,8], smart manufacturing [9], robotics [10], etc.
In the building sector, the convergence of machine learning (ML) and buildings could result in substantial progress in performance prediction, control, and management.Broad subsets of ML in this concept are supervised learning, unsupervised learning, and reinforcement learning.Supervised learning involves training the model based on a significant set of labeled data, allowing the model to predict or classify new instances (e.g., building energy consumption prediction).In contrast, unsupervised learning applies to unlabeled training data to discover patterns or relations within data (applicable for anomaly detection).While supervised and unsupervised learning primarily focus on observation and prediction tasks, reinforcement learning (RL) is suitable for training building control and management systems [11].RL agent interacts with a training environment, instead of labeled data in supervised learning, and receives feedback (rewards) corresponding to its decisions.It gradually learns to take actions with the highest rewards.Incorporating deep neural networks in RL techniques enhances the learning capability to handle complex decision-making problems (For example multi-energy systems).Regardless of the type of ML models, the acquisition of a sufficient dataset or providing a training environment are challenges in the training process [11].
In this regard, there is a growing interest in utilizing building simulation tools for generating a corresponding amount of data rapidly.Building simulation, as a computational tool, enables modeling and predicting the building energy performance, simulating different possible scenarios (e.g., low-probability critical situations), and assessing the dynamics of the environment (e.g., daily temperature fluctuations impact on building operation).Therefore, building energy models can quickly respond and adjust to changing conditions, providing valuable insights for accurate energy estimations and decision-making processes.
In the work of Attia et al. [12], the ability to adapt is considered a crucial element of building resilience.Adaptation is a holistic concept involving the whole building, its systems, and occupants, and it can be characterized as a circular, iterative process.When using BPS, through dynamic adjustments to changing conditions, models calibrated to the building's measured data (which might contain issues like system failures, biased data, etc.) can adapt to changes quickly and make optimum control decisions [13].These decisions can be reactions to unexpected disturbances or proactive measures as a response to predictions.In this regard, integrating building simulations into the adaptation process provides support for reactive or proactive control and management of building energy systems.The evolution of simulation-trained machine learning (ML) models in a repeating adaptive process can give rise to enhancing the overall resilience of buildings during unpredicted changes [14,15].However, in the existing literature, adaptation has not commonly been considered in holistic, iterative process that enhances the system's intelligence in the operational phase.

Review of existing review studies
The possibilities of building performance simulation (BPS) have encouraged researchers to review simulation-related articles in various applications like predicting the building energy and thermal performance [14,16], co-simulation of buildings and different energy systems [17], investigating existing uncertainties [18], model calibration [19], building thermal resilience [15], etc.Also, coupling BPS into specific subjects and applications such as optimization processes and interoperability with building information modeling (BIM) is another issue of interest in existing review papers [20][21][22].Also, studying the relationship and connection between simulation models, digital twins, and artificial intelligence is no exception [23][24][25].
Some reviews concentrated on the ML side of the subject when simulation and ML models are coupled.Sierla et al. [4] classified the various functionalities of machine learning models in the context of virtual power plants and building energy management into different categories like optimization, forecast, and classification.In this research, they reviewed the use of ML in the operating phase of systems.Alanne and Sierla [11] presented an overview of studies working on ML applications for building energy management, especially reinforcement learning models and autonomous agents making independent decisions.They attempted to address the research gap regarding the lack of a holistic perspective toward discussing the learning ability of buildings at a system level.However, despite pointing out the importance of simulation in the ML training phase in both papers, neither of them elaborated on the role of simulation models in these applications.
Finally, in some specific applications of coupling building simulation and ML, Roman et al. [26] investigated the literature concerned with artificial neural network-based simplified approximations of complex building simulation models trained and tested using simulation data.In this research, they focused on articles about generating surrogate models using simulation data and did not mention autonomous AI agents making decisions as well as adaptive models.On the other hand, Wang et al. [27] surveyed studies about building controllers trained by reinforcement learning algorithms.Based on this study, it is reported that a majority of reviewed RL-based controllers were trained and tested using simulation environments.Liang Yu [28] reviewed various deep reinforcement learning (DRL) methods used for developing building energy management systems to address challenges related to BEMS.They classified reviewed studies based on the system's scale and explained how these smart BEMSs can enhance the energy performance of buildings.In this study, the focus was on different approaches developed based on the DRL algorithm.However, they did not elaborate on the importance of simulation environments for training agents.Table 1 indicates how baseline reviews covered this research domain.
As seen in Table 1 the discussion about several concepts regarding the adaptive, BPS-based ML models has been missing in existing review articles.Machine learning as a subset of artificial intelligence (AI) can equip buildings with an ability to learn and be exploited in a variety of applications throughout the life cycle of a building [4,11].While ML offers potential opportunities in the building industry, it also presents challenges, notably the requirement for acquiring sufficient high-quality data for the training process.ML training demands a large amount of data for efficient training which cannot be obtained easily because of limited data collection resources (the need for many sensors and inability to measure some parameters), data privacy and security, data quality problems, etc. [29].This review aims to address the role of building performance simulation as a potential solution to the data acquisition barriers, while also addressing challenges within the context of the simulation model's fidelity and complexity.Interpreting the importance of using building simulations in ML training for future studies needs comprehensive insights into various applications, implementations, and performance of simulation-based training approaches.This can provide a guiding roadmap to help researchers determine the viability of using BPS for different ML training applications, outlining advantageous approaches and contexts for effective utilization.Additionally, addressing the role of building simulation in developing adaptive controllers and management systems is another important research gap.As discussed, adaptive building energy models are developed by a cyclic process that benefits from learned experiences to increase their learnability and intelligence [12].In this regard, we hypothesize that an iterative framework identified as the "adaptation cycle" can benefit from the potential of BPS in ML training to develop intelligent building energy systems in the operational phase.Developed systems can be adapted to real-time changes or future predictions (if available) to make decisions.The adaptation cycle is inspired by the enhanced resilience enabled by intelligent and learning buildings.

Objectives and research questions
This paper aims to address these research gaps by answering the following research questions: • In which settings and to which extent, the reported decision-making AI applications trained using building simulations are adaptive?• What triggering mechanisms have been introduced and developed for iterative adaptive approaches so far?• What challenges have been perceived in terms of the simulation model's complexity and fidelity while implementing BPS in the adaptation cycle?
The novelty of this work lies in the analysis of relevant applications based on the role of building simulation as a part of the cyclical adaptation process while the primary focus lies within the operational phase of the buildings.To fill the research gaps we define a framework called the "adaptation cycle" to be used as a content analysis method for classifying and interpreting reviewed adaptation approaches (inspired by [12]).Then, state-of-art studies that utilized building simulation environments for training and testing ML models are classified based on the adaptation cycle and their functionalities into four main different categories: prediction, optimization, control, and management.Selected papers are reviewed from the perspective of ML application, simulation environment, and reactive or proactive adaptability to sudden changes.

H. Amini et al.
The potential of virtual models for developing and training both MLbased black-box and decision-making models and providing the possibility of adapting them to changing conditions are addressed.Also, based on the adaptation cycle and the phases of the adaptation process, new decision-making approaches toward developing efficient simulation-based ML models/AI agents with high adaptability are discussed.This article is targeted to engineers and researchers as a roadmap to simulation-based ML models and developing adaptive BEMS, regardless of their prior AI knowledge.

Structure of the article
This article is organized as follows.Section 2 explains the research methodology and criteria for selecting articles and analyzing their contents based on the adaptation cycle as an overarching theme.Section 3 discusses the role of BPS in the adaptation cycle focusing on data acquisition and management.Section 4 discusses the essence of decision-making ML applications as a part of the adaptation cycle.Section 5 discusses the role of BPS in training adaptive decision-making ML applications and examines triggering mechanisms initiating the adaptation process.Section 6 addresses the fidelity and complexity challenges when implementing building simulation models in various segments of the adaptation cycle.Section 7 provides a discussion of different applications and approaches with respect to the adaptation process.Finally, Section 8 concludes the paper with the most important finding of the review.

Methodology
In this review, for better acquainting with the research domain, a scientometric analysis is first performed and selected papers were then reviewed.The scientometric analysis is "a quantitative study of the research on the development of science" [30].It involves extracting relevant information from articles to generate insights and measure various aspects of scientific output [31].Using this methodology, the evolution of knowledge surrounding the subjects of building simulation and machine learning can be exposed.
For analyzing the research field, the Web of Science (WoS) database was selected to find articles related to the utilization of building simulations in training ML models.The research domain was limited to the period from 2014 to May 2023.Fig. 1 illustrates the growing number of research activities in this field.H. Amini et al.
Fig. 2 demonstrates the paper selection process in this study.The gathered database was prioritized based on three different search queries presented in Table 2.They were chosen based on the research questions of the paper to entail all articles focusing on the use of simulation for AI training and adaptation cycle.Since the use of diverse terminology in this field of research is common in various articles, we aimed to choose our keywords in a way that minimizes the chance of missing key studies.For example, using more specific keywords like "adaptive", "adaptation", "trigger", "decision making", "proactive", and "reactive", which are directly in contact with our research questions, resulted in neglecting many related articles.This strategy led to dealing with a large domain of resources that screened in a human-supervised process.
Selected queries were separately performed in the WoS database in the "topic" field (title, abstract, author keywords, and Keywords Plus) to narrow down the results while results were limited to only articles with a publication date ranging from 2014 to May 2023.Additionally, the search was refined to include relevant WoS categories including energy fuels, construction building technology, engineering electrical electronics, engineering civil, and engineering mechanical.
In addition to the WoS search engine (based on the mentioned strategy), the IEEE Xplore database, the reference list of selected articles, the most influential articles, recognized by the scientometric analysis, and citation searching results (other relevant studies identified by examining the references and citations of recognized key articles) were considered in the paper selection process.After removing duplicates, titles, and abstracts of selected papers were screened.The screening process was conducted by the first author (single screening) with a highlighted emphasis on the first 100 papers of each query sorted by relevance (priority screening), as well as papers obtained through alternative searching methods.During title and abstract screening, papers were excluded because they were not related to building energy concepts and lacked machine learning or simulation in their methodology (RQ #1).Finally, the eligibility of chosen articles was assessed by full-text assessment.In this process, 278 articles were excluded because they were not directly related to the research questions.The most common exclusion reasons were: not incorporating BPS in the training or adaptation process (RQ #1), lacking a clear connection with the adaptation cycle and its inner or outer circles (RQ #2 and #3), not focusing on building energy systems as the primary objective.Finally,  papers were selected to be investigated in this review as presented in Fig. 3.In this review, the criterion for selecting a paper among the databases was that it is focused on the integration of physic-based models and machine learning (ML) algorithms for different building energy applications.Afterward, suitable papers were chosen for further scan and annotation.They were classified by the role of simulations in training the models, as well as the use case of their simulation-aided ML models to provide a taxonomy of their functionalities (like [4]).Additionally, the adaptability (as mentioned in section 1) of developed models was investigated with respect to the adaptation cycle to identify the distinguishing features of different approaches and applications.Fig. 4 demonstrates the adaptation cycle which is introduced in this paper for the first time and is exploited as an analysis method to address the existing research gaps.In fact, selected papers and their approaches are assessed according to the criteria and items explained below.
As shown in Fig. 4, the adaptation phase initiates by activating a triggering mechanism.A condition, event, feedback, etc., can activate this trigger automatically or manually to start the adaptation cycle.At first, the virtual model (e.g., building energy simulation) is synchronized with the actual building (considering the most effective uncertain parameters), and the model's uncertainties can be determined using realtime data provided by IoT sensors (synchronization process).Second, machine learning (ML) training (or an optimization process) is carried out based on operation data resulting from continuous interactions with virtual systems.Here, based on the defined setup, the ML agent not only learns from real-time data provided by the simulation environment but also can benefit from future predictions of influential variables such as occupancy pattern, energy price, etc. in various time frames (for example, using 24 h ahead predictions).The adaptation phase concludes when stopping criteria are met and an optimal control strategy is established.Finally, the control signals resulting from the adaptation phase are forwarded to the physical system in the operation phase of the cycle.Experiences resulting from the interactions in both the adaptation and operation phases accumulate in the outer cycle to be utilized in the next cycles.
Consequently, key characteristics of selected articles regarding the adaptation cycle and defined research questions alongside the application and their BPS-trained models were extracted to create the metadata obtained from selected documents.The next sections are structured based on the adaptation cycle and its circles and phases.

Data acquisition and management (outer circle)
Experiences and data harnessed from the simulation models or actual building are accumulated in the outer circle of the adaptation cycle (Fig. 4) to be used for predictions and optimization of the building's operation.Predictive models, developed based on either simulation-based datasets or measured data, facilitate making future-oriented decisions.
Detailed building performance simulation models can present reliable and accurate predictions of building energy demand, while they impose high computational expenses.On the other hand, training datadriven predictive models requires a large amount of diverse, representative, and accurate measured data [32].To tackle these challenges, when sufficient data is gathered from building sample simulations or the combination of simulation and measured datasets, machine learning models can be trained to predict various objectives even if few input variables are known.Also, the predictive performance of different ML algorithms can be compared and tested using these simulation databases [33].In this regard, studies utilizing BPS for acquiring training data are reviewed and categorized in this section.
BPS can be a part of the data acquisition process in the outer circle of the adaptation cycle.ML models trained by simulation-generated datasets can be used for applications like prediction, calibration [34], fault detection and diagnosis [35][36][37], etc. Fig. 5 exhibits the schematic diagram of this data generation process.A randomly selected series of necessary input variables are determined for conducting building performance simulation using simulation tools.Then, input variables and their corresponding simulation results are gathered to train ML algorithms to identify relationships between inputs and outputs.
Fig. 6 demonstrates different outputs of interest which are addressed by the simulation-based predictive models reviewed in this study.Each article can be related to more than one output.Most of the predictive models trained by simulation-derived databases were developed for estimating the heating/cooling demand (e.g., [33,38]) and total energy consumption of the building (e.g., [39]).Also, they were implemented for estimating some other outputs of interest like thermal comfort [40], temperature [41], CO 2 emission [42], building features [43], etc. [44,45].ML predictive models trained by simulation data apply to various types of buildings based on their usage.Fig. 7 categories reviewed predictive models based on the simulated building's function and also distinguishes whether their simulation model represents an actual building or is merely a prototype model.A majority of articles confined their studies to evaluating their prediction methodologies theoretically by implementing them in reference BPS models (like the Department of Energy (DOE) reference building).Also, most of the researchers' attention is devoted to non-residential buildings, especially office and university buildings, because they have higher energy use intensity in comparison to residential buildings.
Most of the reviewed studies focusing on data acquisition using simulation models and predictive models explore various approaches to reach a balance between computational time and accuracy.In this regard, Singh et al. [46] acknowledged that the ML predictive model trained with a sufficient training database can demonstrate an acceptable prediction time and accuracy in comparison to alternatives like simplified simulation models.Using databases acquired from highfidelity simulation models for training ML-based predictive models is considered a potential solution to enable accurate and fast predictions [47].Addressing uncertainties (e.g., occupants' actions) and unknown parameters when developing prediction models gives rise to increasing prediction accuracy.For example, through integrating a BPS-trained surrogate model into a framework capable of capturing human dynamic behavior [39].Furthermore, utilizing both simulation data and measured data enhances the accuracy by capturing the occupants' energy consumption behavior while addressing the generalizability challenges associated with regular data-driven approaches [32].Generalizability as an important concern in the context of predictive models tried to be answered in some studies.They attempted to generate generic and reliable datasets that can represent a group of buildings using archetype building simulation for training ML models.These generic predictive models were exploited for specific types or locations of buildings (e.g., European non-residential buildings [38]) [48,49].
Simulation-trained ML models can be utilized in both the design and Fig. 3. Selected articles by the year of publication.
H. Amini et al.
operational phases of the buildings.As one of the major applications, simulation-based prediction models are used for evaluating the energy efficiency of various design strategies, materials, etc. [22,50].Simulation-generated datasets of various building designs and locations enable training ML predictive models with reusability potential applicable across various locations and climates [51,52].These predictive models can provide valuable insights into the performance of buildings for designers to select the most energy-efficient design strategies.On the other hand, in the building operational phase, simulation-based ML predictive models can be integrated with control algorithms to optimize the performance of buildings.So, the model should train with a dataset consisting of the required control variables.For instance, it has been  demonstrated that the integration of simulation-trained predictive models with model predictive controllers can significantly reduce energy consumption compared to rule-based controllers [53,54].
Training datasets obtained by simulation models offer a valuable option for comparing the prediction performance of different ML algorithms to determine the most effective one [42,[55][56][57].Synthesizing these datasets not only reduces the effort required for data acquisition but also minimizes potential noises in measured data.On the other hand, simulation-trained predictive models also facilitate forecasting multiple objectives simultaneously [40,[58][59][60].Objectives concerning building energy demand and occupants' comfort commonly coexist in these multi-objective prediction models [40,58].Such interrelated aspects are informative for the decision-making process in the inner circle of the adaptation cycle.Simulation-trained predictive models can estimate desired objectives in various prediction intervals (hourly [32,51,57], daily [61], annual [51,61], etc.) and for various scales (zone, building, neighborhood [61,62], etc.).
The robustness of developed ML models when facing conditions (like weather) different from synthesized training data should be considered [41].As a novel approach benefiting from BPS, adaptive prediction models, which are capable of dynamically adjusting to changing conditions (like occupancy patterns or weather data), have been proposed in some studies [63,64].Yet, further research needs to better address adaptive prediction models and the role of BPS in developing them.
The synthesis of the reviewed studies, related to the outer circle of the adaptation cycle, demonstrates the pivotal role of BPS in developing predictive ML-based models.They cover a wide range of domains, including building and neighborhood design or operation, prediction  algorithm comparison, multi-objective prediction, adaptive forecasting, creating generic prediction models, etc.Among these applications, adaptive prediction models are considered as future research direction which can specifically leverage from data synthesis potential of building simulation.Integrating these predictive models with the adaptation cycle's inner circle leads to developing proactive building management and control systems.There is an obvious lack of addressing this approach in reviewed papers in this section.Table 3 shows further elaborations on reviewed articles that were only focused on feeding and supporting the adaptation cycle (outer circle).

Decision making
Decision-making is one of the key parts of the adaptation cycle's inner circle.Building control and management decisions are made based on information provided by BPS (virtual system), acquired data in the outer circle, and prediction by predictive models.mostpopular decisionmaking tools in this field of research are optimization techniques [65] and leveraging machine learning agents [66].

Optimization process
Conducting building optimization involves numerous simulation runs and computational time.However, replacing them with simulationtrained ML models mimicking detailed BPSs leads to a significant reduction in computational time [67].It facilitates time-efficient decision-making in cyclic processes such as the adaptation cycle when optimization algorithms are applied.
In the design phase of the buildings, several reviewed studies coupled simulation-trained ML models and optimization algorithms to find optimum design variables (for example, up to 75% reduction in required simulation numbers in the optimization process [68]).They sought to optimize objectives like energy performance [67,69,70]., thermal comfort [69,70], visual comfort [67], heating/cooling loads [71], and CO 2 emission [69].On the other hand, some studies focused on making operational decisions using simulation, ML, and optimization [72].For example, using an ML model trained by simulation data for model calibration, followed by performing an hourly multi-objective optimization to achieve a tradeoff between the building's energy consumption and thermal comfort [73].However, operational optimization approaches can be improved when equipped with the ability to adapt to dynamic changes [74].

ML agents
Leveraging decision-making agents, developed based on ML algorithms, in the adaptation cycle's inner circle facilitates implementing adaptive building energy controllers and management systems.Fig. 8 shows how BPS can act as a training environment replacing the physical building in the training process.According to the diagram, the simulation environment provides the agent's current state based on input variables of the time step.Then, the machine learning agent selects corresponding control actions that will be sent back to the simulation environment as inputs for the next time step.The benefits of the decision made by the agent are specified as its reward.As the process progresses, the agent finds the most effective control actions by accumulating rewards.This role of BPS enables the adaptation phase of the adaptation cycle (inner circle) through continuous interactions between agents and simulated virtual environments.
Articles developed their approaches based on this framework are classified into two main categories in this section: control and management.Control systems usually focus on optimizing the actions related to a specific energy system like actuating HVAC systems.On the other hand, management takes a broader perspective by supervising the control level and building energy systems.Considering hierarchical architecture, management possesses a higher level of hierarchy than control, making inclusive decisions like the determination of setpoints and overall system optimization [11].Table 4 outlines reviewed articles concerning the integration of BPS and ML-based decision-making agents in both control and management applications.

Control
Articles that concern making control decisions benefiting from the BPS-based training environment (e.g., an EnergyPlus-Python co-simulation testbed to that end [75]) are reviewed in this sub-section.Three major techniques for developing building control systems are rule-based control (RBC), model predictive control (MPC), and ML-based control.Rule-based controllers depend on fixed and pre-determined control strategies that cannot be customized dynamically [27].On the other hand, MPC relies on models describing the building's energy dynamic and predictive information to make control decisions.Although MPC controllers have demonstrated promising results, issues like development difficulty and generalizability challenges hinder the wide implementation of these controllers in the building sector [76].MPC also can be coupled with ML models to facilitate the controlling process like [77] which integrated a simulation-trained neural network (NN) dynamic model of a residential building with an MPC controller.Finally, accessing operation data or a dynamic training environment can lead to developing ML-based controllers for buildings [78].Reinforcement learning as a branch of machine learning can be a revolutionary building controlling method tackling existing challenges.RL relies on a trial and error process which involves training an agent to make sequential control decisions through interactions with the training environment [27].
RL agents are suitable options for implementation in the adaptation cycle to make control decisions (e.g., supply water temperature [79,80], HVAC on/off status [81], etc.).Decisions are made utilizing states provided by BPS and data acquired in the outer circle across various planning horizons (like 24 h ahead [82]) and controlling time steps (e.g., hourly [81] or sub-hourly [79]).The data input into the decisionmaking process (state) and actions taken by control agents are determined according to various goals (outlined in Table 4) which the agents are trained to achieve.For example, to reach a balance between cost and thermal comfort, the agent requires dynamic energy price information, indoor and outdoor temperature [81].
This approach has been approved in real-time implementation, while making sub-hourly decisions, in comparison to RBC [79,80].However, in a comparison with data-driven MPC, the offline-trained RL agent faced challenges when encountering states not seen during the training process [83].The fact that their RL agent was unable to adapt to unforeseen events underscores the importance of an iterative adaptation Fig. 8. Schematic diagram of using simulation models as the training environment.

H. Amini et al.
process to address unexpected scenarios.Furthermore, various RL algorithms propose different control performance and training time, as discussed in a study comparing tabular Q-learning and policy-gradient RL algorithms for determining passive control strategies [84].
As discussed previously, the complexity and high computational time of high-fidelity simulation models pose challenges when utilizing them as training environments for RL agents.As a solution, the agent was first pre-trained with a simulation-trained black box model and then employed in a high-fidelity simulated model to evolve and adapt to various scenarios [85].Also, the simulation training environment can be completely replaced by a data-driven model trained using both simulation and historical datasets [86].All in all, based on reviewed studies, it can be concluded that simulation models are suitable tools for training and pre-training of RL controllers directly or indirectly.

Management
Building energy management systems (BEMS) take a broader perspective by supervising controllers and building energy systems.It entails continuous monitoring and adjusting various parameters beyond control actions, like setpoints, schedules, etc., to enhance the building's energy performance, improve thermal comfort [87], minimize operating costs [88], and reduce carbon emissions [89].In this sub-section, articles concerning developing BEMS, through interactions of ML agents with simulation environment are discussed (Table 4).Based on the target building(s) and goals, various management strategies are defined for management agents e.g., determining indoor temperature setpoint [88] and energy storage status [89].
In the context of management application, BPS can facilitate BEMS training by mimicking and simulating the behavior of physical building and energy systems [90], energy generation (e.g., [89]), or occupants' comfort (e.g., occupants vote simulator in [91]).This approach is suitable for applying on a single building or a cluster of buildings (e.g., a centralized DRL agent trained for energy management of a cluster of buildings [90]).According to the complexity of the building environment, a single agent [90] or multiagent approach (e.g., two DRL agents for HVAC sub-systems [92] or an agent for each zone of the building [93]) is chosen to manage the building energy performance.While simulation-trained BEMS agents are able to be implemented in various building(s) situations and systems, considering their computational time is an important factor in the building's operational management.As a synthesis of reviewed studies, using suitable prior knowledge for pretraining these RL-based BEMSs leads to reducing computational time.For example, using defined baseline behavior, rather than making initial random decisions, at the beginning of the training process improves training efficiency [94].Also, obtained knowledge of agents can be transferred in multiagent implementations to decrease the training time [92].
The performance of BEM agents trained in simulation environments can be improved using calibrated simulation models and harnessing real-time data provided by internet of thing (IoT) sensors [93].Also, simulation-based training environments can be coupled with predictive models to provide future information to facilitate RL agents' decisionmaking [87].Online implementation of these management systems enhances the adaptability to unforeseen events and sudden changes and evolves the agent's intelligence over time.This implementation as guided by the adaptation cycle has not been addressed by reviewed studies in this subsection.

Adaptation phase
Integrating the reviewed decision-making approaches (optimization techniques [65,74,95] and ML agents coupled with BPS) within the adaptation cycle in an online implementation enhances the system's adaptability and intelligence over time.Adaptation can be formulated with different approaches based on the data provided by the simulation environment and the outer circle (data acquisition and management) for the decision-making process.When the adaptive control or management system aims to respond to immediate changes and make quick decisions, we call this approach "reactive adaptation" (e.g., [96,97]).On the other hand, when predictions of variable patterns in future time steps are utilized to prepare the controller and BEMS for potential changes or challenges before they occur, we categorize the approach as "proactive adaptation" (e.g., [66,98]).By utilizing the predictive model potential, adaptive proactive control approaches can outperform RBC and nonadaptive DRL controllers [66].Table 5 outlines reviewed articles concerning the integration of BPS and ML agents in the adaptation phase.
According to the adaptation cycle, a trigger initiates the adaptation phase in order to prepare the decision-making tool for new situations.Based on the definition of the triggering mechanism, the cycle will repeat either on a regular basis (e.g., on a daily basis ( [99])) or after significant changes in real-time conditions (event-based ( [100])).For instance, a temperature threshold policy was proposed as an event-based triggering mechanism [100].The scarcity of studies focusing on this topic highlights the event-triggered adaptation cycle as a future research suggestion.Following the initiation of the adaptation phase, the simulation environment, calibrated to the measured data of the target building, starts interacting with the decision-making agent during the training (or optimization) process [101].This approach is an alternative to directly involving the physical building in the initial development process.So, the simulation model requires calibration tailored to the control action time step (for example, hourly calibration [97]).As a suggestion, performing a sensitivity analysis before the calibration process helps determine the right calibration parameters.
Optimization techniques represent an option for making decisions in the adaptation phase.This approach was adopted for 24 h ahead operational cost optimization of buildings using future price predictions (proactive adaptation setting) [65,74].The demand for numerous simulation runs in the optimization process underscores using simulation-trained black-box models instead of directly utilizing BPS [95].On the other hand, reinforcement learning agents with an adaptive nature possess the capability to evolve over time and adapt to sudden changes in addition to the regular dynamic of the building when implemented in the adaptation cycle.Various RL algorithms can be implemented in this framework [102].These simulation-trained control and management agents can coordinate various building energy subsystems [97,103] and energy generators [103,104] to optimize the building's energy consumption [105], flexibility [99], thermal comfort [96], etc.
Online training of DRL agents using BPS (making decisions and receiving feedback dynamically) addresses the disability of the offline approach to automatically adapt to the changes [101].These agents can make decisions for various planning horizons based on the application and target building (e.g., 24 h [101], 12 h [66], 3 h [96]).Also, in addition to simple building, this approach has been employed for controlling multiple buildings with centralized or decentralized agents [106] and hybrid (or multi) energy systems [104].Furthermore, implementing multiple DRL agents interacting with each other has been suggested in specific applications [107].
Reducing the computational time challenge especially when it comes to online training is important.In this regard, some reviewed studies proposed approaches to address this issue.For example, offline pretraining of the agent (e.g., using historical data of the existing controllers [96] or interacting with reference buildings equipped with the same energy systems [108]) before online implementation [109].Also, in

Table 5
Articles focused on the integration of BPS and ML agents in the adaptation phase.another study, a transfer learning framework was presented to use a pretrained DRL agent for different buildings or climates instead of starting from scratch [105].Overall overview of studies classified as adaptive decision-making agents enables interpreting the profits of harnessing the capabilities of BPS and adaptation cycle in developing BEMSs and controllers across various implementation settings (reactive or proactive).However, computational time challenges of BPS models still need further investigation.

Fidelity-complexity challenge in BPS-ML integration
As perceived in reviewed studies, building simulation models emerges as a suitable replacement for physical systems in the training and adaptation process across various applications.Fig. 9 shows the distribution of papers reviewed based on the role of simulation in the adaptation cycle and the application of developed ML models.Here, the reviewed applications are classified into four major categories: prediction, optimization, control, and management.
Based on the synthesis of reviewed studies, the complexity and high computational time of accurate simulation models are important challenges in implementing the BPS in the optimization or training process, especially when it comes to iterative processes like the adaptation cycle.For example, Marzullo, T., et al. [85] acknowledged that their highfidelity simulation model is not fast enough to solely train the control agent (one hour for simulating a 24 h episode).So, they developed an alternative training framework with lower computational time.
Computational costs and accuracy are affected by simulation tools, modeling language, and the developed model's complexity and details [110].In this regard, a variety of simulation software, simplification methods, and calibration strategies have been investigated to find a balance between the model's complexity and fidelity [46].Selecting the appropriate simulation software objectives is an important step in developing simulation models.This step is performed considering the building's characteristics, energy systems (e.g., storage systems [103]), energy resources (e.g., PV panels [89], borehole fields, etc.), and objectives.For example, implementing simulation models in the adaptation cycle requires the possibility of continuous interactions between simulation software and controlling agents [99].In addition to different capabilities, accuracy and computational time are other important factors in selecting simulation tools.A comparison among EnergyPlus, TRNSYS, IDA-ICE, and Dymola examined properties such as accuracy, field of application, co-simulation capabilities, etc. [110].Fig. 10 demonstrates simulation software mostly used in reviewed studies.
According to reviewed studies, there is an obvious trend toward using EnergyPlus software for ML training applications in reviewed articles.EnergyPlus was chosen in most of the reviewed studies because it is an open-source whole-building simulation tool that enables accurate dynamic modeling of complex buildings and provides comprehensive capabilities for simulating various aspects of buildings [111].Since EnergyPlus provides the possibility for co-simulation with other engines, it also can be utilized for developing a simulation-based training environment to be implemented in the adaptation cycle [92,101].
After selecting the appropriate simulation tool and simulating the building, finding a tradeoff between complexity and fidelity as well as calibrating the simulation model to measured data are required for the successful implementation of the model in the adaptation cycle [46].For instance, simplifying the simulation model (e.g., simplification of the building geometry) reduces the computational time, however, it carries the risk of compromising the fidelity of the model.Calibrating the model to the measured data obtained from the actual building enhances the Fig. 9. Distribution of reviewed papers based on the application of their models.fidelity of the model and makes it a better replica of the building.
Building's unknown parameters and variables, like occupant behavior, heat gains, building envelope features, heating, cooling, and ventilation systems' characteristics, thermal properties of the building, etc., and uncertainties resulting from simplifications can lead to simulation errors ranging from minor deviations to substantial discrepancies [112][113][114].Identifying the most influential parameters of the model (using sensitivity analysis) and then regulating them in a manner that minimizes mismatches between predictions and real-world observations can lead to the development of high-fidelity models.Also, hourly calibration ensures that the model captures the dynamic behavior of the physical system [97].Within reviewed studies, only 11% of articles mentioned utilizing a specific calibration method in their approach.They used calibration to address uncertainties related to user behavior, building operation, etc.
Despite the importance of the simulation model's fidelity and computational time, the majority of the reviewed studies did not mention details of their simulation setup.Key details worth mentioning in studies concerning the integration of the BPS and ML training process are the simulation software, simplification strategy, calibration method, accuracy, and simulation time.

Discussion
In this review, we focus on the role of building simulations for training machine learning models with a special emphasis on adaptive models and the adaptation cycle.There is a research trend toward simulation-trained adaptive models that can adapt to uncertainties and real-time events.Continuous adaptation to real-time conditions through operating a cyclic process (adaptation cycle) gives rise to adaptability to unforeseen events like COVID-19, heat waves, and energy crises (e.g., due to war).Thus, this paper discussed different adaptation types and levels in the context of proactive adaptation (considering future predictions) and reactive adaptation (immediate response to changes) for each reviewed article.Accordingly, the choice of adaptation type depends on the specific application and the desired level of adaptability in each paper.
Building performance simulations (BPS) play an important role in the adaptive training of machine learning models for different applications by reducing the need for large historical datasets and real-world training environments.In the first part of the paper, we discussed the ability of simulation environments to acquire training datasets covering a wide range of scenarios, conditions, and low-probability events.Based on reviewed papers, the ML models developed with simulation-based datasets are usually used for prediction and optimization applications in the contexts of building energy consumption, demands, and thermal comfort.In the second part of the article, decision-making agents and their adaptive training using BPS as a training environment were addressed.The performance of AI controllers and management systems which can provide adaptability, generalizability, computational efficiency, etc. can be enhanced through integrating with BPS.Thus, controllers and BEMS, which regulate the HVAC system's characteristics, temperature setpoints, and schedules, can benefit from simulationtrained AI agents.In this respect, ML-based BEMS and controllers can evolve in virtual environments, learn how to handle real-world situations and adapt to unforeseen events.
Simulation models can potentially address challenges related to the heavy reliance on large amounts of historical data in the ML training process.This includes the need for years of measured history, privacy concerns, high cost, etc. Simulations can efficiently synthesize a wide range of scenarios to create diverse datasets including hypothetical situations and disruptions that cannot be derived from historical data.Additionally, measured training datasets may be biased due to physical system malfunctions.Hence, simulation models adjusted to work flawlessly can teach AI agents how the system should work correctly even when the measured data are biased.
In the reviewed studies, some simulation-trained ML models were implemented in reference and predesigned models to investigate the adaptive performance of their proposed approaches.ML training using reference building models offers simplicity, accessibility, and computational efficiency, while not presenting realistic and diverse real-world scenarios.On the other hand, simulation environments calibrated to actual buildings take into account variability and complexities in real buildings which are valuable for developing adaptive ML models.Continuous interaction and synchronization between simulation and physical models enhance the fidelity of simulation environments, thereby improving the adaptability of trained models in real-world implementations.However, there are still some data quality challenges when it comes to dealing with measured data related to system malfunctions or faults which highlights the need for accurate fault detection algorithms.
In the context of adaptive systems, simulation-based training approaches developed based on the online adaptation processes can achieve a higher level of adaptability in comparison to other counterparts such as offline training.However, the high computational time of using high-fidelity simulation models in the adaptation cycle (a repeating process) gives rise to some limitations.In this regard, creating datadriven black-box models with datasets collected from high-fidelity simulations can be a potential research topic that can address these obstacles.
Finally, adaptive training using the potential of simulation environments can be performed based on the proposed adaptation cycle framework, containing adaptation, operation, and data accumulation phases.According to the adaptation cycle, this review discussed different steps and principles involved in the reviewed articles' proposed approaches.The adaptation process which begins with a manual (human input) or automatic (regular/event-based) trigger leads to a series of optimum and autonomous decisions suited for real-time situations.This intelligence, which results from experiences collected from previous cycles and interactions of the AI agent, simulation, and physical model, will incrementally increase.This evolution can be addressed using the adaptation spiral.According to the adaptation spiral, performing every cycle of this repeating process will increase the accumulated data and experience of the adaptive controller and consequently make it more intelligent to make optimum and autonomous decisions in any situation.So, through continuous interactions of the agent, virtual and physical system, the management/control system becomes smarter and makes more conscious control decisions.
Based on our conception of existing literature, considering the adaptation as an iterative process enhancing the building's intelligence and autonomy during the operational phase and exploring the role of high-fidelity black-box models in adaptive training of the ML models in various building applications are suggested as potential future research directions.Also, further research is still needed to explore the aspect of intelligence enhancement over time and the short/long-term resilience of adaptive models.In fact, the relation between required decisionmaking episodes and corresponding computational time with achieved adaptability level, and short/long-term resilience of the BEMS and control systems should be addressed in this concept.

Conclusion
In this paper, articles concerning the use of building simulation environments for training machine learning models were reviewed.These papers were organized based on the adaptation cycle benefiting from simulation models in the training process into either training data acquisition or training environment.Then, the reviewed applications were classified into four major categories: Prediction, optimization, control, and management.Here, we mainly focused on simulation-based ML models that were able to adapt to real-time situations and make resilient decisions when facing unpredictable circumstances.Based on the reviewed articles and their approaches we conclude that: • In 57% of the reviewed studies, BPS was used for generating training datasets (in the outer circle of the adaptation cycle) while in 43%, it participates in the adaptation cycle's inner circle as a training environment.• Out of the BEMS and control applications reviewed, 48% of decisionmaking AI agents were adaptive.Interestingly, 37.5% of these adaptive agents were trained in a proactive setting.This highlights the need for further research in this domain.• In BEMS applications, 53% and 40% of AI agents were trained to make decisions about zone temperature setpoints and storage schedules respectively.However, there is a scarcity of use cases for developing simulation-based adaptive virtual power plants making autonomous energy market decisions.• In 81% of the reviewed studies, the adaptation process was initiated on a regular basis.This highlights the event-triggered adaptation cycle as a future research suggestion.• High-fidelity BPS models and digital twins will play important roles in shaping the future of adaptive BEMS, yet further research addressing their computational time challenges and the role of blackbox models in this context is needed.
All in all, the adaptation cycle benefiting from BPS potentials serves as a robust framework for comprehending the adaptation process and contains almost all required steps for developing an adaptive training approach.According to the proposed adaptation cycle, this paper discussed different phases and stages involved in both reactive and proactive adaptation settings for training AI agents.This exploration was shaped by utilizing the Web of Science online database in combination with supplementary search methods.However, further research is still needed to explore the intelligence enhancement paradigm and short/ long-term resilience of adaptive models based on the adaptation spiral.

Fig. 5 .
Fig. 5. Schematic diagram of using simulations for training data acquisition.

Fig. 7 .
Fig. 7. Classification of reviewed predictive models based on simulation model representation and building type.

Table 1
Indication of perspective gaps existing in baseline reviews regarding the adaptation process.

Table 2
Search query details.

Table 3
Details of articles concerning simulation-trained ML models for prediction applications.

Table 4
Articles focused on the integration of BPS and ML-based decision-making agents.