Probabilistic occupancy forecasting for risk-aware optimal ventilation through autoencoder Bayesian deep neural networks

Ventilation plays a noteworthy role in maintaining a healthy, comfortable and energy-efficient indoor environment and mitigating the risk of aerosol transmission and disease infection (e.g., SARS-COV-2). In most commercial and office buildings, demand-controlled ventilation (DCV) systems are widely utilized to conserve energy based on occupancy. However, as the presence of occupants is often inherently stochastic, accurate occupancy prediction is challenging. This study, therefore, proposes an autoencoder Bayesian Long Short-term Memory neural network (LSTM) model for probabilistic occupancy prediction, taking account of model misspecification, epistemic uncertainty, and aleatoric uncertainty. Performances of the proposed models are evaluated using real data in an educational building at the University of Cambridge, UK. The models trained on data of one open-plan space are used to predict occupant numbers for other spaces (with similar layout and function) in the same building. The probabilistic occupant profiles are then used for estimating optimal ventilation rates for two scenarios (i.e., normal DCV mode for energy conservation and anti-infection mode for virus transmission prevention). Results show that, during the test period, for the 1-h ahead prediction, the proposed model achieved better performance with up to 5.8% mean absolute percentage error reduction than the traditional LSTM model. More flexible alternatives for ventilation can be offered by the proposed risk-aware decision-making schemes serving different purposes under real operation. The findings from this study provide new occupancy forecasting solutions and explore the potential of probabilistic decision making for building ventilation optimization.


Background
Ventilation plays an essential role in building energy conservation, improvement of indoor air quality as well as prevention of respiratory disease (such as syndrome caused by a SARS-COV-2). To achieve acceptable indoor environmental control with minimal operation costs, existing standards such as ASHRAE 62.1-2016 [1] and CEN EN15251-2007 [2] prescribe the effective minimum ventilation rates to safeguard acceptable indoor air quality based on the occupancy estimation. Demand-controlled ventilation (DCV) [3] has been introduced to conserve energy by adjusting ventilation rates in response to changes in occupancy or indoor pollutant concentration. In the continuing COVID-19 pandemic, infection control as a new ventilation control objective should be considered to reduce the transmission of the virus.
As the prescribed outdoor airflow rate of public buildings by current standards is insufficient to control the infection risk at a low level [4], it is essential to develop ventilation control schemes so as to reduce infection probability based on occupancy level.
Currently, various industry associations and public health organisations have issued ventilation guidelines in response to COVID-19, as listed in Table 1. In summary, to reduce the cross-infection of airborne transmitted diseases, the currently accepted approach is to increase the ventilation rate using large amounts of fresh air and extend the ventilation periods to dilute/remove contaminants based on occupancy levels. Simple rule-based ventilation control strategies are widely used in practical applications. For instance, the ventilation systems are suggested to operate at least 2 h with the maximum volume before and after occupied times suggested by most guidelines. Furthermore, researchers have developed real-time ventilation control strategies to reduce the risk of respiratory infection. Sha et al. [5] determined the optimal ventilation rates based on a modified Wells-Riley model to fulfil the requirements of both dilution ventilation and ventilative cooling. Wang et al. [6] proposed a metabolism-based ventilation control method based on occupants' activity intensity estimation, reducing the infection probability down to 4.3-6.3% while saving 13% of energy compared with fixed-fresh-air ventilation. These control methods determine the ventilation rates based on the real-time collection of occupancy data. However, since the fresh airflows supplied to occupied spaces are coupled with the ventilation duct network, in practice, it usually takes a relatively long time to make airflows achieve their desired values, resulting in a delay in pollutant dilution and removal [7]. Predicting future occupancy profiles could be a promising approach for achieving proactive ventilation operation. However, due to the stochastic nature of occupant presence and the inherent variability in behaviour amongst individuals, accurate occupancy prediction is challenging.
Over the past decade, forecast algorithms have significantly improved the accuracy of occupancy predictions. Generally, forecast methods can be broadly classified into four groups: conventional statistical approaches, unsupervised machine learning approaches, supervised machine learning approaches and hybrid approaches [14]. The mainstream statistical approaches are Markov chain-based and recursive models. For instance, Candanedo et al. [15] utilized the hidden Markov models to define occupancy schedules based on temperature, humidity, light, and occupancy measurements. Salimi et al. [16] developed an inhomogeneous Markov chain model to extract occupancy information with varying time steps from collected occupancy data.
Burak et al. [17] adopted a recursive occupancy learning algorithm based on maximum likelihood estimation to update the estimate of occupant profiles.
More recently, machine learning-based models began to dominate the literature, replacing traditional statistical models [18]. As typical unsupervised models, the clustering algorithms, such as k-means, k-nearest neighbour techniques and support vector clustering, were introduced to identify occupancy patterns in both residential [19,20] and commercial buildings [21,22]. The cluster results could further represent different occupancy patterns, distinguishing weekdays from weekends. For supervised models, studies have analysed the forecast performance of gradient boosting [23], support vector regression [24], decision tree [25], random forest [26], and deep neural networks [27]. Particularly, Long Short-term Memory (LSTM), one of the Recurrent Neural Network architectures, is well designed for time-series prediction due to its ease of incorporating exogenous variables, and automatic feature extraction abilities. Kim et al. [28] developed an LSTM deep learning approach for predicting real-time occupancy in a large exhibition hall with a prediction accuracy of over 84%. Qolomany et al. [29] found that the LSTM-based occupancy forecasting method significantly reduced the root mean square error (RMSE) by 80.9%-93.4% in a commercial building compared with Auto Regression Integrating Moving Average models. Normally, most existing occupancy forecasting models derive point forecasts based on the input of continuous historical occupancy data and contextual information, such as hours, weekdays,

Nomenclature
English symbols a 0 -a 3  and holidays. However, deterministic models often neglect uncertainty inherent to the non-deterministic occupant behaviour and thus fail to derive variances of occupancy predictions, especially when using an incomplete dataset for model training. Forecasting occupancy in a probabilistic manner with uncertainty bound or confidence interval enables the decision-makers to make risk-informed building energy management and ventilation operations [30]. Another challenge encountered is that some buildings or zones do not install sensors to detect occupancy but need energy-efficient ventilation control. In such buildings, the ventilation rates are calculated based on maximum occupancy conditions [31], resulting in over-ventilation and energy waste during low-occupancy periods. Since data on occupant behaviour are hard to collect and privacy considerations may preclude making the data widely available, occupant-related energy and environmental data are considered as an alternative to indicate the occupant density level [32]. For instance, electrical loads, such as plug-in equipment and lighting loads, have been shown to exhibit a strong relationship with building occupancy. It is reported that the R-squared values between plug loads and the occupant counts by linear regressions could achieve 0.65-0.78 [33,34]. In addition, environmental monitoring variables, such as sound, relative humidity, air temperature and CO 2 , can also be used as the exogenous variables to train the forecasting model [35]. Due to the increasing penetration of smart meters in the residential sector [36] and mandatory requirements for smart sub-metering systems in the commercial sector regulated by building codes [37], occupant-related energy data are more feasible to obtain. In the context of occupancy prediction, it is possible to conduct the occupancy forecasting for the zones without explicit occupant counts, by taking advantage of the rich data/knowledge obtained from other zones that have similar activities and applying transfer learning [38] or model transferrable methods [39].

Problem definition and research objectives
Based on the literature review, the following observations were made: (1) As an important feature of occupancy, building electrical load (i.e., plug and lighting electrical loads) should not be ignored in occupancy prediction. (2) Predicting the occupancy in a probabilistic manner contributes to the risk-aware ventilation decision making, due to the random nature of occupant behaviour. (3) Different operation decision schemes are required in the face of different ventilation purposes (e.g., fulfilling the requirement of thermal comfort and indoor air quality under non-pandemic periods or reducing the risk of infection under pandemic periods). The key research questions of this study are: o Which factors noticeably affect the occupancy prediction in buildings? Which features should be chosen to predict occupancy? o How are the different uncertain sources taken into account in model development? Could the developed models be transferrable to other zones with similar functions and activities? o How are the predicted probabilistic occupancy profiles utilized for risk-aware ventilation decision making?
To address the above-mentioned research gaps, this study develops an autoencoder Bayesian Long Short-Term Memory Network (ABLSTM) model for probabilistic occupancy prediction, enabling the risk-aware decision-making schemes for optimal ventilation under pandemic and non-pandemic periods based on quantified occupant profiles. To verify the proposed strategy, case studies were conducted in an educational building located at the University of Cambridge, UK. The performance and the transferability of the proposed models were investigated and compared with conventional deep neural networks.
The main contributions of this study are briefly summarized as follows:1) An autoencoder Bayesian deep neural network model is developed for probabilistic occupancy-based ventilation operation, which can take account of different uncertain sources (i.e., model misspecification, aleatoric and epistemic uncertainties). 2) The transferability of the probabilistic model framework is tested and validated for zones (similar activities) with and without explicit occupancy counts. 3) Risk-aware decision-making schemes (i.e., normal DCV and anti-infection modes) are provided for different ventilation requirements under pandemic and non-pandemic periods.

Outline of probabilistic occupancy forecasting method for optimal ventilation
The outline of the proposed probabilistic occupancy forecasting method for optimal ventilation is shown in Fig. 1, which consists of four major tasks to be addressed. The first task is to pre-process the datasets (e.g., occupant counts, plug loads and lighting loads), including data cleaning, data transformation and normalization. The second task is to build an autoencoder Bayesian LSTM (ABLSTM) framework, using the information currently available (at timestamps t, t-1, t-2 …) to forecast the occupant number for the future timestamps (t+1, t+2 …) in a probabilistic manner. Two ABLSTM models, suitable for zones with and without explicit occupancy counts respectively, are built to forecast the occupant number at different sequential input features. The third task involves the test and validation of the model transferability. The transferability test is performed, in which the models trained by one floor of a building are applied to new data from the other two floors of the same building. The fourth task is to make the risk-aware decision-making for ventilation operations based on the forecasted occupant profiles. The probabilistic decision-making schemes could be applied in two ventilation operation modes: 1) normal DCV mode under non-pandemic periods and 2) anti-infection mode under pandemic periods. The details of each task are illustrated in Sections 2.2 to 2.5, respectively.  [9] ✓ Start ventilation at the nominal speed at least 2 h before the building usage time and switch to a lower speed 2 h after the building usage time ✓ Change CO 2 setpoint to lower 400 ppm to assure the operation at nominal speed ✓ Keep the ventilation on 24/7, with lowered ventilation rates when people are absent ASHRAE [10] o Increase outdoor air ventilation o Disable demand-controlled ventilation o Open minimum outdoor air dampers and eliminate recirculation CCIAQ [11] ✓ Increase the amount of air exchanged per hour ✓ Operation of ventilation systems should be beyond the hours of occupancy (2 h before and after) ✓ Provide more outdoor air than the current maximum (i.e., preferably 100% outdoor air and exhaust) WHO [12] o Set the minimum mechanical ventilation rate higher than 10 l/s/ person. o Operate HVAC system with maximum outside airflow for 2 h before and after occupied times CDC [13] ✓ Increase the introduction of outdoor air by opening outdoor air dampers beyond minimum settings ✓ Rebalance or adjust HVAC systems to increase total airflow to occupied spaces when possible ✓ Turn off any DCV controls ✓ Run the HVAC system at maximum outside airflow for 2 h before and after the building is occupied 2.2. Data collection and pre-processing

Descriptions of test building and datasets
An educational building located at the University of Cambridge, UK was selected as the testbed for this study. The data were collected from the first to third floors with similar layouts, in which the primary use is open-plan offices. The view of the building and its first-floor plan is presented in Fig. 2. The design occupant capacity was 50 for each floor. Recessed LED modular luminaires with daylight-linked dimming, presence detection and manual override were controlled by an infrared handheld controller. The data for hourly plug-in power, lighting power, CO 2 concentration, indoor temperature, and fan operating status from June 2018 to July 2019 were collected. During the monitoring period, all the windows were closed, and the dedicated outdoor air ventilation system with constant-speed fans served the occupied spaces.
Methods for occupant detection found in the literature include CO 2based models [40,41], camera-based sensors [31,42], radio-frequency identification systems [43], and mobile sensing-based detection [43]. In this study, occupant counts are indirectly collected based on measured indoor CO 2 concentration while the plug-in and lighting load data were directly collected from the building's submetering system. The CO 2 -based models, which treat the indoor CO 2 concentration as uniform, are selected to detect occupant counts, and the indoor occupant number (O) can be solved from the mass balance equation: where V r is the volume of the room (l). Q sup is the supply airflow rate (l/ s), obtained from the design flow rate and on/off status of constantspeed fans. s is the CO 2 generation rate per person (5 mg/s) following the settings in ASHRAE Standard 62.1-2016 [1]. O t is the number of occupants. c t and c sup t are the CO 2 concentration (ppm) of indoor air and supply air at the time instant t (s), respectively. Fig. 3 illustrates the occupant-based power use and occupancy data collected from the test building. The plug load data was collected from the building's submetering system, while the occupant counts were calculated by Eq.1, based on the fan specification, measured CO 2 data  and fan operating data. Fig. 4 presents the linear regression results for occupant counts with plug and lighting loads during the test period. The R-square value varied from 0.47 to 0.55 for the linear regression between plug-in loads and occupant counts and from 0.43 to 0.50 for the linear regression between lighting loads and occupant counts, which shows a strong correlation between occupant-based internal loads and occupant counts. It can be found that the lighting power density appeared to reach a plateau due to a maximum number of lighting appliances used, whereas plug-in power density continued to increase with the number of occupants. Although the lighting power density more likely reached maximum values at the high occupancy level, there was also a probability that light loads at the peak value under low occupancy level as office users tended to leave the lighting on when they temporarily left their offices. In contrast, the level of plug-in loads certainly increased with the increase of the occupancy level. For instance, the action of turning on plug-in appliances (i.e., desktop PCs) could be a strong signal that occupants were entering their offices.

Feature selection, data transformation and normalization
The input selection of a forecast model is key to its performance because it reflects deep correlations between the inner physical features and characteristics of occupancy. The inputs of forecast models can be divided into three categories: (1) historical occupancy data, (2) building electricity usage and (3) calendar information. Historical values of occupancy counts (or the estimation of them) are traditionally used as  inputs to occupancy prediction models, due to their usual high correlation with future values. Sub-metered electricity data (i.e., plug loads and lighting loads) are selected as they are highly correlated to building occupant numbers as presented in Fig. 4. Furthermore, for open-plan office spaces, since most occupants follow a daily work schedule, calendar information such as the hour of day and day of week (i.e., weekdays versus weekends/holidays), can capture the cyclic daily behaviour of occupants.
In this study, two sets of sequential input features with the same time lag of 24 h are selected to build predictive models for different purposes, as shown in Table 2. Option A includes all three input categories, which selects the 24 lagged plug-in power variables (P t-1 , P t-2 , …, P t-24 ), 24 lagged lighting power variables 24 ) and calendar variables (e.g., hour of day [0,1] 24 , day of week [0,1] 7 ). Option B includes the last two input categories, which remove inputs of the 24 lagged occupant counts to test whether it is possible to use the information of occupant-based electricity usage for occupancy prediction. Option A and Option B are suitable for zones with and without explicit occupancy counts, respectively, as the building energy data are most available for zones regulated by the building codes, but the occupancy sensors are not always installed. Here, for the zones without occupancy sensors, Option B can be potentially applied by transferring the knowledge learned from zones with occupancy information.
The purpose of data transformation is to provide a suitable dataset for input selection and model development. Data transformation methods are used to covert the categorical variable (e.g., calendar information) into dummy/indicator variables. Data normalization aims to eliminate the influence of the scales of data/measurement. All the candidate inputs, including the plug loads, lighting loads, and occupant counts, are normalized using max-min normalization as shown in Eq.2, where X = {x 1 , x 2 , …, x n } and X is the normalized data of X.
The data are further split into three sets: the training, validation, and testing sets. The testing sets are selected from the last month of the data, whereas the training and validation sets are split randomly using the rest data with a proportion of 7:3. The training and validation sets are used for hyperparameter optimization and model training, while the testing sets are used for the performance evaluation of models.

Probabilistic deep learning framework
The complete architecture of the developed ABLSTM framework for occupancy forecasting contains two layers as shown in Fig. 5: (i) an unsupervised learning layer, autoencoder, that captures the inherent pattern in the time series and is learned during pre-training, and (ii) a supervised prediction network that receives input both from the learned representative embeddings within the autoencoder framework as well as external features (e.g., calendar information) to predict the occupancy.
Three types of prediction uncertainty are considered in this deep learning framework [44]: model misspecification, epistemic uncertainty, and aleatoric uncertainty. Model misspecification refers to the assumed model class used in a learning/reasoning algorithm that represents an imperfect approximation to reality. Aleatoric uncertainty represents the variability of true random or uncontrollable processes, which is irreducible because there will always be variability in the underlying variables. Epistemic uncertainty represents deficiencies by a lack of knowledge, information or expertise, which is reducible at least theoretically by searching for knowledge that allows predicting its outcome with greater accuracy.

Autoencoder
An autoencoder layer is introduced to eliminate model misspecification when predicting unseen samples with very different patterns from the training data sets. The autoencoder layer consists of an encoder and a decoder. The encoder extracts representative features from a time series from the training data set, and the decoder reconstructs the original data back to the original input space [45]. This network is trained by minimising reconstruction errors with a loss function, which measures the differences between the original input and the consequent reconstruction. Let the encoder be given by f(X ) and the decoder be given by g(X ), the autoencoder then minimizes the loss function ζ (e.g., mean square error) as represented by Eq.3, where input is given by X ∈R d and d is the dimension of the input.
In this study, both the encoder and decoder adopt LSTM Networks. LSTM is a special subtype of deep learning neural network that has the capability to handle sequential data, such as forecasting time-series building energy demand [45], indoor air temperature [46] and PV power generation [47]. It learns long-and short-term impacts from past time series and outperforms traditional non-linear machine learning algorithms by introducing three special gates, namely forget gate, input gate and output gate. Forget gate decides what is relevant to keep from "historical cells", given by Eq.4. The input gate decides what new information in x t should enter the cell state, accomplished by Eqs. 5-6. Then the old state was multiplied by f t to forget what is unnecessary and to update from C t− 1 to C t , given by Eq.7. Eventually, based on C t , h t− 1 and x t , the output gate provides the value of the next hidden state h t by Eqs. [8][9].
where f t , u t , and o t denote the update gate, forget gate, and output gate at time t, respectively.  autoencoder's inputs are the lagged occupant counts, lighting loads and plug loads, while for the proposed model with Option B, the autoencoder's inputs are lagged lighting loads and plug loads. The external features are the calendar information for both models.

Bayesian deep neural networks
A Bayesian LSTM will be further applied for the prediction of timeseries data, taking account of epistemic uncertainty and aleatoric uncertainty. In this study, a Monte Carlo (MC) dropout approach [48] is used for approximate Bayesian inference, to capture the ignorance of the model parameters and reduce the impacts of epistemic uncertainty. This method applies stochastic dropout in each weight layer during model training and also performs dropout at the testing process to provide an ensemble of predictions from the approximate posterior [49]. Specifically, a diagonal matrix diag(d) is applied to the weight matrix (W) and bias vector (B) in each hidden layer, as given by Eq.10. All diagonal elements of the diag(d) are Bernoulli random variables.
To consider the impacts of aleatoric uncertainty, different from that (typically mean square error) of deterministic regression models, the loss function is adopted given by Eq.11 during the model training, which captures noise inherent in the observations. It consists of two parts: the residual regression obtained with stochastic samples and an uncertainty regularization term. Therefore, the training process can be conducted by supervised learning of the regression task without "uncertainty labels" [50].
where the model predicts a mean ўy^ and variance σ 2 , and y denotes the actual output (i.e., occupant number).
To optimize the hyperparameters of the developed ABLSTM models, the Bayesian optimization [51] in the Keras Tuner [52] is implemented. In this study, the hyperparameters studied include the number of hidden neurons in each hidden layer, number of hidden layers, learning rate, and dropout rate.

Quantification of model misspecification, aleatoric uncertainty and epistemic uncertainty
As aforementioned by Eq.11, both the means (y^1,ŷ 2 y^2,…y^M) and variances (σ 1 2 , σ 2 2 , …σ M 2 ) of the occupancy forecasts can be directly derived from the ABLSTM model. As the encoder reconstructs the inputs in embedding spaces (accounting for potential model misspecification) and gets further propagated through the prediction network with random dropout, the variances are therefore the aggregated uncertainties taking account of both model misspecification and aleatoric uncertainty, as given by Eq.12. The epistemic variance of the occupancy forecasts is obtained by calculating the variance of sampled outputs using Monte Carlo, as given by Eq.13.
where, M is the number of Monte Carlo samples, taking 200 in this study. j is a certain positive integer number.

Model transferability test
The three open-plan floors in the test building have similar functions and floor areas, mainly the offices used by research students and staff. As curriculum schedules and teaching activities varied on each floor, the occupant pattern and behaviours may differ. For instance, although the design occupant capacity was 50 for each floor, the maximum occupant counts were different as presented in Fig. 4. To transfer representations of occupancy from one floor to another floor, the models trained by data of the first floor are used or fine-tuned to predict occupant numbers for the second and third floors to explore the model transferability.
For Option A, the test process consists of the following parts. (1) The source datasets from the first floor are used to pre-train a base ABLSTM model. (2) The target training and validation dataset from the second and third floors are used to fine-tune base ABLSTM and thus we can get For Option B, as the occupant counts are assumed unknown, the test process consists of the following parts: (1) The source dataset from the first floor is used to pre-train an ABLSTM model. (2) The structure and weights of the ABLSTM model will be directly applied to target datasets, while the predicted normalized occupant counts (Ȏ n ) from ABLSTM will be scaled back via inverse transform function to the original representation (Ô) based on the design occupant number (d oc ) and baseline occupant number (b oc ) of the target spaces, as presented by Eq.14. In this study, the baseline occupant number is set to zero based on the site spot observations. The flowchart of the model transferability test can be found in Fig. 6.

Baseline model
The conventional LSTM neural network is chosen as the baseline model to demonstrate the performance improvement of developed ABLSTM models. The hyperparameter tuning, model training, and feature selection are carried out in the same way as for those in the ABLSTM model.

Performance metrics
Five predictive performance metrics are used for evaluating the model performance. Three of them are for deterministic accuracy, i.e., Root Mean Square Error (RMSE), the Mean Absolute Percentage Error (MAPE), and the Coefficient of Variance of the Root Mean Square Error (CVRMSE) as given by Eqs.15-17. These indices are used to evaluate the deviations between the predictive mean and the real occupant number.
The other two are for uncertainty accuracy, χ-accuracy [53] and prediction interval coverage probability (PICP), as given by Eq.18 and Eq.19. These two indices are used to evaluate the performances of the predictive variances due to model uncertainty. The χ-accuracy can be calculated by adjusting the error tolerance threshold χ of correct prediction. For instance, the 0-accuracy (χ = 0) means a prediction is regarded as correct only when this prediction perfectly matches actual occupancy. PICP is defined as the proportion of observed occupant numbers located at the predicted occupant number confidence interval. The larger the PICP is, the better the model performs. It is worth noticing that PICP is only applicable for the proposed ABLSTM, as the baseline LSTM only derives deterministic values.
where O i and Ô i are observed and predicted occupant number at the hourly resolution, respectively. O is the mean value of the N actual occupant number. Ô i is a predictive mean at the hourly resolution. i is a certain positive integer. N is the size of the test data set. χ is the tolerance level. l i and u i are the lower and upper bounds of the predictive occupant number at the ith hour. It is worth noticing that, for actual practice, if the occupant level inside a zone or room is relatively low, it is not necessary to turn on the ventilation systems or open the windows because the indoor infiltration can provide sufficient fresh air. Therefore, to assess the performance of all models, MAPE and CVRMSE are used to evaluate the performances of the predictive means of the developed models when the actual occupant number is higher than 5 as the served space is a large-scale open-plan office (over 380 m 2 ). The tolerance level is set to 3 for χ-accuracy. Besides, the threshold of occupant number for calculating the model accuracy should be adjusted according to the floor area of the served spaces and different operating modes.

Risk quantification and risk-aware decision-making schemes for optimal ventilation
The predicted probabilistic occupant profiles from developed models are used to evaluate the required ventilation rates for zones. Two operation modes, normal DCV mode and anti-infection mode, are proposed and can be integrated into the building management systems for different ventilation requirements under pandemic and nonpandemic periods, respectively. The normal DCV mode is mainly intended to save energy with the satisfaction of indoor air quality in response to changes in occupant number. The anti-infection mode, which is a special form of DCV, aims to reduce the risk of individual infection by introducing the outdoor air to dilute the indoor air contaminant based on occupancy density. Generally, the airflow rate required per person by anti-infection mode will be higher than that of normal DCV mode. For instance, it is reported that to control the infection risk in offices below 2%, the ventilation rate should be improved from 30 m 3 /(h⋅person) (a designed rate of normal DCV mode) to 225 m 3 /(h⋅ person) under the design occupant capacity with a 2-h exposure time [4].
In this study, the ventilation decision is made by evaluating the probability of ventilation inadequate (R n ) under normal DCV mode and individual risk of infection (R c ) under anti-infection mode respectively.

Normal DCV mode
The probability of ventilation inadequacy (R n ) is used as an index to quantify the performance of the ventilation system under normal DCV mode during the non-pandemic period. R n is defined as the probability that the demand ventilation rate is lower than the supply ventilation rate, as shown in Eq.20.
where, Ф is the cumulative density function. 1 , v sup,2 , …, v sup,n ] are the probabilistic demand and supply ventilation rates respectively (l/s). In the study, we select the scenario where the mechanical ventilation systems are turned off, and all windows of the zone are closed. Therefore, the supply ventilation rate v sup approximates the infiltration rate (v inf ).
The demand ventilation rate (v dem ) is adjusted based on detected occupancy and occupied area by adopting the DCV strategy, while the infiltration (v inf ) is calculated using an empirical method described by Ref. [54], as per Eqs.20-21: where P tot is the total occupancy. A zone is the served floor area (m 2 ). R P and R b are the fresh air requirement per person (l/s⋅person) and per unit area (l/s⋅m 2 ) respectively, specified by Ref. [1]. V inf,d is the design infiltration rate (l/s), set as 0.00137 m 3 /s/m 2 of building envelope surface area [55]; |ΔT| is absolute indoor-outdoor temperature difference ( • C); U H is wind speed (m/s); a 0 , a 1 , a 2 , a 3 are regression coefficients. We used a 0 = 0.606, a 1 = 0.036, a 2 = 0.1177, a 3 = 0 for the regression coefficients, the coefficient scheme recommended by the EnergyPlus predecessor BLAST [56]. Taking account of the measurement uncertainty of the anemometer as well as the design infiltration rate uncertainty, the distributions of local wind speed and design infiltration rate are set following the relative normal distribution (1, 10%) [57][58]. The distribution of air infiltration rates can therefore be obtained by propagating these two uncertainties.

Anti-infection mode
The individual risk of infection (R c ) is used as an index to quantify the performance of ventilation systems under an anti-infection mode during the pandemic period. We consider that individuals are far apart from each other (the recommended safe social distance, i.e., 2 m [59]), and there is no risk of short-range transmission either by large or small droplets/aerosols, nor by contact with contaminated surfaces. Based on the perfect-mixing-based Wells-Riley model, R c can be calculated by Eq.22, which is the function of air changes rate per hour (ACH) of supply air, the proportion of infected people (α) and the number of occupants (O), where k p = 4.1 × 10 2 , and N virus is the total number of viral particles (copies) inhaled over some time. N virus can be calculated based on knowledge of the mask type, occupant activity, ventilation filter efficiency, etc, given by Eqs. [24][25]: where N virus (t 0 ) is the total amount of virus inhaled until t = t 0 . η is mask efficiency (η = 0 if without face masks). C e is the temporal evolution of the particle concentration (PFU/m 3 ). Q inh is an inhalation rate (0.521 l/s for an average person [60]).N˙v ,gen is viral particles per unit of time emitted by continuous exhalation (PFU/s). λ is decay coefficient, 1.026 h − 1 , considering viral decay and the settling of the aerosol droplets. The proportion of infected people (α) is chosen as 8%, selecting the peak population rate testing positive for COVID-19 in the UK [61]. More details can be found in Ref. [62].

Test results of model performance
In the case study, the performances of the developed models were evaluated for the 1-h ahead occupancy forecasting using 24-h lagged inputs.

Forecasting performances of developed models
Based on well-tuned hyperparameters in each model, the generalization capacity was further evaluated by applying each model to the testing datasets. The calculated performance metrics of the developed models were shown in Table 3.
The developed ABLSTM model outperformed the baseline model for both options. For Option A, CVRMSE and MAPE of the ABLSTM were around 7.0% and 5.8% less than the baseline model, respectively. The 3tolerance accuracy (when the error of a prediction is less than 3 occupants, the prediction regards as correct) for these two models was higher than 90%, showing satisfactory prediction performance for both models. The PICP for the ABLSTM model is higher than 83%, indicating the actual occupancy was mostly covered in the prediction band. The forecasting results of the models with Option A were compared over two typical weeks, as shown in Fig. 7. Although a good agreement in terms of the values and trends can be observed for the two models, the proposed ABLSTM was more suitable to use for the decision-making process, as its predictive confidence interval covered almost all observed values and occupant scenarios. For Option B, the ABLSTM also outperformed the baseline model. However, as lagged occupant counts were not used in the training, the CVRMSE and MAPE of ABLSTM were increased from 16.6% (Option A) to 55.2% and 15.0% (Option A) to 55.2%. However, the 3-tolerance accuracy and PICP for the proposed ABLSTM were both higher than 75%, indicating that the model's performance was acceptable most of the time in a day. The forecasting results of the models with Option B were compared in a typical two weeks, as shown in Fig. 8. The ABLSTM model had a satisfactory prediction performance at low occupant level, which can guide the ventilation fans to run at low speed under low occupant density. For instance, in existing studies [63] [64], for zones with dual speed fans but without occupant sensors, the selection of high  speed and low speed was entirely based on the anticipation of occupant flow or predefined operating schedule. Although the proposed ABLSTM model cannot accurately predict the occupant numbers, it approximated the occupant pattern (i.e., at a low level or high level). Therefore, it is suitable to be applied for dual-speed ventilation control for energy conservation. In addition, with the adoption of the Bayesian framework, most of the observed occupant values were located at the prediction confidence interval, as the model detected high uncertainty at a high occupant level.
From Figs. 7 and 8, it can be seen that the confidence interval was wider when occupancy levels were high, indicating the high prediction uncertainty at high occupancy. Different uncertainty sources of occupancy forecasting were further quantified as shown in Fig. 9. The model misspecification and aleatoric uncertainty variances altered in a small range, while the aleatoric variances fluctuated substantially. Compared with model misspecification and aleatoric uncertainty, epistemic uncertainty contributed to a large proportion of main uncertainty sources. Building energy demand may be affected not only by occupant number but also other factors such as occupant actions and activities (window opening/closing, temperature setpoint, usage patterns of electrical appliances). Under high building energy demand, occupants tend to have high spatial and temporal randomness. For instance, during lunchtime when people leave offices, building plug loads would not be changed much as occupants may keep their electrical appliances (e.g., desktop PCs) running. Taking account of past steps' occupancy information into feature selection (i.e., Option A) can effectively reduce the predictive uncertainties.

Transferability performance of developed models
The transferability of developed models was also tested using the datasets from the second and third floors in the same buildings. It is worth noticing that, for Option A, as fine-tuning the pre-trained model was needed, the target datasets were split into training, validation and testing sets. For Option B, as the developed models by source target (1 st floor) were directly applied to the datasets without pretraining, the whole target datasets were used as the testing sets. Table 4 lists the performance of the ABLSTM models when applied to the new datasets for both options.
For Option A, the CVRMSE and MAPE of ABLSTM were 17.4% and 14.0%, respectively, using second-floor datasets, and 19.1% and 15.7% for third-floor datasets. The ABLSTM using both two datasets showed a satisfactory performance with 3-accuracy higher than 98% and PICP higher than 85%. The transferability results of the models with Option A were compared in a typical two weeks, as shown in Fig. 10, it can be seen that predictive means from the developed models match well the observations.
Compared with Option A, the CVRMSE and MAPE of ABLSTM of Option B experienced a 31.2% and 30.1% increase respectively using second-floor datasets, and a 39.8% and 31.4% increase respectively for third-floor datasets. Though larger errors were observed, the ABLSTM using both two datasets for Option B still showed a good performance with 3-accuracy higher than 74% and PICP higher than 71%. The transferability results of the models with Option B were compared in a  typical two weeks, as shown in Fig. 11. Similar to that of the first-floor datasets, a higher prediction confidence interval can be observed during periods of higher occupancy. In this study, the main purpose of the model transfer is to save time for constructing and training models. The target models (i.e., second and third floors) had similar or even better performance compared with the source model (i.e., first floor) for the following two reasons: 1) All three floors have similar functions, area, occupant activity and pattern. The number of datasets was the same for each floor (from Jun 2018 to July 2019 with 1-h intervals). Compared with that of the source model solely using the first floor's datasets for training, the target models were first pre-trained by datasets of the source model and then fine-tuned using their own datasets. Therefore, the datasets for training or tuning the target models doubled that of the source model. 2) During the test Fig. 10. Transferability of BLSTM with Option A using second and third-floor datasets. Fig. 11. Transferability of BLSTM with Option B using second and third-floor datasets.
periods, the mean occupant numbers for the first, second and third floors are 7.5, 5.4 and 5.9, respectively. As it can be seen in Figs. 7 and 9, high occupancy yields high model prediction uncertainty. Due to the smaller magnitudes and ranges of actual occupancy on the second and third floors, the target models show even better performance compared with the source model.
Through the transferability test, we find that developed models for the information-rich floor (i.e., with occupant sensors) could be further applied for information-poor floors (i.e., without occupant sensors) in the same building. For real applications, the ventilation control in information-poor offices could be beneficial from this approach.

Application of probabilistic deep learning models for optimal ventilation control
As observed from Sections 3 and 4, the BLSTM with Option A shows satisfactory performance for building occupancy forecasting. Therefore, the BLSTM with Option A can be used for real-time ventilation control to reduce energy use under normal DCV mode and to prevent infection under the anti-infection mode. Fig. 12 shows the probability of ventilation inadequacy (R n ) under normal DCV mode. With the assumption that all the windows and ventilation fans were closed (only infiltration involved), the probability of ventilation inadequacy was calculated by comparing the required supply airflow rates and air infiltration rates. The meteorological data (i. e., wind speed and outdoor temperature) from a nearby weather station and indoor temperature data from the building management system were also collected to calculate the air infiltration rates by Eq.22. It can be found that, due to the low occupant levels in buildings, there was no need for mechanical invention or window opening on 4, 5, 11 and 12 Aug (R n = 0). Besides, the quantification of R n also provided alternatives for ventilation operation by setting the risk thresholds (γ). For instance, for comfort-oriented alternative, decision-makers (e.g., facilities managers) can set the low value of γ (e.g., 10%). Therefore, except for the above-mentioned four days, mechanical ventilation or window opening was required for the rest of the days to guarantee indoor thermal comfort or air quality. For the energy-oriented alternative, facilities managers can raise the value of γ (e.g., 30%). Therefore, mechanical ventilation or window opening was not required for an additional three days such as 30 July, and 8-9 Aug for energy-saving purposes. With the assumption that all the windows and ventilation fans were closed (only infiltration involved), the individual infection risk was calculated by using the infiltration air flowrate (blue curve in Fig.12) as inputs. The individual risks were also quantified in a probabilistic manner to guide the ventilation operation. For instance, facilities managers can compare the individual risks at different quantiles with the targeted infection rate to determine the operation of ventilation systems. If we set the targeted infection rate to be 2% following the suggestion by Ref. [4] and compared it with the mean infectious risk, it can be found that, apart from 6 days (30 Jul,4,5,9,11 and 12 Aug), the mechanical ventilation or window opening were required at the rest of days to achieve the targeted infection rate.

Discussion
In Sections 3 and 4, it is found that the proposed ABLSTM models with Option A had a satisfactory performance, which could potentially be used for real-time ventilation control. The performances of the developed models were first evaluated for the 1-h ahead load forecasting based on previous 24-h data. Subsequently, a representative model was selected to further investigate the impacts of the input time lags and forecasting horizons on model performances. The limitations and future work are further presented in this section.

Impact of input time lags
Based on the above results, we selected the ABLSTM model as the representative to investigate the impact of time lags of input features (Option A) on model performances. The studied range of time lags (lagged hours) was set as 1-72 h. Fig. 14 shows the forecasting performances of the developed ABLSTM model under different input time lags. Similar trends were seen between the two metrics. For the first 10 time lags, both CVRMSE and MAPE decreased, while the errors fluctuate at the rest of the time lags. This indicates that the forecasting performances of the ABLSTM model were considerately affected by the input time lag when it was short, e.g., less than 10 in this study. After the 10th time lag, most of these two metrics stabilized in the ranges of [16%, 18%] and [15%,17%], respectively.

Impact of forecasting horizons
To facilitate predictive building energy management, multiple-hour ahead occupancy forecasting is usually required. In the 1-h ahead load forecasting, it was identified that the performance of the model was not stable when the input time lags were longer than 10. Therefore, to explore the impacts of forecasting horizons, we also used the 24-h historical trajectories to forecast the occupancy. Fig. 15 shows the CVRMSE and MAPE of the ABLSTM model under different forecasting horizons (1-24 h). The overall trends in these two metrics were similar. When the forecasting horizons were higher than 3, the CVRMSE and MAPE values were high than 30%, which indicates the ABLSTM model would not be suitable for occupant detection due to the high randomness of the occupancy. By increasing the forecasting horizon from 1 to 8 h, both CVRMSE and MAPE values increased. This indicates that the accuracy of the developed ABLSTM model decreased with the increase of the forecasting horizon before reaching 8 h. From the 8-24 h forecasting horizon, most of the CVRMSE and MAPE values stabilized.

Limitations and future work
The proposed probabilistic occupancy forecasting model integrated with risk-aware ventilation decision-making schemes could offer indoor spaces with low infection probability and high energy efficiency based on quantified risks. This is especially crucial for real applications, where different sources of uncertainty may affect the operation and performance of ventilation systems. Based on the performance tests of the developed method and a review of the existing literature, several issues still need to be addressed in further studies for the successful implementation in practice: 1) The occupant counts were estimated based on a CO 2 -based model in this study. However, the latency of the CO 2 sensor responses (i.e., time delay) due to the slow gas mixture rate, the improper place locations and numbers reduce the precision in occupancy estimation [65]. Existing studies [33] [34] showed that the R-squared values could achieve 0.65-0.78 for the linear regression between plug-in loads and occupant counts in offices, while the values were only 0.47-0.55 in the study. This is partially attributed to the deficiencies of CO 2 -based occupancy detection methods for occupancy estimation. Although we proposed a Bayesian deep learning approach that forecasts occupancy in a probabilistic manner rather than a deterministic way to compensate for variations in occupancy estimation, the model prediction accuracy with Option B is expected to be further improved with advanced occupancy detection methods. More accurate occupancy detection methods, such as radio frequency-based, video camera-based, and sensor fusion methods [66] should be applied to obtain the occupant counts with higher accuracy.
2) The transferability test can be also conducted in different types of buildings to further validate the performance of the proposed models and the effectiveness of the model framework. In addition, the developed occupancy prediction model can be further expanded or improved by expanding the scale and scope of the building types, locations, and design alternatives. The proposed ventilation schemes could be implemented in real building energy management systems with different ventilation types (e.g., variable air volume systems) to improve energy efficiency and ventilation effectiveness. 3) To determine the appropriate supply airflow rate, the building design infiltration rate can be calibrated for various building types (i.e., different surface area to volume ratios, ventilation systems, and layouts). The uncertainty of the design infiltration rate can also be estimated. Possible approaches involve the use of tracer gases tests [67], CFD-assisted air-velocity method [68] and Bayesian calibration [69]. 4) For quantifying the infectious risk in anti-infection mode, we selected the scenario in which the masks and filters were not used. The infectious risk could be further evaluated by considering various mask types used by occupants (e.g., N95, surgical and 3-ply cloth), ventilation air filter types (e.g., HEPA, ISO coarse) as well as occupant activities (e.g., sitting, standing, heavy exercise). Additional commissioning is also needed to determine the optimal risk thresholds (γ) to both fulfil user requirements and meet control objectives. Different risk thresholds should be provided for the models with different prediction accuracies under different operating modes. For instance, for the prediction model with lower accuracy, γ can be set lower and vice versa. For more conservative solutions, decisionmakers can set the γ to zero to avoid potential risks. 5) The proposed strategy can be further implemented in the existing field networks of building automation systems. The ventilation supervisory controller can be developed to integrate into the central control systems, which consists of two parallel networks: the training network and the predictor network. The training network is used to learn the relationship between the indoor energy/environmental parameters and occupant numbers based on continuous online monitoring [70]. The training network weights are then passed to the predictor network where they are used for forecasting the occupant number at the next step. The predictor network subsequently finds optimal ventilation flowrate setpoints that can minimize the potential risks. Eventually, the local process controller (i.e., PID) ensures ventilation rates are controlled at the optimal setpoints by adjusting the supply air damper.

Conclusions
In this study, an autoencoder Bayesian long short-term memory neural networks (ABLSTM) were developed for probabilistic occupancy prediction, taking account of model misspecification, aleatoric and epistemic uncertainties. Two feature options with or without occupant counts (i.e., Options A and B), and their transferability for the developed models were investigated. Risk-aware decision-making schemes were proposed for optimal ventilation under pandemic and non-pandemic scenarios. The impacts of input time lags and forecasting horizons on model performances were explored. Based on the testing results, some detailed conclusions can be drawn: o The developed ABLSTM models outperform the baseline LSTM model using both feature options. Option A is more suitable for realtime ventilation control due to its satisfactory accuracy. Option B is more suitable for zones without occupant sensors but with dual speed fans for two-level ventilation control, as it can accurately capture the low-level occupancy. During the test period (i.e., one month), the mean absolute percentage errors of the developed models with Option A and Option B were reduced by 5.8% and 4.0%, respectively, compared with that of the baseline model. However, due to the random nature of occupant behaviour, it is very difficult to forecast occupancy accurately. For the proposed ABLSTM, over 80% and 70% of actual occupant numbers were within the predicted confidence interval for Option A and Option B, respectively. o The developed ABLSTM models can take three sources of uncertainty into account. Model misspecification is eliminated by the autoencoder layer, while epistemic uncertainty and aleatoric uncertainty are considered based on Monte Carlo dropout with a modified loss function. The developed models can also be transferable and used for other zones with similar activities without sacrificing the prediction accuracies. o The proposed risk-aware decision-making schemes can offer flexible alternatives to ventilation operation for decision-makers under normal demand control ventilation mode and anti-infection modes. This enables the occupied zones with high energy efficiency under non-pandemic periods and low infection probability (i.e., below 2%) under pandemic periods.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. awarded to the Alan Turing Institute, UK (EP/T001569/1 and EP/ W006022/1).