Selecting durable building envelope systems with machine learning assisted hygrothermal simulations database

Hygrothermal simulations provide insight into the energy performance and moisture durability of building envelope components under dynamic conditions. The inputs required for hygrothermal simulations are extensive, and carrying out simulations and analyses requires expert knowledge. An expert system, the Building Science Advisor (BSA), has been developed to predict the performance and select the energy-efficient and durable building envelope systems for different climates. The BSA consists of decision rules based on expert opinions and thousands of parametric simulation results for selected wall systems. The number of potential wall systems results in millions, too many to simulate all of them. We present how machine learning can help predict durability data, such as mold growth, while minimizing the number of simulations needed to run. The simulation results are used for training and validation of machine learning tools for predicting wall durability. We tested Artificial Neural Network (ANN) and Gradient Boosted Decision Trees (GBDT) for their applicability and model accuracy. Models developed with both methods showed adequate prediction performance (root mean square error of 0.195 and 0.209, respectively). Finally, we introduce how the information supports guidance for envelope design via an easy-to-use web-based tool that does not require the end-user to run hygrothermal simulations.


Introduction
Advanced data analytics, such as machine learning, are finding a place in modeling the built environment's performance. Hong et al. did a literature review on machine learning and identified over 9,000 publications [1]. They took a harder look at just over 150 publications to understand how machine learning was applied to building performance. They identified design, operation, and control as the dominant application spaces. Understanding the benefits of machine learning, such as reducing computational time, they identified areas for improvement: availability of training data sets, transferability, and cost. Tzuc et al. used accelerated weathering tests to model a concrete wall's performance with a vegetative façade [2]. The weather data was used as the training set to develop an artificial neural network to predict the hygrothermal performance of the "green" concrete wall. Tijskens et al. looked at the applicability of three types of neural networks, multilayer perception, recurrent neural networks, and convolutional neural networks, to study the hygrothermal behavior of a masonry wall [3]. They found that the convolutional neural network provided the best performance to predict hygrothermal properties such as relative humidity. Also, the time to train the model was significantly less compared to the other models. They later optimized the convolutional neural network to show that the  [4]. In addition to predicting hygrothermal performance, machine learning was applied to optimize a double skin façade. One of the challenges with double-skin facades is the relationship between aesthetics and performance. Optimizing the envelope is challenging enough. Adding another layer of complexity, such as the appearance and function of the second skin, becomes computationally challenging. To address this issue, Kim and coworkers used machine learning to optimize the design of the second skin together with building performance [5]. Results were comparable to conventional simulations, provided the training data set was large enough. These are just some examples of how machine learning is being used to evaluate the performance of the built environment. This work leverages BSA's database as a training set for machine learning models and ranks them in hygrothermal performance for different wall assemblies as a function of components and climates.

Hygrothermal simulations to evaluate building envelope durability
The BSA tool provides the users with the answers to the design questions regarding thermal performance and durability. A U-value is calculated and compared to the code requirements to answer the thermal question. The answers to the durability question come from building science principles and simulated results. Mold growth in the building assembly is predicted as per ASHRAE Standard 160 [6] by running a five-year simulation and taking the maximum mold growth index [7] (0-6) in the fifth year.

Wall assemblies and climates included in the study
A set of one-dimensional simulations was carried out for lightweight and masonry walls using a hygrothermal simulation tool [8]. The total number of the simulations was 10,960, including two orientations for each climate: North and the orientation with the most wind-driven rain in the location. The orientation that resulted in the highest mold index was selected as the design case for each wall. Thus, the number of simulation results used in the data analysis with machine learning was 5,480. The simulation parameters included 15 climate locations covering all US climate zones and the layer details listed in Table 1. the dataset was used for training, and the rest was used for testing. Grid search and random initialization were carried out to tune the hyperparameters and obtain models fitting data well. The root mean square error (RMSE) values of the best ANN and GBDT models were 0.195 and 0.209, respectively. The best ANN model had nine hidden layers of 25 nodes. The best GBDT models had 100 trees, of which the maximum depth was nine.

Conclusions
The BSA tool includes rule sets based on expert opinions to determine from users' inputs if the wall structure has acceptable moisture performance. The tool guides how to improve the wall. The ongoing effort includes adding simulation results in a database to provide further guidance and confidence in the results. The number of possible cases is too large to allow for simulating them all. Machine learning methods have proven that more cases can be predicted based on a set of simulations. Additionally, artificial intelligence (AI) can be built on the data to create design recommendations.
ANN and GBDT were tested to develop a metamodel (surrogate model). The models developed with both methods showed adequate prediction performance; considering the methods' computational load, we concluded that GBDT was more suitable for this dataset. However, as the dataset size grows, machine learning methods having higher degrees of freedom, e.g., deep neural networks, would be more suitable for accurate prediction.