Forecasting the parameters of structures from natural materials taking into account environmental impact

The paper deals with the problems of building structures with certain values of stability parameters based on the corresponding values of material parameters. For modeling, a passive experiment is used based on archival information for building structures. An algorithm using forecast models is built. Its main stages are given, they are indicated in the diagram. The algorithm is based on statistical approaches. The constructed forecast models for stone natural building materials, for binding materials, for forest materials are presented. During the construction of the regression model the authors proceeded from the independence of the experiments, as well as the fact that the nature of various random components remained unchanged in all experiments. The built models may be used to forecast changes in design parameters when the selected characteristics change.


Introduction
To date, a large number of building materials have been developed. Material properties may change as a result of external influences. Besides, there is consistency among the structural components that are part of the materials. Then there are opportunities to evaluate the characteristics of materials. The variety of building structures makes it possible to meet a variety of needs under appropriate environmental conditions [1,2]. The purpose of the work is to study the dependence of the parameters of building structures under various external conditions on the selected material parameters.

Development of an algorithm to forecast the parameters of building structures from selected
characteristics When solving control tasks in order to streamline the functioning of construction systems, the nature of control impacts depends on the current state of a studied object, however, if there is a forecast of the ongoing process, the effectiveness of decisions may significantly increase. To solve this problem, it is necessary to build forecast models, on the basis of which it is possible to conduct a simulation experiment in order to select optimal control actions [3].
Active and passive experiment methods based on the regression analysis are most often used to construct the forecast models. The efficiency of solving problems of modeling, forecasting and optimizing the functioning of building materials increases significantly with the use of computer equipment and specialized software.
The heterogeneous nature of modeling objects requires an individual approach to the control algorithm. It is known that any control system involves two main processes: determining the state of the system and generating the optimal impact for this state. When choosing the control impact tactics, one of the most important stages is the prediction of system parameters based on the forecast models.
To obtain a mathematical description, active and passive experiments based on the regression analysis are most often used. Given the specifics of construction systems, a passive experiment based on archival information is used to model them.
Forecasting changes in construction characteristics is one of the most important estimates, the accuracy of which mainly determines the optimal selection of control impacts.
The algorithm for constructing the forecast models consists of the following stages: based on a survey of experts, which allows fully identifying the state of the modeling object.
2. One or more controlled indicators Y j ( ) 3. As a result of the dispersion analysis, those indicators X i that do not affect the change of any controlled indicator Y j are excluded.
4. Information is filtered to select valid measurements.
5. Optimal selection of characteristic space is performed due to elimination of parametric redundancy.
6. The hypothesis of the normal distribution of the values of indicators X i is checked.
7. The type of the regression model is selected (linear, incomplete quadratic, quadratic). 8. Estimates of the coefficients of the regression equation are calculated. The least squares method is used. The least squares criterion may be written as follows: 9. Their significance is checked and insignificant coefficients are excluded from the model. 10. The adequacy of the model is checked. If the model is adequate, this identifies the end of the algorithm. If the model is inadequate, but its complication is possible -refer to item 7, otherwise it is necessary to adjust the original sample (increase its volume, reduce the number of inaccurate measurements). The algorithm for constructing the forecast models is shown in Fig. 1. The construction of multiple regression equations in this case is carried out by the so-called step (multistep) analysis, during which the model is solved and the selection of factors is completed using statistical and mathematical criteria where the form of connection of each factor with the effective characteristic is specified.
The numerical values of the multiple regression equation parameters are usually determined using the least squares method, for which a system of normal equations is constructed and solved. For linear multiple regression , the system of normal equations will be as follows: The coefficients at xi in the multiple linear regression equation show how much on average the effective feature changes with an increase in the corresponding factor by one and with a fixed (constant) value of other factors included in the regression equation.
The value of the aggregate correlation coefficient from the values of the paired coefficients may be determined as follows: ( The R 2 value, which is the determination coefficient, shows to what extent the variation of the effective feature is caused by the influence of the feature factors included in the considered correlation equation. The value of the aggregate correlation coefficient varies from 0 to 1 and cannot be numerically less than any of the pair correlation coefficients forming it. The closer the aggregate correlation coefficient is to one, the smaller the role of the factors not considered in the model and the more reason to believe that the parameters of the regression model reflect the degree of effectiveness of the factors included into it [4]. In some cases, the scattering of points of the correlation field is so large that it makes no sense to use the regression equation, since the error in the estimation of the analyzed indicator will be large. For the entire set of observed values, the mean quadratic error of the regression equation Se is calculated, which is the mean quadratic deviation of the actual values yi,  The smaller the scattering of empirical points around the line, the smaller the mean quadratic error of the equation. Thus, the Se value serves as an indicator of the significance and utility of the model expressing the relationship between the features [5].
Checking the adequacy of the model is one of the most important procedures for regression analysis, since it is necessary to make sure that the practical use of the obtained model will lead to positive results. When we select a model structure, we want it to be as simple as possible, i.e. it should include as few coefficients as possible. This is the so-called principle of the model efficiency. Reducing the number of coefficients facilitates both the evaluation procedure and the use of the model [6].
This is a measure of fluctuations in actual values у near the corresponding theoretical values, i.e. near the regression line (residual variance). In mathematical statistics To measure the tightness of the relationship between у and х, it is therefore logical to use the ratio of factor variance to the total variance of the effective feature. This ratio is called the theoretical determination index (i 2 ): (10) The theoretical index of determination shows what part of the total variation of the effective feature-factor y is explained by the feature-factor х included in the corresponding regression equation. The square root of the determination index, called the correlation index (i), is also used as an indicator of bond tightness.
Regression and correlation indices -parameters of the regression equation, indices or coefficients of determination and correlation -calculated for constraints on the population size, may be distorted by random factors. Therefore, it is necessary to check whether the indicators are characteristic of the complex of conditions in which the studied population is located, whether they are the result of a combination of random circumstances. The significance (materiality) of the regression and correlation indicators is checked using the t-test (Student), the variance F-test (Fisher): where m -number of parameters in the regression equation. The values t calculated from these formulas are then compared with their critical values at the received significance level and the number of degrees of freedom k=n-2. The critical values t are found in the Student distribution table. The calculated value F is compared with the critical (tabular) value for the received significance level and degrees of freedom numbers k1=m-1, k2=n-m.
Estimating the significance of regression coefficients using the t criterion is often used to complete the selection of factors during step analysis. Two procedures are best known that are implemented in application packages: sequential increase and sequential decrease in the group of independent variables [7]. For example, a sequential decrease is that after solving the model and estimating the significance of all regression coefficients, the factor at which the coefficient is insignificant and has the lowest confidence coefficient t is excluded from the model. After that, the model is solved and again the significance of all regression coefficients is estimated. If among them there are again insignificant values, then again the factor with the smallest coefficient t is excluded. The process of excluding factors continues until a regression equation is obtained, all coefficients in which are significant.
Step-by-step regression is used to minimize the number of independent variables included in the studied model [8].

Results
In order to predict the efficiency of building materials by characteristics of resistance to loads, models are built that take into account the relationship of analyzed indicators. To use the regression analysis methods, the following prerequisites must be met: • all experiments should be conducted independently of each other in the sense that the accidents that caused the deviation of the response from the pattern in one experiment did not affect such deviations in other experiments; • the statistical nature of these random components remained unchanged in all experiments; the main reasons for the fact of inaccuracy of information on construction components include the following: inability to ensure the objectivity of assessments; difficulty or inability to quantify qualitative indicators; data writing errors; • measures included in the regression equation as independent variables must be unrelated to each other.
In our case, the experiment refers to the design of the next data on the construction material, so the first two prerequisites of the regression analysis are carried out based on data collection technology. To achieve independence, risk factors need to be pre-selected, i.e. there is a need to exclude a number of parameters, and choose those that carry the minimum information. Based on the developed method, the following groups of interrelated indicators were formed: A) for stone natural building materials: Х1 -true density, Х2 -porosity, Х3 -thermal conductivity, Х4 -heat capacity, Х5 -elasticity, Х6 -residual deformation, B) for binding materials: Х1 -true density, Х2 -porosity, Х3 -thermal conductivity, Х4heat capacity, Х5 -capillary suction, Х6 -linear temperature expansion, C) for forest materials: Х1 -