Estimation of systEm rEliability by using thE Pls-rEgrEssion basEd corrEctEd rEsPonsE surfacE mEthod ocEna niEzawodności systEmu z wykorzystaniEm PoPrawionEj mEtody PowiErzchni odPowiEdzi oPartEj na rEgrEsji cząstkowych najmniEjszych kwadratów

A new computational method, referred as PLS-regression (PLSR) based corrected response surface method, has been developed for predicting the reliability of structural and mechanical systems subjecting to random loads, material properties, and geometry. The method involves a Corrected-Response Surface Model (C-RSM) based on the Partial Least Squares Regression Method (PLSRM) combined with some correction factors, and Monte Carlo Simulation (MCS), which is named as the Corrected-Partial Least Squares Regression-Response Surface Method (C-PLSRRSM). In order to develop an accurate surrogate model for the region determining the reliability of the system, a proper coefficient is presented to determine the sampling region of the input random variables. Due to a small number of original function evaluations, the proposed method is effective, particularly when a response evaluation entails costly finite-element, mesh-free, or other numerical analysis. Three numerical examples involving reliability problems of two structural systems and a mechanical system illustrate the method developed. Results indicate that the proposed method provides accurate and computationally efficient estimates of reliability. The proposed correction method, the PLSR based corrected response surface (C-PLSR-RS), can be the accurate surrogate model for calculating system reliabilities, especially for the implicit performance functions.


Introduction
Suppose a system has m limit-state functions associated with its constituting components.If the relationship between the system reliability and component reliability is known, it is possible to compute the system reliability pRs through the component reliability j pR .For a series system, the system works well only if all components operate well, then system reliability is the probability of intersection of the component reliability events [21,29], as shown in Eq. (1).For a parallel system, the system is reliable if any of the components works well, then system reliability pRs is, therefore, computed by the probability of the union of the component reliability events [29], as shown in Eq. (2).


(for a series system), Pr 0


(for a parallel system). ( Where j Y is the th j component limit-sate function included in the system performance function, which is expressed by: sciENcE aNd tEchNology and the size of the random variables , , , T n X X X X =  is n.Then the component reliability event j E is defined by the event ( ) 0 j g > X , and the component reliability is then the probability: The major task of system reliability analysis is to calculate pRs , given the joint distribution of X and the limit state functions ( ) ( ) . So far, one of the system reliability analysis method uses the component reliability without considering the dependency between the components failure.Approximate methods, such as the first-and second-order reliability methods (FORM/ SORM) [4,15,28] and simulation methods [1,23,25,26], are commonly employed to estimate the component reliability.Considering the failure dependency expressed by the linear correlation coefficient ρ ij , which is the failure relationship between the th i component and the th j component, another system reliability calculation method is obtained.The linear correlation coefficient ρ ij can be easily found with the linearized limit-state function ( ) j g X .Details about this type of system reliability analysis can be referred in [6,21,33,38].
Without considering the correlation between the components failure, or using the linear correlation coefficient to express the dependency relationship, the accuracy of the component (or system) reliability result will deteriorate with the increase of the nonlinearity in nonnormal-to-normal transformation.To solve these problems, several more accurate methods have been developed by investigators [3,7,9].Through extending the saddlepoint approximation (SA) method [8,16] used in component reliability analysis, Du [7] developed a SA based system reliability analysis method.However, the accuracy of the results is largely determined by the accuracy of linearization of limit-state functions in the vicinity of their associated Most Likelihood Points (MLPs) and the MLPs are acquired by the optimization iteration process which affects the efficiency of the reliability calculation.Efficient Global Reliability Analysis (EGRA) method [2] was extended to solve the system reliability problems by Barron [3].It is based on the creation of Gaussian process surrogate models that are required to be locally accurate only in the regions which have the significant contributions to the system failure.However, a large number of iterations and a complex optimization process are needed to get the surrogate model, which will decrease the efficiency of the system reliability analysis.An active learning reliability method combining Kriging and MCS was presented by Echard [9].Two kinds of active learning method, which are used to add the experiment points to mend the meta-model, are presented.However, every point of the sample population obtained from the Monte Carlo Sampling is needed to search once during each active learning process, and high computational cost occurs if the number of the sample point population is large.A fuzzy multi-objective genetic algorithm approach [24] was proposed to optimize the system reliability.
This paper presents a new computational method for predicting reliability of structural and mechanical systems subjecting to random loads, material properties, and geometry.The proposed method involves a small number of exact or numerical evaluations of the performance function, generation of approximate values of the performance function at arbitrarily large number of inputs using the C-RSM, and the reliability evaluation by using the MCS.Three numerical examples involving reliability problems of structural and mechanical system illustrate the effectiveness and accuracy of the proposed method.Whenever possible, to evaluate the accuracy and computational efficiency of the proposed method, comparisons have been made with direct MCS method which calculates the original performance functions to get the system reliability.
Section 2 provides a brief introduction to the partial least squares regression and response surface method.Section 3 describes the proposed corrected response surface method based on PLSR method, which involves a new correction method with a coefficient, and a new sample method with a proposed proper coefficient to bound the distribution region of the input random variables.Section 4 gives the simulation theory of the MCS method which is used to analysis the system reliability with complex component failure dependencies.Three numerical examples are illustrated in Section 5, and comparisons have been made with direct MCS method.

Partial least square regression method and response surface method
Partial least square regression (PLSR) has two algorithms, PLS1 (Sequential algorithm) for the univariate response variables and PLS2 for the multivariate response variables [20].PLSR was used to simultaneously correlate the parameters and responses.PLSR is a method for relating two data matrices, x and y (in this paper, representing a pair of realization matrix of X and Y at the sampling data), by a linear multivariate model, but goes beyond traditional regression in that it models also the structure of x and y .The core concept of the PLSR approach is to solve the multicollinearity in regression or calibration, and the further details of the PLSR can be found in Ref. [35].Nowadays, the PLSR method is applied to analysis the component reliability [39,40].PLSR derives its usefulness from its ability to analyze data with many, noisy, collinear, and even incomplete variables in both x and y .Unlike the traditional Multiple Linear Regression (MLR) method, PLSR actually uses the responses variable information during the decomposition process [13]; even the x -variables data tend to be many and also strongly correlated, PLSR method also works well.Many studies have shown the potential of PLSR for estimating the parameters and demonstrated that PLSR was a better alternative to conventional stepwise regression [18,30,32].PLSR is also known as the projection to the latent structures which are included in a relatively recent multivariate regression method that combines the aspect of the principal component regression(PCR) and multiple linear regression (MLR).PLSR is pertinent statistical choice when [a] there are many variables x that are correlated with many responses y and [b] there is missing data on experimental work [5].In this paper, PLSR method will be used to produce the surrogate model of the original performance functions of a system.The meta-model, with simple and low nonlinear form, will be used to calculate the system reliability.
Response surface method is used to explore interaction among the parameters and predict properties on the experimental region [5].RSM is also a effective tool in assessing the reliability of complex structures which requires a deal between reliability algorithms and mechanical methods used to model the mechanical behavior, and the interest of this method is that the user is allocated to choose and check the mechanical experiments [12,31].RSM was used to explore interactions among parameters and predict the failure regions.RSM methodology is a collection of mathematical and statistical techniques based on fitting of polynomial equation to the experimental data, and becomes a powerful tool for describing the studied system ,so prediction of its behavior can be made by the surface responses plots that represents the system under studied region [5,27].RSM was also used for analyzing the surface maps for different responses and detecting of interactions among variables and quadratic models presented on sciENcE aNd tEchNology the responses [14].The procedure of using a least square regression analysis to obtain the parameters of a response surface around a design point has earlier been used by Faravelli [10].RSM method is used for the several reasons but the important one is that the numerical derivation on the analytical response surface is available, which reduces the number of mechanical computations required and provides the information to decision maker to choose the judicious experiment chemometric tools like Design of Experiment (DOE), RSM, PCR, or PLSR [12].These methodologies can be helpful when many variables and responses are presented in various processes and correlation.

Corrected response surface based on PLSR
, , tained from calculating the original performance functions, two matrices x and y of dimensions ( * N n ) and ( * N m ) are formed.Data of a PLSR method can be arranged in two tables, and usually have been centered and scaled before the analysis [11], which are expressed by Eq. ( 5) and Eq. ( 6), respectively: Here, a simple PLSR algorithm used in this paper is described as below: Finding the eigenvector corresponding to the maximum eigen- = t E w and the residual matrix as E E t Where Finding the eigenvector corresponding to the maximum eigen-(2) value of the matrix 1 0 0 1 The same processes carried out repeatedly until the (3) th p step.
Then p w equals to the eigenvector corresponding to the maximum eigenvalue of the matrix ,( 1, 2, , ) Where two requirements should be fulfilled:

Cross-validation theory
A strict test of the predictive significance of each PLS latent variable is necessary, and then stopping when latent variables start to be non-significant.Cross-validation (CV) is a practical and reliable way to test this predictive significance [11,35,36].Cross-validation method is used to determine whether the next latent variable is needed to be extracted.Assuming the current latent variable is h t .Then the theory of this method, including two predictive residual sums of squares (PRESS), is analyzed as follows: First type-PRESS (1) The N sample points are divided into two groups each time, including one with N-1 sample points and the other with one sample point.N parallel regression model is developed from the reduced data with one row of the observation data deleted.After developing a model, differences between actual and predicted Y -values are calculated for the deleted data.The sum of squares of these differences is computed and collected from all parallel models to form the predictive residual sum of squares (PRESS), which estimates the predictive ability of the model.The PRESS of the th j response is expressed as: sciENcE aNd tEchNology The PRESS of ( ) can be defined as: Second type-SS (2) All sample points are used to regress the response functions, and the difference between the actual and predicted Y -values for each point are calculated.The PRESS of j Y corresponding to all the sample points is presented as: Then the PRESS of ( ) , , , can be defined as: Stopping condition (3) An error threshold of the stopping condition is defined by: Where ( ) denotes the residual sum of squares before the current latent variable.
The ratio is calculated after each latent variable, and a latent variable is judged significantly from the ratio which is smaller than around 0.9025 for at least one of the Y -variables.If thre C is less than 0.9025, then the h latent variables are enough to provide an accurate regression model; otherwise, another latent variable is needed to be extracted in order to reach the accurate level.The process continues until a latent variable is not significant.

Design of experiment
Direct sampling methods (MCS for example) for reliability analysis by evaluating a large number of original response functions with high complexity and nonlinearity can be prohibitively expensive.Various importance sampling methods have been developed to reduce expense by focusing on samples in the important regions of the random variable space [19,34,41].Another method of reducing cost is the use of surrogate models.Typically, a relatively small set of points are selected through DOE method and the true response is calculated at each sample point.These points are then used to construct an approximation with simple and low nonlinear form of the true response using some regression methods (PLSR is used in this paper).
In order to develop the response surface surrogate model, Latin hypercube sampling (LHS) [22] is used to generate a group of sample observations in this paper.By considering the sample number used in dimensional reduction method (DRM) [37] through selecting the same number of sample points along every axis and several experiments, the proper number of samples used to construct the surrogate model is 4 n N = × .Where n is the number of the random variables affecting all of the responses which determine the performance of the mechanical system.Assuming the mean value vector of the random variables is: and the deviation vector of the random variables is: Then the sampling space used in LHS method can be given by: The proper one, f=4.5, is chosen by constructing many experiments with ranging from 3 to 6.With the selected coefficient to bounding the sampling space, a more accurate reliability probability value will be obtained.

Response surface based on PLSR
With the N sample points, the corresponding N response function values of each j Y , which are obtained from the structural analysis method (FEM for example), are computed by: Then the two data matrices, x and y , are enough to develop the surrogate models.RSM consisting of a group of mathematical and statistical techniques, is used in the development of an adequate functional relationship between a response of interest and a number of associated input variables.Without containing the cross-product powers of 1 2 , , , n X X X  , the second-degree RSM model of the th j component performance function is shown by: ,( 1, 2, , ) Then nonlinearity of the original performance function can be explained.Substituting the 2 i X with i Z , the RSM model is then defined as: ,( 1, 2, , ) Therefore, combined with the two matrices consisting of the input variables sample data, sciENcE aNd tEchNology

Corrected RSM based on PLSR
Here, as only second powers of the input variables are considered and the cross-products of powers of the input variables are neglected, errors may be produced by the surrogate model.Therefore, some methods are needed to improve the accuracy of the surrogate model.In the field of reliability analysis, almost all types of the random variables have the following characteristics as most of the data is distributed around the mean value and the more the data close to the mean value, so the larger probability data will be selected.The two characteristics show that the data around the mean value of the input variables has the larger impact on the reliability result of the component or system.Then if the surrogate model is accurate at the mean value of the input variables, the reliability result will be accurate.Based on this thought, if the surrogate model is improved at the mean value of the input random variables, the accuracy of the reliability result calculated by surrogate model will be improved.In the light of this, a new correction method is proposed.The performance functions values at the mean value of the input random variables are firstly calculated for both of the original response functions and the surrogate functions.The difference between two types of response values are then calculated and used as the correction coefficient to modify the meta-model.The procedures for this method are list as follows: Calculate "mean" response function values (1) The value of the component response function is evaluated as: and the corresponding value obtained by surrogate model is estimated as:

Calculate the coefficients (2)
The coefficients are then represented by: Where j cf represents the correction quantity of the surrogate model of the th j component limit state function corresponding to the original exact one at the mean value of the random variables.,( 1, 2, , ) Based on the corrected method, an accurate surrogate model will be obtained and the system reliability can be calculated more accurately.Combined with the MCS, the model will be used to simulate the system reliability.

MCS used in the reliability analysis of a dependent system
Assuming 0 j Y > represents that the component is working well.For a series system, reliability of the system indicates that all components of the system work well.When using the component performance functions to express the system reliability, the performance function of the system can be defined as: [ ] and the system reliability is defined as: For a parallel system, the system is reliable if any of the components works well.Then the performance function of the system can be computed by: [ ] and reliability of the parallel system is evaluated by: Where all of the component functions

P
of the reliabilities of the series system and parallel system, respectively, are expressed as: , 1 1 0 Where k G is the th k realization of G , S N is the sample size, and

[] ⋅
 is an indicator function such that k G is in the reliable set (i.e. when 0 k G > ) and zero otherwise.Since the proposed method facilitates explicit lower-dimensional approximation of a general multivariate function, the embedded MCS can be conducted for any sample size.The accuracy and efficiency of the reliability calculations using the developed method will be discussed in section 5.

Numerical examples
Three methods, including the Partial Least Squares Regression-Response Surface Method (PLSRRSM), Corrected-Partial Least Squares Regression-Response Surface Method (C-PLSRRSM), and direct MCS (D-MCS) with 6  10 samples, are discussed to analyze the system reliability with dependency.Accuracy of the proposed method is verified by three numerical examples.The system reliability calculated with the original performance functions by MCS is used as the sciENcE aNd tEchNology benchmark data.When comparing computational efforts, the number of original performance function evaluations is chosen as the primary metric in this paper.For the direct MCS, the number of original function evaluation is same as the sample size.However, the MCS (although with the same sample size as the direct MCS) embedded in the proposed method is conducted by using their response surface approximations.The difference in CPU times in evaluating an original function and its response surface approximation is significant when a calculation of the original function involves in expensive finite-element or mesh-free analysis.

Example 1-A ten-bar truss structural system
A ten-bar, linear-elastic, truss structure, shown in Fig. 1, was studied to examine the accuracy and efficiency of the proposed system reliability analysis method.Two concentrated forces are applied at nodes 2 and 4. In order to build the limit state function of the structural system, three failure modes of the system analyzed by Huang [17] are shown: The stress failure of bar 3 indicates that the stress applied on bar 3 is larger than that of the allowable stress, is expressed as: The stress failure of bar 7, where the stress applied is larger than the allowable stress, is given by: And the displacement failure of the node 2, demonstrating that the maximum displacement occurred at node 2 exceeds to the allowable one, is presented by: Then the system limit state function of the ten-bar structure is given by: [ ] The system which composed of three failure modes with the corresponding three component limit state functions is a series system.And this system is used to demonstrate the accuracy and efficiency of the proposed method.Properties of the input random variables, denoted as 1 , reliabilities of the tenbar truss structure system obtained by PLSRRSM, C-PLSRRSM, and D-MCS, are 0.5767,0.9443and 0.9315, respectively.The probability of reliability calculated by D-MCS is selected as the benchmark, and then the percentages of reliability result errors from PLSRRSM and C-PLSRRSM are 38.1% and 1.37%, respectively.It is shown that the accuracy of the reliability given by the correction model is improved by 36.74%.Moreover, when the number of sample points are more than 32, the accuracy of the reliability probabilities estimated by the two proposed methods stand still with the increase of the sample points.In other words, the accuracy of the  proposed method cannot be improved by giving a large number of sample points.From the discussion, it is verified that the proposed correction model requires only a few original function evaluations to generate an accurate result.

Example 2-A cantilever beam system
This second test problem involves the reliability analysis of a cantilever beam as shown in Fig. 3. Two external forces 1 F and 2 F , two external moments 1 M and 2 M , and external distributed loads represented by ( ) and ( ) , are applied on the cantilever beam.A total of twenty-one random variables, such as, dimensions, the yield strength S , the maximum allowable shear stress max τ , are involved in this example, as shown in Table.2. The system limit state function composed of three component limit sate functions [7] will be used to describe the accuracy of the proposed method.
The system limit state function consists of: the first component limit state function, representing the difference between the maximum normal stress and the yield strength S , is given by: ( [ ] The second component limit state function expresses that the deflection tip v of the tip of the beam should be less than the allowable deflection max v and is defined as: ) Where R is the reaction force at the fixed end; The Young's modulus , and the allowable deflection is =0.025 max v .
The third limit state function is given by: Where τ is the shear stress at root, and the term in curly brackets is the shear force at the root.With the increase of the sample points, Fig. 4 presents reliability probabilities of the cantilever beam, predicted by PLSRRSM and C-PLSRRSM, as well as by D-MCS.The reliability probabilities from PLSRRSM, C-PLSRRSM and D-MCS, when the number of sample points is 4 n=4 21=84 × × , are 0.967, 0.9542 and 0.9537, respectively.The absolute error percentages of the PLSRRSM and C-PLSRRSM to D-MCS are 1.39% and 0.052%, respectively.Therefore, the probabilities calculated by both of the two proposed method are accurate, and the results obtained by C-PLSRRSM closes to the benchmark almost without error.As can be seen in Fig. 4, when the number of the sample points is more than 84, both the PLSRRSM and C-PLSRRSM provide stable reliability results with small fluctuations.The same conclusion is derived from the results that C-PLSRRSM is more accurate than PLSRRSM.The effectiveness of the proposed method is also demonstrated by the example.

Example 3-Vehicle side impact
The final test problem investigates the side impact crash-worthiness of a vehicle subjecting to variations in the sizes and material properties of several key components.This problem has been investigated by many researchers in the fields of reliability based design optimization and robust design optimization.However, the reliability of each component is treated separately without considering their failure dependency.Actually, all failure modes are being the potential failure mode and strong dependencies are contained between them.When any of the components fails, the entire vehicle as the series system is said to be failed.The limit state system function constructed by Bichon [3] with ten failure modes will be used to test the proposed method.
Ten equations corresponding to failure modes are considered: the abdomen load the rib deflections at middle  46.36 9.9 12.9 0.1107 the viscous criteria at upper the viscous criteria at middle Combined with the corresponding allowable values, then the system limit state function of the vehicle side impact problem is defined as: [ ] ) and the system reliability can be expressed by: The distribution information of the random variables, denoted as 1 11 X -X , involving the thickness and material properties of critical structures in the vehicle and the location of the impact, are described in Table .3. The reliability was calculated by using the proposed method and compared with the D-MCS, as shown in Fig. 5.The reliability with the increase of the sample points are also described in Fig. 5. Corresponding to the number of sample points, 4 n=4 11=44 × × , the calculated reliability results of the vehicle are 0.747, 0.8441 and 0.8231 for PLSRRSM, C-PLSRRSM, D-MCS respectively.The error percentages of the proposed method to D-MCS are 9.25% and 2.55%.Different from the above examples, the accuracy of the results improved by the corrected model is not significant and the reliability results, computed by PLSRRSM and C-PLSRRSM are close to each other, when the number of the sample points is more than 82.This may be due to the fact that the original functions of the vehicle system are derived from RSM.Nevertheless, the accuracy and efficiency of the proposed method is obvious.

Conclusion
Based on the ability of the PLSR to analyze the dependent relationship between the same input variables and the corresponding different responses, a new response surface modeling method for the structural or mechanical system was developed.To improve the accuracy of the surrogate model, a correction method by adding a coefficient to each component meta-model of the system surrogate model was presented.The coefficient is defined as the difference between the exact response value and the surrogate one at the mean values of the input variables.Then the corrected surrogate model combined with MCS method was used to analyze the system reliability, named as Corrected-PLSR-RSM based system reliability analysis (C-PLSRRSM-SRA).As to the sampling method to build response surface model, LHS is selected and a proper coefficient f 4.5 = to bound the sampling region of the input random variables was chosen.
Due to a small number of original function evaluations, the proposed method is effective, particularly when a response evaluation entails costly finite-element, mesh-free, or other numerical analysis, whose limit state function is implicit.By using the surrogate model, it is also an effective way to solve the problem composed of the complex and high nonlinear explicit limit state functions, which saves computational expense explicitly.The numerical examples tested in the paper indicate that the proposed method provides accurate and computationally efficient estimates of reliability.As Compared to the PLSRSM method, the C-PLSRRSM method makes considerable improvements from the perspective of accuracy, efficiency, and stability, with only one more time of calculating the original limit state functions.The C-PLSRRSM method could be more accurate to solve the reliability of the system with highly nonlinear limit state function composed of several component limit sate functions involving a large number of input variables, and provide a moderate accurate value of the system reliability which fulfils the requirements of engineering applications.With the proposed sampling method, another advantage of this new method is that a more accurate surrogate model can be built with the least number of sample points.
However, the C-PLSRRSM method needs a large number of original function evaluations to get the surrogate model when the number of the input random variables is large, which will decrease the efficiency of proposed methods.In addition, the C-PLSRRSM method may producing an error for large probability levels (e.g., more than 99.9%).Because the response surface model is regressed without mixed terms, the surrogate model cannot represent the original performance functions in the whole distribution region of the input random variables.To deal with these issues, developing a more accurate correction method will be a future work.

Acknowledgement
The support from the High-End Talents Recruitment Program of Liaoning Province of China is gratefully acknowledged.The authors would like to thank the anonymous referees and the editor valuable comments and suggestions leading to an improvement of this article.
the corresponding response data, [ ] y (in dimensions N m × ), the parameters of the model shown in Eq. (16-17) can be calculated accurately by the PLSR method.Where I is a 1 N × matrix filled with 1.
j cf , the surrogate model can be revised in the form of: in Table.1.All variables are normally distributed.The reliability results, from PLSRRSM, C-PLSRRSM, and D-MCS, corresponding to the change of number of sample points used to develop the surrogate model, are given in Fig. 2. When the number of sample points are 4 n=4 8=32 × ×

Fig. 1 .
Fig. 1.A ten-bar truss structure with random cross-sectional areas

3.1. A simple PLSR algorithm
lecting N observations (sample points) composed of N input data vectors

Table 1 .
Distribution details of input random variables

Table 3 .
Random variables in the vehicle side impact problem Deviation of impact location (mm)