FACTOR ANALYSIS IN SEISMOLOGY

R.Z. Burtiev, V.G. Alcaz, S.V. Troian and V.Yu. Kardanets Institute of Geology and Seismology of Moldova, Chisinau. ...................................................................................................................... Manuscript Info Abstract ......................... ........................................................................ Manuscript History Received: 22 December 2019 Final Accepted: 25 January 2020 Published: February 2020

Factor analysis allows us to explore the structure of the relationship of variables, where each group of variables will be determined by the factor according to which these variables have maximum loads. The result of factor analysis is a transition from a set of input parameters to a smaller number of variables, called factors. The factor is interpreted as a hidden variable -the cause of joint variability (interrelation) of several initial parameters. A method for calculating of the attenuation function values is proposed. On the basis of the obtained values, the seismic hazard is calculated as the probability that for a fixed time t at points of the earth's surface Q(φ, ψ) n seismic shocks occur, of which m will have intensity I k of the MSK-64 scale. The method is used to study the seismicity of the Foci Vrancea and earthquakes of Romania. Seismic processes are complex and diverse, since their formation is due to complex, diverse geological and geophysical processes occurring in the bowels of the Earth, and is characterized by many different parameters, and the results of observations on them are represented as multidimensional random variables. In the study of such multiparameter processes, the question arises: is it possible to discard some of the parameters, or replace them with a smaller number of some functions of them, while retaining all the information? To solve this problem is a factor analysis [Bakhtin, at al., 2007]. Factor analysis is based on determining the minimum number of factors that make up the largest share in the variance of the data [Harman, 1972]. In the study of the complex nature of seismicity, factor analysis helps to better understand the essence of seismic processes, since the interdependence between seismic parameters must be due to relationships between parameters, the identification of which is the task of factor analysis.

…………………………………………………………………………………………………….... Introduction:-
Seismology is mostly an empirical science, since statements, like models, are based on facts and observations. The goal of the empirical sciences is to cover the abstract model of the results of real-world observations. Presentation of the picture of future behavior and determination of the future form of the model is the main task of empirical sciences. In practical applications, the best is considered the model that is the most general and simplified description of observations, and has a high forecast potential. In order to draw conclusions about the general, real relationship in a set of observations with some (most) probability, methods of probability theory are used, since it is impossible to cover the whole set of observations with a model. The purpose of stochastic modeling is to recognize ISSN: 2320-5407 Int. J. Adv. Res. 8(02), 1268-1285 1269 the true structure of the system from the observed data. The array of seismic data is analyzed, consisting of values of 17 parameters characterizing the seismicity of the Vrancea source: MSK -intensity at the epicenter; Mw -moment magnitude; R-is the hypocentral distance; Az -azimuth to the most distant point; NP 1stk -the direction of the nodal plane 1; NP 1dp is the angle of incidence of the nodal plane 1; NP 1slip -offset of the nodal plane 1; NP 2stk -the direction of the nodal plane 2; NP 2dp is the angle of incidence of the nodal plane 2; NP2 slip -shift of the nodal plane 2; P az -azimuth of the compression axis; P pl is the angle of the compression axis; B az is the azimuth of the neutral axis B pl is the angle of the neutral axis; T az -azimuth of the axis of stretching; T pl is the angle of the stretching axis.
Modern models of the Earth's structure and theories explaining the occurrence of earthquakes are based on indirect data, mainly on seismic observations. The main purpose of geophysical research is to solve the inverse problem, i.e. determination of the structure of the environment from observations of the characteristics of physical fields. The attenuation of the intensity of seismic effects is one of the factors determining the quality of seismic hazard analysis. The characteristics of the seismic impact of each earthquake are determined by its characteristics such as tectonics (tectonic structure of the region), focal depth, mechanism, focal geometry, direction and course of the process of fracturing of rocks and other parameters. The picture of the macroseismic field is a reflection of the influence of all these factors, and local geological features on the manifestation of the seismic effect at the points of the day surface. One of the significant factors determining the quality of seismic hazard analysis is the seismicity model of the earthquake zone. Seismicity is the susceptibility of the Earth or individual territories to earthquakes, which is a reflection of the geological processes occurring in the earth's interior, and occur in areas of the Earth that are different in structure and nature of geological development. Seismicity is characterized by: 1. Earthquake frequency 2. Statistical distribution of the force of shocks (magnitudes), 3. Spatial distribution of foci, 4. Macroseismic observations of strong seismic events (seismic intensity, damage pattern).
In the study of seismicity, we are dealing with a seismic process occurring in time and space, and many of the tasks of seismology are associated with the ability to calculate the probabilities of some events associated with the seismic process and the determination of their probabilistic structure. Seismology is concerned with the study of earthquakes and related phenomena. The main task of seismology is to learn how to predict the strength, time and place of occurrence of earthquakes. The model, which covers various forms of dependence between events, is a real reflection of reality, but this leads to great difficulties in studying their probabilistic structure and statistical analysis. Therefore, a compromise decision is made -a model is selected that takes into account the dependencies sufficient for the adequacy of the model and is amenable to statistical analysis. Mathematical statistics, whose main task is to link the real world of data with the world of theoretical models, develops on the basis of probability theory and is used to determine the parameters of the model from sample data, forecast, hypothesis testing, and to some extent solves inverse problems. Mathematical statistics reveal the likelihood degree of some hypotheses regarding some phenomena. In addition, methods of mathematical statistics can help the researcher in drawing conclusions about mass phenomena, from observations on them, to identify the most significant factors and areas of research. Seismic processes occur and develop in time and space under the influence of the internal determinism of global tectonics. Uncertainties associated with the interweaving of the internal physical fields of the Earth and the gravitational force of celestial bodies, and their influence on global tectonics, introduce an element of randomness in seismicity models [Burtiev, 2017].  occurring  in  the  bowels  of  the  Earth,  and is characterized by many different parameters, and the results of observations on them are represented as multidimensional random variables. In the study of such multiparameter processes, the question arises: is it possible to discard some of the parameters, or replace them with a smaller number of some functions of them, while retaining all the information? To solve this problem is a factor analysis. Factor analysis is based on determining the minimum number of factors that make up the largest share in the variance of the data. In the study of the complex nature of seismicity, factor analysis helps to better understand the essence of seismic processes, since the interdependence between seismic parameters must be due to relationships between parameters, the identification of which is the task of factor analysis. Factor analysis allows you to explore the structure of the relationship of variables, where each group of variables will be determined by the factor according to which these variables have maximum loads. The result of factor analysis is a transition from a set of initial variables to a smaller number of new variable factors. The factor is interpreted as a hidden variable -the cause of joint variability (interrelation) of several source variables. Factor analysis solves the problem of reducing the number of variables with minimal loss of initial information. If we assume that the correlations between variables are explained by the influence of hidden causes -factors, then the main task of factor analysis is to analyze the correlations of the set of parameters If the researcher is only interested in the structure of variables, this completes the factor analysis. Continuing factor analysis, we can calculate the values of the factors for further regression analysis Most factor analysis methods are based on the principal component analysis (PCA), which converts a group of correlated source variables into another group of uncorrelated variables. PCA solves the problem of reducing the number of variables while maintaining the maximum proportion of the variance of observations, choosing only the main components that explain most of the variance.
The main stages of factor analysis: calculation of the correlation matrix; extraction of factors; selection and rotation factors; interpretation of factors; calculation of factor values; assessment of the quality of the model. The array of seismic data is analyzed that consists of the values of 16 parameters characterizing the mechanism and geometry of the Vrancea core: MSK-degree of intensity in epicenter of MSK-64 scale; Mw; R; Az; NP 1stk ; NP 1dp ; NP 1slip ; NP 2stk ; NP 2dp ; NP2 slip ; P az ; P pl ; B az ; B pl ; T az ; T pl . It is assumed that the values of these parameters are due to hidden factors that cannot be directly observed, and the identification of which is the task of factor analysis. The task of factor analysis is to determine the minimum number of factors that contribute the most to data variance. The use of factor analysis begins with the calculation of descriptive statistics of an array of observed values over 16-dimensional random variables characterizing the seismicity mechanism and geometry of the Vrancea focus. Factor analysis can reveal relationships that determine the correlation interdependence between seismic parameters. It is assumed that the reason for the significant correlation between the parameters is the desired factors. The task of factor analysis is to determine the minimum number of factors that contribute the most to data variance. The values of the table (Table  1) show that the values of the statistical characteristics of the parameters have a large scatter, that is, their values are measured in different units that are incompatible with each other, so they should be brought to a single scale using standardization (1): Where X ij value of the j-th parameter in the i-th observation, X i mean and σ i standard deviation of j-th parameter. According to the values of the table (Table 1), it can be seen that the values of the statistical characteristics of the parameters have a large scatter, and their values are measured in different units that are not comparable with each other, therefore they should be brought to a single scale using standardization. The first idea of the presence of dependent parameters can be obtained from the correlation matrix (Table 2), which characterizes the degree of correlation between the parameters of the original array. The higher the proportion of high correlations, the better the data are suitable for factor analysis. For example, the value of the correlation coefficient between the parameters ISSN: 2320-5407 Int. J. Adv. Res. 8(02), 1268-1285 1271 np 1stk and np 2stk , contains 0.726 (table 2). The value of the correlation coefficient between the paz parameter and np1dp is equal to 0.884, which indicates a rather high degree of dependence between the parameters and is the basis for their inclusion in one group. Units are located along the main diagonal of the correlation matrix, as well as the covariance matrix, and they represent the variances of the used I-parameters; however, due to the standardization of the parameters, these variances become 1.

Results and Discussion:-
The Cheddoc method (Table 3) can be used to describe the significance of the correlation coefficients verbally. For example, there is a close correlation between the NP 1Stk and NP 2Stk parameters, equal to R =-0.820, and there is absent correlation between parameters, MSK and NP 1Stk since the correlation value R = -0.016 is in the range (0; 0.1). In factor analysis, the correlation matrix is transformed, after which all non-diagonal elements of the correlation matrix turn into zero, and the diagonal elements change their values. This means that the parameters become independent of each other. Most factor analysis methods are based on the principal component analysis (PCA), which converts a set of correlated input parameters into a set of non-correlating factors.
In factor analysis, the correlation matrix is transformed, after which all non-diagonal elements of the correlation matrix turn into zero, and the diagonal elements change their values. This means that the parameters become independent of each other. Most factor analysis methods are based on the PCA, which converts a set of correlated input parameters into a set of non-correlating factors. In this method, after calculating the correlation matrix, its orthogonal transformation is performed and the factor loads are determined from the values of the matrix elements.
The most common method of rotation is used -Varimax (non-correlated factors). The method is based on finding the eigenvalues and eigenvectors of the correlation matrix from solving the equation: (2) where Roriginal correlation matrix; A -matrix whose elements are factor loadings a ij parameter i by component j; A 'transposed matrix. The result of solving this equation is to determine the matrix of factor loadings (a ij ) (tab.4). Determination of the optimal number of components of the factor model -the number of factors is determined by the number of eigenvalues of the correlation matrix, exceeding one, which are the roots of equation (2). The first colon in table -Cumulative variance explained‖ (Table 5) shows that four eigenvalues are greater than one, that is, four factors are defined. The attenuation of seismic intensity is an important factor determining the quality of seismic zoning. The calculated seismic intensity values, taking into account the magnitude, epicentral distance and depth, often deviate from the real values of MSK-64 points, at the IDP (Intensity data points) points of the macroseismic field, respectively, the seismic hazard is calculated with errors. In an isotropic geophysical environment, with a uniform distribution of seismic energy, the shape of the theoretical isoseists would be determined by the geometry of the source in the near zone, and the shape close to the circle in the far. However, in reality, from the epicenter of earthquakes, lines of equal intensity diverge in the form of ovals or curved lines of an intricate shape. The shape of AA' = R ISSN: 2320-5407 Int. J. Adv. Res. 8(02), 1268-1285 1273 these isolines is influenced by factors: how common is the geometry and mechanism of the source; shear wave velocities, geometric divergence of the wave front; and regional -energy dissipation on inhomogeneities, absorption by the medium, physical properties of the medium, types of soils, features of the geological structure of the medium, composition of the material; the thickness of the soil layer; ground water level; speed differences in bedrock and overlying layers, etc. To take into account the influence of these factors on the intensity of seismic effects, and to improve the quality of seismic hazard maps, more complex models of attenuation of the intensity of their shocks should be made. However, at this stage of development of seismology, the complexity of the attenuation model is limited by the elliptical form of the dependence of the attenuation of the intensity of seismic shocks as the distance from the earthquake source [Rautian 1982;Shebalin, 1961Shebalin, , 1997].
The development of a more complex and adequate attenuation model is the subject of study by many seismologists. For example, an analysis of the macroseismic and instrumental data of earthquakes of the intermediate depth of the Vrancea locus revealed specific features of the effects of earthquakes: the impact on large areas with a predominant orientation of NE-SW; a large degree of dependence of the seismic displacement amplitude of the soil on local and regional geological conditions, as compared with the magnitude and distance from the source; large variability of parameters of strong ground movements; the reflection of the topography of the earth's surface by isolines [Ismail-Zadeh et al., 2007]. In the near zone, where r ~ h, (r is the epicentral distance, h is the depth of the earthquake source), the geometry of the earthquake source has a decisive influence on the configuration of the macroseismic field [Shebalin, 1961;1980]. ,007 100,000 The study of seismicity of Kyrgyzstan revealed that the influence of the source length on the macroseismic field configuration disappears in the interval of 25-35 km, for earthquakes with a magnitude of M = 6, and for earthquakes with M = 7, in the interval of 80-110 km. [Januzakov, 2013]. The macroseismic field is characterized by zones of equal intensity and lines separating the points of change of intensity, which are usually quite tortuous, and this fact prompted the addition of a macroseismic field to the equation, characterizing the mechanism and geometry of the earthquake source as seismic variables, to create a more realistic model of seismic intensity attenuation jolts. The apparatus of mathematical statistics has a method of factor analysis; whose task is to reduce the number of regressors in multiparameter problems. Factor analysis is based on determining the minimum number of factors that make up the largest share in the variance of the data. In the study of the complex nature of seismicity, factor analysis helps to better understand the essence of seismic processes, since the interdependence between seismic parameters must be due to relationships between parameters, the identification of which is the task of factor analysis.

ISSN: 2320-5407
Int. J. Adv. Res. 8(02), 1268-1285 1274 Factor analysis allows you to explore the structure of the relationship of variables, where each group of variables will be determined by the factor according to which these variables have maximum loads. The result of factor analysis is a transition from a set of input parameters to a smaller number of variables, called factors. The factor is interpreted as a hidden variable -the cause of the joint variability (interrelation) of several initial parameters. Most factor analysis methods are based on a model of principal components that transforms a group of correlated initial parameters into another group of uncorrelated factors. From the values of descriptive statistics (Table 1) it can be seen that the values of the statistical characteristics of seismic parameters have a large scatter, and their values are initially measured in different units that are incompatible with each other, therefore they should be brought to a single scale using standardization (1). The values of the regression coefficients are estimated using the least squares method, and the Gauss-Markov conditions must be satisfied for the consistency, unbiasedness, and efficiency of the estimates: The variance of the random component in all observations should be constant (homoscedasticity) equal to zero, i.e.
it is assumed that the random component has a normal distribution i.e.
The multidimensional linear regression model is: where ydependent random variable, x 1 …,x jexplained (regressors); β 1 , β 2 ,…,β jcoefficient of regression; εrandom variable. The appearance of a random component: To explain the 100% variance in seismic data, 16 components would be required. However, using the factor extraction procedure, only 4 factors out of 16 were extracted, which explain 84,996% of the variance in the data set. That is, a factor model consisting of 4 factors saves 84,996% of the initial information. As noted earlier, when grouping the initial array of parameters, the loss of information is inevitable. Saving information by as little as 60% is considered a fairly good indicator. Usually, when conducting factor analysis, the first main components are used, the total proportion of dispersion of which exceeds 60%. Considering that in the course of factor analysis, the number of parameters is reduced several times, even with a large loss of information, 40%, for example, the use of a factor model is appropriate. The initial eigenvalues and eigenvalues after extraction (Extracted sum of squared loads) are the same for PCA (Table 5), however, with other extraction methods they differ slightly. The graph of eigenvalues is useful in determining the number of factors, which is determined by the number of factors where a sharp decline begins. After a given number of factors, the dependence studied is close to the horizontal line, that is, the decrease of eigenvalues slows down. As can be seen (Fig. 1), the inflection point is on the 4th factor, that is, no more than 4 factors stand out. This method is used to determine the number of factors before rotation, the purpose of which is to determine a simple model, where each parameter corresponds to a large value of the factor load for one factor and a small value for all the others. The factor load is determined by the correlation coefficient of each parameter with each of the identified factors. The sum of the squares of the loads on the column represents the variance of the factor and serves as a measure of its information content. The ratio of the variance of the factor to the total variance of the standardized initial variables (numerically equal to their number) makes it possible to judge the relative in formativeness of the factor. In the principal component method, the dispersion of the principal components is equal to the eigenvalues of the correlation matrix of the initial variables.
Factor analysis is based on the rotation of factors, to obtain a simple structure of the relationship between parameters, where a large value of the factor load of each parameter corresponds to only one factor and a small one for the rest. The -Varimax‖ rotation option is usually used, which is orthogonal, since with such rotation the axes are located at a right angle. With orthogonal rotation, each successive factor is determined so as to maximize the variance left over from the previous factors, so the factors are uncorrelated with each other. In each row of the rotated matrix of components (Table 7), the factor load is noted, which has the maximum absolute value. For example, the np 1dp parameter correlates as closely as possible with 1 factorthe correlation value is 0.951.
Parameters; paz; np 2stk ; np 2dp; np 1dp are also associated with the first factor of the highest correlation (0,910; -0,932; -0,934; 0,951). In the same way, factors are determined to which the remaining parameters should be attributed (Table 7). When selecting the name of each factor should be based on the logic and research topic. Thus, the parameters: np 1slip ; np 2slip combined in the first factor (Table 7), hidden from direct measurement, and which causes the interdependence of parameters. Accordingly, the parameters are combined, combined into the remaining four factors. In this case, it becomes possible, instead of the initial set of 16 parameters, to analyze data on the four identified factors. In PCA, the value of each parameter X i is represented as linear combinations of factor loads and factors F j , j = 1,2, ..., J, J is the number of factors, I is the number of parameters J << I: The matrix consisting of factor loads and having the number of columns equal to the number common factors, and the number of rows equal to the number of input parameters, is called the factor matrix. The factor matrix captures the degree of linear relationship of each parameter to each common factor. The value of the factor load, close to zero, means that the factor has almost no effect on this parameter. It is possible to calculate the contributions of factors to the total variance of all parameters (by calculating the sum of the squares of the factor loads for each factor for all parameters. The higher the proportion of this contribution in the total variance, the more significant this factor is. Factor estimates are estimates of each parameter for each factor, and indicate emissions by factor (Table 8).  . 8(02), 1268-1285 1277 The matrix of coefficients of factor estimates (Table 8) contains the regression coefficients b ij , which are used to calculate factor estimates: where F jnvalue of j-th factor by n-th observation, s in Xstandardized value of i-th parameter by n-th bservation in initial sample; ij bfactor loads. In the SPSS package, the calculated factor scores are indicated by identifiers: FAC1_1, FAC2_1, are added to the right of the source data array, and can be used as variables in a further study. In the calculated factor estimates, emissions are not seen in the values of the factors, which indicates a good quality of factor analysis. The sum of the squares of the loads of the j-th factor in all I -parameters is equal to the eigenvalue: of this factor, and it is used to characterize the structure of factors, which can be express it as a percentage, and it helps to know the nature of the factors. One of the characteristics of the quality of the factor model is the commonality -the sum of square factor loadings by the factor matrix row: a given parameter, explained by all factors, and can be interpreted as the reliability of this parameter. The community determines for which parameters factor analysis works better or worse. The magnitude of the community must be greater than 0.5, otherwise the corresponding parameters will be deleted [Inheritances A. 2013]. That is, the community indicates the reliability of this parameter. A value of 1 means that the variance of a parameter is completely determined by the factor to be emitted. Extraction -the proportion of dispersion, which takes values from the interval (0-1), is explained by all the parameters remaining after the extraction of factors. For example, the extracted factors account for over 97.6% of the dispersion of the P pl parameter (Table 8). If the parameter has low generality, for example, less than 0.5, for this parameter the application of the factor model does not make sense and this parameter should be removed from the model.
When factor analysis under one factor, the parameters of the initial array are collected, which are associated by the closest correlation connection (factorial loads) with this factor, which are contained in the rotated matrix of tab components. Factor loads vary from -1 to +1 and are analogous to the correlation coefficient. In the matrix of factor loadings, it is necessary to distinguish significant loads using student's criterion: KMO criterion is used to check the quality of data factorization based on correlation and partial correlation between the initial parameters. Sample statistics KMO is also used to identify multi collinearity in factor analysis to determinewhich parameters need to be removed from the model. In this case, the sample KMO index value of 0.57 (tab. 10) exceeds the threshold value of 0.5, which indicates the suitability of the data for conducting factor analysis.
Testing hypothesis H 0 : that the parameters involved in factor analysis are not correlated with each other, with the alternative hypothesis of the presence of correlation, is carried out using the Bartlettcriterion. The probability of the significance of the sample statistics of the Bartlett test criterion (significance line) equal to 0.0, below the significance level α = 0.05, therefore, the hypothesis H 0 , is rejected in favor of an alternative hypothesis about the correlation of the parameters among themselves. Thus, the test criteria of KMO and Bartlett confirm the correctness of the decision on the use of factor analysis to study the totality of the values of seismic parameters. For factor analysis, the total KMO measure must be greater than 0.50. If KMO is less than the threshold value, then it is necessary to remove the parameters with the lowest values of the statistics of the individual KMO. The process should be repeated until the total KMO, which is the sum of the KMO statistics for each parameter, reaches the threshold value. KMO private statistics for each parameter are diagonal elements of the Anti-image matrix (Table  11).  Multicollinearity means high mutual correlation of explanatory regression variables. The absence of high collinearity of regressors is one of the conditions for applying the least squares method for estimating the parameters of multidimensional linear regression. The study of seismicity is associated with the identification of cause-effect relationships in endogenous processes. In the study of the relationships between causal investigative processes and the determination of the strength of the corresponding relationships, regression analysis may be useful, which makes it possible to identify trends in mass phenomena. Regression analysis assumes that the following conditions are met: The relationship between the variables is linear, which can be seen on scatter diagram of variable values.
Normal distribution of residuals -the difference between the predicted and observed values. Visually determine the nature of the distribution of residues can be histogram of residues, and its comparison with the density of the standard normal distribution. It is believed that two regressors x and z are collinear if the pair correlation coefficient r xz > 0.7 [Gabrielyan, 2006]. The absence of multicollinearity and homoscedasticity -consistency of the variance of the regression residuals is a necessary condition for obtaining unbiased, effective and consistent estimates of the regression coefficients. Checking the model for collinearity of regressors is determined by the values of the criterion: In practical applications, it is recommended that the number of observations n be greater than the number of independent variables m not less than three times [Mkhitaryan, et al., 2008]. High values of VIF j -greater than 4, at least for one regressor X, means multicollinearity.
The multicollinearity value will be indicated by the values of the VIF coefficient (13) from 4 and higher for at least one j. The quality characteristic of the regression model is the coefficient of determination of the R-square, which is equation, and using test criteria, the adequacy of the regression is checked. If the model is significant, the next regressor is included, and so on, if the significance of some regressor is less than the critical value, then the regressor is excluded, if more, then those associated with the dependent variable with the largest partial correlation are preserved and are included in the regression equation. In this case, direct and inverse step-by-step regressions are combined, where the parameters are successively turned on and off in the equation. The multiple regression method allows determining the significance of a linear relationship between parameters and seismic intensity, the quality of data approximation by a regression equation, the suitability of the calculated values of the coefficients of the equation for the best prediction, as well as determining the significance of parameters for predicting the seismic force. From the correlation matrix of all sample data it can be seen that a noticeable correlation is observed between the intensity of seismic shocks and the hypocentral distance; moderate dependence between magnitude and MSK-64 scale points. A functional and strong relationship was observed between some of the physical parameters of the foci, probably due to endogenous processes, the nature of which is to be clarified.
The table shows that the method of step-by-step regression in the regression equation alternately included the parameters: Constant, lnR MW, np2slip, tpl, azim, starting with a constant and hypocentral distance. In the pivot table (  According to Fisher's criterion, the regression model turned out to be significant for all included in 5 regressors models, since the probability of the significance of the Fcriterion sampling statistics used to test the null hypothesis H 0 : that the regression equation is insignificant is zero. The best, according to statistical characteristics, was the model, which included 5 parameters: constant, Mw, lnR np2slip, tpl and azim. In this case, the coefficient of determination is greatest: R 2 = 0.823. The sample distribution function of the regression residuals (Fig. 2) almost coincides with the theoretical distribution function, which speaks in favor of the model with 5 regressors. However, test criteria characterizing the adequacy of the regression model indicate sufficient statistical significance of the attenuation model, where the constant, magnitude Mw, and the natural logarithm of the hypocentral distance lnR are included. The inclusion of other parameters in the model did not lead to a significant improvement in the quality of the attenuation model, and such parameters are difficult to statistically process. It was decided to include in the regression equation the attenuation of magnitude and distance, as is customary in seismology. In this task, the values of the parameters are given in different units of measurement, therefore, to estimate the degree of influence of parameters on the intensity of seismic shocks, standardized coefficients in the regression equation are used, that is, a standardized version of the regression equation. Instead of the values of all parameters, their standardized values are used. In this case, only the form of the regression equation record changes, the correlation coefficients between all ISSN: 2320-5407 Int. J. Adv. Res. 8(02), 1268-1285 1282 parameters, the coefficient of determination, the significance of the regression coefficients do not change. The probability of the significance of the t-statistic is checked null hypothesis about the insignificance of the regression coefficient. The table shows that in the third, fourth and fifth models, the coefficient -Constant‖ is statistically insignificant -the probability of significance exceeds 5%, the significance level α = 0.05. Values of the VIF coefficient less than 4, therefore, multicollinearity is absent. One of the indicators of collinearity of parameters is the tolerance (tolerance) of a parameter, the sample values of which significantly exceed the threshold value of 0.2. Therefore, the corresponding parameter is not a linear combination of other regressors. One of the prerequisites determining the adequacy of the regression model is the normal distribution of residuals, which is indicated by those who slightly deviate from the distribution density and the distribution function of the normal law (Fig. 2, a; b), the histogram and the sample residual distribution function.
The approximation by an ellipse of a macroseismic field is a simplified model of the real picture, which is mosaic, and the lines dividing zones of equal magnitude -meandering. To build a more realistic model of attenuation of the intensity of seismic shocks, as the distance from the earthquake source increases, a scheme is proposed: the origin of coordinates is combined with the epicenter of the earthquakes under study; in each sliding azimuth section of a width of 40 0 , with a step of 20 0 , the values of the coefficients of the macroseismic field equation are determined using regression analysis (table 15). Built a regression model of the intensity of seismic effects in points-points: on the magnitude of Mw and log hypocentral distance lnR and all parameters Mw, lnR, azim, np 1stk , np 1dp , np 1slip , np 2stk , np 2dp , np 2slip , p az , p pl , b az , b pl , t az , t pl . Test criteria for verifying the significance of a regression indicate a sufficiently high-quality regression of the values of the intensity of seismic jolts by 2 parameters: Mw, lnR. The regression of intensity values on magnitude and hypocentral distance is characterized by selective values: R 2 = 0.784, a high value of Fisher's test criterion, which rejects the hypothesis that regression is not significant in favor of its significance.   (table,13), therefore, we can talk about the presence of weak positive autocorrelation. Values of the VIF coefficient are less than 4, therefore there is no multicollinearity. A seismic hazard was calculated and a map of seismic zoning of the territory of Moldova and neighboring countries was constructed based on the obtained values of the attenuation function coefficients of ( fig.3, 4). The methodology for calculating seismic hazard developed at the Institute of Geology and Seismology was applied. The mapped territory was covered by a geographical network with a discretization step of 0.2 0 , and in the network nodes the seismic hazard was calculated using the formula [Burtiev, 2017]: where ω k -the probability of occurrence in the nodes of the grid seismic shock with intensity I k , which is the vector of the probability distribution of occurrence in the nodes of the network of seismic tremors with intensity I k =1,2,…,12 degree of the scale МСК-64:

Conclusion:-
In the study of the complex nature of seismicity, factor analysis turned out to be useful for understanding the essence of seismic processes. A statistically significant correlation dependence was observed between the 16 parameters, reflecting the mechanism and geometry of the Vrancea source, and using factor analysis, hidden factors were found that determine the relationships between these parameters. Factor analysis leads to a loss of information contained in the source data, but a significant reduction in the number of parameters justifies its use and helps to identify patterns in seismic processes that are not directly observable. The seismic hazard map of the territory of Moldova, Romania and Bulgaria in points of the MSK-64 scale, according to the values of the attenuation function coefficients (Table  6), calculated using the moving average method from all seismic zones.

Recommendations:-
A method for calculating the values of the attenuation equation coefficients is proposed. On the basis of the obtained values, the seismic hazard is calculated, as the probability that at points of the earth's surface at a fixed time t, n pushes happen, of which m times with intensity I k , k degree of MSK 64 scale. In order for the regression analysis based on the usual least squares method to give the best results, the random error must satisfy the Gauss-Markov conditions: the mathematical expectation of the random error in any observation must be zero, which means it should not have a systematic bias. Usually, if the regression equation includes a free term, then this means that the condition is satisfied automatically, since the role of the constant is to determine any systematic tendency o f the explained variable included in the regression equation. Multicollinearity means a high cross-correlation of the explanatory regression variables. The lack of high collinearity of the regressors is one of the conditions for applying the least squares method to estimate the parameters of multidimensional linear regression. To assess the values of the coefficients of the attenuation function, in the presence of multicollinearity, we use regression analysis on the main components, where the strongly correlated regressors are replaced by components F1, F2, F3, F4, identified by the model of the main components of factor analysis, between which there is no correlation.