Elastic Net Regression and Empirical Mode Decomposition for Enhancing the Accuracy of the Model Selection

Elastic net (ELNET) regression is a hybrid statistical technique used for regularizing and selecting necessary predictor variables that have a strong effect on the response variable and deal with multicollinearity problem when it exists between the predictor variables. The empirical mode decomposition (EMD) algorithm is used to decompose the nonstationary and nonlinear dataset into a finite set of orthogonal intrinsic mode function components and one residual component. This study mainly aims to apply the proposed ELNET-EMD method to determine the effect of the decomposition components of multivariate time-series predictors on the response variable and tackle the multicollinearity between the decomposition components to enhance the prediction accuracy for building a fitting model. A numerical experiment and a real data application are applied. Results show that the proposed ELNET-EMD method outperforms other existing methods by capable of identifying the decomposition components that have the most significance on the response variable despite the high correlation between the decomposition components and by improving the prediction accuracy.


Introduction
Several studies such as medicine, and economics interested in using time series datasets, where these datasets are often non-stationary and non-linear simultaneously. However, there is a lack of statistical methods that can effectively extract oscillatory patterns from the data because the traditional methods assumed that the dataset should be either stationary or linear. For example, Fourier decomposition (Titchmarsh, 1948), and wavelet decomposition (Chan, 1994). Recently, Huang et al. (1998) proposed the empirical mode decomposition (EMD) method, which aims to decompose non-stationary and non-linear data with keeping the time domain. The EMD method does not require any restrictions and pre-conditions on the nature of the data, such as stationary or linearity, in contrast to traditional methods. regression and noise-assisted multivariate EMD (NA-MEMD) (Masselot et al., 2018), SVR and particle swarm optimization based on EMD (Hong and Fan, 2019).

Methodology
This section briefly describes the applied methods. The first method is the EMD method by sifting process to decompose the original signal. The second method pertains to the technical penalized regularization method by ELNET regression. This section also discusses the proposed ELNET-EMD method. Huang et al. proposed the EMD in 1998. The EMD focuses on the nonstationary and nonlinear time-series datasets by using the sifting process of the technical algorithm to decompose the original signal into a finite set of orthogonal decomposition components. Such components have different oscillatory patterns and by keeping the time domain of the signal unchanged. Their orthogonal decomposition components are called IMF components and one residual component ( Huang, 2014;Moore et al., 2018).

Sifting Process
The iterative algorithm process of the EMD for extracting all the IMF components and one residual component is called the sifting process. It separates the original signal into orthogonal components of a non-overlapping time scale (Huang, 2014). The sifting process to decompose the original signal ( ) is summarized as follows: Step 1: The first step entails inserting the original signal ( ) with repetition indicators that are equal to one ( , = 1).
Step 2: All the local maximum and a local minimum of the original signal ( ) are determined.
Step 3: By using the cubic spline curve, all local maximum and local minimum are connected separately to build an upper envelope ( ) and a lower envelope ( ), respectively, where all the original signal must be between these envelopes.
Step 4: The mean envelope value between the upper and lower envelopes is determined to build a new line curve, which represents the mean envelope by using the following equation: Step 5: Using the difference between the original signal ( ) and the mean envelope value ( ) as the new function ( ): We verify whether the new function ( ) satisfies the conditions of IMF. If ( ) satisfies the conditions of IMF, then ( ) = ( ) , where ( ) is the k-th IMF { = 1,2, … , } and we continue to step 6. If not, then we replace ( ) with ( ) and repeat the operation from step 1 with repetition indicators = + 1 and = 1.
Step 6: This step involves calculating the residual component ( ) as in the following formula: We check whether the residual component ( ) is a monotonic function or satisfies the stoppage criterion of the standard deviation for two consecutive successive siftings of the results as follows: If not, then we replace ( ) with ( ) and repeat the operation from step 1 with repetition indicators = + 1. If yes, then we save all the IMF and residual components and end the sifting process.
The original signal ( ) is the linear combination of the finite set of orthogonal IMF { ( ): = 1,2, … , } and one monotonic residual ( ) that are extracted via EMD, as in the following formula:

IMFs
The IMF's represent a new orthogonal design for the signals resulting from the division of the main original signal by using EMD. Each of IMF { ( ); = 1,2, … , } components and residual ( ) component has different and easy physical significant meanings. The IMF component function that satisfies two conditi (Huang et al., 1998;Huang, 2014) are as follows: (1) Over the whole length of a signal, the numbers of local extrema (LE) (maximum and minimum) and the number of zero-crossings (ZCs) must be either equal or differ at most by one.
(2) At any point on a signal, the envelope mean value ( ) between the upper envelope ( ) defined by the local maximum and the lower envelope ( ) defined by local minimum is equal to zero.
The first condition indicates that each IMF has only one local maximum or local minimum between two consecutive ZCs; meanwhile, the second condition explains that all the IMFs are stationary, which makes the analysis process highly flexible using these components (Raghuram et al., 2012).

ELNET Regression
In 2005, Zou and Hastie proposed ELNET regression. It is a hybrid technical penalized least square regression method that involves regularization and variable selection (Zou and Hastie, 2005). The ELNET regression is a combination of two best techniques of shrinkage regression methods, namely, Ridge regression ( 2 penalty) for dealing with high-multicollinearity problems and the LASSO regression ( 1 penalty) for feature selection of regression coefficients (Wang et al., 2019).
The ELNET regression reduces the number of predictor variables by using the shrinkage of the coefficient regression toward zero or equal to zero for the less important variables by using the sum of the absolute values of the coefficient variables ( 1 norm) multiplied by the tuning parameter 1 . The high correlation between the predictor variables is treated using the sum of the squared coefficient variables ( 2 norm) multiplied by the tuning parameter 2 . This principle of the ELNET regression contributes to the production of an interpretable fitting model by limiting unnecessary variables that do not exist in the final model to increase the prediction accuracy. The ELNET deals with multicollinearity by keeping the predictor variables with high correlation into or out of the fitted model ( Liu and Li, 2017).
The model structure of the multiple linear regression, which builds the relationship between the response variable and the predictor variables, is derived as follows: = 1,2, … , and = 1,2, … , , where is the i-th response variable, 0 is the intercept, is the j-th predictor variable of the i-th observation, is the regression coefficient of the j-th predictor variable, which represents the average effect on of per one unit change in the j-th predictor variable , and ε is the random error. For simplicity, we assume that the predictors and response variables are standardized by subtracting from the means and dividing by the standard deviations to obtain zero mean and unit variance (Melkumova and Shatskikh, 2017).
The traditional ordinary least squares (OLS) regression method is used to estimate the unknown regression coefficients by minimizing the residual sum of squared ( ). Therefore, the can be computed as follows: Using Equation (7), the OLS regression coefficient estimation for the j-th element (i.e., : = 0,1,2, … , ) is calculated by minimizing the form (Montgomery et al., 2012) as in the following formula: The ELNET regression method is a penalized OLS estimator by adding the penalty terms (i.e., 1 and 2 penalties) to estimate the regression coefficients ̂ (Zou and Hastie, 2005) as follows: where, 1 and 2 are the tuning parameters and positive numeric values ( 1 , 2 > 0), respectively; they are automatically selected using cross-validation (CV). The tuning parameters control the strength of the regularization and selection of the predictor variables (Zou and Hastie, 2005). By denoting 1 = 2 and 2 = (1 − ) (Haws et al., 2015), Equation (9) becomes equivalent to the following: where, is a regularization parameter (0 ≤ ≤ 1), and is a tuning parameter ( > 0).
The ELNET estimation represents the form of Ridge regression when = 0, while the ELNET estimation represents the form of LASSO regression when = 1. Therefore, the ELNET regression is setting the appropriate value for alpha between zero and one. The two penalty functions in ELNET regression are combined, where the 2 penalty is used to stabilize the 1 penalty regularization, while the 1 penalty is used to generate a sparse model (Liu and Li, 2017). Therefore, Ridge and LASSO regressions are special cases of ELNET regression (Haws et al., 2015). The difference between the two penalties is the principle of the shrinkage of the regression coefficient and dealing with the correlation between the predictor variables.

Proposed ELNET-EMD Method
The ELNET-EMD is designed to explain the significance of the decomposition components of multivariate time-series predictors ; = 1,2,3, … on the response variable by selecting the most necessary components and tackle the multicollinearity using the following process: (1) Decompose the original multi-predictors ( ) separately into a finite set of orthogonal , ( ) components and one residual ( ) component via EMD, where the original time-series predictor ( ) is the summation of all decomposition components of the j-th predictor: (2) Using Equation (6), all the decomposition components of the multi time-series predictors via EMD will be used as new predictor variables to explain the response variable ( ): (3) The ELNET regression will be used as in the following formula: The numerical experiments and real time-series data will be used to compare between the ELNET-EMD method against seven methods, namely, OLS-EMD, SR-EMD, Ridge-EMD, LASSO-EMD, The smoothly clipped absolute deviation (SCAD) (Fan and Li, 2001) based on EMD (SCAD-EMD), The minimax concave penalty (MCP) (Zhang, 2010) based on EMD (MCP-EMD), and Ad-LASSO (Zou, 2006) based on EMD (Ad-LASSO-EMD) methods.
To check the correlation among the decomposition components, the variance inflation factor (VIF) test will be used to assess the correlation value between the decomposition components. The decomposition components will be free from multicollinearity when the value of VIF is less than 10 (Jadhav et al., 2014) . The VIF form is presented as follows: Where, , 2 is the coefficient of determination of the decomposition component , on the remaining components in the model.
In this article, three criteria tests are used, that is, (a) root mean square error ( ), (b) mean absolute error ( ), and (c) mean absolute percentage error ( ). Where, is the actual value, ̂ is the estimated value, and n is the sample size.

Simulation Studies
We comprehensively explain the numerical experiment, which is simulated by the sine and cosine functions used to apply the methods, and the numerical results and discussion.

Numerical Experiment
The sine and cosine wave functions are used to generate simulated signals to demonstrate the ability of the ELNET-EMD method. These signals are numerical experiments that are employed in the application. The numerical experiments are explained in two cases. In the first case, the predictor variables 1 ( ), 2 ( ), and 3 ( ) are fixed variables without white noise error. In the second case, the predictor variables have white noise error with normal distribution of zero mean and unit variance (i.e. ( ) = ( ) + ; ε~ N(0,1), = 1,2,3). The response variable is simulated according to one or two components from the predictor variables. This notion is in line with the simulation study proposed by (Qin et al., 2016;Al-Jawarneh et al., 2020) for generating variables.
The datasets are simulated by predictors and response variables with the length of a sample size = 110 and time sequence between zero and nine (0 ≤ ≤ 9 ). For the second case, 2,000 replications of a sample size length = 110 are made. The datasets are divided into two parts of 70% for training the model and 30% for testing and evaluating the performance criteria. The formula of the function test of the response and the predictor variables is presented as follows: 1 2 3 ( ) sin( ) sin(2 ) cos (6 ) cos(13 ) ( ) sin(2 ) cos( ) sin(5 ) sin (9 ) ( ) 0.2 sin( ) cos(6 ) cos (9 ) ( ) sin( ) sin(8 ) cos (7 ) Figure 2 shows the decomposition components via EMD method of the original multivariate predictors 1 ( ), 2 ( ) and 3 ( ).  Table 1 shows the values of the VIF ( , ) test of multicollinearity between the decomposition components for the original multivariate predictors via EMD. Based on the output of , , several decomposition components obtain values larger than 10 (i.e., 1,6 , 2,3 , 2,5 , and 3,3 > 10). This finding indicates that high multicollinearity (correlation) exists between the decomposition components.  Figure 3 shows the CV and coefficient estimation plots of the ELNET-EMD. The first plot on the left represents CV at = 10, where the -axis represents the mean square error (MSE), and theaxis represents the value of log( ). The upper horizontal line represents the numbers of non-zero coefficients selected at the log( ) value. The first vertical dotted line from the left is the location of the point selected at a minimum of the MSE ( ) rule, while the second vertical line denotes the location of the point selected at a minimum of MSE with the one-standard-error (1 ) rule. Thus, the increase in leads to a decrease in the number of non-zero coefficients entering into the final model according to the CV plot. Therefore, the selection of the value at the or 1 rule is based on the optimal minimum MSE value. The second plot represents the coefficient estimation and explains the order in which the number of non-zero regression coefficients of the decomposition components will be entered into the final model. It provides a sign on the response variable at current , which is the actual degree of freedom. Therefore, the number of non-zero coefficient estimation differs in relation to the selected value of at the or 1 rule. For example, 14 components are identified under the rule, while seven components are observed under the 1 rule. Table 2 illustrates the results of the values in the first case of the simulation for the regression methods at the values of lambda as selected by CV. The results show that the smallest value is for the ELNET-EMD model at the 1 rule ( =0.25958;

Numerical Results and Discussion
=16.21). The value provides the best method for selecting important variables and for supporting the fitting regression models using the ELNET-EMD model.    Table 3 explains the estimation of the coefficients of the decomposition components for each of the regression methods used in this study. Most regression methods can reduce the number of decomposition components, except for Ridge-EMD; these methods have the same numbers of coefficient, that is, all the decomposition components are entered into the final model. The numbers of nonzero coefficients of ELNET-EMD and other methods in the study are different from those of Ridge-EMD method. For the ELNET-EMD at 1 rule, the coefficient estimation equal to seven nonzero coefficients 1,2 , 1,3 , 2,1 , 2,2 , 2,3 , 3,1 , and 3,4 components enter the ELNET-EMD regression model with varying levels of significance. That is, the 1,3 and 2,3 components are more significant than other selected components on the response variable.   Table 4 illustrates the performance criteria of the prediction accuracy of the regression methods using RMSE, MAE, and in the first case of the numerical experiment. The results show that the ELNET-EMD has the smallest error values in terms of RMSE, MAE, and . Therefore, the ELNET-EMD method is highly reliable for the selected components with high prediction accuracy. Table 4. Performance criteria (Case1). * 1-8 is the rank of the methods from minimum to maximum values Moreover, in the second case of the numerical experiment, the white noise error ε ~ N(0,1) is added to the main predictor variables. After performing the same process in the first case, the results show that components are similar to those in the first case for the selected predictor variables.  Table 5 explains the mean performance criteria values (i.e.,  ,  , , and ) for the ELNET-EMD method in the second case with the other regression methods. The results show that the ELNET-EMD has the smallest error value in these criteria tests. Therefore, the ELNET-EMD improves the prediction accuracy by producing the smallest error values in terms of , , , and .
Based on the numerical experiment results, the ELNET-EMD functions by selecting the necessary variables that have a significant effect on the response variable and improves the production accuracy despite the existence of white noise.

Application
In this section, we provide an empirical data analysis using the daily exchange rates of four countries against the US dollar (USD).

Exchange Rate
In this study, the proposed method contributes to the identification of the decomposed components via the EMD method of the multivariate original predictor variables, which exert a reflective effect on the response variable in case of multicollinearity among the decomposition components for enhancing the production accuracy. The daily close exchange rates of four countries against the USD, namely, Taiwan (TAW/USD Figure 5 shows the decomposing components using the EMD algorithm of the main predictor variables 1 ( ), 2 ( ), and 3 ( ). Each one of these variables is decomposed into seven IMFs and one residual component. Besides, the first three components have high frequency and short wavelength. These physical properties vary among the components, where the frequency decreases with the increase in the wavelength when the increase in the number of components. Table 6 explains the values of the VIF test of multicollinearity between the decomposition components. The results show that several , values are greater than ten ( , > 10), such as 1,7 , 2,7, 2,8 , and 3,7 . This finding indicates that a high correlation exists among the decomposition components of the MYR, JAP, and CHN variables, which subsequently indicates that multicollinearity exists. Figure 6 shows the 10-CV and the coefficient estimation plots of the ELNET-EMD. The figures illustrate the number of decomposition components, which are selected in the final model by choosing the value of λ at the or 1se rule according to the CV plot. For the rule, twelve components from all decomposition components are selected into the final model, while for the 1 rule, only the 1,1 and 3,1 components have the strongest effect on the response variable and that is why it is entered into the final model. International Journal of Mathematical, Engineering andManagement Sciences Vol. 6, No. 2, 564-583, 2021 https://doi.org/10.33889/IJMEMS.2021.6.2.034 578 Figure 5. Decomposition of the main predictor variables 1 ( ), 2 ( ), and 3 ( ) via EMD. International Journal of Mathematical, Engineering and Management Sciences Vol. 6, No. 2, 564-583, 2021 https://doi.org/10.33889/IJMEMS.2021.6.2.034 579 Figure 6. 10-Cross-validation (10-CV) estimation and the coefficient estimation for ELNET-EMD.   Table 8 explains the estimation of the coefficients of the decomposition components. For the ELNET-EMD at rule compared with other regression methods used in the study, the coefficient estimation equal to twelve nonzero coefficients, where six components from Malaysia ( 1,1~ 1,4 , 1,7 , and 1 ), four components from Japan ( 2,1 , 2,5 , 2,6 , and 2 ), and two  , and 3,2 ) have varying levels of strong effects on the response variable. Thus, the ELNET-EMD method is more accurate in selecting non-zero coefficients than other regression methods when a high correlation exists among the decomposition components. Where the components that have high correlation like 1,7 and 2 components are selected into the final model.   Table 9 illustrates the performance criteria of the prediction accuracy of the regression methods using RMSE, MAE, and in the application using a real time-series dataset. The results show that the ELNET-EMD at the rule has the smallest error value in the terms of and terms, while for the achieved the fourth-order among the other methods. Therefore, the ELNET-EMD method selects decomposition components that have more significant effect on the response variables with high prediction accuracy.

Conclusions
This study applied the ELNET-EMD method by using nonstationary and nonlinear time-series data.
The method is used to study the effect of the IMFs and the residual component of multivariate predictor variables on the response variable and tackle the high correlation among the IMFs and residual component for ensuring accuracy and reliability of the selected fitting model.
Numerical experiments and actual time-series dataset for the daily close exchange rates of MYR, JAP, CHN, and TAW are carried out. The ELNET-EMD method separately decomposes the multipredictors into a finite set of IMFs and one residual component via EMD. Thereafter, the decomposition components are selected, which have the most significance on the response variable and address the multicollinearity between the decomposition components using the ELNET regression.
The results prove that the ELNET-EMD method is considerably more accurate than other regression methods. The ELNET-EMD method is highly capable of identifying the decomposition components that have the most significance on the response variable. Although the correlation between the decomposition components is high, the components have high correlation in/out the final model. The ELNET-EMD method selects the best fitting model that is free of multicollinearity and displays high prediction accuracy.