Forecasting the gross domestic product using a weight direct determination neural network

: One of the most often used data science techniques in business, finance, supply chain management, production, and inventory planning is time-series forecasting. Due to the dearth of studies in the literature that propose unique weights and structure (WASD) based models for regression issues, the goal of this research is to examine the creation of such a model for time-series forecasting. Given that WASD neural networks have been shown to overcome limitations of traditional back-propagation neural networks, including slow training speed and local minima, a multi-function activated WASD for time-series (MWASDT) model that uses numerous activation functions, a new auto cross-validation method and a new prediction mechanism are proposed. The MWASDT model was used in forecasting the gross domestic product (GDP) for numerous nations to show o ff its exceptional capacity for learning and predicting. Compared to previous WASD-based models for time-series forecasting and traditional machine learning models that MATLAB has to o ff er, the new model has produced noticeably better forecasting results, especially on unseen data.


Introduction
As information technology has advanced, new sophisticated computer approaches have been adopted in order to implement the best practices in public administration, financial management, and planning. Several studies have demonstrated the necessity of implementing new information and communication technology (ICT)-based techniques for the reformation and enhancement of public and financial management strategies [1]. In numerous scientific fields, the use of new machine learning (ML) techniques has resulted in the acceptance of new, improved intelligence methodologies for timeseries forecasting issues [2]. Methodologies for artificial intelligence have been used in a variety of fields, including but not limited to engineering, medicine, economics and finance, and social science research. They are frequently employed in the field of engineering for feedback control systems stabilization [3,4], solar systems measurements [5], wind speed forecasting [6] and alloy behavior analysis [7]. Also, they are often employed in the field of medicine for diagnosing diabetic retinopathy [8], flat foot [9] and several types of cancer, including lung cancer [10] and breast cancer [11], whereas they are usually employed in the field of economics and finance for portfolio optimization [12], time-series forecasting [13,14], and macroeconomic factors prediction [15,16]. Additionally, methodologies for artificial intelligence have been successfully applied in social science research usually for multiclass classification tasks, such as characterizing occupational mobility [17], evaluating jobs' potential for teleworking [18], and classifying occupations [19].
Gross domestic product (GDP) forecasts are increasingly valuable for financial management and planning. Several studies have examined how various forecasting approaches can be used to predict the GDP, such as incorporating economic signals from the domestic economy's main trading partners [20], incorporating stochastic volatility to improve both point and density forecast accuracy [21], establishing a grey forecasting model with time power term [22], using gradient boosting and random forest models [23], and utilizing mixed-frequency factor-mixed-data sampling models [24]. ML algorithms, on the other hand, are prospective substitutes for the time-series regression models that central banks generally employ for forecasting important macroeconomic indicators [25]. For managing data sets with a large number of potential regressors, ML models are especially well suited. In this paper, we investigate the performance of different ML algorithms in obtaining accurate forecasts of GDP for the United States (U.S.), the United Kingdom (U.K.), Italy, France, Greece and India.
The main objective of this research is to develop a model for GDP forecast using innovative neural networks enriched with cutting-edge methods. To achieve this, we will employ a 3-layer feedforward neural network that is able to handle regression tasks. As an alternative to the well-known back-propagation algorithm that is used to train feed-forward neural networks, a weights and structure determination (WASD) training algorithm will be utilized. The WASD algorithm provides the following advantages when training a neural network [26]: -It computes the ideal set of weights directly by using the weights direct determination (WDD) process as opposed to the back-propagation algorithm, which iteratively modifies the network's structure.
-It avoids getting stuck in local minima. -It ultimately contributes to lower computational complexity. In this paper, a novel multi-function activated WASD for time-series (MWASDT) model is introduced. It takes into account the unique features of the WASD based model for binary classification presented in [27], which uses multiple activation functions to reduce the training error even more than the single activation function models. It also takes into account the unique features of the WASD based model for time-series forecast presented in [13], which finds the optimal number of lagged observations to reduce the training error. In order to further improve the structure and functionality of the WASD based neural networks in the case of time-series forecast, the MWASDT model specifically makes use of various activation functions, optimizes the ratio between the fitting and validation sets (i.e., cross-validation auto-adjustment) and employs a new prediction mechanism. Results from six experiments reveal that, when compared to some of the most cutting-edge ML regression models available through MATLAB's regression learner app, the MWASDT model performed better on all measures.
The following can be used to summarize the main concepts of this work: • A novel 3-layer feed-forward WASD neural network for time-series forecast, termed MWASDT, is presented. • The MWASDT makes use of various activation functions, optimizes the ratio between the fitting and validation sets to reduce bias and prevent becoming caught in local optima during the training phase and employs a new prediction mechanism. • Using the new prediction mechanism, the MWASDT model's predictions on unseen data can be kept within a user-specified reasonable range. • In six experiments on GDP forecast, the MWASDT model is contrasted with some of the most advanced regression models accessible through MATLAB's regression learner app. • Six experiments on GDP forecast demonstrate that the MWASDT model offers superior prediction abilities compared to other WASD models, including the WASDP model developed in [28].
The following provides a description of the paper's structure. An overview of the multi-function activated WDD process for time-series process is given in Section 2. Section 3 presents the 3-layer feed-forward MWASDT neural network structure. Section 4 presents the MWASDT algorithm as well as the whole training and forecasting processes for the MWASDT neural network model. Section 5 shows and discusses the findings of six experiments on GDP forecast using the MWASDT model, the WASDP model developed in [28] and some of the most cutting-edge regression models available through MATLAB's regression learner app. In Section 6, concluding observations are given.

The multi-function activated WDD process for time-series
This section describes the multi-function activated WDD process for time-series. The WDD process is an essential part of any WASD method since it does away with the requirement for labor-intensive, usually unreliable iterative computations to obtain the appropriate weights matching the current hidden layer layout. The WDD procedure reportedly allows for both speed and lower computational complexity whilst avoiding some of the accompanying challenges as opposed to traditional weight determination methods [26].
Here, comprehensive justifications of key theoretical underpinnings and studies are provided for the creation of the MWASDT neural network. First, it is important to mention a few of the main symbols used in this work: a! denotes the factorial of a; () T denotes transposition; () † denotes pseudoinversion; sign(·) denotes a sign function; round(·) denotes a round function. () ⊙ denotes the elementwise exponential.
Theorem 2.1. The following holds when a target function, f (·), has the (H + 1)-order continuous derivative on the range [r 1 , r 2 ], and H is a nonnegative integer:

1)
where C K (a) and B K (a), respectively, imply the error term and the H-order TPLA of f (a).
Let f (h) (z) be the value at the point z of the h-order derivative of f (x). The approximate representation of f (a) is shown below: where h 1 , h 2 , . . . , h v are nonnegative integers.
According to [27,13], the data also requires normalization to a range of [−0.5, −0.25] before their input in the neural network model because it enhances the accuracy of the WDD method. We achieve that by using a linear transformation [30] as below: where A max and A min are the maximum and minimum values of the time-series data A = [A t−1 , A t−2 , . . . , A t−m ] ∈ R 1×m , respectively, with t > m denoting the time. It is worth noting that the neural network can deal with over-fitting in this way. As a consequence, the normalized input A and the target vector D = A t ∈ R are considered. According to the power activated multi-input neural networks in [26], the nonlinear function given below may be used to express the relationship between the input variables A t−1 , A t−2 , . . . , A t−m and the output target D of the neural network: (2.5) Thereafter, based on Proposition 2.1, the H-order TPLA B H (A t−1 , A t−2 , . . . , A t−m ) can map (2.5) as shown below: where k h = G h (A t−1 , A t−2 , . . . , A t−m ) ∈ R 1×mn signifies a power activation function, w h ∈ R mn signifies the weight that corresponds to k h , and h is both the number of the hidden layer neurons and the power value. Additionally, the four power elementwise activation functions (AFs) presented in Table 1 are recommended when dealing with regression tasks [13].
Power sigmoid For a given number of mr ∈ N observations, the input matrix A and the target vector D become: Thereafter, setting k r,h = G h (C 1 , C 2 , . . . , C m ) ∈ R r×mn , where C i ∈ R r denotes the ith column of the input matrix A in (2.7), the input-activation matrix is shown below:  8) and the weight vector is W = [w 0 , w 1 , . . . , w n−1 ] T ∈ R mn . As opposed to the iterative weight training used in traditional neural networks, the weights of the H-order TPLA neural network are then calculated directly using the WDD process outlined below [29]:

The MWASDT neural network structure
The 3-layer feed-forward neural network structure is shown in Figure 1. Particularly, the neural network receives the normalized input values A t−1 , A t−2 , . . . , A t−m based on (2.4) from Layer 1 (i.e., input layer) and allocates them to the relevant neuron of Layer 2 with equal weight 1. Notice that Layer 2 has a maximum number n of activated neurons. Further, the neurons that connect Layer 2 and Layer 3 (i.e., output layer) are acquired using the WDD procedure and have weights W j , j = 1, 2, . . . , n − 1.
To compute the predictionĎ of A t , the formula shown below is used: Last, Layer 3 has one activated neuron, which uses the function shown below: , Var(A) denotes the variance of A, and the parameter γ ≥ 0 imposes a bound on in the predicted value of the neural network model. Given that the parameter γ is specified by the user, the model's predictions on unseen data can be kept within a user-specified reasonable range. Keep in mind that the predicted value of the neural network model remains unbound for γ = +∞. It is important to mention that B(Ď) is the normalized output of the neural network. As a result, it is necessary to denormalize the value of B(Ď) using the reverse procedure shown in (2.4).

The MWASDT algorithm
The MWASDT algorithm is in charge of training the neural network model. That is, it finds the optimal ratio between the fitting and validation input sets (i.e., p * ), the optimal number of inputs m or observation delays (i.e., M * ), the optimal number of hidden layer neurons n, the optimal AF of each hidden layer neuron (i.e., v), and the weights W of the neural network. Consider the time-series A t , A t−1 , . . . , A t−g , the maximum number of hidden layer units n max specified by the user and, according to [13], the maximum number of inputs M max = round(g/3). Also, consider the parameter p ∈ [0, 1] which is the ratio between the fitting and validation input sets (i.e., cross-validation split). Since a typical split is above 50% for the fitting set and less than 50% for the validation set, the following iterative procedure ((1)-(10) steps) is used in particular for p = 0.55 : 0.01 : 0.85, where p takes prices from 0.55 to 0.85 with step 0.01.
(2) We create the input matrix A and the target matrix D according to (2.7) for r = g and m = M, and we set G 1 = round(pg), the number of the fitting data, and G 2 = g − G 1 , the number of the validation data. That is, the training set of the neural network comprises the matrices A and D.

Algorithm 1 Matrix K calculation.
Require: The data A, the vector V that contains the optimal powers of every hidden-layer neuron checked so far, the h = [AF 1, AF 2, AF 3, AF 4] that contains the optimal AFs of V, the delays number M. j ← j + 1 14: end while Ensure: The matrix K.
(3) For h = 0 : n max − 1, where h takes prices from 0 to n max − 1 with step 1, repeat the following (4)-(8) steps. (4) For v = 1 : 4, where v takes prices from 1 to 4 with step 1, repeat the following (5)-(7) steps. (5) Create K v (i.e., one matrix K for each of the four AF of (1)) for the input A in line with Alg. 1. (6) Calculate the weights W v for the first G 1 observations (fitting set) of K v , i.e., K v (1 : G 1 , :), in line with the WDD process of (2.9). That is, we set Based on W v of the previous step and the last G 2 observations (validation set) of K v , i.e., K v (G 1 +1 : G 2 , :), their predictions' mean absolute percentage error (MAPE) over the target value D(G 1 + 1 : denotes the predictions on the validation set. Note that the MAPE is a well-known statistic tool that measures the accuracy of a forecasting approach and is commonly used in ML as a loss function for regression problems. In addition, MAPE values that are closer to zero are preferable, and is calculated as follows: whereD and D, respectively, are the forecasted and the target prices. (9) Compare the best MAPE of the current M to the minimum MAPE of the optimal delays number so far (i.e., M * ). If the best MAPE of the current M is lower than M * , it becomes the minimum MAPE, and the current M becomes the optimal delays number. (10) Compare the best MAPE of the current p to the minimum MAPE of the optimal ratio between the fitting and validation input sets so far (i.e., p * ). If the best MAPE of the current p is lower than p * , it becomes the minimum MAPE, and the current p becomes the optimal ratio between the fitting and validation input sets.
In this way, the MWASDT algorithm can keep the lowest number of hidden-layer neurons in the neural network while optimizing the ratio between the fitting and validation input sets through the parameter p. At the same time, it finds the optimal number of inputs and lowers the MAPE of the neural network as a whole. The full workflow of the MWASDT algorithm is depicted in the diagram in Figure 2. After finding the optimal structure of MWASDT neural network model of Figure 1, we use the last M observations of the time-series, i.e., A t , A t−2 , . . . , A t−M−1 , to create the test set of the neural network. The forecast for the next Z ∈ N in number time instances can therefore be obtained through (3.2). Particularly, for z = 1 : Z, where z takes prices from 1 to Z ∈ N with step 1, we repeat the following (a)-(b) steps. As a result, we are able to obtain the neural network forecast for the next Z in number time instances. The detailed procedure for training and forecasting with the MWASDT neural network model is depicted in the diagram in Figure 3.

Experiments on GDP forecast
This section looks at the MWASDT model's efficiency and forecasting capacity when used with real-world data. Particularly, the performance of the MWASDT neural network is investigated and compared to some of the best-performing regression models accessible in the MATLAB classification learner app during six experiments on GDP forecast. The exponential Gaussian process regression (EGPR), linear support vector machine (LSVM), and ensemble bagged trees (EBT) are these regression models. The WASDP neural network model developed in [28] is also compared. With the following link, you may get the whole creation and deployment of the concepts and computational methods featured in Sections 2-4 from GitHub: https://github.com/SDMourtas/MWASDT.
The MATLAB package provides full implementation and extensive installation instructions.

Experiments description
The datasets used in the experiments of this section are taken from the Federal Reserve Economic Data (FRED) at https://fred.stlouisfed.org/, which contains frequently updated U.S. macro and regional economic time-series at annual, quarterly, monthly, weekly, and daily frequencies. For our experiments, we employed the quarterly GDP time-series for the U.S., U.K, Italy, France, Greece and India. The time frames for training the neural network models are from 1/1991 to 10/2017 for the U.S., U.K and France, from 1/1997 to 10/2017 for Italy and Greece, and from 4/2004 to 10/2017 for India. These time frames include 109, 93 and 56 observations, respectively. Therefore, we set g = 109, g = 93 and g = 56, respectively, in the MWASDT algorithm described in the diagram of Figure 2.
Additionally, we set n max = 50 for all the time frames in the MWASDT algorithm. The time frame for testing the neural network models is from 1/2018 to 1/2023 for all the aforementioned nations and includes 20 observations. As a result, we set Z = 20 in the process described in the diagram of Figure  3. It is important to note that we set γ = 10 in (3.2) for all the time frames in the MWASDT algorithm with the exception of Italy's GDP, where we have set γ = 1. We have also set γ = +∞ for all the time frames in the MWASDT algorithm, and in the figures and table of this section, we designate it as MWASDT WR (i.e., MWASDT without restrictions).
In the case of the U.S.'s GDP, the neural network training and test results are presented in Figure 4. Particularly, the MAPE of the validation set during training is shown in Figure 4a-4c for various ratios of the fitting and validation input sets p, various delays M for the optimal value of p, and various AFs for the optimal values of p and M, respectively. More particularly, Figure 4a shows that the optimal value of p is 0.75, Figure 4b shows that the optimal value of M is 8, and Figure 4c shows that the optimal number of hidden-layer neurons is 26. In Figure 4d and 4e, which depict the predicted prices on the validation set during training, we can see that the MWASDT, LSVM, and EGPR models have a better match with the actual prices than the WASDP and EBT models. As shown in Figure 4f, the MWASDT and LSVM models are more accurate at forecasting the actual prices than the MWASDT WR, WASDP, EGPR and EBT models.   When it comes to the U.K.'s GDP, the neural network training and test results are presented in Figure 5. Particularly, the MAPE of the validation set during training is shown in Figure 5a-5c for various ratios of the fitting and validation input sets p, various delays M for the optimal value of p and various AFs for the optimal values of p and M, respectively. More particularly, Figure 5a shows that the optimal value of p is 0.81, Figure 5b shows that the optimal value of M is 16, and Figure 5c shows that the optimal number of hidden-layer neurons is 23. In Figure 5d and 5e, which depict the predicted prices on the validation set during training, we can see that the MWASDT, LSVM, and EGPR models have a better match with the actual prices than the WASDP and EBT models. The MWASDT and MWASDT WR models are more accurate at forecasting the actual prices than the other models, as shown in Figure 5f.   Regarding Italy's GDP, the neural network training and test results are presented in Figure 6. Particularly, the MAPE of the validation set during training is shown in Figure 6a-6c for various ratios of the fitting and validation input sets p, various delays M for the optimal value of p, and various AFs for the optimal values of p and M, respectively. More particularly, Figure 6a shows that the optimal value of p is 0.63, Figure 6b shows that the optimal value of M is 9, and Figure 6c shows that the optimal number of hidden-layer neurons is 27. In Figure 6d and 6e, which depict the predicted prices on the validation set during training, we can see that the EGPR model has a better match with the actual prices compared to the other models. As shown in Figure 6f, the WASDP model is less accurate at forecasting the actual prices than the rest of the models.
In the case of the France's GDP, the neural network training and test results are presented in Figure  7. Particularly, the MAPE of the validation set during training is shown in Figure 7a-7c for various ratios of the fitting and validation input sets p, various delays M for the optimal value of p, and various AFs for the optimal values of p and M, respectively. More particularly, Figure 7a shows that the optimal value of p is 0.81, Figure 7b shows that the optimal value of M is 8, and Figure 7c shows that the optimal number of hidden-layer neurons is 28. In Figure 7d and 7e, which depict the predicted prices on the validation set during training, we can see that the MWASDT and EGPR models have a better match with the actual prices compared to the rest of the models. The MWASDT and LSVM models are more accurate at forecasting the actual prices than the MWASDT WR, WASDP, EGPR and EBT models, as shown in Figure 7f. When it comes to Greece's GDP, the neural network training and test results are presented in Figure 8. Particularly, the MAPE of the validation set during training is shown in Figure 8a-8c for various ratios of the fitting and validation input sets p, various delays M for the optimal value of p, and various AFs for the optimal values of p and M, respectively. More particularly, Figure 8a shows that the optimal value of p is 0.81, Figure 8b shows that the optimal value of M is 9, and Figure 8c shows that the optimal number of hidden-layer neurons is 20. In Figure  8d and 8e, which depict the predicted prices on the validation set during training, we can see that the WASDP model is less accurate at matching the actual prices than the other models. The MWASDT and MWASDT WR models are more accurate at forecasting the actual prices than the rest of the models, as depicted in Figure 8f.         Regarding India's GDP, the neural network training and test results are presented in Figure 9. Particularly, the MAPE of the validation set during training is shown in Figure 9a-9c for various ratios of the fitting and validation input sets p, various delays M for the optimal value of p, and various AFs for the optimal values of p and M, respectively. More particularly, Figure 9a shows that the optimal value of p is 0.8, Figure 9b shows that the optimal value of M is 13, and Figure 9c shows that the optimal number of hidden-layer neurons is 49. In Figure 9d and 9e, which depict the predicted prices on the validation set during training, we can see that the EBT and WASDP models are less accurate at matching the actual prices than the other models. As depicted in Figure 9f, the MWASDT model is more accurate at forecasting the actual prices than the other models.

Results discussion
The models statistics on the training and test sets for the GDPs of the U.S., U.K., Italy, France, Greece and India are shown in Table 2. It is crucial to note that the statistics presented in this table were generated using MATLAB and then double-checked using SPSS. The coefficient of determination (R 2 ), MAPE, symmetric MAPE (SMAPE), mean absolute error (MAE), root-mean-square error (RMSE), mean absolute scaled error (MASE) and mean directional accuracy (MDA) are the performance measures considered in our analysis. Particularly, R 2 is the proportion of the variation in the dependent variable that is predictable from the independent variable(s), SMAPE is an accuracy measure based on percentage errors, MAE is a measure of errors between paired observations expressing the same phenomenon, RMSE is a measure of the differences between values predicted by a model and the values observed, MASE is a measure of the accuracy of forecasts, and MDA is a measure of prediction accuracy of a forecasting method and compares the forecast direction to the actual realized direction. See [31] for more details and in-depth analysis of these measures.
In the case of the U.S.'s GDP, the neural network models' statistics on the training and test sets are presented in Table 2. In the training set, EGPR has the best R 2 , MWASDT has the second best, and WASDP has the worst. On MAPE, SMAPE, MAE, RMSE and MASE, EGPR has the lowest values, MWASDT has the second lowest values, and WASDP has the highest values. Also, EGPR and MWASDT have the best MDA, and WASDP has the worst. In the test set, MWASDT has the best R 2 , and EBT and MWASDT WR have the worst. On MAPE, SMAPE, MAE, RMSE and MASE, MWASDT has the lowest values, and MWASDT WR has the highest values. Also, MWASDT has the best MDA, and EBT and MWASDT have the worst. Overall, the MWASDT has the second best statistics on the training set and the best statistics on the test set.
When it comes to the U.K.'s GDP, the neural network models' statistics on the training and test sets are presented in Table 2. In the training set, EGPR has the best R 2 , MWASDT has the second best and WASDP has the worst. On MAPE, SMAPE, MAE, RMSE and MASE, EGPR has the lowest values, MWASDT has the second lowest values, and WASDP has the highest values. Also, EGPR has the best MDA, MWASDT has the second best, and LSVM has the worst. In the test set, all the statistics of MWASDT and MWASDT WR are identical. MWASDT has the best R 2 , and EBT has the worst. On MAPE, SMAPE, MAE, RMSE and MASE, MWASDT has the lowest values, and WASDP has the highest values. Also, MWASDT and LSVM have the best MDA, and WASDP has the worst. Overall, the MWASDT has the second best statistics on the training set and the best statistics on the test set.
Regarding Italy's GDP, the neural network models' statistics on the training and test sets are presented in Table 2. In the training set, EGPR has the best R 2 , MWASDT has the third best, and WASDP has the worst. On MAPE, SMAPE, MAE, RMSE and MASE, EGPR has the lowest values, and MWASDT has the highest values. Also, EGPR has the best MDA, MWASDT has the third best, and WASDP has the worst. In the test set, MWASDT WR has the best R 2 , MWASDT has the third best, and EGPR has the worst. On MAPE, SMAPE, MAE, RMSE and MASE, EGPR has the lowest values, MWASDT has the third lowest values, and MWASDT WR has the highest values. Also, MWASDT, MWASDT WR and LSVM have the best MDA, and WASDP, EGPR and EBT have the worst. Overall, the MWASDT has the third best statistics on the training set and the second best statistics on the test set.
In the case of France's GDP, the neural network models' statistics on the training and test sets are presented in Table 2. In the training set, EGPR has the best R 2 , MWASDT has the second best, and WASDP has the worst. On MAPE, SMAPE, MAE, RMSE and MASE, EGPR has the lowest values, MWASDT has the second lowest values, and WASDP has the highest values. Also, EGPR has the best MDA, MWASDT and LSVM have the second best, and WASDP has the worst. In the test set, MWASDT has the best R 2 , and EBT and MWASDT WR have the worst. On MAPE, SMAPE, MAE, RMSE and MASE, MWASDT has the lowest values, and MWASDT WR has the highest values. Also, MWASDT has the best MDA, and WASDP has the worst. Overall, the MWASDT has the third best statistics on the training set and the best statistics on the test set. When it comes to Greece's GDP, the neural network models' statistics on the training and test sets are presented in Table 2. In the training set, EGPR has the best R 2 , MWASDT has the third best, and WASDP has the worst. On MAPE, SMAPE, MAE, RMSE and MASE, EGPR has the lowest values, MWASDT has the third lowest values, and WASDP has the highest values. Also, EGPR has the best MDA, MWASDT has the third best, and WASDP has the worst. In the test set, all the statistics of MWASDT and MWASDT WR are identical. MWASDT has the best R 2 , and LSVM has the worst. On MAPE, SMAPE, MAE, RMSE and MASE, MWASDT has the lowest values, and EBT has the highest values. Also, MWASDT has the best MDA, and WASDP, EGPR and EBT have the worst. Overall, the MWASDT has the third best statistics on the training set and the best statistics on the test set.
Regarding India's GDP, the neural network models' statistics on the training and test sets are pre-sented in Table 2. In the training set, MWASDT has the best R 2 , and WASDP has the worst. On MAPE, SMAPE, MAE, RMSE and MASE, MWASDT has the lowest values, and EBT has the highest values. Also, EGPR, MWASDT and LSVM have the best MDA, and EBT has the worst. In the test set, MWASDT WR has the best R 2 , MWASDT has the fourth best, and EBT has the worst. On MAPE, SMAPE, MAE, RMSE and MASE, MWASDT has the lowest values, and WASDP has the highest values. Also, LSVM and MWASDT WR have the best MDA, MWASDT has the second best, and EBT has the worst. Overall, the MWASDT has the best statistics on the training and test sets.
In the situations of the U.S., U.K. and France, we can see that these nations have steadily rising GDPs and have fully recovered from the 2007-2008 and COVID-19 financial crises. We can observe that whereas Italy and Greece have entirely recovered from the COVID-19 financial crisis, they have not yet fully recovered from the 2007-2008 financial crisis. India's GDP is extremely volatile, and while it entirely recovered from the financial crisis of 2007-2008, it has not yet fully recovered from the COVID-19 financial crisis. We are able to forecast the rising trend in these countries' GDPs using neural networks, but we are unable to predict with high accuracy how the COVID-19 financial crisis will impact GDP. The tests' execution aids in these participating nations' economic forecasts. In order to predict a country's future economic activity, economists use the most recent data available. Although the specifics of these reports vary, their fundamental methodology is the same: They forecast an economy's growth using economic indicators and models. Most central banks, global rating agencies, and organizations like the International Monetary Fund (IMF) carry out this kind of examination. However, they are crucial for investors because they aid in their decision as to whether or not to invest in a particular nation.
Altogether, the MWASDT model did a great job resolving forecasting issues, which is consistent with the information shown in Table 2. The GDPs of the U.S., Italy, France, and India, where the accuracy of the MWASDT on the test set grew considerably in comparison to MWASDT WR, clearly demonstrated the effectiveness of (3.2). The performance of the MWASDT compared to traditional neural network approaches is quite competitive or even better when forecasting tasks are considered as regression problems.

Conclusions
This paper introduced a novel 3-layer feed-forward WASD neural network for time-series forecast, termed MWASDT. The findings of six experiments on GDP forecast show that the MWASDT model outperforms the WASDP model and some of the most cutting-edge regression models available through MATLAB's regression learner app. As a result, it has been demonstrated that the MWASDT model is an excellent substitute to MATLAB's regression neural network models for forecasting the GDP. It is important to mention that due to limitations placed by the WDD procedure, which is used by the MWASDT algorithm, only time-series data can be used as input to train and test the MWASDT neural network model. Another limitation of the MWASDT model is that it was only designed to be used for time-series forecasting tasks. Its proper adjustment and application to diverse time-series forecasting issues across many scientific domains will therefore be the subject of future research.

Use of AI tools declaration
The authors declare they have not used artificial intelligence (AI) tools in the creation of this article.