Dividend Payout Forecast : Multiple Linear Regression vs Genetic Algorithm-Neural Network

This research aims to compare two methods of forecasting, i.e Multiple Linear Regression (MLR) and Genetic Algorithm-Neural Network (GA-NN), in forecasting dividend payout of Indonesian manufacturing company listed on Indonesia Stock Exchange from 2010-2014. Having collected 1384 firm-year observations, the result shows that these two methods could be used to predict dividend payout by considering earnings, free cash flow, growth opportunity, leverage, liquidity and size. This resesarch finds that even though both methods are powerful in prediction, yet in this case, MLR outperforms GA-NN.


INTRODUCTION
dividend is a form of return over investment distributed by a company towards its stockholder.A dividend can be seen as a return generated by investor over the investment made on the stock of a particular company .[1] argue that a dividend conveys an important information used by investors to differentiate the profitable firm from the maximizing-wealth firm.The amount distributed to stockholders carries information about the expectation of the manager about the future profitability and investment direction.Amount to be distributed in form of dividend is determined by company's dividend policy.This term is used in relation to the profit of a company.As a dividend maximizes the wealth of the investor, then forecast information on dividend is considered important for investor in decision making.Eventhough dividend forecasting is essential in investments, but according to [2] it is difficult to find good theoritical models in forecasting dividend.
In forecasting a dividend, the financial accounting data is used.Multiple Linear Regression (MLR) is one of the commonly used statistical methods to process accounting data.MLR is used to see the effect of the multiple independent variables over the dependent variable.The effect of each independent variable is used to forecast the direction as well as the magnitude of the dividend payout.
Another useful method that can be used to forecast a dividend payout is computational intelligence method.One of the powerful computational intelligence methods that have the ability to do forecasting is the Artificial Neural Network (ANN).This method has been successfully used in business application [2] [3].ANN is a network that consists of interconnected nodes inspired by the brain that generates output through the training process.In the training process, ANN will learn the pattern of data given to the network by comparing the actual output produced by the network with the target or desired output.The result of the comparison is known as the error.Weights of the network will be iteratively modified until it reaches the minimum error.When the network reaches the minimum error, training will stop.After the training and testing, the network is ready to be used as a machine that produces forecasting value.
Several studies show that ANN works well when used together with Genetics Algorithm (GA) ( [4], [5], [6]).These studies shows that both of these methods can be combined to solve forecasting problems in many applications and is proven to improve the forecasting accuracy compared with only using GA or ANN alone.Another study shows that the combination of GA and ANN performs better than Support Vector Machine (SVM) and Vector Autoregression (VAR) [7].
GA can be used in different ways.Several recent studies use GA as an optimization algorithm to produce optimal weights for the network.This weights is then fed to the network for training using backpropagation.The results show that this method helps increase the accuracy of the forecast.( [8], [9]) .
Both MLR and Genetics Algorithm-Neural Network (GA-NN) can be used to forecast dividend.Several studies had been conducted in regard to comparing multiple MLR with ANN in financial forecast such as stock or bankruptcy predictions, but not on dividend payout.Therefore, the comparison of the predictive power of MLR and GA-NN is an empirical issue.This study will compare MLR with ANN in forecasting dividend payout and GA will be used to train ANN to produce optimal weights that will then be fed to the network for training using backpropagation.MLR and GA-NN will be compared in terms of their Mean Square Error (MSE).

Data Collection
In doing the comparison, the purposive sampling method is used.Criteria to be fulfilled for the chosen sample are: • Listed on Indonesia Stock Exchange from 2000-2013 • Paying dividend at least once during the observation period • Providing all data needed as variables of research Manufacturing companies are the object of this research for several reasons.Firstly, it is the largest sector in Indonesia Stock Exchange compared to other sectors.Secondly, it provides enough data to run multiple linear regressions.Choosing other s ectors that have fewer companies might be at risk as some companies in that sectors might not provide all the data required for this research.Thirdly, the firms in the similar industry share many things in common.This make the results more generalizable and applicable for the investors.
Data of all variables are taken from Datastream.Data used for this research are dividend per share as the proxy for dividend policy, earnings per share (the proxy for earnings), free cash flow per share (free cash flow), market capitalization (firm size), change in sales (growth opportunity), quick ratio (liquidity) and debt to asset ratio (leverage).

Forecasting with MLR
The developed model used to predict dividend payout has to go through several test i.e multicollinearity test, heteroscedasticity test, and autocorrelation test.Figure 1 describes the steps to perform these tests.Having detected the existence of multicollinearity, heteroscedasticity or autocorrelation, the model has to be treated until the problem is omitted.By addressing all the existence of the aforementioned problems, the model is expected to generate a best, linear, and unbiased estimator.

Multicollinearity Test
Multicollinearity test is conducted to identify whether there is an exact collinearity among independent variables.The existence of multicollinearity can be observed through correlation between independent variables as well as coefficient significance of each independent variables.If correlation between variables is greater than 0.8 then multicollinearity holds.High R-squared yet statistically insignificant coefficient is another indicator of this problem.Therefore, dropping one of the independent variables or transforming the variables could be done as treatment.[10] identifies some the consequences of multicollinearity.Those are precise estimation of estimator becomes difficult, readily acceptance of "zero null hypothesis", tendency of coefficient to be statistically insignificant yet the goodness of fit could be very high.Therefore, conducting multicollinearity test is indispensable.

Heteroscedasticity Test
Heteroscedasticity test is conducted to identify whether the population has the same variance or homoscedastic.Heteroscedasticity refers to the variability of variances.This problem is possible when regressing cross-sectional data since each company has its own uniqueness or characteristics.To test the existence of heteroscedasticity, Glejser test will be used.Having run the model, Glejser test will be run to find out if the probability chi-square is greater or less than the level of confidence (0.05).If probability chi-square is less than 0.05, then heteroscedasticity exist.If the problem exists, then the t-test of F-test might give us an inaccurate result.What is statistically insignificant might be is significant.Therefore, should this problem exists, then White Heteroscedasticity procedure will be employed to eliminate this problem.

Autocorrelation Test
Autocorrelation test is used to identify serial correlation in the observation.The existence of autocorrelation will cause the estimators to be inefficient.Breusch-Godfrey test is to be exercised to check the existence of autocorrelation problem in the developed model.If the probability chi-square of Breusch-Godfrey test is less than the level of confidence (0.05), autocorrelation holds.To eliminate its existence, Newey-West will be used.
Having conducted all the aforementioned tests, the model will be run to test the significance of the model as well as the independent variables.If the F-test is less than 0.05, then the model has predictive ability.Finally, the adjusted R-squared and standard error of this model will be compared to the hybrid method.
Dependent variable of this model is dividend payout (represented by DivPay in the model), while earnings (EARN), free cash flow (FCF), firm size (SIZE), growth opportunity (GrOp), liquidity (LIQ) and leverage (LEV) are independent variables.Each variable would be proxied by data taken from Datastream as explained in figure 1.The model to statistically test the significance of independent variables is as follow: To statistically test the model, EViews 6 software is used.Afterward, the automaticallygenerated adjusted R-squared and standard error will be compared to results generated by hybrid models.

Forecasting with GA-NN
This research uses ANN multilayer neural network combined with GA .GA will be used to train the ANN in order to achieve minimal Mean Square Error (MSE).Figure 2 describes the steps to be performed by GA-NN.

Data Normalization
The input and output data should be normalized before it is used by ANN.The purpose of normalization is to transform the data so it will be in accordance with the Sigmoid function format.The Sigmoid function uses values between -1 and 1.The Sigmoid function is used as the activation function and is used to determine the output of each neuron.
The max-min method with the following formula will be used to normalized the data : where :

The GA-NN Structure/Model
The ANN structure describes the framework that shows the connections and arrangements of neurons.These arrangements will be the parameters for the ANN.Moreover, the number of layers and neurons on each layer are depicted in the ANN structure.The ANN structure of this research is shown in Figure 3, with the input neurons are earnings, free cash flow, firm size, growth opportunity, liquidity and leverage.
Figure 3 The Artificial Neural Network Structure

Training with GA
The training process is essential in ANN.In this process, ANN will learn from the inputoutput pattern given to the network.Weights in each layer will be updated until the given input produced the desired output.On this research, GA is used as the training algorithm.Training ANN with GA was first done by [11].The result of this study shows that GA outperformed Backpropagation algorithm in training ANN.Several studies had been done in order to deal with some issues in using GA as a training method for ANN.These issues are related to the representation schemes of the connection weights and chromosomes representation of the connectivity.Studies show that real numbers might be used to represent connection weights and a chromosome might represents the connectivity, weights, and biases in the form of a matrix ( [12], [13], [14]).Therefore, this study will use the same approach in representing chromosomes to be used by GA.GA parameters include selection options, mutation options, and crossover options.The fitness function for GA is the MSE, where the goal is to find the optimal weights that generate the least MSE.2) Calculate Fitness : After defining the population, each member that represents the weights and bias will be scored according to their fitness, based on the value returned by the fitness function.The fitness function is the mean squared error (MSE), in which the lower the value of the MSE the larger the fitness value 3) Select Best Members : GA will select weights and biases that have better fitness to be the parents that will produce members for the next population.The selection algorithm used in this study is the stochastic uniform that uses a line where each parent are directly related to a part of the line proportional with its value.The algorithm will move along the line and will select parents based on the section it lands on.4) Produce Children : The selected parents will be used to create children in which the process of forming children are done either through crossover or mutation.Crossover is the process of combining 2 parents from the current population to produce children, while mutation is the process of randomly changing the genes of the members.The crossover function used in this study is a single point crossover.This function will choose a random number between 1 and the number of variables that represents the points in which the crossover will occur.The mutation operator used in this study is mutation Gaussian, where this algorithm works by adding a random number to each parent.The amount of mutation will usually decrease at new generation.After forming the children, GA will replace the members of the current population with the children that it produces.GA will stop iterating the process to modify the population of the solution when the stopping criterion is met.

MLR Results
EViews software is used to statistically test the model.Table 1-4 shows the MLR results for the correlation test, heterokedasticity test , autocorrelation test, and the final result.
The result in Table 1 shows that there is no variable has greater (lesser) than 0.8 (-0.8) correlation with other variable.Moreover, the correlation between each independent variable is insignificant.Since there is no highly correlated variable exists, then multicollinearity problem does not hold.Therefore, no variables should be dropped from the proposed model.Table 2 shows that there are two independent variables (EARN and FCF) whose p-value is less than 0.05.It signifies that as the indepent variable size increase, the variance of the dependent variable will also increase significantly.Therefore, Table 2 shows that the p-value of Obs*R-squared of Glesjer test is 0.0000 or less than 0.05.In other words, the data used in this research contains heteroskedasticity problem.If the existence of heteroskecasticity is not treated, then it would affect the significance of statistical test.Therefore, White Heteroskedasticity is used to address this issue.The result of removing heteroskedasticity is presented in Table 4.The result of Breusch-Godfrey test in Table 3 shows that autocorrelation exists as one of the independent variables (EARN) has significant effect to the residual (p-value is 0.0266 or less than 0.05).Moreover, the Obs*R-squared Prob.Chi-Square is 0.0000 or less than 0.05.The Newey-White test would be used to address this issue.The result of treating the autocorrelation issue is presented in Table 4.
Having seen the existence of all problem, the treatment have been conducted.The following result shown in Table 4 is the final result after the treatment of the heteroskedasticity and the autocorrelation issues by including White Heteroskedasticity and Newey-West to run the multiple linear regression.

Forecasting with GA-NN
Table 5 shows the GA parameter.Global optimization toolbox provided by Matlab is used to run GA.This toolbox could be used to search global solutions for problems that contains multiple minima.GA solvers in this toolbox can be customized by modifying its initial population or by delineating its selection, crossover and mutation function.The best fitness value and the variables values for that fitness will be returned by the GA solver.

Table 5 GA Parameter
The result of Breusch-Godfrey test in Table 3 shows that autocorrelation exists as one of the independent variables (EARN) has significant effect to the residual (p-value is 0.0266 or less than 0.05).Moreover, the Obs*R-squared Prob.Chi-Square is 0.0000 or less than 0.05.The Newey-White test would be used to address this issue.The result of treating the autocorrelation issue is presented in Table 4.
Having seen the existence of all problem, the treatment have been conducted.The following result shown in Tabel 4 is the final result after the treatment of the heteroskedasticity and the autocorrelation issues by including White Heteroskedasticity and Newey-West to run the multiple linear regression.After configuring the GA-NN structure, GA is used to train the network.Data used in training is 80% of the sample data and 20% is for testing.After running GA, the minimum MSE is achieved on generation 87 with the value of 0.0001458.The optimization terminates after the average change in the fitness value is less than the error goal.The weights produced by GA is then be fed to ANN for training using backpropagation.

Comparison Result
MLR and GA-NN are compared in terms of how well their model do a forecast.Forecasting results of each model for the dividend payout value will be compared with actual values of dividend payout.In this study, 2 methods will be used to analyze and compare the results of both models.The 2 methods are: 1) Comparing Graph that compares the actual dividend payout with the result of the forecast of both models using sample data, 2) Comparing Mean Squared Error of both methods.

1). Comparing Graph of Actual vs Forecast values
Figure 5-7 shows the comparison of the forecasting results with actual values in graph.Figure 5 shows the comparison of GA-NN forecast results with the actual values, while figure 6 shows the comparison for MLR result.In figure 7

CONCLUSION
MLR results show that EARN, FCF, GrOp, and LIQ significantly affect company dividend payout decision.The model used to assess those factors significance toward company dividend payout decision is significant (p-value of F-test is 0.0000 and model reliability is 0.88).Therefore, MLR could be used by investor to assess the posibility of generating return over investment made on particular company.
GA-NN results show that the target data has a significant relationship with the output data (this is shown through the  R value of 0.88).Furthermore, GA can be used as a training algorithm for ANN.This is confirmed by the low MSE value achieved by GA.Using the sample data collected from Datastream, GA-NN model can be used to forecast the dividend payout of Indonesian manufacturing companies.
By comparing the result of MSE from both methods, we can conclude that MLR produces better model compares to GA-NN because MLR produces lower MSE value than GA-NN.Therefore, MLR forecast values are closer to the actual values.This result indicates that MLR generates better fit and forecast than GA-NN model.

RESEARCH LIMITATION AND RECOMMENDATION
Several limitations of this research are firstly, GA optimization has some drawbacks such as it is slow and in need of algorithm parameter tuning.Secondly, the MLR has to fulfill certain assumptions, such as autocorrelation, multicolinearity, and heteroskedasticity.Thirdly, the comparison of performance of both models are only based on mse and regression.
Having considered the limitations, some considerations could be considered by future research.This considerations include using other training algorithms that runs faster.If the data have a small ratio of input to target, then according to [15], it is probably best to use resilient backpropagation algorithm.Generalized Leased Square (GLS) might be used as the statistical method to do forecasting, since GLS is not restricted by assumptions used by MLR.In comparing the statistical method and computational intelligence performance in forecasting dividend payout, other evaluations might be used.

Figure 1
Figure 1 The Multicollinearity, Heteroscedasticity, and Autocorrelation Test

n 261 Fakultas
Ilmu Komputer | Universitas Klabat | CORIS | ISSN: 2541-2221 | E-ISSN: 2477-8079 , the result of the forecast by both GA-NN and MLR are compared to the actual values.Based on this graph, GA-NN fitness to the actual values are poor compared to MLR.The quantification of these fitness is measured by MSE.

Figure 5 .
Figure 5.Comparison of GA-NN Forecast with Actual Values

Table 4 .
Multiple Linear Regression Fakultas Ilmu Komputer | Universitas Klabat | CORIS | ISSN: 2541-2221 | E-ISSN: 2477-8079 Comparing Errors and Fit of Data Table7shows the comparison in terms of MSE and  R from both methods.This table confirms the graph shown above.It suggests that MLR model outperforms GA-NN, where MSE of MLR model is lower than GA-NN.This table suggests that MLR model performs better in terms of its errors and the fits of its data.

Table 7 .
Comparison of GA-NN and MLR