Flower Pollination Inspired Algorithm on Exchange Rates Prediction Case

.


INTRODUCTION
The importance of planning a business target, resource plan, investing, and other management planning makes forecasting indispensable to ensure company continuity [1].Even now, every significant company forecasts annually or quarterly and uses the computer science method to ensure better prediction.Computer science uses several methods to make a forecast model using several data, such as time-series data that even include an economic process in the calculation, with the easiest and lightest using only time-series data [2] [3].The forecasting model can be built using a traditional statistical model such as the autoregressive model [4] [5], curve fitting [6], and nonlinear modeling such as neural networks [7] [8].Technology advancement in increasing computer speed and extensive data handling has given rise to another method also proposed in this research: a nature-inspired and intelligencebased model [9] [10].
Much related research using the intelligence-based model have been published, such as comparing artificial neural network and hybrid ANN-ARMA and ANN-ARIMA [11] [12].The research results show that a hybrid method produces a model with less error than a pure ANN model.Adding ARIMA makes ANN robust, facing high volatility, complex, and noisy data [13].Using a hybrid method, the model can predict with a correlation coefficient as significant as 99.545% [11].
Another method that has already been proven to forecast an exchange market value is the Genetic Algorithm Neural Network (GANN) [14].GANN combines an intelligent Neural network and a nature-inspired method of the Genetic Algorithm [15].Using one-year currency exchange data, the GANN can produce a model predicting the lowest RMSE of 0.044% [14].
In the emergence of state-of-the-art bio-inspired methodologies, Yang [16] proposed a new genetic algorithm method inspired by the behavior of flower pollination.Flower Pollination Algorithm is a swarm intelligence inspired by how flower pollination occurs.Flower pollination can occur by its flower or from another flower.In this algorithm, self-pollination is local search and pollination from another flower is global.The flower pollination algorithm solves the global and local search balance and uses levy for better global search performance.Yang [16] stated that there are four rules in the flower pollination algorithm: • Rule 1 -Cross-pollination, or interfloral pollination, is a global pollination that follows a Levy distribution.
• Rule 2 -Self-pollination is local pollination • Rule 3 -Pollinators can develop flower homeostasis because it corresponds to a reproductive probability proportional to the two flowers' similarity.• Rule 4 -The interaction between self-pollination and global pollination can be controlled by random variables (which will be called switch probability), with slightly biased toward local pollination In this research, the Flower Pollination Algorithm (FPA) model is deducted to forecast the exchange rate of the United Kingdom Pound sterling (GBP) to US dollars (USD).The exchange rate of GBP to USD in this research is a time series problem with several data before used as input, and the output is one step in the future.
A similar problem with exchange rates has been researched using classical methods like ARIMA [17] or hybrid methods by combining artificial intelligence with nature-inspired algorithms.An artificial neural network can be combined with classical statistical methods such as ARMA and ARIMA to achieve a robust forecast model [17] [18].The classical method is used to understand better the relationship between variables besides the ANN that works as an optimizer.However, using a hybrid method means increasing parameters and needing a new method to determine the parameter as in prior studies [19].Furthermore, a hybrid of ANN with ARMA or ARIMA constructs a model that can withstand volatile, complex, and noisy data.
The genetic algorithm is a nature-inspired algorithm that uses the same principle as the classical or neural-based approach and can be used as a forecasting tool [21] from a random population and solution [22].At each iteration happens reproduction wherein genetic algorithm only reproduces between the selected population to breed new and hopefully better solutions.In the genetic algorithm, the crossovers and mutations reproduce the nextgeneration solution from the prior solution [23].However, in FPA, two crucial phases happen that is local pollination and global pollination [16].Global pollination means pollination between flowers, and local pollination is pollination with itself.The mathematical equation of global and local pollination is mentioned in Equation 1 and Equation 3, respectively.Using Levy distribution [24] to perform diversification in global pollination or exploration makes the flower pollination algorithm more reliable in solving the global optimization problem.It guarantees the system to jump out of local optima [25].Time-series problems are nonlinear problems but with a relatively small dimension.Solving time-series problems using a neural-based approach or hybrid method will take too much effort compared to its dimension.Therefore, in this work, we propose only using FPA to find the optimal global solution to the time-series problem.Nonetheless, in this work, we proposed forecasting methods using flower pollination because it involves optimization like ANN and the survival of the fittest without forfeiting the computational cost.

Figure 1 The GBP-USD data in the line chart
This research uses exchange rates of United Kingdom GBP to US USD.The exchange rates are from January 24th, 2008, until February 29th, 2016.From this date range, we obtain 2102 records in time series, which is presented in Figure 1.For this research, we use the closing rates for each day.We conduct exploratory data analysis to get the statistical Information of the dataset.The result is shown in Table 1.From the exploratory data analysis, the peak exchange between GBP and USD is nearly twice, whereas the lowest exchange is when GBP is around 1.3 times USD.By measuring the mean, mode, median, and standard deviation of the dataset, the central tendency of the exchange rate in this dataset is around 1.6, with a relatively small deviation from the central tendency.

Flower Pollination Algorithm
The flower pollination algorithm is a bio-inspired system for optimization inspired by the characteristic of cross-pollination of flowers, which is proposed by Yang [16].In his research, the flower pollination algorithm is written in pseudocode in Figure 2 and can be visualized in a flowchart as in prior research [22] as in Figure 3.Our research aims to obtain the best solution g * from the set of solutions x at generation t.From Figure 2, we initiate the initial population t0 by generating d random solutions x, which will be explained in the next section.For every generation with d solutions, each solution x consists of regression coefficients xi where i = 1, 2, .., n and bias x0 for predicting the exchange rate of day D′T using linear regression based on n days before so that x = {x0, x1, x2, ..., xn}.The xi will be used as a sum product with DT−i, and x0 will be added.Formally, the linear regression for our research is written in Equation 4.
The objective function for each solution x is to minimize the difference between the predicted exchange rate D′T and the actual exchange rate DT.For this research, we use rooted mean squared error (RMSE) to measure the difference for each solution.RMSE is written in Equation 5, where n is equal to the length of the time series record.In the ideal result, the RMSE from Equation 5will result in 0 because the same value between the predicted exchange rate D′T and actual exchange rate DT, so the squared error became zero.
By minimizing the RMSE for each solution, we state the fitness function for each solution as in Equation 6to evaluate the solution.The best solution for every generation will be the g* and will be used as the final solution.According to Equation 5, the RMSE will yield 0 in the ideal situation, thus resulting ideal fitness of 1.
For each generation, we propose d solutions as a population.From the initial generation (t = 0), the best solution in the population will be stated as g*.In the generation t where t = 1, 2, ..., MaxGeneration, if one solution is better than g*, that solution will replace the new g*.The alteration of g* is done iteratively in each generation, so a dynamic approach is needed.The solutions in generation t are formed from the pollination of the solutions in generation t-1, either global or local, as stated in Equations 1 and 2, respectively.The switch between global or local pollination in generation t is controlled by switch probability p.In the next section, we will explore the hyperparameters for this research.

Experiment Setup
The data is split into training data to obtain the most optimal coefficient as the solution and testing data to evaluate the final solution.The training and testing data comparison is 80:20.The data is split after we process the data into chunks of sliding windows with a size of n.
In this research, we experiment with some hyperparameters for the algorithm.We also state the fixed hyperparameters that do not undergo the experiment, such as the number of days used for predicting (n), which is 5, and the number of generations/iterations m equals 100.Population size d and switch probabilities p are hyperparameters that are fine-tuned.In this research, we experiment with 50, 100, and 150 as the size of population d and 0.3, 0.5, and 0.8 as the switch probability p. From the combinations of d and p, and we obtain 9 experiment

RESULT AND DISCUSSION
The flower pollination algorithm gives fast convergence results when training to get the optimal solution.The fast convergence result is shown in Figure 4, which shows only to generation 50 th .Even if we observe each experiment with 100 as m or the number of generations/iterations, all experiments have the same corresponding g* from generation 44th to 100 th , and this left a question of whether the flower pollination algorithm gives premature convergence.Furthermore, most experiments show that convergence happens before the 20th generation.2. From this result, we can see that the population size is relatively inversely proportional to RMSE.Although the experiment with 0.5 as switch probability does not show the best RMSE when the population size is 150, the difference is insignificant when the population size equals 100.That means the population size is proportional to the solution's fitness, as we can see that 150 as the population size better results than 100 and 50 as the population size, while the population size at 100 is better than 50.This experiment shows that the number of formed solutions affects the best solution g*quality because the solutions in generation t are directly generated from pollination in generation t-1.The more solutions in generation t-1 mean more pollinations occur for generation t.More pollinations imply more probability of generating a better solution from prior g*.For the switch probability p, we can see that a higher switch probability gives a better solution based on RMSE in Table 2.In each size of population d, higher p always gives a lower RMSE, which means a better solution.A higher p equals a higher possibility of global pollination to happens.That means global pollination tends to generate better solutions rather than local pollination.As stated in Equation 1, each solution of generation t is generated from pollinating the best solution with itself at t-1.This solution tends to result in a solution close to g*, if not better than the previous g*.Compared to global pollination, local pollination only pollinates random solutions to become a solution in the next generation.Even though it could generate a better solution than g*, that probability is lower than global pollination, as our switch probability experiment proved.
Further analysis also can be conducted based on the prediction visualization.Figure 5, Figure 6, and Figure 7 show the prediction for each experiment.The final solution of each experiment is applied to all data.We take an average of 20 consecutive data in these visualizations to maintain the image quality.From a brief view, the prediction of each experiment is similar to the actual data for each time in each experiment.However, it is noticeable that in Figure 5 (where p = 0.3), the prediction misses by a large margin when the actual value has sudden peaks or sudden drops from the previous actual value.The RMSE also reflects this from Table 2, where p = 0.3 gives the worst RMSE.The solution in Figure 6 improves the miss-prediction in Figure 5 by increasing the switching probability p to 0.5, and further improvement is shown in Figure 7, which uses p = 0.8.The solution is also reflected in Table 2, where p = 0.8 returns the best result.The process yields another question of whether using entirely global pollination for the flower pollination algorithm is better in this exchange rates case.
The training time for all experiments is also recorded and written in Table 2. From Table 2, running time is proportional to the switch probability p.Running time is longer when p is more significant in equal population size, and vice versa, indicating that global pollination consumes slightly more computational time.However, the running time difference of each experiment based on switch probability is insignificant, as shown in Table 2. From this experiment, it can be concluded that global pollination and local pollination serve relatively similar -if not the same time complexity.The noticeable time difference between population size d can be seen with the constant switch probability.The running time is linearly proportional to the population size d since the greater population size means more time to evaluate each solution in a generation.

CONCLUSION
The flower pollination algorithm is an algorithm for solving optimization problems.Cross-pollinating characteristics of flowers inspire that.Our research examines using a flower pollination algorithm for time series prediction in linear regression, where the solutions are represented in flowers.Each solution is a set of coefficients for regression, where we evaluate each flower in one generation to get the best solution using RMSE.The next generation is generated by cross-pollinating all solutions in one prior generation, either with global or local pollination.The best solution for all generations is considered the final solution.
This research conducts several experiments to examine the impact of population size and switch probability.All the experiments go with fast convergence, even below the maximum iteration.Our result shows that a higher population size gives the final solution with lower RMSE, which leads to higher fitness.The same conclusion goes to switch probabilities, where higher probabilities give better results in the final solution due to global pollination being likely to generate better solutions than the current best solution, rather than local pollination, which pollinates the solution to other random solutions.In running time, increasing both switch probability and population size contributes to the increment of training time, although the population size leads to more running time than switch probability.In the future, more experiments with flower pollination algorithms are needed.Further experiments can modify the global pollination that usually applies to Levy distribution.A similar attempt also can be made in local pollination by modifying the randomness of a picked solution to be cross-pollinated.Future research can also use the vast majority -if not entirely global pollination.

Figure 4
Figure 4 RMSE of g * in each generation for each experiment of population d and switch probability p

Figure 5 3 Figure 6 5 ◼Figure 7
Figure 5 The prediction for each experiment of population d and switch probability p = 0.3

Table 1
Exploratory Data Analysis for the Dataset

Table 2
RMSE in the final generation of testing data