Comparative Analysis of LSTM Neural Network and SVM for USD Exchange Rate Prediction: A Study on Different Training Data Scenarios

. Purpose: This paper aims to investigate and compare the performance of LSTM Neural Network and Support Vector Machines (SVM) in predicting the USD exchange rate using three different training data scenarios: 45%, 55%, and 75%. The study employs a dataset from the Indonesian Central Bureau of Statistics (BPS) for the period of January 1 to June 30, 2021, encompassing attributes USD Selling Rate. Methods: The methods involve implementing LSTM and SVM algorithms within the Python programming language using Google Colaboratory. Three distinct training data scenarios are explored to evaluate the models' robustness. Performance metrics, including Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and R-squared, are employed for evaluation. Result: Results reveal that LSTM demonstrates superior prediction accuracy compared to SVM across all scenarios, even though it incurs a longer training time. Notably, in the 75% training data scenario, LSTM achieves an MAE of 49.52, RMSE of 63.08, and R-squared of 0.37906, outperforming SVM with MAE of 138.33, RMSE of 161.58, and R-squared of 0.34277. Novelty: This study innovatively compares LSTM Neural Network and Support Vector Machines (SVM) for USD exchange rate prediction across different training scenarios (45%, 55%, and 75%). Unlike previous research focusing on individual models, this study systematically evaluates both methods, highlighting the nuanced balance between prediction accuracy and training time. The findings offer novel insights into LSTM and SVM applicability in currency forecasting, providing valuable guidance for researchers and practitioners in model selection based on specific predictive task requirements.


INTRODUCTION
In the rapidly evolving landscape of economic globalization, currency exchange rates emerge as pivotal determinants of a country's economic dynamics [1], [2].Policy decisions, industrial competitiveness, and investment strategies hinge significantly upon the fluctuations in currency values, notably the United States Dollar (USD), which serves as the global reserve currency, exerting substantial influence over the global financial market [3], [4].Consequently, the development of predictive models that not only accurately forecast currency movements but also adapt to the dynamic nature of market conditions becomes imperative to facilitate informed decision-making among market participants, investors, and financial institutions [5], [6].
However, the landscape of predictive modeling for currency exchange rates is vast and diverse, encompassing various methodologies ranging from traditional time series models to advanced machine learning techniques.Despite this diversity, there remains a lack of consensus on the most effective approach, particularly in addressing the challenges posed by the dynamic nature of financial markets.Hence, a comprehensive review of existing literature and an identification of research gaps are essential to guide the selection of appropriate predictive models.
Against this backdrop, this research aims to address the aforementioned gap by focusing on two prominent predictive models: the LSTM (Long Short-Term Memory) neural network and the SVM (Support Vector Machine).These models are chosen for their distinct capabilities and strengths in handling time series data and capturing complex patterns.The LSTM neural network, a type of recurrent neural network, is renowned for its ability to model long-term dependencies and intricate temporal relationships [7], [8].Conversely, the SVM, initially developed for classification tasks, has proven to be proficient in regression problems and excels in capturing nonlinear patterns within high-dimensional data [9], [10].While both models offer promising avenues for predicting exchange rates, it is imperative to acknowledge their respective limitations.The LSTM neural network, despite its ability to capture long-term dependencies, may suffer from vanishing or exploding gradients, leading to challenges in training with prolonged sequences of data.Similarly, the SVM, while effective in capturing nonlinear patterns, may struggle with large datasets and requires careful selection of appropriate kernel functions to optimize performance.
Through a rigorous examination of the strengths and weaknesses of the LSTM neural network and SVM, this research seeks to provide a nuanced understanding of their applicability in predicting exchange rates.By conducting controlled empirical experiments and scrutinizing relevant literature, this study endeavors to offer insights into the comparative performance of these models under varying market conditions.Ultimately, the findings of this research are anticipated to inform decision-makers in selecting suitable predictive models and contribute to advancing both theoretical understanding and practical applications in navigating the complexities of currency exchange rate dynamics within the global market [4], [6].

METHODS
A more thorough literature review is warranted to provide a comprehensive understanding of the research landscape and to identify specific gaps in knowledge.Previous studies in the field of currency exchange rate prediction have addressed various predictive models, including autoregressive integrated moving average (ARIMA), neural networks, support vector machines (SVM), and long short-term memory (LSTM) networks, among others [11], [12], [13], [14], [15], [16], [17], [18], [19].However, a significant gap exists in the comparative analysis of different models, particularly in the context of dynamic financial markets characterized by rapid fluctuations and complex interactions [20].This research aims to address this gap by conducting a comprehensive comparative analysis of two prominent predictive models: the LSTM neural network and the SVM.While previous studies have explored the predictive capabilities of individual models, the novelty of this research lies in its comparative approach, which offers a more robust framework for model evaluation and selection [21], [22], [23].
Furthermore, by leveraging insights from previous studies on the limitations and challenges associated with each model, this research aims to identify strategies for enhancing predictive accuracy and adaptability.Through a systematic analysis of empirical data and rigorous statistical methods, this study endeavors to contribute to advancing both theoretical understanding and practical applications in the field of currency exchange rate forecasting [24], [25].
The comparison of the LSTM neural network and SVM models represents a novel contribution to the literature, as previous research has primarily focused on the performance of individual models without direct comparison [26], [27], [28].By conducting a head-to-head comparison of these two models, this research seeks to provide valuable insights into their relative strengths and weaknesses, thus offering a more informed basis for model selection in currency exchange rate prediction.
In summary, while existing literature has provided valuable insights into the predictive performance of individual models, a comprehensive comparative analysis remains lacking.This research aims to fill this gap by providing a rigorous evaluation of the LSTM neural network and SVM, thus offering valuable insights for informed decision-making in the dynamic realm of currency exchange rate prediction.
The LSTM Neural Network is specifically designed to handle time series data with long-term memory, enabling an understanding of complex temporal patterns [7], [8].In the context of predicting the USD selling value involving historical data, the ability of LSTM to capture long-term relationships can provide an advantage in handling exchange rate fluctuations associated with historical factors.SVM, although initially known in the context of classification, is also capable of addressing regression problems.The strength of SVM lies in its ability to handle high-dimensional data and find the best hyperplane to separate classes or model regression relationships [9], [10], [29].In the context of exchange rate prediction, the reliability of SVM in handling regression problems can provide an interesting contrast to the recurrent LSTM approach.
LSTM has the ability to recognize more complex temporal patterns due to its recursive structure [7], [30].This model can understand and leverage long-term historical information, which is crucial in dealing with dynamic changes in currency exchange rates.SVM has good generalization ability and can provide reliable solutions even for complex data.Additionally, SVM is often considered more interpretable, providing clarity in understanding how the model makes decisions or determines relationships between variables [9], [31].
The use of LSTM and SVM in the context of exchange rate prediction can provide advantages.Combining the recursive capabilities of LSTM in capturing long-term temporal patterns with the regression and classification capabilities of SVM can result in a robust and reliable model.Both methods are chosen due to their relevance to specific challenges associated with exchange rate prediction.In a rapidly changing financial market environment, LSTM and SVM represent two different approaches and can provide valuable insights into the performance of models in dynamic situations.

Identification of research objectives
The primary objective of this research is to compare the performance of two prediction methods, namely Long Short-Term Memory (LSTM) Neural Network and Support Vector Machine (SVM), in forecasting the selling value of the United States Dollar (USD).The main focus of performance measurement is the USD selling attribute.This study aims to provide in-depth insights into the capabilities of both methods in anticipating changes in currency exchange rates.

Data collection
The data used in this research is sourced from the Central Statistics Agency (Badan Pusat Statistik or BPS) and covers the time period from January 1 to June 30, 2021.The dataset consists of several attributes, including Date, USD Selling, Inflation, Money Supply (in Billion Rupiah), Import Value (in Million Dollars), and Export Value (in Million Dollars).This data will serve as the foundation for evaluating the performance of the LSTM Neural Network and SVM prediction methods.

Designing of experimentation scenarios
Before implementing the prediction methods, a scenario testing plan will be formulated.This involves selecting parameters for LSTM and SVM.In the scenario testing plan phase, this research will implement three different scenarios for the division of data between training and testing data.These scenarios were informed by best practices in machine learning experimentation [32], [33].The division of data into these two sets has a significant impact on the formation of predictive models, and therefore, variations need to be made to understand the performance of the methods comprehensively.
In the LSTM neural network method in this study, there is an architecture specification defined in Table-1, which consists of the LSTM model architecture specifications, model compilation, and early stopping.

Evaluation of method performance
This LSTM is a specialized type of layer in recurrent neural networks (RNN) designed to handle long-term dependency issues.LSTM is used to understand temporal patterns in time series data [7], [8].On the other hand, the Dense (Output Layer) with 1 neuron is a fully connected layer that summarizes the results from the LSTM layer into a single output value.With one neuron, it is suitable for regression tasks to predict a single numerical value, as this is a regression task to predict currency exchange rates, and one unit is sufficient to generate one output value.This LSTM layer has 50 neuron units that determine the model's complexity and its ability to capture patterns in time series data.The ReLU activation function introduces non-linearity to the LSTM layer, allowing the model to learn from more complex data.The input shape accepted by this LSTM layer indicates the number of timesteps and features at each timestep.This aligns with the structure of the time series data used in the model.The LSTM utilizes gates to control the information that is stored or ignored in the memory cell.The equations above reflect the specific operations that occur within an LSTM cell during one time step.In each subsequent time step, these values are updated using information from the previous time step and the current input.
Meanwhile, in the SVM method, it is a machine learning algorithm used for classification and regression problems [9], [10], [29].The SVM model in this research uses Support Vector Regression (SVR) with a linear kernel.SVM can also employ various types of kernels (linear, polynomial, radial basis function, etc.), and in this case, a linear kernel is chosen for specific reasons: 1.A linear kernel has a relatively straightforward interpretation.The decision function produced by SVM with a linear kernel can be explained in the original feature space, aiding in understanding the relationship between features and the target.2. A linear kernel is suitable when the relationship between features and the target can be approximated linearly.If the data can inherently be described by a straight line, then a linear kernel can provide good results.3. SVM with a linear kernel is typically computationally more efficient compared to non-linear kernels.Training the model and making predictions with a linear kernel can be faster, especially on large datasets.4. A linear kernel tends to provide strong regularization to the model.This can help avoid overfitting, especially when the data is limited or tends to be noisy.
For SVM in the context of regression, such as Support Vector Regression (SVR), the underlying equation is as follows:

Scenario-2 (55:45)
▪ The 55%-45% data split aims to achieve a better balance between the training and testing phases.▪ With a larger allocation of test data, the model is given the opportunity to assess the generalization of prediction outcomes.
▪ This approach emphasizes training the model by providing limited room to test the model on data not involved in the training process.

Implementation of methods
The LSTM Neural Network was implemented using TensorFlow [34] , a widely used deep learning framework, with specific configurations as detailed in Table 1.The choice of TensorFlow aligns with its reputation for efficient handling of sequential data and complex neural network architectures [35].The SVM implementation utilized the Scikit-learn library [36], a popular machine learning toolkit in Python, known for its ease of use and robust implementation of SVM algorithms [37].The programming language used was Python, and all experiments were conducted through the Google Colaboratory platform, leveraging the flexibility of cloud computing for efficient analysis and model training.Additionally, tool specifications were a critical factor to ensure the continuity and accuracy of the analysis process.The tool used was Google Colab, a Python-based platform running in the Google Cloud environment.The device specifications used to run Google Colab are presented in Table 2.

Evaluation of method performance
To evaluate the performance of the LSTM Neural Network and SVM, several metrics were employed, including Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and R-squared.These metrics are standard in assessing the accuracy and goodness-of-fit of predictive models [27].The choice of these metrics was informed by their relevance to regression tasks and their widespread use in evaluating machine learning models [28], [34].
MAE (Mean Absolute Error) provides information about the average of the absolute differences between predictions and actual values.The lower the MAE value, the better the model performance [38], [39], [40].
RMSE (Root Mean Squared Error) measures the magnitude of the average prediction error in a form consistent with the original data.Like MAE, a lower RMSE value indicates better predictions [38], [39], [40] R-squared presents the extent to which the variation in the dependent variable can be explained by the model.Higher values indicate a better model in explaining the data variation [38], [39], [40].
Notation:  ̂ : estimated value  : actual values  : number of data points  : index of data point

RESULTS AND DISCUSSIONS
This section illustrates and analyzes the experimental results of this research, focusing on the three data split scenarios previously described: 45:55, 55:45, and 75:25.Additionally, the discussion will involve the architectures of the implemented LSTM and SVM methods, as well as the tool specifications used in this study.
The exploration of data split scenarios aims to evaluate the reliability and accuracy of predictions from the LSTM Neural Network and Support Vector Machine (SVM) models in the context of predicting the USD exchange rate.As a complement, this chapter also provides details about the specific architectures used in the LSTM and SVM models, offering a deeper understanding of how they operate in this prediction task.Although the training time of SVM is shorter, the LSTM model appears to produce more accurate predictions based on lower MAE and RMSE values.The R-Squared value approaching 2 for LSTM indicates the model's ability to explain data variation better compared to SVM, which has a relatively low R-Squared value.Therefore, while SVM excels in time efficiency, the LSTM model provides more accurate and reliable results in the context of predicting the USD exchange rate in this data split scenario.In model selection, the trade-off between time and accuracy needs to be carefully considered depending on specific applicative needs (see Figure 1).2).  6).Critically, these results show that although LSTM requires a longer time for training, it tends to provide more accurate predictions compared to SVM in the context of currency exchange data.Thus, the decision between these two methods must be carefully considered, considering the trade-off between training time and desired prediction quality.

Scenario
The author should enrich the additional results by considering the implications of the findings for practical applications and further research.For example, the author could discuss how the superior performance of the LSTM Neural Network in predicting the USD exchange rate could benefit financial institutions, investors, and policymakers in making more informed decisions.Additionally, the author could explore potential extensions of the research, such as investigating the performance of other machine learning algorithms or incorporating additional features into the prediction models.
Furthermore, this segment should be enriched with comparisons to other studies to provide context for the results.By comparing the findings to existing literature, the author can assess whether the results are consistent with previous research or if there are any discrepancies that warrant further investigation.This comparative analysis could help validate the robustness of the findings and contribute to the advancement of knowledge in the field.

CONCLUSION
The conclusion drawn from the results of the three data-sharing scenarios for training reveals several significant findings.Firstly, when utilizing 75% of the data for training, the LSTM Neural Network exhibits a longer training time (11.52The practical implications of these findings are significant for the finance sector.By demonstrating the superior predictive performance of the LSTM Neural Network, this research provides valuable insights for currency traders, financial institutions, and policymakers.Accurate exchange rate forecasts enable traders to make informed decisions and mitigate risks, while financial institutions can optimize their hedging strategies and minimize exposure to currency fluctuations.Policymakers can utilize these forecasts to formulate effective monetary policies and stabilize currency markets.
Furthermore, the research contributes to the advancement of predictive modeling techniques in finance.By showcasing the effectiveness of the LSTM Neural Network, the study encourages further exploration and development of deep learning algorithms for financial forecasting.Future research could explore the integration of additional data sources and refining model architectures to enhance predictive accuracy further.
In conclusion, the findings of this research offer practical benefits for decision-making and risk management in the finance sector.By providing accurate exchange rate predictions, the LSTM Neural Network presents a valuable tool for navigating the complexities of global financial markets.As such, this research contributes to the ongoing development and innovation in financial forecasting methodologies.

Figure 1 .
Visualization of the experimental results in scenario-1: (a) LSTM; (b) SVM (data source: compiled by authors) Scenario-2 (55:45) Experimental results using 55% of the data as training data show a significant difference between the performance of the LSTM Neural Network and Support Vector Machines (SVM) models.The time required by the LSTM Neural Network model (5.73909) is much longer than that of SVM (0.60398), indicating higher complexity in the LSTM model training process.Nevertheless, the performance of the LSTM Neural Network, as measured by Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and coefficient

Figure 2 .
Visualization of the experimental results in scenario-2: (a) LSTM; (b) SVM (data source: compiled by authors) Scenario-3 (75:25) The results of experiments with a 75% data-sharing scenario for training show a performance comparison between the LSTM Neural Network and Support Vector Machines (SVM) methods in predicting USD currency exchange rates.Regarding the time required to train the model, LSTM Neural Network shows a more significant time, around 11.52 seconds, while SVM only takes 0.30 seconds.The relatively faster time on SVM is attributed to its essential characteristics, which are more computationally efficient, especially when dealing with large data sets.However, keep in mind that faster training times do not necessarily indicate better prediction quality (see Table

Table 1 .
The architecture of the LSTM neural network method (data source: compiled by authors) The model compilation process is a necessary step before training or evaluating a model in deep learning.Compilation configures the model by selecting the optimizer, loss function, and evaluation metrics.Some key elements of model compilation are:1.Optimizer: The optimization algorithm that will be used to adjust the model weights based on the training data.'Adam' is one of the commonly used optimizers because it is efficient and effective.2. Loss Function: The loss function that will be minimized during the training process.For regression problems, such as predicting currency exchange rates in this case, Mean Squared Error (MSE) is often used.3. Evaluation Metric (Optional): Metrics that will be evaluated during training and testing.While optional, these metrics provide additional insights into the model's performance.Indicate how long (how many epochs) we are willing to wait without improvement in the monitored metric before stopping the training.This helps prevent the model from learning noise in the training data.3. Restore Best Weights: If set to True, it will revert the model to the best weights found during training.This helps ensure that we have a model with the best performance even if the training is stopped early.With model compilation, the model configuration is used for learning data using the appropriate optimizer and loss function.On the other hand, Early Stopping is a strategy to control the training duration and prevent the model from overlearning, which can negatively impact performance on new data.  ,   ,   ,   : learned weight values ▪   ,   ,   ,   : learned bias values Early Stopping is a technique used during model training to prevent overfitting and expedite the training process.The main function of Early Stopping is:1.Monitor: Monitor a specific metric on the validation data, in this case, 'val_loss' or the loss on the validation data.If there is no improvement in that metric, the training will be stopped.2. Patience: (  , ) For SVM with a linear kernel, the kernel function (K) is calculated as the simple dot product between two feature vectors.For other kernels, such as polynomial or Radial Basis Function (RBF) kernels, the kernel function is adjusted according to the type of kernel used.

Table 2 .
Specifications of the tools used (data source: compiled by authors)

-1 (45:55)
In the data split scenario with a training proportion of 45%, the experimental results show a performance comparison between the LSTM Neural Network and Support Vector Machines (SVM) models.The time required by the LSTM model is approximately 7.49 seconds, while SVM only takes about 0.43 seconds.These results indicate a significant advantage of SVM in terms of training speed.However, when looking at the performance evaluation metrics, it is evident that the LSTM model shows a Mean Absolute Error (MAE) of 204.64,RootMeanSquared Error (RMSE) of 225.03, and R-Squared of around 1.99.On the other hand, the SVM model has an MAE of 132.50, RMSE of 162.81, and R-Squared of around 0.33 (see Table4).

Table 4 .
Performance evaluation results of the methods in scenario-1 (data source: compiled by authors)

Table 5 .
of determination (R-squared), is better than SVM.The lower MAE in LSTM (84.72266) indicates that its predictions have a more minor average error than SVM (134.67929).A similar thing is seen in RMSE, where a lower value in LSTM (104.92248)indicates a better prediction error rate than SVM (168.23663).In addition, the higher coefficient of determination (R-squared) in LSTM (0.35128) suggests that this model is better able to explain variations in test data than SVM (0.28888) (see Table5).Performance evaluation results of the methods in scenario-2 (data source: compiled by authors) Despite the longer training time of LSTM, these results illustrate that the model complexity and time invested in LSTM Neural Network can be offset by superior prediction performance compared to SVM in this data-sharing scenario (see Figure

Table 6 .
Performance evaluation results of the methods in scenario-3 (data source: compiled by authors) seconds) compared to SVM (0.30 seconds).However, the LSTM Neural Network yields predictions with superior performance, with an MAE of approximately 49.52 and an RMSE of about 63.08, whereas SVM shows an MAE of around 138.33 and an RMSE of approximately 161.58.Despite these results, both methods still exhibit a need for higher R-squared values, indicating a partial explanation of the data variation.Secondly, in the scenario using 55% of the data for training, the LSTM Neural Network maintains commendable performance, with a training time of around 5.74 seconds, an MAE of about 84.72, and an RMSE of 104.92.Despite a faster training time for SVM (0.60 seconds), its prediction accuracy is inferior, with an MAE around 134.68 and an RMSE around 168.24.Lastly, with 45% of the data used for training, the LSTM Neural Network demonstrates a training time of around 7.49 seconds, accompanied by improved prediction performance compared to the previous scenario, with an MAE around 204.64 and an RMSE around 225.03.SVM, on the other hand, requires a very short training time (0.43 seconds), but its prediction performance remains suboptimal, with an MAE of approximately 132.50 and an RMSE of around 162.81.