Time Series Prediction with Artificial Neural Networks: An Analysis Using Brazilian Soybean Production

Food production to meet human demand has been a challenge to society. Nowadays, one of the main sources of feeding is soybean. Considering agriculture food crops, soybean is sixth by production volume and the fourth by both production area and economic value. The grain can be used directly to human consumption, but it is highly used as a source of protein for animal production that corresponds 75% of the total, or as oil and derived food products. Brazil and the US are the most important players responsible for more than 70% of world production. Therefore, a reliable forecasting is essential for decision-makers to plan adequate policies to this important commodity and to establish the necessary logistical resources. In this sense, this study aims to predict soybean harvest area, yield, and production using Artificial Neural Networks (ANN) and compare with classical methods of Time Series Analysis. To this end, we collected data from a time series (1961–2016) regarding soybean production in Brazil. The results reveal that ANN is the best approach to predict soybean harvest area and production while classical linear function remains more effective to predict soybean yield. Moreover, ANN presents as a reliable model to predict time series and can help the stakeholders to anticipate the world soybean offer.


Introduction
World's population is projected to reach 9.8 billion in 2050 [1] and food production needs to increase by 60% to meet the demand [2,3]. One reason for that is the developing countries-that have been growing much more rapidly than the industrial countries-are creating implications for world food demand mainly in products such as animal-based, fruits, and vegetables [4]. However, six delays computed using a Nonlinear Autoregressive Network with External Input-NARX with Levenberg-Marquardt backpropagation for training the network.
The results show that the ANN model is the most efficient method to predict soybean harvest area and production. The novelty of this paper is to obtain a reliable prediction for soybean production measures using an ANN model and dealing with a short data period time series (50 years) [25]. The period of 1961-1966 was used only for ANN model delay.
This paper is divided into sections: Section 1 presents this introduction and literature review, Section 2 shows the methodological procedures, Section 3 deal with results and discussion, and Section 4 presents the conclusions of the study.

Artificial Neural Networks
Artificial Neural Networks, as the name proposed, use artificial neurons connected in layers to simulate human synapse ( Figure 1). A mathematical model mimics the neural structure to learn and to acquire knowledge via experiences (Equations (1) and (2)). This technology is effective to solve problems-dynamic and nonlinear-such as pattern recognition and prediction [25][26][27][28][29][30].
where x 1 . . . x n are the input values (data set), w 1 . . . w n are the weights, and b is the activation threshold (bias) in the neuron potential ne [25,26,31]. Among several types of neuron activation functions, the most common are: hyperbolic tangent (Equation (3)), hidden layer, and linear. The last one always assumes values identical to the activation potential n [25,26,31]: where β is the constant associated with the slope of the hyperbolic tangent function and the output values assume numbers between −1 and 1. ANN uses previous data for training the network and minimizes errors between the insertion and the estimation. This process adjusts the weights and possible bias for each neuron interaction. The training usually stops when finding out the optimal learning rate [25][26][27][28][29][30].
There are various ANN techniques such as General Regression Neural Network (GRNN), Backpropagation Neural Network (BNN), Radial Base Function Neural Network (RBFNN), and Adaptive Neuro-Fuzzy Inference System (ANFIS) [32]. Backpropagation (BP) is a learning algorithm widely used in forecasting problems with ANN, and the networks [30]. The weights between the different layers may be updated using the BP algorithm, with momentum and learning rate. Moreover, the weights between the different layers may be updated where the error is then propagated backward from the output to the input layer [33].
Some studies have been using ANN to study the agricultural environment. Garg et al. [30] compare the performance between different training methods using an ANN model to forecast wheat production in India. The data contain 95 years of wheat production (1919-2013), and the results revealed that the algorithms most effective in training methods are Bayesian regularization and Levenberg-Marquardt.
Almomani [34] adopted artificial neural networks to predict the biofuel production from agricultural wastes and cow manure at high accuracy. The training and testing of the ANN used to predict the cumulative methane production was assessed by using the root mean square method. The study confirms the capacity of the ANN model to predict the behavior of biofuel production and to identify the optimum conditions in a short time.
Sankhadeep et al. [35] use an ANN model for soil moisture quantity prediction for sustainable agricultural applications. They study soil moisture prediction in terms of soil temperature, air temperature, and relative humidity. The nonlinear relation between soil moisture and the features is realized using a hybrid modified flower pollination algorithm supported by the ANN model. They conclude that for sustainable agricultural application the model is highly suitable.
Khan et al. [8] use deep neural networks to fruit production prediction. They considered different types of fruit production such as apples, bananas, citrus, pears, and grapes with data from the National Bureau of Statistics of Pakistan. They adopted Levenberg-Marquardt optimization, backpropagation, and Bayesian regularization backpropagation. The results reveal that the government of Pakistan needs to further increase fruit production and create better policies for farmers to improve their production.
Wang and Xiao [36] studied recycle agriculture in West China to make a prediction on the comprehensive development status applying a neural network model with the application of backpropagation through the MATLAB program. They conclude that China needs to take measures to promote resources' decrement input and resource reuse efficiency, protect the forest resources, and reinforce harnessing of water loss and soil erosion.
Liu et al. [37] create an artificial neural network model for crop yield responding to soil parameters. The model was established by training a backpropagation neural network with 58 samples and tested with other 14 samples. They conclude that the model can precisely describe crop yield responding to soil parameters.
Fegade and Pawar [38] describe that, in India, farmers have difficulties to select proper crop for farming due to factors such as rainfall, temperature humidity, soil, and so on. Therefore, they used support vector machine and artificial neural networks to predict crop with 86.80% of accuracy.
Regarding grains, Maimaitijiang et al. [10] evaluate the power of an unmanned aerial vehicle (UAV) to estimate soybean grain yield within the framework of deep neural networks (DNN). Thermal images were collected using a low-cost multi-sensory UAV. The results propose that multimodal data fusion improves the yield prediction accuracy and is more adaptable to spatial variations; DNN-based models improve yield prediction model accuracy and were less prone to saturation effects.
Zhang et al. [39] establish a model for forecasting soybean price in China using quantile regression models to describe the distribution of the soybean price range, and using regression-radial basis function neural networks to approximate the nonlinear component of the soybean price. They collected the monthly domestic soybean price in China, and the results of the model indicate that the proposed model is effective.
García-Martínez et al. [9] analyze different multispectral and red-green-blue vegetation indices, canopy cover, and plant density in order to estimate corn grain yield using a neural network model. The neural network model provided a high correlation coefficient between the estimated and the observed corn grain yield with acceptable errors in the yield estimation.
Abraham et al. [40] propose to design, train, and simulate an ANN on to forecast the demand of soybean production in Mato Grosso state, Brazil that is exported by the port of Santos. A nonlinear autoregressive solution was adopted considering 80% of data for training, 5% to validation, and 15% for testing the network-a value of 9.0 million tons for 2017 as an increase of about 26.5% compared with the 2016.
Eventually, Abraham et al. [41] also analyze the relationship between soybean supply (production) and soybean demand (export) using artificial intelligence in a hybrid model neuro-fuzzy. Data from 20 years of soybean production and exportation were used, and the results indicate that the supply tends to be low when the demands of the ports are overloaded.
Specifically, in the present article, we raised two questions regarding ANN in soybean production: • Can soybean harvest area, yield, and production be predicted efficiently using Artificial Neural Networks? • If so, are Artificial Neural Networks more effective than classical methods of Time Series Analysis to predict soybean production measures?
To answer these two questions, we develop an ANN model using NARX with the Levenberg-Marquardt algorithm for backpropagation and data of Brazilian soybean production.

Time Series and Classical Methods
Time series analysis studies the past behavior of historical series using different methods (Table 1). It verifies trends, seasonality, and randomness in a dataset in two ways: stationary, when observations oscillate around a central horizontal axis; and non-stationary when oscillates around changing values [42,43]. The most appropriate model for a specific dataset is the coefficient of determination (R), the mean absolute error (MAE), and the mean squared error (MSE) [42,43]. Table 1. Classical methods, equations, and characteristics.

Method Formulas Features
Linear function Linear is defined as a curve of the first degree or a simple straight line-where y is the trend, x represents the period of time, a is a slope, and b is the intercept. The intercept will determine how far from the x-axis the trend begins. The slope will determine the direction and the steepness.

Exponential function y = ae bx
Exponential is defined as a transcendental curve, where e represents the basis for natural logarithms, and its constant value is 2.7813. It grows exponentially, but they never reach the attracting value.
The inverse of the exponential function is a logarithmic function.
Polynomial function The second-degree polynomial curve is a parabola. The polynomial model can go up to the sixth degree. A larger magnitude corresponds to a greater adjustment than that in the original data; however, this does not mean that it is best for forecasting. The best method is the one that can perform well with minimum parameters.
Power function The graph of a power curve is a hyperbola.
The coefficient of determination (Equation (4)) measures the linear regression adjustment, which aims to explain the relationship of the variables. The closer this number is to one, the more fitted is the model. However, a measure higher than 0.7 is satisfactory [25,42,43]: The coefficient of determination is calculated based on the ratio between the explained and the total variance where y represents the real value of the series,ŷ is the expected value (value of the regression line approaching the actual value), andȳ is the average value of the series.
Note that the variance is the difference between the expected value and the mean, and the total variance is the difference between the original and mean value [25,42,43]. The MAE and MSE are calculated according to Equations (2) and (3), where n is the number of elements in the series.
Finally, functions with error values close to 0 are the most effective in predicting future values. These time series applications are described in the Results and Discussion section.

Dataset
To perform this study, we collected data from the Food and Agriculture Organization of the United Nations (FAO) [44] regarding harvest area (million hectares), yield (tons per hectare), and production (million tons) between 1961-2016. The dataset was imported from MS c Excel 2016 spreadsheet to Matlab c R2017b arrays. However, the period from 1961-1966 was used only for delay configuration, and it was not plotted on the time series [29,45].
Firstly, we conducted Time Series Analysis. The historical series was extracted and processed in MS c Excel 2016 spreadsheet format generating graphs with trend lines. Tables 2-4 present the formulas.   Secondly, we used neural networks toolbox of the Matlab c R2017b software to create, train, and validate the ANN model-we tested with 10 neurons and six delays (Figure 2). We adopted the Nonlinear Autoregressive Network with External Input-NARX type because it has proven to be the most effective and accurate solution for multivariable data series [27,[46][47][48]. The NARX network applies historical input data with time delay operators [9]. We used 70% of data for training, 15% for validation, and 15% for testing. We defined the percentage based on k-cross validation that utilizes efficiently the learning abilities of the ANN model [49], and data are distributed randomly by NARX [46]. Moreover, we adopted the Levenberg-Marquardt algorithm for backpropagation due to being the fastest supervised algorithm for training and widely used for time series prediction in the ANN model [25,30,46].
For harvested area (target), we used yield and production as input variables; for yield (target), we used harvested area and production as input variables; for production (target), harvested area and yield were used as input variables.
After that, Matlab c R2017b provided algorithms for closed-loop form simulation (named multistep prediction). This type of simulation is important to verify the ability of the networks to make predictions (calculation of errors) [25]. Figure 3 shows the overall flowchart of the ANN model.

Model Classification
The differences between the original and predicted values were computed using MAE and MSE, even for ANN. We compared classical models and neural networks where the errors of each model were sorted from lowest to highest. However, regression measures were sorted from highest to lowest. Depending on the use of two measures of error the weighted average was used (Equation (7)): where MAE is mean absolute error, MSE is mean squared error, and R is the coefficient of determination.

Harvested Area
The first application of classical methods for prediction uses the time series for harvest area (target). Figure 4 illustrates the 1967-2016 timespan in million hectares.
The harvested area raised continuously, mainly after the harvest of 1997 (Timestep 31), and reached around 33 million hectares in 2016 (Timestep 50). Looking back over 1967 year (Timestep 1), the planted soybean area was 2% (around 0.6 million hectares) of the area planted in 2016. There has been a more than a 50-fold increase while the US, the main Brazilian competitor, had in 1967 54% of the current planted area [22].
Regarding the fit of functions, polynomial and power were more effective in predicting harvest area considering R, MAE, and MSE (Table 5).

Yield
The second application of classical methods of predication verifies soybean yield (target).  The yield increase in soybean production was affected by changes in production processes and the use of new technologies. Moreover, genetic improvements of soybean created a variety of grain with better adaptation to the climate that affected productivity [12]. According to Pereira [50], the last 30 years have demonstrated innovative solutions in food production, such as new crop varieties and new irrigation techniques. However, Dani [12] argue that the evolution has generally been technological without improving governance processes causing logistics issues throughout the supply chain. Logistics has a huge impact on soybean production and directly affect the trade [51].
Related to the fit of the functions, linear and polynomial predict more precisely soybean yield given that R, MAE, and MSE (Table 6).

Production
Finally, we use classic methods to predict production. Figure 6 shows the time series analysis for soybean production from 1967-2016 in millions of tons. Brazilian soybean production raised 130-fold, moving from 700 thousand tons in 1967 (Timestep 1) to 96.3 million tons in 2016 (Timestep 50). In the same period, the US soybean production moves from 26.5 million tons to 116.9 million tons [22]. In 2019, the Brazilian soybean production was 25% higher [21] than 2016.
Furthermore, we identify that in 1970 (Timestep 4) the production was 1.5 million tons, but, in 1977 (Timestep 11), it reached 12.5 million tons. This great expansion of soybean cultivation occurs due to the expansion of international demand and the national soybean oil industry [52].
Regarding soybean production, polynomial and power were more effective considering R, MAE, and MSE (Table 7). R is the coefficient of determination (value between 0 and 1), MAE is the mean absolute error (value in millions of tons), and MSE is the mean squared error (value in millions of tons).

Training, Validation, and Testing of Neural Network
Harvested area training reached an optimal value for the regression and correlation among variables after nine interactions (Figure 7). The training procedure stops when the performance on the test data does not improve following a fixed number of training iterations [39]. The main purpose of the training phase is to find the optimal set of weights for the ANN model where the error is minimized [35]. The training, validation, and testing indicate that the network learned from the data (R > 0.99). Moreover, the fit was well-aligned, which means the model has a good capacity for generalization and prediction.
Yield training reached an optimal value for the regression and correlation among variables after 12 iterations and pose an R higher than 0.9 ( Figure 8). Yield presents correlation and regression results similar to the other two networks. However, fit shows a reasonable alignment representing a capacity of the network generalize and predict.
Finally, the production network was trained and after nine interactions reached an optimal value for the regression and correlation among variables (Figure 9). The network shows an excellent rate of learning, with reasonable values of alignment. However, the validation and test pose deviations on fit. The overall results present proper alignment confirming its ability to generalize and make predictions.
Given that there are three networks, the harvested area indicated the best results, followed by production and yield.

Time Series Results with an Artificial Neural Network
Figures 10-12 depict the results of the time series generated by the neural networks in closed-loop form (multistep prediction). The blue line (target) represents the original data, and the red line (prediction) represents the obtained values for each period.
Neural network prediction shows better adjustment to the original data than time series analysis. In other words, these predictions show a smoother follow-up. The trend lines of the classical models follow the randomness of the series increasing the error between the original and predicted values. The base graphic Figure 10 presents the error of the prediction in millions of hectares, in Figure 11 in tons per hectare and Figure 12 in millions of tons.
The classical models are based on elements dependent on the analysis of their predecessors. On the flipside, ANN is a generalization of the classical models, where an element to be predicted also depends on the previous elements of other related time series [27].

Comparison between Artificial Neural Networks and Time Series Classical Models
Considering R, MAE, and MSE, Tables 8-10 present the ranking of ANN model versus classical functions for forecast harvested area, yield, and production, respectively.
Considering the results, Artificial Neural Networks ranked first for predicting harvested area and production and third place to yield. The polynomial model ranked second in all three series showing the reliability of the model to estimate future values. The logarithmic model is the least fitted and should be discarded for these series.   Based on these results, it is possible to infer that predictive capabilities of the developed ANN model are efficient to soybean prediction with short data time series. This fact confirms the superior performance of the ANN model against classical methods. Similar results are obtained for Nedic et al. [53] when compared to an ANN model with classical statistical models to predict traffic noise.
ANN has been recognized as a valuable predictive tool due to its ability to learn, adapt, and generalize the results of a sample of noise data and are more effective and flexible than conventional statistics for dealing with nonlinearity [54]. Moreover, there is a tendency towards the adopting of artificial intelligence in decision models.

Conclusions
This study compares classical methods of time series prediction with Artificial Neural Networks using Brazilian soybean harvest area, yield and production from 1961-2016. The results indicate that ANN is the best approach to predict soybean harvest area and production while classical linear function remains more effective to predict soybean yield. However, ANN is a reliable model to predict using time series and can help farmers, government, and trading companies anticipate the soybean world offer to organize efficiently logistics resources and public policies.
Our results confirm the important role of neural networks in dealing with agriculture issues as showed in previous studies in the literature [8,10,35,39]. The R value above 0.9 confirms the high performance of the model. Nevertheless, regarding the agriculture concerns about low availability for planting areas, yield, and production [4][5][6], your results demonstrated that, at least in case of the soybean, this is not a concern.
Furthermore, we can conclude that the ANN model can be effective even using a short time series-that, in our case, was 50 years. This fact reveals a robustness of the model. However, despite the advantages of the ANN model, classical methods also can produce very good models. A comparison in other agriculture commodities can be made to confirm or refuse the behavior presented in a soybean case.
Finally, we also suggest for further studies to combine neural networks in hybrid systems using, for example, ANN and Fuzzy Logic, similar to that proposed by [41]. Literature has shown that hybrid systems are more efficient. The goal is to achieve a synergy between hybrid systems to compensate for the disadvantage of one by the advantage of another [55][56][57].