Forecasting with artificial neural networks:: The state of the art
Introduction
Recent research activities in artificial neural networks (ANNs) have shown that ANNs have powerful pattern classification and pattern recognition capabilities. Inspired by biological systems, particularly by research into the human brain, ANNs are able to learn from and generalize from experience. Currently, ANNs are being used for a wide variety of tasks in many different fields of business, industry and science (Widrow et al., 1994).
One major application area of ANNs is forecasting (Sharda, 1994). ANNs provide an attractive alternative tool for both forecasting researchers and practitioners. Several distinguishing features of ANNs make them valuable and attractive for a forecasting task. First, as opposed to the traditional model-based methods, ANNs are data-driven self-adaptive methods in that there are few a priori assumptions about the models for problems under study. They learn from examples and capture subtle functional relationships among the data even if the underlying relationships are unknown or hard to describe. Thus ANNs are well suited for problems whose solutions require knowledge that is difficult to specify but for which there are enough data or observations. In this sense they can be treated as one of the multivariate nonlinear nonparametric statistical methods (White, 1989, Ripley, 1993, Cheng and Titterington, 1994). This modeling approach with the ability to learn from experience is very useful for many practical problems since it is often easier to have data than to have good theoretical guesses about the underlying laws governing the systems from which data are generated. The problem with the data-driven modeling approach is that the underlying rules are not always evident and observations are often masked by noise. It nevertheless provides a practical and, in some situations, the only feasible way to solve real-world problems.
Second, ANNs can generalize. After learning the data presented to them (a sample), ANNs can often correctly infer the unseen part of a population even if the sample data contain noisy information. As forecasting is performed via prediction of future behavior (the unseen part) from examples of past behavior, it is an ideal application area for neural networks, at least in principle.
Third, ANNs are universal functional approximators. It has been shown that a network can approximate any continuous function to any desired accuracy (Irie and Miyake, 1988, Hornik et al., 1989, Cybenko, 1989, Funahashi, 1989, Hornik, 1991, Hornik, 1993). ANNs have more general and flexible functional forms than the traditional statistical methods can effectively deal with. Any forecasting model assumes that there exists an underlying (known or unknown) relationship between the inputs (the past values of the time series and/or other relevant variables) and the outputs (the future values). Frequently, traditional statistical forecasting models have limitations in estimating this underlying function due to the complexity of the real system. ANNs can be a good alternative method to identify this function.
Finally, ANNs are nonlinear. Forecasting has long been the domain of linear statistics. The traditional approaches to time series prediction, such as the Box-Jenkins or ARIMA method (Box and Jenkins, 1976, Pankratz, 1983), assume that the time series under study are generated from linear processes. Linear models have advantages in that they can be understood and analyzed in great detail, and they are easy to explain and implement. However, they may be totally inappropriate if the underlying mechanism is nonlinear. It is unreasonable to assume a priori that a particular realization of a given time series is generated by a linear process. In fact, real world systems are often nonlinear (Granger and Terasvirta, 1993). During the last decade, several nonlinear time series models such as the bilinear model (Granger and Anderson, 1978), the threshold autoregressive (TAR) model (Tong and Lim, 1980), and the autoregressive conditional heteroscedastic (ARCH) model (Engle, 1982) have been developed. (See De Gooijer and Kumar (1992)for a review of this field.) However, these nonlinear models are still limited in that an explicit relationship for the data series at hand has to be hypothesized with little knowledge of the underlying law. In fact, the formulation of a nonlinear model to a particular data set is a very difficult task since there are too many possible nonlinear patterns and a prespecified nonlinear model may not be general enough to capture all the important features. Artificial neural networks, which are nonlinear data-driven approaches as opposed to the above model-based nonlinear methods, are capable of performing nonlinear modeling without a priori knowledge about the relationships between input and output variables. Thus they are a more general and flexible modeling tool for forecasting.
The idea of using ANNs for forecasting is not new. The first application dates back to 1964. Hu (1964), in his thesis, uses the Widrow's adaptive linear network to weather forecasting. Due to the lack of a training algorithm for general multi-layer networks at the time, the research was quite limited. It is not until 1986 when the backpropagation algorithm was introduced (Rumelhart et al., 1986b, Werbos, 1988) that there had been much development in the use of ANNs for forecasting. Werbos (1974), Werbos (1988)first formulates the backpropagation and finds that ANNs trained with backpropagation outperform the traditional statistical methods such as regression and Box-Jenkins approaches. Lapedes and Farber (1987)conduct a simulated study and conclude that ANNs can be used for modeling and forecasting nonlinear time series. Weigend et al., 1990, Weigend et al., 1992, Cottrell et al., 1995address the issue of network structure for forecasting real-world time series. Tang et al. (1991), Sharda and Patil (1992), and Tang and Fishwick (1993), among others, report results of several forecasting comparisons between Box-Jenkins and ANN models. In a recent forecasting competition organized by Weigend and Gershenfeld (1993)through the Santa Fe Institute, winners of each set of data used ANN models (Gershenfeld and Weigend, 1993).
Research efforts on ANNs for forecasting are considerable. The literature is vast and growing. Marquez et al. (1992)and Hill et al. (1994)review the literature comparing ANNs with statistical models in time series forecasting and regression-based forecasting. However, their review focuses on the relative performance of ANNs and includes only a few papers. In this paper, we attempt to provide a more comprehensive review of the current status of research in this area. We will mainly focus on the neural network modeling issues. This review aims at serving two purposes. First, it provides a general summary of the work in ANN forecasting done to date. Second, it provides guidelines for neural network modeling and fruitful areas for future research.
The paper is organized as follows. In Section 2, we give a brief description of the general paradigms of the ANNs, especially those used for the forecasting purpose. Section 3describes a variety of the fields in which ANNs have been applied as well as the methodology used. Section 4discusses the key modeling issues of ANNs in forecasting. The relative performance of ANNs over traditional statistical methods is reported in Section 5. Finally, conclusions and directions of future research are discussed in Section 6.
Section snippets
An overview of ANNs
In this section we give a brief presentation of artificial neural networks. We will focus on a particular structure of ANNs, multi-layer feedforward networks, which is the most popular and widely-used network paradigm in many applications including forecasting. For a general introductory account of ANNs, readers are referred to Wasserman, 1989, Hertz et al., 1991, Smith, 1993. Rumelhart et al., 1986a, Rumelhart et al., 1986b, Rumelhart et al., 1994, Rumelhart et al., 1995, Lippmann, 1987,
Applications of ANNs as forecasting tools
Forecasting problems arise in so many different disciplines and the literature on forecasting using ANNs is scattered in so many diverse fields that it is hard for a researcher to be aware of all the work done to date in the area. In this section, we give an overview of research activities in forecasting with ANNs. First we will survey the areas in which ANNs find applications. Then we will discuss the research methodology used in the literature.
Issues in ANN modeling for forecasting
Despite the many satisfactory characteristics of ANNs, building a neural network forecaster for a particular forecasting problem is a nontrivial task. Modeling issues that affect the performance of an ANN must be considered carefully. One critical decision is to determine the appropriate architecture, that is, the number of layers, the number of nodes in each layer, and the number of arcs which interconnect with the nodes. Other network design decisions include the selection of activation
The relative performance of ANNs in forecasting
One should note the performance of neural networks in forecasting as compared to the currently widely-used well-established statistical methods. There are many inconsistent reports in the literature on the performance of ANNs for forecasting tasks. The main reason is, as we discussed in the previous section, that a large number of factors including network structure, training method, and sample data may affect the forecasting ability of the networks. For some cases where ANNs perform worse than
Conclusions and the future
We have presented a review of the current state of the use of artificial neural networks for forecasting application. This review is comprehensive but by no means exhaustive, given the fast growing nature of the literature. The important findings are summarized as follows:
- •
The unique characteristics of ANNs – adaptability, nonlinearity, arbitrary function mapping ability – make them quite suitable and useful for forecasting tasks. Overall, ANNs give satisfactory performance in forecasting.
- •
A
Acknowledgements
We would like to thank Dr. Pelikan, the associate editor, and three anonymous referees for their constructive and helpful comments.
Biographies: Guoqiang ZHANG received a B.S. in Mathematics and an M.S. in Statistics from East China Normal University, and is currently a Ph.D. candidate at Kent State University. His research interests are forecasting, neural networks applications, inventory systems, and statistical quality control. In 1997, he received the Best Student Paper Award at the Midwest Decision Sciences Institute Annual Meeting.
B. Eddy PATUWO is an Associate Professor in the Administrative Sciences Department at
References (225)
- et al.
What size network is good for generalization of a specific task of interest?
Neural Networks
(1994) - et al.
Forecasting the behavior of multivariate time series using neural networks
Neural Networks
(1992) Neural networks: Forecasting breakthrough or passing fad?
International Journal of Forecasting
(1993)- et al.
A neural network approach to mutual fund net asset value forecasting
Omega
(1996) - et al.
Neural network system for forecasting method selection
Decision Support Systems
(1994) - et al.
Some recent developments in non-linear time series modelling, testing, and forecasting
International Journal of Forecasting
(1992) - et al.
Analysis of univariate time series with connectionist nets: a case study of two classical examples
Neurocomputing
(1991) - et al.
Hierarchical training of neural networks and prediction of chaotic time series
Physics Letters
(1991) - et al.
Too-wear prediction using artificial neural networks
Journal of Materials Processing Technology
(1995) - et al.
Forecasting with neural networks – An application using bankruptcy data
Information and Management
(1993)
Neural network forecasting of short, noisy time series
Computers and Chemical Engineering
On the approximate realization of continuous mappings by neural networks
Neural Networks
Comparative study of artificial neural network and statistical models for predicting student grade point averages
International Journal of Forecasting
Much ado about nothing? Exchange rate forecasting: Neural networks vs. linear models using monthly and weekly data
Neurocomputing
The design and evolution of modular neural network architectures
Neural Networks
Artificial neural networks for forecasting and decision making
International Journal of Forecasting
Approximation capabilities of multilayer feedforward networks
Neural Networks
Some new results on neural network approximation
Neural Networks
Multilayer feedforward networks are universal approximators
Neural Networks
Training neural networks with the GRG2 nonlinear optimizer
European Journal of Operational Research
A neural network for predicting total industrial production
Journal of End User Computing
A new look at the statistical model identification
IEEE Transactions on Automatic Control
A comment on “Neural networks: A review from a statistical perspective”
Statistical Science
Time series prediction with neural nets: Application to airborne pollen forecasting
International Journal of Biometeorology
Correspondence: On the selection of error measures for comparisons among forecasting methods
Journal of Forecasting
Short term load forecasting using fuzzy neural networks
IEEE Transactions on Power Systems
Time series analysis by neural networks: Environmental temperature forecasting
Automazione e Strumentazione
Analyzing financial health: Integrating neural networks and expert systems
PC AI
A comment on “Neural networks: A review from a statistical perspective”
Statistical Science
An expert system for unit commitment and power demand prediction using fuzzy logic and neural networks
Expert Systems
First- and second-order methods for learning: Between steepest descent and Newton's method
Neural Computation
What size net gives valid generalization?
Neural Computation
Prediction of a continuous function with the aid of neural networks
Automatic Control and Computer Sciences
On tests for non-linearity in time series analysis
Journal of Forecasting
Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems
IEEE Transactions on Neural Networks
Neural networks: A review from a statistical perspective
Statistical Science
Maximum-entropy approach to identify time-series lag structure for developing intelligent forecasting systems
International Journal of Computational Intelligence and Organization
Gradient radial basis function networks for nonlinear and nonstationary time series prediction
IEEE Transactions on Neural Networks
Cited by (3351)
Forecasting seasonally fluctuating sales of perishable products in the horticultural industry
2024, Expert Systems with ApplicationsIntersecting reinforcement learning and deep factor methods for optimizing locality and globality in forecasting: A review
2024, Engineering Applications of Artificial IntelligencePredictive modeling for depth of wear of concrete modified with fly ash: A comparative analysis of genetic programming-based algorithms
2024, Case Studies in Construction MaterialsAMFGP: An active learning reliability analysis method based on multi-fidelity Gaussian process surrogate model
2024, Reliability Engineering and System SafetyDistribution network planning method: Integration of a recurrent neural network model for the prediction of scenarios
2024, Electric Power Systems Research
Biographies: Guoqiang ZHANG received a B.S. in Mathematics and an M.S. in Statistics from East China Normal University, and is currently a Ph.D. candidate at Kent State University. His research interests are forecasting, neural networks applications, inventory systems, and statistical quality control. In 1997, he received the Best Student Paper Award at the Midwest Decision Sciences Institute Annual Meeting.
B. Eddy PATUWO is an Associate Professor in the Administrative Sciences Department at Kent State University. He earned his Ph.D. in IEOR from Virginia Polytechnic Institute and State University. His research interests are in the study of stochastic inventory systems and neural networks. His research has been published in Decision Sciences, IIE Transactions, Journal of Operational Research Society, Computers and Operations Research, among others.
Michael Y. HU is a Professor of Marketing at Kent State University. He earned his Ph.D. in Management Science from the University of Minnesota in 1977. He has published extensively (about 80 research papers) in the areas of neural networks, marketing research, international business, and statistical process control. His articles have been published in numerous journals including Decision Sciences, Computers and Operations Research, OMEGA, Journal of Academic of Marketing Science, Journal of International Business Studies, Journal of Business Research, Financial Management, and many others.