Forecasting with artificial neural networks:: The state of the art

https://doi.org/10.1016/S0169-2070(97)00044-7Get rights and content

Abstract

Interest in using artificial neural networks (ANNs) for forecasting has led to a tremendous surge in research activities in the past decade. While ANNs provide a great deal of promise, they also embody much uncertainty. Researchers to date are still not certain about the effect of key factors on forecasting performance of ANNs. This paper presents a state-of-the-art survey of ANN applications in forecasting. Our purpose is to provide (1) a synthesis of published research in this area, (2) insights on ANN modeling issues, and (3) the future research directions.

Introduction

Recent research activities in artificial neural networks (ANNs) have shown that ANNs have powerful pattern classification and pattern recognition capabilities. Inspired by biological systems, particularly by research into the human brain, ANNs are able to learn from and generalize from experience. Currently, ANNs are being used for a wide variety of tasks in many different fields of business, industry and science (Widrow et al., 1994).

One major application area of ANNs is forecasting (Sharda, 1994). ANNs provide an attractive alternative tool for both forecasting researchers and practitioners. Several distinguishing features of ANNs make them valuable and attractive for a forecasting task. First, as opposed to the traditional model-based methods, ANNs are data-driven self-adaptive methods in that there are few a priori assumptions about the models for problems under study. They learn from examples and capture subtle functional relationships among the data even if the underlying relationships are unknown or hard to describe. Thus ANNs are well suited for problems whose solutions require knowledge that is difficult to specify but for which there are enough data or observations. In this sense they can be treated as one of the multivariate nonlinear nonparametric statistical methods (White, 1989, Ripley, 1993, Cheng and Titterington, 1994). This modeling approach with the ability to learn from experience is very useful for many practical problems since it is often easier to have data than to have good theoretical guesses about the underlying laws governing the systems from which data are generated. The problem with the data-driven modeling approach is that the underlying rules are not always evident and observations are often masked by noise. It nevertheless provides a practical and, in some situations, the only feasible way to solve real-world problems.

Second, ANNs can generalize. After learning the data presented to them (a sample), ANNs can often correctly infer the unseen part of a population even if the sample data contain noisy information. As forecasting is performed via prediction of future behavior (the unseen part) from examples of past behavior, it is an ideal application area for neural networks, at least in principle.

Third, ANNs are universal functional approximators. It has been shown that a network can approximate any continuous function to any desired accuracy (Irie and Miyake, 1988, Hornik et al., 1989, Cybenko, 1989, Funahashi, 1989, Hornik, 1991, Hornik, 1993). ANNs have more general and flexible functional forms than the traditional statistical methods can effectively deal with. Any forecasting model assumes that there exists an underlying (known or unknown) relationship between the inputs (the past values of the time series and/or other relevant variables) and the outputs (the future values). Frequently, traditional statistical forecasting models have limitations in estimating this underlying function due to the complexity of the real system. ANNs can be a good alternative method to identify this function.

Finally, ANNs are nonlinear. Forecasting has long been the domain of linear statistics. The traditional approaches to time series prediction, such as the Box-Jenkins or ARIMA method (Box and Jenkins, 1976, Pankratz, 1983), assume that the time series under study are generated from linear processes. Linear models have advantages in that they can be understood and analyzed in great detail, and they are easy to explain and implement. However, they may be totally inappropriate if the underlying mechanism is nonlinear. It is unreasonable to assume a priori that a particular realization of a given time series is generated by a linear process. In fact, real world systems are often nonlinear (Granger and Terasvirta, 1993). During the last decade, several nonlinear time series models such as the bilinear model (Granger and Anderson, 1978), the threshold autoregressive (TAR) model (Tong and Lim, 1980), and the autoregressive conditional heteroscedastic (ARCH) model (Engle, 1982) have been developed. (See De Gooijer and Kumar (1992)for a review of this field.) However, these nonlinear models are still limited in that an explicit relationship for the data series at hand has to be hypothesized with little knowledge of the underlying law. In fact, the formulation of a nonlinear model to a particular data set is a very difficult task since there are too many possible nonlinear patterns and a prespecified nonlinear model may not be general enough to capture all the important features. Artificial neural networks, which are nonlinear data-driven approaches as opposed to the above model-based nonlinear methods, are capable of performing nonlinear modeling without a priori knowledge about the relationships between input and output variables. Thus they are a more general and flexible modeling tool for forecasting.

The idea of using ANNs for forecasting is not new. The first application dates back to 1964. Hu (1964), in his thesis, uses the Widrow's adaptive linear network to weather forecasting. Due to the lack of a training algorithm for general multi-layer networks at the time, the research was quite limited. It is not until 1986 when the backpropagation algorithm was introduced (Rumelhart et al., 1986b, Werbos, 1988) that there had been much development in the use of ANNs for forecasting. Werbos (1974), Werbos (1988)first formulates the backpropagation and finds that ANNs trained with backpropagation outperform the traditional statistical methods such as regression and Box-Jenkins approaches. Lapedes and Farber (1987)conduct a simulated study and conclude that ANNs can be used for modeling and forecasting nonlinear time series. Weigend et al., 1990, Weigend et al., 1992, Cottrell et al., 1995address the issue of network structure for forecasting real-world time series. Tang et al. (1991), Sharda and Patil (1992), and Tang and Fishwick (1993), among others, report results of several forecasting comparisons between Box-Jenkins and ANN models. In a recent forecasting competition organized by Weigend and Gershenfeld (1993)through the Santa Fe Institute, winners of each set of data used ANN models (Gershenfeld and Weigend, 1993).

Research efforts on ANNs for forecasting are considerable. The literature is vast and growing. Marquez et al. (1992)and Hill et al. (1994)review the literature comparing ANNs with statistical models in time series forecasting and regression-based forecasting. However, their review focuses on the relative performance of ANNs and includes only a few papers. In this paper, we attempt to provide a more comprehensive review of the current status of research in this area. We will mainly focus on the neural network modeling issues. This review aims at serving two purposes. First, it provides a general summary of the work in ANN forecasting done to date. Second, it provides guidelines for neural network modeling and fruitful areas for future research.

The paper is organized as follows. In Section 2, we give a brief description of the general paradigms of the ANNs, especially those used for the forecasting purpose. Section 3describes a variety of the fields in which ANNs have been applied as well as the methodology used. Section 4discusses the key modeling issues of ANNs in forecasting. The relative performance of ANNs over traditional statistical methods is reported in Section 5. Finally, conclusions and directions of future research are discussed in Section 6.

Section snippets

An overview of ANNs

In this section we give a brief presentation of artificial neural networks. We will focus on a particular structure of ANNs, multi-layer feedforward networks, which is the most popular and widely-used network paradigm in many applications including forecasting. For a general introductory account of ANNs, readers are referred to Wasserman, 1989, Hertz et al., 1991, Smith, 1993. Rumelhart et al., 1986a, Rumelhart et al., 1986b, Rumelhart et al., 1994, Rumelhart et al., 1995, Lippmann, 1987,

Applications of ANNs as forecasting tools

Forecasting problems arise in so many different disciplines and the literature on forecasting using ANNs is scattered in so many diverse fields that it is hard for a researcher to be aware of all the work done to date in the area. In this section, we give an overview of research activities in forecasting with ANNs. First we will survey the areas in which ANNs find applications. Then we will discuss the research methodology used in the literature.

Issues in ANN modeling for forecasting

Despite the many satisfactory characteristics of ANNs, building a neural network forecaster for a particular forecasting problem is a nontrivial task. Modeling issues that affect the performance of an ANN must be considered carefully. One critical decision is to determine the appropriate architecture, that is, the number of layers, the number of nodes in each layer, and the number of arcs which interconnect with the nodes. Other network design decisions include the selection of activation

The relative performance of ANNs in forecasting

One should note the performance of neural networks in forecasting as compared to the currently widely-used well-established statistical methods. There are many inconsistent reports in the literature on the performance of ANNs for forecasting tasks. The main reason is, as we discussed in the previous section, that a large number of factors including network structure, training method, and sample data may affect the forecasting ability of the networks. For some cases where ANNs perform worse than

Conclusions and the future

We have presented a review of the current state of the use of artificial neural networks for forecasting application. This review is comprehensive but by no means exhaustive, given the fast growing nature of the literature. The important findings are summarized as follows:

  • The unique characteristics of ANNs – adaptability, nonlinearity, arbitrary function mapping ability – make them quite suitable and useful for forecasting tasks. Overall, ANNs give satisfactory performance in forecasting.

  • A

Acknowledgements

We would like to thank Dr. Pelikan, the associate editor, and three anonymous referees for their constructive and helpful comments.

Biographies: Guoqiang ZHANG received a B.S. in Mathematics and an M.S. in Statistics from East China Normal University, and is currently a Ph.D. candidate at Kent State University. His research interests are forecasting, neural networks applications, inventory systems, and statistical quality control. In 1997, he received the Best Student Paper Award at the Midwest Decision Sciences Institute Annual Meeting.
B. Eddy PATUWO is an Associate Professor in the Administrative Sciences Department at

References (225)

  • W.R. Foster et al.

    Neural network forecasting of short, noisy time series

    Computers and Chemical Engineering

    (1992)
  • K. Funahashi

    On the approximate realization of continuous mappings by neural networks

    Neural Networks

    (1989)
  • W.L. Gorr et al.

    Comparative study of artificial neural network and statistical models for predicting student grade point averages

    International Journal of Forecasting

    (1994)
  • T.H. Hann et al.

    Much ado about nothing? Exchange rate forecasting: Neural networks vs. linear models using monthly and weekly data

    Neurocomputing

    (1996)
  • B.L.M. Happel et al.

    The design and evolution of modular neural network architectures

    Neural Networks

    (1994)
  • T. Hill et al.

    Artificial neural networks for forecasting and decision making

    International Journal of Forecasting

    (1994)
  • K. Hornik

    Approximation capabilities of multilayer feedforward networks

    Neural Networks

    (1991)
  • K. Hornik

    Some new results on neural network approximation

    Neural Networks

    (1993)
  • K. Hornik et al.

    Multilayer feedforward networks are universal approximators

    Neural Networks

    (1989)
  • M.S. Hung et al.

    Training neural networks with the GRG2 nonlinear optimizer

    European Journal of Operational Research

    (1993)
  • M. Aiken et al.

    A neural network for predicting total industrial production

    Journal of End User Computing

    (1995)
  • H. Akaike

    A new look at the statistical model identification

    IEEE Transactions on Automatic Control

    (1974)
  • S. Amari

    A comment on “Neural networks: A review from a statistical perspective”

    Statistical Science

    (1994)
  • C.M. Arizmendi et al.

    Time series prediction with neural nets: Application to airborne pollen forecasting

    International Journal of Biometeorology

    (1993)
  • J.S. Armstrong et al.

    Correspondence: On the selection of error measures for comparisons among forecasting methods

    Journal of Forecasting

    (1995)
  • Azoff, E.M., 1994. Neural Network Time Series Forecasting of Financial Markets. John Wiley and Sons,...
  • Bacha, H., Meyer, W., 1992. A neural network architecture for load forecasting. In: Proceedings of the IEEE...
  • A.G. Bakirtzis et al.

    Short term load forecasting using fuzzy neural networks

    IEEE Transactions on Power Systems

    (1995)
  • A. Balestrino et al.

    Time series analysis by neural networks: Environmental temperature forecasting

    Automazione e Strumentazione

    (1994)
  • D. Barker

    Analyzing financial health: Integrating neural networks and expert systems

    PC AI

    (1990)
  • A.R. Barron

    A comment on “Neural networks: A review from a statistical perspective”

    Statistical Science

    (1994)
  • S. Bataineh et al.

    An expert system for unit commitment and power demand prediction using fuzzy logic and neural networks

    Expert Systems

    (1996)
  • R. Battiti

    First- and second-order methods for learning: Between steepest descent and Newton's method

    Neural Computation

    (1992)
  • E.B. Baum et al.

    What size net gives valid generalization?

    Neural Computation

    (1989)
  • Bergerson, K., Wunsch, D.C., 1991. A commodity trading model based on a neural network–expert system hybrid. In:...
  • A.N. Borisov et al.

    Prediction of a continuous function with the aid of neural networks

    Automatic Control and Computer Sciences

    (1995)
  • Bowerman, B.L., O'Connell, R.T., 1993. Forecasting and Time Series: An Applied Approach, 3rd ed. Duxbury Press,...
  • Box, G.E.P., Jenkins, G.M., 1976. Time Series Analysis: Forecasting and Control. Holden-Day, San Francisco,...
  • Brace, M.C., Schmidt, J., Hadlin, M., 1991. Comparison of the forecasting accuracy of neural networks with other...
  • Caire, P., Hatabian, G., Muller, C., 1992. Progress in forecasting by neural networks. In: Proceedings of the...
  • Chan, D.Y.C., Prager, D., 1994. Analysis of time series by neural networks. In: Proceedings of the IEEE International...
  • W.S. Chan et al.

    On tests for non-linearity in time series analysis

    Journal of Forecasting

    (1986)
  • Chang, I., Rapiraju, S., Whiteside, M., Hwang, G., 1991. A neural network to time series forecasting. In: Proceedings...
  • Chen, C.H., 1994. Neural networks for financial market prediction. In: Proceedings of the IEEE International Conference...
  • Chen, S.T., Yu, D.C., Moghaddamjo, A.R., 1991. Weather sensitive short-term load forecasting using nonfully connected...
  • T. Chen et al.

    Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems

    IEEE Transactions on Neural Networks

    (1995)
  • B. Cheng et al.

    Neural networks: A review from a statistical perspective

    Statistical Science

    (1994)
  • Chester, D.L., 1990. Why two hidden layers are better than one? In: Proceedings of the International Joint Conference...
  • K.H. Cheung et al.

    Maximum-entropy approach to identify time-series lag structure for developing intelligent forecasting systems

    International Journal of Computational Intelligence and Organization

    (1996)
  • E.S. Chng et al.

    Gradient radial basis function networks for nonlinear and nonstationary time series prediction

    IEEE Transactions on Neural Networks

    (1996)
  • Cited by (3351)

    View all citing articles on Scopus

    Biographies: Guoqiang ZHANG received a B.S. in Mathematics and an M.S. in Statistics from East China Normal University, and is currently a Ph.D. candidate at Kent State University. His research interests are forecasting, neural networks applications, inventory systems, and statistical quality control. In 1997, he received the Best Student Paper Award at the Midwest Decision Sciences Institute Annual Meeting.
    B. Eddy PATUWO is an Associate Professor in the Administrative Sciences Department at Kent State University. He earned his Ph.D. in IEOR from Virginia Polytechnic Institute and State University. His research interests are in the study of stochastic inventory systems and neural networks. His research has been published in Decision Sciences, IIE Transactions, Journal of Operational Research Society, Computers and Operations Research, among others.
    Michael Y. HU is a Professor of Marketing at Kent State University. He earned his Ph.D. in Management Science from the University of Minnesota in 1977. He has published extensively (about 80 research papers) in the areas of neural networks, marketing research, international business, and statistical process control. His articles have been published in numerous journals including Decision Sciences, Computers and Operations Research, OMEGA, Journal of Academic of Marketing Science, Journal of International Business Studies, Journal of Business Research, Financial Management, and many others.

    View full text