Elsevier

Neurocomputing

Volume 69, Issues 16–18, October 2006, Pages 2161-2170
Neurocomputing

The effect of different basis functions on a radial basis function network for time series prediction: A comparative study

https://doi.org/10.1016/j.neucom.2005.07.010Get rights and content

Abstract

Many applications using radial basis function (RBF) networks for time series prediction utilise only one or two basis functions; the most popular being the Gaussian function. This function may not always be appropriate and the purpose of this paper is to demonstrate the variation of test set error between six recognised basis functions. The tests were carried out on the Mackey–Glass chaotic time series, Box–Jenkins furnace data and flood prediction data sets for the Rivers Amber and Mole, UK. Each RBF network was trained using a two-stage approach, utilising the k-means clustering algorithm for the first stage and singular value decomposition for the second stage. For this type of network configuration the results indicate that the choice of basis function (and, where appropriate, basis width parameter) is data set dependent and evaluating all recognised basis functions suitable for RBF networks is advantageous.

Introduction

Feedforward artificial neural networks (ANNs) maintain a high level of research interest due to their ability to map any function to an arbitrary degree of accuracy. This has been demonstrated theoretically for both the radial basis function (RBF) network [19] and the popular multilayer perceptron (MLP) network [10]. Due to the similarity in functional mapping ability, the RBF and MLP share the same problem domains and consequently direct comparisons have been made (for example, see [18]). Feedforward ANNs have been applied to many diverse areas such as pattern recognition, time series prediction, signal processing, control and a variety of mathematical applications.

The RBF [4], [21] was developed from an exact multivariate function interpolation [20] and has attracted a lot of interest since its conception. There are a number of significant differences between RBFs and MLPs:

  • The RBF has one hidden layer while the MLP can have several.

  • The hidden and output layer nodes of the RBF are different while the MLP nodes are usually the same throughout.

  • RBFs are locally tuned while MLPs construct a global function approximation.

There have been many reports where practitioners have applied RBFs to various time series problems. In the majority of these cases only one or two basis functions have been used (for example, [5])—the most popular of which is the Gaussian kernel. This may be appropriate when the underlying phenomena are reasonably approximated by Gaussian functions but this is not always the case. Other kernels may be more appropriate for optimal network performance. While some researchers have concluded that the basis function is not crucial to network performance (for example, [12]), this study demonstrates that the optimal choice of basis function is problem dependent.

While some authors have explored the development of algorithms to optimise the number of basis functions used within RBFs (for example, see [8]) and RBF configuration (for example, see [14]), there has been little investigation into the influence of the basis functions themselves. This paper examines the variation in network performance (i.e. error on the test set) resulting from the use of different basis functions when applied to two benchmark tests and two ‘real world’ problems. An analysis of the effect of the basis width (where applicable) has also been conducted to investigate the importance of this parameter.

There are several ways of setting the basis width parameter; the simplest of which is setting it to a fixed common value. Alternatively, an ad hoc method can be employed such as the average distance of each basis function to its L nearest neighbours. There are more sophisticated algorithms which perform a gradient descent on a regularised error function and enable the dynamic adjustment of the basis centres and basis widths for each node [2]. However, the trade-off is a considerably slower training time. To enable consistent evaluations of variations to the basis width parameter to be made, a common basis width is used for each node during the comparison tests presented in this paper.

This paper is split into the following sections; Section 2 provides a brief overview of the RBF network; Section 3 details the different basis functions suitable for an RBF network; Section 4 describes the data sets; Section 5 describes how the experiments were conducted; Section 6 presents the experimental results and Section 7 provides the conclusions and recommendations from this study.

Section snippets

The RBF network

The RBF consists of two layers (see Fig. 1), with architecture similar to that of a two-layer MLP. The distance between an input vector and a prototype vector determines the activation of the hidden layer with the non-linearity provided by the basis function. The nodes in the output layer usually perform an ordinary linear weighted sum of these activations, although non-linear output nodes are an option. Training an RBF can be accomplished by two different approaches depending on whether the

Basis functions

For the exact interpolation problem a theorem provided by Micchelli [16] shows that for a large class of basis functions the N×N interpolation matrix is non-singular, provided the data points are distinct. There are six basis functions, which are recognised as having useful properties for RBF networks [2]:

  • 1.

    Multiquadratic [9]:

    φ(x)=(x2+σ2)1/2forσ>0andx.

    Which is a case of

    φ(x)=(x2+σ2)α,0<α<1.

  • 2.

    Gaussian:

    φ(x)=exp(-x22σ2)forσ>0andx.

These first two functions are local ones in the sense that φ0as|x|

Data sets

In order to provide a broad set of comparisons, four diverse data sets have been employed in this study. Two are recognised as standard benchmark tests and two are ‘real’ time-series data sets based on river flow modelling (see [7]). To facilitate comparison, all data are split following previous studies. These data sets are described below.

Comparison tests

In all cases the error measure used for evaluation was the MSE, since it is the most commonly used measure found in the literature. All data were normalised by range (0.1…0.9). While it is appreciated that more sophisticated training and testing strategies can be used (for example, cross-validation), a standard training and testing approach was adopted throughout this study to enable comparisons to be made under equivalent conditions (the purpose of this study was not to identify the best

Results

The results for the River Amber are presented first since additional experiments were conducted in order to emphasise the potential pitfalls associated with assuming that the ‘best’ model is the one with the smallest error measure on a test set.

Conclusions

The results in this paper indicate that the choice of basis function is problem dependent and, for a network configuration under consideration, testing all six basis functions proves to be advantageous. The results indicate that the ‘best’ basis functions for the data considered here include the cubic, inverse multiquadratic and Gaussian. Furthermore, the choice of basis width parameter (where applicable) has a significant affect on model performance. Also, making the assumption that the model

Acknowledgements

The Box–Jenkins furnace data and the Mackey–Glass raw data generator were obtained from the IEEE Neural Networks Council Standards Committee Working Group on Data Modelling Benchmarks which may be found at: http://neural.cs.nthu.edu.tw/jang/benchmark/

The authors are grateful to Richard Cross (Severn Trent Environment Agency) and Brian Greenfield (Thames Environment Agency) for provision of the hydrometric data.

Colin Harpham received the B.Sc. degree in Mathematics in 1997 followed by the M.Sc. degree in Industrial Mathematical Modelling in 1998, both from Loughborough University. He received a Ph.D. in Computer Science from the University of Derby in 2004. From 2002 to 2004, he was a research associate at King's College London applying artificial neural networks and statistical methods to the prediction of daily precipitation time series. He is about to undertake a research position in the Climatic

References (22)

  • A. Ghodsi et al.

    Automatic basis selection techniques for RBF networks

    Neural Networks

    (2003)
  • K. Hornik et al.

    Multilayer feedforward networks are universal approximators

    Neural Networks

    (1989)
  • B.G. Batchelor et al.

    Method for location of clusters of patterns to initialise a learning machine

    Electron. Lett.

    (1969)
  • C.M. Bishop

    Neural Networks for Pattern Recognition

    (1995)
  • G.E.P. Box et al.

    Time Series Analysis Forecasting and Control

    (1976)
  • D.S. Broomhead et al.

    Multivariable function interpolation and adaptive networks

    Complex Syst.

    (1988)
  • E.S. Chng et al.

    Gradient radial basis function networks for nonlinear and nonstationary time series prediction

    IEEE Trans. Neural Networks

    (1996)
  • C.W. Dawson et al.

    Inductive learning approaches to rainfall-runoff modeling

    Int. J. Neural Syst.

    (2000)
  • C.W. Dawson et al.

    An artificial neural network approach to rainfall-runoff modeling

    Hydrol. Sci. J.

    (1998)
  • R.L. Hardy

    Multiquadratic equations of topography and other irregular surfaces

    J. Geophys. Res.

    (1971)
  • J.R. Jang

    Adaptive-network-based fuzzy inference system

    IEEE Trans. Syst. Man, Cybern.

    (1993)
  • Cited by (155)

    • COVID-19 outbreak analysis and prediction using statistical learning

      2022, Advanced Data Mining Tools and Methods for Social Computing
    View all citing articles on Scopus

    Colin Harpham received the B.Sc. degree in Mathematics in 1997 followed by the M.Sc. degree in Industrial Mathematical Modelling in 1998, both from Loughborough University. He received a Ph.D. in Computer Science from the University of Derby in 2004. From 2002 to 2004, he was a research associate at King's College London applying artificial neural networks and statistical methods to the prediction of daily precipitation time series. He is about to undertake a research position in the Climatic Research Unit at the University of East Anglia. His research interests include applying artificial neural networks, in particular the Radial Basis Function configured using a Genetic Algorithm, to time series prediction with an emphasis on climatological applications.

    Christian Dawson completed his Ph.D. at Loughborough University in 1992 in Software Engineering. Since that time he has been heavily involved with the development and application of artificial neural networks (ANNs). He has specific expertise in the advancement and application of artificial intelligence methods to environmental and hydrological modelling. He has developed a runoff model for the River Yangtze in a project sponsored by the Three Gorges University, China and has recently managed an international comparative study in neurohydrology. He is a member of the Technical Committee of the Hydraulics, Water Resources and Ocean Engineering Conference in India and is the Local Organizing Chair for the Sixteenth International Conference on Industrial and Engineering Applications of Artificial Intelligence and Expert Systems. He is a member of the British Computer Society and is also a Chartered Engineer.

    View full text