The effect of different basis functions on a radial basis function network for time series prediction: A comparative study

doi:10.1016/j.neucom.2005.07.010

Neurocomputing

Volume 69, Issues 16–18, October 2006, Pages 2161-2170

https://doi.org/10.1016/j.neucom.2005.07.010 Get rights and content

Abstract

Many applications using radial basis function (RBF) networks for time series prediction utilise only one or two basis functions; the most popular being the Gaussian function. This function may not always be appropriate and the purpose of this paper is to demonstrate the variation of test set error between six recognised basis functions. The tests were carried out on the Mackey–Glass chaotic time series, Box–Jenkins furnace data and flood prediction data sets for the Rivers Amber and Mole, UK. Each RBF network was trained using a two-stage approach, utilising the k-means clustering algorithm for the first stage and singular value decomposition for the second stage. For this type of network configuration the results indicate that the choice of basis function (and, where appropriate, basis width parameter) is data set dependent and evaluating all recognised basis functions suitable for RBF networks is advantageous.

Introduction

Feedforward artificial neural networks (ANNs) maintain a high level of research interest due to their ability to map any function to an arbitrary degree of accuracy. This has been demonstrated theoretically for both the radial basis function (RBF) network [19] and the popular multilayer perceptron (MLP) network [10]. Due to the similarity in functional mapping ability, the RBF and MLP share the same problem domains and consequently direct comparisons have been made (for example, see [18]). Feedforward ANNs have been applied to many diverse areas such as pattern recognition, time series prediction, signal processing, control and a variety of mathematical applications.

The RBF [4], [21] was developed from an exact multivariate function interpolation [20] and has attracted a lot of interest since its conception. There are a number of significant differences between RBFs and MLPs:

•
The RBF has one hidden layer while the MLP can have several.
•
The hidden and output layer nodes of the RBF are different while the MLP nodes are usually the same throughout.
•
RBFs are locally tuned while MLPs construct a global function approximation.

There have been many reports where practitioners have applied RBFs to various time series problems. In the majority of these cases only one or two basis functions have been used (for example, [5])—the most popular of which is the Gaussian kernel. This may be appropriate when the underlying phenomena are reasonably approximated by Gaussian functions but this is not always the case. Other kernels may be more appropriate for optimal network performance. While some researchers have concluded that the basis function is not crucial to network performance (for example, [12]), this study demonstrates that the optimal choice of basis function is problem dependent.

While some authors have explored the development of algorithms to optimise the number of basis functions used within RBFs (for example, see [8]) and RBF configuration (for example, see [14]), there has been little investigation into the influence of the basis functions themselves. This paper examines the variation in network performance (i.e. error on the test set) resulting from the use of different basis functions when applied to two benchmark tests and two ‘real world’ problems. An analysis of the effect of the basis width (where applicable) has also been conducted to investigate the importance of this parameter.

There are several ways of setting the basis width parameter; the simplest of which is setting it to a fixed common value. Alternatively, an ad hoc method can be employed such as the average distance of each basis function to its L nearest neighbours. There are more sophisticated algorithms which perform a gradient descent on a regularised error function and enable the dynamic adjustment of the basis centres and basis widths for each node [2]. However, the trade-off is a considerably slower training time. To enable consistent evaluations of variations to the basis width parameter to be made, a common basis width is used for each node during the comparison tests presented in this paper.

This paper is split into the following sections; Section 2 provides a brief overview of the RBF network; Section 3 details the different basis functions suitable for an RBF network; Section 4 describes the data sets; Section 5 describes how the experiments were conducted; Section 6 presents the experimental results and Section 7 provides the conclusions and recommendations from this study.

Section snippets

The RBF network

The RBF consists of two layers (see Fig. 1), with architecture similar to that of a two-layer MLP. The distance between an input vector and a prototype vector determines the activation of the hidden layer with the non-linearity provided by the basis function. The nodes in the output layer usually perform an ordinary linear weighted sum of these activations, although non-linear output nodes are an option. Training an RBF can be accomplished by two different approaches depending on whether the

Basis functions

For the exact interpolation problem a theorem provided by Micchelli [16] shows that for a large class of basis functions the $N \times N$ interpolation matrix is non-singular, provided the data points are distinct. There are six basis functions, which are recognised as having useful properties for RBF networks [2]:

1.
Multiquadratic [9]:
$φ (x) = (x^{2} + σ^{2})^{1 / 2} for σ > 0 and x \in ℜ .$
Which is a case of
$φ (x) = (x^{2} + σ^{2})^{α}, 0 < α < 1 .$
2.
Gaussian:
$φ (x) = \exp (- \frac{x^{2}}{2 σ^{2}}) for σ > 0 and x \in ℜ .$

These first two functions are local ones in the sense that $φ \to 0 as | x | \to \infty$

Data sets

In order to provide a broad set of comparisons, four diverse data sets have been employed in this study. Two are recognised as standard benchmark tests and two are ‘real’ time-series data sets based on river flow modelling (see [7]). To facilitate comparison, all data are split following previous studies. These data sets are described below.

Comparison tests

In all cases the error measure used for evaluation was the MSE, since it is the most commonly used measure found in the literature. All data were normalised by range (0.1…0.9). While it is appreciated that more sophisticated training and testing strategies can be used (for example, cross-validation), a standard training and testing approach was adopted throughout this study to enable comparisons to be made under equivalent conditions (the purpose of this study was not to identify the best

Results

The results for the River Amber are presented first since additional experiments were conducted in order to emphasise the potential pitfalls associated with assuming that the ‘best’ model is the one with the smallest error measure on a test set.

Conclusions

The results in this paper indicate that the choice of basis function is problem dependent and, for a network configuration under consideration, testing all six basis functions proves to be advantageous. The results indicate that the ‘best’ basis functions for the data considered here include the cubic, inverse multiquadratic and Gaussian. Furthermore, the choice of basis width parameter (where applicable) has a significant affect on model performance. Also, making the assumption that the model

Acknowledgements

The Box–Jenkins furnace data and the Mackey–Glass raw data generator were obtained from the IEEE Neural Networks Council Standards Committee Working Group on Data Modelling Benchmarks which may be found at: http://neural.cs.nthu.edu.tw/jang/benchmark/

The authors are grateful to Richard Cross (Severn Trent Environment Agency) and Brian Greenfield (Thames Environment Agency) for provision of the hydrometric data.

References (22)

A. Ghodsi et al.
Automatic basis selection techniques for RBF networks
Neural Networks
(2003)
K. Hornik et al.
Multilayer feedforward networks are universal approximators
Neural Networks
(1989)
B.G. Batchelor et al.
Method for location of clusters of patterns to initialise a learning machine
Electron. Lett.
(1969)
C.M. Bishop
Neural Networks for Pattern Recognition
(1995)
G.E.P. Box et al.
Time Series Analysis Forecasting and Control
(1976)
D.S. Broomhead et al.
Multivariable function interpolation and adaptive networks
Complex Syst.
(1988)
E.S. Chng et al.
Gradient radial basis function networks for nonlinear and nonstationary time series prediction
IEEE Trans. Neural Networks
(1996)
C.W. Dawson et al.
Inductive learning approaches to rainfall-runoff modeling
Int. J. Neural Syst.
(2000)
C.W. Dawson et al.
An artificial neural network approach to rainfall-runoff modeling
Hydrol. Sci. J.
(1998)
R.L. Hardy
Multiquadratic equations of topography and other irregular surfaces
J. Geophys. Res.
(1971)

J.R. Jang

Adaptive-network-based fuzzy inference system

IEEE Trans. Syst. Man, Cybern.

(1993)

Cited by (155)

Multivariate regression (MVR) and different artificial neural network (ANN) models developed for optical transparency of conductive polymer nanocomposite films
2022, Expert Systems with Applications
The present study addresses a comparative performance assessment of multivariate regression (MVR) and well-optimized feed-forward, generalized regression and radial basis function neural network models which aimed to predict transmitted light intensity ( $I_{tr}$ ) of carbon nanotube (CNT)-loaded polymer nanocomposite films by employing a large set of spectroscopic data collected from photon transmission measurements. To assess prediction performance of each developed model, universally accepted statistical error indices, regression, residual and Taylor diagram analyses were performed. As a novel performance evaluation criterion, 2D kernel density mapping was applied to predicted and experimental $I_{tr}$ data to visually map out where the correlations are stronger and which data points can be more precisely estimated using the studied models. Employing MVR analysis, empirical equation of $I_{tr}$ was acquired as a function of only four input elements due to sparseness and repetitive nature of the remaining input variables. Relative importance of each input variable was calculated separately through implementing Garson’s algorithm for the best ANN model and mass fraction of CNT nanofillers was found as the most significant input variable. Using interconnection weights and bias values obtained for feed-forward neural network (FFNN) model, a neural predictive formula was derived to model $I_{tr}$ in terms of all input variables. 2D kernel density maps computed for each ANN model have shown that correlations between measured data and ANN predicted values are stronger for a specific $I_{tr}$ range between 0% and 18%. To measure the stability of the ANN models, as a final analysis, 5-fold cross-validation method was applied to whole measurement data and 5 different iterations were additionally performed on each ANN model for 5 different training and test data splits. Statistical results found from 5-fold cross-validation analysis have reaffirmed that FFNN model exhibited outperformed prediction ability over all other ANN models and all FFNN predicted $I_{tr}$ values agreed well with experimental $I_{tr}$ data. Taken all computational results together, one can adapt our proposed FFNN model and neural predictive formula to predict $I_{tr}$ of polymer nanocomposite films, which can be made from different polymers and nanofillers, by considering specific data range as presented in this study with statistical details.
COVID-19 outbreak analysis and prediction using statistical learning
2022, Advanced Data Mining Tools and Methods for Social Computing
Recent years have witnessed the application of statistical machine learning for different fields of research. To predict the future state of the process with observable characteristics, it is highly essential to analyze well. Time series analysis is an alternative that applies different statistical methods to explore different states and fluctuations. The main objective of this work is to build a robust model that analyzes the worldwide COVID-19 outbreak and to predict the spread of the virus in the next 7 days using Prophet. It is an additive regression model that produces excellent prediction results. It is well explained in the methodology section of this chapter. Through the proposed method, we analyze the total number of confirmed, death, and recovered cases. We also analyze which country has the highest number of cases. Prophet begins with a model of a time series using specified parameters, producing predictions, and evaluates them. The dataset includes 12,568 observations with eight variables. There are three continuous variables, i.e., confirmed cases, deaths, and recovered cases. The discrete variables of date and the last update and the nominal variables serial number, state, and country do not affect the prediction, so they will be neglected in the problem.
Data-driven modeling for unsteady aerodynamics and aeroelasticity
2021, Progress in Aerospace Sciences
Aerodynamic modeling plays an important role in multiphysics and design problems, in addition to experiment and numerical simulation, due to its low-dimensional representation of unsteady aerodynamics. However, in the traditional study of aerodynamics, developing aerodynamic and flow models relies on classical theoretical (potential flow) and empirical investigation, which limits the accuracy and extensibility. Recently, with significant progress in high-fidelity computational fluid dynamic simulation and advanced experimental techniques, very large and diverse fluid data becomes available. This rapid growth of data leads to the development of data-driven aerodynamic and flow modeling. Through advanced mathematical methods from control theory, data science and machine learning, a lot of data-driven aerodynamic models have been proposed. These models are not only more accurate than theoretical models, but also require very low computational cost compared with numerical simulation. At the same time, they help to gain physical insights on flow mechanism, and have shown great potential in engineering applications like flow control, aeroelasticity and optimization. In this review paper, we introduce three typical data-driven methods, including system identification, feature extraction and data fusion. In particular, main approaches to improve the performance of data-driven models in accuracy, stability and generalization capability are reported. The efficacy of data-driven methods in modeling unsteady aerodynamics is described by several benchmark cases in fluid mechanics and aeroelasticity. Finally, future development and potential applications in related areas are concluded.
Air catalytic biomass (PKS) gasification in a fixed-bed downdraft gasifier using waste bottom ash as catalyst with NARX neural network modelling
2020, Computers and Chemical Engineering
The air gasification of Palm Kernel Shells (PKS) using coal bottom ash (CBA) as a catalyst has been performed in a fixed-bed gasifier. The impact of three process parameters, namely, temperature (575–775°C), air flowrate (1.5–45 litter/min) and catalyst loading (0–30 wt.%) has been investigated on the product gas yield. The composition of the H₂ product is computed to be a maximum of 28 vol.% at 875°C. The air flowrate has a direct relation with H₂ production. The catalysts used have demonstrated a positive impact on the carbon conversion efficiency, showing the increase in carbon-containing gases in the product gas due to the increases in gas yield. A Non-linear Autoregressive Network with exogenous inputs (NARX) neural network has been used to predict the gaseous flowrate dynamically in order to improve gasification performance. The predicted results from the NARX network demonstrate good agreement with the experimental study with R² ≥ 0.99.
Tuning machine learning models for prediction of building energy loads
2019, Sustainable Cities and Society
There have been numerous simulation tools utilised for calculating building energy loads for efficient design and retrofitting. However, these tools entail a great deal of computational cost and prior knowledge to work with. Machine Learning (ML) techniques can contribute to bridging this gap by taking advantage of existing historical data for forecasting new samples and lead to informed decisions. This study investigated the accuracy of most popular ML models in the prediction of buildings heating and cooling loads carrying out specific tuning for each ML model and using two simulated building energy data generated in EnergyPlus and Ecotect and compared the results. The study used a grid-search coupled with cross-validation method to examine the combinations of model parameters. Furthermore, sensitivity analysis techniques were used to evaluate the importance of input variables on the performance of ML models. The accuracy and time complexity of models in predicting heating and cooling loads are demonstrated. Comparing the accuracy of the tuned models with the original research works reveals the significant role of model optimisation. The outcomes of the sensitivity analysis are demonstrated as relative importance which resulted in the identification of unimportant variables and faster model fitting.
Examining the Application of Artificial Neural Networks (ANNs) for Advancing Energy Efficiency in Building: A Comprehensive Reviews
2024, Journal of Sustainability Research

View all citing articles on Scopus

Colin Harpham received the B.Sc. degree in Mathematics in 1997 followed by the M.Sc. degree in Industrial Mathematical Modelling in 1998, both from Loughborough University. He received a Ph.D. in Computer Science from the University of Derby in 2004. From 2002 to 2004, he was a research associate at King's College London applying artificial neural networks and statistical methods to the prediction of daily precipitation time series. He is about to undertake a research position in the Climatic Research Unit at the University of East Anglia. His research interests include applying artificial neural networks, in particular the Radial Basis Function configured using a Genetic Algorithm, to time series prediction with an emphasis on climatological applications.

Christian Dawson completed his Ph.D. at Loughborough University in 1992 in Software Engineering. Since that time he has been heavily involved with the development and application of artificial neural networks (ANNs). He has specific expertise in the advancement and application of artificial intelligence methods to environmental and hydrological modelling. He has developed a runoff model for the River Yangtze in a project sponsored by the Three Gorges University, China and has recently managed an international comparative study in neurohydrology. He is a member of the Technical Committee of the Hydraulics, Water Resources and Ocean Engineering Conference in India and is the Local Organizing Chair for the Sixteenth International Conference on Industrial and Engineering Applications of Artificial Intelligence and Expert Systems. He is a member of the British Computer Society and is also a Chartered Engineer.

View full text

The effect of different basis functions on a radial basis function network for time series prediction: A comparative study

Abstract

Introduction

Section snippets

The RBF network

Basis functions

Data sets

Comparison tests

Results

Conclusions

Acknowledgements

Neural Networks

Neural Networks

Method for location of clusters of patterns to initialise a learning machine

Electron. Lett.

Neural Networks for Pattern Recognition

Time Series Analysis Forecasting and Control

Multivariable function interpolation and adaptive networks

Complex Syst.

Gradient radial basis function networks for nonlinear and nonstationary time series prediction

IEEE Trans. Neural Networks

Inductive learning approaches to rainfall-runoff modeling

Int. J. Neural Syst.

An artificial neural network approach to rainfall-runoff modeling

Hydrol. Sci. J.

Multiquadratic equations of topography and other irregular surfaces

J. Geophys. Res.

Adaptive-network-based fuzzy inference system

IEEE Trans. Syst. Man, Cybern.