Elsevier

Neurocomputing

Volume 57, March 2004, Pages 373-387
Neurocomputing

Cell-splitting grid: a self-creating and self-organizing neural network

https://doi.org/10.1016/j.neucom.2003.09.012Get rights and content

Abstract

A new model of self-creating and self-organizing neural network, cell-splitting grid (CSG), is presented. In this proposed CSG algorithm, the neurons and their connections are created and organized on a 2-D plane according to the input data distribution. Compared with self-organizing map (SOM) and its other improved algorithms, the CSG algorithm has outperformed SOM and other SOM-related algorithms in vector quantization while maintaining relatively good topology preservation. This paper shows that CSG is a promising and an effective method especially for non-uniformly distributed data.

Introduction

Self-organizing map (SOM), proposed by Kohonen [9], has been widely used in many areas such as pattern recognition, biological modeling, data compression, signal processing, and data mining [10]. It is an unsupervised and non-parametric neural network approach. The success of the SOM algorithm lies in its simplicity that makes it easy to understand, simulate and be used in many applications. The fundamental idea of SOM is based on a biological mechanism. It is a heuristic approach and has not been derived from strict mathematics.

The basic SOM consists of a set of neurons usually arranged in a 2-D structure such that there are neighborhood relations among the neurons. After the completion of training, each neuron is attached to a reference vector of the same dimension as the input space. By assigning each input vector to the neuron with the nearest reference vector, SOM is able to divide the input space into regions with common nearest reference vectors. This process can be considered as performing vector quantization (VQ) [7]. Also, because of the neighborhood relation contributed by the inter-connections among neurons, SOM exhibits another important property of topology preservation. In other words, if the reference vectors are near from each other in the input space, the corresponding neurons will also be close in the output space, and vice versa.

The main drawback of SOM is that one must pre-define the map structure and the map size before the commencement of the training process. SOM seems to be inherently limited by the fixed network. Usually, we must adopt trial tests to select a more appropriate network structure and size [15]. Another problem is that the neurons in SOM are uniformly distributed in the output space, which does not well reflect the neurons in the input space. The essential factor of eliciting these drawbacks is that we pre-define the data structure, which instead should be determined by the data themselves. Several improved SOM and related algorithms [1], [3], [5], [6], [16] have been proposed in recent years to overcome the above shortcomings. The neural networks in these algorithms dynamically increase their networks to suitable sizes, but these algorithms either are not easy to visualize high-dimensional input data on a 2-D plane [5], [6], [16], or have equal distances among the neighboring neurons on the output map [1], [3].

In this paper, we present a new self-creating and self-organizing neural network model called cell-splitting grid algorithm (CSG), which resembles the cell-splitting mechanism in biological perspective. Using the proposed CSG algorithm, it is able to overcome the discussed shortcomings of SOM and its other improved algorithms [1], [3], [5], [6], [16]. The proposed CSG algorithm exhibits improved performance especially for non-uniformly distributed input data that are experienced in most in real world applications. The proposed network enables a 2-D representation on the output map confined in a square region and the neurons are distributed on the 2-D map according to the density distribution of the input data. The neurons representing the dense region of the input data are densely distributed on the 2-D map, whereas those lying in the sparse region of the input data are located on the sparse region of the 2-D output space. As a result, the non-uniform distribution of neurons on the output map is able to not only preserve the data distribution in the input space, but also deliver a better performance of vector quantization than those delivered by the SOM and other SOM-related algorithms.

The content of this paper is organized as follows. In Section 2, it briefly introduces several improved SOM algorithms and the idea of the proposed CSG algorithm. In Section 3, we present our algorithm in detail and the advantages when using the CSG. In Section 4, CSG's relations to quadtrees and tree structured SOM are discussed. In Section 5, the ability of the proposed algorithm is demonstrated through applying the CSG algorithm to three synthetic data sets. In Section 6, the conclusion is drawn.

Section snippets

Self-creating and self-organizing neural network models

There have been a number of algorithms proposed to improve the SOM by dynamically increasing the number of neurons [1], [3], [5], [6], [16]. After the completion of a training process, the network can reach a desired map size and be used for data analysis. The number of neurons, connections of neurons, and the map size are not predefined by the designers, but determined incrementally from the data set. These algorithms have overcome the fixed structure of SOM to some extent.

(1) Growing cell

Goals of CSG

At first, it is required to define the goals of CSG, which is to obtain a good performance of vector quantization as well as good topology preservation through the dynamic self-creating and self-organizing learning process.

Problem specifications

There are input data x distributed in n-dimensional space with an unknown probability p(x). We denote V=Rn as the input space and N the number of input data. The goal of CSG is to map V into a 2-D output space A=R2. Clearly, after the completion of the training process, the

Relation to quadtrees

The term quadtree is used to describe a class of hierarchical data structure whose common property is that they are based on the principle of recursive decomposition of space [14]. Quadtrees are used for point data, regions, curves, surfaces, and volumes. The most studied quadtree approach to region representation is based on the successive subdivision of the image array into four equal-sized quadrants. For region quadtree, it starts from only one large region, subdivides the array into four

Experimental results

In this section, the proposed CSG algorithm is applied to data sets to demonstrate its effectiveness. We use three performance criteria to compare CSG with the SOM, GCS, GNG, and DSOM algorithm: (1) quantization error, (2) topographic product [2], (3) correlation between the pairwise distance [12]. Criteria 2 and 3 are used for comparison of the topology preservation. If value of topographical product is close to zero or value of correlation is close to one, this indicates a good performance of

Conclusion

A cell-splitting grid (CSG) algorithm is developed for reflecting the topology and distribution of the data set directly onto the output map. Based on the synthetic data examples, it has been shown that the proposed algorithm is effective for vector quantization and topology preservation especially for non-uniformly distributed data sets. Using the proposed algorithmic procedures, the network architecture is generated in accordance with the given data sets. This characteristic contributes the

Acknowledgements

The authors would like to thank the anonymous reviewers for their useful comments and suggestions.

Tommy W. S. Chow (M’93) received the B. Sc. (First Hons.) and Ph. D. degrees from the University of Sunderland, Sunderland, UK. He joined the City University of Hong Kong, Hong Kong, as a Lecturer in 1988. He is currently a Professor in the Electronic Engineering Department. His research interests include machine fault diagnosis, HOS analysis, system identification, and neural networks learning algorithms and applications.

References (16)

  • B. Fritzke

    Growing cell structurea self-organizing network for supervised and un-supervised learning

    Neural Networks

    (1994)
  • J. Si et al.

    Dynamic topology representing networks

    Neural Networks

    (2000)
  • D. Alahakoon et al.

    Dynamic self-organizing maps with controlled growth for knowledge discovery

    IEEE Trans. Neural Networks

    (2000)
  • H.-U. Bauber et al.

    Quantifying the neighborhood preservation of self-organizing feature maps

    IEEE Trans. Neural Networks

    (1992)
  • J. Blackmore, R. Miikkulainen, Visualizing high-dimensional structure with the incremental grid growing neural network,...
  • D. Choi et al.

    Self-creating and organizing neural networks

    IEEE Trans. Neural Networks

    (1994)
  • B. Fritzke

    A growing neural gas network learns topologies

    Adv. Neural Inf. Process. Syst.

    (1995)
  • R.M. Gray

    Vector quantization

    IEEE Acoust. Speech Signal Process. Mag.

    (1984)
There are more references available in the full text version of this article.

Cited by (12)

  • A new Self-Organizing Extreme Learning Machine soft sensor model and its applications in complicated chemical processes

    2017, Engineering Applications of Artificial Intelligence
    Citation Excerpt :

    Shen Furao et al. presented the incremental network and automated to adjust the number of hidden nodes (Chang et al., 2013; Han et al., 2012; Hsu et al., 2012; Qiao and Han, 2012; Li et al., 2004). Chow and Wu (2004) proposed the cell-splitting grid (CSG) algorithm, and proved that the CSG algorithm outperformed SOM and other SOM-related algorithms in vector quantization while maintaining relatively good topology preservation. In addition, Shen Furao and Osamu Hasegawa proposed an incremental network for online unsupervised classification and topology learning.

  • Grid topologies for the self-organizing map

    2014, Neural Networks
    Citation Excerpt :

    The Cube Kohonen Self-Organizing Map model (Lim & Haron, 2013) is designed to represent 3D input distributions, and it aims to learn a correct wireframe topology for closed 3D surface data. Finally, the Cell Splitting Grid constrains the topology to lie on a square, so that a square cell is associated to each unit, and growth proceeds by splitting a square into four subsquares (Chow & Wu, 2004). Other types of self-organizing networks include non growing tree structures such as the Self-Organizing Topological Tree (Xu, Chang, & Paplinski, 2005), which is particularly adequate for hierarchical input data and features a faster learning due to the tree nature of the network.

View all citing articles on Scopus

Tommy W. S. Chow (M’93) received the B. Sc. (First Hons.) and Ph. D. degrees from the University of Sunderland, Sunderland, UK. He joined the City University of Hong Kong, Hong Kong, as a Lecturer in 1988. He is currently a Professor in the Electronic Engineering Department. His research interests include machine fault diagnosis, HOS analysis, system identification, and neural networks learning algorithms and applications.

Sitao Wu is now pursuing Ph.D. degree in the Department of Electronic Engineering of City University of Hong Kong. He obtained B. E. and M. E. degrees in the Department of Electrical Engineering of Southwest Jiaotong University, China in 1996 and 1999, respectively. His research interest areas are neural networks, pattern recognition, and their applications.

View full text