Cell-splitting grid: a self-creating and self-organizing neural network
Introduction
Self-organizing map (SOM), proposed by Kohonen [9], has been widely used in many areas such as pattern recognition, biological modeling, data compression, signal processing, and data mining [10]. It is an unsupervised and non-parametric neural network approach. The success of the SOM algorithm lies in its simplicity that makes it easy to understand, simulate and be used in many applications. The fundamental idea of SOM is based on a biological mechanism. It is a heuristic approach and has not been derived from strict mathematics.
The basic SOM consists of a set of neurons usually arranged in a 2-D structure such that there are neighborhood relations among the neurons. After the completion of training, each neuron is attached to a reference vector of the same dimension as the input space. By assigning each input vector to the neuron with the nearest reference vector, SOM is able to divide the input space into regions with common nearest reference vectors. This process can be considered as performing vector quantization (VQ) [7]. Also, because of the neighborhood relation contributed by the inter-connections among neurons, SOM exhibits another important property of topology preservation. In other words, if the reference vectors are near from each other in the input space, the corresponding neurons will also be close in the output space, and vice versa.
The main drawback of SOM is that one must pre-define the map structure and the map size before the commencement of the training process. SOM seems to be inherently limited by the fixed network. Usually, we must adopt trial tests to select a more appropriate network structure and size [15]. Another problem is that the neurons in SOM are uniformly distributed in the output space, which does not well reflect the neurons in the input space. The essential factor of eliciting these drawbacks is that we pre-define the data structure, which instead should be determined by the data themselves. Several improved SOM and related algorithms [1], [3], [5], [6], [16] have been proposed in recent years to overcome the above shortcomings. The neural networks in these algorithms dynamically increase their networks to suitable sizes, but these algorithms either are not easy to visualize high-dimensional input data on a 2-D plane [5], [6], [16], or have equal distances among the neighboring neurons on the output map [1], [3].
In this paper, we present a new self-creating and self-organizing neural network model called cell-splitting grid algorithm (CSG), which resembles the cell-splitting mechanism in biological perspective. Using the proposed CSG algorithm, it is able to overcome the discussed shortcomings of SOM and its other improved algorithms [1], [3], [5], [6], [16]. The proposed CSG algorithm exhibits improved performance especially for non-uniformly distributed input data that are experienced in most in real world applications. The proposed network enables a 2-D representation on the output map confined in a square region and the neurons are distributed on the 2-D map according to the density distribution of the input data. The neurons representing the dense region of the input data are densely distributed on the 2-D map, whereas those lying in the sparse region of the input data are located on the sparse region of the 2-D output space. As a result, the non-uniform distribution of neurons on the output map is able to not only preserve the data distribution in the input space, but also deliver a better performance of vector quantization than those delivered by the SOM and other SOM-related algorithms.
The content of this paper is organized as follows. In Section 2, it briefly introduces several improved SOM algorithms and the idea of the proposed CSG algorithm. In Section 3, we present our algorithm in detail and the advantages when using the CSG. In Section 4, CSG's relations to quadtrees and tree structured SOM are discussed. In Section 5, the ability of the proposed algorithm is demonstrated through applying the CSG algorithm to three synthetic data sets. In Section 6, the conclusion is drawn.
Section snippets
Self-creating and self-organizing neural network models
There have been a number of algorithms proposed to improve the SOM by dynamically increasing the number of neurons [1], [3], [5], [6], [16]. After the completion of a training process, the network can reach a desired map size and be used for data analysis. The number of neurons, connections of neurons, and the map size are not predefined by the designers, but determined incrementally from the data set. These algorithms have overcome the fixed structure of SOM to some extent.
(1) Growing cell
Goals of CSG
At first, it is required to define the goals of CSG, which is to obtain a good performance of vector quantization as well as good topology preservation through the dynamic self-creating and self-organizing learning process.
Problem specifications
There are input data x distributed in n-dimensional space with an unknown probability p(x). We denote V=Rn as the input space and N the number of input data. The goal of CSG is to map V into a 2-D output space A=R2. Clearly, after the completion of the training process, the
Relation to quadtrees
The term quadtree is used to describe a class of hierarchical data structure whose common property is that they are based on the principle of recursive decomposition of space [14]. Quadtrees are used for point data, regions, curves, surfaces, and volumes. The most studied quadtree approach to region representation is based on the successive subdivision of the image array into four equal-sized quadrants. For region quadtree, it starts from only one large region, subdivides the array into four
Experimental results
In this section, the proposed CSG algorithm is applied to data sets to demonstrate its effectiveness. We use three performance criteria to compare CSG with the SOM, GCS, GNG, and DSOM algorithm: (1) quantization error, (2) topographic product [2], (3) correlation between the pairwise distance [12]. Criteria 2 and 3 are used for comparison of the topology preservation. If value of topographical product is close to zero or value of correlation is close to one, this indicates a good performance of
Conclusion
A cell-splitting grid (CSG) algorithm is developed for reflecting the topology and distribution of the data set directly onto the output map. Based on the synthetic data examples, it has been shown that the proposed algorithm is effective for vector quantization and topology preservation especially for non-uniformly distributed data sets. Using the proposed algorithmic procedures, the network architecture is generated in accordance with the given data sets. This characteristic contributes the
Acknowledgements
The authors would like to thank the anonymous reviewers for their useful comments and suggestions.
Tommy W. S. Chow (M’93) received the B. Sc. (First Hons.) and Ph. D. degrees from the University of Sunderland, Sunderland, UK. He joined the City University of Hong Kong, Hong Kong, as a Lecturer in 1988. He is currently a Professor in the Electronic Engineering Department. His research interests include machine fault diagnosis, HOS analysis, system identification, and neural networks learning algorithms and applications.
References (16)
Growing cell structurea self-organizing network for supervised and un-supervised learning
Neural Networks
(1994)- et al.
Dynamic topology representing networks
Neural Networks
(2000) - et al.
Dynamic self-organizing maps with controlled growth for knowledge discovery
IEEE Trans. Neural Networks
(2000) - et al.
Quantifying the neighborhood preservation of self-organizing feature maps
IEEE Trans. Neural Networks
(1992) - J. Blackmore, R. Miikkulainen, Visualizing high-dimensional structure with the incremental grid growing neural network,...
- et al.
Self-creating and organizing neural networks
IEEE Trans. Neural Networks
(1994) A growing neural gas network learns topologies
Adv. Neural Inf. Process. Syst.
(1995)Vector quantization
IEEE Acoust. Speech Signal Process. Mag.
(1984)
Cited by (12)
A new Self-Organizing Extreme Learning Machine soft sensor model and its applications in complicated chemical processes
2017, Engineering Applications of Artificial IntelligenceCitation Excerpt :Shen Furao et al. presented the incremental network and automated to adjust the number of hidden nodes (Chang et al., 2013; Han et al., 2012; Hsu et al., 2012; Qiao and Han, 2012; Li et al., 2004). Chow and Wu (2004) proposed the cell-splitting grid (CSG) algorithm, and proved that the CSG algorithm outperformed SOM and other SOM-related algorithms in vector quantization while maintaining relatively good topology preservation. In addition, Shen Furao and Osamu Hasegawa proposed an incremental network for online unsupervised classification and topology learning.
Grid topologies for the self-organizing map
2014, Neural NetworksCitation Excerpt :The Cube Kohonen Self-Organizing Map model (Lim & Haron, 2013) is designed to represent 3D input distributions, and it aims to learn a correct wireframe topology for closed 3D surface data. Finally, the Cell Splitting Grid constrains the topology to lie on a square, so that a square cell is associated to each unit, and growth proceeds by splitting a square into four subsquares (Chow & Wu, 2004). Other types of self-organizing networks include non growing tree structures such as the Self-Organizing Topological Tree (Xu, Chang, & Paplinski, 2005), which is particularly adequate for hierarchical input data and features a faster learning due to the tree nature of the network.
A study on Energy Artificial Neuron Model and its applications in self-growing and self-organizing neural network
2011, IFAC Proceedings Volumes (IFAC-PapersOnline)PolSOM: A new method for multidimensional data visualization
2010, Pattern RecognitionExtracting drug utilization knowledge using self-organizing map and rough set theory
2007, Expert Systems with ApplicationsContent-based image retrieval using growing hierarchical self-organizing quadtree map
2005, Pattern Recognition
Tommy W. S. Chow (M’93) received the B. Sc. (First Hons.) and Ph. D. degrees from the University of Sunderland, Sunderland, UK. He joined the City University of Hong Kong, Hong Kong, as a Lecturer in 1988. He is currently a Professor in the Electronic Engineering Department. His research interests include machine fault diagnosis, HOS analysis, system identification, and neural networks learning algorithms and applications.
Sitao Wu is now pursuing Ph.D. degree in the Department of Electronic Engineering of City University of Hong Kong. He obtained B. E. and M. E. degrees in the Department of Electrical Engineering of Southwest Jiaotong University, China in 1996 and 1999, respectively. His research interest areas are neural networks, pattern recognition, and their applications.