Cell-splitting grid: a self-creating and self-organizing neural network

doi:10.1016/j.neucom.2003.09.012

Neurocomputing

Volume 57, March 2004, Pages 373-387

https://doi.org/10.1016/j.neucom.2003.09.012 Get rights and content

Abstract

A new model of self-creating and self-organizing neural network, cell-splitting grid (CSG), is presented. In this proposed CSG algorithm, the neurons and their connections are created and organized on a 2-D plane according to the input data distribution. Compared with self-organizing map (SOM) and its other improved algorithms, the CSG algorithm has outperformed SOM and other SOM-related algorithms in vector quantization while maintaining relatively good topology preservation. This paper shows that CSG is a promising and an effective method especially for non-uniformly distributed data.

Introduction

Self-organizing map (SOM), proposed by Kohonen [9], has been widely used in many areas such as pattern recognition, biological modeling, data compression, signal processing, and data mining [10]. It is an unsupervised and non-parametric neural network approach. The success of the SOM algorithm lies in its simplicity that makes it easy to understand, simulate and be used in many applications. The fundamental idea of SOM is based on a biological mechanism. It is a heuristic approach and has not been derived from strict mathematics.

The basic SOM consists of a set of neurons usually arranged in a 2-D structure such that there are neighborhood relations among the neurons. After the completion of training, each neuron is attached to a reference vector of the same dimension as the input space. By assigning each input vector to the neuron with the nearest reference vector, SOM is able to divide the input space into regions with common nearest reference vectors. This process can be considered as performing vector quantization (VQ) [7]. Also, because of the neighborhood relation contributed by the inter-connections among neurons, SOM exhibits another important property of topology preservation. In other words, if the reference vectors are near from each other in the input space, the corresponding neurons will also be close in the output space, and vice versa.

The main drawback of SOM is that one must pre-define the map structure and the map size before the commencement of the training process. SOM seems to be inherently limited by the fixed network. Usually, we must adopt trial tests to select a more appropriate network structure and size [15]. Another problem is that the neurons in SOM are uniformly distributed in the output space, which does not well reflect the neurons in the input space. The essential factor of eliciting these drawbacks is that we pre-define the data structure, which instead should be determined by the data themselves. Several improved SOM and related algorithms [1], [3], [5], [6], [16] have been proposed in recent years to overcome the above shortcomings. The neural networks in these algorithms dynamically increase their networks to suitable sizes, but these algorithms either are not easy to visualize high-dimensional input data on a 2-D plane [5], [6], [16], or have equal distances among the neighboring neurons on the output map [1], [3].

In this paper, we present a new self-creating and self-organizing neural network model called cell-splitting grid algorithm (CSG), which resembles the cell-splitting mechanism in biological perspective. Using the proposed CSG algorithm, it is able to overcome the discussed shortcomings of SOM and its other improved algorithms [1], [3], [5], [6], [16]. The proposed CSG algorithm exhibits improved performance especially for non-uniformly distributed input data that are experienced in most in real world applications. The proposed network enables a 2-D representation on the output map confined in a square region and the neurons are distributed on the 2-D map according to the density distribution of the input data. The neurons representing the dense region of the input data are densely distributed on the 2-D map, whereas those lying in the sparse region of the input data are located on the sparse region of the 2-D output space. As a result, the non-uniform distribution of neurons on the output map is able to not only preserve the data distribution in the input space, but also deliver a better performance of vector quantization than those delivered by the SOM and other SOM-related algorithms.

The content of this paper is organized as follows. In Section 2, it briefly introduces several improved SOM algorithms and the idea of the proposed CSG algorithm. In Section 3, we present our algorithm in detail and the advantages when using the CSG. In Section 4, CSG's relations to quadtrees and tree structured SOM are discussed. In Section 5, the ability of the proposed algorithm is demonstrated through applying the CSG algorithm to three synthetic data sets. In Section 6, the conclusion is drawn.

Section snippets

Self-creating and self-organizing neural network models

There have been a number of algorithms proposed to improve the SOM by dynamically increasing the number of neurons [1], [3], [5], [6], [16]. After the completion of a training process, the network can reach a desired map size and be used for data analysis. The number of neurons, connections of neurons, and the map size are not predefined by the designers, but determined incrementally from the data set. These algorithms have overcome the fixed structure of SOM to some extent.

(1) Growing cell

Goals of CSG

At first, it is required to define the goals of CSG, which is to obtain a good performance of vector quantization as well as good topology preservation through the dynamic self-creating and self-organizing learning process.

Problem specifications

There are input data x distributed in n-dimensional space with an unknown probability p(x). We denote V=Rⁿ as the input space and N the number of input data. The goal of CSG is to map V into a 2-D output space A=R². Clearly, after the completion of the training process, the

Relation to quadtrees

The term quadtree is used to describe a class of hierarchical data structure whose common property is that they are based on the principle of recursive decomposition of space [14]. Quadtrees are used for point data, regions, curves, surfaces, and volumes. The most studied quadtree approach to region representation is based on the successive subdivision of the image array into four equal-sized quadrants. For region quadtree, it starts from only one large region, subdivides the array into four

Experimental results

In this section, the proposed CSG algorithm is applied to data sets to demonstrate its effectiveness. We use three performance criteria to compare CSG with the SOM, GCS, GNG, and DSOM algorithm: (1) quantization error, (2) topographic product [2], (3) correlation between the pairwise distance [12]. Criteria 2 and 3 are used for comparison of the topology preservation. If value of topographical product is close to zero or value of correlation is close to one, this indicates a good performance of

Conclusion

A cell-splitting grid (CSG) algorithm is developed for reflecting the topology and distribution of the data set directly onto the output map. Based on the synthetic data examples, it has been shown that the proposed algorithm is effective for vector quantization and topology preservation especially for non-uniformly distributed data sets. Using the proposed algorithmic procedures, the network architecture is generated in accordance with the given data sets. This characteristic contributes the

Acknowledgements

The authors would like to thank the anonymous reviewers for their useful comments and suggestions.

Tommy W. S. Chow (M’93) received the B. Sc. (First Hons.) and Ph. D. degrees from the University of Sunderland, Sunderland, UK. He joined the City University of Hong Kong, Hong Kong, as a Lecturer in 1988. He is currently a Professor in the Electronic Engineering Department. His research interests include machine fault diagnosis, HOS analysis, system identification, and neural networks learning algorithms and applications.

References (16)

B. Fritzke
Growing cell structurea self-organizing network for supervised and un-supervised learning
Neural Networks
(1994)
J. Si et al.
Dynamic topology representing networks
Neural Networks
(2000)
D. Alahakoon et al.
Dynamic self-organizing maps with controlled growth for knowledge discovery
IEEE Trans. Neural Networks
(2000)
H.-U. Bauber et al.
Quantifying the neighborhood preservation of self-organizing feature maps
IEEE Trans. Neural Networks
(1992)
J. Blackmore, R. Miikkulainen, Visualizing high-dimensional structure with the incremental grid growing neural network,...
D. Choi et al.
Self-creating and organizing neural networks
IEEE Trans. Neural Networks
(1994)
B. Fritzke
A growing neural gas network learns topologies
Adv. Neural Inf. Process. Syst.
(1995)
R.M. Gray
Vector quantization
IEEE Acoust. Speech Signal Process. Mag.
(1984)

There are more references available in the full text version of this article.

Cited by (12)

A new Self-Organizing Extreme Learning Machine soft sensor model and its applications in complicated chemical processes
2017, Engineering Applications of Artificial Intelligence
Citation Excerpt :
Shen Furao et al. presented the incremental network and automated to adjust the number of hidden nodes (Chang et al., 2013; Han et al., 2012; Hsu et al., 2012; Qiao and Han, 2012; Li et al., 2004). Chow and Wu (2004) proposed the cell-splitting grid (CSG) algorithm, and proved that the CSG algorithm outperformed SOM and other SOM-related algorithms in vector quantization while maintaining relatively good topology preservation. In addition, Shen Furao and Osamu Hasegawa proposed an incremental network for online unsupervised classification and topology learning.
The control of product quality of complex chemical processes strictly depends on the measure of the key process variables. However, the online measure device is extremely expensive, and these devices are hard to protect. Meanwhile, there is a delay for these online measure devices. Therefore, the soft sensor technology plays a vital role in measuring the key process variables. Extreme Learning Machine (ELM) is an efficient and simple single layer feed-forward neural networks (SLFNs) to building an exact soft sensor model. However, unsuitable selected hidden nodes and random parameters will greatly affect the performance of the ELM. Therefore, this paper proposes a novel Self-Organizing Extreme Learning Machine (SOELM) algorithm constructed by the biological neuron-glia interaction principle to solve the issue of the ELM. Firstly, the weights between input layer nodes and the CNS are tuned iteratively by the Hebbian learning rule. Then the network structure is adjusted self-organizing by Mutual Information (MI) among different structures of networks. Secondly, the weights between the CNS and output layer nodes are obtained by the ELM. The experimental results based on different UCI data sets prove that the SOELM has a better generalization capability and stability than that of the ELM. Moreover, our proposed method is developed as a soft sensor model for accurately predicting the key variables of the Purified Terephthalic Acid (PTA) process.
Grid topologies for the self-organizing map
2014, Neural Networks
Citation Excerpt :
The Cube Kohonen Self-Organizing Map model (Lim & Haron, 2013) is designed to represent 3D input distributions, and it aims to learn a correct wireframe topology for closed 3D surface data. Finally, the Cell Splitting Grid constrains the topology to lie on a square, so that a square cell is associated to each unit, and growth proceeds by splitting a square into four subsquares (Chow & Wu, 2004). Other types of self-organizing networks include non growing tree structures such as the Self-Organizing Topological Tree (Xu, Chang, & Paplinski, 2005), which is particularly adequate for hierarchical input data and features a faster learning due to the tree nature of the network.
The original Self-Organizing Feature Map (SOFM) has been extended in many ways to suit different goals and application domains. However, the topologies of the map lattice that we can found in literature are nearly always square or, more rarely, hexagonal. In this paper we study alternative grid topologies, which are derived from the geometrical theory of tessellations. Experimental results are presented for unsupervised clustering, color image segmentation and classification tasks, which show that the differences among the topologies are statistically significant in most cases, and that the optimal topology depends on the problem at hand. A theoretical interpretation of these results is also developed.
A study on Energy Artificial Neuron Model and its applications in self-growing and self-organizing neural network
2011, IFAC Proceedings Volumes (IFAC-PapersOnline)
Neurons have been always considered as the active and main players in the brain's activities all the time. However, According to recent studies in the neuroscience field, the glial cells is playing a more and more important role in the brain's activities and the brain should be regarded as a system consisted of both neurons and glial cells. Furthermore, it has been proved to be related to the growth of neurons. In this paper, a new artificial neuron model called EAN Model (Energy Artificial Neuron Model) which is based on the energy concept from the glial cells is proposed, and a way to demonstrate EAN model in mathematics is suggested. Based on EAN model, a self-growing and self-organizing neural network called ESGSONN (EAN Based Self-growing and Self-organizing Neural Network) is realized, which has following features: rapid growing, density persevering, no or less dead neurons and incremental learning. New features of ESGSONN have been shown by comparable experiments.
PolSOM: A new method for multidimensional data visualization
2010, Pattern Recognition
In this paper, a new algorithm named polar self-organizing map (PolSOM) is proposed. PolSOM is constructed on a 2-D polar map with two variables, radius and angle, which represent data weight and feature, respectively. Compared with the traditional algorithms projecting data on a Cartesian map by using the Euclidian distance as the only variable, PolSOM not only preserves the data topology and the inter-neuron distance, it also visualizes the differences among clusters in terms of weight and feature. In PolSOM, the visualization map is divided into tori and circular sectors by radial and angular coordinates, and neurons are set on the boundary intersections of circular sectors and tori as benchmarks to attract the data with the similar attributes. Every datum is projected on the map with the polar coordinates which are trained towards the winning neuron. As a result, similar data group together, and data characteristics are reflected by their positions on the map. The simulations and comparisons with Sammon's mapping, SOM and ViSOM are provided based on four data sets. The results demonstrate the effectiveness of the PolSOM algorithm for multidimensional data visualization.
Extracting drug utilization knowledge using self-organizing map and rough set theory
2007, Expert Systems with Applications
Cardiovascular disease is becoming the major cause of death in many industrialized countries. People who receive long-term treatments usually ignore the progress of the disease states. Therefore, it is critical and necessary to evaluate drug utilization and laboratory test in order to discover the knowledge that is beneath and can be extracted from those raw data. This paper utilizes techniques of self-organizing map (SOM) and rough set theory (RST) to discover the trend of individual patient’s condition. With 10-fold cross-verification, the proposed SOM–SOM–RST process successfully and effectively detects patients whose diagnosis codes have been changed during the period of investigation and attains an accuracy of approximate 98%. This method can remind physicians to reevaluate the disease conditions of their patients.
Content-based image retrieval using growing hierarchical self-organizing quadtree map
2005, Pattern Recognition
In this paper, a growing hierarchical self-organizing quadtree map (GHSOQM) is proposed and used for a content-based image retrieval (CBIR) system. The incorporation of GHSOQM in a CBIR system organizes images in a hierarchical structure. The retrieval time by GHSOQM is less than that by using direct image comparison using a flat structure. Furthermore, the ability of incremental learning enables GHSOQM to be a prospective neural-network-based approach for CBIR systems. We also propose feature matrices, image distance and relevance feedback for region-based images in the GHSOQM-based CBIR system. Experimental results strongly demonstrate the effectiveness of the proposed system.

View all citing articles on Scopus

Sitao Wu is now pursuing Ph.D. degree in the Department of Electronic Engineering of City University of Hong Kong. He obtained B. E. and M. E. degrees in the Department of Electrical Engineering of Southwest Jiaotong University, China in 1996 and 1999, respectively. His research interest areas are neural networks, pattern recognition, and their applications.

View full text

Cell-splitting grid: a self-creating and self-organizing neural network

Abstract

Introduction

Section snippets

Self-creating and self-organizing neural network models

Goals of CSG

Problem specifications

Relation to quadtrees

Experimental results

Conclusion

Acknowledgements

Neural Networks

Neural Networks

Dynamic self-organizing maps with controlled growth for knowledge discovery

IEEE Trans. Neural Networks

Quantifying the neighborhood preservation of self-organizing feature maps

IEEE Trans. Neural Networks

Self-creating and organizing neural networks

IEEE Trans. Neural Networks

A growing neural gas network learns topologies

Adv. Neural Inf. Process. Syst.

Vector quantization

IEEE Acoust. Speech Signal Process. Mag.