Neural network techniques for position and scale invariant image classification

Grimes, Catherine Alison (1998). Neural network techniques for position and scale invariant image classification. PhD thesis The Open University.

DOI: https://doi.org/10.21954/ou.ro.0000e20a

Abstract

This research is concerned with the application of neural network techniques to the problems of classifying images in a manner that is invariant to changes in position and scale. In addition to the goal of invariant classification, the network has to classify the objects in a hierarchical manner, in which complex features are constructed from simpler features, and use unsupervised learning. The resultant hierarchical structure should be able to classify the image by having an internal representation that models the structure of the image.

After finding existing neural network techniques unsuitable, a new type of neural network was developed that differed from the conventional multi-layer perceptron type of architecture. This network was constructed from neurons that were grouped into feature detectors.These neurons were taught in an unsupervised manner that used a technique based on Kohonen learning.A number of novel techniques were developed to improve the learning and classification performance of the network.

The network was able to retain the spatial relationship of the classified features; this inherent property resulted in the capability for position and scale invariant classification. As a consequence, an additional invariance filter was not required. In addition to achieving the invariance property, the developed techniques enabled multiple objects in an image to be classified.

When the network had learned the spatial relationships between the lower level features, names could be assigned to the identified features. As part of the classification process, th e system was able to identify the positions of the classified features in all layers of the network.

A software model of an artificial retina was used to test the grey scale classification performance of the network and to assess the response of the retina to changes in brightness.

Like the Neocognitron, the resulting network was developed solely for image classification. Although the Neocognitron is not designed for scale or position invariance, it was chosen for comparison purposes because it has structural similarities and the ability to accommodates light changes in the image.

This type of network could be used as the basis for a 2D-scene analysis neural network, in which the inherent parallelism of the neural network would provide simultaneous classification of the objects in the image.