Recognition of Eudiscoaster and Heliodiscoaster Using SOM Neural Network

This research is aimed to design an Eudiscoaster and Heliodiscoaster recognition system. There are two main steps to verify the goal. First: applying image processing techniques on the fossils picture for data acquisition. Second: applying neural networks techniques for recognition. The image processing techniques display the steps for getting a very clear image necessary for extracting data from the acquisition of image type (.jpg). This picture contains the fossils. The picture should be enhanced to bring out the pattern. The enhanced picture is segmented into 144 parts, then an average for every part can easily be computed. These values will be used in the neural network for the recognition. For neural network techniques, Self Organization Maps (SOM) neural network was used for clustering. The weights and output values will be stored to be used later in identification. The SOM network succeeded in identification and attained to (False Acceptance Rate = 15% False Rejection Rate = 15%).


Introduction
Coccolithophores are planktic unicellular algae belonging to the division (or phylum) Haptophyta which produce coccoliths (calcareous exoskeletal plates). Coccolithophores are in the class Coccolithophyceae (Coccolithophores) first occur in the Late Triassic and are abundant from the Early Jurassic to the Recent.
Discoasteracea includes the calcareous nannofossils of star or rosette shape [1] Flat-lying between remain dark when viewed with crossed nicoles since the C-axis of the calcite then is vertical, most species includeded in Discoaster. Theoorids (1983) suggested Eudiscoaster and Heliodiscoaster and the Discoaster has over 100 species which can be distigushed without majer problems when they are well preserved [2].
There are many researches about the computer recognition or identification process. Like: in 2000 Multimodal Biometric System cleared for face and voice Identification [3], in 2004 physical characteristics systems (such as: fingerprint and retina) are used in picture recognition [4]. In 2008 system designed for picture recognition by using linear associative memory neural network [5].
In geological data, calcareous nannofosils were commonly classified by qualitative description [1], for this reason this study is used for digital classified fort this important grope of fossils which determinate very good age to the beds that contains.

Image acquisition
An important and difficult step of this system is image acquisition.
Since fossils is small in size and having color, it is difficult to acquire good images for analysis using the standard Charged Coupled Device (CCD) camera. We have used camera of sufficiently high quality, then preprocessed computerized. In which the process of measurement is being fast, comfortable as well as robust against natural modifications. It employs a digital video camera type (Sony-cyber shot 3X). See

Image standardization
The acquired image always contains not only the 'useful' parts (fossils) but also some 'irrelevant' parts (e.g. sediments). under some conditions, the brightness is not uniformly distributed. In addition, different image-to-camera distance may result in different image sizes of the same image. For the purpose of analysis, the original image needs to be preprocessed. Therefore, the color image (RGB) should be converts to the gray level image by converts the RGB values to The National Television Systems Committee (NTSC) coordinates, sets the hue and saturation components to zero, and then converts back to RGB color space [6]. See Fig. (2):

Image enhancement
The original image has low contrast and may have non-uniform illumination caused by the position of the light source. These may impair the result of the texture analysis. Therefore it is necessary enhancing the image to reduce the effect of nonuniform illumination.
So, needing to Perform top-hat filtering to image. By performs morphological top-hat filtering on the grayscale image using the structuring element (SE), where SE must be a single structuring element which it is here creates a flat, disk-shaped structuring element, where by specifying the radius R which must be a nonnegative integer [6]. See Fig. (3) and Fig. (4). Then, the famous two-dimensional convolution equation, see equation (1), will be implemented to apply the top-hat filtering.

Delusion image:
The top-hat filtering to image on gray scale has unneeded boundary details. Therefore, a delusion image (DIM) will be constructed to exhibit the center of image and disappear the boundaries [6].
Apart from being circularly symmetric, edges and lines in various directions are treated similarly. The idea taken from the Gaussian blur filters because it have advantage characteristic of they are separable into the product of horizontal and vertical vectors.
So, the delusion image is decomposable in the product of a vertical vector and a horizontal vector [7]; the possible vector is shown in equation (2): Then the DIM will be multiplied by the tophat filtering image to produce the enhanced image as shown in Fig. (6). This step is important to equalize the values of the previous texture image such that the output image will contain a uniform distribution of intensities.

Image segmentation:
The picture can be extracted at this stage in a square box (12  12). As mentioned previously, image segmented into 144 square matrixes (each of 24  24 pixel) and the average are calculated for each segment as (mean1, mean2, ………, mean144). These values flow to the neural network input layer. Fig. (7) shows the block diagram of the steps adopted in the image preprocessing.

Facility of neural networks
The neural network techniques could be adopted for the purpose of comparison and identification. As is the case with most neural networks, the aim is to train the net to achieve a balance between the ability to respond correctly to the input patterns that are used for training (memorization) and the ability to give reasonable (good) responses to input that is similar, but not identical, to that used in training (generalization) [8].

SOM neural network
A SOM layer forms a part of a large number of neural networks. It group similar input vectors together without the use of training data to specify what a typical member of each group looks like or to which group each vector belongs. A sequence of input vectors is provided, but no target vectors are apecified. The net modifies the weights so that the most similar input vectors are assigned to the same output (or cluster) unit. The neural net will produce an exemplar (representative) vector for each cluster formed [9].

Architecture of SOM neural network
The neurons in a SOM layer distribute themselves to recognize frequently presented input vectors.
The architecture for a SOM network is shown in Fig. 8 below.

Figure (8): Architecture of SOM neural network
The dist box in this figure, which has R inputs, accepts the input vector p and the input weight matrix IW1,1, and produces a vector having S1 elements. The elements are the negative of the distances between the input vector and vectors IW1,1 formed from the rows of the input weight matrix [10].
The net input n1 of a SOM layer is computed by finding the negative distance between input vector p and the weight vectors and adding the biases b. If all biases are zero, the maximum net input a neuron can have is 0. This occurs when the input vector p equals that neuron's weight vector. The SOM transfer function accepts a net input vector for a layer and returns neuron outputs of 0 for all neurons except for the winner, the neuron associated with the most positive element of net input n1. The winner's output is 1. If all biases are 0, then the neuron whose weight vector is closest to the input vector has the least negative net input and, therefore, wins the competition to output a 1 [10].

Suggested SOM neural network
The SOM network which is suggested has 144 nodes in the input layer, two nodes in the output layer. The activation function used for SOM network is competitive activation function in the output layer. By this topology the SOM network is able to recognize a two clusters between two types of nannofosils.
As mentioned previously, image segmented into 144 square matrixes and the mean array calculated for each segment as (mean1, mean2, ………, mean144). These values flow parallel to the input layer for training.
The SOM network topology, as shown in Fig. 9, is a single-layer consisting of 144 nodes for input and 2 nodes for output. It has 288 weights and no biases to be stored. The data input stream is parallel for each sample.

Fig. 9: Suggested SOM neural network
The network trained for 40 samples (20 to each type) in 300 epochs, and also tested for 40 samples. Such that any image introduced to the system could identified if it is on these trained sample or closed enough to its cluster type. The calculated weights are stored in the database and become the comparison base to detect the image. Every image to be tested enters the network in the same way, the output values will be appear to its cluster.
Eudiscoaster, typically Neogene and usually star shape Discoaster with planar contacts surface between elements and Heliodiscoater, typically Paleogene usually rosette shape Discoaster with curved contact surface between elements [11]. SOM network topology attained False Acceptance Rate (FAR) 15% and False Rejection Rate (FRR) 15%.
Which means that the system had success reached to 85%.

Results
For pattern image comparisons, Fig. 10 and Fig. 11 clarifies samples for patterns which prove the differences between data patterns before and after image multiplication. These curves illustrates the image segments and their data values levels. In Fig. 11, It converts the image segments to normally Gaussian distributed which appear the middle position for the data after multiplication has highest values because of the object position. The other variation of data levels is for the pattern variation according to each segment number.  Moreover, it is important to see differences between the images for different types and a convergence between the images for the same type in data values. After image segmentation, taking average values for each image segment may give suitable results because of its features, where it reduces the huge numbers of image data and exhibits the variations between image segments. This step prepares the pattern data to be used in the next stage neural network. Fig. 12 shows that the average values for two images had the same pattern. It proves that image data is nearly for the same type.

Fig. 12: Mean curves of two segmentation images for the same type
In the previous figure, the average values (patterns) for the same type lead to same curves which close enough together. This equalization for the same images will lead to give different average curves between different iris images. Fig. 13 below shows the differences between samples in different types:

Fig. 13: Mean curves of the different segmentation images
In the previous figures, the Eudiscoaster which represents a star shape has more frequency than the Heliodiscoater which represents rosette shape. That was the reason of clustering and then lead to comparisons. Table 1 shows the SOM neural network clustering results.

Conclusion
For neural network techniques, SOM neural network was used for comparisons. The weights and output values will be stored to be used later in identification. The SOM network succeeded in identification and attained to (False Acceptance Rate = 15% -False Rejection Rate = 15%). Therefore, when input the photo of the calcareous nannofossils the program classified it to either Eudiscoaster then the bed that studied is aged Neogene , or Heliodiscoater then the bed that studied is aged Paleoogene