Retrieval with HSV Color Space using Clustering Techniques

— The main objective of this paper is to classify the face images using HSV color features and an image retrieval system (CBIR) is presented which can retrieve facial images from the extracted facial features. The primary principle of CBIR in retrieving the face images is to retrieve almost all relevant images as well as to minimize the number of irrelevant images. This can be achieved with the help of clustering algorithm. When a query image is searched, the first step is determining the nearest cluster and the second step involves the computation of the distances between the query image and the target images assigned to the corresponding cluster. Finally, images that are similar to the query image are retrieved and displayed. The experiment result is compared with Euclidean distance metric where the clustering technique produces accurate image retrieval and better classification of images.


I. INTRODUCTION
HE common Image Retrieval system for retrieving images from large database of digital images utilizes metadata / keywords.Manual image annotation is time consuming and locating desired image from small database is possible, where as in large database more effective techniques are needed.Content-based image retrieval systems (CBIR) are very useful and efficient if the images are classified on the score of particular aspects.CBIR is a technique which uses visual contents, normally known as features, to search images from large scale image databases according to users' requests in the form of a query image.Given a query image, try to find visually similar images from an image database.If the distance between two vectors is smaller than the threshold, we get one match.Visual contents are colors, shapes, textures, objects, or meta-data (e.g., tags) derived from images.CBIR operates on a totally different principle, retrieving/searching stored images from a collection by comparing features automatically extracted from the images themselves.The commonest features used are mathematical measures of color, texture or shape (basic).A more review reports a tremendous growth in CBIR techniques.Applications of CBIR systems to medical domains already exist, although most of the systems currently available are based on radiological images.Most of the work in dermatology has focused on skin cancer detection.Different techniques for segmentation, feature extraction and classification have been reported by several authors.Here the face images are retrieved using CBIR techniques.The face image retrieval with CBIR provides fast retrieval efficiency.
Facial image retrieval retrieves images based on information extracted from human faces.It is a specific problem of content based image retrieval and has great potential in various applications, such as Human-Computer Interaction (HCI), digital video processing and visual surveillance.Due to the rising popularity of digital cameras and digital albums, retrieving images of human faces becomes an interesting problem.In the literature, Gudivada & Raghavan (1997) proposed a framework for image retrieval systems and implemented a feature based face retrieval system; Satoh et al., (1999) built a face retrieval and recognition system called "Name It" for video content analysis based on an eigen-face method for face recognition.Eickeler (2002) used a pseudo 2D Hidden Markov Model to retrieve faces from a face database.Most of the current face retrieval systems are based on the face recognition technique, i.e. retrieval by identity.Besides identity, the face contains a lot of other useful category information, such as gender, age, ethnicity and even expression, etc.It would be helpful to use clues other than identity to retrieve facial images.

II. LITERATURE REVIEW
Early systems existed already in the beginning of the 1980s [Chang & Fu, 1980] The image classification is treated as a preprocessing step for speeding-up image retrieval in large databases and improving accuracy, or for performing automatic image annotation.Image clustering inherently depends on a similarity measure.Image classification is often followed by a step of similarity measurement, restricted to those images in a large database that belong to the same visual class as predicted for the query.In such cases, the retrieval process is twisted, whereby classification and similarity matching steps together form the retrieval process.Similar arguments hold for clustering as well, due to which, in many cases, it is also a fundamental "early" step in image retrieval [Ritendra Datta et al., 2008].The K-Means clustering procedure is applied for scalable image retrieval from large databases.K-Means is an iterative improvement heuristic algorithm which works faster.A common method is to run the algorithm several times recover the best clustering found.
Color is one of the most widely used features for image similarity retrieval, Color retrieval yields the best results, in that the computer results of color similarity are similar to those derived by a human visual system that is capable of differentiating between infinitely large numbers of colors.One of the main aspects of color feature extraction is the choice of a color space.A color space is a multidimensional space in which the different dimensions represent the different components of color [Daniela Stan & Ishwar K. Sethi, 2001].Most color spaces are three dimensional.Example of a color space is RGB, which assigns to each pixel a three element vector giving the color intensities of the three primary colors, red, green and blue.The space spanned by the R, G, and B values completely describes visible colors, which are represented as vectors in the 3D RGB color space.As a result, the RGB color space provides a useful starting point for representing color features of images.For color feature extraction the HSV color space is used that is quite similar to the way in which the colors are defined as human perception, which is not always possible in the case of RGB color space.

III. COLOR SPACE
A color space is defined as a model for representing color in terms of intensity values with one-to four-dimensional space.A color component, or a color channel, is one of the dimensions.In this proposed work, HSV color space is used.

HSV Color Space
HSV stands for hue, saturation, and value.The value represents intensity of a color, which is decoupled from the color information in the represented image.The hue and saturation components are intimately related to the way human eye perceives color resulting in image processing algorithms with physiological basis.As hue varies from 0 to 1.0, the corresponding colors vary from red, through yellow, green, cyan, blue, and magenta, back to red, so that there are actually red values both at 0 and 1.0.As saturation varies from 0 to 1.0, the corresponding colors (hues) vary from unsaturated (shades of gray) to fully saturated (no white component).As value, or brightness, varies from 0 to 1.0, the corresponding colors become increasingly brighter.The HSV coordinate system model and the HSV color are shown in figure 1 and figure 2.

Color Conversion
In order to use a good color space, color conversion is needed between color spaces which preserve the perceived color differences.

A. Hue Calculation
The hue value can be calculated using the following formula:

B. Saturation Calculation
The saturation value can be calculated using the following formula:

C. Value calculation
The value is calculated using the following formula.

IV. HSV COLOR SPACE WITH CLUSTERING TECHNIQUES
The proposed system consists of three modules: Feature extraction, clustering of images and finding the similar image.

Feature Extraction
A RGB color image set is converted into a HSV color image using RGB to HSV conversion technique discussed in section 3.2.1.

Clustering of Images using K-Means Clustering
The images in the database are clustered using clustering technique, the representative bin number of each cluster is found and the query image is compared with only the cluster representatives.Given the cluster number K, the K-Means algorithm is carried out in three steps: 1. Initialisation: set seed points.Assign each object to the cluster with the nearest seed point.2. Compute seed points as the centroids of the clusters of the current partition (the centroid is the centre, i.e., mean point, of the cluster) 3. Go back to Step 1, stop when no more new assignment.

V. RESULT DISCUSSION
Given a query image, the cluster whose features are closer to the query image feature vector is retrieved.The efficiency of the proposed system is measured using the quality of clustering and the correct retrieval of face images.When the quality of clustering is compared with the similarity measure, Euclidean distance, the clustering proved to be efficient in retrieving images.With Euclidean distance metrics the similar images and also, the irrelevant images that are not a part of the query image are retrieved which leads to confusion.But the clustering technique minimizes the irrelevant images and classifies images with distance value obtained from the clusters.

VI. CONCLUSION
In this paper, an approach for Content Based Image Retrieval using HSV Color features is proposed.K-Means clustering technique is applied to the images are initially clustered into group which has similar HSV color content.Then the chosen group is clustered using K-Means clustering algorithm.K-Means is a clustering method based on the optimization of an overall measure of clustering quality is known for its efficiency in producing accurate results in image retrieval.
Since each cluster obtained is a unique set of similar images, the user can select an image set of his choice and further refine the search by applying K-Means technique.The images are retrieved efficiently and classified according to the cluster distance value.

Figure 3 -
Figure 3 -Block Diagram of Proposed System

Face Image Retrieval with HSV Color Space using Clustering Techniques content
[Niblack et al., 1993Jia Li et al., 2000;Carson et al., 2002;Chen & Wang, 2002]pports users to retrieve image by colour, shape and texture.QBIC provides several query methods Simple Query Multi-Feature Query Multi-Pass Query.Few of the techniques have used global color and texture features[Niblack et al., 1993;Pentland, 1994;Markus Stricker & Markus Orengo, 1995]where as few others have used local color and texture features[Natsev et al., 1999;Jia Li et al., 2000;Carson et al., 2002;Chen & Wang, 2002].The latter approach segments the image into regions based on color and texture features.The regions are close to human perception and are used as the basic building blocks for feature computation and similarity measurement.These systems are called region based image retrieval (RBIR) systems and have proven to be more efficient in terms of retrieval performance.Traditional text-based image search engines perform manual annotation of images and use textbased retrieval methods.In text-based retrieval methods, the following limitations occur in image annotation: large volumes of databases and valid only for one language.With image retrieval, this limitation should not exist.In human perception of text-based retrieval methods, there exist certain limitations such as subjectivity of human perception and too much responsibility on the end-user.In abstract, the queries cannot be described at all, but tap into the visual features of images.The advantage of CBIR over TBIR (Text-Based Image Retrieval) are among other image retrieval methods, CBIR is an approach that exclusively relies on the visual features, such as color histogram, texture, shape, and so forth, of the images.One of the obvious advantages of CBIR over other methods, e.g., text-based image retrieval, is that CBIR can be done in a fully automatic process since the visual features are automatically extracted.