Robust Image Hashing Using Radon Transform and Invariant Features

. A robust image hashing method based on radon transform and invariant features is proposed for image authentication, image retrieval, and image detection. Specifically, an input image is firstly converted into a counter-part with a normalized size. Then the invariant centroid algorithm is applied to obtain the invariant feature point and the surrounding circular area, and the radon transform is employed to acquire the mapping coefficient matrix of the area. Finally, the hashing sequence is generated by combining the feature vectors and the invariant moments calculated from the coefficient matrix. Experimental results show that this method not only can resist the normal image processing operations, but also some geometric distortions. Comparisons of receiver operating characteristic (ROC) curve indicate that the proposed method outperforms some existing methods in classification between perceptual robustness and discrimination.


Introduction
In the information era, copying and distribution of the multimedia data have become increasing convenient. However, the illegal tampering and usage of the image is a serious issue. In order to solve this problem, the traditional hash functions in cryptography and digital fragile watermarking techniques have been proposed. The hash functions such as SHA-1 and MD5 are extremely sensitive to the input data, because a bit change will result in large changes in the hash result [1]. In the digital fragile watermarking schemes, the original image will be modified for embedding the fragile watermark, which may undermine the perceived characteristics of the original image [2]. Therefore, image hashing is developed for image authentication.
Image hashing can transform an original image into a short sequence to represent its content, which is widely used in image authentication, copy detection, image retrieval, etc. Even though the original image is processed by normal image processing which do not affect the perceptual content in transmission, the extracted hash sequence remains unchanged for image authentication. As we know, perceptual robustness and discrimination are the two key issues of image hashing. Perceptual robustness requires that image hashing should be invariant to content-preserving operations. Discrimination claims that image hashing have the ability to distinguish the visually distinct images.
A variety of image hashing schemes have been proposed in recent years, including the schemes based on various transformations, the schemes based on matrix factorization, and the schemes based on invariant features. Various transformations have been applied to image hashing generation. Monga and Evans propose an image hashing method using discrete wavelet transform (DWT), but the method cannot resist contrast adjustment, gamma correction and large-angle rotation [3]. Lin and Chang develop a hashing method based on discrete cosine transform (DCT) of non-overlapping image blocks, but the method is not tolerant to geometric distortions [4]. Lefebvre et al. propose an image hashing method using the important features extracted in radon transform, which owns good robustness but lacks the ability to resist some conventional attacks [5], [6]. Lei et al. describe a robust scheme using the radon transform and moments for generating hashing sequence [7]. The scheme can resist geometric transform due to the invariant moments of the radon domain. Wu, Zhou and Niu suggest a robust hashing method based on the radon and wavelet transform which can resist the printscan attacks, but the method doesn't own good key-dependent security [8]. Other improved radon-based hashing methods using image normalization and the key-dependent secure hashing methods based on radon transform also have been proposed [9], [10]. These methods are robust to small angle rotation, but their discriminations are not desirable. Swaminathan, Mao and Wu utilize the Fourier coefficients for hashing generation, but the scheme can only resist some common attacks [11]. Xiang, Kim and Huang present a method using the invariant shape features of histogram, but the method isn't sensitive to the changes of image content [12]. Qin, Chang and Tsou [13] propose a hashing method based on non-uniform sampling.
Some other methods using the matrix factorization have been proposed. Kozat, Mihcak and Venkatesan divide the image into blocks and conduct the singular value decomposition (SVD) in each one for image hashing generation [14]. The non-negative matrix factorizations (NMF) are utilized to form image hashing, but these algorithms are sensitive to watermark embedding [15], [16]. A hashing method using the compressive sensing (CS) is proposed, but its discrimination is limited [17]. Tang et al. develop a dictionary framework to generate the image hash sequence, but its computational complexity and the tamper detection capability need to be improved [18].
There are also some methods using invariant features. Zhao et al. propose some hashing methods that combine the local texture features and Zernike moments for image hashing generation, which can't resist high strength noise addition and some geometric attacks, e.g. cropping [19], [20]. Tang et al. describe some robust hashing methods based on the invariant features. But these algorithms also can't resist noise addition and some geometric attacks [21], [22]. Liu, Cheng and Leung employ the wave atom transform in the hashing method, but the method doesn't own good performance of resisting geometric attacks such as scaling and translation [23]. Most of the aforementioned hashing methods are sensitive to geometric attacks. Although some methods are robust against geometric transform, but their discriminative capabilities are not desirable.
In our previously published paper, we proposed a hashing method based on log-polar mapping (LPM) and contourlet transform. The sub-band image is divided into non-overlapping blocks, and low and middle DCT frequency coefficients are selected from each block. The singular value decomposition (SVD) is applied to obtain the first digit of the maximum singular value. Finally, the features are scrambled and quantized as the safe hash bits [24]. However, in this paper, we propose a new image hashing method based on radon transform and invariant features, which is robust to ordinary image processing operations including JPEG compression, filtering, noise contamination, scaling, translation and rotation. The invariant centroid point and the circular area around the invariant point are helpful to resist geometric attacks. The radon transform is employed to obtain the rotation invariant property [7]. And then the final image hashing sequence can be generated in the radon domain. Secret key is introduced in the feature extraction. The method can satisfy the robustness against normal image processing and geometric distortions, and also have good discrimination for perceptual different images.
The rest of this paper is organized as follows. Section 2 introduces some useful tools and concepts. Section 3 describes the details of the proposed scheme. Experimental results are presented in Sec. 4. Finally, conclusions are drawn in Sec. 5.

Radon Transform
Radon transform is an effective method to analyze the signal between the spatial domain and its projection space.
Let g(r, ) denote the radon transform of a two-dimension image f(x, y), which is defined as its linear integral along the line inclined at an angle  from the x -axis and a distance r from the origin. The mathematical expression can be written as g(r, ) = R{f(x, y)} = f(x,y)(r -xcos -ysin)dx dy, where () is the pulse function, r = xcos + ysin, and 0   < 2. Radon transformation has excellent properties for the geometric transformations, i.e. translation, scaling and rotation.

Invariant Moments
Invariant moments, firstly introduced by Hu [25], are invariant to translation, scaling and rotation. The aim of choosing invariant moments as image features is to make the hash resilient to image rotation. Let f(x, y) be gray value of a pixel in a digital image sized m  n, where 0  x  m and 0  y  n. Thus, the seven invariant moments are defined as follows: 0 3   2  2  21  03  21  03  30  12  21  03   3  3   3 3 , where  pq (p, q = 0,1,2,…) are the normalized central moments defined as: and  pq are the central moments calculated by: and M pq are the (p + q)-th order moments:

Image Hashing Scheme
Generally, the image hashing scheme is mainly composed of two parts: feature extraction and hash generation. Image feature extraction is the crucial stage of the hashing scheme. The important robust features are extracted to represent the main contents of the original image. Then these features will go through some quantitative procedures to form the final hash sequence. The image hashing sequence can meet the robustness to some perceptually similar operations, and have the ability to distinguish the visually distinct images.

Image Preprocessing
In order to obtain a normalized image, the original image is rescaled to a fixed size k  k with bilinear interpolation. This is based on the consideration that the real images with different sizes can generate their hashes with a fixed length and have the same computation complexity. This step can also ensure that the proposed hashing can resist image rescaling to a certain extent. We choose k = 512 in the experiment.

Invariant Centroid Algorithm and the Circular Area Extraction
Invariant centroid algorithm is a method to extract an invariant feature point of an image. The point will remain unchanged even if the image undergoes some normal image processing and some geometric transforms. Thus the extraction of the invariant centroid point is crucial to the robustness of the hashing scheme. Through applying an iterative approach, we can obtain the invariant centroid point, which is close to the center point of the image. Assume that the invariant centroid point of the original image F(x,y) is calculated as follows: where x, y  M  N, that is to say that they belong to the entire image. The main steps of the scheme are as follows.
 Calculate a centroid point of the original image as C 0 , which is regarded as the initial value C b of the invariant centroid, C b = C 0 .
 Taking C b as the center point, extract the centroid point C r in the circular area with the radius r i . If  Set C b as the center point and r i as the radius, and extract the centroid point. This procedure will end until that the centroid points of two extracting are the same. The final point is the invariant centroid point, and the one near the center point is applied to obtain the circular area.
 After the extraction of the invariant centroid, the point can be set as the center of the circle and a circular area can be obtained around the point, where R is set as the radius of the circular area.
Usually, R should be a bit smaller than the length and width of the original image in order to keep the area unchanged after the translation, cropping and rotation. The image hashes can be obtained by extracting the important robust features in the circular area finally.

1) Local Feature Extraction
For the local features, four feature values, including zero-order moment, variance, singular value and DC component of DCT, are obtained from the selected rows, and then these values are formed a feature vector. These features all own good performance of resisting normal image processing and geometric distortions.
When the recycling translation happens on the  axis, assuming that the rotation angle is , the zero-order moment function is defined as follows: Thus the following relationship will be generated when the rotation happens: The zero-order moment of radon coefficient line can be regarded as an invariant feature. And the variance of the corresponding line can also be used as a feature value.
where i and j respectively represent the number of line and the number of coefficient in each line. len represents the length of a line, m indicates the mean value, and g i (j) expresses the coefficient values of each line. Since the SVD decomposition owns good performance of resisting geometric attacks [14], the maximum singular value of each line is considered as a feature value. We apply the discrete Fourier transform to each line for resisting translation attacks since the generated coefficients remain constant.
Because the real parts of coefficients have more stability, and the imaginary parts can easily be changed by some common attacks, the real parts of the coefficients are extracted to compose another matrix which owns good performance of resisting ordinary attacks. All of the elements are the real parts of coefficients, which are not sensitive to changes caused by ordinary attacks. Then DCT is conducted to the matrix, and the direct coefficient (DC) is considered as a feature value, that is because the DCT can well present the energy of the image and the DC can remain stable.

2) Global Feature Extraction
For the global features, the invariant moments of the radon coefficient matrix are firstly extracted, and then the HU moments in Sec. 2.2 are applied here. The global hashes are obtained by combining the seven HU moments together. The global features can well represent the image content and resist the geometric attacks, and have the desirable stability.

Image Hash Construction
A robust image hashing method is not only able to resist normal image processing, such as JPEG compression, noise addition, filtering and other manipulations, but also can resist geometric distortions such as rotation, scaling and translation. Therefore, the hashing sequences extracted from perceptually the same images, should be same or similar. And the hashing sequences extracted from the perceptually different images should be significantly different. The distance of similarity between two different images should be great enough.
The proposed hashing generation algorithm can be described as Fig. 1. The steps of the algorithm are as follows.
Input: the original image I, the key K Output: the hashing sequence H Begin: (1) Image preprocessing.
(3) Extract the circular area around the invariant centroid.
(4) Apply radon transform to the circular area.
(5) Generate the random sequences using the logistic chaotic system based on K, which it is applied to select 15 coefficient lines. For each coefficient line, compute the zero-order moment, variance, singular value, DC component of DCT to obtain the local features.

End
The final hashing sequence H can be formed by combining local hashes with global hashes. The hash length L is 67 (67 = 4  15 + 7) decimal digits.

Image Authentication
In the image authentication stage, hash H 1 is first generated from the suspect image with the same process as the original hash H. Then H 1 is compared with the original H with the Euclidean distance. The Euclidean distance is defined as follows: where H(i) and H 1 (i) represent the i-th value of the original hashing sequence and the extracted hashing sequence, respectively. By comparing the hash distance d, generally the distance of perceptually similar images is small, and the distance of the perceptually different images is relatively  great. A threshold T can be set according to the experiments, if d  T, the image is authentic and the two images are visually similar, otherwise the two images are perceptually different.

Experimental Results
The proposed scheme is run in Windows 7 and realized with Matlab2012a. Several standard test images with the size of 512  512 are used, such as 'lena', 'baboon', 'barbara', 'fullgold', 'airplane' and 'pepper'. The image database of Columbia University is also used for the overall analysis of performance. We use these test images under the commonly used image manipulations and malicious tampering to analyze the experimental results.
Robustness means that the hashing values are not changed significantly after ordinary image processing and geometric distortions, which can ensure the image authentication. Discrimination means that the hashing values are different when the images are perceptually distinct or maliciously tampered, which reflects the image integrity. Here, we mainly validate the robustness and discrimination of the proposed hashing

Security Analysis
During the process of the image hash generation, the logistic chaotic scrambling encryption step is introduced, and hence it is necessary to have the corresponding key. Moreover, the chaotic system is sensitive to the initial value, so that even small difference can vary widely, which means that if the key is correct, the right hash sequence can be obtained; if the key is not correct, the extracted hash sequence is different. Thus the proposed hash sequence has better security.

Robustness Analysis
The experiments of robustness analysis are mainly aimed at the robustness against the content-preserving attacks. The hashing values should remain similar when these attacks happen, including the ordinary image processing operations such as JPEG compression, Gaussian noise, salt and pepper noise, median filtering, Gaussian filtering, also including some geometric distortions such as scaling, translation and rotation. We extracted image hashes of the original images and their distorted versions, and then calculated their distances. For space limitation, only the results of five standard color images sized 512  512, including Lena, Peppers, Baboon, Airplane and Barbara, are taken for example. The experiments demonstrate that the proposed scheme not only can resist the ordinary image processing, but also the geometric attacks. The used manipulations and corresponding parameters for robustness experiments are listed in Tab. 1. The results are shown in Fig. 2 Due to the preprocessing, the original images will go through an image normalization procedure. The sizes of the images will be the same, which can resist the scaling distortion. Due to the extraction of the invariant centroid and the circular areas, the proposed scheme can resist the translation and rotation attacks. Figure 2 also shows the performances of resisting geometric attacks. The image hashes of the original image and the distorted versions are extracted, and then, their similarities are calculated using Euclidean distance. The y-axis d is the Euclidean distance of the image hashes, and the x-axis represents the parameter values of different operations. Table 2 presents the minimum, maximum and mean distance of each manipulation and its standard deviation. It can be seen that all the Euclidean distances are less than 10.

Discrimination Analysis
Discrimination means that the proposed scheme should be able to distinguish different images and the malicious operations from the content-preserving ones. In the paper, the Euclidean distance is applied to analyze the discrimination of image hashing. The greater the Euclidean distance between images, the better the discrimination.
Here, we randomly select 200 images from the image database of Columbia University for the discrimination analysis. These images contain (but not limited to) different categories including people, buildings, landscapes and so on. The sizes of these images range from 256  256 to 2048  1536. We mainly extract the hash sequences of different images and calculate the Euclidean distances between each pair of the hashes. Then the 19900 results are obtained and the distribution is shown in Fig. 3. The x-axis is the Euclidean distance and the y-axis represents the number of image pairs. The minimum, maximum, mean and standard deviation of hash distances are 6.40, 184.01, 55.01, and 25.97, respectively. As shown in Fig. 3, it is clear that a small threshold can improve discrimination performance, but it may simultaneously influence the perceptual robustness. Therefore, we should select a suitable threshold in the practical applications. When the threshold equals 8.0, almost not any two different images are falsely identified as similar images. When the threshold reaches 10, only 0.12% different images are mistakenly classified as visually identical images. If the Euclidean distance between hashes is greater than the threshold, we can easily distinguish perceptual different images; otherwise we fail to detect the image. For example, we choose the threshold T = 10 for image authentication, which will show excellent detection performance.
According to the comparative experiments of different images in Tab. 3, the Euclidean distances of different images are all greater than 10. Thus the proposed scheme has the ability of distinguishing different images.
To further demonstrate the performance of the proposed scheme to content changing attacks, we consider the four examples of cut-and-paste image editing as shown in Fig. 4 original images and their distorted versions are greater than 10, thus we can correctly declare it inauthentic. When the percentage of malicious operations is less than 30%, we are concerned about the malicious attacks on small region since they are rational attacks. To some degree, we can claim that the proposed scheme is able to distinguish the malicious operations from the content-preserving ones.

Performance Comparisons
In this subsection, we mainly compare the proposed scheme with several reported hashing methods. Zhao, Wang and Zhang [20] proposed a method that also combines the local features and the global features for image hashing, but the method cannot resist high strength noise addition for the reason that the noise may influence the extraction of the salient regions. And the method is not robust against cropping and rotation distortion with large angle. Since our features are extracted from the radon domain, the method proposed by Lei, Wang and Huang [7] is selected for comparison. Moreover, the proposed scheme uses the SVD and invariant moments, thus the methods proposed by Kozat et al. [14] and Tang et al. [21] are also selected for comparison. The proposed scheme mainly aims at the grayscale images. All the color images are converted to the grayscale images in the experiments. The same Euclidean distance computing is also applied to the analysis. We obtain the similar images with eight content-preserving operations on 200 original images of the above database, and take the different content images and the forged images by pasting a foreign block, which the size is more than 30%, into the 200 original images as perceptually distinct images. The receiver operating characteristics (ROC) is a useful tool for visualizing classification perfor-  mances between robustness and discrimination. The true positive rate (TPR) and the false positive rate (FPR) are calculated firstly. TPR and FPR are defined as follows, respectively.
Here n 1 is the number of the pairs that the visually identical images are considered as the similar images, N 1 is the total pairs of visually identical images, n 2 is the number of the pairs that the different images are considered as similar images, and N 2 is the total pairs of different images. It is clear that TPR and FPR indicate robustness and discrimination respectively. If two methods have the same TPR, the one with a low FPR outperforms that with a high FPR. Similarly, if they have the same FPR, the one with a high TPR is better than that with a low TPR. In order to obtain the experimental comparisons of the proposed scheme and the methods in reference [7], [14], [21], we set different threshold values and calculate the TPRs and FPRs of different methods. We repeat this process with different thresholds and obtain the ROC graph as shown in Fig. 5, with TPR and FPR as y axis and x axis, respectively.
The method in reference [7] uses the invariant moments and DFT for image hashing generation, which can resist some geometric distortions. The method in reference [14] uses the SVD-SVD, and the method in reference [21] uses the invariant moments, which can resist rotation distortion. But they only use the global features of the image and cannot balance the robustness with discrimination well. The proposed scheme combines the local features with the global features for hashing which can outperform these methods. As shown in Fig. 5, we can know that the ROC curve of the proposed scheme is above the curves of other methods, and the area under the ROC curve is larger than the curves of other methods, which means that the proposed scheme is superior to the other three methods in terms of classification performance.

Conclusions
In this work, an image hashing method is developed, which not only owns good robustness against the typical content-preserving operations, such as JPEG compression, filtering, noise contamination, scaling, translation and rotation, but also good discrimination for perceptually distinct images. The key contribution of this work is using radon transform and invariant features, which means combining the local features and the global features. However, the success of the proposed scheme depends on the multiple transformations to a large extent, so that the computational complexity is not desirable enough. Meanwhile, the hashing length is not the shortest among the state-of-art work. The further research is desired to extract the features that better represent the image contents, and it also maintains short hash length and low computational complexity.