A GLCM-Feature-Based Approach for Reversible Image Transformation

: Recently, a reversible image transformation (RIT) technology that transforms a secret image to a freely-selected target image is proposed. It not only can generate a stego-image that looks similar to the target image, but also can recover the secret image without any loss. It also has been proved to be very useful in image content protection and reversible data hiding in encrypted images. However, the standard deviation (SD) is selected as the only feature during the matching of the secret and target image blocks in RIT methods, the matching result is not so good and needs to be further improved since the distributions of SDs of the two images may be not very similar. Therefore, this paper proposes a Gray level co-occurrence matrix (GLCM) based approach for reversible image transformation, in which, an effective feature extraction algorithm is utilized to increase the accuracy of blocks matching for improving the visual quality of transformed image, while the auxiliary information, which is utilized to record the transformation parameters, is not increased. Thus, the visual quality of the stego-image should be improved. Experimental results also show that the root mean square of stego-image can be reduced by 4.24% compared with the previous method.

and RIT methods [Zhang, Wang, Hou et al. (2016)]. Zhang [Zhang (2011)] proposed the framework of "VRAE", in which, the data hider divides the encrypted image blocks into two sets firstly, then embed secret bits by flipping three LSBs of a set. To decrease the extracted-bits error rate, Hong et al. [Hong, Chen, and Wu (2012)] evaluated the complexity of image block respectively. However, the "VRAE" methods used by the cloud server should be specified together with the receiver. Ma et al. [Ma, Zhang, Zhao et al. (2013)] designed the framework of "RRBE", in which the image owner can reverse the room of LSBs by using an RDH method and encrypt the self-embedded image, then the cloud sever embeds secret data into the reversed LSBs of encrypted image. Cao et al. [Cao, Du, Wei et al. (2016)] compressed pixels in the local patch by sparse representation and achieve a higher reversed room than other previous methods. The complexity of this framework is determined by the sender who should reserve room for RDH by exploiting the redundancy within the image, and thus the RDH method used by the cloud sever should be specified with the sender. Therefore, the framework of "VRAE" cannot ensure that the encrypted image after data extraction can decrypt and obtain the original image in the receiver, and the framework of "RRBE" need the sender undertake the algorithm complexity since the original image in the sender should be compressed and reversed room for data hiding. In other words, the RDH method used by cloud sever in the two frameworks is receiver-related or sender-related. However, the cloud sever may be semi honest and should not know the encryption or decryption methods which is concerned with the sender and receiver in the public cloud environment. Therefore, data embedding in the cloud sever should be not have effect on the encryption and decryption method. In other words, the cloud sever can utilize arbitrary classic RDH methods to embed secret information into encrypted image which is similar to other image, and the framework is independence of the receiver-related or sender-related frameworks. How to transform reversibly the original image to the encrypted image which is similar to other image is a more challenging problem, which is called "reversible image transform" (RIT). The RIT method was designed for image privacy protection because the secret information is just the image itself. Although the method of Yang et al. [Yang, Ouyang, and Harn (2012)] can be used for "secret sharing" by embedding an image into several other images, the transmission and storage of multiple images cause the practicability to be low. Therefore, it is very challenging and important to hide one image into other one with the same size, which is called "image transformation". The first image transformation technology is proposed by Lai et al. [Lai, and Tsai (2011)], they chosed a target image similar to the secret image in an image database, and transformed each secret block to generate the final stego-image by the map between secret blocks and target blocks, then embedded the map. Lai et al.'s method is reversible, but the visual quality of stego-image is not good because the auxiliary information is very large, and it needs more time to choose a target image in a database. Lee et al. [Lee and Tsai (2014)] improve Lai et al.'s method by transforming a secret image to a freely-selected target image and reduce the auxiliary information. However, the method only reconstructs a good estimation of secret image because traditional color transformation method is not reversible.
In order to overcome the shortcomings of Lai et al.'s and Lee et al.'s methods, Hou et al. [Hou, Zhang, and Yu (2016)] presented a novel RIT method, in which, they transform a secret image to a freely-selected target image and obtain a stego-image similar to the target image by designing a reversible shift transformation. Before shifting image blocks, an effective clustering algorithm is used to match secret and target blocks, which not only can improve the visual quality of transformed image, but also can reduce the auxiliary information for recording block indexes. In this method, image block is paired by similar means and standard deviations (SDs) between the original and target images. Let a block B be a set of pixels such that = { 1 , 2 , … , }, and then the mean value and SD can be calculated as follows: (2) As shown in Fig. 1, by RIT, the secret image can be transformed to an image similar to the target image. In previous methods, the transformed image is utilized to embed the auxiliary information with reversibility and RDH realizes that the stego-image can be returned into the transformed image completely. In addition, RIT achieves the transformed image that can be restored to the secret image without error, and secret image can not be recovered only by target image.

Figure 1: Reversible image transformation
Inspired by the RIT method, Zhang et al. [Zhang, Wang, Hou et al. (2016)] transform the original image into the encrypted image which looks like the target image and propose the RDH-EI framework based on RIT. Since the correlation of transformed image are not destructed, the cloud sever can embed secret bits by a traditional RDH method. The RDH method used by the cloud sever is not affected by the encryption and decryption algorithm, and thus it is irrelevant with neither the sender nor receiver. Therefore, it is a very meaningful work to improve the visual quality of transformed image and stego-image in RIT method which is used to encrypt the original image.

Proposed method
In this section, the proposed method will be described with three steps: (1) Feature extraction for block matching; (2) Reversible shift and rotate transformation; (3) Secret image extraction. The detail is introduced in Fig. 2.
To hide the secret image, a data hider firstly calculates each image block feature and utilizes K-means clustering algorithm to divide image blocks into K classes, then moves each secret block to match the corresponding target block. After that, each pixel in secret blocks are added to the quantified difference, which is between the secret and matched target blocks, for having a similar mean with the matched target blocks. Then the overflow/underflow pixels are truncated and each block is rotated to the optimal angle for minimizing the root of mean square error (RMSE) between the rotated and the matched target blocks.
In RIT method, the auxiliary information contains class index, quantified difference, small overflow/underflow information and rotation angle, which also need to be recorded and embedded into the transformed image by RDH method. On the receiver side, after obtaining the auxiliary information, the receiver can use the auxiliary information to rotate each block to anti-angle, restore overflow/underflow pixels, remove quantified difference and recover positions of each secret image block. Finally, the secret image can be recovered completely.  [Haralick, Shanmugam, and Dinstein (1973)], 13 textural features can be derived from the normalized GLCM. These features measure different aspects of the GLCM, but many are correlated. Thus, Ulaby et al. [Ulaby, Kouyate, Brisco et al. (1986)] prove that energy, entropy, contrast and relevance are not correlated, which not only have a high precision of texture complexity, but also can reduce the computational burden.
To improve the visual quality of transformed image, the proposed method chooses these GLCM descriptors as image blocks features because it can concentrate in a small range close to zero and the frequency fast drops with the increasing of the feature value. Suppose the distance of image pixels is = { , } , the direction is = {0°, 45°, 90°, 135°} , and thus d belongs to the rectangle region {(0, ), ( , ), ( , 0), (− , )}, GLCM also can be denoted as ( , , , ). The main characteristic parameters are described as follow.
(1) Energy. Energy 1 is used to describe the distribution of image blocks uniformity and the coarse grain size of the texture, and 1 can be calculated by where L is the number of rows or columns of GLCM, ( , ) is pixel coordinate, and d (d>0) is a distance in the corresponding direction.
(2) Entropy. Entropy 1 is utilized to measure the amount of information contained in an image. If the image texture is complex, the entropy value is correspondingly large. 1 can be represented by (3) Contrast. Contrast 1 is used to describe the clarity and texture depth of an image. When an image has a clear and deep texture, the contrast of the image will be correspondingly large, and 1 can be calculated by (4) Relevance. Relevance 1 is an index to measure the degree of similarity in the row (column) of GLCM. The high or low 1 is positively related to the local gray correlation of the image. R 1 can be calculated by where 1 and 2 are mean values in the row(column) of GLCM, 1 and 2 are SD values. Note that 1 , 1 , 1 and 1 are texture feature parameters in one direction. However, the texture feature parameters in four directions are not different. The mean square error (MSE) can be assigned weights of texture feature parameters in four directions, and it can restrain the orientation vector and make the obtained texture features independent of direction. Thus, the complexity of sub block texture can be more accurately calculated. Taking energy feature parameter as an example, the energy values in four directions are 1 , 2 , 3 and 4 , then denote = 1 1 + 2 2 + 3 3 + 4 4 (10) where MSE i is the mean square error of four directions, 1 , 2 , 3 and 4 are the weights of the characteristic parameter assigned to four directions, respectively. And the assigned weights of entropy , contrast and relevance also can be calculated as the formal (7-10). Finally, the complexity of each image block can be represented by is the complexity of image blocks, 1 , 2 , 3 and 4 are the weight assigned to four main feature parameters, which can be calculated as the formulas (7-9).

Block matching
After replacing the SD with , a suitable block pairing should be selected. In Lee et al.'s method, the secret and target blocks are sorted in ascending order according to their SDs, respectively, and then each secret tile is paired up with a corresponding target block in turn according to the order. To restore the secret image from the transformed image, the positions of the secret tiles must be recorded and embedded into the transformed image with a reversible method, thus ⌈ ⌉ bits are needed for recording the block indexes. However, it will decrease the visual quality of stego-image when the number of image blocks is large or block size is small. In the proposed method, the blocks with close are deemed as one class since most of fs are similar. The secret block should be transformed to the target block in the same class.
(1) Cluster all of secret blocks into K classes by a traditional clustering method such as K-means, and sort the K classes to ensure that the in the ℎ class is smaller than in the jth class when 1 < < ≤ .
(2) Classify the target blocks by the classes' volumes of secret image, the scanning order, and each target class has the same volume with the corresponding class of the secret image. Let the ℎ secret image class contains n α blocks, where 1 ≤ ≤ . The first 1 target blocks, with the smallest , are divided into the first class, the second 2 target blocks, with second-smallest , are divided into the second class, and so on, until all of target blocks are divided.
(3) Distribute a compound index to each block, where is the ℎ block of the αth class and 1 ≤ ≤ . Then the ℎ secret blocks should be replaced to ℎ target blocks, and the transformed image is generated. A simple example of block matching is shown in Fig. 3. The secret tiles are divided into three classes here: (1) {0,1,2,3} belongs to the class 1, it is labeled as "white"; (2) {4,5,6} belongs to the class 2, labeled as "gray"; (3) {7,8,9} belongs to class 3, labeled as "black". The compound index for each block can be defined by scanning classes in the raster order. For instance, the second secret block is the first one of SD class 1 that is assigned as 1 1 , and the seven block is the second one of SD class 1 that is assigned as 1 2 . After that, the target blocks can be classified according to the class of secret blocks, and the one-to-one map between secret blocks and target blocks will be created. Then the ℎ secret blocks should be transformed to ℎ target blocks and replace them. Finally, the transformed image is generated and the class index A is recorded as auxiliary information for recover the position of secret image blocks.

Reversible shift and rotate transformation
After block matching, the transformed image blocks should be shifted and rotated for being similar as the target image. Let the matched block is a set of pixels = { 1 , 2 , … , } with mean value , and the target block is a set of pixels = { 1 ′, 2 ′, … , ′} with mean , then the matched block = { 1 ′′ , 2 ′′, … , ′′} can be generated.
To keep the transformation reversible, the amplitude − should be rounded to be an integer. ∆ = ( − ).
(13) To solve the overflow/underflow problem, ∆u should be modified as follow. Denote the maximum overflow pixel value as for ≥ 0 and the minimum underflow value is for < 0, is a parameter to control a balance between the number of overflow and underflow and the distance from the mean value of target image. When ∆ ≥ 0: and when ∆ < 0: To reduce the amount of auxiliary information, ∆u should be quantized to a little integer.
where the quantization step must be an even parameter and (•) is ceiling function. Then ∆ ′ = 2|∆ |/ should be recorded as the final auxiliary information, which is embedded into the transformed image, and λ is a parameter to make a trade-off between the amount of auxiliary information and the distance from the mean value of target image. Thus, the matched block = { 1 ′′ , 2 ′′, … , ′′} can be shifted as follows.
(17) Although modifying the amplitude − to ∆ , the overflow/underflow problem may still occur. To deal with the problem, the pixels less than 0 are truncated to be 0, and the pixels more than 255 are truncated to be 255, then a location map = ( 1 , 2 , … , ) can be generated to record the position of overflow/underflow pixels.
The LM can be compressed well because, it is very small. To further maintain the similarity between the transformed image and target image as much as possible, the shifted block C can be rotated into one of the four angles 0°, 90°, 180° or 270°. The best angle {0°, 90°, 180°, 270°} is selected for minimizing the root of mean square error (MSE) between the rotated block and the target block. Now, the transformed image is generated, and the auxiliary information containing the class index of secret image, quantified difference ∆u ′ , small overflow/underflow information LM and rotation angle ϑ, which also can be embedded into transformed image by the arbitrary traditional RDH methods. Before embedding, the auxiliary information should be compressed by the classic method such as Huffman code for reducing the amount and should be encrypted by the traditional way such as AES encryption for security.

Secret image extraction
Image recovery is the opposite process of the image hiding. The transformed image and embedded auxiliary information firstly can be recovered by the RDH method, and the information can be decrypted and decompressed. The transformed image is divided into non-overlapping blocks, and each block is rotated in the anti-direction of . After that, by quantified difference ∆ ′ = 2|∆ |/ , if ∆u ′ is an even number, then ∆ ≥ 0 and ∆ can be restored by ∆ = * ∆ ′ /2; if ∆u ′ is an odd number, then ∆ < 0 and ∆ can be restored by ∆ = − * ∆ ′ /2. Then each pixels of rotated blocks can remove ∆ . Finally, the removed blocks can be re-assigned to the position of matched blocks in secret image by the class index of the secret image, and the secret image can be recovered.

Parameter setting for the proposed method
In the proposed method, we adopt the Huffman code to reduce the auxiliary information, and use the RDH method in Sachnev et al. [Sachnev, Kim, Nam et al. (2009)] to embed the compressed information. For security reason, the compressed information should be encrypted before embedding, and Fig. 4 is an example that reflects security of the algorithm. Fig. 4(a) is the secret image, Fig. 4(c) is the stego-image, which is similar to the target image Fig. 4(b). If the eavesdropper has a wrong key, the messy image Fig. 4(d) will be achieved. Thus, only the receiver with correct key can restore the secret image. The experiments are carried with MATLAB-R2014a. The test machine is Asus PC with 4200 CPU @2.80GHz and 8.00 GB RAM. The test images shown in the experiments are listed, which are in the PNG format adopted by many cameras or computer equipments. In this subsection, four typical combinations of secret and target images in Fig. 5 are applied to discuss how to properly set parameters on the proposed method.  The secret and target images can be divided into the same number of 4 × 4 blocks. To match these two blocks, the GCML feature and clustering method is utilized to classify the blocks as classes. To solve the overflow/underflow problem, is set as a parameter to control the balance between the number of overflow and underflow and mean's bias of target image. To reduce the amount of auxiliary information, also is set as a parameter to make a trade-off between the amount of auxiliary information and the mean's bias of target image. The auxiliary information (AI) and the root of mean square error (RMSE) of the transformed image with different parameter , and for four examples are shown in Tab. 1. Fig. 6 can express the average payload and RMSE change with parameter , and more intuitively. In formula (16), the parameter is used to reduce the amount of auxiliary information but it results in the mean's bias of target image blocks. To choose an appropriate , we maintain the parameter , such as = 10, = 6 and change the . In Fig. 6(a1) and Fig. 6(b1), when is larger than 8, the average RMSE of the created transformed image will increase rapidly but the average payload increases slowly. Thus, = 8 is an appropriate value.
(a3) (b3) Figure 6: The average payload and RMSE change with different parameters , K and T As mentioned in Section 2.1.1, the parameter is used to classify the images blocks into classes. To choose an appropriate , we maintain the parameter , such as = 8, = 6 and change the value of . In Fig. 6(a2) and Fig. 6(b2), when is larger than 10, the average RMSE of the created transformed image will decrease slowly but the average payload is increased slowly. Thus, = 10 is an appropriate value. In formula (14-15), the parameter is utilized to reduce the amplitude of the mean's bias. To choose an appropriate , we maintain the parameter , such as = 10, = 8 and change the value of . In Fig. 6(a3) and Fig. 6(b3), when is larger than 8, the average RMSE of the created transformed image will increase rapidly but the average payload increases slowly. Thus, = 10 is an appropriate value.

Performance comparison
To compare the performance, the same compression method by Huffman coding is used to compressed the auxiliary information, and same RDH scheme proposed in Sachnev et al. [Sachnev, Kim, Nam et al. (2009)] are utilized to embed the compressed information into the transformed image for the proposed method and Hou et al.'s method. The root means square error (RMSE) and the auxiliary information (AI) are the main performance indexes to appraise the similarity between transformed and target images. The block size 4 × 4 usually performs best for Hou et al.'s method. Tab. 2 shows the RMSE and AI of the transformed image and stego-image with different block sizes for Example 4. And we can find that the block size 4 × 4 also performs best for Example 4. From Fig. 7 and Tab. 3, we can see that the visual quality of transformed image and stego-image of the proposed method outperformed that generated by Hou et al.'s method. The root of mean square of stego-image can be reduced by 4.24% compared with the previous method and the AI approximately equals to Hou et al.'s method. The reason is that an effective feature extraction algorithm of each block is utilized to increase the accuracy of blocks matching for improving the visual quality of transformed image, and the amount of the auxiliary information for recording transformation parameters is unchanged.    Wu et al.'s method is reversible, it can not ensure a relatively large payload (more than 1 bit per pixel). But the proposed method can achieve it. Compared with Lai et al.'s method, the stego-image in the proposed method is not expanded and has the same size with secret image. Moreover, the proposed method can resist detection of strong steganalysis because it is hard to recover secret image only by the stego-image which looks like the freely-selected target image.

Conclusion
In this paper, we proposed a GLCM-feature-based approach for reversible image transformation. Effective feature extraction algorithm of each block is utilized to increase the accuracy of block matching for improving the visual quality of transformed image, and the amount of accessorial information for recording transformation parameters is unchanged. Thus, the visual quality of stego-image should be improved, and the root mean square of stego-image can be reduced by 4.24% compared with the previous method. In future work, we may further improve the visual quality of stego-image from two aspects. On the one hand, the amount of accessorial information such as quantified mean difference should be reduced, or a novel RDH method is designed to reduce the loss of transformed image caused by the accessorial information. On the other hand, more block's feature should be chosen to improve the visual quality of transformed image, and thus improve the visual quality of stego-image.