Improved reversible data hiding in JPEG images based on new coefficient selection strategy

Recently, reversible data hiding (RDH) techniques for JPEG images have become more extensively used to combine image and authentication information conveniently into one file. Although embedding data in JPEG image degrades visual quality and increases file size, it is proven to be useful for data communication and image authentication. In this paper, a data hiding method in JPEG image using a new coefficient selection technique is proposed. The proposed scheme embeds data using the histogram shifting (HS) method. According to the number of zero AC coefficients, block ordering is used to embed data first in the blocks causing less distortion. In order to further reduce the distortion, positions of AC coefficients are selected carefully. Finally, AC coefficients valued +1 and −1 are used for embedding, and the remaining non-zero AC coefficients are shifted to the left or right directions according to their sign. Compared to the current state-of-the-art method, experimental results show that the proposed method has higher peak signal to noise ratio (PSNR) and smaller file size.


Introduction
These days, the growth of multimedia technologies and the attractiveness of the Internet are dramatically increasing. Multimedia technologies are key players in this digital information age. Data communication and information exchange between people are done in digital form and mostly over the open Internet. This means information exchange allows a third party to access all types of multimedia information. Easy accessibility of multimedia information threatens privacy, and there is no guarantee for multimedia ownership and integrity. In general, the reliability, security, integrity, and confidentiality of multimedia information are under risk while in digital form, especially on the Internet. Other than these risks, there are cases where important metadata such as encrypted patient information or digital signature for a file is accidentally deleted or the image itself is maliciously doctored. Reversible data hiding (RDH) provides a mitigation methodology for those types of scenarios by allowing users to hide a payload into their cover media. It does it in a way that the original cover image can be recovered without any distortion. For some applications such as medical and military imaging, where even the slightest distortion is not desired, RDH can be useful.
JPEG standard is one of the oldest and most commonly used digital image formats in daily life. Most current media broadcasting corporations and digital devices use JPEG image compression to store information in graphic form. Data hiding in a JPEG compression domain in a reversible manner is also a useful and reasonable research area for image archive management, image authentication, and image privacy.
The first data hiding method was proposed by Barton's patent [1] in 1997. Following that, numerous schemes of data hiding and lossless data hiding have been proposed. Tian [2] proposed a difference expansion technique to embed data in the spatial domain; the host image is divided into pixel pairs, and the difference value of the two pixels in a pair is expanded to carry one message bit. Subsequently, Tian's work was improved upon in many aspects [3,4]. In 2006, Ni et al. [5] proposed a histogram shifting technique to embed data more efficiently, preferring to embed a message into coefficients belonging to some selected frequencies. The minimum points of the histogram are used for data embedding. Qu et al. [6] proposed a novel embedding strategy for reversible watermarking based on compensation. Some of the modified pixel values return to their original values after data hiding, compensating image distortion. These days, many algorithms exploit prediction errors (PE) and pixel value ordering (PVO) [4,[7][8][9]. Some authors [8,9] use block-based pixel ordering, whereas one author [7] uses pixel-based ordering (neighboring pixel ordering). Sachnev et al. [4] proposed a popular efficient data hiding using a sorting and prediction technique. Among these research ideas, most use a histogram shifting (HS) strategy to embed data. In the HS method, while the embeddable bins have the payload embedded, the rest of the bins must be shifted either to the right or to the left depending on their sign. Embeddable and expandable bins are specified by the encoder threshold values. Compared to other methods, PE and PVO schemes have better embedding performance. However, the PVO method works well for low-capacity embedding. On the other hand, reversible data hiding based on least squared prediction [10][11][12] gained popularity. These methods work quite well in highcapacity embedding. Most of the existing reversible data hiding techniques focus on the pixel domain.
However, these days, researchers have been giving attention to data hiding in JPEG-compressed image. JPEG image compression is based on discrete cosine transform (DCT), which is one of the basic building blocks for JPEG compression [13,14]. The most important aspect of DCT for JPEG compression is the ability to quantize the DCT coefficients using visually weighted quantization table values. In [15,16], Huffman code mapping is used for embedding the payload into a JPEG bit-stream. They used the unused variable-length code (VLC) for AC coefficients by applying a map from the unused codes to used codes. Hu et al. [17] improved the VLC-based lossless data hiding scheme for JPEG images. In their work, a lossless data hiding scheme that directly embeds data into the bitstream of JPEG images based on unused variable-length codes in the Huffman table is presented. Their method [17] is an improvement on the first method [15] of this kind. Chang et al. [18] presented a block-based lossless and reversible data hiding scheme for hiding payload in DCT-based compressed images. From each block of the medium-frequency elements, two successive zero coefficients are used for embedding. Mobasseri et al. [16] embedded payload in the JPEG bit-stream by code mapping.
The rest of this paper is organized as follows. In Section 2, an overview of the JPEG image standard is provided and related works of reversible data hiding in JPEG images are discussed. In Section 3, the proposed RDH scheme for JPEG images is introduced. Experimental results and analysis are given in Section 4. Finally, Section 5 concludes the work.
2 Related works 2.1 Overview of legacy JPEG image compression standard Joint Photographic Experts Group (JPEG) ( [13,14], http://www.ijg.org/) has worked on the very first international digital image compression standard. It is one of the most common image formats used by current digital cameras and other image-capturing devices. The compression ratio of the legacy JPEG compression standard is high with minimal loss in visual quality. The key steps of the JPEG compression process ( [13,14], http:// www.ijg.org/) are shown in Fig. 1. The JPEG legacy encoder consists of three parts: discrete cosine transform (DCT), quantization step, and entropy encoding. The original image is divided into non-overlapping 8 × 8 pixel blocks, subtracted by 127 for normalization, transformed using the two-dimensional DCT function, and entropy coding is applied using the Huffman  [13,14]. When a different quantization table is used, different image qualities and compression ratios are achieved. The recommended quantization table is shown in Fig. 2. Quality factor (QF) is a value ranging from 1 to 100, and scaled quantization table Q s is obtained using the following equation: where Q is the recommended quantization table value, and [x] represents the round operator of x.

Reversible data hiding in JPEG image
Unlike the pixel-based reversible data hiding technique, JPEG reversible data hiding embeds in the quantized DCT coefficients. Reversible data hiding in JPEG DCT coefficients is based on four general data hiding approaches [19]. The first one is a lossless compressionbased method proposed by Fridrich and Goljan [20], in which the embedding space is preserved by compressing the redundant component of the image. Since the message capacity is too small, this method has received less attention.
The second is a quantization table modification approach proposed by Fridrich et al. [21] and later improved by Wang et al. [22]. Their techniques work by preprocessing the quantized DCT coefficients and modifying the quantization table to create space for data hiding. Although the experimental results of [22] achieve high peak signal to noise ratio (PSNR), the file size increases greatly.
The third method [16] modifies the Huffman table. In this method, data embedding is performed by mapping a used variable length coding (VLC) to an unused VLC. Qian and Zhang [15] improved this method, but the payload size is very small for these methods.
The fourth method [18,23] modifies the quantized DCT coefficients. Xuan et al. [23] shifted the quantized DCT coefficient histogram with an optimum searching strategy. This optimum strategy helped the technique to achieve good performance. In order to make data embedding unperceivable and the visual quality of marked image high, when a certain amount of data is embedded, only lower and middle frequency DCT coefficients are chosen to embed data in the embedding process. Sakai et al. [24] improved this scheme, producing better image quality with a new block selection strategy. Li et al. [25] proposed a reversible data hiding scheme on JPEG images based on the smaller DCT value selection method and three slight modifications of the quantization table.  Using the HS analogy, Lin et al. [26] discussed highcapacity reversible data hiding for JPEG images. They used different block sizes and got high embedding capacity. Celik et al. [27] also proposed lossless data hiding in the least significant bit (LSB) of the JPEG DCT coefficients. Those DCT values, which have high distortion, are padded as side information in the payload. Huang et al. [19] proposed a histogram shifting technique on AC coefficients valued +1 or −1 and employ a block ordering based on the statistical properties of the number of zero AC coefficients in the blocks. The method is quite fresh and achieves huge improvements compared to the past work.

Proposed scheme
This section describes the proposed JPEG reversible data hiding method. In subsection 3.1, the embedding of the AC coefficients will be explained. Subsection 3.2 explains the block ordering method proposed by Huang et al. [19] in details. Subsection 3.3 explains the main contribution of this paper, which is the process of selecting the positions of AC coefficients for minimizing the distortion. Subsection 3.4 details which side information is needed and how they are embedded. Subsection 3.5 briefly explains the recovery of the original DCT coefficients and the extraction of the payload. A short explanation of the complexity of the algorithm is also found in subsection 3.6. Finally, the section concludes with a precise description of the proposed algorithm pseudo code.

Embedding
In a quantized coefficient block, the very first coefficient is referred to as the DC coefficient, whereas 63 others are referred to as AC coefficients. The proposed method will only embed the payload in the AC coefficients. For each AC coefficient C, the following method is used to embed the payload bit b ∈ 0 , 1 and creates the watermarked AC coefficient C ′ : A bit b is embedded if and only if C is either +1 or −1. If C is greater than +1 or less than−1, it is shifted by +1 and −1, respectively. Notice that the zero coefficients are not modified as explained in the beginning of this section. Before we continue with the rest of the explanation, the following histogram shifting terms are summarized for clarity: Embeddable coefficients: AC coefficients valued either +1 or −1. Unchangeable coefficients: AC coefficients valued 0. Shiftable coefficients: AC coefficients which are greater than +1 or less than −1.

Block ordering
Huang et al. [19] first proposed a block ordering based on the number of zero AC coefficients. The experimental results show that blocks with many zero AC coefficients will likely contain many −1 or +1 valued AC coefficients. Using this statistical feature, [19] proposed only embedding in +1 and −1 and set embedding order such that the blocks with more zero AC coefficients are embedded first. This strategy effectively reduced distortion. Additionally, the file size increase is also lessened. The modification of zero AC coefficient increases the file size. This is because whenever a zero AC coefficient is modified to non-zero, an extra symbol is needed to be coded. Therefore, their method leads to smaller distortion and smaller file size than the existing schemes, which embed in zero AC coefficients.
Before the AC coefficients selection step, in the proposed scheme, a similar block ordering scenario is used. A block with a higher number of zero coefficients will be at the top with the highest embedding priority and a block with less number of zero coefficients will be at the bottom with lowest embedding priority.

AC coefficient position selection
(1) Motivation: Block ordering method proposed by Huang et al. [19] has been proven to be effective for decreasing the distortion. However, they did not consider two additional points. First, each position of the AC coefficients across all blocks has different distribution of embeddable, unchangeable and shiftable coefficients. There are certain AC positions, which have more embeddable coefficients than shiftable coefficients. Figure  3 shows the total number of embeddable and shiftable coefficients in each position for all the blocks. Figure 3a is the statistics recovered from image Lena with QF = 70 and Fig. 3b is from image Baboon with QF = 70. For example, in Fig. 3a, it shows that positions such as 24 has a larger number of embeddable AC coefficients than shiftable AC coefficients. On the contrary, position 5 has smaller number of embeddable AC coefficients while having a larger number of shiftable AC coefficients. Naturally, position 24 is more desirable for embedding. Second, the modifying cost for each position is not uniform; i.e., due to the quantization table's non-uniform structure, modifying the AC coefficient which is near the DC coefficient may lead to less distortion than modifying the last one. For these reasons, we propose a method of selecting positions of AC coefficients for embedding.
(2) Embedding capacity and distortion: The positions are chosen by considering the embedding capacity and the distortion. The embedding capacity is measured using the number of embeddable AC coefficients. The distortion is modeled using the number of shiftable AC coefficients, quantization table, and PSNR function.
(3) PSNR function: For PSNR function, although it is not the best model for evaluating perceptual distortion, it is the most agreed upon method for measuring the distortion. It penalizes modification as a square of deviation, i.e., f(x) = x 2 .
(4) Quantization table: The quantization table has crucial information about the compression ratio. Each entry in the quantization table defines the degree of compression (distortion) for each DCT coefficient block. This is because each DCT coefficient is divided by its corresponding quantization entry. Therefore, the modification of the quantized coefficient is directly proportional to the square of its quantization entry during the pixel reconstruction step. Figure 4 shows the squared quantization entries for each position for QF = 70. It can be easily observed that modifying AC coefficients in the position 45 will cause huge distortion, while the position 1 will cause very little distortion.
(5) Embedding efficiency: In order to select the least distortion-causing positions, the proposed method has to find positions by considering the number of embeddable and shiftable AC coefficients and the quantization table entry. The proposed method calculates a metric called embedding efficiency R i for each position i ∈ 1 , 2 , … , 63: where, E (i, n) ∈ {0, 1} represents whether the AC coefficient in position i in the n th block is embeddable (1) or not (0). Similarly, S (i, n) represents whether the AC coefficient in position i in the n th block is shiftable (1) or not (0). Q i is the quantization table entry at position i. Note that the nominator of the equation represents the total embedding capacity when AC coefficients in position i are used for embedding, whereas the denominator represents the corresponding estimated distortion: sum of total shiftable AC coefficients and half of the embedding capacity (assuming the payload is pseudorandom, approximately half of the embeddable will cause distortion) with the squared quantization term of penalty. The reason to squaring the quantization table entry is, to consider its effect on de-quantization phase. Although the number of embeddable coefficients are many and shiftable coefficients are small on a given position, if the quantization table entry is a large number, then image may be highly distorted after data embedding.
To make clear the difference between positions with respect to the value of R, this method squared the quantization table values. So, from two positions which have similar distortion (denominator result from Eq. (4) without squaring Q i ), the one which has the smaller Q i value will be preferred for embedding. Therefore, the best position to embed will have the largest embedding efficiency R i value, whereas the worst one will have the lowest value. For Eq. (4), the image with 512 × 512 size is considered and the number blocks becomes 4096. (6) Sorting and position selection: Once the embedding efficiency is calculated for all 63 positions, the values can be sorted from the highest to the lowest. From here, we have to ensure that enough positions are chosen so that the payload can be embedded, this can be done by choosing the minimum number of positions from the top of the list such that the embedding capacity for each chosen positions is equal or higher than the payload. Once the positions are chosen, the embedding can take place for each block, in the order previously defined from block ordering. To make sure the encoder and the decoder uses the same order for the positions, authors suggest embedding from the lowest to the highest position, i.e., if positions (5,1,6,9) are chosen from embedding, then for each block, embed in position 1 first, 5 next, then 6, and finally 9 and for extraction, extract in position 1 first, 5 next, then 6, and finally 9.

Side information
In order to make the proposed scheme truly reversible, side information such as payload length and the positions of the embedded AC coefficients need to be embedded in the DC coefficients. The maximum payload length which must be transmitted is log 2 (W × H), where W is the width and H is the height of the image. This is 18 bits for 512 × 512-sized images. The positions of AC coefficients which are used for embedding can be represented in the binary vector of size 63 bits, one bit for each position, i.e., if positions (5,1,6,9) are chosen from embedding, then from the total 63, bits at vector index 1, 5, 6, and 9 will be valued 1 and the other will be valued 0. Therefore, the total side information needed for a 512 × 512 image is 81 bits and this can be embedded in the first 81 LSBs of the DC values. In order to facilitate perfect recovery of the LSB of the original DC values, they are appended as part of the payload before embedding.

Extraction and image recovery
In reversible data hiding techniques, the payload should be extracted and the original image should be recovered correctly without any errors. The extraction and recovery is done simultaneously. First, the quantized DCT coefficients are retrieved. The 81 LSBs of the DC coefficients are read to find the size of the payload and the positions of the AC coefficients which were used for embedding. Then, block ordering is used in embedding order for the blocks. Since zero-valued AC coefficients are not modified, the same ordering from the embedding is achieved. Using the position of the AC coefficients which  were used, the original AC coefficients are recovered using Eq. (5). The payload and the original LSBs of the 81 DC coefficients are extracted using Eq. (6). Once the original LSBs of the DC coefficients are recovered, perfect recovery of the original JPEG image is completed.

Complexity
The computational complexity of the proposed method is quiet minimal. To select the positions of the AC coefficients for embedding, embeddable, shiftable and unchangeable AC coefficients must be counted. This can be done quite efficiently by adding a counter loop during the Huffman code-word decoding of the quantized coefficients. For determining the efficiency ratios, multiplication (squaring the 63 entries of the quantization table) and division (63 times) are required. Finally, the sorting of 63 ratios should be not much of a task. Overall, the computational complexity is very low.

Encoder and decoder
In this section, a brief overview of the proposed encoder and decoder is presented to aid with understanding.

Encoder
The encoder is done as follows: Sort the blocks by the number of zero AC coefficients Count the total number of embeddable, shiftable, and unchangeable AC coefficients for each of the 63 possible positions Calculate the embedding efficiencies of each position and determine the positions which will lead to the lowest distortion to embed the payload (including the original LSBs of the 81 DC coefficients). Sort the positions using embedding efficiency R i Embed the payload Replace the LSBs of the first DC coefficients with the side information.

Decoder
The decoding is done as follows:  quality and file size against the three state-of-the-art schemes [19,23,24]. Schemes which require modification of the quantization matrix, such as [22], are not included in the experimental result, as they usually retain higher visual quality for much larger file size. This effect can be clearly observed in [19]. The PSNR is used to evaluate the visual quality of the marked JPEG image, and it is calculated between the original JPEG image and the watermarked JPEG image. File size is measured using the number of bytes. Note that all experiments were done under QF = 70.

Visual quality
For visual quality comparison, the PSNR is calculated between the watermarked JPEG image and the original JPEG image. The visual quality of the cover image concern researchers to design a distortion function [28] that preserved the intrinsic statistical characteristics of the image after embedding (Tables 1 and 2). The complete PSNR results on the six images are shown in Fig. 6. The experiment shows that the proposed method has the highest PSNR than all the previous works, but it eventually converges to [19] as payload increases. This is because AC coefficient position selection has no effect when all AC coefficients are used for embedding. Tables 1 and 2 show the numerical results of PSNR values on the listed six images, and Table 3 shows the results of the 23 images which are from the USC-SIPI database, where the data at the bottom are the average results. From the tables, the proposed technique performs the best. The result of the proposed method seems very impressive given that the AC coefficient selection scheme requires small additional computation when compared with [19].

File Size
It is clear that file size is important to be considered with image quality in JPEG data hiding. Figure 7 shows the file size comparison. The orange dashed line in this figure is the original file size of the image. The horizontal axis represents the payload size, while the vertical axis does the image file size (in bytes). The file size of the watermarked image obtained by the proposed method is on average less than that of the previous methods (Tables 3 and 4). Tables 5 and 6 show the numerical results of file size increment on the listed six images, and Table 4 shows results of the 23 images which are from the USC-SIPI database, where the data at the bottom are the average results. From the tables, we can see that the proposed method has the smallest file size mostly, but [24] seems to achieve the smallest file size for some images. File size is measured using the number of bytes.

Conclusions
The popularity and easy accessibility of the JPEG image format is becoming a great area of research for data hiding. Modification to the JPEG image introduces distortion and increase in the file size. In the proposed method, reversible data hiding for JPEG images is discussed. The technique uses the HS strategy to embed data. From nonzero AC coefficients, only +1 and −1 are used for embedding, and the others are expanded/shifted in either direction according to their sign. Before embedding, block ordering is done and then, appropriate embeddable sections with the best coefficients for RDH are selected based on the distortion sum of AC coefficients in a section. In general, the proposed scheme has better embedding efficiency as measured by visual quality and less file size compared with previous related research. The major contribution of this paper is in exploiting Eqs. (3) and (4).
Using these equations, we can choose the appropriate AC coefficients that can reduce distortion and file size.