Scalable Image Self-Embedding Based on Dual-Rate SPIHT-LDPC Reference Generation Scheme

Image Self-Embedding is a method of embedding two sets of data into the original image, authentication data for tamper detection and reference data for image recovery. In this paper, a scalable self-embedding method is proposed based on dual-rate source-channel coding for reference data generation. The proposed method uses Set Partitioning in Hierarchical Tree (SPIHT) algorithm for source coding and Low-Density Parity Check (LDPC) for channel coding. Accordingly, the proposed recovery system provides higher reconstruction quality at low tampering rates, while it can handle higher tampering rates with less reconstruction quality. Therefore, the proposed method has the ability of both preserving the image quality and recovering higher tampering rates. Simulation results show noticeable improvements compared with the related self-embedding methods in the literature.

Passive authentication is a fundamental method for detecting counterfeit images without having any information about the original image.Thus, pixels statistics and learning based algorithms are served for detecting tampered regions [1], [2], [10].On the other hand, in active authentication schemes, prior information of the original image are used for tamper detection [11], [12].A digital signature is a simple method for investigating the integrity of the image [13].In this method, hash function generates a unique code for the image.For forgery detection, another signature is produced based on the test image.The image is labeled authentic if the received code matches the generated one.Digital watermarking is widely used for image authentication [14][15][16][17].The methods based on fragile watermarking detect every little modification of the image [16][17][18], while semi-fragile watermarking authorizes some image modifications like compression [8], [20].In [21], an authentication method is proposed based on both statistical correlation and digital watermarking.For generating authentication data in the fragile watermarking scheme, the digital signature techniques are very helpful [9], [10].In this scheme, the image is partitioned into non-overlapping blocks and the digital signature is generated for each block using a hash function, embedding into the same block.For tamper detection, the extracted data is compared with the generated one and in the case of mismatches, the block is labeled as tampered [22].
Self-embedding is a tamper detection and image recovery algorithm based on digital watermarking.Two types of data are embedded into the original image, authentication data for tamper detection and reference data for image recovery [23].Fridrich et al. introduced a selfembedding method based on fragile watermarking, not only for tamper detection but also for image recovery [24].A three-level authentication system based on fragile image watermarking was proposed by Lin et al. [25].In this method, parity check bits were generated as authentication data and the average of the block utilized as reference data.The proposed dual-watermarking scheme by Lee and Lin [14] compensates the drawback of the previous scheme in [25].In this method, two reference data are generated for every block in order to provide higher tamper resilience.Since high reference payload was the main disadvantage of the proposed method in [14], Zhang et al. proposed a flexible self-embedding scheme based on compressive sensing for reducing data redundancy [26].In this scheme, the reference data is generated for several blocks in DCT domain.By using image compression, the reference data is generated more efficient and the image can be recovered with higher reconstruction quality.Yang and Shen proposed a tamper recovery method based on vector quantization (VQ) [27].In this method, the indices of the blocks were embedded into the main image using fragile watermarking.Korus et al. [23] proposed a fragile watermarking scheme for tamper recovery based on quantization and channel coding.In this scheme, the image is partitioned into non-overlapping blocks and each block is transformed to DCT domain.The DC coefficients are quantized using scalar quantization and AC coefficients are quantized by using VQ.For the first time, Lee et al. introduced a tamper recovery scheme based on channel coding algorithms [28].Since the generated reference data is hidden in the image, tampers might affect the embedded data.Channel coding is an effective method to ensure better error resiliency for the reference data.Sarreshtedari et al. [29] proposed a sourcechannel coding self-embedding scheme based on SPIHT and Reed-Solomon (RS) algorithms.In this method, the image is compressed at 1 bit per pixel (bpp) data rate and 0.5 bpp redundancy is added to the compressed data for error protection.The generated reference data is permuted and embedded into the original image.Qin et al. proposed an overlapping-block embedding strategy which provides block-based tamper detection and content recovery [30].In this method, check bits are generated according to the complexity of each block.For tamper recovery, the average value of the overlapping blocks is used for generating reference data.
In this paper, a scalable self-embedding method is proposed based on a dual-rate source-channel coding.The main contribution of this scheme is generating reference data consisting of two data parts.The first part is well protected to be prepared for high tampering rates and the second part is suitable for low tampering rates for providing higher reconstruction quality.The reference data is produced using SPIHT for image compression and LDPC algorithms for error protection.The compressed SPIHT bitstream is partitioned into two parts.The first part provides fundamental quality for content recovery, and therefore protected with higher redundancy rate.However, the second part consists in enhancement information and can be protected by less redundancy rate.For tamper detection at the receiver side, the image is partitioned into non-overlapping blocks and the authentication data is extracted to detect the modifications of the block.For image recovery, the extracted reference data is partitioned into two parts according to the embedding procedure.After inverse permutation, each part is error-corrected individually.The bitstream is decompressed to generate an image representation which is used for image recovery by replacing tampered regions.
By allocating dual-rate source-channel coding, the quality-robustness performance of the proposed scheme is formed in two scales.For low tampering rates, higher reconstruction quality is expected, called scale 1.At the second scale, less reconstruction quality is provided, instead, higher tampering rates are achievable.The contributions of the proposed method are listed as follows:  Better quality-robustness performance to handle both increasing the recovered quality and the tolerable tampering rate.
 Scalable self-recovery provides two levels (scales) of image recovery: higher reconstruction quality at low tampering rates and lower reconstruction quality at high tampering rate.
 Configurable rate-allocation based on recovery requirement.
The rest of this paper is organized as follows.The proposed method is discussed in Sec. 2. In Sec. 3, tamper detection and recovery procedure are described and Section 4 is dedicated to the experimental results.Finally, in Sec. 5, the proposed scheme is concluded.

The Proposed Method
Figure 1 represents block diagram of the embedding process of the proposed method.The original image is first compressed using SPIHT, an image compression algorithm based on multi-level wavelet decomposition which generates scalable bitstreams [31].According to the scalable property of SPIHT, the compressed bitstream is formed with multiple bitplanes providing several quality scales, from the basic quality to the enhancement levels.In this paper, the SPIHT bitstream is not arithmetically codded and just a proportion of the beginning of the bitstream is used for generating reference data.The selected data part is partitioned into two parts according to the assigned source coding rates (R 1 s and R 2 s ).The proposed scalable self-embedding method takes advantage of the scalable property of the SPIHT's compressed bitstream.Two unequal redundancy rates are allocated using LDPC channel coding.
The functionality of rate-allocation unit is to assign source and redundancy rates according to the data embedding payload.In permutation stage, the protected data parts are mixed and scrambled using a security key.Besides providing security, data permutation spreads the generated watermark all over the image.Therefore, the effect of tampering uniformly distributed between part one and two.As a matter of fact, the permutation prevents massive erasing.
For generating check bits, the image is partitioned into 8 × 8 non-overlapping blocks.Six most significant bits (MSB) of the pixel intensity, security key2 and the share of generated watermark for the block are used for check bit generation.An MD5 hash function is used for producing a digital signature.32 bits of the generated digital signature is truncated as in [23], [29].Generating 32 bits for 64 pixels (8 × 8 block size) means creating 0.5 bpp check bits data rate.The watermark consists of 1.5 bpp reference data and 0.5 bpp check bits are embedded into two Least Significant Bits (LSBs) of the main image.Before data embedding, 2 LSBs of the image are set to zero, according to (1).
In this equation, p(i, j) is the pixel intensity in coordination (i, j).Note that the LSBs do not corporate in check bit generation.Therefore, data embedding does not affect the integrity of the image.
In the next two sections, more details about producing reference data are presented.In Sec.2.1 the concept of dual-rate allocation is proposed.Also, the proposed channel coding algorithm is mentioned in Sec.2.2.

Rate Allocation
In the proposed dual-rate allocation technique, the compressed bitstream is separated into two parts, part one with rate R 1 s and part two with rate R 2 s , where R 1 s + R 2 s = 0.5 bpp.Two unequal redundancy rates are allocated using LDPC channel coding, R 1 r for the first part and R 2 r for the second part (Fig. 2).Since the first part provides fundamental quality scale, assigning higher redundancy rate for this part enables the recovery system for high tampering rates.In order to manage data payload for watermarking, the allocated rates should satisfy (2).

   
In this paper, source and redundancy rates are assigned as R 1 s = 0.25 bpp, R 1 r = 0.75 bpp for the first part and R 2 s = 0.25 bpp, R 2 r = 0.25 bpp for the second part.

Channel Coding: LDPC
The self-embedding scheme has modeled as an erasure in communication channel [23], [29] as a result of erasing embedded watermark in a tampered block.Therefore, the proposed method takes advantage of the finitelength near ideal low-density parity-check (LDPC) codes [32], designed for the binary erasure channel (BEC).
An LDPC block code (n,k) encodes k-bits message data U = [u 1 , u 2 , …, u k ] to n-bits codeword data with m = n -k bits redundancy, as in (3).

V = U×G .
( In ( 3), G is a generator matrix which can be found by performing Gauss-Jordan elimination on parity check matrix, H. LDPC code is called low density because of using sparse parity check matrix which causes dense generator matrix.By using a dense generator matrix, the encoding process would be very complex.In [33], the idea of semi-random parity check matrix is proposed for reducing the complexity of the encoding process.According to this method, the parity check matrix consists of two parts.A deterministic part which is mainly a diagonal matrix concatenated with a random matrix (4).
The fast encoding method generates codeword according to (5).In this paper, a three-step algorithm is used for designing parity check matrix according to [34].

Tamper Detection and Recovery
Figure 3 represents the block diagram of tamper detection and recovery.In this procedure, first, the watermark is extracted which consists of check bits and reference data.For tamper detection, the extracted check bits of a block are compared with the generated one for the block.The method of producing check bits is the same as the embedding process.Since the proposed method uses fragile watermarking for strict authentication, even little changes in the image content are considered as tampers.Thus, lossy image compression is not acceptable as a result of changing the embedded data in the LSB layers, while lossless image compression is acceptable.
For content restoration, a tampered block is replaced with the same block from the reference image.The procedure of generating reference image is initiated with inverse permutation on the extracted reference data [35].Then, the data is partitioned into two parts, according to the assigned rates in Sec.2.1.Both parts are channel decoded using message passing iterative algorithm.Then part 1 and part 2 are merged together for SPIHT decoding in order to generate the reference image.In the recovery stage, the tampered regions of the image are replaced with the same regions of the reference image.The generated reference image provides two quality scales for tamper recovery.For low tampering rates, the image is recovered by scale 1 with higher restoration quality.However, higher tampering rates are tolerable with lower reconstruction quality, called scale 2.

Experimental Results
For performance evaluation, four 8-bit grayscale 512 × 512 standard images including Lena, Airplane, Lake, and Crowd, are used as original images (Fig. 4).The proposed method uses two LSBs of the image for data embedding, 0.5 bpp for check bits and 1.5 bpp for reference data.The mentioned images are tamper-protected using the proposed self-embedding scheme.Figure 5 shows the cor-responding watermarked images and their PSNR values are 44.15,44.12, 44.16 and 44.18 dB, respectively.
The distortion caused by data embedding above 36 dB cannot be considered noticeable by human vision system [29].The contents of the watermarked images in Fig. 5 are modified with different versions and tampering rates, shown in Fig. 6.The tampering rate is defined as the ratio of the number of tampered pixels to the whole image pixels.The tampering rates for the images in Fig. 6 are 27, 23.73, 5.15 and 32.03 percent, respectively.In Fig. 7, the result of tamper detection and localization is represented.In this figure, tamper masks of the tampered images in Fig. 6 are displayed, white pixels represent tampered region and black pixels are authentic.The tamper masks show that the authentication algorithm based on MD5 function can accurately detect modifications.The localization accuracy is 8 × 8 pixels as a result of the block size.Although tamper detection is block-based, the proposed content recovery method is pixel-wise.Figure 8 shows the recovered images which were modified in Fig. 6.The quality of recovery for Lena, Airplane, Lake and Crowd image are 35.51,37.16, 41.28 and 33.18 dB, respectively.In Fig. 9, the recovery performance of the proposed method is compared with two related methods [29], [30].Both methods use two LSBs for data embedding in a pixelwise image recovery.In this figure, the quality of the recovered image is plotted for various tampering rates.In this figure, the tampering rates are created using cropping method for the purpose of evaluating the proposed method by the worst tampers.In this case, all the pixels' values in the tampered area are set to zero and thus, the reference and authentication data are destroyed.The tampered area is a square in the center of the image according to the desired tampering rate, as in [29], [30].Note that the image center mainly contains the most important information of the image.
According to the proposed scalable recovery method, the quality-robustness performance is formed in two steps (Fig. 9 (a-d)).For low tampering rates less than 30 percent, the entire reference data are completely reconstructed, leading to the highest quality level (scale 1).For more tampers above 30, the second part of the reference data with less redundancy rate cannot be decoded.Therefore, the quality of the recovered image is reduced to a lower quality level (scale 2).Instead, the image can tolerate higher tampering rates, more than 45 percent.Although the proposed method in [29] provides higher reconstruction    quality, it is applicable for low tampering rates less than 33 percent.However, the proposed method can achieve higher tampering rates (more than 45 percent).The simulation results in Fig. 9 show that the proposed method provides higher reconstruction quality for the test images for most tampering rates in comparison with [30].The proposed method has better performance for low texture images like Airplane (Fig. 9(b)) as a result of using SPIHT algorithm which has more efficiency for low texture images.In Fig. 9(d), the performance of the proposed method for Crowd image is the same as [30] at high tampering rates.
For a general result, the proposed method is evaluated by 1000 images and the average PSNR for the first scale is 39.7 dB and for the second scale 35.86 dB.Moreover, the standard deviations are 6.3 and 6.29 for the scale one and two, respectively.

Conclusion
In this paper, a scalable self-embedding method based on source-channel coding scheme was proposed.The proposed method generated the reference data by compressing the original image with 0.5 bpp.Then, the bitstream was partitioned into two parts.The first part was provided a rough approximation of the main image and the second part provided an enhancement.Therefore, the first part was received with higher redundancy rate to be prepared for high tampering rates.The second part was received with less redundancy rate which is applicable at low rate tamper correction.The proposed method used near-optimal LDPC algorithm for channel coding.The contributions of the proposed method in this paper are listed as follows: (1) Better quality-robustness performance related to reported methods.The proposed method can handle both increasing the recovered quality and the tolerable tampering rate.
(2) Scalable self-recovery not only provides higher reconstruction quality at low tampering rates, but also it increases the tolerable tampering rate.The first scale of the proposed method can recover 30 percent tampering rates with high restoration quality.The second scale can achieve higher tampering rates (more than 45 percent), however, with less quality level.(3) Configurable rate-allocation based on recovery requirement.In most self-embedding methods, the system configuration is fixed which makes them impractical.In the proposed method, quality levels can be designed by adjusting source coding rates.Also, redundancy rates are flexible based on the desired tolerable tampering rates.

Fig. 1 .
Fig. 1.The block diagram of embedding process of the proposed method.

Fig. 2 .
Fig. 2. Rate allocation procedure for the proposed reference generation.

Fig. 3 .
Fig. 3. Block diagram of tamper detection and recovery process of the proposed method.