Abstract

Confidential information can be hidden in digital images through data hiding technology. This has practical application value for copyright, intellectual property protection, public information protection, and so on. In recent years, researchers have proposed many schemes of data hiding. However, existed data hiding schemes suffer from low hiding capacity or poor stego-image quality. This paper uses a new method of multiple pixels-value adjustment with encoding function (MPA) to further improve the comprehensive performance, which is well in both hiding capacity and stego-image quality. The main idea is to divide n adjacent cover pixels into two sub-groups and implement multi-bit-based modulus operations in each group, respectively. The efficacy of this proposed is evaluated by peak signal-to-noise ratio (PSNR), embedding payload, structural similarity index (SSIM), and quality index (QI). The recorded PSNR value is 30.01 dB, and embedding payload is 5 bpp (bits per pixel). In addition, the steganalysis tests do not detect this steganography technique.

1. Introduction

With the rapid development of the Internet, more and more people transmit messages through the network. But at the same time, they also face the risk of information leakage. Hence, information security is getting high attention. In recent years, many techniques for information security have been researched such as cryptography and data hiding. Cryptography transforms secret information into an incomprehensible form, and the encrypted message looks messy and meaningless. The shortcoming of cryptography is that it is easily detected and then destroyed by the attacker. Unlike cryptography, data hiding ensures that there is no easily detectable change in the carrier of the hidden secret message. This technology can ensure the security of confidential information. One method of data hiding is to embed secret information in the cover image and then generate a stego-image that prevents attackers from stealing it [1]. There are various ways to classify data hiding, which can be divided into spatial domain and frequency domain. In spatial domain-based techniques, the pixel values of the image are directly modified to embed secret information, whereas in frequency domain-based techniques, pixels are transformed into coefficients after changes (e.g., Fourier transform, Laplace transform, and Z-transform). These coefficients are modified to embed secret data. Among these two domains, the spatial domain-based techniques are simple and less time consuming. In this domain, according to whether the stego-image can be recovered after embedding secret data, data hiding can be divided into reversible data hiding [26] and irreversible data hiding. Extracting secret data in stego-images also allow recovery of the cover image without distortion, and this method is suitable for some fields with high image requirements, such as medical and military. In contrast, irreversible data hiding cannot restore the cover image, but this method can embed more secret data and is easy to understand and implement.

In this paper, we focus on irreversible data hiding. In the past decades, many data hiding methods have been proposed. These methods can be divided into three categories, LSB-based, PVD-based, and EMD-based. Firstly, least significant bit (LSB)-based method embeds secret data through the least significant bits of cover pixels, which is the simplest and most direct approach. LSB is simple to implement, but the generated stego-images may draw suspicion or be easily detected. In an image, the texture region can hide more secret data compared to the smooth region. Therefore, the goal of the pixel value difference (PVD)-based method is to hide less secret data in the smooth region and embed more secret data in the texture region. For this target, scholars have proposed many variants of PVD [7, 8] or combined PVD with methods such as LSB [9, 10] or modulus functions. The last type is exploiting modified direction (EMD)-based method, using special equations or modulus functions to embed secret data. In 2006, Zhang and Wang [11] proposed EMD by making full use of the modulus function to change the directional characteristics. The EMD method uses a function to embed secret data transformed in the (2n + 1)-ary notational system in n cover pixels. When n = 2, EMD can achieve the maximum embedding payload of 1.16 bpp. To improve the embedding capability, many data hiding techniques based on EMD methods have been proposed [12]. In 2007, Lee et al. [13] proposed the improved EMD (IEMD) method to embed secret data using modulus function. The embedding payload of IEMD is improved from 1.16 bpp to 1.5 bpp. However, the fixed number of cover pixels and weights in each group leads to this scheme being less flexible, which is weak against steganalysis. In 2009, Jung [14] proposed JY, which uses function to embed n secret data transformed in the (2n + 1)-ary notational system into 1 cover pixel, increasing the single-pixel embedding payload to 2 bpp. In 2013, Kuo and Wang [15] proposed generalized exploiting modification direction (GEMD) method. The scheme is an extension of IEMD. Extending the fixed weights to vary with n, the method has an embedding payload of 2 bpp. In 2015, Kuo et al. [16] proposed the KKWW scheme to optimize the parameters and modulus function of GEMD. The simulation results showed that the scheme is feasible. In 2019, Sairam and Boopathybagan [17] proposed SB algorithm to carry secret data using other notational system. The maximum embedding payload of this method is 4 bpp. To improve the embedding capacity, in 2020, Zhang et al. [18] proposed the modulus calculations on prime number algorithm (MOPNA). This method is based on the modulo operation of prime numbers. The embedding payload of this algorithm is increased to 3.5 bpp.

To further increase the embedding payload of per pixel while ensuring the quality of stego-image, this paper proposes a high-capacity data hiding scheme MPA. The main contributions of this paper are summarized as follows.(1)In this paper, we propose a new scheme that adopts the grouping technique of cover pixel groups and uses modulus function for secret data embedding which improves the capacity of data hiding. With this method, the embedding payload can reach 5 bpp, while the quality of the stego-image is still acceptable (PSNR > 30 dB).(2)The correctness of the MPA scheme was demonstrated with mathematical and the experimental results which proved the performance and safety of the MPA scheme.(3)To make it easier for other scholars to verify the work, we have uploaded all the code and other materials at https://github.com/SunJie0916/MPA.

The rest of this paper is organized as follows. Section 2 will show the related work. The new algorithm MPA and its math proofs are presented in Section 3. Section 4 provides the experimental details and results of MPA. Finally, a conclusion is given in Section 5.

2.1. Exploiting Modified Direction (EMD) Algorithm

Zhang et al. proposed EMD, which embeds the secret digit transformed in the (2n + 1)-ary notational system into n cover pixels. The payload of EMD is , up to 1.16 bpp when n = 2. There is room to improve. And the algorithm is as follows. The inputs of EMD are n cover pixels and . The output is .Step 1. Calculate the extraction function in .Step 2. is obtained as follows. if and , then else Step 3. Calculate to get the secret data.

2.2. Improved EMD (IEMD) Algorithm

In 2007, Lee et al. proposed a data hiding algorithm IEMD. Its embedding payload of IEMD is 3 bpp. More detailed steps are shown below. The inputs of IEMD are a pixel pair and decimal digit which is transformed from 3 binary bit secret data. The output is .Step 1. Calculate using .Step 2. is obtained as follows. if , then and  else if , then and  else if then and  else if , then and  else if , then and  else if , then and  else if , then and Step 3. Calculate to get the secret data.

2.3. JY

Jung et al. proposed the JY algorithm. The payload of the algorithm is up to 2 bpp. The details of JY are shown as follows. The inputs of JY are 1 cover pixel and n binary bit secret data. The output is stego-pixel .Step 1. Calculate .Step 2. Based on the value of choose the range of x that satisfies the requirements of the following equation and bring it in order to calculate . And choose the x that satisfies , and then . if then  else if ,  else Step 3. Calculate to get the secret data.

2.4. Generalized Exploiting Modification Direction (GEMD) Algorithm

In 2013, Kuo and Wang proposed the GEMD algorithm, which is an extended version of the IEMD scheme. The embedding payload of GEMD can reach 1.5 bpp. The inputs of GEMD are n adjacent pixels and decimal digit which is transformed from n + 1 binary bit secret data. The output is n stego-pixels .Step 1. Calculate .Step 2. is obtained as follows., . if , then ,  else if , transform to and for i = n down to 1 do  if and , then   if and , then  else if , transform to , for i = n down to 1 do  if and , then   if and , then Step 3. Calculate to get the secret data.

2.5. KKWW

KKWW was proposed by Kuo et al. in 2015. It is a high-capacity data hiding scheme based on multi-bit encoding function. The embedding payload of this scheme is up to 4.25 bpp. The details of KKWW are shown as follows. The inputs of KKWW are n adjacent pixels and decimal data which is transformed from nk + 1 binary bit secret data. The output is n stego-pixels .Step 1. Calculate , where.Step 2. is obtained as follows.,  if ,  if , ,  if   transform to and for each i in do   ; , if  if   transform to and for each i in do   ; , if Step 3. Calculate to get the secret data.

2.6. SB

In 2019, Sairam proposed a high-capacity information hiding scheme SB. The embedding payload of SB can reach 3 bpp. More detailed steps are shown below. The inputs of SB are pixel and -ary notational system digit which is transformed from n binary bit secret data. The output is stego-pixels .Step 1. By the steps below to find the variable x and obtain for {  if (){       break}}Step 2. Calculate to get the secret data.

2.7. Modulus Calculations on Prime Number (MOPNA) Algorithm

In 2020, Zhang et al. proposed MOPNA, a method for modulo computation using prime numbers . This method uses a function (changes in pixel values, CPV) to measure the effect of embedding. The maximum embedding payload of MOPNA is 3.5 bpp. More details are shown as follows. The inputs of MOPNA are a pixel pair and decimal digit data which is transformed from 2n + 1 binary bit secret data. The output is stego-pixels .Step 1. Calculate .Step 2. Choose the variable X that satisfies the following equation and :Step 3. Calculate to get the secret data.

3. Proposed Method

3.1. Multiple Pixels-Value Adjustment with Encoding Function (MPA) Algorithm

Inspired by EMD, IEMD, and KKWW, we design a new scheme MPA to embed secret data into cover images. MPA divides cover image into pixel groups that include n pixels. In the specific embedding process, n pixels are regrouped to get two sub-groups. After that, nk + 2 bits of secret data converted to decimal are grouped accordingly using (1) and embedded into the subgroups. The MPA algorithm can embed up to 5-bit secret data in each pixel and ensure the quality of the stego-image. The detail of the algorithm is described as follows. Figure 1 shows the flowchart of the proposed Algorithm 1 for embedding process. The inputs of MPA are n adjacent pixels and decimal data which is transformed from nk + 2 binary bit secret data s, where n is the number of pixels in a group and k is used to adjust the degree of secret data embedding. The output is n stego-pixels .

Step 1. Divide n adjacent pixels into two groups and , where .
Step 2. Transform s into decimal by formula
Step 3. Construct the Algorithm 1, where , are integers and .
    
Step 4. Bring into Algorithm 1 and obtain
    
 Where
Step 5. Calculate the difference D between into by Algorithm 1.
    
Step 6. is obtained by embedding into in the following steps.
 if
  
 if
  ,
 if
  transform to , for each i in do
   
   , if
 if
  transform to , for each i in
   
   , if
Step 7. Bring into Algorithm 1 to calculate and get the difference D between and by . Finally, repeat step 6 to obtain .
Step 8. Merge and to get .
Step 9. Repeat from steps 20 to 27 until all secret data are embedded.

According to embedding algorithm, the secret data are embedded in the cover image to get the stego-image. Stego-image is transmitted from the sender to the receiver. The receiver extracts the secret information from the stego-image by the following extraction process. Figure 2 shows flowchart of data extracting process.

The inputs are n stego-pixels and . And the output is secret data s.Step 1. Compute the secret and .Step 2. Convert the value of and to binary and combine them to get s.Step 3. Repeat from steps 30 to step 31 until all secret data are extracted.

We use an example to illustrate the embedding process and the extraction process. The specific process is described below (Algorithms 2 and 3).

Input 4 adjacent pixels and nk+2 = 14 binary secret data s.
Output 4 stego-pixels .
Step 1. Divide 4 adjacent pixels into two groups and .
Step 2. Transform s into decimal .
Step 3. Construct the to get and .
Step 4. Bring and into Equation 2 to calculate , .
Step 5. Calculate by Algorithm 1.
Step 6. is obtained by embedding into in the following steps.
,
  transform to , ,
  for , , , ,
  for , ,
 get .
Step 7. Bring into Algorithm 1 to calculate and get the difference D between and by .
,
  transform to , ,
  for , , ,
  for , ,
 get .
Step 8. Merge and to get .
Input Stego-pixels and .
Output Secret data .
Step 1. Compute the secret and .
Step 2. Convert the value of and to binary and combine them to get .
3.2. Math Proof: The MPA Can Embed nk + 2-Bit Secret Data into n Cover Pixels

In this section, we prove that nk + 2-bit secret data can be split according to the grouping of n cover pixels. In our scheme, the n cover pixels are divided into two groups , , and .

When s has nk + 2 secret data, the decimal .

, .

It is easy to know that can be uniquely represented, , where , are integers, and .

, can be embedded into and cover pixels, respectively.

3.3. Math Proof: MPA Can Embed Secret Data into Cover Image Correctly

In this section, we need to prove that regardless of the values of and , the values of D must satisfy the four embedding conditions in Step 6. When and , the secret data and convert to decimal .

With the modulus operator, .

.

Let and , and the original equation can be changed to .

.

, which is an uncertain variable.

, where , , and they are two pixels with definite values.

is a constant value and .

, and z is a definite fixed value. , y is an uncertain variable taking values from [0, 127], and z is a definite fixed value taking values in the range [0, 127]. D must take values in the range [0, 127]. The range of D must satisfy the four embedding conditions in step 6, so all the secret data must be able to be embedded in the cover image.

3.4. Math Proof: MPA Can Extract Secret Data from Stego-Image Correctly

In the previous subsection, we have proved that the range of D must satisfy the four embedding conditions in step 6. In this section, it is proved that the secret data extracted from the equation must be equal to the embedded secret data . That is, . When and , (1)if , then (2)if , then , (3)if , then transform to , and , is to convert from octal to decimal, and (4)if , then transform to and ,

Therefore, the algorithm can recover the secret data from stego-images.

3.5. Experimental Results and Security Analysis

In this section, we test the proposed algorithm and present the results. Our method and other algorithms are implemented in Python. And the hardware conditions for the experiments are based on a personal PC with Intel(R) Core(TM) i7-10700 CPU @ 2.90 GHz and 16-GB RAM. The operating system is Windows 10 Professional 64 bits, and the experiment software is PyCharm. The images used for the experiments are a series of grayscale images of standard size (Baboon, Barbara, Boat, F-16, Goldhill, Lena, Pepper, and Tiffany) [19], which are shown in Figure 3.

The performance of method in this paper includes stego-image quality, payload, PSNR, SSIM, QI, and several steganalysis test.

3.6. The Quality of Stego-Image

In this experiment, the number of secret data (NCD) bits is set as 262144 bits and 327680 bits. When n = 2 and k = 3, the payload of MPA is 4 bpp. There are 256  256 = 65536 pixels available for embedding secret data. The largest number of secret data bits that can be embedded on a cover image is . Another set of NCD data is chosen with bpp = 5. Figure 4 shows the stego-images when the length of secret data is 262144 bits. Figure 5 shows the stego-images when the length of secret data is 327680 bits.

It is clearly shown that it is difficult to detect the difference between the cover images and the stego-images with the human visual system. Hiding the information in the image to ensure that it will not be detected by the attacker is exactly what data hiding techniques do.

3.7. Payload

Payload is the number of secret data bits that can be embedded into per pixel (bpp). The straightforward formula is defined in

A higher bpp represents more secret data that can be embedded in a pixel. On the contrary, a low bpp represents poor embedding efficiency. The embedding scheme’s characteristics including adjacent pixel in a group, number of embedding bits for each group, and payload (bpp) are summarized in Table 1. In this table, n is the number of pixels in a group and k is used to adjust the degree of secret data embedding. For MPA, n = 2, k = 2,3,4, and n = 4, k = 2,3,4 in experiments.

The number of adjacent pixels in a group is different for each data hiding scheme. Three are n, 2, 1, n, n, 1, 2, n adjacent pixels in a group of EMD, IEMD, JY, GEMD, KKWW, SB, MOPNA, and MPA, respectively. EMD, IEMD, JY, GEMD, KKWW, SB, MOPNA, and MPA can carry , 3, 2, n + 1, nk + 1, k, 2k + 1, and nk + 2 bits. And then, the specific payload of the above algorithm is shown in Table 1.

We calculate their specific values and express them in Figure 6. In Figure 6, it can be observed that the bpp of the algorithms tends to increase as n or k decreases. And we can find that the MPA algorithm can achieve a maximum embedding payload of 5 bpp.

3.8. PSNR, SSIM, and QI

PSNR is a major metric in the field of information security, and many studies use it to evaluate the performance of information hiding methods. PSNR  40 dB means the difference between cover image and stego-image is small and the secret information hidden in it is not easily detected. 40 dB  PSNR  30 dB means that the quality of stego-image is acceptable. PSNR  30 dB means that the quality of stego-image is poor and the secret information hidden in it has been detected and attacked. In the field of information hiding, the secret data hidden in the stego-image is not perceived by the human visual system when PSNR  30 dB. The equation of PSNR is shown in equations (5) and (6).where PSNR is measured in dB, M and N represent the length and width of the image, represents the pixel value at position of the cover image, and represents the pixel value at position of the stego-image. SSIM is another metric used to show the similarity between the cover image and the stego-image. The value of SSIM is between 0 and 1. The closer to 1 means that the stego-image is similar to the cover image. Its calculation is presented in equation (7).where o and s represent the cover image and the stego-image. and are the constants. and are the mean. is the covariance. and are variances, and and are the standard deviations for the corresponding o and s. QI is used to measure the equivalence of the stego-image with the cover image, and its value is estimated using equation (8) [20].where represents the average pixel value of the cover image and represents the average pixel value of the stego-image.

In order to compare and analyze the comprehensive performance of the algorithms proposed, we compare MPA with a total of seven other algorithms. In our experiments, the NCD is set as 49000 bits and 98305 bits, respectively. Among all the compared algorithms, EMD has the lowest embedding capacity. The payload of the EMD is 0.79 bpp when there are 4 adjacent pixels in each group (n = 4). There are 256  256 = 65536 pixels available for embedding secret data. The maximal number of bits of secret data which can be embedded is . For comparison with EMD, NCD = 49000 is selected for comparison.

When NCD = 49000 bits, the PSNR, SSIM, and QI of the eight algorithms are shown in Table 2. It is obvious that when n = 4, EMD has the highest PSNR = 54.66 dB, but it has the lowest payload. IEMD (1.5 bpp), JY (2 bpp), and GEMD (highest payload 1.5 bpp) have better PSNR performance, but the highest payload is lower than MPA. For methods such as KKWW, SB, MOPNA, and MPA, lower k values bring lower embedded payload but good PSNR performance; on the contrary, higher k values bring higher embedded payload but poor PSNR performance. When n = 1, k = 3, SB has the highest payload of 3 bpp, but PSNR = 39.89 dB performs less than MPA (3 bpp, PSNR = 42.54 dB). When n = 2, k = 2, MOPNA has a payload of 2.5 bpp, but PSNR = 42.79 dB performs less than MPA (2.5 bpp, PSNR = 43.15 dB). When n = 2, k = 3, the highest payload of MOPNA is 3.5 bpp and PSNR = 37.93 dB. MPA performs slightly lower than MOPNA and KKWW (PSNR = 36.58 dB), but the highest payload of MPA is higher than MOPNA. The payload of KKWW is different from MPA when n is different from k. For a more intuitive comparison, we plot Figure 7. It can be observed that the PSNR curve of MPA generally performs better than that of KKWW from Figure 7. And only when MPA reaches the highest payload of 5 bpp, it still has good stego-image quality (PSNR = 30.01 dB). In conclusion, MPA does better when the embedded payload and stego-image quality metrics are combined.

In addition to PSNR, the performance of MPA on SSIM and QI metrics also indicates that the quality of stego-images seems to be acceptable even though a large amount of secret data is hidden in the cover images. When n = 2, the SSIM values are 0.9944, 0.9819, and 0.9580 for k = 2 to 4, respectively, and the QI values are all around 0.99, which indicate that the quality of the stego-images is good. When n = 4, the SSIM values are 0.9941, 0.9827, and 0.9597 for k = 2 to 4, respectively, and the values of QI are around 0.99, so the stego-images of MPA are imperceptible in nature. No change can be found in the cover image and the stego-image.

When NCD = 98305 bits, the simulation results are shown in Table 3. In this case, 98305 bits of secret data cannot be embedded by IEMD, GEMD, and EMD. Only the scheme with a payload of at least 2bpp can embed 98305-bit secret data. As observed in Table 3, the highest payload of JY is 2 bpp with PSNR = 37.73 dB lower than the performance of MPA (when 2.5 bpp, PSNR = 43.25 dB). When k = 2,3, the PSNR of SB is lower than that of MPA. The PSNR of MOPNA is slightly higher than MPA when k = 2,3, but the maximum payload of this scheme is 3.5 bpp, which is lower than MPA (5 bpp). To compare KKWW and MPA more intuitively, we plot Figure 8. From the figure, it can be observed that the PSNR curve of MPA performs better than KKWW in general. And only the MPA payload can reach 5 bpp, the quality of the stego-image is still good (PSNR = 30.31 dB). In summary, MPA does better when both payload and stego-image quality are considered.

Whenn = 2, the SSIM values of MPA are 0.9899, 0.9655,and 0.9437 fork = 2 to 4, respectively, and the values of QI are all around 0.99. Whenn = 4, the SSIM values of MPA were 0.9898, 0.9763,and 0.9567 fork = 2 to 4, respectively, and the values of QI were around 0.99. These indicate that it is difficult to find artificial traces in thestego-images generated by MPA.

3.9. Steganalysis

To further illustrate the security of the MPA scheme, we apply a bit-plane attack for testing. Bit-plane attack is a visual attack method to steal information by extracting the bit-plane of an image. In this section, bit-plane attack experiments are performed on cover image Lena and stego-image Lena embedded with 49000-bit secret data.

Comparing Figures 9 and 10, there is no difference between each bit plane of the stego-image and cover image after the bit-plane attack. That is, the attacker cannot steal the secret data through the bit-plane attack, which shows that the MPA scheme ensures the security of information hiding.

In RS steganalysis, n adjacent pixels () are as a group. Then, the discrimination function F, defined as , is used to quantify the smoothness or regularity of each pixel group. Finally, the flip function is applied to define three types of pixel groups: regular, singular, and unusable, represented by R, S, and U, respectively. The percentages of all groups of regular and singular with masks M = [1 0 0 1] and –M = [−1 0 0 −1] are represented as , , , and . According the RS attack principle, if , then steganography is detected. If , then RS test fails to detect the steganography. Figure 11(a) depicts RS cures for F-16 image (n = 2, k = 2), Figure 11(b) depicts RS cures for Barbara image (n = 2, k = 3), Figure 11(c) depicts RS cures for Boat image (n = 2, k = 4), Figure 11(d) depicts RS cures for Pepper image (n = 4, k = 2), Figure 11(e) depicts RS cures for Lena image (n = 4, k = 3), and Figure 11(f) depicts RS cures for Tiffany image (n = 4, k = 4). The x-axis represents the rate of the total amount of data that MPA can embed under different conditions. For example, when n = 4, k = 2, and bpp = 2.5, the total amount of data that MPA can embed is .

The y-axis represents the percentage of regular ( and ) and singular ( and ) groups. It can be observed from all the subfigures that . Thus, the conclusion can be drawn that the RS attack fails to break the proposed scheme.

PDH analysis is performed by plotting the PDH graphs. As shown in Figure 12, PDH is a two-dimensional curve, where the x-axis represents the difference between two adjacent pixels in each group and the y-axis represents the incidence (frequency) of difference values. In general, the PDH plot of the cover image is kept smooth. If the PDH image of the stego-image has any zigzag shape, then steganography is detected; if not, steganography is not detected. Figure 12 depicts the PDH test of MPA on six cover images, (a) Airplane, (b) Barbara, (c) Boat, (d) Pepper, (e) Lena, and (f) Tiffany. In each of the six images, there are two curves. The blue curve is the cover image, and the orange curve is the stego-image. It can be seen from all the images that the orange curve does not have any zigzag nature. Therefore, the PDH test failed to reveal this proposed technique.

4. Conclusions

In this paper, a new data hiding algorithm MPA is proposed. It groups n adjacent pixels and completes the secret data grouping accordingly. The secret data are embedded into the pixels using a multi-bit encoding function in each group to achieve high embedding payload and high-quality stego-images. It can be observed from the experiments that the payload and PSNR perform better than the compared techniques. The SSIM and QI are acceptable, which implies that the stego-image generated by MPA is similar to the cover image. This does not make the attacker suspicious of the stego-image. From the results of the bit-plane attack, it is not possible to detect the secret information. 4 curves from the RS test fit the conditions of the attack. Similarly, the PDH curves of the stego-image do not appear zigzag and are consistent with the cover image, which does not disclose the steganographic technique. In the future, we will try to use this algorithm to investigate more applications such as watermarking and image authentication.

Data Availability

The data and code used to support MPA scheme are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.