An Adaptive Image Watermarking Method Combining SVD and Wang-Landau Sampling in DWT Domain

To keep a better trade-off between robustness and imperceptibility is difficult for traditional digital watermarks. Therefore, an adaptive image watermarking method combining singular value decomposition (SVD) and the Wang–Landau (WL) sampling method is proposed to solve the problem. In this method, the third-level approximate sub-band obtained by applying the three-level wavelet transform is decomposed by SVD to obtain the principal component, which is firstly selected as the embedded position. Then, the information is finally embedded into the host image by the scaling factor. The Wang–Landau sampling method is devoted to finding the best embedding coefficient through the proposed objective evaluation function, which is a global optimization algorithm. The embedding strength is adaptively adjusted by utilizing the historical experience, which overcomes the problem of falling into local optimization easily in many traditional optimization algorithms. To affirm the reliability of the proposed method, several image processing attacks are applied and the experimental results are given in detail. Compared with other existing related watermarking techniques based on both qualitative and quantitative evaluation parameters, such as peak signal to noise ratio (PSNR) and normalized cross-correlation (NC), this method has been proven to achieve a trade-off between robustness and invisibility.


Introduction
Digital watermarking [1][2][3] technology is an important scheme to protect the security of image, audio and video and even the data collected by the Internet of Things(IoT) technology. Through embed secret information into the target source, digital watermarking technology is widely applied in many fields, such as unauthorized copying and modification of digitized works [4][5][6]. Embedded secret information is generally obtained by some transformation of copyright information, which is generally invisible or undetectable and can only be extracted by a special reader or detector. Compared with encryption technology [7], digital watermarking technology can provide further protection for decrypted information.
Certain attacks on image watermarks compel to develop a digital watermarking algorithm that ensures the quality of the image, which denotes that the embedded watermark will not be visualized by human eyes and can still be extracted using the same pattern when the carrier has been modified. However, the robustness and imperceptibility of digital watermarking are contradictory, and the improvement of watermark robustness is often accompanied by the descending of the imperceptibility. Therefore, to obtain a trade-off between robustness and imperceptibility, many researchers introduce an intelligence optimization algorithm to obtain optimal embedding strength adaptively, such as Genetic algorithm (GA) [8,9], Particle Swarm Optimization algorithm (PSO) [10,11] and Artificial bee colony [12]. Meanwhile, these approaches have usually combined singular value decomposition (SVD) with transform algorithms over the past few years, such as DCT, DWT, DFT and RDWT [13][14][15][16][17][18][19]. Furthermore, DWT combined with the SVD algorithm has attracted extensive attention for its advantages of large embedding amount and better robustness. Therefore, to optimize the watermarking algorithm combining SVD decomposition and DWT transformation is a relatively common strategy. Zhou et al. [20] proposed a strongly robust hybrid watermarking method combining the discrete wavelet transform (DWT) with all phase discrete cosine biorthogonal transform (APDCBT) and singular value decomposition (SVD). The direct current (DC) coefficients of block-based APDCBT in high frequency subband (LH and HL) are modified to improve the imperceptibility of watermarking, compared with the traditional watermarking method, this method acquires higher image quality and strong robustness against various attacks. Wu and Wu [21] proposed an adaptive chaotic audio watermarking algorithm embedded in discrete cosine transform (DCT) and discrete wavelet transform (DWT) according to special rules, taking advantage of the insensitivity of the human ear to small changes in high-frequency components of the audio medium. Experimental results show that the algorithm can effectively improve the robustness and imperceptibility. Liu et al. [22] used the sorting strategy of wavelet coefficients to calculate quantization steps and select appropriate parameters in high-entropy image regions to embed the watermark, and the host image is not needed in watermark extraction. In the experiment, the false alarm probability and detection probability of watermarking are used to verify that the robustness of the proposed method is higher than the quantized watermarking algorithm. Liu et al. [23] proposed a method based on DWT, SVD, and Hessenberg decomposition (HD) to embed the watermark into the host image. The host image is decomposed into several sub-bands by multi-layer DWT, and the sub-band coefficient obtained is used as the input of HD. The fruit optimization algorithm [24,25] is used to obtain the scaling factor. Experimental results show that for some multi-size watermarks, the optimized watermarking method can strike a good balance between robustness and invisibility. Zheng et al. [26] proposed an algorithm Guided Dynamic-PSO (GDPSO) to solve the premature convergence problem for the PSO algorithm, in which it often gets stuck at a specific point. The new algorithm also targets the particle that an individual falls into best. On this basis, an optimal value replacement method based on Shared fitness is proposed to provide better diversity to avoid this problem. Experimental results show that this algorithm is superior to other algorithms in imperceptibility and robustness. To achieve this objective of the highest possible robustness without losing visual imperceptibility, Loukhaoukha et al. [27] proposed an optimal image watermarking method based on particle swarm optimization (MOPSO) to obtain the optimal embedding factor. Watermark embedding and extraction are based on wavelet transform and singular value decomposition. Experimental results prove significant improvement both in terms of imperceptibility and robustness under various attacks. Sharma and Mir [28] proposed a time-efficient optimization method to reduce the large time consumption in the time-sensitive applications. The watermarking is embedded in the domain of discrete cosine transform after acquiring an optimum strength value using ant colony optimization and light gradient boosting algorithm. It is found that compared with the existing optimization methods, this method takes less time to solve the optimal solution.
The optimization algorithm is essentially a meta-heuristic algorithm, making these algorithms highly common in the watermark embedding process. However, the best scaling factor is different because of the different ability to find a global optimal solution. However, most of the above optimization strategies lack sampling statistics of the previous results, which means that they easily fall into local optimization. Therefore, in this paper, we proposed a global optimization scheme based on the Wang-Landau (WL) sampling method to avoid the problem of premature convergence [29] and obtained a better embedding coefficient. In this method, the scaling factor is designed as an energy value, which reflects the current state of watermarking processing using this factor. Then, the energy state density function and histogram function are applied to census the changes in these states, and the better scaling factor can be found in subsequent searching. In addition, the three-level discrete wave transformation (DWT) is used to decompose the host image to seek for the embedding position, and then the watermark is embedded into the second-level approximate sub-band after singular value decomposition (SVD). Finally, the performances of invisibility and robustness are evaluated by comparison with different methods through noise ratio (PSNR), normalized cross-correlation (NC), structural similarity index measure (SSIM) and bit error ratio (BER). The experimental results denote the effects of this method to acquire a trade-off between robustness and imperceptibility.
The rest of this paper is designed as follows: Section 2 introduces the preliminaries in this work; in Section 3, an adaptive image watermarking method combining SVD and Wang-Landau Sampling in the DWT domain is proposed; in Section 4, the optimization of the embedding strength factor using WL is presented, and the experimental results are displayed and analyzed comparatively in Section 5; finally, Section 6 puts forward the conclusion of this paper and further research works.

Preliminaries
In this section, the related technologies, including DWT, SVD and WL methods, are introduced and used in the proposed scheme.

Discrete Wave Transformation
In the process of signal analysis and image processing, discrete wavelet transform (DWT) is very useful for the multi-resolution characteristics, and the embedded watermark is robust in compression, noise and other attacks [30,31]. In general, an image can be transformed into four subbands using DWT, which are low-high (LH), high-low (HL), high-high (HH), and low-low (LL). The LL sub-band is a low-pass filtered coefficient representing the detailed information of the image, and the remaining sub-bands, named horizontal detail sub-picture, vertical detail sub-picture and diagonal detail sub-picture, represent the edge and contour information of the image [32]. The selection of a higher-level detail sub-picture as embedding locations has better invisibility and robustness, and the approximate detail sub-picture of the host image contains more information than others. Therefore, in this paper, a three-level DWT is used to decompose the host image, and the LL sub-band is chosen to be the watermark embedding position. Figure 1 is a schematic diagram of the two-level wavelet decomposition.

Singular Value Decomposition
An image can be seen as a matrix composed of many non-negative vectors from a mathematical point of view, and the singular value decomposition technique [33] is a mathematical method for numerical extraction of feature points in pictures. In this method, the singular value decomposition (SVD) is first applied to the entire host picture or its small picture area, and then the singular value of which is modified for embedding the watermark. A represents a matrix with the size m × n, which can be decomposed into A = USV T , where U expresses the left singular matrix, S expresses the singular matrix and V expresses the right singular matrix. Here, T is the transposed symbol. In addition, S = diag(λ i ) is a matrix with a value of 0 on a non-diagonal line. The value of the diagonal is the singular value λ i (i = 1, ..., m) and satisfies Equation (1), For the singular value decomposition (SVD) of matrix A, it can be expressed as where r is the number of columns of matrix A (r ≤ m), u i and v i represent the left singular vector and right singular vector, respectively. In the research of digital watermarking, few works of literature use SVD decomposition to embed digital watermarks. The classical approach is to cooperate DWT decomposition with DCT decomposition, and such overlay algorithms are generally robust to most attacks [10,34]. In this paper, A DWT-SVD based digital image watermarking scheme is implemented.

Wang-Landau (WL) Sampling Method
The Wang-landau (WL) sampling algorithm is a kind of Monte Carlo (MC) algorithm, which is first presented by F. Wang and Landau based on a frequency histogram. The purpose of this algorithm is to get an estimate of the state density of the system characterized by the cost function, which utilizes a non-Markov stochastic process that progressively converges to a multi-regular ensemble [35]. The traditional MC method is limited to generate regular distribution under the scenario of constant temperature. Therefore, the algorithm must run multiple times under different temperatures to obtain high-quality solutions. However, the WL sampling walks freely through sampling distribution to the density of states. Therefore, any accessible state (both favorable and less favorable) can be accessed by the algorithm, and much faster than the Metropolis algorithm [36]. Then, this algorithm can obtain a flat energy histogram and the probability density function g(E). Starting from a certain state in the system, the algorithm gradually calculates the energy of each state and then excavates the distribution of each state's energy in the whole system.
Considering the robustness and invisibility of the watermark change dynamically with a different embedding strength, we assume that the trade-off between the robustness and invisibility is regarded as a state. The watermarking optimization problem that the scaling factor is difficult to find for the unknown distribution of the state of a watermark can be solved using the WL algorithm to conduct sampling statistics on the distribution. The detailed steps of the WL are outlined as Algorithm 1.
Here, Q w represents the initial set of states, and the optimal state could be eventually found based on the actual problem. Next, an example to find the minimum value of the state system is introduced.

Algorithm 1 Wang-Landau (WL) Sampling Algorithm
Require: Q w Ensure: the state with currentEnergy. 1: Initialize the parameters in this algorithm. 2: Choose the state with the highest fitness from Q w , and mark it as currentEnergy; 3: Select a random state from Q w by the roulette method, and mark it as currentEnergy; 4: if random [0, 1] < exp [ (P (currentEnergy) − P (targetEnergy) ) ] then 5: Accept targetEnergy for the new currentEnergy; then 6: Let  Figure 2 shows the process of finding the minimum using the WL algorithm. In a dimensional potential energy surface, assume that the WL algorithm starts from the state S with high potential energy (or high state density) to find the state with the lowest global energy. When the penalty factor (frequency histogram function) is equal, the algorithm is more beneficial to search for states with low potential energy (or low density of states). The penalty term of the potential energy function (or state density function) of the visited low energy state increases with the calculation of the local optimal time (i.e., the number of iterative steps s), correspondingly, which is not conducive to sample the state again. For example, the potential energy surface is filled with repeated sampling of the state near state A in Figure 2, although the energy of state C is lower than that of state D. However, because of the effect of histogram penalty term near state C, its modification energy is higher than that of state D. Hence, the WL algorithm can accept state D (although it seems to be higher than that of state C), jump out of the state A and reach globally optimal state B.

Watermark Embedding Scheme
In this section, the decomposition process of the host image via a three-level DWT transform is explained. The suitable watermark embedding positions are then selected based on features of the wavelet transform. In addition, the watermark image W is decomposed by DWT to obtain embedded information. Let the size of host image A and watermark image W be N × N and M × M, respectively. The steps of watermark information embedding are as follows: (1) The host image A is decomposed to acquire the components of low-high (LH 3 ), high-low (HL 3 ), high-high (HH 3 ), and low-low (LL 3 ), and then the watermarking W is also decomposed to LH w , HL w , HH w , and LL w , the db1 wavelet is used in the DWT. Meanwhile, the LL w has the same size as LL 3 . (2) Apply SVD to the approximate sub-band LL 3 and LL W of host image and watermark, respectively: here, A LL3 is the 3-level approximate sub-band of the host image A, U, S and V are the left singular matrix, singular matrix and the right singular matrix of A LL3 , respectively. Here, T is the transposed symbol.
here, A LLW is the three-level approximate sub-band of the watermark image W, U W , S W and V W are the left singular matrix, singular matrix and the right singular matrix of A LLW , respectively. (3) Embed the watermark information into the LL 3 using Equation (5).
here, λ * wi is singular value of S * W (see Equation (9)), λ i and λ wi (i = 1, 2..., N/4) are singular values of S and S W , respectively, and α is the embedding coefficient found by the WL optimization algorithm. (4) The diagonal matrix S * is obtained after the embedding procedure, and then the reconstructed approximate sub-band LL 3 is obtained by Equation (6).
Finally, the watermarked image A * is obtained by performing the inverse three-level DWT. Figure 3 shows the flow chart of the watermarking embedding.

Watermark Extraction Scheme
In the process of extraction, the input is the watermarked image A * , which has been under some attack. The output is the extracted watermarking W * . The steps of watermark extraction are as follows: (1) The watermarked image A * is decomposed to LH * 3 , HL * 3 , HH * 3 , and LL * 3 using 3-level DWT. (2) Apply SVD to decompose the LH * 3 : (3) The extracted singular value S w is gained by here, λ i and λ * wi (i =1, 2..., N/4) are singular values of S and S * W , respectively, and the embedding coefficient α is sent to the extractor as a key. (4) The approximate sub-band LH * w is obtained by Equation (9).
Finally, the watermarked image W * is obtained by performing the inverse DWT. Figure 4 is the procedure of the watermark extraction.

Optimization of Embedding Strength Utilizing WL
In this part, we propose a technique to balance the robustness and transparency of watermarking, which is used in embedding watermarking. This algorithm performs the following tasks: (1) Census the state of the current system with the histogram and the probability density function. (2) Correct the direction for searching the scaling factor based on the acceptance condition. (3) Generation of temporary target state using roulette selection. (4) Update the population near the current optimal state. Figure 5 shows the flowchart of scaling factor optimization using WL. Next, combining the watermarking embedding and extraction scheme presented previously, the proposed method is explained in detail.

Initial population and parameters used in the WL algorithm
Select the current state and give the current target state after statistical calculation by H(X) and G(X).
Watermark embedding  Figure 5. Scaling factor optimization using WL.

Construction of the Fitness Function
Balancing the invisibility and robustness in digital image watermarking is a research focus in this field [37]. As a meta-heuristic algorithm with global optimization ability, the WL algorithm can solve many optimization problems effectively. At present, the algorithm has successfully provided solutions for many optimization problems, such as protein structure prediction and satellite cabin layout [38][39][40]. Therefore, this algorithm is used to achieve a good trade-off between robustness and invisibility. First of all, the performance metrics to evaluate invisibility and robustness are introduced. Among them, invisibility is generally evaluated by the performance metrics such as peak signal to noise ratio (PSNR) and the structural similarity index measure (SSIM). Moreover, the robustness is measured by normalized cross-correlation (NC) and bit error ratio (BER). In this section, combined with the actual, the definition of these performance metrics is given, respectively.
here, A max is the maximum pixel value in the host image A, and A * denotes the watermarked image A * . A(m, n) and A * (m, n) are pixel values in A and A * , respectively. M and N represent the length and width of A and A * . SSIM is defined by here, u A and u A * are the average of image A and A * , respectively. σ A and σ A * are the variance of image A and A * , respectively. σ A and σ A * are the covariance of image A and A * , respectively, and c 1 and c 2 are used to keep the stabilization of the division with a weak denominator. NC is defined by NC is an index used to evaluate the robustness of the watermarking algorithm. Here, assume that K kinds of attacks are used to attack the watermarked image, the fitness function f (r i ) is given by The details of the proposed fitness function are described as follows.
(a) ∂ 1 and ∂ 2 represent quantization coefficients for invisibility or robustness and are tunable. (b) Requirements for PSNR: Image quality is acceptable only if PSNR > 35 dB is satisfied [37]. (c) Using PSNR and SSIM as the evaluation of invisibility, this function is used to compute the fitness value. Furthermore, the proposed fitness function employs a predefined quality level (Qth). Hence, the optimization method provides the maximum achievable robustness while ensuring that the image quality is higher than Qth.

Acquisition of Target State X 2
Roulette wheel selection (RWS) indicates that whether an individual can be selected mainly depends on the adaptability of the error [41]. The stronger the adaptability, the more it should be selected; on the contrary, it should be eliminated. Here, the adaptability of the individual (state) is defined by R(X) = H(E(X)) + 1, which means that individuals having been accepted by the WL algorithm many times are more likely to acquire optimal individuals around them. The main purpose of adding 1 is to prevent the case that the initial adaptability is all 0 and no individual can be produced. The details of this strategy are as below, Step 1: Initialize the adaptability of each state in the population Q w (i = 1, 2 ..., n), and n is the size of the population.
Step 2: Calculate the cumulative probability value of the link using Equations (14) and (15). Here, ∑ M j=1 R(X j ) represents the sum of the fitness of all states, and the R(X j ) is the fitness of the j-th state. Q j is the cumulative probability value of the state X j (j = 1, 2, ..., M), where Q j = P(X j ).
Step 3: Generate a random number r between [0,1]. If r ≤ Q 1 , accept the individual 1. Otherwise, the k-th state X k is selected if the condition (Q k−1 < r ≤ Q k ) is satisfied.
Step 4: Repeat Step 3 for M times until the end of voting. The state with the highest number of votes is target state X 2 .

Proposed Approach for Acquiring the Adaptive Embedding Strength
This section explains how to achieve the best embedding strength by the WL method. At the beginning of the WL algorithm, all the energy E in the state space and its corresponding state density function g(E) are unknown (the initial values are all set to 0). Once a new energy E is accessed, the state density g(E) and histogram function H(E) of this energy are initialized to 1. Since the energy values calculated in this paper are between 0 and 1. Therefore, the maximum energy is one, which is divided into 100 energy intervals, each energy interval is 0.01, i.e., In addition, the convergence rate of the algorithm is controlled by the flatness of the histogram, and the flat histogram means that the value of all H(E(X)) is not less than k(0 < k < 1) times the average value of the histogram H ([E (X)]) , and the value of k depends on the complexity of the system and the expected accuracy of g(E(X)). Actually, it is very difficult to get an absolute flat histogram. Therefore, the algorithm will check if the histogram is approximately flat after every 10 3 [E] in S 0 , then the adjustment correction factor becomes a smaller value, i.e., f i = f i , i = i + 1. Then, if the number of iterations is f ≈ exp(10 −4 ) = 1.001, f max is output, and the iteration is terminated. Otherwise, set H([E(X)]) corresponding to all visited energy intervals [E(X)] to zero, preserve g(E) value, and start the next random walk. At this point, g(E(X)) will converge to the true value with a certain precision. In the actual application, lng(E(X i )) = ln f i + lng(E(X i )) is chosen to update the state density G([E(X)]), i = 1, 2. Meanwhile, the acceptance probability from X 1 to X 2 is calculated by using the formula containing exponentials, namely, p (X 1 − > X 2 ) = min e ln[g(E1)−g(E2)] , 1 . Next, the steps of the proposed method to obtain a trade-off between robustness and imperceptibility are listed as bellow,

MC iterations. If H(E) becomes approximately flat, that is, if H([E]) >= k · H([E(x)]) for all
(1) Initialize the parameters used in the algorithm (k = 0.8, i = 0, f 0 = e, L = 0), where T = 1000 means that the algorithm checks whether the histogram is flat every 1000 executions, f i (initial f i = f 0 ) is a monotonic decreasing function used to reduce the modification factor. Choose the state with the highest fitness value from Q w , so that X max = X 1 , E max = E 1 ; make the accessed energy interval set S 0 = [E(X 1 )], that is, [E(X 1 )] is added in the temporary buffer S 0 , S 0 = S 0 + [E(X 1 )]. Make the histogram function H(E(X 1 )) = 1 and the probability density function g(E(X 1 )) = 1. (2) Checks the number of iterations reaches the maximum maxG or not. If it reaches the maximum, output the results containing the best scaling factor E m ax and the maximum fitness value f max , otherwise, go to (3).
(3) Select the target link X 2 using the RWS random generation strategy (see Section 4.2), and let g(E(X 2 )) = 1, H(E(X 2 )) = 1, L = L + 1. (4) If [E(X 2 )] is not included in a target value interval of S 0 , the target value interval containing E(X 2 ) is placed in S 0 , that is, S 0 = S 0 + [E(X 2 )], go to (5). (5) Checks the target state X 2 is accepted or not, and the acceptance probability is If random(0, 1) < p(X 1 → X 2 ), X 2 is accepted and removed from the population Q w . Then, Compute the fitness value of state X 2 using Equation (13), and then compare the value with the maximum fitness value f max . If f (X 2 ) > f max , generate a random state X rand , Here, low R and up R are the upper and lower bounds of the histogram interval where E max is located, respectively, go to (8). Otherwise, go to (7).
denotes that the state with higher robustness is added to the population Q w . Otherwise, E rand = E max ×rand[0,1], which denotes that the state with higher quality is added to the population Q w . (8) If L%T == 0, go to (9). Otherwise, go to (2).

Results and Discussion
In this section, to evaluate the performance of the proposed image watermarking method based on the WL algorithm, the invisibility and robustness experiments are implemented. All simulation experiments are run on a PC with Intel Core(TM) I7-7700HQ, 2.80 GHz processor and 8.0 GB of RAM, and using MATLAB R2016a under Windows 10. Likewise, six standard 512 × 512 gray-scale images are shown in Figure 6 are used as the host images, and a 128 × 128 gray-scale logo image is used as a watermark image, as shown in Figure 6. The key parameters of the WL algorithm are set, as shown in Table 1 (See Section 4 for the remaining parameter settings), and the different attacks are shown in Table 2. Next, the proposed method is compared with the image watermarking methods given in [42][43][44][45], where these methods are based on the optimization algorithm, and the remaining methods used in [46,47] are not. Table 1. Summary of key parameters.

Number
Key Parameter Description

1
The number of population Q w is 20 2 The maximum iterator max G = 1000 3 The quantization coefficients ∂ 1 and ∂ 2 are set to 0.5 and 0.5. 4 Qth = 40, 45 5 Energy intervals number = 100 Table 2. Summary of different attacks.

Number Attack Description
1 Median filter with window size 3 × 3 2 Rescaling 512 → 256 → 512 3 Salt and pepper noise with noise density 0.01 4 Gaussian low-pass filter of size 3 × 3 with standard deviation 5 JPEG compression with quality factor 30 6 Counterclockwise rotation with 45 • 7 Speckle noise with variance 0.01 8 Gaussian noise with mean zero and standard deviation 0.001 9 Gamma correction with gamma value 0.2 10 Histogram equalization 11 Sharpening attack

Imperceptibility Measurement
The watermark invisibility is an important metric to measure the performance. Figure 7 shows the watermarked host images suffered no attack and the NCs, SSIMs are also listed. It can be seen from Figure 7, all these metrics are greater than 40dB and 0.9993, respectively, which means the proposed method has good invisibility performance. Therefore, the results verify that the quality achieved exceeds the predetermined Qth, and the proposed watermarking method has good invisibility, whether from a subjective or objective point of view. Here, when Qth = 40 db, from Figure 7a-f, the optimal scaling factor computed by the proposed method are 0.05898, 0.06493, 0.05931, 0.06762, 0.05702 and 0.05905, respectively. The fitness value f max for these images are 0.94988, 0.94488, 0.93897, 0.95903, 0.95527 and 0.94397. The obtained scaling factors are between 0.058 and 0.068, which is corroborated with the analysis presented by [48]. The curves of PSNR(A, A * ) and SSI M(A, A * ) with the scaling factor are shown in Figure 8. Obviously, no matter which host image is used, the transparency of the image decreases as the scaling factor increases. Moreover, the reached highest PSNR value for each image are all greater than 40 db. Therefore, the most appropriate embedding factor should improve the robustness degree if possible when ensuring the PSNR value greater than Qth. It enables the extracted watermark to be correct to the maximum extent possible. Table 3 shows the quality using the PSNR metric, compared to the other methods. It is observed that the proposed method has better quality and guarantees a quality higher than the predefined value.

Robustness Measurement
In this section, the robustness is evaluated by detecting the performance of the watermarking system under different attacks, which is a similarity between the embedded and the extracted watermark [42]. Table 4 shows the NC value of the watermark extracted from the host image under different attacks and the original watermark, respectively. As shown in Tables 4 and 5, after embedding the watermark, the mean value of NC metric for each watermarked image remained above 0.89, and the mean value of BER are all lower than 0.2. Therefore, this method also has good robustness. The BER values for different methods are shown in Table 5.
To further examine the effect of scaling factor on robust for attacked images. The effects of scaling factor on NC values, between the range of [0.01,0.2], are displayed in Figure 9. It can be seen that the NC value varies only between 0.01 and 0.2, except for (e). Subsequently, the NC value tends to be stable. For gamma correction and counterclockwise rotation attacks, the NC values concerning (d) are constant throughout. Therefore, the change in scaling factor has little effect on the two types of attacks, which is also reflected in the above results. Furthermore, the robustness of images under various attacks is also different. This also makes it difficult for the same algorithm to be fully applicable to all watermarking systems.   Table 6 shows the watermark extracted from these corrupted watermarked images after various kinds of attacks. It can be seen that the watermark can be better extracted in most cases except for gamma correction and rotation attack. Combined with Figure 9, we concluded that the low robust sensitivity for these two attacks resulted in the little contribution to the NC value when altering the embedding coefficients. Finally, the proposed method is compared with other approaches. The comparison of robustness is shown in Tables 7 and 8, which is examined by the BER and NC value. It can be seen from Table 7 that BCRs of the proposed method are more robust than [42,43,46] in most attacks. The average value of BCR is higher than other methods when Qth = 40 db, and it still has high robustness when PSNR reaches 45 db. In addition, after modifying the attack parameters, the proposed method is also compared with the methods given by [44,45,47] using Lena when Qth = 40 db and 45 db. At this time, the optimal coefficients are 0.06143 and 0.0337. It is clear that the NC value of the proposed method is better in most cases except for rotation attacks. Therefore, it can be deduced that the proposed method achieves better robustness while guaranteeing predetermined invisibility.

Conclusions
In this paper, an adaptive image watermarking method combining SVD and the Wang-Landau (WL) sampling method is proposed to achieve the trade-off between robustness and invisibility through sampling statistics of the energy state density function and histogram function to find the global optimal solution. In addition, the three-level discrete wave transformation (DWT) and singular value decomposition (SVD) are used to decompose the host image to seek for the embedding position, and then the watermark is embedded. Finally, the performances of invisibility and robustness are evaluated by some metrics, such as peak signal to noise ratio (PSNR), normalized cross-correlation (NC). Meanwhile, compared with other existing related methods, the experimental results of the proposed method are effective and meaningful to improve the trade-off between robustness and imperceptibility. However, for some attacks, such as gamma correction, the robustness of the watermark is decreased, which means that it is necessary to sacrifice some invisibility to ensure the robustness of the watermark. In the future, more experimental implementations will be performed to evaluate the performance of the algorithm. In addition, since some parameters in the algorithm have a great influence on the experimental results, some machine learning methods will be used to adjust these parameters in our future works automatically.