Low Distortion Reversible Database Watermarking Based on Hybrid Intelligent Algorithm

: In many fields, such as medicine and the computer industry, databases are vital in the process of information sharing. However, databases face the risk of being stolen or misused, leading to security threats such as copyright disputes and privacy breaches. Reversible watermarking techniques ensure the ownership of shared relational databases, protect the rights of data owners and enable the recovery of original data. However, most of the methods modify the original data to a large extent, and cannot achieve a good balance between protection against malicious attacks and data recovery. In this paper, we propose a robust and reversible database watermarking technique, using hash function to group digital relational databases, setting the data distortion and watermarking capacity of the band weight function, adjusting the weight of the function to determine the watermarking capacity and the level of data distortion, using firefly algorithms and simulated annealing algorithms to improve the e ffi ciency of the search for the location of the watermark embedded, and finally using the di ff eren-tial expansion of the way to embed the watermark. The experimental results prove that the method maintains the data quality and has good robustness against malicious attacks.


Introduction
In the era of exponential data growth, sharing information has become an important part of business, industry, and academic research, and these activities involve the trading of data.Examples include weather forecasting, medical data analysis, and credit reports of financial institutions.Therefore, protecting data copyrights and preventing leakage of shared data have become challenging issues today [1], and database watermarking, as an important technique for database security can stop the illegal dissemination of data.
At present, scholars for the database watermarking problem put forward the Least Significant Bit (LSB), the use of histogram statistical analysis, text steganography, digital fingerprinting, optimization algorithms based on the reversible database watermarking, blockchain-based watermarking technology, and other methods, which is simple and easy to implement, the hidden nature of the LSB is high, but in the LSB is susceptible to a variety of attack means of destruction, limited watermarking capacity, and poor robustness in practical applications.Embedded watermarking using histogram statistical analysis is reversible and has a large watermarking capacity, but it is susceptible to random noise interference, and it is difficult to cope with attack analysis.Text steganography has a high degree of covertness and can cope with attack analysis but its capacity is limited, which restricts the application in large-capacity information transmission [2].Digital fingerprints are robust, fast to recognize and retrieve but the original data cannot be recovered after the data is converted into fingerprints.Blockchain technology utilizes distributed database technology, which is slow to process and requires complex operation and maintenance development for large databases [3].Reversible database watermarking based on optimization algorithms inserts watermarks to protect data in the form of finding the optimal location for embedding watermarks, which can protect intellectual property rights, prove ownership, and ensure authentication and security of the content [4], and has the advantages of watermarking capacity, robustness, and recovery of the original data in large databases.
Addressing the shortcomings inherent in traditional watermarking schemes, which often encompass issues such as irreversibility, limited capacity, inadequate robustness, and suboptimal optimization, this paper introduces an innovative approach to reversible database watermarking.Our principal objective is to bolster algorithmic robustness across a spectrum of database attack scenarios, while concurrently addressing the deficiencies that plague conventional watermarking techniques.The primary contributions of this paper can be summarized as follows: • A new reversible database watermarking algorithm is proposed, and after a series of experimental evaluations, it is found that the method exhibits better performance than traditional algorithms while maintaining the watermark embedding tolerance.Experiments on the database with 50% attribute or tuple deletion still maintain the watermark extraction rate above 75% stably, and the extraction rate can be maintained at 100% under the attack of adding tuples and attributes, this result strongly proves that our method can better protect the integrity of the database watermark information.
• The weighting coefficients are introduced into the design of the optimization function (or brightness function) as an indicator of the performance of the algorithm.At the data embedding watermark preparation stage, the data copyright owner can change the weighting parameters in combination with the actual application requirements.It not only determines the capacity size of the watermark embedding, but also directly affects the degree of data distortion, further enhancing the autonomy and flexibility of the data owner to protect the intellectual property rights of the data.• In order to cope with the problem of huge data distortion that can be caused by a large number of watermark embeddings, a differential extension technique is used to embed watermarks into individual tuples.The aim is to decentralize the embedding of the watermark information into various sub-parts of the tuple while keeping the data relatively stable.The problem of serious data distortion caused by centralized large-scale embedding is avoided, while the reversibility and integrity of data manipulation are also ensured.A symmetric encryption algorithm is used to realize fast watermark verification under the premise of protecting data security.Compared with the traditional large number judgment method, this strategy has significantly improved in efficiency and accelerated the watermark verification process.

Related work
In 2003, Kiernan et al. [5] proposed database watermarking technique to protect relational database copyright issues.The approach is to embed the watermark data on the numeric fields of the database, classify each tuple using Message Authentication Code (MAC) thereby obtaining the embedding position of each watermark bit, and finally embedding the watermark using LSB.The disadvantage of this approach is that the tuples are not uniformly categorized, and in extreme cases, there are many candidate tuples in some categorization groups and only a few tuples to choose from in other categorization groups.When an attacker attacks the lowest valid bit of a numeric field in a database, a large number of watermarks will be lost, and the watermarking capacity of this approach is also very small.In 2006, Zhang et al. [6] proposed a reversible database watermarking scheme based on the attribute difference to construct a histogram.In the same year, Zhang et al. proposed a reversible watermarking scheme based on heteroskedastic operation for relational databases.In 2008, Shehab et al. [7] proposed an optimization algorithm based on genetic algorithm in order to reduce the distortion of the data caused by embedded watermarks, which turns the search for the location of the watermark to be embedded into an optimization problem.GUPTA et al. [8] proposed a database watermarking embedding technique based on DEW in 2009, because of the DEW in this scheme.technique, as the DEW in this scheme is based on MAC randomly selected to be embedded watermark location, so this embedding watermarking technique will make the data distortion too large, in practical use will be selected by setting a threshold to insert or not, resulting in the reduction of watermarking capacity.In 2013, Chang et al. [9] proposed histogram transform based on the BRRW technique realizes the insertion of reversible watermarks against relational databases.In 2015, Iftikhar et al. [10] proposed a new scheme for selecting the basis of watermark embedding location based on the concept of information according to the characteristics of Genetic Algorithm (GA).In 2017, Imamoglu et al. [11] used one way hash function SHA to sort the tuple of the database according to the primary key as well as the input key and used Firefly Optimization Algorithm (FA) based on the database to Find the best location for embedding the watermark, use DEW technique to achieve the embedding of the watermark and recovery of the data, and finally store the location of the watermark and extract the watermark location in the extraction phase.In 2019 Hu et al. [12] used genetic algorithm to select the best key for grouping of the database, and embedded the watermark after offsetting the histogram of prediction error.
In the process of reversible watermarking, the distortion constraint ensures the imperceptibility of the watermark and avoids a large amount of data quality loss.The severe loss of data quality makes it an easy target for attacks, which deteriorates the robustness of watermarking.Therefore, there may be a potential conflict between robustness and distortion constraints [13].Although GADEW and FADEW also use optimization methods to select the best location to minimize watermark embedding distortion, the distortion caused by the embedded watermark is still significant.In order to maximize the robustness and minimize the distortion, and seek the optimal solution as much as possible within the required distortion range, so as to determine the appropriate watermark embedding location [14,15,16], this paper proposes a combination of the firefly algorithm and the simulated annealing algorithm to solve this problem.
3. Definitions for DE, FA and SA

Differential Extension
Differential Extension (DE) technology was initially widely used in the field of audio decoding and audio processing to improve audio compression, transmission and quality.Later, DE technology gradually migrated to the field of database watermarking to form Differential Extension Watermarking (DEW), which protects data in databases by embedding almost imperceptible digital signals to ensure that copyrighted information is not illegally exploited, in addition to authenticating the source of the data.After embedding watermarks in databases, distortions are usually difficult to recognize by observation, but in certain application scenarios with high-precision data, even small data variations may lead to undesirable results.In order to preserve the robustness of data usage, reversible database watermarking techniques based on differential extensions have emerged.These techniques not only allow watermarking to be extracted during the data validation phase, but also enable full recovery of the original data.For a deeper understanding of the transformation and recovery process of differential expansion, some sample data demonstrations are provided below: Suppose x, y are the data of some tuple in the database where x=206, y=201, and the bit of information to be embedded b=1. Step

Firefly brightness and behavior
Biologically inspired algorithms have become popular in solving global optimization problems such as the Traveling Salesman Problem (TSP), also known as Swarm Intelligence (SI)-based algorithms due to their simulation of the group characteristics of living organisms.A particle swarm optimization algorithm was proposed in 1995 based on the swarming behaviors of birds and fish [17].Subsequently, ant colony optimization [18] and artificial bee colony optimization [19] were proposed.In 2008, Yang et al. [20] proposed the Firefly Algorithm (FA) to solve the NP problem based on the glowing behavior of fireflies in nature.The algorithm, as a stochastic algorithm, cannot guarantee an optimal solution in a certain time, but it can converge to a value in a certain time, so as to find the local optimal embedding location of the watermark in the database.Set the firefly population in a certain spatial range S as F, the attraction function is A, if two fireflies are attracted to each other, then A = 1 otherwise A = 0, the brightness function is I, the attraction is β, the movement function is M, then the FA algorithm has the following three properties: (2) In the FA, the two fireflies' relative brightness is determined by initial brightness and light absorption coefficient as well as distance, assuming light absorption coefficient is γ, the distance of any pair of fireflies is r j i , and the brightness of selected fireflies is I 0 , the firefly's relative brightness is calculated by the formula: The mutual attraction between fireflies determines the flight direction of fireflies, in the space region, suppose there are three fireflies F 1 , F 2 , F 3 , the attraction of F 1 to F 2 is greater than the attraction of F 3 to F 2 , and F 2 tends to fly to F 1 , the mutual attraction of the fireflies is directly proportional to the relative luminance of the fireflies, and the attraction is determined by the maximum attraction (attraction when the distance is 0) with the light absorption function and the distance, assuming that the maximum attraction is β 0 , the light absorption coefficient is γ, and the distance of the two fireflies is r j i , then the formula of the mutual attraction of the fireflies is as follows: Different moments correspond to different positions of the same firefly, and their position are not only determined by mutual attraction, but also by the random movement of fireflies.Assuming that the position of firefly i at time t is X i (t), and the random movement of firefly i is αϵ i , then the position function of firefly i at time t+1 can be expressed as follows: where α is a randomization parameter and ϵ i is a random number generator.x j (t) − X i (t) denotes the distance between the best firefly and the current firefly at time t i.e. r j i , defined as follows: Mathematical Biosciences and Engineering Volume 19, Issue x, xxx-xxx

Simulated annealing
The earliest idea of Simulated Annealing (SA) algorithm was proposed in 1953 by N. Metropolis et al.It is a stochastic optimization algorithm based on the Monte-Carlo iterative solution strategy, and its starting point is based on the similarity between the annealing process of solids in physics and general combinatorial optimization problems.The essence of the simulated annealing algorithm is to randomly search for a globally optimal solution of the objective function in the solution space, accepting the new solution if it is better than the current one, or else accepting the poor solution with a certain probability.Let the initial temperature be T 0 , the solution be x, the objective function be E, and the cooling coefficient be k.The steps are as follows: Step 1: Initialize t, x 0 , and E, t = 0, x 0 = random(x), E = E(x 0 ).
Step 4: Repeat steps 1 to 3 Until the temperature reaches the set threshold or the temperature drops to 0. The watermark embedding phase, which is the focus of this research, is concerned with maintaining a balance between watermark capacity and data distortion prior to embedding, and ensuring database availability and watermark imperceptibility.
The extraction phase and the data recovery phase include the efficient extraction of the watermark and the reversible recovery of the data.This study examines the security of embedding watermarks into a database and analyzing the proof of FASADEW through robustness experiments, assuming the presence of an insecure computer network.The possible error rates in extracting watermarks against common attacks, such as tuple insertion and deletion and attribute insertion and deletion, are analyzed.For readers' convenience, we list the notations used in this paper in Table 1.Databases with embedded watermarks len Length of watermark

Data preprocessing
Data preprocessing in the database is used as the first step in embedding the watermark, in order to better distribute the watermark randomly in different tuples with different attributes in the database, to prevent a large number of watermarks from being lost due to tuple attacks, and to ensure that the insertion of the watermark is independent of the way the watermark is stored in the database.The primary key is extracted to reorder the database tuples using MAC, and the attributes are reordered according to the attribute name.This approach ensures that the database watermark is robust to the reordering attack on tuple and attribute columns, and effectively improves the efficiency of the subsequent creation of the initial population.the tuple grouping operation creates non-repeating groups{D i } (i=1,••• ,τ) , and the group number num is determined by the primary key S and the user-defined private key: where ∥ is the concatenation notation and H is the secure hash function algorithm such as MD5, SHA-1, SHA-2, SHA-512, etc., in this paper scheme SHA-512 is used to be applied to the grouping to ensure secure grouping.
The reordering groups the tuples, the tuples with the same computation result are put into the set D, the tuples with different computation results are separated, and one bit of watermark needs to be inserted into each set, so the length of the watermark determines the size of the symbol τ.Define the distortion tolerance range of each attribute, which is decided by the user.If no distortion tolerance range is set, the algorithm automatically selects the minimum and maximum values of the attribute as the corresponding distortion tolerance range, and the default distortion tolerance range DT is in the interval [AT min x , AT max x ].

Watermark embedding phase
This module uses the firefly algorithm to create an initial population for the set screened in the preprocessing stage and customize the firefly brightness function, move the fireflies iteratively through the brightness function and the simulated annealing algorithm, and ultimately the fireflies are gathered in a cluster, and select multiple fireflies in the cluster with a high brightness, which are the optimal solution given by the algorithm i.e., the optimal embedding location of the watermark.The initial population of fireflies in a set D i is given in Figure 2, and a random element in the tuple is selected, assuming a watermark of 1.This element is embedded with other attribute values using the DEW algorithm as shown in Figure 3.After embedding, each attribute pair distortion is calculated as shown in Figure 4, and finally a least distorted one is selected to be added to F i j .The loop is repeated n times.The randomized algorithm selects some attribute values that are not the global minimum distortion but the local minimum distortion, so it cannot be used to embed the watermark.

Define the luminance function of a firefly
The previous step creates the initial population and selects an attribute pair with the smallest distortion as an element in the firefly, in this step the luminance function needs to be defined to ensure the moving direction of the firefly, the luminance function is defined as the sum of the watermark capacity and the weight of the average attribute distortion, i.e., Br i = (row w /n)w 1 + | tupleUnit/All Distort | w 2 , the weight can be adjusted appropriately by the specific engineering requirements, the calculation of the brightness function is shown in Figure 3 and Figure 4.The sum of the weights of the brightness function Br i weights w 1 and w 2 is 1.The higher value of w 1 indicates that higher watermarking capacity is needed in the project, and the higher value of w 2 indicates that normalized distortion is more important in the project.

Mobile fireflies
The brightest firefly in each group is judged before moving, and if the brightness of the brightest firefly is greater than the set brightness threshold Ts, then the firefly can be directly added to the set E. In the original firefly algorithm, the distance between two fireflies is calculated based on Cartesian coordinates, which are not available in the database, so the distance of fireflies needs to be redefined.In this paper, we use the number of different attribute pairs to define the distance between two fireflies.The movement of fireflies is the movement of ordinary fireflies towards the brightest fireflies, the main purpose of the movement of fireflies is to reduce the differences between fireflies, the movement process needs to be guided by the distance of fireflies to complete.Since there are m attribute pairs in each firefly, we define the distance of fireflies as follows: How to move to the brightest firefly is very important, you can use a random number to move the firefly, if in the firefly contains m attribute pairs, it is different from the brightest firefly in the attribute pairs of t, that is, it is the distance of t with the brightest firefly, then in the firefly to select the range [1,t] attribute pairs for replacement, panning to replace into the attribute pairs of the brightest firefly, the specific steps are shown in Figure 5.The random movement of the firefly in the algorithm is done by taking a random value for a part of the remaining attribute pairs so that it ensures the randomness of the movement.Next use simulated annealing algorithm to update the movement of the firefly, set an initial temperature T, the cooling coefficient dT, in the firefly after a movement to calculate its brightness value, if the brightness value of the firefly is greater than the brightness value before the movement, then accept the value, and then drop the temperature once and use the simulated annealing algorithm to move again on a randomly moving attribute pairs to move, repeat the firefly's movement operation, and finally The cycle stops when the temperature drops to the lowest value of the set temperature.

Figure 6. Firefly Iterative Mobile Process
After all non-brightest fireflies use simulated annealing to complete the move, then the simulated annealing algorithm is applied to the brightest fireflies, after moving the brightest fireflies once, the brightest fireflies in the group are re-judged, and the brightest fireflies are iterated until the global optimal solution is found, and the non-brightest fireflies are also moved to the location of the brightest fireflies in a similar position through the hybrid algorithm of fireflies and simulated annealing to form a cluster, and the brightest fireflies in the found clusters are added into the set E.

Watermark embedding phase
The brightest fireflies from all groups are added to the set E by a hybrid firefly and simulated annealing algorithm, and each firefly in the set E specifies the attribute position of the watermark embedding.The first firefly in the set E, represents the first group computed by the MAC function, which contains the best embedding position of the watermark.Each watermark bit is embedded into the database by traversing the set E. The following pseudo-code specifically demonstrates embedding the watermark information into the corresponding set of the database.Eventually, the set E is encrypted and saved locally using the symmetric encryption algorithm AES to ensure the covertness and security of the watermark embedding.

Watermark extraction
The encrypted set E is decrypted, the tuples are grouped using the MAC function to obtain a series of tuples, the best fireflies in the set E are extracted, each firefly points to an attribute of the current row used for watermark embedding, the watermark information is extracted by applying the DEW inverse operation to the specified attribute through E, and the reconstruction restores the original value of the attribute, the pseudo-code of the watermark extraction algorithm is as follows.

Algorithm 3 Watermark Extraction Algorithm Input: Embedded watermark in database D
′ , encrypted file K doc, private key sk

Output: Original database D and watermark
Step 1: Use sk to decrypt K doc to get the best firefly set E. Get the number of elements of set set E E num.
Step 2: Use sk with D ′ to reorder the database tuples into groups, aligning the watermark array with the reordered tuples.Step 3: Repeat step 4 for E num times.
Step 4: The ith reordered tuple and the ith firefly in E are selected, and the watermark is extracted by performing the inverse operation of DEW on all attribute pairs in the fireflies.
Step 5: Obtaining the original database D and watermark.

Experimental results
In this paper, we use the University of California Forest Coverage dataset to conduct experimental tests on the dataset, which includes 581,012 tuples and 54 attributes, most of which have the value of 0. Therefore, only 10 attributes in the dataset are used in the actual experiments, to test the watermarking capacity, the degree of distortion, and the robustness under some attacks, and to determine whether the method of this paper is adaptable to the database with an arbitrary number of tuples, by varying the use of the tuples in the experiments.Any number of tuples of the database are adapted, and do comparison experiments with the algorithms in [11,15,21,22,23,24,25], set the watermark length to 96, and all the experiments are carried out under the same embedded watermarking conditions.The experimental environment is Intel core i5-13400 processor, 16GB DDR4 memory, operating system is windows 11 host, database mysql.the experiment is divided into four parts, the first part analyzes the watermarking capacity between each algorithm, the second part analyzes the distortion of each algorithm after completing the embedding of the watermark, the third part is the comparison of the classical algorithm and the proposed algorithm in this paper under the comparison of robustness under some known attacks.As the size of the population in this paper is decided based on the number of tuples in the MAC function grouping, the number of population is constant in the experiment.In the experiment the number of custom attributes n is taken in the interval [2,8] respectively for testing the proposed method in this paper, which is randomly taken in the database 10176 for testing.The performance of this paper's method for the value of n is shown in Figure 7, where the x-axis is the number of customized attributes n, and the y-axis in (a) denotes the total database distortion using the FASADEW method.The y-axis in (b) is the watermark capacity using the FASADEW method.
The results show that when n=5, the watermarking capacity is the largest and the data distortion is low, so we set n to 5.After determining n, keep n unchanged to change with dT to find the most suitable dT, and do not change T in the experiment, because we can determine the annealing rate with

Mathematical Biosciences and Engineering
Volume 19, Issue x, xxx-xxx dT as the condition.At this time n=5, T=2000, the method in this paper adjusts the dT value, and the watermarking performance is shown in Figure 8.The x-axis in Figure 8 is the cooling factor dT, the y-axis in (a) is the watermarking capacity, the y-axis in (b) is the total distortion of the data, and the y-axis in (c) is the time required by the algorithm.From Figure 8 it can be seen that in the testing process dT=0.98 watermark capacity reaches the highest, dT=0.99 when the minimum data distortion, dT=0.96 when the algorithm takes the least amount of time, dT=0.99 when the time required is the longest, at this time to calculate the average of data distortion found that the average distortion of dT=0.99 and dT=0.98 is not much difference, considering the efficiency of the algorithm in this paper, we choose dT=0.98.
Next, we test the influence of two weight parameters, w 1 and w 2 in the luminance function correspond to the watermark capacity and data distortion weights, and the sum of the two weights is 1.The higher value of w 1 in the luminance function indicates that the capacity of embedded watermarks in each group is more important relative to the distortion, and the higher value of w 2 indicates that the capacity of embedded watermarks in each group is more important relative to the distortion.Keeping the previous n=5, T=2000, dT=0.98, the size of data distortion and watermark capacity of w 1 at 0.3, 0.4, 0.5, 0.6 and 0.7 (w 2 =1-w 1 ) were calculated and the results are shown in Figure 9, where the highest embedded watermark was produced by w 1 =0.6 among the 5 w 1 tested, while the lowest distortion was produced.Therefore, n=5, T=2000, dT=0.98,w 1 =0.6, and w 2 =0.4 were used in the next experiments.

Watermark capacity
Watermark capacity is the maximum number of watermarks or the maximum amount of information that can be embedded in a given database or dataset.A higher watermark capacity means that more watermarks can be embedded or a larger amount of watermark information can be embedded.The higher the capacity, the more robust and stealthy the watermark is, which is very useful for protecting data integrity, tracking the origin of data and resisting malicious tampering [26,27,28].In this paper, the watermark capacity is defined as: To evaluate the performance of the algorithm, we conducted experiments using 214, 756, 1478, 3986, 5769, and 10176 tuples in the dataset and compared the results with FADEW, PSODEW, DEW, and GADEW.
The experimental results show that the watermarking capacity of the FASADEW algorithm is comparable to that of the FADEW algorithm when faced with small datasets.However, in larger datasets (>1478), the watermarking capacity of the FASADEW algorithm has a large boost compared to the FADEW algorithm.This boost is due to the fact that the FASADEW algorithm, guided by the simulated annealing algorithm, accepts fireflies with a brightness less than the original brightness with some probability.The watermarking capacity using the FASADEW algorithm shows a significant improvement compared to the PSODEW and DEW algorithms, which is mainly due to the fact that the FASADEW algorithm has more embeddable locations.

Distortion value analysis
Watermarking causes distortion of data after embedding, which can be harmful in subsequent use by the authorizer, and when the data is very different from the original data, it can lead to some wrong prediction results or the obtained data is not in line with the expectation [29,30].In this paper, the experiments take the average distortion method to calculate the distortion, the formula is as follows: where D represents the tuple to be embedded in the watermark and DataNumrepresents the number of all tuples in the database.Since the DEW algorithm is too long therefore it is not comparable in the distortion value analysis.The watermark distortion of each algorithm is shown in Table 3, it can be found that the distortion using in FASADEW algorithm is less than using FADEW, PSODEW, GADEW algorithms.21.8 25.3 25.9 27.2 29.7 30.9

Robustness analysis
The robustness of watermarking is mainly shown in some attacks, such as anti-tuple deletion and modification attacks, addition attacks, attribute deletion attacks, tuple reordering attacks, bit reversal, etc., and the main purpose of the attacks is to destroy the watermarking message w or embedding their own watermarks [31,32].Under these attacks, whether the watermark can be extracted correctly or not is a critical issue, in this paper scheme experiments on some common attacks to observe whether the watermark can be extracted correctly under these common attacks, before this paper define the watermark extraction rate.
Where Num w is the number of watermarks detected in each type of experiment and Num tuple is the total number of successfully embedded watermarks.The robustness experiments include the cases in which the degree of attack is 10%, 20%, 30%, 40%, and 50% of the common attack methods, and the watermarks are embedded for 10,176 tuples in the experiments, and the average value is taken after running 10 times.As shown in Figure 10, the experimental results show that FASADEW has the best performance in the case of tuple reordering attack and tuple addition attack, which can keep all the watermarks extractable, but in the case of attribute deletion attack, the results are not satisfactory, this is due to the fact that the watermarks are embedded in different attributes of a tuple, when the attribute deletion is too much it will lead to, the watermarks can't be extracted.Experimental results from (a) show that FASADEW anti-tuple deletion attack is better than FADEW, PSODEW, DEW, GADEW algorithms, in the deletion of 50% of the tuples under the original DEW algorithm, FASADEW algorithm improves the watermark extraction rate by 5%, here the enhancement comes from the use of a MAC function to pick out tuples are randomly selected, the use of a different MAC function, randomly selected tuples are also different, in the end, it will affect the tuple deletion of the watermark extraction rate, so the choice of a good MAC function is also very important.Attribute deletion is a more serious attack which deletes all the watermarks on the attribute columns while deleting the attribute columns, which is disastrous for this way of embedding watermarks using DEW algorithm.In this regard, in the FASADEW algorithm in order to increase the capacity of watermark in the attribute values, it is proposed to use fireflies to find the optimal location for embedding the watermark, and the optimal location has more than one attribute that can be embedded with the watermark, and embed all of the multiple locations so as to achieve the effect of defending against attribute deletion attacks.From Figure 11(b) comparing the FADEW, PSODEW, DEW, and GADEW algorithms, it can be found that when the proportion of attribute deletion is large, the FASADEW algorithm has strong robustness.
The tuple adding attack is to add tuples to the database using a certain amount of data, since all the above schemes use DEW for watermark embedding in the embedding phase of the watermark, this approach takes into account the synchronization of the watermark embedding and extracting, so in the tuple adding attack the watermark extracting rate of these schemes is 100%.
The reordering attack of tuple means that the tuple is disrupted and reordered in a certain way, while the MAC used by the FASADEW algorithm for tuple filtering, the reordering did not disrupt the sorting of the subsequent MAC function, so the FASADEW algorithm has a watermark extraction rate of 100% under the attack of the reordering of the tuple, which means that the FASADEW is able to ensure that it is completely resistant to the attack of the tuple reordering to ensure the safety of the watermark.

Conclusions and future directions
In this paper, we propose a reversible and robust digital data watermarking technique for relational databases.We combine the firefly algorithm with the simulated annealing method to reduce distortion and improve the robustness of database watermarking.FASDEW is characterized by selecting the initial population by the firefly algorithm and using the simulated annealing algorithm to find the optimal embedding location of the watermark, which guarantees the quality of the watermarked database.Experimental results show that FASADEW has less distortion and improves the robustness of watermarking.Because of the grouping mechanism, the method can extract most of the watermark information and recover most of the data even if it is maliciously attacked.A large number of experiments are conducted for different attack scenarios.The experimental results show that FASADEW recovers the embedded watermark and the original data well no matter what percentage of tuples the attacker adds.FASADEW is compared with recently proposed techniques such as DEW, GADEW, FADEW, PSODEW, GADEW, etc. and the results show that FASADEW outperforms all of them.Future work is to develop reversible and robust watermarking for non-digital data and to propose watermarking schemes for shared databases in a distributed environment.

Use of AI tools declaration
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.

1 :
Calculate the average of x and y, avg = ⌊ x+y 2 ⌋ = 203, difference: d = x − y = 5.Step 2: Defining the new margin d = 2d +b = 11, Calculate the new x and ỹ, x = avg+⌊ d+1 2 ⌋ = 209, ỹ = avg − ⌊ d 2 ⌋ = 198 Above is the forward step of differential expansion, after getting x and ỹ, use them to replace the original x and y to get the data containing watermark.The method to recover the original data and the corresponding bit b is as follows: Step 1: Calculate the average of x and ỹ, ã vg = ⌊ x+ỹ 2 ⌋ = 203, Calculate the difference d = x − ỹ = 11 Step 2: Extract a watermark bits b = d − ⌊ d 2 ⌋ × 2 = 1 Step 3: The original data can be recovered from d and b obtained from steps 1 and 2

Algorithm 1
Creating the initial population method Input: D i ,w[i]   Output: Initial population F Step 1: Loop the method j of step 2, j ∈[1, • • • , P]. Step 2: Method k times of steps 2.1-2.3 of the cycle, k ∈ [1, • • • , M].Step 2.1: Number of user-defined firefly attribute columns n.Step 2.2: Randomly select the nth attribute At x in the jth row, ensuring that the selection is not the same each time.Step 2.3: Loop the method of steps 2.3.1-2.3.3N times.Step 2.3.1:At x requires a DEW operation with all attributes except it,(M At x , M At y ) = DEW(At x , At y , w[i]) ,y ∈ [1, • • • , A], x y.Step 2.3.2:Calculate the distortion after applying DEW algorithm, distort =| M − At x − At x | + | M − At y − At y |.Step 2.3.3:Select the attribute pair with the least distortion to add to F j .Step 3: Getting the initial population F in D i .4.2.1.Creating the initial population Create an initial population F = {F 1 , F 2 , • • • , F p } using P fireflies, the size of P is equalized by the number of tuples in the grouping D i , each firefly uses a floating point number to represent the position Mathematical Biosciences and Engineering Volume 19, Issue x, xxx-xxx of each firefly, F 1 = 3.2 represents the 3rd attribute of the first firefly and the 2nd attribute of the first firefly, and the steps to create the initial population are as follows.

Figure 2 .Figure 3 .
Figure 2. Partial schematic of the algorithm for creating the initial population

Figure 5 .
Figure 5. Schematic diagram of firefly movement

Algorithm 2 Step 1 :
Watermark Embedding Algorithm Input: Database D, E, watermark set w, watermark length len w Output: Database with watermarked values D ′ Repeat steps 2-4 len w times.Step 2: Extract the ith element (x i , y i ) in the set E and the ith element in the watermark set, and embed the watermark w[i] on all attribute pairs of the ith element.Step 3: Embed the watermark into At x and At y using DEW algorithm to get M − At x and M − At y .Step 4: Re-write M − At x and M − At y back to the database.Step 5: Getting a new database D ′ .

Figure 9 .
Figure 9. Distortion and watermark capacity with different weights

Figure 10 .
Figure 10.FASADEW's detection rate of watermarking under common attacks

Figure 11 .
Figure 11.Watermark detection rate of various algorithms under tuple deletion attack and attribute deletion attack.(a) Tuple deletion attack.(b) Attribute deletion attack.

Table 1 .
Notations Used in the Paper

Table 2 .
Comparison of watermarking capacity between algorithms

Table 3 .
Comparison of average distortion values by algorithm