Differentially Private Kernel Support Vector Machines Based on the Exponential and Laplace Hybrid Mechanism

Support vector machines (SVMs) are among the most robust and accurate methods in all well-known machine learning algorithms, especially for classiﬁcation. The SVMs train a classiﬁcation model by solving an optimization problem to decide which instances in the training datasets are the support vectors (SVs). However, SVs are intact instances taken from the training datasets and directly releasing the classiﬁcation model of the SVMs will carry signiﬁcant risk to the privacy of individuals, when the training datasets contain sensitive information. In this paper, we study the problem of how to release the classiﬁcation model of kernel SVMs while preventing privacy leakage of the SVs and satisfying the requirement of privacy protection. We propose a new diﬀerentially private algorithm for the kernel SVMs based on the exponential and Laplace hybrid mechanism named DPKSVMEL. The DPKSVMEL algorithm has two major advantages compared with existing private SVM algorithms. One is that it protects the privacy of the SVs by postprocessing and the training process of the non-private kernel SVMs does not change. Another is that the scoring function values are directly derived from the symmetric kernel matrix generated during the training process and does not require additional storage space and complex sensitivity analysis. In the DPKSVMEL algorithm, we deﬁne a similarity parameter to denote the correlation or distance between the non-SVs and every SV. And then, every non-SV is divided into a group with one of the SVs according to the maximal value of the similarity. Under some certain similarity parameter value, we replace every SV with a mean value of the top-k randomly selected most similar non-SVs within the group by the exponential mechanism if the number of non-SVs is greater than k . Otherwise, we add random noise to the SVs by the Laplace mechanism. We theoretically prove that the DPKSVMEL algorithm satisﬁes diﬀerential privacy. The extensive experiments show the eﬀectiveness of the DPKSVMEL algorithm for kernel SVMs on real datasets; meanwhile, it achieves higher classiﬁcation accuracy than existing private SVM algorithms.


Introduction
In recent years, with the rapid development of computing devices in the collecting, storing, and processing capabilities, data sharing and analyzing are becoming easier and more practical [1]. Data mining and machine learning techniques have been gaining a great deal of attention for analyzing useful information. e classification algorithm, as one of the important data mining tasks, trains a classification model from labeled training datasets to classify unknown data in the future [2]. e support vector machine (SVM) [3,4] is one of the most widely used machine learning algorithms for classification in practice [5]. SVMs train a classification model by finding a solution to the convex optimization problem. Like most other classification algorithms, SVMs also have privacy issues when the training datasets contain sensitive information such as user behavior records or electronic health records. In SVMs, support vectors (SVs) are an important component of the classification model and they are intact instances taken from the training datasets. Directly releasing the classification model of SVMs will carry significant risk to the privacy of individuals, especially for kernel SVMs [2].
More and more researchers have made great efforts on the privacy leakage problem. Differential privacy (DP) [6][7][8] is one of the state-of-the-art models and has become an accepted standard for privacy protection in sensitive data analysis since it was proposed by a series of works by  In recent decades, two main research directions of DP have been developed: differentially private data publishing and differentially private data analysis [9]. Differentially private data publishing aims to output aggregate information to the public without disclosing any individual record, including transaction data publishing [10], histogram publishing [11], stream data publishing [12], graph data publishing [13], batch query publishing [14], and synthetic datasets publishing [15]. e essential task of differentially private data analysis is extending the current non-private algorithms to differentially private algorithms, including supervised learning [16], unsupervised learning [17], and frequent pattern mining [18].
In this paper, we studied the problem of how to release the classification model of kernel SVMs while satisfying the requirement of privacy protection. To overcome the shortcomings in the existing private SVM algorithms, such as the requirements on the differentiability of the objective function and the low classification accuracy, we proposed a new differentially private algorithm for the kernel SVMs. e main contributions in this paper concluded as follows: (i) We propose the exponential and Laplace hybrid mechanism to prevent privacy leakage of the SVs. e hybrid mechanism takes advantage of both the exponential mechanism and the Laplace mechanism to improve the classification accuracy. (ii) We define a similarity parameter to denote the correlation or distance between the non-SVs and every SV. is can be easily achieved from the symmetric kernel matrix produced during the training process. And then, every non-SV is divided into a group with one of the SVs according to the maximal value of the similarity. (iii) Learning from the idea of top-k frequent pattern mining [1,19,20], we use different methods to protect the privacy of the SVs under some certain similarity parameter values. When the number of the non-SVs is greater than k within the group, we replace every SV with a mean value of the top-k randomly selected most similar non-SVs by the exponential mechanism. Otherwise, we add random noise to the SVs by the Laplace mechanism.
(iv) We theoretically prove that the DPKSVMEL algorithm satisfies DP. e extensive experiments show the effectiveness of the DPKSVMEL algorithm for kernel SVMs on real datasets; meanwhile, it achieves higher classification accuracy than existing private SVMs algorithms. e rest of the paper is organized as follows: In Section 2, we discuss the work related to private SVMs. In Section 3, we give a brief overview of the basic knowledge of SVMs, DP, and top-k frequent pattern mining. Section 4 proposes the DPKSVMEL algorithm, and Section 5 gives the experimental performance evaluation of the DPKSVMEL algorithm. Section 6 concludes the research work.

Related Work
In this section, we briefly review some work of private SVMs and then focus on the work related to differentially private SVMs.
ere are some works related to private SVMs. Mangasarian et al. [21] proposed a highly efficient privacypreserving SVM PPSVM via random kernels for vertically partitioned data. Lin et al. [2] pointed out the private violation problem of SVs in the classification model of the SVM and proposed a privacy-preserving SVM classifier PPSVC to replace the Gaussian kernel with a precisely approximate decision function. ese two methods achieve similar classification accuracy to the original non-private SVM classifier. Nevertheless, the degree of privacy protection cannot be proved as the private SVMs based on DP.
DP is a rigorous privacy definition and has become an accepted standard for privacy protection in sensitive data analysis. e degree of privacy protection can be measured with privacy budget parameter ε. e classification model of the SVMs should be released under a DP guarantee. Chaudhuri et al. [22,23] proposed two popular perturbation-based techniques, output perturbation and objective perturbation, for privacy-preserving machine learning algorithm design. Output perturbation introduces randomness into the weight vector w after the optimization process, and the randomness scale is determined by the sensitivity of w. Objective perturbation introduces randomness into the objective function before the optimization process, and the randomness scale is independent of the sensitivity of w.
ese two perturbation-based techniques have been applied to logistic regression and linear SVM algorithms. However, their sensitivity is difficult to analyze, and the loss function needs to satisfy certain convexity and differentiability criteria for objective perturbation. For nonlinear kernel SVMs, Chaudhuri et al. used the random projections method to approximate the kernel function by transforming it to linear classification and avoided publishing privacy values in the training datasets directly. e disadvantage of this method is that it needs to provide the projection matrix except for the privacy classification model in the prediction process, which increases the risk of privacy leakage. Furthermore, the appropriate projection dimension is also an issue to be considered. Rubinstein et al. [24] proposed two mechanisms for differentially private SVM learning, one with finite-dimensional feature mappings and one with potentially infinite-dimensional feature mappings. Both mechanisms are achieved by adding noise to the output classifier and are effective for all convex loss functions including the most frequent hinge loss function. ey also came up with a utility metric by comparing the similarity of the classifiers released by private and non-private SVM. eir mechanisms are valid only for the translation-invariant kernels. Li et al. [25] developed a hybrid private SVM model by using public data and private data together. ey leveraged a small portion of open-consented data to calculate the Fourier transformation to alleviate too much noise in the final outputs. However, public data are hard to obtain in the private world. Liu et al. [26] proposed a private classification algorithm LabSam with high classification accuracy under DP when the labeled data were limited and the privacy budget was small. eir algorithm implements random sampling under the exponential mechanism differing from the perturbation-based methods. Zhang et al. [27] constructed a novel private SVM classifier DPSVMDVP on dual variable perturbation, which added Laplace noise to the corresponding dual variables according to the ratio of errors.

Preliminaries
In this section, we give a brief overview of the basic knowledge of SVMs, DP, and top-k frequent pattern mining.

Support Vector Machines.
e SVM is an efficient learning method for classification based on structural risk minimization [3]. It aims to find an optimal separating hyperplane with a maximal margin to separate two classes of the given instances. e maximal margin corresponds to the shortest distance between the closest data points and any point on the hyperplane. Xue et al. [28] described the complete calculation process of the decision function in detail. e main task for training a SVM is to solve the optimization problem of quadratic programming as follows [29,30]: subject to 0 ≤ α i ≤ C, i � 1, . . . , l and y T α � 0. (1) In equation (1), Q denotes a symmetric kernel matrix with Q ij � y i y j K (x i , x j ) and K is the kernel function. α is a dual vector, and x i and y i denote the training instance and label, respectively. e optimization problem of equation (1) can be solved by the sequential minimal optimization algorithm efficiently [30]. After the optimization process, we obtain the decision function as follows: From equation (2), we can conclude that the classification model of the SVMs is composed of dual variables α and SVs. It is a very serious private issue that directly releases a classification model with the original instances of the training datasets.

Differential Privacy.
With the advent of the digital age, more and more personal information has been collected and shared by mobile devices and web services to improve the quality of these services. At the same time, it also raises privacy concerns of data contributors. DP [6][7][8]31] provides a mathematically rigorous definition of privacy for private data analysis. It guarantees that any possible outcome of the data analysis is hardly any different from each other whether or not the individual participates in the database. e maximal difference of the outcome is controlled by a small privacy budget parameter ε. Formally, the definitions related to DP are given in the following.
Definition 1 (ε-DP [6]). A randomized algorithm K satisfies ε-DP if datasets D and D′ differ on at most one instance, and for all subsets of possible outcomes of the algorithm S ⊆ Range (K), Definition 2 (sensitivity [6]). For a given query function f: D ⟶ R d and neighboring datasets D and D′, the sensitivity of f is defined as Currently, there are two principal mechanisms used for realizing DP: the Laplace mechanism for numerical data and the exponential mechanism for nonnumerical queries.
Definition 3 (Laplace mechanism [8]). For a numeric function f: D ⟶ R d , the algorithm K that answers f in equation (5) provides ε-DP.
where Lap (∆f/ε) is a random variable sampled from the Laplace distribution with mean 0 and standard deviation � 2 √ Δf/ε. e Laplace mechanism retrieves the true results from the numerical query and then perturbs it by adding independent random noise according to the sensitivity.
Definition 4 (exponential mechanism [7]). Let q(D, r) be a scoring function on a dataset D that measures the quality of output r ∈ R, and Δq represents the sensitivity. e algorithm K satisfies ε-DP if e exponential mechanism is useful to select a discrete output in a differentially private manner, which employs a scoring function q to evaluate the quality of an output r with a nonzero probability.

Top-k Frequent Pattern
Mining. Frequent pattern mining aims to discover items that frequently appear together in a transaction dataset. Directly releasing the Security and Communication Networks discovered frequent patterns with support counts will violate individual privacy. erefore, the top-k most frequent pattern should be released under DP guarantee. Zhang et al. [1] proposed an algorithm DFP-Growth, which accurately found the top-k frequent patterns with noisy support counts while satisfying DP. e DFP-Growth algorithm is performed by two key steps: firstly, selecting the top-k frequent patterns by the exponential mechanism; secondly, perturbing the true support count of each top-k pattern by the Laplace mechanism.

Materials and Methods
To solve the privacy leakage problem of the SVs in the classification model of kernel SVMs, we proposed the DPKSVMEL algorithm based on the exponential and Laplace hybrid mechanism. e privacy of SVs is protected by postprocessing the non-private classification model with DP, while the training process of the original SVMs is not changed. Firstly, we train a non-private kernel SVM to obtain a classification model including dual vector α and the SVs. Secondly, we define a similarity parameter to denote the correlation or distance between the non-SVs and every SV and then every non-SV is divided into a group with one of the SVs according to the maximal value of the similarity.
irdly, we use either the exponential mechanism or the Laplace mechanism to generate a new SV and then replace the original SV within the group under some certain similarity parameter values. e mechanism to use depends on whether the number of non-SVs is greater than k. Lastly, we output the classification model with the private SVs. Figure 1 gives an example of the DPKSVMEL algorithm implementation process.
In Figure 1, there are three SVs and eight non-SVs. e square represents the SV, the small circle represents the non-SV, the triangle represents the private SV, and the big circle represents a group. Every group is represented by different colors and viewed as a hypersphere with the SV as the center and the similarity as the radius. And, every non-SV is divided into one group according to the maximal value of its similarity with every SV. In particular, some non-SVs that are located at the intersection of multiple groups also belong to one group to satisfy the parallel composition property of DP. We set the parameter k to 2 in this example. In the groups with red and yellow colors, the number of the non-SVs is greater than k. We use the exponential mechanism to randomly select the most similar two non-SVs and then generate a new private SV with the mean value of them. However, there is only one non-SV in the blue group. We use the Laplace mechanism to generate a new private SV with noise. erefore, the SVs in the final classification model are all private ones to prevent privacy leakage.

Similarity Parameter and Sensitivity.
In the DPKSVMEL algorithm, the similarity is a vital parameter. We view the symmetric kernel matrix Q in equation (1) as the probability of similarity between every two instances in the datasets, especially for the radial basis kernel.
Definition 6 (similarity). For a non-SV x i and a SV x j , the similarity between them is defined as In equation (7), Similarity is a subset of Q. It is obtained easily from the classification model and requires no extra complicated computation. e smaller the distance between a non-SV and a SV, the greater the value of the Similarity when they have the same labels. If they have different labels, the value of the Similarity is less than zero and the corresponding non-SV will be discarded from participating in the calculation within the group.
e Similarity is viewed as the probability of the correlation between a non-SV and a SV. In the DPKSVMEL algorithm, we set a lower limit for the Similarity named LLs to denote the minimum value of the correlation or the maximum value of the distance between them. If the value of the Similarity is less than LLs, it means that the correlation is too small or the distance is too large between the non-SV and the SV; then, the non-SV is also discarded from the group. After all the non-SVs are divided into groups, we use the exponential and Laplace hybrid mechanism to generate a new SV in every group. When LLs are fixed, the radius of the hypersphere corresponding to each group is determined. en, the sensitivity of the exponential mechanism and Laplace mechanism is calculated by LLs easily. ey are denoted by Sensitivity em and Sensitivity lm , respectively.
In the exponential mechanism, we use Similarity as the scoring function. Under the fixed LLs, the maximum value of the similarity is 1 when the non-SV coincided with the SV and the minimum value is LLs within the group. erefore, the sensitivity of the exponential mechanism is obvious as shown in equation (8). In the Laplace mechanism, we define the radius of the hypersphere R to denote the maximal distance between the non-SVs and the SV within a group that corresponds to the lower limit of the Similarity LLs. e relationship between LLs and R is drawn from equation (7) in that where gamma is a scale parameter with a default value of 1/n for a dataset with n attributes. en, where R denotes the maximal distance between every non-SV and the SV within the group. As all the attributes in a dataset are independent, we conclude that the biggest change of any attribute within a group is at most sqrt (−log (LLs)) based on the formula for the distance between two points. erefore, the sensitivity of the Laplace mechanism is as shown in equation (9).

Privacy Budget Allocation.
In DP algorithm, the privacy budget ε is another vital parameter. It determines the level of privacy protection for a randomized algorithm. e smaller the privacy budget, the higher the level of privacy protection. When the allocated privacy budget runs out, the randomized algorithm K will lose privacy protection. In the DPKSVMEL algorithm, every non-SV is divided into a group with one of the SV and there are no common instances between groups. erefore, the DPKSVMEL algorithm satisfies the parallel composition property of DP. ere is no need to split the privacy budget for the exponential mechanism and the Laplace mechanism.

Description of the DPKSVMEL Algorithm.
In the DPKSVMEL algorithm, DP is achieved by the exponential and Laplace hybrid mechanism. e description of the DPKSVMEL algorithm is shown in Algorithm 1.
e DPKSVMEL algorithm protects the privacy of SVs by postprocessing the non-private classification model. It builds a single group for every SV and divides every non-SV into one of the groups according to its similarity. e private SVs are constructed by the exponential mechanism when the number of the non-SVs is greater than k within the group; otherwise, they are constructed by the Laplace mechanism. Finally, the DPKSVMEL algorithm outputs the private classification model. Because the running time of the algorithm only depends on the number of SVs, its time complexity is much less than O (n), where n denotes the number of instances.

Privacy Analysis.
In the DPKSVMEL algorithm, randomness is introduced by the exponential and Laplace hybrid mechanism. According to the definition of DP, we prove that the DPKSVMEL algorithm satisfies DP by eorem 1.

Theorem 1. DPKSVMEL algorithm satisfies DP.
Proof. In the DPKSVMEL algorithm, DP is achieved by postprocessing the non-private classification model. Every SV is viewed as the center of a group, and there is no intersection between groups. We consider the impact of adding an instance in the dataset on the classification model in the following three cases. e first one is that the new instance becomes a SV; then, one new group needs to deal with the Laplace mechanism. e second one is that the new instance is a non-SV and is divided into a group, which only adds one non-SV to be randomly selected by the exponential mechanism. e third one is that the new instance is a non-SV and does not belong to any group, and the classification model shows no change. Based on the sensitivity computation in equations (8) and (9), either the exponential mechanism or the Laplace mechanism for dealing with one group satisfies DP. According to the parallel composition property of DP, the DPKSVMEL algorithm satisfies DP.

Results
In this section, we compared the performance of the DPKSVMEL algorithm with the newest private SVM algorithms LabSam [26] and DPSVMDVP [27]. While Pri-vateSVM [24] does not provide practical and comparable experimental results, the experimental datasets in hybrid SVM [25] cannot be obtained now.

Datasets.
e datasets in our experiments are commonly used for testing SVM algorithms' performance and are available at https://www.csie.ntu.edu.tw/∼cjlin/ libsvmtools/. Table 1 shows the basic information of the eight datasets and classification accuracy of the non-private SVM with the default parameters based on LIBSVM (version 3.24) [33]. We use the radial basis function as the kernel function in the experiments.

Algorithm Performance Experiments.
In this section, we evaluated the performance of the DPKSVMEL algorithm by the Accuracy and AUC (the area under a ROC curve). e higher their values, the better the usability of the algorithm. To evaluate the algorithm performance under different parameters setting, we set k at 2 and 3, set privacy budget ε at 0.1, 0.5, and 1, and set the lower limit for the Similarity from 0.5 to 0.9. To avoid the influence of randomness on algorithm performance, we execute the algorithm DPKSVMEL 10 times under every set of parameters. Tables 2-9 show the mean value, standard deviation, maximum value, and minimum value of the Accuracy and AUC on the eight datasets. e values in bold represent the best case of the mean value of Accuracy and AUC under the same privacy budget. And, the running time of the DPKSVMEL algorithm is shown in Table 10.
Based on the above experimental results, a N-way ANOVA (analysis of variance) was conducted to compare the effect of the three parameters and the interaction model Security and Communication Networks  Input: Q: symmetric kernel matrix; ε: privacy budget; LLs: lower limit of the Similarity; N ns : the number of non-SVs in a group; k: the number of non-SVs selected in the exponential mechanism; Output: SV p : private SV; Begin (1) obtain a non-private classification model including dual vector α and the SVs by training a kernel SVM; (2) get the Similarity matrix from the subset of Q in which the Similarity value was no less than LLs; (3) divide every non-SV into one group according to the maximal value of its similarity with every SV; (4) for i in every group (5) if N ns > k then (6) compute the probability Pr ns for every non-SVs with its Similarity value; (7) randomly select the most similar k non-SVs with probability Pr ns by the exponential mechanism; (8) SV pi � the mean value of the selected k non-SVs; (9) else (10) for every attribute of the SV (11) SV pij � SV ij + Laplace (Sensitive lm /ε); (12) end for (13) end if (14) end for (15)      of the bar graphs on datasets Australian and Breast. e effect of the experimental performance by the three parameters is consistent with the result of ANOVA. e larger the privacy budget ε, the higher the classification accuracy of the algorithm. e experimental performance for parameter Similarity is mainly affected by the privacy protection mechanism adopted. When its value is small, there are more non-SVs within the group and the exponential mechanism plays a greater role.
Otherwise, the Laplace mechanism plays a greater role. e DPKSVMEL algorithm is more stable on dataset Breast.
We compared the performance of the DPKSVMEL algorithm and the non-private SVM under different privacy budgets on datasets Australian and Breast from Figures 6-9. With the increase of the privacy budget, the performance of the DPKSVMEL algorithm gradually reaches or even exceeds that of non-private SVM.     Figures 10-14. Finally, we compared the Accuracy of the DPKSVMEL algorithm and the DPSVMDVP algorithm under different privacy budgets on dataset Splice in Figure 15. Compared to the LabSam algorithm and the DPSVMDVP algorithm, our DPKSVMEL algorithm has higher classification accuracy and is closer to the non-private SVM under the same privacy budget.
In addition, as the DPKSVMEL algorithm does not change the training process of the classical non-private SVMs, a number of new optimization methods can be easily combined with our proposed algorithm to improve the classification accuracy, for example, CRP algorithm [34] for bilinear analysis and NI-SVM [35] in event analysis tasks.

Conclusions
In this paper, we study the privacy problem of the classification model of kernel SVMs. We proposed the DPKSVMEL algorithm based on the exponential and Laplace hybrid mechanism. e privacy of SVs is protected by postprocessing the nonprivate classification model with DP to prevent privacy leakage of the SVs. e DPKSVMEL algorithm is proved to satisfy DP theoretically and overcomes some shortcomings in the existing private SVM algorithm. Firstly, the postprocessing in the DPKSVMEL algorithm does not change the training process of the non-private kernel SVMs. erefore, no complex sensitivity analysis is required like output perturbation and objective perturbation. Secondly, the DPKSVMEL algorithm avoids the additional risk of privacy disclosure and the consideration of projection dimension caused by the transformation from nonlinear SVMs to linear SVMs using random projection as PrivateSVM and hybrid SVM. Meanwhile, the DPKSVMEL algorithm has higher classification accuracy than the newest private SVM algorithms LabSam and DPSVMDVP under the same privacy budget. However, the DPKSVMEL algorithm views the value of kernel function as the probability of similarity between a non-SV and a SV that is valid only for some kernel functions with values in the range 0 to 1. Furthermore, the DPKSVMEL algorithm has poor performance for the datasets with a high proportion of SVs, especially when the similarity lower limit is small. In the future, we will work in two aspects. One is to extend the DPKSVMEL algorithm into more kernel functions. e other is to consider setting different similarity lower limits for different groups.
Data Availability e datasets in our experiments are commonly used for testing SVM algorithms' performance and are available at https://www.csie.ntu.edu.tw/∼cjlin/libsvmtools/.

Conflicts of Interest
e authors declare that they have no conflicts of interest.