Maximized Privacy-Preserving Outsourcing on Support Vector Clustering

Despite its remarkable capability in handling arbitrary cluster shapes, support vector clustering (SVC) suffers from pricey storage of kernel matrix and costly computations. Outsourcing data or function on demand is intuitively expected, yet it raises a great violation of privacy. We propose maximized privacy-preserving outsourcing on SVC (MPPSVC), which, to the best of our knowledge, is the first all-phase outsourceable solution. For privacy-preserving, we exploit the properties of homomorphic encryption and secure two-party computation. To break through the operation limitation, we propose a reformative SVC with elementary operations (RSVC-EO, the core of MPPSVC), in which a series of designs make selective outsourcing phase possible. In the training phase, we develop a dual coordinate descent solver, which avoids interactions before getting the encrypted coefficient vector. In the labeling phase, we design a fresh convex decomposition cluster labeling, by which no iteration is required by convex decomposition and no sampling checks exist in connectivity analysis. Afterward, we customize secure protocols to match these operations for essential interactions in the encrypted domain. Considering the privacy-preserving property and efficiency in a semi-honest environment, we proved MPPSVC’s robustness against adversarial attacks. Our experimental results confirm that MPPSVC achieves comparable accuracies to RSVC-EO, which outperforms the state-of-the-art variants of SVC.


Introduction
Clustering forms natural groupings of data samples that maximize intra-cluster similarity and minimize inter-cluster similarity. Inspired by support vector machines (SVMs), support vector clustering (SVC) has attracted many studies for remarkably handling clusters with arbitrary shape [1][2][3][4]. Various application areas are closely related to it, e.g., information retrieval and analysis, signal processing, traffic behavior identification, etc. [2,[4][5][6]. However, building a good clustering model requires a large number of valid training samples and a substantial iterative analysis under specific metrics; hence, it is frequently hard for individuals or small organizations to build their model on those quickly accumulated data. The pricey storage of the kernel matrix and costly computations by the dual (1) A reformative SVC with elementary operations (RSVC-EO) is proposed with an updated dual coordinate descent (DCD) solver for the dual problem. In the training phase, it prefers a linear method to approach the objective without losing validity. Furthermore, it easily trades time for space by calculating kernel values on demand. (2) A fresh convex decomposition clustering labeling (FCDCL) method is presented, by which iterations are not required for convex decomposition. In connectivity analysis, the traditional segmer sampling checks in feature space are also avoided. Consequently, to complete calculations in each step of the outsourcing scene, a single interaction is sufficient for prototype finding and connectivity analysis. Without a mass of essential iterations, the labeling phase will not hurt the outsourcing.
Towards privacy-preserving, MPPSVC is proposed by introducing homomorphic encryption, two-party computation, and linear transformation to reconstruct RSVC-EO. Although these protocols limit the calculation types, MPPSVC works well with maximized outsourcing capability. In the field of SVC, to the best of our knowledge, it is the first known client-server privacy-preserving clustering framework solving the dual problem without interaction. Furthermore, it can securely outsource the cluster number analysis and cluster assignment on demand.
The remainder of the paper is organized as follows. Section 2 briefly describes the classic SVC and the homomorphic encryption and discusses the privacy violation and limited operations of SVC outsourcing. In Section 3, we first reformulate the dual problem and propose the DCD solver to it. Meanwhile, we present FCDCL that completes convex decomposition without iterative analysis and finishes connectivity analysis irrespective of sampling. By integrating these designs, we give an implementation of RSVC-EO in plain-domain (PD). Section 4 proposes MPPSVC for the client-server environment in ED, and presents the core idea of securely outsourcing the three crucial tasks of RSVC-EO. Section 5 gives performance and security analysis for these ED techniques. Section 6 reviews the related works. Finally, the conclusions are drawn in the Section 7, as well as the future works to be investigated.

Cloud Servers
Plain Model

Data Owner
Original Data

Estimation of a Trained Support Function
Assume that a dataset X has N data samples {x 1 , . . . , x N }, where x i ∈ R d with i ∈ [1, N]. Through a nonlinear function Φ(·), SVC maps data samples from data space to a high-dimensional feature space and finds a sphere with the minimal radius which contains most of the mapped data samples. This sphere, when mapped back to the data space, can be partitioned into several components, each enclosing an isolated cluster of samples. In mathematical formulation, the spherical radius R is subject to min R,α,ξ i where α is the sphere center, ξ i is a slack variable, and C is a constant controlling the penalty of noise. Following Jung et al. [12], the sphere can be simply estimated by a support function that is defined as a positive scalar function f : R n → R + . After solving the dual problem in Equation (2), we estimate the support function by support vectors (SVs) whose corresponding coefficients β i ∈ (0, C) for i = 1, . . . , N.
By optimizing Equation (2) with Gaussian kernel K(x i , x j ) = e −q||x i −x j || 2 , the objective trained support function is formulated by Theoretically, the radius R of the hypersphere is usually defined by the square root of f (x i ) where x i is one of SVs.

Cluster Assignments
Since SVs locate on the border of clusters, a simple graphical connected-component method can be used for cluster labeling. For any two samples, x i and x j , we check m segmers on the line segment connecting them by traveling their images in the hypersphere. According to Equation (3), x i and x j should be labeled the same cluster index if all the m segmers are always lying in the hypersphere, i.e., f (xm) ≤ R 2 form ∈ [1, m]. Otherwise, they are in two different clusters.

Homomorphic Encryption
Multiplication is critical for SVC. Although the fully homomorphic encryption protects multiplicative items well, the existing schemes are far from practical use. Since multiplication can be replaced by addition, we prefer using the Paillier cryptosystem [13]-an additively homomorphic public-key encryption scheme-to secure clustering analysis and data exchange. It was also adopted by the authors of [14][15][16][17][18] for simplicity and generality. Based on the decisional composite residuosity problem, the Paillier cryptosystem is provable semantic security. It means that, for some composite n, we cannot decide whether there exists some y ∈ Z * n 2 such that an integer z satisfies z = y n mod n 2 . A brief description of the Paillier cryptosystem is given as follows.
Let n = pq where p and q are two large primes, r is randomly selected in Z * n , and g is randomly selected in Z * n 2 , which satisfies gcd(L(g λ mod n 2 ), n) = 1 with L(u) = u−1 n and λ = lcm(p − 1, q − 1). We assume that plaintext m ∈ Z n is the numeric form of a feature value in x i (i = 1, . . . , N); its ciphertext is denoted by [[m]], i.e., [[m]] ∈ Z * n 2 . The Paillier cryptosystem's encryption, decryption, and additively homomorphic operations are as follows: (1) Encryption: Ciphertext [[m]] = g m · r n mod n 2 .
Here, m 1 and m 2 are two numeric features in PD and α is an integer constant. In practical situations, if α < 0, its equivalence class value α mod n can be a substitution.
Before outsourcing data, in this study, the client should distribute his public key (i.e., (n, g)) of the predefined Paillier cryptosystem to the server while keeping his private key (i.e., (p, q, r) or (λ, r)) secret. Then, the server performs a series of clustering tasks on ciphertexts by exploiting the homomorphic properties. Only the client can decrypt all the encrypted messages that encapsulate clustering results by using his corresponding private key.

Privacy Violation of SVC
Undoubtedly, to conduct the essential computations of SVC correctly in the Cloud, we have to face the possibility of privacy leakage in either the training phase or the labeling phase, or both. Otherwise, there is no feasibility.
For the training phase, the core work is to solve the dual problem in Equation (2). After being transferred to the Cloud, in PD, either the data samples or the kernel matrix is undoubtedly retrievable for the server. Based on privacy-preserving policy, this situation raises several critical problems. (1) Letting the server know the data samples as plaintext is unacceptable. (2) Only having separately encrypted data samples, the server cannot generate the required kernel matrix by itself, since calculating the Gaussian function needs the help from the client, as discussed by Rahulamathavan et al. [17]. However, more than N 2 times interactions between the server and the client are required before the server constructs a N × N kernel matrix. Besides, the widely used SMO solver for solving the dual problem needs approximately O(N 2 ) kernel evaluations to construct the support function [8]. Taking the number of iterations into account, far more than N 2 times interactions are also essential for the SMO solver. Therefore, it is inadvisable to encrypt data samples separately. (3) To make the solver iterate effectively, the intermediate results of the coefficient vector should be in PD. Unfortunately, this would leak the final index of SVs with their importance to the server. (4) Although no plain data samples appear, the constructed kernel matrix is not suitable to be sent to the server directly. If the server knows any data sample, it can recover all the data samples since the kernel function is exactly a type of similarity measure [19].
For the labeling phase, labeling the remaining data samples, usually based on the Euclidean distances in data space, is not a time-consuming work if the cluster prototypes have been obtained. Samanthula et al. [20] presented a good idea for it, even though we have to do it in an outsourced environment. However, in the most recent studies, prototype finding and sampling calculation for connectivity analysis generally are based on iterative analysis, which raises frequent interactions to get help from the client because exponent arithmetic is the fundamental operation, and SVs are the component of the exponent function (i.e., support function). Otherwise, privacy cannot be guaranteed. It is worth mentioning that Lin and Chen [21] presented a method of releasing the training SVM classifier without violating the confidentiality of classification parameters, e.g., SVs. Although replacing the trained SVM classifier by the support function appears to be a good solution, unfortunately, we still have to send the plain data samples for decisions made by the outsourced service in the server. Undoubtedly, this results in a privacy violation.

Limited Operations of Homomorphic Encryption
As described in Section 2.2, additively homomorphic encryption is characterized by its additive homomorphism. However, this fundamental operation only expedites the addition operation for two ciphertexts and the multiplication operation between a plaintext and a ciphertext. Therefore, without outside assistance, it still cannot complete multiplication/division operations for two plaintext or complex functions, e.g., Gaussian function. Unfortunately, multiplication/division operations are frequently invoked by both the SMO solver and iterative analysis for cluster prototypes. Furthermore, calculating values of Gaussian function with different sample pairs is a fundamental operation in the SMO solver, iterative analysis for cluster prototypes, and sampling for connectivity analysis. Since the client is the only legal assistance provider outside, this will dramatically raise the huge scope of interactions that lead to heavy load to network bandwidth and degrade the usability of SVC outsourcing. Therefore, it is critical to explore practical ways of avoiding too many complex computations and massive interactions.

Reformative Support Vector Clustering with Elementary Operations
Traditionally, iterative analysis and complex operations cause massive interactions that reduce the usability of outsourcing. Towards balancing privacy, efficiency, and accuracy, we design the DCD solver by reducing unnecessary operations and replacing complex functions by elementary ones, and we also design the FCDCL to avoid iterative analysis in convex decomposition and connectivity analysis.

DCD Solver for SVC's Dual Problem
Derived from [22], we reformulate the nonlinear dual problem in Equation (2) by a linear model. Let Q be the original kernel matrix with element Then, the dual problem in Equation (2) can be reformulated by Let D svc denote {β| ∑ j β j = 1, 0 ≤ β j ≤ C, j = 1, . . . , N}. Obviously, we can relax D svc by D svc = {β|0 ≤ β j ≤ C, j = 1, . . . , N} in which an equivalent globally optimal solution can be achieved. We thus get an equivalent form of the problem in Equation (2) as Without any prior knowledge of labels, we fix the label y i of x i to +1 or −1 for i = 1, · · · , N. To solve the problem in Equation (5), the optimization process starts from an initial point β 0 ∈ R N and generates a sequence of vectors {β k } ∞ k=1 . We refer to the process from β k to β k+1 as an outer iteration. In each outer iteration, we have N inner iterations, so that sequentially β 1 , β 2 , · · · , β N are updated. Each outer iteration thus generates vectors β k,i ∈ R N , i = 1, 2, · · · , N + 1, such that β k,1 = β k , To update β k,i to β k,i+1 , we fix the other variable and then solve the following one-variable sub-problem: where e i = [0, . . . , 0, 1, 0, . . . , 0] T . Then, the objective function of Equation (6) is a simple quadratic function of θ: where ∇ i f is the ith component of the gradient ∇ f . Therefore, based on a decision function definition of y = w T Φ(x) + b for SVC, we can solve problem in Equation (7) by introducing Equations (8)- (9).
Along with the update of β i , we can maintain w by whereβ i is the temporary coefficient value obtained in the previous iteration. Thus, we have Therefore, similar to the DCD method in [22], we propose a DCD solver detailed by Algorithm 1 for solving the problem in Equation (2). Briefly, we use Equation (11) to compute ∇ i f (β k,i ), check the optimality of the single-variable optimization in Equation (6) by ∇ P i f (β k,i ) ? > 1 × 10 −12 , and update β i . The cost per iteration from β k to β k+1 is O(Nd), and an appropriate controls the iteration number well for efficiency. Notice that the memory requirement is flexible. With sufficient memory, we can afford O(N 2 ) for the full kernel matrix, and use search on demand to finish the calculation of line 5 in Algorithm 1; otherwise we can either store X or split them into blocks (requirement reduced to ≤ O(N)), and then calculate the values of required kernel function sequentially for later use in the outer iterations.
Require: Dataset X , kernel width q, and penalty C Ensure: Coefficient vector β 1. Randomly initialize the coefficient vector β 2. while true do 3.

Convex Decomposition without Iterative Analysis
Derived from [22], in the labeling phase, we take convex hull as the cluster prototype for both efficiency and accuracy. However, it usually requires ( 1) iterations to get a stable equilibrium vector (SEV) from an SV. If we let the server do this, the client has to finish the exponential function by itself. Combining with the uncertain number of SVs N SV , more than N SV rounds of interactions bring a heavy burden to the network bandwidth. Therefore, we prefer achieving the same object without iterative analysis. Theorem 1. If a cluster is decomposed into multiple convex hulls, the best division positions should be those SVs whose locations link up two nearest neighboring convexity and concavity.
Proof. From the convex decomposition strategy [23], theoretically, the decomposed convex hulls are constructed by SVs without exceptions. Taking Figure 2a as an example, S 1 and S 2 , respectively, denote the two convex hulls in a cluster and the corresponding SEVs. Obviously, the cluster is stroked SVs {x 11 , · · · , x 17 ; x 21 , · · · , x 25 }. {x 11 , x 17 } and {x 21 , x 25 } constitute the division position (termed division set) for linking up two nearest neighboring convexity and concavity. To confirm this, we consider the two situations: (1) If the division position is not in the SVs subset in which each one links up two nearest neighboring convexity and concavity, for instance, x 21 is replaced by x 22 , then the corresponding convex hull S 2 has to be further split into two convex hulls enclosed by {x 21 , x 22 , x 25 } and {x 22 , x 23 , x 24 }, respectively, even though all these SVs converge to S 2 . Here, x 22 is an imaginary sample that is extremely similar to x 22 . Because no overlapping region or intersection of vertices of convex hulls is allowed, unfortunately, this result conflicts with the definition that SVs in a convex hull will converge to the same SEV. Using SVs to construct a convex hull, massive iterations in the convergence increase the communication burden significantly. We first consider the characteristics of convex decomposition: (1) SVs locate on the cluster boundary; (2) the convergence directions for SVs in a convex hull are close to the same SEV inside; and (3) after connecting two nearest neighboring SVs, no concavity is in a convex hull. Then, the core idea behind our method is quite simple and intuitive: Each SV is not only a vertex of convex hull but also an edge pattern of a cluster. In a convex hull, if an SV's tangent plane crossing the SV is perpendicular to its convergence direction, the other SVs will set on one side of the tangent plane. Therefore, to avoid forming concavity around the SV, the included angle, between its convergence direction and the line segment connecting it and its nearest neighboring SV in the same convex hull, has a relatively small value.
As shown in Figure 2a, two tangent planes crossing x 15 and x 21 can be found, respectively. Blue arrows show their convergence directions. Apparently, S 1 x 15 x 16 and S 2 x 21 x 25 must be small enough to respectively avoid x 16 and x 22 entering the other side of the corresponding tangent planes. Based on this consideration, we do non-iterative convex decomposition in Algorithm 2. Obviously, calculating the convergence direction in Line 4 is the key.
With β and SVs obtained by Algorithm 1, we can use any optimization method to find a local minimizer of Equation (3). Different methods lead to different efficiency, i.e., the number of iterations.
Fortunately, in this study, an approximate convergence direction is sufficient. For the sake of simplicity, the gradient descent of Equation (3) is preferred and calculated by and contributes to the final convergence direction. The convergence direction x in the first step for x is the negative gradient of f (x), i.e., where γ is a constant factor. Now, we can calculate the cosine function on Line 5 of Algorithm 2 by Algorithm 2 Non-iterative convex decomposition.
Require: SVs set X S , decomposition threshold η 1 Ensure: Convex hulls S CH , set of triples S Tri 1. Randomly select x j ← find the nearest neighboring SV from X S \ X U 4.
x i ← x j 13. end while Notice that, on Line 9, we collect point-pairs which are two nearest neighbors but belonging to different convex hulls, as well as the convergence direction, into S Tri . It will be employed in the following connectivity analysis for directly fetching the nearest neighboring convex hulls.

Connectivity Analysis Irrespective of Sampling
Based on the decomposed convex hulls, we try to check the connectivity of two nearest neighbors sequentially.

Lemma 1.
For two nearest neighboring convex hulls whose distance is determined by two closest vertices separately from them, there are two apparent properties. (1) We can always find a tangent plane crossing one of the two vertices, which makes the two convex hulls locate on different sides of the tangent plane. (2) Taking one of the two vertices as the original point, we draw an included angle between rays from the original point to its SEV and the other convex hull's SEV, respectively. The smaller the included angle is, the higher the possibility of the two convex hulls are connected.
Proof. Taking Figure 2b as an example, x 11 and x 21 are the nearest SV-pair in a division set thus far from two nearest convex hulls S 1 and S 2 , respectively.
According to the definition, there is no overlapping region between any two convex hulls. Apparently, the first property is always true. As a typical prototype [24], geometrically, SEV locates in the convex hull and reflects the relative location well. Thus, we have S 2 x 21 S 1 < S 2 x 21 S 1 if the convex hull S 1 is moved to S 1 . This movement uses x 21 x 11 as the radius to keep distance unchanged. It generates a relative displacement for the SEV along the direction of black arrow, and the moved one gets closer to the convex hulls S 2 . Considering a plane shaped by line S 1 x 21 x 22 , the vertex x 21 will have lower possibility to be a transition point connecting two nearest neighboring convexity and concavity. That means a higher probability of S 1 and S 2 being enclosed in one cluster as the the distance reduces. On the contrary, if we increase S 2 x 21 S 1 , actually, it is achieved by moving the convex hull S 1 far from the side of S 2 with respect to the tangent plane of x 21 . Then, on the plane shaped by line x 11 x 21 x 22 , x 21 must be included in the division set again. Thus, the second property is true.
Definition 1 (Merging Factor). Let x 11 and x 21 be two nearest vertices, respectively, from two nearest neighboring convex hulls S 1 and S 2 . In connectivity analysis of S 1 and S 2 , merging factor is defined by the cosine function of included angle between the convergence direction from one vertex and its ray direction to the other convex hull's SEV, i.e., cos x 21 x 21 S 1 or cos x 11 x 11 S 2 .
However, Algorithm 2 cannot give us the exact SEV. Based on the local geometrical property discussed by Ping et al. [25], we define a density centroid as the substitution.

Definition 2 (Density Centroid). The density centroid S DC i of the ith convex hull S CH i is defined by
On the basis of Lemma 1, the presented connectivity analysis strategy is quite simple: (1) Without prior knowledge of the cluster number, we merge two nearest neighboring convex hulls if their minimal merging factor is greater than a predefined threshold η 2 . (2) Otherwise, to control the number of clusters for a specific K globally, pairs of nearest neighboring convex hulls are merged into one cluster as they move up in the hierarchy.

Implementation of RSVC-EO
Algorithm 3 gives a complete solution of RSVC-EO. Given q and C, CollectSVsbyDCDSolver(·) obtains β by invoking Algorithm 1 and collects SVs X S . Then, ConvexDecomposedbySVs(·) constructs convex hulls S CH as the cluster prototypes by employing the non-iterative analysis strategy presented in Section 3.2. Using the prior knowledge of η 2 or K, ConnAnalysisbyNoSamp(·) checks the connectivity of two nearest neighboring convex hulls and merges them step by step on demand. It results in an array with size N SV which contains the cluster labels for SVs. Notice that the frequently used adjacency matrix by the traditional methods are no longer required because ConnAnalysisbyNoSamp(·) adopts a bottom-up merging strategy. Finally, the remaining data samples are separately assigned by their nearest convex hulls' labels.

Algorithm 3 RSVC-EO.
Require: Dataset X , kernel width q, penalty C, and thresholds η 1 , η 2 or the cluster number K Ensure: Clustering labels for all the data samples 1 for each x ∈ X \X S do 5. inx← find the nearest convex hull from x 6.

Time Complexity of RSVC-EO
Before introducing a privacy-preserving mechanism, RSVC-EO works locally in PD. Hence, we measure only the computational complexity in this section. Let N be the number of data samples, N SV be the number of SVs, and N CH be the number of the decomposed convex hulls. In the training phase, whether to use the pre-computed kernel matrix for efficiency at the cost of storage or to calculate the corresponding row of kernel values on demand depends on the actual memory capacity. If with sufficient memory, i.e., the space complexity of O(N 2 ) to store the kernel matrix, the computational complexity is O(N 2 ). The innermost operation is to compute β j K(·, ·). Otherwise, it is up to O(dN 2 ) whose innermost operation calculates each kernel function's value of two d-dimensional samples. Although it seems to be time-consuming, it is much lower than O(N 3 ) required by the traditional methods which frequently need O(N 2 ) storage (see [4]). Further, the innermost operations for both situations are simple.
In the labeling phase, time costs for constructing the convex hulls by SVs and completing connectivity analysis depend on N SV and N CH . By employing the proposed strategy, iterations to reach the local minimum from SVs are avoided, and the sample rate is replaced by one comparison. Therefore, to finish the labeling phase, RSVC-EO only consumes O( N 2 SV + ρN CH ) where = {1, d} and ρ ∈ (2, 3] are separately determined by whether to store the kernel matrix of SVs and input parameter η 2 or K.
Due to page limitation, we omit the comparisons with the art-of-the-state methods, which can be found in [22,26].

Maximized Privacy-Preserving Outsourcing on SVC
Taking RSVC-EO as the core method, we develop MPPSVC to maximize the capability of selective service outsourcing.

Privacy-Preserving Primitives
For privacy-preserving, some elementary operations in Algorithm 3 have to be done with the client's help, e.g., multiplication, distance measure, and comparison. We present lightweight secure protocols in our proposed MPPSVC. All the below protocols are considered under a two-party semi-honest setting, where the server is a semi-honest party, and the client is an honest data owner. Meanwhile, a potential adversary might monitor communications. In particular, we assume the Paillier's private key K pri is known only to the client, whereas K pub is public.

Secure Multiplication
Secure multiplication protocol (SMP) considers that the server, under the client's help, Instants m 1 and m 2 are unknown to the server in the adversarial environment. In this study, we adopt the SMP described by Samanthula et al. [20].

Secure Comparison
Numeric comparison is critical for the server to finish the connectivity analysis and the labeling phase in ED. Notice that the intermediate comparison result is a privacy to the adversary (intermediary in the network), but it is not a secret for the server due to the requirement of procedure control. Therefore, based on the secure comparison protocols (SCPs) of Rahulamathavan et al. [17] and Samanthula et al. [20], we customize an SCP in Algorithm 4, which only needs one-round interaction.
Since r is picked randomly, the ciphertext sent back to the client can be either

Secure Vector Distance Measurement
Euclidean distance measurement between any two vectors is essential for extracting the convergence direction in Equation (12) and cosine function in Equation (14) and labeling the remaining data samples (Line 6 in Algorithm 3). For simplicity, in this study, the secure vector distance measure protocol (SVDMP) adopts the secure squared euclidean distance of Samanthula et al. [20], which employs SMP for each dimension. It can also be done in batch if necessary.

Algorithm 4 Secure comparison protocol.
Require: Server: In ED, phases such as Lines 2 and 6 of Algorithm 3 should be done with the help of secure 1-NN query (S1NNQ). Given a data sample [[x]], the server finds the nearest neighbor in dataset [[X ]] with |X | samples, as described by Algorithm 5.
end if 7. end for

The Proposed MPPSVC Model
In this section, we first reformulate the crucial phases based on the privacy-preserving primitives. Then, we present the flow diagram for MPPSVC.

Preventing Data Recovery from Kernel Matrix
In this study, we adopt the Paillier for both efficiency and security. Unfortunately, on Line 5 of Algorithm 1, ∑ N j=1 β j K(x i , x j ) brings a chain-multiplications in ED. If we use SMP, the massive interactions are fatal due to privacy concerns of [[β j ]] and [[K(x i , x j )]]. One may consider the acceptation of K(x i , x j ) in PD. That means that the client only sends the values of K(x i , x j ) for i, j ∈ [1, N]. However, they carry the similarities between all the data sample pairs. The server/adversary can easily recover the whole dataset if it luckily collects any two data samples in plaintext [19].
Since the kernel matrix is sufficient for Equation (5), we design a transformation strategy to hide the actual similarity in K(·, ·) while protecting β. Let M be an N × N orthogonal matrix satisfying M −1 = M T , which is kept secret by the client. LetQ = M TQ M; the problem in Equation (5) which is equivalent to Let β = Mβ; the problem in Equation (16) almost achieves the same formulation with Equation (5). Thus, we can find a fact for Algorithm 1: if the client sends a transformedQ to replaceQ, we obtain a correspondingβ which is different from β, i.e.,β = β. In plaintext, the client can obtain the expected β by multiplying M due to β = Mβ. Furthermore, the server cannot recoveryQ since M is only held by the client. For further computation and communication savings, we suggest decomposingQ to generate M throughQ = MΣ eig M T , where Σ eig is the eigenvalues ofQ and M is the right unitary matrix satisfying M −1 = M T . We get a diagonal matrixQ = M TQ M with N non-zero elements. If N is too large, outsourcing either the QR decomposition [27] or eigen-decomposition [28] is recommended. An adversary cannot recoverQ from a diagonal matrixQ theoretically.

Privacy-Preserving DCD Solver
Generally, carrying the similarities amongst N data samples needs a N × N matrix. By employing the transformation strategy in Section 4.2.1, we hide the similarities by splitting them into the private key M of the client and the diagonal matrixQ. In the other perspective, as presented in Equation (16), this strategy is equivalent to protecting the sensitive coefficients by encrypting them with the private key M. If we use the transformedQ as input and multiply the returnedβ by M, Line 5 of Algorithm 1 should be replaced byĜ = 2 × ∑ N j=1 β jQjj , and the other steps will not be changed.

Privacy-Preserving Convex Decomposition
Since SVs are even more sensitive than the other data samples, for Algorithm 2, the client shows the server the encrypted SVs Due to the exponential function of K(·, ·), however, calculating the convergence direction for each data sample by Equation (12) in ED is beyond the server's ability. A practical choice is querying the result from the client, while all the SVs should be considered. The client can either store the kernel matrix of SVs for solving Equation (12) by matrix manipulation or calculate 4qβ j K(x j , x) on demand in each loop body, on the basis of available storage. Thus, for privacy-preserving convex decomposition, we can reformulate Algorithm 2 by replacing Lines 3 and 4 by Algorithm 5 and querying, respectively. Furthermore, the cosine function on Line 4 can be easily implemented by utilizing SMP and SVDMP.

Privacy-Preserving Connectivity Analysis
In this study, connectivity analysis between two nearest convex hulls indicated by the collected set S Tri has two strategies for choice. In ED, we present this privacy-preserving connectivity analysis with cluster number K and threshold η 2 by Algorithms 6 and 7, respectively.  13. for v ← 1, N Tri do 14. [   [ If we want the server to supply a service of labeling the remaining samples or determining the new arrival samples for the others, S1NNQ is recommended. Either the SVs or the density centroids can be used on behalf of the convex hulls to meet the accuracy or efficiency requirements.

Work Mode of MPPSVC
By introducing secure outsourcing protocols to the core method of RSVC-EO, Figure 3 gives the flow diagram of MPPSVC. Arrows on the client side show communications between the client and the server for successive cluster analysis in ED, while arrows on the server-side illustrate all the accessible information for the server in each step. All the crucial information is encrypted.

For instance, [[β]] is protected by M and the others in the form of [[·]
] are encrypted by K pub . For clarity, we separate communications based on requirements from each step. To maximize outsourcing capability appropriately (on demand), MPPSVC allows one to outsource any step or steps without losing the control of sensitive data.

K pub
Public key Step 1. DCD Solver in ED for coefficient indicating SVs (Alg.1) Step 2. Convex decomposition in one iter. using SVs (Sec. 4

Time Complexity of MPPSVC
We take the classical SVC [1] as the baseline and separately measure the time complexity of each step in RSVC-EO and MPPSVC for local use by the data owner and client-server environment. Let N be the number of d-dimensional data, N SV be the number of SVs, N CH be the obtained convex hulls, and m be the average sample rate. Due to distance measurement in ED, we introduce d for accurate comparisons. Table 1 lists the computational complexities.
Step 2 is not in the classical SVC. The most recent convex decomposition based method [22] takes O(ζ N SV ) and the SEVs based methods [12,29] require O(ζ N). Here, ζ is the number of iterations. For RSVC-EO, the difference brought by the "Instance K(·, ·)" is calculating K(x j , x) with d-dimensional input before getting the gradient in Equation (12). For MPPSVC, upon using the kernel matrix of SVs or not, the client's complexity is O( N 2 SV ) where = {1, d}. Meanwhile, the server is designated by a polling program in O(dN 2 SV ). Notice that the essential tasks of Algorithm 2 for the client in ED have been cut down even though they have similar time complexity with that of in PD.
( 3) In Step 3, connectivity analysis of the classical SVC is m times sampling checks for each SVs-pair. However, calculating Equation (3) requires N SV SVs, the whole consumption is up to O(mN 3 SV ). On the contrary, RSVC-EO only requires O(ρN CH ) due to the direct use of the convergence directions. ρ ∈ (2, 3] is because Algorithms 6 and 7 are provided for choice. Similarly, the client responses SMP, SVDMP, and SCP in Algorithm 6 or SCP in Algorithm 7 with O(N CH ). (4) Step 4 is particular for the classical SVC while the prior leaves an adjacency matrix. Its complexity is O(N 2 SV ). (5) Similar to classification, the traversal of the whole set in Step 5 cannot be avoided for every method. Its complexity is O(dNN SV ) or O(dNN CH ) except for the server in MPPSVC. Using S1NNQ, the major operations are carried out with the client under protocols of SVDMP and SCP. Table 1. Time complexities of the proposed methods.

using SVs
Step 3. Connectivity analysis with

Security of MPPSVC under the Semi-Honest Model
In this section, we consider the execution of MPPSVC under the semi-honest model. Due to the semantic security of the Paillier, messages in ciphertext exchanged in the client-server environment are securely protected. For each step in Figure 3, the analysis is presented as follows: (1) According to Sections 4.2.1 and 4.2.2, the execution image of the server is given byQ and C.Q is a diagonal matrix protected by the matrix M secretly held by the client and C is a single-use parameter without strict limitation. Notice that fromQ to the kernel matrixQ is a one-to-many mapping. No one can recover an N-dimensional vector by only one number. Without any plain item ofQ, the server cannot infer any data sample even though it occasionally gets several data samples. When the server finishes Algorithm 1, the output [[β]] is naturally protected by M.
(2) For the second step, the major works are carried out on the client side as a response to the server. As marked in Figure 3 Step 3 depends on the output of Step 2. It includes SMP, SCP, and SVDMP whose prototypes are proved by [20]. The server can only get the number of clusters, but cannot infer any relationship between data samples in PD. Furthermore, the client can easily hide the real number by tuning the predefined parameters K or η 2 . Then, an uncertain cluster number is meaningless for the server. (4) Step 5 employs S1NNQ. The server cannot infer the actual label for a plain data sample, even if it occasionally has several samples.

Security of MPPSVC under the Malicious Model
We extend MPPSVC into a secure protocol under the malicious model where an adversary exists. It may be the server or an eavesdropper. Since the eavesdropper cannot get more information than the server, for simplicity, we only consider the server as an implicit adversary.
For the server, it can arbitrarily deviate from the protocol to gain some advantages (e.g., learning additional information about inputs) over the client. The deviations include, for example, for the server to instantiate MPPSVC with modified queries and to abort the protocol after gaining partial information. Considering SCP, the malicious server might either use a fixed r to obtain the ordering of encrypted numerical values or tamper the two compared numerical values. For the prior, the immediate order for N ciphertext is meaningless for recovering their plaintext, because of Z N ⊂ Z n 2 and N n 2 in bit. For the latter, without K pri , any modification of ciphertext might cause a significant change of its plaintext. The client can easily discover it. Therefore, all the intermediate results are either random or pseudo-random values. Even though an adversary modifies the intermediate computations, he cannot gain any additional information. The modification may eventually result in the wrong output. Thus, if we ensure all the calculations performed and messages sent by the client are correct, the proposed MPPSVC is secure. It provides the ability to validate the server's works to the client.

Experimental Setup
In the premise of security for outsourcing, we demonstrated the performance of RSVC-EO in PD and MPPSVC in ED. RSVC-EO dominated the validity while MPPSVC supplied the secure outsourcing framework. Deservedly, we first evaluated the validity and performance of RSVC-EO, and then the performance of MPPSVC.
In PD, the first experiment was to estimate the sensitivity concerning η 1 on accuracy. Since RSVC-EO is designed for local use, its declared advantage is the flexibility of using the pre-computed kernel matrix or not. The second experiment was to check the performance related to kernel utilization. Then, the third series of the benchmark was to give full comparisons of RSVC-EO and the state-of-the-art methods. Besides, we verified the effectiveness of capturing data distribution by all the compared methods regarding the discovered cluster number. In ED, eventually, our primary focuses were the changes in accuracy and efficiency brought by MPPSVC. In this study, the adjusted rand index (ARI) [4,30] was adopted for accuracy evaluation. It is a widely used similarity measure between two data partitions where both true labels and predicted cluster labels are given. Let N ij be the number of data samples with true label i yet assigned by j. N i· and N ·j are, respectively, the number of data samples with label i and j. ARI is formulated by Table 2 shows the statistical information of the employed twelve datasets. Here, wisconsin, glass, wine, movement_libras, abalone, and shuttle (training version) were from UCI repository [31]. Four text corpora were employed after a pre-processing, namely DC GLI -CCE by Ping et al. [32], i.e., four categories of WebKB [33], full twenty categories of 20Newsgroups [34], top 10 largest categories of Reuters-21578 [35], and Ohsumed with 23 classes [36]. P2P traffic is a collection of 9206 flows' features that were extracted from traffic supplied by UNIBS [37] following the method of Peng et al. [38]. Following the work of Guo et al. [39], kddcup99 is a nine-dimensional dataset extracted from KDD Cup 1999 Data [40], which was used to build a network intrusion detector. Due to space limitation, thereafter, we use abbreviations in brackets for dataset with long name.

Experiments in the Plain-Domain
For local use, RSVC-EO in PD means the data owner frequently cannot have sufficient memory for the required kernel matrix. Consequently, the testbed was a laptop with Intel Dual Core 2.66 GHz and 3 GB available RAM, which calculates kernel function on demand. RSVC-EO and all the other compared methods were implemented and fairly evaluated by MATLAB 2016a on Windows 7-X64.

Analysis of Parameter Sensitivity
Algorithm 2 introduces η 1 to indicate the convex decomposition which might be directly related to RSVC-EO's performance. Figure 4 depicts the ARI variations achieved by RSVC-EO with respect to η 1 ∈ [−0.8, 0.9] with step 0.01. Here, variation for each dataset is represented by the rectangle in which the square block is the mean value. Apparently, the variations is very small for 9 out of 12 datasets, i.e., wisconsin, glass, mLibras, P2P-T, WebKB, abalone, Reuters, Oh, and kddcup99. In fact, these variations are lower than 3.21 × 10 −4 . For the other three datasets, i.e., wine, 20NG, and sh, the variations are lower than 5.5 × 10 −3 , and the rectangles' locations show that most of them are close to the peak value. Therefore, parameter selection of η 1 is frequently an easy work for the proposed RSVC-EO to achieve a relatively optimal clustering result. Notice that a preset η 1 is not required for those cases with prior knowledge of the cluster number K, because RSVC-EO can merge convex hulls until the expected K is obtained.

Analysis of Iteration Sensitivity
The iterative analysis is essential for both RSVC-EO and MPPSVC. Although the server conducts the solver independently to get the encrypted coefficient vector Mβ, runtime might be pricey if we have to choose the strategy of immediately calculating kernel function for large-scale data. Therefore, we concern whether massive iterations are unavoidable. Noticeably, Hsieh et al. [41] proved that the general DCD solver reaches an -accurate solution in O(log(1/ )) iterations. For the sake of simplicity, we checked the relationship between the achieved ARI and iteration number in PD. Figure 5 depicts the results where kddcup99 is omitted due to pricey runtime with great iteration number.
For most of the cases, the achieved ARIs are relatively stable, along with the iteration number increases. By employing the proposed solver, a useful phenomenon is that, for each case, the best ARI is usually reached with a small iteration number (≤4). It means that a more precise objective function value of the problem in Equation (2) is not always required in practice. Therefore, a small iteration number meets the requirements from both RSVC-EO and MPPSVC for expected results. This will not bring noticeable computation load to either side.

Performance Related to Kernel Utilization
In this section, we check whether RSVC-EO is flexible to balance the efficiency and usability with limited memory. For efficiency, the runtime of completing Algorithm 1 with the pre-computed kernel matrix and with immediate calculating kernel function on Line 5 were separately evaluated. The former is denoted by "Runtime (Store Kernel)" while the latter is "Runtime (Cal. Kernel)". Meanwhile, their memory consumptions are also evaluated, respectively, as "Storage (Store Kernel)" and "Storage (Cal. Kernel)" (Due to the limited memory (3GB); in fact, the client cannot afford all the experimental datasets' requirements. Thus, "Runtime (Store Kernel)" for storage exceeding the supplement was estimated by "Runtime (Cal. Kernel)" minus the runtime of calculating K(x j , x i ).). Figure 6 shows the results. Apparently, for small datasets with N < 1000 such as wisconsin, glass, and wine, there are no obvious differences between "Runtime (Store Kernel)" and "Runtime (Cal. Kernel)". Although the client affords large dataset analysis well, its "Runtime (Cal. Kernel)" quickly raises as N increases. On the contrary, based on the immediate calculating kernel function, "Runtime (Cal. Kernel)" is linear to the data size, whereas "Storage (Store Kernel)" easily exceeds the client's capability. Gaps become significantly for those datasets with N d. For instance, to deal with sh, the requirement is 7.05 GB for "Storage (Store Kernel)" while "Runtime (Cal. Kernel)' only needs 1.49 MB. Therefore, efficiency and storage consumptions are strongly related to the kernel utilization. It is critical to outsource the cluster analysis for those resource-limited clients.

Benchmark Results for Accuracy Comparison
To check RSVC-EO's performance, we compared it with the state-of-the-art methods: the complete graph (CG) [1], the reduced complete graph (RCG) [24], the equilibrium based SVC (E-SVC) [29,42], the cone cluster labeling (CCL) [43], the fast SVC (FSVC) [12], the position regularized SVC (PSVC) [44], the convex decomposition cluster labeling (CDCL) [23], the voronoi cell-based clustering (VCC) [8], the fast and scalable SVC (FSSVC) [26], and the faster and reformulated SVC (FRSVC) [22]. Table 3 gives the achieved accuracies regarding ARI, and the corresponding runtime of the training phase and the labeling phase for each dataset is illustrated in Figure 7. Three points are important to be noted. First, due to the sampling strategy, VCC cannot achieve a fixed accuracy even though its parameters are fixed. We use its mean and mean-square deviation of the top ten ARIs. Secondly, runtime for each dataset is the average time of ten executions. Thirdly, not all methods can finish analysis on the client while the kernel matrix is too large or too much time (≥4 h. in this study) required by any phase. For these cases, we use "-" to denote an unavailable ARI and mark the runtime with 0. Meanwhile, we collect runtime of FRSVC and RSVC-EO by adopting the immediate calculation of kernel function.
Note: "-" means not available due to insufficient memory or too much time consumption.
In Table 3, the first rank is highlighted by boldface. Apparently, RSVC-EO reaches the best performance on 7 out of 12 datasets, especially for the large ones such as Reuters, Oh, 20NG, and kddcup99, whereas FRSVC performs better on sh and WebKB, FSSVC outperforms the others on wine and wisconsin, and CDCL gets better results on P2P-T. As data size increases, many traditional methods cannot run well on our client, e.g., CG, E-SVC, CCL, and PSVC. We directly quoted the results of E-SVC and FSVC on sh in [12]. Additionally, RSVC-EO frequently gets into the first three ranks in the other cases, e.g. sh, WebKB, and P2P-T. Thus, regarding the accuracy, we guess that RSVC-EO is suitable for relatively large datasets. To verify it, we also give the results of pair comparison in Table 4 following the work of Garcia and Herrera [45]. Here, RSVC-EO is the control method. A nonparametric statistical test, namely Friedman test, was employed to get the average ranks and unadjusted p values. By introducing an adjustment method, Bergmann-Hommel procedure, the adjusted p-value denoted by p Homm corresponding to each comparison was obtained. RSVC-EO reaches the best performance in the view of average rank. Since the Bergmann-Hommel procedure rejects those hypotheses with p-values ≤ 0.016, together with the values of p Homm , we further confirm RSVC-EO's better performance.
In Figure 7, some obvious observations for the training phase can be found. (1) For the cases smaller than WebKB, most of the methods perform similarly, including FRSVC and RSVC-EO. Although they have to calculate the kernel function, this might be balanced due to more iterations for the others. However, CCL and PSVC still consume a lot for strict restrictions. CCL requires more iterative analysis to guarantee R < 1, while PSVC needs a pre-analysis to determine the weight for each data sample and imposes these weights as additional constraints. (2) Along with data size increases, e.g., from WebKB to Reuters, runtime for most of the methods raises dramatically. Particularly, CG, E-SVC, CCL, and PSVC want memory greater than the predefined upper bound. When the kernel matrix requires memory getting close to or greater than the client's supplement, such as Oh, 20NG, and sh, only two groups of methods are valid. The first group includes VCC (sample rate θ ∈ [0.001, 0.5)) and FSSVC, which adopts sampling strategy. Together with accuracies in Table 3, FSSVC obtains better accuracy and consumes much more memory because of steadily choosing boundaries, whereas VCC prefers a random strategy. The second group consists of FRSVC and RSVC-EO, which calculate the kernel function on demand to avoid huge memory consumption. Despite having greater runtime than VCC, they are rewarded with better accuracies. A remarkable finding is that benefited by better performance of the parameter insensitivity; fewer iterations required by RSVC-EO's learning lead to less runtime requirement. (3) For kddcup99, the full kernel matrix with approximate size 2.44 × 10 11 wants 909.18 GB, which is far greater than what the client can afford. VCC and FSSVC fail because they can hardly select appropriate data samples to describe the pattern, while FRSVC fails for pricey labeling strategy. Only RSVC-EO finishes the analysis with a suboptimal result because we only use one iteration in Algorithm 1. Therefore, there still will be a big challenge for RSVC-EO to obtain the optimal result when the data size continues to increase. For the labeling phase, RSVC-EO outperforms the others significantly that confirms the core ideas of FCDCL. Firstly, FCDCL does not use iterative analysis. Thus, it performs well on high-dimensional data, such as mLibras and Oh, whereas ESVC fails and the others consume much more. Secondly, connectivity analysis of FCDCL avoids the traditional sampling checks in feature space. It reduces the impact of candidate sample pairs. Although the others try to reduce the number of sample pairs and the sample rate, runtime for the essential sample analysis is longer than the proportional time to the size of the dataset or the candidate subset.

Effectiveness of Capturing Data Distribution
Following Xu and Wunsch [30], we find that increasing the cluster number sometimes has a positive impact on the accuracy measures. However, we should try to avoid splitting data samples of a group into multiple clusters. We intuitively expect an effective method which can accurately capture data distribution. Therefore, we use the difference between the captured cluster number N C and the real number of classes N R summarized in Table 2, in terms of percentage ( N C N R − 1) × 100%. Comparisons amongst the eleven methods are illustrated by Figure 8. If a method is invalid on a dataset, the corresponding percentage is assigned by the greatest value amongst the other finished ones, and its column is gray with slashes. Certainly, the shorter the column for a dataset is, the better effect the corresponding method performs. As shown in Figure 8, RSVC-EO outperforms the other ten methods significantly.

Experiments in the Encrypted-Domain
To integrate the Paillier, MPPSVC was implemented by C++ using GNU GMP library version 6.0.0a (https://ftp.gnu.org/gnu/gmp/).Both the server and the client were modeled as different threads of a single program, which passes data or parameters to each other following the rules shown in Figure 3. We conducted experiments on a server with Quad-Core 2.29 GHz CPUs and 64 GB main memory running on Windows 7-X64.

Performance in the Encrypted-Domain
In ED, the Paillier only allows integers. We introduce a scaling factor γ on the input data samples and the exchanged data to take integers downwardly before encryption. Table 5 shows the accuracies for various γ in ED. Although better accuracies are not always obtained with greater γ, the clustering results (marked by underlines) become steadily when γ is above 10 4 . For these cases, the accuracies are either the best ones (highlighted by boldface) or very close to the best. Hence, four decimal points are sufficient for the 12 datasets. Compared with Table 3, we find that γ has little influence on accuracy in ED because both the input data samples and the intermediate results transmitted from the client to the server might lose a certain degree of precision. Since the accuracies achieved by MPPSVC are very close to those obtained by RSVC-EO in PD, we confirm the guaranteed privacy of the input data and clustering procedure.   Table 6 presents the runtime of each step (following Table 1) in ED consumed, respectively, by the client and the server. "#SVs" and "#Convex Hulls", respectively, denote the number of SVs and convex hulls. We omit the consumption of outsourcing the pre-computation for M following Luo et al. [27] because it is not in the proposed framework. Table 6. Runtime (s) of each step in ED, respectively, consumed by the client (C) and the server (S).

Dataset
Step 1 Step 2 Step 3 Step Note: Step 5 can classify all the arrival data one by one or in parallel, and we collect the consumptions for a single sample for efficiency analysis.
Together with Figure 7, we can make several observations. (1) Step 1 is no longer the first barrier for the client even though the solver requires a large number of iterations. For example, in Figure 7l, Step 1 for kddcup99 still consumes 10,846.2134 s although we cut off the iterations to get the suboptimal result for efficiency. Now, it only requires 0.4751 s by the client without considering the iterations on the server. (2) Step 2 is the most time-consumption task for both sides, and Step 3 takes the second place. Comparing the first two time-consuming datasets P2P-T and mLibras with the others, efficiency is closely related to "#SVs" and "#Convex Hulls". From Definition 2, "#SVs" also affects "#Convex Hulls". Since S1NNQ compares each unlabeled data sample with both SVs and density centroids, "#SVs" and "#Convex Hulls" directly contribute the computational time of Step 5. (3) Dimensionality is another critical factor. The evidence can be found on the small data mLibras with 90 dimensions. Since each dimension should be encrypted/decrypted separately, high dimensionality increases the computational time in ED and results in more SVs. (4) Similar to the method in [17], Step 5 is a classification work in ED, which can be conducted one by one or in parallel. It is a light workload to outsource the encrypted sample for its label. (5) The total time cost is proportional to data size yet in an acceptable range for the client. However, we do not suggest outsourcing Steps 1-3 for small data analysis due to the increased costs. Based on a learned model, Step 5 is also suitable for being outsourced as a service.

Related Work
Despite handling arbitrary cluster shapes well, SVC suffers from pricey computation and memory as data size increases. Generally, exploring ways of optimizing the critical operations and asking for the Cloud's help are potential countermeasures. For the former way, the training and labeling phases are considered, respectively. (1) For model training, the core is solving the dual problem in Equation (2). Major studies prefer generic optimization algorithms, e.g., gradient descent, sequential minimal optimization [4], and evolutionary strategies [46]. Later, studies rewrote the dual problem by introducing the Jaynes maximum entropy [47], the position-based weight [44], and the relationship amongst SVs [26]. However, conducting a solver with the full dataset suffers from huge consumption of kernel matrix. Thus, FSSVC [26] steadily selects the boundaries while VCC [8] samples a predefined ratio θ ∈ (0, 1] of data. Other methods related to reducing the working set and divide-and-conquer strategy were surveyed in [4]. However, bottlenecks still easily appear due to the nonlinear strategy and the pre-computed kernel matrix. Thus, FRSVC [22] employs a linear method to seek a balance between efficiency and memory cost. (2) For the labeling phase, connectivity analysis adopts a sampling check strategy for a long time.
Reducing the number of sample pairs thus becomes the first consideration, for instance, using the full dataset by CG [1], the SVs by PSVC [44], the SEVs by RCG [24], and the transition points by E-SVC [29,42]. Although the number of sample pairs is gradually reduced, they have a side-effect of additional iterations in seeking SEVs or TS. Thus, CDCL [23] suggests a compromise way of using SVs to construct convex hulls, which are employed as substitutes of SEVs. For efficiency, CDCL reduces the average sample rate by a nonlinear sampling strategy. Later, FSSVC [26] and FRSVC [22] made further improvements by reducing the average sample m close to 1. Besides, Chiang and Hao [48] introduced a cell growth strategy, which starts at any data sphere, expands by absorbing fresh neighboring spheres ,and splits if its density is reduced to a certain degree. Later, CCL [43] created a new way by checking the connectivity of two SVs according to a single distance calculation. However, too strict constraints emphasized on the solver degrade its performance. In fact, for these methods, the other pricey consumption is the adjacent matrix, which usually ranges from O(N 2 SV ) to O(N 2 ).
As data size increases, pricey computation and huge space needed by the above solutions raise the requirement of outsourcing. However, to ask for the Cloud's help, the risk of data leakage raises concerns, since both the input data and the learned model memorize information [10]. The major studies focus on securely outsourcing known SVM classifiers. Generally, the secure outsourcing protocols prefer introducing homomorphic encryption [15][16][17]49], reformulating the classifier [50], randomizing the classifier [51], or finding an approximated classifier [21]. Furthermore, based on the homomorphic encryption, the existing secure outsourcing methods support the calculation of rational numbers [52], matrix computations [14,27,28], mathematical optimization [14], and k nearest neighboring query [20]. However, very few works are related to privacy-preserving model training. An early work was published by Lin et al. [19], who suggested a random linear transformation for data's subset before outsourcing. Later, Salinas et al. [53] presented a transformed quadratic program and its solver, namely Gauss-Seidel algorithm, for securely outsourcing SVM training while reducing the client's computation. According to a primal estimated sub-gradient solver and replacing the SVs with data prototypes, the most recent work [54] gives a solution of training SVM model with data encrypted by homomorphic encryption.
In a sense, training an SVM model and making a decision for a data sample is similar to the training phase and the last labeling step, respectively. However, the known solutions are not suitable for SVC due to the distinguished iteration/analysis strategy and operations in feature space. To the best of our knowledge, no practical solution is presented for secure outsourcing SVC despite strong demand.

Conclusions and Future Work
Towards easing the client's workload, we propose MPPSVC to make all the phases of SVC outsourceable without worrying about privacy issues. For simplicity and generality, we suggest using additively homomorphic encryption to protect data privacy. The limited operations motivate us giving a new design of RSVC-EO based on elementary operations. However, inevitable iterations for Equation (2) and complex computations in all phases of SVC may cause massive interactions. For efficiency, we consequently protect the kernel by a matrix transformation, which not only reduces data transmission but also makes the outsourced solver iterate well. Besides, for the labeling phase of RSVC-EO, FCDCL is developed without iterative analysis. Taking RSVC-EO as the core, MPPSVC consists of several customized, lightweight, and secure protocols. Theoretical analysis and experimental results on twelve datasets prove the reliability of the proposed methods, i.e., RSVC-EO for local use and MPPSVC for outsourcing.
Although MPPSVC provides customizable phase outsourcing, security and efficiency are ever-lasting issues, which should be balanced as data size increases. How to securely control the iterations while reducing SVs, finding substitutes for the complex operations, avoiding unnecessary kernel matrix consumption, and making full use of distributed computing are worthy of investigation.