Fuzzy-Based Balanced Partitioning Under Capacity and Size-Tolerance Constraints in Distributed Quantum Circuits

It is important for the design of a distributed quantum circuit (DQC) to minimize the communication cost in k-way balanced partitioning. In this article, given an original quantum circuit (QC), a partitioning number k, the maximum capacity δ inside each partition, and the maximum size tolerance γ between two partitions, a new k-way (δ, γ)-balanced partitioning problem can be formulated as a k-way partitioning problem under the capacity constraint δ and the size-tolerance constraint γ, and a fuzzy-based partitioning algorithm can be proposed to minimize the communication cost in k-way (δ, γ)-balanced partitioning for a DQC design. First, an edge-weighted connection graph can be constructed from the gates in a given QC. Furthermore, based on the estimation of the probabilistic connection strength between two vertices in the connection graph and the initial k-way partitioning result in the connection graph, the fuzzy memberships on k clusters can be generated in fuzzy k-means graph clustering. Finally, based on the fuzzy memberships on k clusters in the connection graph, the maximum capacity inside each partition, and the maximum size tolerance between two partitions, all the vertices in the connection graph can be assigned onto k partitions to minimize the communication cost in k-way (δ, γ)-balanced partitioning. Compared with Daei's recursive Kernighan–Lin-based algorithm in four-way balanced partitioning, the experimental results show that the proposed fuzzy-based partitioning algorithm with three size-tolerance constraints γ = 1, γ = 2, and γ = 3 can use 58.3%, 61.3%, and 64.5% of CPU time to reduce 16.1%, 21.2%, and 24.6% of the communication cost for the eight tested circuits on the average, respectively. Compared with the modified partitioning algorithm from Dadkhah's partitioning algorithm in three-way, four-way, or five-way balanced partitioning, the experimental results show that the proposed fuzzy-based partitioning algorithm with the size-tolerance constraint γ = 3 can use 35.0% of CPU time to reduce 11.1% of the communication cost for the eight tested circuits on the average, respectively.


I. INTRODUCTION
Quantum computing [1] can use the principles of quantum mechanics to efficiently solve some specific problems by designing quantum algorithms. It is known that the quantum algorithms can be described by using quantum circuits (QCs) [2], [3] on an ideal quantum computer. To execute some large-scale quantum algorithms on a quantum computer, many physical qubits must be required in the QCs. Due to the technology constraints, the number of the available quantum bits (qubits) in a fabricated quantum device [4] is limited. Hence, the existence of these technology constraints will lead to the emergence of distributed quantum computing. In the design of a distributed quantum circuit (DQC), a set of smaller scale quantum devices must be constructed for large-scale quantum algorithms, and there can be some communication methods among smaller scale quantum devices.
In the concept of using DQC designs, Cleve and Buhrman [5] first studied the quantum communication among remote quantum devices. As the input data are distributed among remote quantum devices, quantum entanglement can be used for communication. Furthermore, Cirac et al. [6] showed that the use of maximally entangled states can be advantageous for a large number of quantum devices using ideal quantum channels. In addition, Beals et al. [7] also showed that each QC can be converted into a DQC design in a distributed quantum system.
For communication in DQC designs, Yepez [8] first presented a distributed architecture using two communication methods. Basically, each qubit can be entangled to any number of qubits in quantum communication, and all the remote quantum devices can be connected by a set of classical channels in classical communication. In addition, Lo [9] examined the cost of the classical communication on DQC designs. Furthermore, Caleffi et al. [10], [11] stated that the communication among remote quantum devices can become faster by using the quantum Internet. In the design of the quantum Internet, teleportation can be used as the main strategy for information transmission.
For two-way partitioning in DQC designs, Yimsiriwattana and Lomonaco [12] first proposed the distributed model of Shor's algorithm [13]. In the distributed model, the global gates can be used to implement the DQC design, and teleportation can be used as the communication method. However, the partitioning of the used qubits cannot be determined to minimize the number of teleportations in the distributed model. Furthermore, based on the teleportation for communication, Van Meter et al. [14] proposed the distributed design of a two-qubit Vedral-Barenco-Ekert (VBE) carryripple adder onto two equal quantum devices. However, the teleportation cost in the distributed design can still be higher. Next, Zomorodi-Moghadam et al. [15] proposed a heuristic algorithm to reduce the communication cost between two partitions inside a DQC design. By finding the execution order of two-qubit controlled-not (cnot) gates, the total number of teleportations between the two partitions can be reduced to minimize the teleportation cost between two distributed devices. In addition, Houshmand et al. [16] proposed an evolutionary algorithm, and Dadkhah et al. [17] proposed a genetic algorithm to minimize the communication cost between two partitions inside a DQC design. Recently, based on the connectivity matrix model of QCs, Ghodsollahee et al. [18] further proposed a two-phase algorithm to minimize the communication cost between two partitions inside a DQC design. However, these proposed algorithms do not consider the multiple-way partitioning on the minimization of the communication cost in a DQC design.
In multiple-way partitioning for a DQC design, Andrés-Martínez and Heunen [19] presented an automated method to distribute a QC over multiple devices. However, the minimization of the communication cost does not need to be considered for multiple partitions in a DQC design, and the assignment of the global gates in different partitions does not need to be further discussed in the proposed algorithm. Based on the construction of the bipartite graph for a given QC, Davarzani et al. [20] proposed a dynamic programming (DP) algorithm to partition the bipartite graph into some low-capacity QCs. However, any QC cannot be guaranteed to be presented as a bipartite graph, and the DP algorithm takes more execution time in multiple-way partitioning. Additionally, Daei et al. [21] also proposed an iterative Kernighan-Lin-based (KL-based) algorithm in a DQC design from a monolithic QC. In the proposed algorithm, the communication cost between multiple partitions inside a DQC design can be minimized. However, based on the utilization of the KL algorithm, the proposed recursive KL-based algorithm can only be used on the constrained number of partitions in a DQC design. Recently, based on the reordering result of a QC for the improvement of the execution time and the construction of a graph mode for a QC, Dadkhah et al. [22] proposed the genetic algorithm and the modified tabu-search algorithm to partition the graph model to obtain a DQC. However, the genetic algorithm and the modified tabu-search algorithm take more execution time in multiple-way partitioning.
The contributions of this article can be summarized as follows.
1) In a DQC design, a new k-way balanced partitioning (KBP) problem with two adjustable parameters on maximum capacity and maximum size tolerance can be formatted. Given an original QC, a partitioning number k, the maximum capacity δ inside each partition, and the maximum size tolerance γ among two partitions, an edge-weighted connection graph can be constructed from the gates in the given QC for k-way (δ, γ )-balanced partitioning under the capacity constraint δ and the size-tolerance constraint γ . 2) Based on the edge connections in an edge-weighted connection graph and the given partitioning number k, the probabilistic connection strength between two vertices in the connection graph can be estimated. Furthermore, the initial k-way partitioning result in the connection graph can be obtained by using a bottom-up clustering algorithm. Finally, based on the definition of the clustering distance between two vertices in the connection graph, the fuzzy memberships on k clusters in the connection graph can be generated in fuzzy k-means graph clustering (FKGC). 3) Based on the fuzzy memberships on k clusters in the connection graph, the capacity constraint δ inside each partition, and the size-tolerance constraint γ between two partitions, all the vertices in the connection graph can be assigned onto k partitions to minimize the communication cost in k-way (δ, γ )-balanced partitioning for a DQC design.
The rest of this article is organized as follows. Section II contains the motivation of the KBP in a DQC design and the formulation of the k-way (δ, γ )-balanced partitioning problem under the capacity constraint δ and the size-tolerance constraint γ in a DQC design. In Section III, based on the construction of an edge-weighted connection graph, the computation of the probabilistic connection strength between two vertices in the connection graph, the design of the FKGC, and the assignment of all the vertices in the connection graph onto k partitions, a fuzzy-based partitioning algorithm can be proposed to partition an original QC into k quantum subcircuits while minimizing the communication cost under the capacity constraint δ and the size-tolerance constraint γ in a DQC design. In Section IV, the experimental results in the proposed fuzzy-based partitioning algorithm can be listed and Transactions on IEEE compared with some published algorithms in k-way (δ, γ )balanced partitioning for a DQC design. Finally, Section V concludes this article.

II. MOTIVATION AND PROBLEM FORMULATION
In quantum computing, it is necessary for the solution of a larger problem to use more quantum bits (qubits) inside a QC. Based on the superposition principle in QCs, the state |ψ of a qubit can be represented by a unit vector in a Hilbert space labeled as α|0 + β|1 , where |0 and |1 are the basis of space and α and β are two complex coefficients establishing |α| 2 + |β| 2 = 1.
In general, a QC can consist of some quantum gates connected by a set of quantum wires for moving quantum data. Basically, a t-qubit quantum gate U can be defined and represented as a 2 t × 2 t matrix. By performing a t-qubit quantum gate U on t quantum states, |ψ 1 , |ψ 2 , …, and |ψ t , the outcome of the quantum gate U can be represented as the state U(|ψ 1 , |ψ 2 , …,|ψ t ). In addition, the controlled gate, controlled-U, with s-controlled qubits and t-qubit quantum gate U, operating on (s+t)-qubits, can be treated as a (s+t)qubit gate. For example, the 1-controlled gate, cnot, with one controlled qubit, and 1-qubit quantum gate, not, can be treated as a two-qubit gate.
Due to the limited capacity inside a QC in quantum computing technologies, a DQC design consisting of many small QCs on remote locations can be connected together to cooperatively solve a larger problem using the available capacities inside all the smaller QCs. In a DQC design, a local gate can be defined as a quantum gate where all qubits are located inside the same QC. On the other hand, a global gate can be defined as a quantum gate where all qubits are located inside some remote QCs. To implement the functionality of an original QC, the remote QCs in a DQC design must communicate with each other by sending the necessary qubits' information to each other using a quantum channel via teleportation.
Due to the limited capacity inside a small QC, the maximum capacity inside each small QC must be treated as the capacity constraint of one partitioned QC in a DQC design. Basically, the state of the controlled qubits inside some remote QCs must be sent to the operating qubit inside the other remote QC by communicating the qubits using teleportation. Since teleportation in a DQC design is a costly operation, the communication cost must be minimized in the partitioning process of a DQC design, that is, the number of the global gates must be minimized in the partitioning process of a DQC design. Additionally, to make use of the computing ability of the small QCs for a larger problem in a DQC design, the balance degree of the available partitions must be considered in a DQC design. Hence, the maximum size difference between two partitioned QCs in a DQC design must be treated as the size-tolerance constraint of all the partitions in a DQC design. Clearly, the smaller the maximum size difference between two partitioned QCs is, the higher the balance degree of a DQC design is. However, the communication cost in a DQC design will become more serious due to the higher balance degree of a DQC design. Under the acceptable balance degree of a DQC design, the allowable tolerant difference between two partitioned QCs in a DQC design can be used to reduce the communication cost in a DQC design.

A. MOTIVATION
In general, a given circuit can easily be bipartitioned into two balanced subcircuits in the KL algorithm. Hence, an original QC in a DQC design can be recursively partitioned by using the KL-based algorithm. However, the recursive KL-based algorithm can only partition an original QC into smaller QCs inside some constrained partitions, that is, the number of the partitions can be only constrained as 2 p in the recursive KL-based partitioning of a DQC design, where p is the number of recursions. Hence, it is necessary for a DQC design to consider the arbitrary partitions in the multiple-way partitioning of a DQC design.
On the other hand, it is known that the KL-based algorithm is a two-way balanced partitioning algorithm. Hence, the recursive KL-based algorithm is a multiple-way strictly balanced partitioning algorithm, that is, the size difference between two partitions is not larger than 1 in the recursive KL-based partitioning. Due to the strict size tolerance in the recursive KL-based partitioning, the strict balance will lead to the larger communication cost in a DQC design. If the acceptable size tolerance is considered in the multiple-way balanced partitioning, the communication cost can be further reduced in a DQC design.

B. PROBLEM FORMULATION
Initially, it is assumed that the communication cost between one controlled qubit and one operating qubit inside one gate can be set as 1. Given an original QC with n qubits q 1 , q 2 , …, q n and m gates U 1 , U 2 , …, U m , a partitioning number k, the maximum capacity δ inside a partitioned QC, and the maximum size tolerance γ among two partitions, the k-way (δ, γ )-balanced partitioning can be formulated to partition the original QC into k partitions to minimize the communication cost among the k partitions with satisfying the capacity constraint δ inside each partition and the size-tolerance constraint γ between two partitions in a DQC design.
For the specification of an original QC with eight qubits q 1 , q 2 , …, q 8 and 23 gates U 1 , U 2 , …, U 23 , in Fig. 1(a), it is clear that the two gates U 19 and U 22 are 2-controlled gates and the other gates are 1-controlled gates. If the partitioning number k is given as 3, the maximum capacity δ is given as 4, and  Similarly, if the partitioning number k is given as 3, the maximum capacity δ is given as 3, and the maximum size tolerance γ is given as 1, then 1) the set of eight quantum bits {q 1 , q 2 , q 3 , q 4 , q 5 , q 6 , q 7 , and q 8 } can be partitioned into three partitions, {q 1 , q 2 , q 5 }, {q 3 , q 6 , q 7 }, and {q 4 , q 8 } in the three-way (3, 1)-balanced partitioning and 2) the communication cost can be minimized as 12 in the DQC design. As illustrated in Fig. 1(c), it is clear that the communications in the three-way (3, 1)-balanced partitioning can be placed on the 11 global gates, U 1 , U 2 , U 3 , U 6 , U 7 , U 11 , U 14 , U 15 , U 8 , U 19 , and U 20 .

III. BALANCED PARTITIONING UNDER CAPACITY AND SIZE-TOLERANCE CONSTRAINTS IN DQCS
Given an original QC with n qubits and m gates, a partitioning number k, the maximum capacity δ inside each partition, and the maximum size tolerance γ among two partitions, a fuzzybased partitioning algorithm can be proposed to minimize the communication cost in k-way (δ, γ )-balanced partitioning under the capacity constraint δ and the size-tolerance constraint γ , and the design flow of the proposed algorithm is shown in Fig. 2.
In the proposed algorithm, the process of partitioning an original QC into k partitions in k-way (δ, γ )-balanced partitioning can be divided into three sequential steps: construction of edge-weighted connection graph, generation of fuzzy matrix in FKGC, and vertex assignment in k-way (δ, γ )-balanced partitioning.
For the construction of an edge-weighted connection graph, based on the communication relation inside the gates in a given QC, an edge-weighted connection graph can be constructed. For the generation of a fuzzy matrix in FKGC, first, based on the partitioning number k and the edge connections in the connection graph, the connection strength between two vertices in the connection graph can be estimated. Furthermore, based on the edge connections in the connection graph, the initial k-way partitioning in the connection graph can be constructed by using a bottom-up clustering 5100115 VOLUME 4, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply. algorithm. Finally, based on the definition of the clustering distance between two vertices in the connection graph, the fuzzy memberships on k clusters can be generated in FKGC. For the vertex assignment in k-way (δ, γ )-balanced partitioning, based on the maximum capacity δ inside each partition, the maximum size tolerance γ between two partitions, and the fuzzy memberships on k clusters, the vertices in the connection graph can be assigned onto k partitions under the capacity constraint δ and the size-tolerance constraint γ .

A. CONSTRUCTION OF EDGE-WEIGHTED CONNECTION GRAPH
Given an original QC with n qubits q 1 , q 2 , …, q n and m gates U 1 , U 2 , …, U m , initially, it is assumed that there is only one operating qubit inside each gate in a given QC. Basically, the communication relation can be defined as the relation between a controlled qubit and its operating qubit inside one gate. It is known that the communication between two qubits is permitted to be bidirectional in a DQC design. Hence, any edge in a connection graph is undirected. In the assignment of a graph edge, the communication cost between two qubits inside a global gate can be set as 1. In contrast, the communication cost between two qubits inside a local gate can be set as 0. Based on any available qubit in a given QC as one vertex and the communication relation between 1-controlled qubit and its operating qubit inside one gate in a given QC as one undirected edge, an undirected edge-weighted connection graph G c (V c , E c ) can be constructed as follows: a vertex v j in V c represents one available qubit q j , 1 ≤ j ≤ n, and an undirected edge e j,l in E c represents the communication relation between 1-controlled qubit q j and its operating qubit q l inside one gate U i , 1 ≤ i ≤ m, 1 ≤ j, l ≤ n, jࣔ l. In addition, the weight of the undirected edge e j,l in E c represents the number of gates with the communication relation between the two qubits q j and q l .
Refer to the given QC with eight qubits q 1 , q 2 , …, q 8 and 23 gates U 1 , U 2 , …, U 23 , in Fig. 1(a), based on the eight qubits in the QC as 8 vertices and the 16 communication relations inside 23 gates in the QC as 16 undirected edges, an undirected edge-weighted connection graph

B. GENERATION OF FUZZY MATRIX IN FKGC
To our knowledge, FKGC on fuzzy c-means clustering has been used for graph bisection and two-way circuit partitioning [23], [24], k-way circuit partitioning [25], partitioningbased placement [26], and layer-aware via minimization [27]. However, the fuzzy matrix in FKGC seriously depends on the definition of the clustering distance between two vertices in a corresponding graph. Due to the definition of the clustering distances in different applications, it is clear that the FKGC in circuit partitioning [23], [24], [25], placement [26], or layer-aware via minimization [27] cannot be directly used in multiple-way balanced partitioning. Hence, it is necessary for the development of multiple-way balanced partitioning in a DQC design to define the accurate clustering distance in FKGC.
For FKGC in multiple-way balanced partitioning, it is known that the construction of an initial k-way partitioning in VOLUME 4, 2023 the connection graph, the definition of the clustering distance between two vertices in the connection graph, the formulation of one error function, and the selection of an acceptable error ε must be considered. To generate the final fuzzy matrix of a given connection graph in FKGC, the generation process can be further divided into three sequential steps: estimation of probabilistic connection strength, initial k-way partitioning via bottom-up clustering, and FKGC via probabilistic connection strength.

1) ESTIMATION OF PROBABILISTIC CONNECTION STRENGTH
Given an edge-weighted connection graph G c , there may be some connection paths between any pair of two vertices v i and v j . If the vertices v i and v j are divided into two different groups, all the connection paths of the vertices v i and v j must be fully cut. If one edge on any connection path between the vertices v i and v j is cut, the weight sum of all the cut edges between the vertices v i and v j can be treated as the connection strength of the vertices v i and v j for the cut edges. Since one cut edge on any connection path between the vertices v i and v j is randomly selected, all the connection strengths of the vertices v i and v j can be obtained for all the possible cut edges. To measure the connection strength of the vertices v i and v j , the concept of the probabilistic connection strength of the vertices v i and v j can be introduced by using the uniform distribution in probability theory. Hence, the estimation of the probabilistic connection strength between two vertices in the connection graph can be divided into two following steps: extraction of estimation paths and computation of probabilistic connection strength.
For the extraction of the estimation paths in an edgeweighted connection graph, given any pair of two vertices v i and v j in an edge-weighted connection graph G c , one iterative extraction process can be proposed to find a feasible set of estimation paths between the vertices v i and v j in the graph G c . In the iterative extraction process, first, the maximumweight shortest path p i, j 1 between the vertices v i and v j can be found and extracted from the graph G c as an estimation path. Basically, the vertices, excluding the vertices v i and v j , on the path p i, j 1 can be defined as a set of extracted vertices on the path p i, j 1 . After extracting the path p i, j 1 , the extracted vertices and edges on the path p i, j 1 and the edges connecting to the extracted vertices must be further deleted from the graph G c , and the iterative extraction process can continue for the extraction of the next estimation path p i, j 2 in the remaining graph G c . Until there is no path between the vertices v i and v j in the remaining graph G c , the iterative extraction process will stop. As a result, a set of estimation paths between the vertices v i and v j can be extracted for the computation of the probabilistic partitioning cut between the vertices v i and v j in the graph G c .
Refer to the two vertices v 1 and v 6 in the connection graph G c , in Fig. 3, first, the maximum-weight shortest path p 1, 6 1 , including the edge e 1,6 , can be extracted from the graph G c . After extracting the path p 1, 6 1 , the edge e 1,6 must be deleted from the graph G c . Furthermore, the maximum-weight shortest path p 1,6 2 , including the two edges e 1,2 and e 2,6 , can be extracted from the remaining graph G c . After extracting the path p 1,6 2 , the vertex v 2 , the two edges e 1,2 and e 2,6 , and the two edges e 2,4 and e 2,5 , connecting to the vertex v 2 , must be deleted from the remaining graph G c . Next, the maximumweight shortest path p 1, 6 3 , including the three edges e 1,3 , e 3,7 , and e 6,7 , can be extracted from the remaining graph G c . After extracting the path p 1, 6 3 , the two vertices v 3 and v 7 , the three edges e 1,3 , e 3,7 , and e 6,7 , the edge e 3,4 connecting to the vertex v 3 , and the two edges e 5,7 and e 7,8 connecting to the vertex v 7 must be deleted from the remaining graph G c . Finally, the maximum-weight shortest path p 1, 6 4 , including the three edges e 1,8 , e 4,6 , and e 4,8 , can be extracted from the remaining graph G c . After extracting the path p 1, 6 4 , the two vertices v 4 and v 8 , the three edges e 1,8 , e 4,6 , and e 4,8 , and the edge e 4,5 connecting to the vertex v 4 must be deleted from the remaining graph G c . As a result, the four estimation paths p 1, 6 1 , p 1,6 2 , p 1,6 3 , and p 1,6 4 between the vertices v 1 and v 6 can be extracted from the graph G c , as illustrated in Fig. 4(a).
Similarly, refer to the two vertices v 2 and v 7 in the connection graph G c , in Fig. 3, first, the maximum-weight shortest path p 2, 7 1 , including the two edges e 2,5 and e 5,7 , can be extracted from the graph G c . After extracting the path p 2,7 1 , the vertex v 5 , the two edges e 2,5 and e 5,7 , and the two edges e 1,5 and e 4,5 connecting to the vertex v 5 must be deleted from the graph G c . Furthermore, the maximum-weight shortest path p 2,7 2 , including the two edges e 2,6 and e 6,7 , can be extracted from the remaining graph G c . After extracting the path p 2,7 2 , the vertex v 6 , the two edges e 2,6 and e 6,7 , and the two edges e 1,6 and e 4,6 connecting to the vertex v 6 must be deleted from the remaining graph G c . Next, the maximum-weight shortest path p 2, 7 3 , including the three edges e 1,2 , e 1,3 , and e 3,7 , can be extracted from the remaining graph G c . After extracting the path p 2, 7 3 , the two vertices v 1 and v 3 , the three edges e 1,2 , e 1,3 , and e 3,7 , the three edges e 1,5 , e 1,6 , and e 1,8 connecting to the vertex v 1 , and the edge e 3,4 connecting to the vertex v 3 must be deleted from the remaining graph G c . Finally, the maximum-weight shortest path p 2, 7 4 , including the three edges e 2,4 , e 4,8 , and e 7,8 , can be extracted from the remaining graph G c . After extracting the path p 2, 7 4 , the two vertices v 4 and v 8

Engineering uantum
Transactions on IEEE be deleted from the remaining graph G c . As a result, the four estimation paths p 2,7 1 , p 2,7 2 , p 2,7 3 , and p 2,7 4 between the vertices v 2 and v 7 can be extracted from the graph G c , as illustrated in Fig. 4(b).
For the computation of the probabilistic connection strength between the vertices v i and v j in an edge-weighted connection graph G c given a set of r estimation paths p i, j 1 , p i, j 2 , …, p i, j r between the vertices v i and v j , it is assumed that there are n h edges on the hth estimation path p i, j h , and there are n h weights w h,1 , w h,2 , …, w h nh on the n h edges, 1 ≤ h ≤ r. If there is no estimation path between the vertices v i and v j , the probabilistic partitioning cut ppc i,j between the vertices v i and v j can be set as 0. On the other hand, if the vertex v i is the same as the vertex v j , the probabilistic partitioning cut ppc i,j between the vertices v i and v j can be set as ∞. Based on the uniform distribution in probability theory, the probabilistic partitioning cut ppc i,j between the vertices v i and v j on the r estimation paths p i, j 1 , p i, j 2 , …, p i, j r can be further computed and set as As a result, the matrix M PPC representing the probabilistic partitioning cuts in the graph G c can be obtained.
It is known that the larger the probabilistic partitioning cut between two vertices is, the larger the connection strength between two vertices is. To estimate the probabilistic connection strength between the vertices v i and v j given the upper bound of the maximum cut U MC as the weight sum of all the edges in the graph G c , the probabilistic connection strength pcs i,j between the vertices v i and v j in the graph G c can be defined as the ration between the probabilistic partitioning cut between the vertices v i and v j , and the upper bound of the maximum cut in the graph G c . If the vertex v i is the same as the vertex v j , the probabilistic connection strength pcs i,j between the vertices v i and v j can be set as 1. If the vertex v i is different from the vertex v j , the probabilistic connection strength pcs i,j between the vertices v i and v j can be set as ppc i,j /U MC . As a result, the matrix M PCS , representing the probabilistic connection strength in the graph G c , can be obtained.
Refer to the edge-weighted connection graph G c in Fig. 3, based on the extraction of the estimation paths between two vertices in the graph G c , the matrix M PPC , representing the probabilistic partitioning cuts in the graph G c , can be obtained, as illustrated in Fig. 5(a). Based on the edge weights in the graph G c , the upper bound of the maximum cut in the graph G c can be obtained as 23. Furthermore, based on the matrix representing the probabilistic partitioning cuts in the graph G c , the matrix M PCS , representing the probabilistic connection strength in the graph G c , can be obtained, as illustrated in Fig. 5(b).

2) INITIAL K-WAY PARTITIONING VIA BOTTOM-UP CLUSTERING
Given an edge-weighted connection graph G c (V c , E c ), V c = {v 1 , v 2 , …, v n } and a partitioning number k, k-way graph partitioning (KGP) can be defined by specifying the vertices in V c into k subsets v 1 , v 2 , …, V k . Furthermore, the KGP result can be represented by using k characteristic functions u i : V c -> {0, 1} for the ith vertex subset V i , 1 ≤ i ≤ k, as follows: As a result, a partitioning matrix can be used to represent the k characteristic functions for the result of the k partitions in the graph G c .
It is known that a better initial KGP result can reduce the convergence time in FKGC. In FKGC, an initial KGP result in the graph G c can be constructed by using one iterative bottom-up clustering process. Initially, each vertex v j in the graph G c can be treated as one partition P j , 1 ≤ j ≤ n. If the partition number is larger than k in the graph G c , the edge e i,j with the largest weight w(e i,j ) in the graph G c can be selected and the two partitions P i and P j connected by using the edge e i,j can be merged into one larger partition P (i,j) . After constructing the new partition P (i,j) , the graph G c must VOLUME 4, 2023 Engineering uantum Input: An edge-weighted connection graph G c with n vertices; A partitioning number k; Output: A set of k subgraphs G 1 , G 2 , …, G k for k partitions; Set an initial set of n subgraphs for n initial partitions P 1 , P 2 , and P n , where P i = {v i }, 1 ≤ i ≤ n; while (Partition number in G c is larger than k) Find one edge e i,j with the largest weight between two partitions P i and P j in G c ; Merge the two partitions P i and P j into one larger partition P (i , j) ; Modify the graph G c by using the partition P (i,j) as one new vertex and summing the weights on the corresponding merged edges; end while return A set of k subgraphs G 1 , G 2 , …, G k for k partitions; be modified by using the partition P (i,j) and summing the weights on the corresponding merged edges. Furthermore, the iterative clustering process will continue for the modified graph G c . Until the partition number is equal to k in the modified graph G c , the iterative clustering process will stop. As a result, a partitioning matrix M 0 can be used to represent the result of the k partitions in the graph G c for the initial KGP result using in FKGC.
Given an edge-weighted connection graph G c and a partitioning number k, the KGP result can be obtained by running the iterative bottom-up clustering algorithm, k-way graph partitioning via bottom-up clustering (KGPBUC).
Refer to the edge-weighted connection graph G c in Fig. 3, if the partitioning number is given as 3, one iterative bottomup clustering process can be used to construct three partitions in the graph G c . Initially, the eight vertices in the graph G c can be set as eight initial partitions. As illustrated in Fig. 6(a), in the first iteration, the vertices v 3 and v 7 representing the two partitions P 3 and P 7 can be merged into one new vertex (v 3 , v 7 ), representing the merged partition P (3,7) in the modified graph G c . In the second iteration, the vertices v 4 and v 8 , representing the two partitions P 4 and P 8 , can be merged into one new vertex (v 4 , v 8 ), representing the merged partition P (4,8) , in the modified graph G c . In the third iteration, the vertices v 1 and (v 3 , v 7 ), representing the two partitions P 1 and P (3,7) , can be merged into one new vertex (v 1 , v 3 , v 7 ), representing the merged partition P (1,3,7) in the modified graph G c . In the fourth iteration, the vertices v 6 and (v 1 , v 3 , v 7 ), representing the two partitions P 6 and P (1,3,7) , can be merged into one new vertex (v 1 , v 3 , v 6 , v 7 ), representing the merged partition P (1,3,6,7) in the modified graph G c . In the fifth iteration, the vertices (v 4 , v 8 ) and (v 1 , v 3 , v 6 , v 7 ), representing the two partitions P (4,8) and P (1,3,6,7) , can be merged into one new vertex (v 1  representing the merged partition P (1,3,4,6,7,8) in the modified graph G c . As a result, the eight vertices in the graph G c can be partitioned into three partitions {v 1 , v 3 , v 4 , v 6 , v 7 , v 8 }, {v 2 }, and {v 5 }, and the partitioning cut can be obtained as 9 in three-way graph partitioning. After completing the iterative bottom-up clustering process, the partitioning matrix M 0 , representing the result of the three partitions {v 1 , v 3 , v 4 , v 6 , v 7 , v 8 }, {v 2 }, and {v 5 }, in the graph G c can be used as an initial partitioning result using in FKGC, as illustrated in Fig. 6(b).

3) FKGC VIA PROBABILISTIC CONNECTION STRENGTH
Given an edge-weighted connection graph G c , the probabilistic connection strength pcs i,j between two vertices v i and v j , 1 ≤ i, j ≤ n, in the graph G c can be computed. Clearly, the probabilistic connection strength pcs i,j between two vertices v i and v j can further reflect the clustering distance d i,j between two vertices v i and v j in the graph G c . The larger the probabilistic connection strength pcs i,j between two vertices v i and v j in the graph G c is, the shorter the clustering distance between two vertices v i and v j in the graph G c is. Based on the probabilistic connection strength pcs i,j between two vertices v i and v j , the clustering distance d i,j between two vertices v i and v j in the graph G c can be defined as follows: Based on the concept of fuzzy c-means clustering on the geometrical clustering distances, the FKGC in the graph G c 5100115 VOLUME 4, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.

Engineering uantum
Transactions on IEEE on the defined clustering distance d i,j between two vertices v i and v j , 1 ≤ i, j ≤ n, can be designed as follows. Similar to the definition of the characteristic function for the ith vertex subset V i , 1 ≤ i ≤ k, in KGP, the characteristic function can be defined as a fuzzy function u i : V c -> [0, 1] for the ith vertex set V i , 1 ≤ i ≤ k, in FKGC. In KGP, the restricted assignment of any vertex in the graph G c onto one partition may lead to a nonbalanced partitioning result. In contrast, each vertex in the graph G c can be specified by using the degree of belonging to each partition in FKGC and the concept of using the fuzzy memberships in FKGC can easily lead to satisfy some specific size constraints. If there are k fuzzy functions u 1 , u 2 , …, u k associated with the vertex set V c in the graph G c , the partitioning result of each vertex v i in the vertex set V c can be represented by using the k fuzzy memberships u 1,i , u 2,i , …, u k,i onto k clusters in FKGC. Hence, the main purpose of partitioning the vertex set V c in FKGC is to find the k fuzzy memberships of each vertex in the vertex set V c onto k clusters.
Given the vertex set V c = {v 1 , v 2 , …, v n } in the graph G c , the k fuzzy functions of the vertices in the vertex set V c can be represented by a fuzzy matrix U ∈ M fkm , where M fkm is all possible fuzzy matrices in FKGC. In the formulation of one error function, given the clustering distances d i,j between two vertices v i and v j , 1 ≤ i, j ≤ n, in the graph G c , the classical squared-error function [28] can be formulated and used in FKGC as follows: Basically, the objective function J can be treated as a squared-error criterion and its minimization can further produce a fuzzy matrix U that is optimal in a least squared error. To approximately minimize the objective function J, the objective function J can be minimized on the fuzzy memberships u i,j , 1 ≤ i ≤ k, 1 ≤ j ≤ n, and the centers vc i , 1 ≤ i ≤ k, by using an iterative improvement algorithm. In the iterative improvement algorithm, based on the optimality analysis of the fuzzy c-means clustering [29], [30] on the geometrical clustering distances, the optimization of the objective function J under some constraints can be obtained by alternately finding the partial optimization for the modification of the fuzzy memberships u i,j , 1 ≤ i ≤ k, 1 ≤ j ≤ n, and the optimization for the selection of the k centers vc i , 1 ≤ i ≤ k. In the modification of the fuzzy memberships u i,j , 1 ≤ i ≤ k, 1 ≤ j ≤ n, the center vector vc in the function J must be fixed and the necessary condition of the fuzzy matrix M = {u i,j } can be found by the Lagrange multiplier method to minimize the function J. In the selection of the k centers vc i , 1 ≤ i ≤ k, the fuzzy matrix M = {u i,j } in the function J must be fixed and the necessary condition of the center vector vc can be found by an exhaustive search to minimize the function J. As a result, a final fuzzy k-means matrix M can be used to represent the fuzzy membership of the k partitions in the graph G c . Table 1 presents the symbols and notations in the iterative improvement algorithm ε-FKGC.
Based on the definition of the clustering distance d i,j between two vertices v i and v j , 1 ≤ i, j ≤ n, in the graph G c , the KGP matrix M 0 , representing the k-way partitioning result of all the vertices in the graph G c as an initial partitioning result, the selection of an acceptable error ε, the necessary condition of the fuzzy matrix M, the center vector vc, and a ε-approximate fuzzy matrix M in FKGC can be obtained by using the iterative improvement algorithm ε-FKGC on the fuzzy matrix M and the cluster centers vc for the objective function J.
Refer to the matrix M PCS , representing the probabilistic connection strength in the graph G c , in Fig. 5(b) and the KGP matrix M 0 for three-way graph partitioning in Fig. 6(b), based on the definition of the clustering distance d i,j between two vertices v i and v j , 1 ≤ i, j ≤ n, in the graph G c , the clustering distance d i,j between two vertices v i and v j , 1 ≤ i, j ≤ n, in the graph G c can be computed. By using the KGP matrix M 0 , as the initial fuzzy matrix and the iterative improvement on the modification of the fuzzy memberships and the selection of the three centers in fuzzy 3-means graph clustering, as illustrated in Fig. 7(a), the final ε-approximate fuzzy matrix M of the fuzzy 3-means graph clustering can be obtained after completing the algorithm ε-FKGC with ε= 0.001, as illustrated in Fig. 7(b).

C. VERTEX ASSIGNMENT IN K-WAY (δ, γ)-BALANCED PARTITIONING
For the vertex assignment in k-way (δ, γ )-balanced partitioning, based on the fuzzy memberships on the n vertices inside k clusters in a final fuzzy matrix M = {u i,j }, 1 ≤ i ≤ k, 1 ≤ j ≤ n, the maximum capacity δ, and the maximum size tolerance γ , the vertex assignment in k-way (δ, γ )-balanced partitioning can be divided into two sequential steps: Initial assignment and Iterative size modification.
In the initial assignment of all the vertices, the assignment of the n vertices v 1 , v 2 , …, and v n in the graph G c is based on the largest fuzzy membership return A fuzzy matrix M of all the vertices in G c on k clusters; 1 ≤ j ≤ n, on the k clusters. If the larger fuzzy membership of the vertex v j , 1 ≤ j ≤ n, in the graph G c is the fuzzy membership u i,j on the ith cluster, 1 ≤ i ≤ k, the vertex v j can be directly assigned onto the ith cluster. As a result, the n vertices v 1 , v 2 , …, v n in the graph G c can be partitioned into the k partitions P 1 , P 2 , …, P k . If the number of vertices inside each partition satisfies the given capacity constraint δ and the maximum size difference between two partitions satisfies the given size-tolerance constraint γ , then the initial k-way partitioning result can be obtained as the final k-way (δ, γ )-balanced partitioning result and the partitioning cut in k-way (δ, γ )-balanced partitioning can be computed. On the other hand, if the maximum size inside any partition does not satisfy the given capacity constraint δ or the maximum size difference between two partitions does not satisfy the size-tolerance constraint γ , then the iterative size modification must be used for the k partitions P 1 , P 2 , …, P k in the initial assignment.
In the iterative size modification for the k partitions P 1 , P 2 , …, P k , the iterative size modification inside the k partitions P 1 , P 2 , …, P k is based on the reassignment of the vertices from some larger partitions to the smallest partition. Initially, each vertex v j , 1 ≤ j ≤ n, in the graph G c can be set as one original vertex inside its partition. In each iteration, first, the qth partition P q with the smallest size can be found from the k partitions P 1 , P 2 , …, P k , and the original vertices inside the other larger partitions can be treated as the selected vertices in the modification process. For any possible selected vertex v j inside the ith partition P i , the membership difference of the vertex v j in the modification process can be computed and obtained as (u i,ju q,j ). Furthermore, the selected vertex v j with the smallest membership difference can be reassigned into the qth partition P q , and the selected vertex v j can be set as one assigned vertex in the modification process. Until the number of vertices inside each partition satisfies the given capacity constraint δ and the maximum size difference between two partitions satisfies the given size-tolerance constraint γ , the iterative size modification for the k partitions P 1 , P 2 , …, P k will stop. 5100115 VOLUME 4, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.
with fuzzy memberships u i,j on k clusters for an edge-weighted graph G c ; Maximum capacity δ inside each partition; Maximum size tolerance γ among two partitions; Output: k sets of vertices V 1 , V 2 , …, V k for k partitions P 1 , P 2 , …, P k , and the partitioning cut in k-way (δ, γ )-balanced partitioning; Initialize k sets of vertices, V 1 = V 2 = … = V k = ∅; for j = 1 to n if (the larger fuzzy membership of the vertex v j is the fuzzy membership u i,j on the ith cluster) then V i = V i ∪ {v j }; end if end for if (the number of vertices inside any partition is larger than δ or the size difference between any pair of two partitions is larger than γ ) then Set each vertex in G c as one original vertex inside its partition; while (the maximum size inside any partition is larger than δ or the maximum size difference between two partitions is larger than γ ) then Find the qth partition P q with the smallest size; Set the original vertices inside the other larger partitions as the selected vertices; Find the selected vertex v j with the smallest membership difference from the partition P i to the partition P q ; Set the selected vertex v j as one reassigned vertex; end while end if Set the k partitions P 1 , P 2 …, P k as the k-way (δ, γ )-balanced partitioning result; Compute the partitioning cut in k-way (δ, γ )-balanced partitioning; return k sets of vertices V 1 , V 2 , …, V k for k partitions P 1 , P 2 , …, P k and the partitioning cut in k-way (δ, γ )-balanced partitioning; As a result, the k-way partitioning result can be obtained as the k-way (δ, γ )-balanced partitioning result and the partitioning cut in k-way (δ, γ )-balanced partitioning can be computed.
Based on the fuzzy memberships u i,j , 1 ≤ i ≤ k, 1 ≤ j ≤ n, of all the n vertices v 1 , v 2 , …, v n on the k clusters in a final ε-approximate fuzzy matrix M = {u i,j }, 1 ≤ i ≤ k, 1 ≤ j ≤ n, the maximum capacity δ, and the maximum size tolerance γ , the k-way (δ, γ )-balanced partitioning result can be obtained by using the algorithm KBP. Refer to the final fuzzy matrix M for fuzzy 3-means graph clustering in Fig. 7(b), based on the fuzzy memberships of all the eight vertices v 1 , v 2 , v 3 , v 4 , v 5 , v 6 , v 7 , and v 8 on the three clusters in the final fuzzy matrix M, the three partitions P 1 , P 2 , and P 3 of the eight vertices can be obtained as {v 1 , v 2 , v 5 , v 6 }, {v 3 , v 7 }, and {v 4 , v 8 } in the initial assignment. If the maximum capacity δ is given as 4 and the maximum size tolerance γ is given as 2, it is clear that the 3-way partitioning result can satisfy the capacity constraint δ = 4 and the size-tolerance constraint γ = 2 in 3-way (4, 2)-balanced partitioning. Based on the three partitions, 3 , v 7 }, and P 3 = {v 4 , v 8 } of the eight vertices surrounded by three red regions, as illustrated in Fig. 8(a), it is clear that a set of eight quantum bits {q 1 , q 2 , q 3 , q 4 , q 5 , q 6 , q 7 , and q 8 } can be partitioned into three subsets {q 1 , q 2 , q 5 , q 6 }, {q 3 , q 7 }, and {q 4 , q 8 }, and the communication cost in 3-way (4, 2)-balanced partitioning can be obtained as 10 in the DQC design.
On the other hand, if the maximum capacity δ is given as 3 and the maximum size tolerance γ is given as 1, the initial three partitions P 1 = {v 1 , v 2 , v 5 , v 6 }, P 2 = {v 3 , v 7 }, and P 3 = {v 4 , v 8 } of the eight vertices cannot satisfy the capacity constraint δ = 3 and the size-tolerance constraint γ = 1. Hence, the iterative size modification must be used for the initial 3-way partitioning result.
In the iterative size modification, the smallest partition P 2 can be selected and the four vertices v 2 , v 2 , v 5 , and v 6 inside the partition P 1 can be treated as the selected vertices in the first iteration. In the iteration, the membership differences of the four vertices v 1 , v 2 , v 5 , and v 6 from the partition P 1 to the partition P 2 can be obtained as 1, 0.023, 0.041, and 0.016, respectively. Hence, the vertex v 6 with the smallest membership difference can be reassigned onto the partition P 2 , satisfying the capacity constraint δ=3 and the size-tolerance constraint γ = 1 in 3-way (3, 1)-balanced partitioning. Based on the three partitions, P 1 = {v 1 , v 2 , v 5 }, P 2 = {v 3 , v 6 , v 7 }, and P 3 = {v 4 , v 8 } of the eight vertices surrounded by three red regions, as illustrated in Fig. 8(b), it is clear that a set of eight quantum bits {q 1 , q 2 , q 3 , q 4 , q 5 , q 6 , q 7 , and q 8 } can be partitioned into three subsets {q 1 , q 2 , q 5 }, {q 3 , q 6 , q 7 }, and {q 4 , q 8 }, and the communication cost in 3-way (3, 1)-balanced partitioning can be obtained as 12 in the DQC design.

D. ANALYSIS OF TIME COMPLEXITY
In k-way (δ, γ )-balanced partitioning, the process of partitioning a given QC into k partitions in a DQC design can be divided into three sequential steps: construction of edgeweighted connection graph, generation of fuzzy matrix in FKGC, and vertex assignment in k-way (δ, γ )-balanced partitioning.
For the construction of an edge-weighted connection graph, based on the communication relation inside each gate by scanning all the gates in a given QC, the time complexity of constructing an edge-weighted connection graph is O(n+m), where n is the number of qubits and m is the number of gates in a given QC.
For the generation of a fuzzy matrix in FKGC, first, the time complexity of constructing the estimation paths between two vertices in an edge-weighted connection graph by using the maximum-weight shortest-path algorithm is O(n 4 ) and the time complexity of computing the probabilistic connection strength between two vertices in an edge-weighted connection graph is O(n 4 ). Furthermore, the time complexity of constructing the initial KGPBUC in the connection graph is O(n 2 ). Next, based on the matrix representing the probabilistic connection strength in the connection graph, the time complexity of computing the clustering distance between two vertices in the connection graph is O(n 2 ). Finally, given a tolerant error ε on the fuzzy memberships, the ε-approximate fuzzy matrix M can be obtained by running the algorithm ε-FKGC. In the iterative improvement of the fuzzy memberships in the generation of the ε-approximate fuzzy matrix M, the number of iterations depends seriously on the number of vertices n in the connection graph and the tolerant error ε. Clearly, the more the number of vertices or the smaller the tolerant error, ε is, the more the number of iterations is. However, the number of iterations cannot be modeled by using a formal polynomial function of the two variables n and ε. Hence, the number of iterations can be modeled and estimated as a function f(n, ε) of the two variables n and ε. In each improvement, the time complexity of finding a new center vector on the k clusters in ε-FKGC is O(n 2 ) and the time complexity of finding a new ε-approximate fuzzy matrix in ε-FKGC is O(n 2 ). Clearly, the time complexity of completing the algorithm ε-FKGC is O (n 2 f(n, ε)). Hence, the time complexity of generating a ε-approximate fuzzy matrix in FKGC is O(n 4 +n 2 f(n, ε)).
For the vertex assignment in k-way (δ, γ )-balanced partitioning, first, based on the fuzzy memberships in the εapproximate fuzzy matrix M, the maximum capacity δ, and the maximum size tolerance γ , the time complexity of constructing k partitions in the initial assignment is O(n). Based on the assignment result of the k partitions in the initial assignment, the maximum capacity δ, and the maximum size tolerance γ , the time complexity of reassigning some vertices inside the k partitions in the iterative size modification is O(n). Hence, the time complexity of completing the vertex assignment in k-way (δ, γ )-balanced partitioning is O(n).
To sum up, the time complexity of completing the partitioning process in k-way (δ, γ )-balanced partitioning under capacity constraint δ and the size-tolerance constraint γ is O(m+n 4 +n 2 f(n, ε)), where m is the number of gates and n is the number of qubits in a given QC.

IV. EXPERIMENTAL RESULTS
For k-way (δ, γ )-balanced partitioning in a DQC design, the proposed fuzzy-based partitioning algorithm has been implemented by using standard C++ language, compiled by gcc4.2.4 and run on an Intel Core i7-7700HQ CPU 3.80 GHz machine with 16 GB memory. In the experiments, eight tested circuits, Circuit01, Circuit02, Circuit03, Circuit04, Circuit05, Circuit06, Circuit07, and Circuit08, can be generated from the combination of some reversible circuits in the online resource, Revlib [31]. It is clear that the smaller the tolerant error ε in FKGC is, the higher the identification degree of the fuzzy memberships inside the final ε-approximate fuzzy matrix is and the more the running time in KBP is. To have the reasonable running time in KBP, the tolerant error ε in FKGC can be set as 0.001.
Basically, the convergence time on the generation of a final ε-approximate fuzzy matrix in FKGC seriously depends on the used initial fuzzy matrix. It is known that a better initial fuzzy matrix can lead to a better ε-approximate fuzzy matrix in FKGC. Because one edge with the largest weight is randomly selected in the construction of an initial partitioning result, the ε-approximate fuzzy matrix may not be unique. Hence, a set of ε-approximate fuzzy matrices in FKGC can be obtained by using a set of initial partitioning results. In the article, ten initial partitioning results can be used to find ten ε-approximate fuzzy matrices in FKGC and the ε-approximate fuzzy matrix with the least running time can be treated as the final fuzzy matrix in FKGC for any tested circuit.
To compare the communication cost in k-way (δ, γ )balanced partitioning for a DQC design, Daei's recursive KL-based algorithm [21], and a modified partitioning algorithm, the combination of the genetic algorithm and the  [22] with the capacity constraint inside a partition as n/k + 1 or n/k + 2 and the size-tolerance constraint between two partitions as 3 can also be implemented for the eight tested circuits. Because of the bipartitioning feature in the KL-based algorithm, the value k (k = 2 p ) must be restricted in Daei's recursive KL-based algorithm in k-way (δ, γ )-balanced partitioning, where p is the number of recursions. In Tables 2 and 3, "#qubits" denotes the number of qubits in a tested circuit, "#gates" denotes the number of gates in a tested circuit, "k" denotes the partitioning number in a tested circuit, "δ" denotes the maximum capacity inside each partition in KBP, "γ " denotes the maximum size tolerance between two partitions in KBP, "#CC" denotes the communication cost in KBP for a tested circuit, and "CPU Time" denotes the execution time for a tested circuit.
It is known that a good initial partitioning result can shorten the convergence time of a fuzzy matrix in FKGC. To measure the performance of our proposed initial partitioning result in k-way (δ, γ )-balanced partitioning, the fuzzy-based partitioning algorithm using a random initial partitioning result and the fuzzy-based partitioning algorithm using our proposed initial partitioning result in k-way (δ, γ )-balanced partitioning can be implemented and compared in the first experiment. For the eight tested circuits in the first experiment, the experimental results of the fuzzy-based partitioning algorithm using a random initial partitioning result and the fuzzy-based partitioning algorithm using our proposed initial partitioning result in k-way (δ, γ )-balanced partitioning for a DQC design can be obtained and listed in Table 2. It is assumed that two different capacity constraints δ = n/k + 1 and δ = n/k + 2 can be set in the experiment. Compared with the fuzzy-based partitioning algorithm using a random initial partitioning result with two different size-tolerance constraints γ = 1 and γ = 2 in 3-way, 4way, or 5-way balanced partitioning, the experimental results show that the fuzzy-based partitioning algorithm using our proposed initial partitioning result with two different sizetolerance constraints γ = 1 and γ = 2 can reduce 16.0% and 17.3% of the CPU time to obtain the same communication cost for the eight tested circuits on the average, respectively.
To measure the communication cost of the fuzzy-based partitioning algorithm in k-way (δ, γ )-balanced partitioning, Daei's recursive KL-based algorithm [21], the modified partitioning algorithm from Dadkhah's partitioning algorithm [22], and the proposed fuzzy-based partitioning algorithm in k-way (δ, γ )-balanced partitioning can be implemented and compared in the second experiment. For the eight tested   [22], and the Proposed Fuzzy-Based Partitioning Algorithm in KBP circuits in the second experiment, the experimental results of Daei's recursive KL-based algorithm [21], the modified partitioning algorithm from Dadkhah's partitioning algorithm [22], and the proposed fuzzy-based partitioning algorithm in k-way (δ, γ )-balanced partitioning can be obtained and listed in Table 3. It is assumed that two different capacity constraints δ = n/k + 1 and δ = n/k + 2 can be set in the experiment. Compared with Daei's recursive KL-based algorithm [21] in 4-way balanced partitioning, the experimental results show that the proposed fuzzy-based partitioning algorithm with three different size-tolerance constraints γ = 1, γ = 2, and γ = 3 can use 58.3%, 61.3%, and 64.5% of CPU time to reduce 16.1%, 21.2%, and 24.6% of the communication cost for the eight tested circuits on the average, respectively. Compared with the modified partitioning algorithm from Dadkhah's partitioning algorithm [22] in 3-way, 4-way, or 5-way balanced partitioning, the experimental results show that the proposed fuzzy-based partitioning algorithm with the size-tolerance constraint γ = 3 can use 35.0% of CPU time to reduce 11.1% of the communication cost for the eight tested circuits on the average, respectively.

V. CONCLUSION
Given a large QC in a DQC design, first, an edge-weighted connection graph can be constructed from the gates in the given QC. Furthermore, based on the edge connections in the connection graph, a given partitioning number k and a tolerant error ε , the probabilistic connection strength between two vertices in the connection graph can be estimated and the initial KGP result via bottom-up clustering can be obtained. Based on the definition of the clustering distance between two vertices in the connection graph, the ε-approximate fuzzy matrix in FKGC can be obtained. Finally, given the maximum capacity δ inside each partition and the maximum size tolerance γ between two partitions, all the vertices in the connection graph can be assigned onto k partitions to minimize the communication cost in k-way (δ, γ )-balanced partitioning for a DQC design.
In future works, the communication cost between two qubits must be discussed and analyzed according to the characterization of the utilized gates in a DQC design. In addition, the noise effect during the execution of quantum gates can be further considered in a DQC design.