Quantum Hierarchical Agglomerative Clustering Based on One Dimension Discrete Quantum Walk with Single-Point Phase Defects

: As an important branch of machine learning, clustering analysis is widely used in some fields, e.g., image pattern recognition, social network analysis, information security, and so on. In this paper, we consider the designing of clustering algorithm in quantum scenario, and propose a quantum hierarchical agglomerative clustering algorithm, which is based on one dimension discrete quantum walk with single-point phase defects. In the proposed algorithm, two nonclassical characters of this kind of quantum walk, localization and ballistic effects, are exploited. At first, each data point is viewed as a particle and performed this kind of quantum walk with a parameter, which is determined by its neighbors. After that, the particles are measured in a calculation basis. In terms of the measurement result, every attribute value of the corresponding data point is modified appropriately. In this way, each data point interacts with its neighbors and moves toward a certain center point. At last, this process is repeated several times until similar data points cluster together and form distinct classes. Simulation experiments on the synthetic and real world data demonstrate the effectiveness of the presented algorithm. Compared with some classical algorithms, the proposed algorithm achieves better clustering results. Moreover, combining quantum cluster assignment method, the presented algorithm can speed up the calculating velocity.


Introduction
As one of the most important fields in modern physics, quantum mechanics has not only changed the way we understand the physical world, but also provided a new method of solving some problems in the field of information technology [Liu, Xu, Yang et al. (2019) ;Jiang, Wang, Liang et al. (2020); Lin, Guo, Huang et al. (2016)]. After the pioneering works of Shor and Grover, various properties of quantum mechanics were utilized to design many subtle quantum algorithms [Montanaro (2016)], which can be exponentially faster than their classical counterparts. Machine learning enables computers to learn a certain hidden pattern from one data set and has a large variety of applications, such as image analysis, information retrieval, and bioinformatics, etc. However, if the number of the data set and/or the dimension of the data points are large, it is frequently required to cost a lot of times and huge computational resources. Especially, in the age of big data, this problem becomes more and more serious. Since quantum speed-up may be a good solution to this problem, quantum machine learning has recently been proposed and drawn a lot of attention [Biamonte, Wittek, Pancotti et al. (2017); Harrow, Hassidim and Lloyd (2009)]. In 2013, Lloyd et al. [Lloyd, Mohseni and Rebentrost (2013)] proposed two quantum machine learning algorithms. One is a supervised cluster assignment algorithm, the other is cluster finding algorithm that is used to obtain suitable seeds for quantum k-means algorithm. These two algorithms both offer an exponential speed-up over the corresponding classical counter-parts. Later on, Cai et al. [Cai, Wu, Su et al. (2015)] implemented them on a small-scale photonic quantum computer in experimental aspect. Besides that, some subtle quantum algorithms for machine learning have been put forward, e.g., quantum support vector machine ; Li, Liu, Xu et al. (2015)], quantum decision tree [Lu and Braunstein (2014)], quantum nearest-neighbor algorithms [Wiebe, Kapoor and Svore (2015)], quantum principal component analysis  ;Yu, Gao, Lin et al. (2019)], quantum deep learning [Wiebe, Kapoor and Svore (2014)], quantum association rules mining [Yu, Gao, Wang et al. (2016)], quantum clustering [Aı̈meur, Brassard and Gambs (2013) ;Li, He and Jiang (2011)], and so on. Clustering analysis is an essential tool for knowledge discovery and becomes a major branch of machine learning [Chen, Xiong, Xu, et al. (2019);Zhou, Tan, Yu, et al. (2019);Xiang, Shen, Qin, et al. (2019)]. During the past decades, some subtle clustering algorithms were proposed from different points of view. In this paper, we consider hierarchical agglomerative clustering (called HAC), which is one main kind of clustering algorithm, in quantum scenario. Two features of quantum walks, localization and ballistic effect, are utilized to designed a quantum hierarchical agglomerative clustering algorithm. In this algorithm, each data point is represented by a particle that is firstly performed a one-dimensional discrete quantum walk with single-point phase defects. Here, the phase defect, which can govern the localization effect, is deter-mined by the local density of the corresponding data point. Then, one makes a measurement on this particle in a computational basis, and obtains a random outcome. Based on the outcome, every attribute of this data point is made an appropriate modification. After executing this process several times, the data points are clustered together and divided into several classes. Numerical simulations show the effectiveness and efficiency of the proposed quantum HAC algorithm. The remainder of this paper is organized as follow. In Section 2, we briefly review the essential preliminaries, i.e., classical hierarchical agglomerative clustering algorithm and discrete quantum walk with single-point phase defects. Then, a quantum hierarchical agglomerative clustering algorithm with HWSPPD is described in Section 3. Its numerical simulations and experimental evaluation are presented in Section 4. Finally, a short conclusion is provided in Section 5.

Hierarchical agglomerative clustering
As compared to some supervised machine learning algorithms, clustering is unsupervised, which means that the training data points are unlabeled. In general, the goal of clustering analysis is to classify the data points into categories on the basis of their similarity, namely, to group these data points such that the intra-cluster similarity is maximized and the inter-cluster similarity is minimized. A general mathematical representation of the cluster analysis is depicted as follows. Given a set of data points = { 1 , 2 , ⋯ , }. After executing the clustering algorithm, these elements are classified into l subsets, denoted by { 1 , 2 , ⋯ , }. And these subsets satisfy the following constraints, where ∅ represents an empty set. Hierarchical clustering is a traditional clustering algorithm that seeks to build a hierarchy of clusters. Generally, it is divided into two types: agglomerative and divisive. In a standard hierarchical agglomerative clustering (HAC), each data point is thought as a cluster initially. Afterward, the distances of arbitrary two clusters are calculated, and two closest clusters are merged as one. This process is executed repeatedly until all data points cluster to one class or a certain terminate condition is satisfied. Finally, a hierarchy of clusters is built. Since any valid measure of distance can be used in HAC, it has been extensively used in data mining and statistics. However, this clustering algorithm should cost a lot of times. Suppose that a data set has data points and the dimension of each point is . By simple calculation, we know its complexity is ( 3 ). This implies that HAC is too slow for large data sets. Therefore, the most restraint of HAC is its high computational complexity. In this paper, we try to overcome this obstacle by proposing a quantum counterpart, in which some quantum technologies are utilized to speed-up the calculation.

Discrete quantum walk with single-point phase defects
As a quantum mechanical analog of classical random walk, quantum walk [Venegas-Andraca (2012)] has attracted a great deal of interesting recently. It has exhibited some distinct features, which can be used to design some new algorithms for quantum computers, e.g. quantum search algorithms. In a standard model, a discrete quantum walk on an infinite line can be depicted by a Hilbert space , which consists of two spaces, i.e., = ⊗ . One is a two-dimensional coin space , the computational basis of which is {|0⟩, |1⟩} corresponding to two possible directions of movement, rightward and leftward. The other space is a position space that is spanned by the orthogonal position vectors {| ⟩ | ∈ }, where denotes the set of integers. So, an orthonormal basis of the whole quantum system is {| , ⟩ = | ⟩ ⊗ | ⟩ | ∈ , = 0,1}. The movement of the walker at each step is determined by the result of a coin flip, which is implemented by a unitary operation ∈ (2). Afterward, a conditional position shift operation is performed. So, the whole one-step evolution can be depicted as, = (∑ =0,1 ⊗ | ⟩⟨ |)( ⊗ ). In this paper, we will adopt a common quantum walk, Hadamard walk. In this walk, is a Hadamard operator and has the following form, And the conditional position shift operation is | ⟩ = | + (−1) ⟩. Given the walker starts from the origin, so the whole system is in a initial state, After steps evolution, the form of the final state is,  From this figure, it is shown that the movement of the quantum walker is ballistic. The phenomenon cannot exist in classical random walk, the distribution of which is a Gaussian centered at the origin. Besides the ballistic diffusion, quantum walk has another nonclassical feature, localization, which has been found in some quantum walks [Schreiber, Cassemiro, Potocek et al. (2011)]. In 2012, Wojcik et al. [Wójcik, Łuczak, Kurzyński et al. (2012)] showed that localization effect can be obtained by changing a phase at a single point for discrete quantum walks without localization effect. In a discrete quantum walk with single-point phase defects, the phase of the particle is modified when it passes through a designed position e.g., = 0, each time. Later, this issue has been studied deeply and made a great progress in both theory and experimental aspects [Zhang, Xue and Twamley (2014); Xue, Qin and Tang (2015)]. Considering Hadamard walk depicted above, a single-point phase shift ∈ (0,2 ] is applied at the original point. That is, the operator is replaced by , In this way, a new quantum walk model is obtained (called HWSPPD). The localization effect of this quantum walk can be observed in Fig. 1(b).

Quantum HAC algorithm with HWSPPD
At first, a simple quantum hierarchical agglomerative clustering algorithm (Algorithm 1) is given, which can be derived directly from the conclusion of Ref. [Lloyd, Mohseni and Rebentrost (2013)]. In Lloyd et al. [Lloyd, Mohseni and Rebentrost (2013)], Lloyd et al. designed a quantum cluster assignment (called QCA) algorithm that accomplishes assigning a new data point to one of two sets by calculating the Euclidean distances between a new data point and two sets. Moreover, since this calculation only costs time ( ( )) on a quantum computer, this quantum algorithm provides an exponential speed-up over classical algorithms that should take time ( ). Drawing ideas from QCA algorithm, the calculating process of the Euclidean distance between two clusters can been obtained. In this process, two clusters, { | = 1,2, ⋯ , } and { | = 1,2, ⋯ , } are given. First, qRAM is utilized to construct the state Then, a projective measurement is performed on the first particle.
Finally, according to the probability of the case, in which the measurement result is || ⟩, the distance between these two clusters can be calculated with time ( ( )) . Hence, the complexity of Algorithm 1 will be reduced to ( 2 ( )). Additionally, we assume the termination condition is that the minimum of the distances is larger than a threshold . The detailed algorithm is described as follows. General speaking, in a data set, one data point is more related to nearby points than to points farther away. So, these date points are divided into three kinds: the center points, the border points, and the outlier points (or the noisy points). For example, in a twodimensional data set DS1 shown in Fig. 2, there are two clusters, { 1 , 2 , ⋯ , 9 } and { 10 , 11 , ⋯ , 18 }, and two outlier points, 19 and 20 . Consider three data points, 15 , 18 , and 19 , where 15 is the center point, who has many neighbors and is surrounded by them. In contrast, 19 is the outlier point, who has few neighbors and is isolated from the other points. For the point 18 , its neighborhood contains some points, and these neighbors are located towards the center points, thus 18 is named as the border point.

Figure 2: Point distribution
The second quantum hierarchical agglomerative clustering algorithm has its basis in this observation. In the algorithm, each data point is considered as a walker particle. According to the difference among these three kinds of data points, the corresponding particle performs a HWSPPD with different values of . It is determined by the density of its neighbors. Concretely, for the particles represented the center points or the outlier, the location effect of quantum walk is adopted to cause these particles move slowly or keep them motionless. While, for the border points, the ballistic effect is chosen to make the corresponding particles move towards the center points quickly. In this way, two nonclassical features of quantum walks are utilized to achieve clustering task. Before presenting our clustering algorithm, we need to define some notions that are used in our algorithm. Suppose that there exists an unlabel data set with data points denoted by = { 1 , 2 , ⋯ , }, and each data point has attributes, = ( 1 , 2 , ⋯ , ). So, the Euclidean distance between two data points, = ( 1 , 2 , ⋯ , ) and = ( 1 , 2 , ⋯ , ), is defined as ( , ) = �∑ ( =1 − ) 2 . Based on this distance definition, the neighborhood of every data point can be directly obtained. Concretely, the −neighborhood of a data point can be written as, = { ∈ | ( , ) ≤ }. Then, a new quantity is defined to represent the number of point 's neighbors, i.e., = | |. This quantity was used and named as local density in Ref. [Rodriguez and Laio (2014)].
In the second algorithm, each data point is considered as a particle. For convenience, we can assume that the corresponding particle of a data point is . This particle is prepared in the initial state | (0)⟩. At each step, particle performs a HWSPPD with a parameter firstly. Generally speaking, one data point just interacts with its neighborhood points. Thus, in our algorithm, the parameter is determined by the neighborhood of the data point . Concretely, is calculated as follows. where, After the walker makes a HWSPPD with = , this particle is measured in the computational basis. Finally, according to the measurement result , the attributes of are changed. The detailed modification is described as, = + | | × , where = Here, the setups of these two quantities, and , are based on a general assumption. Namely, the cluster centers with higher local density are surrounded by neighbors with lower density. If the data point is the center point, there generally exists another data point in its neighborhood that is very close to the center point. Moreover, it is common that the local density of the data point is equal or only slightly less than that of the center point i.e., ≃ . Thus, the quantity =| − | approaches to 0, then ≃ 0.7 . In this case, the localization effect takes action when particle performs a HWSPPD with ≃ 0.7 . On the other hand, the neighbors of point are located around it symmetrically, i.e., ∑ ∈ ≃ . It implies that the value of is also close to 0.
Thus, the center point is kept unchanged with high probability. The similar scenario occurs when point is the outlier point. The reason is that this point has few neighbors, i.e., ≃ 0 and ≃ 0. Therefore, the outlier point is also steady in our algorithm.
However, it is going to be different when is the border point. Generally, there may exist a data point ∈ , who is the center point or near the center point. So, ≠ 0, and the ballistic phenomenon will be found during the quantum walk of the corresponding particle. Moreover, since the neighbors of the border point are located towards the center point, the corresponding of point is larger than that of the center point. Under this condition, the border point will move towards the center point. Further, considering two border points 1 and 2 , where point 1 is more close to the center point than point 2 . This implies that the distance between points 2 and is larger than that of 1 and . In this case, the value of 1 is less than that of 2 , because it is common that point 1 has more neighbors than point 2 , i.e., 1 ≥ 2 . Therefore, we can obtain 1 ≥ 2 . Furthermore, it is evident that | 1 | ≤ | 2 |. Hence, as compared with point 1 , point 2 moves more quick towards the center.
In the above manner, all points expect the outlier point get together after executing this process several times. According to this basic idea and Algorithm 1, we can obtain the second quantum HAC algorithm (Algorithm 2), which is described as follows. Now, let us consider the data point set in Fig. 2. After the iterative process is executed one round, the border points, e.g. points 16 and 18 , move toward the center point 15 , whereas 15 does not move much and the outlier point 19 is steady [as shown in Fig.  3(a)]. Moreover, point 18 moves more quick than point 16 . After four iterations, from Fig. 3(d), it is shown that all data points cluster together except two outlier points X 19 and X 20 . Consequently, Algorithm 2 achieves the clustering task successfully.

Numerical simulations and experimental evaluation
In this section, we implement the presented algorithms by numerical simulation on a classical computer. From the clustering results on synthetic and real-world data, the performance of Algorithm 2 is evaluated. Because it facilitates the representation and manipulation of matrices, MATLAB is frequently used to simulate quantum states and operations. Therefore, the presented algorithms are programmed by MATLAB, and executed on a personal computer with Intel(R) Core (TM) i5-4590 CPU 3.30 GHz and 8.0 GB RAM. Let us start with conducting experiments on two synthetic data sets DS2 and DS3, which are displayed in Fig. 4(a) and Fig. 4(b). The first data set consists of four arbitrarily shaped clusters, each of which has 100 data points. The second one is comprised of four clusters with different densities. When is 0.14 (0.16), all data points in the set DS2 (DS3) are clustered into four classes by executing Algorithm 2. The corresponding clustering results are shown in Figs. 4(c) and 4(d). This implies that the presented algorithm successfully detects all types of clusters without any errors.

Figure 4: Clustering results of two synthetic data sets
In the following, we consider the other case, in which four real-world data publicly available at the UCI machine learning repository (http://archive.ics.uci.edu/ml), i.e., Wisconsin, Iris, Ecoli, and Wine, are clustered via Algorithm 2. To evaluate it better, the simulation experiment results of this algorithm are compared with that of two classical clustering algorithms, X-Means and MeanShift. Furthermore, to provide an objective description of effectiveness, we use the normalized mutual information (NMI) as a measure for clustering quality. It is defined as, , where and ′ are two clustering results of a data set.
( ) is the entropy associated with the clustering = { 1 , 2 , ⋯ , }, i.e., ( ) = is the mutual information between these two clusters. The value range of NMI( , ′) is between 0 and 1. The higher the value, the better the clustering effect. Derived from the study on breast cancer, the Wisconsin data set comprises of two classes. One is benign with 444 instances, the other is malignant with 239 instances. Each instance has 9 attributes. By Algorithm 2, this data set is clustered into two classes successfully. One cluster with 457 instances represents the class benign, among which 19 instances have been wrongly labeled. The other with 226 instances is malignant and has 6 wrong results. In total, there are only 25 instances wrongly clustered. It is better than the performance of the other two algorithms, which is shown in Tab

Conclusion
In summary, based on one-dimension discrete quantum walk with single-point phase defects, a new quantum hierarchical agglomerative clustering algorithm is introduced.
Each data point is regarded as a particle that performs a HWSPPD with a parameter θ.
Here, this parameter that can control the localization effect of this walk is determined by the local density of this data point. Then, this particle is measured. According to the measurement result, the corresponding data point makes an appropriate modification. In this way, each data point interacts with its neighbors. As time evolves, similar data points cluster together and form distinct classes. To illustrate the effectiveness of this algorithm, extensive simulation experiments on the synthetic and real world data are performed. Furthermore, in the presented algorithm, quantum cluster assignment method is utilized to speed up the calculating velocity. Hence, our approach is efficient. In addition, there are two key technology problems in the implementation of the presented algorithm. One is the achievement of the quantum cluster assignment method. The experiment of this method has been accomplished by Cai et al. [Cai, Wu, Su et al. (2015)] on a small-scale photonic quantum computer. The other one is that of quantum walk with single-point phase defects. The corresponding experiment has also been achieved by Xue et al. [Xue, Qin and Tang (2015)] with optical interferometers. Therefore, the presented algorithm is experimentally feasible with current technology.

Conflicts of Interest:
The authors declare that they have no conflicts of interest to report regarding the present study.