Demonstration of Topological Data Analysis on a Quantum Processor

Topological data analysis offers a robust way to extract useful information from noisy, unstructured data by identifying its underlying structure. Recently, an efficient quantum algorithm was proposed [Lloyd, Garnerone, Zanardi, Nat. Commun. 7, 10138 (2016)] for calculating Betti numbers of data points -- topological features that count the number of topological holes of various dimensions in a scatterplot. Here, we implement a proof-of-principle demonstration of this quantum algorithm by employing a six-photon quantum processor to successfully analyze the topological features of Betti numbers of a network including three data points, providing new insights into data analysis in the era of quantum computing.

In exploratory data analysis and data mining, our data often encodes extremely valuable information, but is typically large, unstructured, noisy, and incomplete, such that extracting useful information from the data is an important yet challenging task. Topological data analysis (TDA) [1] provides a general framework for studying such data in a manner that is insensitive to the particular metric and robust against noise. In particular, persistent homology [2,3] has been well established as a technique for extracting useful information by identifying topological features of data. One essential feature is the number of k-dimensional holes and voids in datasets, that is, the k-th Betti number β k (a topological invariant). For instance, the first three Betti numbers, β 0 , β 1 and β 2 , represent respectively the number of connected components, one-dimensional holes, and two-dimensional voids. The Betti numbers abstract away the actual data, reducing it to a purely topological representation, which is valuable for understanding the underlying structure of datasets. The field of using topological data analysis to analyze Betti numbers of data has been growing rapidly in recent years, yielding applications in image recognition [4], signal processing [5], network science [6,7], sensor analysis [8][9][10][11], brain connectomics [12,13], and fMRI data analysis [14,15], just to name a few.
Practically however, when facing the issue of computational complexity, classical topological methods pose a formidable task: a set of n data points possesses 2 n potential subsets that could contribute to the topology, quickly overwhelming even the most powerful classical computers, even for not-so-large datasets. So far the best classical algorithm for estimating Betti numbers to all orders with accuracy δ takes time O(2 n log(1/δ)) [16][17][18][19][20][21]. Moreover, exact calculation of Betti numbers is known to be PSPACE-hard for some classes of topologies [22].
Recently, Lloyd et al. [23,24] extended methods from quantum machine learning to TDA for efficiently estimating Betti numbers to all orders. Indeed, if the proportion of ksimplices generated from a dataset is large enough, the quan-tum algorithm for calculating Betti numbers to all orders with accuracy δ has runtime O(n 5 /δ) -exponentially faster than the best known classical algorithms. Furthermore, the algorithm does not require a large-scale quantum random access memory (qRAM) [25] -just O(n 2 ) bits is sufficient for the algorithm to store the information of all pairwise distances between the n data points. The potential computational speedup and its practicality will likely make quantum TDA a promising application for future quantum computers, in addition to Shor's algorithm [26][27][28][29], quantum simulation [30][31][32][33], solving linear systems [34,35], and classification of linear vectors [36][37][38].
Here we report a proof-of-principle demonstration of the quantum TDA algorithm on a small-scale photonic quantum processor for the first time. The topological features of Betti numbers of three data points are revealed and monitored at two different topological scales in our experiment. Our experiment successfully demonstrates the viability of the algorithm and suggests that data analytics may be an important future application for quantum computing, with widespread applications in our increasingly data-centric world.
To calculate Betti numbers, we first represent data topologically in terms of relationships between data points. Using a cutoff distance , we group data points into simplices (see Fig. 1(a))-fully-connected subsets of data points. The set of simplices forms a simplicial complex, the topological structure from which features such as Betti numbers can be extracted. This topological construction is shown in Fig. 1(b-d).
By determining the complete set of Betti numbers over the full range of , we can then construct the barcode (see Fig. 1(e)) [39], a parameterized version of Betti numbers in a distance-dependent manner. Each bar in the region of H k represents a k-dimensional hole, and the length of the bar indicates its persistence in the parameter . With the barcode, we can qualitatively filter out the short bars as topological noise and capture the long bars as significant features, since the length of bars is indicative of their persistence against changes arXiv:1801.06316v2 [quant-ph] 18 Dec 2019 in distance . In Fig. 1(e), a bar in the region of H 1 persists for a long range, leading us to determine that the underlying topological feature of the unstructured data ( Fig. 1(b)) is a circle.
In general, the quantum TDA algorithm has two main steps (see Fig. 2(a)). First, one accesses the data to construct the uniform mixture of the k-simplices that encode the desired topological structure. The time of this step is in the worst case exponential and in fact depends on the proportion of ksimplices. In cases where this fraction is large enough, this step can be implemented efficiently either classically, or using Grover's algorithm, yielding a further quadratic algorithmic enhancement. In the quantum algorithm, this step could be realized via two small steps, namely: (1a) simplicial complex state preparation; (1b) uniform mixed state construction. Second, one implements step (2) to reveal the topological invariants of the structure. This step is realized using the phase-estimation algorithm [40], which provide an exponential speedup over known classical procedures on a quantum computer, in fact [23,24] showed that this can executed in time O(n 5 /δ), with accuracy δ. The steps of the quantum algorithm are now described in more detail.
The first data point is connected to the second and third points by two edges. (e) Optimized circuit with 5 qubits. The blocks with different colors represent the four basic stages.
For a scatterplot including n data points, a k-simplex s k consists of k+1 points V j0 , V j1 , . . . , V j k , together with k(k+1)/2 edges, creating a fully connected subset of the data. We can encode a k-simplex as an n-qubit quantum state |s k with k + 1 1s at positions j 0 , j 1 , ..., j k and 0s at the other remaining positions.
The Vietoris-Rips simplicial complex S k is the set of ksimplices where all points are within distance of each other. In the quantum implementation, we can construct the simplicial complex state |ψ k as the uniform superposition of ksimplices in the complex Classically verify whether all points in each of the s k are within distance of each other could help us construct the simplicial complex state. Besides, we can also implement a multi-target Grover's algorithm [41] with a membership oracle function {f k (s k ) = 1 if s k ∈ S k } to verify whether s k ∈ S k , yielding a quadratic speedup. Let H k be the Hilbert space spanned by |s k where s k ∈ S k . The construction of |ψ k also reveals the number of k-simplices, |S k | = dim H k , and takes time O(n 2 (ζ k ) −1/2 ), where ζ k = |S k |/ n k+1 is the proportion of k-simplices that are actually in this complex at scale , and (ζ k ) −1/2 = (|S k |/ n k+1 ) −1/2 is the number of iterations of the multi-target Grover's algorithm. When the proportion is too small, the quantum search procedure will fail to find the simplices [23,24].
In step (1b), we construct the mixed state, the uniform mixture over the set of simplices in the complex. This procedure can be easily realized by adding an n-qubit ancillary register, performing controlled-NOT (CNOT) operations to copy |ψ k to construct 1 √ |S k | s k ∈S k |s k ⊗ |s k , and finally tracing out the ancillary register to obtain ρ k .
Step (2) acta on the simplicial complex to reveal topological features -the core of exponential speedup in the algorithm. Define the boundary map ∂ k that operates from H k to H k−1 by, where |s k−1 (l) is obtained from s k with vertices j 0 ...j l ...j k by omitting the l-th point j l from s k . The k-th Betti number is defined as [17][18][19][20], Classical algorithms for calculating Betti numbers to all orders with accuracy δ require time O(2 n log(1/δ)) [16][17][18][19][20][21]. In quantum TDA, an exponential speedup is achieved by employing the phase-estimation algorithm. For this purpose, the boundary map is embedded into a Hermitian matrix, Now applying phase-estimation to decompose ρ k in terms of the eigenvectors and eigenvalues of B k , one obtains the probability η k of projecting onto the kernel by measuring the eigenvalue register. Then the dimension of the kernel of ∂ k can be calculated as dim(Ker ∂ k ) = η k · |S k |. When both dim(Ker ∂ k ) and dim(Ker ∂ k+1 ) are determined, we can reconstruct the k-th Betti number by, We note that for some special cases for ∂ k , it is trivial to calculate dim(Ker ∂ k ). For example, if a k-simplex does not exist, dim(Ker ∂ k ) = |S k | = 0, while dim(Ker ∂ 0 ) is always equal to the number of points.
Careful evaluation indicates that step (2) can estimate Betti numbers to all orders with accuracy δ in time O(n 5 /δ) [23,24]. Hence, while in the worst case that their proportion is too small, step (1) will fail to find the k-simplices, since both the classical and quantum algorithm will take exponential time. There are specific cases, in particular where step (1) can be implemented efficiently, where the overall quantum algorithm can provide exponential savings. In fact we have tested a particular case using data-points with random distances between them and showed that indeed step (1) can be implemented efficiently (see Supplement 1 for details), either by a classical algorithm or further improving the time by a square root factor through Grover's algorithm.
To construct the corresponding uniform mixed states, we don't actually need to generate a complete copy of |ϕ 1 1 (|ϕ 2 1 ). Instead, we need only perform a CNOT operation between the auxiliary qubit |0 A and the second qubit of |ϕ 1 1 (|ϕ 2 1 ) to partially copy the state of simplices. After tracing out the ancillary qubit, the uniform mixed states ρ 1 and ρ 2 are obtained.
Next, apply quantum phase-estimation to reveal information related to Betti numbers. Since there are only three data points, k-dimensional holes for k > 1 can not exist. Therefore, only the 0-th and 1-st Betti numbers need to be calculated. We note that the algorithm cares not about the exact eigenvalue spectrum, but the probability of detecting |0 in the eigenvalue register. We can exploit this property to reduce the number of qubits required in the eigenvalue register. A particular treatment for boundary matrices is utilized to greatly simplify the complex circuit (see Supplement 1 for details) -a single CNOT operation between the eigenvalue register comprising only one qubit |0 B and the first bit of ρ 1 (ρ 2 ) is sufficient for realizing phase-estimation. Finally, the information related to Betti numbers will be read out by measuring the eigenvalue register. Note that since the quantum TDA algorithm only depends on how the points are connected, not the precise distances between points, our circuit works for all nontrivial cases of three points (where one, or two edges are present). The cases where zero or three edges are present are trivial, since we could clearly know the Betti numbers in the cases that the N points are all disconnected (β 0 = N , and β k = 0 for k > 0) or all connected (β 0 = 1, and β k = 0 for k > 0) for N points without calculating. Fig. 3 shows the setup of our experiment. We use single photons as qubits, where the logical qubits |0 and |1 are encoded into horizontal (H) and vertical (V ) polarization, respectively. With these settings, the step of simplices state preparation becomes straightforward. and 2 can be prepared directly by adding or removing the polarizer in path 2 respectively, where the index i in |H(or V ) i denotes the spatial mode. Photons 4 (ancilla) and 5 (eigenvalue register) are both disentangled by polarizers into |H , and then photons 3 and 6 (trigger) immediately collapse into |V . Note that the CNOT gates can be simulated using combinations of a polarizing beam splitter (PBS) and a half-wave plate (HWP) [27], since the target qubits are fixed at |H . This setup, in principle, suffices to demonstrate the underlying conceptual principles of quantum TDA.
Before running the algorithm, we first characterized the performance of the optical quantum circuit.
In the case of 4 < 2 < 5, a three-photon entangled state |φ = (|H 1 |V 2 |V 4 + |V 1 |H 2 |H 4 )/ √ 2 is generated after implementing the CNOT gate in step (1a). We measured the fidelity of the experimentally prepared state (see Supplement 1 for details) as F = 0.954 (6), which exceeds the threshold of 0.5 for the entanglement witness to confirm genuine multi-partite entanglement [43]. To the best of our knowledge, such a high fidelity for three photon entanglement has never been achieved before [44].
To further quantify the experimental performance, we use [45] to characterize the overlap between experimental and theoretical values, where e k and t k are the experimental and theoretical output probabilities of the state |k , respectively. The Error bars represent one standard deviation, deduced from propagated Poissonian counting statistics of the raw detection events. (c) The barcode for 0 < < 5. Since no k-dimensional holes for k ≥ 1 exist at these scales, only the 0-th Betti barcode is given here. For 0 < < 3, there is no connection between each point, so the 0-th Betti number is equal to the number of points. That is, there are three bars at 0 < < 3. At scales of 3 < 1 < 4 and 4 < 2 < 5 , the 0-th Betti number are 2 and 1.
We note that for the quantum TDA algorithm, the results are read out by measuring the eigenvalues. In general, the eigenvalue register requires only a few qubits for the quantum TDA algorithm (1 qubit in the current work), since we only care about the proportion of |0 in the eigenvalue register, rather than the exact value of all eigenvalues. Thus, a small amount of measurements are sufficient for obtaining reliable results, an important feature for the scalability of the algorithm.
In addition, theoretically, for the quantum TDA algorithm, only the qubits in the eigenvalue register need to be measured, rather than having to measure all qubits. In our experiment, since the photons generated by spontaneous parametric down conversion are probabilistic, to ensure that all qubits in the circuit have been generated, and the quantum circuits have been fully implemented, we need to measure 6-fold coincidence events. In fact, this is a common problem encountered in the current linear optical quantum computing. Fortunately, with the development of deterministic quantum dot single photon source [46], and other techniques [47], we believe this prob-lem can eventually be overcome. We anticipate that with more qubits (more photons [42,48] or higher dimensional states [49,50]), our proposal could be extended to the analysis of much larger datasets in the future.
In summary, we have presented the first proof-of-principle demonstration of quantum TDA on a small-scale photonic quantum processor. The topological features of a dataset comprising three data points is revealed and tracked at two different topological scales, fully reproducing the Betti numbers associated with the topology of the data. Future advances in the field could open up new frontiers in data analysis for quantum computing, including signal and image analysis, astronomy, network and social media analysis, behavioral dynamics, biophysics, oncology and neuroscience.
Acknowledgements Betti numbers are a way to describe the connectivity within a topological space. In simplest terms, the k-th Betti number β k counts the the number of k-dimensional holes in a topological space, for example, -β 0 is the number of connected components; -β 1 is the number of planar holes (1-dimensional holes); -β 2 is the number of two-dimensional voids (2-dimensional holes); -... Betti numbers are topological invariants. If two Betti numbers are the same for two different spaces then the spaces are homotopy equivalent [1]. To demonstrate Betti numbers more vividly, some examples are shown in Fig. 5. We can see that a circle has a connected component, a 1-dimensional holes, thus β 0 = 1, β 1 = 1. The Betti numbers of circle are the same as a triangle, so they are are homotopy equivalent (see Fig. 5(a)); Similarly, the two-dimensional hollow sphere is homotopy equivalent to a hollow tetrahedron (see Fig. 5(b)). Thus, Betti numbers can record significant topological features of a shape, which could be directly used in pattern recognition [51], anomaly detection [52], computational linguistics [53]. For instance, considering a simple shape recognition task, namely the recognition of printed letters, by using the Betti numbers, we could identify and distinguish the letters "A" and "B" in Fig. 5(c), even in the presence of some deformation. Now, we briefly introduce some mathematical background for Betti numbers. For more details, one can refer to [54].
We first describe how to use a simplicial complex to formally describe a topological structure.
Simplex: A k-simplex σ k = [V j0 , · · · , V j k ] is a fully connected set of k + 1 affine geometric points V j0 , · · · , V j k , together with k(k +1)/2 edges (see Fig 1(a) for some example). where k is the dimension of the simplex.
Simplicial complex: Roughly speaking, a simplicial complex K is a finite set simplices (see Fig. 1(d) for an example) such that: i) any face of a simplex of K is a simplex of K, ii) the intersection of any two simplices of K is either empty or a common face of both.
Next, we will introduce the chain group, boundary operator, cycle group and boundary group, and then how to calculate the Betti numbers.
k-chain group: A k-chain is a formal sum of k-simplices with integer coefficients, which can be written as c = p i=1 ε i σ i with ε i ∈ Z 2 , where {σ 1 , · · · , σ p } is the set of k-simplices of K. The set of all k-chains forms an Abelian group C k (K). k-boundary operator: For a k-simplex σ k = [V j0 , · · · , V j k ], the boundary map ∂ k : C k (K) → C k−1 (K) is given by whereV ji indicates that V ji is removed, and [V j0 , · · · ,V ji , · · · , V j k ] is the k − 1-simplex spanned by all the vertices except V ji .
k-boundary group and k-cycle group: The k-boundary group is defined as B k (K) = Im ∂ k+1 = {c ∈ C k (K)|∃c ∈ C k+1 (K), ∂ k+1 (c ) = c}, containing elements that are boundaries of k + 1-dimensional objects; The k-cycle group is defined as Z k (K) = Ker ∂ k = {c ∈ C k (K)|∂ k c = 0}, the elements in the cycle group can be understood as 'loops'. It can be proved that B k (K) ⊆ Z k (K) ⊆ C k (K).
Homology group: Let K be an k-dimensional simplicial complex. The kth homology group H k (K) associated with K is defined by H k (K) ≡ Z k (K)/B k (K), which represents those elements of Z k (K) (loops) that are not boundaries.
Using Betti numbers, we can detect invisible geometric features of high-dimensional objects. Applying Betti numbers to data analysis could help us analyze and exploit the complex topological and geometric structures underlying data. Next, we will introduce how to use persist homology, a sophisticated topological data analysis method, to extract useful information by identifying the topological features (Betti numbers) of data.
From points to simplicial complex: In data analysis, data is usually represented as an unordered sequence of points (see Fig. 1(b)), to analyze the Betti numbers of data, requiring a method to construct a simplicial complex.
To define a simplicial complex, the most obvious way is to use the points as the vertices of a combinatorial graph whose edges are determined by proximity. Using a cutoff distance , and connecting points within distance (see Fig 1. (b-d) for the procedure), we can construct the simplicial complex (see Fig 1. (d)), called a Vietoris-Rips simplicial complex.
Computing Betti numbers: Having constructed the simplicial complex of data points, we use the method above to calculate Betti numbers, finding the topological structure of the data points.
Barcode: Converting data points into a simplicial complex requires a choice of parameter -cutoff distance . However, if is too small, almost all points are separated, and no overall structure is apparent; if is too large, all the points may be connected with each other, the complex is a single high dimensional simplex, and no topological holes exist. It is challenging to select an appropriate scale for a given dataset. To address this problem, we observe the evolution of topological features for the full range of , rather than focussing on a particular numeric value, yielding the barcode (see Fig. 1(e)). Each bar in the region of H k of the barcode represents a kdimensional hole, the length of which indicates its persistence in the parameter . With the barcode, we can qualitatively filter out the short bars as topological noise and capture the long bars as significant, persistent topological features, since the length of bars is indicative of their persistence against changes in distance . For further details, refer to [55].
There are many interesting and useful applications of topological data analysis. For instance, in the field of image recognition, Carlsson et al. found that high-contrast 3×3 pixel patches from grayscale digital images concentrate near the surface of a Klein bottle in a higher-dimensional space [4]; in the field of signal processing, Perea and Harer found that persistent homology can detect periodicity in time-series data preventing noise [5], which is very stable and accurate especially in the presence of damping; in unsupervised machine learning, persistent homology also provides a powerful tool for the analysis of musical data, exploring common features of classical scores [56].

II. NUMERICAL SIMULATION OF THE PROPORTION OF k-SIMPLICES IN SOME CASES
As mentioned in the main text, the efficiency of step (1) depends on the proportion of k-simplices. Here, we studied the relationship among the proportion of k-simplices, the number of data point n, the dimension k of the k-simplices, and cutoff distance by numerical simulation (see Fig. 6).
In our simulations, without loss of generality, we randomly set the distances between different points in the range of [0,1]. In Fig. 6(a), we take k = 4 as an example to simulate the relationship among the proportion of k-simplices, the number of data points n and cutoff distance . Since the computational complexity of step (1) in quantum TDA is O(n 2 (ζ k ) −1/2 ), and the computational complexity of step (2) is O(n 5 /δ), where δ is the accuracy, we could regard step (1) as efficient in quantum TDA if n 2 (ζ k ) −1/2 ≤ n 5 /δ, that is ζ k ≥ n −6 . In Fig. 6(a), the blue area represents ζ k < n −6 , and the green area represents ζ k ≥ n −6 . We can see that, as n increases, the the green area becomes larger and the blue area becomes smaller. Thus, with the increase of n, the step (1) is efficient at a wider range of cutoff distance .
In Fig. 6(b), we take n = 25 as an example to simulate the relationship between the proportion of k-simplices, their dimension k, and the cutoff distance . It is clear that the proportion of k-simplices becomes smaller gradually at each cutoff distance as k becomes larger. Similar to Fig. 6(a), we let the blue area represent ζ k < n −6 , and the green area represent ζ k ≥ n −6 , yielding Fig. 6(c). We can see that even when k = 12 and n k+1 reaches the maximum 25 12 , the green area can still encompass over 50% of the region. Obviously, by analyzing all three figures in Fig. 6, the regime of step (1) that can be regarded as efficient is much larger than than that regarded as inefficient. That is, step (1) can be implemented efficiently in the cases of our numerical simulations.

III. EXPERIMENTAL ERRORS ANALYSIS
In this section, we will analyze errors introduced by experimental noise and provide an error threshold analysis.
The imperfections in our experiment can be attributed to two major causes: higher-order photon emissions, and partial distinguishability of independent photons. In order to suppress the influence of higher-order photon emissions, we placed two single-photon detectors at each measurement port. This dual-channel setup can partially suppress higher-order events where both detectors trigger simultaneously at one measurement port, indicating the presence of multiple photons. To ensure the high levels of indistinguishability between independent photons, all photons are spectrally filtered by 3nm narrow-band filters.
The final result of the quantum TDA algorithm is decided by the probability of the zero eigenvalue measured in the eigenvalue register. Assume the ideal probability of measuring the zero eigenvalue is η i , then the dimension of the kernel of ∂ k could be calculated as dim(Ker∂ k ) = η i · |S k |. To obtain the correct dimension in the experiment, we need to (a) Let k = 4, the relationship among the proportion of k-simplices ζ k , the number of data point n (y axis) and cutoff distance (x axis). The blue area represents ζ k < n −6 , the green area represents ζ k ≥ n −6 . (b) Let n = 25, the relationship among the proportion of ksimplices, the dimension k of the k-simplices and cutoff distance . (c) Let the blue area represent ζ k < n −6 in (b), and the green area represent ζ k ≥ n −6 in (b). It is clear that the green area is far larger than the blue area. ensure that | dim(Ker∂ k ) ideal − dim(Ker∂ k ) experiment | < 0.5, that is |η e − η i | · |S k | < 0.5 if we use the rounding principle, where η e is the probability of the experimentally measured zero eigenvalue. To quantify the experimental error threshold, we define the error as E t = |η e − η i |, and then simulate the error threshold that satisfies the constraint condition |η e − η i | · |S k | < 0.5. The relationship between the number of k-simplices |S k | (x axis) and error threshold (y axis) is shown in Fig. 7. Obviously, as |S k | increases, the error threshold decreases. Thus, appropriate fault-tolerance mechanisms should be employed when we deal with large-scale dataset.
Note that unlike the the previous quantum algorithm, the quantum TDA algorithm only cares about the probability of the zero eigenvalue, not all the individual values in the eigenvalue register. Thus, the quantum TDA algorithm, in principle, could be more robust to noise than other algorithms, such as Shor's algorithm [26] and the HHL algorithm [34], which require an exact quantum state as output. The relationship between the number of k-simplices |S k | (x axis) and error threshold Et (y axis). Obviously, as |S k | increases, the error threshold Et decreases.

IV. NECESSITY OF CONSTRUCTING THE MIXED STATE
In the quantum TDA algorithm, step (1) is used to construct the uniform mixture of the k-simplices, which is realized by: (1a) simplicial complex state preparation; (1b) uniform mixed state construction. In fact, the purpose of step (1) is to sample a uniform k-simplex, which is the essential reason for constructing mixed state.
Next, we will provide the reason why the quantum TDA algorithm can not directly use the pure state generated in step (1a) as the input of step (2). In step (2), we use quantum phase-estimation algorithm to decompose a mixed state in terms of the eigenvectors of the Hermitian matrix B k , which acts on the space H k−1 ⊕ H k , and find the probability of the zero eigenvalue to compute the dimension of the kernel of ∂ k .
The mixed state is where each k-simplices |s k is the basis, and ρ k is a maximally mixed state. According to quantum mechanics, even using another complete basis set, the maximally mixed state ρ k is still of the above form. Thus, ρ k could be rewritteb as the eigenstate set {|n k } of ∂ k Introduce qubits |0 t as the eigenvalue register, after the phase-estimation algorithm, For each eigenstate |n i , the eigenvalue register will output its corresponding eigenvalue |λ i . Thus, The probability of measuring the zero eigenvalue in the register is N k (0)/|S k |, where N k (0) is the number of eigenstates in {|n k } whose eigenvalue is zero, that is, the dimension of the kernel of ∂ k . However, if we directly used the pure state generated in step (1a) as the input to step (2), after we decompose the pure state in terms of the eigenvectors of the Hermitian matrix B k , the probability of the zero eigenvalue in the register will be meaningless due to interference effects. For ease of understanding, we will give an example to show that using the pure state as the input of step (2) will output wrong results.
Therefore, after the phase-estimation algorithm, the probability of measuring the eigenvalue of zero in eigenvalue register should be 2/7. However, if we use the the pure state, Obviously, the probability of measuring the eigenvalue of zero 35 , which is inconsistent with the expectation 2/7. By this counterexample, we can see that the algorithm can not use pure state generated in step (1a) as the input to step (2).

V. CIRCUIT DETAILS
To implement the algorithm with a limited number of qubits, our designed circuit differs from the original algorithm via several modifications, some of which have already been mentioned in the main text. Here we show the details of the modifications to phase-estimation, the core of the quantum TDA algorithm. Before introducing the modification, we provide two preliminaries: (i) Let U be an arbitrary unitary operator, the eigenvector and eigenvalue sets of which are {|u 1 , |u 2 , ..., |u n } and {λ 1 , λ 2 , ..., λ n }, respectively. If we transform the unitary operator U into αU 2 , where α = 0 is a constant, then the eigenvalue set of αU 2 become {αλ 1 2 , αλ 2 2 , ..., αλ n 2 }, and the eigenvector set will not change. We note that if λ i = 0, then αλ i 2 = 0, else if λ i = 0, then αλ i 2 = 0.
(ii) Suppose |0 ⊗t |u is the input of the phase-estimation algorithm, where |0 ⊗t is an eigenvalue register with t qubits, and |u is an eigenvector of unitary operator U with eigenvalue e 2πiφ (φ ≈ 0.φ 1 ...φ t with binary representation). The phase-estimation algorithm is designed to output |φ 1 ...φ t |u , where |φ 1 ...φ t is an approximation to the phase φ with a precision of t bits.
Specifically, the Hermitian boundary matrices at scales 3 < 1 < 4 and 4 < 2 < 5 are The eigenvalue and eigenvector sets of the boundary matrices To reduce the number of qubits required in the eigenvalue register, we set B 1 = (B 1 ) 2 /2, then the eigenvalue spectrum becomes {λ 1 1 , λ 1 2 , λ 1 3 , λ 1 4 } = {1,1, 0, 0}, without changing the eigenvector set. We note that the algorithm cares not about the full spectrum but the probability of |0 being detected in the register, so this special treatment is justified. Then transforming B 1 into the unitary operator e iπB1 allows us to implement phase-estimation using an eigenvalue register with only one qubit |0 B . For the input |0 B |ϕ 1 1 , we apply the transformation, Similarly, at the scale of 2 , we set B 2 = (B 2 ) 2 and transform B 2 into the unitary operator e iπB2 to meet experimental requirements. For the input |0 0| B ⊗ ρ 2 , the phaseestimation procedure outputs the state |1 1| B ⊗ ρ 2 , where ρ 2 = (|110 110| + |101 101|)/2. Thus, in our experiment, only a single CNOT operation between the eigenvalue register comprising only one qubit |0 B and the first bit of ρ 1 (ρ 2 ) is sufficient for us to compile the phase-estimation algorithm.

VI. EXPERIMENTAL IMPLEMENTATION OF THE CIRCUIT
In the experiment, we use single photons as qubits, where the logical qubits |0 and |1 are encoded into horizontal (H) and vertical (V ) polarization, respectively. The setup of our experiment is shown in Fig. 3. Photons in paths 1, 2, and 3 are used to construct simplex states. Photons 4 (ancilla) and 5 (eigenvalue register) are both disentangled by polarizers into |H , and then photons 3 and 6 (trigger) immediately collapse into |V . Here we describe details of how to experimentally implement the circuit in Fig. 2 Qubit and Gates Experimental Realization In the initialization stage, the photons in our experiment are generated by spontaneous parametric down-conversion using β-barium borate (BBO). Ultraviolet laser pulses pass through a BBO crystal to produce entangled state 1/ √ 2(|0 |1 +|1 |0 ) (see Fig. 9(a)). If we do not want the entangled state, we could use a polarizer (POL) to disentangle the entangled state to |0 or |1 (see Fig. 9(b)).
In the quantum gate operation stage, we need to implement a H gate, X gate, and CNOT gate. The single-qubit quantum gates H and X can beexperimentally realized using half-wave plates (HWP) of 22.5 • (see Fig. 9(c)) and 45 • (see Fig. 9(d)), respectively. Since the target qubit of the CNOT gate in our circuit is |0 , it can be realized using a combination of a polarizing beam splitter (PBS) and a HWP, and post-selecting the events where there is exactly one photon exiting each output of the PBS [27] (see Fig. 9(e)).
In the measurement stage, each photon passes through a quarter-wave plate (QWP), a HWP, a PBS, and is finally read out by using a single-photon detector (see Fig. 9(f)). By adjusting the angle of the QWP and HWP, we can measure the photonic qubit in arbitrary bases. Error bars represent one standard deviation, deduced from propagated Poissonian counting statistics of the raw detection events.

VII. PHOTON SOURCE
We developed a high-performance source of polarization entangled photons generated via spontaneous parametric down-conversion (SPDC) using a sandwich-like bulk [42], which consists of two identically cut 2mm-thick beam-like type-II β-barium borate (BBO) crystals with one half-wave plate (HWP) inserted between them. The source simultaneously exhibits high brightness (∼850Hz/mW), high efficiency (∼45% collection efficiency with 3nm bandwidth filters, and ∼88% collection efficiency without narrowband filtering) and high fidelity (∼0.98) at a pump power of 240mW. These three essential features are crucial for future scalable photonic quantum technologies.

VIII. CHARACTERIZING THE THREE-PHOTON ENTANGLED STATE
Here we show the details for determining the fidelity of the three-photon entangled state |φ = (|HV V + |V HH )/ √ 2 and verifying genuine multipartite entanglement [57] using an entanglement witness. The fidelity is the overlap of the experimentally produced state ρ exp with the desired state ρ ideal , For the three-photon entangled state ρ ideal = |φ φ| = (|HV V HV V | + |V HH V HH| where X, Y and Z are the Pauli matrices σ x , σ y , σ z respectively. Fig. 10 shows the experimental data. The expectation values of |HV V HV V | + |V HH V HH| and (XXX + Y XY − XY Y + Y Y X)/4 are 0.987(1) and 0.921 (12) respectively. Thus, the state fidelity of |φ can be calculated as F |φ = 0.954 (6), which exceeds the threshold of 0.5 required for the entanglement witness. With high statistical significance (∼76 standard deviations), genuine three-photon entanglement is confirmed.

IX. STATE RECONSTRUCTIONS
The matrix form of the reconstructed experimentally obtained states ρ 1 exp and ρ 2 exp are, To avoid this problem, we employ maximum likelihood estimation [58]