Paper The following article is Open access

Quantum tensor singular value decomposition*

, , and

Published 2 July 2021 © 2021 The Author(s). Published by IOP Publishing Ltd
, , Citation Xiaoqiang Wang et al 2021 J. Phys. Commun. 5 075001 DOI 10.1088/2399-6528/ac0d5f

2399-6528/5/7/075001

Abstract

Tensors are increasingly ubiquitous in various areas of applied mathematics and computing, and tensor decompositions are of practical significance and benefit many applications in data completion, image processing, computer vision, collaborative filtering, etc. Recently, Kilmer and Martin propose a new tensor factorization strategy, tensor singular value decomposition (t-svd), which extends the matrix singular value decomposition to tensors. However, computing t-svd for high dimensional tensors costs much computation and thus cannot efficiently handle large scale datasets. Motivated by advantage of quantum computation, in this paper, we present a quantum algorithm of t-svd for third-order tensors and then extend it to order-p tensors. We prove that our quantum t-svd algorithm for a third-order N dimensional tensor runs in time ${ \mathcal O }\left(N\mathrm{polylog}(N)\right)$ if we do not recover classical information from the quantum output state. Moreover, we apply our quantum t-svd algorithm to context-aware multidimensional recommendation systems, where we just need to extract partial classical information from the quantum output state, thus achieving low time complexity.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

A tensor (hypermatrix) refers to a multi-array of numbers. It has been applied in several areas including image deblurring, denoising, video recovery, data completion, tensor networks, multi-partite quantum systems, and machine learning [119], owing to the flexibility of tensors in representing data. Some of these applications utilize tensor decompositions including CANDECOMP/PARAFAC (CP) [20], tensor-train decomposition (TT) [21], TUCKER [22], higher-order singular value decomposition (HOSVD) [12, 23, 24], tensor singular value decomposition (t-svd) [1, 3, 19, 25], and etc.

During the last decade, plenty of research has been carried out on t-svd and its applications. The t-svd factorization strategy is first proposed by Kilmer and Martin [1] for third-order tensors, then it is extended to order-p tensors by Martin et al in 2013 [26]. The t-svd algorithm extends the matrix svd strategy to tensors while avoiding the loss of information inherent during unfolding tensors as operated in CP and TUCKER decompositions. The general idea of t-svd factorization is performing matrix svd in the Fourier domain, consequently it allows other matrix factorization techniques, e.g., QR decomposition, to be extended to tensors easily using the similar idea. Moreover, the t-svd is superior to Tucker or HOSVD in the sense that truncated t-svd gives an optimal approximation of a tensor measured by the Frobenius norm, while this best approximation cannot be obtained by truncating the full HOSVD or Tucker decomposition. Due to this optimality property, t-svd is shown to have better performance than HOSVD in facial recognition [27] and tensor completion [2, 28].

However, the complexity of calculating full t-svd for third-order N dimensional tensors is ${ \mathcal O }({N}^{4})$, which is extremely high for large scale datasets. Hence many works have devoted to low-rank approximated t-svd representation which gives up the optimality property and has comparatively low comlexity. In [29], Zhang et al propose a randomized t-svd method which can produce a factorization with similar properties to the t-svd, and the computational complexity is reduced to ${ \mathcal O }({{kN}}^{3}+{N}^{3}\mathrm{log}N)$, where k is the truncated term.

Considering the high cost of the existing classical algorithms of t-svd, we present a quantum version of t-svd for third-order tensors which reduces the complexity to ${ \mathcal O }(N\mathrm{polylog}(N))$. To our best knowledge, the efficiency of this algorithm beats any known classical t-svd algorithms in the literature. In section 4, we extend the quantum t-svd algorithm to order-p tensors.

An important step in a classical t-svd algorithm is to perform discrete Fourier transform (DFT) along the third mode of a tensor ${ \mathcal A }\in {{\mathbb{R}}}^{{N}_{1}\times {N}_{2}\times {N}_{3}}$, obtaining $\hat{{ \mathcal A }}$ with computational complexity ${ \mathcal O }({N}_{3}\mathrm{log}{N}_{3})$ for each tube ${ \mathcal A }(i,j,:)$, i = 0, ⋯ ,N1 − 1; j = 0, ⋯ ,N2 − 1. Thus, the complexity of performing the DFT on all tubes of the tensor ${ \mathcal A }$ is ${ \mathcal O }({N}_{1}{N}_{2}{N}_{3}\mathrm{log}{N}_{3})$. In the quantum t-svd algorithm to be proposed, this procedure is accelerated by the quantum Fourier transform (QFT) [30] whose complexity is only ${ \mathcal O }\left({\left(\mathrm{log}{N}_{3}\right)}^{2}\right)$. Moreover, due to quantum superposition, the QFT can be simultaneously performed on the third register of the state $| { \mathcal A }\rangle $, which is equivalent to performing the DFT for all tubes of ${ \mathcal A }$ parallelly, so the total complexity of this step is still ${ \mathcal O }\left({\left(\mathrm{log}{N}_{3}\right)}^{2}\right)$.

After performing the QFT, in order to further accelerate the second step in the classical t-svd algorithm which performs the matrix svd for every frontal slice of $\hat{{ \mathcal A }}$, we apply a modified quantum singular value estimation (QSVE) algorithm originally proposed in [31] to the frontal slices $\hat{{ \mathcal A }}(:,:,i)$ parallelly with complexity at most ${ \mathcal O }(N\mathrm{polylog}(N))$ for N dimensional tensors. Traditionally, the quantum singular value decomposition of non-sparse low-rank matrices involves exponentiating matrices and outputs the superposition state of singular values and their associated singular vectors. However, this Hamiltonian simulation method requires that the matrix to be exponentiated be low-rank, which is difficult to be satisfied in general. In our algorithm, we use the modified QSVE algorithm, where the matrix is unnecessarily low-rank, sparse or Hermitian.

The main contributions of this paper are listed as follows:

The original QSVE algorithm proposed in [31] has to be carefully modified to become a useful subroutine in our quantum tensor-svd algorithm. Specifically, the original QSVE, stated in lemma 3, requires the matrix A be stored in the classical binary tree structure, then the singular values of A can be estimated efficiently. Given a tensor ${ \mathcal A }\in {{\mathbb{R}}}^{{N}_{1}\times {N}_{2}\times {N}_{3}}$, in algorithm 3, QSVE is performed on the matrices ${\hat{A}}^{(m)}$ which are frontal slices of $\hat{{ \mathcal A }}$, m = 0, ⋯ ,N3 − 1. The difficulty lies in the fact that we cannot require all ${\hat{A}}^{(m)}$ be stored in the data structure since they are obtained after QFT. It is more reasonable to assume that the frontal slices of the original tensor ${ \mathcal A }$ is stored in the binary tree structure. Therefore, the main obstacle we need to overcome is how to estimate the singular values of ${\hat{A}}^{(m)}$ on the condition that every frontal slice of the original tensor A(k) is stored in the data structure. This problem is solved by theorem 2, and whose proof presents a detailed illustration of this process.

In section 5, we design a quantum tensor approximation algorithm based on algorithm 3, and present an application of this algorithm, namely the context-aware multidimensional recommendation systems. Our quantum tensor approximation algorithm suits the 3D recommendation systems model very well for mainly two reasons.

First, compared with other tensor decompositions, t-svd has been shown to be superior in capturing the spatial-shifting correlation [28], so it is suitable to model the 3D recommendation systems. Suppose the preference information of a user is encoded in a third-order tensor in which the three modes represent locations, point-of-interests, and time frames respectively, a user's preference in a certain time is very likely to affect the recommendation for him/her at other time in general. The QFT in t-svd algorithm is used to bind a user's preferences in different time together, so the recommendation using t-svd can better user experience by integrating the relations between different context.

Second, our quantum t-svd algorithm can achieve good result with low cost when applied to 3D recommendation systems. Actually, it is not necessary to reconstruct the entire tensor as in the classical 3D recommendation systems algorithms based on tensor factorizations, such as the truncated t-svd with complexity ${ \mathcal O }({{kN}}^{3}+{N}^{3}\mathrm{log}N)$ [3] and the truncated HOSVD (T-HOSVD) with complexity ${ \mathcal O }(3{{kN}}^{3})$ [32] for third-order N dimensional tensors, where k is the truncation rank. Our quantum 3D recommendation systems algorithm only samples high-value elements from the approximated tensor (corresponding to measuring the output state in computational basis a certain times), and this is exactly what we need for recommendation systems. Consequently, our quantum 3D recommendation systems algorithm achieves the complexity ${ \mathcal O }(N\mathrm{polylog}N)$ if the preference tensor has several dominating (namely, high-value) elements.

The rest of this paper is organized as follows. A standard classical t-svd algorithm and several related concepts are introduced in section 2.2; section 2.3 summarizes the quantum singular value estimation algorithm proposed in [31]. Section 3 presents our main algorithm, quantum t-svd, and its complexity analysis. We extend the quantum t-svd algorithm to order-p tensors in section 4. In section 5, we design a quantum tensor approximation algorithm based on classical truncated t-svd and then provide an application on context-aware multidimensional recommendation systems. In section 6, we conclude the paper.

2. Preliminaries

In section 2.1, we first introduce the concept of tensor, and notation used throughout the paper. In section 2.2, we review the concept of t-product and the t-svd algorithm proposed by Kilmer et al [1] in 2011. Then in section 2.3, we briefly retrospect the quantum singular value estimation algorithm (QSVE) [31] proposed by Kerenidis and Prakash.

2.1. Tensor background and notation

A tensor ${ \mathcal A }=({a}_{{i}_{1}{i}_{2}\cdots {i}_{p}})\in {{\mathbb{C}}}^{{N}_{1}\times {N}_{2}\times \cdots \times {N}_{p}}$ is a multidimensional array of data, where p is the order and (N1, ⋯ ,Np ) is the dimension. The order of a tensor is the number of modes. For instance, ${ \mathcal A }\in {{\mathbb{C}}}^{{N}_{1}\times {N}_{2}\times {N}_{3}}$ is a third-order tensor of complex values with dimension Ni for mode i, i = 1, 2, 3, respectively. In this sense, a matrix A can be considered as a second-order tensor, and a vector x is a tensor of order 1. For a third-order tensor, we use terms frontal slice ${ \mathcal A }(:,:,i)$, horizontal slice ${ \mathcal A }(i,:,:)$ and lateral slice ${ \mathcal A }(:,i,:)$ (see figure 1). By fixing all indices but the last one, the result is a tube of size 1 × 1 × N3, which is actually a vector. For example, ${ \mathcal A }(i,j,:)$ is the (i, j)-th tube of ${ \mathcal A }$.

Figure 1.

Figure 1. (a) frontal slices, (b) horizontal slices, (c) lateral slices of a third-order tensor. (d) a lateral slice as a vector of tubes.

Standard image High-resolution image
Figure 2.

Figure 2. The t-svd of ${ \mathcal A }\in {{\mathbb{R}}}^{{N}_{1}\times {N}_{2}\times {N}_{3}}$.

Standard image High-resolution image

Notation. In this paper, script letters are used to denote higher-order tensors (${ \mathcal A }$, ${ \mathcal B }$, ⋯). Capital nonscript letters are used to represent matrices (A, B, ⋯), and vectors are written as boldface lower case letters ( x , y , ⋯). DFT( u ) refers to performing the discrete Fourier transform (DFT) on u , which is computed by the fast Fourier transform represented in Matlab notation fft( u ). The tensor after DFT along the third mode of ${ \mathcal A }$ is denoted by $\hat{{ \mathcal A }}$, i.e. $\hat{{ \mathcal A }}=\mathrm{fft}({ \mathcal A },[],3)$. Hence we have ${ \mathcal A }=\mathrm{ifft}(\hat{{ \mathcal A }},[],3)$, which is the inverse of the above operation. We use A(i) to denote the i-th frontal slice ${ \mathcal A }(:,:,i)$ for short, hence the m-th frontal slice of $\hat{{ \mathcal A }}$ is ${\hat{A}}^{(m)}$. There are three types of product we would like to clarify here: u v refers to the circular convolution between vectors u and v , ⊙ is the element-wise product and ${ \mathcal A }* { \mathcal B }$ represents the t-product between tensors ${ \mathcal A }$ and ${ \mathcal B }$.

2.2. The t-svd algorithm and t-product

In this subsection, we present the classical t-svd as well as its pseudocode, algorithm 1. For readability of the main text, all mathematical details are put in appendix A. Simply speaking, the t-svd of a tensor can be interpreted as the usual matrix svd in the Fourier domain, as can be seen in algorithm 1.

tensor singular value decomposition (t-svd) [1].

Theorem 1 For ${ \mathcal A }\in {{\mathbb{R}}}^{{N}_{1}\times {N}_{2}\times {N}_{3}}$, its t-svd is given by ${ \mathcal A }={ \mathcal U }* { \mathcal S }* {{ \mathcal V }}^{T}$, where ${ \mathcal U }\in {{\mathbb{R}}}^{{N}_{1}\times {N}_{1}\times {N}_{3}}$ and ${ \mathcal V }\in {{\mathbb{R}}}^{{N}_{2}\times {N}_{2}\times {N}_{3}}$ are orthogonal tensors, and every frontal slice of ${ \mathcal S }\in {{\mathbb{R}}}^{{N}_{1}\times {N}_{2}\times {N}_{3}}$ is a diagonal matrix (see figure 2).

Algorithm 1. t-svd for third-order tensors [1]

Input: ${ \mathcal A }\in {{\mathbb{R}}}^{{N}_{1}\times {N}_{2}\times {N}_{3}}$
Output: ${ \mathcal U }\in {{\mathbb{R}}}^{{N}_{1}\times {N}_{1}\times {N}_{3}},{ \mathcal S }\in {{\mathbb{R}}}^{{N}_{1}\times {N}_{2}\times {N}_{3}},{ \mathcal V }\in {{\mathbb{R}}}^{{N}_{2}\times {N}_{2}\times {N}_{3}}$
$\hat{{ \mathcal A }}=\mathrm{fft}({ \mathcal A },[],3)$;
for $i=1,\cdots ,{N}_{3}$ do
$[U,S,V]=\mathrm{svd}(\hat{{ \mathcal A }}(:,:,i))$;
$\hat{{ \mathcal U }}(:,:,i)=U;\hat{{ \mathcal S }}(:,:,i)=S;\hat{{ \mathcal V }}(:,:,i)=V$;
end for
${ \mathcal U }=\mathrm{ifft}(\hat{{ \mathcal U }},[],3);{ \mathcal S }=\mathrm{ifft}(\hat{{ \mathcal S }},[],3);{ \mathcal V }=\mathrm{ifft}(\hat{{ \mathcal V }},[],3)$;

For an order-p tensor ${ \mathcal A }\in {{\mathbb{R}}}^{{N}_{1}\times {N}_{2}\times \cdots \times {N}_{p}}$, the frontal slices of ${ \mathcal A }$ are referenced using linear indexing by reshaping the tensor into an N1 × N2 × N3 N4Np third-order tensor, then the i-th frontal slice is ${ \mathcal A }(:,:,i)$ Using this representation, one version of MATLAB pseudocode of the t-svd algorithm for order-p tensors is provided below.

Algorithm 2. t-svd for order-p tensors [26]

Input: ${ \mathcal A }\in {{\mathbb{R}}}^{{N}_{1}\times {N}_{2}\times \cdots \times {N}_{p}},\iota ={N}_{3}{N}_{4}\cdots {N}_{p}$
for $i=3,\cdots ,p$ do
$\hat{{ \mathcal A }}=\mathrm{fft}({ \mathcal A },[],i)$;
end for
for $i=1,\cdots ,\iota $ do
$[U,S,V]=\mathrm{svd}(\hat{{ \mathcal A }}(:,:,i))$;
$\hat{{ \mathcal U }}(:,:,i)=U;\hat{{ \mathcal S }}(:,:,i)=S;\hat{{ \mathcal V }}(:,:,i)=V$;
end for
for $i=p,\cdots ,3$ do
${ \mathcal U }=\mathrm{ifft}(\hat{{ \mathcal U }},[],i);{ \mathcal S }=\mathrm{ifft}(\hat{{ \mathcal S }},[],i);{ \mathcal V }=\mathrm{ifft}(\hat{{ \mathcal V }},[],i)$;
end for

In the remainder of this subsection, we summarize some properties of t-svd, which will be used in the quantum tensor approximation algorithm to be developed in section 5.

In the t-svd literature, the diagonal elements of the tensor ${ \mathcal S }$ are called the singular values of ${ \mathcal A }$. Moreover, the l2 norms of the nonzero tubes ${ \mathcal S }(i,i,:)$ are in descending order, i.e.

However, it should be noticed that the diagonal elements of ${ \mathcal S }$ may be unordered and even negative due to the inverse DFT. Thus, the truncated t-svd method for data approximation or tensor completion is designed by truncating the diagonal elements of $\hat{{ \mathcal S }}$ instead of ${ \mathcal S }$, as the diagonal elements of the former are non-negative and ordered in descending order; see lemma 1.

[1, 29].

Lemma 1 Suppose the t-svd of the tensor ${ \mathcal A }\in {{\mathbb{R}}}^{{N}_{1}\times {N}_{2}\times {N}_{3}}$ is ${ \mathcal A }={ \mathcal U }* { \mathcal S }* {{ \mathcal V }}^{T}$. Then we have

where the matrices ${ \mathcal U }(:,i,:)$ and ${ \mathcal V }(:,i,:)$ and the vector ${ \mathcal S }(i,i,:)$ are regarded as third-order tensors. For $1\leqslant k\lt \min ({N}_{1},{N}_{2})$, define ${{ \mathcal A }}_{k}\triangleq {\sum }_{i=0}^{k-1}{ \mathcal U }(:,i,:)* { \mathcal S }(i,i,:)* { \mathcal V }{\left(:,i,:\right)}^{T}$. Then

where ${{ \mathcal M }}_{k}=\{{ \mathcal X }* { \mathcal Y }| { \mathcal X }\in {{\mathbb{R}}}^{{N}_{1}\times k\times {N}_{3}},{ \mathcal Y }\in {{\mathbb{R}}}^{k\times {N}_{2}\times {N}_{3}}\}$. Therefore, $\parallel { \mathcal A }-{{ \mathcal A }}_{k}{\parallel }_{F}$ is the theoretical minimal error, given by $\parallel { \mathcal A }-{{ \mathcal A }}_{k}{\parallel }_{F}=\sqrt{{\sum }_{i=k-1}^{\min ({N}_{1},{N}_{2})-1}{\parallel { \mathcal S }(i,i,:)\parallel }_{2}^{2}}$.

2.3. Quantum singular value estimation

In [31], Kerenidis and Prakash propose a quantum singular value estimation (QSVE) algorithm. They assume that the input data is stored in a classical binary tree data structure, as stated in the following lemma, such that the QSVE algorithm with access to this data structure can efficiently create superpositions of rows of the subsample matrix.

[31], theorem 5.1.

Lemma 2 Consider a matrix $A\in {{\mathbb{R}}}^{{N}_{1}\times {N}_{2}}$ with τ nonzero entries. Let Ai be its i-th row, and ${{\boldsymbol{s}}}_{A}=\tfrac{1}{\parallel A{\parallel }_{F}}{\left[\parallel {A}_{0}{\parallel }_{2},\parallel {A}_{1}{\parallel }_{2},\cdots ,\parallel {A}_{{N}_{1}-1}{\parallel }_{2}\right]}^{T}.$ There exists a data structure storing the matrix A in ${ \mathcal O }\left(\tau {\mathrm{log}}^{2}({N}_{1}{N}_{2})\right)$ space such that a quantum algorithm having access to this data structure can perform the mapping ${U}_{P}:| i\rangle | 0\rangle \to | i\rangle | {A}_{i}\rangle $, for $i=0,\cdots ,{N}_{1}-1$ and ${U}_{Q}:| 0\rangle | j\rangle \to | {{\boldsymbol{s}}}_{A}\rangle | j\rangle $, for $j=0,\cdots ,{N}_{2}-1$ in time $\mathrm{polylog}({N}_{1}{N}_{2})$.

The following lemma summarizes the main idea of the QSVE algorithm and its detailed description can be found in [31].

[31], theorem 5.2.

Lemma 3 Let $A\in {{\mathbb{R}}}^{{N}_{1}\times {N}_{2}}$ and ${\boldsymbol{x}}\in {{\mathbb{R}}}^{{N}_{2}}$ be stored in the data structure as mentioned in lemma 2. Let the singular value decomposition of A be $A={\sum }_{l=0}^{r-1}{\sigma }_{l}| {u}_{l}\rangle \langle {v}_{l}| $, where $r=\min ({N}_{1},{N}_{2})$. The input state $| x\rangle $ can be represented in the eigenstates of A, i.e. $| x\rangle ={\sum }_{l=0}^{{N}_{2}-1}{\beta }_{l}| {v}_{l}\rangle $. Let $\epsilon \gt 0$ be the precision parameter. Then there is a quantum algorithm, denoted as ${U}_{\mathrm{SVE}}$, that runs in time ${ \mathcal O }(\mathrm{polylog}({N}_{1}{N}_{2})/\epsilon )$ and achieves

with probability at least $1-1/\mathrm{poly}({N}_{2})$, where ${\overline{\sigma }}_{l}$ is the estimated value of ${\sigma }_{l}$ satisfying $| {\overline{\sigma }}_{l}-{\sigma }_{l}| \leqslant \epsilon \parallel A{\parallel }_{F}$ for all l.

Remark 1. With regard to the matrix A stated in lemma 3, we can also choose the input state as $| A\rangle =\tfrac{1}{\parallel A{\parallel }_{F}}{\sum }_{l=0}^{r-1}{\sigma }_{l}| {u}_{l}\rangle | {v}_{l}\rangle $, corresponding to the vectorized form of the normalized matrix $\tfrac{A}{\parallel A{\parallel }_{F}}$ represented in the svd form. This representation of the input state is adopted in section 3. Note that we are able to express the state $| A\rangle $ in the above form even if the singular pairs of A are not known. According to lemma 3, we can obtain ${\overline{\sigma }}_{l}$, an estimate of ${\sigma }_{l}$, stored in the third register superposed with the singular vectors $\{| {u}_{l}\rangle ,| {v}_{l}\rangle \}$ after performing ${U}_{\mathrm{SVE}}$, i.e. the output state is $\tfrac{1}{\parallel A{\parallel }_{F}}{\sum }_{l=0}^{r-1}{\sigma }_{l}| {u}_{l}\rangle | {v}_{l}\rangle | {\overline{\sigma }}_{l}\rangle $, where $| {\overline{\sigma }}_{l}-{\sigma }_{l}| \leqslant \epsilon \parallel A{\parallel }_{F}$ for all $l=0,\cdots ,r-1$.

3. Quantum t-svd for third-order tensors

3.1. The algorithm

In this section, we first present our quantum t-svd algorithm, algorithm 3, for third-order tensors ${ \mathcal A }\in {{\mathbb{R}}}^{{N}_{1}\times {N}_{2}\times {N}_{3}}$, then explain it in detail.

Assumption 1. Every frontal slice of ${ \mathcal A }$ is stored in a tree structure introduced in lemma 2.

Assumption 2. We can prepare the state

Equation (1)

efficiently. Without loss of generality, we assume that ${\parallel { \mathcal A }\parallel }_{F}=1$.

Algorithm 3. Quantum t-svd for third-order tensors

Input: tensor ${ \mathcal A }=({a}_{{ijk}})\in {{\mathbb{R}}}^{{N}_{1}\times {N}_{2}\times {N}_{3}}$ prepared in a quantum state $| { \mathcal A }\rangle $, precision ${\epsilon }_{\mathrm{SVE}}^{(m)}$, $m=0,\cdots ,{N}_{3}-1$, $r=\min \{{N}_{1},{N}_{2}\}$.
Output: state $| \phi \rangle .$
1: Perform the QFT on the third register of the quantum state $| { \mathcal A }\rangle $ in (1) to get
$| \hat{{ \mathcal A }}\rangle =\displaystyle \frac{1}{\sqrt{{N}_{3}}}\sum _{m=0}^{{N}_{3}-1}\left(\sum _{i,j,k}{\omega }^{{km}}{a}_{{ijk}}| i{\rangle }^{c}| j{\rangle }^{d}\right)| m{\rangle }^{e}$. (2)
2: Perform the controlled operation
$U\triangleq \sum _{m=0}^{{N}_{3}-1}{U}_{\mathrm{SVE}}^{(m)}\otimes | m\rangle \langle m| $ (3)
on the state $| \hat{{ \mathcal A }}\rangle $ to obtain
$U| \hat{{ \mathcal A }}\rangle =\sum _{m=0}^{{N}_{3}-1}\sum _{l=0}^{r-1}{\hat{\sigma }}_{l}^{(m)}| {\hat{u}}_{l}^{(m)}{\rangle }^{c}| {\hat{v}}_{l}^{(m)}{\rangle }^{d}| {\overline{\hat{\sigma }}}_{l}^{(m)}{\rangle }^{a}| m{\rangle }^{e}$, (4)
where ${\overline{\hat{\sigma }}}_{l}^{(m)}$ is the estimated value of ${\hat{\sigma }}_{l}^{(m)}$, and the singular value decomposition of ${\hat{A}}^{(m)}$ is ${\hat{A}}^{(m)}={\sum }_{l=0}^{r-1}{\hat{\sigma }}_{l}^{(m)}{\hat{u}}_{l}^{(m)}{\hat{v}}_{l}^{(m)\dagger }$.
3: Perform the inverse QFT on the last register of (4) and output the state $| \phi \rangle $ expressed as
$\displaystyle \frac{1}{\sqrt{{N}_{3}}}\sum _{t,m=0}^{{N}_{3}-1}\sum _{l=0}^{r-1}{\hat{\sigma }}_{i}^{(m)}{\omega }^{-{tm}}| {\hat{u}}_{l}^{(m)}{\rangle }^{c}| {\hat{v}}_{l}^{(m)}{\rangle }^{d}| {\overline{\hat{\sigma }}}_{l}^{(m)}{\rangle }^{a}| t{\rangle }^{e}$, (5)
where $\omega ={e}^{2\pi {\rm{i}}/{N}_{3}}$.

The quantum circuit of algorithm 3 is shown in figure 3, where the blocks ${U}_{\mathrm{SVE}}^{(m)}$, m = 0, ⋯ ,N3 − 1, are illustrated in figure 4.

Figure 3.

Figure 3. Circuit for algorithm 3. ${U}_{{ \mathcal A }}$ is the unitary operator for preparing the state $| { \mathcal A }\rangle $. The QFT is denoted by F. ${N}_{i}={2}^{{n}_{i}},i=1,2,3$. The blocks ${U}_{\mathrm{SVE}}^{(m)}$ are further illustrated in figure 4.

Standard image High-resolution image
Figure 4.

Figure 4. Circuit for ${U}_{\mathrm{SVE}}^{(m)}$, m = 0, ⋯ ,N3 − 1. The initial state of register c and d is $| {\hat{A}}^{(m)}\rangle $. Um refers to ${U}_{{\hat{Q}}_{m}}$. ${U}_{{f}_{m}}$ is a unitary operator implemented through oracle with a computable function fm (x). The notation is further explained in the proof of theorem 2 in appendix B.

Standard image High-resolution image

Before illustrating the algorithm, we first interpret the final quantum state ∣ϕ〉. Similar to the quantum singular value decomposition for matrices [33] that the output allows singular values and associated singular vectors to be revealed in a quantum form, the output state ∣ϕ〉 in our algorithm also finds the estimated values of ${\hat{\sigma }}_{l}^{(m)}$, ${\overline{\hat{\sigma }}}_{l}^{(m)}$, which are stored in the third register, in superposition with corresponding singular vectors. Although the singular values of the tensor ${ \mathcal A }$ are defined as ${\sigma }_{l}^{(k)}=\tfrac{1}{\sqrt{{N}_{3}}}{\sum }_{m=0}^{{N}_{3}-1}{\omega }^{-{km}}{\hat{\sigma }}_{l}^{(m)}$, according to algorithm 1 for the classical t-svd, the singular values of ${\hat{A}}^{(m)}$, ${\hat{\sigma }}_{l}^{(m)}$, have wider applications than the singular values of ${ \mathcal A }$, ${\sigma }_{l}^{(k)}$. For example, some low-rank tensor completion problems are solved by minimizing the tensor nuclear norm of the tensor, which is defined as the sum of all the singular values of ${\hat{A}}^{(m)}$ [3, 5]. Moreover, the theoretical minimal error truncation is also based on the singular values of ${\hat{A}}^{(m)}$; see lemma 1. Therefore, in algorithm 3, we estimate the values of ${\hat{\sigma }}_{l}^{(m)}$, m = 0, ⋯ ,N3; l = 0, ⋯ ,r − 1, and store them in the third register of the final state ∣ϕ〉 for future use. Furthermore, in terms of the circulant matrix $\mathrm{circ}({ \mathcal A })$ defined in definition 1, $\tfrac{1}{\sqrt{{N}_{3}}}{\sum }_{t=0}^{{N}_{3}-1}{\omega }^{-{tm}}| t\rangle | {\hat{v}}_{l}^{(m)}\rangle $ is the right singular vector corresponding to its singular value ${\hat{\sigma }}_{l}^{(m)}$. Similarly, the corresponding left singular vector is $\tfrac{1}{\sqrt{{N}_{3}}}{\sum }_{t=0}^{{N}_{3}-1}{\omega }^{-{tm}}| t\rangle | {\hat{u}}_{l}^{(m)}\rangle $.

Next, we give a detailed explanation on Step 2. We first rewrite the state in (2) for further use. For every fixed m, the unnormalized state

Equation (6)

in (2) corresponds to the matrix

Equation (7)

namely, the m-th frontal slice of the tensor $\hat{{ \mathcal A }}$ Normalizing the state in (6) produces a quantum state

Equation (8)

Therefore, the state $| \hat{{ \mathcal A }}\rangle $ in (2) can be rewritten as

Equation (9)

In Step 2, we utilize a controlled operation U defined in (3) to estimate the singular values of ${\hat{A}}^{(m)}$ parallelly, m = 0, ⋯ ,N3 − 1. Due to the quantum parallelism, the operator U performed on the superposition state $| \hat{{ \mathcal A }}\rangle $ is thus equivalent to ${U}_{\mathrm{SVE}}^{(m)}$ performed on each of the components $| {\hat{A}}^{(m)}\rangle $ as a single input. That is,

Equation (10)

Next, we focus on the result of ${U}_{\mathrm{SVE}}^{(m)}| {\hat{A}}^{(m)}\rangle $ in (10). The state $| {\hat{A}}^{(m)}\rangle $ can be rewritten in the form

Equation (11)

where $\tfrac{{\hat{\sigma }}_{l}^{(m)}}{{\parallel {\hat{A}}^{(m)}\parallel }_{F}}$ is the rescaled singular value of ${\hat{A}}^{(m)}$. The following theorem describes ${U}_{\mathrm{SVE}}^{(m)}$, a modified quantum singular value estimation process on each matrix ${\hat{A}}^{(m)}$ utilizing the corresponding input $| {\hat{A}}^{(m)}\rangle $ represented in (11).

Theorem 2. Given every frontal slice of the original tensor ${ \mathcal A }$ stored in the data structure (Lemma 2), there is a quantum algorithm, denoted as ${U}_{\mathrm{SVE}}^{(m)}$, that uses the input $| {\hat{A}}^{(m)}\rangle $ in (11) and outputs the state

Equation (12)

with probability at least $1-1/\mathrm{poly}({N}_{2})$, where $({\hat{\sigma }}_{l}^{(m)},{\hat{u}}_{l}^{(m)},{\hat{v}}_{l}^{(m)})$ is the singular triplet of the matrix ${\hat{A}}^{(m)}$ in (7), and ${\epsilon }_{\mathrm{SVE}}^{(m)}$ is the precision such that $| {\overline{\hat{\sigma }}}_{l}^{(m)}-{\hat{\sigma }}_{l}^{(m)}| \leqslant {\epsilon }_{\mathrm{SVE}}^{(m)}{\parallel {\hat{A}}^{(m)}\parallel }_{F}$ for all $l=0,\cdots ,r-1$. For tensor ${ \mathcal A }$ with same dimension N on every order, the running time to implement ${U}_{\mathrm{SVE}}^{(m)}$ is ${ \mathcal O }\left(N\mathrm{polylog}N/{\epsilon }_{\mathrm{SVE}}^{(m)}\right)$.

Proof. See appendix B.

Actually, the process ${U}_{\mathrm{SVE}}^{(m)}$ proposed in theorem 2 is quite different from the original QSVE technique introduced in lemma 3. In theorem 2, it is proved that we can estimate the singular values of ${\hat{A}}^{(m)}$ on the condition that each frontal slice of the original tensor A(k) is stored in the binary tree. The proof of theorem 2 presents a detailed illustration of the procedure of ${U}_{\mathrm{SVE}}^{(m)}$, and the circuit shown in figure 4 can help to understand it.

Thus after Step 2, the state in (10) becomes the state in (4) based on theorem 2.

Our quantum t-svd algorithm can be used as a subroutine of other algorithms, that is, it is suitable for some specific applications where the singular values of ${\hat{A}}^{(m)}$ are used. For example, some third order tensor completion problems can be efficiently solved by extracting the singular values of ${\hat{A}}^{(m)}$ and only keeping the greater ones. Moreover, some context-aware recommendation systems also utilize tensor factorizations, such as the truncated t-svd [3] and the truncated HOSVD [32]. See more details in section 5.

Remark 2. Note that the input of ${U}_{\mathrm{SVE}}^{(m)}$ is $| {\hat{A}}^{(m)}\rangle $ instead of an arbitrary quantum state commonly used in some quantum svd algorithms [34]. This input fits our quantum t-svd algorithm better since it keeps the entire singular information $({\hat{\sigma }}_{l}^{(m)},{\hat{u}}_{l}^{(m)},{\hat{v}}_{l}^{(m)})$ completely, thus our algorithm can output a quantum state whose representation is similar to the matrix svd. Another consideration is that we do not need the information unrelated to the tensor ${ \mathcal A }$ (e.g. an arbitrary state) to be involved in our algorithm.

3.2. Complexity analysis

For simplicity, we consider a tensor ${ \mathcal A }\in {{\mathbb{R}}}^{N\times N\times N}$ with the same dimensions on each mode. In Steps 1 and 3 of algorithm 3, performing the QFT or the inverse QFT parellelly on the third register of the state $| { \mathcal A }\rangle $ has the complexity of ${ \mathcal O }({(\mathrm{log}N)}^{2})$, compared with the complexity ${ \mathcal O }({N}^{3}\mathrm{log}N)$ of the DFT performed on N2 tubes of the tensor ${ \mathcal A }$ in the classical t-svd algorithm. Moreover, in the classical t-svd, the complexity of performing the matrix svd (Step 2 of algorithm 1) for all frontal slices of $\hat{{ \mathcal A }}$ is ${ \mathcal O }({N}^{4})$. In contrast, in our quantum t-svd algorithm, this step is accelerated by theorem 2 (the modified QSVE) whose complexity is ${ \mathcal O }\left(N\mathrm{polylog}(N)\right)$ on each frontal slice ${\hat{A}}^{(m)}$. Since we perform this modified QSVE on each ${\hat{A}}^{(m)}$ parallely, m = 0, ⋯ ,N − 1, the running time of Step 2 is still ${ \mathcal O }\left(N\mathrm{polylog}(N)\right)$. Therefore, the total computational complexity of algorithm 3 is ${ \mathcal O }\left(N\mathrm{polylog}(N)\right)$.

4. Quantum t-svd for order-p tensors

Following a similar procedure, we can extend the quantum t-svd for third-order tensors to order-p tensors easily.

We assume that the quantum state $| { \mathcal A }\rangle $ corresponding to the tensor ${ \mathcal A }\in {{\mathbb{R}}}^{{N}_{1}\times \cdots \times {N}_{p}}$ can be prepared efficiently, where ${N}_{i}={2}^{{n}_{i}}$ with ni being the number of qubits on the corresponding mode and

Equation (13)

Next, we perform the QFT on the third to the p-th mode of the state $| { \mathcal A }\rangle $, and then use ∣m〉 to denote ∣m3〉 ⋯ ∣mp 〉, i.e. $m={m}_{3}{{\rm{\Pi }}}_{i=4}^{p}{N}_{i}+{m}_{4}{{\rm{\Pi }}}_{i=5}^{p}{N}_{i}+\cdots +{m}_{p}$. The value of m ranges from 0 to ι − 1, where ι = N3 N4Np + N4 N5Np + ⋯ + Np .Specially, $\iota =\left({N}^{p-1}-N\right)/(N-1)$ when N3 = ⋯ = Np = N. Then we obtain

Equation (14)

Let the matrix

and perform the modified QSVE on ${\hat{A}}^{(m)}$, m = 0, ⋯ ,ι − 1, parallelly using the same strategy described in section 3.1, we can get the state

Equation (15)

after Step 2.

Finally, we recover the ∣m3〉 ⋯ ∣mp 〉 expression and perform the inverse QFT on the p-th to the third register, obtaining the final state

Equation (16)

corresponding to the quantum t-svd of order-p tensor ${ \mathcal A }$.

Algorithm 4. Quantum t-svd for order-p tensors

Input: tensor ${ \mathcal A }\in {{\mathbb{R}}}^{{N}_{1}\times \cdots \times {N}_{p}}$ prepared in a quantum state, precision ${\epsilon }_{\mathrm{SVE}}^{(m)}$, $m=0,\cdots ,\iota -1$.
Output: state $| {\phi }_{p}\rangle $.
1: Perform the QFT parallelly from the third to the p-th register of quantum state $| { \mathcal A }\rangle $, obtain the state $| \hat{{ \mathcal A }}\rangle $.
2: Perform the modified QSVE for each matrix ${\hat{A}}^{(m)}$ with precision ${\epsilon }_{\mathrm{SVE}}^{(m)}$ parallelly, $m=0,\cdots ,\iota -1$, by using the controlled-${U}_{\mathrm{SVE}}$ acting on the state $| \hat{{ \mathcal A }}\rangle $, to obtain the state $| {\psi }_{p}\rangle $.
3: Perform the inverse QFT parallelly from the third to the p-th register of the above state and output the state $| {\phi }_{p}\rangle $.

For order-p tensor ${ \mathcal A }\in {{\mathbb{R}}}^{N\times \cdots \times N}$, compared with the classical t-svd algorithm [26] whose time complexity is ${ \mathcal O }({N}^{p+1})$, our algorithm output a quantum state with the classical t-svd information encoded, and the time complexity of our quantum algorithm is ${ \mathcal O }\left(N\mathrm{polylog}(N)\right)$ followed by a similar analysis.

5. Application to context-aware POI recommendation systems

In this section, we apply algorithm 3 to another quantum algorithm, algorithm 5, which implements a classical machine learning task: context-aware POI recommendation systems.

In point-of-interest (POI) recommendations, a user's preference for a POI, such as restaurants and sightseeing sites, is seriously influenced by context, such as the time slot in a day, his/her current location, etc. Therefore, many works focus more on integrating multiple contexts in a more efficient manner to improve user experience. Tensors are a natural choice to model high-order contextual information, e.g., a user's rating for different POIs can be encoded in the preference tensor ${ \mathcal T }$ in which the three modes represent locations, POIs, and time frames respectively. The classical (non-quantum) context-aware POI recommendation systems utilizing various tensor decomposition methods have been tested by experiment to have better result (both accuracy and execution time) than non-contextual modeling [35, 36]. However, these methods are computational expensive due to the high cost of tensor decomposition. In fact, the classical 3D recommendation systems algorithm based on tensor decomposition, such as the truncated t-svd with complexity ${ \mathcal O }({{kN}}^{3}+{N}^{3}\mathrm{log}N)$ [3], or the truncated HOSVD (T-HOSVD) with complexity ${ \mathcal O }(3{{kN}}^{3})$ [32], where k is the truncation rank and N is the dimension of the preference tensor. Considering the effectiveness and high cost of context-aware POI recommendation systems, we propose a quantum context-aware POI recommendation systems algorithm (QC-POI) with lower complexity ${ \mathcal O }(N\mathrm{polylog}N)$ for some suitable parameters, and the result of this algorithm is the classical information of POI recommendation indices.

The problem of the context-aware POI recommendations can be stated as follows. Suppose there is a hidden preference tensor ${ \mathcal T }$ that encodes the preference information of a given user under two contexts, e.g. locations and time. In practical applications, only a part of the entries of ${ \mathcal T }$ can be observed. Actually, users typically engage with only a small subset of POIs and a considerable amount of possible interactions stays unobserved. For example, a person may be simply unaware of existing alternatives for the POIs of his/her choice. Finding out those entries helps to make better predictions. We denote the tensor whose entries are observed ratings as ${ \mathcal A }$, which is sparse in general. Our goal is to predict these unobserved triples (location, POI, time) and recommend some of his/her favorite POIs among all contexts based on those comparatively high predicted values. For example, Alice has visited several cites (locations) in America, and her ratings of different POIs when she was in these cites are encoded in a tensor ${ \mathcal T }$ whose three modes are cities, POIs, time respectively. For some reasons, she only scores a part of POIs of these cites, denoted as ${ \mathcal A }$, and our task is to provide her favorite POIs among all cites and time slots based on the truncated t-svd of the tensor ${ \mathcal A }$.

In this application, we make two assumptions. Firstly, we assume that tensor the ${ \mathcal T }$ is of low tubal-rank. Secondly, the original tensor ${ \mathcal T }$ has several dominating entries. In fact, this low tubal-rank assumption is also adopted in the classical truncated t-svd data completion problem [2, 3, 37]. For the second assumption, in real situations, it is very likely that a user has some favorite POIs among all contexts (dominating entries).

Our QC-POI algorithm, algorithm 5, includes three processes: quantum t-svd, quantum state projection and quantum measurement. After algorithm 3 and the quantum state projection process, we can obtain the state $| {{ \mathcal A }}_{\geqslant \sigma }\rangle $ corresponding to an approximation of the hidden preference tensor ${ \mathcal T }$ under certain conditions; similar conclusion could be seen in [19], theorem 1. Thus, previously unobserved values of triples corresponding to this user's favourite POIs in the hidden preference tensor ${ \mathcal T }$ might be boosted after projected t-svd on observed tensor ${ \mathcal A }$. As ${{ \mathcal A }}_{\geqslant \sigma }$ is non-sparse in general, we can predict the missing entries based on the relatively high predicted values in ${{ \mathcal A }}_{\geqslant \sigma }$, and provide POI recommendations by measuring the state $| {{ \mathcal A }}_{\geqslant \sigma }\rangle $ in the computational basis. The complexity of our QC-POI of getting a good recommendation is ${ \mathcal O }(N\mathrm{polylog}N)$ for suitable chosen parameters, see the analysis in the paragraph below algorithm 5. The output is a POI recommendation index for this user under all locations and time slots.

In fact, our quantum 3D recommendation systems algorithm borrows the idea of quantum recommendation systems for matrices proposed by Kerenidis and Prakash [31]. For recommendation systems modeled by an m × n preference matrix, Kerenidis and Prakash designed a quantum algorithm that offers recommendations by just measuring the quantum state representing an approximation of the hidden preference matrix obtained by truncated matrix SVD.

The following is a summary of the steps of algorithm 5. We first follow the Steps 1 and 2 of algorithm 3, then perform the quantum projection with a pre-specified threshold σ on state (4). Specifically, for state (4), we apply the unitary operator V on the register a and an ancillary register ∣0〉 that maps ∣ta ∣0〉 → ∣ta ∣1〉 if t < σ and ∣ta ∣0〉 → ∣ta ∣0〉 otherwise. Therefore, after Step 2 of algorithm 3, we get

Equation (17)

Next, we apply the inverse modified QSVE on state (17) and discard the register a. Then we measure the third register of (17), and postselect the outcome ∣0〉, obtaining

Equation (18)

where

Equation (19)

The probability that we obtain the outcome ∣0〉 is

Equation (20)

since the Frobenius norm is unchanged under the Fourier transform. Tensor ${\hat{{ \mathcal A }}}_{\geqslant \sigma }$ denotes the tensor whose m-th frontal slice is ${\hat{A}}_{\geqslant \sigma }^{(m)}$ obtained by truncating ${\hat{A}}^{(m)}$ with threshold σ, and ${{ \mathcal A }}_{\geqslant \sigma }$ is the inverse QFT of ${\hat{{ \mathcal A }}}_{\geqslant \sigma }$. Based on amplitude amplification, we have to repeat the measurement ${ \mathcal O }\left(1/\alpha \right)$ times in order to ensure the success probability of getting the outcome ∣0〉 is close to 1. Thus, the complexity of getting the state 18 is ${ \mathcal O }(N\mathrm{polylog}N/\alpha )$. Combining (9) with (11), we find that the state (18) can be seen as an approximation of $| \hat{{ \mathcal A }}\rangle $.

In Step 4, we perform the inverse QFT on state (18), obtaining the final state denoted as

Equation (21)

which is an approximation of the state $| { \mathcal A }\rangle $ in the classical counterpart algorithm [3]. The effectiveness of the classical counterpart of algorithm 3 has been tested by numerical experiments; see more detail in remark 3.

Remark 3. The classical counterpart of algorithm 5 corresponds to a tensor completion method proposed in [3] and its effectiveness has been verified by numerical experiments. In what follows we briefly summarize this algorithm. For an ${N}_{1}\times {N}_{2}\times {N}_{3}$ tensor ${ \mathcal A }$, we use algorithm 1 to get $\hat{{ \mathcal A }}$, $\hat{{ \mathcal U }}$, $\hat{{ \mathcal S }}$ and $\hat{{ \mathcal V }}$. The total number of diagonal entries of $\hat{{ \mathcal S }}$ is ${{rN}}_{3}$, where $r=\min \{{N}_{1},{N}_{2}\}$. After sorting them from the largest to the smallest, suppose these entries decay rapidly or small values are the majority, there exists a number k such that keeping the top k diagonal entries of $\hat{{ \mathcal S }}$, denoted as ${\hat{{ \mathcal S }}}_{k}$, and setting other entries to be 0 result in a good approximation. Thus, the approximate tensor is ${{ \mathcal A }}_{k}={{ \mathcal U }}_{k}* {{ \mathcal S }}_{k}* {{ \mathcal V }}_{k}^{{\rm{T}}}$, where ${{ \mathcal U }}_{k}$, ${{ \mathcal S }}_{k}$ and ${{ \mathcal V }}_{k}$ are the inverse DFT of ${\hat{{ \mathcal U }}}_{k},{\hat{{ \mathcal S }}}_{k}$ and ${\hat{{ \mathcal V }}}_{k}$ respectively. According to the simulation results in [3], this approximation method has the lowest relative square error (RSE) compared with the other two compression methods, and it has good performance for video datasets from both stationary and non-stationary cameras.

Algorithm 5. Quantum context-aware POI recommendation systems

Input: The observed tensor ${ \mathcal A }=({a}_{{ijk}})\in {{\mathbb{R}}}^{{N}_{1}\times {N}_{2}\times {N}_{3}}$ satisfying Assumption 1, threshold σ, $r=\min \{{N}_{1},{N}_{2}\}$.
Output: a recommendation index.
1: Follow the Steps 1 and 2 of algorithm 3.
2: Apply the unitary operator V on the register a and an ancillary register $| 0\rangle $ that maps $| t{\rangle }^{a}| 0\rangle \to | t{\rangle }^{a}| 1\rangle $ if $t\lt \sigma $ and $| t{\rangle }^{a}| 0\rangle \to | t{\rangle }^{a}| 0\rangle $ otherwise, obtaining the state in (17).
3: Perform the inverse modified QSVE on (17), discard the register a and postselect the outcome $| 0\rangle $ to obtain state in (18).
4: Perform the inverse QFT on state (18) to obtain $| {{ \mathcal A }}_{\geqslant \sigma }\rangle $ in (21).
5: Measure the output state in (21) in the computational basis to get a recommendation index.

In the final step, we measure the output quantum state $| {{ \mathcal A }}_{\geqslant \sigma }\rangle $ in the computational basis in order to extract some classical information (recommendation indices) from the approximate quantum state (21). Here, a good recommendation means that the amplitude of the measurement outcome corresponds to that dominating entries in the hidden preference tensor. Next we give a rough estimation on the complexity of obtaining a good recommendation by algorithm 5. Denote the sum of the squares of the dominating entries of tensor ${{ \mathcal A }}_{\geqslant \sigma }$ as υ (corresponding to the large amplitudes of the state $| {{ \mathcal A }}_{\geqslant \sigma }\rangle $). According to above analysis, the total cost of first four steps of algorithm 5 is ${ \mathcal O }(N\mathrm{polylog}N{\parallel { \mathcal A }\parallel }_{F}/\alpha )$, where α is the Frobenius norm of tensor ${{ \mathcal A }}_{\geqslant \sigma }$ and it is defined in (19). Then after Step 5, we need to repeat the measurement ${ \mathcal O }(\alpha /\sqrt{\upsilon })$ times in order to ensure the probability of getting a good POI recommendation index is close to 1. Therefore, the complexity of our QC-POI algorithm is ${ \mathcal O }(N\mathrm{polylog}N{\parallel { \mathcal A }\parallel }_{F}/\sqrt{\upsilon })$ when N1 = N2 = N3 = N. According to that dominating entries assumption, the complexity of algorithm 5 could be ${ \mathcal O }(N\mathrm{polylog}N)$ for large N. The measurement outcome corresponds to a triplet (location, POI, time slot) whose second entry is the recommended POI index, and this algorithm achieves a polynomial speedup compared with the classical 3D recommendation systems algorithm.

We summarize the main steps of this algorithm in algorithm 5.

6. Conclusion

In this paper, we propose a quantum t-svd algorithm for third-order tensors with complexity ${ \mathcal O }(N\mathrm{polylog}(N))$. The key tools that accelerate this process are quantum Fourier transform and quantum singular value estimation, then we extend this third-order tensor algorithm to order-p tensors. Moreover, based on the quantum t-svd algorithm, a quantum tensor approximation algorithm is proposed, which is applied to context-aware 3D recommendation systems.

Appendix A.: Relevant definitions and results of the classical t-svd algorithm

In this section, we provide the relevant definitions and results of the classical t-svd algorithm, algorithm 1 in the main text.

[1] circulant matrix.

Definition 1 Given a vector ${\boldsymbol{u}}\in {{\mathbb{R}}}^{N}$ and a tensor ${ \mathcal B }\in {{\mathbb{R}}}^{{N}_{1}\times {N}_{2}\times {N}_{3}}$ with frontal slices ${B}^{(l)}$, $l=0,\cdots ,{N}_{3}-1$, the matrices $\mathrm{circ}({\boldsymbol{u}})$ and $\mathrm{circ}({ \mathcal B })$ are defined as

respectively.

circular convolution.

Definition 2 Let ${\boldsymbol{u}},v\in {{\mathbb{R}}}^{N}$. The circular convolution between ${\boldsymbol{u}}$ and ${\boldsymbol{v}}$ produces a vector ${\boldsymbol{x}}$ of the same size, defined as

As a circulant matrix can be diagonalized by means of the discrete Fourier transform (DFT), from definition 2, we have DFT( x ) = diag(DFT( u ))DFT( v ), where diag( u ) returns a square diagonal matrix with elements of the vector u on the main diagonal. As a result, the circular convolution between two vectors in definition 2 is better understood in the Fourier domain, which is given by the following result.

Cyclic Convolution theorem [38].

Theorem 3 Given ${\boldsymbol{u}},{\boldsymbol{v}}\in {{\mathbb{R}}}^{N}$ , let ${\boldsymbol{x}}={\boldsymbol{u}} \circledast {\boldsymbol{v}}$ as defined in definition 2. We have

Equation (A.1)

where $\odot $ denotes the element-wise product.

If a tensor ${ \mathcal A }\in {{\mathbb{R}}}^{{N}_{1}\times {N}_{2}\times {N}_{3}}$ is considered as an N1 × N2 matrix whose (i, j)-th entry is a tube of dimension N3, i.e. ${ \mathcal A }(i,j,:)$, then based on definition 2, the t-product between tensors is defined as follows.

t-product [1].

Definition 3 Let ${ \mathcal M }\in {{\mathbb{R}}}^{{N}_{1}\times {N}_{2}\times {N}_{3}}$ and ${ \mathcal N }\in {{\mathbb{R}}}^{{N}_{2}\times {N}_{4}\times {N}_{3}}$. The t-product of ${ \mathcal M }$ and ${ \mathcal N }$, i.e. ${ \mathcal A }\triangleq { \mathcal M }* { \mathcal N }$, is an ${N}_{1}\times {N}_{4}\times {N}_{3}$ tensor $(i,j)$-th tube is

Equation (A.2)

for all $i=1,\ldots ,{N}_{1}$ and $j=1,\ldots ,{N}_{2}$.

Similar as the circular convolution in definition 2, the t-product in definition 3 can be better interpreted in the Fourier domain. Specifically, let $\hat{{ \mathcal A }}$ be the tensor whose (i, j)-th tube is $\mathrm{DFT}({ \mathcal A }(i,j,:))$. Then by theorem 3 and definition 3, we have

Equation (A.3)

which is the Fourier counterpart of equation (A.2). Interestingly, for a fixed index l in the third mode, equation (A.3) is equivalent to ${\hat{A}}^{(l)}={\hat{M}}^{(l)}{\hat{N}}^{(l)}$, which is the conventional matrix product. This nice equivalence relations between t-product and matrix multiplication (in the Fourier domain) are summarized in the following theorem.

Theorem 4 [1] For tensors ${ \mathcal M }\in {{\mathbb{R}}}^{{N}_{1}\times {N}_{2}\times {N}_{3}}$ and ${ \mathcal N }\in {{\mathbb{R}}}^{{N}_{2}\times {N}_{4}\times {N}_{3}}$, the equivalence relation

Equation (A.4)

holds for $l=0,\cdots ,{N}_{3}-1$, where ${\hat{A}}^{(l)}$ is the l-th frontal slice of $\hat{{ \mathcal A }}$.

By the equivalence relation given in theorem 4 above, the t-svd defined in theorem 1 in the main text can be interpreted in as the matrix SVD in the Fourier domain, as reflected in algorithm 1. In what follows, we list some definitions used in theorem 1 and algorithm 1.

Firstly, tensor transpose operation is used in theorem 1, which is defined as follows.

tensor transpose [1].

Definition 4 The transpose of a tensor ${ \mathcal A }\in {{\mathbb{R}}}^{{N}_{1}\times {N}_{2}\times {N}_{3}}$, denoted by ${{ \mathcal A }}^{T}$, is obtained by transposing all the frontal slices and then reversing the order of the transposed frontal slices 1 through ${N}_{3}-1$.

The tensor transpose defined in definition 4 has the same property as the matrix transpose, i.e. ${\left({ \mathcal A }* { \mathcal B }\right)}^{T}={{ \mathcal B }}^{T}* {{ \mathcal A }}^{T}$.

Secondly, orthogonal tensors are used in theorem 1, whose definition is given below.

orthogonal tensor [1].

Definition 5 A tensor ${ \mathcal U }\in {{\mathbb{R}}}^{{N}_{1}\times {N}_{2}\times {N}_{3}}$ is an orthogonal tensor if it satisfies ${{ \mathcal U }}^{T}* { \mathcal U }={ \mathcal U }* {{ \mathcal U }}^{T}={ \mathcal I }$, where ${ \mathcal I }\in {{\mathbb{R}}}^{{N}_{1}\times {N}_{1}\times {N}_{3}}$ is an identity tensor, in other words, its first frontal slice ${I}^{(0)}$ is an ${N}_{1}\times {N}_{1}$ identity matrix and all the other frontal slices are zero matrices.

Finally, we given the definition of tensor Frobenius norm as it is quite useful for tensor approximation.

tensor Frobenius norm [1].

Definition 6 The Frobenius norm of a third-order tensor ${ \mathcal A }=({a}_{{ijk}})$ is defined as ${\parallel { \mathcal A }\parallel }_{F}=\sqrt{{\sum }_{i,j,k}{\left|{a}_{{ijk}}\right|}^{2}}$.

Similar to orthogonal matrices, the orthogonality defined in definition 5 preserves the Frobenius norm of a tensor, i.e. given an orthogonal tensor ${ \mathcal Q }$, we have $\parallel { \mathcal Q }* { \mathcal A }{\parallel }_{F}={\parallel { \mathcal A }\parallel }_{F}$. Moreover, when the tensor is second-order, definition 5 coincides with the definition of orthogonal matrices.

Appendix B.: The proof of theorem 2

Before proving theorem 2, we would like to sketch the proof first. According to Assumption 1, each frontal slice A(k) is stored in the binary tree structure, hence based on the proof of lemma 3 in [31], the states $| { \mathcal A }(i,:,k)\rangle $, corresponding to the i-th row of A(k), can be prepared efficiently by operators P(k). Based on these operators, two new isometries ${\hat{P}}_{m}$ and ${\hat{Q}}_{m}$ are constructed in order to perform QSVE on ${\hat{A}}^{(m)}$. Moreover, the input of our modified QSVE is also different from that in [31]. The detail of the proof is given below.

Proof. Since every ${A}^{(k)}$, $k=0,\cdots ,{N}_{3}-1$, is stored in the binary tree structure, the quantum computer can perform the following mapping in ${ \mathcal O }(\mathrm{polylog}({N}_{1}))$ time, as shown in theorem 5.1 in [31]:

Equation (B.1)

where ${ \mathcal A }(i,:,k)$ is the i-th row of ${A}^{(k)}$.

Define the degenerate operator ${P}_{k}\in {{\mathbb{R}}}^{{N}_{1}{N}_{2}\times {N}_{1}}$ related to ${U}_{{P}_{k}}$ as

Equation (B.2)

That is,

Equation (B.3)

Based on the efficiently implemented operators Pk and ${U}_{{P}_{k}}$, we define another operator

It can be easily followed that the operator ${U}_{{\hat{P}}_{m}}$ achieves the state preparation of the rows of the matrix ${\hat{A}}^{(m)}$, i.e. ${U}_{{\hat{P}}_{m}}=\tfrac{1}{\sqrt{{N}_{3}}}{\sum }_{k=0}^{{N}_{3}-1}{\sum }_{i=0}^{{N}_{1}-1}{\omega }^{{km}}| i\rangle | { \mathcal A }(i,:,k)\rangle \langle i| \langle 0| ={\sum }_{i}| i\rangle | \hat{{ \mathcal A }}(i,:,m)\rangle \langle i| \langle 0| $. Similarly, the isometry corresponding to ${U}_{{\hat{P}}_{m}}$ is ${\hat{P}}_{m}={\sum }_{i}| i\rangle | \hat{{ \mathcal A }}(i,:,m)\rangle \langle i| $, where $| \hat{{ \mathcal A }}(i,:,m)\rangle $ is the state of the i-th row of ${\hat{A}}^{(m)}$. It can be easily shown that ${\hat{P}}_{m}$ is an isometry since ${\hat{P}}_{m}^{\dagger }{\hat{P}}_{m}={I}_{{N}_{1}}$

Since ${U}_{{P}_{k}}$ can be implemented in time ${ \mathcal O }(\mathrm{polylog}({N}_{1}))$, ${U}_{{\hat{P}}_{m}}$ can also be implemented in time ${ \mathcal O }\left({N}_{3}\sqrt{\tfrac{{N}_{3}}{{N}_{1}}}\mathrm{polylog}{N}_{1}+\sqrt{\tfrac{{N}_{3}}{{N}_{1}}}\mathrm{log}{N}_{3}\right)$ using linear combination of unitaries (LCU) technique [3943]. For tensor ${ \mathcal A }$ with same dimension N, the complexity for implementing ${U}_{{\hat{P}}_{m}}$ turns out to be ${ \mathcal O }(N\mathrm{polylog}N)$, which is analyzed below.

The LCU technique is first proposed by Long in [43] in a more general form, and Shao et al summarize this result in [42]. The problem of LCU can be formulated as follows: Given ${\alpha }_{i}\in {\mathbb{C}}$ and unitary operators Ui , $i=0,1,\cdots ,N-1$. The LCU problem is to implement linear operator $L={\sum }_{j=0}^{N-1}{\alpha }_{j}{U}_{j}$. The algorithm stated in [39] implements L in time ${ \mathcal O }(({T}_{\mathrm{in}}+\mathrm{log}N)N{\max }_{j}| {\alpha }_{j}| /\parallel L| \psi \rangle \parallel )$, where $| \psi \rangle $ is any given initial state and ${T}_{\mathrm{in}}$ is the time to implement ${U}_{1},{U}_{2},\cdots ,{U}_{N-1}$. In our case, ${T}_{\mathrm{in}}={N}_{3}\mathrm{polylog}{N}_{1}$, ${\max }_{j}| {\alpha }_{j}| =1/\sqrt{{N}_{3}}$ and the input state is chosen as $| \psi \rangle ={\sum }_{i=0}^{{N}_{1}}| i\rangle | 0\rangle $. Thus, $\parallel L| \psi \rangle \parallel =\parallel \tfrac{1}{\sqrt{{N}_{3}}}{\sum }_{k=0}^{{N}_{3}-1}{\omega }^{{km}}{U}_{{P}_{k}}| \psi \rangle \parallel =\parallel \tfrac{1}{\sqrt{{N}_{3}}}{\sum }_{k=0}^{{N}_{3}-1}{\sum }_{i=0}^{{N}_{1}-1}{\omega }^{{km}}| i\rangle | { \mathcal A }(i,:,k)\rangle \parallel =\sqrt{{N}_{1}}$. The complexity to implement ${U}_{{\hat{P}}_{m}}$ is ${ \mathcal O }\left({N}_{3}\sqrt{\tfrac{{N}_{3}}{{N}_{1}}}\mathrm{polylog}{N}_{1}+\sqrt{\tfrac{{N}_{3}}{{N}_{1}}}\mathrm{log}{N}_{3}\right)$.

Next, we define the mapping

Equation (B.4)

where ${{\boldsymbol{s}}}_{{\hat{A}}^{(m)}}$ is a vector whose i-th entry is $\tfrac{\parallel \hat{{ \mathcal A }}(i,:,m)\parallel }{{\parallel {\hat{A}}^{(m)}\parallel }_{F}}$. This operator ${U}_{{\hat{Q}}_{m}}$ can be implemented using the technique developed in [44] for preconditioned linear solvers. To be more specific, according to the analysis of algorithm 3, we can rewrite the state $| {\hat{A}}^{(m)}\rangle $ in (8) as $| {\hat{A}}^{(m)}\rangle =\tfrac{1}{{\parallel {\hat{A}}^{(m)}\parallel }_{F}}{\sum }_{i}\parallel \hat{{ \mathcal A }}(i,:,m)\parallel | i\rangle | \hat{{ \mathcal A }}(i,:,m)\rangle $. Then we apply ${U}_{{\hat{P}}_{m}}^{-1}$ to $| {\hat{A}}^{(m)}\rangle $ to get the state $| {{\boldsymbol{s}}}_{{\hat{A}}^{(m)}}\rangle $, thus obtaining the mapping ${U}_{{\hat{Q}}_{m}}$. Finally, similar with ${U}_{{\hat{P}}_{m}}$, the corresponding isometry is defined as ${\hat{Q}}_{m}={\sum }_{j}| {{\boldsymbol{s}}}_{{\hat{A}}^{(m)}}\rangle | j\rangle \langle j| $ and it can be readily shown that ${\hat{Q}}_{m}^{\dagger }{\hat{Q}}_{m}={I}_{{N}_{2}}$.

Now we can perform QSVE on the matrix ${\hat{A}}^{(m)}$ following similar procedures as in [31]. First, the factorization $\tfrac{{\hat{A}}^{(m)}}{{\parallel {\hat{A}}^{(m)}\parallel }_{F}}={\hat{P}}_{m}^{\dagger }{\hat{Q}}_{m}$ can be easily verified. Moreover, we can prove that $2{\hat{P}}_{m}{\hat{P}}_{m}^{\dagger }-{I}_{{N}_{1}{N}_{2}}$ is reflection and it can be implemented through ${U}_{{\hat{P}}_{m}}$. Actually,

Equation (B.5)

where $2{\sum }_{i}| i\rangle | 0\rangle \langle i| \langle 0| -{I}_{{N}_{1}{N}_{2}}$ is a reflection. The similar result holds for $2{\hat{Q}}_{m}{\hat{Q}}_{m}^{\dagger }-{I}_{{N}_{1}{N}_{2}}$.

Now denote

Equation (B.6)

Let ${\hat{A}}^{(m)}={\sum }_{i=0}^{r-1}{\hat{\sigma }}_{i}^{(m)}{\hat{u}}_{i}^{(m)}{\hat{v}}_{i}^{(m)\dagger }$ be the singular value decomposition of ${\hat{A}}^{(m)}$. We can prove that the subspace spanned by $\{{\hat{Q}}_{m}| {\hat{v}}_{i}^{(m)}\rangle ,{\hat{P}}_{m}| {\hat{u}}_{i}^{(m)}\rangle \}$ is invariant under the unitary transformation Wm :

The matrix Wm can be calculated under an orthonormal basis using the Schmidt orthogonalization. It is a rotation in the subspace spanned by its eigenvectors $| {\omega }_{i\pm }^{(m)}\rangle $ with correspondent eigenvalues ${e}^{\pm }i{\theta }_{i}^{(m)}$, where ${\theta }_{i}^{(m)}$ is the rotation angle satisfying $\cos ({\theta }_{i}^{(m)}/2)=\tfrac{{\hat{\sigma }}_{i}^{(m)}}{{\parallel {\hat{A}}^{(m)}\parallel }_{F}}$, i.e.

Here, we choose the input state as the Kronecker product form of the normalized matrix $\tfrac{{\hat{A}}^{(m)}}{{\parallel {\hat{A}}^{(m)}\parallel }_{F}}$ represented in the svd, i.e. $| {\hat{A}}^{(m)}\rangle =\tfrac{1}{{\parallel {\hat{A}}^{(m)}\parallel }_{F}}{\sum }_{i}{\hat{\sigma }}_{i}^{(m)}| {\hat{u}}_{i}^{(m)}\rangle | {\hat{v}}_{i}^{(m)}\rangle $. Then

Equation (B.7)

Performing the phase estimation on Wm with running time ${ \mathcal O }\left(N\mathrm{polylog}N/{\epsilon }_{\mathrm{SVE}}^{(m)}\right)$ for ${N}_{1}={N}_{2}={N}_{3}=N$, and computing the estimated singular value of ${\hat{A}}^{(m)}$ through oracle ${\hat{\sigma }}_{i}^{(m)}={\parallel {\hat{A}}^{(m)}\parallel }_{F}\cos ({\theta }_{i}^{(m)}/2)$, we obtain

Equation (B.8)

we next uncompute the phase estimation procedure and then apply the inverse of ${I}_{{N}_{1}}\otimes {U}_{{\hat{Q}}_{m}}$ to obtain the desired state (12) in theorem 2.

Footnotes

  • This research is supported in part by Hong Kong Research Grant council (RGC) grants (No. 15208418, No. 15203619, No. 15506619) and Shenzhen Fundamental Research Fund, China under Grant No. JCYJ20190813165207290.

Please wait… references are loading.
10.1088/2399-6528/ac0d5f