Quantum Fourier networks for solving parametric PDEs

Many real-world problems, like modelling environment dynamics, physical processes, time series etc involve solving partial differential equations (PDEs) parameterised by problem-specific conditions. Recently, a deep learning architecture called Fourier neural operator (FNO) proved to be capable of learning solutions of given PDE families for any initial conditions as input. However, it results in a time complexity linear in the number of evaluations of the PDEs while testing. Given the advancements in quantum hardware and the recent results in quantum machine learning methods, we exploit the running efficiency offered by these and propose quantum algorithms inspired by the classical FNO, which result in time complexity logarithmic in the number of evaluations and are expected to be substantially faster than their classical counterpart. At their core, we use the unary encoding paradigm and orthogonal quantum layers and introduce a new quantum Fourier transform in the unary basis. We propose three different quantum circuits to perform a quantum FNO. The proposals differ in their depth and their similarity to the classical FNO. We also benchmark our proposed algorithms on three PDE families, namely Burgers’ equation, Darcy’s flow equation and the Navier–Stokes equation. The results show that our quantum methods are comparable in performance to the classical FNO. We also perform an analysis on small-scale image classification tasks where our proposed algorithms are at par with the performance of classical convolutional neural networks, proving their applicability to other domains as well.


I. INTRODUCTION A. Fourier Neural Network
Solving Partial Differential Equations (PDEs) has been a crucial step in understanding the dynamics of nature.They have been widely used to understand natural phenomena such as heat transfer, modelling the flow of fluids, electromagnetism, etc.
Each PDE is an equation, along with some initial conditions, for which the solution is a function f of space and time (x, t), for instance.A PDE family is determined by the equation itself, such as Burgers' equation or Navier-Stokes equation.An instance of a given PDE family is the aforementioned equation along with a specific initial condition, represented, for instance, as f (x, t 0 ).Modifying this initial condition leads to a new PDE instance and, therefore, to a new solution f (x, t).Note also that the solution is highly dependent on some physical parameters (i.e.viscosity in fluid dynamics).
In practical scenarios, a closed-form solution for most PDE families' instances is difficult to find.Therefore, classical solvers often rely on discretising the input space and performing many approximations to model the solution.A large number of computations for each PDE instance are required, depending on the chosen resolution of the input space.
Recently, considerable effort in research for approximating a PDE's solution is based on neural networks.The main idea is to let a neural network become the solution of the PDE by training it either for a fixed PDE instance or with various instances of a PDE family.The network is trained in a supervised way by trying to match the same solutions as the ones computed with classical solvers.The first attempts [Y + 18, BS19] were aimed at finding the PDE's solution f (x, t) for an input (x, t) given a specific initial condition (one PDE instance), and later [ZZ18, AÖ17, BAP + 19] to a specific discretisation resolution for all instances of a PDE family.For the first case, once trained, the neural network can output solution function values at any resolution for the instance it was trained on.However, it has to be optimised for each instance (new initial condition) separately.In the latter case, the neural network can predict solution function values for any instance of the PDE family but for a fixed resolution on which it was trained.
A recent proposal named Fourier Neural Network [LKA + 20] overcame these limitations and posed the problem as learning a function-to-function mapping for parametric PDEs.Parametric PDEs are families of PDEs for which the initial condition can be seen as parametric functions.Given any initial condition function of one such PDE family sampled at any resolution, the neural network can predict the solution function values at the sampled locations.
The input is usually the initial condition f (x, t 0 ) itself.It is encoded as a vector of a certain length N s by sampling it uniformly at N s locations x of the input space, given some resolution.This input is also called an evaluation of the initial condition function.The number of samples N s is key in analysing the computational complexity as it is the neural network input size.Note that sometimes the initial condition is also sampled for several times t as well.The output of the neural network is the corresponding PDE's solution f (x, t) applied at all x sampled and for a fixed t.Experiments on widely pop-  ) is sampled N s times and modified via a trainable matrix P to become a matrix of size N s × N c .Then, T Fourier Layers (green block) are applied sequentially.In this paper, we are designing quantum circuits to implement the Fourier Layers.Each Fourier Layer consists of a row-wise Fourier Transform (FT), followed by a column-wise multiplication with trainable matrices labelled W , and the row-wise Inverse Fourier Transform (IFT).We only apply the inner matrix multiplications to the first K columns (also called modes) and leave the others unchanged or replaced by 0s before the IFT.Finally, a reverse operation with a trainable matrix Q is performed to obtain the output f (x, t) as a discretised vector.The trainable parts are updated with Gradient Descent until the outputs correspond to the actual solutions of the PDE.
ular PDEs showed that it was effective in learning the mapping from a parametric initial condition function to the solution operator for a family of PDEs.
The method proposes a Fourier Layer repeated several times.It consists of a Fourier Transform (FT), then a multiplication with a trainable matrix (also called a linear transform, and an Inverse Fourier Transform (IFT) operation, and ends with a standard non-linearity.This is similar to the convolution operation as it also translates to multiplication in the Fourier domain.However, a key feature of the Fourier Layer is the fact that one can keep only some part of the data before the IFT, corresponding to the lowest frequency in the Fourier domain, reducing the amount of information and computational resources.
The major bottleneck which might hinder the scalability of this classical Fourier Neural Operator (FNO) is its time complexity, limited by the classical FT and IFT operations inside the Fourier Layer.Indeed, their time complexity is O(N s log N s ) with a classical computer, where N s is the input size (number of samples).In many use cases, N s is expected to be quite high for learning precisely the solution of a PDE family.To be more precise, we will see that each input, a vector of size N s , is first modified and reshaped to become a matrix of size N s ×N c (see Fig. 1).N c is named channel dimension, and usually N s N c .This matrix will be the actual input of the Fourier Layer.

B. Quantum Algorithmic Proposals
Quantum computing has gained popularity due to its potential for faster performance than its classical counterpart.Among the most famous quantum algorithms is the exponentially faster Quantum Fourier Transform (QFT), although probably only attainable with longterm quantum computers.In the same way, long-term quantum machine learning and deep learning were proposed [AHKZ20, KLP19,BFL21].
More recently, several developments in learning techniques based on near-term quantum computing were proposed.The initial demonstrations of these algorithms involved experiments on small-scale quantum hardware [FN18, CMDK20, CEK20, GBC + 18], which established their effectiveness in extracting patterns.Following this, many works [MBI + 20, BBF + 20] proposed small-scale implementations of fully connected quantum neural networks on near-term hardware.Other proposals [CCL19] for deploying convolution-based learning methods on quantum devices showed effective training in practice.Furthermore, [CYL + 19] proposed quantum-hardware implementation for generative adversarial networks.A different approach, where the inputs are encoded as unary states, using the two-qubit quantum gate RBS (Reconfigurable Beam Splitter ) was proposed in a recent work [JDM + 21].This encoding gave rise to the use of orthogonal properties of pure quantum unitaries, as proposed in [LML + 22, CKM + 22] for training, for instance, orthogonal feed-forward networks to damp the gradientbased issues while learning.It used a pyramid-shaped (or other architectures) circuit based on parameterised RBS gates to implement a learnable orthogonal matrix as compared to the existing classical approaches, which offer approximate orthogonality at the cost of increased training time.This orthogonality in neural networks results in much smoother convergence and also fewer parameters, as shown by [LJW + 19] for feed-forward neural networks and [WCCY20] for convolutional nets.The effectiveness of these orthogonal quantum networks was further shown in another work on medical image classification [LML + 22] problem.
In this work, we develop quantum algorithms to implement the Fourier Neural Operator (FNO).In particular, we propose three circuits equivalent to or close to the Fourier Layer (See the green box in Fig. 1).The remaining parts of the FNO can be adapted using existing techniques At the core of our circuit, we develop a new Quantum Fourier Transform (QFT) suited for near-term hardware and specific quantum data encoding.Termed as unary-QFT, which transforms only the unary states into the Fourier domain, with an exponential speedup compared to the classical operation.We have built it upon the recently developed idea [LML + 22, JDM + 21] of encoding inputs to unary quantum states and applying orthogonal transformations only on these unary-basis states via learnable quantum circuits.Using this, we adapt the classical Fourier Neural Network [LKA + 20] and propose several quantum algorithms to learn the functional mapping from the initial condition function of a PDE instance to the corresponding solution function.
The circuit proposed in this work is inspired by the widely popular butterfly diagram used for the Fast Fourier Transform (FFT) [CT65].Then, we propose an implementation of controlled butterfly-shaped learnable quantum circuits for applying the linear transform (trainable matrix multiplications) in the Fourier domain.This results in three quantum circuits inspired by the classical Fourier Layer, which are faster than the classical operation or, said differently, require fewer parameters for the same architecture, thereby boosting their scalability.Given the matrix input of the Fourier Layer dimension N s ×N c , where N s corresponds to the number of samples per PDE, and N c correspond to the channel dimension, the order of time complexity corresponding to Fourier Layer and proposed algorithms is shown in table I.
The first algorithm corresponds to the quantum counterpart of the classical operation, where the middle trainable matrix is an orthogonal one.The other two algorithms are modifications of the first circuit with lower circuit depth (at the cost of more qubits or different ex-pressivity) to mitigate the fact the near-term quantum hardware might still be too noisy.
We have simulated all the three proposed quantum algorithms on all the three PDEs evaluated, as in the classical FNO paper [LKA + 20], namely Burgers' equation, Darcy's flow equation and Navier-Stokes equation, on the synthetic datasets used in that paper.We have also simulated our quantum algorithms against the Convolutional Neural Networks (CNNs) on several benchmark datasets for image classification.In all the experiments, the three quantum algorithms perform similarly and also show accuracies comparable to state-of-the-art classical methods.

C. Contributions
To summarise, the contributions of this work are the following: • We propose a new quantum circuit for performing a Quantum Fourier Transform on an input encoded as a superposition of unary states.
• Using the above-mentioned unary-QFT, we propose three quantum circuits with a provable equivalence or approximation to the classical Fourier Layer.These circuits can be sequentially combined to form a quantum version of a trainable Fourier neural network for solving Parametric PDEs.
• We provide an in-depth analysis of the computational complexity of the circuits and prove their logarithmic time complexity with respect to N s (input dimension), compared to the linear time complexity of their classical counterpart.
• We benchmark results, showing the effectiveness of the proposed quantum algorithms, against their classical equivalent, for solving several PDEs or even classifying images from well-known datasets.

II. CLASSICAL FOURIER NEURAL OPERATOR
For solving a Partial Differential Equation (PDE), we are provided with a dataset where each instance is a set of the initial conditions of the family of the PDE.This initial condition is represented as a parameterised function and is sampled at various locations to generate the input.The neural network's output corresponding to this initial condition is trained to be the value of the solution f (x, t) for that instance at the same locations and for a given t.The Fourier Neural Operator (FNO) [LKA + 20] tries to learn this functional mapping from the initial condition function to the solution function for this PDE family.This implies that given a set of initial conditions sampled at various locations as input to the FNO, it has Table I: Comparison of the order of time/depth complexities (Big-O) of the proposed circuits with the existing classical Fourier Layer (FL).Here N s denotes the sampling dimension, N c denotes the channel dimension where N s N c and K (usually in the range 4-16) denotes the maximum number of modes allowed [LKA + 20].This implies that the proposed quantum algorithms would be faster than the classical method.Each quantum circuit requires N c + N s qubits and K independent parallel circuits are required by the Parallelised QFNO.
to predict the solution function values at all these locations for any PDE instance in the test set.An overview of the FNO is given as Fig. 1.We recall that each input of the FNO is a parametric initial condition, seen for instance as a function of space f (x, 0).This function is sampled at N s locations and forms a vector in R Ns that is later modified by a trainable matrix P to become a matrix A ∈ R Nc×Ns .We denote N c the channel dimension and usually N s N c .The matrix A will be the input of the first Fourier Layer (FL) and its size will be maintained for each following FL.As shown in Fig. 1, the FNO consists of a sequence of Fourier Layers.Without loss of generality, we will denote A ∈ R Nc×Ns to be the input of any Fourier Layer.
Each Fourier Layer (FL) starts by transforming its input matrix to the Fourier domain.It applies a Fourier Transform (FT) on each column of the input matrix A. The resulting matrix has the same dimension, and we will refer to its N s columns as modes to emphasize their presence in the Fourier basis.After the FT, we apply a learnable matrix multiplication to the first K modes, i.e. the first K columns among N s (See Fig. 1).The original proposal indicates to crop the remaining modes by replacing them with 0. In our quantum proposal, we will let the remaining modes untouched rather than cropping them.The final operation is an Inverse Fourier Layer (IFT) which transforms the matrix back to the input space.Note that in the original proposal, the authors apply in parallel a direct convolution to the input, and both outputs are merged (Not shown in Fig. 1.In our quantum proposals, we discard this convolution part for simplicity, without any impact on the experimental accuracy. In this work, we propose several quantum algorithms for mimicking the Fourier Layer and name these circuits Quantum Fourier Layer (QFL).We name the resulting neural network as a Quantum Fourier Neural Operator (QFNO).As the other parts of the FNO (P and Q matrix multiplications from Fig. 1) are easier to adapt with existing quantum techniques [LML + 22], we have not focused our work on them.Next, we will formulate the analytic expression of the FL's output, in order to prove its correspondence to QFL.Fourier Layer We now discuss the mathematical details of the classical Fourier Layer for a 1D PDE case (e.g.Burgers' equation), showing the inputs and outputs of each transformation involved.For each PDE instance, the input is denoted as A ∈ R Nc×Ns , where N c denotes the number of channels per sample in the input and N s corresponds to the number of samples for this instance (initial condition function).
Regarding notation, the elements of A are a ij , its rows are a i while its columns are a j .We denote the output corresponding to this classical operation as Y ∈ R Nc×Ns , and its elements, columns, and rows as y ij , y i , and y j respectively.As the quantum matrices are orthogonal and the l 2 -norm of any quantum state vector is 1, we consider the input A such that ||A|| 2 = 1.Enforcing this condition is easy and doesn't have any significant impact on the optimisation process.
Going further, a Fourier Transform (FT) is applied to this input along each row of size N s : where Let Â be the resulting matrix.
Denoting the maximum number of modes with K, the intermediate linear transform is in fact a multiplication with a 3-tensor W ∈ R Nc×Nc×K .Each W k ∈ R Nc×Nc corresponds to the k th matrix of W , indexed along the last dimension, and corresponding to k th mode (see Fig. 1).In the quantum implementation later, we will consider matrices W k to be orthogonal, as this naturally occurs in quantum circuits.We multiply the tensor W to the first K modes (along the N s dimension) of the Â.Said differently, for each j ≤ K, the j th column of Â is multiplied by W j , resulting in the following output Let bj = W j âj , we can rewrite the previous vector as In the original classical proposal [LKA + 20], the rest of the modes are discarded (replaced by zeros).In the quantum case, it will be simpler to let the other modes unchanged.We found that this choice doesn't impact the performance.
Finally, we apply the Inverse Fourier Transform (IF T ) operation on this transformed input, row by row.It results in the following output for each row i : Where bij is the i th component of bj .In conclusion, the overall time complexity of the Fourier Layer (F T . This runtime can be improved if we consider a distributed algorithm, considering the current availability of efficient GPUs.By parallelising classical operations, we can achieve the Fourier Layer in O(N c + 2N s log(N s )).Note however that the dominant term remains the same as N s N c and K is usually a constant.

III. QUANTUM ALGORITHMIC TOOLS
In this section, we introduce quantum tools necessary to build the Quantum Fourier Layer in Section IV.These tools are meant to be implemented on near-term quantum computers, with modularity so that they can be useful for other applications.
We first introduce the matrix unary amplitude encoding, a fast way to load a matrix as a quantum state.Then, we develop a new quantum circuit to apply a Quantum Fourier Transform (QFT) on the unary basis states.Finally, we present learnable quantum orthogonal layers, the equivalent of learnable weight matrices in classical neural networks.
We describe here a quantum gate that will be common to the next tools: the 2-qubit Reconfigurable Beam Splitter (RBS) gate [FND + 20], parameterised by a single parameter θ.The RBS gate has the following unitary: It can be observed that it modifies |01 and |10 , while it performs the identity operation on |00 and |11 .The RBS gate, therefore, preserves the hamming weight of the input state.In particular, any superposition of the unary basis (states with hamming weight 1), is kept in this basis through a circuit made of RBS gates.Its implementation depends on the quantum hardware considered.Now, we will discuss an identity for these RBS gates and use it in the coming section to implement controllable parameterised circuits made of these gates.
Identity III.1 Given two qubits, applying an RBS gate on them with angle θ, followed by a Z gate on any one of them is equivalent to applying a Z gate on the same qubit followed by an RBS gate with angle −θ on the two qubits (Figure 2).
Proof : Let us first look at the circuit shown in Figure 2 (a).Eq. 6 shows the calculation for the final unitary corresponding to the left-hand side: and Eq. 7 shows the same for the right-hand side: where both the equations finally arrive at the same unitary matrix, thereby proving the identity.A similar calculation for the circuit shown in Figure 2 (b) verifies its correctness.As seen in Section II, the input of each classical Fourier Layer is a matrix A. As we are about to propose quantum circuits to process this data, we need a method to encode these matrices as quantum states.We chose to encode data as amplitude-encoded states, a superposition of basis states with amplitudes that correspond to the data itself.We chose the unary basis, namely the computational basis vectors that have a hamming weight of 1, e.g.
This choice of basis is motivated by its ability to implement near-term encoding for vectors and matrices.It also allows performing tractable linear algebra tasks with provable guarantees.Higher order basis states can be used [KP22] but are not the focus of this work.Given an input matrix A ∈ R n×d , its quantum state once loaded should be: To load a matrix in such a way, we use a quantum circuit made of two registers (one for the rows, the other for the columns) as shown in Fig. 3 We can get rid of the normalisation factors if we assume or preprocess the vectors and matrices to be normalised.
On the n-qubits top register, we first load the vector made of the norm of each row ( A 1 , • • • , A n ).On the d-qubits lower register, we sequentially load and unload each row A i in a controlled fashion.Details can be found in With the right connectivity, the circuit to load a vector of size n has depth O(log(n)), hence loading a matrix A ∈ R n×d requires a circuit of depth O(log(d)+2dlog(n)).

B. Unary Quantum Fourier Transform
Quantum Fourier Transform (QFT), one of the most impactful algorithms found in quantum computing literature, provides an exponential speedup compared to classical computing.It performs the Discrete Fourier Transform over the entire 2 n -dimensional Hilbert space.In this work, we propose a new quantum circuit that performs the Discrete Fourier transform over the unary basis states.This allows for a shallow-depth quantum circuit adapted to our quantum data encoding presented in the previous section.The classical algorithm for performing Fast Fourier Transform (FFT) uses a butterfly-shaped diagram [CT65], shown in Fig. 4. Our goal is to inspire from the classical FFT diagram and perform the same operation with quantum circuits.Namely, the unitary matrix, once restricted to the unary basis, must implement the FFT matrix.With an input x ∈ R n , The FFT matrix F n is given by: where ω k = e i 2π n k is the n th -root of unity raised to some power k based on the position of the radix.Note that F n is not unitary, but its scaled version F n / √ n, is unitary.Therefore, we will implement the scaled version using our quantum circuit.As shown in Fig. 4, the input x is permuted, which we will have to take into account when loading data quantumly.
The classical FFT algorithm is decomposed into several radix-2 operations (Fig. 4).Each one itself is a matrix multiplication which transforms the input  (Fig. 4) by replacing each radix-2 operation with the phase gate followed by the RBS gate.The input must be a vector |x encoded in the unary basis, with the right permutation.The output will be |x .
(scaled by a factor of 1/ √ 2 to make it unitary) on two qubits, we use one single qubit gate and one RBS gate (Eq.III).The single qubit gate is a phase gate 1 0 0 −ω k applied on the top qubit only.Then, we apply an RBS gate with an angle of −π/4.Overall, this applies the following unitary: The above unitary, once restricted to the unary basis, namely {|01 , |10 } (middle rows and columns) in the case of two qubits, is exactly the desired radix operation We apply these operations exactly in the manner of the classical FFT architecture.For an n-dimensional vector, we would thus require n qubits, log(n) depth and n log(n) gates.
The Unary-QFT is meant to be applied on a unary quantum state, after a vector data loader for instance.
Note that the input to the circuit needs to be the permuted version of the vector.This is a fixed permutation, and we can construct our data loader accordingly.
Finally, it is simple to apply the inverse Fourier transform in a similar way, named Unary-IQFT.It is simply the inverse of the above circuit (Fig. 5), along with the right permutation of the input data.
In the rest of this work, we will use the following equations.Given a normalised real vector x = (x 1 , • • • , x M ), and its Fourier Transform x = (x 1 , • • • , xM ), the QF T operation in unary basis and its inverse, the IQF T operation, can be defined as follows:

C. Quantum Circuits as Trainable Linear Transforms
To mimic the learnable part of the Fourier Layer, we need to perform quantumly some sort of matrix multiplication in the unary basis.With this in mind, we propose to use Quantum Orthogonal Layers [LML + 22], namely parameterised quantum circuits using hamming-weight preserving gates only.In our case, we are consequently ensured to preserve the superposition in the unary basis, before applying the final Inverse Fourier Transform.The unitary considered have real values and once restricted to the unary basis, correspond to orthogonal matrices, hence their name.In the context of the Fourier Layer, we are now able to apply a learnable linear transform (see Section II), by implementing orthogonal matrix multiplications with trainable parameters.
Several circuits are possible, but we will mostly use the Butterfly circuit shown in Fig. 6: it has the same layout as the previous Unary-QFT (Section III B), and for this reason has a logarithmic depth if the hardware connectivity allows it.For an input x ∈ R n , the Butterfly quantum circuit has O(n log(n)) parameterised gates.The corresponding matrices, once restricted to the unary basis, lie in a subgroup of orthogonal matrices.Other circuits, with nearest-neighbours qubits connectivity, are also usable in our case.They differ in their expressivity, number of parameters, and depth.
In the next sections, we will detail how to apply such circuits in a controlled fashion.This will become useful as K such orthogonal matrix multiplications will be applied on the first K modes (see Section II), after the Fourier Transform.Figure 6: parameterised quantum circuits with a butterfly shape will take the role of the linear transforms (matrix multiplications) in the Fourier Layer.Given complete hardware connectivity, their depth is logarithmic in the number of qubits.

IV. QUANTUM CIRCUITS FOR FOURIER LAYER
Using the circuits from the previous section as building blocks, we propose three quantum circuits for classical Fourier Layer (FL), named Sequential (Section IV A), Parallel (Section IV B), and Composite (Section IV C) Quantum Fourier Layer (QFL) respectively.These quantum circuits are differentiated based on how the intermediate matrix multiplications in the classical FL are implemented using the learnable quantum circuit.We compare their computational complexities (see Table I) and their efficiency in practice in the following sections.
To reproduce the classical Fourier Layer (FL) (see Section II) using quantum circuits, we need the final quantum state to correspond to the result of the classical FL, as shown in Eq.4.Therefore, we expect ideally a quantum output state of the following form: where which simply means that the matrix y is encoded in the unary basis, as is Eq.8, and y ij is the j th component of the resulting vector y i defined in Eq.4.Note that there are no normalisation factors in the above equation, as the rows of the matrix (a 1 , • • • , a Nc ) are assumed normalised, and the normalisation is preserved through the circuit.
For our three quantum circuit proposals-the Sequential circuit (IV A), the Parallel circuit (IV B), and the Composite circuit (IV C)-we will compare their output quantum state to the desired one (Eq.12),showing how well we are able to replicate the classical operation.
Our conclusions are the following: the Sequential circuit is returning the desired output |y but might have a prohibitive depth for near-term application.On the other hand, both Parallel and Composite circuits are designed to have a shorter depth, at the cost of producing a slightly different quantum output.Interpreting these alternative outputs is complicated, and numerical simulations in Section V will assess their quality.
The three proposals are closely related, as shown in Figures 7, 9, and 10.All circuits start with |0 Nc ⊗ |0 Ns , where the top and lower registers are used respectively to encode the N c rows and the N s columns of the input matrix A. Indeed, all circuits start by loading the input matrix A as explained in Section III A. We recall that N s is the number of samples that are used to encode each initial condition function of the PDE, as a vector.N c is another dimension we use to extend this vector as a matrix.We usually have N s N c .After this step, the Unary-QFT from Section III B is applied on the lower register.At the very end, the inverse QFT is applied similarly, followed by a measurement of both registers.Between the two Fourier Transforms lies the core difference between the three proposals: the implementation of the K matrix multiplications (see Fig. 1 and Section II).This part is a trade-off between circuit depth, circuit repetitions, and correspondence with the classical Fourier Layer.

A. Sequential Circuit for Fourier Layer
This first proposal is represented as Fig. 7.As explained, the Sequential circuit starts by loading the input matrix A, which in general is itself the output of the previous Fourier Layer.Therefore, the first part of the circuit is nothing else than the circuit displayed in Fig. 3. Its depth is O(log(N c ) + 2N c log(N s )).The resulting state is: To follow the classical operation, the Fourier Transform should be applied to each of the rows.Thus, we apply the Unary-QFT operation on the lower register, currently encoding a superposition of these row vectors.
We first rewrite the previous state as: Figure 7: Sequential-QFL: The proposed Sequential Quantum Circuit, which replicates the classical FNO operation from Fig. 1 (if we measure at the end).Further details regarding it are given in Section IV A. The yellow box comprises a controlled parameterised circuit having j th qubit (1 ≤ j ≤ K) of the lower register as the controlling qubit.
And then apply the QF T operation, as in Eq.11, to the lower register: where âij corresponds to the classical, row-wise, Fourier Transform of A. Note that the resulting state has kept its superposition in the unary basis.
Next, we realize the learnable linear transform, namely K matrix multiplications, as in the middle of the classical Fourier Layer (Fig. 1).Each matrix W k , to follow the classical notations from Section II, will now become an orthogonal matrix in the quantum case.This multiplication is done with the parameterised quantum circuits from Section III C, which preserves the unary basis.We choose to use the learnable butterfly circuit from Fig. 6, for its shallow depth.As explained before, applying these learnable circuits effectively applies an orthogonal matrix multiplication in the unary basis.The main challenge, which will differentiate the three proposals, consists in applying these matrix multiplications to the first K columns (â 1 , • • • , âK ), independently.Contrary to the previous QFT, the linear transform should act on the columns, encoded in the upper register.
As shown in Fig. 7, we propose to apply sequentially K such parameterised circuits P 1 , • • • , P K (butterfly circuits) on the top register.To ensure independent trans-formations are applied on each column âj , we propose a controlled implementation of the parameterised circuit, or, a Controlled parameterised Circuit.It applies the parameterised circuit to the upper register only when a particular qubit of the lower register is not in the activated state (|1 ).For the circuit P j , controlled by j th qubit of the lower register, this controlled implementation begins with the application of circuit P j on the upper register, followed by the application of Controlled Z-gates to certain qubits in the upper register if the j th qubit in the lower register is in state |0 .This can be seen as a controlled implementation of some matrix U Z (see Fig. 7) and thus, we denote this transformation by CU Z .The set of qubit indices for applying these Z-gates are selected using the following rule: each RBS-gate of the parameterised circuit P j has a Z-gate applied to exactly one of its two qubits.For instance, if P j has nearest neighbour connectivity, then the desired set can be the set of even or odd qubit indices.Subsequently, we apply another parameterised circuit, similar to P j , on the upper register, but this time reversing the order in which the RBS gates are applied, denoting this circuit by P j .Finally, we repeat the controlled Z-gate operations on the same qubits.This completes the implementation of the controlled parameterised circuit.Let us now see how this implementation achieves the desired task.Fig. 8 shows the controlled version of the parameterised circuit from Figure 6.Here, the lowest qubit controls all the Z-gates and thus, is the controlling qubit for the parameterised circuit.The complete circuit includes the parameterised circuit (P j ) followed by controlled Z-gates on some of the qubits (CU Z ).These qubits are selected using the following criterion: Every RBS-gate in the circuit has exactly one of its two qubits lying in this set.Finally, the circuit P j is applied again but this time the order of RBS-gate is reversed (P j ).Denoting the angles [θ 1 , θ 2 , ...., θ M ] by θ, where M is the total number of parameterised RBS-gates in P j , and writing the circuit P j as P j (θ), the following relation holds between P j and P j : where P † j denotes the conjugate transpose of P j .Before proceeding further on the implementation of the controlled parameterised circuit, we first discuss a claim below regarding the existence of U Z for any possible parameterised butterfly circuit.
Claim 4.1 For an N-qubit butterfly-shaped parameterised circuit described in Section III, with N=2 a for any whole number a, there exists a set of qubit indices I a such that every RBS-gate in the circuit has exactly one of its qubits' index lying in this set.Proof: We show this proof by recurrence.We first establish the base case for a=1 (or N=2).For this we know I 1 will be either {1} or {2}.Now, it is given that I a exists for some a > 1.Since I a consists of exactly one of the qubit index for every RBSgate, then indices not in I a (denoted by I a ) cannot have any RBS-gate between them.Also, since exactly one of every RBS-gate indices is in I a , therefore the other index lies in I a .Thus, I a is also a solution set for this problem.Now, if we combine the two N = 2 a qubit circuits (corresponding to I 1 a and I 2 a ), then we see that there is no new connection on the index set I 1 a U I 2 a .Thus, this can correspond to the set I a+1 .Hence, we proved that if I a exists, then I a+1 also exists.
Going further, we now use Identity III.1 for RBS gates to arrive at the following equation: Using this, we arrive at the following relation: Thus, it can be observed that when the lowest qubit in Figure 8 is in state |0 , each RBS gate in the last layer of P K is being cancelled out by the RBS gate in the first layer of P K .After this, the second last layer operations get cancelled by the second layer of P K and finally only Z-gate operations (U Z ) will remain.Finally, the application of another CU Z unitary results in U Z being re-applied to the upper register and using U Z U Z = I, we can conclude that the state of the upper register is preserved.Thus, applying X-gate on the lowermost qubit again retains the initial state before the Figure 8: parameterised Butterfly Circuit followed by Z gates on certain qubits and its flipped version around the vertical axis.The dash-enclosed region shows the last layer RBS gates of the parameterised circuit being cancelled out by first layer RBS gates in its flipped version, using the RBS identities in figure 2. Similarly for the second last layer after the last layer is cancelled out and so on.Thus, eventually, only Z gates will remain.
application of this circuit.
On the other hand, when the lowest qubit is in state |1 , no Z-gates are applied, and the initial state of the remaining qubits is transformed by P K and P K .This corresponds to a controlled version of a parameterised circuit, namely P j P j .Note that it is slightly different from applying a controlled P j only, whose implementation remains an open question.
Let us now apply the Controlled version of P 1 on the state in Eq.( 15).After re-arranging terms and applying P 1 , it can be written as: Now, on applying the CU Z unitary using the first qubit of the lower register as the control (applied when this qubit is in state |0 ), the state becomes: Now, applying the flipped circuit P 1 transforms the state to: 18), the state reduces to: Finally, repeating the controlled application of U Z leads to: Applying K such controlled circuits, where the j th circuit on the upper register is controlled by j th qubit in the lower register, and re-arranging the terms in the last equation leads to the following state: We now want to compare the state obtained in Eq.( 24) to the classical output of the FNO from Eq.(3), before the final inverse Fourier Transform.We recall that in the classical case, each of the first K columns âj was multiplied by an independent matrix W j .In the quantum case, we need to understand if the same operation is applied, and with which matrix W j Q .For all j ∈ [1, K], we saw in Eq.( 24) that P j P j was applied to the unary encoding of âj .We denote respectively by W Pj and W P j the unary matrix of P j and P j .Each matrix is the N c × N c submatrix of the whole unitary of size 2 Nc ×2 Nc , corresponding to the basis states of unary vectors (see Section III C).Therefore, considering only the top register, the j th operation P j P j corresponds to the sub-matrix W j Q : This matrix W j Q is the quantum implementation corresponding to the matrix W j used in the implementation of classical FNO (eq.3).The overall matrix (â ij ) can be decomposed into , where this bj is the quantum counterpart for one used in classical FNO.Thus, the state after these controlled parameterised circuits can be written as: Finally, the output state of this circuit after IQF T on the lower register becomes: Since IQF T ( i xi |e i ) = i x i |e i , where IF T (x) = x, this implies that j th component of IF T would be same as the coefficient of j th state in IQF T .From this, we can conclude that the state in Eq.( 12) is equivalent to the state in Eq.( 27) and thus, this circuit replicates the classical operation.Finally, let us now discuss the depth complexity of this circuit.

Depth Complexity (d).
Based on the discussion of the Sequential QFL circuit, it can be divided into four parts : (a) unary loading of the input matrix (d load ), (b) applying QF T on the lower register (d qf t ), (c) applying K Controlled Parameterised Circuits on the upper register (d cpc ) and (d) applying inverse QF T on the lower register (d iqf t ).Thus, the depth of the complete Sequential QFL circuit becomes:

B. Parallelised Circuit for Fourier Layer
For the Sequential QFL discussed in the previous subsection, the depth complexity of the learnable part is linear in the number of modes (K).Given the multiplicative noise model for NISQ devices, this linear dependence might hinder learning.A helpful modification then can be parallelising the learnable butterfly circuits, which can make the learning in the presence of noise more efficient and reduce the circuit's depth complexity.Fig. 10 shows this modified version of the Sequential QFL, consisting of K quantum circuits operating in parallel and each implementing only one learnable circuit controlled by one of the top K qubits in the lower register.As all the circuits up to the learnable part are similar to the sequential circuit, we can directly write the state after the QF T using Eq.( 15), as: where the index k denotes the k th parallel circuit and (â ij ) k , |e i k denote the coefficient âij , state |e i corresponding to this k th circuit respectively.Also, in the k th parallel circuit, the learnable butterfly part is controlled by the k th qubit of the lower register.We recall that the parameterised circuit applied on the top register is effectively mapping the vector âj to bj (see Eq.( 27)) and thus, we can write the updated state of the circuits as: Now, applying IQF T on the lower register in each of the circuits independently: We denote coefficients of this state corresponding to k th circuit by (c ij ) k and thus re-writing it as: where all of the (c ij ) k are explicitly given by using the equation for the Fourier transform as follows: Similarly, writing c ij for the sequential circuit discussed in the paper (using eq.27): Comparing the above two equations leads to the observation that coefficients in Eq. 34 wouldn't be a subset of coefficients in Eq. 33, and there is no closed-form classical processing/transformation to achieve this.Thus, this parallel circuit results in a somewhat different operation which might be intuitively similar to the sequential circuit, but the output is different.However, Section V shows that this conceptually similar operation is also effective in dealing with PDEs/Images and is expected to be more efficient than the sequential circuit for a noisy scenario.Also, if we remove the IQF T operation from this circuit and instead apply the classical IF T , measuring after Eq.( 30), we get the following K N c × N s matrices after applying the square root operation: where bj and âj have been defined previously.In case we combine bk from all of the K matrices with (â j ) j∈[K+1,Ns] from any of the K matrices, suppose the first one, it leads to the following N c × N s matrix: which is exactly the same as Eq.(3).Thus, this modified circuit (without the IQF T ), followed by some classical post-processing and IF T , can replicate the classical Fourier layer operation.

Depth Complexity (d).
Given that the only difference compared to the Sequential QFL is the parallel implementation of the controlled parameterised circuits as against sequential, the depth complexity of this circuit can be derived by substituting K = 1 in Eq.(37): and a total of K independent quantum circuits are required to execute this circuit.As highlighted in the previous subsection, the depth of the parameterised part of the sequential circuit might make the learning process difficult on currently available noisy quantum hardware.Even though the Parallelised QFL can deal with this, its requirement of K independent N c + N s qubit circuits might not be possible in many cases.Therefore, we propose a new operation corresponding to the learnable part of the sequential circuit and term the resulting overall circuit as the Composite QFL.It significantly decreases the learnable part's depth complexity while requiring only one quantum circuit with (N c + N s ) qubits.Instead of applying the K-controlled parameterised circuits, we span a single, more extensive parameterised circuit over the upper register (N s qubits) and top K qubits in the lower register.Fig. 10 shows the diagram for this circuit.Note that the upper and lower registers are unary independently, before the parametrised circuit.
If we jointly consider the upper and lower registers, the states are in a superposition over the hamming weight two basis states.However, we will consider the upper register and just the top K qubits from the lower register, in that case, the states can be in a superposition over hamming weight one and two basis states.Given that the RBS gates are hamming weight preserving, the state after applying the parameterised circuit will also be a superposition of hamming weight one and two basis states.
Note that the input superposition can't have all the possible basis states with hamming weights 1 and 2 for the top N c + K qubits.For instance, it doesn't comprise unary states for which the 1 is in the lower K qubits.Similarly, for hamming weight 2, it doesn't comprise states with both the 1s in top N c or bottom K qubits.In contrast, the output superposition can have any of the hamming weight 1 or 2 states.Recall that the number of possible hamming weight 1 states for these N c +K qubits is N c + K and hamming weight 2 states is Nc+K 2 .Let us now discuss the application of parameterised circuits on these N c + K qubits.The complete unitary B will be a 2 Nc+K ×2 Nc+K block diagonal matrix with each block corresponding to a subspace with fixed hamming weight [KP22], B = B 1 ⊗ B 2 ⊗ ... ⊗ B n , where B i correspond to the block diagonal unitary for subspace with hamming weight i.Since our input has hamming weight 1 or 2, we only care about unitaries B 1 and B 2 .B 1 will be of size Given the circuit is similar to the sequential circuit till the QF T operation, the state of this circuit after QF T would be the same as the one in Eq.( 15).We now separate this complete state into two sets of states corresponding to hamming weight 1 and 2: where the first term corresponds to the superposition of hamming weight two basis states |h 2 and similarly the second term corresponds to the superposition of hamming weight one basis states |h 1 .On application of the parameterised circuit, the unitary B 1 will act on |h 1 and B 2 on |h 2 .Let us first focus on the term corresponding to |h 1 .It does not contain the states where the qubits in the upper register are all 0 and the 1 lies in the top K qubits of the lower register.It implies that the coefficients of all these states should be taken as zero.Therefore, the state corresponding to this |h 1 can also be written as: where |e 0 denotes the state corresponding to no ones in the upper N c register and |e ij denotes the hamming weight 2 states for the lower register, where i and j denote the positions of 1.Similarly, if we consider the first term in Eq.( 38), corresponding to |h 2 , we further have to include states where both ones are in the upper register or both ones in the top K of the lower register.These new states again would have zero coefficients.As a result, we can write the term corresponding to |h 2 in Eq.( 38) as: This results in a total of Nc+K 2 states.Let us now discuss the application of parameterised circuit (B 1 , B 2 ) on the |h 1 and |h 2 states in Eq.(39) and Eq.( 40) respectively.For the hamming weight 1 basis, the application of parameterised circuit (B 1 ) is already discussed in sec.IV A. For notational consistency, we denote this operation as a multiplication with matrix Furthermore, we also apply a post-select operation to preserve the basis, selecting only the states with a nonzero coefficient before applying the B 1 .On similar lines as hamming weight 1 basis, for the hamming weight 2 case, the application of parameterised circuit (B 2 ) can be interpreted as multiplication by the matrix W 2 ∈ R q×q where q = Nc+K 2 .Based on a recent work on subspace states [KP22], if the parameterised circuit has a nearest neighbour connectivity, then the matrix B 2 is the compound order 2 matrix [HJ12] of B 1 .Therefore, for this case, each of its elements will correspond to the determinant of a 2 × 2 submatrix of W 1 : for some a, b, k < N c + K.For the butterfly-shaped circuit, we are not limited to nearest neighbour connectivity and thus, W 2 has to be extracted from the complete unitary (2 Nc+K × 2 Nc+K ) only.After applying this unitary B 2 on |h 2 , their transformed coefficients c ij are: Similar to the case of hamming weight 1, here also we use a post-select operation to discard the states which initially had coefficients zero thereby preserving the basis.
Combining the transformed |h 1 , |h 2 states and applying IQF T on the lower register, the final output state of this circuit is: We term the overall circuit, when parameterised circuit has nearest neighbour connectivity (pyramid-shaped), as the Composite Circuit (Compound) having depth complexity (N c + K) + log(N c ) + (N c + 2)log(N s ) and for butterfly shaped as the Composite Circuit (Butterfly) having depth complexity log(N c +K)+log(N c )+(N c +2)log(N s ).

V. EXPERIMENTS
This section analyses our proposed quantum algorithms for solving PDEs and image classification tasks.We compare them against the state-of-the-art in both the domains, i.e., classical Fourier Networks (for PDEs) and CNNs (for image classification), for both tasks.All the details related to architecture and hyperparameters are provided at the end of the paper.All the experiments shown in this section are simulated, i.e., the quantum operations have been simulated using classical matrices corresponding to quantum unitaries, since the currently available quantum hardware is too noisy for circuits of such size.However, we expect the upcoming generations of quantum hardware to support experiments of such scale.Also, the Butterfly version of the Composite Circuit, i.e.Composite Circuit (Butterfly), has been used for all the experiments.

A. Partial Differential Equations
We show results on all the three PDEs used in the classical Fourier Layer paper [LKA + 20]: Burgers' equation, Darcy's Flow equation and Navier-Stokes equation using the datasets proposed in that paper.All of these equations were designed for modelling the flow of fluids and have found their applications in other domains as well.We abstractly describe the three tasks.For equations and other details, please refer to [LKA + 20].We analyse the performance of the trained networks across different resolutions (N s ) for the first two and for different viscosity values for the third.Burgers' Equation.It is a 1D-Partial Differential Equation for modelling fluid motion and is expressed as follows: t ∈ (0, 1] where u(x, 0) = u 0 (x) and x ∈ (0, 1) (45) with ν ∈ R + corresponding to the fluid viscosity and u 0 denoting the initial condition function for this PDE family.We need to learn the mapping from this u 0 to the function at time one u(x, 1), for a given viscosity.
Fig. 11 (left) shows the comparison of relative error in estimating this mapping among the classical Fourier Layer, classical CNNs and proposed quantum circuits for the Fourier Layers, across different resolutions.The quantum circuits perform comparably to the classical Fourier Layer and are much better than classical CNNs.x ∈ (0, 1) 2 t ∈ (0, T ] where w corresponds to vorticity, w 0 being the initial vorticity, ν is the viscosity, u is a velocity field and f (x) is some sort of a forcing function.The aim here is to model Middle: A similar comparison on the Pneumonia-MNIST [YSW + 21] dataset.The performance of CNNs is somewhat noisy here whereas it is smoother in the case of the sequential circuit, both converging to a similar value.The composite quantum circuit and the classical Fourier baseline are also quite close to the CNNs in convergence.Right: Same comparison on the FashionMNIST [XRV17] data.Here, a significant difference in the performance is observed with CNNs being the best followed by the Sequential circuit.
the fluid vorticity up to instant T (> 10) given the vorticity up to time 10.Fig. 12 shows the performance comparison for this equation between our proposed circuits and classical methods.It shows the convergence comparison for this family with viscosity ν fixed to 1e − 3 for all the methods.Here, again, it can be observed that all the proposed circuits and the classical Fourier method perform significantly better than CNNs.Also, from Table II, the sequential circuit performs similarly to classical method and the others converge at a slightly higher error.

B. Image Classification
We further compare our proposed Quantum algorithms on the downstream image classification tasks on benchmark datasets including the MNIST, FashionMNIST [XRV17] and PneumoniaMNIST [YSW + 21] datasets.The MNIST dataset consists of grayscale images corresponding to digits from 0 to 9 (both included), having a resolution of 28 × 28.The task is to predict the digit for a given input image.Similarly, for FashionMNIST also there is a 10-way classification task into various categories of clothing, using 28 × 28 grayscale images.On the other hand, PneumoniaMNIST involves a binary classification task into positive or negative for a given grayscale image.
For this comparison, we have used the 2D versions of all the architectures-our quantum FNO, the classical FNO and the CNNs.Fig. 13 shows the epochs (x-axis) v/s accuracy (y-axis) plot for this evaluation.It can be observed that our proposed algorithms outperform the clas-sical FNO and converge in close proximity (w.r.t.accuracy) to the classical CNNs, with Sequential QFL being almost comparable to the classical CNNs.This shows that the proposed QFLs are comparable in performance to the state-of-art in the vision domain as well along with solving PDEs, thereby broadening the scope of their applicability.

VI. CONCLUSION
We proposed a quantum algorithm to carry out the recently proposed classical Fourier Neural Operator on quantum hardware.We further proposed two more quantum algorithms, which perform a different operation than the classical one and can be much more efficiently deployed on noisy quantum hardware.Experimental results further confirm that the proposed quantum neural networks perform efficiently in both solving PDEs and image classification.The sequential network matches the bestperforming classical algorithm (the CNNs for images and the classical Fourier Layer for PDEs) on both tasks.
An interesting future direction can be further developing the learning process of the composite network so that it can perform better than the sequential network while at the same time being more efficient to deploy.The composite network intuitively performs a kind of attention mechanism that learns how to mix the mappings performed on each of the top K modes, and understanding how this mixing works can provide new ideas for improving it.

Figure 1 :
Figure1: Overview of the Fourier Neural Network.Each initial condition f (x, 0) is sampled N s times and modified via a trainable matrix P to become a matrix of size N s × N c .Then, T Fourier Layers (green block) are applied sequentially.In this paper, we are designing quantum circuits to implement the Fourier Layers.Each Fourier Layer consists of a row-wise Fourier Transform (FT), followed by a column-wise multiplication with trainable matrices labelled W , and the row-wise Inverse Fourier Transform (IFT).We only apply the inner matrix multiplications to the first K columns (also called modes) and leave the others unchanged or replaced by 0s before the IFT.Finally, a reverse operation with a trainable matrix Q is performed to obtain the output f (x, t) as a discretised vector.The trainable parts are updated with Gradient Descent until the outputs correspond to the actual solutions of the PDE.

Figure 3 :
Figure 3: Quantum circuit for loading a matrix A ∈ R n×d as a quantum state on the unary basis.Each white square is a circuit to load a vector.a i is the i th row of A. The circuit starts with the state |0 n ⊗ |0 d

Figure 4 :
Figure 4: Diagram of the Cooley-Tukey algorithm [CT65], performing the classical FFT on an input x ∈ R n .Each white box performs an elementary radix-2 operation with the root of unity ω = e i2π/n .
[a, b] into [a + ω k b, a − ω k b].In matrix multiplication terms, each of these operations applies the matrix 1 ω k 1 −ω k .Quantumly, we want to reproduce the idea of the classical FFT diagram.It results in the circuit shown in Fig.5.To reproduce the action of the radix-2 transforms

Figure 5 :
Figure 5: Quantum circuit for implementing a Fourier Transform on the unary basis.Single qubit gates are phase gates.Vertical lines are RBS gates with −π/4 angle.It reproduces the classical FFT butterfly circuit(Fig.4)by replacing each radix-2 operation with the phase gate followed by the RBS gate.The input must be a vector |x encoded in the unary basis, with the right permutation.The output will be |x .

Figure 9 :
Figure 9: Parallel QFL: Parallelised version of the Sequential Quantum Circuit to minimize the depth of the learning part, thus making it more efficient when deployed on noisy hardware.For each mode (out of the top K) in the transformed input, there is a different circuit to perform the parameterised matrix transform.

Figure 10 :
Figure 10: Composite-QFL: Variant of the Sequential-QFNO where instead of controlled butterfly circuits, there is a Composite Butterfly Circuit spanning the upper register and top K qubits of the lower register.

ν = 1e − 3 ;
Comparison of parameters required by one layer of the proposed circuits and the existing classical Fourier Layer along with error analysis for different ν and T values for the 2D case of a Navier-Stokes equation.

Figure 13 :
Figure 13: Left: Performance comparison of the CNNS, classical Fourier layer and the proposed quantum circuits on the MNIST dataset.It can be observed that all of them perform quite similarly, classical CNNs being the best.Middle: A similar comparison on the Pneumonia-MNIST [YSW + 21] dataset.The performance of CNNs is somewhat noisy here whereas it is smoother in the case of the sequential circuit, both converging to a similar value.The composite quantum circuit and the classical Fourier baseline are also quite close to the CNNs in convergence.Right: Same comparison on the FashionMNIST[XRV17] data.Here, a significant difference in the performance is observed with CNNs being the best followed by the Sequential circuit.
from [CKM + 22].It uses circuits from [JDM + 21] that load vectors in the unary basis.For instance, a row a i ∈ R d is loaded as Left: Performance comparison (relative error as used in [LKA + 20]) of the classical Fourier networks, CNNs and the three circuit proposals for a quantum Fourier layer on the Burgers 1D PDE equation across different resolutions.The quantum Fourier circuits are quite close to the performance of the classical Fourier baseline and much better than classical CNNs in minimising the error.Right: Same comparison on the Darcy's 2D PDE for different resolutions.A similar relative performance is observed where the error in CNNs is much larger and is increasing w.r.t.resolution, whereas the error in the other four (3 quantum circuits and classical layer) is quite similar.