Quantum implementation of an artificial feed-forward neural network

Artificial intelligence algorithms largely build on multi-layered neural networks. Coping with their increasing complexity and memory requirements calls for a paradigmatic change in the way these powerful algorithms are run. Quantum computing promises to solve certain tasks much more efficiently than any classical computing machine, and actual quantum processors are now becoming available through cloud access to perform experiments and testing also outside of research labs. Here we show in practice an experimental realization of an artificial feed-forward neural network implemented on a state-of-art superconducting quantum processor using up to 7 active qubits. The network is made of quantum artificial neurons, which individually display a potential advantage in storage capacity with respect to their classical counterpart, and it is able to carry out an elementary classification task which would be impossible to achieve with a single node. We demonstrate that this network can be equivalently operated either via classical control or in a completely coherent fashion, thus opening the way to hybrid as well as fully quantum solutions for artificial intelligence to be run on near-term intermediate-scale quantum hardware.


I. INTRODUCTION
The field of artificial intelligence was revolutionized by moving from the simple, single layer perceptron design [1] to that of a complete feed-forward neural network (ffNN), constituted by several neurons organized in multiple successive layers [2,3]. In such artificial neural network designs each constituent neuron receives, as inputs, the outputs (activations) from the neurons in the preceding layer. The advantage of ffNNs with respect to simpler designs such as single layer perceptrons or support vector machines is that they can be used to classify data with relations that cannot be reduced to a separating hyperplane [4]. The present ubiquitous use of artificial intelligence in a wide variety of tasks, ranging from pattern or spoken language recognition to the analysis of large data sets, is mostly due to the discovery that such feed-forward networks can be trained by using well established optimization algorithms [2][3][4].
Quantum computers hold promise to achieve some form of computing advantage over classical counterparts in the not-so-far future [5]. Indeed, quantum computing has been theoretically shown to offer potentially exponential speedups over traditional computing machines, especially in tasks such as large number factoring, solving linear systems of equations, and data classification [6][7][8][9][10]. More recently, quantum computers have been applied to the field of Artificial Intelligence [11][12][13][14], and recent realizations of artificial neurons [15][16][17][18] and support vector * francesco.tacchino01@ateneopv.it machines [19,20] on real quantum processors, even if limited to simple systems at present, have shown a promising route towards a practical realization of such advantage.
In order to harness the full potentialities that quantum computing may offer to the field of artificial intelligence it is necessary to undergo the passage from single layered to deep feed-forward neural networks [21][22][23], which has so greatly expanded the capabilities of artificially intelligent systems to date. Here we propose the architecture of a quantum ffNN and we test it on a state-of-the art 20-qubit IBMQ quantum processor. We start from a hybrid approach combining quantum nodes with classical information feed-forward, obtained via classical control of unitary transformations on qubits. This design realizes a fully general implementation of a ffNN on a quantum processor assisted by classical registers. A minimal 3-node example, specifically designed to carry out a pattern recognition task exceeding the capabilities of a single artificial neuron, is used for a proof-of-principle demonstration on real quantum hardware. We then describe and successfully implement on a 7-qubit register an equivalent fully quantum coherent configuration of the same set-up, which does not involve classical control of the feed-forward links and thus potentially opens the way to the exploration of more complex and classically inaccessible regimes.
The proposed quantum implementation of ffNN offers interesting perspectives on scalability already in the Noisy Intermediate-Scale Quantum (NISQ) [24] regime: indeed, the single quantum nodes potentially feature exponential advantage in memory usage, thus allowing to manipulate high-dimensional data structures with intermediate-size quantum registers, in principle. More-over, the hybrid nature of the ffNN itself suggests a seamless integration with existing classical structures and algorithms for neural network computation and machine learning [25].

II. DESIGN OF THE HYBRID FEED-FORWARD NEURAL NETWORK
In this section, we outline the general structure of our proposed hybrid ffNN, including a synthetic description of the working principles of single nodes and a more detailed discussion of layer-to-layer connections. While, for the sake of clarity, we will often refer to a specific minimal example with three nodes and two layers, the overall scheme can be generalized to arbitrary feed-forward networks.

A. Individual nodes
A ffNN is essentially composed of a set of individual nodes {n i }, or artificial neurons, arranged in a set of successive layers {L j }. Information flows through the network in a well defined direction from the input to the output layer, travelling through neuron-neuron connections (i.e. artificial synapses). Each node performs an elementary non-linear operation on the incoming data, whose result is then passed on to one or more nodes in the successive layer.
In their simplest form, individual nodes can be designed to analyze binary-valued inputs. The artificial neurons that we consider here are based on the well known perceptron model [1]: such computational units analyze information by combining input ( i) and weight ( w) vectors, providing an activation response that depends on their scalar product i · w. In our case, input and weight vectors are assumed to be binary-valued mdimensional arrays [26], i.e.
The activity of a binary artificial neuron can be implemented on a quantum register of N = log 2 (m) qubits [18] by considering the quantum states These encode the corresponding input and weight vectors by effectively exploiting the exponential size of the Hilbert space associated to the quantum register in use. The states of the form presented in Eq.
(2) are real equally-weighted (REW) superpositions of all the computational basis states |j ∈ {|0 . . . 00 ; |0 . . . 01 ; . . . , |1 . . . 11 }. The quantum procedure carrying out the perceptron-like computation for single artificial neurons can be summarized in three steps [18]. First, assuming that the N -qubits quantum register is initially in the idle configuration, |0 ⊗N , we prepare the quantum state encoding the input vector with a unitary operation U i such that |ψ i = U i |0 ⊗N . We then apply the weight factors of vector w on the input state by implementing another unitary transformation, U w , subject to the constraint |1 ⊗N = U w |ψ w . An optimized yet exact implementation of U i and U w exploits the close relationship between REW quantum states and the class of hypergraph states [18,27], achieving in the worst case an overall computational complexity which is linear in the size of the classical input, i.e. O(m). After the two unitaries have been performed, it is easily seen that the state of the quantum register is where c m−1 = ψ w |ψ i = (1/m) i · w. Finally, the nonlinear activation of the single artificial neuron can be implemented by performing a multi-controlled NOT gate [6] between the encoding register and an ancilla initialized in the initial state |0 followed by a final measure of the ancilla in the computational basis. Hence, the output of the quantum artificial neuron is found in the active state |1 a with probability p(1) = |c m−1 | 2 .

B. Information feed-forward
When several copies of the quantum register implementing the artificial neuron model outlined above work in parallel, the respective ancillae, and the result of the measurements performed on them, can be used to feedforward the information about the input-weight processing to a successive layer. Indeed, let us suppose that a layer L j contains j independent nodes, {n kj } j k=1 , each of them characterized by a weight vector w kj : in one cycle of operation, every node is provided with a classical input i kj (either coming from layer L j−1 or directly from the original data set to be analyzed) and, upon measurement, it outputs an activation state a kj ∈ {1, 0}, chosen according to a probability p kj (a kj = 1) ∝ | i kj · w kj | 2 . Assuming for simplicity that the h-th neuron n h(j+1) belonging to the L j+1 layer collects the outputs of all {n kj } nodes, the corresponding binary classical input can be constructed Output (or +2 ) Figure 1. Abstract architecture of a hybrid ffNN. Each layer Lj contains an arbitrary number of nodes {n kj }, which can individually be implemented on a quantum hardware. Upon measurement, information about the activation state of a layer is passed to the following one (Lj+1) in the form of classical bits controlling quantum operations. Full connectivity between nodes in successive layers is schematically shown, although sparser networks are also possible in principle. The dashed line represents classical inputs from a generic preceding stage, which can be, e.g., a collection of layers up to Lj−1 or the original input information.
Such new input vector can then be used to parametrize the appropriate U i transformation for the n h(j+1) node. The overall computation can then be constructed by iteratively alternating the unitary quantum computation carried out by single layers with non-linear measurement and feed-forward stages. Notice that the design is totally general in terms of the number of nodes in each layer, the number of connections and the size of the various inputs to individual nodes. Moreover, as the information is formally transferred in the form of classical bits, the same input can easily be manipulated, e.g., by making classical copies to be fed to independent nodes sharing similar connections to the previous layer. An abstract representation of the proposed architecture is shown in Fig. 1. From the technical point of view, a very natural implementation of the hybrid ffNN architecture onto a quantum processor makes use of classically controlled quantum gates. Independent quantum nodes within the same layer can either be implemented in different quantum registers, and thus computed simultaneously, or run on the Input Output activation The minimal example of a feed-forward neural network that we analyze in this study accepts four classical binary inputs and features one hidden layer containing two artificial neurons plus one output layer made of a single neuron. Next to each neuron, the ideal shape of the weight vectors achieving the desired recognition of horizontal and vertical lines is shown. The corresponding encoding scheme in terms of black and white pixels is also reported for a generic input/weight binary vector b = (b0, . . . , bm). (b) Ideal results for the classification of 2×2 pixel images. Notice that the target patterns, corresponding to integer labels 12, 3 (horizontal), 10 and 5 (vertical) all have p out = 1, while all others have p out < 0.5 (threshold shown in red). same set of qubits, after proper re-initialization and by storing all the observed activation states in different positions of a classical memory register.

C. Example: pattern recognition
The working principles of our proposed hybrid ffNN, including the above technical details, are actually best clarified by describing an explicit example tailored to solve a well defined elementary classification problem. This will also set the stage for the experimental proof-of-principle demonstration on an actual superconducting quantum hardware to be presented in the next section. First, let us recall that binary input and weight vectors can be visually interpreted as images containing black or white square pixels [18]: a natural encoding scheme associates, e.g., a white spot to a i j (w j ) = −1 entry in the corresponding input (weight) vector, as shown explicitly in Fig. 2a for the hidden (m = 4, i.e. 2 × 2 pixel images) and output (m = 2, i.e. 2 × 1 pixel images) layers of a minimal ffNN. Moreover, we can identify any such binary pattern with a unique integer label by considering the equivalent decimal representation of the binary num- The task that we set out to solve with our example ffNN is the following: the network should be able to recognize (i.e., give a positive output activation with sufficiently large probability) whether there exist straight lines in 2 × 2 pixel images, regardless of the fact that the lines are horizontal or vertical. All the other possible input images should be classified as negative. Notice that, as the data vectors encoding horizontal and vertical lines are orthogonal to each other, there is no single hyperplane separating the four positive states from all other possible input images: therefore, the desired classification cannot be carried out by a single node accepting 4-bit inputs. This behavior of quantum artificial neurons differs from their usual classical counterparts, which cannot correctly classify sets containing opposite vectors [4]. More explicitly, given an input vector v 1 and a weight vector w, a single quantum neuron would output a value proportional to | v 1 · w| 2 , i.e. cos 2 θ, where θ is the angle formed by the two vectors. If we take a second input vector v 2 ⊥ v 1 , the output would be upper bounded by sin 2 θ. As the set of patterns that should yield a positive result includes vectors that are orthogonal (those representing horizontal lines are orthogonal to those representing vertical lines) and vectors that are opposite (for instance, the vector corresponding to a vertical line on the left column of a 2 × 2 pixel image is opposite to the vector corresponding to a vertical line on the right column), it is therefore impossible to find a weight w capable of yielding an output activation larger than 0.5 for all targets in the configuration space. We hereby show that a simple three-node network can accomplish the desired computation. A scheme of such an elementary ffNN is shown in Fig. 2a, where the circles indicate individual artificial neurons, and the vectors w i refer to their respective weights. The network features a single hidden layer and a single binary output neuron. On a conceptual level, the functioning of the network can be interpreted as follows: with the a priori choice of weights represented in Fig. 2a, the top quantum neuron of the hidden layer outputs a high activation if the input vector has vertical lines, while the bottom neuron does the same for the case of horizontal lines. The output neuron in the last layer then recognizes whether one of the neurons in the hidden layer has given a positive outcome.
A possible quantum circuit description of the ffNN introduced above, including the classical feed-forward stage between the hidden and the output layer, is provided in Fig. 3a. We assume that each neuron within the hidden layer can accept 4-bit inputs, such that each quantum neuron can be represented on a 2-qubit encoding register plus an ancilla qubit (i.e., m = 4 and N = 2 in this case). At the same time, the output neuron takes 2-dimensional inputs coming from the previous layer and provides the global activation state of the network, thus requiring a single qubit (m = 2, N = 1) to be encoded. Classical bits are also included to store the intermediate and final results.
Let us call n 1 and n 2 the two hidden nodes, which actually accept the same classical input but process it in two different ways. As described at the beginning of this section, each artificial neuron will independently provide, upon measurement, an activation pattern a k ∈ {0, 1} (for k = 1, 2), which can be stored in a classical bit b k . We denote p k the probability of actually observing a value a k = 1 from the k-th neuron. When such measurement is performed, we set b k = a k : as a result, the state of the classical 2-bit register after the quantum computation in the hidden layer has been completed is one of the following with probability respectively. It is easy to see that feed-forwarding the information contained in the classical register to the output neuron n 3 corresponds to providing it with one of the classical binary inputs i b1b2 reading As shown in Fig. 3a, a straightforward strategy for preparing the corresponding |ψ i state on the single-qubit register representing n 3 is by first bringing it from the idle state |0 to the superposition √ 2|+ = |0 + |1 via a Hadamard (H) gate, and then conditioning the application of two Z gates (each of them adds a −1 phase to the |1 component, if applied) on the two classical bits [b 1 , b 2 ]. The resulting quantum state will then be  where a ⊕ b here denotes the usual bit sum modulo 2.
If we now choose, as shown in Fig. 2a, a weight vector w 3 = (1, −1) we obtain U w3 ≡ H. Therefore, the final state of the third neuron reads The overall probability of observing an active state on the output neuron can be written, in general, as where we employed the usual notation for conditional probabilities and In our specific case, it is easy to see that, given Eq. (7) and Eq. (10), this reduces to Since in this elementary example n 3 is encoded in a single qubit, the final measurement can be performed directly without the need for an additional ancilla. In Fig. 2b we report the exact result for the convolution of Eq. (13): as it can be seen, the ffNN ideally outputs an active state with p out = 1 for the target horizontal and vertical patterns, while p out < 0.5 in all other cases. Before moving forward, it is worth mentioning that the construction of a classically conditioned U i can always be found also in more general cases, e.g. when the hidden layer contains more than two neurons. In particular, any node encoded on N qubits will be able to accept inputs from m = 2 N nodes in the previous layer: indeed, each output configuration from the latter will be one of the 2 m possible bit strings [b 1 , . . . , b m ] that can be used to uniquely identify one of the 2 m = 2 2 N possible input states, and thus to classically program its preparation.

III. QUANTUM COHERENT FEED-FORWARD
The hybrid feed-forward architecture described so far and realized in a minimal 3-node 2-layer example can also be reformulated in a fully quantum coherent way. As we will show below, and at difference with the hybrid quantum-classical solution, this version always requires all nodes to be implemented simultaneously on a dedicated quantum register, thus making the quantum computation more demanding. At the same time, however, it reduces the necessity to store and process classical bits during intermediate stages. Moreover, fully coherent quantum neural networks offer more opportunities for use on quantum processors, as will be discussed in the final conclusions.
In Fig. 3b we show a fully quantum construction for the ffNN of Fig. 3a. The fundamental reason for the actual equivalence lies in the well known principle of deferred measurement [6], stating that in a quantum circuit one can always move a measurement done at an intermediate stage to the end of the computation while replacing classically controlled operations (O) with quantum controlled ones: Indeed, assuming that the nodes n 1 and n 2 are encoded in parallel and after the operations of the first layer (except the measurement on the ancillae) have been performed, we can write the global state of the total (3+3+1)-qubit network as where r nx = (1 − c 2 m−1,nx ) 1/2 and |ϕ nx contains, for each neuron, all the components other than the one leading to activation, see Eq. (4). Notice that, by construction, ϕ nx |1 . . . 1 = 0. In the meantime, the n 3 qubit is brought into the superposition √ 2|+ = |0 + |1 by applying a single-qubit Hadamard gate, H. Synapses can thereafter be implemented with two CZ gates, as represented in Fig. 3b. The overall state of the quantum ffNN then becomes where c nx is a short-hand notation for c m−1,nx , and the activated |A and rest |R states of n 1 and n 2 are explicitly given as By applying U w3 ≡ H on n 3 we obtain an output state It is straightforward to observe at this point that the neurons of the hidden layer can in principle be measured in an activation state which exactly correspond to the ones reported in Eq. (7). However, as long as we are interested only in the output state of the network, i.e. the activation state a 3 of n 3 , there is no need to actually perform the final measurements on n 1 and n 2 : similarly to Eq. (11), we can in fact simply discard the information contained in the variables pertaining to the hidden layer by performing a partial trace operation. This returns a density matrix for the output neuron which automatically represents the convolution of the hidden nodes, see Eq. (13). It is worth noticing that the role of the partial trace operation has recently been recognized and extensively discussed in the literature as a possible ingredient for a more general theory of quantum neural networks [28,29].
To conclude this section, we also point out explicitly that the conversion between the two modes of operation (hybrid vs coherent) of our proposed ffNN architecture goes beyond the specific example presented in this work. Indeed, as mentioned at the end of Sec. II C, any feedforward link between successive layers can in general be decomposed in terms of classically controlled operations.  Fig. 3a implemented on the IBMQ Poughkeepsie superconducting processor and compared with ideal noiseless outcomes computed numerically with the Qiskit qasm simulator. (a) Neuron n1, recognizing horizontal inputs. (b) Neuron n2, recognizing vertical inputs. (c) Neuron n3, recognizing 2-dimensional inputs with dissimilar entries. Error mitigation is applied to data for n1 and n2.
Whenever such construction is known, measurement deferral and partial traces can in principle always be employed to obtain the equivalent coherent network, namely by replacing all classical controls with their quantum counterparts and by measuring only the output layer.

IV. EXPERIMENTAL REALIZATION ON A SUPERCONDUCTING NISQ PROCESSOR
We have implemented the ffNN introduced in Fig. 2 on a real superconducting NISQ processor made available on cloud via the IBM Quantum Experience and programmed using the Qiskit python library [30]. Employing the same device, named IBMQ Poughkeepsie, we realized both the hybrid (Fig. 3a) and the fully coherent ( Fig. 3b) configurations, reporting in both cases a remarkable successful completion of all the desired classification tasks.
In Fig. 4a-b we show the results for the 3-qubit simulation of nodes n 1 and n 2 , respectively corresponding to the first and second set of three qubits in Fig. 3a, from which the probabilities p 1 and p 2 can be estimated for all possible input vectors while assuming the weights w 1 and w 2 shown in Fig. 2a. The comparison with ideal results simulated numerically shows an excellent qualitative agreement and a good quantitative match of the outcomes: in particular, notice that each individual node can successfully single out either vertical or horizontal lines, see patterns in Fig. 2b [18]. The agreement is naturally better for the simulation of all possible n 3 circuits, whose results are reported in Fig. 4c: indeed, in this case the probability p(a 3 = 1|[b 1 , b 2 ]) can be computed operating on a single qubit. The final outcomes (i.e. p out ) for the hybrid configuration of the ffNN, reported in Fig. 5a, are then obtained by applying Eq. (11). The latter is used in place of e.g. Eq. (13) in order to avoid introducing unnecessary assumptions or biases in the calculation and to take into account all possible sources of inaccuracy such as, for example, a non exactly zero outcome for p(a 3 = 1|[0, 0]).
Finally, the experimental results for the fully coherent ffNN configuration are reported in Fig. 5b. These were obtained by running the 7-qubit quantum circuit introduced in Fig. 3b. As it can immediately be appreciated, the outcomes are in good agreement with the cor-responding ones in the hybrid version of the ffNN. We stress that such comparison is made non-trivial from the experimental point of view by the fact that, in the fully coherent version, a register of 7 simultaneously active and typically entangled qubits is required. On the contrary, the hybrid solution only requires each individual node to be separately implemented on a 3-qubit quantum register and, provided that the classical outcomes are conveniently stored, such quantum computations can be carried out in dedicated runs, thus avoiding e.g. crosstalks effects. As in the hybrid case, and despite some residual quantitative inaccuracy in the estimation of the activation probabilities, all the possible inputs are classified correctly by the ffNN, with the target horizontal and vertical patterns singled out from all other patterns.
We also mention that raw data from the quantum processor already allow for an accurate classification in both hybrid and coherent configurations. However, the overall quality of the outcomes greatly benefits from the application of simple error mitigation techniques [31][32][33][34].

V. DISCUSSION
In this work we have presented an original architecture to build feed-forward neural networks on universal quantum computing hardware and demonstrated the use of them in NISQ devices. In particular, we have shown how successive layers constituted by artificial neurons and implemented on independent quantum registers can be either connected to each other via classical control opera-tions, thus realizing a hybrid quantum-classical ffNN, or by fully coherent quantum synapses. The necessary degree of non-linearity is achieved in one case via explicit quantum measurement, in the other by a partial trace operation that effectively produces a convolution operation. We stress that our proposed procedure is hardwareindependent and therefore it can, in principle, be implemented on any quantum computing machine, e.g. based on superconducting qubits [35], trapped-ions based quantum processors [36], and photonic components [37,38].
In the present work, we have successfully tested a 3node implementation of our algorithm applied to an elementary pattern classification task, both in the hybrid and fully coherent configurations. Such proof-of-principle demonstration was achieved on the IBMQ Poughkeepsie superconducting quantum processor by using up to 7 active qubits, and finding a substantial experimental agreement between the two proposed operating modes of the network. These results represent, to the best of our knowledge, one of the largest quantum neural network computation reported to date in terms of the total size of the quantum register. We also notice that the use of quantum artificial neurons as individual nodes gives the prospective advantage of an exponential gain in storage and processing ability: in turn, this confirms that hybrid quantum-classical neural networks could already be able to treat very large input vectors, beyond the capabilities of current systems. Such ability is becoming increasingly needed to handle e.g. very large image files, sanitary data for public health, market data for financial applications, and the "data deluge" expected from the Internet of Things. Moreover, the hybrid structure of our proposed ffNN could actually represent a relevant technical feature in the process of integrating quantum and classical processes for machine learning tasks: indeed, one could for example easily imagine that a few carefully distributed quantum nodes at the input of an otherwise classical network might act as a memory-efficient convolutional layer enabling the treatment of otherwise unmanageable sets of data.
A very natural extension of this work, and particularly of the fully coherent setup, would be an exploration of classically inaccessible regimes with no hybrid (i.e. classically controlled) counterpart. This could be achieved, e.g., by allowing more complex synapse operations, thus letting activation probabilities for all neurons feeding the same successive layer to interfere in a truly quantum coherent way, or by engineering non-trivial quantum correlations between quantum nodes already within the same layer. In addition to the large advantage in data treatment capacity, this could then also result in new functionalities, such as the ability to deploy complicated convolution filters impossible to be run on classical hardware.
Even further reaching consequences might be expected from the possibility to directly process quantum data instead of quantum-encoded classical information, for instance to search for patterns in the output of a quantum simulator or process quantum states coming from a quantum internet appliance. In these cases, the input would directly be given in the form of a wavefunction or a density matrix [29], without the resource cost associated to a classical input [39][40][41].
A last remark concerns the quantum network training. The practical example shown in this work used weights that were selected by the programmers, instead of discovering the weights through an optimization process (training). The nonlinearity coming from the measurement on the ancilla of each artificial neuron is sufficient, in principle, to guarantee the required plasticity for training [18]. This means that the architecture for quantum artificial neural networks proposed in this work is fully compatible with classical training algorithms, like the backpropagation method or the Newton-Raphson method [4]. However, one possible drawback of such methods is that they would incur in exponentially large training costs, i.e. when dealing with the very large vector spaces that could be associated to quantum neural networks. A possible alternative would be to use hybrid quantum-classical methods, like for instance Variational Quantum Eigensolvers [42] to find the optimal weights. In particular, some VQE protocols have been shown to be implementable with an efficient (i.e. polynomial) use of classical resources [43][44][45] and some strategies have also been put forward to deal with the well known issue of barren plateaus in quantum neural networks [46,47]. A thorough study of the training is however beyond the scope of the present work.
In conclusion, we provide a clear-cut recipe to map classical feed-forward neural networks onto quantum processors, and our results suggest that the whole design may eventually benefit from paradigmatic quantum properties such as superposition and entanglement. This represents a necessary step towards the final goal of approaching quantum advantage in the operation and training of quantum neural network applications.