Programming multi-level quantum gates in disordered computing reservoirs via machine learning

Novel computational tools in machine learning open new perspectives in quantum information systems. Here we adopt the open-source programming library Tensorflow to design multi-level quantum gates including a computing reservoir represented by a random unitary matrix. In optics, the reservoir is a disordered medium or a multimodal fiber. We show that by using trainable operators at the input and at the readout, it is possible to realize multi-level gates. We study single and qudit gates, including the scaling properties of the algorithms with the size of the reservoir.

The development of multi-level quantum information processing systems has steadily grown over the past few years, with experimental realizations of multi-level, or qudit logic gates for several widely used photonic degrees of freedom such as orbital-angular-momentum and path encoding [1][2][3][4].However, efforts are still needed for increasing the complexity of such systems while still being practical, with the ultimate goal of realizing complex large-scale computing devices that operate in a technologically efficient manner.
A key challenge is the development of design techniques that are scalable and versatile.Recent work outlined the relevance of a large class of devices, commonly denoted as "complex" or "multimode."[5,6] In these systems, many modes, or channels are mixed and controlled at input and readout to realize a target input-output operation.This follows the first experimental demonstrations of assisted light transmission through random media [7][8][9][10], which demonstrated many applications including arbitrary linear gates [5], mode conversion, and sorting [11,12].
The use of complex mode-mixing devices is surprisingly connected to leading paradigms in modern machine learning (ML), as the "reservoir computing" (RC) [13] and the "extreme learning machine" (ELM) [13,14].In standard ML, one trains the parameters (weights) of an artificial neural network (ANN) to fit a given function linking input and outputs.In RC, due to the increasing computational effort to train a large number of weights, one internal part of the network is left untrained ("the reservoir") and the weights are optimized only at input and readout.
ML concepts such as photonic neuromorphic and reservoir computing [15,16] are finding many applications in telecommunications [17,18], multiple scattering [19], image classification [20], biophotonics [10], integrated optics [21], and topological photonics [22].Various authors have reported the use of ML for augmenting and assisting quantum experiments.[23][24][25] Here we adopt RC-ML to design complex multi-level gates [2,3,26,27], which form a building block for highdimensional quantum information processing systems.While low-dimensional examples of such gates have been implemented using bulk and integrated optics, efficiently scaling them up to high dimensions remains a challenge.In quantum key distribution, one uses at least two orthogonal bases to encode information.High-dimensional QKD may be realized by using the photonic spatial degree of freedom as the encoding (computational) basis, and suitable unitaries to switch between bases mutually unbiased with respect to the computational basis.However, the security of the QKD protocol may be compromised by the fidelity of such basis transformations, leading to errors in the key rate.An additional consideration is the experimental complexity of such transfor-mations, which can scale rather poorly using established techniques based on bulk optical systems.By using a random medium and I/O readout operators, one can realize such high-dimensional operations in a controllable and scalable manner, relying only the existing complexity of the disordered medium and a control operation at the input.Here, we explore methodologies to train a disordered medium to function as a multi-level logic gate by using different implementations of ML concepts.
Figure 1 shows the schematic of a device including the complex medium, represented by the unitary operator Û , and two trainable input Ŝin and readout Ŝout operators.|h (1,2) are hidden states.The use of an optical gate in this manner is also related to the use of a disordered medium as a physically unclonable funtion (PUF) [28][29][30].
In our general framework, we have a random system modeled by a unitary random matrix.We want to use the random medium to perform a computation in a Hilbert space containing many qudits.The random medium is not necessarily a disordered system (for example, a dielectric assembly of scattering particles), but may also be a multimode fiber, or an array of waveguides.The input/output relation is represented by a linear unitary matrix operator U M and only forward modes are considered.The U M matrix has dimensions M × M , with M the dimension of the embedding space.
The "reduced" state vector at input has dimensions N × 1, with N ≤ M .This models the case in which we use a subset of all the available modes.The input to the reservoir is a "rigged" state vector x with dimension M , where the missing complementing C components are replaced by C ancillas with C = M − N .Our goal is to use the random medium to perform a given operation denoted by a gate unitary matrix S in M and S out M are two "training" operators that are applied at input and output (see figure 1) and whose elements can be adjusted.We first consider the presence of the input operator S in M = S M , and S out M = 1 M , which can be implemented by spatial-light modulators (we denote as 1 M the identity matrix with dimension M ).
We identify two cases: (i) either we know the matrix U M , or (ii) we have to infer U M from the input/output relations.We show in the following the way these two problems can be solved by ANN, where we denote the two families as non-inferencing and inferencing gates.Non-inferencing gates -We consider a target gate with complex-valued input state with dimension N , and components x 1 , x 2 , ..., x N .We embed the input vector in a rigged Hilbert space with dimension M ≥ N , so that the overall input vector is x = {x 1 , x 2 , ..., x N , x N +1 , ..., x M }.We have a linear propagation through a medium with unitary complex transfer matrix U M .The overall transmission matrix is The observed output vector is written as P • y, where P is a N −projector operator with dimensions N × M such that P = [1 N |0], with 1 N the identity matrix with size N × N , and 0 a null matrix with dimension N × C. The goal is finding the matrix S M such that where X N is the N × N target gate and 0 is the null complement N × C at dimension M .Eq. ( 2) is a matrix equation, which guarantees that the overall system behaves as a X N gate on the reduced input.Solving the matrix Eq. ( 2) may be demanding and nontrivial when the number of dimensions grows.In the following, we discuss the use of machine learning techniques.
The transmission matrix T M in the rigged space from x to y can be written as blocks where O C is a unitary matrix with dimensions C × C to be determined.If U M and S M are unitary, the resulting transmission matrix T M is also unitary.However, if one uses Eq. ( 2) the problem may also have a nonunitary solution ("projected case") as some channels are dropped at the output.In other words, solving Eq. ( 3) is not equivalent to solving Eq. ( 2), and we adopt two different methodologies: one can look for unitary or nonunitary solutions by ANN.By following previous work developed for real-valued matrices [31], we map the complex-valued matrix equation (2) into a recurrent neural network (RNN).In the "non-inferencing" case, the matrix U M is known, and the solution is found by the RNN in figure 2. The RNN solves an unconstrained optimization problem by finding the minimum of the sum of the elements e ij > 0 of an error matrix E. The error depends on a "state matrix" W M , and one trains the elements w ij of W M to find the minimum min In the adopted approach, the sum of the elements e ij is minimum when the hidden layer elements g ij of the matrix G(W ) are zero.E and G have to be suitably chosen to solve the considered problem.We found two possible G matrices: (i) the "projected" with X N 0 = [X N 0] as in Eq. ( 2) and, (ii) the "unitary" (see eq. 3) Input layer Readout Random or multimodal system in A general optical gate based on a complex random medium; the input state x is processed to the input layer with operator Ŝin , the system is modeled by the unitary operator Û , and the output further elaborated by Ŝout .
These two cases are discussed below.
To find the unknown training matrix S M , one starts from an initial guess matrix W M (0).The guess is then recurrently updated, as in figure 2, until a stationary state W M (∞) is reached.Once this optimization has converged, the solution is given by S M = W M (∞).The update equation is determined by a proper choice of the error matrix E as follows.
As the matrices are complex valued, e ij is a function of g ij and g * ij .
We set e ij = e ij (|g ij | 2 ).The corresponding dynamic RNN equation, which for large time gives the solution to the optimization problem is where µ is the "learning rate", an optimization coefficient (hyperparameter) which is set to speed-up the convergence.The elements f ij of the matrix Eq. ( 7) implies that the RNN is composed of two bidirectionally connected layers of neurons, the output layer with state matrix W , and the hidden layer with state matrix G.The training corresponds to sequential updates of F and W when solving the ordinary equations (( 7)).As shown in [31], this RNN is asymptotically stable and its steady state matrix represents the solution (an example of training dynamics is in figure 2b).
We code the RNN by TensorFlow TM and use the ode integrator odeint.In the case N = M , as X N = X M is a unitary operator, the solution of the recurrent network furnishes a unitary S M matrix, which solves the problem.For M > N the RNN furnishes a unitary solution S M , and a unitary transfer function T M , only if we embed the target gate X N in a unitary operator as in 3 with O C a randomly generated unitary matrix.
Single non-inferencing qtrit gate X -For the training of a gate X 3 defined by [2,32] The gate X 3 is obtained by an embedding dimension M = 5 and unitary transfer function U 5 as in Fig. 2.
For G = G P , the number of ordinary differential equations for the training of the network is minimal (N = 3), however, the solution is not unitary, as some channels are dropped out by the N −projector.The overall M × M transmission matrix, after the training, T M is such that T † M • T M = I because the solution S M is not unitary.However, the system always reaches a stationary case.
A unitary solution is found by letting G = G U and involving the maximum number of ordinary differential equations in (7) with a unitary embedding of X N as in (3), i.e., adopting a further -randomly generated -unitary matrix O C .The key point is that the system finds a solution for any random unitary rigging of the matrix X N , that is, for any randomly assigned matrix O C .This implies that we can train all these systems to realize different multi-level gates.Inferencing gates -In the case that we do not know the transfer matrix of the system, we can still train the overall transmission matrix by using a neural network and infer U M .Here we use an ANN to determine the training operators without measuring the transfer matrix.Figure 3 shows the scheme of the ANN, where the unitary matrix U M is represented by its elements u ij , and the w ij are the adjustable weights.After training, the resulting w ij are the elements of the solution matrix S M .For the sake of simplicity, we first consider Ŝout = 1 M , as above.
1 0 0 0 0 0 0 0 0.6 0.6 0.4 0.4 0 0 0 0.5 0.2 0.3 0.7 For a target X N we build the T M as in (3) by randomly generating the unitary complement O C .As T M and U M are unitary, the resulting S M is also unitary.One can use a non-unitary T M by choosing, for example, O C = 0, correspondingly -after the training -S M is not unitary.We randomly generate a set of input states x i , with i = 1, ..., n train .Each input state is "labelled" with the target output y i = T M • x i .We remark that x i and y i are vector with size M .A further set of n valid validation rigged vectors is used to validate the training.
For any input x i in the training set, we adjust the weights to minimize the error function with y i = T M • x i .After this training, we test the accuracy on the validation set.Each cycle of training and validation is denoted as "epoch".
Figure 3 shows the ANN for N = 3, and M = 5.In our model, we build a matrix W M of unknown weights.As we deal with complex quantities, W M is written as W M = W M + ıW M with W M and W M are real-valued matrices with elements forming the weights of the ANN.Using random matrices as initial states, we end the iteration when the validation cost is below a threshold ε valid .Single qtrit inference X gate -Figure (3) shows the training of a single qtrit gate X 3 in 8. Similar results are obtained with other single qudit gates as X 2 and Z and for higher dimensions.Training typically needs tens of iterations and scales well with the number of dimensions.Conclusions -We have investigated the use of machine learning paradigms for designing linear multi-level quantum gates by using a complex transmitting multimodal system.The developed algorithms are versatile and scalable when the unitary operator for the random system is either known or unknown.We show that generalized single qudit gates can be designed.The overall methodology is easily implemented by TensorFlow application program interface (API) and can be directly adapted to experimentally retrieved data.The method can be generalized to more complex information protocols, and embedded in real-world multimodal systems.

Figure 2 .
Figure 2. (a) Recurrent neural network for the matrix equation (7) The status nodes are denoted by the elements of the matrix W , and the hidden state of the system is in the nodes of the matrix F (b) training dynamics for the case N = M = 3 with XT corresponding to a single-qtrit x-gate (µ = 100); (c) resulting transfer function for the case N = 3 and M = 5 in the unitary and non-unitary case.In the latter case, the excess channels are ignored during the training.The resulting transmission channels TM are displayed, O2 is the unitary complements for C = M − N = 2 in the unitary case.

Figure 3 . 6 Figure 3
Figure 3. Example of inference training of a M = 5 random system to act as X3 gate.(a) Neural network model (in our example S out M is not used); (b) numerical examples for the trasmission matrix TM = UM • S in M before and after training; (c) scaling properties in terms of training epochs.Parameters: ntrain = 100, n valid = 50, e valid = 10 −3 , n epoch = 6