Tensor-network-based variational Monte Carlo approach to the non-equilibrium steady state of open quantum systems

We introduce a novel method of efficiently simulating the non-equilibrium steady state of large many-body open quantum systems with highly non-local interactions, based on a variational Monte Carlo optimization of a matrix product operator ansatz. Our approach out-performs and offers several advantages over comparable algorithms, such as an improved scaling of the computational cost with respect to the bond dimension for periodic systems. We showcase the versatility of our approach by studying the phase diagrams and correlation functions of the dissipative quantum Ising model with collective dephasing and long-ranged power law interactions for spin chains of up to N = 100 spins.

Open quantum systems are often well described by the Lindblad master equation.However, solving it for larger system sizes is seldom straightforward, even in one dimension, due to an exponentially increasing size of the Hilbert space (in fact, with double the exponent as compared to closed quantum systems).Exact numerical treatment based on diagonalization of the Lindbladian superoperator or Monte Carlo averaging of stochastic quantum trajectories [39][40][41] is possible only for small systems, while other exact approaches are limited to specific models or conditions [42][43][44].These limitations prompted the development of many other approaches, each compromising on different aspects of the problem.Some popular categories include mean-field [45][46][47][48], perturbative [49][50][51][52][53], phase-space [54][55][56][57], or ensemble truncation methods [58][59][60][61].
In the context of strongly-correlated lowdimensional quantum many-body lattices, tensor network methods, such as the celebrated density matrix renormalization group (DMRG) [62], timeevolved block-decimation [63], or corner transfer renormalization group in two dimensions [64], stand out as some of the most powerful approaches for closed and open systems with limited entanglement.These methods rely on retaining only the most relevant states of the Hilbert space; the truncation is controlled by the bond dimensions of the tensor network ansatz, which can be increased to include more correlations between subsystems (at the price of costlier tensor contractions).For open quantum systems, most implementations rely on simulating the real-time dynamics of the system [65][66][67][68][69][70] (in some hybrid approaches, the dynamics is unraveled using quantum trajectories and simulated with tensor network techniques [71][72][73][74][75]), from which the steady-state properties can be extracted in the long-time limit.However, the potentially rapid increase in entanglement during the time evolution may cause the bond dimension to grow exponentially [76], causing such methods to fail to capture the correct long-time behavior [77,78].In this regard, when studying the steady state, variational methods [77,[79][80][81] (and related imaginary-time evolution methods [82]) are powerful alternatives to real-time evolution.It is known that a correct description of the steady state frequently requires a smaller bond dimension than of the intermediate states in the transient dynamics towards it [83].A variational approach thus allows one to target the steady state directly, bypassing these costly intermediate states, as demonstrated in [77].Nonetheless, appropriate cost functions for open quantum systems are non-linear in the Lindbladian [84], severely limiting the effectiveness of this approach [69].
In recent years, variational Monte Carlo (VMC) methods based on the optimization of quantum neural networks have made strides in simulating a wide variety of open quantum lattices in one and two dimensions [85][86][87][88][89][90][91][92][93][94].In [87] it was found that a solution for the steady state of the dissipative anisotropic Heisenberg model in one dimension was possible with a neural network ansatz with close to 40 times fewer variational parameters than with a MPO-based method that achieves comparable accuracy.However, in contrast to tensor networks, quantum neural networks are restricted to smaller system sizes, with the largest system hitherto studied in the literature with a timeindependent approach limited to 20 spins [89].Moreover, it is unclear how well this class of variational ansätze can capture long-ranged correlations [70].
Inspired by previous variational tensor network and quantum neural network approaches to open quantum lattices, we formulate a new time-independent variational Monte Carlo method, based on the optimization of a MPO tensor network ansatz for the steady-state density matrix, which we term variational matrix product operator Monte Carlo (VM-POMC).By combining the strengths of tensor networks and variational Monte Carlo, our method outperforms and enjoys distinct advantages over comparable approaches.It is also capable of simulating computationally-challenging, highly non-local interactions, such as long-ranged Ising interactions or collective dephasing, which we investigate in this work.
This paper is organized as follows: In Sec. 2 we review some fundamental concepts necessary for the formulation of VMPOMC, which we then develop in Sec. 3. We benchmark and illustrate the capabilities of our method in Sec. 4, concentrating on variants of the paradigmatic dissipative quantum Ising model, including with long-range interactions and collective dephasing.Sec. 5 provides a summary and outlook.

Basic concepts
Our goal is to characterize the steady-state properties of one-dimensional, translationally-invariant, interacting quantum systems in contact with a dissipative environment, described by a Markovian quantum master equation in Lindblad form (setting h = 1) [95,96], (1) where the Hamiltonian H governs the coherent dynamics of the system and where the Lindblad jump operators Γ k model the incoherent dynamics due to the coupling of the system to the dissipative environment.The non-equilibrium steady state of the system, ρ SS , is defined as the solution to Lρ SS = 0.In this work, we only consider models with a unique steady state.
space of vectorized density matrices, where we map ρ by an application of Choi's isomorphism [98,99] to each single-body state, abbreviating the indices by the pair x i = (α i , β i ), where we denote x = (x 1 , . . ., x N ).In Liouville space, superoperators are mapped to matrices; from the properties of the Kronecker product one obtains, for a generic Lindbladian in Eq. (1), ( When the dimensions of the Lindbladian matrix are sufficiently small, the steady state can be found by solving for the null eigenvector of (2).One can also define the completeness relation and inner product in Liouville space.The former, follows directly from the usual completeness relation.The inner product between vectorized operators |A⟩ and |B⟩, which we denote by the braket ⟨A|B⟩, is to be understood as the Hilbert-Schmidt product, It follows that the Hilbert-Schmidt (Frobenius) norm of the operator (matrix) A can be expressed as [100] A matrix product operator (MPO) representation of the density matrix has the form i=1 (where d denotes the local Hilbert space dimension) can be considered matrices of (bond) dimension χ for fixed physical indices α i , β i .For translationally-invariant systems with a unit cell of size one and periodic boundary conditions, the matrices A αi,βi are functions of the values of α i and β i but are otherwise independent of the site i.In vectorized notation, after reshaping the tensors, the MPO becomes a matrix product state (MPS), We depict this in Fig. 2.

Method
Having reviewed the required prerequisites, we can now formulate the VMPOMC method.In Sec.3.1 we set up the variational parametrization of the density matrix, introduce the notions of a cost function and the local estimator of the Lindbladian, and discuss their computation with an MPO ansatz.This is followed by a discussion of optimization of the ansatz via stochastic gradient descent and stochastic reconfiguration in Sec.3.2.A summary and comparison with other MPO-based approaches is provided in Sec.3.3.

Variational Monte Carlo with Lindbladians and MPOs
In our variational approach, we represent the density matrix as a translationally-invariant MPO tensor network, ρ = ρ(a), where the variational parameters a = {a i }

Nparam i=1
are the N param = d 2 χ 2 distinct elements of the site-independent matrices {A s } d 2 s=1 .For notational brevity, we use the compound index i = (u, v, s) of the full set of indices labelling a tensor element of the MPO, a i = a s uv .We seek to find the optimal parameter values yielding the desired steady state.To this end, we must define a suitable cost function which attains a global minimum at the steady-state condition, and which can be systematically minimized by tuning the variational parameters to approach the steady-state condition sufficiently closely [86,99].We resort to the variational principle, which for open quantum systems can be stated as 0 ≤ ∥ ρ(a)∥ 2 ∥ρ(a)∥ 2 , with the equality saturating at the steady state [84,86].Different interpretations of the above principle are possible, depending on the chosen definition of the norm.It was argued [84,101] that the most optimal choice is the trace norm.However, the trace norm is difficult to compute in practice; instead, we shall make use of the Hilbert-Schmidt norm, defining the cost function with normalization constant (purity) Z = ∥ρ(a)∥ 2 = x |⟨x|ρ(a)⟩| 2 .For notational brevity, we shall suppress all functional dependencies on a from now on.
The cost function in Eq. (8) cannot be evaluated exactly for larger lattices as written due to exponential growth of the size of the Hilbert space.We therefore resort to Monte Carlo sampling to evaluate it approximately.First, we must represent it in a form amenable to stochastic sampling.In addition, the product L † L is in general difficult to handle as it's far less sparse than L. A particularly useful form of (8) (originally due to [86])-an expectation value of products of terms linear in L-is obtained upon insertion of the completeness relation in Eq. (3) in between the Lindbladian superoperators, where we defined the local estimator of the Lindbladian, The expectation value E x∼p(x) [ ] in Eq. ( 9) can be referred to as a quantum ensemble average, in which an observable (here, |L loc (x)| 2 ) is averaged over an ensemble of many-body configurations {x}, distributed according to the Born-like probability density p(x) = |⟨x|ρ⟩| 2 /Z.As in classical Monte Carlo simulations, this ensemble average can be stochastically approximated by sampling a set of states {x} from the distribution p(x).In our implementation, we draw new Monte Carlo samples via a sequential Metropolis update, updating each site of the system in fixed order, which minimizes costly tensor contractions; Monte Carlo sampling is delineated in Appendix A.
Let us now consider more closely the expression of the local estimator of the Lindbladian.At first glance, its computation seems like a formidable task due to the sum over all basis states {y}.In practice, however, the Lindbladian is to a large degree local, such that the subset of configurations {y} for which ⟨x| L |ρ⟩ is nonzero is small (typically of O(N )).On the other hand, because the density matrix is represented by a MPO, the brakets in the local estimator of the Lindbladian are products of matrices, which are generally expensive to compute, with a formal computational complexity of O(N χ 3 ) for the product above.If the size of the subset of important configurations {y} is of O(N ), this results in an overall scaling of O(N 2 χ 3 ) per sample x.However, a more careful analysis (see Appendix B) shows that, for sufficiently local Lindbladians, the above matrix products can be evaluated in sweeps where parts of previously computed matrix products are reused, reducing the formal computational scaling by a factor of N , to O(N χ 3 ) per sample x.
It must be noted that the variational steady state can generally only be an approximation to the true steady state, owing to the finite number of variational parameters.Moreover, to constitute a physicallyvalid density matrix, it must satisfy: (1) hermiticity, (2) unit trace, and (3) positive semi-definiteness.While the first two conditions are easily enforced by renormalizing the matrix elements, the last is difficult to impose.Although it is possible to construct a positive definite ansatz via purification [65,68], the required bond dimension may become arbitrarily large [102], and translationally-invariant MPOs need not admit a purified form valid for all system sizes [103].Therefore, we only consider the MPO in its unpurified form, which we nevertheless observe to reliably converge towards the correct, positive definite manifold of solutions.As argued in [77], convergence may be formally decided by considering the variation of C (and other observables) as a function of increasing bond dimension, requiring that the variation falls below some set threshold value.In what follows, we shall deem "converged" simulations that attain C/N < 10 −4 .

Optimization
Having defined an appropriate ansatz and cost function, we now seek a systematic method of optimizing the variational parameters.We shall outline two approaches: stochastic gradient descent (SGD) and the more sophisticated stochastic reconfiguration (SR) method.

Stochastic gradient descent
In the gradient descent method the variational parameters are updated in the direction of the steepest decrease of the cost function C.Although our variational ansatz is holomorphic, the cost function is not due to the presence of complex-conjugates of the variational parameters.Under these conditions, the correct expression for the steepest descent gradient vector is given by the "conjugate cogradient" [104], i.e. the Wirtinger derivative of the cost function with respect to the conjugated variational parameters, satisfying ∆ i |ρ⟩ = ∂ ai |ρ⟩.In addition, we shall denote its diagonal matrix elements by Defining the log-derivatives of the Lindbladian, ) the expression for the gradient can be written as [86] As in the case of the cost function, the ensemble averages in Eq. ( 14) can be approximated by Monte Carlo sampling, in which case we speak of stochastic gradient descent (SGD).Likewise, the computation of the gradient can be made more manageable by exploiting the structure of the MPO ansatz; details of the efficient computation of the gradient for an MPO ansatz are given in Appendix C.
With the gradient vector known, the variational parameters can be updated by setting a ← a − δf (15) for some small step size δ.After updating the tensor elements, the trace of the density matrix will generally be no longer unity.To alleviate this, we renormalize the local tensor elements so that tr ρ = 1 after every iteration.

Stochastic reconfiguration
The above gradient descent method can be refined by accounting for the curvature of the variational manifold.Roughly speaking, small changes of the variational parameters need not correspond to small changes in the cost function; by computing the metric tensor S ij , it is possible to reconfigure the gradient vector f i so that the Hilbert-Schmidt distance between successive variational density matrices is minimized, accelerating convergence.This approach is known as stochastic reconfiguration (SR) [105,106].
In the context of open quantum systems, the SR method has been adapted to the optimization of quantum neural network ansätze in [85][86][87][88].The resultant equations of motion for the variational parameters are j S ij δa j = f i [105], where the elements of the N param × N param metric tensor S ij can be expressed as [86] As before, we have interpreted the above expectation values as ensemble averages over the many-body configurations {x}, which we can evaluate stochastically, reusing the same set of samples used to calculate the cost function (9) and gradients (14).
With the metric tensor and gradient vector known, we may update the variational parameters by setting for some small step size δ.The above requires the inversion of the metric tensor; due to stochastic sampling and numerical inaccuracy, it may happen that S becomes noninvertible.This can be prevented by adding a constant, diagonal shift to the final metric tensor, where δ ij denotes the Kronecker delta and where the parameter ϵ can be tuned to modulate the strength of the reconfiguration of the gradients in Eq. (17).

Selecting optimal hyperparameter values
The success in the search for the global minimum depends critically on the optimization hyperparameters, such as the step size for the gradient update, the number of Monte Carlo samples per iteration, or the renormalization offset when using SR.We select appropriate values for the step size and offset by trial and error, and decrease the former geometrically with the number of iterations.To minimize statistical inaccuracy during optimization, we ensure that a sufficiently high number of Monte Carlo samples (N MC ≫ N params ) [105] is drawn every iteration.In some simulations we periodically increase N MC to allow for more precise, less noise-affected updates.Regarding the renormalization of the metric tensor, in principle it is possible to determine the minimum offset ϵ that guarantees positive semi-definiteness of the metric tensor S by diagonalizing S and setting the magnitude of ϵ to that of the smallest (negative) eigenvalue of S.However, in practice, we find that a somewhat larger offset results in a more stable convergence.

Summary
For the convenience of the reader, here we outline the critical steps of a single iteration of the VMPOMC algorithm.A graphical summary is also shown in Fig. update, while generating a set of right matrix products {R j } N j=1 .
2. Calculate the local estimator of the Lindbladian L loc (x) = ⟨x|L|ρ⟩ / ⟨x|ρ⟩, while generating a set of left matrix products {L j } N j=1 .
4. Repeat steps (1)-( 3) N MC number of times, then compute the gradients f i = ∂ a * i C. 5. Optionally: compute and regularize the metric tensor S ij .
6. Update the tensor elements, a i ← a i − δf i (or a i ← a i − δ j (S −1 ) ij f j ).
For generic n-local Lindbladians, the formal worstcase computational scaling for one iteration of VM-POMC with SR is O(N MC N 2 d 2n χ 3 + N MC d 4n χ 4 + d 6n χ 6 ).In practice, however, the computational cost can be greatly reduced by exploiting the sparsity and diagonality of the local Lindbladian operators (see Appendix B).
Compared to other variational MPO-based approaches [77,79,81], our method enjoys numerous distinct advantages.It disposes of the need to find an MPO representation for the Lindbladian, whose bond dimension may be large and significantly impact the difficulty of the required tensor contractions [79].It is readily applied to finite periodic systems; comparable MPO-or DMRG-based algorithms would have effective computational costs of O(N 2 χ 5 ) or O(N 2 χ 6 ) per iteration [107,108].It relies on the evaluation of single-layer contractions (⟨x|ρ⟩), whose computational cost is far lower than of the multi-layer contractions in a deterministic approach, further compounded by the added complexity of handling the non-local squared Lindbladian L † L (partly alleviated in our method thanks to Eq. ( 9)).Due to its inherent stochasticity, it has a reduced likelihood of becoming trapped in a local minimum, and therefore does not require additional warm-up protocols as in e.g.[77,81].Moreover, it is particularly suitable for treating purely-or mostly-diagonal superoperators (contingent on sufficient expressibility of the MPO ansatz) due to cancellation of the probability amplitudes in the expressions for the cost function and gradient vector, enabling exact treatment of otherwise prohibitively difficult non-local interactions, as we shall demonstrate presently.

Numerical results
We benchmark and illustrate the capabilities of VM-POMC by studying variants of the paradigmatic dissipative quantum Ising model (DQIM).Given the variational steady-state density matrix ρ, we measure the steady-state expectation value of an observable O by computing ⟨O⟩ = tr(Oρ).We note that, unlike in comparable neural-network-based algorithms [85][86][87], here the expectation values can be computed exactly, i.e. without carrying a Monte Carlo error, via an appropriate tensor contraction (see Fig. 4).The parameter and hyperparameter values used in all subsequent simulations are detailed in the table in Appendix D.

Dissipative Ising chain
As a first example we study the one-dimensional DQIM in a transverse magnetic field with nearestneighbor interactions and incoherent decay, defined by the Hamiltonian with the Lindblad spin decay operator , for all k = 1, . . ., N , where σ n for n ∈ (x, y, z) denote Pauli spin operators.A schematic of the model (with added dephasing) is shown in Fig. 1.Non-trivial steady-state properties of the model arise from the competition between unitary and dissipative dynamics.Variations of the DQIM have been used as a testing ground for other methods in earlier publications [70,77,[86][87][88].Moreover, after inclusion of long-ranged interactions, the DQIM becomes experimentally realizable with e.g.Rydberg gases [101,109,110].We assume γ − = 2J = 1 in all subsequent simulations.
In Fig. 5 we depict an example optimization process, comparing the SGD and SR algorithms.We plot the stochastic estimates of the cost function and a selection of steady-state observables as a function of the iteration step for the DQIM with N = 12 sites and an MPO ansatz of bond dimension χ = 4.The small number of sites allows us to gauge the accuracy of our method by comparing our results against corresponding exact steady-state expectation values, which we find via an exact diagonalization of the Lindbladian.The exact results are indicated on the plots with black, horizontal lines.SR is found to converge to the minimum much more rapidly than SGD, yielding more accurate estimates of the measured observables at only a modest increase in the computational overhead.Moreover, observable estimates from SGD plateau some distance away from the exact values, possibly becoming trapped in a local minimum due to the low number of variational parameters.It should be noted that the effectiveness of SR over SGD depends strongly on the model and chosen parameters.Nevertheless, it is found by the authors to be the superior choice in virtually all circumstances if adequately regularized.As such, we shall resort to SR exclusively in all subsequent computations (with suitably adapted values of ϵ).
In Fig. 6 we investigate the convergence of VM-POMC as a function of increasing bond dimension χ for the DQIM with N = 12 and N = 100 spins.We concentrate on the cost function, steady-state magnetizations, and stead-state Rényi-2 entropy S 2 = − log 2 (Z)/N .For N = 12, we start the optimization process from entirely random initial tensors, while for N = 100 we do so from the already converged, rescaled tensors obtained with N = 12.Expectedly, the final averaged values of the cost function decrease monotonically with increasing χ.The criterion for convergence (C/N < 10 −4 ) is satisfied for both system sizes for a bond dimension as low as χ = 6.Correspondingly, respectable accuracy (< 4% relative error) across all considered observables is achieved for χ = 6, and is improved further for χ ≥ 10, with ≈ 0.1% relative error for χ = 16.Most of the recorded measurements are visually indistinguishable between the two system sizes, which is indicative of the lack of discernible finite size effects when increasing the chain length from 12 to 100 spins for the chosen parameter values.The low bond dimension required for convergence in our variational approach is worth contrasting with time-dependent MPO-based methods, where the required bond dimensions may reach several hundred [77].
In Fig. 7 we compare directly the variational density matrix after 1000 SR iterations to the exact steady state density matrix for a system of 6 spins.The variational density matrix approximates the steady state accurately, with an Uhlmann fidelity between the variational and exact density matrices in excess of 0.99997.
In Fig. 8 we compare the accuracy of our method against other time-independent variational approaches of [86] and [94], based on restricted Boltzmann machine (RBM) and convolutional neural network (CNN) ansätze, respectively.Our steady-state magnetization results, obtained with an MPO of bond dimension χ = 6, are in the best agreement with the exact (quantum trajectories) results.We note that the total number of variational parameters used in our approach (144 for χ = 6) is close to 20 times smaller compared to the RBM ansatz (up to 2752), or 3 times smaller compared to the CNN ansatz (438).We attribute the improved performance of our method in part to the translationally-symmetric structure of the MPO ansatz, which reduces the number of variational parameters, as well as to the lack of additional statistical errors in the observable measurements.In Fig. 9 we perform a finite-size extrapolation of the steady-state magnetizations and Rényi-2 entropy, comparing the VMPOMC for the dissipative quantum Ising model with periodic boundary conditions (PBC) with a time-dependent MPS (or t-MPS) approach (as described in [108]) for an equivalent model with open boundary conditions (OBC), with the Hamiltonian The steady-state expectation values are seen to converge to the thermodynamic limit with smaller system sizes for PBC than OBC.In the latter case, a more accurate estimate of the thermodynamic expectation values is given by measuring the magnetizations of a single spin in the bulk of the chain instead of an average over the whole chain.On similar grounds, the estimate of S 2 , a global property of the many-body system, converges much more slowly with increasing system size to its thermodynamic-limit value with OBC than with PBC.We attribute the slight deviations in the extrapolated thermodynamic limit-values for t-MPS from VMPOMC to additional sources of errors, such as due to the Trotter decomposition and finite time-step size.

Long-ranged interactions
As a second example, we study the DQIM with spin decay as before, except with the nearest-neighbor interactions replaced by long-ranged ones.The corre- sponding Hamiltonian is where the exponent α ∈ (0, ∞) sets the decay length of the long-ranged interactions, and where N α is the Kac normalization factor [111].These power law interactions are experimentally realizable via e.g.Rydberg atoms (α = 3, 6) or trapped ions (0 < α < 3) [112].
For α = ∞ we recover the previously studied model with nearest-neighbor interactions.In the other limiting case of α = 0 the spins become fully connected; a family of such models has been studied extensively via a variety of mean-field [45], exact [44], field-theoretic [113], semiclassical [114], and other approaches [53,115,116].For 0 < α < ∞, variations on the above model have been studied only very recently with tree tensor network [117] and truncated Wigner approaches [55].Our MPO-based method allows for the exact treatment of long-ranged Ising interactions as in Eq. ( 21), of, in principle, arbitrary decay lengths (contingent on sufficient expressibility of the MPO ansatz).This is made possible due to the purely diagonal form of the interaction Hamiltonian, which implies a cancellation of the probability amplitudes in Eqs.(10) and (13), and thus absence of costly tensor contractions.
The long-range interactions may induce longranged correlations whose simulation may be prohibitively difficult for the MPO ansatz.We there- fore begin by again assessing the convergence of our method with bond dimension for chains of length N = 12 and N = 100 at h = 1.0 and α = 2 in Fig. 10.As before, for N = 100, we start from the already converged, rescaled tensors obtained with N = 12.Remarkably, convergence in both cases is achieved for modest bond dimensions of χ ≥ 8.
In Fig. 11 we study the effect of the long-ranged interactions on the magnetization phase diagram and spin-spin correlation functions of the dissipative quantum Ising model.We consider the decay exponents α = 2, 3, ∞ with N = 100 and χ = 10 throughout.Remarkably, convergence was achieved for all considered decay exponents and external field strengths.
In Fig. 12 we perform a finite-size extrapolation of the steady-state magnetizations and Rényi-2 entropy with the VMPOMC algorithm for the dissipative quantum Ising model with PBC and long-range dipolar interactions.

Local and collective dephasing
As a final example, we study the DQIM with nearestneighbor interactions and single-site decay, alongside with a dephasing term.We consider two possible dephasing channels: local, described by the singlesite Lindblad jump operators Γ , for all k = 1, . . ., N , and collective, described by the single non-local operator Models with collective dephasing have attracted interest due to promising applications in e.g.quantum error prevention [12,13].To the authors' best knowledge, the above DQIM with collective dephasing has not been studied in the present literature, with previous numerical and mean-field studies limited only to permutationally-invariant versions [115,118].Sim- ilarly to the case of long-ranged Ising interactions, the exact treatment of this model with our method is made possible due to the purely-diagonal form of the Lindblad collective dephasing operator.
In Fig. 13 we compare the steady-state magnetization and Rényi-2 entropy S 2 phase diagrams in the presence of purely-local or purely-collective dephasing.Notably, we observe a striking difference in the behavior of S 2 .For local dephasing, S 2 can be seen to increase at weak dephasing before decreasing for γ d > 0.5, whereas for collective dephasing the entropy decreases almost monotonically for all dephasing strengths.This behavior can be understood in terms of the rapid initial reduction of the off-diagonal elements (coherences) of the density matrix, resulting in a decrease in the purity, or an increase in S 2 .This, however, also limits the coherent dynamics due to the external field counteracting spin decay in setting the populations, leading to the pure fully-decayed spin product state in the strong dephasing limit.For the model with collective dephasing, however, a subset of the coherences (of the decoherence-free subspace) survives, resulting in the near-immediate reduction in S 2 .
Although the added computational overhead from both dephasing terms in computing the local estimator is negligible, the long-ranged correlations induced in the collective case prove too difficult to accurately describe with an MPO ansatz for larger dephasing strengths and system sizes, as indicated by higher recorded values of the cost function and poorer precision of the measured magnetizations and entropies in Fig. 13, despite using a twice larger bond dimension of χ = 24.From a technical perspective, the collec- tive dephasing term can be viewed as a dissipative analogue of an all-to-all spin-interaction, and hence is not expected to be simulable with a MPO ansatz.More accurate simulation of such models may therefore be possible with e.g.long-ranged RBMs, which were shown to satisfy a volume-law entanglement entropy scaling for closed quantum systems [119].

Conclusions and perspectives
In this work, we introduced a novel time-independent variational Monte Carlo method, based on the optimization of a MPO tensor network ansatz, applicable to the non-equilibrium steady state of stronglycorrelated open many-body quantum systems.Our implementation for open quantum spin chains enjoys excellent agreement with exact results and superior accuracy over related variational Monte Carlo approaches at significantly smaller fraction of the number of variational parameters.We delineated the unique advantages of VMPOMC over comparable approaches and illustrated its capabilities by simulating the one-dimensional dissipative quantum Ising model with highly non-local interactions, including collective dephasing and long-ranged power law interactions.In particular, we have shown that the steady state can often be well described with an MPO of modest bond dimension, even in the presence of long-ranged interactions with as many as N = 100 spins.
The main bottleneck preventing effective simulation of larger systems is the computation of the logderivatives of the Lindbladian in Eq. (13), the cost of which scales as N 2 with the system size.In this regard, more advanced optimization techniques which further reduce the total required number of iterations, such as the linear method [120], may yield a significant improvement.
The proof-of-principle approach introduced in this work can be extended in several directions.The translational invariance we imposed may be lifted, or the size of the unit cell may be increased, albeit at increasing the number of variational parameters.Such an extension would enable the simulation of e.g.staggered systems with antiferromagnetic ordering in the steady state.Non-equilibrium dynamics can be simulated by resorting to the time-dependent variational principle as was done in [85,87,90] for quantum neural network ansätze.Finally, it would be worthwhile to extend our method to two dimensions, for instance by mapping the two-dimensional lattice onto an effective one-dimensional system with long-ranged interactions simulable with our approach, similarly to implementations of DMRG for closed two-dimensional systems [121].

A.3 Sample autocorrelation
To explore the sample space efficiently, it is crucial that the Monte Carlo samples are independently distributed, i.e. subsequent samples are uncorrelated [105].In addition to being efficiently calculable, the sequential Metropolis update produces uncorrelated Monte Carlo samples.To see this, we investigate the autocorrelation function, which for a Markov chain of states {x n } N MC n=1 can be estimated via where the integer t denotes the separation between succesively accepted states of the Markov chain, and where O denotes an observable such that O(x n ) = ⟨β n | O |α n ⟩.We compute the mean value Ō exactly from the MPO ansatz.
In Fig. 14 we plot example autocorrelation functions for Markov chains of Monte Carlo samples of length N MC = 10 6 for the DQIM of different system sizes from a converged steady-state MPO ansatz, and where the chosen observables are the spin-averaged magnetizations, e.g.
We draw a new Monte Carlo sample after a single left-or rightward sequential Metropolis sweep (in alternating directions), after an initial burn-in period of 1000 samples.For both system sizes the autocorrelations are already less than 0.01 after just a single sweep, and decrease even further for larger numbers of sweeps.For N = 100, the recorded autocorrelations are on average smaller than for N = 8, owing to the larger number of Metropolis spin flips in a single sequential update.Due to the low sample autocorrelation of the sequential Metropolis update, we deemed it sufficient to accept new Monte Carlo samples after only a single sweep for the purpose of the simulations discussed in the main text.

B Efficient computation of the local estimator of the Lindbladian
In this section we describe an efficient procedure to calculate the local estimator of the Lindbladian, in which we endeavor to minimize costly matrix multiplications.We shall specialize our discussion to the case of at most 2-local interactions allowed by the Lindbladian superoperator.
To begin, it is helpful to split the Lindbladian into a 1-local part and a 2-local part, where L (i) contains only i-local terms.Similarly, the local estimator of the Lindbladian decomposes as which allows us to treat the two cases separately.We shall now examine the 1-local case in detail.The 1-local part of the Lindbladian is separable and can therefore be written as i.e.L (1) can be expressed as a sum of tensor products l (1) j of identities, with the exception of a single 1-body Lindbladian l (1) as the j-th term in the product.We find, for the 1-local part of the local estimator, L (1)  loc Observe that ⟨x i | (l j ) i |y i ⟩ = ⟨x j | l (1) |y j ⟩ whenever i = j and ⟨x i | (l In the latter case we have y i = x i by orthogonality of the basis states, which also implies a cancellation of the probability amplitudes in Eq. (27).Hence, the above expression simplifies to where L j−1 and R N −j , defined via L j = j i=1 A(x i ) and R j = N i=N +1−j A(x i ), are the products of the matrices left and right of the matrix A(y j ), respectively.Note the recurrence relations L To see how the local estimator in Eq. ( 28) can be calculated efficiently, by minimizing the total number of required matrix multiplications, consider the following protocol of sequential updates of the L j and R j partial matrix products, given an initial The above procedure thus has a worst-case computational scaling of ∼16N χ 3 compared to ∼4N 2 χ 3 in a naive approach.This can be reduced even further, to ∼8N χ 3 , if one already has access to the sets of partial matrix products {L j } N j=1 and {R j } N j=1 .The latter is the approach used in our numerical implementation, where the sets {L j } N j=1 and {R j } N j=1 are constructed and memorised in advance, as part of prior left-or rightward sweeps.
Another important simplification occurs for the diagonal elements of l (1) , for which y j = x j , causing the probability amplitudes in Eq. (28) to cancel out.This simplification is valuable as it eliminates some of the tensor contractions entirely.For some operators, the one-body Lindbladian can in fact be dominated by diagonal elements, as is the case with e.g. the spin decay Lindblad jump operator, Γ k = σ − k , whose corresponding l (1) matrix contains 4 non-zero elements, of which 3 lie on the diagonal.
The above discussion generalizes straightforwardly to the 2-local case, where we may write the Lindbladian as a sum of tensor products of identities and 2-body operators l (2) , and where the sums over y j are replaced by sums over pairs of y j and y j+1 .The exception occurs at the boundary, where the operator l (2) N connects the pair of spins labeled by y N and y 1 (which are nevertheless still 2-local due to periodic boundary conditions).In the same vein, we can generalize this procedure to arbitrary n-body or nlocal interactions, albeit at a rapidly increasing size of the local l (n) matrices or the requirement for additional matrix multiplications in connecting non-local couplings.
We point out that comparably performant algorithms for the computation of the local estimator and log-derivatives of the Lindbladian may be also devised based on a direct contraction of the Lindbladian and differential operators with the MPO density matrix, without explicit decomposition over the {y} sub-states (first equalities in (10) and (13)), albeit at possibly higher memory costs and without enabling the discussed simplifications for non-local interactions arising from the diagonality of the Lindbladian operators.

C Efficient computation of the gradient
In this section we discuss how to efficiently compute the gradients in Eq. (14).The main difference from computing the local estimator stems from the added presence of the log-derivative tensor ∆ s uv (y).As such, we must first discuss how to differentiate a translationally-invariant MPO with respect to a tensor element.The probability amplitude for a translationally-invariant MPS is given by ⟨α|ψ⟩ = tr A(α 1 )A(α 2 ) . . .A(α N ), (30) Differentiating with respect to the matrix element a where the lower case l and r denote the matrix elements of the corresponding partial matrix products defined earlier, L j = [l uv ] and R j = [r uv ].This result is different to the one stated in Eq. ( 9) in [107] and is found to lead to more stable convergence.The above carries over to a MPO when vectorized into a MPS.
Hence, to efficiently compute the log-derivative tensor ∆ s uv (y) elements, we first compute the set of all left and right partial matrix products {L j } N j=1 and {R j } N j=1 for the state y recursively, from where Eq. ( The total computational cost amounts to 2N χ 3 from the computation of the left and right matrix products and a further N χ 3 from the computation of the numerator in Eq. (32), for a total cost of 3N χ 3 .The cost of computing the denominator is neglected as it's already known from computing the cost function earlier.
With ∆ s uv (y) known, the computation of the ensemble averages in Eq. ( 14) can again be performed in a sweeping manner, analogous to the computation of the cost function as described in Appendix B.

D Parameters and hyperparameters
In the table below we collected the parameter and hyperparameter values of all calculations discussed in the main text.The column N MC lists the number of Monte Carlo samples per each Markov chain, per iteration.The column N work.lists the number of independent Markov chains per iteration (distributed over different MPI processes).The column N iter.lists the total number of iterations completed before the optimization process was terminated.In all simulations, the iteration step size δ decreases geometrically with the iteration number k as δ = δ 0 F k ; the values δ 0 and F used are listed in the final two columns.

Figure 1 :
Figure 1: Schematic of an open quantum spin chain with periodic boundary conditions as an array of two-level systems.Each spin experiences coherent drive with strength h and incoherent decay with strength γ l from coupling with the dissipative bath, and interacts with its nearest neighbors with strength J. Additionally, the spins may experience dephasing from contact with the dissipative bath, which may be local (γ loc d ) or collective (γ col d ).

Figure 2 :
Figure 2: (a) The translationally-invariant MPO with periodic boundary conditions in the matrix element of ρ in Eq. (6), expressed in Penrose graphical notation.Lines connecting two nodes imply a tensor contraction.(b) The MPS in the coefficient of |ρ⟩ in Eq. (7) after vectorization of the MPO, where we merged the on-site physical indices as xj = (αj, βj).

Figure 3 :
Figure 3: Graphical summary of the VMPOMC algorithm.(Inside gray box, clockwise) Sampling loop: Given a set of variational parameters a, we draw a new sample x via a leftward sequential Metropolis update, in the process of which we generate a set of right matrix products.We reuse these when computing the local estimator and local gradient tensor via a rightward sweep through the chain, during which we generate a corresponding set of left matrix products.These are then reused when drawing the next sample x. (Outside gray box, clockwise) Optimization loop:After a sufficient number of Monte Carlo samples have been gathered, the computed quantities are used to find the gradient vector and metric tensor, which are subsequently used to update the variational parameters a.We iterate this process until convergence to the steady state.

Figure 5 :
Figure 5: Sample optimization process with SGD (blue) and SR (red, ϵ = 0.01) over 1000 iterations for the DQIM with N = 12 at h = 0.5, starting from the same random initial MPO.We plot (a) the evolution of the cost-function, (b) the steady-state x-and (c) z-magnetizations, as well as (d) the purity of the variational steady-state.Shown via black horizontal lines in plots (b)-(d) are the true values of the corresponding observables obtained via an exact diagonalization of the Lindbladian.The recorded wall times for the 1000 iterations and associated measurements on an Intel i7 12700k machine were 65s for SGD and 88s for SR.

Figure 6 :
Figure 6: Convergence of VMPOMC for the DQIM with N = 12 (blue crosses) and N = 100 (red crosses) spins at h = 1.0 as a function of the bond dimension χ, showing the final recorded values of (a) the cost function, (b) steady-state x-and (c) z-magnetizations, and (d) steadystate Rényi-2 entropy.Black horizontal lines in (b)-(d) indicate corresponding exact steady-state expectation values for N = 12.Insets: relative errors between the variational and exact results for N = 12 and χ ≥ 6 as a function of χ.

Figure 7 :
Figure 7: The real (left column) and imaginary (right column) parts of the variational (top row) and exact (bottom row) steady state density matrices of the DQIM with N = 6 at h = 1.5.The former is obtained after 1000 SR iterations, starting from a random initial MPO with χ = 6; the latter is obtained via exact diagonalization of the Lindbladian superoperator.The Uhlmann fidelity between the variational and exact density matrices is in excess of 0.99997.

Figure 9 :
Figure 9: (a-c) Steady-state magnetizations and (d) Rényi-2 entropy for the dissipative quantum Ising chain with γ− = h = 2J = 1.0 as a function of the inverse system size with N between 8 and 100, obtained via VMPOMC for PBC and t-MPS for OBC with χ = 10 throughout.The solid lines are extrapolations of the VMPOMC results to the thermodynamic limit.

Figure 10 :
Figure 10: Convergence of VMPOMC for the DQIM with long-ranged interactions with α = 2 for N = 12 (blue crosses) and N = 100 (red crosses) at h = 1.0 as function of the bond dimension χ, showing the final recorded values of (a) the cost function, (b) steady-state x-and (c) z-magnetizations, and (d) Rényi-2 entropy.Black horizontal lines in (b)-(d) indicate corresponding exact steady-state expectation values for N = 12.Insets: relative errors between the variational and exact results for N = 12 and χ ≥ 6 as function of χ.

Figure 12 :
Figure 12: (a-c) Steady-state magnetizations and (d) Rényi-2 entropy for the dissipative quantum Ising chain with γ− = h = 2J = 1.0 and long-ranged dipolar (α = 3) interactions as a function of the inverse system size with N between 8 and 100, obtained via VMPOMC for PBC with χ = 10 throughout.The solid lines are extrapolations of the VMPOMC results to the thermodynamic limit.

Figure 14 :
Figure 14: Autocorrelation functions between successive Monte Carlo samples for the DQIM for N = 8 and N = 100 at h = 0.3, computed from converged MPO ansätze for N MC = 10 6 samples.