Mixed state entanglement classification using artificial neural networks

Reliable methods for the classification and quantification of quantum entanglement are fundamental to understanding its exploitation in quantum technologies. One such method, known as separable neural network quantum states (SNNS), employs a neural network inspired parameterization of quantum states whose entanglement properties are explicitly programmable. Combined with generative machine learning methods, this ansatz allows for the study of very specific forms of entanglement which can be used to infer/measure entanglement properties of target quantum states. In this work, we extend the use of SNNS to mixed, multipartite states, providing a versatile and efficient tool for the investigation of intricately entangled quantum systems. We illustrate the effectiveness of our method through a number of examples, such as the computation of novel tripartite entanglement measures, and the approximation of ultimate upper bounds for qudit channel capacities.

Reliable methods for the classification and quantification of quantum entanglement are fundamental to understanding its exploitation in quantum technologies.One such method, known as Separable Neural Network Quantum States (SNNS), employs a neural network inspired parameterisation of quantum states whose entanglement properties are explicitly programmable.Combined with generative machine learning methods, this ansatz allows for the study of very specific forms of entanglement which can be used to infer/measure entanglement properties of target quantum states.In this work, we extend the use of SNNS to mixed, multipartite states, providing a versatile and efficient tool for the investigation of intricately entangled quantum systems.We illustrate the effectiveness of our method through a number of examples, such as the computation of novel tripartite entanglement measures, and the approximation of ultimate upper bounds for qudit channel capacities.
The core tasks of entanglement classification [1][2][3] and quantification [4][5][6] are essential for future quantum technologies, and ask the seemingly straightforward questions: Given a quantum state ρ, is it entangled?If so, by how much is it entangled?As the system size or dimension of a quantum system grows, these questions become highly non-trivial and in general there are no universal criteria or methods to provide answers.The most popular mathematical recipe for classification, the Positive Partial Transposition (PPT) criterion (or Peres-Horodecki criterion) [7,8], applies only to (2⊗2) or (2⊗3) bipartite systems.As one extends to multipartite, higher dimensional quantum systems more sophisticated tools are required.
The application of classical machine learning tools for the study of quantum systems, such as Artificial Neural Networks (ANNs), have seen a surge of interest due to their remarkable expressive power and efficiency [9][10][11].In particular, Carleo and Troyer [12] showed that Restricted Boltzmann Machines (RBMs) offer a resoundingly appropriate classical representation of quantum states, due to their ability to perform dimensionality reduction, their non-local information distribution, and optimisation capacity [13].Ansatzes based on this architecture are known as Neural Network Quantum States (NNS), and they have been a successful classical simulation tool in a variety of contexts such as tomography [14][15][16][17], open quantum system dynamics [18][19][20][21][22], and the simulation of quantum computing [23][24][25].
The versatility of NNS also provides an excellent framework for the study of entanglement [26].As introduced for pure, qubit states in Ref. [27], it is possible to manipulate and constrain these neural networks in a way that guarantees a strict form of separability.These constrained variational states are known as Separable Neural Network States (SNNS).Combined with a quantum state reconstruction algorithm, this introduces a unique entanglement witness protocol based on the reconstruc-tive performance of a SNNS with a target state.
In this paper, we generalise these results to mixed, ddimensional quantum states.We show how SNNS can be used to perform highly specific entanglement classification, and approximate entanglement measures to a very high degree of accuracy.The ability to implicitly characterise the space of separable states is extremely valuable, and allows one to compute entanglement measures that are otherwise extremely difficult to measure, such as the Relative Entropy of Entanglement (REE) [28].
This paper is structured as follows: In Section I we revise the NNS architecture and its variants for pure and mixed states.Section II overviews separable architectures, and shows how specific forms of entanglement can be guaranteed.In Section III the methods of classification and quantification using SNNS are discussed.Section IV provides numerical evidence for their utility through a number of relevant examples, with interesting applications in the study of noisy tripartite entanglement, bound entanglement, and quantum channel capacities.Finally, conclusions and future directions are addressed in Section V.

A. Pure states
The simplest neural network model we can introduce is the positive, real NNS.This model uses a real valued restricted Boltzmann machine (RBM) architecture, with n v visible units s = {s 1 , . . ., s nv } representing the number of qudits being modelled within the target quantum system, fully interconnected with n h hidden units h = {h 1 , . . ., h n h }.The visible units are typically binary valued to study d = 2 dimensional systems, s i ∈ {−1, 1} as are the hidden units h j ∈ {−1, 1}; however this depends on the system being modelled.This network ar-chitecture allows us to capture the correlations of the objective quantum system through network parameters: where a are visible biases, b are hidden biases, and W is the network weight matrix.The total number of parameters is |Π| = n h • n v + n h + n v (see Fig. 1).The inherent advantage offered by the RBM architecture for generative modelling is that there are no intralayer connections (i.e.there are no connections between adjacent visible units or hidden units).This allows for an ansatz that is independent from the activations of the hidden state space.Thus, one can define a positive NNS wavefunction as [12] and therefore the NNS is Whilst NNS have typically been applied to qubit systems using binary visible units, one can extend the modelling to d-dimensional qudits by using a set of visible binary neurons that collectively represent a single qudit [17].One may choose to encode d-dimensional states using a collection of d visible, binary neurons via an encoding function C, i.e.
The n v qudit visible-layer can then be encoded into ñv = dn v > n v visible neurons, s = {s 1 , s 2 , . . ., s nv } → {g 1 , g 2 , . . ., g ñv }. ( We may identically define the qudit decoding function C such that C(g) = |s .One may encode qudits into binary codes on the visible-layer |s → bin(s), requiring ñv = log 2 d n v visible binary neurons, which however requires d = 2 r for some integer r in order to admit a complete basis set.For arbitrary d it may be more useful to utilise one-hot encoding such that |s → onehot(s) = e d s where e d s is a d-length vector that is zero at all indices except index s.
In order to study non-positive quantum states one can introduce complex network parameters.Letting where θ γ j = k Γ kj s k + γ j , and θ λ j = k Λ kj s k + λ j .Thus the NNS can exhibit phase properties of quantum states.The network parameter set extends to Alternatively one can preserve reality of network parameters by restructuring the nature of the NNS ansatz itself.In particular we can construct an ansatz that uses two RBMs that unify to represent a complete state.Defining a variational phase state Φ Ξ (s), and amplitude state Ψ Π (s), this network ansatz is given as [14] |Ψ Π,Ξ = s e i log ΦΞ(s) Ψ Π (s) |s .
Therefore both the variational phase and amplitude networks need only be real valued, since the complex/phase properties of the state are managed through the complex exponential.The state is now defined by two parameter sets, Π = {a k , b j , W kj } ∈ R and Ξ = {c k , d j , U kj } ∈ R.

B. Mixed States
To extend the variational ansatz to mixed states requires the addition of a hidden mixing-layer with n m hidden units, capable of encoding the classical probability distribution of the mixed quantum state [19][20][21].The network state can be constructed from two sets of variational network parameters: Π = {c p , U kp }, c p ∈ R nm and U kp ∈ C nv×nm encoding the mixing probabilities [29] and the previously defined Ξ = {a k , b j , W kj } ∈ C which encodes the pure-state probability distribution.Let the density-matrix row and column degrees of freedom be described by basis vectors {α, β} respectively.As these parameter sets are independent, we may describe a densitymatrix element as a contribution from a classical mixing state P Π and a pure-state σ Ξ .The contribution from a classical mixing network is given by where x * denotes complex conjugation.Meanwhile the pure-state contribution is The complete variational state can therefore be constructed as a sum over all density-matrix elements, where is the Hadamard product.It is important to emphasise that the classical mixing state P Π cannot capture quantum correlations, only classical correlations.Hence the pure-state σ Ξ alone simulates the quantum correlations within the network state.This architecture is presented in Fig. 2. The network parameters in this ansatz are necessarily complex, but one can create a reformulated ansatz in order to use only real parameters.One could use the NNS used in Eq. ( 7) to learn a vectorised density-matrix ρ = |ρ Π,Ξ .Whilst optimal convergence towards the target vectorised mixed state is possible in this way, the ansatz itself is neither Hermitian or positive semi-definite under reshaping to a density-matrix, i.e. ρ = vec −1 (ρ) is not a valid density-matrix.
Instead one can restructure the mixed state ansatz in order to take a closer form to the complex exponential format utilised in the previous sections.Let the real parameter sets Ξ, Π be used to describe the pure-state phase and amplitude networks respectively, and the complex parameter set Ω used to describe the mixing network.Recall a pure state wavefunction in complex exponential form Ψ Π,Ξ (α) = e i log ϕΞ(α) σ Π (α).It is useful to  define the following functions of our pure density-matrix phase/amplitude wavefunctions In order to incorporate the classical mixing we need a mixing-layer that takes a similar vectorised form.Omitting the visible biases which are already possessed by the pure-states, the mixing-layer takes the form where R kp = Re(U kp ) and I kp = Im(U kp ) denote the real and imaginary components of the mixing network respectively.One can then construct the following phase and amplitude functions for the classical mixing such that the vectorised mixing state takes the form e i log|ϑΩ |r Ω .This allows for any element of the complete mixed state to be expressed according to

II. SEPARABLE NEURAL NETWORK ARCHITECTURES A. Separable Pure Network States
Through restrictions on the connectivity of the weight matrix W kj , one can guarantee separability of the generative network state.Let us define K as a collection of , that collect qudit indices from an n-qudit system.More precisely, In Eq. ( 21) we have demanded that the global partition set necessarily contains all n-qudits in the system, and that subsets of qudits are disjoint in Eq. ( 22).Hence, an n-qudit, pure-state |Ψ is defined to be K-separable if it can expressed as a tensor-product of sub-states |Ψ = k∈K |ψ k , i.e. it is separable with respect to the partition set K. This is a very precise format of separability, as it precisely specifies the arrangement of entangled parties.If we were to disregard specific party orderings we would refer to (|K| = K)-separability.
Disjointedness in this definition of K-separability ensures that each qudit is only entangled with respect to a single subset of the quantum system.This provides a specific level of detail to the entanglement structure, while also degenerating many forms of entanglement that we may not be interested in.For example, genuine tripartite entanglement under disjoint K-separability allows for only a single set K = {k 1 } = {1, 2, 3} with no partitions.We may then define non-disjoint K-separability as an extension of the previous definition simply by removing the conditions in Eq. (22).Using this non-disjoint definition, genuine tripartite entanglement allows for many more definitions, K = {1, 2, 3}, {1, 2|2, 3}, {1, 2|2, 3|1, 3}, . .., which is studied in later sections.
To strictly impose either type of separability on an NNS, the goal is to express the wavefunction of the network state in the following form where ψ k l Π are separable sub-wavefunctions that describe the behaviour of qudits in the partition k l .We may then construct an analogous hidden-layer partition set H = {h l } K l=1 , which assigns a subset of hidden units to each visible subset of entangled qudits K = {k l } K l=1 .By segmenting the layer of hidden units into these K-subsets and applying the following restriction to the weight matrix this condition then provides the complete, K-separable network state

B. Separable Neural Network Density Matrices
Whilst pure-states are K-separable when they can be expressed as the tensor product of |K| = k local substates, a mixed state possesses a form of separability iff it can be expressed as a convex combination of local substates ρ {k l } K l=1 .It is now useful to define two distinct forms of separability; consistent and inconsistent mixedmultipartite separability.
A state is consistently K-separable if it can be expressed as a convex combination of states which all admit an identical form of separability, On the contrary, a state is inconsistently {K j }-separable if it is a mixture of states with different entanglement properties, so its entanglement properties are defined by a combination of constituent K j -separabilities.Precise classification methods are much more difficult for mixed states, however there are still some very useful approaches that can be introduced using NNS.Consistently K-separable states require a direct application of the separability conditions given by Eq. ( 24) onto the pure-state of the NNS.Since the mixing state cannot capture quantum correlations, it is already separable and requires no restrictions.It is thus expedient to apply the separability conditions of Eq. ( 24) onto the pure-states of the mixed NNS, restricting the capacity of the neural network to simulate quantum correlations.Enforcing separability on the pure density-matrix in this way thus provides a NNS guaranteed to be consistently Kseparable If one wishes to enforce complete separability such that for an n-qudit state ρ = j p j n m=1 ρ m j , one can of course just apply consistent separability onto the network state via the separability set K = {1|2|, . . ., |n} in an identical manner as before.However, as the state is completely separable, there are no quantum correlations and the pure-states in the network ansatz are not necessary for simulation of the state.It can then be simplified to ρ Π = P Π , and we can simulate completely separable mixed quantum systems using an RBM with a classical mixing-layer only [30] ρ Unfortunately, it is not possible to strictly classify an inconsistently separable mixed state according to ansatzes discussed in this Section.Take the tripartite example which can be thought of as "cheap" genuine tripartite entangled state.We can certainly define an NNS that can reconstruct a state of this form (trivially, one can utilise a fully connected NNS that can reconstruct ρ); however we cannot specify all three forms of separability in ρ without also allowing the NNS to potentially manifest genuine, pure tripartite entanglement.One can instead utilise independent consistently separable NNS according to the partitions {1, 2|3}, {1, 3|2} and {2, 3|1} in order to quantify the amount of entanglement in the target state with respect to each partition.

III. CLASSIFYING AND QUANTIFYING ENTANGLEMENT A. Learning of Quantum States
We present a learning protocol for a pure NNS |Ψ Π,Ξ to reconstruct a target state |ϕ using the ansatz from Eq. ( 7), which is then extendible to mixed states.We employ a unified learning approach, where the variational state optimises the global, vectorised fidelity with a target state, rather than separate phase and amplitude fidelities.We may define the loss function as the negative logarithmic fidelity between two pure-states as a function of our set of variational parameters Splitting these wavefunctions into respective phase and amplitude functions, we wish to compute the derivatives of the unified cost function with respect to the parameter sets {Π, Ξ}.Since these wavefunctions utilise only real parameters, it is expedient to compute the derivatives using the following chain rule formulation, Computing these gradients will provide the necessary parameter update rules at the m th iteration to the k th network parameter by gradient descent, taking the form where η is some learning rate small enough such that the network state converges to the target state over sufficient iterations of the learning scheme.Defining the quantity complete gradients with respect to variational parameters can therefore be computed as where ) denote diagonal matrices containing the logarithmic derivatives of the network state with respect to the k th amplitude and phase network parameters respectively.Utilising Eq. (38) in the update rule given by Eq. ( 35), the phase and amplitude properties will optimise in a unified manner, maximising the fidelity between the network and the target state endowed with non-trivial phase structure.
Fortunately this learning procedure is readily extended to mixed states via the ansatz in Eq. (20).Since the variational state is in a complex exponential format, one then formulates a cost function based on the fidelity between the vectorised density-matrix and the vectorised target state.The extension is straightforward and explained in Appendix A.
As shown in Ref. [27] separable neural network states can be used to perform entanglement classification and provide entanglement measures of pure, two-dimensional quantum states.Using qudit sub-encoding and the mixed state architectures discussed in the previous sections, these ideas can be extended to classification of more complex quantum systems.
Let us devise a precise decision rule for classification.Consider a target n-qudit state σ, a K-separable learner ρ K Ω , and a free, entangled learner ρ Ent Ω which have both been optimised with respect to reconstructing σ.Using the Bures fidelity, F (σ, ρ) = Tr √ σρ √ σ, we denote the reconstruction fidelity of a learning process as the final/optimal fidelity achieved after a given number of learning iterations.A target σ is learnable via ρ Ent Ω iff its reconstruction fidelity satisfies for a sufficiently small threshold .The choice of F opt determines the reliability of classification, and in our numerical experiments we fix ≤ 10 −4 .The accuracy of this reconstruction via free learning also benchmarks the satisfactory computational resources required in the network, informing the separable reconstruction.One can reliably infer that a target state is K-separable if it is learnable by both a free NNS (ρ Ent Ω ), and a Kseparable NNS (ρ K Ω ).Then the NNS reconstruction fidelities must satisfy Otherwise, the state is entangled to a higher degree.One may then quantify the entanglement content of the target by investigating the distance between σ and an approximation to the closest K-separable state.

B. Quantifying Entanglement
The most difficult aspect of quantifying entanglement stems from the complicated nature of characterising the space of separable quantum states.Thanks to the implicit guarantee of specific separability, SNNS offer an extremely useful tool to help with this, and provide the opportunity to study a variety of entanglement measures that are otherwise much too difficult to explore.
Let us consider measures E that satisfy the general properties of a valid entanglement measure [31].Many important types of E are constructed as a geometric optimisation problem with respect to the space of all fully separable states D Sep .That is, given a target state σ and a distance measure (possibly quasi-distance measure) f , These are entanglement measures which are computed by locating the Closest Separable State (CSS) σ to σ, with respect to the distance measure f .For such measures, the employment of SNNS to parameterise the separable states ρ Ω ∈ D Sep is extremely useful, as it offers an efficient way to perform this optimisation.Furthermore, since SNNS are inherently separable, they will always approximate an upper bound on E, since they are certifiably limited in the quantum correlations that they are able to simulate.This is, To generalise, we may construct a measure E K which is analogous to E, but is defined with respect to the space of all states which are at most K-separable.Defining the set of all states that are K-separable as D K , then the set of all states that are at most K-separable is given by [32] Assuming a measure of the form Eq. ( 41), then we can define E K satisfies all the general properties of an entanglement measure, but now with respect to DK , and is therefore able to classify/quantify more complex forms of entanglement.
Let us specify some important entanglement measures which SNNS can utilise, starting from the Geometric Measure of Entanglement (GME) [33].For pure-states, the GME is the maximum fidelity that can be obtained between a target state |σ and the set of pure, at most K-separable states BK For more sophisticated mixed state approaches, it is expedient to employ any number of density-matrix distance measures.Several important examples include the trace distance where X 1 = Tr √ X † X or the Bures metric where F is the Bures fidelity as before.These quantities are readily approximated via SNNS, and easily specified to different forms of K-separability.
Of particular interest is the Relative Entropy of Entanglement (REE) [28], an entanglement measure that has many applications in quantum communications and channel capacities [34].The REE is based on the quantum relative entropy (QRE), a kind of distance measure between two quantum states where such that S(ρ σ) ∈ [0, +∞).Due to its asymmetry and the fact that it is infinite on pure-states, it is not a true metric, however it is nonetheless extremely useful.Defining the REE then follows which can be readily employed with respect to parameterised NNS.This can of course generalise to E K R (σ) given a form of separability.Interestingly, the REE is sub-additive and in general This lets us define a regularised n-shot REE The single-shot, standard REE alone is an extremely difficult quantity to compute, largely due to the characterisation of D Sep and the unruliness of the QRE.Its computation has recently been explored using an active learning strategy [35], in which the authors use active learning to compress D Sep into a more relevant subset of the separable state space that contributes strongly to the REE.Thanks to the implicit separability of NNS, we may choose an alternative approach where it is possible to optimise some other cost function such as fidelity/trace distance that will simultaneously minimise the QRE towards the optimal REE.In doing so, SNNS should allow for the accurate and efficient approximation of E R , and previously unexplored REEs with respect to other forms of separability E K R .56) for η = −0.75.Using NNS, the REE was approximated to within < 10 −5 precision of the known analytical value ER( η,d ) ≈ 0.4564 [36].The entangled network used 10 hidden mixing neurons and 10 hidden pure-state neurons, whilst the separable network used 10 hidden mixing neurons.The density matrices of the (approximate) CSS ρ Sep Ω ≈ η,5 and target state approximations are also shown.

A. Mixed States in d-dimensions
The most substantial generalisation of the methods introduced in Ref. [27] is the ability to classify and quantify entanglement in mixed, d-dimensional states.To illustrate this improvement, consider the d-dimensional Werner state, parameterised by where i,j=0 |ij ji| is the two-qudit flip operator, I d is the d-dimensional identity operator, and η characterises the entanglement properties of the state.For η ∈ [−1, 0] the state is entangled, and we can easily quantify this entanglement using the analytically known REE [36], In Fig. 4 we display an optimisation procedure for d = 5, η = −0.75 using an entangled learner ρ Ent Ω and a fully separable learner ρ Sep Ω .The free, entangled learner is able to reconstruct the target Werner state with ease, and an extremely high fidelity, while the fully separable learner correctly classifies the target as entangled.
Beyond the obvious entanglement classification, the SNNS is able to quantify the REE of the state, by monitoring the relative entropy E Ω R ( η,d ) = S( η,d ρ Sep Ω ) throughout the learning process.As the optimisation converges, E Ω R → E R , we gather an approximation to the REE of the state.Indeed, under typical optimisation settings, the REE is approximated to within < 10 −5 precision of the known analytical value E R ( −0.75,5 ) ≈ 0.4564, reinforcing the strength of this approach.

B. Classification of Bound Entangled States
The positivity of a partially transposed quantum system can be a signature of separability.However it is not universal, and there exist classes of states which are PPT but are entangled, known as bound entangled (BE) states.Here we consider the following two-qutrit state, where ) is a d = 3 dimensional Bell state.This state is known to satisfy the following entanglement properties [37]: Here we investigate the target state in the bound entangled region, and show that this bipartite state cannot be optimally reconstructed via SNNS.Fig. 5 depicts the employment of entangled learners ρ Ent Ω (blue), and fully separable learners ρ Sep Ω (red) to reconstruct σ α across the domain 3 < α ≤ 4.
For all values of α, ρ Ent Ω is able to reconstruct the state to a high degree of precision such that the trace distance is σ α − ρ Ent Ω 1 ≤ 10 −4 .However, the separable learners are unable to reach this level of reconstruction accuracy.Hence, since σ α are learnable via free NNS, the inability of ρ Sep Ω to reconstruct σ α implies that these states are entangled in this region.Since they are also PPT in this region, we have successfully shown the ability of SNNS to classify bound entanglement.
During each constrained optimisation we gather an upper bound on the distance between the target bound entangled state, and its CSS.As said before, this is an upper bound since ρ Sep Ω offers an approximation to the CSS, and is potentially loose.Nonetheless the inferred classification is informative.Fig. 5 plots the trace distance σ α − ρ Sep Ω 1 , shown to steadily rise as α increases, which is expected as σ α becomes freely entangled for 4 < α ≤ 5.

C. Detection and Measurement of Multipartite Entanglement
The versatility of the K-separable state design means that we can explore entanglement classification and quantification methods that are otherwise very difficult.In particular, we may construct a NNS protocol that is able to witness W/GHZ-state entanglement, and measure W/GHZ-type correlations in both pure and mixed quantum states.Consider the three-qubit W and GHZ states respectively [38,39] These are both maximally entangled three party states.However they possess two inequivalent forms of tripartite entanglement, such that |W cannot be transformed into |GHZ by means of LOCC (local operations and classical communications) strategies.The key difference in these forms of entanglement is their robustness i.e. when a party is removed from a GHZ state the remaining states are separable, whilst a W-state remains entangled.Therefore a W-state possesses strict bipartite entanglement between all three parties, whereas GHZ entanglement can be achieved via "relayed entanglement" [40].
To classify between these states, we must define a partition set that is capable of capturing GHZ correlations, but incompletely capture W-type correlations.The nondisjoint separability set is capable of learning both W and GHZ entangled states, as it strictly specifies bipartite entanglement between all  parties.However, one can construct the partition set which is any possible permutation of two subsets of K W . Programming a NNS according to K GHZ does not allow the network to capture direct correlations between qubits j and k, and will therefore provide an insufficient ansatz to reconstruct W-states.This forms a witness for W-type entanglement; if a target state is learnable via a NNS endowed with K W -separability, but is not learnable via K GHZ -separability, then the state is verified as possessing W-type entanglement.Furthermore, by constructing entanglement measures E KGHZ Ω we are able to measure the amount of W-type correlations within a target state.
Figure 6(a) shows the pure-state classification of a three-qubit W-state, where the non-disjoint network architectures perform classification easily.Note that these three-qubit partitions can be analogously embedded into larger, n-qudit systems in order to study more complex forms of entanglement.
Realistically, multipartite entangled resources for future quantum communication/computing protocols will be noisy and imperfect.Generating and distributing multipartite entanglement over noisy quantum channels is fundamental for many future quantum technologies, particularly for secure communications and quantum networks [41][42][43][44][45][46][47][48].Therefore it is a more interesting challenge to consider the classification and quantification of tripartite entanglement subject to decoherence.For instance, one can consider versions of |W /|GHZ in which each qudit has been passed through a depolarising channel where n denotes the number of qudits being acted on (in this case n = 3).We denote these noisy, three-qubit states as Using mixed NNS programmed with different separabilities, we may then easily distinguish between the entanglement properties of noisy W/GHZ-states subject to depolarising channels.Indeed, Fig. 6 , it is clear that both are able to optimally reconstruct the noisy GHZ-state, whilst only ρ KW Ω is able to optimally reconstruct the noisy W-state, completing the classification.This is taken a step further in Fig. 6(c) where different versions of the REE of σ p W is monitored for various depolarising probabilities.This plot describes three forms of REE: • The standard E R (red) defined on the space of all fully separable states (using the partition set K FS = {1|2|3}) which measures the amount of any entanglement present.
• The genuine tripartite entangled REE, E Gen R (green), using the bi-separable partition sets K BS = {i, j|k}, i = j = k ∈ {1, 2, 3}, which measures the amount of genuine tripartite entanglement in the state (W or GHZ correlations).
• The W-REE, E W R (blue) using the partition set K GHZ in Eq. ( 61), which measures the amount of genuine, tripartite, strictly W-type entanglement within the state.
By employing more complex separable architectures, we may study how different forms of entanglement behave with respect to environmental properties, such as depolarisation.By measuring E Gen R and E W R for instance, we may monitor the decoherence of genuine tripartite entanglement, rather than any entanglement as done so by E R .Such characterisations could prove very useful in communication/networking scenarios, where genuine multipartite entanglement is critical to performance.
It is important to remind the reader that these are upper bounds.The standard REE upper bound is expected to be tight, as fully separable NNS architectures precisely capture full separability.However, K BS and K GHZ are degenerate, e.g.K BS = {i, j|k} has 3 unique forms.Since mixed SNNS are restricted to consistent separabilities, there may be convex combinations of states of these separabilities that produce tighter bounds.It is unknown if this is the case, nonetheless E Gen R and E W R provide informative upper bounds on these unique entanglement measures.

D. Ultimate Limits for Channel Capacities
We may provide a more practical example for the use of SNNS in the realm of quantum communications, using them to approximate upper bounds of quantum channel capacities.Introduced in Ref. [34], the Pirandola-Laurenza-Ottaviani-Banchi (PLOB) bound is an ultimate upper bound on the two-way assisted quantum (and secret-key) capacity C(E) for a given quantum channel E. Its derivation is based on the techniques of channel simulation and teleportation stretching, which have proven to be extremely versatile in a number of settings [42,[49][50][51][52][53][54]].An essential class of quantum channels are those which are teleportation covariant, meaning that they satisfy the condition for some pair of teleportation unitaries {U, V }.Let us define the Choi matrix of a d-dimensional channel E as the result of passing one mode of a maximally entangled state Φ + through the E, and the other through an identity channel I where the maximally entangled state may take the form For teleportation covariant channels, the ultimate channel capacity can then be upper bounded in a remarkably simple way [34] C where E R is the standard relative entropy of entanglement (and E n R its n-shot version).SNNS can be used to approximate upper bounds on these channel capacities, via constrained reconstruction of the Choi state of the desired quantum channel.
We consider two important, teleportation covariant, ddimensional quantum channels in an effort to illustrate  the effectiveness of our approach: The depolarising channel considered in Eq. ( 62), and the Holevo-Werner channel [55][56][57].The Choi states of these channels are the classes of isotropic states and Werner states respectively, whose REE bounds are known analytically.Therefore, we can compare the numerical performance of computing the REE via SNNS with the known, exact bounds.Fig. 7(a) reports REE bounds on the capacity of depolarising channels for dimensions d = 2, 3, 4. Approximating these bounds via separable network states requires the targeted reconstruction of the isotropic state, Using a bipartite SNNS ρ Sep Ω , and attempting to learn the target Choi state leads to an approximation of the REE of said state.Performing this optimisation for many depolarising probabilities p, the results in Fig. 7(a) can be produced.This is be achieved to a very high degree of accuracy, reproducing the analytical bounds with an average error ∼ < 10 −5 .Furthermore, these bounds can be computed very efficiently by performing each optimisation sequentially, initialising the network parameters using the results of previous optimisations [58].
In Fig. 7(b) we give REE upper bounds for the HW channel, which takes the form such that T superscript denotes the transposition.The Choi state of the HW channel is the d-dimensional Werner state, introduced in Eq. ( 56).The single shot REE bounds for the HW channel are analytically known and given in Eq. ( 57), and are independent of dimension d.Again, this single shot bound is approximated to a good precision, as shown in the results.For Werner states of dimension d > 2, their REE is known to be strictly sub-additive when η < − d 2 , and previous studies have explored the two-shot REE for these Choi states [56], which can therefore be used to tighten these upper bounds.For instance, in Fig. 7(b) the twoshot capacity can be seen to significantly tighten the bounds for d = 3.In order to compute these tighter bounds, one must modify the definition of the n-shot quantities slightly.Now the minimisation is performed with respect to the space of all locally bi-separable states.Consider the n-copy Werner state, and let us label each copy with indices of its modes {i, j}, where we have permuted the labels into a bi-separable decomposition such that each state belongs to exclusively even or odd mode labels.This corresponds to a situation where two users each possess n local modes, and their goal is to produce the closest state to ⊗n η,d that is biseparable between them.In general this is a very difficult task, and while beyond the scope of this paper, poses as an interesting future application for SNNS.

V. CONCLUSIONS AND OUTLOOK
We have generalised the concept of NNS with programmable separability to mixed, d-dimensional quan-tum states.We discussed a number of neural network architectures for the description of quantum states, and detailed how their entanglement properties may be controlled via constraints placed on network connectivity.It was shown that network connectivity controls entanglement structure on a very specific level, requiring distinctions between certain forms of entanglement.Outlining one of many possible optimisation protocols, methods of classification and quantification via SNNS have been logically developed, and applied in a number of important settings.We then studied a practical application of these tools in the bounding of ultimate quantum channel capacities, showing that they can reproduce the PLOB bounds for DV channels with high precision.
There are a number of valuable future directions in which SNNS may be explored and expanded.While an optimisation scheme based on the vectorised fidelity is effective for a variety of applications (as shown in this work) more sophisticated optimisation protocols could enhance performance for more specific entanglement measures.In particular, a gradient descent method that directly minimises the relative entropy (or some variant thereof) would provide a more effective computation of the REE for complex states.This would also lend well to the study of n-shot REE quantities with applications in quantum channel capacities, and the characterisation of more complex bound entangled states (such as those constructed from un-extendible product bases).Combining these tools with those from practical quantum tomography could also be extremely useful, e.g.where SNNS may be used to certify the effectiveness an entanglement distribution protocol.

Figure 1 .
Figure 1.Neural network quantum state architectures for the simulation of pure-states.Panel (a) illustrates the standard NNS construction for n qudits.The visible-layer consists of nv × d units which encode the accessible basis states of the target system; Here d is the number of visible units required to encode a single qudit state where C(•) is some encoding function such that C(|d ) = {gi} d i=1 and its inverse C({gi} d i=1 ) = |d .Correlations between qudits are captured by an n h unit hidden-layer with interconnected weights and biases.Panel (b) illustrates the amplitude/phase machine that uses two hidden-layers and only real valued parameters.

Figure 2 .
Figure2.A restricted Boltzmann machine architecture for the simulation of (generally entangled) density matrices using complex parameters.

Figure 4 .
Figure 4.The classification and entanglement quantification of a d = 5 Werner state η,d , defined in Eq. (56) for η = −0.75.Using NNS, the REE was approximated to within < 10 −5 precision of the known analytical value ER( η,d ) ≈ 0.4564[36].The entangled network used 10 hidden mixing neurons and 10 hidden pure-state neurons, whilst the separable network used 10 hidden mixing neurons.The density matrices of the (approximate) CSS ρ Sep

Figure 5 .
Figure 5. Bound entangled state classification.Entangled learners ρ Ent Ω (blue) are used to confirm the learnability of the target bound entangled state via NNS.Separable learners ρ Sep Ω (red) and then used to classify the target state, and approximate an upper bound on the trace distance from the CSS, σ α .Here we illustrate density matrices of the approximate CSS, and the target state for α = 3.95.

Figure 6 .
Figure 6.Classification and quantification of d = 2 W/GHZ type entanglement using NNS.Panel (a) shows the classification of W-type entanglement using two NNS designed according to the partition sets KGHZ = {1, 2|2, 3} and KW = {1, 2|2, 3|1, 3}.If a variational state endowed with KW-separability can optimally reconstruct a target that KGHZ cannot, then it must possess W-type entanglement.In turn, we locate the closest GHZ-entangled state to |W .In Panel (b) this is extended to mixed, depolarised W/GHZ-states for p = 1 3 .Panel (c) depicts different versions of the REE upper bounds on a depolarised W-state σ p W with respect to depolarising probability.Here we plot three types of REE: The fully separable REE ER (red), the genuine tripartite REE E Gen R (green) and the strictly W-type entanglement REE E W R (blue).

Figure 7 .
Figure 7. PLOB channel capacity upper bounds computed via separable neural network states.Continuous plots are exact, while the scatter plots are SNNS data.Panel (a) displays the communication capacities for d = 2, 3, 4 dimensional quantum systems in a depolarising channel of depolarising probability p, using mixed, qudit SNNS ansatzes.Panel (b) depicts the capacity for Holevo-Werner (HW) qutrit channels.The network states approximate the REE to a typical accuracy of < 10 −5 , hence reproducing the capacities to a very high degree of precision.