Entanglement classification via neural network quantum states

The task of classifying the entanglement properties of a multipartite quantum state poses a remarkable challenge due to the exponentially increasing number of ways in which quantum systems can share quantum correlations. Tackling such challenge requires a combination of sophisticated theoretical and computational techniques. In this paper we combine machine-learning tools and the theory of quantum entanglement to perform entanglement classification for multipartite qubit systems in pure states. We use a parameterisation of quantum systems using artificial neural networks in a restricted Boltzmann machine architecture, known as Neural Network Quantum States, whose entanglement properties can be deduced via a constrained, reinforcement learning procedure. In this way, Separable Neural Network States can be used to build entanglement witnesses for any target state.


I. INTRODUCTION
As the size of a quantum system grows, the number of accessible states, and thus the Hilbert space dimension, scales exponentially.Therefore, the amount of information required for a complete description of a many-body quantum state quickly grows uncontrollably.For this reason, as exact descriptors of many-body systems becomes intractable, we should quickly turn to mathematical models for the simulation of quantum states.
Very recently, machine learning has become a prominent numerical tool for the assessment of problems of overwhelming complexity, with applications in many areas of physics [1][2][3].In particular, Artificial Neural Network (ANN) architectures have been shown to provide excellent representations of quantum systems, due to their efficiency in dimensional reduction, and sufficient expressive power to provide efficient simulations and insight in quantum problems with high dimensional Hilbert spaces, inaccessible by many other analytical or numerical means.A key instance of problems where ANN-based approaches hold the promises for a gamechanging contribution is the discrimination of entangled and separable states, which is a known NP-hard classification problem in quantum information processing [4].
In this work, we employ the recently introduced neural network quantum states (NNSs) [5], which are ANN architectures of the restricted Boltzmann Machine (RBM) form, to build an accurate entanglement-separability classifier that we show to be effective in both witnessing multipartite entangled states and identify the k-inseparability class of generic multipartite quantum states.Our tool requires minimum adaptation to the form of possible input states, as we show by addressing various multipartite qubit states, including linear cluster states, which are crucial resources for measurementbased for quantum computation [6].
The remainder of this paper is organized as follows.
In Sec.II we introduce the concept of NNS and their parameterization, while Sec.III is dedicated to our strategy for the characterization of pure multipartite entangled states.In Sec.IV we present the results of our analysis for a series of benchmark examples including linear cluster states, while Sec.V is for our conclusions and a sketch of our future directions of investigation.

II. NEURAL NETWORK STATES
As mentioned above, NNSs provide a parameterisation for the wavefunction of quantum systems by means of RBM-like architectures [5], which have recently received considerable attention [7].RBMs consist of a single visible and hidden layer of neurons, mediated by weighted inter-layer connections and with no intra-layer links.The visible layer embodies the physical degrees of freedom of the system, whilst the hidden one is used to distribute information across the network.The optimization of the latter is the intrinsic purpose of any ANN.
We consider a generic, pure NNS with N discretevalued degrees of freedom, for example a system of N qubits s = {s i } i=1,...,N , as a visible layer of binaryvalued neurons, fully connected to a hidden layer of H hidden binary-valued neurons h = {h j } j=1,...,H , where s i , h j = {−1, 1}.These connections are mediated by the variational parameters of the network Ω = {a i , b j , W i j }.The wavefunction of this state is thus given by Ψ Ω ( s, h) = h e i s i a i + i j W i j h j s i + j h j b j . ( The hidden layer of binary neurons h can be readily traced out, due to the lack of intra-layer connections, thus providing a representation depending only on Ω and the FIG. 1. Illustration of a N-qubit NNS based on a RBM of N binary artificial visible neurons, and H binary artificial hidden neurons used to mediate the correlations within the system.There are NH weighted connections and N + H total neural biases.
physical spin-like variables in the visible layer The actual NNS can thus be written as | ⌦ i = P s ⌦ (s) |si (up to an irrelevant normalisation constant).Note that this ansatz describes pure states.

III. PURE STATE ENTANGLEMENT CLASSIFICATION
The non-local features of the RBM architecture allows for the assessment of entanglement throughout the system.The capacity of the NNS to represent multipartite entangled state is based on the amount of network parameters being exploited [8].Utilising more hidden neurons in the RBM structure increases the sets of weights and biases, and thus the usefulness of the network state.However, representing a pure state via the In order to provide a systematic w generic N-qubit states into a form repr we use the following approach: Giv | ⌦ i with N visible neurons, H hidde etezised by a the set ⌦, and given a t wish to optimise ⌦ in a way that | ⌦ proximates |'i.This can be achieved procedure that iteratively updates ⌦ = to achieve the set ⌦ 0 for which the fid NNS and the target state is maximum [1, H].As the target state is known an the entire optimisation, state fidelity ca a multi-variable function dependent o work parameters as The quantities in this expression can classical expectation values over proba defined by the state at hand, which computable fidelity between the adapt the target state [9] where, generally, hAi ↵ denotes a stat value of the quantity A over the pr tion ↵.Computation of these expecta achieved via standard Markov Chain niques.This quantity can then be use learning scheme in which at every ite misation, the network parameters are a a positive fidelity gradient, until conv imum (ideally unit) value is achieved physical spin-like variables in the visible layer The actual NNS can thus be written as |Ψ Ω = s Ψ Ω ( s) | s (up to an irrelevant normalisation constant).Note that this ansatz describes pure states.

III. PURE STATE ENTANGLEMENT CLASSIFICATION
The non-local features of the RBM architecture allows for the assessment of entanglement throughout the system.The capacity of the NNS to represent multipartite entangled state is based on the amount of network parameters being exploited [8].Utilising more hidden neurons in the RBM structure increases the sets of weights and biases, and thus the usefulness of the network state.However, representing a pure state via the ansatz in Eq. ( 2) requires the exact parameterisation of the N-qubit state in terms of the neural network set of parameters Ω = {a i , b j , W i j }.Fortunately, NNS are constructed so that they can undergo variational evolution using a learning-optimisation procedure.Therefore it is straightforward to implement a learning scheme that variationally evolves a NNS |Ψ Ω into a known target state |ϕ through the maximisation of state fidelity.
Under the assumption that any entangled state is learnable, it is interesting to investigate the relationship be-tween the set of parameters entering a NNS and the separability properties of the state.We will see that the use of Separable Neural Network States (SNNS) in conjunction with such a fidelity-maximisation learning scheme, target states can be classified based on their entanglement properties.

A. Quantum state representation and translation
In order to provide a systematic way of translating generic N-qubit states into a form represented by a NNS, we use the following approach: Given a blank NNS |Ψ Ω with N visible neurons, H hidden neurons, parametezised by a the set Ω, and given a target state |ϕ , we wish to optimise Ω in a way that |Ψ Ω most closely approximates |ϕ .This can be achieved using a learning procedure that iteratively updates Ω = {a i , b j , W i j } so as to achieve the set Ω for which the fidelity between the NNS and the target state is maximum ∀i ∈ As the target state is known and fixed throughout the entire optimisation, state fidelity can be computed as a multi-variable function dependent on the neural network parameters as The quantities in this expression can be computed as classical expectation values over probability distributions defined by the state at hand, which delivers a readily computable fidelity between the adaptive RBM state and the target state [9] where, generally, A α denotes a statistical expectation value of the quantity A over the probability distribution α.Computation of these expectation values can be achieved via standard Markov Chain Monte Carlo techniques.This quantity can then be used to implement a learning scheme in which at every iteration of the optimisation, the network parameters are adjusted to provide a positive fidelity gradient, until convergence at a maximum (ideally unit) value is achieved.In practice, it is more convenient to consider the negative logarithm of the overlap as it possesses a more compact form of gradient, converting this maximisation into a minimisation.
Defining O i = ∂ Ω i ln(Ψ Ω ), the loss function and its gra-dients for this learning scheme are given formally as With this at hand, we can now construct a learning scheme using stochastic gradient descent (SGD).Updates to the i th network parameter of the RBM wavefunction at the k th iteration will be given by where η is the learning rate of the process.Over enough iterations and a small enough learning rate, convergence is guaranteed, and the network variational state is optimised to reconstruct the desired target state.The latter can be generally represented as a sum of Kronecker functions with unique probability amplitudes ϕ( i=1 α i δ(s i , s j ) where δ(s i , s j ) is equal to unity if and only if s i = s j .However, the use of such Kronecker functions within the target wavefunction provide a very difficult optimisation problem for the learning procedure, due to the infinite magnitude of the gradients on the associated free-energy surface.By smoothing the target wavefunction into a sum over Gaussians, the task becomes much more manageable, while retaining accuracy if sufficiently small variances are used.The wavefunction for this approximation to the target state thus becomes ϕ( s j ) ≈ 2 N i=1 α i e −(β i −β j ) 2 /σ 2 , where β k = bin( s k ) is the binary conversion of the k th N-qubit basis state, and σ 2 is the variance of the Gaussian packet.
The ability of an NNS to represent local phase within a target state is dictated by the nature of the ANN parameters.A NNS with complex weights and biases is able to generate generally complex amplitudes such that Ψ Ω ( s) = re iϕ (r ∈ R, ϕ ∈ [0, 2π]) for some input vector of qubit configurations.Thus, NNS with purely real parameters are only capable of simulating positive wavefunctions, up to a global phase factor.
However, introducing a non-trivial phase structure into a target state increases the complexity of the optimisation problem.It is instructive to instead utilise an additional layer of hidden neurons, l = {l} k=1,...,M with an associated set of weights and biases Ξ = {c i , d k , U ik } dedicated to learning local phase factors of a target state [10].Thus the original hidden layer h and its weights and biases Ω becomes the dedicated amplitude-learning parameter set (see Fig. 2).This introduces a new global NNS ansatz, combining the contribution of both layers, and reading with Φ Ξ ( s) ∈ [0, 1].In our numerical experiments on target-state reconstruction, the method of natural gradient descent [11] was found to be more effective than that Graphical depiction of a NNS describing a system qubits, using a RBM machine of N binary artificial visible rons, H binary artificial hidden neurons and M binary artifi hidden neurons that mediate amplitude and phase correlat respectively.There are N(H + M) weighted connections 2N + H + M total neural biases.
where ⌘ is the learning rate of the process.Over eno iterations and a small enough learning rate, converge is guaranteed, and the network variational state is timised to reconstruct the desired target state.The ter can be generally represented as a sum of Krone functions with unique probability amplitudes '(s j P 2 N i=1 ↵ i (s i , s j ) where (s i , s j ) is equal to unity if and o if s i = s j .However, the use of such Kronecker fu tions within the target wavefunction provide a very ficult optimisation problem for the learning proced due to the infinite magnitude of the gradients on the sociated free-energy surface.By smoothing the ta FIG. 2. Graphical depiction of a NNS describing a system of N qubits, using a RBM machine of N binary artificial visible neurons, H binary artificial hidden neurons and M binary artificial hidden neurons that mediate amplitude and phase correlations, respectively.There are N(H + M) weighted connections and 2N + H + M total neural biases. of stochastic gradient descent, ans is thus adopted in what follows.Updates to the i th parameter of the network at the k th iteration are thus given by where S i j denotes the elements of a covariance matrix and f j the elements of a generalised force vector, both defined in Appendix A. The updates to Ξ k i in Eq. ( 8) are based on a natural metric of the variational subspace being explored, which greatly enhances the optimisation process.
From this point forward we omit reference to the phase learning layer unless required, but recognise that its application is synonymous with the original NNS design.

B. Separable neural network states and multipartite entanglement
The ability to effectively enforce properties of separability onto a NNS is extremely useful and integral to the entanglement classification protocol addressed in this paper.In order to enforce a particular form of separability into a pure multipartite state, we must first determine the  3. Geometric representation of the hierarchy of multipartite entangled states for N-partite quantum states.For every partition of an N party system, there exists a set of states that admit K-separability, U K .States that are K separable are also representable as (K + 1)-separable states, thus U K ⇢ U K+1 ✓ U (where U is the total set of states).If a state ⇢ 2 U but ⇢ < U K , 8K 2 [2, N], then ⇢ is genuinely multipartite entangled [12].
combining the contribution of both layers, and reading with ⌅ (s) 2 [0, 1].In our numerical experiments on target-state reconstruction, the method of natural gradient descent [11] was found to be more e↵ective than that of stochastic gradient descent, ans is thus adopted in what follows.Updates to the i th parameter of the network at the k th iteration are thus given by where hS i j i denotes the elements of a covariance matrix and f j the elements of a generalised force vector, both defined in Appendix A. The updates to ⌅ k i in Eq. ( 8) are based on a natural metric of the variational subspace being explored, which greatly enhances the optimisation process.
From this point forward we omit reference to the phase learning layer unless required, but recognise that its application is synonymous with the original NNS design.

B. Separable neural network states and multipartite entanglement
The ability to e↵ectively enforce properties of separability onto a NNS is extremely useful and integral to the entanglement classification protocol addressed in this paper.In order to enforce a particular form of separability into a pure multipartite state, we must first determine the number of ways in which it may possess entanglement.FIG. 3. Geometric representation of the hierarchy of multipartite entangled states for N-partite quantum states.For every partition of an N party system, there exists a set of states that admit K-separability, U K .States that are K separable are also representable as (K + 1)-separable states, thus U K ⊂ U K+1 ⊆ U (where U is the total set of states).If a state ρ ∈ U but ρ U K , ∀K ∈ [2, N], then ρ is genuinely multipartite entangled [12].number of ways in which it may possess entanglement.An N-qubit pure quantum state is said to be K-separable if it is the tensor product of K = 2, .., N parties of the total system, where N-separability coincides with full separability.Differently, if a state is genuinely multipartite entangled it cannot be factorised into any tensor product representation (K = 1).
We define a pure K-separable state |Ψ = K i=1 |ψ S i , such that S = {S i } i=1,...,K is a set of K disjoint subsets of the N parties of the total quantum system, i.e. S i ∩ S j = ∅, ∀i, j ∈ {1, . . ., K}.
However, at the level of constructing specific separability sets according to S = {S i } i=1,...,K , the number of ways a K-separable state can be invoked is highly degenerate (cf.Appendix B).Therefore, one can describe a state as K j -separable (i.e. a K-subseparability) (with j ∈ [1, P K ] where P K is the degeneracy of K partitioning) in order to specify the exact form of K-separability being addressed.
We are thus left to deduce a translation of K jseparability from its fundamental definition into a set of relations for the parameter set of the NNS.Given a set of partitions {S m } m=1,...,K we wish to find the network conditions such that the NNS with N visible neurons and H hidden neurons can only reproduce states with such form of separability.This can be achieved by solving the following equation where ψ S k is an ansatz for a "local" wavefunction for each collection of entangled qubits.In this way we are requesting that Ψ Ω ( s) takes a desired product form, and the required conditions can thus be derived from of the solutions of since identical products are taken over the hidden neurons in both cases.The goal of this task is to transform the right-hand side (RHS) of Eq. ( 10), currently capable of describing all forms of separable states, into a form that aligns with the left-hand side (LHS) and therefore the separability properties of the state.This separation can be achieved by performing segmentations of the neural network architecture according to the separability being imposed.Each set of potentially entangled qubits S m is fully connected to a dedicated set of hidden neurons H m (W i∈S m , j∈H m 0), but are fully disconnected to all other hidden neurons (W i∈S m , j H m = 0).Thus, there exist K disjoint sets of hidden neurons {H m } i=1,...,K corresponding to K disjoint sets of qubits {S m } m=1,...,K .Performing this segmentation, the RHS of Eq. ( 10) becomes In this way, an N-qubit K j -Separable Neural Network State (SNNS) is defined as an RBM with N visible neurons segmented into disjoint sets {S m } m=1,...,K and H hidden neurons segmented into disjoint sets {H m } m=1,...,K mediated by complex variational parameters Ω = a i , b j , W i j with the property Such separable network architecture generally relies on a larger number of hidden neurons than that of a conventional NNS, due to the need for dedicated sets of neurons for each separable subsystem of the N-qubit set.However, it is important to recognise that this increase in hidden neurons does not decrease the efficiency of the optimisation procedure.This is because the null weights in W i j (disconnections) do not require updates during the learning protocol, and can thus be ignored.Hence the number of meaningful parameters in W i j is given by which is comparable with the number of parameters |Ω Free | = N + H + NH in a free learner.This returns the computational complexity of the learning regime for SNNS to that of a typical quantum state reconstruction.4 K-separable ties of the towith full sepy multipartite ensor product ecific separae number of is highly decan describe ability) (with f K partition--separability ation of K jinto a set of Given a set of network coneurons and H ith such form lving the fol- efunction for s way we are uct form, and d from of the hidden neuto transform ently capable , into a form and therefore forming segure according set of potented to a ded- sets of qubits {S m } m=1,...,K .Performing this segmentation, the RHS of Eq. ( 10) becomes In this way, an N-qubit K j -Separable Neural Network State (SNNS) is defined as an RBM with N visible neurons segmented into disjoint sets {S m } m=1,...,K and H hidden neurons segmented into disjoint sets {H m } m=1,...,K mediated by complex variational parameters ⌦ = a i , b j , W i j with the property Such separable network architecture generally relies on a larger number of hidden neurons than that of a conventional NNS, due to the need for dedicated sets of neurons for each separable subsystem of the N-qubit set.However, it is important to recognise that this increase in hidden neurons does not decrease the e ciency of the optimisation procedure.This is because the null weights in W i j (disconnections) do not require updates during the learning protocol, and can thus be ignored.Hence the number of meaningful parameters in W i j is given by which is comparable with the number of parameters |⌦ Free | = N + H + NH in a free learner.This returns the computational complexity of the learning regime for SNNS to that of a typical quantum state reconstruction.

C. Entanglement Classification
With the necessary tools in place, an entanglement classification protocol can be devised.The maximum fidelity learning regime allows a blank, randomised NNS to undergo variational evolution in order to converge towards a pre-defined pure quantum state, and accurately simulate the target one.Such learning process operates under the assumption that the target state is simulatable by the the NNS being utilised.If a typical NNS is used then this is true, as there are no restrictions/rule on the values the network parameters can take.In this way we call generic NNS free learners of target states.
However, if one uses a K-separable SNNS, this is not necessarily the case.A K-separable SNNS is a neural network state that possesses network parameters in accordance with Eq. ( 12) which can only simulate states with this form of K-separability.Therefore, if a Kseparable SNNS (or restricted learner) is used in conjunction with the maximum fidelity learning scheme in order to reconstruct a pure state |ϕ , it will only achieve maximum fidelity if the target state is also K-separable.Otherwise, if the target state is in fact K -separable the SNNS will learn to optimise its fidelity to the maximum value that a K-separable state can achieve with the Kseparable state |ϕ .If K -separability is a higher order of entanglement than K, then the optimised fidelity will be a value less that unity.Yet if K -separability is a lower order of entanglement, then it will be representable as K-separable also.
The classification protocol thus follows: For a pure N-partite quantum state |ϕ which can be optimally reconstructed via unrestricted maximal fidelity learning, if a K j -separable SNNS |Ψ K j Ω (defined by the set of disjoint sets of qubits {S i } i=1,...,K ) is unable to reconstruct |ϕ through the same optimisation scheme then the target state |ϕ must possess entanglement within at least one of the partitions S i .This approach provides valid separability criteria for conclusive classification of pure quantum states.

D. Witnesses and measures
The construction of a reliable and consistent entanglement classification procedure requires a quantifiable measure of performance for NNS target state reconstructions.As ANN learning is numerical in nature, it may be prone to statistical errors and possible flaws due to the size of Hilbert space being explored, and infinite possible variational updates that can be made.Hence, we resort to statistics to combat this.
Consider a classification protocol which uses a restricted learner |Ψ K Ω in order to classify this form of entanglement for a target state |ϕ , achieving a set of fidelities {F i K } i=1,...,M over all the learning operations.In order to build a level of reliability and confidence, this protocol is performed M times so to calculate an average fidelity and their variance Given enough samples M and enough hidden neurons (and thus free parameters) to ensure sufficient expressive power, we can define a performance set F K = F K − |∆F K |, F K + |∆F K | that describes a window of reliability in the particular, separable learner being employed.Doing so equivalently for the free learner |Ψ Free Ω is also extremely important, providing a benchmark for the performance of the NNS without entanglement property restrictions.In fact, the learning performance of any restricted learner can be expressed relative to the behaviour of the free learner, and provides a systematic method for conclusive entanglement classification.In general there are two cases in doing this In this case there is an intersection between the computed fidelity of the restricted learner and the free learner, meaning that we can classify this state as possessing entanglement properties according to this form of Kseparability.
• F K ∩ F Free = ∅.In this case there is no intersection between the computed fidelity of the restricted learner and the free learner.In this case the learner has only been able to reconstruct (ideally) the closest state to |ϕ that possesses entanglement properties according to |Ψ K Ω .
manifested within the target state.Defining the relative fidelity an approximate, local entanglement measure can be devised in accordance with the general properties of an entanglement measure [13] Note that we now refer to K j -separability, such that this is a sub-genre of the more general K-separability.Classification does not necessarily require this distinction, but measurement of K j -separability is not a complete reflection of K-separability, as there exist P K 1 other contributions to this measure.Thus, a complete measure of K-separability requires an analysis of all such contributions.
The quantifier in Eq. ( 16) is inspired from the well known Geometric Measure of Entanglement (GME) [14].The quantity E K j strives at quantifying the lack of representability according to K j -separability.In an ideal scenario, all optimisation procedures are perfectly convergent such that for a target state |'i, any SNNS | K j ⌦ i will reconstruct the closest K j -separable state to the target state |' 0 K j i.One can define the overlap between such states as the critical fidelity as it defines maximum fidelity between a target state and the set of K j -separable states.In such ideal case, the relative fidelity and local entanglement measure become so that R K j recovers the critical fidelity and E K j the GME for multipartite, pure state entanglement.

IV. RESULTS
The following results and simulations are used to illustrate the e↵ectiveness of the entanglement classification protocol.We begin with the simplest classification problem of bipartite separability and proceed to more complex states of up to six qubits.Such sizes allow for the investigation of non-trivial multipartite entangled systems without the complications entailed by large many-body systems.Each classification is characterised by a "learning path" which depicts the evolution of the fidelity of the free learner and a separable learner throughout the state reconstruction procedure.Learning paths which follow a convergent trajectory towards unit fidelity indicate a correct classification of the state with the separability properties of the learner in question.Paths which converge with sub-optimal fidelities, or which do not converge at all, indicate a witnessing of entanglement with respect to the appropriate form of separability.
We start addressing the two-qubit case, a situation for which entanglement classification is a binary decision problem as a pure two-qubit state is simply either entangled or fully separable.Fig. 5 illustrates the use of SNNS for classification purposes when the target state is either the Bell state x the x Pauli matrix.The learning paths of the SNNS performs the classification successfully, whilst the free, entangled learner learns both states with ease.Note that SNNS when targeting the Bell state achieves a fidelity of F 1|2 ⇡ 1/ p 2 which aligns with the maximum overlap between any Bell state and the set of all separable bipartite states.Increasing the target system size to three qubits immediately increases the complexity of the classification problem, such that a state is tripartite entangled, biseparable (which is three-fold degenerate) or fully separable.The degeneracy of biseparability is due to the arrangement of entanglement between parties, which the appropriate SNNS are able to distinguish.Fig. 6 displays the ability of SNNS to both detect K-separability and identify the particular permutation of entangled parties (K jseparability).A similar investigation is illustrated for the four/six qubit cases in Fig. 7, which show the power of the classification protocol that is capable of providing complete entanglement descriptions of pure target states.⌦ i converging to unit fidelity while reconstructing a separable two qubit state manifested within the target state.Defining the relative fidelity an approximate, local entanglement measure can be devised in accordance with the general properties of an entanglement measure [13] Note that we now refer to K j -separability, such that this is a sub-genre of the more general K-separability.Classification does not necessarily require this distinction, but measurement of K j -separability is not a complete reflection of K-separability, as there exist P K 1 other contributions to this measure.Thus, a complete measure of K-separability requires an analysis of all such contributions.
The quantifier in Eq. ( 16) is inspired from the well known Geometric Measure of Entanglement (GME) [14].The quantity E K j strives at quantifying the lack of representability according to K j -separability.In an ideal scenario, all optimisation procedures are perfectly convergent such that for a target state |'i, any SNNS | K j ⌦ i will reconstruct the closest K j -separable state to the target state |' 0 K j i.One can define the overlap between such states as the critical fidelity as it defines maximum fidelity between a target state and the set of K j -separable states.In such ideal case, the relative fidelity and local entanglement measure become so that R K j recovers the critical fidelity and E K j the GME for multipartite, pure state entanglement.

IV. RESULTS
The following results and simulations are used to illustrate the e↵ectiveness of the entanglement classification protocol.We begin with the simplest classification problem of bipartite separability and proceed to more complex states of up to six qubits.Such sizes allow for the investigation of non-trivial multipartite entangled systems without the complications entailed by large many-body systems.Each classification is characterised by a "learning path" which depicts the evolution of the fidelity of the free learner and a separable learner throughout the state reconstruction procedure.Learning paths which follow a convergent trajectory towards unit fidelity indicate a correct classification of the state with the separability properties of the learner in question.Paths which converge with sub-optimal fidelities, or which do not converge at all, indicate a witnessing of entanglement with respect to the appropriate form of separability.
We start addressing the two-qubit case, a situation for which entanglement classification is a binary decision problem as a pure two-qubit state is simply either entangled or fully separable.Fig. 5  x the x Pauli matrix.The learning paths of the SNNS performs the classification successfully, whilst the free, entangled learner learns both states with ease.Note that SNNS when targeting the Bell state achieves a fidelity of F 1|2 ⇡ 1/ p 2 which aligns with the maximum overlap between any Bell state and the set of all separable bipartite states.Increasing the target system size to three qubits immediately increases the complexity of the classification problem, such that a state is tripartite entangled, biseparable (which is three-fold degenerate) or fully separable.The degeneracy of biseparability is due to the arrangement of entanglement between parties, which the appropriate SNNS are able to distinguish.Fig. 6 displays the ability of SNNS to both detect K-separability and identify the particular permutation of entangled parties (K jseparability).A similar investigation is illustrated for the four/six qubit cases in Fig. 7, which show the power of the classification protocol that is capable of providing complete entanglement descriptions of pure target states.This approach provides a consistent rule for deciding how a target state is entangled.The application of this method relies on the accuracy of the learning regime to maintain a low variance throughout its full spectrum of fidelities, since an arbitrarily large variance will render the result redundant.Nonetheless, this method of classifying the entanglement properties of target states resembles that of entanglement witnesses.If a K-separable learner achieves an optimal reconstruction fidelity with a performance similar to the free learner, then the NNS witnesses this state as entangled in this way.Otherwise, it does not witness the state and this binary classification delivers the contrary result.
A much more detailed classification can be carried out by more closely investigating the resultant fidelities of all restricted learners according to a set of separabilities, not just those that achieve optimal fidelities with respect to the free learner.Instead, one can consider the relative fidelity of the K j -separable learner with respect to the free learner as a measure of how much K j -separability is manifested within the target state.Defining the relative fidelity an approximate, local entanglement measure can be devised in accordance with the general properties of an entanglement measure [13] E Note that we now refer to K j -separability, such that this is a sub-genre of the more general K-separability.Classification does not necessarily require this distinction, but measurement of K j -separability is not a complete reflection of K-separability, as there exist P K − 1 other contributions to this measure.Thus, a complete measure of K-separability requires an analysis of all such contributions.
The quantifier in Eq. ( 16) is inspired from the well known Geometric Measure of Entanglement (GME) [14].The quantity E K j strives at quantifying the lack of representability according to K j -separability.In an ideal scenario, all optimisation procedures are perfectly convergent such that for a target state |ϕ , any SNNS |Ψ K j Ω will reconstruct the closest K j -separable state to the target state |ϕ K j .One can define the overlap between such states as the critical fidelity as it defines maximum fidelity between a target state and the set of K j -separable states.In such ideal case, the relative fidelity and local entanglement measure become so that R K j recovers the critical fidelity and E K j the GME for multipartite, pure state entanglement.

IV. RESULTS
The following results and simulations are used to illustrate the effectiveness of the entanglement classification protocol.We begin with the simplest classification problem of bipartite separability and proceed to more complex states of up to six qubits.Such sizes allow for the investigation of non-trivial multipartite entangled systems without the complications entailed by large many-body systems.Each classification is characterised by a "learning path" which depicts the evolution of the fidelity of the free learner and a separable learner throughout the state reconstruction procedure.Learning paths which follow a convergent trajectory towards unit fidelity indicate a correct classification of the state with the separability properties of the learner in question.Paths which converge with sub-optimal fidelities, or which do not converge at all, indicate a witnessing of entanglement with respect to the appropriate form of separability.
We start addressing the two-qubit case, a situation for which entanglement classification is a binary decision problem as a pure two-qubit state is simply either entangled or fully separable.⌦ i (red curve).The NNS with in the appropriate separable form (blue curve) achieves maximal fidelity throughout the optimisation process, whilst the triseparable and fully separable learners achieve sub-optimal convergences.Panel (b) depicts a similar situation but with a biseparable state in the splitting {1, 2, 3}-vs-{4} containing a tripartite entangled |GHZi state, with a set of learners in the separable forms The NNS in the appropriate separable form (blue line) achieves maximal fidelity throughout the optimisation process, whilst the triseparable learners achieve suboptimal convergences.Panel (c) reports a similar classification process for a random six-qubit state that is biseparable in the {1, 2, 3}-vs-{4, 5, 6} bipartition using Furthermore GMEs can be created to compliment the classification process and provide better insight into the entanglement properties of a target state.The example target states in Figs. 5 (a) and (b) convey the extreme cases of maximal entanglement and complete separability respectively, however a bipartite can contain any amount of entanglement such that its classification is less obvious.In this way, Fig. 8 (a  ⌦ i (red curve).The NNS with in the appropriate separable form (blue curve) achieves maximal fidelity throughout the optimisation process, whilst the triseparable and fully separable learners achieve sub-optimal convergences.Panel (b) depicts a similar situation but with a biseparable state in the splitting {1, 2, 3}-vs-{4} containing a tripartite entangled |GHZi state, with a set of learners in the separable forms The NNS in the appropriate separable form (blue line) achieves maximal fidelity throughout the optimisation process, whilst the triseparable learners achieve suboptimal convergences.Panel (c) reports a similar classification process for a random six-qubit state that is biseparable in the {1, 2, 3}-vs-{4, 5, 6} bipartition using Furthermore GMEs can be created to compliment the classification process and provide better insight into the entanglement properties of a target state.The example target states in Figs. 5 (a) and (b) convey the extreme cases of maximal entanglement and complete separability respectively, however a bipartite state can contain any amount of entanglement such that its classification is less obvious.In this way, Fig. 8 it is by no means trivial to ask whether this state is separable for any value of p. Constructing an approximate GME for any form of entanglement, as seen in Fig. 8 (b), shows that |'(p)i possesses a degree of entanglement for all values of p 2 [0, 1], and is never fully separable.Similar investigations could be performed to measure biseparability throughout the in-FIG.6. Examples of the ability of SNNS to distinguish between K j -separable learners.Here, three tri-separable learners of the forms | 12|3 ⌦ i (green), | 13|2 ⌦ i (orange) and | 1|23 ⌦ i (blue) are employed to distinguish between the exact form of separability of target state.In each case, only the learner with the correct form of tri-separability can conclusively classify the target state through convergent learning.In panel (a) we have considered The NNS with in the appropriate separable form (blue curve) achieves maximal fidelity throughout the optimisation process, whilst the triseparable and fully separable learners achieve sub-optimal convergences.Panel (b) depicts a similar situation but with a biseparable state in the splitting {1, 2, 3}-vs-{4} containing a tripartite entangled |GHZi state, with a set of learners in the separable forms The NNS in the appropriate separable form (blue line) achieves maximal fidelity throughout the optimisation process, whilst the triseparable learners achieve suboptimal convergences.Panel (c) reports a similar classification process for a random six-qubit state that is biseparable in the {1, 2, 3}-vs-{4, 5, 6} bipartition using Furthermore GMEs can be created to compliment the classification process and provide better insight into the entanglement properties of a target state.The example target states in Figs. 5 (a) and (b) convey the extreme cases of maximal entanglement and complete separability respectively, however a bipartite state can contain any amount of entanglement such that its classification is less obvious.In this way, Fig. 8 it is by no means trivial to ask whether this state is separable for any value of p. Constructing an approximate GME for any form of entanglement, as seen in Fig. 8 (b), shows that |'(p)i possesses a degree of entanglement for all values of p 2 [0, 1], and is never fully separable.Similar investigations could be performed to measure biseparability throughout the interval and this concept may extend to any form of separability of interest in an N qubit state (provided stable, (red curve).The NNS with in the appropriate separable form (blue curve) achieves maximal fidelity throughout the optimisation process, whilst the triseparable and fully separable learners achieve sub-optimal convergences.Panel (b) depicts a similar situation but with a biseparable state in the splitting {1, 2, 3}-vs-{4} containing a tripartite entangled |GHZ state, with a set of learners in the separable forms The NNS in the appropriate separable form (blue line) achieves maximal fidelity throughout the optimisation process, whilst the triseparable learners achieve suboptimal convergences.Panel (c) reports a similar classification process for a random six-qubit state that is biseparable in the {1, 2, 3}-vs-{4, 5, 6} bipartition using |Ψ Free performs the classification successfully, whilst the free, entangled learner learns both states with ease.Note that SNNS when targeting the Bell state achieves a fidelity of F 1|2 ≈ 1/ √ 2 which aligns with the maximum overlap between any Bell state and the set of all separable bipartite states.
Increasing the target system size to three qubits immediately increases the complexity of the classification problem, such that a state is tripartite entangled, biseparable (which is three-fold degenerate) or fully separable.The degeneracy of biseparability is due to the arrange-ment of entanglement between parties, which the appropriate SNNS are able to distinguish.Fig. 6 displays the ability of SNNS to both detect K-separability and identify the particular permutation of entangled parties (K jseparability).A similar investigation is illustrated for the four/six qubit cases in Fig. 7, which show the power of the classification protocol that is capable of providing complete entanglement descriptions of pure target states.Furthermore GMEs can be created to compliment the classification process and provide better insight into the entanglement properties of a target state.The exam- convergent learning).
Finally, on the investigation of entangled states with non-trivial phase structures, one can utilise SNNS with network architectures according to Sec.III, with the applied separability conditions applied, and a natural gradient descent optimisation protocol.Particularly interesting states that fall into this category are that of cluster states, which are of great significance to quantum computing [6,15].Given a d-dimensional square lattice of vertices V = {1, . . ., N}, with connections between sites that define a neighbourhood, N connecting vertices (i, j), a cluster state is given by where U (i, j) = diag(1, 1, 1, is a controlled-phase gate between qubits on sites (i, j).The four-qubit cluster state built on a one-dimensional lattice

V. CONCLUSIONS AND FURTHER LOOK
SNNSs o↵er a powerful and versatile tool in order to attack the problem of entanglement classification.With a su ciently powerful learning mechanism, this classification method could be far reaching and proven powerful for larger, many body quantum systems.
Nonetheless, the problem of entanglement classification will remain a considerable roadblock.For systems of large N, the number of ways in which a state may be entangled is overwhelmingly large, and thus demanding a complete, global search of separability properties by using all possible K-separable learners is unfeasible.The need for some a priori knowledge about the state, or about the form of separability one wishes to classify (such as full separability, or genuine N-partite entanglement) becomes essential, and greatly narrows the search.In this way, classification of larger systems becomes much more realistic using the SNNS method.Finally, on the investigation of entangled states with non-trivial phase structures, one can utilise SNNS with network architectures according to Sec.III, with the applied separability conditions applied, and a natural gradient descent optimisation protocol.Particularly interesting states that fall into this category are that of cluster states, which are of great significance to quantum computing [6,15].Given a d-dimensional square lattice of vertices V = {1, . . ., N}, with connections between sites that define a neighbourhood, N connecting vertices (i, j), a cluster state is given by where U (i, j) = diag(1, 1, 1, −1) is a controlled-phase gate between qubits on sites (i, j).The four-qubit cluster convergent learning).
Finally, on the investigation of entangled states with non-trivial phase structures, one can utilise SNNS with network architectures according to Sec.III, with the applied separability conditions applied, and a natural gradient descent optimisation protocol.Particularly interesting states that fall into this category are that of cluster states, which are of great significance to quantum computing [6,15].Given a d-dimensional square lattice of vertices V = {1, . . ., N}, with connections between sites that define a neighbourhood, N connecting vertices (i, j), convergent learning).
Finally, on the investigation of entangled states with non-trivial phase structures, one can utilise SNNS with network architectures according to Sec.III, with the applied separability conditions applied, and a natural gradient descent optimisation protocol.Particularly interesting states that fall into this category are that of cluster states, which are of great significance to quantum com-

V. CONCLUSIONS AND FURTHER LOOK
SNNSs offer a powerful and versatile tool in order to attack the problem of entanglement classification.With a sufficiently powerful learning mechanism, this classification method could be far reaching and proven powerful for larger, many body quantum systems.
Nonetheless, the problem of entanglement classification will remain a considerable roadblock.For systems of large N, the number of ways in which a state may be entangled is overwhelmingly large, and thus demanding a complete, global search of separability properties by using all possible K-separable learners is unfeasible.The need for some a priori knowledge about the state, or about the form of separability one wishes to classify (such as full separability, or genuine N-partite entanglement) becomes essential, and greatly narrows the search.In this way, classification of larger systems becomes much more realistic using the SNNS method.
There are many further extensions and investigations worth pursuing following the introduction of SNNSs to classify entanglement.Most importantly is the extension from pure states to mixed states in a manner that maintains the power and efficiency that motivates this approach.Efficient ANN parameterisations of mixed states have been developed through the addition of a hidden "mixing" layer to the RBM architecture to create a Neural Density Matrix [16,17].Research into encoding separability properties into these machine architec-tures is worth exploring.An initial starting point may be aimed at generalising network conditions that invoke K j -separability into K-separability, which could improve both pure and mixed state simulation abilities.A further, exciting, extension of this research may be to introduce generative neural network models that can numerically simulate higher dimensional quantum systems i.e qudits, and even infinite dimensional systems by constructing models within a finite-dimensional phase space.Reworking the neural network framework in a way that allows for this versatility, whilst maintaining the ability to manufacture properties such as separability and potentially Gaussianity, could provide a worthy tool that has far reaching applications in quantum communications, computing and more.
With growing interest being accrued at the interface of quantum information and machine learning [10,18], the integration of entanglement classification protocols offers an exciting avenue of exploration.
S-separable if it is described by this set of exact partitions.This is the most detailed level of entanglement classification we can achieve.Now let M be the set that defines the size of each of these entangled subsets, i.e M for S is given by {m j = |S j |} j=1,...,K .A state is deemed M-separable if it is described by entangled sub-collections of these dimensions.Given an arbitrary form of K-separability we wish to deduce how many forms of S-separability are attributed to it.
Whilst S-separability describes a specific separable order of entangled qubits, we can define M-separability as a specific separable order of m j -dimension entangled qubit sets, which is therefore less degenerate than S (many ordered sets S may correspond to a single M).By finding the number of ways that a state may be Mseparable, we can then use the degeneracy of M with respect to the creation of K partitions to find total degeneracy.
When constructing an entanglement set (S, M), as each S j is filled with indices of entangled qubits, the possible choices of qubits for subsequent subsets diminishes (since they are all disjoint).Hence for m j ∈ M, the number of possible permutations are given by the multinomial coefficient, However, counting in this manner disregards cyclic invariance of separabilities i.e shuffling subsets in S does not alter the separability of the state.Hence we must further reduce P by removing these duplicates.Such duplicates will only occur whenever the total set contains subsets of equivalent size, m i = m j for some i j ∈ {1, . . ., K}.We define the function g(l) = k i=1 δ(m i , l) as that which counts the degeneracy of subsets of size l, where δ is the Kronecker delta function.
We can then find the number of ways that a Mseparable state is S-separable, where m = max (M).We are now left to find how many ways a K-separable state can be constructed using Mseparability.This is equivalent to searching for the number of solutions to K i=1 m i = N, for m i ∈ M and fixed K.That is, how many ways can construct a set {m 1 , . . ., m K } such that these elements sum to N. The solution to this is given by the partition function of exactly K parts P(n, K) [19], which has the generating function, and can thus use to determine the degeneracy of Mseparability with respect to K-separability.Hence for an N-partite state, the number of ways in which we can arrange the N-qubits into K entangled collections is given by, Therefore the total number of unique forms of separability (discounting genuine, complete multipartite entanglement) is G = N K=2 G K .This result indeed agrees with the more concise answer to the total number of Sseparabilities attributed to an N-partite quantum system.This is given by the Bell numbers which can be calculated using Dobiński's formula [20], It can be seen that the first few Bell numbers do indeed generate the number of S-separabilities for N elements, The Bell numbers count all forms entanglement for an N-partite system with respect to S-separability.However they do not detail the degeneracy of the more specific Kseparability, which is instead given by Eq. (B4).

A
. Quantum state representation a

FIG. 1 .
FIG.1.Illustration of a N-qubit NNS based on a RBM of N binary artificial visible neurons, and H binary artificial hidden neurons used to mediate the correlations within the system.There are NH weighted connections and N + H total neural biases.
FIG. 4. Graphical depiction of a N-qubit K j -SNNS based on a RBM of N binary artificial visible neurons and H binary artificial hidden neurons that mediate the correlations within the system.Separability under the set of partitions {S m } m=1,...,K is enforced by performing segmentations throughout the network.Each set of qubits S m is fully connected to a set of hidden neurons H m (|H m | , |S m | generally) but is independent from all other hidden neurons.There are P K m=1 |H m ||S m | non-zero weighted connections, and N + H neural biases.

FIG. 4 .
FIG. 4. Graphical depiction of a N-qubit K j -SNNS based on a RBM of N binary artificial visible neurons and H binary artificial hidden neurons that mediate the correlations within the system.Separability under the set of partitions {S m } m=1,...,K is enforced by performing segmentations throughout the network.Each set of qubits S m is fully connected to a set of hidden neurons H m (|H m | |S m | generally) but is independent from all other hidden neurons.There are K m=1 |H m ||S m | non-zero weighted connections, and N + H neural biases.
FIG. 5. Classification of two-qubit quantum states.Panel (a) reports the learning paths of a free learner | Free ⌦ i (blue) and a separable learner (orange) attempting to reconstruct a two qubit Bell state | + i = (|01i + |10i)/ p 2 such that the separa-

FIG. 5 .
FIG. 5. Classification of two-qubit quantum states.Panel (a) reports the learning paths of a free learner | Free ⌦ i (blue) and a separable learner (orange) attempting to reconstruct a two qubit Bell state | + i = (|01i + |10i)/ p 2 such that the separable learner | 1|2 ⌦ i is unable to acquire a maximal fidelity, whilst the free learner easily achieves unit fidelity.Instead, | 1|2 ⌦ i converges to the maximum fidelity that the set of separable states can acquire with | + i, which is 1/ p 2. Panel (b) illustrates the behaviour of both the entangled learner | Free ⌦ i and the separable learner | 1|2 ⌦ i converging to unit fidelity while reconstructing a separable two qubit state |'i = |+i 1 |0i 2 .

FIG. 5 .
FIG. 5. Classification of two-qubit quantum states.Panel (a) reports the learning paths of a free learner |Ψ Free Ω (blue) and a separable learner (orange) attempting to reconstruct a two Bell state |Ψ + = (|01 + |10 )/ √ 2 such that the separable learner |Ψ 1|2 Ω is unable to acquire a maximal fidelity, whilst the free learner easily achieves unit fidelity.Instead, |Ψ 1|2 Ω converges to the maximum fidelity that the set of separable states can acquire with |Ψ + , which is 1/ √ 2. Panel (b) illustrates the behaviour of both the entangled learner |Ψ Free Ω and the separable learner |Ψ 1|2 Ω converging to unit fidelity while reconstructing a separable two qubit state |ϕ = |+ 1 |0 2 .
(a) depicts the relative fidelity and GME of a variable Bell state |'(p)i = p p |00i + p 1 p |11i constructed by monitor-ing the performance of the separable learner with |'(p)i for many values of p in the interval p 2 [0, 1].Similarly considering a variable, three qubit state |'(p
(a) depicts the relative fidelity and GME of a variable Bell state |'(p)i = p p |00i + p 1 p |11i constructed by monitoring the performance of the separable learner with |'(p)i for many values of p in the interval p 2 [0, 1].Similarly considering a variable, three qubit state |'(p

FIG. 9 .
FIG. 9. Entanglement classification for a four-qubit cluster state |C 4 built on a one-dimensional lattice, using an SNNS with a double hidden layer architecture, separating phase amplitude learning.The free learner |Ψ Free Ω,Ξ (blue) reconstructs |C 4 to the maximal degree over a sufficient number of learning iterations, whilst the biseparable learner |Ψ 12|34 Ω,Ξ (orange) reaches a maximal value of fidelity of 1/ √ 2.
Classification of two-qubit quantum states.Panel (a) reports the learning paths of a free learner | Free ⌦ i (blue) and a separable learner (orange) attempting to reconstruct a two qubit Bell state | + i = (|01i + |10i)/ p 2 such that the separable learner | 1|2 ⌦ i is unable to acquire a maximal fidelity, whilst the free learner easily achieves unit fidelity.Instead, | 1|2 ⌦ i converges to the maximum fidelity that the set of separable states can acquire with | + i, which is 1/ p 2. Panel (b) illustrates the behaviour of both the entangled learner | Free ⌦ i and the separable learner | 1|2 ⌦ i converging to unit fidelity while reconstructing a separable two qubit state | illustrates the use of SNNS for classification purposes when the target state is either the Bell state | + i = (|01i 12 + |10i 12 )/ p 2 or the separable state |'i = |+i 1 |0i 2 with x |+i = |+i and