Probing Criticality in Quantum Spin Chains with Neural Networks

The numerical emulation of quantum systems often requires an exponential number of degrees of freedom which translates to a computational bottleneck. Methods of machine learning have been used in adjacent fields for effective feature extraction and dimensionality reduction of high-dimensional datasets. Recent studies have revealed that neural networks are further suitable for the determination of macroscopic phases of matter and associated phase transitions as well as efficient quantum state representation. In this work, we address quantum phase transitions in quantum spin chains, namely the transverse field Ising chain and the anisotropic XY chain, and show that even neural networks with no hidden layers can be effectively trained to distinguish between magnetically ordered and disordered phases. Our neural network acts to predict the corresponding crossovers finite-size systems undergo. Our results extend to a wide class of interacting quantum many-body systems and illustrate the wide applicability of neural networks to many-body quantum physics.

Solving a quantum many-body problem often implies a coarse-graining procedure to remove redundant degrees of freedom from the short-range, or the high-energy, sector of the theory. In this case, a proper elucidation of low energy properties of the system or the type of its long-range ordering encodes the macroscopic behavior. In its turn, the methodology of machine learning in multidimensional and typically nonstructured datasets is inevitably linked to the effective approaches to dimensionality reduction, thereby yielding a powerful technique for the detailed analysis of classical and quantum models in many-body physics [18,19]. Practical application of neural networks in the context of both supervised and unsupervised machine learning has now become commonplace for testing thermal, quantum, and topological phase transitions [2,3,4,5,6,7,8,9,10,11] as well as for formulating effective variational wave function ansätze states [12,13,14,15,16,17]. The application of machine learning to quantuminformation problems has also received significant interest recently, promising to directly probe the entanglement entropy [20,21,22] as well as other properties. The utility of machine learning methods for quantum information purposes is driven by its great success in condensed matter physics [23,24,25,26,27,28,5,29,30,31,32,33,34,35,36] and computational many-body methods [37,38,39,40,41,42]. In this study, we employ a specific machine learning technique to create a low-dimensional representation of microscopic states, relevant for macroscopic phase identification and probing phase transitions. More specifically, we explore phase transitions in the transverse field Ising chain and anisotropic XY model and demonstrate that even the simplest possible neural network architecture-a binary classifier as a perceptron with no hidden neurons present is capable of keeping track of its macroscopic phases depending on the, e.g., external magnetic field or anisotropy parameter, without any prior knowledge.

Transverse field Ising model
One-dimensional spin models represent strongly correlated quantum systems that can be rigorously approached at equilibrium [43]. Certain non-equilibrium properties can also be extracted [44]. In the following, we focus on the one-dimensional ferromagnetic transverse field Ising model (TFIM). The TFIM naturally appears upon solving a classical two-dimensional Ising model with ferromagnetic-type nearest-neighbor exchange coupling and its exact solution dates back to the original works [45,46,47]. Generally, the TFIM of L spins on a chain with open boundary conditions is specified by the following Hamiltonian: which represents a 2 L × 2 L matrix with σ α i (α = x, y, z) being a Pauli matrix acting on site i, and J and τ stand for the strength of exchange coupling and external magnetic field respectively. Interestingly, despite its relative simplicity, this model was used to describe intricate physics, e.g., the order-disorder transitions in ferroelectric crystals of KH 2 PO 4 . At zero temperature, quantum fluctuations may lead to a restructuring of the ground state which is manifested by a certain non-analyticity in the ground state energy of the quantum Hamiltonian. For the case of the Hamiltonian (1), when there is no magnetic field present (τ = 0) the ground state configuration is purely determined by the exchange interaction, the first term in Equation (1), which favors collinear magnetic ordering. For J > 0, the ferromagnetic state is energetically preferable, meaning that all magnetic moments point in the same direction σ z i = +1 (or −1), signaling the double degeneracy of the ground state. Increasing the transverse field beyond the critical value τ = τ c makes the system susceptible to spin flip and all the spins aligned in x direction in the limit τ → ∞, i.e., disordered in σ z basis.
The one-dimensional TFIM can be worked out analytically by virtue of the Jordan-Wigner transformation that maps an interacting spin model onto that of free spin-polarized fermions [47,48]. The exact solution unambiguously demonstrates a continuous quantum phase transition (QPT) upon passing through the critical field τ c = 1 (in the units of J), separating magnetically ordered ferromagnet (τ < τ c ) and disordered paramagnetic states (τ > τ c ). Although there is no exact analytical solution in higher dimensional systems, a quantum phase transition can be clearly detected [48]. It is worth noting that the phase diagram of a one-dimensional TFIM is very similar to that of a two-dimensional classical Ising model at finite temperature with a temperature-driven phase transition. Interestingly, this dualism has a strict mathematical form corresponding to the so-called Suzuki-Trotter decomposition and which maps a d-dimensional quantum model to a d + 1 dimensional classical one [49].

Anisotropic XY model
The XY model is yet another well-known quantum spin lattice model of magnetism. One can arrive to the isotropic version of this model by switching off the ZZ couplings in the Heisenberg Hamiltonian. In its turn, the anisotropic XY model is a generalization of it in the sense that the interaction strength in the XY plane is not isotropic anymore. In this study, we limit ourselves to the case when there is no field transverse to the interaction plane. The Hamiltonian of the model is thus given by where γ is the anisotropy parameter that is usually restricted to −1 ≤ γ ≤ 1 and J is the coupling strength which we set to 1 hereafter. If one sets γ = 0 the fully isotropic case, which possesses an additional symmetry [H, σ z i ] = 0, is restored. On the other hand, it is also well-known that in the opposite case, i.e. γ = 1, the ground state possesses a long-range Neel order which yields and for γ = −1 accordingly, as is described in detail in Ref. [50]. It is clear that as γ decreases from 1 to -1, the x-and y-components begin to compete. Its phase diagram is thus given by an x-and y-ferromagnetic states for γ = 1 and −1 accordingly. The model is fully isotropic at γ = 0 and undergoes a second-order phase transition at this point while the gap continuously vanishes [50,51].

General overview
The complexity of a generic quantum many-body problem grows exponentially with the size of a system (using the best known methods), making the available numerical routines computationally demanding. While machine learning has been specifically designed to coarse-grain certain information while maintaining relevant and unique features corresponding to the dataset (reminiscent to the formalism of renormalization group in statistical and high-energy physics [52]) it appears to be perfectly suited for identification of classical and quantum phases [25,53,54]. Indeed, sampled spin-1 2 configurations can be mapped to either binary numbers or black and white pixels which can be further classified in the form of macroscopic configurations, representing the class of problems which machine learning has been routinely used for. However, typically for quantum many-body systems we do not have predefined labels, so the use of unsupervised learning is favored. Within this paradigm we search for clusterization or associative rules that govern the behavior of a system. Unsupervised learning can also take measurement data and essentially reconstruct the wave function from individual images or snapshots. These reconstruction techniques based on machine learning are now being studied and compared to traditional techniques based on quantum state and quantum process tomography [8,55,56,26,57]. The advantage of using machine learning algorithms for exploration of both classical and quantum phase transitions is associated with finding certain features related to symmetry breaking in microscopic configurations. Particularly, phase transitions in magnetically ordered systems result in spin directions being randomized by the temperature-while the corresponding temperature can be detected as a point where the magnetization drops. When considering quantum phase transitions one typically investigates a finite region of sudden change that shrinks in the thermodynamic limit to a single point of non-analyticity [58]. Alternatively, in the vicinity of a phase transition point one can examine the behavior of the order parameter, which is known to collapse, or the correlation length that diverges [48,59]. Passing through the phase transition point results in the ground state of a system being restructured, which is manifested by a certain non-analiticity in the ground state energy of a quantum Hamiltonian. It is therefore not surprising that there exists a final overlap between two different ground states of the system, which is regarded as a meaningful source of information on the quantum phases of a system and can be rigorously worked out within the fidelity approach [60,61].

Sampling spin configurations
In this section, we briefly describe the sampling routine we used for the interacting spin models, described by the Hamiltonians (1) and (2). Note that the Hamiltonians (1) and (2) are sparse matrices with most of the elements being zero, as schematically shown in Figure 1 for a system of L = 7 spins.   (2) is possible. Let a 2 L -dimensional vector |g be the ground state of this system. In the computational basis the vector is purely determined by 2 L complex-valued decomposition components α i 1 i 2 ...i L in the basis |i k = {| ↑ , | ↓ }, with k = 1, . . . , L, which are known to give the probability distribution p i 1 i 2 ...i L = |α i 1 i 2 ...i L | 2 of a particular spin configuration |i 1 |i 2 . . . |i L , which we refer to as a bitstring and later represent explicitly as strings of 0's and 1's. Thus, sampling the physical system specified by the Hamiltonian (1) might be approached by sampling each bitstring with the corresponding probabilities p i 1 i 2 ...i L .

Neural network architecture
We use a neural network architecture that consists of an input layer and one output neuron, corresponding to a binary classifier. The sampled bitstrings serve as input data. Noteworthy, any hidden layers are absent. The output is prescribed to take value 0 when an input spin configuration is drawn from the ground state prescribed by τ 1 = 0.01 (γ 1 = −1), whereas if the configuration is taken from τ i (γ i ), the neuron is prescribed to take the value 1. The neural network architecture used is shown in Figure 2.

Input layer
Output layer Figure 2. The neural network design. W i denotes the weights connecting the input layer neurons with the output neuron, σ i denotes a spin value in the z-basis fed into the input layer, the solid blue line denotes the sigmoid activation function which for the output neuron.
The linear combination of the spins' z-projections σ i is fed into the neural network via the input layer, followed by a nonlinear activation of the output neuron with σ(x) = 1 1+e x being the sigmoid function and the binary cross-entropy serving as the loss-function. Such a simple form of the neural-network architecture results in high computational speed(s). The neural network outcome is the probability that the input state should be classified as belonging to the respective probability distribution specified by the control parameter value. We update the parameters of the neural network, the weights and the biases, using the RMSProp algorithm [62].

The Algorithm
In our numerical simulations, for chains of L = 20 spins we explore the model described by Equation (1)  and N = 10 4 spin configurations to be sampled for each value of τ i . Afterwards, a feed-forward neural network N i is trained to classify the bitstrings sampled for τ 1 = 0.01 from those for τ i . Finally, we end up with D−1 pairs of (P i , τ i ) with P i ∈ (0, 1) being the mean output of the neural network evaluated on the samples drawn from the probability distribution given by the ground state of H(τ i ). In what follows, we show that the value of P with respect to τ dramatically changes signalling a phase transition. We apply a similar procedure to the anisotropic XY model with the anisotropy parameter −1 ≤ γ ≤ 1 starting with γ 1 = −0.99. The result is then averaged over 40 runs to rid possible effects caused by random initialization of the neural networks' parameters (displayed as shadows in the plots).

Results
Below, we present and discuss the results of our numerical simulations, demonstrating how the neural network architecture and the corresponding algorithm described in the previous section are capable of probing the phase crossover point for the described models. In Figure 3, we show how our setup performs for a TFIM on a open chain of L = 20 spins. As expected, the neural network learns the order parameter due to the linearity of the latter as a function of spin projections. Note however, that while the resulting curve is typical of a transverse magnetization curve for TFIM, there was no information about the x-projections of the spin measurements in our setup, but only the measurements in the z-basis. Unlike in previous studies, for example [63], the simplicity of a neural network used for the simulations makes direct visualization of the weights straightforward owing to their vectorial nature. Figure 4 clearly displays the crossover in the neighborhood of criticality, making these results intuitively clear and interpretable in contrast to usual deep learning routines [64,65]. Each vertical row in Figure 4 corresponds to a set of coefficients z-components of spins are multiplied by before transferring the whole sum to the activation function of the output neuron. Thus, the model actually mimics zprojections of spin configurations given the transverse magnetic field value τ . The latter explains why the rows in the heatmap are uniform in the ferromagnetic limit and take random values in the disordered phase. Note that the boundary coefficients are different because of the open boundary conditions.
In Figure 5   plot, one can clearly see the phase crossover induced by the change of γ which is a sign of a well-studied anisotropy-induced phase transition in an infinite system [66], similarly to the phase transition induced by the critical value of the magnetic field. Again, while our algorithm is given information about the z-components of spins, it is capable of exposing a phase crossover induced by the anisotropy in the x-y plain.

Conclusion
In this paper we have considered the simplest neural network architecture with no hidden layers present and applied it to study the finite-size phase crossovers in the quantum transverse field Ising model and the quantum anisotropic XY model on a one-dimensional chain. We were able to distinguish the regions of different phases using neural networks without prior knowledge of the phase diagram by observing the corresponding phase boundary crossover in a finite-size system. Relative simplicity of the machine learning setup allowed us to visualize the weights of the corresponding neural network and unambiguously relate this plot to configuration of different spin orderings.