Deep Neural Network Detects Quantum Phase Transition

We detect the quantum phase transition of a quantum many-body system by mapping the observed results of the quantum state onto a neural network. In the present study, we utilized the simplest case of a quantum many-body system, namely a one-dimensional chain of Ising spins with the transverse Ising model. We prepared several spin configurations, which were obtained using repeated observations of the model for a particular strength of the transverse field, as input data for the neural network. Although the proposed method can be employed using experimental observations of quantum many-body systems, we tested our technique with spin configurations generated by a quantum Monte Carlo simulation without initial relaxation. The neural network successfully classified the strength of transverse field only from the spin configurations, leading to consistent estimations of the critical point of our model $\Gamma_c =J$.

reasonable value of the critical point of the two-dimensional Ising model on a square lattice.
This neural network (NN) approach is not restricted to the case of the classical Ising model.
In general, NNs have the capability of being applied to various types of data. Thus, in the present study, we applied the technique established in the previous study to a quantum manybody system. In a quantum many-body system, we can observe the microscopic state through measurement. To obtain the expectation value of a physical observable, the measurements must be repeated. In particular, one of the typical measurements of physical observables in the quantum spin systems is the direction of the spin variables.
In the present study, we consider the one-dimensional transverse-field Ising model, which shows a quantum phase transition. The present study is intended to serve as a test study for the use of NNs in quantum many-body systems. We also investigate the significance of the NN structure. Unlike the previous study, 7) we employ multi-layer perceptrons (MLPs), which are the simplest form of a NN. Despite their simplicity, MLPs can be used to provide a robust NN. In particular, CNNs can capture complex features on images by introducing the convolution process to MLPs. This process functions as a kind of real-space renormalizationgroup analysis, 9) which enables elucidation of the critical behavior of the system. 10) However, through our analysis, we confirmed that the process of the convolution does not necessarily affect the extraction of critical behaviors. The simple MLP can also be used to estimate the precise value of critical points.
We consider a one-dimensional transverse-field Ising model, defined by the following Hamiltonian: where we define and Here, h stands for the strength of the longitudinal magnetic field, Γ represents the strength of the transverse magnetic field, σ z i is the z component of the Pauli matrix at site i, and σ x i is the x component of the Pauli matrix at site i. The symbol L is the number of spins in the one-dimensional chain. This model has a quantum phase transition at Γ c = J. The ordered phase occurs when J > Γ, and the disordered phase occurs when Γ > J. However, we cannot directly simulate the quantum many-body system because the Hamiltonian includes noncommuting operators. We then employ the Suzuki-Trotter decomposition 11) to express our model in terms of the c numbers: where β is the inverse temperature and β = 1/T . The Trotter number is defined as τ. The onedimensional transverse-field Ising model is then mapped onto a two-dimensional classical Ising model using Eq. (4). The effective Hamiltonian is given by where γ = − log(tanh(βΓ/τ))/2β and σ i,t = {−1, 1}. We impose the periodic boundary conditions σ L+1,t = σ 1,t and σ i,τ+1 = σ i,1 . Therefore, our model becomes a classical model.
In the classical expression, one dimension is the original spatial dimension, while the other dimension expresses imaginary time. Imaginary time can be interpreted as repeated observations on the quantum system. We regard the two-dimensional expression of our model as a sequence of measurement results of spin configurations on the one-dimensional chain. The following analysis is not restricted to the case of the one-dimensional Ising model with a transverse field. In general, our analysis can be applied to any quantum many-body system.
In the present study, we demonstrate the procedure for detecting the quantum phase transition and reduce the complexity in constructing the NN to simplify this process, making it more accessible than the previous study.
The technique can be applied to the so-called non-stochastic Hamiltonian, on which quantum Monte Carlo simulation cannot be applied in a straightforward manner. 12) Therefore, the use of classical computer simulation for obtaining the microscopic state is not useful except for several special cases. 13) In order to obtain the spin configurations, we may also utilize a quantum simulator such as the D-Wave machine, which is a well-known system that has performed manipulation of a 2000-bit Ising model using quantum annealing by tuning the strength of the transverse-field on the chimera graph. [14][15][16] Here, we show the process of training the MLP in detail for those who may be unfamiliar with machine learning. The MLP utilizes a complicated function to relate input to output by repeated linear and nonlinear transformations. We prepare the pairs of input and output data and fit the MLP to these data, while we optimize the structure of the MLP. This technique is known as supervised learning. 17) We set the discretized values of the transverse fields as the input and the spin configurations as the output: where the spin configurations σ d are denoted The first subscript denotes the index of the space and the second subscript denotes the index of imaginary time in the quantum Monte-Carlo simulation or that of the sequence of repeated measurements from an actual experiment.
To utilize the output data in training the MLP, we employ the following flattened representation of σ d as Here, τ is the Trotter number. Detecting the value of the transverse field from the spin configurations is considered a multi-classification problem. We discretize real-value Γ ∈ [0, 2) to Γ by using the function where N class is the number of classes. This discretization is often called the one-hot representation.
Unlike the previous study, 7) we consider a very simple three-layer MLP in the present study, consisting of the input, hidden, and output layers. We stack these layers to design a nontrivial function.
From the input layer to the hidden layer, we compute for k = 1, 2, ..., N h , where W (1) k is the weight parameter connecting to unit k in the hidden layer, N h is the number of hidden units, and b (1) represents the bias in the input layer. The weights and bias perform the linear transformation. We use the rectified linear function (ReLU) 18) as the nonlinear transformation, which is often called the activation function. The ReLU function is defined by This type of activation function is employed to prevent the gradient from vanishing, which hampers efficient learning by use of the gradient descent method. From the hidden layer to the output layer, we compute for k = 1, 2, ..., N class . We represent (z 1 , z 2 , ..., z N h ) T as z (1) . The activation function (14) is called the softmax function. These processes are are known as f orward propagation. We then calculate the loss function, which is given by where Γ k is the kth element of Γ. This loss function is called the cross entropy. We optimize the summation of the loss function over all components of the dataset and update parameters W, b using gradient descent. One efficient algorithm for optimizing the weights while avoiding a learning plateau at the saddle point of the loss function is the Adam method. 19) To obtain generalized performance, we employ mini-batch learning 20) in which the summation of the loss function is approximated by a partial summation over the mini-batch of randomly chosen components from the dataset. When the batch size is small, the MLP tends to attain generalized performance. 21) We utilize the weight decay 22) via L 2 -norm regularization to avoid overfitting to the training data. This technique is a standard method for achieving more generalized MLP performance.
The process of calculating the loss function and updating the weights is called back propagation. We show the MLP learning protocol in Algorithm 1. The epoch is the step of learning. The symbol N m represents the mini-batch number.
Before discussing our experimental results, we describe our experimental environment.
We perform the quantum Monte Carlo simulation 23) (MCS) of the effective Hamiltonian (6), and h = −10 −3 . We generate 100 spin configurations σ for each class k, and we take 100 MCS for each sample using the Metropolis method. 24,25) We do not discard spin configurations generated in the initial relaxation. Starting from the random initial configurations, we obtain 100 spin configurations at each k. Therefore, the spin configurations are not necessarily in equilibrium around the critical point. In this sense, MLP might be able to infer the strength of the transverse field from non-equilibrium behavior such as the nonequilibrium relaxation. 26) In this regard, we do not have any definite conclusion about the effectiveness of the nonequilibrium behavior. However, when the equilibrium spin configurations, which are the output from the system, result in similar behavior of the weight in the MLP, we cannot obtain the precise estimation as shown below. We will discuss this point in more detail later.
In the present three-layer MLP, the number of units in the input layer is L × τ, the number of units in the hidden layer is N h = 2000, and the number of units in the output layer is N class . According to Algorithm 1, MLP learns the dataset D Transverse . We show the heat map of weights from the hidden layer to the output layer after 50 learning epochs in Fig.2. We observe a sharp change in the weights near Γ c . Figure 2 indicates that the distribution of weights changes near the critical point Γ c = 1. Similarly to the previous study, 7) to extract a characteristic point for weights, we define where k represents the class of the discretized transverse field and W (2) k is the weight connected to unit k in output layer. The weight can be interpreted as a type of magnetization order parameter.
The notation |W (2) k | indicates the number of W (2) k . The vector 1 T is defined by 1 T = (1, 1, ..., 1) and has N h components. In the present study, we introduce the variance of W (2) k as W var (Γ(k)) = 1 This is the variance of the order parameters, which can often signify the phase transition in the Markov Chain Monte Carlo method by computing the binder parameter for estimating the precise critical point location. We substituteΓ(k) for class k as whereΓ(k) represents a transverse field corresponding to a certain class k.
Similarly to the previous research, 7) we fit two of the defined quantities to the following function using the non-linear least square method: where a, b,Γ, c are parameters, andΓ is the estimation of the critical point. The pointΓ indicates the location where this function sharply changes, as shown in Figs. 2 and 3. We obtain  the estimation ofΓ = 0.9870 from (16) andΓ = 0.9544 from (17). The error rate between the exact solution Γ c = 1 and theΓ estimation from (16) and (17) is 1.3% and 4.56%, respectively.
The order parameters (16) and (17)  can be applied to sequential observations of experimental results instead of the imaginarytime distribution of the quantum systems. Typically, the Monte Carlo simulation requires a lot of computational time to precisely estimate the location of the critical point because of the number of relaxation times required to obtain an equilibrium distribution. On the other hand, our estimation is accurate even without long relaxation times. In addition, unlike the previous study, 7) which tested the CNN on the two-dimensional Ising model on a square lattice, we utilized simple MLPs in this study. We showed that the one-dimensional transverse-field Ising model is essentially the same as the classical two-dimensional Ising model. We further confirmed that the MLP has sufficient performance to detect critical behavior. Therefore, it is not necessary to utilize the convolutional process to detect critical behavior. We emphasize that this method has the potential to be applied to models with non-trivial order parameters.
In our future work, we plan to expand this method to models with topological phases such as XY models.