Deep Learning the Quantum Phase Transitions in Random Two-Dimensional Electron Systems

Random electron systems show rich phases such as Anderson insulator, diffusive metal, quantum and anomalous quantum Hall insulator, Weyl semimetal, as well as strong/weak topological insulators. Eigenfunctions of each matter phase have specific features, but due to the random nature of systems, judging the matter phase from eigenfunctions is difficult. Here we propose the deep learning algorithm to capture the features of eigenfunctions. Localization-delocalization transition as well as disordered Chern insulator-Anderson insulator transition is discussed.

Recently, there has been great progress on image recognition algorithms 15) based on deep machine learning. 16,17) Machine learning has recently been applied to several problems of condensed matter physics such as Ising and spin ice models 18,19) and strongly correlated systems. [20][21][22][23][24][25] In this Letter, we test the image recognition algorithm to determine whether the eigenfunctions for relatively small systems are localized/delocalized, and topological/nontopological. As examples, we test two types of two-dimensional (2D) quantum phase transitions: Anderson-type localization-delocalization transition in symplectic systems, and disordered Chern insulator to Anderson insulator transition in unitary systems.
Distinguishing Localized States from Delocalized Ones-We start with a 2D symplectic system, which is realized in the presence of spin-orbit scattering. We use the SU(2) Hamiltonian 26) that describes the 2D electron on a square lattice with nearest-neighbor hopping, where c † i,σ (c i,σ ) denotes the creation (annihilation) operator of an electron at site i = (x, y) with spin σ, and ǫ i denotes the random potential at site i. We assume a box distribution with each ǫ i uniformly and independently distributed on the interval [−W/2, W/2]. The modulus of the transfer energy is taken to be the energy unit. R(i, j) is an SU(2) matrix, with α and γ uniformly distributed in the range [0, 2π). The probability density P(β) is For E = 0 (band center), from the finite size scaling analyses of the quasi-1D localization length, 26,27) it is known that the states are delocalized when W < W SU2 c (≈ 6.20), while they are localized when W > W SU2 c . We impose periodic boundary conditions in x-and y-directions, and diagonalize systems of 40 × 40. From the resulting 3200 eigenstates with Kramers degeneracy, we pick up the 1600th eigenstate (i.e., a state close to the band center). For simplicity, the maximum modulus of the eigenfunction is shifted to the center of the system. Changing whether the states belong to the localized (delocalized) phase.
For our network architecture, we consider two types of simple convolutional neural network (CNN), which output two real numbers, i.e., probabilities for each phase, given 40 × 40 input eigenfunction. The first one is a very simple network with two weight layers, which first convolves the input with a 5 × 5 filter with stride 1 to 10 channels, then applies max pooling with a kernel size of 2 × 2 and stride 2, and finally performs fully connected linear transfor- judge whether the states are localized or not. The resulting probability for eigenfunction to be delocalized, P, is shown in Fig. 3(a).
We then apply the results of the learning around E = 0 to judge whether the states around E = 1.0, 2.0, and 3.0 are delocalized. Results are shown in Fig. 3(b), in which we observe that, with increasing E, that is, as we move from band center to band edge, the electron begins to be localized with a smaller strength of the disorder W, qualitatively consistent with the finite size scaling analysis. 27) There seems to be, however, a systematic deviation of the 50% criterion of localization-delocalization transition and the actual critical point with increasing E. This may be due to the appearance of bound states near the band edge, which is absent in the machine learning around E = 0. We have further applied the results of SU(2) model machine learning for the Ando model, 32) and verified that once the machine learns the eigenfunction features in certain systems, it can be applied to other systems belonging to the same class of quantum phase transition (see Supplemental material for detail 33) ).
Distinguishing Topological Edge States from Non-topological Ones-We next study the topological Chern insulator to nontopological Anderson insulator transition. [34][35][36] We use a spinless two-orbital tight-binding model on a square lattice, which consists of s-orbital and p ≡ p x + ip y orbital, 37) H = where ǫ s , v s (x), ǫ p , and v p (x) denote atomic energy and disorder potential for the s-and porbitals, respectively. Both v s (x) and v p (x) are uniformly distributed within [−W/2, W/2] with identical and independent probability distribution. t s , t p , and t sp are transfer integrals between neighboring s-orbitals, p-orbitals, and that between s-and p-orbitals, respectively.
In the absence of disorder, the system is a Chern insulator when the band inversion condition is satisfied: 0 < |ǫ s − ǫ p | < 4(t s + t p ). We set ǫ s − ǫ p = −2(t s + t p ), ǫ s = −ǫ p < 0, and t s = t p > 0 so that this condition is satisfied, and set t sp = 4t s /3. The energy unit is set to 4t s .
A bulk band gap appears in |E| < E g = 0.5 where chiral edge states exist. For E = 0, the system remains as a Chern insulator for W < W CI c ≈ 3.2, 35) while it is an Anderson insulator for W > W CI c . (Unfortunately, the estimate of W CI c is less precise than the SU(2) model.) We impose fixed boundary conditions in the x-and y-directions, so that the edge states appear if the system is a topological insulator.
We diagonalize square systems of 40 × 40 sites, and from the resulting 3200 eigenstates, we pick up the 1600th eigenstate. Examples of the eigenfunctions in topological Chern We have also increased the number of hidden units to be 32 for the first convolution layer ("conv1" in Fig. 2), 128 for the second ("conv2"), and 512 for the hidden dense connection layer ("ip1").
In Fig. 5(a), we plot the probability of the eigenfunction to be judged topological. A new ensemble of eigenfunctions with different random number sequences has been prepared to test this method. As in the case of delocalization-localization transition, the probability fluctuates near the critical point and vanishes in the nontopological region. The validation accuracy is 90% for the case of two layers of network (dotted line), and 97% for four layers of network (solid line), which demonstrates clearly that a deeper network exhibits better performance.
We next apply the result of the deep learning around E = 0 to judge the states in the bulk band gap region at zero disorder, |E| < E g = 0.5. We diagonalize a system for W = 1 < W CI c and W = 6 > W CI c , take all the eigenstates with |E| < E g , and let the machine judge them. Figure 5(b) shows that topological edge states other than E = 0 are also well distinguished from nontopological ones based on the learning around E = 0.
Concluding Remarks-In this paper, we focused on 2D random electron systems. We have demonstrated the validity of deep learning for distinguishing various random electron states in quantum phase transitions. For strong enough and weak enough randomness, the precision of judgement is 0.99999· · · , while in the critical regime, the judgement becomes less accurate. This region is related to the critical region where the characteristic length scale ξ is comparable to or longer than the system size L. That is, the probability P for the eigenfunction to be judged delocalized/topological obeys the scaling law, P(W, L) = f [(W − W c )L 1/ν ], although determining the exponent ν is beyond the scope of this Letter. Since all we need to calculate are eigenfunctions with relatively small systems, the method will work for systems where the transfer matrix method is not applicable (localization problems on random [38][39][40][41] and fractal lattices, 42) for example).
We have used the known values of critical disorder to teach the machine. After learning the feature of eigenfunctions near the band center, the machine could capture localized/delocalized and topological/nontopological features away from the band center. We have also verified that the results of the SU(2) model learning can be applied to the Ando model. 33) In the cases of Anderson transition near the band edge in the SU(2) model [ Fig. 3

(b)]
and that at the band center in the Ando model, the machine tends to predict the transition for a slightly smaller disorder than the estimate of finite size scaling analyses. 32,43) We have extracted the features in the middle layers to explain this tendency, 33) but could not clarify how the machine judges phases. The details of judgement should be clarified in the future.
We have focused on the amplitude of eigenfunction in 2D. In higher dimensions, the same algorithm will be applicable via dimensional reduction: integration of |ψ 2 | over certain directions, reducing the image to two dimensions. The dimensional reduction will also work for disordered 3D strong and weak topological insulators. 44) Other interesting quantities for machine learning are phase and spin texture of eigenfunctions in random electron systems.
Classical waves (photon, phonon) in random media [45][46][47] as well as disordered magnon 48) are also worth machine learning.  (Fig. 6). In this model, α in Eq.(2) in the main text is set to 0, γ is 0 for x-direction transfer and π/2 for y-direction. We set the strength of the spin-orbit coupling β = π/6 to compare with the previous results, 32,43) (2)) can be applied to a different model (Ando), though the probability of delocalization starts to decrease with increasing W slightly earlier than expected. This might be due to the corrections to scaling, which is present in Ando model but negligible in SU(2) model.
We have also set β = 0 (no spin-orbit coupling, i.e., the Anderson model, which belongs to the orthogonal class), where all the states are expected to be localized, which is actually the case of machine judgement (red +). The machine judgement is, however, too good in small disorder region W < 5 where the localization length becomes greater than 100 lattice cites, 49) larger than the system size 40 . This may be due to the standing wave like structure of eigenfunctions in this region, where the peak values are fluctuating due to disorder, from which the eigenfunctions might have been judged to be localized.

Features in the intermediate layers-Here we show examples of features in the interme-
diate layers for localization-delocalization transition (Fig. 7) and topological-nontopological transition (Fig. 8). phase is judged to be a topological edge state with probability 0.8288..., while the right one shows how a state in the Anderson insulator phase is judged to be a topological edge state with probability 0.1221...