Phase Detection with Neural Networks: Interpreting the Black Box

Neural networks (NNs) normally do not allow any insight into the reasoning behind their predictions. We demonstrate how influence functions can unravel the black box of NN when trained to predict the phases of the one-dimensional extended spinless Fermi-Hubbard model at half-filling. Results are the first indication that the NN correctly learns an order parameter describing the transition. Moreover, we demonstrate that influence functions not only allow to check that the network trained to recognize known quantum phases can predict new unknown ones but even discloses information about the type of phase transition.

Machine learning (ML) influences everyday life in multiple ways with applications like letter and voice recognition software, fingerprint identification, e-mail spam filtering, self-driving cars, and many others. These versatile algorithms, dealing with big and high-dimensional data, also have a prominent impact on science, as neural networks (NNs) have been harnessed to solve problems of quantum chemistry, material science, and biology [1][2][3][4]. Physics is no different in exploring ML methods, encompassed already by astrophysics, high-energy physics, quantum state tomography, and quantum computing [5][6][7][8][9][10][11]. Especially abundant is the use of ML in phase classification which is not surprising if one considers that determining the proper order parameters for unknown transitions is no trivial task, on the verge of being an art. It includes the search in the exponentially large Hilbert space and the examination of symmetries existing in the system, guided by the intuition and educated guess. The alternative route was shown, when NNs, used commonly for high-dimensional data analysis, located the phase transitions for known systems without a priori physical knowledge [12,13]. However, the resulting models were agnostic, largely opaque, and 'intelligence' was provided by extracting it from data, which is in stark contrast with a physicists' main driving force: the need to understand the underlying mechanisms of the process. At the same time, it is undeniable that ML often produces surprisingly good results.
Next to all these successful applications, there are open problems, for instance ones concerning topological models and many-body localization (MBL), which include the need for pre-engineered features [32][33][34], disagreement of predicted critical exponents [19], and high sensitivity to hyperparameters describing the training pro- cess [21]. Moreover, even in the models described by Landau's theory, so far, these approaches have mostly enabled only the recovery of known phase diagrams or the location of phase transitions in qualitative agreement with more conventional approaches based, for instance, on order parameters and/or theory of finite-size scaling. Nonetheless, ML achieved this at a much lower computational cost, e.g. using fewer samples or smaller system sizes [19,21].
Additionally, used ML techniques are mostly black boxes, i.e., systems with internal logic not obvious at all to a user [35]. The missing key element is the model interpretability, i.e., the ability to be explained or presented to a human in understandable terms [36]. The research on this crucial property is at heart of a booming field of ML interpretability [37][38][39][40][41][42] aiming at designing methods that discover the internal logic of commonly used black boxes.
They are needed for a plethora of reasons. Given the ML presence in everyday life, it is no surprise that already legal measures have been taken to assure that any individual can obtain meaningful explanations of the logic involved when automated decision-making takes place [43]. Next to the legal motivations, there is ethics. The worrying fact was revealed that learning machines inherit biases from humans preparing data [44]. Also, deep NNs were shown to perfectly fit random labels [45], and that a group of local features can be their good approximation [46]. These studies prove that the learning process sometimes goes against our intuition, and indicate that the predictions should be accompanied by a justification understandable by humans to be trusted.
Moreover, an overwhelming need for ML interpretability was born after first successful uses of ML in physics. Its numerical power cannot be denied, but its frequent automatic use contradicts the primary motivation of physical research, namely the desire to know and understand the underlying mechanisms of the process. Instead, with black boxes, we stop at reproducing the process. Even in phase classification problems, where NNs have been often used, we cannot be fully confident that NNs learn order parameters. Hence, in this work, we show how interpretability methods can be used in the classification of physical phase transitions to understand what characteristics are learned by a ML algorithm. This approach unravels if a relevant physical concept was indeed learned or if the prediction cannot be trusted. We also present that interpretable NN can give additional information on the phase transitions, not provided to the algorithm explicitly.
Supervised learning. We consider supervised learning problems with labeled training data D = {z i } n i=0 , with z i = (x i , y i ). The input data is coming from some input space x i ∈ X , and the model predicts the outputs coming from some output space y i ∈ Y. In our setup, the inputs x i are the state vectors for a given physical system, and y i are the corresponding phase labels. The model is determined by the set of parameters θ. In the training process, the parameters' space is being searched for the final parametersθ D ≡θ of the ML model, which minimize the training loss function L(D, θ) = 1 n z∈D L(z, θ), where n is the training data set size, which tends to be of the order of thousands. After training, a model can make a prediction for an unseen test point z test with the test loss function value L(z test ,θ) related to the model certainty of this prediction.
Interpreting neural networks. An intuitive way of unraveling the logic learned by the machine is retraining the model after removing a single training point z r (starting from the same minimum, if a non-convex problem is analyzed), and checking how it changes the prediction of a specific test point z test . Such a leave-one-out training (LOO) [47] studies the change of the parameters θ, now shifted to a new minimumθ D\{zr} of the loss function, as depicted in Fig. 1(a). An analysis of the test loss change, ∆L ≡ L(z test ,θ) − L(z test ,θ D\{zr} ), enables the indica-tion of the most influential training points for a given test point z test being the ones whose removal causes the largest change. Influential examples can be both helpful (∆L > 0) and harmful (∆L < 0). Such an analysis gives the notion of a similarity used by the machine in a given problem, as training points being the closest in the ∆L space can be understood as the most similar. Once the most influential points are indicated, we can decode what characteristics are being looked at by comparing 'similar' points in the machine 'understanding'. It can be especially useful in phase classification problems where the analysis of ∆L enables the recovery of patterns being crucial for distinguishing the phases. The use of this technique to check the influence of every training point in D on a given test point is, however, prohibitively expensive, as the model has to be retrained for each removed z.
To circumvent this problem, one can make a Taylor expansion of the loss function L with respect to the parameters around the minimumθ, and approximate ∆L resulting from the LOO training. This method was proposed for regression problems already forty years ago [47][48][49] and named influence functions. Not only this interpretability method is computationally feasible, but also it treats a model as a function of the training data instead of assuming that the training data set is fixed. The influence function reads and it estimates ∆L for a chosen test point z test after the removal of a chosen training point z r . ∇ θ L(z test ,θ) is the gradient of the loss function of the single test point, ∇ θ L(z r ,θ) is the gradient of the loss function of the single training point whose removal's impact is being approximated, and H −1 θ (θ) is the inverse of Hessian, All derivatives are calculated w.r.t. the model parameters θ, evaluated atθ corresponding to the minimum of the loss, L(D,θ). We can only ensure the existence of the inverse of the Hessian if it is positive-definite. However, it was shown [50,51] that this method could be generalized to non-convex problems and therefore applied to ML. The example code can be found in [52].
Physical model. We apply influence functions to a small CNN (described in detail in appendix A) trained to recognize phases in the extended Hubbard model, namely a one-dimensional (1D) system consisting of spinless fermions at half filling with hopping between neighboring sites with amplitudes J, interacting with nearest neighbors with strength V 1 and next-nearest neighbors with strength V 2 The competition between the system parameters J, V 1 , and V 2 leads to four different phases: gapless Luttinger liquid (LL), two gapped charge-density-wave phases with density patterns 1010 (CDW-I) and 11001100 (CDW-II), and bond-order (BO) phase, as seen in Fig. 1(b) [53,54]. The order parameter describing the transition to the CDW-I (-II) phase is the average difference between (next-)nearest-site densities, while the BO phase is characterized by staggered effective hopping amplitudes. Detailed description is included in appendix B. We feed the CNN with ground states expressed in the Fock basis, labeled with their appropriate phases, calculated for a 12site system with QuSpin and SciPy packages [55,56]. To lift the degeneracy of the ground state, we use guiding fields favoring one symmetry. We define the phase transition position where the order parameter's value is ten times larger than the corresponding guiding field (see appendix B). The hopping amplitude, J is set to 1 throughout the paper.
Transition between LL and CDW-I. We train a CNN to classify ground states into two phases: LL and CDW-I based on the transition line marked with the arrow (1) in Fig. 1(b) for V 2 = 0. We plot the influence functions of all training examples for a chosen test point (marked with orange line) in Fig. 2. The order parameter describing the transition here is the average difference between nearestsite densities, which is zero in the LL phase and non-zero (growing to one) in the CDW-I phase. The panels (a)-(b) present how influential training points are for test points from the LL phase. The test state (a) is the ground state located deeply in the LL phase, while (b) is closer to the transition.
If the CNN learns an order parameter, all training points, i.e., ground states from the LL phase exhibiting a zero order parameter, should be similarly positively influential, and that is exactly what we observe. They form an almost flat line in panels (a), (b), and (d). In panel (c), however, for the test point close to the transition, their influence changes linearly. This divergence from expected behavior is because in our exact diagonalization calculations, the order parameter in the LL phase is not exactly constant and equal to zero. Instead, it is growing very slowly, that is why finally the most helpful points are the ones near the transition -they are the most unique from the training points labeled as LL, and the information they provide is the most valuable. The nonzero order parameter is caused by three phenomena: the finite-size effect, use of the guiding fields, and the numerical arbitrariness of choosing the transition point. In the perfect scenario (observed, for example, for training on states obtained from mean-field calculations), the five most influential points should be randomly distributed over the whole LL phase. The most harmful training points are, in both cases, the ones closest to the transition, but on the CDW-I side of it. These states are the most similar (with the smallest order parameter value), but already labeled differently.
On the side of the CDW-I phase, the influence pattern is significantly different. The curvature of influential points corresponds to the growth of the order parameter, The expected behavior of the most influential points indicates not only that the CNN correctly learned the order parameter, but also that this tool enables distinguishing between the phase transition types. In particular, the curvature of the line drawn by influence functions' values is different for the transitions characterized by continuous and discontinuous change of the order parameter.
Transfer learning. With a similar approach, we validate the transfer learning to another transition line. We take the trained CNN from Fig. 2 Fig. 3 we apply it to test states coming from the transition line for V 2 = 0.25 V 1 , where the phase transition position is shifted to higher values of V 1 /J. Therefore the training and test states come from different transition lines, V 2 = 0 and 0.25 V 1 , marked in Fig. 1(b) with the arrows (1) and (2), respectively. Panels (a) and (b) of Fig. 3 show the influence function values of training data set for test states from the LL phase, while (c) and (d) -from the CDW-I phase. Due to the shifted transition point for the test line, compared to the training line, we see the same shift in the behavior of the most influential points on the CDW-I side. This shift corresponds to the fact that no longer the same value of V 1 /J yields the same value of the order parameter, and that the ML algorithm still as the most influential points regards the states with the most similar order parameter.

and in
Inferring the existence of the third phase. This time we analyze the transition line crossing three phases, LL, BO, and CDW-II, which is indicated by the arrow (3) in Fig. 1(b). Two order parameters are needed to describe this transition. One is the average difference of the nextnearest neighbor density, which equals zero in the LL and BO phases, and grows to 1 in the CDW-II phase. The other is the staggering of effective nearest-neighbor hoppings, being 0 in the LL phase, non-zero in the BO phase, and slowly decaying to 0 in the CDW-II phase. In the studied range of parameters, two phases (BO and CDW-II) co-exist (see appendix B). It is crucial to note that in this section, we train on the mentioned transition line crossing three phases, but we label ground states only as belonging to one out of two phases.
In the first set-up, with results presented in the panels (a)-(b) of Fig. 4, we label ground states as belonging to the LL (blue dots, label 0) or belonging to the BO and CDW-II phases (purple dots, label 1). Independently on the test point location, within purple training points belonging to BO and CDW-II two similarity regions, understood as two groups of points with similar influence within the group, can be distinguished. The ML algorithm apparently learns two different patterns (order parameters) to classify the data correctly, and as such, it notices the existence of the third phase within the incorrectly labeled data. This would be impossible to notice without the use of interpretability methods, which in this sense pave the way towards unknown phases detection.
The second set-up consists of labeling the same data as belonging either to the LL and BO phases (blue dots, label 0) or to the CDW-II phase (purple dots, label 1). The influence functions values, resulting from this classification, are in the panels (c)-(d) of Fig. 4. The pattern they form is starkly different. First of all, no longer two similarity regions within training points from the LL and BO phases are distinguished. It is because this transition can be fully described with one order parameter, which is zero in the LL and BO phases, and non-zero in the CDW-II phase. The behavior is then more similar to the one seen in Fig. 2 with the transition between LL and CDW-I. It is not identical, though, as in the phase LL+BO the most helpful training points are always distributed randomly, but deep in the LL phase, avoiding the BO phase. The most helpful points on the CDW-II side are also those deep in the CDW-II phase in contrast to Fig. 2, where they mostly follow the test point. The difference comes mostly from the fact that the deeper in the CDW-II phase, the smaller the BO order parameter, which is making CDW-II predictions less difficult. We claim that the observed pattern is the sign of not learning correctly the order parameter and potentially overfitting.
Finally, we trained a CNN on the same data, but with three labels correctly corresponding to all three phases. The influence patterns seen in Fig. 2 and panels (c)-(d) of Fig. 4 are repeated, indicating that CNN correctly learns both appropriate order parameters.
Conclusions. We used the influence functions, being the interpretability method aiming at approximating the LOO training, on the CNN trained to classify ground states of the extended 1D half-filled spinless Fermi-Hubbard model. We acknowledge significant finite-size effects, but we see that achieved results do not depend on the system size. We provided for the first time a strong indication that the ML algorithm indeed learned a relevant order parameter. Moreover, we showed that the influence functions, applied to the trained NN, were able to detect an unknown phase as well as distinguish between types of transitions. Two aspects impacted which training points were the most important for a given test point: how similar they were to the test state and how unique in the training data set. Together they gave a notion of distance or similarity used by the CNN in the phase classification problem and indicated that the pat-terns relevant for the test states coincided with the order parameters.
The next step is to address open problems of topological models and MBL with NNs, whose logic can be finally discovered by influence functions. Even though this work concerned quantum phase transitions, this method may be easily applied to classical models as well. Moreover, influence functions proved to be very sensitive to outliers existing in the data set and may serve for anomaly detection. As such, it can be useful for analysis of experimental noise in the data on which various models are built and allow to judge how strongly it affects them. We use a neural network (NN) (see Fig 5) consisting of 3 one-dimensional convolutional layers with 5 filters on the input vector, 8 filters on the first hidden layer and 10 filters for the last convolution layer. After the first 2 convolutions we apply a max pooling layer to reduce the dimension, and the last convolutional layer is followed by a global average pooling (GAP) layer. The GAP architecture has been introduced in [57] and has found applications in discriminative localization of objects in image data [39]. It reduces each filter of the final convolution to a single value. After the GAP we have one fully connected layer with two output neurons that predict the labels. The GAP makes sure that most of the weights of the NN are contained in the convolutional part. We use this technique here to reduce the amount of weights in the fully connected part of the NN. For the training of the NN we use state vectors from each phase as an input and label them with 0 or 1 for each phase. The state vectors are obtained via exact diagonalization of the Hamiltonian B1.
We use L 2 regularization during the training to effectively decrease the certainty of the NN's predictions. Actually, the undertrained NN with imperfect accuracy can provide the better intuition behind the problem than overtrained one, whose predictions are impacted by overfitting. Used CNNs had accuracy between 89 and 96%.
For the results in Fig. 3 we apply transfer learning, which means that we use a NN that was trained with a training set that comes from a different domain than the test data. In our case the domains are different trajectories through the phase space indicated in Figure 1 with the arrows (1) and (2) where the phase transition appears for different values of V 1 /J. We study the 1D system consisting of spinless fermions at half filling with hopping between neighboring sites with amplitudes J, interacting with nearest neighbors with strength V 1 and next-nearest neighbors with strength V 2 : The model exhibits four different phases, two of them co-exist in the limited range of parameters. Without the next-nearest-neighbor interaction, V 2 , the system can follow only patterns of the gapless liquid Luttinger (algebraic) phase (LL) or the charge-density wave of the type I (CDW-I) with the degenerated density pattern 101010. The CDW-I order parameter describing this transition reads O CDW-I = 1 L i,j |n i − n j |, where symbolizes nearest neighbors. The next-nearest-neighbor interaction, V 2 competes with V 1 , so for non-zero V 2 but still smaller than V 1 the transition between LL and CDW-I shifts towards bigger V 1 . For sufficiently strong V 2 the bond-order (BO) phase emerges with the order parameter O BO = 1 It turns into the charge-density wave of the type II (CDW-II) with the degenerated density pattern 11001100 for large V 2 values, with O CDW-II = 1 L i,j |n i − n j |, where symbolizes next-nearest neighbors. To calculate the ground states and order parameters of the model, we use QuSpin package [55] to write the Hamiltonian for 12-site system in the Fock basis, resulting in 924 basis states. We assume periodic boundary conditions. The exact diagonalization is done with the SciPy package [56]. The ground states belonging to BO, CDW-I and II phases are degenerated. In order to lift the degeneracy, we apply symmetry breaking fields that favor one of the patterns. This approach results in non-zero corresponding order parameters independently of the phase, therefore we define transition points as such parameters of the system that correspond to the order parameter being 10 times bigger than the corresponding symmetry breaking fields. As such, due to the guiding fields of values 10 −7 , 10 −5 , and 10 −4 for 101010 and 11001100 density patterns and 1010 hopping pattern, respectively, the order parameters of values 10 −6 , 10 −4 , and 10 −3 signal the transition to the CDW-I, CDW-II, and BO phase, respectively. It is interesting to note that the results presented in this work stay the same without the symmetry breaking fields, and also do not depend on the size of the system.
Within this work we train the convolutional neural network on three transition lines indicated with arrows (1)-(3) in Fig. 1(b). The first transition line leads from the LL to the CDW-I phase, and is calculated for constant V 2 = 0 and V 1 /J = 0, 40 . It is a source of training data for both Figs. 2 and 3, and test data for Fig. 2. It is symbolized in Fig. 1(b) with the arrow (1), and the values of corresponding order parameter O CDW-I are plotted in Fig. 6(a). The transition, defined as above, occurs for V 1 /J = 1. The second transition line is calculated for V 2 = 0.25 V 1 and V 1 /J = 0, 80 . Indicated with the arrow (2), it is the source of test data for Fig. 3. Corresponding order parameter CDW-I is plotted in Fig. 6(b), and the transition takes place for V 1 /J = 1.85. Final transition line cuts three phases: LL, BO, and CDW-II. It is marked with the arrow (3) and provides both training and test data for Fig. 4. It is calculated for constant V 1 = 1/J and V 2 = 0, 8 V 1 . Transition between LL and BO occurs for V 2 = 0.51 V 1 , and between BO and CDW-II for V 2 = 1.7 V 1 . It is important to notice that for the chosen range of parameters V 2 = 1.7, 8 V 1 , two phases co-exist what can be seen in Fig. 6(c).