Physics-Informed Neural Networks for Discovering Localised Eigenstates in Disordered Media

The Schr\"{o}dinger equation with random potentials is a fundamental model for understanding the behaviour of particles in disordered systems. Disordered media are characterised by complex potentials that lead to the localisation of wavefunctions, also called Anderson localisation. These wavefunctions may have similar scales of eigenenergies which poses difficulty in their discovery. It has been a longstanding challenge due to the high computational cost and complexity of solving the Schr\"{o}dinger equation. Recently, machine-learning tools have been adopted to tackle these challenges. In this paper, based upon recent advances in machine learning, we present a novel approach for discovering localised eigenstates in disordered media using physics-informed neural networks (PINNs). We focus on the spectral approximation of Hamiltonians in one dimension with potentials that are randomly generated according to the Bernoulli, normal, and uniform distributions. We introduce a novel feature to the loss function that exploits known physical phenomena occurring in these regions to scan across the domain and successfully discover these eigenstates, regardless of the similarity of their eigenenergies. We present various examples to demonstrate the performance of the proposed approach and compare it with isogeometric analysis.


Introduction and Background
Partial differential equations (PDEs) are powerful tools for studying dynamic systems and have applications across various fields. However, these equations are notoriously difficult to solve explicitly, leading to an increased interest in numerical approximations. In addition to classical numerical methods, recent research has shown the potential of using neural networks to estimate solutions to differential equations.
There are several advantages of using neural networks to solve PDEs over traditional numerical methods.
Firstly, neural networks are generally efficient as they can often produce solutions to PDEs more quickly than traditional numerical methods, particularly for high-dimensional problems. Secondly, they are more robust against the "curse of dimensionality," which refers to the difficulty of solving PDEs in high-dimensional spaces using traditional numerical methods. Furthermore, they can be trained to solve a wide range of PDEs, including those with complex boundary conditions or nonlinear operators. Once trained, neural networks can quickly produce solutions for new PDEs that are similar to those encountered during training, potentially without requiring further adjustments or recalibration of parameters. Overall, neural networks can provide a powerful and flexible tool for solving PDEs, especially in cases where traditional numerical methods may be insufficient or computationally expensive.
Data-driven supervised networks [10] and data-free unsupervised networks [13] have been shown to produce efficient approximations of differential eigenvalue problems. Supervised networks aim to discover patterns in labelled datasets to infer a general model that can predict examples outside of the given dataset.
The training process involves calculating approximations for given inputs and computing how much these estimations deviate from the true outputs using a loss function. Then, the networks minimise this loss function by adjusting their parameters. Artificial Neural Networks (ANNs) are comprised of layers of interconnected nodes with weights and biases that calculate the estimations. Convolutional Neural Networks (CNNs) are a newer form of neural networks that are especially known for their ability to identify patterns faster and more reliably than ANNs in solving eigenvalue problems using labelled datasets [10]. They introduce a convolutional layer built up of windows that slide around the input data, searching for patterns in whole regions at a time. They are commonly used to pass the information on to ANNs and are optimised similarly.
In situations where training data do not contain labels, unsupervised networks are proposed to tackle the given task. These networks rely on a carefully designed loss function based on known information about the desired result to push the network toward a correct approximation.
In the case of finding the eigenstates of Hamiltonians, we leverage known physical phenomena to design our loss function. This kind of model is an example of a Physics Informed Neural Network (PINN, c.f., [2,20]), one that embeds the knowledge of physical laws governing the system into the learning process to drive the network to an admissible approximation.
Differential eigenvalue problems occur in a wide range of problems in physics and applied mathematics, such as quantum energy problems. In early works, Lagaris et al. [15] proposed an ANN to discover the solu-tions to differential eigenvalue problems, tested on various problems in quantum mechanics. More recently, Finol et al. [10] presented a supervised ANN to solve eigenvalue problems in mechanics, showing that CNNs were outperforming traditional ANNs in contexts with labelled datasets. Sirignano and Spiliopoulos [21] developed a deep learning network to accurately solve partial differential equations in dimensions as high as 200. They also proposed a mesh-free algorithm, which was desirable as meshes become infeasible in such high dimensions. Chen et al. [3] adopted the emerging PINNs to solve representative inverse scattering problems in photonic metamaterials and nano-optic technologies. Their method was also mesh-free and was tested against numerical simulations based on the finite element method. These ANN models are excellent at learning time series data. The work [18] demonstrated its application in wave packet displacements in localised and delocalised quantum states. Another work [14] adopted CNNs to solve a classification problem. The authors constructed a supervised model learning from experimental data to distinguish between the dynamics of an Anderson insulator and a many-body localised phase. Yang et al. [23] proposed a combination of PINNs with Bayesian Neural Networks, which solved both forward and inverse nonlinear problems, obtaining more accurate predictions than PINNs in systems with large noises.
Jin et al. [13] have shown that PINNs can solve single and multiple square well problems, calculating the eigenfunction and eigenvalue simultaneously. Their network employs a scanning mechanism that pushes the network to search for higher eigenstates as the training process evolves, while the network updates the eigenfunction prediction accordingly. They store found eigenfunctions and exploit the orthogonality of wave functions to search for higher eigenstates. Grubišić et al. [11] applied PINNs to identify localised eigenstates of operators with random potential. They studied the effective potential of the operator, whose local minima correspond to the different eigenstates. They first built a PINN to solve these eigenvalue problems in one dimension, then generalised to a deeper model that solves higher dimensional problems. Effective potentials provide a neat way of locating the localised eigenstates, however, are limited in the number of locations they can identify, and have difficulty differentiating between eigenstates of identical eigenvalue. These limitations occur in Bernoulli distributed potentials, which we aim to solve with our model.
We build on the design of [13] to approximate these eigenstates for Hamiltonians whose potentials are distributed randomly, leading to the localisation of the solution to specific regions of the domain. In particular, the core of our work is in finding these eigenstates where the eigenenergies are nearly identical, a task that current models are unable to accomplish. This is most prevalent with potentials distributed according to the Bernoulli distribution but can occur in other distributions.
The rest of the paper is organised as follows. In Section 2, we state the Schrödinger differential eigen-value problem and present the isogeometric analysis, followed by a discussion on the challenges of its spectral approximation. In section 3, we propose our novel loss function design for the PINN for solving the Schrödinger equation. Section 4 collects and discusses various numerical tests to demonstrate the performance of the proposed method. Concluding remarks are presented in section 5.

Problem Statement and Isogeometric Analysis
In this section, we first introduce the modelling problem, followed by a presentation of a classic numerical method, namely isogeometric analysis, used to solve the problem as well as to generate the reference solutions. We then provide a general discussion on the challenges associated with the numerical computation of the model.

Problem Statement
We study the time-independent Schrödinger equation: Find the eigenpair (E, u) such that (Ω) specifies the potential and is a non-negative function, and is a bounded open domain with Lipschitz boundary ∂Ω. Throughout the paper, we will focus on the case with d = 1, followed by a discussion on its extension to d = 2, 3. The differential operator is referred to as the Hamiltonian, i.e., H = − 2 2m ∆ + V . Here, is the reduced Planck constant and m is the mass of the particle. Without loss of generality, this equation is equivalent to: This problem is a Sturm-Liouville eigenvalue problem (see, for example, [9,22]) which has a countable infinite set of positive eigenvalues with an associated set of orthonormal eigenfunctions u j where δ jk is the Kronecker delta which is equal to 1 when j = k and 0 otherwise. The set of all the eigenvalues is the spectrum of the Hamiltonian. We normalise the eigenfunctions in the L 2 space; hence, the eigenfunctions are orthonormal under the scalar inner product. Let us define two bilinear forms where H 1 0 (Ω) is the Sobolev space with functions vanishing at the boundary ∂Ω. Using this notation, the eigenfunctions are also orthogonal with each other with respect to the energy's inner product, i.e., where we have used the integration by parts on (2.2). We remark that these orthogonalities are critical in the development of the proposed neural networks in Section 3.

Isogeometric Analysis
Let H 1 0 (Ω) be the standard Hilbert space where both functions and their weak derivatives are in L 2 (Ω) with functions vanishing at the boundary. At the continuous level, the weak formulation for the eigenvalue problem (2.2) is: Find all eigenvalues E ∈ R + and eigenfunctions u ∈ H 1 0 (Ω) such that, while at the discrete level, the isogeometric analysis (IGA, c.f., [4,6,8,12]) for the eigenvalue problem (2.2) is: Find all eigenvalues E h ∈ R + and eigenfunctions u h ∈ W h such that, is the solution and test space spanned by the B-spline or non-uniform rational basis spline (NURBS) basis functions [5].
In this paper, for comparison with the state-of-the-art, we adopt the recently-developed soft isogeometric analysis (softIGA, c.f., [7,16]). SoftIGA has a similar variational formulation as (2.8) which leads to the matrix eigenvalue problem , and U contains the coefficients of the eigenfunction u h , for its representation in the linear combination of the B-spline basis functions. For simplicity, the matrix K (although it contains a scaled mass) is referred to as the stiffness matrix while the matrix M is referred to as the mass matrix, and (E h , u h ) is the unknown eigenpair. We refer to [7,16] for details.

A Few Challenges on Numerical Computation
There are a few challenges in the numerical computation, such as using IGA or softIGA, of the Schrödinger equation with random potentials. Firstly, there is a non-uniqueness of solutions. The Schrödinger equation with random potentials can have multiple solutions that correspond to the same (or very similar) energy level. This non-uniqueness can make it difficult to accurately identify the correct eigenstates, particularly when dealing with complex potential landscapes. Secondly, it can be a multi-scale problem with large sampling/discretisation errors. Random potentials can vary over multiple length scales, which leads to inaccurate numerical simulations and can make it challenging to choose an appropriate discretisation size and mesh.
Thirdly, the Schrödinger equation for many-body problems in multiple dimensions suffers from the curse of dimensionality. The number of parameters needed to describe the random potential can be very large. As the number of dimensions increases, the computational cost of solving the Schrödinger equation can increase exponentially.
Last but not least, the Schrödinger equation with random potentials is a complex mathematical problem, and solving it numerically can be computationally expensive. This is particularly true for large system sizes and high disorder strengths, which require a large number of numerical simulations. Moreover, solving the Schrödinger equation with a different random potential requires constructing a new matrix eigenvalue problem (2.9), which can be a time-consuming process. This can become particularly challenging when solving the equation for a large number of random potentials. In such cases, one potential solution is to train a neural network that takes the random potential as input and produces the corresponding eigenstates as outputs. For instance, to solve the Schrödinger equation with N different random potentials, one could use a classic method like softIGA and apply it N times, or alternatively, train a neural network using data from a subset of cases (N 0 N ) and then use the well-trained neural network to solve the remaining cases.
Some common types of deep neural networks include convolutional neural networks (CNNs) for image processing and computer vision tasks, recurrent neural networks (RNNs) for sequential data processing, deep belief networks (DBNs) for unsupervised learning, and generative adversarial networks (GANs) for image and video synthesis. Deep neural networks have demonstrated impressive performance in many applications.
In this paper, we adopt the physics-informed neural networks (PINN) with a novel loss function design to solve the Schrödinger equations. Typically, the loss function in PINN is defined as where L de specifies the loss associated with the PDE and L reg specifies the loss associated with the regularisation such as the boundary conditions. The loss function term L de involves derivatives which are usually evaluated by the automatic differentiation by Tensorflow [1] or by Pytorch [19]. In the following subsection, we present a goal-oriented loss function for the best performance in solving the Schrödinger equations with random potentials.

Loss Function Design
To demonstrate the main idea, we focus on the one-dimensional case. We assume that the potential V (x) is randomly distributed over m regions over the domain [0, 1], and each region has a random value. Each set of random values then characterises the individual differential eigenvalue problem to be solved for Anderson localised states. We partition the interval [0, 1] into n + 1 equally spaced points x j = jh, j = 0, 1, · · · , n where h = 1/n is the grid/mesh size. We denote by u h = (u h (x 0 ), u h (x 1 ), · · · , u h (x n )) T as a vector of the approximate values of the true solution u(x) at each of these coordinates x j . Similarly, we denote by V = (V (x 0 ), V (x 1 ), · · · , V (x n )) T as a vector of the approximate values of the potential V (x). In our PINN model, we use choose n such that each region has exactly 5 nodes each for accuracy as well as for training efficiency. To approximate the second derivative vector u (x), we use the centre finite difference method with an accuracy order of six, i.e., the error is of O(h 6 ).
We construct a PINN that takes the vector V as input and split it into two separate networks: one that calculates the eigenvalue (E), and the other one for the eigenfunction u(x). Figure 3.1 draws this model structure, highlighting the split into separate networks. We set the number of nodes in each hidden layer to n, allowing the network to scale with the size of the mesh.
The driving force of this network towards a correct solution is the design of its loss function, which is built of multiple terms relating to different aspects of a correct solution. The neural network's optimisation method (backpropagation) works to minimise the loss function, so we choose terms that are positive errors for incorrect predictions of E and u(x) and are zero for correct predictions. Thus, the minimisation of these terms leads the network to produce correct predictions for E and u(x).
The central term of our loss function is the term: which is always non-negative and is zero for solutions E and u(x) that satisfy equation 2.2; thus, it satisfies the consistency condition. To encourage the network to satisfy the boundary conditions u(0) = u(1) = 0, we create the term: which the minimisation of the loss function leads to u h (x 0 ) = u h (x n ) = 0. The major issue with using just these terms is that the trivial solution u(x) = 0 satisfies these terms (reducing them to zero). Thus, we need the following term to penalise the approximate solution: The minimisation of this loss term leads to approximate solutions of L 2 norm 1. For the integration, we use the mid-point rule and this calculation is majorly local. In the PINN model developed in [13], this condition was imposed using the terms like 1/u 2 (x) and 1/E 2 . We point out that since the eigenstates are localised, an eigenstate u(x) = 0 for most of the domain, leads to very large and even infinite values due to the design of the term 1/u 2 (x). Consequently, this leads to difficulties in minimising the loss function and the overall neural network training. Our design of the loss term 3.13 is novel, sufficient, and computationally efficient to approximate normalised non-trivial solutions.
These terms are sufficient to push the network to generate an admissible eigenstate. However, the model often produces the same eigenstates, preventing the discovery of higher states. To overcome this issue, once an eigenstate (E h , u h (x)) is discovered, we add the eigenstate u h (x) to a list of eigenstates S. Then, we take advantage of the known physical fact that the eigenfunctions of Hamiltonians are orthogonal and construct the following term: where u h (x) is the network's current state to be produced. We iteratively recompile and train the network, collecting the eigenstates once the prediction is acceptable. To decide whether a prediction is admissible, we determine whether L de is below a certain threshold, as well as a patience condition as in [13] that checks if the absolute value of the change in L de and that of the eigenvalue is below another threshold. This allows the network to converge to solutions with L de much lower than the chosen threshold, given that it is converging fast enough.
In this study, we discover that these terms are sufficient in training the model to discover multiple eigenstates, without the use of the drive term described in [13]. The loss function we have built so far performs adequately on problems with enough spread in the eigenvalues between different states. However, in situations where the eigenstates have nearly identical eigenvalues, the network in [13] is unable to distinguish between the states, converging to an undesirable algebraic combination of the eigenfunctions. We believe this is due to that the local minima in the high dimensional loss function graph that represent the correct eigenstates are much closer when the eigenvalues are similar, causing difficulty in the optimisation process. When studying Hamiltonians with potentials generated randomly, there is a disorder for Anderson localisation to occur. We leverage this fact in our loss function with the following procedure.
We let the model run with the above loss function for a certain number of epochs q, which is a hyperparameter. The goal is to set this parameter so that at epoch q, the network has converged to an algebraical combination of the eigenfunctions; for example, see the left plot in Figure 3.2.
Since we know the eigenstates of the Hamiltonian are localised, each spike in 3.2 represents a separate localised state. We thus scan through the eigenfunction and generate a list K of the regions of the domain where the norm of the eigenfunction is greater than some set hyperparameter, splitting apart the regions where the function returns to zero to separate the localised spikes. Then, in our iterative process of discovering eigenvalues, at each step we choose one of the intervals in K (say, [x a , x b ]) and add the following term to the loss function:  Equivalently, we could have added terms like L end for each nodal point outside of [x a , x b ]. We remark that this term is non-negative. With all these loss terms in mind, then we have 16) which is added to L de to give the overall loss function L in (3.10). Throughout the paper, we use this novel loss function in our proposed PINN with the structure shown in Figure 3.1.

Numerical Experiments
We test our PINN model on various Hamiltonians with randomly distributed potential and demonstrate the network's robustness against potential with different distributions. In particular, we consider potentials with distributions that cause the Hamiltonian to admit eigenstates with almost identical eigenenergies. This is most prevalent in potentials distributed according to the Binomial distribution. We test the network against     At this point, the model's prediction has met the convergence criteria and saves the eigenstate to use in L orth and it resets its weights to search for the next eigenstate. In our implementation, we include a function that generates a video to display the network's eigenfunction prediction as the training process evolves, which is available at our GitHub 1 page. We also test the model's robustness against the number of nodes. We note that the model achieves slightly better results for the potential distribution according to the normal distribution compared to the uniform distribution. This may be due to that the points generated by the normal distribution are more concentrated around the mean than in the other cases. However, the results for the Bernoulli distributed potentials are much better, all with relative errors below 1%. This demonstrates the success of our approach to the problem of eigenstates with similar eigenvalue. Table 2 displays the eigenvalue prediction results for a single potential generated by the Bernoulli distribution with a different number of mesh elements. As the number of elements increases, there is a clear improvement in the accuracy of the PINN-predicted eigenvalues. The convergence rate is of order approximately 2 but the theoretical analysis of this convergence is subject to future study.

Concluding Remarks
Recently, there has been a growing interest in using machine learning and neural networks to solve differential equations, particularly in the study of Hamiltonians. In this paper, we extend current models by constructing a neural network that is capable of distinguishing eigenstates from Hamiltonians with randomly  distributed potentials that lead to localisation in the states. Our approach involves incorporating a normalisation loss term to avoid trivial solutions and an orthogonal loss term to search for higher eigenstates. Our method eliminates the need for driving terms, which can hinder a network's ability to converge to an admissible solution. The novelty of our approach lies in a new localisation loss term, which encourages the network to converge to a solution that is localised to a specific region. This idea is based on the physical  fact that localised eigenstates occur in randomly distributed potentials. We utilise the network's initial attempt at convergence to identify regions of localised states before adding this term to achieve the desired approximation. We demonstrate the effectiveness of our approach in discovering eigenstates for potentials generated randomly according to different distributions, with the highlight being the Bernoulli distribution, which leads to eigenstates with identical eigenvalue. Our model successfully predicts these eigenstates with good accuracy, overcoming a challenge faced by current models.
There are several potential avenues for future work. Some potential future directions in this research area include: (1) exploring the use of more complex loss functions that can capture additional physical constraints, such as symmetries or conservation laws; (2) investigating the use of different network architectures, such as convolutional neural networks or attention-based models, to improve accuracy and scalability; (3) generalising the approach to higher-dimensional systems or non-linear differential equations; and (4) investigating the performance of the approach with larger and more complex systems, including systems with a larger number of particles or more complex interactions. Overall, the use of ANNs to solve differential equations and approximate eigenstates of Hamiltonians has shown to be a promising approach, and there are many potential directions for future research in this area.