Learning the black hole metric from holographic conductivity

We construct a neural network to learn the RN-AdS black hole metric based on the data of optical conductivity by holography. The linear perturbative equation for the Maxwell field is rewritten in terms of the optical conductivity such that the neural network is constructed based on the discretization of this differential equation. In contrast to all previous models in AdS/DL (deep learning) duality, the derivative of the metric function appears in the equation of motion and we propose distinct finite difference methods to discretize this function. The notion of the reduced conductivity is also proposed to avoid the divergence of the optical conductivity near the horizon.The dependence of the training outcomes on the location of the cutoff, the temperature as well as the frequency range is investigated in detail. This work provides a concrete example for the reconstruction of the bulk geometry with the given data on the boundary by deep learning.


I. INTRODUCTION
AdS/CFT correspondence as a typical implementation of gauge/gravity duality reveals the deep connections between a (d + 1)-dimensional gravity theory and a d-dimensional quantum field theory [1][2][3][4]. In particular, due to the feature of strong/weak duality in large N limit, AdS/CFT correspondence has been proven to be a powerful tool to study the strongly correlated physics via classical gravitational theories. When applying to manybody systems in condensed matter physics, it has formed an important subject which now is dubbed as AdS/CMT duality [5][6][7][8][9][10][11]. In this field, the traditional way is properly setting the structure of bulk geometry, and then deriving the properties of the many-body system living on the boundary by solving the equations of motion in the bulk with the use of the holographic dictionary. Surprisingly, one finds some of the properties simulate the transport behavior of a strongly coupled system which has been observed in laboratory but is very hard to understand based on the standard perturbative method in quantum field theory [6,[12][13][14][15][16][17][18][19][20][21][22]. In this direction, remarkable progress has been made in understanding the longstanding problems in condensed matter physics by holography, such as the mechanism of high-temperature superconductivity and the non-Fermi liquid behavior of strange metals [23,24]. Nevertheless, this route contains a vital limitation that prevents us from thoroughly solving the problems faced by experimental physicists in laboratory. That is, the derived properties of the dual system greatly depend on the specific structure of spacetime in the bulk. Once the setup of bulk geometry is given, then the transport property of the dual system on the boundary is determined. Usually, based on the consideration of symmetry one may construct the bulk geometry with essential ingredients to observe the expected phenomenon for a dual system, but this is not the general case. In general, we do not know what kind of bulk geometry would give rise to the specific property of the boundary theory as one expects. For instance, there are two fundamental problems which have not been solved by AdS/CMT duality. One is to reproduce the expected phase diagram for the hightemperature superconductivity, the other is to reproduce all the observed features of strange metal in a single holographic model. In this situation, we are facing the inverse problem of the traditional AdS/CMT method: Given the observed features of practical materials in lab, can we construct a holographic model to reproduce such features properly and finally provide a theoretical understanding on these features? Or more generally, in the context of holography, given the data on the boundary, how can we reconstruct the geometry of the bulk?
Without doubt, this inverse problem is much harder. A lot of efforts have been made to learn the bulk geometry by machine learning/deep learning(DL), which is now viewed as the most advanced technique in artificial intelligence . As far as we know, one may classify the current application of deep learning to holography into two categories by the types of boundary data used. One is taking the entanglement on the boundary as the input data [34,35] and the other is taking the vacuum expectation value (VEV) as the data [36,37,39,42]. For the first category, one usually constructs a tensor network as the discretized version of AdS/CFT correspondence, and then transfers this network to a Boltzmann machine, and a method called entanglement feature learning (EFL) is suggested to learn the spatial geometry from the feature of entanglement on the boundary [34]. A generic neural network is also proposed to recover the geometry fluctuation from the multiregion entanglement entropy on the boundary [35]. The second category is less ambitious but more relevant to AdS/CMT duality. One just specifies the metric of the bulk to be some simple form with one or more unknown functions, and then constructs a neural network based on the equations of motion for matter fields. The goal of the neural network is to learn unknown functions in the metric by boundary data which are the VEV of dual operators on the boundary. Now such an approach is called AdS/DL method. As the first step, in [36], it is assumed that only one single function in the spacetime metric is unknown. With this ansatz, one attempted to construct the neural network based on the equations of motion of a scalar field and then train the neural network to learn the corresponding space-time metric by inputting the experimental data of magnetization and external magnetic field as initial data. Moreover, this AdS/DL method has also been applied to AdS/QCD duality [37,40,41]. Later, the neural network is replaced by another machine learning algorithm called neural ODE, which can serve the same purpose meanwhile yielding more accurate results [43].
In all above work, the neural network is constructed by considering the perturbations of a scalar field. In [42], the perturbation of the metric tensor has been considered and the neural network reflects the RG flow equation of the shear viscosity. In this paper, aiming to apply AdS/DL method to AdS/CMT duality, we intend to extend the setup to investigate the perturbations of a vector field in the bulk. Specifically, we consider the electromagnetic field A µ in a charged black hole background and construct the neural network based on the RG flow equation of the optical conductivity of the dual current operator, then train the black hole metric from the data of the optical conductivity on the boundary.
We organize this paper as follows. In section II, we derive the equation of motion for the linear perturbation of the Maxwell field and then rewrite this equation with the optical conductivity as the fundamental variable. The neural network is constructed based on the discretized version of this equation. In section III, we explain the preparation of the input and output training data. In section IV, we illustrate the results of deep learning and discuss the effects of the chemical potential µ and the region of frequency ω on the final training results. We suggest a novel regularization term to improve training accuracy and save time in hyper-parameter tuning. We also propose several conditions on the regularization term, such as smooth metric and asymptotic AdS, to obtain physically reasonable results.
Conclusions and discussions are given in section V. Appendix A gives details of four kinds of finite difference methods for f (z). Appendix B gives the detailed training methods, hyperparameters, and all the training results. Appendix C states the running environment of the code.

ITY
In this section, we derive the equation of motion for the optical conductivity in AdS/CMT duality and then construct the corresponding neural network for training the black hole metric. We start with the action of Einstein-Maxwell theory with a negative cosmological constant, where κ 2 = 8πG with G the Newton constant, and 6 L 2 is the cosmological constant term with L being the AdS radius. The field strength is F = dA, where A is the Maxwell field.
From this action, the equations of motion can be derived as, We consider the following charged black brane solution to (2) with spatially planar symmetry which is also called the AdS-RN metric 1 , where z ∈ [0, 1] is the radial direction of RG flow and f (z) ≡ 1 − z 3 − µ 2 z 3 /4 + µ 2 z 4 /4 is the theoretical result by solving Einstein equations. In this paper, we will treat f (z) as the target function that should be learned by the neural network via data training. Parameter µ is the chemical potential of the dual system on the boundary. In this metric form, the horizon of black brane locates at z = 1 while the boundary of spacetime locates at z = 0. Moreover, the Hawking temperature which is given by T = 12−µ 2 16π , is identified as the temperature of the dual system in equilibrium. Here, we set µ ∈ (0, √ 12] to ensure that the Hawking temperature is positive. Thus, one may change the temperature of the system by adjusting the value of µ.
Now we consider the optical conductivity of the system following the standard procedure of the linear response theory in holographic gravity [7,48]. To obtain the optical conductivity, we turn on an electric field along x direction by considering the following linear perturbation in the bulk, Plugging it into the Maxwell equation, one obtains the linearized equation of motion for A x as where The Green function J x J x can be collected as −∂ z A x /A x from holographic dictionary, while the applied electric field associated with A x reads as E x = −∂ t A x = iωA x . From the Kubo formula in holographic gravity [6,49], the optical conductivity can be expressed as 1 For concreteness, we fix κ = 1 and L = 1 throughout this paper.
Next, we intend to rewrite Eq.(5) as the differential equation in terms of the optical conductivity. For this purpose, we notice that Furthermore, dividing (5) by A x (z) one obtains, With the use of Eq.(8), we obtain the differential equation for the optical conductivity as Now the original second-order differential equation with variable A x (z) becomes a first-order differential equation with variable σ(z). We are ready to construct the neural network and train the metric function f (z) based on this equation. First, we discretize equation (10) by evenly sampling along the z-axis, where N is the number of layers of the network, while z b and z h are the locations of the cutoff on the boundary and the horizon, respectively. In addition, n ∈ [0, N − 1], n ∈ Z, We rewrite equation (10) into the real part and the imaginary part separately as Next, we are concerned with the boundary conditions on both ends of z-axis. At the horizon, we impose the ingoing boundary condition for A x (z), which takes the form as, As a result, the optical conductivity becomes, where we have denoted the second term as σ r (z, ω), and we name it the reduced optical conductivity. It is noticed that with the ingoing boundary condition, the first term of the conductivity becomes divergent near the horizon, which easily sabotages the numerics.
Therefore, we intend to treat the reduced optical conductivity as the basic variable for the construction of a deep neural network. The discretized version of the equations of motion for the reduced conductivity is given by, In order to reveal the relation between the above discretized equation and a neural network, we convert the equation into a matrix form as below, where and b = ∆z Here, W 2×2 that contains the information of the spacetime metric can be regarded as the weight matrix of a neural network. The weight matrix in deep learning represents the connecting parameters of the neurons of adjacent layers. And naturally, b can be regarded as the bias term of a network. In addition, according to the matrix form, the activation function is the identical mapping.
As a result, we construct the following neural network to represent the discretized equation of motion for the reduced conductivity (Fig. 1). Physically speaking, we can consider the network structure as the spacetime structure because the connecting weights of the network contain the metric information and the propagation direction of the network is the holographic direction [36]. Also, we can imagine a scene where the conductivity travels along the network by perceiving the spacetime information locally.
However, we notice that the derivative of the metric function, namely f (z), appears in the equations of motion as well, which is in contrast to the discretized version of the equations of motion appearing in previous literature on deep learning in holography [36,42]. In principle, we may treat f (z) and f (z) as two independent variables and train them independently. In practice, however, we find that this makes the optimization process of deep learning much more difficult. To obtain a feasible deep learning process, we discretize f (z) in terms of its relation to f (z). There are many different ways to discretize f (z). Here we select four distinct varieties as listed below and investigate their effects on the final results (see detailed information in Appendix A.).
We introduce the loss function to evaluate the difference between the true values and the results predicted by the neural network. One criterion of designing a neural network is to make the loss function as small as possible. According to the previous work [36,42], we introduce two loss functions as below, where the regularization term is, In the above equations, σ is the input data whileσ is what the network predicts. c 1 and c 2 are hyper-parameters, which we can tune manually. n epoch is the number of epochs we run, where an epoch is defined as the period that the full data set propagates through the neural network once. Here the regularization term L REG contains two terms and they are used to find a reasonable metric, which differs greatly from the common effect of overcoming the overfitting. The first term is to guarantee the asymptotically AdS property of spacetime at z b = 0, while the second term is to suppress the possibility of large gradients to promote the efficiency of the neural network to figure out a smooth metric function numerically. We find both terms are important for the deep learning process, just as in previous work [36,39,42].

III. THE SETUP FOR TRAINING DATA AND DISCRETIZATION
In this section, we present the setup for the training data and figure out the best way to discretize f (z). Given the theoretical result of the metric function f (z), then from (10) one can numerically obtain the data of optical conductivity from the boundary (σ(z b )) to the horizon (σ(z h )) for any specified frequency ω > 0, as performed in ordinary holographic approach. Now we reverse the problem by setting f (z) as an unknown function and try to learn it by inputting the data of optical conductivity. For this purpose, near the boundary, z = 0 we fix the location of the cutoff at z b = 0.01 and input 2000 numerical data of optical conductivity with ω uniformly sampled along (0.1, 1] as initial data at the cutoff. Next, since the conductivity becomes divergent at the horizon z = 1, we also need to introduce a cutoff z h near the horizon. Now for each input data, one can generate the data of conductivity at z h as output data through the neural network. Finally, one can train the neural network to learn the metric function by comparing the output data with the theoretical results. All the training data {(σ(z b ), σ(z h ))} we use can be found in [54]. Here, we study two casesz h = 0.9 and z h = 0.99 to test the learning ability of the neural network.
Next, we need to fix the number of layers in the neural network. The discretization in the process of deep learning introduces truncation error, which can be decreased by increasing the number of layers in the neural network. In theory, constructing deeper neural networks with more layers would improve the accuracy, however, with the price of consuming time and intensifying resources. In practice, we find that an 11-layer neural network in this work suffices to provide results that are strikingly close to those of deeper networks.
Finally, we intend to pick out the best way of discretizing f (z) for the neural network.
For this purpose, we show the output data of the standard conductivity and the reduced conductivity at z h = 0.99 with µ = 1 in Fig. 2 and Fig. 3, respectively, which are generated by the neural network with various discretizations of f (z). We also present the numerical result by directly solving the differential equation with the finite difference method, which might be viewed as the "true" values of the conductivity, namely the data obtained by the deep neural network in the continuous limit. The brown curve is the numerical result by directly solving the differential equation with the finite difference method.
First, let us focus on the output of the standard conductivity in Fig. 2. It is noticed that the data of the conductivity obtained by the "f (z) forward" discretization looks closer to the data by the continuous limit. However, in practice, we find that the "ln f (z) forward" discretization performs more robustly, and with it one can get more accurate metric information than "f (z) forward" discretization (see the comparison in Table I). We can make the difference of results between "f (z) forward" and "ln f (z) forward" smaller by increasing the number of network layers, but the "ln f (z) forward" method is intrinsically more robust.
Therefore, with comprehensive consideration, we decide to adopt "ln f (z) forward" method to train the network. Similarly, we compare the output of the reduced conductivity in Fig.   3, and find that "f (z) forward" discretization is the best way for deep learning process of the reduced conductivity.
We obtain the similar results for the case with (z h = 0.99, µ = 2), (z h = 0.9, µ = 2) and other combinations (in Appendix B). As a result, we choose "ln f (z) forward" discretization for the standard conductivity and "f (z) forward" discretization for the reduced conductivity in the construction of the neural network.
More importantly, we find that the output data of the reduced conductivity at z h is much closer to the data of the continuous limit than that of the standard conductivity.
In particular, as z h approaches the location of the horizon, the reduced conductivity exhibits its advantages more evidently since the divergent part has been peeled off. So in the next section, we focus on the results of the neural network constructed with the reduced conductivity. For full results, please see [54].

IV. RESULTS OF THE LEARNED METRIC
A. The result of learned metric with µ=1 and z h =0.99 Firstly, we show a typical example of the training results for the metric function f (z) with µ=1 and z h =0.99, which is illustrated in Fig. 4. The left figure is the result of the learned metric. It shows that after the deep learning process, the initial randomly selected metric becomes the true metric. Two plots on the right-hand side are the output data of the reduced conductivity at z h . It shows that the reduced conductivity generated by the initial metric is far away from the true one, while after the deep learning process, it is quite close to the true conductivity, indicating that the neural network has successfully learned the metric from the reduced conductivity. Also, we find that after the first training process, the results In this subsection, we discuss the effects of the chemical potential µ on the training results, which can also be understood as the influence of temperature. Fig. 5 is the deep learning results of the reduced conductivity at z h =0.99 with µ = 1 (left plot) and µ = 2 (right plot), respectively. We have tried various initial guesses and found they all converge to the true values of f (z) with great accuracy. This shows that the neural network is powerful and robust in learning the metric from optical conductivity.
We show more concrete performance criteria in Fig. 6. It is noticed that the effect of the deep learning for µ = 1 is better than that of the case µ = 2. This result holds also for many other training data [54]. Nevertheless, one can see that both training results are greatly improved after the second training process. The orange, red, blue, and purple curves represent the real part of reduced conductivity generated by the true metric, the random initial metric, the metric after the first training process, and the final metric after double training procedures, respectively. (b-2) The imaginary part of the reduced conductivity generated by different metrics.
to whether the neural network can learn the metric with the data in a narrow range of ω [42]. Here we intend to justify if this is also true for optical conductivity. For this purpose, we study two different ranges of ω:   Fig. 8 gives the concrete performance criteria of three different ω ranges at z h = 0.99 of µ = 1. We find that the training performance is better when the range of ω is wider. In addition, a larger µ will worsen the training outcomes, as illustrated in Appendix B.

V. CONCLUSION AND DISCUSSION
We have constructed a neural network to learn the RN-AdS black hole metric in the bulk based on the data of optical conductivity on the boundary by holography. The equation of motion that we recast into a neural network is generated by perturbing the vector field, thus enriching the prior research that only studied the scalar field or metric tensor field. In contrast to previous models, in this circumstance, the derivative of the metric function f (z) appears in the equation of motion, and we have proposed four distinct finite difference methods to discretize f (z). We have investigated their performance during the deep learning process in detail. Furthermore, to recast the equations of motion into a numerically feasible neural network, we have defined the reduced conductivity to avoid the divergence of the optical conductivity near the horizon. In addition, we have proposed a novel regularization term that automatically tunes the hyper-parameters, which ensures the robustness and efficiency of the training methods. We have also discussed the dependence of the training outcomes on the location of the cutoff z h , the temperature as well as the frequency range.
It turns out that the network is harder to train as z h approaches the horizon and as the temperature decreases. Given the number of data points, the training results with a wider range of ω are better than those with a narrower range. This can be understood from the fact that data from wider ranges of frequency contains more information than that from the narrower ranges.
This work has explicitly demonstrated the remarkable power of deep learning in the reconstruction of the spacetime with the given data on the boundary. For further study, we expect the AdS/DL method may be applied to AdS/CMT duality and shed light on the open problems in strongly coupled many-body system. For instance, given the RG flow data of the optical conductivity of the strange metal, the neural network would learn the metric of the bulk geometry which is capable of reproducing all the transport features of the strange metal. Currently, such kind of metric in the framework of AdS/CMT is unknown. Without doubt, the neural network presented in this paper is too simple to accomplish this task. We expect it could be developed into a network with more abundant structure and functions, such that its ability of learning the background information could be greatly improved. As the next step, one could consider a neural network with more neurons such that it could learn more unknown functions rather than a single unknown function in the metric. In addition, we expect the AdS/DL method may be applied to more holographic models and learn the bulk geometry by inputting the data of other transport quantities such as the thermal conductivity, etc.
An even more ambitious goal of AdS/DL is to learn the action of the dual gravity system from boundary data, which is of crucial significance not only for finding holographic models to understand important phenomena in dual systems but also for comprehending the implications of machine learning in holographic reconstruction of spacetime geometry. However, many challenges persist in realizing this goal, such as the degeneracy between the boundary data and the action of the dual theory, the construction of a machine learning model that can establish a relationship between the boundary data and the action represented by a symbolic system, etc. Recent advancements in machine learning offer promising prospects for directly learning the action from boundary data. For instance, the representation of symbolic space is comparable to natural language, and there exist highly effective methods, such as seq2seq [50], that can effectively address the problem. The SymbolicMathematics [51], empowered by seq2seq, is even more powerful in solving integral problems than well-known commercial software such as Mathematica and Matlab. These methods provide valuable strategies for representing and exploring the symbolic space. To solve the problem of degeneracy, on one hand, we can reduce the necessary variables in the model based on physical considerations, such as symmetry requirements. On the other hand, compared to the electrical conductivity that we currently consider, one may further reduce the degeneracy by increasing the type of boundary data, such as thermal conductivity, entropy, etc. Furthermore, a crucial capability of machine learning is generalization, meaning that it has the potential to address problems outside its training data range. Currently, advances such as Diffusion models and ChatGPT have robustly demonstrated this [52,53]. We have reason to believe that given sufficiently high-quality data sets, machine learning has the potential to learn more intrinsic properties of holographic gravity and greatly contribute to the development of the AdS/DL.

ln f (z) middle
Reσ(z + ∆z) =Reσ(z) + (ln f (z − ∆z) − ln f (z + ∆z)) 2 Reσ(z) + ∆z 2ωImσ(z)Reσ(z) , )Reσ(z) + ∆z 2ωImσ(z)Reσ(z) , We remark that for both of the middle methods the first layer is not defined since the data of the current layer depends on the data in the previous and next layers. Thus, in practice we adopt the forward difference method to define the first layer. For more details, please see the appendix C.

Appendix B: DNN training parameters and results
In this appendix we present the details of the training process, including the setup for epochs, loss functions, learning rate, the optimization algorithm as well as the training criteria.
As a whole, the training process is divided into two steps. The first step contains 3001 epochs, while the second step contains 2001 epochs. The loss function for each step has been shown in the main body of the paper. The learning speed with L 1 -loss is faster while L 2 -loss can make the final metric more smooth.
For the optimization algorithm, we use the RMSprop optimizer in the first step and the Adam optimizer in the second step. The batch size is fixed as 200.
For the learning rate, we reduce it gradually along with the increase of the epoch by where i represents the i th layer and f i p is the quantity that the network trains, which is just the metric function f (z) in this work. f i p represents the metric of prediction, while f i t refers to the true metric.
To prevent the influence of contingency factors and statistical fluctuations on the learning process, we train 5 times for each training process and set the average of these results as our final results. All the training results are listed as below (in the next page):