Deep-Learning Estimation of Band Gap with the Reading-Periodic-Table Method and Periodic Convolution Layer

We verified that the deep learning method named reading periodic table introduced by ref. Deep Learning Model for Finding New Superconductors, which utilizes deep learning to read the periodic table and the laws of the elements, is applicable not only for superconductors, for which the method was originally applied but also for other problems of materials by demonstrating band gap estimations. We then extended the method to learn the laws better by directly learning the cylindrical periodicity between the right- and left-most columns in the periodic table at the learning representation level, that is, by considering the left- and right-most columns to be adjacent to each other. Thus, while the original method handles the table as is, the extended method treats the periodic table as if its two edges are connected. This is achieved using novel layers named periodic convolution layers, which can handle inputs exhibiting periodicity and may be applied to other problems related to computer vision, time series, and so on for data that possess some periodicity. In the reading periodic table method, no material feature or descriptor is required as input. We demonstrated two types of deep learning estimation: methods to estimate the existence of a bandgap, and methods to estimate the value of the bandgap given when the existence of the bandgap in the materials is known. Finally, we discuss the limitations of the dataset and model evaluation method. We may be unable to distinguish good models based on the random train-test split scheme; thus, we must prepare an appropriate dataset where the training and test data are temporally separate. The code and data are open.


I. INTRODUCTION.
Machine learning (ML) methods have been employed in the search for inorganic materials [1,2], and useful libraries have been developed [3,4]. In the case of organic materials, machine learning methods have been studied, typically by employing graph structures [5,6], which are relevant to problems in computer science. Researchers have invested more efforts into searching for organic materials than for inorganic ones. However, the search for inorganic materials still has a wide scope. Density functional theory (DFT) [7][8][9], which involves first-principle computational calculations, requires expensive computational resources, is difficult to apply to highly correlated systems, and often requires ordered crystal structures, although progress is being made to address such issues. Machine learning and DFT can be used to complement each other, and both must be investigated further. Deep learning [10][11][12] has achieved advances in image recognition [13], machine translation [14], image generation [15], natural language inference [16,17], raw audio generation [18], and imperfect information games [19]. Now, deep learning is finding increasing applications in mathematics and physics as well, in the fields of quantum systems [20], particle physics [21], Gauss-Manin connection [22], neural networks and quantum field theory [23], holographic QCD [24], conformal field theory [25], black hole metrics [26], supergravity [27], Seiberg-Witten curves [28], Calabi-Yau manifolds [29,30], etc. Deep learning possesses superior capabilities over prior machine learning methods, such as support vector machines (SVM) [31,32], random forest [33], and k-means [34][35][36].
A deep learning method for finding new superconductors was introduced in our previous study [1] (Although it is not technically deep learning, the critical temperature of superconductors was studied by the random forest method, which is a method of machine learning, in ref [37]). In the study [1], a deep learning model was trained to read the periodic Hence, it is better to use a learning representation that allows the deep learning model to unquestionably learn the laws represented by the cylindrical periodicity. We extended our previous method [1] to reflect the cylindrical periodicity at the learning representation level.
We designed this functionality as a new layer named the periodic convolution layer, which performs the required operation, i.e., the layers process the periodic table as if the table is cylindrically rolled-up, even though the input is the ordinary two dimensional periodic table. This method is named reading periodic table with cylindrical periodicity. The periodic convolution layer can be used for other problems related to computer vision, time series data, and so on, if the data being examined contains some periodicity.
To solve the two aforementioned problems, we demonstrate that the extended method can predict band gaps. A band gap is a fundamental material property that forms the basis for separation among conductors, semiconductors, and insulators; it influences thermal and electrical conductivities, the functioning of light emitting and laser diodes, etc. In addition, its structure is relevant to some topological matter that has recently been reported and is receiving considerable attention. To design functional materials, knowledge of the band gap is important. ML-based band-gap estimation has already been performed in earlier studies [38][39][40][41][42][43]. We demonstrate that the proposed method of reading periodic table with cylindrical periodicity has estimation capabilities comparable to those of SVM classification and regression studied in Ref. [43]. We also highlight the problem in the random train-testsplit model evaluation scheme, in that we may not be able to identify good models by using the scheme. 3 The main contributions of the paper are as follows. In the previous study, we introduced the method named reading periodic table, which uses deep learning learn to read the periodic table and estimate the critical temperature of superconductors. In this paper, we extended the method for deep learning to also learn the cylindrical periodicity directly, i.e. by considering the fact that the right-and left-most columns are adjacent in the table. We then demonstrated that the method has wide applicability to material-related problems other than superconductors by demonstrating two band gap estimations. This verification is necessary for future applications. Furthermore, it is not necessary to input the crystal structure of the materials. This is both advantageous and disadvantageous. The advantage is as follows. Because we do not need to input spatial information, we just need to input the chemical formula only to estimate material properties We do not have to calculate the spatial structure of new materials via first principle calculations, which require high computational cost, before inputting spatial structure to the machine learning model. Furthermore, we could not acquire spatial information for the experimental data on band gaps. Databases of experimentally measured spatial information are available, such as the Crystallography Open Database (COD) [44][45][46]. COD is an openaccess collection of the crystal structures of organic, inorganic, and metal-organic materials, and it has approximately 460,000 entries as of 2020. However, there are no databases for specific problems like band gaps, superconductors, and so forth. It takes significant effort and time to determine the spatial structure of the experimental data of materials related to band gap problems using such a database. In such databases, the same chemical formulas can 5 have different spatial structures. To determine the spatial structure of a specific material, we need to check original papers. The disadvantage of not inputting the crystal structure is that the capability of machine learning will improve if we use the complete information of spatial structures, since it plays an important role in physics. Although it is not visibly apparent, the left-and right-most columns of the periodic table are adjacent, and it also possesses cylindrical periodicity. Thus, for the above-mentioned reasons, we extend the method so that deep learning can also learn the laws represented by the cylindrical periodicity more directly at the learning representation level, as illustrated in Fig. 1. We achieve this functionality using a new layer named periodic convolution layer, as illustrated in Fig 2. The layer can be applied to other problems that use input data possessing some periodicity.

A. Hyperparameters
For both estimation tasks, we used the absolute representation in the method of reading periodic table with cylindrical periodicity, where the learning rate is 10 −4 , Adam is used as the optimizer, and a network with a depth of 64 is used. In deep learning, a depth of 64 is not so deep compared to that in modern neural network architectures.
In the absolute representation, H 2 O is represented as H: 2, O:1, whereas in its relative representation, it is H: 2 3 , O: 1 3 . Except for the fact that that the representation is the absolute one, the hyperparameters are identical to those used in our previous study [1]; the learning rate is 10 −4 , and the epochs are 200 for both band gap existence estimation and gap value estimation. Parameter tuning may improve the scores.

B. Neural Network Structure
Basic periodic block. To explain the model structure, we need to describe the basic periodic block. Let a basic periodic block be denoted by basic periodic block[input channel, inside channel, output channel, kernel size], which is composed of the following layers. First, we have the periodic convolution[input channel, output channel=inside channel, kernel size], followed by batch norm, followed by a ReLU, followed by periodic convolution[input channel=inside channel, output channel, kernel size], followed by batch norm, and then followed by a ReLU.
We also have a skip connection in the basic periodic block between input of the block and 6 before the last ReLU.
Next, we have conv[input channel=10, output channel=20, kernel size=(2, 4), stride=(1, 2), padding=(0, 3)] followed by batch norm and a ReLU. Then, basic periodic block[input channel=20, inside channel=20, output channel=20, kernel size=(3,3)] follows. Next, we have conv[input channel=20, output channel=30, kernel size=(2, 4), stride=(1, 2), padding=(0, Here, we demonstrate two types of estimations: (1) estimating the existence of a band gap and predicting whether the material is a nonmetal, and (2) estimating the values of the band gap given that the existence of a band gap in the materials is known in advance. We basically used the same dataset employed in the previous study using SVM [43] with correction to compare our results with the SVM-based results.It is noteworthy that the previous study used an existing method, SVM, whereas we invented a novel method.

B. Summary of the Data
We use the experimental data of 3734 gapped materials, of which 2473 are unique compositions. For the binary classification, we also use the first principle calculation data of 2450 non-gapped materials from Material Project [47]. The numbers are balanced to avoid possible bias caused by an imbalance between the number of gapped and non-gapped materials. Only unique compositions are used. We use only the experimental data for the regression of band gaps. We use all the experimental data of the 3734 gapped materials to accord with previous studies.

C. Classification of band gap existence.
We randomly split the dataset into training and test data in a ratio of 80 to 20. The results of the binary classification are summarized in Table I, and the ROC curve is illustrated in Fig. 3.   The dataset used had 3734 compositions. We did not remove identical compositions with different band gaps to accord with the previous study, as they may differ in their crystal structures, which could not be understood from the available data. We also randomly separated the dataset into training and test data in a ratio of 80 to 20. The results of band-gap regression (for test data) are summarized in Table II, and the scatter plot is illustrated in Fig. 4 We demonstrated the applicability of the method for other material-related problems by solving two problems related to the band gap. The deep learning method satisfactorily provides a binary prediction of band-gap existence and a regression of the band-gap values.
The training and test data obtained by randomly splitting the dataset were used to compare our results with those of the previous studies. However, because the test data is inevitably similar to the training data, the estimation by machine learning with the test data may become very easy. To more accurately evaluate models, we require an appropriately prepared dataset. Thus, the data must be divided temporally, such that the data until a certain year constitute the training data and the data after that year constitute the test data, as discussed in Ref. [1]. This is the same situation under which we will use machine learning models for material search, and the evaluation of the machine learning models must be performed under similar conditions. However, as we could not know when the data was obtained, we could not temporally separate the dataset in this study. Contrary to the naive intuition of a machine learning novice, the preparation of an appropriate dataset is very difficult and is among the most crucial contributions to the progress of the field. We would like to stress that the development of such a dataset remains a challenge for the machine learning community, and if the dataset is appropriately prepared, we would be more capable of distinguishing and making good models based on such a dataset.

THE CODE AND THE DATA
The data used for band gap existence binary classification, the data used for the band gap regression, the code for the periodic convolution layer, the code for the neural network will be found in the link, since a link cannot be used in a paper in this journal. Readers can use the code and the data provided that they cite this paper. Furthermore, the code that transforms chemical formulas into the reading periodic