Introduction

Materials science, like any other field in science and technology, constitutes of four paradigms that are empirical, theoretical, computational, and data-driven science.[1,2,3,4] Over the last couple of decades, the increasing availability of advanced computational resources and the generation of big data[5,6,7,8,9] using the first three paradigms have shifted our approach from traditional methods to data-driven methods for data analysis (Fig. 1). Traditional methods involve designing empirical formulations and computational methods with chemical intuition and performing trial and error-based hands-on experimentation and/or simulations. However, with a near-infinite space of possible candidate materials, trying to discover new materials with desirable properties and performance using traditional methods becomes extremely costly and time-consuming. Hence, data-driven methods have become extremely popular for screening purposes, which can significantly reduce the cost and development time compared to hands-on experiments and simulations. Data-driven methods use artificial intelligence (AI) techniques that have been employed and improved upon by people in their respective fields of research for various applications.[10,11,12,13,14,15,16] These data-driven AI techniques have also been used to help solve various tasks in the field of materials science, which can be broadly categorized into forward modeling for property prediction analysis and inverse modeling for process optimization and materials design and have helped materials scientists better understand the underlying correlations and advance the frontier of knowledge.[17,18,19,20,21,22,23,24,25,26,27,28,29]

Some of the learning methodologies used to perform forward and inverse modeling include reinforcement learning, active learning, generative modeling, genetic algorithms, scientific machine learning, and transfer learning. Reinforcement learning involves decision-making tasks where the agent is trained to make optimal decisions within an environment by trial and error to obtain maximum reward.[30,31] Active learning aims to efficiently label or acquire new data by iteratively selecting the most informative samples from an unlabeled dataset.[32,33] Generative modeling is used to train models which learn the underlying relation and patterns within the input dataset and use that information to generate new samples that have similar characteristics to that of the input data.[34,35] A genetic algorithm is an optimization technique designed to search and find near-optimal solutions to complex problems with a large solution space or non-linearity of the objective function.[36,37] Scientific machine learning focuses on creating models that incorporate constraints based on scientific knowledge and physical principles when training the model.[38,39] Transfer learning is a technique that involves leveraging pre-learned knowledge from one task or domain to improve performance on a different task or domain.[40,41,42] All these methods are increasingly gaining interest and applicability in materials science to accelerate material discovery, property prediction, and optimization, leading to the development of new materials with tailored properties and enhanced performance.

Figure 1
figure 1

The four paradigms of science and its application toward predictive modeling.

In this brief Review, we provide a high-level overview of the evolution of AI in contemporary materials science for the task of materials property prediction in forward modeling. The three stages of evolution discussed in this work include ‘Traditional Machine Learning,’ ‘Conventional Deep Learning,’ and ‘Graph Neural Networks.’ Each stage of evolution is accompanied by an outline of some of the commonly used methodologies and/or network architectures and general applications. We conclude the work by providing some possible future ideas for further development of artificial intelligence in materials science to facilitate the discovery, design, and deployment workflow.

Traditional machine learning

The advent of AI in materials science was accompanied by the application of traditional machine learning (ML), which fundamentally consists of algorithms that learn from structured data and build a (usually somewhat easily interpretable) model to make predictions. Traditional ML algorithms have been widely used for classification, regression, and clustering tasks in materials science.[43,44,45,46,47] To construct an effective and efficient ML model, one has to choose the algorithm used for model training and perform feature engineering to develop a suitable representation for the input data. Some of the traditional machine learning algorithms commonly used for predictive analysis for both classification and regression tasks are shown in Table I.

Table I Traditional machine learning algorithms for predictive analysis.

Applications

Most works involving traditional ML in materials science put emphasis on the input representation obtained via feature engineering of the unstructured data based on domain knowledge.[43,47,53,54,55,56,57,58,59,60,61] The general workflow for the data-driven approach that incorporates traditional machine learning for training predictive models in materials science is shown in Fig. 2.

Figure 2
figure 2

The general workflow for the data-driven approach that incorporates traditional machine learning for training predictive models in materials science.

The workflow comprises the following steps: (1) Obtain raw input files from the first three paradigms of materials science, i.e., empirical science, theoretical science, and computational science; (2) Generate ML-friendly features from a featurizer that uses the domain knowledge to obtain attributes that represent a sufficiently diverse range of physical/chemical properties for a given composition and/or structure; (3) Feed the ML-friendly features into the traditional ML technique of the user’s choice, depending on the input representation and application, in order to maximize the model performance; and (4) Use the trained model in further analysis involving materials property prediction for materials discovery, design, and deployment. A few examples of solving materials science problems with traditional ML techniques trained on materials data with a brief overview are as follows. Work in[59] used thousands of descriptors obtained via domain knowledge-based feature engineering containing combinations of elemental properties such as the atomic number and ionization potential to analyze the tendency for materials to form different crystal structures. Meredig et al.[43] used the fraction of each element present and various intuitive factors, such as the maximum difference in electronegativity as the materials representation to perform predictive modeling for the formation energy of ternary compounds. Work in[58] performed ML algorithm-based predictive modeling for different materials properties using a generalized set of composition based consisting of stoichiometric attributes (e.g., number of elements present in the compound, several \(L^p\) norms of the fractions), elemental property statistics (e.g., mean, mean absolute deviation, range, minimum, maximum and mode of different elemental properties), electronic structure attributes (e.g., average fraction of electrons from the s, p, d, and f valence shells between all present elements), and ionic compound attributes (e.g., whether it is possible to form an ionic compound based on[62]) as materials representations. Faber et al.[60] took 3938 entries from Materials Project and used Coulomb matrix (CM)-based representation to train the ML model to obtain a low prediction error in cross-validation. Work in[47] used representation based on four different kinds of structural descriptors to create a model for cohesive energy from 18,903 entries consisting of compounds based on a select set of structures and elements. Schütt et al.[61] predicts the density of states at the Fermi level using an ML model with a representation based on the partial radial distribution function and demonstrate that the model can be used to predict the property value for crystal structures outside of the original training set. Work in[63] uses 126 features derived from the local environment of each atom within a crystal structure known as Voronoi tessellation (e.g., effective coordination number, structural heterogeneity attributes, chemical ordering attributes, maximum packing efficiency, local environment attributes) of a material to perform predictive analysis. These sets of descriptors generated via feature engineering tend to be more effective toward a specific materials property only making them less generalizable. Moreover, in general, ML algorithms are less scalable with an increase in the number of data points.

Conventional deep learning

Using unstructured data as model input for traditional ML algorithms is challenging as the user has to first perform manual or domain knowledge-based feature engineering and then select a desirable algorithm to train the ML model. This makes the whole workflow costly, time-consuming, and difficult to scale with the ever-increasing data. In such scenarios, deep learning (DL) algorithms—which are ML algorithms based on deep neural networks—have emerged as a powerful tool for performing predictive analysis. Given a large materials dataset available for training the model, DL techniques can automatically and efficiently extract features from those unstructured data and build accurate models for different materials properties, often surpassing traditional ML techniques. There are different types of deep neural networks that can be used for training the model depending on the input representation of the unstructured data, some of which are shown in Table II.

Table II Well-known conventional deep learning algorithms for predictive analysis.

Applications

Deep learning offers an alternative route for accelerating the production of predictive models by being able to excel on raw inputs and therefore reducing the need for designing physically relevant features using manual or domain knowledge-based feature engineering. There have been several works that used deep neural networks to perform forward modeling for the task of materials property prediction.[67,68,69,70,71,72,73,74,75,76] The general workflow for the data-driven approach that incorporates conventional deep learning for training predictive models in materials science is shown in Fig. 3.

Figure 3
figure 3

The general workflow for the data-driven approach that incorporates conventional deep learning for training predictive models in materials science.

The workflow comprises the following steps: (1) Obtain raw input files from the first three paradigms of materials science, i.e., empirical science, theoretical science, and computational science; (2) Obtain DL-friendly features directly from the raw input files without going through the featurizer that uses domain knowledge to obtain composition- and/or structure-based attributes incorporating a sufficiently diverse range of physical/chemical properties; (3) Feed the DL-friendly features into the conventional DL technique of the user’s choice, depending on the input representation and application, in order to maximize the model performance as given in Table II; and (4) Use the trained model in further analysis involving materials property prediction for materials discovery, design, and deployment. It is advisable to apply conventional DL methods on a given dataset when dealing with a large dataset, as conventional DL methods are consistently shown to help improve the performance of the trained model as compared to traditional ML techniques in large dataset scenarios. A few examples of solving materials science problems with conventional DL techniques trained on materials data with a brief overview are as follows. Harvard Energy Clean Project by Pyzer-Knapp et al.[77] used a three-layer network for predicting the power conversion efficiency of organic photovoltaic materials. Montavon et al.[68] predicted multiple electronic ground-state and excited-state properties using a model trained on a four-layer network on a database of around 7000 organic compounds. Zhou et al.[67] used high-dimensional vectors learned using Atom2Vec along with a fully connected network with a single hidden layer to predict formation enthalpy. CheMixNet[70] and Smiles2Vec[78] applied deep learning methods to learn molecular properties from the molecular structures of organic materials. ElemNet[69] used a 17-layered architecture to learn formation enthalpy from elemental fractions but has shown performance degradation beyond that depth (Fig. 3). Work in[71] used a combination of principal component analysis and convolutional neural networks to predict the stress–strain behavior of binary composite. Zheng et al.[79] uses multi-channel input for the deep convolution neural networks to improve the prediction accuracy as compared to single input channels. Nazarova et al.[80] use recurrent neural networks along with a series of optimization strategies to achieve high learning speeds and sufficient accuracy for the task of polymer property prediction. Yang et al.[72] used convolution recurrent neural networks to learn and predict several microstructure evolution phenomena of different complexities. IRNet[26,73] introduced the concept of deeper neural network architecture in materials science, where they build 17-, 24-, and 48-layered architecture with residual connections to learn different materials properties from the composition and structure information of a crystal without degrading the performance. Branched Residual Network (BRNet) and Branched Network (BNet)[74] introduce the concept of branching in neural network architecture to perform materials properties prediction from the composition-based attributes for improved performance under parametric constraints (Fig. 3).

Graph neural networks

Conventional DL is used to perform predictive analysis when dealing with input representation based on Euclidean datasets comprised fixed forms (such as images, text, and numerical tables). These datasets also tend to work on the fundamental assumption that every instance is independent of each other. However, applying conventional DL algorithms becomes challenging when presented with more complex data represented as graphs (non-Euclidean) without a fixed form and comprised intricate interactions between the instances inside the graph. Graph neural networks (GNNs) are a category of deep learning algorithms used to handle and perform inference on the complex data represented as graphs. GNNs work by iteratively updating the representation of each node in the graph based on the representations of its neighbors. This allows GNNs to learn about the local and global structure of the graph, which can be used to perform predictive analysis for various applications. Some of the common examples of GNNs based on how they learn the representations of nodes are shown in Table III.

Table III Common examples of GNNs based on how they learn the representations of nodes in a graph.

Applications

As GNNs are able to capture the complex relationships between the nodes and edges in graphs, they have been used to learn the atomic interaction or the material embeddings from the crystal structure and composition.[61,86,87,88,89,90,91,92,93,94,95] The general workflow for the data-driven approach that incorporates graph neural networks for training predictive models in materials science is shown in Fig. 4.

Figure 4
figure 4

The general workflow for the data-driven approach that incorporates graph neural networks for training predictive models in materials science.

The workflow comprises the following steps: (1) Obtain raw input files from the first three paradigms of materials science, i.e., empirical science, theoretical science, and computational science; (2) Obtain GNN-friendly features from the raw input files, which are usually represented in a graph form with intricate interactions between the instances inside the graph. Additionally, in some cases, atom-, bond-, or angle-based embeddings containing the pre-defined knowledge are also provided as a form of input to the model in order to aid the training process; (3) Feed the GNN-friendly features with/without embeddings into the graph neural network of the user’s choice, depending on the input representation and application, in order to maximize the model performance as given in Table III; and (4) Use the trained model in further analysis involving materials property prediction for materials discovery, design, and deployment. In a real-life scenario, a given compound is always represented in a graphical form with intricate interactions between the atoms. Moreover, for a given compound, it is possible to have various structure types or polymorphs with completely different materials property values, which is difficult for traditional ML and conventional DL techniques to distinguish. Hence, graph neural networks tend to perform better compared to other techniques due to their ability to excel in such scenarios. A few examples of solving materials science problems with graph neural networks trained on materials data with a brief overview are as follows. Crystal graph convolution neural networks (CGCNN)[88] directly learn material properties via the connection of atoms in the crystal structure of the crystalline materials, providing an interpretable representation, which was then improved in[89] by incorporating Voronoi-tessellated crystal structure information, explicit 3-body correlations of neighboring constituent atoms, and optimize chemical representation of interatomic bonds in the crystal graph. OGCNN[92] incorporates orbital–orbital interaction and topological characteristics information in the CGCNN model to improve the performance of the model. A-CGCNN[93] introduces an attention mechanism and normalizing node features in the network architecture to improve the prediction accuracy of the CGCNN model. SchNet[61] incorporated continuous filter convolutional layers to model quantum interactions in molecules for the total energy and interatomic forces which was then extended in[86] where the authors used an edge update network to allow for neural message passing between atoms for better property prediction for molecules and materials. MatErials Graph Network (MEGNet)[87] was developed as a universal model for materials property prediction of different crystals and molecules, which uses temperature, pressure, and entropy as global state inputs. Goodall and Lee[90] developed an architecture called Representation Learning from Stoichiometry (Roost) that takes elemental fraction-based stoichiometric attributes as input features along with embedding obtained via material science literature using advanced natural language processing algorithms known as matscholar embedding to learn appropriate materials descriptors from data. The architecture uses a graph neural network that takes matscholar embeddings and the elemental fraction of each element present in the compound, which is passed through a series of parallelly stacked message-passing layers, weighted attention layers, and fully connected layers with residual connections before making a prediction (Fig. 4). Directional Message Passing Neural Network (DimeNet)[96] and DimeNet++[97] use the directional information by transforming messages based on the angle between the atoms along with spherical Bessel functions and spherical harmonics to achieve better performance than the Gaussian radial basis representations with latter model being faster as compared to the former model. Geometric Message Passing Neural Network (GemNet)[98] was developed as a universal approximator for molecule predictions that is invariant to translation and equivariant to permutation and rotation using directed edge embeddings and two-hop message passing in its architecture. Atomistic Line Graph Neural Network (ALIGNN)[91] combines different structure-based features, including atom, bond, and angle information of the materials, to perform materials property prediction and obtain high-accuracy models for improved materials property prediction. ALIGNN architecture consists of embedding layers for each of the input types, followed by the ALIGNN layer and GCN layer, each containing two edge-gated graph convolution layers[99] and one edge-gated graph convolution layer, respectively, and finally an average pooling layer before making a prediction (Fig. 4). ALIGNN was then improvised as ALIGNN-d in[100] where they introduced dihedral angles along with other information as the model input. DeeperGATGNN[94] constructed based on GATGNN[95] combines residual connections and global attention mechanism with differentiable group normalization to address the over-smoothing issue and improves the prediction accuracy of crystal properties when dealing with large datasets. Graphormer[101] uses a self-attention mechanism in the GNN to achieve significantly improved performance in the prediction of crystal and molecular properties in the OGB[102] and OC20[103] challenges. Crystal Edge Graph Attention Neural Network (CEGANN)[104] learns unique feature representations using graph attention-based architecture and performs classification of materials across multiple scales and diverse classes.

Future directions

The widespread use and development of various AI-based models in materials science inspired by standard practices in the computer science community have led to the utilization of advanced algorithms with tailor-made input representations for application in materials science. With how fast materials science is catching up with the state-of-the-art methodology in computer science, it is just a matter of time before researchers formulate a method to design a generalized workflow to extract input representation which can then be used to train the next generation of neural networks. Hence, in this section, we would like to give a brief overview of a new class of neural networks known as the graph matching networks (GMNs)[105], which might be the next class of neural networks that can help advance and boost the model accuracy closer to the chemical accuracy if modified and implemented for the materials science community.

Graph matching networks

GMNs[105] is a class of neural networks that is used to perform supervised learning by processing the similarity between a pair of graphs given as input. As the name suggests, they are particularly well suited for tasks where the input data are structured in the form of a graph. As GMNs take a pair of graphs as the model input and jointly compute the similarity score on the pair, they tend to be potentially more powerful than the embedding models, which independently map each graph to a vector. The network architecture comprises encoders (one for each graph) and a cross-graph attention mechanism. First, the encoder is used to produce vector representation for each node in the graph given as model input. Then, these vector representations are passed to the cross-graph attention mechanism to compute a similarity score between each pair of nodes. Finally, the similarity score between each pair of nodes is used to compute a final similarity score between the two graphs. Compared to the embedding-based graph models, the matching model can potentially change the vector representation of the graphs on the basis of the other graph used for comparison. This way, if the two graphs do not match, the model will modify the vector representation of the graph to be comparatively more different from the two graphs that match.

GMNs have been shown to be a powerful tool for both graph–graph classification and graph–graph regression tasks, as it is able to learn the complex relationships between the nodes and edges of graphs more effectively as compared to the traditional methods. They also generalize well to new graphs that have not been seen during training by learning robust vector representations of graphs. Although GMNs have been shown to outperform state-of-the-art models in several studies, they are computationally expensive to train, difficult to interpret, and highly sensitive to the choice of hyperparameters. Overall, GMN is a new approach that has shown to be promising when utilizing graph-structured objects for performing graph–graph classification and regression[106,107,108,109] and is expected to improve the robustness and accuracy of the predictive modeling for different scientific domains, including materials science.

Conclusion

AI has grown to become an important and flexible tool with various applications for materials discovery, design, and deployment. In this section, we discuss some other facets of AI in the context of materials informatics, which are important considering the growing interest, applicability, and impact of data-driven approaches in materials science.

Limitations and challenges

There still exist a wide variety of limitations and challenges that need to be worked upon to leverage the maximum potential of AI-based models in the materials science community. Some of these limitations and challenges (Fig. 5) include reliability and quality assessment of datasets, uncertainty quantification of the deployed model, conversion of the raw data to tailor-made input representation, explainability/interpretability of the trained model for prediction tasks, reproducibility, transferability, and usability of the complex models. Moreover, for most of the datasets, it is only feasible to use traditional machine learning or convectional deep learning techniques due to the lack of tools and information to convert the raw data into a suitable input representation that can be fed into the more advanced methods, such as graph neural networks. Although there have been ongoing efforts to address the challenges associated with the application of AI in materials science for materials discovery, design, characterization, and performance prediction, which incorporates various techniques,[110,111,112,113] these areas of research are still in their nascent stage. Hence, more research on filling the gap in the knowledge between AI-based models and their application to the materials science problem will help to better understand underlying correlations, create an easy-to-use pipeline for raw data to tailor-made input representation conversion, potentially determine physical laws and knowledge that are currently unknown, and bring down the model errors to resemble the chemical accuracy and eventually contributing to scientific understanding and progress with minimal human input.

Ethical considerations

Like any other scientific field, materials science must look into various ethical considerations when incorporating AI in its research, development, and applications. These ethical considerations are essential to ensure responsible and sustainable progress and prevent unintended negative consequences. Some of the key ethical considerations in materials science include (1) potential impacts on jobs: we need to recognize the potential negative effect of new materials and technologies on the job market and come up with proactive measures to mitigate negative impacts on workers, such as retraining programs, reskilling initiatives, and social safety net; (2) recognizing AI-generated content: reducing the risk of spreading possible misinformation through data generated using AI (such as generating and spreading materials properties and structures information obtained from generative modeling) via rigorous validation and testing using transparent and reproducible practices; and (3) bias in AI models: mitigating bias in AI models when dealing with training data that are unrepresentative or contain inherent biases to ensure fair and unbiased predictions by trying to use diverse and inclusive datasets and employing bias detection and correction techniques to minimize potential biases. Addressing these considerations and establishing ethical guidelines and codes of conduct can help guide responsible research and innovation in materials science.

Collaborative efforts

Successful multidisciplinary collaborations have played a significant role in advancing AI in materials science. These collaborations bring together experts from various fields to tackle complex challenges, create innovative solutions, and open new possibilities. Some examples of how experts from different fields can work together to advance the field include: (1) Computer scientists provide expertise in algorithm development and optimization by developing AI models and algorithms tailored for materials data analysis and prediction; (2) Data scientists offer insights into handling large and complex materials datasets by ensuring data quality and accessibility for AI-driven analyses; (3) Computational materials scientists contribute their knowledge in developing efficient simulation methods by incorporating high-performance computing into the workflows; (4) Experimental materials scientists provide guidance on relevant material properties and structures and insights into material processing and performance evaluation; (5) Chemists contribute their expertise in chemical synthesis and offer domain-specific knowledge on material properties and molecular structures; and (6) Industry partners provide real-world testing and validation, ensuring the practical relevance of AI models and materials discoveries. Fostering an inclusive and collaborative research environment with experts from different disciplines can collectively advance the field, leading to transformative discoveries and developments in materials science.

Figure 5
figure 5

Limitations and challenges of AI-based models in the materials science community.