Artificial intelligence for the understanding of electrolyte chemistry and electrode interface in lithium battery

: Recognizing the critical role of electrolyte chemistry and electrode interfaces in the performance and safety of lithium batteries, along with the urgent need for more sophisticated methods of analysis, this comprehensive review under-scores the promise of machine learning (ML) models in this research field. It explores the application of these innovative methods to studying battery interfaces, particularly focusing on lithium metal anodes. Amid the limitations of traditional experimental techniques, the review supports a hybrid approach that couples experimental and simulation methods, enabling granular insights into the formation process and characteristics of battery interfaces at the molecular level and harnessing AI to extract patterns from voluminous data sets. It showcases the utility of such techniques in electrolyte design and battery life prediction and introduces a novel perspective on battery interface mechanisms. The review concludes by asserting the potential of artificial intelligence (AI) or ML models as invaluable tools in the future of battery research and highlights the importance of fostering confidence in these technologies within the scientific community.


INTRODUCTION
The 2019 Nobel Prize in Chemistry has been awarded to professors John B Goodenough, M Stanley Whittingham, and Akira Yoshino for their groundbreaking contributions to the invention of lithium batteries.They have created a rechargeable world: laptops, smartphones, electric cars, etc., which rely on lithium-ion batteries (LIBs), have truly changed human lives [1][2][3][4].However, this does not signify the ending of research as smartphones' battery life has been subject to criticism, and electric vehicles' range is challenging to surpass that of conventional cars [5][6][7].Traditional LIBs employ graphite as the anode material, so their energy densities are generally below 300 W h kg −1 .Due to the higher theoretical specific capacity (~3860 mA h g −1 ) and lower reduction potential (−3.04 V vs. SHE) of Li metal, lithium metal batteries (LMBs) can deliver higher energy densities (>500 W h kg −1 ), making them promising candidates to develop high-energy battery systems [8][9][10].However, lithium metal anodes encounter poor cycling stability and dendrite conundrum that impede their commercialization [11][12][13][14].
Artificial intelligence (AI) or machine learning (ML) models offer new opportunities for the understanding of complex electrolyte chemistry and electrode interphase formation in lithium batteries [37,38].Every year, nearly 10,000 research papers related to lithium batteries are produced, involving electrode design, new electrolyte development, electrolyte solvation structure, electrode interfacial characterization, etc.These studies provide a rich database for AI research in the field of batteries [39,40].In addition, the theoretical calculation and simulation of the model of the electrode/electrolyte interface is increasingly perfect, providing a solid foundation for AI research [41].AI is expected to be widely used in the development and design of new electrolytes [42,43], the establishment of key descriptors for electrolyte performance [44,45], the understanding of electrode interface reactions [46], the simulations of interface structure and composition [47][48][49], and battery life prediction [50,51].However, few reviews have focused on these aspects, especially electrolyte chemistry and understanding of electrode interfaces.Most reviews of the AI application in the battery field in the literature mainly focus on the electrode material science [38,[52][53][54][55][56][57][58], which is not the aim of our review.
To this end, here we provide a comprehensive overview of the application of AI and ML techniques in the understanding of electrolyte chemistry and electrode interfaces in lithium batteries, particularly on lithium metal anodes.As shown in Figure 1, we highlight how these methods can be utilized to gain deeper insights into the electrolyte screening and rational design, the formation processes of electrode interfaces, the Li dendrite growth simulations and their potential applications in battery degradation and lifetime prediction, which in turn guide the electrolyte design.Our aim in this review is to provide readers with a comprehensive understanding of the application of AI in the field of battery physics and chemistry, thereby instilling greater confidence in related technologies.
ML is a branch of AI, which aims to automatically learn laws from data through computer programs, and use these learned laws to make predictions [52].In recent years, with the development of material genetic engineering and the emphasis on data, data-driven methods represented by ML have become increasingly important in scientific research.ML algorithms require material structure or characterization data as input, called descriptors.In addition to supervised and unsupervised learning, ML consists of various other types such as ensemble learning, reinforcement learning, and semi-supervised learning.In this work, we mainly focus on the introduction of supervised and unsupervised learning methods.To delve into alternative ML techniques, it is recommended to refer to the research paper authored by Liu et al. [52], which offers a comprehensive examination of fundamental principles and implementation strategies in ML, specifically addressing their relevance to rechargeable battery materials science.
Supervised learning is currently the most widely used ML method in materials research.For supervised learning, the data set needs to include descriptors and labels.In the field of batteries, labels can be some physical and chemical properties or classification marks of batteries' materials.The training process of supervised ML establishes a mapping relationship between input (descriptors) and output (label), f ML :R D → L, where D is the dimension of the descriptors, which can also be called the number of features.The established mapping relationship is the law learned by the computer program, and the computer makes predictions based on this mapping relationship.Supervised learning can be divided into classification models and regression models according to different prediction tasks, and the indicators for evaluating the predictive ability of the model vary according to the various prediction tasks.The indicators for evaluating the prediction accuracy of the classification model include the following.
(1) Accuracy: the number of correctly classified samples divided by the total number of samples.However, in the case of unbalanced categories, the accuracy rate cannot fully reflect the performance of the classifier.
(2) Precision rate: the number of samples correctly predicted as positive samples divided by the total number of samples predicted as positive samples.It focuses on how many of the classifier's predicted positive samples are valid positive samples.
(3) Recall: the number of samples correctly predicted as positive samples divided by the total number of valid positive samples.It focuses on how many classifiers are predicted in the valid positive samples.
(5) AUC-ROC: the area under the receiver operating characteristic (ROC) curve, which measures the classifier's ability to sort positive and negative samples, that is, the ability to distinguish between positive and negative samples.
where n is the number of samples, y i is the label value of the ith sample, y i is the model prediction value of the ith sample, and y is the average value of the labels of all samples.When selecting evaluation indicators, it is necessary to choose appropriate indicators according to the characteristics of specific tasks and data.
Commonly used supervised learning algorithms include the linear model [59][60][61], tree model [62,63], support vector machine [64], and neural network model [65], etc.In early research, the classic algorithm of linear models-linear regression (LR), was most commonly used because of its computational simplicity and strong physical interpretability [66].The way of LR to seek the mapping law is to solve multiple linear equations.For a given descriptor R and label L, the predicted label value of the linear model is L predicted = R T W, where W is called the coefficient vector of the multiple linear equations.Adjusting W to minimize the error between L predicted and the current L is the way to get the LR model.Among them, the data set suitable for the linear regression model requires an obvious linear relationship between features and labels, and the linear correlation between features and features needs to be low.However, there is usually a specific correlation between features in real data sets.To improve the performance of linear models in real data sets, regularization techniques are often introduced into linear models, among which the least absolute shrinkage and selection operator (LASSO) [60] uses the sum of the absolute values of the coefficients in W for regularization.
Similarly, ridge regression [61] uses the sum of the squares of the coefficients in W for regularization.In recent years, the sure independence screening and sparsifying operator (SISSO) [67] model has received extensive attention.It is essentially a linear model, but the special of SISSO is the combination of feature engineering and linear model, so SISSO can also deal with non-linear data sets.The feature engineering process of SISSO is to perform some mathematical operations on the original features, such as addition, Natl Sci Open, 2024, Vol.3, 20230039 subtraction, multiplication, division, taking the log, power, square root, etc.The new features created by these linear and nonlinear operations make the predictive ability of the linear model qualitative improvement.
Tree models are suitable for both classification tasks and regression tasks.There are many types of tree models, such as random forest (RF) [63], gradient boosting tree (GBDT) [68], eXtreme Gradient Boosting (XGBoost) [69], etc., and they are all based on decision trees (DT) [62].The core idea of a decision tree is to make decisions step by step based on information entropy.If the decision branches of each step are drawn, a dendrogram is formed, called a decision tree.When the decision tree is used for classification tasks, DT starts from the root node, judges a particular attribute in the descriptor, and assigns instances to its child nodes according to the judgment result.At this time, each child node corresponds to a value of the attribute, so as to recursively judge the instance and carry out the next step of allocation, and finally classify the example into the class corresponding to the leaf node; when the decision tree is used for regression tasks, the decision tree becomes a regression tree, and the internal structure becomes a binary tree structure.The regression tree will divide the feature space into several tree-connected units, each unit has a specific output value, and these output values correspond to the judgment of "yes" and "no", and recur until the condition is met, thus completing the regression.
Support vector machines (SVM) are another classical ML model that has received much attention [38,64].Its basic idea is distinguishing data samples by finding the maximum interval hyperplane.The maximum interval hyperplane refers to the hyperplane closest to these data points in the hyperplane that separates the data.SVM can be used for both classification and regression problems.The support vector machine used for data classification is called the support vector classifier (support vector classification, SVC), and the support vector machine used for fitting data regression is called the support vector regression machine (support vector regression, SVR).The essential difference between SVC and SVR is that SVC maximizes the distance from the nearest sample point to the hyperplane, while SVR maximizes the distance from the farthest sample point to the hyperplane.
The neural network model [65] is a computational model that simulates the human nervous system, which consists of a large number of neurons (nodes).Each neuron accepts input signals from other neurons and generates an output signal.The neural network model usually includes an input layer, several hidden layers, and an output layer.The input layer receives the input signal, the input information is the descriptor of the data set, the output layer outputs the predicted results, and the hidden layer processes and transforms the input signal.The training process of the neural network is to learn a set of suitable neural network hyperparameters from the training data by adjusting the weights and biases between neurons, to realize the classification or regression prediction of new data.
Unlike supervised learning, unsupervised learning does not require labels during training.Unsupervised learning can efficiently handle large volumes of data samples and help discover patterns in the data, which may provide chemical insights.Typical unsupervised learning tasks include grouping and clustering, data visualization, dimensionality reduction, and feature extraction.In materials informatics, unsupervised learning is widely used to visualize materials in a latent space to explore potential relationships among different groups of materials.Unsupervised learning is usually used for data dimensionality reduction, visual analysis, etc.According to the functional division, the main algorithms of unsupervised learning can be divided into: (1) Clustering [70]: divide the data into the data set into several groups, and the data in each group have similar characteristics.Commonly used clustering algorithms include K-means, hierarchical clustering, etc. (2) Dimensionality reduction: dimensionality reduction is a technique used to decrease the number of dimensions in high-dimensional data while retaining essential features.There are different approaches to dimensionality reduction, categorized as linear and non-linear methods.Linear dimensionality reduction algorithms involve mapping the high-dimensional data onto a lower-dimensional space through a linear transformation.This type of algorithm preserves the original data's linear structure.Common linear dimensionality reduction algorithms include principal component analysis (PCA) [71], linear discriminant analysis (LDA) [72], etc. Nonlinear dimensionality reduction techniques involve transforming high-dimensional data into low-dimensional spaces through nonlinear mappings.These algorithms effectively address intricate nonlinear structures present in the original dataset.Common nonlinear dimensionality reduction algorithms include kernel principal component analysis (KPCA) [73], isometric mapping (Isomap) [74], locally linear embedding (LLE) [75], and t-distributed stochastic neighbor embedding (t-SNE) [76], etc.

AI APPLICATION TO ELECTROLYTE DESIGN
The electrolyte serves as the lifeblood of lithium metal batteries, not only facilitating the conduction of lithium ions but also undergoing decomposition at the negative/positive electrode interface to generate solidelectrolyte interphase (SEI) with varying components and structures that ultimately impact the voltage range and cycling stability of batteries [77].However, the correlation between electrolyte formulation and battery performance remains unresolved, posing a challenge to developing more efficient electrolytes.Traditional electrolyte development is typically a trial-and-error process that involves electrolyte preparation, conducting numerous electrochemical tests on batteries to characterize their performance, repeating these steps to optimize the formula, and ultimately obtaining an optimal electrolyte formula [78][79][80].However, such experiments are costly and inefficient, making it difficult to control the electrolyte components and presenting significant challenges precisely.The utilization of AI or ML, utilizing statistically correlating vast amounts of data (salts, solvents, additives and their formulas), guides the prediction of new solvent molecules, new additives, new formulas and new solvation structures (Figure 2A), eventually the optimal electrolyte design.In addition, AI or ML can also enable the identification of statistical descriptors that capture the overall nature of electrolytes (Figure 2B).This approach allows for the establishment of an electrolyte formula and its properties and a model for correlating these properties with battery performance.
Professors Yi Cui and Stacey F. Bent from Stanford University in the United States [43] have gathered a comprehensive dataset covering a vast design space to develop supervised ML models that can accurately predict and optimize the Coulombic efficiencies (CEs) of various electrolytes for lithium metal anodes (Figure 3A).Due to the intricate nature of electrolyte effects on CE and high simulation costs, data-driven approaches are predominantly utilized in research.However, CE values are mainly distributed between 85% and 99.9%, unfavorable for data dispersion.It is necessary to transform this parameter into a new logarithmic form, denoted as LCE = −log 10 (1−CE), called logarithmic CE (LCE).By analyzing the correlation between various parameters of training electrolyte components and LCE, four crucial factors were identified: anion carbon ratio (aC), inorganic/organic ratio (InOr), fluorine/oxygen ratio (FO), and solvent oxygen (sO).Among them, sO is the most significant feature, and reducing its value may be an effective strategy to enhance CE (Figure 3B).Based on this theory, several solvents were screened, and the sO values of methyl  butyl ether (MBE), methyl t-butyl ether (MTBE), and dibutyl ether (DBE) were found to be significantly lower than dimethoxyethane (DME) at 0.056, 0.056, and 0.037 respectively.It is predicted that four electrolytes mixed with non-dissolving diluents using these solvents will exhibit high performance.The utilization of MTBE and DBE mixed with toluene as an electrolyte is particularly noteworthy, not only due to its high CE, up to 99.64% and 99.70%, but also because it possesses the potential for large-scale production.In general, electrolytes exhibiting excellent CE have weaker solvation effects, more SEI derived from anions, and more significant Li deposits, which aligns with previous research.
The development of critical electrolyte parameters is paramount.Many parameters have already been developed, such as dielectric constant [81] and electrostatic potential [82].Ko et al. [83] utilized experimental characterization techniques and ML methods to investigate the impact of changes in Li electrode potential on CEs across 74 common electrolytes.The authors introduced ferrocene into the electrolyte solution system (vs.Fc/Fc + , ferrocene) to measure the Li electrode potential (E Li ) in these 74 electrolytes (Figure 3C).Among them, the LiFSI/dimethoxymethane (DMM) electrolyte has the highest E Li , and enables more stable cycling performance over 400 cycles with an average CE of 99.1%.More importantly, the authors utilized the partial least squares (PLS) regression method to calculate various parameters of the electrolyte, including radial distribution function (RDF), component composition, density, dipole moment, and highest occupied molecular orbital (HOMO)/lowest unoccupied molecular orbital (LUMO), etc. Figure 3D presents the normalized coefficients of the prediction function in descending order, indicating a strong correlation between E Li and the coordination environment around Li, particularly with FSI − coordination.Raman spectroscopy was employed to investigate the coordination state of Li-FSI − in various electrolytes.The results revealed a significant increase in E Li with an increase in ion pairs paired with Li in FSI − /solvent, consistent with the prediction outcomes based on ML.
The electrostatic potential, which serves as a crucial characteristic of the electrolyte solvent, has been demonstrated to greatly influence the electrochemical performance of lithium batteries.In their study, Wu et al. [82] utilized density functional theory (DFT) calculations to determine the antisolvent, weakly solvating solvent, and strong solvating solvent.They achieved this by examining the plot of the lowest negative electrostatic potential (ESP min ) and the highest positive electrostatic potential (ESP max ).The electrostatic potential descriptor shows potential as an effective descriptor for predicting the molecular structure of advanced electrolyte solvents.Wang and Cheng [84] utilized an ML force field to accelerate ab initio molecular dynamics, and achieved more accurate electrolyte solvation structures.Similarly, Wang et al. [85] developed a solution-property ML model to investigate mass transport and the polarization effect of carbonate electrolytes.
The mainstream electrolyte screening and design are still based on a trial-and-error experiment and the researchers' chemical/electrochemical intuition.AI offers new opportunities to accelerate this process.However, current research on AI for electrolyte screening and design is still based on limited databases.The potential reason for this situation is that experimental standards have not yet been unified in the literature.Therefore, establishing standard experimental procedures and data analysis is crucial for AI application on electrolyte development.In addition, the key descriptors for the electrochemical performance of electrolytes are deficient, so the reaction mechanism and interfacial modulation of electrolytes are crucial, which will be discussed in the next section.

AI APPLICATION TO INTERFACE FORMATION & CHARACTERIZATION
This section reviews the applications of AI or ML to the interfacial chemistry and reaction kinetics of electrolytes on electrode surfaces and the simulations of interphasial formation and characterizations.The multi-component compositions (anions, cations, solvents, and additives) and heterogeneity (solid-liquid-gas three-phase, crystal, and amorphous) at the electrode interface aggravate the complexity of the interfacial reactions, resulting in an insufficient understanding of the interface.AI offers new opportunities for understanding interfacial reactions and interphasial regulatory mechanisms at the molecular level.

Understandings and simulations of interfacial reactions
The interphasial formation is initiated by the interfacial reactions between the electrolyte and electrode.It is related to the oxidation or reduction stability of compositions and the solvation structure of the electrolyte.By conducting the DFT calculation, the HOMO and LUMO energy levels of each component and different solvation aggregates in the electrolyte can be obtained.The HOMO energy levels determine the reductive activities of electrode components on the anodes, while the LUMO energy levels reflect their oxidation stabilities on the cathodes.The solvation structure of the electrolyte can be simulated and optimized by molecular dynamics (MD) simulations.
The electrode interfacial reactions and the SEI/CEI formation can be simulated by quantum mechanicsbased MD (QM-MD).QM-MD uses projector augmented wave (PAW) method with generalized gradient approximations (GGA) of the Perdew-Burke-Ernzerhof (PBE) functional.For example, our group has used QM-MD to simulate the reductive decompositions of FSI − and NO 3 − anions on Li metal in the LiFSI-LiNO 3 -DOL electrolyte [35].As shown in Figure 4A, at 3.16 ps of simulation, we observed that the breaking of the S-F bond of FSI − anion and one F − anion plus one FSO 2 NSO 2 • radical were generated.At 6.96 ps, the decomposition of NO 3 − anion was initiated, and one NO 2 − anion plus one Li 2 O was formed.Successively thereafter, quick breaking of N=O bonds occurred to finally form Li 2 O and Li 3 N precipitates on the Li metal interface.However, AIMD simulations are limited in time scale (~10 ps) due to the vast amount of computation.Accordingly, a hybrid scheme has been proposed by coupling AIMD and reactive molecular dynamics (RMD) for accelerated interfacial reaction and evolution simulation, donated as hybrid ab initio and reactive force field (HAIR) MD simulations.HAIR simulations involve accurate local electrochemical reactions simulated by AIMD and longer-time chemical reactions with rapid mass transfer in RMD.This alleviates the burden of accurately fitting the ReaxFF parameters to describe every local bond-breaking barrier because the AIMD handles these steps.Therefore, ensuring the accuracy equivalent to DFT-based MD simulations can significantly improve computational efficiency, allowing simulation time to break through the nanosecond level [86].Recently, we simulated the interphasial reactions and the SEI formation processes at the Li metal interface in LiFSI-DME electrolytes with different concentrations by HAIR-MD simulations [34].As shown in Figure 4B, in the low-concentration electrolyte (LiFSI-10DME), the FSI − anion decomposes incompletely even after a prolonged simulation time of 2.8 ns.More importantly, the decomposition of DME can be observed at 115.6 ps.One DME molecule formed one ethylene molecule and two lithium methoxide molecules under the action of lithium ions.By contrast, no DME decomposition can be observed in the high-concentration electrolytes (LiFSI-1.4DMEand LiFSI-0.7DME)within the simulated Natl Sci Open, 2024, Vol.3, 20230039 time scale.Simultaneously, the complete decomposition of FSI − anions and the generation of Li 2 S appeared.
In addition, through HAIR-MD simulations, the relative content of each product in SEI can be counted, which is crucial for understanding the structure and composition of SEI.
Studying the reaction processes at the electrode interface by various computational and simulation methods is essential.Wang et al. [87] used AIMD simulations with the free energy calculation to compute the redox abilities of different concentrations of lithium bis(trifluoromethane)sulfonimide (LiTFSI) in propylene carbonate (PC).First, they used AIMD to simulate the solvation structures of electrolytes and calculated the electronic structure of each component of the electrolytes by the projected electronic density of states (PDOS).Then they calculated the redox potentials of different electrolytes through AIMD with the FEP-TI method [88].The stability of the interphasial LiF component was further evaluated by the dissolution-free energies of LiF in different electrolytes using the FEP-TI method.Finally, they analyzed the local solvation environments of TFSI − anions through unsupervised ML, and the TFSI − anion cluster network could favor LiF precipitation in SEI.
There are thousands of reaction paths for electrolytes at the electrode interface.For example, at the negative electrode interface, even for commonly used ethylene carbonate (EC) solvents with the participation of impurity water, 570 species will be generated, and nearly 9 million reactions for 5-bond-change reactions (Figure 5A).AI or ML provides an opportunity to discover and enumerate all possible reaction paths and to screen the optimal one.Xie et al. [89] used a data-driven automated methodology based on a database of DFT-calculated molecular fragments and recombinants to study the possible reaction pathways for forming lithium ethylene mono-carbonate (LEMC) from EC.They screened the most probable pathways from 769 Natl Sci Open, 2024, Vol.3, 20230039 reaction pathways by calculating the Gibbs free energy (Figure 5A and B).Their encouraging work on the data-driven methodology of reaction network-predicted pathways has provided new insights into understanding interface formation mechanisms in electrochemical systems.
However, these theoretical simulations mainly focus on interfacial reactions of electrolytes and the composition of SEI products, but more importantly, the SEI structure and evolution processes remain unestablished.Spotte-Smith et al. [90] also used data-driven chemical reaction networks with stochastic simulations based on quantum chemical calculations to study the SEI formation on the anode.They claimed to have simulated the products and structures of SEI at a mechanistic level for the first time.They first analyzed over 80 million reactions by computational reaction networks to screen key SEI products and gas components.Then they performed kinetic Monte Carlo (kMC) simulations with different tunneling barriers (Figure 5C) to simulate the response of each product of SEI with the applied potential of the electrode under the condition of different SEI thicknesses (Figure 5D).Finally, the origin of the SEI multilayer structure was deduced according to the variation of each product with the SEI thickness.

Interfacial characterizations
Understanding the structures and chemical compositions of SEI/CEI should be based on advanced interface characterization techniques.A variety of characterization techniques have been applied to the study of SEI/ CEI, including Raman or tip-enhanced Raman spectroscopy (TERS) [91], transmission electron microscopy (TEM) [92], photoelectron spectroscopy (XPS) [93], scanning electrochemical microscopy (SECM) [94], synchrotron X-ray diffraction (XRD) and pair distribution function (PDF) [95], secondary ion mass spectrometry (SIMS) [96], atomic force microscopy (AFM), reflection interference microscopy (RIM) [97], nuclear magnetic resonance spectroscopy (NMR) [98], and electrochemical impedance spectroscopy (EIS) [99], etc. AI offers new opportunities to analyze and predict the results of these interfacial characterization techniques.However, few studies still exist on the combination of AI and these interfacial characterization techniques, mainly focusing on XPS and EIS.
As an interface-sensitive characterization technique, XPS has been widely conducted in the chemical composition analysis of SEI/CEI.However, due to the complex analysis processes and ex-situ test environment, XPS cannot effectively correlate with the interfacial electrochemical reactions.By combining HAIR simulations and ML models, Sun et al. [47] developed an AI-driven ab initio framework to predict the XPS results of the SEI formed in a localized high-concentration electrolyte (Figure 6A).As a result, the ML prediction XPS results are consistent with the experimental results.The AI-driven prediction provides a new way to reduce the calculation costs in XPS simulations.
Another important application of AI in interfacial characterizations is EIS-based prediction.As a widely used electrochemical characterization in batteries, EIS has a rich database for ML.Zhang et al. [100] collected over 20,000 EIS spectra of Li-ion batteries as an ML dataset for battery degradation prediction, and they achieved an accurate prediction of remaining life.Similarly, Xiong et al. [101] also developed two battery degradation datasets, including 4700 EIS spectra for ML training, finally achieving semi-supervised prediction of battery capacity degradation.In addition to using AI with the existing EIS data to predict battery degradation, AI can also be introduced to realize battery EIS prediction.For instance, Guo et al. [102] used ML to predict the EIS spectra of Li-ion batteries from partial charge voltage curves (Figure 6B).Temiz et al.
Natl Sci Open, 2024, Vol.3, 20230039 [44] used a novel co-modeling (Figure 6C) ML framework based on minimal experimental seed data to predict EIS precisely.
AI can be helpful for the understanding of interface formation mechanism and interfacial characterizations.However, the interfacial complexity leads to a dramatic increase in the cost of computational simulations, severely limiting the simulation temporal and spatial scales.Therefore, more advanced AI-based interfacial simulation methods need to be developed to study battery interface mechanisms.In addition, more relevant interface characterization technologies need to be combined with AI, such as cryogenic scanning TEM (cryo-TEM), interfacial spectral characterization, NMR, etc.

AI APPLICATION FOR LITHIUM DENDRITE GROWTH AND INHIBITION
The uncontrollable growth of lithium dendrites is an inherent problem in developing rechargeable lithium batteries, leading to internal short circuits and severe safety hazards, and has been considered one of the bottlenecks hindering the commercialization of lithium metal batteries.Lithium dendrite growth includes the process of the solvated Li + in electrolyte going on desolvation towards the electrode and obtaining one electron to form a Li atom and the process of Li atom aggregation to form Li clusters, namely nucleation, and growth.In the past few years, advanced characterization methods such as magnetic resonance imaging (MRI), cryo-TEM, scanning electron microscopy (SEM), and TEM have been used to understand the growth process of lithium dendrites in situ, and theoretical methods based on DFT, MD, Monte Carlo (MC) methods have also been used to understand the growth of lithium dendrites at the micro-scale.

Lithium dendrite growth
Although tremendous characterization and theoretical methods have been proposed to study the process of lithium metal deposition and dendrite growth, further efforts are still needed to elucidate these underlying mechanisms and the issues of lithium dendrite suppression.The experimental characterization method inevitably destroys the interface structure and makes it difficult to observe the formation process of dendrites directly.In contrast, theoretical models are limited by computational cost and cannot accurately and reliably simulate the growth process of lithium dendrites.More importantly, the growth process of lithium dendrites requires consideration of various complex external environments coupled with temperature, stress, and electric field.Accordingly, combining physical mechanisms with AI-driven models is gradually becoming a new paradigm for studying lithium dendrite growth.Lai et al. [103] used a deep neural network interface potential for Li-Cu systems (LiCu-NNIP) with quantum-mechanical computational accuracy, and they performed large-scale MD simulations to obtain the dynamic behavior of Li deposition on the Cu surfaces with the different Miller indices and the arrangement features of Li atoms (Figure 7A).Additionally, the accuracy of machine potential or neural network potential force field could be significantly improved using the hybrid of experimental and theoretical databases.Zhang et al. [104] based on accurate neural network potential from training DFT data, simulated the morphology evolution of lithium dendrite with a selfconsistent continuum solvation model (Figure 7B), discovering that reducing the surface or grain boundary energy can drive the morphology evolution, contributing to stabilizing the lithium metal anode.They proposed a descriptor, symmetry function, for training the neural network model.More recently, the external pressure was considered with an ML potential, and they found that the hole defects and Li dendrites would gradually fuse and disappear with the increase of external pressure (Figure 7C) [105].

Lithium dendrite inhibition
Inhibiting dendrite formation is an effective means to improve the interface structure and performance of batteries.By making full use of existing small data and relevant databases, such as the CCSD database, PubChem, and Material Project, explore materials or molecules with target properties (stress, ionic conductivity, and stability), new materials or molecules that can inhibit dendritic growth can be quickly and efficiently screened with Generative model (Schnet [106], GraphInvent [107,108] and Uni mol [109]) based on the above data sets.Ahmad et al. [110] carried out high-throughput screening of over 12,000 inorganic solid electrolytes based on stability in contact with lithium metal anode as shown in Figure 7D for a private dataset.Trained graph neural network (GNN) model with descriptors and target properties, the pure structural descriptors are selected as features to save cost.Target properties are shear and bulk moduli, obtaining candidate solid electrolyte with low dendrite and high ionic conductivity.And they also trained elastic constants with gradient boosting regressor and kernel ridge regression, found to increase with an increase in mass density and the ratio of Li and sublattice bond ionicity and decrease with an increase in volume per atom Natl Sci Open, 2024, Vol.3,20230039 and sublattice electronegativity by adding important physical descriptors such as electric field, current, structure, density, formula, etc. Ren et al. [111] carried out symbolic regression with SISSO based on the simulated lithium dendrite with phase field and found that some electrolytes with high elastic modulus and initial yield strength can inhibit lithium dendrite growth effectively.These simulated results implied higher elastic modulus (E e ) and initial yield strength of electrolyte (σ y0 e ), leading to inhibition of lithium dendrite growth.Despite the important role of AI has been verified in lithium dendrite growth mechanism and dendrite inhibition, current AI simulations are often too idealistic and model-based.In order to more realistically simulate lithium metal growth and inhibition strategies, effective modeling by combining multiple parameters is required, e.g., pressure, temperature, crystal plane orientation, ion transport, polarization, electric field, current density, substrate, electrolyte properties, and most importantly, interphasial features.

AI APPLICATION FOR BATTERY DEGRADATION AND LIFETIME PREDICTION
Due to the fact that LIBs have become an important energy storage system for electric and hybrid vehicles, the safety performance and degree of performance degradation of power batteries have been considered key indicators for evaluating vehicle performance.Accordingly, accurate state of health (SOH) estimation and remaining useful life (RUL) prediction are critical to prolong the service life of the battery and ensure the safe Natl Sci Open, 2024, Vol.3,20230039 and reliable operation of the system, such as electrochemical models or models simulating the behaviors of cells with optimization algorithms and observers.However, identifying fundamental cause-effect relations for the loss of performance and predicting the battery lifetime remain challenging as the battery capacity degrades in a stochastic manner, given the internal complex electrochemical reactions of the battery and the external operational conditions.Due to the massive amount of data and factors, AI methods based on "big data" analysis and related statistical/computational tools have put data-driven approaches into use in battery degradation and lifetime prediction (Figure 8A) [112].

Aging mechanism analysis
The underlying understanding of the aging mechanism, which includes a combination of loss of active materials (LAM) in the cathode and a loss of Li inventory (LLI) in Li plating or the increase of cell internal resistance, is the primary prerequisite for the success of health estimation and prediction [112].Mapping battery aging is complicated because multiple patterns are often conjugated.Recent trends in diagnostics and prognostics have been heavily influenced by ML [100,113].However, current ML models are based on charge and discharge curves, which require long-term cycle data in the early stage.This suggests that using data from early life to predict the later behavior of batteries remains a significant challenge [100,114].EIS is a non-invasive, information-rich, and real-time measurement that is hitherto underused in battery diagnosis.Zhang et al. [100] reveal that an accurate battery forecasting system by combining EIS with Gaussian process ML could automatically determine which spectral features predict degradation.Jiang et al. [115] further found that the constant frequency impedance characteristics can be used to accurately predict the SOH of the battery with only a average maximum absolute error of 2.2%.However, a significantly larger training set is required to cover the different eventualities of the above method.Furthermore, Li et al. [116] proposed a hybrid model that combines an impedance-based model with an open-circuit voltage (OCV) reconstruction model based on considering degradation recognition accuracy and computational cost and presents a degradation diagnosis framework for LIBs by integrating field data, impedance-based modeling, and AI (Figure 8B).The mean absolute percentage errors for health state estimates of battery capacity decay and power decay are less than 0.5% and 1.5%, respectively.To further reduce simulation costs, Thelen et al. [113] proposed a lightweight physics-informed ML to estimate a battery's capacity online and diagnose its major degradation patterns using only limited early lifetime experimental degradation data.In addition, the estimation performance of supervised ML algorithms decreases dramatically when fewer samples of measurement capacity are available.In this case, the semi-supervised approach proposed by Xiong et al. [101] reduces the root mean square (RMS) error by up to 50.66% by effectively filtering the data.To predict the aging mechanism under complex models, as shown in Figure 8C, Liu et al. [117] proposed an interpretable hybrid ML framework that enables the discovery of a previously unknown performance indicator, the ratio of electrolyte amount to high-voltage-region capacity at the first discharge, to untangle intractable degradation chemistries of Li-S batteries with a test mean absolute error of 8.9%.

Battery lifetime prediction
Traditional battery prediction methods rely on modeling microscopic aging mechanisms (such as SEI film Natl Sci Open, 2024, Vol.3, 20230039 growth, lithium plating, and active mass loss) [50,100,112,113].Still, it is not scalable to characterize and simulate each aging mechanism, because it may take years to reach thousands of cycles in traditional battery testing [51].To overcome this challenge, recent reports have focused on data-driven approaches that run realtime, lossless measurements on batteries and directly correlate those measurements with battery health using statistical ML, showing an excellent ability to estimate the state of charge (SOC) and SOH in battery systems.The simple data-driven battery life prediction is more convenient and time-saving than the combination of experiment and data-driven prediction and has better accuracy.
To establish a reliable ML model with rapid and accurate predictions, relevant descriptors that affect battery performance need to be designed on the premise of fully understanding the degradation mechanism and process.The early cycle life prediction may lead to significant prediction errors, as Harris et al. [118] found a weak correlation between capacity values at cycle 80 and cycle 500.Thus, Severson et al. [114] developed data-driven models that predict the cycle life of lithium iron phosphate (LFP)/graphite cells by generating cycle lives ranging from 150 to 2300 with 72 conditions without knowledge of degradation mechanisms.The best models achieve a 9.1% test error for quantitatively predicting cycle life using the first 100 cycles.Based on the database generated by Severson et al. [114], Dong et al. [119] focused on the chaos sparrow search optimization algorithm, Random forest, XGBoost, light gradient boosting machine (LightGBM), categorical features gradient boosting (CatBoost), neural network, etc., and discussed the importance evaluation of features and the hyperparameter search process.They found that CatBoost has the highest prediction accuracy, 88.44% of the predicted data, with a relative error of less than 10%.Moreover, Severson et al. [114] used a linear model with 9 battery descriptors, and Gong et al. [120] used a fusion feature selection method with 20 features related to the capacity degradation to predict LIB cells after 100 charge/discharge cycles, allowing fast training and predictions with an affordable computational cost.
Concerning predicting the RUL of LIBs with external environments, Mansouri et al. [121] found that simple variation of voltage inputs with a random forest approach leads to a 3.3% prediction error in the RUL.The voltage, capacity, cycle number, and temperature were considered vectors to estimate the SOH and RUL of LIBs with errors of <6.4%.These results implied that the environmental and load conditions strongly influenced the SOH and RUL.Jin et al. [122] defined an innovative criterion to evaluate RUL forecasting methods' accuracy and computational cost.SVM, Gaussian process regression (GPR), and extreme learning machine (ELM) are more suitable for calculating small sample sizes, because they have the characteristics of simple structure and small amounts of calculation.For a complex nonlinear system such as a lithium-ion system, deep neural network (DNN), recurrent neural network (RNN), and long and short-term memory (LSTM) all have good performance.Among them, the uncertainty management ability of RNN is relatively poor, and the training process of LSTM is long and complex, requiring expensive equipment to accelerate the training.In conclusion, DNN has strong autonomous learning and generalization abilities, making it more suitable for RUL prediction (Figure 8D) [122].
AI has been implemented in the degradation mechanism and battery lifetime prediction of lithium batteries by the initial electrochemical analysis and limited amount of data.However, the reliability of lifetime prediction still remains doubtful and challenging, which is caused by the lack of the database of battery endof-life cycling data.Moreover, it is difficult to guarantee the consistency and reliability of experimental results in the database, which is not conducive to the accuracy of life prediction.

CONCLUSION
In this comprehensive review, we delved into the application of ML models in battery interface research, laying the foundation for a deeper understanding of battery interfaces and electrolyte designs, and predicting battery lifespans.The value of AI methods and models in battery research became evident, underscoring their potential to sift through vast amounts of data, identify correlations, and guide future experiments.The application of AI to electrolyte design heralds a new era of targeted, data-driven optimization, and the insights gained from AI application in interface formation and characterization have opened up new avenues for mitigating lithium dendrite growth and enhancing battery safety.We illuminated how AI methods have been employed to study lithium dendrite growth and its inhibition, providing novel perspectives on a complex problem, and potentially laying the groundwork for safer and more efficient battery designs.Furthermore, the use of AI for battery degradation and lifetime prediction not only improves our understanding of the aging mechanisms of batteries but also propels forward the future of predictive maintenance.In conclusion, AI and ML techniques have emerged as powerful tools in the field of battery research.They not only deepen our understanding of battery interfaces and electrolyte designs but also serve as predictive tools for battery life.Moving forward, we anticipate a proliferation of AI applications in this field, encouraging the development of more efficient, safer, and longer-lasting batteries.Further exploration of these techniques and their potential applications will undoubtedly continue to drive advancements in battery Natl Sci Open, 2024, Vol.3, 20230039

Figure 1
Figure 1 Overview of AI application to lithium battery chemistry focused in this review, including electrolyte design, electrode interfacial simulations, Li dendrite growth, and battery lifetime prediction.

Figure 2
Figure 2 AI-driven electrolyte design and electrolyte performance descriptor screening.(A) AI applied to predict new solvent molecules, new additives, new formulas and new solvation structures.(B) AI applied to screen key descriptors that determine electrolyte performance.

Figure 3
Figure 3 Data-driven analysis of key descriptors in electrolytes for Li metal anodes.(A, B) Extraction of different features and their correlations with logarithmic Coulombic efficiencies [43].(C) Schematic of Li electrode potential measurements by the ferrocene reference.(D) Correlations of 17 different descriptors with Li electrode potential [83].(A, B) Adapted with permission from Ref. [43], Copyright©2023, PNAS.(C, D) Adapted with permission from Ref. [83], Copyright©2022, Springer Nature.

Figure 5
Figure 5 AI-driven prediction of interfacial reactions and SEI formation.(A) The reaction networks of Li + , EC, H 2 O, and e − .(B) Feasible reaction paths to form LEMC and the corresponding reaction-free energies [89].(C) Schematic diagram of the input parameters on the simulated negative electrode in the kMC model.(D) Average fractions of SEI and gas products after the kMC simulations and corresponding schematic of the SEI structure [90].(B) Adapted with permission from Ref. [89], Copyright©2022, American Chemical Society.(D) Adapted with permission from Ref. [90], Copyright©2022, American Chemical Society.

Figure 7
Figure 7 AI-driven methods applications for lithium dendrite growth and inhibition.(A) The dynamic behavior of Li deposition on the Cu surfaces with the different Miller indices using large-scale MD simulations [103].(B) Morphology evolution of lithium dendrite with a selfconsistent continuum solvation model using neural network potential [104].(C) The dynamic behavior of Li deposition on the Cu surfaces with external pressure using an ML potential [105].(D) High-throughput screening of over 12,000 inorganic solid electrolytes based on stability to obtain candidate solid electrolyte with low dendrite and high ionic conductivity [110].(A) Adapted with permission from Ref. [103], Copyright©2022, John Wiley and Sons.(B) Adapted with permission from Ref. [104], Copyright©2022, John Wiley and Sons.(C) Adapted with permission from Ref. [105], Copyright©2023, Elsevier.(D) Adapted with permission from Ref. [110], Copyright©2022, American Chemical Society.