Systematic analysis of the impact of slurry coating on manufacture of Li-ion battery electrodes via explainable machine learning

The manufacturing process strongly affects the electrochemical properties and performance of lithium-ion batteries. In particular, the flow of electrode slurry during the coating process is key to the final electrode properties and hence the characteristics of lithium-ion cells, however it is given little consideration. In this paper the effect of slurry structure is studied through the physical and rheological properties and their impact on the final electrode characteristics, for a graphite anode. As quantifying the impact of the large number of interconnected control variables on the electrode is a challenging task via traditional trial-and-error approaches, an explainable machine learning methodology as well as a systematic statistical analysis method is proposed for comprehensive assessments. The analysis is based upon an experimental dataset in lab-scale involving 9 main factors and 6 interest variables which cover practical range of variables through various combinations. While the predictability of response variables is evaluated via linear and nonlinear models, complementary techniques are utilised for variables importance, contribution, and first and second order effects to increase the model transparency. While coating gap is identified as the most influential factor for all considered responses, other subtle relationships are also extracted, highlighting that dimensionless numbers can serve as strong predictors for models. The impact of slurry viscosity and surface tension on electrode thickness, coat weight and porosity are also extracted, demonstrating their importance for electrode quality. These variables have been rarely considered in previous works, as the relationships are difficult to extract by trial and error due to interdependencies. Here we demonstrate how model-based analysis can overcome these difficulties and pave the way towards an optimised electrode manufacturing process of next generation Lithium-ion batteries.


Introduction
With increasing concerns around climate change, environment, and sustainability, the global electric vehicle stock is expected to expand from 11 million in 2020 to almost 145 million by 2030 [1]. Consequently, demands for high quality and high-performance energy storage systems to support electric mobility is expected to rise significantly. Rechargeable lithium-ion battery (LiB) cells have proven to be a powerful technology due to their considerable energy, power density and long cycle life [2]. According to the literature, the Li-ion battery market value is expected to increase from about $34.2 billion in 2020 to $87.5 billion in 2027 [3]. Advancement of technologies for electrode manufacturing has been among one of the main factors in the reduction of LiB costs by nearly 90% and 5 times increase in its volumetric energy since 2008. The drop in prices only between 2019 and 2020 was found to be about 13% [4]. LiB manufacturing contributes about 25% of the LiB cost [5], and considering the cell manufacturing process to support the electric vehicle market as an example, it has been confirmed that the electrode production is responsible for more than 50% of the costs associated with a conventional LiB [6]. Since electrodes have a large contribution to the final characteristics of cell, improvements in their production process are critical for further cost reduction and cell performance enhancement [7].
Battery electrode manufacturing is a complex process made up of multiple stages that affect the characteristics of the electrode and the final LiB cell performance [8]. The current industrial electrode manufacturing process is based on slurry casting. This includes stages: (1) selection of the material and the decision of the formulation, (2) mixing the material and the solvents to form a slurry, (3) coating the current collector foils (typically aluminium for cathode, and copper for anode) with slurry, (4) drying the electrodes to remove solvent, (5) calendering the electrodes to achieve a prescribed porosity for a desired energy density, (6) cutting the electrodes and finally (7) assembling the cells. The main stages of electrode and cell manufacturing are visualised in Fig. 1, for further details of LiB manufacturing stages see [3,9]. Each stage mentioned above includes various control variables and parameters that impact the properties of intermediate products and the characteristics of the final cells [7]. Monitoring of these properties and controlling these variables is critical to optimise the process for desired cell characteristics such as high energy density or low internal resistance. However, this optimisation is currently done by trial and error, which is costly in time and material. In order to reduce this cost, a predictive understanding of the impact of the manufacturing variables on the quality is required. This is made challenging as the variables are highly interconnected and so cannot be studied in isolation. For example, increasing the mixing speed during slurry preparation stage, has an impact on the shear degradation of the binder which later on increases the cracks on the electrode coating surface during the drying process [10].
Evaluating the effect of the electrode manufacturing control variables and parameters has been attempted in a number of research and studies during the last few years [11][12][13]. For example, the effect of calendering process conditions on the cathode characteristics is addressed in [14,15] and the slurry mixing procedure impact on the electrochemical characteristics of composite LiB electrodes is investigated in [16]. However, these have mostly utilised trial-and-error approaches, bringing wastage and only qualitative understanding. Drakopoulos et al. [17] also looked at graphite electrodes, and an artificial intelligence (AI) approach to optimisation. However, it was based on a limited dataset and a full design of experiments was not performed. To avoid trial and error methods, quantification of the impact of the electrode manufacturing variables on the output variables, and their prediction via machine learning (ML) and advanced data mining algorithms has been proposed in a number of studies recently [18][19][20]. Unlike conventional statistical methods or physics-based models, these approaches have the potential of dealing with large number of interconnected variables. Electrode drying, and electrolyte filling process variables are studied in [21,22], and the influence of calendering variables are discussed in [23,24]. Electrode coating variables including liquid-to solid ratio, active material and coating gap are addressed with respect to the electrode's structural [25][26][27] and electrochemical features [28,29]. A ML model is built in [30] to predict the cathode characteristics via viscosity and coating gap. In silico electrode mesostructures characteristics are fed into an ML model to predict the calendared electrodes properties in [31]. Predictability of the energy capacity of half-cells at various coating conditions is studies in [32], and followed by comprehensive analysis of relative importance of the structural features of cathode and anode in [33,34]. In the broader concept of material design for lithium-ion batteries, comprehensive reviews of the application of ML techniques are also provided by Liu et al. [35,36].
One area that has received limited attention is the impact of the flow in the coater on coating quality. This is a complex problem consisting of viscoelastic, viscocapillary and particle effects [10,7]. Studies have shown that these parameters are necessary to define a coating window, outside of which defects, such as air entrapment, occur when the Capillary number is too high, [37][38][39]. Considering the underlying physics, dimensionless numbers such as Capillary number can be used to characterise the flow and allow results to be transferred between systems [40,41]. While this defect free region can be easily observed to define coating window, it is less understood how both process variables and dimensionless numbers impact the coating quality, e.g., coat thickness, coat weight and porosity, within this window. This is vitally important as coating quality can significantly impact the electrode properties. On the other hand, most of the existing studies focus on correlation between the factors, or the predictability of the response variables, and still the influence of the control variables on the response is not quantified comprehensively. Generally, unlocking the relationship between the electrode characteristics and the manufacturing control variables is difficult [42] and therefore, a combination of advanced metrology techniques (e.g. slurry rheology, surface tension) [10,43] as well as data mining technologies is proposed here to eludicate the impact of slurry flow during the electrode manufacturing process.
The objectives of this paper are: (i) To introduce a clear methodology of systematic analysis and model-based representation of LiB electrode manufacturing process (using coating as a test system). (ii) To quantify the impact of key coating control variables on the LiB electrode properties and reveal the interconnection between the variables. (iii) To produce a framework that can be extended to other areas of the manufacturing process and contribute towards building a digital twin for LiB Manufacturing.
The study has three main contributions and novelties, (1) to utilise advanced metrology to facilitate access to the key control variables during electrode coating, (2) to introduce novel control factors such as viscocapillary effects into models and (3) to develop Explainable ML models (XML) to not only predict the response variables given the control variables (predictors), but also quantify the impact of those on responses via interpretability and explainability techniques. Within this context, XML algorithms [44,45] are among well proven AI methods for improving the transparency of machine learning models [46]. While ML models have a general opacity due to complex model structures [47], XML offers several metrics to evaluate the models and shed light on the "black-box" [46]. Recently XML models have been applied to manufacturing process optimisation problems and examples include semiconductors [48] and automated composite production [49]. To the best of the authors knowledge, XML algorithms have not been comprehensively applied to battery and electrode manufacturing optimisation concepts before, therefore, this study is particularly concentrated at it.
The structure of the paper is as follows. In Section 2, the methodology of the study is given. This section includes the details of the experiments, the list of the control and interest variables, and the fundamentals of the statistical analysis and machine learning techniques. In Section 3, the results are reported and discussed. This section includes linear regression, analysis of variance (ANOVA) and the ML predictions. XML-based analysis in the form of feature importance, contribution and impact are also given in this section. Section 4 is dedicated to summary, conclusions, and future works.

Methodology
A systematic set of experiments are conducted to represent a range of the slurry physical properties, and coating parameters. The variables of Table 1 are considered as control and response variables and deliberately defined in Section 2. The parameters are selected to represent the physical properties of the slurry and electrode coating and to capture the key physical interactions during coating process, i.e. viscous and capillary forces. The ranges for parameters are decided by experimentalists in the field based on previous works, to cover the practical cases encountered during the coating process.
The collected data are then cleaned of outliers and concatenated for analysis and modelling activities. The linear regression and analysis of variance are then followed to determine the most important features of the process and to quantify the linear correlations between the response and control variables. In the next step, a ML model of random forest is built to evaluate the predictability of the response variables given the control factors. The model is then accompanied by two explainability indices of feature importance (FI) and the accumulated local effects (ALEs). While the former quantifies the relative importance of all the features for the response variables, the later quantifies the effect of each individual control factor on the response. It also highlights the effect coupled control variables on the response variables. The flowchart of the systematic analysis of the mixing and coating process is illustrated in Fig. 2. It is worth noting that all models developed here have multiple inputs and a single output as listed in Table 1.
The methods and models in this study are selected such that they facilitate a pipeline for systematic data analysis of slurry coating process and building a hierarchical methodology. This hierarchical methodology starts from investigating if there are any significant linear relationships between the control and response variables and continues towards the possibility of uncovering the nonlinear dependencies. While ANOVA is an efficient approach to reveal linear relationships, the LR model relates the response and control variables via functions. It is a model with an affordable computational complexity and acceptable accuracy for linear dependencies. For nonlinear relations, the RF is preferred because of its voting system for decision making and its ensemble structure that is suitable for small to medium size of tabular dataset compared to single contributing models such as support vector machines and decision trees.
This methodology via advanced metrology and systematic analysis paves the way towards an in-line control of the LiB electrode manufacturing process. It enables higher production efficiency by identifying the most significant influential factors and revealing the relationship of the physical and electrochemical characteristics of intermediate products such as slurry mix and coated electrodes, as well as the final products such as half or full LiB cells. This model-based prediction is critical for saving time and resource during LiB manufacturing as well as research and development activities for its design.

Experimental
In this study, a range of research-scale experiments were performed using anode slurries as a test system as there is significant room for optimisation in water-based anode manufacturing [17].
The slurries were prepared in a THINKY mixer, and the slurries characterised. Surface tension was measured using a Wilhelmy Plate Tensiometer, taking an average of 3 repeat measurements (Standard deviation ranged from 1-13%). Rheology was measured using a Netzsch Kinexus Pro+ rheometer equipped with 40 mm parallel plates at 25 • C. Flow curves were measured between shear rates from 10s-1 to 1000s-1 and fitted using a power law of Eq. (1).
Where η (Pa.s) is the viscosity as a function of shear rate, γ (s − 1 ), K is the flow consistency index and fb denotes the flow behaviour index, both fitted to the data. Shear rates from 10 s-1 to 1000s-1 were fitted and R2 was >0.99. Oscillatory data was also collected, performing frequency sweeps from 0.1 Hz-100 Hz at 0.5% strain. Oscillatory data was not utilised for the model but was used to explain the trends.
The slurries were coated using a RK K Paint applicator with a micrometer adjustable doctor blade geometry. Wet thickness was measured with a wet thickness comb. The coating was dried at 50 • C on a hotplate for 20 min, before being dried overnight at 120 • C in a vacuum oven. A 10cm 2 disc was cut from the centre of each coating, which was weighed to calculate coat weight, and its thickness was measured at multiple points using a micrometer thickness gauge and averaged. Capacity was calculated based on coat weight and a theoretical capacity for Graphite of 320 mAh/g. Shear rate in the coater was calculated using Eq. (2), where v is the coating speed (m/min) and h is the gap (um) between the blade and the foil [50]. The value of viscosity at this shear rate was used in the dataset. This was extrapolated if necessary, using the power law fit, which was deemed as valid as the fits had high R2 and the extrapolated shear rates were close to the range of the data (the maximum shear rate was 3600 s − 1 , and data was collected to 1000 s − 1 ). Density and dry density were calculated from the combined densities of the added components (with or without solvent respectively). In total 67 coating trials were used for this study, varying weight percentage, coating gap and speed according to Table 2 (the levels are given with three significant digits). The capillary number was calculated to capture the ratio of surface tension to viscous drag effects using Eq. (3), where v is the coating speed (m/s), σ the surface tension (N/m) and η the viscosity (Pa.s) at the coating shear rate. Additionally, the Ohnesorge number, [51] was calculated as the ratio of viscous forces to both inertial and surface tension forces using Eq. (4), where ρ is the wet slurry density (kg/m 3 ), σ the surface tension (N/m), L the coating gap (m) and η the viscosity (Pa.s) at the coating shear rate.

Statistical analysis
Here ANOVA and linear regression, [52], are employed to obtain the mathematical linear relationships between the control and output variables. Consider, the control and response dataset defined asO = ,with x f as the f th control variable, F is the number of various control features, n is the sample (experiment) index, and z is the response variable. Also, N refers to the sample size (the number of experiments).
The LR model is of the form expressed in Eq. (5), where β is the regression coefficient and ε is the random error. Eq. (5) is expanded to all the F control variables per each single response z.
Since not all the terms might be statistically significant in representing the response variable, model reduction based on ANOVA pvalues is employed to obtain a reduced model containing only the statistically significant terms [53]. p_value is the probability of rejecting a null hypothesis. The null hypothesis is that the model coefficient for an input is statistically significant, which means it contributes to the presentation of the response variable significantly. The p stands for probability and measures the strength of the evidence against the null hypothesis. p-values are compared against a threshold of α which is a compromise between the complexity of the model and its accuracy in representing the input-output relationship. In the present work the α is 0.1 and, p-values < 0.1 determine the statistical significance. In other words, p-values smaller than 0.1 mean that the probability of rejecting the null hypothesis is only 10% [54].
Based on LR and ANOVA it is possible to plot the residual intervals which can be used to determine the presence of possible outliers in the data set and to remove those during the analysis and modelling activities.
The goodness of fit for linear model is determined by the value of the coefficient of determination R 2 . Coefficient of determination, as in Eq. (6) is a measure of explained variance in relation to the total variance. R 2 takes values within the range of 0 and 1. It is calculated as a relation between the sum of squared errors, and the total sum of squares between each data point and the whole average. The closer the R 2 to 1 means that more variability has been captured by the model. In other words, it means that a change in the independent predictors predicts R 2 (%) of the variance in the dependent response variable.
The F-statistics value of the terms [55], computed from the ANOVA, is used to determine the feature importance for each of the output variables. ANOVA is a case of a linear regression with the ability to process categorical features. While both models are linear representation of relationships, they have different versions of reported results [56].

Machine learning
Further to the statistical analysis, a state-of-the-art approach based on machine learning is utilised to demystify the relationship between the control and response variables. The advantage of machine learning model over the linear analysis is its ability to capture nonlinear and complex relationships between inputs and output. Furthermore, an ML model can be used to explain the impact of inputs on the response via Explainable Machine Learning techniques which is detailed in the next section.
Here state-of-the-art RF model is utilised with the inputs and outputs of Table 1. Random forest [57] is a powerful machine learning model based on the ensemble learning method for prediction. It is built upon a multitude of components called 'weak learners' to form a 'strong learner'. The weak learners are individual decision trees, and a RF usually outperforms each individual decision tree and overcomes the common issue of overfitting [58]. By growing an ensemble of trees and averaging the prediction of them, significant improvements in regression accuracy can be achieved. The ensembles are grown by generating random vectors to govern the growth of each tree. One popular approach for growing the trees is bagging, where each tree is grown based on a random selection of data points available for training [58,59] which is utilised here as well.
The RF model that is meant to perform predictions, need to be validated so that it can be utilised for prediction of the electrode characteristics related to other similar data set to this study. Validation is an essential step of the model development to make sure the analysis is generalisable to an independent dataset. For this purpose, the cross validation (CV) approach is utilised [60]. CV is a validation technique that divides data into multiple portions of training and validation (test) and then builds and evaluates the model performance using each portion in an iterative way. In each iteration, the whole dataset is categorised into K different portions of equal size via random sampling, K-1 portions are used for training the model and the remaining portion is used for testing. In the next iteration, the random resampling is repeated, and different portions are used for training and test. Based on the trained RF model, the results of the predicted response variables are obtained, and accuracy of model is evaluated through an arithmetic average of the  accuracy and model performance indices for all iterations. In order to quantify the accuracy of the model, two indices are utilised, root mean square error (RMSE) and coefficient of determination (R 2 ). RMSE, Eq. (7), is the root of mean squared error (MSE) or deviations of the models, which averages the squared difference between the measured and predicted values. It is strictly positive and has the same unit as the predicted variable.
RMSE represents the variance of the estimator and it bias which means how the predicted values are spread from one data point to another and how far the predicted value is from each data point. Smaller values of RMSE mean better predictions. RMSE is not alone enough to judge the accuracy of predictions and hence the R 2 is also necessary. R 2 is obtained by Eq. (6). When the two indices are used to evaluate the performance of model for prediction, they usually have conflicting values. Therefore, a balance between the two would be required to decide about the capability of models.

Explainable machine learning
The Random Forest model built for predicting the response variables of the coating process is a black box model that does not reveal its internal mechanisms and cannot be understood completely by looking at its parameters. Explainable or interpretable machine learning [44] offers techniques that make the model's behaviour understandable.
XML supports data scientists and manufacturers when not only a good prediction is required but also there is an intention to know why a specific decision is made by the model. XML techniques clarify how the features affect the predictions and how the model comes to a decision [61]. They help gaining information regarding the relations in between control and response variables. The effect of each individual control variable on the response is referred to first order effect and the influence of two joint predictors on the response is referred as second order effects. XML highlights whether the relationship between the response and a feature is linear, nonlinear, monotonic or nonmonotonic.
For increasing the explainability and interpretability of the RF model in this study, two model agnostic analysis of feature importance [62], as well as accumulated local effects are performed [44]. The advantage of the model-agnostic methods is that they can be freely applied to any ML models and widely extendable from this research to similar ones.
Feature importance analysis is performed to quantify the weight and impact of the features on the response variables that are predicted via the model. Feature importance helps to rank the control variable in relation to each other and make key decisions regarding the improvement of the coating process.
In order to perform the feature importance analysis, the mean decrease in impurity index, (MDI) [63,64] has been used here. To reduce the bias of the MDI algorithm, the control variables have been normalised prior to analysis. For each individual tree of the RF model, each node splits the data from its parent node based on the information provided by a feature that best improves the MDI. The relative feature importance value (gain) of a single feature is obtained by adding up the impurity improvements [63,64], obtained from the nodes using that feature.
The second explanation of the results is achieved via ALE. ALE describes how the features influence the predicted variables on average [65]. ALE is particularly applicable to cases with dependant features such as the control variables of this the mixing/coating study where the dependency between variables such as density and viscosity is certain.
In order to calculate the ALEs for a single feature,x f ∈ X, f = 1,2,...,F, first the feature space is segmented into intervals ofN f (l)with limit values ofw l,f ∈ X, l = 1, 2, ..., L f .N is selected by the ML engineer considering the computational complexity and the data range. Then, for the data points within each interval, the difference between the prediction via the model M, when the feature value is replaced with the upper and lower limits of the interval is calculated. the differences are later on accumulated as shown in Eq. (8).

Herex
(i) − f refers to all features except the one ALE is calculated for, (feature f), n refers to individual samples, andx (n) f are samples of the feature f. n f (l)refers to the number of samples in each neighbourhoodN f (l). ALE by Eq. (8) is then centred via Eq. (9) so that the mean effect is zero. Fig. 3 shows the feature space intervals and limits for the ALE calculation for two correlated features of x 1 and x 2 described in the formulations above.
The value of the ALE is the effect of the specific feature, f at its certain value on the response variable compared to the average prediction of that response variable. The ALE for a single feature can be considered as a first order effect, and when calculated for two features is referred as the second order effect of features on the response variable.

Results and discussions
This section includes the results of the statistical analysis of the experimental data and data-based modelling and representation of the manufacturing processes. The statistical analysis provides insights regarding the significance of factors on the response variables and includes the results of the ANOVA and linear regression. The section then continues with the results of the ML modelling and explanations. The full experimental data can be found in Supplementary Information.
ANOVA results and the coefficients are given in Table 3 for the example output of wet coating thickness. The table contains the values of the estimated model coefficients, the p-value, F-statistics (F-value), and additional parameters used in the evaluation of the statistical significance of the control variables (standard error of the coefficients, tstatistics, sum of squares and mean-squared error). The t-statistics helps in the evaluation of the significant features. Larger t-scores imply larger differences associated with a given control variable (feature). The Fvalue is the ratio of two mean squares, in this case one for the terms of the model (or features) and one for the error. The mean squares are the variances considering the degrees of freedom. Higher values of the ratio of the mean squares indicate larger differences in the variances and as such an indication that the null hypothesis can be rejected. Fig. 4 shows an example of outliers for wet coating thickness and capacity detected via the residuals obtained during the linear regression analysis.
The linear regression models containing the statistically significant  Table 4.
In general, the models are satisfactory in representing the experimental data as shown in the values of R 2 . Coating density and porosity showed a lower level of correlation (R 2 ~ 0.73) but still satisfactory for linear modelling. It can be appreciated from the model coefficients in Table 4 that coating speed, coating gap, slurry density and viscosity are the most significant input variables out of the 9 studied variables.
As Table 4 shows, coating weight, coating density and porosity are modelled by first order terms, while second order terms are additionally required for wet/dry coating thickness and capacity. Detailed modelling results and planes are shown in Fig. 5 for example outputs. The figures indicate the response surface for the (a) wet thickness of the coating, (b) the capacity, (c) the porosity and (d) the density of the coatings as a function of two interest factors. Here the dots refer to the experimental points from the designed experiments and the planes are graphical representations of the functions obtained by linear regression method. The graphs confirm a good agreement between the experimental data and predicted results. The planes can be used to predict the output variables given the control variables, which are alternatives to the equations reported in Table 4.
To further highlight the impact and importance of factors on responses, a deeper analysis is followed via ML. Fig. 6 shows the distribution of the predicted vs measured datapoints for all response variables via RF. The results are obtained for a cross validation with K = 5 partitions, which means that at each CV iteration, 80% of the datapoints are used for training and 20% is left for test, which is a common approach in ML applications [66]. It is worth mentioning that all the results are a fusion of multiple, 40, runs of the model for a more robust representation of model performance. The number of runs has been selected as a compromise between the run time of algorithms and the stability of results.
According to Fig. 6, the distribution of data is almost a combination of two normal distributions and two peaks for all responses. The distribution does not contain any information by itself as the design of experiments has had more than one control factor varying at a time. According to this graphs, the distribution of the predicted responses follows the distribution of the observations (the experimental measurements). This confirms that the ML model is capable of representing the input-output relationships. The compatible predictions and observations also imply that given the set of coating control variables, the electrode features can be predicted or estimated with a reasonable accuracy. This is important when predicting the response variables related to control variables that are not in the experimental dataset.
To get a better view of the goodness of fit and the datapoints at which the prediction is most accurate, a closer look at the results is required (Figs. 7 and 8). While Fig. 7 highlights the error between each data point and its predicted value, Fig. 8 gives a broader view of the variability of predicted values in whole range. The solid lines of Fig. 8 are perfect predictions and the more data points clustered around that line the better the prediction.
A summary of the goodness of fit indices are given in Table 5, the         includes the mean value of indices for multiple runs, 40, as well as the standard deviations. In order to provide a basis to show the performance of the random forest model, the results are compared with the gradient boosted tree (GBT) model [67]. This model is also based on a collection of weak learners, decision trees, similar to random forest but utilises a different mechanism, which is based on gradient boosting, for obtaining the final results from the outcome of each weak learner. In order to facilitate a fair comparison, the number of weak learners has been set equally for both methods and they have been iterated for the same number of runs. Considering the results in Table 5, generally the RF performs better that the GBT model for all response variables, it is also showing smaller standard deviations in the results. Here after only the results of the random forest model are reported in detail and further investigates are based on that as well.
According to the results of Table 5 and Figs. 6-8, the models have all R 2 coefficients larger than 0.7, which shows that at least 75% of the data are well represented by models. For the capacity, coat weight this predictability is higher than 88% and for wet thickness higher than 94%. The lowest predictability belongs to porosity and coating density with about 75%. This is believed to be due to the higher levels of measurement noise associated with porosity and coating density data as those are calculated features based on coating weight and coating thickness and have an uncertainty imposed by two measured values.
At this level and after the reviewing the model performance and the accuracy indices, it is concluded that the models can predict the responses, but still further analysis are required to provide insights about the manufacturing processes and the impact of control factors on responses. This highlights the necessity of XML techniques.
The feature importance graphs obtained via MDI approach described in Section 2 are plotted for the response variables as Fig. 9. Each figure shows the relative importance of control factors on the response. The horizontal axis has values in percent and the longer the bar the stronger the effect of that variable. It is obvious each factors' importance is different for each responses. So, the immediate finding is that a detailed analysis of the impact on factors on responses is necessary. In fact, the ranking of features for one response would not be generalisable to other responses of electrode manufacturing process.
According to the graphs of (a)-(d), the coating gap is the main factor with +70% contribution to the coating weight, capacity, dry and wet thickness of the electrodes. For coating weight, capacity, and dry thickness the capillary number and viscosity are the next important factors, while for the wet thickness, coating speed is also very contributing.
The importance graphs for (e) and (f) are slightly different in terms of the contribution of factors. While still coating gap is the main factor, but the contribution of it is only about 20% and the rest of the factors have an importance at the same range and only slightly less. The graphs imply that while the coating weight, capacity, dry and wet thickness values are highly dependent to the coating gap and this factor masks the others, for porosity and coating density there is a much complex interaction of responses and factors.
Coat weight, dry coating thickness and porosity are chosen as three key response variables and analysed by ALE (Figs. 10, 13 and 14, respectively). The results are shown for six of the input variables (coating gap, viscosity, coating speed, surface tension, slurry solid weight and dry density).
According to Fig. 10, coating gap shows a linear relationship with coat weight, as would be expected, as a higher gap leads to more slurry deposited on the current collector and hence a higher mass loading. Viscosity appears to affect coat weight at lower values but reaches a plateau, having little effect on coat weight above 1 Pa.s. This may be due to slumping, where the lower viscosity slurries spread out more immediately after coating, leading to a thinner coating with lower coat weight, once the slurry is sufficiently thick to avoid this, a plateau is reached.
A higher coating speed also appears to increase the coat weight, the reason for which is unclear, as a faster speed may be expected to leave less material on the current collector. One possibility is that the slurry behaved elastically, as the slurry is viscoelastic at all weight percentages. This can be seen in the oscillatory rheology, where the elastic modulus, G' is close to the viscous modulus, G'' at all measured frequencies (Fig. 11). The higher speeds could have increased the tendency of the coating to 'spring back' giving a thicker coating than the gap set (and many of the wet thicknesses observed are slightly higher than the gap set), causing this trend.
Surface tension also appears to increase the coat weight at higher values. This could also be linked to slumping, a higher surface tension will likely increase the contact angle of the slurry on the current collector, which gives the slurry a tendency to bead up, rather than spread on the surface. In the extreme this could cause coating defects such as pinholes, as the slurry beads up around a nucleation point (e.g. a speck of dust or irregularity in the current collector), however these were not observed within the conditions used for the dataset. So, an increased surface tension could reduce the amount of slumping and increase the tendency to 'spring back' after coating, leading to higher coat weights.
Solid weight percent only appeared to impact the coat weight at higher values. It would be expected that a higher weight percentage would lead to higher mass loadings (as there is less solvent deposited, which is removed during drying).
Dry density, which is the predicted theoretical density of the components in the coating without solvent, will not vary between weight percentages of slurry and hence did not vary significantly during the experiments. However, there is a small fluctuation due to errors in weighing out components. Because it was constant, within experimental error, it has a much smaller impact on the results than other variables. According to Fig. 10 and considering the range of the ALEs for the coat weight on left axis, it is quite clear than the trend of the mixing control variables is quite nonlinear for most response variables. Also, the range of the effects is varying from one to another implying their different importance. The ALE graphs magnitude range is fairly compatible with the importance analysis provided on Fig. 9 (a) having the order of impact as coating gap, viscosity, coating speed, slurry solid wight percent Surface tension, and dry density.
As shown in Fig. 12, the relationships with dry thickness are very similar to those with coat weight (as a thicker coating has a higher mass loading). Hence the trends follow almost the same patterns for Coat weight.
As Fig. 13 shows, electrode porosity decreases with coating gap, but  Fig. 11. Oscillatory rheology of slurries at three weight percentages of the study, frequency sweeps measured at 0.5% strain.
this is most pronounced at lower gaps. This is because a lower gap produces a thinner coating, and thin coatings have little freedom to rearrange under capillary forces during drying. Thicker coatings have both more time and more spatial freedom to perform rearrangements to reduce the overall porosity [68]. Viscosity follows an inverse relationship with porosity to that with coat weight, implying that the effect of coat weight on porosity is the dominant effect, for the reasons given above. High viscosity could be expected to impair rearrangement and increase porosity, but this is not observed, implying that the coating thickness is much more important to this process. (However, the second order plot, Fig. 16 reveals more about this).
Coating speed has a similar relationship; however, the trend does not exactly follow the inverse of the trend in coat weight, in particular, at the highest speeds, coat weight plateaus, but porosity continues to decrease. This implies that faster coating speed promotes the lower porosity, possibly due to reduced air entrapment or faster rearrangement of components.
Surface tension shows what appears to be a minima in porosity at intermediate values. This could be due to the competing effects of the coat thickness (which increases with surface tension so may be expected to give a decrease in porosity) and the capillary forces. Higher surface tension will also increase the resistance to capillary forces and hence the rearrangements possible during drying, leading to a higher overall porosity.
Porosity also appears to decrease with weight percentage, most dramatically at lower weight percentage (unlike the trend with thickness, which is flat in this region). The pore structure is created by the  evaporation of solvent, so it follows that the more solvent the higher the overall porosity.
Finally increasing dry density also decreases porosity, which is expected as again a higher density of components should lead to a higher density in the coating, i.e., fewer voids. Although it should be noted that this is over a small range as the formulation was not varied in these experiments.
While first order effect graphs show the influence of each individual factor on the response, the second order ALEs, quantify the impact of a pair of features. Second order ALEs shed light on the interaction of control variables and its impact on the responses. In order to capture the second order effects, a similar approach to what described via Eqs. (8) and (9) is taken [44] and instead of intervals in Fig. 3, rectangular cells were created to cover both dimensions. It should be emphasised that, when interpreting the second order ALEs, not only adjustments for the overall mean effect, but also adjustments for the first order effects of individual features are performed. This means that the second order effects do not include the effects of individual features and only capture the additional interaction effects. The second order effects are 2 dimensional graphs in the form of heatmaps, x and y axis show the features and the colour bar is related to the response variable. The colour intensity shows the strength of the correlation. In fact, cells with lighter colour indicate an above average prediction value while darker coloured cells suggest below average predictions.
All the second order effects for responses are calculated considering the coating gap as the first feature, and one of the four features of viscosity, coating speed, surface tension and slurry density. Fig. 14 gives heatmaps for the predicted coat weight, while Figs. 15 and 16 provide results for dry thickness and porosity of the electrode, respectively. Similar to first order ALEs, the values are centred to the mean of the response variable. Fig. 14 (a) suggests that higher coating gaps and lower values of viscosity have an additional positive effect on the predicted coating weight. Similarly, when interpreting the cells, of Fig. 14 (b), high coating gaps, and high coating speeds have a negative effect on the predicted coating weight. In all results, when a cell value is almost zero the coupled features do not influence the predicted response. To understand the total effect of two features on the predicted value, three figures should be considered, two of the first order ALE Figs. 10-13 plus the values of the heatmaps in the Figs. 14-16, these three values should then be added to the average predicted response (as all calculations are centred and normalised). Fig. 14 (b) shows that a high coating speed decreases the coat weight for coatings at larger coating gaps. Also, high surface tension increases coat weight at larger gaps (Fig. 14 (c)), as a high surface tension would give a higher contact angle on the current collector and a lower tendency to slump or spread on the surface. Slurry density-coating gap heatmap interestingly shows an opposite relationship at low coating gaps compare to higher gaps, Fig. 14 (d).
Following Fig. 15 (a) shows that a high viscosity gives increased thickness at large coating gaps, which is opposite to the coat weight results, although if the highest coating gap/viscosity is ignored the general second order trends in coat weight and dry thickness are similar. The trend in coating speed and coating gap, with dry thickness, Fig. 15 (b), appears broadly similar to that with coat weight. There is also a near identical trend in dry thickness with surface tension/coating gap as coat weight, Fig. 15 (c). Slurry density shows a less clear second order trend with coating gap, Fig. 15 (d), but it does reverse between low and high gaps as with the coat weight.
The next heatmaps are dedicated to porosity. High viscosity favours higher porosity at large gaps, whereas at small gaps a lower viscosity decreases porosity, (Fig. 16 (a)). This is believed to be because a high viscosity impairs rearrangement to reduce porosity, which is most important at small gaps where the components have less space and a smaller drying time to rearrange. When the gap is increased, this trend reverses, which is believed to be due to the high viscosity preventing slumping, which leads to a thinner coating with higher porosity.
There is a complex relationship between coating speed, coating gap and porosity as demonstrated in Fig. 16 (b). There appears to be a decrease in porosity at intermediate speeds for gaps of 200 and above, but for 70, this inverts, becoming an increase. Higher speeds may be expected to leave a thinner coating, which generally have higher porosity, but it could be that at the lower speeds, more air entrapment is possible and there is a balance between these effects. For the smallest gap the shear rate is higher, and thus the slurry is more likely to behave elastically, i.e. spring back to give a thicker coating at faster speeds, with a higher porosity. Once the speed is high enough to induce this behaviour, further increases may just reduce air entrapment and reduce porosity again.
Surface tension again shows a different relationship between 70 (um) and higher coating gaps, Fig. 16 (c). This could be as slumping is more likely for the thicker coatings, which is reduced by a higher surface tension, giving thicker coatings with higher porosity. For the smallest coating gap, the ability of the components to rearrange and reduce porosity is more critical (due to smaller space and drying time) and hence it might be expected that the higher capillary forces that come with higher surface tension would impede this, leading to a higher porosity. However, the opposite trend is observed. This could be due to the higher surface tension favouring a minimisation of the air-slurry interfaces, thus reducing air incorporation in the slurry. This follows  the trend in dry thickness in Fig. 15.
Slurry density generally increases porosity, as a denser slurry is more resistant to slumping giving thicker coatings and more likely to entrap air. However, there are outliers at low density, Fig. 16 (d).

Conclusions
This study focuses on the lithium-ion battery slurry coating process and quantitatively investigating the impact of physical properties on coating procedure. Slurries are characterised with advanced metrology and, the statistical analysis together with the explainable machine learning techniques are applied to reveal the interdependency and relationships between inputs and outputs. In fact, understanding the impact of control variables on the responses is only possible through local interpolation techniques such as ALE and traditional methods would not be applicable when there is only limited experimental data and the cost of experiments with the approach of changing one-variableat-time is very high.
Conventional statistical analysis is the first choice for uncovering linear interconnections while machine learning models and explainability techniques are necessary for capturing the nonlinear relationships. The linear models can represent the slurry coating response variables with an accuracy between 73% and 98% for different variables. The prediction accuracy via random forest model is between 74% and 94%. For response variables of wet and dry thickness, coating weight and density, the validation scores of linear model are slightly higher than random forest and this implies that the responses have a nearly smooth and almost linear dependency on the control variables. For these responses, linear models are a better choice due to less complexity and easier optimisation compared to random forest. For porosity and density, the nonlinear models of random forest provide a better accuracy and are recommended models for predictions and analysis. But regardless of the accuracy, ML models are necessarily to offer the explanation via XML.
An acceptable prediction accuracy obtained via ML confirms that the presented approach and methodology can provide an opportunity to determine product quality through model-based analysis specially where experiments are hard or costly to run. In other words, the models can predict the product properties for various combinations of control factors that fall in the same range for which the model has been trained for. This would eliminate the need to perform a manufacturing run for that combination as it would lead to almost the same response predicted by model. The results here concern a current industrial process for manufacture of water-based Graphite anodes, for which there is still significant room for optimisation and the models here show promise for reduction of costly trial and error setup. However, the results also demonstrate the utility of ML more generally in battery manufacturing, and similar approaches could provide substantial benefit in the development of new technologies, such as solid-state and dry processing, where the available data is smaller and experiments, particularly on scale, will be even more costly to run. This approach provides a method for extracting the most information from that experimental data.
With the support of explainable ML techniques, key results are found regarding viscosity and surface tension. According to the results, viscosity is necessary to be kept above a minimum threshold to prevent slumping, low coat weights and high porosities. Surface tension, a metrology option rarely considered for electrode slurries is also shown to be an important variable that can be used to optimise coating. Low surface tensions, like viscosity, promote slumping, however high surface tensions promote air entrapment that increases porosity, therefore there lies an optimum intermediate value for coating. This is an important additional parameter to measure for slurry coating, which may provide opportunity to identify issues earlier and reduce optimisation time for industrial lines. The importance of dimensionless numbers of the predictability of the responses is also among the findings of this research. The importance graphs reveal that the capillary and Ohnnesorg numbers can both serve as strong predictors to estimate the response variables. They are more contributing to the coat weight, dry thickness and capacity compared to porosity and coating density.
One of the main findings of this study is that the importance of control variables is different from one response to another, as is the linearity or nonlinearity of the dependencies. The results can facilitate an efficient design of experiments for a comprehensive representation of the manufacturing processes involving coating processes. The importance graphs show that the coating gap factor is better to be set at a constant value, in order to reveal the impact of factors such as viscosity and surface tension that appear at a lower rank. In this way the coating gap will not mask the other factors. Furthermore, the ALE graphs show that in the regions that the control factor is fairly linear with the response, (such as coating gap and dry thickness), only a few number of breakpoints for control variables would be enough for training a model. On the contrary a finer breaking is necessary to represent a nonlinear dependency such as surface tension and porosity.
Although the analysis presented here have revealed the interdependencies in the complex process of slurry coating, future works are still required to consider a wider variety of control and output variables in the model at different stages, such as formulation, drying, calendering and cell making via a comprehensive design of experiments. It is also necessary move towards an online and closed-loop decision making procedure based on the analysis [69]. This extension is critical for a predictive digital twin of the LiB coating process and is currently under study by the authors. This study has been a proof of concept performed in the lab and via lab-scale production methods, therefore further studies addressing the transition from lab to higher scales of manufacturing specially industry scale is still required. It is worth noting that the modelling system, the methodology of the systematic process analysis, and the feature contribution approach that are proposed here are all independent of the nature of the dataset. Therefore, the methodology is highly transferable from one process to another and also from one electrode to another as long as the general requirements of the dataset quality in terms of the variability and the ranges are met. The specific part of this study that has been tailored to the graphite manufacturing process is the design of the experiments which include the ranges and the number of break points for each variable. Consequently, for any other processes or material combination, the DoEs need to be revised. It is also fair to consider that the hyperparameters of the ML models are also dependent to the dataset and their distributions, therefore, while the modelling methodology is applicable to other scenarios, their hyperparameters need to be tuned and the model robustness of models to the shift in the data needs to be quantified.

Supplementary information
The experimental data of this study has been included as supplementary information. Please contact the authors for further details.
To increase the reproducibility of the results, the model details and the range of the hyper-parameters used for optimising the model are mentioned as below, all analysis are performed in MATLAB 2020b and Python 3.