Comparison of methodologies to estimate state-of-health of commercial Li-ion cells from electrochemical frequency response data

Various impedance-based and nonlinear frequency response-based methods for determining the state-of-health (SOH) of commercial lithium-ion cells are evaluated. Frequency response-based measurements provide a spectral representation of dynamics of underlying physicochemical processes in the cell, giving evidence about its internal physical state. The investigated methods can be carried out more rapidly than controlled full discharge and thus constitute prospectively more efficient measurement procedures to determine the SOH of aged lithium-ion cells. We systematically investigate direct use of electrochemical impedance spectroscopy (EIS) data, equivalent circuit fits to EIS, distribution of relaxation times analysis on EIS, and nonlinear frequency response analysis. SOH prediction models are developed by correlating key parameters of each method with conventional capacity measurement (i.e., current integration). The practical feasibility, reliability and uncertainty of each of the established SOH models are considered: all models show average RMS error in the range 0.75% – 1.5% SOH units, attributable principally to cell-to-cell variation. Methods based on processed data (equivalent circuit, distribution of relaxation times) are more experimentally and numerically demanding but show lower average uncertainties and may offer more flexibility for future application.

• Metrological framework to compare SOH estimation methods for second-life cells.
• Electrochemical impedance data measured during cell aging at 4 institutions.
• Models based on raw EIS and NFRA data, and equivalent circuit and DRT fits.
• All methods predict SOH of aged cells to within 1.5% SOH units RMS error.
• SOH prediction uncertainty is principally governed by cell-to-cell variation.

Keywords:
Electrochemical impedance spectroscopy Equivalent circuit Distribution of relaxation times Nonlinear frequency response analysis State-of-health prediction A B S T R A C T Various impedance-based and nonlinear frequency response-based methods for determining the state-of-health (SOH) of commercial lithium-ion cells are evaluated.Frequency response-based measurements provide a spectral representation of dynamics of underlying physicochemical processes in the cell, giving evidence about its internal physical state.The investigated methods can be carried out more rapidly than controlled full discharge and thus constitute prospectively more efficient measurement procedures to determine the SOH of aged lithiumion cells.We systematically investigate direct use of electrochemical impedance spectroscopy (EIS) data, equivalent circuit fits to EIS, distribution of relaxation times analysis on EIS, and nonlinear frequency response analysis.SOH prediction models are developed by correlating key parameters of each method with conventional capacity measurement (i.e., current integration).The practical feasibility, reliability and uncertainty of each of the established SOH models are considered: all models show average RMS error in the range 0.75%-1.5% SOH units, attributable principally to cell-to-cell variation.Methods based on processed data (equivalent circuit, distribution of relaxation times) are more experimentally and numerically demanding but show lower average uncertainties and may offer more flexibility for future application.

Introduction
Lithium-ion cells age due to repeated charge/discharge cycles and during storage.Once capacity fade exceeds the defined operational limit, typically 70-80% of the original capacity, the aged lithium-ion cells need to be replaced.Approximately 200,000 tonnes of lithiumion cells enter the EU market annually [1].Demand for lithium-ion cells increased by about 50% from 2018 to 2020 and it is projected that this number will increase to 14 times larger by the end of 2030 [2], due to the expanding electric vehicle market [3,4].About 40% of aged lithium-ion cells are collected from the market and recycled [1].This indicates that lithium-ion cell waste will increase rapidly and may exacerbate further problems in the future.Repurposing of aged lithium-ion cells could maximise cell utilisation prior to recycling or disposal.Aged lithium-ion cells can still be useful for lower power-or energy-density applications, such as domestic energy storage.Such cells are called second-use cells or second-life cells [5][6][7].By repurposing aged cells, the cost of second-use applications can be significantly decreased.Also, the demand for raw materials to manufacture new cells will be reduced.
For the economically viable application of second-use cells, an accurate and cost-effective characterisation technique to estimate the state-of-health (SOH) is essential.SOH describes the actual charge capacity that can be stored in a cell, as a ratio to the nominal capacity value.Since module performance is limited by the individual cell with the lowest SOH, accurate SOH determination allows grouping of cells with closely matched SOH values, increasing module lifetime.Similar considerations apply to the grouping of modules in packs.The most common method to determine SOH is by integrating the transient current during one complete charge/discharge cycle at a nominal operating current.The major drawback of this method is the long measurement duration, so that alternative, more approximate, techniques are required for rapid and/or online SOH estimation.
Various SOH estimation models or algorithms have been proposed to address this shortcoming [8].Models resolving the dynamic response of a cell seem to be highly suitable for such tasks, as they allow extraction of characteristic time constants and features from the dynamic experimental response, which can be correlated to physical processes and states, such as SOH.The variety of dynamic diagnostic models for batteries is wide, ranging from mechanistic, i.e., physics-based approaches via (semi-)empirical models based on equivalent circuits (EC), to data-driven models, and even hybrid models composed e.g., of data-driven and mechanistic models [9].
Mechanistic simulations employ physics-based cell models such as the pseudo-two-dimensional (P2D, "Newman") model [10] or single particle (SP) model [11] to estimate the SOH of a cell as a function of its physical parameterisation.Physics-based models may be tailored to consider the physicochemical prediction of cell performance, as well as degradation processes such as solid-electrolyte interphase (SEI) growth [12].In this way, besides SOH estimation, the root causes of capacity fade can also be quantified and the impact on ageing of operating conditions, such as temperature, cycling depth or current, can be better understood.However, the practical utilisation of mechanistic models depends on resource-intense parameterisation [13,14], and due to model complexity, the operation and fitting of the models may not be rapid [15,16].
Compared to mechanistic simulation, exclusively data-driven (empirical) models are more easily implemented and show significant promise both for SOH estimation of cells after first life, as well as for online applications (during use) [8,17].A data-driven model involves the training of an empirical model that maps a defined set of ageing parameters to SOH.Various training algorithms for SOH estimation have been extensively researched, e.g., neural networks [18][19][20][21][22][23][24], support vector machines (SVM) [25][26][27], relevance vector machines (RVM) [28][29][30][31], gaussian process regression [32,33], and extreme learning machines [34,35].Correspondingly, different sets of ageing parameters have been identified and used as the inputs for the training algorithms.For example, Zhou et al. [28] defined mean voltage falloff within a specific voltage window as an indicator to predict SOH via simple regression and optimized RVM approaches.For online SOH estimation, Zhou et al. [36] employed an optimized gray model based on the time interval of equal discharging voltage difference, while Li et al. [37] implemented a recurrent neural network (RNN) with long short-term memory (LSTM) that uses a raw partial charging curve.Examples of other ageing parameters that have been proposed include the voltage at the beginning of discharge [38], constant current (CC) and CV charging time [23,32,39], charging/discharging energy and efficiency [23,34], rate of change of CV charging current [32], nominal voltage [23,32,34], peak heights/ratio from incremental capacity analysis (ICA) [40], incremental voltage difference [22], characteristic features from the voltage relaxation curve [41], etc.
Besides ageing parameters drawn from the DC voltage response, electrochemical impedance spectroscopy (EIS) has also been shown to be a viable data source for SOH estimation.EIS uses a small-amplitude oscillatory load to determine the complex impedance of the object under test, as a function of the load frequency [42].EIS is especially useful for analysing the internal state of the cellin particular, the interfacial properties that evolve as the cell ages.Efforts to date on SOH prediction using EIS have been reviewed comprehensively by Mc Carthy et al. [16], including online estimation challenges.One simple approach is the direct correlation of a spectral feature with cell degradation.For example, Zhang et al. [33] showed that, for a particular data instance, the impedances at 17.8 Hz and 2.16 Hz had the strongest correlation to the degradation, and so could be used for SOH estimation.Wang et al. [43] demonstrated that the phase shift at 79.4 Hz correlated positively to the internal temperature of the cell.A more involved impedance-based approach is SOH estimation using parameterisation from equivalent circuit (EC) models.Here, changes in circuit fit parameters are correlated to SOH.For example, both ohmic and charge transfer resistances have been shown to be reliable SOH predictors, as they increase consistently and significantly during cell ageing [19,[44][45][46][47][48][49].Other researchers have demonstrated promising SOH estimation accuracy by considering additional EC parameters, namely SEI layer resistance, capacitance dispersion, inductance, and double layer capacitance [20,21,50].Eddahech et al. [18] pointed out that by considering operating conditions such as temperature, cycling profile and depth-of-discharge (DOD) variation in conjunction with ageing-relevant parameters such as equivalent series resistance, the established SOH estimation model is more general to different applications and operating conditions.Examples of this wider approach include the work of Wang et al. [51], who established an SOH estimation model based on charge transfer resistance fitted from an EC model, including temperature and state-of-charge (SOC) variation, and Li et al. [52], who extracted ohmic resistance from an EC model via particle swarm optimization and implemented it as an indicator for online SOH prediction in a cloud-based battery management system (BMS).
Besides EIS, nonlinear characterisation methods, e.g., nonlinear frequency response analysis (NFRA), have been shown to have potential for ageing state estimation.Here, the higher harmonic responses to a large sinusoidal input signal, typically current, are analysed [53].Harting et al. [54] revealed that higher harmonic signals are sensitive to ageing; not only do nonlinear responses increase with decreasing SOH [36] but the higher harmonics also change in a different way with the underlying ageing process, e.g., Li plating vs. SEI growth.Quantitatively, correlation between NFRA and SOH has been achieved using various features of the response amplitude [54] either directly [55] or with machine learning [13].In Ref. [25], a support vector machine model for SOH prediction based on the summation of the second and third harmonics gave prediction accuracies below 5%.
Since these various previous studies used different experimental conditions, cell formats and data analysis methods, comparison between them is currently challenging.An underpinning metrological framework for impedance-based methods is currently lacking; such a framework should include traceability, quantified measurement uncertainties and defined measurement procedures to guarantee comparability of the results.In this work, we compare various selected impedance-based methods for the purpose of offline SOH estimation on cells following first life use: direct use of EIS data, EC fits to EIS, distribution of relaxation times (DRT) analysis on EIS, and NFRA.For each of the investigated methods, we follow a common, systematic framework, progressing through data collection, data processing, parameter selection, model training, and model validation.Here, all methods are tested under equivalent experimental conditions, using a consistent source for the tested cells, with measurements undertaken at multiple institutions in parallel to demonstrate robustness of the methods with respect to hardware variation.Each of the methods is thoroughly evaluated in terms of its practical feasibility, estimation accuracy, and uncertainty.We also provide perspective on the extension of our methodology to cells aged under different cycling conditions, as well as cells with different formats and chemistries.The adaptation of the presented methodology to comparison of online SOH estimation approaches would also be feasible.

Battery testing protocols
Life-cycle tests (LCTs) were conducted by four different measurement institutions on cylindrical 18650 cells (9 cells total, obtained from a commercial supplier) with a nominal capacity of 3 Ah.The selected cell chemistry has a lithium nickel manganese cobalt oxide (LiNi 0.8- Co 0.1 Mn 0.1 O 2 ) positive electrode and a silicon-graphite composite negative electrode (Si content estimated as 2.8 ± 0.5 wt% from post mortem SEM-EDX analysis, to be reported separately).The cells were cycled between 3.0 V and 4.2 V (DOD 100%) at 45 • C and 4 A (≈1.33C), with a CC-CV protocol in the charging direction (current cut-off at 300 mA) and CC in the discharging direction.Different cell cyclers were used at different institutions (Modulab-MACCOR, BaSyTec XCTS, BioLogic MPG-205).At 50-or 100-cycle intervals, the capacity of each cell was measured under consistent, repeatable conditions considered practical for a future standard measurement method, at 23 • C and 1.25 A (≈0.4C).SOH is computed by taking the ratio of the discharge capacity of the aged cell to the initially measured capacity of the fresh cell.Following each capacity measurement, the internal state of the cells was characterised via EIS and, at one institution, NFRA.Dynamic measurements were carried out under the same conditions as capacity measurement, at specified SOC values using various electrochemical workstations at different institutions (MACCOR, Zahner Zennium, BioLogic SP-200, BioLogic MPG-205).SOC was adjusted by discharging the cells at 1.25 A from the end-of-charge voltage, with the required extent of discharge being adapted according to the most recently measured capacity.After SOC adjustment, the cells were allowed to rest for at least 30 min before dynamic measurements were conducted.
Measurement parameters for the EIS and nonlinear frequency response (NFR) measurements are shown in Table 1 and Table 2, respectively.For the NFR measurement, the AC excitation amplitude is chosen at 5 A to ensure an excitation of harmonic signals with good signal-to-noise ratio while avoiding significant cell heating that could induce additional ageing [56].The investigated SOC range is restricted to SOC 20-80% to avoid overcharging due to the higher AC amplitude.
The frequency range has a lower bound of 10 − 1 Hz to avoid drift in the cell state over longer measurement durations [57].

Life-cycle test data set and qualitative observations
All data from the life-cycle tests were collated as a set of discrete 'experiments', where each experiment represents an impedance or NFR spectrum gathered under a particular condition.The following cell information was available for each experiment:    A notable impedance increase occurs during cell ageing, especially for the lower-frequency arc (ca.0.02 Hz-10 Hz), which is attributable to a charge transfer process for Li insertion in the active material at one or both electrodes of the cell.The impedance increase within this frequency range is more prominent when measuring at higher SOC.From the EIS data in Fig. 2, there is a promising indication that the impedance spectrum is sensitive to the SOH fade of the cell, and therefore it is suitable to investigate quantitative models to correlate the impedance data to SOH fade.Unlike EIS, which probes the linear response of the cell, NFRA utilises the higher harmonic signals that are generated when the cell responds nonlinearly to a larger perturbation [53,56].Fig. 2 (c) and (e) show some typical examples of NFR spectra of the corresponding EIS during cell ageing.Notable higher harmonic responses occur only in the frequency range 0.1 Hz-2 Hz.For all SOC, the voltage amplitudes of the second (Y 2 ) and third (Y 3 ) harmonics increase monotonically with decreasing SOH.As for the EIS results, higher SOCs give a higher magnitude response.Nonlinear responses are only expected for nonlinear processes, such as charge transfer reactions; slow transport processes can cause additional nonlinearities [58].The presence of significant nonlinear responses in the frequency range 0.1 Hz-2 Hz strengthens the inference that the EIS response in this frequency range is attributable to the charge transfer process for Li insertion in one or both electrode active materials.An increase in the nonlinear response in turn can be correlated to worsening charge transfer kinetics of the Li-ion cell, as has been illustrated by mechanistic modelling studies [57].As the amplitudes of both Y 2 and Y 3 show a stronger increase during ageing at higher SOC, the high SOC and low frequency range appears to be most suitable for correlating NFR signals to the SOH fade.This encourages the development of an SOH model that utilises the NFR data directly.
Systematic analysis of frequency-response data is not straightforward.A particular limitation preventing direct utilisation of EIS or NFR data for machine learning is that each spectrum comprises a set of discrete data points that are highly correlated to each other (i.e., they form a continuous spectrum).If the number of individual inputs is large but the inputs are highly correlated, reliable model training may require excessive quantities of input data; therefore, it is not appropriate to use raw impedance data directly and in their totality for practical machine learning.To resolve this challenge, different data reduction methods were applied to the impedance and NFR data.
As identified in the literature review in the Introduction, the simplest approach is to limit the analysed data to specific frequencies.An alternative is to use EC or DRT analysis [59,60] to represent the raw impedance data as a smaller set of uncorrelated coefficients that are suitable as inputs to an empirical model, with minimal information loss.In the remainder of this section, we specify the mathematical methods used to derive the relevant EC (Section 3.2) and DRT (Section 3.3) parameters ("processed data") for correlation to SOH.The general methodology for formulating SOH models from raw or processed data is then introduced in Section 3.4.

Equivalent circuit (EC) fitting
Following inspection of the impedance spectra, all data were fit to the 2ZARC equivalent circuit shown in Fig. 3 (a), which was developed taking inspiration from the comparable equivalent circuits presented by Buteau and Dahn [61] and established empirically as sufficiently accurate for the measured EIS data.The equivalent circuit chosen is not intended to be a direct, physicochemical representation of underlying processes in the cell, although to maintain a clear, convenient nomenclature, the circuit elements are notated according to conventional assignments ("ct", "dl", "diff").The purpose of the equivalent circuit fit is data reduction by means of the nonlinear transformation represented by the circuit.An exemplar quality of fit is shown in Fig. 3 (b).
The equivalent circuit in Fig. 3 (a) has an analytically expressed impedance in terms of 12 coefficients (see Equation ( 1)) as a function of angular frequency ω, as follows [61]: The constant phase element (CPE) components in Fig. 3 (a) are parameterised in two ways, following Buteau and Dahn [61]; for components tending ideally towards a capacitance (CPE ct ), a characteristic capacitance C dl,k and phase angle α dl,k are used, while for the component tending ideally towards a Warburg impedance (CPE diff ), a capacity Q diff and phase angle α diff are used.A 1ZARC equivalent circuit can also be achieved by setting R ct,2 = 0 and removing α dl,2 and C dl,2 from the fitting process; the 1ZARC circuit is more appropriate if only one arc can be discerned in the EIS data.The detailed fitting algorithm is described in Supplementary Material S.2.

Distribution of relaxation times (DRT) fitting
The DRT method represents an impedance spectrum as the response of a number of RC elements in series [62,63].DRT analysis assumes that the process contributing to each RC element is best described as the sum of a (nominally) infinite number of contributing serial RC elements with relaxation times distributed around a time constant τ.This distribution reflects the complex interaction of transport processes and charge transfer reactions in the heterogeneous porous electrodes [59].For an electrochemical systemsuch as a lithium-ion cellthat exhibits impedance behaviour that cannot be entirely described by RC circuit elements, impedance data must be pre-processed prior the DRT analysis; contributions to the impedance that do not possess RC circuit properties are removed from the spectrum [64].The pre-processing approach is described fully in Supplementary Material S.3.
From the pre-processed spectra, DRT responses were deconvoluted using the program DRTtools developed by Wan et al. [65].For a meaningful analysis of impedance spectra by means of DRT, two aspects are crucial: (a) the selection of real and/or imaginary impedance data to be used for the calculation (in this study a combined data set of real and imaginary parts was used); (b) the choice of an appropriate value of the regularisation parameter λ, for the DRT calculation.It was found that in order to maintain a consistent identity of DRT features across the full SOH range, it was necessary to modify λ as a function of SOH.Full details of the data and regularisation parameter selection are given in Supplementary Material S. 3.
For almost all spectra, the DRT analysis yielded the same number of time constants at each SOC.Five time constants τ 1 − τ 5 (enumerated by increasing magnitude of the time constant) were derived from each impedance spectrum recorded at SOC 100%, along with the corresponding resistance (R) and capacitance (C).Thus, 15 DRT-based impedance parameters in total were identified from each spectrum; as each time constant τ equals the product of the corresponding R and C values, only 10 of these parameters are independent.

Methodology for SOH estimation model training
In this section we describe a consistent methodology for creating SOH estimation models using EIS or NFRA measurement data.All numerical methods were implemented in MATLAB (MathWorks, version.R2019b or later).

Data reduction
To aid model training, data reduction is undertaken using various approaches; following data reduction, highly correlated ageing parameters are selected.We consider only data from experiments where the  measured SOH is between 70% and 95%.This restriction ensures that the SOH models are applicable in the practical range of interest for second-life applications.To predict the highly nonlinear or anomalous ageing behaviour observed (Fig. 1) for fresh cells (SOH >95%) and highly aged cells (SOH <70%) would require a much more complicated and, overall, less accurate model, without providing additional insight in the SOH range of primary interest to our application.

Ageing parameter selection
Following data reduction according to SOH, suitable ageing parameters that correlate with the SOH fade are identified from each of the impedance-based methods.The ageing parameters take the following forms: for the direct impedance-based SOH model, raw EIS spectral data, namely real (Z ′ ) and imaginary (Z ′′ ) impedances at the characterised SOCs and frequencies (Table 1); for the EC-based SOH model, the EC model parameters as shown in Table S.1 (Supplementary Material); for the DRT-based SOH model, time constants and their respective resistances and capacitances; and lastly, for the NFRA-based SOH model, raw NFR spectral data, namely second (Y 2 ) and third (Y 3 ) harmonics at the characterised SOCs and frequencies (Table 2).
For each method, consideration is given to whether a model could be prepared using a single choice of SOC, so that only one spectrum would need to be recorded in practice.Priority is given to models using impedance data at SOC 100% for practical reasons, since this cell condition can be more easily achieved than other SOCs (it is attainable by charging to a defined voltage).We recognise that the high sensitivity of impedance to SOC at this SOC extreme might be an argument to discourage this selection; this issue should be evaluated more fully by future work, but for the purpose of the present study we emphasise projected practicality of the methods.
The strength of the correlation between the ageing parameters and the SOH fade is assessed via Spearman rank correlation (Supplementary Material S.4), a generalized correlation method that has been used for battery SOH correlation analysis previously [25].Complementary results from Pearson correlation analysis are given in Supplementary Material S.5.

Model definition and training
Once ageing parameters have been selected with the support of the correlation coefficient information, an empirical model is developed using the training data set.Models are trained using stepwise linear regression to coefficients of a quadratic function of SOH on ageing parameters x i , combined with binary crosswise terms as the product of pairs of ageing parameters x i and x j (Equation ( 2)).Details of the regression algorithm are given in Supplementary Material S.6.This particular regression approach was chosen to provide a simple and repeatable model applicable to all impedance-based methods.

Model validation and uncertainty quantification
Once a model is trained, there exists an uncertainty for the predicted SOH from given test data.When evaluating this uncertainty, three contributions must be considered.First, there exists uncertainty within the training data used for the model (so-called "aleatoric uncertainty" [66]) -that is, both measurement uncertainty and cell-to-cell variation in the EIS spectrum corresponding to a particular SOH value will propagate during model training into model error.Second, there exists uncertainty due to the quality of the chosen SOH model, in terms of how well Equation (2) describes the underlying relations between the ageing parameters and the SOH (so-called "epistemic uncertainty" [66]).Lastly, the experimental uncertainties of the new test measurement will propagate through the model.
Despite contemporary efforts in the field, to the authors' knowledge there exists no general methodology in the field of uncertainty quantification for machine learning techniques that allows the independent evaluation of these contributions, either with respect to original training data or new test data.In this work, we take a preliminary model validation approach by means of a partial but quantitative assessment of uncertainty, using two standard methods: cross-validation, and uncertainty propagation.
Cross-validation is conventionally used to assess model fit quality empirically when only one training data set is availablethe data are repeatedly divided into a portion of training data used to establish fit coefficients, and independent test data which are used to quantify the accuracy of that fit [67].We apply a four-fold cross-validation procedure to each of the models, by partitioning at random the data set of experiments (in the form of ageing parameters computed for each experiment) into four subsets (approximately 25% of input in each).Four regression models were then trained for each method; each model successively uses one of the four subsets as test data for validation, while the model is trained on the combination of the three remaining subsets.Root mean square error (RMSE) is computed for the model SOH predictions on the test data, by comparison to the corresponding measured SOH values.The mean of the RMSE for all four trained models gives an estimate of the combined aleatoric and epistemic uncertainties for SOH estimation.This quantity is used below as a quality criterion for model comparison.
Uncertainty propagation can be used to estimate the likely proportion of the empirically observed RMSE attributable to experimental uncertainty.Specifically, identified experimental input uncertainties are propagated through the trained model to a resulting (partial) SOH prediction uncertainty.This propagated uncertainty does not include the contributions from cell-to-cell variation in the original data set, or model quality, and so it is not a unique value for SOH uncertainty that is directly useable with model predictions; rather, the goal of this analysis is to assess whether experimental contributions are a negligible or substantial proportion of the overall empirical RMSE.
Four potential influencing factors from the experiments were identified, which contribute to aleatoric uncertainty: temperature (inhomogeneities in the temperature distribution in the measurement chamber and in the cell), SOC (the desired SOC can only be adjusted with a certain degree of accuracy), measurement time (cells never fully equilibrate), and the calibration of the impedance meter [68].In this work we introduce the consideration of propagated partial uncertainties for SOH prediction models by considering the influence of temperature and SOC only.Characterisation experiments needed to evaluate these input uncertainties are described comprehensively in Supplementary Material S.8.The corresponding evaluations for measurement time and impedance meter calibration exceeded the scope of this work, but these influencing factors would necessarily form part of a more comprehensive uncertainty budget and would be recommended for future study.For the EIS-based models, the uncertainties due to SOC adjustment are disregarded, since all ageing parameters were selected from data at SOC 100%, which occurs at a well-defined voltage level (end-of-charge cut-off at 4.2 V).For the NFRA-based models, NFR data at SOC 80% were considered; as such, besides cell temperature, the additional uncertainty due to SOC adjustment needs to be considered.
For the simpler cases of the direct impedance-and NFRA-based models, which use raw data points as ageing parameters, the estimation of uncertainty contributions was calculated on the basis of the 'Guide to the expression of uncertainty in measurement' [69] (see Supplementary Material S.7).For the models using processed data (EC-based and DRT-based), the ageing parameters are no longer simply discrete impedance values; hence, a more complex Monte Carlo approach [70] is required.This method is described fully at the point of use in Section 4.5.2below.

Development and evaluation of SOH estimation models
In this section, the proposed framework for developing an SOH estimation model (Section 3.4) is applied to each impedance-based method, using the data set described in Section 3.1.The resulting models based on the four impedance-based methods are evaluated, compared, and discussed in terms of their estimation accuracy, uncertainty, and practical implementation.

Direct impedance-based SOH model
In the following, a model based on raw impedance data is developed and evaluated.To establish a reduced set of frequencies at which raw impedance data show good correlation to SOH, the impedance data at all measured SOCs and frequency points were analysed against the SOH fade via Spearman rank correlation.Fig. 4 (a) and (b) show the correlation coefficients: light colours indicate that the absolute magnitude of the correlation coefficient is close to 1, which means that impedance values in this region are strongly correlated to the SOH fade.In general, Z ′ shows strong correlation across a larger frequency and SOC range.This shows that Z ′ is strongly correlated to the SOH fade at most frequencies, but the correlation is strongest for Z ′ at high SOC and low frequency.Similarly, Z ′′ at high SOC and low frequency also shows strong and linear correlation to the SOH fade.From this correlation analysis, the following four spectral features were identified as the best correlated ageing parameters to the SOH fade for the direct impedancebased SOH model: 1) Z ′ at SOC = 100% and 0.03 Hz, 2) Z ′ at SOC = 100% and 0.01 Hz, 3) Z ′′ at SOC = 100% and 0.2 Hz, 4) Z ′′ at SOC = 100% and 0.5 Hz.All identified ageing parameters are at SOC 100%, due to the high sensitivity towards the degradation behaviour of the cell, as also qualitatively visible from Fig. 2.This is especially helpful in consideration of practical implementation of the impedance-based SOH estimation, due to the straightforward experimental accessibility of the SOC 100% state (see Section 3.4).Therefore, the analysis of the impedance spectra fits using EC and DRT methods is also confined to SOC 100% in the following sections.
Instead of taking the absolute value of the selected impedance data points as the model predictors, the model takes as input the difference between the impedance at a certain aged state and the initial value at pristine state.This is to avoid the systematic error that could possibly arise from the measurement setup, as the LCTs were conducted with different devices at different institutions.
The considered training and test data sets were the impedance spectra of the 9 cells from four different institutions (Fig. 1).Firstly, impedance points were selected from these data based on the four identified ageing parameters.Fig. 5 (a) plots the results of the four-fold cross validation and demonstrates that an average root mean square (RMS) error of <1% SOH units can be attained (full cross-validation results are shown in Figure S.10, Supplementary Material).This suggests that a full impedance spectrum is not required; instead only a few selected impedance points, i.e., at low frequency and high SOC range, are needed to achieve an acceptable SOH prediction accuracy.

Equivalent circuit-based SOH model
An SOH model is developed using the coefficients derived from equivalent circuit fits to the EIS data (Section 3.2), according to the general methodology of Section 3.4 as follows.First, in addition to the overall data reduction according to SOH as described in Section 3.4, and the limitation to SOC 100% as established in Sections 3.4 and 4.1, the following data are excluded from the analysis: • All experiments for which a 1ZARC fit was returned, since these experiments exhibit a different set of ageing parameters which cannot be compared consistently to the 2ZARC spectra.In general, the discarded 1ZARC fits were for cells very early in the ageing process (SOH >95%).• All experiments for which the normalised residual for the equivalent circuit fit was >5 × 10 − 3 ; this removes a small fraction (<5%) of experiments where the fit quality was poor (see Figure S.2, Supplementary Material), such that the EC coefficients do not accurately represent the underlying impedance data.
Correlation coefficients were assessed between SOH and each individual equivalent circuit coefficient (Fig. 6 (a)).
Corrections for inductive effects (coefficients L s, L p , R p ) show weak correlation, which is expected as these coefficients reflect external measurement circuitry and should be independent of cell state.Coefficients associated with the higher-frequency ZARC (enumerated with subscript 1) also show relatively weak dependence.The strongest dependences are for series resistance R s and the lower-frequency ZARC parameters.Since series resistance cannot be expected to be comparable between different measurement configurations as an absolute value, due to different measurement configurations, models were trained through fitting to the 5 coefficients associated with the low-frequency features (R ct,2 ,C dl,2 ,α dl,2 ,Q diff ,α diff ), according to the method described in Section 3. .The RMS error is consistently <1.5% SOH units, suggesting a consistently good predictive performance.While the error is slightly higher than from the direct impedance-based model, the EC fits have the increased flexibility of not requiring a reference for the impedance value of the fresh cell, once the SOH model is prepared.

Distribution of relaxation times (DRT)-based SOH model
An SOH model is developed using the coefficients derived from DRT coefficient data (Section 3.3), according to the general methodology of Section 3.4 as follows.To identify DRT-based impedance parameters that are suitable as predictors for an SOH estimation model, Spearman correlation coefficients describing the relation with the SOH were calculated (Fig. 6 (b)).The Spearman rank correlation coefficients consistently show strong correlations between the SOH and the impedance parameters derived from the low-frequency range of the spectrum, while the correlation of parameters from the high-frequency range is weaker.The parameters τ 4 , τ 5 , C 4 , and R 5 have coefficients greater than 0.7 and therefore appear to be suitable as SOH predictors.Fig. 5 (c) shows the results for the model M1 (see Table 3, full cross-validation results are shown in Figure S.12, Supplementary Material).The mean RMS error of the four iterations of the cross-validation is 1.05% SOH units.
In addition to this model, which uses all four parameters as predictors, other models were tested, each omitting one of the parameters and a model using only the parameters R 5 and τ 5 , as these have the highest correlation coefficients.The most accurate prediction is provided by the model M1 in which all four impedance parameters are used as predictors.
The pronounced dependence of impedance spectra of lithium-ion batteries on SOC is reflected in the derived DRT plots (Fig. 7).Especially in the high time constant range, the distributions differ significantly with respect to their absolute value and peak area; furthermore, they have different numbers of distinguishable signals.Therefore, in a supplementary approach, the similarities of impedance spectra measured at different SOCs were considered in more detail and a corresponding SOH estimation model using impedance data from multiple SOCs was trained.From Spearman correlation analysis, the parameters R 1 , R 2 , C 1 , and C 2 were identified to have a strong correlation with the SOH.Similarly, different SOH estimation models were trained using different combinations of the four identified parameters.
Compared to the first approach, the predictive strength of this approach is significantly lower.The mean RMS error is 3.37% SOH units (Table 4, full cross-validation results are shown in Figure S.13, Supplementary Material).Nevertheless, the model has some practical advantages.For SOH estimation, impedance data obtained at any SOC can be used, except data from fully charged cells.This means that the cells do not have to be charged to a defined SOC first.Furthermore, the frequency range considered in this model contributes further to the reduction of the measurement duration.The presented model therefore represents a tool that enables a very rapid rough estimation of the SOH.However, the range of application is partly limited: for SOH >80%, there is no apparent correlation between the capacitance C 1 and the SOH (see Figure S.8, Supplementary Material).

Nonlinear frequency response analysis (NFRA)-based SOH model
Finally, an SOH estimation based on NFRA measurements (input data set as described in Section 3.1) is developed and evaluated according to the methodology of Section 3.4.Similar to the direct impedance-based model (Section 4.1), sensitivities with respect to SOH of the raw data (in this case, higher harmonic voltage magnitudes Y 2 and Y 3 ) were evaluated for data across all characterised SOCs and frequencies via Spearman rank correlation as described in Section 3.4 to obtain ageing parameters for the NFRA-based model.The correlation coefficients are shown in Fig. 4 (c) and (d) and confirm the qualitative observations (Section 3.1) that the NFR signals at low frequency and high SOC show the strongest correlation to the SOH fade, whereas also in the range around 80 Hz a good correlation can be seen; the latter may be especially interesting for fast measurement.The following most strongly correlated ageing parameters were identified: 1) Y 2 at SOC 80% and 8.13 Hz, 2) Y 2 at SOC 80% and 0.25 Hz, 3) Y 3 at SOC 80% and 5.89 Hz, 4) Y 3 at SOC 80% and 0.25 Hz.
The four selected ageing parameters were then differenced from their fresh-cell values and used as predictors for model training/validation.The mean RMS error for SOH estimation with the NFRA-based model from four-fold cross-validation (Section 3.4) is approximately 1.1% SOH units (Fig. 5 (d), full cross-validation results are shown in Figure S.14, Supplementary Material), which is in the same range of prediction accuracy as the three impedance-based approaches described above.Using this method would thus be feasible to estimate the SOH from a cell.Adjustment to 80% SOC would mean that the battery need not be fully charged for the measurement; however, it would imply that the SOC can be reliably measured, e.g., by correlation to voltage.Whether an application of the method at a high SOC value of 100% is possible without harming the cell would need to be evaluated.

Propagated input uncertainty evaluation
For each developed model, an evaluation was undertaken of partial uncertainties due to propagated input uncertainties from experimental sources, as discussed in Section 3.4.4.
The sensitivity coefficients for impedance and NFR signals with respect to temperature fluctuation were assessed by conducting EIS measurements at three different controlled temperatures.The impact of temperature variation of ±2-3 K on EIS and NFR spectra is shown in     It can be seen that temperature variation has a significant impact on both impedance and NFR spectra, especially in the low-frequency range.This is highly relevant to the proposed SOH estimation methodology as the ageing parameters identified for SOH prediction mainly originate in this area.Hence, it is crucial to quantify the impact of the temperature variation on the SOH prediction uncertainties.The temperature sensitivity coefficient was established as a function of frequency by approximating the impedance data Z(T,f) using linear regression across the 3 temperature values, as: As temperature sensitivity data are only available at discrete frequency values, the derivative ∂Z/∂T in Equation ( 3) is interpolated to other frequencies where required using a Hermite polynomial interpolation.
The transient temperature fluctuation inside the temperaturecontrolled chamber is characterised according to the manufacturer's specification for the hardware used in the temperature variation study (Weiss WK3-340/70).In this case, the temperature fluctuation range is ±0.5 K (assumed uniform distribution across this range).It should be noted that this degree of temperature control applies to the chamber air and not necessarily to cell temperature itself; we consider it reasonable as an input uncertainty for the present study, but further experimental investigation to quantify cell temperature uncertainty would improve the quality of uncertainty determination.Also, this accuracy would not be so straightforwardly achievable in targeted analysis of cells within a battery pack.
The SOC sensitivity of the NFR spectra was assessed by conducting NFR measurements at 3 different controlled SOCs (see data in Figure S.6 (c), Supplementary Material).It is noticeable that temperature variation has a significantly larger impact on the NFR signals, and that this applies predominantly in the lower frequency range.Meanwhile, the variation in SOC barely alters the NFR signals.

Uncertainties for the direct impedance-based model
For the direct impedance-based model, the inputs to the SOH model are raw impedance data, whose uncertainties were quantified using Equation (S.8).
Table S.2 (Supplementary Material) summarises the individual uncertainties of the selected ageing parameters due to temperature variation, as computed using Equation (S.8).These individual uncertainties of the selected ageing parameters were then propagated through the SOH model to establish an uncertainty in the predicted SOH as shown in Fig. 8.The uncertainty attributable to temperature variation for the direct impedance-based model remains below 0.5%, which is significantly smaller than the SOH prediction RMS error of around 1%.

Uncertainties for the EC-based model
For Monte Carlo uncertainty analysis of the partial SOH uncertainty due to temperature fluctuation in the EC-based model, a set of n temperature values T i is sampled at random from the defined probability distribution for temperature.For each original spectrum Z k (f), a set of n temperature-distorted spectra is generated, as: EC coefficients are evaluated for each spectrum Z ki (f) using the same fitting procedure as for the original spectra (see Supplementary Material S.2), and each set of resulting fit coefficients is passed into the SOH model to yield a predicted value SOH ki .For each spectrum k, the uncertainty in the predicted SOH across all values T i can then be evaluated based on the empirical probability distribution of the SOH ki .The sample standard deviation of the set of values SOH ki for each spectrum k is evaluated and interpreted as the uncertainty u rel (SOH) T due to temperature variability.The set of resulting uncertainties is plotted as a function of the measured SOH in Fig. 8. From this plot, the partial SOH uncertainty due to temperature fluctuation is generally less than 0.1% SOH units, suggesting that it is a relatively small contribution to the overall RMS error noted in the cross-validation experiment.

Uncertainties for the DRT-based model
The same data set of numerically distorted impedance spectra under random temperature fluctuations as derived for the EC-based model was applied to the DRT-based model.DRT fit coefficients were evaluated from the distorted spectra as described in Section 3.2; the SOH estimations were made by applying the optimized DRT-based SOH model M6 from Table 3.The resulting uncertainties of the SOH predictions are plotted in Fig. 8 as a function of the measured SOH.The plot shows that the highest uncertainties with respect to temperature fluctuation occur in the range of SOH >85% -they are up to 0.3% SOH units.For SOH <85% the uncertainty is smaller than 0.1% SOH units, as for the ECbased model.

Uncertainties for the NFRA-based model
The quantified sensitivities with respect to SOC and temperature variation were used to calculate the corresponding uncertainties in the NFR equivalent of Equation (S.8).The uncertainty contribution u(T) is defined as in the direct impedance-based approach above.The uncertainty contribution u(SOC) denotes the current and voltage precision for the SOC adjustment, which is 0.1% according to the manufacturer's specification for the hardware used in the SOC variation study (BaSyTec XCTS).
Table S.3 (Supplementary Material) summarises the individual uncertainties of the selected NFR ageing parameters.Fig. 8 shows the propagated SOH uncertainty due to temperature fluctuation.Here, we note that in the SOH-range above 85% SOH, temperature uncertainty for the NFRA-based SOH prediction is stronger for the NFRA-based approachnonetheless it is still below 1% SOH units.Propagated uncertainties due to uncertainties in SOC are significantly smaller than those for temperature and always below 0.1% SOH units, as shown in Figure S.7 (Supplementary Material).

Summary of uncertainty evaluation
Propagated uncertainty due to temperature fluctuation is generally below 0.5% SOH units and is below 0.1% SOH units across the bulk of the SOH range for the models using processed data.Significant propagated uncertainties from temperature and SOC fluctuation are observed only for the NFRA-based model at higher SOH.These values are substantially less than the SOH prediction RMSE values reported for the models above, which suggests that the total uncertainty of the models is more attributable to intrinsic cell-to-cell variation (aleatoric) and model quality (epistemic), rather than uncertainty contributions from experimental sources.We stress that this is an empirical observation specific to this data set; other cycling conditions or cell formats might yield a different relative balance of uncertainty contributions.

Comparison of methods
The comparison of the different methods is summarised in Table 5.All models show promising performance for SOH prediction from EIS or NFRA data (four-fold cross validation mean RMSE <1.5% SOH units).In absolute terms, for the specific data set used, the direct impedance-based SOH model offers the best SOH prediction accuracy (RMSE 0.78% SOH units).The performance is quite similar, however, to the other data analysis methods, suggesting that all methods successfully represent an underlying correlation of SOH to frequency response properties of the cells, and no one method should be recommended as superior for predictive performance reasons alone.
A clear advantage of methods using raw data (direct EIS-and NFRAbased) is the simplicity of data gathering, as only a few discrete measurement frequencies are required, rather than a full spectrum.Moreover, no pre-processing is required, unlike the methods using processed data (EC-and DRT-based), in which the measured impedance spectrum must be pre-processed or translated into the relevant properties, i.e., EC or DRT coefficients.The compression of raw impedance data to a smaller number of more expressive coefficients potentially enables a more flexible SOH prediction approach, however, since a regression model can utilise simultaneous information from different parts of the measured frequency range.While this did not yield significant advantages in terms of prediction quality in this study, it could be a decisive advantage for other combinations of cycling condition, cell format and/ or cell chemistry.The propagated uncertainty due to temperature fluctuation is overall higher for methods using raw rather than processed data (Fig. 8).
In all methods developed for data measured at a fixed SOC (SOC 100% for EIS, SOC 80% for NFRA), selected ageing parameters arose in the low frequency part of the spectrum (f < 100 Hz): discrete low frequencies were selected in the direct impedance-based and the NFRAbased models; coefficients for the lowest frequency ZARC element in the EC fit show the strongest correlation to SOH; and the peak with the slowest time constant in the DRT plot is most sensitive to SOH.From the presence of significant nonlinear responses in the NFR spectra in this frequency range, the observed ageing parameters are associated with the progressive slowing of the charge transfer (Li insertion or de-insertion) process in one of the electrodes of the cell, which is strongly correlated to its SOH fade.From frequency response data alone, it is not possible to state definitively whether these parameters measure a process responsible for SOH fade, or simply evidence a parallel degradation that is strongly correlated during continuous cycling to SOH fade arising due to other processes.Additional experimental investigation, including post-mortem analysis, is required to lend further insight.
The supplementary DRT-based SOH model demonstrates that highfrequency features can show correlation to SOH independently of SOC.This model would have practical benefits due to its SOC independence, in that cell charging is not required prior to SOH determination; furthermore, using ageing parameters at higher frequency significantly reduces the implied measurement time.In contemplating future development of this method, these advantages must be balanced against the lower SOH prediction accuracy (RMSE 3.37% SOH units) than models based on the strongest ageing parameters, and a much more limited SOH prediction range (principally below SOH 80%).

Conclusion and perspective
A consolidated data set was generated by recording EIS spectra alongside coulometric measurement of SOH during life cycle testing, using hardware at four different measurement institutions to assess reproducibility of observed behaviour.This data set was used to develop SOH models based on: direct utilisation of impedance data; coefficients from equivalent circuit fits; coefficients from DRT fits.Additionally, NFR spectra were recorded by one institution and used to develop an NFRAbased SOH model.
All SOH models demonstrate the feasibility of predicting SOH of aged cells from rapid experiments (EIS, NFRA) alone (mean RMS error from cross-validation in the range 0.75%-1.5% SOH units), once correlations have been established through life cycle testing.For models utilising measurement at only one SOC (for EIS, SOC 100%, obtainable by voltage measurement alone), the implied measurement time is shorter than a conventional capacity measurement (i.e., full charge-discharge cycle).A supplementary DRT-based SOH model demonstrated the possibility of predicting SOH without prior SOC adjustment; the SOH prediction accuracy of this model is, however, lower.Over the majority of the SOH range studied, overall SOH prediction uncertainty is principally governed by cell-to-cell variation; propagated partial uncertainties from an assessed experimental control factor (temperature fluctuation) are comparatively small.
We emphasise that the predictive performance relies on a sufficient quantity of training data, gathered under a relevant ageing regime.The range of applicability of any SOH model prepared through the methodology presented in this manuscript (inclusive of all data analysis methods studied) depends on the availability and quality of the training data.Our methodology indicates that up-front electrochemical data provision by primary producers of batteries could significantly reduce the experimental overhead for accurate characterisation of aged cells, with concomitant advantages to Li-ion cell asset valuation and the economics of cell second use.

•
Fig. 1 plots the SOH evolution for all cells against the number of elapsed cycles.A consistent trend is observed for the data from all four institutions, indicating reproducibility of cell behaviour and SOH measurement.Stronger cell-to-cell variation is noted principally at SOH

Fig. 1 .
Fig. 1.Dependence of measured SOH on number of cycles (data for 9 cells total at 4 institutions).Cells are distinguished by color according to the institution at which experiments were performed.Distinct symbols ('○', '+', '△') are used to identify individual cells measured at each institute.The expected useful range is 70%-95% SOH, corresponding to <2000 cycles.(For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.) H.S. Chan et al.

Fig. 3 .
Fig. 3. (a) 2ZARC equivalent circuit used for fits to the measured EIS data (1ZARC circuit is obtained by setting R ct,2 = 0).CPE: constant phase element.(b) Example EC fit (red line) compared to experimental data (cell at SOC 100%, after 250 cycles corresponding to SOH ≈ 90%).(For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.)

Fig. 4 .
Fig. 4. Absolute value of Spearman correlation coefficients for the spectral parameters of (a) Z ′ , (b) Z ′′ , (c) Y 2 and (d) Y 3 to SOH as a function of frequency and SOC.Experimental data identical to those shown in Fig. 2. The black region of the heatmap plot indicates the unavailability NFR data outside the NFR measurement frequency range.

4 .
Diffusive tail coefficients were included due to relatively strong Pearson correlation coefficients (Figure S.5 (a), Supplementary Material).The performance of the model was evaluated by four-fold crossvalidation; results are shown in Fig. 5(b) (full cross-validation results are shown in Figure S.11, Supplementary Material)

Fig. 5 .
Fig. 5. Mean RMS error (SOH % units) and test data predictions from four-fold cross-validation of the trained SOH models: (a) direct impedance-based model; (b) EC-based model; (c) DRT-based model; (d) NFRA-based model.The four colours (blue, green, yellow, purple) represent different random samples of 25% of the full data set (different for each subfigure) which are respectively used as test data while the remaining 75% of the data set is used for training.(For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.)

Fig. 6 .
Fig. 6.Spearman rank correlation coefficients (a) for correlation of each EC coefficient with SOH, across the data set subject to the defined exclusions (SOC 100%, n = 165).and (b) for correlation of each DRT coefficient with SOH, across the data set subject to the defined exclusions (SOC 100%, n = 169).

Fig. 7 .
Fig. 7. DRT plots of impedance spectra measured at different SOCs at Institution 1.The cell was cycled for 250 cycles and had an SOH approximately 90%.Each line contains an arbitrary offset in the vertical axis for visual clarity.

Figure S. 6 (
Figure S.6 (a) and Figure S.6 (b) (Supplementary Material), respectively.It can be seen that temperature variation has a significant impact on both impedance and NFR spectra, especially in the low-frequency range.This is highly relevant to the proposed SOH estimation methodology as the ageing parameters identified for SOH prediction mainly originate in this area.Hence, it is crucial to quantify the impact of the temperature variation on the SOH prediction uncertainties.The temperature sensitivity coefficient was established as a function of frequency by approximating the impedance data Z(T,f) using linear regression across the 3 temperature values, as:

Fig. 8 .
Fig. 8. Propagated uncertainty in SOH estimation due to temperature fluctuation as a function of SOH measurement based on the different SOH models.The inset shows the same data on an expanded vertical scale, for clarity.

Table 1
EIS measurement parameters and conditions.

Table 2
NFR measurement parameters and conditions.

Table 3
Performance of SOH estimation models based on different combinations of lowfrequency DRT parameters.

Table 4
Performance of SOH estimation models based on different combinations of highfrequency DRT parameters.

Table 5
Summary of comparison of methods.Empirical uncertainty is assessed as mean RMSE from four-fold cross-validation.