Comparing Machine Learning Strategies for SoH Estimation of Lithium-Ion Batteries Using a Feature-Based Approach †

: Lithium-ion batteries play a vital role in many systems and applications, making them the most commonly used battery energy storage systems. Optimizing their usage requires accurate state-of-health (SoH) estimation, which provides insight into the performance level of the battery and improves the precision of other diagnostic measures, such as state of charge. In this paper, the classical machine learning (ML) strategies of multiple linear and polynomial regression, support vector regression (SVR), and random forest are compared for the task of battery SoH estimation. These ML strategies were selected because they represent a good compromise between light computational effort, applicability, and accuracy of results. The best results were produced using SVR, followed closely by multiple linear regression. This paper also discusses the feature selection process based on the partial charging time between different voltage intervals and shows the linear dependence of these features with capacity reduction. The feature selection, parameter tuning, and performance evaluation of all models were completed using a dataset from the Prognostics Center of Excellence at NASA, considering three batteries in the dataset.


Introduction
One of the primary challenges of modern life is global warming caused by the emission of greenhouse gases from burning fossil fuels, as well as the urgency to diminish reliance on non-renewable resources. Renewable energy generation has become a top priority for governments all around the world. Focusing on photovoltaics (PVs) and wind, which are generally considered non-dispatchable and only partially participate in maintaining grid stability [1], makes battery and other energy storage systems essential. Out of the various battery technologies currently in use, lithium-ion batteries have become the preferred choice, owing to their high power and energy density as well as long service life [2]. For these reasons, exclusively Li-ion batteries are used in electric vehicles [3], where maximizing energy density and minimizing the weight of the battery pack is crucial. Continuous research and investments are focused on this technology to improve its performance, robustness, and stability. A battery management system (BMS) is commonly used to ensure safe and efficient operation of the battery pack by controlling the charge and discharge processes of the cells and providing cell balancing. To achieve this task, the BMS must accurately estimate crucial battery parameters, such as state of charge (SoC), state of health (SoH), and the remaining useful life (RUL) [4]. The SoC is related to the available capacity of the batteries. By knowing this factor, the BMS prevents overcharging or discharging of batteries. The SoH provides information about the aging status of the battery and is indicated by the rise of internal resistance or capacity decrease. On the other hand, the goal of the RUL prediction is to understand how long a battery will continue to operate before it fails or has unacceptable performance. Battery degradation is a highly variable process, depending on cell chemistry, the BMS, ambient conditions, and use patterns. For this reason, a considerable amount of model-based [5][6][7][8][9] and data-driven battery aging methods used for the SoH and end-of-life predictions of batteries can be found in the literature.
Recently, data-driven methods have gained popularity due to the availability of vast amounts of data, gathered through sensors and other monitoring devices, and advancements in the field of machine learning (ML). They do not necessarily rely on prior knowledge of the particular battery cell and are less expensive to develop compared to the model-based ones. Data-driven methods use large data sets to identify patterns and relationships that may not be easily discernible using traditional analytical techniques. Many different features, otherwise known as health indicators (His), have been used to build various ML strategies. Apart from capacity and resistance changes, these His are based on voltage charge-discharge limits, the amount of current, battery temperature [10][11][12][13][14], and incremental capacitance analysis (ICA) [15][16][17][18], as well as features derived from statistical analysis of the other health indicators [19].
In the literature, many machine learning techniques have been studied and used to perform SoH estimation [20,21] apply regression to model battery aging behavior and compare the RUL-prediction capabilities of two fitting functions, while in [22], a combination of an exponential function and regression analysis is used. The authors of [23] discuss a strategy based on support vector regression (SVR) and ICA curves obtained from partial charging data. Similarly, [14] uses partial charging segments of voltage under constant current charging and a support vector machine model. Another SVR strategy is presented in [24], based on curves of battery voltage as a function of charging capacity (V-Q). Finally, in [25], a solution is proposed based on the random forest algorithm. In [13], a gaussian process regression (GPR) model is used with four specific inputs extracted from the charging curves, and a grey relational analysis method is applied to analyze the relationship between features and SoH. The authors of [26] apply GPR to discover the relationship between capacity, storage temperature, and SoC of lithium-ion batteries. By optimizing the feature selection process with an automatic-relevance-determination (ARD) structure, they provide predictions for the calendar aging of batteries tested under different conditions. GPR combined with electrochemical impedance spectroscopy is used in [27], adopting many wave shapes to obtain an estimation of the capacity of the batteries. Novel health indicators related to the lithium diffusion coefficient are provided and validated.
In [28], a capacity-estimation method based on back-propagation neural networks (NN) and partial charging voltage segments, corresponding to 10-50% SOC, has been developed. Another solution based on recurrent neural networks (RNNSs) is proposed in [29], while in [30], an echo state network (ESN) has been used together with a modelbased approach to predict the SoH evolution curve of the tested batteries, starting from cycles 80, 100, or 120. From the generated curves, predictions are made for the RUL. Due to the problem of vanishing or exploding gradient, traditional RNNs are not capable of dealing with long sequences in practice. The emergence of long short-term memory (LSTM) has provided a solution to this problem [31], and [32] utilized LSTM to build a RUL model of the lithium-ion battery. In [33], another method is proposed based on LSTM NNs and signal processing methods for SoH monitoring and RUL prediction of lithium-ion batteries.
In [34], the authors proposed an approach for SoH estimation based on SVR and a feature extraction procedure. In this paper, SVR is compared to other ML approaches, including multiple linear and polynomial regression and random forest. These classical ML strategies have been chosen because they offer a good compromise between light computational effort, applicability, and accuracy of results, while also providing higher model interpretability than complex NNs. The performances of all strategies are compared using a dataset from the Prognostics Center of Excellence at NASA, considering three batteries of the dataset. This work differentiates itself from the other aforecited papers, including the ones employing the same NASA dataset [10][11][12][13]18,28,30,33], based on the specific ML Energies 2023, 16, 4423 3 of 13 strategies implemented, the features used, and their feature numbers. Discussion is provided on the feature selection process based on partial charging times between different voltage limits, as well as the parameter tuning process of the different strategies. Finally, this research had the goal of minimizing the necessary number of features, considering models based on one-to-four features, and achieving optimal results with only two features for all considered ML strategies.

NASA Dataset
The NASA Ames Prognostics Center of Excellence (PCoE) released a data repository composed of six datasets of aged Li-ion batteries [35]. However, only the first of these datasets is suitable for prognostic degradation prediction, according to their guidelines. In this work, batteries 5, 6, and 7 were considered, which were tested until failure. The charging process follows the constant-current (CC) and constant-voltage (CV) protocol. More specifically, the cells are charged with a current of 1.5 A until the upper voltage limit of 4.2 V is met, after which CV charging proceeds until the current drops below 20 mA. The discharge phase is carried out at 2.7 V, 2.5 V, and 2.2 V, depending on the battery. Cycles are grouped into charge, discharge, or impedance cycles. For every cycle of every cell, various quantities are measured, including current, time, temperature, voltage, and discharge capacity. To control the environmental temperature, the tests were carried out in a climatic chamber.

Multiple Linear Regression and Stepwise Regression
Multiple linear regression (MLR) is a statistical approach for modeling the relationship between a target variable (y) and two or more available descriptor variables (x i ), otherwise called features, using a linear equation. Regression models are usually fitted using the least-squares approach, which minimizes the sum of the squared differences between the predicted and actual values of the target variable. However, fitting based on other criteria can be performed, such as least absolute deviations or minimization of a penalized version of the least-squares function, as in the case of ridge and lasso regression. MLR is a powerful tool for analyzing complex relationships between variables, but it assumes that the relationships are linear. When this is not the case, better results could be obtained using polynomial regression, which is a statistical technique that models the relationship between x i and y as an n-th degree polynomial, thus fitting a nonlinear relationship. Polynomial regression utilizing multiple features can have many potential terms resulting from the features raised to a certain power or their combination.
Stepwise regression can be used to automatically identify the most important terms. It involves iteratively adding or removing different terms according to a stopping criterion, which can be based on the p-value, Akaike information criterion (AIC), Bayesian information criterion (BIC), value of the coefficient of determination (R 2 ), or adjusted R 2 . The most popular stepwise methods are forward selection (FS), backward elimination (BE), and bidirectional elimination. In FS, the model starts with no terms and iteratively adds them until a stopping criterion is met. In BE, the model starts with all combinations and iteratively removes terms until a stopping criterion is met. For both the BE and the FS methods, the decision regarding a term is final and is not reconsidered. This is not the case with bidirectional elimination, which is a combination of forward and backward stepwise regression and starts with no terms. If the adjusted R 2 is considered as a stopping criterion, this method will first add the terms that produce the largest increase in the adjusted R 2 value. Eventually, the removal of terms can also occur if this results in maximum increases of the adjusted R 2 .
In this work, models based on MLR, as well as second-and third-degree polynomial terms, have been constructed using bidirectional-elimination stepwise regression. The generated models based on stepwise regression were limited to second-and third-Energies 2023, 16, 4423 4 of 13 degree polynomial terms, including combinational terms. The adjusted R 2 was used as the stopping criterion. In all cases, fitting was performed by using the least-squares method.

Support Vector Regression
The support vector machine (SVM), in ML, is a well-known supervised learning model, used mainly for binary classification tasks. It has been extensively applied in predictive and diagnostic tasks, such as in [36], where a partial-discharge-curve approach is combined with the least-squares SVM to estimate the state of health (SoH) of Li-ion batteries. Similarly, in [37], the SVM is utilized on an electric-vehicle (EV) battery-usageprofile dataset generated by simulations to determine the SoH. The SVM searches for the optimal hyperplane that maximizes the distance from each training point, making it not only effective in classifying points but also in finding the most robust hyperplane. When the points are not linearly separable and a higher-dimensional feature space is needed, the kernel trick is used.
SVR is a version of the SVM adapted to perform regression tasks. SVR fits the error of its predictions within the limit while minimizing the loss function in Equation (1), which is called the L2 loss: where β , β-values that weight arrays, normal and transposed Y n -target values X n -transposed descriptor array b-bias -maximum allowed error. The constraint is then relaxed, introducing the slack variables and applying what is called the soft margin approach.
where ξ n , ξ * n -slack variables for positive and negative error C-weight associated with slack variables. The prediction is expressed as a function of the training samples in Equation (3), in particular of those data points with either α i or α * i different from 0, which are called support vectors.
In this paper, the SVR hyperparameters have been initially tuned with the MATLAB built-in function for SVR models, using the Bayesian optimization algorithm, and run for 500 iterations to define a good starting point for the hyperparameters. The tunable hyperparameters are as listed: Box constraint: Coefficient C that weights the slack variables in Equation (1) and helps regulate overfitting.
Epsilon (ε): The value that defines the radius of the epsilon tube where the algorithm tries to contain the points or, in other words, the maximum error allowed.
Kernel scale: The value that rescales the predictors. Each value in the predictors is divided by the kernel scale value. Kernel function: The value used to compute the similarity between data points in a higher-dimensional feature space.
Additional tuning of the hyperparameters was carried out during the validation process. The final values of the hyperparameters are shown in Table 1. The linear kernel function was selected because the features are quite proportional to the target value to estimate and working in a higher-dimensional space was unnecessary. In fact, different kernel functions led to lower validation accuracy.

Random Forest
A random forest (RF) is an ensemble learning method that puts together many decision trees (DT) as weak learners and is one of the best-known and most used algorithms for supervised learning tasks. In [38], a RF is used to perform an incremental capacity analysis to estimate the capacity of lithium batteries by only feeding raw measurements of new data to the model.
A decision tree is a non-parametric algorithm that develops a tree by splitting the dataset over the values of its features and associates different subsets of the dataset to different nodes of the tree. First, the entire dataset is paired with the root of the tree. Next, the dataset is split into two parts according to a decision made over some of the features, and each part is associated with a new node child of the root, forming the second level of the tree. This behavior is recursively iterated until subsets of the dataset contain only one value or a stop criterion is met, with the final subsets representing the leaves of the tree. Each splitting is made over the value of typically one feature, and the choice for the optimal split is made by finding the feature and its splitting value that optimize a given metric. MSE metric minimization was used in this work: where y is the mean of the target values in the set S, y i is the i-th target value, and N is the number of samples in the set. SplitMSE, S L , and S R are the weighted error, left, and right subsets, respectively, generated by splitting S over the feature F at value V, while N L and N R are the numbers of data points, respectively, in the left and right subsets. F* and S* are the optimal feature-value pair to split the set. Other metrics, such as Gini impurity or information gain, can be used.
However, decision trees are considered weak learners and strongly tend to overfit. A random forest is an ensemble algorithm whose mechanism consists of combining multiple decision trees with a bagging technique to provide higher accuracy and robustness than a single tree, reducing overfitting. Bagging is, in fact, known for reducing the variance of the model (as opposed to boosting, which reduces bias) by training each tree (or learner in general) on a randomly selected subset of the training data with replacement (bootstrapping), hence introducing diversity in the training data. What diversifies the random forest from the standard tree bagging ensemble is the use of subsets of randomly selected features for each tree in the forest, which helps to reduce correlation between each learner, thus reducing overfitting. In this work, one-third of the total features were randomly used to train each single decision tree.

Feature Selection
As aforementioned, the considered approaches were applied by considering a specific feature of the batteries. In most battery applications, the charging stage is conducted in a more repeatable way. While different chargers can be used, which will result in different charging profiles, many charging cycles will be the same or very similar. On the other hand, the battery discharge cycles vary greatly depending on the application and use patterns. Even though the charging phase is more similar between different cycles, complete charging cycles are by no means guaranteed. For this reason, a small portion of the charging curve of voltage was used to extract useful information. More specifically, the extracted feature is the partial charging time (PCT) necessary for the battery to charge by some small voltage range.
In Figure 1, the battery voltage versus time during charging for different cycles is represented. Unsurprisingly, the charging time decreases as the battery ages and the global capacity decreases. In fact, the charging time is halved near the final cycles compared to the initial ones. It is further noted that the beginning of the charging process is characterized by a high derivative and is therefore difficult to appreciate the time differences between different cycles. On the other hand, the middle part extends for a longer period of time and is more suitable for PCT feature extraction. This is why, in this work, the lower voltage limit of 3.7 V was set for the feature extraction process.
to train each single decision tree.

Feature Selection
As aforementioned, the considered approaches were applied by considering cific feature of the batteries. In most battery applications, the charging stage is con in a more repeatable way. While different chargers can be used, which will result ferent charging profiles, many charging cycles will be the same or very similar. other hand, the battery discharge cycles vary greatly depending on the applicati use patterns. Even though the charging phase is more similar between different complete charging cycles are by no means guaranteed. For this reason, a small por the charging curve of voltage was used to extract useful information. More speci the extracted feature is the partial charging time (PCT) necessary for the battery to by some small voltage range.
In Figure 1, the battery voltage versus time during charging for different cy represented. Unsurprisingly, the charging time decreases as the battery ages a global capacity decreases. In fact, the charging time is halved near the final cycle pared to the initial ones. It is further noted that the beginning of the charging pro characterized by a high derivative and is therefore difficult to appreciate the time ences between different cycles. On the other hand, the middle part extends for a period of time and is more suitable for PCT feature extraction. This is why, in this the lower voltage limit of 3.7 V was set for the feature extraction process.

Results and Discussion
The initial choice of the voltage range and limits was made empirically by com four features over the limits of 3.7-4.1 volts with a voltage range of 0.1 V. More cally, the first feature represents the evolution of the charging time between 3.7 V V over the number of cycles, the second feature uses the range of 3.8 V to 3.9 V, Figure 2, the value of the considered features as a function of the number of cycles ted. The first PCT feature computed for the lowest voltage values, from 3.7 V to

Results and Discussion
The initial choice of the voltage range and limits was made empirically by computing four features over the limits of 3.7-4.1 volts with a voltage range of 0.1 V. More specifically, the first feature represents the evolution of the charging time between 3.7 V and 3.8 V over the number of cycles, the second feature uses the range of 3.8 V to 3.9 V, etc. In Figure 2, the value of the considered features as a function of the number of cycles is plotted. The first PCT feature computed for the lowest voltage values, from 3.7 V to 3.8 V, appeared to be an almost flat curve, containing no variance and thus very little information regarding the data. Conversely, the features computed from 3.8 V to 4.1 V have a higher variance and hence are more descriptive of the aging phenomena.
To find the optimal features and model parameters, from the voltage limits of 3.7-4.1 V, many feature sets were created. These sets differ from each other depending on the number of features, the upper and lower voltage limits used, and the voltage range. For each feature set, the models obtained using the different ML strategies are compared. appeared to be an almost flat curve, containing no variance and thus ver mation regarding the data. Conversely, the features computed from 3.8 V to higher variance and hence are more descriptive of the aging phenomena. To find the optimal features and model parameters, from the voltage lim V, many feature sets were created. These sets differ from each other depe number of features, the upper and lower voltage limits used, and the volta each feature set, the models obtained using the different ML strategies are c The fitting accuracy of the various models was assessed through the va efficient of determination (R 2 ). It is a measure used in statistics, indicating hypothesis describes the variance of the data. In other words, it is a measure a model can fit the data. R 2 is described as where -residual sum of squares -total sum of squares -target value -estimated value -mean of the target values. A three-fold cross-validation (CV) procedure was applied to the three b dataset to find the best features and ML strategies. This means the SoH evol battery was estimated based on the data of the other two batteries. The resu in Tables 2 and 3 for the voltage ranges of 0.1 V and 0.05 V, respectively. Initia voltage range of 0.025 V and a larger voltage range of 0.2 V were also cons ever, the smaller voltage range resulted in features with low variability for limits and produced inferior results compared to the ones presented in Ta The fitting accuracy of the various models was assessed through the value of the coefficient of determination (R 2 ). It is a measure used in statistics, indicating how much a hypothesis describes the variance of the data. In other words, it is a measure of how well a model can fit the data. R 2 is described as where SS r -residual sum of squares SS t -total sum of squares y i -target value f i -estimated value y-mean of the target values. A three-fold cross-validation (CV) procedure was applied to the three batteries of the dataset to find the best features and ML strategies. This means the SoH evolution of each battery was estimated based on the data of the other two batteries. The results are shown in Tables 2 and 3 for the voltage ranges of 0.1 V and 0.05 V, respectively. Initially, a smaller voltage range of 0.025 V and a larger voltage range of 0.2 V were also considered. However, the smaller voltage range resulted in features with low variability for most voltage limits and produced inferior results compared to the ones presented in Tables 2 and 3. The larger range of 0.2 V and higher ranges did not improve the SoH-estimation capability of the models. Since minimizing the voltage range was one of the objectives to ensure that the features would be available, even in the case of partial charging cycles, the ranges of 0.05 V and 0.1 V were regarded as optimal, and the higher voltage ranges were not further analyzed or presented.  Table 2 shows the feature sets of partial charging times obtained for a voltage range of 0.1 volts. The first four single feature sets (A1-A4) explore the whole voltage range of 3.7 to 4.1 volts. Unsurprisingly, they show that all ML strategies perform better when the voltage limits of 3.8-3.9 V (A2) or 3.9-4 V (A3) are used as a feature. More specifically, the best results are obtained for the voltage limits 3.9-4 V when a single feature is used. Additionally, Table 2 shows that if the feature set is built from two features based on the limits of 3.8-3.9 V and 3.9-4 V (A6), there is only a marginal improvement in the R 2 value. In any case, the best results for single and double feature sets are A3 and A6. Table 3 presents the feature sets obtained for a voltage range of 0.05 volts. In this case, feature sets consisting of one to four features were constructed. For example, B1 is a feature set of a single feature, which is the PCT between the voltage limits of 3.8 V to 3.85 V, while B6 consists of two features, which are the PCTs between the limits of 3.8 to 3.85 V and 3.85 to 3.9 V. The best results, per number of features, are B3, B7 B10, and B14. Using a single feature, even for the voltage range of 0.05 V, is sufficient if the voltage limits are between 3.85 and 4 volts. There is marginal improvement when two features are used; however, a further increase in the number of features does not lead to any meaningful increase of R 2 . Considering the models of both tables, it can be noted that SVR delivers slightly better results than the other considered ML strategies for all feature sets. Still, using MLR also leads to satisfactory results. Furthermore, when comparing the three strategies based on regression, no significant improvement in the R 2 value is observed when increasing the polynomial order using stepwise regression. That means the PCT Energies 2023, 16, 4423 9 of 13 features and capacity reduction, as functions of the number of cycles, have a strong linear dependence. Hence, high model complexity will not result in an improvement of the results if the correct voltage range of 3.8-4 V has been selected. Actually, a drop in the mean validation R 2 value can even be observed in some cases due to overfitting the training data of the higher-complexity models. This is especially apparent in models built from a higher number of features (A7 and B12 to B14). However, some improvement when increasing the polynomial order can also be observed for the voltage range of 3.7-3.8 V, which has low variance. Finally, the models based on RF demonstrated worse performance than those of MLR and SVR.
The models based on feature sets A3, A6, B3, B7, B10, and B14 all represent satisfactory performance. Having the goal of minimizing the number of features and the voltage range, the authors consider the models based on feature set B7 as the overall best. The plots for the capacity estimation of all the batteries using MLR, SVR, and RF are plotted in Figures 3-5 increasing the polynomial order using stepwise regression. That means the PCT features and capacity reduction, as functions of the number of cycles, have a strong linear depend ence. Hence, high model complexity will not result in an improvement of the results if the correct voltage range of 3.8-4 V has been selected. Actually, a drop in the mean validation R 2 value can even be observed in some cases due to overfitting the training data of the higher-complexity models. This is especially apparent in models built from a higher num ber of features (A7 and B12 to B14). However, some improvement when increasing the polynomial order can also be observed for the voltage range of 3.7-3.8 V, which has low variance. Finally, the models based on RF demonstrated worse performance than those o MLR and SVR. The models based on feature sets A3, A6, B3, B7, B10, and B14 all represent satisfac tory performance. Having the goal of minimizing the number of features and the voltage range, the authors consider the models based on feature set B7 as the overall best. The plots for the capacity estimation of all the batteries using MLR, SVR, and RF are plotted in Figures 3-5, respectively.   increasing the polynomial order using stepwise regression. That means the PCT features and capacity reduction, as functions of the number of cycles, have a strong linear dependence. Hence, high model complexity will not result in an improvement of the results if the correct voltage range of 3.8-4 V has been selected. Actually, a drop in the mean validation R 2 value can even be observed in some cases due to overfitting the training data of the higher-complexity models. This is especially apparent in models built from a higher number of features (A7 and B12 to B14). However, some improvement when increasing the polynomial order can also be observed for the voltage range of 3.7-3.8 V, which has low variance. Finally, the models based on RF demonstrated worse performance than those of MLR and SVR. The models based on feature sets A3, A6, B3, B7, B10, and B14 all represent satisfactory performance. Having the goal of minimizing the number of features and the voltage range, the authors consider the models based on feature set B7 as the overall best. The plots for the capacity estimation of all the batteries using MLR, SVR, and RF are plotted in Figures 3-5, respectively.   All three figures display the previously mentioned three-fold CV. For example, the SoH estimation of battery 5 was done with a model trained using the data of the chosen feature set of batteries 6 and 7. The full lines represent the measured SoH for the batteries, while the dashed lines represent the estimated SoH over the number of cycles. Figures 3  and 4 show that MLR and SVR accurately model the SoH of the batteries, even registering the peaks in the SoH function that are due to the rest time of the battery. Likewise, the RF is able to model batteries 5 and 7 with similar success, but the same cannot be said about battery 6, as is evident in Figure 5. After the SoH of battery 6 falls to around 0.7, the estimation begins to diverge from the measurement because batteries 5 and 7, which were used for training, do not contain data with SoH lower than 0.7.
The random forest and decision trees are indeed well known for their inability to extrapolate, that is, make estimations for predictor values lying outside of the range of the observed data. From Figure 5, it is clear that the SoH value of battery 6 from cycle 90 onwards is lower than that of any other cycle of the training batteries; hence, the decision trees will not be able to correctly estimate that target value. Furthermore, Figure 6 shows that also the feature value for battery 6 is lower than that of the other batteries. Consequently, the branches of the decision trees built on batteries 5 and 7 will "explore" the features in a range that does not include the values of battery 6 predictors after cycle 90. Hence, after this cycle number, all the decision trees of the random forest will infer the lowest observed SoH value for battery 6, which will be around 0.7 because the training data is composed of batteries 5 and 7. This is the reason for the observed flat line output. It is important to specify that this result does not imply that the RF is not a suitable solution for the general problem of battery prognostic because this precise case is strictly related to the dataset distribution and data scarcity. All three figures display the previously mentioned three-fold CV. For example, the SoH estimation of battery 5 was done with a model trained using the data of the chosen feature set of batteries 6 and 7. The full lines represent the measured SoH for the batteries, while the dashed lines represent the estimated SoH over the number of cycles. Figures 3 and 4 show that MLR and SVR accurately model the SoH of the batteries, even registering the peaks in the SoH function that are due to the rest time of the battery. Likewise, the RF is able to model batteries 5 and 7 with similar success, but the same cannot be said about battery 6, as is evident in Figure 5. After the SoH of battery 6 falls to around 0.7, the estimation begins to diverge from the measurement because batteries 5 and 7, which were used for training, do not contain data with SoH lower than 0.7.
The random forest and decision trees are indeed well known for their inability to extrapolate, that is, make estimations for predictor values lying outside of the range of the observed data. From Figure 5, it is clear that the SoH value of battery 6 from cycle 90 onwards is lower than that of any other cycle of the training batteries; hence, the decision trees will not be able to correctly estimate that target value. Furthermore, Figure 6 shows that also the feature value for battery 6 is lower than that of the other batteries. Consequently, the branches of the decision trees built on batteries 5 and 7 will "explore" the features in a range that does not include the values of battery 6 predictors after cycle 90. Hence, after this cycle number, all the decision trees of the random forest will infer the lowest observed SoH value for battery 6, which will be around 0.7 because the training data is composed of batteries 5 and 7. This is the reason for the observed flat line output. It is important to specify that this result does not imply that the RF is not a suitable solution for the general problem of battery prognostic because this precise case is strictly related to the dataset distribution and data scarcity.

Conclusions
Accurate SoH estimation is essential for the safe and reliable operation of lithium-ion batteries. This paper compares SoH-estimation models based on the classical ML strategies of MLR, polynomial regression, SVR, and RF, which offer good trade-offs between Figure 6. PCT for voltage range 3.9-3.95 V for all three batteries.

Conclusions
Accurate SoH estimation is essential for the safe and reliable operation of lithiumion batteries. This paper compares SoH-estimation models based on the classical ML strategies of MLR, polynomial regression, SVR, and RF, which offer good trade-offs between applicability, light computation effort, and accuracy of results. Discussion is provided on the feature selection process and optimal number of features.
The partial charging time proved to be a good indicator of battery aging as long as the proper voltage limits were selected, and the partial charging phase was equal at every cycle. To find the optimal features, 21 feature sets were built considering different voltage limits and the two voltage ranges of 0.1 and 0.05 V. The best results were obtained when considering the voltage limits of 3.8 to 4 volts for both ranges of 0.1 V and 0.05 V. The quality of the features degrades significantly for a minimum voltage of less than 3.7 V due to small variance. Results showed that models based on one or two features are optimal.
Furthermore, the PCT feature demonstrated a linear dependence with capacity reduction as a function of number of cycles. Consequently, MLR produced very accurate results, and the use of polynomial regression was not justified. The overall best performance for all feature sets was achieved using SVR, especially when slightly lower voltage limits were considered. Finally, the RF had the worst performance when facing the limited dataset.