On-line Remaining Charging-discharging Cycle Prediction of Lithium-ion Batteries using Cumulative Indicator

Prediction of the effective number of full charging-discharging cycle is valuable for lithium-ion battery (LIB) replacement and recycling. This paper proposes to construct a cumulative degradation indicator (CDI) to work as a more predictable indicator. The proposed CDI is better than the original degradation indicator (DI) in multiple criteria. In the stage of determining the end-of-life (EoL) threshold, a relevance vector machine (RVM) is introduced to screen a small number of available samples, and to reduce the prediction error. In the experimental verification stage, this paper uses LIB full-life data from NASA to verify the early and long-term prediction performance of RCDC using a small sample. The experimental results show that when the proportion of training data approaches 50%, the prediction error gradually converges to the actual value.


Introduction
With the development of automobile industry in decades, million of LiBs installed in vehicles are reaching their end-of-life (EoL), and need to be properly handled to prevent impacts on the environment. The most important thing is how to accurately estimate the remaining useful life (RUL) of LIBs, thereby improving battery utilization and reducing the risk of environmental harmfulness.
To accurately describe the usable cycles of LIBs, this paper uses the remaining chargingdischarging cycles (RCDCs) to replace the term RUL. As shown in Eqs. (1)-(3), the indicators stateof-capacity (SoC) and state-of-health (SoH) are commonly selected to measure and evaluate the health status of LIBs [1][2][3]. In addition, the performance of LIBs also varies due to the number of chargingdischarging cycles, temperature changes, and the solid electrolyte interface conversion. Therefore, an accurate and robust RUL prediction algorithm is essential to improve performance and optimize the charging-discharging process of LIBs. Generally, 70%-80% of SoH is regarded as the threshold range for the battery replacement [4,5]. (1) Many studies are based on determination of RCDC, while there are still some problems. If the time series analysis methods are used to predict RCDC, the proportion of training data provided is comparatively small (i.e., usually it is considered the proportion of training data exceeds 50% of the overall data can be regarded as an ideal way, it helps to obtain a guaranteed regression result), and thus the early prediction and long term prediction efficiency may be limited. The quantities of training data are relatively small when RCDC is predicted by time series analysis methods. The prediction result however, is considered guaranteed only if more than 50% of overall data is taken as training data. Thus, the efficiency of early and long-term prediction maybe limited. Besides, the works in [6,7] revealed that the battery capacity regeneration phenomena (CRP) hinders RCDC prediction directly. Figure 1 shows the challenges met in the prediction.
To deal with the problem of insufficient training data proportion and the challenge brought by the nonlinear CRP, this paper attempts to establish a new cumulative degradation indicator (CDI) to replace the original SoH-based degradation indicator (DI) for RCDC prediction. Then, to establish the "DI-CDI" relationship, the authors uses RVM to obtain a small amount of relevant vectors (RVs), and selects these vectors to conduct curve fitting to determine the EoL threshold of CDI. The proposed CDI is consistent with the original DI in terms of overall monotonousness and synchronization, but the linearity is better.

Methodology
The proposed prediction framework is shown in figure 2. It includes online degradation indicator acquisition, discrete wavelet transform (DWT) denoising, cumulative features calculation, RVs selection using RVM, EoL estimation on CDI curve, and RCDC prediction. In this paper, the normalized DI of LIBs is given in equation 3. (3)

CDI Calculation
Although the DI of a LIBs can reflect the real-time status of the capacity. It cannot reflect the influence of the past state on the current state, that is, the historical relevance of the time series. And affected by CRP, the DI curve shows non-linear characteristics, which is not conducive to curve fitting, and is not conducive to RCDC prediction. To reveal the historical relevance problem, as shown in equation 4, this paper uses cumulative degradation characteristics to reflect the trend and monotonicity of historical degradation series. As shown in table 1, CDI is far better than the DI curve in terms of monotonicity and linearity.
To evaluate the monotonicity and trend of degradation characteristics, this paper introduces two indicators: monotonicity and trending index [8]. The expressions of the two indicators are shown in equation 5-6. The value of ranges from 0 to 1.
indicates the highest monotonicity, and shows there is no monotonicity. represents the correlation between CDI and the chargingdischarging cycles , and its value range is between -1 and 1.
indicates the best correlation.

RVM Regression for RV Selection and Curve Fitting
RVM regression is a machine learning algorithm that combines Bayesian theory with SVM. Compared with SVM, the kernel function of RVM has no special restrictions, and its sparsity is better [9]. Fewer hyperparameters reduce the calculation amount of the kernel function. Given a data series , RVM regression describes the links between target and input variables using a linear model of the following expression.
where is the linear combination of kernel function, meets the normal distribution. can be expressed as: (8) where is the weight and is a predetermined kernel function. More details about the choice of kernel functions can be found in [9].  When the weight corresponding to these hyperparameters is zero, the sparsity of RVM is met. The inputs corresponding to the remaining nonzero weights are called RVs. As shown in figure 3(a), the red circles are the selected RVs, which are then used for polynomial fitting (as shown in figure  3(b)).
According to equation 9, this paper adopts a second-order polynomial fitting model to obtain the fitting data of CDI in the time ranging from to . The EoL is deemed to have been reached when the first time CDI satisfies the threshold condition . Then, RCDC can be given in equation 10, and the RCDC error shown in equation 11. where denote the weights of model , is the predicted endpoint, while means the real endpoint.

Experimental Verification
The battery capacity degradation data used in this study comes from NASA PCoE. There are four sets of batteries: B0005, B0006, B0007, and B0018, all of which are commercial 18650 LIBs. According to Eq. (4), DI and CDI curves of remaining three sets of LIBs are shown in figure 4. As shown in table 1, the monotonicity and trend indexes of CDIs show better than DIs of three groups of LIBs. In order to facilitate the comparison of curve shapes, this paper multiplies the amplitude of DI by ten times. Obviously affected by the CPR effect, there are many depressions on the DI curves, and these local declining trends may affect the RCDC performance. However, it is not difficult to see that the CDI curves of the three groups of LIBs all increase monotonically, and there is no local fluctuation that occurs on original DI curves. As shown in figure 5, the gray curve reflects the "DI-CDI" relationship, which is nonlinear. In this paper, the RVM regression method is used to determine the potential RVs. Based on these RVs (e.g., as shown in figure 3(a)), a simple first-order polynomial fitting function can be established, this function is helpful to determine the CDI threshold of the EoL location. The overall performance of the RCDC prediction to all groups of LIB data is shown in figures 6-7(a), the prediction errors are shown in figures 6-7(b). When , the prediction error is relatively small and will gradually be consistent with the actual observed value. It means that the available observation data account for 35.7% of the overall data dimension.
As shown in figures 6-7, when the prediction point is between 60 and 80 (accounting for 35% to 47% of the total data length), the prediction error of RCDC gradually decreases. When for the B0006 battery (see figure 7), the prediction error has gradually converged to an acceptable range. It shows that the method proposed in this paper can indeed realize the early and long-term prediction of RCDC.

Conclusion
This paper adopts the idea of online prediction, applies the original DI degradation data of lithium-ion batteries to obtain a new degradation index CDI employing cumulative summation. It is significantly better than DI in monotonicity, trend, and linearity. Based on the fitting curve of DI-CDI, this paper introduces RVM to select the available DI variables for RV selection. The chosen RVs are further used for polynomial fitting to determine the equivalent EoL threshold on the CDI curve, which provides a prediction basis for the RCDC estimation on the CDI curve. The experimental results show  that when the proportion of training data approaches 50%, the prediction error gradually converges to the actual value. The proposed method can also be transplanted to commonly used portable terminal devices.