Complex exponential smoothing

Exponential smoothing has been one of the most popular forecasting methods used to support various decisions in organizations, in activities such as inventory management, scheduling, revenue management, and other areas. Although its relative simplicity and transparency have made it very attractive for research and practice, identifying the underlying trend remains challenging with significant impact on the resulting accuracy. This has resulted in the development of various modifications of trend models, introducing a model selection problem. With the aim of addressing this problem, we propose the complex exponential smoothing (CES), based on the theory of functions of complex variables. The basic CES approach involves only two parameters and does not require a model selection procedure. Despite these simplifications, CES proves to be competitive with, or even superior to existing methods. We show that CES has several advantages over conventional exponential smoothing models: it can model and forecast both stationary and non‐stationary processes, and CES can capture both level and trend cases, as defined in the conventional exponential smoothing classification. CES is evaluated on several forecasting competition datasets, demonstrating better performance than established benchmarks. We conclude that CES has desirable features for time series modeling and opens new promising avenues for research.

cases translates to long time series. Moreover, even though advances in com-forecasting, in both research and practice, has direct implications for practice.
where α is the smoothing parameter andŷ t is the estimated value of series.

149
The same method can be represented as a weighted average of previous actual 150 observations if we substituteŷ t−1 by the formula (2) with an index t − 1 151 instead of t (Brown, 1956): The idea of this representation is to demonstrate how the weights α(1−α) j−1 153 are distributed over time in our sample. If the smoothing parameter α ∈ 154 (0, 1) then the weights decline exponentially with the increase of j. If it lies 155 in the so called "admissible bounds" (Brenner et al., 1968), that is α ∈ (0, 2), 156 then the weights decline in oscillating manner. Both traditional and admis-weights. Depending on the values of the complex smoothing parameter, the weights distribution will exhibit a variety of trajectories over time, including exponential, oscillating and harmonic. Finally, the result of multiplication of 173 two complex numbers will be another complex number, so we substituteŷ t−j 174 withŷ t−j + iê t−j , whereê t−j is the proxy for the error term. The Complex 175 Exponential Smoothing obtained as a result of this can be written as: (1 + i − (α 0 + iα 1 )) j−1 (y t−j + ie t−j ).
CES introduces an interaction between the real and imaginary parts, and the where t is the white noise error term, l t is the level component and c t is the the level l t−1 . Also, note that we use t instead of e t in (7), which means 202 that the CES has (7) as an underlying statistical model only when there is 203 no misspecification error. In the case of the estimation of this model, the t 204 will be substituted by e t , which will then lead us to the original formulation 205 (5).

206
This idea allows rewriting (7) in a shorter more generic way, resembling 207 the general Single Source of Error (SSOE) state space framework: where v 0 is the vector of initial states, σ 2 is the variance of the error term 222 and Y is the vector of all the in-sample observations.
If the absolute values of both roots are less than 1 then the estimated CES 232 is stationary.

233
When α 1 > 1 one of the eigenvalues will always be greater than one.

234
In this case both eigenvalues will be real numbers and CES produces a non-235 stationary trajectory. When α 1 = 1, CES becomes equivalent to ETS(A,N,N).

236
Finally, the model becomes stationary when (see Appendix C): The other important property that arises from (7) is the stability condition for CES. With t = y t − l t−1 the following is obtained: The and can be written in the general form: The model is said to be stable if all the eigenvalues of (13) lie inside the 248 unit circle. This is more important condition than the stationarity for the 249 model, because it ensures that the complex weights decline over time and 250 that the older observations have smaller weights than the new ones, which is 251 one of the main features of the conventional ETS models. The eigenvalues 252 are given by the following formula: CES will be stable when the following system of inequalities is satisfied (see 254 Appendix D):  The conditional mean of CES for h steps ahead with known l t and c t can 264 be calculated using the state space model (7): where E(y t+h |v t ) =ŷ t+h , while F and w are the matrices from (8).

266
The forecasting trajectories of (16) will differ depending on the values    model (Gardner, 1985). It can be shown that CES can be written in the form 290 of ARMA(2,2) model (see Appendix B for the derivations): The coefficients of AR terms of this model are connected with the coeffi-294 cients of MA terms, via the complex smoothing parameter. This connection 295 is non-linear and imposes restrictions on the AR terms plane. Figure 3 296 demonstrates how the invertibility region restricts the AR coefficients field.

297
The triangle on the plane corresponds to the stationarity condition of AR(2) 298 models, while the dark area demonstrates the invertibility region of CES. Note that exponential smoothing models are non-stationary in their na-300 ture, but it is crucial for them to be stable or at least forecastable (Ord  The additional properties of CES become obvious after regrouping the 313 elements of (6): When α 1 is close to 1 the influence ofê t onŷ t becomes minimal and the 315 second smoothing parameter α 0 in (18) behaves similarly to the smoothing 316 parameter in SES: α 0 − 1 in CES corresponds to α in SES. For example, the value α 0 = 1.24 in CES will be equivalent to α = 0.24 in SES.

318
Similar conclusions can be made by comparing ETS(A,N,N), which un-319 derlies SES, and CES in state space form (7), assuming that α 1 = 1, which 320 leads to: The data generated using (19) will resemble the one generated using ETS(A,N,N).

322
The insight from this is that when the series is stationary and the optimal 323 smoothing parameter in SES should be close to zero, the optimal α 0 in CES 324 will be close to one. At the same time the real part of the complex smoothing 325 parameter will become close to 2 when the series corresponds to the random 326 walk process. with a lag t − m instead of t − 1. This is similar to the reduced seasonal 332 exponential smoothing forms by Snyder and Shami (2001). In order to model 333 both seasonal and non-seasonal parts we extend the original model (7) with 334 a seasonal model, leading to the following seasonal CES model: Model (20) can still be written in a conventional state-space form (8).

336
It exhibits several differences from conventional smoothing seasonal mod-

358
When it comes to the estimation of this model, this can be done using 359 the same principles as discussed earlier, while the stability condition can be 360 checked by making sure that all the eigenvalues of the following discount 361 matrix lie inside the unit circle: The restrictions on parameters become more complicated than in the case of 363 the non-seasonal model and cannot be easily analysed.

364
Finally, using the likelihood (9), the Akaike Information Criterion for both 365 seasonal and non-seasonal CES models can be calculated based on formula 366 (1). For the non-seasonal model (7), the number of estimated parameters k is     Although we prefer to use R for data analysis, it should be noted that CES may also be implemented in Excel, with model fitting using Solver. As the only model selection the seasonal and non-seasonal options. In order to see how CES performs, we also evaluate several benchmark methods:  question is between the seasonal and non-seasonal forms, implementation in spreadsheet is simpler than for ARIMA or ETS.

464
• MSIS -Mean Scaled Interval Score: where 1{·} is the indicator function, u t+j is the upper bound and l t+j is 466 the lower bound of the prediction interval and α is the significance level. 467 We use 95% prediction interval in our estimation (α = 0.05). This is  We use the post-hoc Nemenyi test (Demšar, 2006), which relies on ranks of 472 error measures. We use rmcb() function in the greybox package (Svetunkov,473 2021a) for R. This reveals, whether there is evidence that the reported dif-474 ferences in accuracy between forecasts are statistically significant.  We see in Table 1  This time series with CES and ETS is shown in Figure 5.   In order to determine whether the observed accuracy differences between 509 forecasts are statistically significant, we test them using the non-parametric 510 Nemenyi tests. The results at 5% significance level are presented in Figure 6.       In addition, we explore performance in the remaining seasonal time se-566 ries (2053 time series). In this case, the ETS models allow for seasonality.

567
Note that the split of the subset of the series was done according to ETS,  making these the optimally selected models, given any trend restrictions we 569 have imposed. The results of this experiment are shown in Table 5. CES

617
We do not claim that CES is appropriate for every single forecasting 618 scenario. Nonetheless, we provide evidence that it is a robust approach. In Any complex variable can be represented as a vector or as a matrix: The general CES model (5) can be split into two parts: measurement and 631 transition equations using (A.1): Regrouping the elements of transition equation in (A.2) the following 633 equation can be obtained: Grouping vectors of actual value and level component with complex smooth-635 ing parameter and then the level and non-linear trend components leads to: The difference between the actual value and the level in (A.4) is the 637 forecast error: y t − l t−1 = e t . If we consider the model that might generate 638 the data for CES, then we should assume that there is no misspecification 639 error, so that e t = t . Using this and making several transformations gives 640 the following state space model: ( for forecasting and can be discarded from the final state space model: The non-linear trend component can be calculated using the second equa-648 tion of (7), assuming e t = t , in the following way: Inserting (B.1) into the third equation of (7) leads to: (B.2) Multiplying both parts of (B.2) by −(1 − α 1 ) and taking one lag back results 651 in: Opening the brackets, transferring all the level components to the left hand 653 side and all the error terms and non-linear trend components to the right 654 hand side and then regrouping the elements gives: Now making substitutions l t = y t+1 − t+1 in (B.4), taking one more lag back 656 and regrouping the error terms once again leads to: The resulting model (B.5) is ARMA(2,2): In a similar manner using (7) it can be shown that the imaginary part of 661 the series has the following underlying model: where ξ t = t − c t−1 , θ 2,1 = 2 + α 1 and θ 2,2 = α 0 − α 1 − 2.

663
Appendix C. Stationarity condition for CES

664
The analysis of the equation (10) shows that the eigenvalues can be either 665 real or complex. In the cases of the real eigenvalues they need to be less 666 than one, so the corresponding forecasting trajectory can be stationary and 667 exponentially decreasing. This means that the following condition must be 668 satisfied: The first inequality in (C.1) leads to the following system of inequalities: The analysis of (C.2) shows that if α 0 > 4, then the third inequality is 671 violated and if α 0 < 0, then the second inequality is violated. This means 672 that the condition α 0 ∈ (0, 4) is crucial for the stationarity of CES. This also 673 means that the first and the forth inequalities in (C.2) are always satisfied.

674
Furthermore the second inequality can be transformed into: which after simple cancellations leads to: The other important result follows from the third inequality in (C.2), which 677 can be derived using the condition α 0 ∈ (0, 4): which implies that: Uniting all these condition and taking into account the second inequality in 680 (C.1), CES will produce a stationary exponential trajectory when: The other possible situation is when the second part of the inequality 682 (C.1) is violated, which will lead to the complex eigenvalues, meaning that 683 the harmonic forecasting trajectory is produced. CES can still be stationary 684 if both eigenvalues in (10) have absolute values less than one, meaning that: where (λ) is the real part and (λ) is the imaginary part of λ. This means 686 in its turn the satisfaction of the following condition: or: which simplifying leads to: 1 < α 0 + α 1 ≤ 2, or: The full condition that leads to the harmonic stationary trajectory of CES 689 is: (C.11) The first inequality in (C.11) can be rewritten as a difference of squares: Comparing the right hand part of (C.12) with the right hand side of the third inequality in (C.11) it can be shown that there is only one point, when 693 both of these inequalities will lead to the same constraint: when α 0 = 2.

694
This is because the line α 1 = 2 − α 0 is a tangent line for the function α 1 =

695
(2 − α 0 ) (2+α 0 ) 4 in this point. In all the other cases the right hand part of 696 (C.12) will be less than the right hand side of the third inequality in (C.11).

725
Observe that the lags of the non-seasonal and seasonal parts in (E.1) 726 differ, which leads to splitting the state-space model into two parts. But 727 uniting these parts will lead to the conventional state-space model: Inserting the values of the vectors and multiplying the matrices leads to: Substituting the values by the matrices in (E.7) gives: (E.8) The inverse of the first matrix in (E.8) is equal to: (E.9) similarly the inverse of the second matrix is: (E.10) Inserting (E.9) and (E.10) into (E.8), after cancellations and regrouping of 740 elements leads to: Unfortunately, there is no way to simplify (E.11) to present it in a compact 742 form, so the final model corresponds to SARIMA(2, 0, 2m + 2)(2, 0, 0) m .