Uncovering Hidden Insights with Long-Memory Process Detection: An In-Depth Overview

: Long-memory models are frequently used in ﬁnance and other ﬁelds to capture long-range dependence in time series data. However, correctly identifying whether a process has long memory is crucial. This paper highlights a signiﬁcant limitation in using the sample autocorrelation function (ACF) to identify long-memory processes. While the ACF establishes the theoretical deﬁnition of a long-memory process, it is not possible to determine long memory by summing the sample ACFs. Hassani’s − 1 / 2 theorem demonstrates that the sum of the sample ACF is always − 1 / 2 for any stationary time series with any length, rendering any diagnostic or analysis procedures that include this sum open to criticism. The paper presents several cases where discrepancies between the empirical and theoretical use of a long-memory process are evident, based on real and simulated time series. It is critical to be aware of this limitation when developing models and forecasting. Accurately identifying long-memory processes is essential in producing reliable predictions and avoiding incorrect model speciﬁcation


Introduction
Long-memory time series are characterized by having an autocorrelation function (ACF) that decays to zero at a slow polynomial rate as the lag increases. This means that the correlations between observations at different time steps persist over a long period of time, leading to persistent patterns in the data. This property of long-memory time series is useful in various fields, as it allows researchers to model the persistence of certain patterns in the data and make better predictions based on past observations (see, for example, Doukhan et al. 2003;Beran et al. 2013;Zivot and Wang 2013). The concept of long-memory time series was first introduced by (Wei 2006) and (Tsay 2010) (for recent research, refer to Beran 1994 andDas andBhattacharya 2021). The fractionally differenced process is a type of long-memory time series that is characterized by the fractional difference operator (1 − B) α , where B is the backshift operator and α is the fractional differencing parameter; the process is defined by (1 − B) α y t = a t , −0.5 < α < 0.5 where a t is a white noise series. In this equation, y t represents the observed time series and a t is a white noise series, which is a series of uncorrelated random variables with a mean of zero and a constant variance. The fractional difference operator (1 − B) α removes any short-term correlations in the time series and enhances the persistence of the long-term correlations, leading to a long-memory time series. The fractional differencing parameter α lies between −0.5 and 0.5 and determines the degree of differencing to be applied to the time series. The properties of model (1) have been widely studied in the literature (e.g., Hosking 1981). We summarize some of these properties below.
1. When α < 0.5, the long-memory process y t is said to be weakly stationary. This means that the mean, variance, and autocovariance of the process are constant over time. Additionally, the process has an infinite moving average (MA) representation. In other words, the process can be represented as an infinite sum of past error terms, where the coefficients decay exponentially as the time lag between the error terms increases. The weakly stationary property of y t is desirable as it simplifies the analysis and modeling of the process. 2.
When α > −0.5, the long-memory process y t is invertible. This means that it can be transformed into a stationary process by applying a certain filter. In other words, the invertibility property ensures that the long-memory process can be represented as a finite sum of past error terms, where the coefficients decay exponentially as the time lag between the error terms increases. The invertibility property is useful in practical applications as it allows the analyst to transform the long-memory process into a stationary process, which is easier to analyze and model. 3.
When −0.5 < α < 0.5, the autocorrelation function (ACF) of the long-memory process y t follows a certain pattern. The ACF decays at a polynomial rate of h 2α−1 as the lag h increases, leading to persistent patterns in the data and making the process a long-memory time series. This implies that the memory of the process decays very slowly, and the process can exhibit persistent trends and cycles that extend over long time horizons. The long-memory property of y t is important as it captures the longrange dependence in the data, which is often observed in financial, economic, and environmental time series. However, accurately detecting long-memory processes can be challenging, as the sample autocorrelation function may not be a reliable measure of long-range dependence.
If the sample autocorrelation function (ACF) of a time series decays slowly, it can indicate that the series has long memory. In this case, the ACF will not quickly approach zero as the lag increases, but will instead show persistent patterns in the data. However, it is important to keep in mind that other factors, such as the size of the sample or the presence of outliers, may also affect the behavior of the ACF and should be considered when making this determination. Ultimately, statistical tests and model-based approaches are often used to formally test for the presence of long memory in a time series. There are several methods that can be used to estimate the fractional differencing parameter α in the fractionally differenced model defined by Equation (1).

1.
Maximum likelihood method: This method involves maximizing the likelihood function of the fractionally differenced model given the observed data, and then using the resulting estimates of the parameters to estimate α.

2.
Regression method with logged periodogram at lower frequencies: This method involves regressing the log of the periodogram (a measure of the spectral density of the time series) at lower frequencies on the log of the frequency, and then using the slope of the regression line to estimate α.
These methods can provide a good starting point in estimating d in practice, although the choice of method will depend on the specifics of the problem at hand. For example, the maximum likelihood method may be preferred when the data are well behaved, while the regression method may be preferred when the data are noisy or contain outliers. Ultimately, the choice of method will also depend on the specific software or package being used to analyze the data.
The sum of the sample autocorrelation function (S ACF ) has often been used as a diagnostic for long memory, as long-memory processes are characterized by the nonsummability of their theoretical autocovariance function. However, this may not always be the case in practice, as the sample ACF may behave differently from the theoretical autocovariance. This is why it is important to consider the implications of using the sample sum of the ACF as a diagnostic for long memory (Hurst 1951).
This study highlights the limitations of relying solely on the sample autocorrelations to diagnose the presence of long memory in a time series. The results demonstrate that some time series processes can exhibit misleading features in their sample autocorrelations, making it difficult to accurately identify the presence of long memory. These findings have significant implications for the routine use of the sum of the sample autocorrelations in practice and emphasize the need for more robust methods to detect long memory. The structure of this paper is as follows. In Section 2, we examine the definition of longmemory processes and the sum of the sample autocorrelations. This section also includes a discussion of the characteristics and relevant definitions of long memory. Section 3 focuses on well-known long-memory processes and provides an overview of key theoretical results, along with an examination of the implications of using the sample autocorrelation function. To provide further clarification, an illustrative example is included in Section 3. Section 4 presents a comprehensive comparison of various approaches to long-memory detection. It examines the strengths and limitations of each approach and provides insights into their effectiveness in capturing long-range dependence in time series data. In Section 5, a detailed discussion expands on the findings and implications of the comparative analysis. It delves into the nuances of each approach, highlighting their theoretical foundations, practical considerations, and potential areas for improvement. The discussion critically evaluates the suitability of the examined approaches in real-world scenarios and highlights open research questions and challenges that need to be addressed.
Finally, Section 6 presents the conclusions of the paper. It summarizes the main findings and contributions of the research, emphasizing the significance of the comparative analysis and the implications for long-memory detection in time series data. The section also outlines potential directions for future research, highlighting areas where further advancements are needed.

ACF and Long-Term Memory Process
Autocovariance and autocorrelation functions are two fundamental concepts in time series analysis. Autocovariance measures the linear dependence between two observations of a time series at different time lags. Autocorrelation, on the other hand, measures the linear relationship between a time series and a delayed copy of itself. In this subsection, we provide a brief introduction to autocovariance and autocorrelation functions, and we define their mathematical properties.
In practice, we do not have access to the entire population of a time series, but rather to a sample of observations from the series. Therefore, we need to estimate the autocovariance and autocorrelation functions from the available data. In this subsection, we present an estimator for the autocovariance function based on a sample of observations from a stationary time series. We also discuss an alternative estimator and its properties. Finally, we define the autocorrelation function and provide an estimator for it based on the estimated autocovariance function.
We also explore the concept of long-term memory processes, which can be defined in various ways. One of the most commonly used definitions is based on the sum of the autocorrelation function, while others rely on the hyperbolic decay of the autocovariances or the power-law decay of the spectral density function. These definitions can provide insights into the behavior of time series data and help to identify long-memory patterns. Additionally, the Wold decomposition of a process can provide an alternative definition, emphasizing the role of past shocks or innovations in influencing the process's behavior over long periods. It is worth noting that these definitions are not necessarily equivalent and can be useful in different contexts. In the following sections, we delve into each of these definitions and their implications for long-term memory processes.

Sum of the Sample Autocorrelation Function
The autocovariance function of a wide sense stationary process {Y t } at lag h is where E is the expected value operator; µ Y is the expected value of the variable Y.
In practical problems, we only have a set of data Y T = (y 1 , · · · , y T ); the following estimator can be considered as an estimate ofR(h): where y = 1 T ∑ T t=1 y t is the sample mean, which is an unbiased estimator of µ. There is an alternative estimate ofR(h): The autocovarianceλ(h) is biased on the use of the divisor T rather than T − |h| and also has larger bias thanR(h). The autocorrelation function, AFC, is given by and an estimate of Theorem 1. The sum of the sample ACF, S ACF , with lag h ≥ 1 is always −1 2 for any stationary time series with arbitrary length T ≥ 2: Proof. (Hassani 2009) The S ACF has the following properties: 1. It does not depend on the time series length T; S ACF = −1 2 for T ≥ 2. This property is interesting because it implies that the overall level of autocorrelation in a stationary time series, as measured by the sum of the ACF values, is not affected by the length of the time series. This means that even if we have a very long or a very short time series, the overall degree of temporal dependence in the data remains the same. This property can be useful in comparing the overall level of temporal dependence between different time series of varying lengths.

2.
The value of S ACF is equal to −1 2 for any stationary time series. Thus, for example, S ACF for ARMA(p, q) of any order (p, q) is equal to a Gaussian white noise process and both are equal to −1 2 . The second property of the theorem states that for any stationary time series, the value of S ACF is always equal to −1 2 . This means that the sum of the sample ACF at each lag is always a constant, regardless of the length of the time series. For example, the sample ACF of an ARMA(p, q) process of any order (p, q) is equal to a Gaussian white noise process, and both have a value of −1 2 for S ACF . This result has important implications for autoregressive model building and forecasting. If we use the sample ACF to detect the parameters of an autoregressive model, we might yield the improper detection of the order, since the ACF values are not informative of the order.

3.
The values ofρ(h) are linearly dependent: This equation shows that the value ofρ(i) can be expressed as a linear combination of the other sample ACF values, with a constant term of −1 2 . In other words, the ACF values are not independent of each other, but, rather, they are related to each other in a systematic way. This property is a consequence of the fact that the ACF values depend only on the time lag between observations, and not on the specific values of the observations themselves. Therefore, once the ACF values for some lags are known, the values for other lags can be determined using this linear relationship.

4.
There is at least one negativeρ(h) for any stationary time series, even for AR(p) with a positive ACF (Hassani 2010). This property states that for any stationary time series, there is at least one negative sample autocorrelation function (ACF) value, even for autoregressive (AR) models with positive ACF values. An AR model is a popular class of linear models for time series data, where the value of a variable at time t depends linearly on its own past values, up to a certain number of lagged observations (specified by the model order p). When the AR model is fitted to a stationary time series, the resulting ACF values are typically positive for the first few lags, indicating some degree of autocorrelation in the data. However, this property states that there must always be at least one negative ACF value, even for AR models with positive ACF values. This property can be understood as follows: although an AR model may capture some of the temporal dependencies in the data, it is unlikely to capture all of them perfectly. In other words, there are likely to be some patterns in the data that are not fully explained by the AR model. These unexplained patterns can lead to negative ACF values, indicating a lack of autocorrelation at certain lags. Therefore, even for stationary time series that exhibit positive autocorrelation overall, there will always be some degree of randomness or unpredictability in the data, resulting in at least one negative ACF value.
The property of S ACF being constant and equal to −1 2 for any stationary time series has important implications for time series analysis and modeling (see, for example, Silva 2015 andHassani et al. 2021).

Long-Term Memory Process
The concept of long-memory processes can be defined in different ways. One common definition is based on the sum of the autocorrelation function, as shown in Equation (9), which states that the sum of the absolute values of the autocorrelation coefficients is infinite. However, there are alternative definitions that can also capture long-memory behavior, such as the hyperbolic decay of the autocovariances, as shown in Equation (10): In this case, the autocovariances decrease at a rate of h 2α−1 as h approaches infinity, where α is the long-memory parameter and l 1 (0) is a slowly varying function. Another approach to characterizing long-memory processes is through their spectral density function. This definition describes a spectral density function that exhibits a power-law decay for small frequencies, with l 2 (0) being a slowly varying function. These different definitions can be useful in different contexts and can help to identify long-memory behavior in time series data.
Another definition of strong dependence in the frequency domain is based on the spectral density function f (λ), which can be expressed as |λ| −2α l 2 (1/|λ|) for λ near zero, where α is the long-memory parameter and l 2 (0) is a slowly varying function. This definition highlights the relationship between the behavior of the process and the power of the spectral density function at low frequencies.
Additionally, the Wold decomposition of a process can provide an alternative definition of long-memory behavior. This definition characterizes the process as having a slow decay in its Wold representation, which is a linear combination of past innovations with decreasing weights. This definition emphasizes the role of past shocks or innovations in influencing the behavior of the process over long periods of time: for j > 0, where l 3 (0) is a slowly varying function. These definitions are not necessarily equivalent; see (Ding et al. 1993;Doukhan et al. 2003).

Empirical versus Theoretical Results
Let us first briefly consider some widely used long-term memory models.

Selected Long-Term Memory Models
• GARMA(p,q), which stands for Generalized Autoregressive Moving Average with Conditional Heteroscedasticity, is a type of time series model that combines the features of both the ARMA and GARCH models. This model is suitable for analyzing time series data with a non-constant mean and variance. The GARMA(p,q) model includes both autoregressive and moving average components as well as a conditional heteroscedasticity term, which captures the time-varying volatility in the data. This allows for better modeling and forecasting of time series data that exhibit changes in volatility over time. GARMA(p,q) has been applied in various fields for the modeling and forecasting of time series data with changing volatility patterns. For example, it has been used to model stock market returns, exchange rates, and weather data. The GARMA(p,q) model is defined as follows τ t : AR and MA component; A: the function that represents an autoregressive form; M: the function that represents a moving average form; φ j : autoregressive parameter at j; θ j : moving average parameter at j.
The above GARMA(p,q), as defined by Equation (13), specifies a linear regression of a function g(µ t ) on a set of predictor variables X t and a set of unknown parameters β. The error term τ t in Equation (13) is decomposed into an autoregressive (AR) component and a moving average (MA) component, which are specified in Equation (14). The function A(y t−j , X t−j , β) in Equation (14) represents the autoregressive component of the model, where y t−j is the value of the time series at lag j, and X t−j is the corresponding vector of exogenous variables. The autoregressive parameter at lag j is denoted by φ j , which represents the impact of the lagged value of the time series on the current value, conditional on the values of the exogenous variables.
The function M(y t−j , M t−j ) in Equation (14) represents the moving average component of the model, where M t−j is the set of lagged moving average errors. The moving average parameter at lag j is denoted by θ j , which represents the impact of the lagged moving average error on the current value of the time series. • Integrated GARCH (IGARCH) is a type of time series model that is widely used to model financial and economic data. It is an extension of the GARCH model that accounts for the persistence of shocks in financial markets. In the IGARCH model, the past conditional variances of the series are included as predictors of the current conditional variance. This allows the model to capture the long-memory effect, where shocks have a persistent effect on future variance. IGARCH has been applied in various fields for the modeling and forecasting of time series data with persistence in volatility. For example, it has been used to model stock market returns, exchange rates, interest rates, and commodity prices. The IGARCH model is particularly useful for risk management, portfolio optimization, and option pricing. The IGARCH(1,1) model is given by • ARCH(∞) is a type of time series model that extends the ARCH model to include an infinite number of lags in the conditional variance equation. This allows the model to capture long memory in the volatility of financial and economic time series data. The ARCH(∞) model is based on the idea that past shocks can have a persistent effect on the variance of the series over an infinite time horizon. The model can be estimated using maximum likelihood methods and has been shown to provide a better fit to financial data than finite-order ARCH models. The ARCH(∞) model has been applied in various fields for the modeling and forecasting of time series data with long memory in volatility. For example, it has been used to model stock market returns, exchange rates, and interest rates. The model is particularly useful in finance for risk management, portfolio optimization, and option pricing. However, the estimation of the model can be computationally intensive, and the interpretation of the infinite number of parameters can be challenging. The process {ε t } is said to be an ARCH(∞), whenever with where α i ≥ 0 and ψ t represents the information set of all information up to time t, i.e., • LARCH(∞) and LARCH+(∞) are two types of time series models that extend the ARCH and GARCH models to allow for long memory in the conditional variance equation. LARCH(∞) is an extension of the ARCH model that includes an infinite number of lagged squared residuals in the variance equation. This allows the model to capture long memory in the volatility of time series data. LARCH+(∞) is an extension of the GARCH model that includes both an infinite number of lagged squared residuals and an infinite number of lagged conditional variances in the variance equation. This allows the model to capture long memory and the persistence of shocks in financial markets. Both LARCH(∞) and LARCH+(∞) have been applied in various fields for the modeling and forecasting of time series data with long memory in volatility. The models are particularly useful in finance for risk management, portfolio optimization, and option pricing. However, the estimation of these models can be computationally intensive, and the interpretation of the infinite number of parameters can be challenging. The LARCH model can be described as where {ε t , t ∈ Z} are iid random variables with zero mean and unit variance. • Stochastic volatility (SV) models are a type of time series model that allow the volatility of financial or economic time series data to vary over time in a random or stochastic manner. These models are based on the idea that the volatility itself is a random process that follows a certain distribution. In an SV model, the conditional variance of the series is modeled as a function of its past values, as well as a random process that represents the stochastic component of the volatility. This allows the model to capture the time-varying nature of the volatility in the data. SV models have been widely used in finance and economics for the modeling and forecasting of time series data with changing volatility. For example, they have been used to model stock prices, exchange rates, and interest rates, and are particularly useful for pricing options and other financial derivatives. The models are also used for risk management and portfolio optimization, as they allow for the more accurate estimation of risk measures such as the value at risk (VaR) and expected shortfall (ES). However, the estimation of SV models can be computationally intensive, and the interpretation of the random component of the volatility can be challenging. The SV model is defined as follows: where σ 2 t = exp(h t ) is the volatility of y t . The log volatility h t is specified by the AR(1) process with Gaussian innovation noise. • Autoregressive Fractionally Integrated Moving Average (ARFIMA) and Generalized Autoregressive Conditional Heteroskedasticity (GARCH) are two widely used time series models in finance and economics. ARFIMA models are used to model time series data that exhibit long memory or fractional integration, meaning that the autocorrelation of the series declines very slowly. These models extend the ARIMA models by incorporating fractional differencing, which allows them to capture the long-memory effect. GARCH models, on the other hand, are used to model time series data that exhibit heteroskedasticity or volatility clustering, meaning that the variance of the series changes over time. These models extend the ARCH models by incorporating autoregressive components in the conditional variance equation, allowing them to capture the persistence of shocks in the data. Both the ARFIMA and GARCH models have various applications in finance and economics. ARFIMA models are particularly useful in the modeling and forecasting of financial and economic time series with long memory, such as stock prices, exchange rates, and interest rates. GARCH models are widely used in risk management and portfolio optimization, as they allow for the more accurate estimation of risk measures such as the value at risk (VaR) and expected shortfall (ES). They are also used in option pricing and volatility forecasting. However, the estimation of these models can be computationally intensive, and the interpretation of the parameters can be challenging. An ARFIMA process {y t } may be defined by where φ(B) = 1 + φ 1 B + · · · + φ p B p and θ(B) = 1 + θ 1 B + · · · + θ q B q are the autoregressive and moving average operators, respectively. (1 − B) −α is a fractional differencing operator defined by the binomial expansion where for α < 1 2 , α = 0, −1, −2, · · · and {ε t } is a white noise sequence with finite variance.
• A CAR(1) model, also known as a Conditional Autoregressive Model of Order 1, is a time series model that describes the dependence between observations in a series over time.
In this model, each observation in the series is assumed to be a function of the previous observation and a random error term. The term "conditional" in CAR(1) refers to the fact that the current observation is conditional on the previous observation. The application of CAR(1) models is widely practised in econometrics, finance, and engineering for the forecasting and analysis of time series data. It is particularly useful in modeling and forecasting stock prices, exchange rates, and interest rates. It can also be used in modeling natural phenomena, such as climate patterns or population growth. The CAR model explains the observations with p fixed effects and n spatial random effects: where τ 0 I n and Q = ∑ F−1 j=1 τ j Q j are precision matrices, observations y and random effects s are n × 1, design matrix X is n × p, and the fixed effect regression parameter vector β is p × 1. Table 1 presents a comparison of theoretical and empirical results for the definition of long memory based on the sum of the sample autocorrelation function. Table 1 lists different financial models for long-memory time series data. Each model has a theoretical expectation for the long-memory process, represented by either ∑ ∞ h=−∞ |ρ(h)| → ∞ as T → ∞ or f (λ) → ∞ as λ → 0. The table also lists the empirical results for each model, represented by ∑ (T−1) h=−(T−1)ρ (h) = 0 orf (0) = 0. It is evident from the table that while the theoretical results based on the ACF or spectral density are infinite, the empirical spectral density or sum of the empirical ACF is finite and zero. This significant discrepancy between the two makes the detection of long memory misleading. The results presented in this table are, for example, time series that have been widely used in the literature (Hassani et al. 2012). Table 2 presents the comparison between the theoretical and empirical results of longmemory process detection. The theoretical results and the empirical results are presented in the two columns of the table. The first row shows that, according to the theoretical results, the sum of the absolute values of the autocorrelation coefficients (i.e., ρ(h)) approaches infinity as the number of observations (T) increases. However, the empirical results show that the sum of the absolute values of the estimated autocorrelation coefficients (i.e.,ρ(h)) has a finite upper limit as T approaches infinity.

Results
The second row presents a similar discrepancy between the theoretical and empirical results, with the theoretical results indicating that the sum of the autocorrelation coefficients approaches infinity as T approaches infinity, while the empirical results show that the sum of the estimated autocorrelation coefficients is equal to zero.
The third row compares the behavior of the theoretical and empirical spectral densities. The theoretical results show that the spectral density (i.e., f (λ)) approaches infinity as the frequency (λ) approaches zero, while the empirical results show that the estimated spectral density (i.e.,f (λ)) is equal to zero at zero frequency. Overall, the table shows that there are discrepancies between the theoretical and empirical results of long-memory process detection, indicating that the assumptions made in the theoretical analysis may not hold in practice. Table 1. Some examples of long-memory time series-theoretical vs. empirical results. Bertail et al. 2006) 4 I ARCH(∞) f (λ) → ∞ as λ → 0f (0) = 0 (Teyssière and Kirman 2002) 5 FIGARCHv f (λ) → ∞ as λ → 0f (0) = 0 (Zivot and Wang 2013) Table 2. The theoretical and empirical results of a long-memory process. Table 3 provides a comparison of four popular time series models, namely ARFIMA, GARMA, IGARCH, and CAR(1), based on four important characteristics of time series data: long memory, stationarity, volatility clustering, and the autocorrelation function. ARFIMA and GARMA are both long-memory models, meaning that they can capture the long-range dependence present in time series data. However, neither of these models guarantees stationarity, which is a desirable property in many applications. On the other hand, IGARCH and CAR(1) are both stationary models, but they do not capture long-memory dependence. IGARCH is designed specifically to model volatility clustering, which is a common phenomenon in financial time series data. In contrast, CAR(1) assumes that autocorrelation decreases exponentially with the lag and does not account for volatility clustering.

Theoretical Results Empirical Results
Table 3 also shows that the autocorrelation functions of ARFIMA, GARMA, IGARCH, and CAR(1) all decrease over time, but with different rates. For ARFIMA and GARMA, the autocorrelation function decreases to zero, which is indicative of the long-memory dependence captured by these models. In contrast, the autocorrelation functions of IGARCH and CAR(1) decrease exponentially, reflecting the short-range dependence present in these models. Understanding the properties of different time series models can help researchers and practitioners to choose the most appropriate model for their specific application and improve the accuracy of their forecasts.
Let us now consider the differences between the theoretical and empirical results of these models. Figure 1 illustrates these differences. Figure 1 presents 1000 realizations from the ARFIMA, GARMA, IGARCH, and CAR(1) processes. These models shows the behavior of the empirical sum of the autocorrelation function (see package "Hassani.SACF" in R). Figure 1 shows the sum of the sample autocorrelation function (ACF) for various long-memory models, including ARFIMA, GARMA, IGARCH, and CAR(1). These models are widely used in practice to capture long-range dependence in time series data. While the patterns of the sum of the sample ACF are different for each model, they all ultimately converge to −1 2 as the sample size increases. This means that the sum of the sample ACF cannot be used as an accurate measure of long memory in the way that the theoretical definition of long memory is based on the ACF. This finding is significant because it suggests that relying solely on the sample ACF to identify long-memory processes can be misleading.

Comparison of Parametric and Non-Parametric/Semi-Parametric Approaches for Long-Term Memory Time Series Detection
There are several approaches available for the detection of long-term memory in time series data. Broadly speaking, these approaches can be classified into two categories: parametric and non-parametric/semi-parametric.
Parametric approaches assume a specific functional form for the underlying process and estimate its parameters using a maximum likelihood method. These approaches are typically used when the underlying process can be well approximated by a known stochastic process, such as ARIMA, ARFIMA, or GARCH. Parametric approaches have the advantage of being computationally efficient and providing explicit statistical inference, such as hypothesis testing and confidence intervals. However, they may not be appropriate when the underlying process does not follow the assumed functional form, or when the data are contaminated with outliers or measurement errors.
Non-parametric/semi-parametric approaches, on the other hand, do not assume a specific functional form for the underlying process and estimate its properties using more flexible methods. These approaches include wavelet-based methods, periodogrambased methods, detrended fluctuation analysis (DFA), and local Whittle estimation. Nonparametric/semi-parametric approaches have the advantage of being more robust to deviations from the underlying assumptions and can capture more complex dependence structures. However, they may require more computational resources and provide less explicit statistical inference than parametric approaches. Table 4 summarizes the strengths and weaknesses of parametric and non-parametric/semi-parametric approaches for long-term memory time series detection. Table 4. Strengths and weaknesses of parametric and non-parametric/semi-parametric approaches for long-term memory time series detection.

Parametric Semi-Parametric Non-Parametric
Basic assumption Assumes specific probability distribution of errors It is worth noting that both parametric and non-parametric/semi-parametric approaches have their own strengths and weaknesses. The choice of approach should depend on the specific characteristics of the data and the research question at hand. Researchers and practitioners should carefully evaluate the assumptions and limitations of each approach and select the most appropriate one for their application.

Discussion
The empirical results presented in Table 3 provide a comparison of four popular time series models: ARFIMA, GARMA, IGARCH, and CAR(1). These models were evaluated based on four important characteristics of time series data: long memory, stationarity, volatility clustering, and the autocorrelation function.
ARFIMA and GARMA were found to be long-memory models capable of capturing long-range dependence in time series data. However, neither of these models guarantees stationarity, which is a desirable property in many applications. On the other hand, IGARCH and CAR(1) were identified as stationary models, but they do not capture longmemory dependence. IGARCH is specifically designed to model volatility clustering, a common phenomenon observed in financial time series data. In contrast, CAR(1) assumes a decreasing exponential autocorrelation function and does not account for volatility clustering. The theoretical and empirical results of these models were compared, as illustrated in Figure 1. It was observed that the patterns of the sum of the sample ACF differed for each model. However, all the models ultimately converged to approximately -1/2 as the sample size increased. This implies that relying solely on the sum of the sample ACF as a measure of long memory can be misleading. It deviates from the theoretical definition of long memory, which is based on the ACF.
This finding has important implications as it highlights the limitation of using the sample ACF alone to identify long-memory processes. Researchers and practitioners should exercise caution when interpreting results solely based on the sum of the sample ACF, as it may not accurately capture the presence of long-range dependence.
The results obtained from the empirical analysis raise questions about the generalizability of these findings to other approaches and methods that rely on the sample autocorrelation function (ACF) for the detection of long-memory processes. The answer to this question is affirmative, as demonstrated by Hassani's − 1 /2 theorem. This theorem suggests that methods relying solely on the sample ACF may fail to accurately detect long-memory processes, indicating a limitation in their theoretical foundation.
It is important to note that many existing approaches and methods to identify longmemory processes are based on asymptotic behaviors and various assumptions. However, the empirical results presented in this study demonstrate that these assumptions may not hold in real-world scenarios, as evident from the examples provided. This suggests that relying solely on such approaches can lead to inaccurate conclusions and may not fully capture the true nature of long memory in time series data.
To address these limitations, a data-driven approach rather than a model-based approach could be a potential solution. By adopting a data-driven approach, researchers can explore the inherent patterns and structures in the data themselves, rather than relying solely on predefined models or assumptions. This approach acknowledges the complex and diverse nature of real-world data and allows for a more comprehensive investigation of long-memory processes.
Further investigation is warranted to explore and develop data-driven methods to detect long memory in time series data. Such investigations could involve the development of innovative techniques that leverage machine learning algorithms, advanced statistical methodologies, or nonparametric approaches. By incorporating the richness and complexity of the data into the analysis, these data-driven methods have the potential to provide the more accurate and robust identification of long-memory processes.

Concluding Remarks
The paper discusses the issue of detecting long-range dependence in time series data and the discrepancies between the theoretical definition and empirical identification of long-memory processes. The theoretical definition of long memory is based on the autocorrelation function (ACF), which measures the correlation between observations at different time lags.
However, the paper highlights that the commonly used empirical measure of long memory, the sample autocorrelation sum, is a predetermined constant for any stationary time series, regardless of sample size. This means that it cannot identify long-memory processes in the same way as the theoretical definition.
The implications of this are significant, as it suggests that theoretical results based on the ACF may be misleading if the empirical identification of long memory is incorrect. The paper presents an analysis of various long-memory models to demonstrate this point.
The main conclusion of the paper is that alternative approaches to identifying longmemory processes are necessary, given the limitations of the sample autocorrelation sum. Researchers and practitioners who use long-memory models should consider alternative methods of detecting long-range dependence.
For further research, the authors plan to investigate alternative measures and approaches for the detection of long-range dependence in time series data. Specifically, we aim to explore the use of alternative approaches such as wavelet-based methods and fractional integration techniques, which have shown promise in identifying long-memory processes. By using these alternative approaches, we hope to develop more reliable and accurate methods for the detection of long-range dependence in time series data, which could have important implications for model building and forecasting in various fields.