Time delay estimation of traffic congestion propagation due to accidents based on statistical causality

The accurate estimation of time delays is crucial in traffic congestion analysis, as this information can be used to address fundamental questions regarding the origin and propagation of traffic congestion. However, the exact measurement of time delays during congestion remains a challenge owing to the complex propagation process between roads and high uncertainty regarding future behavior. To overcome this challenge, we propose a novel time delay estimation method for the propagation of traffic congestion due to accidents using lag-specific transfer entropy (TE). The proposed method adopts Markov bootstrap techniques to quantify uncertainty in the time delay estimator. To the best of our knowledge, our proposed method is the first to estimate time delays based on causal relationships between adjacent roads. We validated the method's efficacy using simulated data, as well as real user trajectory data obtained from a major GPS navigation system in South Korea.


Introduction
Traffic congestion represents a universal problem for urban life owing to the dramatic growth in vehicle use, expansion of the economy and infrastructure, and proliferation of delivery services, among other factors. Traffic congestion frequently spreads into adjacent roads [1], resulting in greater damage to the overall traffic network.
Consequently, the accurate estimation of time delays has become crucial in addressing fundamental questions regarding the origin points and propagation of traffic congestion. Figure 1 from [2] illustrates this study's objective by showing how the impact of a traffic accident propagates to incoming roads. The black solid line represents the average speed on the road segment where the accident occurs, while the orange, green, and blue solid lines represent the average speeds on the three adjacent incoming roads. The average speed on the road segment was computed from GPS trajectory data provided by the NAVER Corporation using a map-matching process [3]. We can observe a time lag when the impact of an accident propagates to incoming roads. The time delay increases in the order of blue, green, and orange.
However, certain aspects of traffic congestion propagation pose statistical challenges to the accuracy of time delay estimation. First, the average speed on an incoming road is affected by various geographic and topological characteristics, such as road length and width, as well as road network topology. The impact of a traffic accident is distributed among all incoming roads in a complex pattern according to these characteristics. Furthermore, the duration of congestion is dynamic. As shown in the figure, the road denoted in blue exhibits a longer congestion duration than the other incoming roads. That is, the time delay in traffic congestion propagation does not simply denote a temporal pattern shift, but involves complicated temporal dynamics. Finally, data uncertainty is inherent in average road speeds. Trajectory-based average road speeds may be highly volatile depending on the availability of user data over a specific period.
To overcome the challenges outlined above, we propose a novel time delay estimation method for traffic congestion propagation between roads using lag-specific transfer entropy (TE). Our main contributions are as follows: • We provide a model-free approach to estimate congestion propagation delays using a lag-specific TE estimator in complex urban road systems. • We quantify uncertainty in time delay estimation using bootstrap techniques. This uncertainty quantification is employed to evaluate the reliability of time delay estimates and serves as a basis for hyperparameter optimization. • We show that decomposition and nonlinear normalization with a sliding window are effective time series preprocessing methods for revealing causal relationships between traffic speed data. • We validate the proposed method through numerical simulations and real user trajectory data obtained from a major GPS navigation system in South Korea.
The remainder of this paper is organized as follows: Section 2 provides an overview of related studies. Section 3 presents crucial background information pertaining to the proposed method. Section 4 outlines our proposed time delay estimation method. Sections 5 and 6 validate the proposed method using simulated and real congestion propagation data, respectively. Finally, concluding remarks are presented in Section 7.

Related Work
Various topics have been studied regarding the estimation of time delays due to traffic accidents, including travel time delay [4], incident duration [5,6], real-time crash identification [7], and incident impact quantification [2,8]. However, unlike the aforementioned studies, we focused on the propagation delay of traffic congestion caused by accidents in a road network.

Time Delay Estimation for Congestion Propagation
Although our proposed method is the first to estimate time delays for congestion propagation in road traffic networks, time delay estimation (TDE) is not a new problem. In digital signal processing, TDE refers to the task of ascertaining the differences in arrival times between signals received at sensor array. The most widespread approach for TDE is cross-correlation [9,10]. Supposing that signals are received from two sensors, the delay between the sensors can be estimated using the time lag that maximizes the cross-correlation between filtered versions of the received signal.
Limited attempts have been made to perform cross-correlation analyses with traffic speed data. Conventional TDE methods based on cross-correlation have been applied to a real road vehicle pass-by measurement to enable traffic monitoring using passive acoustic sensors [11]. Similarly, [12] conducted a cross-correlation analysis to prove the existence of a significant relationship between the current value of speed at a specific station, as well as past speed values at upstream and downstream stations in a freeway traffic network. We refer to this method as time-lagged cross-correlation (TLCC). This approach is not applicable in the presence of nonstationarity.
To quantify the TLCC level between two nonstationary time series at different scales, [13] proposed a time-lagged detrended cross-correlation analysis approach. This method, referred to as DCCA, divides an entire time series into overlapping boxes to handle nonstationarity [14].
The method proposed in the present study was compared with TLCC and DCCA for evaluation purposes.

Traffic Causality Analysis
Various traffic causal analysis methods have been developed to identify causal relationships among congested roads and detect congestion propagation patterns. The authors of [15] forecasted future traffic flow by ranking input variables to identify a subset of the Bayesian network as the set of cause nodes using the Pearson correlation coefficient. A two-step mining architecture has been proposed to capture the origin points of road traffic congestion [16]. TE was used to reveal the delay propagation network among multiple airports with a time series of airport delays [17]. However, the aforementioned approaches primarily focus on revealing causal relationships and do not provide information regarding estimated time delays during congestion propagation.

Bootstrap for Markov Chains
Suppose that {X t } t≥1 is a stationary Markov chain with a finite state space S = {s 1 , . . . , s n }, where n ∈ N. Let P = (p i j ) ∈ R n×n be the transition probability matrix of the chain and the stationary distribution by π = (π 1 , . . . , π n ). Thus, for any 1 ≤ i, j ≤ n, p i j = P(X t+1 = s j |X t = s i ) and π i = P(X t = s i ). Given a time series {X 1 , . . . , X L } of size L from a stationary Markov chain, π i and p i j can be estimated aŝ The bootstrap observations X * 1 , . . . , X * L can now be generated using the estimated transition matrix and marginal distribution in Eq. (3.1) [18]. 1) Generate a random variable X * 1 from the discrete distribution on {1, . . . , n} that assigns massπ i to s i , 1 ≤ i ≤ n.
2) Generate a random variable X * t+1 from the discrete distribution on {1, . . . , n} that assigns massp i j to j, 1 ≤ j ≤ n, where s i is the value of X * t .

Lag-Specific Transfer Entropy
TE is a measurement of directed information flow [19] based on the concept of Shannon entropy [20] in the field of information theory. For a discrete random variable I with probability distribution p(i), the Shannon entropy represents the average number of bits required to optimally encode independent draws, calculated as follows: Eq. (3.2) can be easily extended to the concept of conditional entropy using two discrete random variables I and J: This equation can be used to measure information flow between two discrete random variables. Consider two discrete random variables, I and J, with marginal probability distributions p(i) and p( j), and joint probability p(i, j). Suppose both variables represent stationary Markov processes of orders k and l, respectively. For the order k Markov process I, Eq. (3.2) can be extended to . Analogously, the information flow from J to I is measured by quantifying the deviation from the generalized Markov property p(i t |i (k) for an arbitrary sourcetarget lag u, as follows: .
Eq. (3.4) preserves the computational interpretation of TE as an information transfer, which is the only relevant option in keeping with Wiener's principle of causality [21]. The transfer entropy is known to be biased in small samples [22]. To correct any bias, [22] proposed the effective transfer entropy where T (k,l) J shuffled →I indicates the transfer entropy using a shuffled version of time series J. The shuffling process randomly draws values from the original time series and realigns them to generate a new time series. Thus, shuffling eliminates any time series dependencies of J, as well as statistical dependencies between J and I. Note that T (k,l) J shuffled →I converges to zero as the sample size increases, and any nonzero value of T (k,l) J shuffled →I (t, u) is a result of the small sample effect. To ensure estimation consistency, shuffling is repeated, and the average of the shuffled transfer entropy estimates across all iterations serves as an estimator for the small sample bias, which is subsequently subtracted from the Shannon or Rényi transfer entropy estimate to correct any bias.

Problem Definition
Suppose that congestion information is transferred from a source road to a destination road with a time delay of u. The objective of time delay estimation is to estimate u given previously observed traffic speed data on the two roads, denoted as {X t } L t=1 and {Y t } L t=1 , respectively. Therefore, the time delay estimation task entails formulating a function f (·) that computes the source-target lag − − → u. The proposed time delay estimation algorithm comprises three steps. First, bootstrapping for each time series is performed using preprocessing methods. Next, the transfer entropy computation provides the estimated time delay lag u * . Finally, the distribution of time delay lag is estimated to determine the existence of a statistical causal relationship.

Preprocessing and Bootstrapping
Consider a time series of congested traffic speed data, {X t } L t=1 , which has the properties of scale dependence, nonlinearity, and nonstationarity. To effectively identify the causal relationship within such a time series, appropriate preprocessing methods are essential.
To this end, we first decompose the time series into a trend and its residual, as follows: where T t and R t are the trend and residual components, respectively. The trend component is a moving average of order m, representing the mean forefront value at time t. The purpose of the trend component is to smooth the time series for estimating the underlying trend. After extracting the underlying trend from {X t } L t=1 , the residual time series {R t } L t=1 is assumed to be a stationary Markov process. The assumption of Markovian property in traffic speed is not a novel notion. Many prior traffic speed prediction and modeling studies have been conducted under this assumption [23,24,25,26,27]. Based on {R t } L t=1 , we can generate the bootstrap residuals {R * (b) t } L t=1 as explained in Section 3.1. Subsequently, we can easily obtain a bootstrap time series for t = 1, . . . , L. Nonlinear normalization with a sliding window is then applied to the obtained time series to address the scale-dependency, nonlinearity, and nonstationarity of traffic speed data. To ensure the data are scale-independent and close to linear [28], the standard normal cumulative distribution function Φ is applied. A sliding window technique has similarly been employed to handle a nonstationary time series in [29].
t with a sliding window size of w, and F 25,t , F 50,t , and F 75,t be the 25th, 50th, and 75th percentiles of X t,w , respectively. Note that these percentiles depend on the location of the sliding window. Then, a normalized time series To verify the effectiveness of the nonlinear normalization method expressed in Eq. (4.2), we compared its performance with that of existing normalization methods [28,29], including the min-max with and without a sliding window technique (see Section 5).

Time Delay Estimation
Using Eq.  In this study, we assume k = = 1.
To compute the lag-specific ETE in Eq. (4.3), we discretize continuous data using symbolic encoding. This discretization can be performed by partitioning the data into a finite number of bins. We denote the bounds specified for the n bins by q 1 , q 2 , . . . , q n−1 , where q 1 < q 2 < · · · < q n−1 . For the normalized time series in Eq. (4.2), we obtain the encoded time series {J * (b) t } L t=1 by the following equation: The choice of bins depends on the distribution of data. In the case where tail observations are of particular interest, binning is typically based on empirical quantiles, such that the left and right tail observations are allocated into separate bins. In this study, we implemented symbolic encoding with n = 3 based on 5% and 95% empirical quantiles, thereby emphasizing speed extremes caused by dynamic speed changes and traffic accidents.

Uncertainty Quantification of Time Delay Estimates
Suppose that bootstrap observations of the time lag follow a distribution G, which is unknown in practice. Let µ and σ 2 denote the mean and variance of G, respectively, which can be estimated byμ Proposition 1 implies that 1) the bootstrap estimateμ B is an unbiased estimate of µ, and 2) 1 Bσ 2 B quantifies the uncertainty ofμ B . That is, we can evaluate the uncertainty of the bootstrap estimateμ B using 1 Bσ 2 B , which is practically useful because µ is unknown. This approach can be applied to hyperparameter tuning. In this study, we used a grid search to determine a set of hyperparameters (length of time series (L) and sliding window size (w)) that minimizes 1 Proof. Asσ 2 B → σ 2 in probability,

) by the Central Limit Theorem and
Slutsky's Theorem [30]. Thus,μ B approximately follows a normal distribution,μ B ∼ N µ, 1 Bσ 2 B . To determine whether the bootstrap estimateμ B is reliable, we compareσ 2 B with a predetermined threshold σ 2 T . That is, ifσ 2 B > σ 2 T , we can conclude thatμ B is not reliable, and congestion is therefore not propagated on the corresponding road segment.
To determine an appropriate value of σ 2 T , we employ the concept of the tolerance interval (T I). The T I is a statistical interval in which a specified proportion γ of a population will fall with a certain level of confidence (1 − α). By definition, a (γ, 1 − α)-T I ofμ B , where k γ,α is the tolerance factor [31]. Then, σ 2 T can be determined by k γ,α 1 B σ 2 T = 1 (min) to yield a ±1 minute T I. With B = 100, γ = 0.9, and α = 0.01, the present study uses σ 2 T = 100 Consider the propagation of traffic congestion caused by accidents. Here, we define the propagation path by a sequence of incoming roads in the direction opposite to the traffic flow, where kth element in the propagation path is denoted as Hop(k-1). Letμ B,k−1 andσ 2 B,k−1 denote the bootstrap estimate and sample variance at Hop(k-1), respectively. Hop0 corresponds to the road where the accident occurred. We consider Hop(k-1) to be statistically significant ifμ B,k−1 andσ 2 B,k−1 satisfy the following conditions: 1)σ 2 B,k−1 < σ 2 T and 2)μ B,k−2 <μ B,k−1 . The second condition states that the time delay between Hop0 and Hop(k-1) must exceed that between Hop0 and Hop(k-2).  The proposed method was validated using simulated data. Two time series -{X t } 120 t=1 and {Y t } 120 t=1were generated by

Simulation Studies
for t < 10 0.5X t−u 0 + 20 + y,t for t ≥ 10 , where x,t ∼ N(0, 2) and y,t ∼ N(0, 2). A predetermined source-target lag (u 0 ) exists such that a significant information flow from X to Y is formed, but not vice versa. Figure 2(a) depicts two time series with u 0 = 10, where the black solid and red dashed lines represent {X t } 120 t=1 and {Y t } 120 t=1 , respectively. This simulation represents a typical congestion propagation scenario between two adjacent roads R X and R Y , assuming that there was a traffic accident on R X at t = 10 and that congestion was resolved at t = 95. With the time shift u 0 = 10, the congestion on R X propagates to R Y .
The proposed method was applied to simulated data with m = 2 and w = 20, as described in Section 4. Figure 2(b) compares the distributions of bootstrap observations obtained from two-time series without normalization, to those obtained from two time series with nonlinear normalization. The red and blue lines in the figure denote the values of μ B ,σ 2 B in a distribution form, that is, 9.54, 3.94 2 and 10.30, 1.35 2 , respectively. Here, a normal distribution is used for visualization purposes. We confirmed that nonlinear normalization with a sliding window improved the accuracy of time delay estimation, as 1.35 2 < 3.94 2 .
For comparison purposes, the TLCC and DCCA methods were also applied to the simulated data. The DCCA method requires a hyperparameter n, which indicates the size of the overlapping box [14]. Here, we used multiple values of n (10,20,30,40) to obtain the results of time delay estimation, as shown in Figure 2(c) for the DCCA method. Furthermore, Table 1 summarizes the results of conventional TDE methods. These results show that both the TLCC and DCCA methods generally yield reasonable time delay estimates for the simulated data. In particular, it is recommended to set n to be greater than 20.   Table 2, the decomposition technique improved the overall performance, and nonlinear normalization with w = 20 generally performed better than all other normalization methods in terms ofσ B and MAE. This implies that nonlinear normalization with w = 20 produced the most precise and accurate estimates among the tested schemes.

Real Data Example: Accident-Driven Traffic Congestion Propagation
Two types of datasets were considered from various sources: a traffic dataset provided by the NAVER corporation navigation team, and an accident dataset provided by the Korean National Police Agency. The traffic dataset encompasses trajectory-based speed and traffic road networks of the major metropolitan area of Seoul, where nearly half of the country's population resides. The speed data are described by GPS trajectories. A GPS trajectory consists of a series of points with latitude, longitude, and timestamp information generated during travel. To align a sequence of observed user positions within the road network, we used a map-matching process [3]. Each accident record includes the reported time, information source, category, incident description, and point of origin described by both geographical coordinates and road segment ID, as discribed in Table 3.
The proposed method was validated on 3,197 real traffic accidents that occurred between September 2020 and February 2021 in Seoul. Figure 3 presents the accident locations, as denoted by red stars, where the blue lines indicate significant propagation paths. Letμ B,k−1 andσ 2 B,k−1 denote the bootstrap estimate and sample variance at Hop(k-1), respectively, for k = 2, 3, 4. In this study, we investigated up to k = 4. k = 1 was excluded because Hop0 refers to the road where the accident occurred.  Table 4 summarizes the time delay estimation results for all 3,197 traffic accidents. To ensure consistency within results, the hyperparameter (w, L) = (60, 180) was used for all accidents based on the grid search. There are 5,036 roads at Hop1 associated with accidents, approximately 69.16% of which were revealed to be significant, with an average time delay of 8.95 minutes. For Hop2 and Hop3, 65.33% and 63.15% of the roads were revealed to be significant with average time delays of 11.10 and 11.97 minutes, respectively.
We selected two representative cases among the traffic data to detail how the proposed method identifies causal relationships and estimates time lag. Case 1 represents a simple road network with few propagation paths, whereas Case 2 represents a complex road network with many propagation paths. For comparison purposes, the TLCC and DCCA methods were also applied with equivalent settings to those used in the simulation study.

Case 1: Simple Traffic Network
The accident occurred on September 8, 2020, at 06:44 AM. The blue star in Figure 4 denotes the exact location of the accident. Case 1 has one propagation path [A, B, C, D]. The black, red, blue, and green line segments in the figures indicate Hop0, Hop1, Hop2, and Hop3, respectively. Time delays were estimated using average speed data recorded at one-minute intervals from the previous hour to the subsequent two hours based on the time when the accident was reported.   tion produced a more consistent estimate of time delays that increased with each hop. We, therefore, conclude that the congestion effect of the accident propagated along the path to Hop1, Hop2, and Hop3 at 3.60, 7.30, and 19.97 min after the accident, respectively. Unlike the proposed method, both TLLC and DCCA failed to identify the congestion propagation effect of the accident. Furthermore, the DCCA method produced inconsistent time delay estimates for varying values of n. Note that both the TLLC and DCCA methods do not provide uncertainty quantification of the time delay estimates.

Case 2: Complex Traffic Network
The accident occurred on September 4, 2020 at 10:16 PM, and affected five propagation paths:  Figure 6 depicts the exact location of the accident, along with the five propagation paths. For each path, the previous hour and the subsequent two hours were considered for time delay estimation. Figure 7 and Table 6 present the results of time delay estimation. In Path 1, no specific causal relationship could be found, as shown in Figure  7(a). This finding is supported by the corresponding high values ofσ 2 B (> σ 2 T = 5.05 2 ) in Table 6. Similarly, we can conclude that the congestion effect of the accident at road A propagated along the second hop of Paths 2 and 3, and the third hop of Paths 4 and 5. Moreover, the values ofσ 2 B in Table 6 indicate that the congestion effect propagated along Path 3 up to Hop2, as depicted in Figure 7(b), and   method only for Path 3, as seen in Table 6. From the results of both cases, it can be concluded that the proposed method identifies causal relationships and estimates the time lag more accurately than the conventional TDE methods.

Conclusion
Traffic congestion spreads its effects to the inflow roads, creating a causal relationship between the accident site and adjacent roads. To identify the said relationship, we developed a new method for estimating differences in congestion time. The proposed method utilizes a lag-specific TE estimator with decomposition and normalization techniques. Furthermore, we conducted extensive performance comparisons under varying experimental settings and found that the proposed decomposition and nonlinear normalization methods yield substantial performance improvements. We also confirmed that the proposed method produces more stable and robust results than the conventional TDE methods. Thus, the proposed time delay estimation method helps quantitatively understand the propagation of traffic congestion.
Moreover, the bootstrap technique and its density estimation of statistical functionals enable the uncertainty quantification of time delay estimates. This uncertainty quantification allows us to evaluate the reliability of time delay estimates and serves as a basis for optimal hyperparameter tuning. Specifically,σ 2 B serves as a key indicator of a causal relationship between two-time series. We developed a rigorous and practical guidance for decision making based on the tolerance interval.
In this study, we only considered the method to obtain accurate time delay estimates using historical traffic data. Eventually, the proposed method will be used to predict time delays in GPS navigation systems, thereby providing users with more accurate estimated arrival times using real-time traffic data. Using our proposed method as a foundation, real-time delay prediction methods can be developed in future work.