Potential of Highly Automated Vehicles for Monitoring Fatigued Drivers and Explaining Traffic Accidents on Motorway Sections

The near-future deployment of high-level automation vehicles ( AVs ) can render promising opportunities to solve ongoing hindrances in modern safety-related research. Monitoring fatigued drivers on any road section is one of these challenges. Vehicle trajectory big data, monitored through AVs , include key information with which to monitor fatigued drivers on roads. To mine this upcoming opportunity, a new data-driven approach which allows the direct monitoring of fatigued drivers on road segments is proposed here for the ﬁrst time. A feasible study was conducted using big vehicle trajectory data and real-life traﬃc accident data. The results showed that fatigued drivers on a target road section can be successfully surveyed using the driving durations from departure locations to the target road section. It was found that, with a statistical correlation of 0.90, an index for fatigued drivers has strong explanatory power about the traﬃc accident rate. This ﬁnding indicates that the proposed method will be a promising means by which to monitor fatigued drivers at road locations in the upcoming era of autonomous vehicles. In addition, the method is immediately practicable if vehicle trajectory data are available.


Introduction
e term "driver fatigue" is defined as a state of reduced mental alertness [1], a transient phase between awake and asleep [2], or a psychological and physiological process [3], all of which, if not interrupted, impair one's ability to perform driving tasks safely. Due to this, driver fatigue is a significant factor in many vehicle crashes. Naturally, a myriad of investigations into the progression of fatigue symptoms (e.g., hypovigilance, microsleep, and sleep) have been conducted based on driving simulators using various measures which are directly or indirectly related to physiological (e.g., brain, eye, heart, and skin) activities and driving performances (e.g., reaction times, steering, and maintaining a constant speed). A notable achievement in this line of research [3][4][5][6][7][8][9] has been to determine the safe limits of continued driving to prevent fatiguerelated vehicle crashes, as task-related fatigue, caused by prolonged and monotonous driving on a motorway, negatively affects the risk of a vehicle crash [3,10].
Despite this public contribution, there remain ongoing challenges that should be addressed to prevent and reduce vehicle accidents. One of these challenges is to integrate drivers' task-related fatigue into the risk management of traffic accidents on actual road sections. To achieve this, it is essential to monitor fatigued drivers on road sections. However, advanced technologies to monitor physiological and driving-performance measures have several limitations [5] preventing their deployment on roads as a type of detector to monitor for driver fatigue along certain road sections. Devices to detect the activities of the brain, eye, heart, skin, and muscle are inconvenient and expensive for drivers although eye-activity detectors based on cameras are easily installable in vehicles and are less obtrusive compared to electro-oculography devices [5]. e communication infrastructure necessary to collect the monitored data also requires a tremendous level of funding. ese obstacles should be overcome in order to realize the monitoring of fatigued drivers along road sections.
Fortunately, it is expected that automated vehicles (AVs), as a new moving detector to collect vehicle trajectory big data on road networks, will provide a promising opportunity to address the aforementioned issues. In relation to this upcoming opportunity, the objective of this research is to demonstrate initiatively the potential of AVs for monitoring fatigued drivers (in manual or partially automated vehicles along road sections) using a new method. e method proposed in this study is developed based on big data on vehicle trajectories that may be collected through the operation of AVs. A case study to verify the feasibility of the proposed method is conducted using vehicle trajectory big data and real-life traffic accident data. Based on the analysis results, selected findings and research directions related to the possible monitoring of fatigued drivers on road sections in the near-future era of AVs are presented.

Approach Concept.
ere is an academic consensus that prolonged driving durations increase the level of drivers' task-related fatigue [1,2,5,11], which has been strongly implicated in numerous automobile crashes [3,[9][10][11][12][13][14][15][16][17]. is implies that driving durations from individual departure locations to a target road section, if made available in some way, can be used directly to assess potentially fatigued drivers on a target road section. In addition, information about estimated potentially fatigued drivers on individual road sections can be usable in risk management evaluations of possible vehicle crashes. To realize this in practice, one obstacle is how to monitor real-world driving durations efficiently along individual road sections without incurring a tremendous cost.
Fortunately, detailed information about the operation of AVs can serve as a clue with which to address this issue. e operation of AVs vitally relies on advanced global positioning system (GPS) and communication systems when driving. at is to say, AVs when driving can be considered as a moving detector along a road network. Detailed spatiotemporal vehicle trajectories can be monitored through GPS technology, and these trajectory data can be sent to an advanced data center via current communication technology. In this vein, it is expected that big data pertaining to individual vehicle trajectories can be procured due to AVs. Individual vehicle trajectories essentially include the actual travel times of individual AVs along a series of road sections. erefore, the distribution of real-life driving durations from departure locations to a target road section can be directly extracted from vehicle trajectory data.
To take advantage of this upcoming opportunity, a method that directly measures fatigued drivers on actual road sections using driving durations is proposed in this research. e method is developed based on the following two concepts. e first is that the distribution of the driving durations, collected from AV trajectory data, is a direct subset of the distribution of driving durations of all vehicles or is at least strongly related to this dataset in some way. is concept is also supported by the fact that vehicle GPS traffic volumes are direct subsets of the total vehicular traffic volume for a road section [18,19]. e second concept is that a typical driver quickly reaches a state of fatigue when the driving task of the driver exceeds a safe limit for continued driving. Research has shown that "people do not differ greatly in how much fatigue they can tolerate but rather how quickly they reach a certain critical level of fatigue" [8] and that "a general conservative limit for safe driving on monotonous highways can prevent drivers from falling asleep or developing excessive fatigue" [9]. erefore, a safe limit can serve as a threshold between nonfatigue and fatigue states on average. If these concepts are logical, fatigued drivers along any road segment can then be directly inferred using the distribution of AV driving durations and a suitable safe limit. Despite the importance of this issue, no research related to the monitoring of fatigued drivers on any road section could be found in our literature review.

Monitoring of Driver Fatigue.
It is expected that the degree of fatigue felt by a driver in an automated vehicle will differ according to the automation level (i.e., 0-5), as defined in earlier work [20]. Drivers in partially automated driving systems (i.e., levels 0-2) still engage in driving tasks (e.g., steering, accelerating, and braking) to maintain safety, whereas high-automation driving systems (i.e., levels 3-5) do not require the drivers' engagement during the driving tasks. e level 3 automation system (i.e., conditional automation) requires the drivers' engagement according to certain environmental or road conditions (e.g., heavy rainfall and road construction), but drivers are no longer required to monitor the automation processes continually.
Contrary to drivers' expectations, it was reported that drivers of future AVs may become fatigued more rapidly than those driving manual vehicles in cases of conditional automation (i.e., level 3 automation) [21][22][23]. is undesirable result is caused by both the passive involvement of the driving process and the monotonous forward attention required to deal with emergencies [21][22][23]. Despite this, level 3 automation shows the greatest differences from partialautomation systems. In this context, level 3 automation is considered as a high-automation system. erefore, it is assumed that drivers of vehicles with low-level automation suffer from task-related fatigue at least similarly to that felt by manual drivers, whereas drivers of vehicles with high levels of automation do not experience any task-related fatigue. Based on this assumption, manual driving in its present form is considered as a type of level 0 automation for the purpose of this paper. is is reasonable given the link between task-related fatigue and the risk of vehicle crashes.
To measure fatigued drivers related to manual and lowlevel automated driving, the ratio of potentially fatigued drivers to all drivers (henceforth, RF, 0.0 ≤ RF ≤ 1.0) on a given road section is introduced. Figure 1 shows the RF concept, where s n and s f are segments for nonfatigued and fatigued drivers, respectively. RF relies closely on the nature of the driving durations, extracted from vehicle trajectory big data, rather than on any type of artificial understanding by a sophisticated model. RF is computed using three components: the nature of the driving duration, a threshold (t c , in minutes) between nonfatigued and fatigued states, and the penetration rate of highly automated vehicles (α, 0.0 ≤ α ≤ 1.0).
Let us define the nature of the driving duration as a frequency function of f(t), as shown in Figure 1. To build f(t) for a given target road section, we define the driving duration (t) with the experienced driving time from a departure location to the midpoint of the target road segment. e experienced driving time can be obtained by simply excluding nondriving activity times at rest areas, for instance, from the total travel time between the departure location and the target location. e starting or ending location of the target road section can also be used as a representative location instead of the midlocation. In this case, the driving duration for the target road section is under-or overestimated from the standpoint of traffic accidents which randomly occur within a full road section. In spite of this, the selection of the representative location for the target road section wholly relies on analysts, as differences in driving durations are acceptable in practice.
Individual driving duration values are integrated into the frequency distribution of the driving duration (i.e., f(t)) with Δt and t max , where Δt is a fixed interval of the driving duration, t max is the maximum driving duration, and is disaggregated approach guarantees a margin of error of 0.05 × Δt. is approach can also be a practical solution allowing the successful assessment of actual obstacles (e.g., computation speed, data building and management, and privacy policies) considering big data as this type of data pertains to individuals.
Once f(t) (i.e., the frequency of driving duration) has been devised, RF (aforementioned with Figure 1) can be directly produced using a suitable t c value. RF can be calculated as follows: (1) A suitable α value (� q a / q) can easily be monitored using total vehicle volume (q) and the highly automated vehicle volume (q a ) on the target road section. α values monitored along different road sections may also be used due to the fact that the highly automated vehicle volume is a direct subset of the total vehicle volume [18,19]. Based on these considerations, RF excluding nonfatigue drivers in highly automated vehicles can be calculated using the following equation: (2) During the real-life computation process using the frequency of the driving duration with Δt � 1.0 and 0.0 ≤ t ≤ t max , equation (2) can also be computed as follows: (3)

Data and Features.
To verify the potential of high-level AVs for monitoring fatigued drivers on a road section, an experimental study was conducted using two types of test data: the distribution of the driving duration and the accident rate (AR). ese two types of test data were collected for the entire year of 2017. e test bed (Figure 2) used here is one of the main motorway lines in South Korea, and it consists of 31 road sections. e line mainly serves as an interregional motorway used by drivers who cover mid-and long distances. e road sections also meet the design standards and guidelines of motorway construction.
To compile the distribution of the driving durations, point-to-point vehicle trajectory big data were used, the number of points of which was as high as to 5.91×10 11 . e vehicle trajectory data, collected by an advanced vehicle GPS system mounted on a vehicle in this case, are most similar to the data of AVs [18]. In addition, the system is activated when the vehicle is started and is deactivated when the vehicle is turned off. e penetration rate (PR, 0.0 ≤ PR ≤ 1.0) of vehicle GPS systems for the 31 target road sections is illustrated in Figure 3, where PR � [annual average daily vehicle GPS volume (vehicle/day)]/[annual average daily vehicle volume (vehicle/day)].
e PR values show a stable trend, ranging from 0.0110 to 0.0132 with average and standard deviation values of 0.0125 and 0.0005, respectively. PRs also statistically satisfy the particular sample rate (%) requirement in this case during a full year. For instance, the recommended sample size for a population of 1,825,000 (�5,000 vehicle/day × 365 day/year) is 16,491 (i.e., sample rate � 0.904%) at a 99.0% confidence level with a 1.0% margin of error. e distributions of the driving durations for the target road sections were extracted from the collected vehicle trajectory data. Figure 4 illustrates the distributions of the driving durations for the 31 target road sections. e distributions are highly complex, ranging from left-biased single peak cases to right-biased multipeak cases. In this manner, a road section has certain driving duration characteristics which are distinguishable from those of other sections. is indicates that the characteristics of the driving durations, if intrinsically related to real-life vehicle crashes on the associated road section, can then be effectively used to measure drivers' task-related fatigue. Moreover, the complexities wholly rely on complex space-and-time activities of vehicles, which also entirely depend on the complex mechanisms of national land use and social activities. at is, the characteristics of the driving durations are closely associated with land use and social activities.
On the other hand, it can be seen that a significant percentage of drivers suffer from task-related fatigue considering a conservative safe limit for continued manual driving on a motorway. e RF outcomes with t c � 60 min and α � 0.0 range from 0.15 to 0.84 with average and standard deviation values of 0.61 and 0.20, respectively. is implies that more than 60% of drivers along with more than 50% of road sections feel task-related fatigue at least in our case. It also appears that the variety of the driving duration characteristics and this serious state of driver fatigue are closely related to the wide variation in ARs for the 31 target road sections, as shown in Figure 5, where AR is the number of automobile crashes per 10 6 × vehicles-km. e maximal AR value is up to five times the minimal AR value, ranging from 0.054 to 0.267.

Relationship between RF and AR.
In order to calculate RF, a suitable t c value should be identified in some way, as RF is entirely dependent on t c . A useful t c value has not been determined according to our literature review, despite the fact that several safe limits for continued and monotonous driving have been reported and publicly recommended. Optimal safe limits for continued driving are also debated in the academic research. For instance, 30 min [4,5], 60 min [3,6,7], and 80 min [8,9] were proposed based on different fatigue symptom countermeasures which are deeply associated with the progression of driving fatigue. Despite these achievements, it is obvious that safe limits are not useful for an analysis of actual traffic accidents, as their common purpose through public education is to prevent drivers from developing excessive fatigue or falling asleep at the wheel [9].
A numeric simulation that maximizes the coefficient of the correlation between RF and AR was conducted to identify the best value of t c . e effects of t c on the coefficient of the correlation between the RF and AR values for all target road segments are shown in Figure 6. e correlation values increase, first steeply (t c � 1 ⟶ 100) and then gradually (t c �100 ⟶ 125), to the maximum level after which they gradually decrease with little variation as the value of t c increases. is convex-shaped curve of the statistical correlation between RF and AR implies that a useful boundary between nonfatigued and fatigued states exists, regardless of whether the boundary is obvious. Remarkably, the maximum correlation value at t c � 125 minutes is as high as 0.90, and the correlation values are also close to 0.90 within t c � 125 ± 15 minutes. ese findings indicate that the best or second-best t c values can be identified by taking into account actual traffic accidents. e t c value of 125 is in line with a critical driving time (120 minutes) [24], after which subject drivers began to feel some fatigue although they attempted to resist their fatigue and struggled to remain alert [24]. It also clear that the t c value is related to the fact that a driver normally tends to underestimate the impact of fatigue, overlook feelings of drowsiness, and to continue driving as they become sleepy [10,25]. In this context, a t c value of 125 was determined as the critical boundary to maximize the statistical explanatory power of RF as it pertains to AR. e critical t c value is also used to demonstrate the potential of RF through detailed analyses. In addition, the three safe limits are found to be desirable with regard to the conservative monitoring of drivers' task-related fatigue. e relationship between RF and AR at t c � 125 is illustrated in Figure 7. e value of the coefficient of determination (R 2 ) exceeds 0.85. is means that RF (i.e., driving fatigue) statistically explains more than 85% of traffic accidents on interregional motorway sections. at is, RF has strong explanatory power with regard to AR at least in our case.
is finding indicates that the monitored driving duration is closely related to task-related fatigue and in turn substantially affects traffic accidents. is also implies that RF can be effectively employed as a useful variable to monitor the degree of driver fatigue and then to evaluate the fatigue-related risk of vehicle accidents at least along motorway segments. As of yet, no explanatory variable that has such explanatory power has been presented. e trend of ARs increases exponentially as RF increases. is indicates that weighted fatigue (or dozy) triggered by prolonged driving has more of an impact on the risk of a vehicle accident than the average level of driving fatigue.
is also signifies that a driver, even when taking short breaks periodically, should take naps or at least get enough rest to diminish fatigue [9,26] when accumulated level of fatigue reaches its maximum such that it seriously impairs the ability to conduct safe driving. In contrast, AR values show stationary trends with some variations when RF ≤ 0.15, and the estimated AR value with RF � 0.0 exceeds zero. is means that the impacts of fatigued driving on the risk of a vehicle crash, when the percentage of fatigued drivers is low, combine with other causal factors of accidents. Despite this, it can be seen that RF, though its impact is low, presents a promising opportunity to be used at least as a significant causal factor without ignoring the impacts of the small percentage of fatigued drivers on traffic accidents. e estimation function in Figure 7 is used for more analyses here.

Effects of High-Level AV on RF and AR.
e effects of high-level AV on a reduction of AR were analyzed from the perspective of driving fatigue using the estimation function ( Figure 7) and the three regimes of RF (Table 1). Figures 8  and 9 show the reductions of RF and AR, respectively, for all 31 target road sections according to the increment of the penetration rate (PR) of AV (i.e., α).
ese results are summarized in Table 1, showing the distinction between the three regimes of RF by the PR of AV and the reduction of AR. Specifically, the reduction of AR in the high regime is distinguishable even when the PR of AV is 0.1. is occurs because fatigued drivers, who account for a considerable portion of RF, decrease as the PR of AV increases. In the same context, the absolute quantity of RF is reduced dramatically in Figure 8 as the PR of AV increases, and AR then also considerably decreases in Figure 9 considering that RF and AR have an exponential relationship in the high regime ( Figure 7). erefore, the field launch of high-level AVs will reduce traffic accidents on high-RF road sections more than that on other road sections in terms of AR. Interestingly, it can be seen that the percentage of fatigued drivers on many road sections, even when the PR of AV exceeds 0.5, remains high. is is due to the fact that the reduction of fatigued drivers according to the increment of

Journal of Advanced Transportation
AV occupancy is closely related to the characteristics of the driving duration. is indicates that the percentage of fatigued drivers may not decrease in proportion to the overall increase in the PR of AV in terms of individual road sections.
In contrast, the reduction in AR is not noticeable in the low regime, even when the PR of AV is greater than 0.5. RF values decrease slightly with an increment in the PR of AV, but the reduction of AR explained by the estimation function is low.
us, the effects of AV on the reduction of AR (Figure 9) cannot be significant on a road section where the degree of fatigued drivers is low (Figure 8).
It should be noted that driving fatigue has a strong and complex relationship with multiple casual factors, including driving environments (e.g., visibility, road geometry, and traffic volume) related to roads and traffic flows [27,28], physiological conditions (e.g., sleep deprivation and circadian rhythms) [26,[29][30][31][32][33][34], and social features (e.g., gender and age) [31,35]. It is expected that high-level automation (i.e., levels 3-5) has a suitable level of self-capability to address these conditions successfully. In this vein, there remain opportunities related to RF in that it likely can be utilized effectively in analyses of traffic accidents when combined with associated driving environments. erefore, further investigations into more sophisticated approaches to integrate multidimensional casual factors into the method proposed in this paper should be conducted.

Concluding Remarks
It is expected that the upcoming introduction of highly automated vehicles on real-world roads will present promising opportunities to solve many ongoing hindrances in modern safety-related research. One of these challenges is the successful monitoring of fatigued drivers on actual road sections. Fortunately, vehicle trajectory big data can be collected through autonomous vehicles, which closely rely on advanced vehicle GPS and communication systems. e vehicle trajectory data include key information with which to monitor driver fatigue on road sections, particularly the driving durations from departure locations to any road segment.
To harness this opportunity, a new concept for the direct monitoring of fatigue drivers on any road section was introduced in this study. A data-driven method which directly surveys fatigued drivers on road segments was developed  based on driving durations extracted from vehicle trajectory big data. e potential of high-level automation vehicles was demonstrated using the characteristics of the driving durations and real-life vehicle accident data. It was found that the ratio of potentially fatigued drivers to all drivers on any road section can be easily and effectively monitored through driving durations as included in vehicle trajectory big data. It was also discovered that the monitored degree of fatigued drivers has strong explanatory power with regard to traffic accidents. erefore, it is expected that the proposed approach for the direct monitoring of on-road fatigued drivers will be feasible in the upcoming era of autonomous vehicles. In fact, the proposed approach is instantly feasible when vehicle trajectory big data collected with a penetration rate of 1.0% are available. is investigation constitutes a first step in presenting a feasible solution to the direct monitoring of fatigued drivers on road segments. In addition to the effective results presented here, there are other opportunities to improve the reliability of the proposed approach in academic and practical fields related to road safety research. For instance, the temporal progression of driving fatigue, which differs between individuals according to circadian rhythms, was not considered. is is a viable area for future research.
Data Availability e vehicle trajectory data used to support the findings of this study were provided only for academic research by the Smart Big Data Center of Korea Transportation Institute and QBICWARE.

Conflicts of Interest
e authors declare that they have no conflicts of interest regarding the publication of this paper.