Modeling of time-dependent safety performance using anonymized and aggregated smartphone-based dangerous driving event data

https://doi.org/10.1016/j.aap.2019.105286Get rights and content

Highlights

  • Large-scale smartphone-based dangerous driving events data were explored and associated with crash occurrence.

  • A multivariate conditional autoregressive model was used to account for both spatial and temporal dependence of crash data.

  • The relationship between dangerous driving events and crash occurrence was found to vary across different times of day.

  • Time-dependent hotspots were identified.

Abstract

Safety performance functions (SPFs) are generally used to relate exposure to the expected number of crashes aggregated over a long time (e.g. a year) by holding all other risk factors constant, and to identify hotspots that have excessive crashes regardless of different time periods. However, it is highly likely that the relationships of exposure, risk factors and crash occurrence can vary across different times of day. This study aims to establish time-dependent SPFs for urban roads by using large-scale dangerous driving event data captured by smartphones in different times of day. Multivariate conditional autoregressive (MVCAR) models are developed to jointly account for spatial and temporal dependence of crash observations. Results of two-sample Kolmogorov-Smirnov tests affirm the heterogeneity of the safety effects of dangerous driving events in different time periods. Time-dependent hotspots are identified using potential for safety improvement (PSI) metric. The assumption here is that due to the change of traffic conditions and environment across different times of day, safety hotspots for different time periods should be different from each other. According to the results of Wilcoxon signed-rank tests, hotspots identified by times of day are found to be mostly different from each other. The findings of this study provide insights into temporal effects of risk factors and can support the development of time-dependent safety countermeasures. Besides, this study also shows the potential of leveraging anonymized and aggregated dangerous driving data to assess traffic safety issues.

Introduction

Safety performance functions (SPFs) are commonly used to correlate geometric, traffic, and environmental characteristics with crash outcomes. These models are frequently sought to support the detection of hotspots that have excessive crashes over similar sites (Wang et al., 2014).

It should be pointed out that SPFs are primarily designed to relate exposure to expected number of crashes, usually per year, at a location by holding all other risk factors constant (Hauer, 1995). Such aggregated functions cannot clearly account for the potential variations of crash occurrences at different times of day. Obviously, factors such as traffic exposure and some roadway conditions have distinct patterns at different periods of a day (e.g., congestion at peak hours vs free flow at night). Arguably, this naturally leads to different levels of crash risk at different periods. For example, one may see more speeding events when traffic is less congested.

A number of studies have illustrated the non-uniform distributions of crashes across different times of day (e.g., Qin et al. (2006) and Pahukula et al. (2015)). From the implementation perspective, safety countermeasures should be designed accordingly. However, if conventional SPFs were used, it would be difficult to capture the appropriate association between crashes and time-sensitive risk factors. Consequently, there will be fewer chances to unveil the facts that a location having excessive crashes during certain periods may be relatively safe in the rest of the day. In other words, hotspots can be time-dependent. With a similar concept, Folkard (1997) proposed the term “black times” to indicate high-risk time periods against “black spots” that represent fixed high-risk locations.

To model time-dependent safety performance, it is essential to collect risk factors during different times of day. Such studies are rarely performed because time-dependent risk factors are difficult to obtain using traditional data collection method. Other than relating short-term traffic flow measures with crash occurrences (e.g. Golob and Recker (2004); Qin et al. (2006), and Pahukula et al. (2015)), it is very difficult to gather the many crash precursors, such as dangerous driving events, in urban areas that have been found to be positively correlated with crash or near-crash occurrence (Guo and Fang, 2013; Paleti et al., 2010).

For collecting the dangerous driving event data, one possible solution is to resort to connected vehicle (CV) technologies. CVs are vehicles that are able to communicate with each other (V2V) and with the infrastructure (V2I) through wireless communication technologies (Feng et al., 2015). The exchange of information between connected vehicles can be used to detect risk or potential collisions (Xie et al., 2018ab) and warnings could then be triggered and issued to drivers accordingly (Doecke et al., 2015). For example, if a vehicle hard brakes in front of the host vehicle, then an emergency electronic brake lights (EEBL) warnings will be issued to the host vehicle (Howe et al., 2016). In this sense, the warnings themselves will be a good indicator for crashes.

However, because of the high implementation cost of the CV devices, large-scale installment of CV devices may not be possible shortly. And due to privacy and security concerns, vehicle and driver IDs and other sensitive information that can link travel trajectories to individual vehicle or driver will be scrubbed in future CV data (e.g. in the Connected Vehicle Pilot Deployment (CVPD) Program – New York City (NYC) DOT Pilot (Galgano et al., 2016)). All of these motivate us to explore other data sources that can be served as a substitute to future CV data. Many crash surrogates were proposed in traffic safety literature. For example, time-to-collision (TTC) (Hayward, 1972), deceleration rate to avoid collision (DRAC) (Cooper and Ferguson, 1976), and post-encroach time (PET) (Allen et al., 1978) are among commonly used surrogates for approximating crashes. But, calculating these surrogates often requires the detailed trajectory information of two consecutive vehicles in a car following scenario, which will be difficult to obtain in the real world. Thus, indicators that can be derived from individual vehicle trajectory will be more practical and are the focus of this study. Anonymized and aggregated dangerous driving event data collected through smartphones by a private company Zendrive (Zendrive, 2018) will be used in this study. Driving event data is collected automatically and passively from smartphone sensors while the Zendrive application is active. Specifically, four types of dangerous driving events data that have been shown to be good indicators for crashes in literature (please refer to the Literature Review section) are collected, which are fast acceleration, hard braking, speeding, and phone use while driving. Zendrive records the location and time of each dangerous driving event, which enables the development of time-dependent safety models. Despite the exclusion of vehicle-to-vehicle communication component, Zendrive has very large coverage due to ubiquitous presence of smartphones. This offers the opportunities to probe the safety issues with unique data in a large area rather than individual sites.

This paper aims to develop the time-dependent SPFs by leveraging the large-scale dangerous driving event data along with geometric, traffic, and environmental characteristics. The unique dataset provided by Zendrive allows us to fully explore the safety effects of dangerous events during different times of day, which has been rarely studied in the literature. Manhattan, New York City is used as our study area. A multivariate conditional autoregressive (MVCAR) model, that can account for both spatial and temporal dependences is developed to model time-dependent safety performance of different areas. With the developed model, temporal effects of risk factors are investigated and time-dependent hotspots are identified. Potential time-dependent safety countermeasures are also discussed along with applications of the developed model.

Section snippets

Time-of-day-related safety performance

Existing studies have discussed the temporally variable association of crash occurrence with respect to risk factors on road environment, traffic characteristics, and driver behavior (Ivan et al., 2000; Lenné et al., 1997; Maze et al., 2006; Plainis and Murray, 2002). Environmental attributes are critical risk factors that have proven to affect crash rates (Balagh et al., 2014). Two commonly examined road environmental factors are weather and visibility. Maze et al. (2006) reviewed a large

Methodology

To evaluate time dependent safety performance, each day was divided into four time periods: a.m. peak (6 a.m. to 10 a.m.), midday (10 a.m. to 4 p.m.), p.m. peak (4 p.m. to 7 p.m.), and night (7 p.m. to 6 a.m.). This temporal separation is consistent with our previous study that explored the safety impacts of truck off-hour delivery programs in New York City (NYC) (Xie et al., 2015). As for the spatial unit of analysis, census tracts (n = 282) of Manhattan were used as the basic geographical

Data preparation and exploration

Two major data sources used in this study are crashes and dangerous driving events. Other data sources, including transportation data, land use data, and socio-demographic data, are also collected for analysis.

Discussion of results

The proposed MVCAR model was developed to model the crash frequencies in different times of day. For comparison purpose, the Poisson-Gamma (PG) model and univariate conditional autoregressive (UCAR) model were also developed in this study. To obtain the most parsimonious model, stepwise selection method based on Akaike Information Criteria (AIC) (Huang et al., 2009; Yamashita et al., 2007) was employed in selecting explanatory variables using the stepAIC function in R (Ripley et al., 2013)

Summary and conclusions

This study examined time-dependent safety performance by leveraging dangerous driving event data. Dangerous driving events were directly related with crash occurrence in different time periods. The multivariate conditional autoregressive (MVCAR) model, which can jointly account for the spatial and temporal dependences of crash observations, was found to achieve the best performance in associating risk factors with crash occurrence. Variation in estimated coefficients in models for different

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

The work is partially funded by Connected Cities for Smart Mobility towards Accessible and Resilient Transportation (C2SMART) Center at New York University (NYU). The authors would like to thank the New York State Department of Transportation, New York City Department of City Planning, Metropolitan Transportation Authority, New York Metropolitan Transportation Council and U.S. Census Bureau for publicly providing data used in this study. The authors would also like to thank Zendrive Inc for

References (93)

  • Y. Feng et al.

    A real-time adaptive signal control in a connected vehicle environment

    Transp. Res. Part C Emerg. Technol.

    (2015)
  • S. Folkard

    Black times: temporal determinants of transport safety

    Accid. Anal. Prev.

    (1997)
  • V. Gitelman et al.

    Exploring relationships between driving events identified by in-vehicle data recorders, infrastructure characteristics and road crashes

    Transp. Res. Part C Emerg. Technol.

    (2018)
  • T.F. Golob et al.

    A method for relating type of crash to traffic flow characteristics on urban freeways

    Transp. Res. Part A Policy Pract.

    (2004)
  • F. Guo et al.

    Individual driver risk assessment using naturalistic driving data

    Accid. Anal. Prev.

    (2013)
  • J.N. Ivan et al.

    Explaining two-lane highway crash rates using land use and hourly exposure

    Accid. Anal. Prev.

    (2000)
  • P.-F. Kuo et al.

    Using geographical information systems to organize police patrol routes effectively by grouping hotspots of crash and crime data

    J. Transp. Geogr.

    (2013)
  • B. Lan et al.

    Validation of a full Bayes methodology for observational before-after road safety studies and application to evaluation of rural signal conversions

    Accid. Anal. Prev.

    (2009)
  • M.G. Lenné et al.

    Time of day variations in driving performance

    Accid. Anal. Prev.

    (1997)
  • D. Lord et al.

    The statistical analysis of crash-frequency data: a review and assessment of methodological alternatives

    Transp. Res. Part A Policy Pract.

    (2010)
  • J.-L. Martin

    Relationship between crash rate and hourly traffic flow on interurban motorways

    Accid. Anal. Prev.

    (2002)
  • E. Melachrinoudis et al.

    A mixed integer knapsack model for allocating funds to highway safety improvements

    Transp. Res. Part A Policy Pract.

    (2002)
  • S. Mitra et al.

    On the nature of over-dispersion in motor vehicle crash prediction models

    Accid. Anal. Prev.

    (2007)
  • J. Pahukula et al.

    A time of day analysis of crashes involving large trucks in urban areas

    Accid. Anal. Prev.

    (2015)
  • R. Paleti et al.

    Examining the influence of aggressive driving behavior on driver injury severity in traffic crashes

    Accid. Anal. Prev.

    (2010)
  • A. Pande et al.

    A preliminary investigation of the relationships between historical crash and naturalistic driving

    Accid. Anal. Prev.

    (2017)
  • S.S. Pulugurtha et al.

    Pedestrian crash estimation models for signalized intersections

    Accid. Anal. Prev.

    (2011)
  • X. Qin et al.

    Bayesian estimation of hourly exposure functions by crash type and time of day

    Accid. Anal. Prev.

    (2006)
  • M.A. Quddus

    Modelling area-wide count outcomes with spatial correlation and heterogeneity: an analysis of London crash data

    Accid. Anal. Prev.

    (2008)
  • L. Sasidharan et al.

    Application of propensity scores and potential outcomes to estimate effectiveness of traffic safety countermeasures: Exploratory analysis using intersection lighting data

    Accid. Anal. Prev.

    (2013)
  • J. Stipancic et al.

    Vehicle manoeuvers as surrogate safety measures: Extracting data from the gps-enabled smartphones of regular drivers

    Accid. Anal. Prev.

    (2018)
  • T. Toledo et al.

    In-vehicle data recorders for monitoring and feedback on drivers’ behavior

    Transp. Res. Part C Emerg. Technol.

    (2008)
  • K.M. White et al.

    Mobile phone use while driving: an investigation of the beliefs influencing drivers’ hands-free and hand-held mobile phone use

    Transp. Res. Part F Traffic Psychol. Behav.

    (2010)
  • M. Wier et al.

    An area-level model of vehicle-pedestrian injury collisions with implications for land use and transportation planning

    Accid. Anal. Prev.

    (2009)
  • K. Xie et al.

    Corridor-level signalized intersection safety analysis in Shanghai, China using Bayesian hierarchical models

    Accid. Anal. Prev.

    (2013)
  • M. Abdel-Aty et al.

    Linking roadway geometrics and real-time traffic characteristics to model daytime freeway crashes: generalized estimating equations for correlated data

    Transp. Res. Rec.: J. Transp. Res. Board

    (2004)
  • B.L. Allen et al.

    Analysis of traffic conflicts and collisions

    Transp. Res. Rec.: J. Transp. Res. Board

    (1978)
  • A.K.G. Balagh et al.

    Highway accident modeling and forecasting in winter

    Transp. Res. Part A Policy Pract.

    (2014)
  • P. Baltusis

    On Board Vehicle Diagnostics

    (2004)
  • D.F. Cooper et al.

    Traffic studies at T-Junctions. 2. A conflict simulation record

    Traffic Eng. Control

    (1976)
  • T.A. Dingus et al.

    The 100-Car Naturalistic Driving Study, Phase II-results of the 100-Car Field Experiment

    (2006)
  • T.A. Dingus et al.

    The 100-Car Naturalistic Driving Study. Phase 2: Results of the 100-Car Field Experiment

    (2006)
  • M.D. Dunlop et al.

    Using smartphones in cities to crowdsource dangerous road sections and give effective in-car warnings

    Paper Presented at the Proceedings of the SEACHI 2016 on Smart Cities for Better Living with HCI and UX

    (2016)
  • M.A. Elliott et al.

    Drivers’ compliance with speed limits: an application of the theory of planned behavior

    J. Appl. Psychol.

    (2003)
  • M. Fazeen et al.

    Safe driving using mobile phones

    IEEE Trans. Intell. Transp. Syst.

    (2012)
  • S. Galgano et al.

    Connected Vehicle Pilot Deployment Program Phase 1: Performance Measurement and Evaluation Support Plan: New York City

    (2016)
  • Cited by (20)

    • Smartphone-based hard-braking event detection at scale for road safety services

      2023, Transportation Research Part C: Emerging Technologies
    View all citing articles on Scopus
    View full text