How risky is distracted driving?

We use data on fatal crashes to quantify the risk of distracted driving. We repurpose, extend, and improve a methodology used to estimate the riskiness of drinking drivers (Levitt & Porter, 2001). Our analysis suggests that distracted drivers are three times more likely to cause a fatal crash than focused drivers. We also estimate that distracted drivers represent three to four percent of drivers on the road at any given time. Further, we find that distractions associated with cellphone use are less likely to cause a fatal crash than are distractions from other sources. The externality costs between $0.02 and $0.05 per mile driven. The insurance surcharge for a distracted driving citation that could internalize the avoidable insurance losses is approximately $577 per year. Our work extends the literature on distracted driving and traffic fatalities. We believe our results can inform policymakers on the traffic-safety and economic consequences of distracted driving.


Introduction
Vehicle crashes in the United States cause thousands of fatalities, millions of injuries, and substantial economic costs each year. In response, policymakers enact various measures aimed at improving traffic safety, such as incentivizing or mandating vehicle safety features, enacting seat belt laws, and prohibiting impaired driving. In recent years, the ubiquity of cellphones in society has thrust the issue of distracted driving to the forefront of policy discussions and actions surrounding traffic safety.

3
However, both policymakers and researchers are divided on the efficacy and net benefits of laws intended to mitigate distracted driving, especially laws that ban the use of handheld devices while driving.
Conflicting results in the literature exacerbate disagreements among policymakers, because both sides can find empirical results to support their positions. For example, published studies report a range of the relative riskiness of distracted driving between 0-no difference between distracted and focused driving (e.g., Bhargava & Pathania, 2013)-and 23 times more risky (e.g., Olson et al., 2009), with most of the positive estimates falling between 2 and 8 times the relative risk of focused driving.
In this paper, we develop and apply a new method for estimating the risk associated with distracted driving that overcomes several of the challenges that hinder previous studies. We draw on the work of Levitt and Porter (2001) (henceforth L&P), who develop an elegant and novel strategy for estimating the risk of drinking and driving using only data on fatal crashes. More specifically, leveraging the fact that many fatal crashes involve multiple drivers, L&P show that, for two car crashes, the relative frequency of crashes involving sober drivers and drinking drivers provides sufficient information for estimating the risk associated with drinking and driving. With this novel method, L&P use the information reported in the Fatality Analysis Reporting System (FARS) database to identify crashes involving sober and drinking drivers. They estimate that drivers who are legally drunk pose a crash risk that is 13 times greater than that of sober drivers. In this way, L&P significantly improve on the work (and shortcomings) of prior studies and provide one of the most complete and reliable insights into the risk of drinking and driving.
We use L&P's method as a kernel to develop a radically different strategy for estimating the risk associated with distracted driving. We analyze the uniform and detailed FARS data in five steps. First, we develop a model similar to L&P's using a different approach that allows us to estimate the relative risk consistently with fewer observations. Second, we validate our changes to the methodology by applying them to drinking drivers and comparing the results to our replication of L&P's analysis. Third, we apply the validated method to estimate the frequency and relative risk of distracted driving. Fourth, we compare the relative risks of distractions from various sources to that of cell phone use. Fifth, we estimate the externalities and associated costs of distracted driving.
Our results suggest that distracted drivers represent between 3.5% and 4% of all drivers (at any time on the road) and are between 2.5 and 3.14 times more risky than focused drivers. Though these numbers are not as high as the relative riskiness of drinking drivers (7.76 to 8.56 times riskier in the 8:00 p.m. to 5:00 a.m. window), they are consistent with a significant risk posed by distracted driving. We also find that cell phone distractions are somewhat less common and less risky than other sources of distraction. Between 23% and 35% of all distracted drivers are distracted by cellphones, and the relative riskiness of cell phone distraction is about two-thirds that of distractions from other sources (e.g., passengers, eating, grooming, etc.). Policymakers should consider this information as they deliberate the costs and benefits of anti-distraction laws.

Background and prior literature
Proponents of policies aimed at reducing distracted driving assert that distractions pose a significant risk to individuals on the road. The fact that many states outlaw certain distractions, such as holding cell phones, suggests many policymakers believe the political, economic, and societal cost of these regulatory and legislative measures are justified. In contrast, opponents of distracted driving laws argue that cellphone bans (and similar laws) may lead to increased police scrutiny, infringement of rights, or other negative externalities that are not warranted by the risk of distracted driving (Dutton, 2017). Cotter (2019) writes about one example of a regional NAACP chapter opposing such laws, unless data on race are collected during each traffic stop. Lyman (2021) reports that state legislators in Alabama voiced concerns about police scrutiny and government overreach.
Another strand of literature recognizes the value of using a cell phone while driving. For example, Hahn et al. (2000) and Cohen and Graham (2003) find the potential costs of cell phone bans to exceed the benefits by tens of billions of dollars in the late 1990s, when cell phone ownership was approximately 25% as prevalent as it is today.
Both proponents and detractors of distracted driving laws lean on the academic literature to support their policy positions. Proponents reference the studies that suggest distracted driving is associated with significant negative consequences that can, to a degree, be remedied by policy action. For example, Redelmeier and Tibshirani (1997) and McEvoy et al. (2005) provide evidence that the hand-held use of a cellphone while driving increases the risk of an automobile crash by a factor of four. Other researchers, using various data sources and study designs, provide additional evidence that the use of a cellphone while driving increases the likelihood of crashing (e.g., Klauer et al., 2006Klauer et al., , 2014McCartt et al. 2006). Other studies provide evidence that legislation (hand-held cellphone bans) reduces cellphone use, crashes, and automobile insurance claims (e.g., Anyanwu, 2012;Braitman & McCartt, 2010;French & Gumus, 2018;Karl & Nyce, 2019, 2020Kolko, 2009;McCartt et al., 2010;Nikolaev et al., 2010;Sampaio, 2010).
Although several studies indicate there is risk associated with distracted driving, there is not a clear consensus in the literature. Several papers discuss methodological and data issues that could bias results involving distracted driving. Other studies develop hypotheses that could explain why cell phone use may not increase crash frequency. We review both explanations in turn. McCartt et al. (2014) comprehensive literature review indicates that several studies find no effect of distracted driving or laws intended to limit distracted driving (Bhargava & Pathania, 2013;Trempel et al., 2011). This lack of consensus in the literature could be caused by several empirical challenges. First among such challenges is data integrity. McCartt et al. (2014) note that "...there is considerable unsettled evidence with regard to the patterns of drivers' phone use or the effects of use on crash risk. Evaluations of cellphone and texting bans also must grapple with substantial methodological and data-related challenges that many of the reviewed studies were unable to overcome." Reliable, uniform data on the effects of distracted driving on crash risk are scarce. Methods used in the literature include sampling cellphone records, observing drivers while operating a vehicle, and examining aggregate data pertaining to cellphone use and crashes. Consequently, estimates of the risk of distracted driving vary widely across studies using different data and methods. The low end of the estimates in prior studies suggests distracted driving (in the form of cell phone use) presents no additional risk for drivers (Bhargava & Pathania, 2013). In sharp contrast, the high end of the estimates suggest distracted driving could be 23 times riskier than focused driving (Olson et al., 2009). Although the National Highway Traffic Safety Administration (NHTSA) recognizes the limitations to collecting and reporting distracted driving data, it does not appear that any states have enacted legislation aimed at improving distracted driver data collection (Bloch, 2020).
Collection of data on rates of distracted driving is left to surveys, efforts to count distracted drivers with manual observation at intersections (visible cell phone use), and police enforcement in states where banned, most notably through high visibility enforcement (often combined with sobriety checkpoints, speed zones, or other high visibility enforcement efforts). This has led researchers to try a variety of approaches to estimate distracted driving, including using cellphone billing records (McEvoy et al., 2005;Redelmeier & Tibshirani, 1997), OnStar call records (Young & Schreiner, 2009) and video recordings (Klauer et al., 2006). By focusing on fatal crashes we believe we have the most accurate and consistent reflection of distracted driving recorded at the time of the crash.
Even putting aside data and methodological issues, other concerns arise when attempting to evaluate the risk distractions pose to drivers. For example, issues pertaining to enforcement present a challenge of assessing the risk of distracted driving, as it is much easier to prove an individual was, for example, drinking and driving (e.g., using a breathalyzer or blood test) than it is to prove that a driver was not paying attention at a specific time on the road. 1 Another problem of assessing the risk of distracted driving is that there are many forms of driving distractions (e.g., eating, adjusting radio/infotainment settings, talking to passengers, talking/texting or using other features of smart phones, etc.) that must be considered.
Finally, a strand of literature develops three hypotheses to explain why cell phone bans might not reduce crashes. Bhargava and Pathania (2013) note that drivers who use cell phones could compensate for the risk by driving slower or in less crowded lanes-a direct analogy to the "Peltsman Effect" developed in the context of seat belt use (Peltzman, 1975). Hahn et al. (2000) posit that drivers who use their phones while driving could have an affinity for risk. In this case, they would substitute some other risky activity if cell phone use is not legal. Consistent with this hypothesis, Wilson et al. (2003) find that drivers who use a cell phone while driving incur more citations for other risky behaviors that are not related to cell phone use (e.g., drinking, seat belt use). The third hypothesis is that the effects of cell phone use while driving could be heterogeneous across drivers (Hahn et al., 2000). For example, the potential distraction of cell phone use could mitigate the effects of boredom and fatigue for some drivers, yet cause other drivers to crash in the same setting.

Distracted driving and the FARS data
How can law enforcement officers determine if a driver was distracted after a crash occurs? The task seems even more daunting if we consider the percentage of distracted drivers who, themselves, die in the crashes. After careful consideration of this question, we are confident that the accuracy of distracted driving observations in the FARS data is a primary strength of our analysis. 2 Law enforcement officers have extensive resources available to determine if a driver was distracted in a fatal crash. Moreover, the processes and techniques used are consistent across states. Officers subpoena data from mobile devices that show what was done with the device and when. They also have access to data from the vehicles. Investigators retrieve data from airbag sensing and diagnostic modules (SDMs) or from electronic control units (ECUs) in older vehicles. A vast literature assesses reaction times in various scenarios (e.g., Gao & Davis, 2017;Makishita & Matsunaga, 2008). If a crash occurs and the at-fault driver did not react within adjusted time parameters, the officer can determine the driver was distracted. Reaction times are adjusted to reflect factors including driver age, lighting, weather, and road structure. If a mobile phone was in use, it would be recorded as the cause of distraction. If the driver is clutching a cheeseburger, the cause is likely eating, and so on. Of course, if the drivers or passengers survive, they can be questioned along with any other witnesses. Gordon (2014) presents a case study of how investigators use these data.
In addition to the knowledge and expertise of the law enforcement officers reporting data to FARS, the NHTSA allocates substantial funding and resources to crash data collection. In the last year of our sample NHTSA spent $34.7 million on crash data collection (NHTSA, 2019). The FARS database includes more than 140 coded variables describing each crash in the United States, the District of Columbia, and Puerto Rico in which at least one person dies. Data are checked for consistency and accuracy by FARS data analysts. NHTSA (2019) claims that FARS "is the most referenced motor vehicle crash data system in the world."

The model
L&P develop a methodology that measures the prevalence of drinking drivers on the road and the risk they pose relative to other drivers. Their methodology relies on five assumptions and properties of the binomial distribution to estimate both the relative risk posed by drinking drivers and the frequency of drinking and driving. Using the FARS data on two-car fatal crashes they estimate that drivers with alcohol in their blood are seven times more likely to cause a fatal crash and legally drunk drivers are thirteen times more likely. They also find that at peak drinking and driving times (weekends between 1:00 a.m. and 3:00 a.m.), up to 25 percent of drivers on the road have been drinking. They estimate that an $8,000 fine for those who are caught would internalize the cost of the externalities associated with drinking drivers.
The novel approach taken by L&P enables them to make a limited number of assumptions and yet estimate both the frequency and relative riskiness of drunk driving from the fatal two-car crash data in the FARS dataset. L&P make five assumptions: 3 1. There are only two types of drivers: drinking or sober; 2. There is equal mixing of drinking and sober drivers on the road: (a) The number of interactions that a driver has with other cars is independent of driver type; (b) Driver type does not affect the composition of the driver types with which that driver interacts; 3. A fatal crash results from a single driver's error; 4. The composition of driver types in one fatal crash is independent of the composition of driver types in other fatal crashes; and 5. Drinking (weakly) increases the likelihood that a driver makes an error resulting in a fatal crash.
These five assumptions along with properties of the binomial distribution involved with two-car fatal crashes allow L&P to estimate a measure of relative riskiness without having to assume frequency (and vice versa). This is especially appealing in the context of distracted driving, where estimates of the frequency of distracted driving vary widely by source. Unfortunately, L&P's methodology is not a turnkey solution for measuring the risk of distracted driving. They solve for the relative riskiness of drinking drivers solely based on the observed distribution of fatal crashes. Then they back out the value of the relative exposure. Their method is constrained by the choice of optimization method. Because they solve a quadratic equation, the solution requires a minimum necessary percentage of the two-car fatal crashes to involve one drinking driver and one sober driver. This is not a binding constraint in their context because more than half of all drivers that cause a fatal crash have been drinking. However, our data show only about 4% of at-fault drivers in fatal crashes are distracted. We avoid this constraint by using a nonlinear programming (NLP) optimization method 4 to estimate the relative riskiness and relative exposure of distracted drivers. The NLP method directly maximizes the log likelihood function instead of solving a quadratic equation. This approach allows us to include one-, two-, and three-car crashes in the analysis to increase the number of observations.

Model assumptions
Our model requires five assumptions very similar to those of L&P.
1. There are two types of drivers, distracted (D) and focused (F). 2. There is equal mixing of distracted and focused drivers on the road, meaning (a) The number of interactions that a driver has with other cars is independent of the driver's type. (b) A driver's type does not affect the composition of the driver types with which they interact.
3. A fatal car crash results from a single driver's error. 4. The composition of driver type(s) in one fatal crash is independent of the composition of driver type(s) in other fatal crashes. 5. The relative likelihood of a distracted driver causing a one-car fatal crash is equal to the relative likelihoods of a distracted driver causing a two-or three-car fatal crash.
Our first four assumptions are the same as those of L&P. The first assumption limits the number of driver categories and implies that there are no "undetermined" drivers. This is necessary because the methodology can only compare two categories at a time. However, we are able to compare distracted and focused drivers across other categories (age, gender, driving record) by limiting the sample. The second assumption requires "equal mixing" of driver types on the road. It means that there is no bunching of drivers in space or time and both types of drivers encounter each other on the road at rates similar to their rate in the population. This assumption is especially likely to hold as we conduct the analysis in smaller groups. If violated, this assumption will lead to downward bias in estimates of relative risk and upward bias in estimates of frequency.
The third assumption rules out the possibility that both drivers contribute to the occurrence of the crash. This assumption is problematic if drivers in one category are better able to avoid potential crashes than drivers in the other category. In the Appendix, we show that violations of this assumption result in downward biased estimates of relative risk.

3
The fourth assumption, that driver types are independent across crashes, allows us to use the multinomial distribution to characterize the joint distribution of driver types involved in fatal crashes.
Our fifth assumption is different from L&P's. Their model requires that drinking drivers are at least as likely as sober drivers to cause a fatal accident. Given the hypotheses discussed above (compensation, substitution, and heterogeneous effects), we do not assume that distracted driving is riskier than focused driving. This is not a problem in our model, because we do not use the quadratic formula to solve the likelihood function. Instead, we assume that the relative likelihood of a distracted driver causing a fatal crash is equal for one-, two-, and three-car fatal crashes. This assumption seems plausible and it allows us to use more data in the calculations.
Another restriction of L&P's method is that it cannot identify the parameters of the model using only one-car crashes. They must estimate the relative riskiness from the two-car crashes, back out the relative exposure, and then use the relative exposure to estimate the relative riskiness for one-car crashes. Again, because there is a high enough percentage of two-car crashes involving drinking drivers, this constraint is not binding for their study. However, in our distracted-driving analysis, the frequency of two-car fatal crashes involving at least one distracted driver is sparse in time and place, especially when we relax Assumption 2 and shrink the geographic and temporal units of observation. Therefore, to include as much information in the estimation as possible, our method combines one-car, two-car, and three-car fatal crashes, 5 and estimates the relative riskiness and relative exposure in one step. This necessitates our Assumption 5 for the relative riskiness of distracted drivers. Assumption 5 does not require the probabilities of a distracted driver causing a one-, two-, or three-car fatal crash to be equal, only that the levels of relative riskiness are the same. 6 Moreover, L&P's results support this assumption. There is virtually no difference between the relative likelihoods of causing one-car and two-car crashes in the case of drinking or drunk drivers (Loughran & Seabury, 2007).

Developing the model
The technical exposition of our model begins with equations representing the probabilities of one-, two-, and three-car crashes. In Eqs. (4) through (23), we apply the assumptions explained above and rearrange the terms to produce a likelihood function that we can solve for the relative risk and frequency of distracted driving.
First, we derive the probabilities that a driver is of a certain type, interacts with one car, two cars, or no other cars, and the types of drivers in those cars. In a given geographic area and time period, there are, on average, N D distracted drivers and N F focused drivers on the road at any instant. During each instance, on average, p of all the drivers on the road experience interactions with one other car ( I 2 = 1 ), during which a two-car crash is possible. Meanwhile, q of all the drivers 1 3 Journal of Risk and Uncertainty (2023) 66:279-312 on the road experience interactions with two other cars ( I 3 = 1 ), during which a fatal crash involving three cars can happen. Therefore, 1 − p − q of all drivers on the road do not encounter any other cars, hence only one-car fatal crashes can occur. Thus, at any instant, for any driver on the road, the probability that the driver is of type i and does not interact with any other drivers, conditional on driving on the road ( Dr = 1 ), is The respective joint distributions for the driver encountering one or two other cars with a pair of driver types, conditional on driving on the road, are, respectively, and Thus, if we randomly pick one car on the road, Eqs. (1), (2), and (3) give us the joint probabilities of the driver's type, the number of cars with which the the chosen car will interact, and the types of the drivers in those cars.
Next, in Eqs. (4) through (6), we derive the joint probabilities that the randomly chosen driver(s) in Eqs. (1) through (3) will cause a fatal crash. Let i be the probability that a driver of type i causes a fatal one-car crash. Assumption 5 implies that the probability that a driver of type i causes a two-car or three-car fatal crash is proportional to that of causing a one-car fatal crash. Define and as scaling parameters so that a driver of type i has probabilities of i and i to cause a two-car and a three-car fatal crash, respectively. Assumption 3 implies that the joint probabilities of numbers of cars interact, the driver type(s), and a fatal crash ( A = 1 ), conditional on driving on the road are and (1) (2) Pr(ij,

3
We use the approximately equal symbols ( ≈ ) in Eqs. (5) and (6) to show that the chances of multiple drivers making an error at the same time are sufficiently small to be ignored for parsimony. 7 Equal symbols will be used henceforth for simplicity. At this point, we know the probabilities of driver type(s) and the number of cars involved in the crash, conditional on driving from Eqs. (4) through (6). Our next step is to use Eqs. (4) through (6) to calculate the joint probabilities of the number of cars and driver types in a crash, given that a fatal crash has occurred, as shown in Eqs. (7) through (9).
Given that a fatal crash has occurred, Eqs. (7)-(9) tell us the joint probabilities of the number of cars involved in the crash and the composition of driver types. These equations allow us to use only the fatal crash data, without knowing all the details of the entire driving population. Now we begin the process of estimating the unknown variables in Eqs. (7) through (9). Because there are more unknowns than equations, we cannot solve them directly. However, we can identify the relative terms. Denote P i , P ij , and P ijk as the probabilities that the composition of driver type(s) is (are) of type i, types i and j, and types i, j, and k, respectively, given that a fatal crash occurs. Define = D ∕ F and N = N D ∕N F , where is the relative likelihood that a distracted driver will cause a fatal crash compared to a focused driver, and N is the average ratio of distracted drivers to focused drivers on the road at any instant in a particular geographic area and period. We can now explicitly state the probabilities in terms of and N, for each accident type and driver type, as: (10) Journal of Risk and Uncertainty (2023)  Equation (10) gives the probability that a fatal crash involves only one car driven by a distracted driver. Likewise, Eq. (11) gives the probability that a fatal crash involves only one car driven by a focused driver.
Equations (12) through (14) are the probabilities that the fatal crash involves two cars, with the three possible combinations of distracted and focused drivers (two distracted, two focused, or one distracted and one focused). (11) Equations (15) through (18) are the probabilities that the fatal crash involves three cars, with the four possible mixes of distracted and focused drivers (three distracted, two distracted and one focused, one distracted and two focused, or three focused). Let A i , A ij , and A ijk be the numbers of fatal crashes involving driver(s) of type i, types i and j, and types i, j, and k, respectively, and denote A total as the total number of fatal crashes. From the ratios of different types of accidents, we can obtain the following additional expressions of our model's parameters: Equation (19) expresses the ratio of fatal crashes caused by distracted drivers over those caused by focused drivers in terms of and the observed numbers of fatal crashes. Expressed in the same terms, Eqs. (20)-(22) represent the ratios of single-car, two-car, and three-car fatal crashes over the total number of fatal crashes, respectively. If measurement error exists and some distracted drivers are misclassified as focused, the estimates of N and , as expressed in Eq. (19), will be affected, which is indeed the case in our later analysis where we examine the sensitivity of our estimates to potential driver misclassification. On the other hand, Eqs. (20)-(22) are not affected by this type of measurement error, because the numerators in these three equations are the total numbers of one-, two-, and three-car crashes, and these numbers do not change with driver(s) types.
Because Assumption 4 ensures the independence of the composition of driver type(s) in a fatal crash, the joint distribution of driver type(s) involved in fatal crashes is given by the multinomial distribution. We can derive the likelihood function as: ( Substituting Eqs. (19)−(22) into Eqs. (10)-(18) enables expressing the probabilities solely in terms of and the observed numbers of fatal crashes. Substituting these probabilities into Eq. (23) yields the corresponding expression of the likelihood function. Recall that L&P need a high enough percentage of drinking-sober crashes in two-car fatal crashes to generate a meaningful estimate. They note that in the cases when the percentage is low, the estimates of the relative riskiness equal one. A potential drawback is that this method will cause downward bias in the estimation because drinking drivers should, on average, impose greater risk than sober drivers. The issue becomes much more problematic in estimates of the relative riskiness of distracted drivers, because in most of the geographic and temporal units, the percentage of distracted-focused crashes in two-car fatal crashes is well below the level required for using L&P's method. We address this issue by using NLP optimization to estimate instead of solving quadratic equations. We estimate by directly maximizing the log likelihood function (the log function of Eq. (23)), which is solely expressed in after plugging in Eqs. (10)-(22). After we obtain the respective estimates of for different geographic and temporal observation units, we plug these values back into the equations to express the log likelihood function solely in N. The implied relative exposure of distracted drivers for each geographic and temporal unit can then be estimated by maximizing the log likelihood function.
In the estimation process, as long as at least one distracted driver and at least one focused driver appear in the FARS reports for a particular geographic and temporal unit, we can use the information from that unit. For some smaller geographic and temporal units, there were not any three-car fatal crashes, because this type of fatal crash is the rarest compared with one-car and two-car crashes (in some cases, there were not any two-car fatal crashes). In these cases, the respective probabilities enter the log likelihood function as zeros, and do not affect the maximization process. This minimizes the downward bias of the estimation of relative riskiness.

Comparison of estimation methods
We begin the empirical analysis by replicating L&P's analysis and comparing their results to results from their sample using our method.   , 1983-1993 Notes: Degrees of freedom represent the number of units of observation that contain at least one crash (plus 2). Numbers in parentheses are standard errors. The sample period, 1983-1993, is chosen to match that of L&P (1) (3) best approximation 8 of the results shown in L&P Table 2 alongside results from using our method to estimate the relative risk and frequency of drinking and driving between 1983 and 1993. Emulating L&P, moving from column (1) through column (8) we relax the "equal mixing" assumption (Assumption 2) to include more units of observation. Column 1 assumes that equal mixing must hold for all fatal accidents (between 8:00 p.m. and 5:00 a.m. every night for the years 1983-1993). Column 2 relaxes that assumption in that it only needs to hold for each hour between 8:00 p.m. and 5:00 a.m. (i.e., equal mixing only needs to hold for that hour). Columns 3 through 8 then reflect various combinations of time and location further relaxing the equal mixing assumption.
The first row in Table 1 shows the unit of observation. The second and third rows show the relative riskiness of drinking drivers from L&P. The fourth row contains the combined fatal crash relativities for drinking drivers (combining one, two, and three car fatal crashes). Our analysis uses a larger sample of units than does L&P because our method can accommodate more information. The fifth and sixth rows contain the degrees of freedom in each analysis, demonstrating that our method preserves more observations in the data.
There are two main differences in the estimates. Estimates from our method are much more consistent across columns, 9 and the downward bias issue, especially for the lower numbered columns, is much less severe. For column (8), where the Percentage of fatal one-car crashes with: One distracted driver 9.4 One focused driver 90.6 Percentage of fatal two-car crashes with: Two distracted drivers 0.7 One distracted driver and one focused driver 10.9 Two focused drivers 88.4 Percentage of fatal three-car crashes with: Three distracted drivers 0.4 Two distracted drivers and one focused driver 1.1 One distracted driver and two focused drivers 12.5 Three focused drivers 86.1 geographic and temporal units are the smallest, our estimate for the relative riskiness of drinking drivers is lower than the original L&P estimates. Time-space units with "too few" fatal crashes caused by drinking drivers are dropped in L&P's analysis, which can inflate estimates of the relative riskiness of drinking drivers. The degrees of freedom rows in column (8) show that our method uses almost 2,000 (28%) more geographic and temporal units than the L&P approach. Dropping these additional units will increase estimates of the relative riskiness of drinking drivers. When these geographic and temporal units are aggregated into larger units in columns (1) through (7), the actual information aggregated into these geographic and temporal units in the L&P method is still less than that of our method. 10 Therefore, our method can address not only the downward bias issue for the lower numbered columns, but also the over-estimation issue for the higher numbered columns. 11

Relative riskiness and exposure of distracted drivers and drinking drivers
Distracted driving is a much more recent phenomenon than drinking and driving, thus our paper uses more recent data. We use Fatality Analysis Reporting System (FARS) data from 2000 to 2018 to estimate the relative riskiness and exposure of distracted drivers and drinking drivers. FARS records detailed information on all fatal automobile crashes that occur on public roads in the United States. Tables 2 and 3 summarize the numbers of fatal crashes and provides means for the compositions of each type of fatal crash in the samples. The percentage of distracted drivers in all fatal crashes does not vary much between different hours, but the percentage of drinking drivers in all fatal crashes is much higher in the evening hours than in day-time hours. Therefore, to provide consistent and robust estimates, the analysis of distracted drivers uses fatal crashes from all 24 hours while the analysis of drinking drivers is limited to fatal crashes between 8:00 p.m. and 5:00 a.m. Table 2 shows that, although the total number of fatal crashes is quite large, the percentage of distracted drivers in the sample is small. Therefore, the numbers of two-car crashes between two distracted drivers and three-car crashes involving three distracted drivers are relatively small. Hence, as discussed earlier, L&P's method cannot provide a reliable estimate. 12 Table 3 shows that the sample size for drinking drivers is significantly smaller than that of distracted drivers because we only use fatal crashes between 8:00 p.m. and 5:00 a.m., but the presence of drinking drivers in fatal crashes are much more common. Table 4 presents the maximum likelihood estimates of the relative fatal crash risks for distracted drivers (Table 5 for drinking drivers) in increasingly granular 10 This occurs even though the differences between the numbers of the more aggregated geographic and temporal units used in the two estimations become smaller (in columns (1) to (4), these numbers are the same). 11 We discuss the upward bias issue further with the results in Table 7. 12 As mentioned in L&P, "the lack of drinking-drinking crashes during the daytime period makes estimation difficult." They circumvent the problem by limiting their sample to nighttime hours. However, the numbers of fatal crashes involving multiple distracted drivers are low in all hours. cuts of the data. As we relax the "equal mixing" assumption from column (1) to column (8), we observe results consistent with a decrease in the potential downward bias of the relative riskiness estimates of distracted drivers and drinking drivers. Again, as can be observed from the pattern moving from left to right in the table, the downward bias is very mild with our method, and the estimates across columns are very consistent.
The fatal crash risk of distracted drivers rises monotonically from 2.49 to 3.14 times greater than that of focused drivers and the risk of drinking drivers is about eight times greater than that of sober drivers. Overall, the results imply that at every instant, about 3% to 4% of drivers on the road are distracted, while approximately 10% to 11% of drivers on the road have been drinking (between 8:00 p.m. and 5:00 a.m.). 13 The degrees of freedom suggest that when disaggregating the data from column (7) to column (8), significant information is lost because there were many timespace units with no observations of fatal crashes involving at least one distracted driver. This probably suggests that the level of disaggregation is too refined (going from column (7) to column (8)) due to the low percentage of distracted drivers in the sample and could be the reason why the implied fraction of distracted drivers does not decrease from column (7) to column (8). Importantly, we have no reason to believe a priori that distracted driving differs by day of the week. We include this category only for comparison to L&P. Percentage of drinking drivers in all fatal crashes 42.5 Percentage of fatal one-car crashes with: One drinking driver 53.1 One sober driver 46.9 Percentage of fatal two-car crashes with: Two drinking drivers 6.9 One drinking driver and one sober driver 46.1 Two sober drivers 47.0 Percentage of fatal one-car crashes with: Three drinking drivers 0.9 Two drinking drivers and one sober driver 7.3 One drinking driver and two sober drivers 45.0 Three sober drivers 46.8  Table 6 presents separate estimates of the relative risk of distracted drivers for each year in our sample. There is significant variation in the coefficient estimates between 2000 and 2006. The relative riskiness of distracted drivers is quite high for the years 2007 to 2009, which coincide with the introductions of the first iPhones and the 3G wireless mobile telecommunications network. The relative riskiness then declines and becomes stable between 2010 and 2018.
Law enforcement officers undergo extensive training, and substantial technical and legal resources are allocated towards determining if a driver was distracted in the event of a fatal crash; therefore, we are confident the FARS data are reasonably accurate. However, to explore the potential effects of under-reporting, we examine the sensitivity of our estimates to four hypothetical scenarios of driver misclassification, with the frequency of distracted drivers being 25%, 50%, 75%, and 100% more than is reported in the FARS data. We estimate the potential effects of misclassification by randomly selecting the coinciding number of focused drivers and changing the classification to distracted before re-estimating the model. For example, Table 2 shows that 7.4% of the 805,256 drivers in all fatal crashes are distracted in our sample of the FARS data. We estimate the 25% misclassification by randomly selecting 0.25 × 0.074 × 805, 256 = 14,897 focused drivers and assuming they are distracted drivers. Results of the sensitivity analysis appear in Table 7. If the actual number of distracted drivers is under-reported, the relative riskiness of such drivers is overstated, and the percentage of these drivers on the road is understated. 14 This is the opposite of L&P's finding that if 5% of drinking drivers are misclassified as sober, the relative riskiness increases, and the implied fraction of these drivers decreases. From the degrees of freedom used in each analysis, we can see that many more time-space units, and hence much more information, are being used in the estimation as the reported percentage of distracted drivers increases. Therefore, we can again verify that due to the way information and observations are used (or not used) in L&P's method, they over-estimate the relative riskiness in the upper columns when the time-space units are smaller, and our method mitigates this upward bias. Table 8 presents results for other driver traits and compares them to results for drinking drivers and distracted drivers. In general, younger drivers are about 1.6 times more likely than older drivers, and male drivers are about 1.2 times more likely than female drivers, to cause fatal crashes. Table 9 reports estimates that allow for interactions between distracted driving and other risk factors such as gender, age, and driving record. Doing so allows for differential fatal crash risk estimates for young and old drivers, male and female drivers, and drivers with and without previous citations for poor driving who have or have not been distracted before fatal crashes. Focused drivers over the age of 25, female focused drivers, and focused drivers with clean driving records are used as respective baselines. Each of the three factors show significant differences in fatal crash risk among distracted drivers.
Although females are less likely than males to drive distracted, females are associated with a higher risk of crashing when they do drive distracted. About 3.9% of female drivers are distracted at any given time, compared to 5.3% of male drivers.

3
The relative risk of female distracted drivers is 6.05 times that of a female focused driver and the relative risk of a distracted male driver is 2.98 times that of a female focused driver. Younger drivers ( ≤ 25 years) are both more likely to drive while distracted (4.7% versus 3.7%), and they are much riskier distracted drivers (6.96 versus 3.42) compared to older drivers (>25 years). Drivers with previous citations are less likely to drive while distracted (2.1% versus 2.8%), but compared to drivers with clean records, they are much riskier distracted drivers (8.17 versus 3.33). The public policy implications of these differences are not obvious. One potential explanation is that these groups are distracted by different behaviors. Alternatively, they could drive in different situations and conditions. Given our research question, however, it is noteworthy that distracted drivers in each category are always riskier than focused drivers.

Relative riskiness and exposure of distracted drivers by source of distraction
Distracted driving is often thought to be a singular function of cellphone use. However, our data show that distracted driving fatalities are less likely to be associated with cellphone use than with other distractions including eating, passengers, insects and reptiles, and other electronics. This finding motivates our analysis of relative risk by source of distraction. We compare distracted driving by source of distraction in three steps. First, we estimate the risk of drivers distracted by cellphones relative to focused drivers. Second, we estimate the risk of drivers distracted by other sources relative to focused drivers. Finally, we compare the estimates of relative risk by source of distraction. Table 10 presents summary statistics for the samples used to compare the relative risk of cellphone distraction to that of distraction from other sources. The sample for this analysis is smaller than preceding analyses because FARS only began collecting the source of distraction in 2010. The cellphone distraction sample excludes crashes involving distraction from other sources. Likewise, the other distraction sample excludes crashes involving cellphone distraction. We estimate the riskiness of and exposure to drivers distracted by cellphones relative to drivers distracted by other factors using the same technique described above. Table 11 presents the results of estimating relative risk of cellphone distraction versus focus, the relative risk of other distraction versus focus, and the ratio of cellphone distraction risk to other distraction risk.
Results show that drivers who are distracted by cellphones are less risky than drivers who are distracted by other factors. This could be because drivers tend to manipulate cellphones more often in less demanding situations, and vice versa, which is consistent with the findings of Kidd et al. (2016), and more generally with Peltzman (1975). It does not, however, explain why the same behavior does not hold for other sources of distraction. Perhaps the focus on handheld devices in public Table 9 Estimates of Relative Riskiness by other Driver Characteristics (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017)(2018) Notes: Values in column 1 are maximum likelihood estimates of the fatal crash risk of the named group relative to the named baseline group (i.e., the group with a relative risk defined to be equal to one and written in bold type). All specifications assume equal mixing by state×year×day×hour and therefore are comparable to column 8 of Table 4. Estimates are based on the same sample used in Table 4 1 3 policy and public education has (falsely) led drivers to believe that cellphones are the only dangerous source of distraction in vehicles. Although drivers who are distracted by cellphones are relatively less risky than drivers who are distracted by other sources in terms of causing a fatal crash, it does not mean that they are less risky than focused drivers. The fatal crash risk of a driver distracted by a cellphone is still 1.76 to 3.17 times greater than that of a focused driver.

Externalities and public policy
In this section, we consider the public policy implications of the estimates obtained in the prior section. In the spirit of L&P, we begin assessing the negative externalities of distracted driving by calculating the number of traffic fatalities that would have been avoided if a driver had not been distracted (i.e., avoidable deaths). We also estimate a conceptual "insurance surcharge" that represents the cost that distracted drivers impose on focused drivers. The surcharge results may help drivers internalize the costs they pay. First, we estimate the externalities from loss of life. Following L&P, we assume that if a distracted driver dies in a crash that she causes, then she bears the cost of her actions and the cost is not classified as an externality. In effect, this assumption implies that the distracted driver considered all associated risk when making the choice to drive distracted. In addition, it does not internalize other noneconomic costs, such as the suffering of friends and family. Similarly, we also assume that passengers who died in a distracted driver's vehicle were willing participants who also assumed the risk associated with riding with a distracted driver, meaning that they are also not considered an externality. 15 Finally, because this analysis is limited strictly to deaths in fatal crashes, it does not include other costs associated with nonfatal injuries, property damage, and other behavioral changes (e.g., focused drivers' fear of being struck by a distracted driver).  238 61,307,191 15 In Table 12, we also present alternate estimates that relax this assumption. Table 12 displays the estimates of the negative externalities of distracted driving for each year of our sample. 16 For context, we explicitly describe the calculation of the year 2000 estimates, but the same logic applies to the estimates of all years. First, in 2000, 51 drivers died in two vehicle crashes when both drivers were distracted. We assume that fault is evenly divided between the vehicle that caused the crash and the other vehicle and therefore categorize half of these deaths (25.5) as externalities 17 . Based on our model, ( − 1)∕ or 68.15% of these deaths would have been avoided had the driver at fault not been distracted, which translates to approximately 17.4 avoidable deaths from this category of drivers. An additional 568 fatalities resulted among occupants of vehicles driven by focused drivers that were involved in a two-car crash where the other vehicle was driven by a distracted driver. Based on our model, ( − 1)∕( + 1) or 51.69% of these deaths would have been avoided had the driver of the other vehicle not been distracted, which translates to approximately 293.6 additional avoidable deaths. Considering the one fatality among occupants of vehicles driven by focused drivers involved in three car crashes with two distracted drivers and one focused driver, we estimate (2 − 2)∕(2 + 1) , or 58.79% of that death to be avoidable. Similarly, considering the 110 fatalities among occupants of vehicles driven by focused drivers involved in three car crashes with one distracted driver and two focused drivers, we estimate ( − 1)∕( + 2) , or 41.63% of those deaths to be avoidable. Fatalities of pedestrians and other people that were not occupants of the vehicles are also multiplied by the respective ratios according to the types of crash to calculate the number of deaths that are avoidable.
Summing the avoidable deaths suggests that the externality of 403 deaths would have been avoided in 2000 if drivers had not been distracted. This figure is reported for each year in the third column of Table 12. In total, we estimate that 5,537 deaths could have been avoided during the years 2000 through 2018 if drivers had not been distracted. Note also that the figures reported in the third column are conservative estimates, because they do not classify passenger fatalities as externalities.
We present the externality calculation above for direct comparison to L&P. However, the assumption that passengers give informed consent to the driver's behavior is more realistic for drinking drivers than for distracted drivers. If we make the more realistic assumption that passengers do not assume the risk of the driver distraction, and classify these passengers as externalities, our estimate of avoidable deaths during the years 2000 through 2018 increases to 11,238 (fourth column of Table 12). 18 16 Note that, for ease of reference, the second column of the table includes the year-specific values of derived from our earlier estimation procedure which we use to explicitly estimate externalities. 17 In years when multi-vehicle crashes occurred involving three vehicles, all with distracted drivers, the same assumption is made. For example, in three car crashes in which all three drivers were distracted, we take the sum of all fatalities and scale by 2/3. 18 Note that passengers of distracted drivers are in some ways less able to consent to the dangerous behavior than are passengers of intoxicated drivers, because intoxicated drivers are generally intoxicated at the beginning of a trip, but distracted drivers can be distracted intermittently. On the other hand, passengers are a common source of distraction for drivers.
Using methods developed by Viscusi (1992) and Viscusi and Masterman (2017), the U.S. Department of Transportation reports annual estimates of the value of a statistical life (VSL). 19 When we multiply the avoidable deaths in each year by the associated yearly VSL estimate, we obtain the total externality cost for a given year. Summing up all of the yearly externality costs implies that the externality associated with lives lost due to distracted driving totaled approximately $46 billion between 2000 and 2018. Note that this represents a conservative estimate, because it excludes passengers from the calculation. If we categorize passengers in vehicles driven by distracted drivers as an externality, then the total externality associated with distracted driving rises to approximately $93 billion between 2000 and 2018. Further, both figures ($46 billion and $93 billion) represent externality cost estimates of fatalities only, not including the costs of non-fatal injuries, property losses, and other noneconomic losses in our analysis.
Following L&P, we contextualize the cost of the externality by evaluating the cost of distracted driving per mile. We start with the implied fraction of distracted drivers in each year shown in Table 6. We multiply the fraction of distracted drivers by the total vehicle miles traveled collected from the Federal Highway Administration (FHA) to obtain the implied miles driven by distracted drivers, which allows us to allocate the externality cost in a given year across miles driven by distracted drivers. For example, in 2000, we estimate that approximately 2.75% of drivers were distracted, the total cost of externalities was approximately $2.8 billion, and the FHA estimates approximately 2.7 trillion vehicle miles traveled. Collectively, these figures indicate a negative externality of approximately 4 cents per mile driven by distracted drivers. Using the same process for all years in our sample period suggests an average negative externality of approximately 2 cents per mile driven by distracted drivers between 2000 and 2018. If we repeat the analysis but include passengers in vehicles driven by distracted drivers, the estimated negative externality increases to approximately 5 cents per mile driven by distracted drivers during the years 2000 through 2018.
Next, we estimate the implied personal automobile insurance policy surcharge necessary to recover the insurance losses attributable to distracted driving. We use data from the National Association of Insurance Commissioners to calculate the losses paid by all U.S. personal automobile insurance companies for each year from 2000 through 2018. Our calculation requires the following three assumptions: 1. The percentages of one-, two-, and three-car fatal crashes to all fatal crashes are the same as the percentages of one-, two-, and three-car crashes to all crashes; 2. The relative riskiness of distracted drivers causing fatal crashes is the same as that of those causing all crashes; 3. On average, the loss per car is the same regardless how many distracted drivers are involved.
We believe these assumptions are valid because the behaviors associated with loss frequency (e.g., distracted driving) do not predict loss severity (FTC, 2007).
With the above assumptions, we can estimate the avoidable insurance losses by following the same process we used for avoidable fatalities. 20 In this case, we are not measuring externalities, so we consider all of the avoidable insured losses. Annual insurance losses ranged from $102 billion to $183 billion in our sample. As given in Column 4 of Table 11, avoidable automobile insurance losses attributable to distracted driving ranged between $2.7 billion and $3.9 billion during of our sample period. Overall, we estimate that approximately $61.3 billion in personal automobile insurance losses could have been avoided if drivers were not distracted.
One conceptual method of internalizing these costs to distracted drivers is to have an insurance surcharge for drivers cited for distracted driving. 21 Using the estimates of avoidable insurance losses to estimate an implied insurance policy surcharge requires the number of potential citations given to distracted drivers in each year of our sample. Citation data pertaining to distracted driving are difficult to obtain. We use the citation rate estimate provided in Rudisill and Zhu (2016), who find that approximately 2,607 citations per 100,000 licensed drivers are issued for distracted driving violations. 22 Using this citation rate, alongside the number of licensed drivers (from FHA), our estimates of the proportion of distracted drivers, and our estimates of the avoidable insurance losses, we can calculate an implied insurance surcharge by distributing avoidable insurance losses over the expected number of distracted driving citations in a given year.
The implied personal automobile insurance surcharges for each year of our sample are given in Column 5 of Table 12. The implied surcharge ranges from approximately $486 per driver in 2018 to $692 per driver in 2001. The average implied surcharge for the sample period is approximately $577. For context, this figure implies that drivers cited for distracted driving between 2000 and 2018 should have been charged an additional $577 by their automobile insurance companies to internalize these costs for distracted drivers.
Given the relative dearth of published citation rates, and recognizing that the citation rate is a function of violation frequency and enforcement efforts, it is 20 Instead of fatalities, we can estimate the crashes in each category that could have been avoided if drivers were focused. We then apply the percentage of avoidable crashes to the total insurance losses to estimate avoidable insurance losses. 21 We are not promoting this concept as a public policy change to reallocate the insurance mechanism. We are using it as a simple example of how the costs of distracted driving are broader than fatalities and how some of these costs are currently borne by all drivers in the insurance market. While some of these costs may be internalized to drivers following at fault crashes or through insurance repricing following citations, the concept of an annual surcharge may simplify the understanding of the cost of distracted driving. 22 The citation data from Rudisill and Zhu (2016) pertain only to the years 2010 and 2013, and only to citations for handheld device distractions. We acknowledge the shortcomings associated with applying the average citation rate reported by Rudisill and Zhu (2016) to all states and all years in our study, but we are not aware of other estimates of the distracted driving citation rate. The citation rates across demographic groups found in Rudisill and Zhu (2016) are positively correlated with the rates of distracted driving we estimate in Table 9, which marginally bolsters our confidence in using their citation rate. useful to consider a range of citation rates. For example, if the citation rate were half of Rudisill and Zhu (2016) estimate, the implied average annual surcharge would be $1,154, and if the citation rate were double that reported by Rudisill and Zhu (2016), the implied average annual surcharge would be $289.

Conclusions
The public policy debate around distracted driving legislation suffers from a lack of credible and reliable information. It is difficult to measure distracted driving risk because reliable data pertaining to distracted driving are scarce, and most prior studies use relatively unsophisticated methods when analyzing the available data. As a result, the evidence in the literature suggests distracted driving could pose no additional risk or it could be up to 23 times riskier than focused driving. Such variance in results is untenable for policy makers attempting to make informed policy decisions relating to distracted driving.
We develop and apply a methodology that measures the risks posed by distracted driving. Our method builds on L&P's model of drinking and driving, which relies on fatal crash data from FARS. FARS data are appropriate for this application because every fatal automobile crash is investigated and recorded using a consistent framework that is not used for non-fatal crashes. Our methodology allows us to estimate the frequency and relative riskiness of distracted driving more accurately than previous studies. By relying on data from a standardized source, and employing a more sophisticated methodology, our paper alleviates some of the concerns in prior studies with respect to the collection and consistency of distracted driving data.
Our estimates indicate that distracted drivers are approximately three times more likely to cause a fatal crash than focused drivers. We also find that distracted drivers represent 3% to 4% of drivers on the road at any given time.
We also find that, although distractions associated with cellphone use increase the risk of a fatal crash, other sources of distractions are even more risky than cellphone distractions. Therefore, public policy interventions enacted to mitigate distracted driving should consider cellphone use and other sources of distraction.
Finally, we quantify some of the economic costs associated with distracted driving. We estimate that from 2000 to 2018 as many as 11,200 deaths could have been avoided if drivers had not been distracted. Moreover, approximately $61 billion of insured losses could have been avoided if drivers consistently focused on driving. For perspective, the fatal externalities imply a cost of $0.02 to $0.05 per mile and the added insurance cost could be internalized with $577 annual insurance surcharges applied to distracted driving citations.
Policymakers should consider this information as they deliberate the appropriate public policies related to distracted driving.
Assuming hypothetical values of u f and u d , and substituting these equations back to the likelihood function allows us to demonstrate the potential difference in the relative riskiness. If we assume u f = 0.2 and u d = 0.15 , our best estimate of relative riskiness increases from 3.14 ( in Table 4 column 8) to 3.52. Hence, allowing for focused drivers to be more skilled in averting potential fatal crashes yields a larger estimated value of . .