Explaining the Association between Driver’s Age and the Risk of Causing a Road Crash through Mediation Analysis

It has been widely reported that younger and older drivers have an excess risk of causing a road crash. Two casual hypotheses may coexist: the riskier driving behaviors and age-related mechanisms in extreme age groups (direct path) and the different environmental and vehicle circumstances (indirect path). Our aim was to quantify, through a mediation analysis, the percentage contribution of both paths. A case-control study was designed from the Spanish Register of Road Crashes with victims from 2014 to 2017. Assuming a quasi-induced exposure approach, controls were non-responsible drivers involved in clean collisions between two or more vehicles (n = 52,131). Responsible drivers for these collisions plus drivers involved in single crashes constituted the case group (n = 82,071). A logit model in which the outcome was the log (odds) of causing a road crash and the exposure was age groups was adjusted for driver, vehicle and environmental factors. The highest crash risk was observed in extreme age groups, compared to the 35–44 year old age group: the youngest (18–24 years old, odds ratio = 2.14, 95% confidence interval: 2.06–2.24) and the oldest drivers (>74 years old, odds ratio = 3.30, 95% confidence interval: 3.04–2.58). The mediation analysis identified the direct path as the main explanatory mechanism for these increases: 89% in the youngest and 93% in the oldest drivers. These data support the hypothesis that the excess crash risk observed for younger and older drivers is mainly related to their higher frequency of risky driving behaviors and age-related loss of capabilities. Preventive strategies in extreme-aged drivers should focus on decreasing these behaviors.


Literature Review
Several studies have reported increased road crash rates for both younger and older drivers (under 25 and over 65 years old, respectively), compared to middle-aged drivers. The reasons given for both increases have been widely explored [1,2]. For younger drivers, these are mainly related to inexperience and risk-taking behaviors (driving under the influence of alcohol and other drugs, speeding, etc.) [1,[3][4][5][6][7][8][9]. This excess of riskier behaviors and higher traffic incident rates has also been suggested for younger cyclists [10]. As for older drivers, their excess risk has usually been related to their reduced ability to cope with the inherent complexities of driving-a situation causally associated with three age-related factors: (a) the physiological loss of capabilities and age-related fragility [2], (b) the pathological loss of capabilities derived from age-related illnesses such as dementia and other mental pathologies, visual and hearing defects, etc. [11,12], and (c) the frequency of driving under the influence of drugs that affect the driver's abilities [13]-a frequency reported to be higher in older drivers [14].
However, some researchers have proposed several alternative hypotheses to partially or even completely explain these age-related risk increases They all share a common background: a significant part of the risk posed by each category of driver is related to the amount and type of exposure to the risk [15][16][17], which is linked to an intrinsically high crash risk, regardless of the driver's characteristics. Therefore, to compare the crash risks yielded by, for example, the age categories of drivers, it is first mandatory to adjust this risk by the amount and type of exposure yielded by each driver's age category. A common example to illustrate the failure of the above requirement is the low mileage bias [17,18]: although crash rates for different driver subgroups were estimated for a fixed amount of exposure (measured as time spent on the road, or more frequently, as distance traveled) [16], this is not a fair comparison, as distances traveled on highways or motorways, and long journeys in general, are associated with lower crash risk than distances traveled in urban areas and short journeys.
It is well-known that older drivers, unlike younger ones, accumulate their travel distances in short (low-mileage) trips mostly in urban areas, where the risk of being involved in an accident is intrinsically higher [19]. Another example is the type of vehicle driven: if, for example, extreme age groups of drivers use older vehicles more frequently (intrinsically associated with a higher crash risk) compared to middle-aged drivers, a biased comparison between age-related crash risks would result.

Assumptions
A general formulation of all previous causal associations regarding age-related increases in crash risk implies the a priori assumption of two causal paths linking age with a high risk of causing a road crash ( Figure 1):

•
A direct causal path (DCP). In this path, the driver's age is associated with the risk of a crash regardless of the amount and type of exposure (the road, the time of the day, the type of vehicle driven, etc.). The reasons for this DCP would be those described in the first paragraph of this introduction for both younger and older drivers. Ultimately, all of these circumstances lead to a loss of optimal driving capabilities or to riskier driving behavior; • An indirect causal path (ICP). In this path, the driver's age is associated with an increased crash risk because it is causally associated with a riskier driving environment or a riskier vehicle: for example, younger drivers tend to drive more frequently at night, while aged drivers tend to drive more frequently on urban roads.
In order to establish intervention priorities for (theoretically) high-risk groups consisting of younger and older drivers, it seems very relevant to know which part of this high crash risk is related to each of the two causal paths described above. Therefore, depending on the possible results, preventive strategies could focus either on changing driving behaviors of extreme-aged drivers (increasing information, consciousness, sanitary advice, etc.) or identifying their loss of capabilities, if the direct path prevails, or in changing the driving environment and vehicle conditions of these drives if the indirect path predominates.

Hypotheses and Objectives
The research question of this work, therefore, is what percentage of the excess of risk in extreme-aged groups corresponds to each of the two casual paths. Our hypothesis was that non-related vehicle and environmental factors (that is, the DCP) are the main explanatory cause of excess risk in extreme-aged drivers. To our knowledge, no previous studies aimed at this purpose, although investigating this would be relatively easy by applying an analytic approach known as mediation analysis. The novelty of this study lies in the use of this method in a large sample of drivers in Spain. Therefore, the objectives of the present study are:

•
To confirm the excess risk of younger and older drivers of causing a crash compared to middle-aged drivers; • If this excess risk is confirmed, the second aim is to quantify which part of this higher risk is related to a DCP and which part depends on an ICP, by applying a mediation analysis based on a decomposition method.

Data Used in the Study
We designed a retrospective case-control study using data from the Spanish Register of Road Crashes from the Spanish Traffic Directorate for the years 2014 to 2017. It is a nationwide police-based register of all road crashes with victims. The characteristics of this register have been described elsewhere [20,21]. Three of the variables included are the type of crash, the type of vehicle and the commission of infractions or driving errors immediately prior to the crash by any driver involved. Taking into account this information, we defined our original study sample as that comprised by the 134,202 drivers of four-wheeled vehicles (cars, vans and all-terrain vehicles) involved in road crashes ascribed to any of the following three subgroups: • Subgroup 1. Drivers involved in single crashes in which only one moving vehicle was involved (n = 31,290 drivers); • Subgroup 2. Offender drivers (drivers who were at fault for the crash), involved in clean collisions (i.e., collisions between two or more moving vehicles, including frontal, front-lateral, lateral, rear or multiple collisions) in which only one of the drivers involved committed a traffic infraction or error immediately prior to the crash) (n = 50,781 drivers involved in as many clean collisions); • Subgroup 3. Non-offender drivers (drivers who were not at fault for the crash) involved in the 50,781 clean collisions described above (n = 52,131 drivers).
As several drivers presented missing values in some of the variables evaluated, the sample analyzed in this study finally consisted of 118,364 drivers with complete records for all variables.
We assumed that most drivers in subgroups 1 and 2 were responsible for the crash in which they were involved; therefore, they comprised the case group. As can be noticed in the three subgroups, only single or clean collisions were considered in the study. Therefore, incidents in which there were two or more drivers who committed an infraction were not included in the case-control study. On the other hand, most drivers included in the subgroup 3 were innocent and could be considered a representative sample of moving drivers on the road; therefore, they constituted the control group. This quasi-induced approach, recently validated as an appropriate way to select the reference group in traffic databases [22], has been widely used in previous studies aimed at comparing the risk of road crashes across subgroups of drivers [23][24][25]. An advantage of the quasi-induce exposure method is that it indirectly allows the strength of the association between age and the risk of causing a traffic accident to be adjusted according to the amount of exposure to driving, without the need to restore to direct estimation (e.g., using time measures or distance traveled by each driver).

Main Variables Considered
For each driver/vehicle/environment/crash we considered the following variables, obtained from the information provided in the register: , type of road (highway or motorway, conventional road, street, other), intersection (no, yes), road use (peri urban area, ring road, residential, with special restrictions, other), traffic density (low, medium, high, very high), speed regulation (generic, specific); road surface (normal, altered), light conditions (daylight, twilight without artificial lighting, twilight with artificial lighting, darkness with artificial lightning, darkness without artificial lighting), meteorological conditions (normal, adverse); • Crash severity (only minor injuries, major injuries, deathly victims). Major injuries were considered when the victim required > 24 h of hospitalization.

Analytic Strategy
The mediation analysis applied in the present study is based on the method proposed by Buis [26], a generalization of the original decomposition method developed by Erikson [27]. This method decomposes the total association between a categorical, discrete or continuous exposure, and an outcome in a direct effect and an indirect effect. As our outcome (y: whether or not a driver causes road crash) is binary, we used logistic regression to model it. According to Buis' notation [26], let x be the age of the driver (for example, x = 1 are drivers aged 18-24, and x = 0 is the reference age group, i.e., [35][36][37][38][39][40][41][42][43][44], and z designs the set of environment-and vehicle-related mediators. According to our hypothesis (see Figure 1), the decomposition of the total effect of x upon y on a direct effect (x → y) and an indirect effect (x → z → y) can be estimated through the following Equation In Equation (1), O is the odds of y = 1 (causing a road crash). The first subscript represents the logistic regression coefficients and the second subscript represents the distribution of z. The left part of the equation (named as 'total') represents the OR that quantifies the global effect of x = 1 on y: the O of causing a road crash in drivers of the age group x = 1, divided by the O of causing a road crash in drivers of the age reference group (x = 0), given the observed values of z in each age group. The first term of the product (named 'indirect' in the equation) quantifies the indirect effect of x = 1 on y: the coefficients of the model are fixed so that they take values from the age reference category (x = 0), while z takes the observed values in each age category. Consequently, the numerator of this term is the O of x = 0 in the counterfactual situation in which z acquires the eigenvalue of x = 1. Therefore, this term expresses the OR for x = 1 that exclusively depends on the association between x and z. Finally, the second term of the product (named as 'direct' in the equation) refers to the OR of x = 1 on y which depends exclusively on its direct effect: the denominator is the O of x = 0 in the counterfactual situation in which z acquires the eigenvalue of x = 1. Therefore, the value of this OR exclusively depends on the direct association between x and y.
The model was implemented in Stata (version 15.0) (StataCorp ® 2019, College Station, TX, USA), with the ldecomp command. According to the theoretical framework explained above, the equations obtained from this command produced three OR estimates for each age group: an OR for the total effect of age; an OR for the DCP and an OR for the ICP (mediated through z). The original coefficients of the model shown in equation (1) were used to express the above decomposition in additive terms and thus determine the relative percentage contribution of each path to the total association. First, a model was obtained for the entire sample including driver's age, z, and also driver's sex. In a second step, the model was obtained separately for men and women, and for crashes of low (only minor injuries), and high severity (resulting in major injuries or deaths).
To obtain the 95% confidence intervals of the OR estimations, Buis [26] proposes the use of the bootstrap method [28]. This is a procedure based on obtaining multiple samples from the population (with replacement), using the study sample as the population. It can estimate the standard error as the standard deviation of all point estimates obtained from the samples. Therefore, bootstrapping (1000 iterations) was used to obtain 95% confidence intervals for the estimated OR in each model. Table 1 shows the distribution of the 118,364 drivers included in the final sample of the study (the one which includes complete records for all variables, and for which the decomposition model was designed).  Table 2 shows, for the total sample and separately according to sex, the three OR values (total, DCP and ICP), and their corresponding 95% confidence intervals (CI) for all age groups of drivers, as well as the percentage contribution of DCP and ICP to the total OR. Regarding the model obtained for the total sample, the risk of causing a crash was higher for the extreme-aged groups and reached its lowest value for drivers aged 35-44 years old. Compared to this age group, the highest risk was observed for the oldest drivers (>74 years old, total OR = 3.32). Most of this increase (92%) was linked to the DCP. Drivers aged 65-74 years old also showed an increased risk of crash (OR total = 1.65). In this group, ICP did not significantly contribute to this increased risk (OR = 1.00; 95% CI: 0.98-1.02). On the other hand, in the youngest age group (18-24 years old), the odds of causing a crash were 2.2 times higher than that of the 35-44 age group. Again, DCP contributed to the main part of this increase (89%).

Results
The pattern described does not change substantially when stratifying the models by sex. In the group of younger drivers (<34 years), there are no differences between men and women in both the increased risk of causing a crash and the percentages of these increase associated with the DCP. For drivers over 54 years, the increased risk associated with older age is slightly higher in women (e.g., the total OR in over-74 group is 4.65 in women, and 3.13 in men). However, the percentage of these increases attributable to the DCP are slightly higher in men. In fact, men aged 65-74 showed a reverse sign weight of the component attributable to the ICP ( Table 2).
Regarding the models stratified by severity of the crash shown in Table 3, we found no remarkable differences between both groups. In the younger drivers' groups, there was a slightly higher increased risk of causing a crash, as well as the percentage contribution to the ICP, in the subgroup of more severe crashes. For drivers aged >45 years, the ICP contribution was lower for crashes resulting in major injuries or deaths. In fact, in this subgroup, the ICP yielded an OR lower than 1, (and, consequently, a negative percent contribution to the total OR) in the age groups from 55 to 74 years.

Discussion
First, our study confirms the relationship between the excess of risk of causing a crash and the extreme-aged groups of drivers (less than 24 and more than 74 years old), this risk being especially high for older drivers. Second, this excess risk in both groups is only partially explained by differences in the driving environment or in the vehicle driven. Therefore, we have to assume that age-related risky driving behaviors and loss of capabilities (which we have called DCP) are primarily responsible for these differences in both men and women.
We also found no substantial differences in this pattern when analyzing separately crashes with minor victims and crashes with major victims or deaths. In both groups, the DCP was also mainly responsible for the increased risks, and the ICP even showed a protective association with major-victim crashes in drivers aged 55-74.
We have not found previous studies based on a theoretical approach similar to ours. Therefore, direct comparisons of our results with previous ones are not possible. However, our results are consistent with those studies that show that drivers of extreme ages are involved in more crashes due to a riskier driving behavior rather than different environmental or vehicle circumstances. According to other studies, a driver's error was the critical reason in 97% of crashes involving older drivers [14], and low-mileage bias has been reported to be insignificant in the rural context [29]. Regarding younger drivers, human factors were more influential than environmental factors in road crashes [1], especially executive function capacities and negative driving behavior [30]. In our media, it has been proposed that adolescents in higher academic grades and living in our region (Andalusia) were less aware of road safety [6].
All these studies pointed to intrinsic human behaviors and loss of capabilities as the main cause of traffic crashes in younger and older drivers. However, it has been proposed that low-mileage bias is an important factor overestimating older drivers' risk in several studies [16,17]. These studies highlighted different environmental factors as the main reason for excess risk among older drivers, which are inconsistent with the results of our study. Our data did not deny the existence of this bias but showed that the main percentage of the risk was due to the DCP.
The analysis of all the riskier behaviors underlying this DCP, impossible to approach in a police-based database in our study, could be relevant not only to prevent future crashes, but also to better adapt to new automotive technologies safely, such as autonomous vehicles [31]. Behavioral studies may also help optimize preventive strategies in extreme-aged drivers in different contexts. This could potentially be decisive in reducing fatal crashes in developing countries, where fatalistic beliefs and risk-taking attitudes are key to road safety education [32]. In fact, as riskier behaviors are culturally determined and the age of drivers is also dependent on the distribution of the population pyramid, effective preventive strategies must be individualized for each country and context.
The DCP could also gather numerous mechanisms such as reckless behaviors, driving under the effects of alcohol, concentration disorder, delayed reactions, limited cognition, psychological loss of capabilities, dementia and other mental pathologies. Research aimed at quantifying those mechanisms in different subpopulations might also improve the individualization preventive strategies.
There is also an encouraging area of future research regarding different pathologies that may be associated with age, riskier behaviors and the risk of causing a road crash. In defining DCP, we mainly considered associated diseases and drug treatment when defining DCP. However, some middle-age diseases such as diabetes [33], cardiovascular disease or hypertension could have a considerable impact on driving abilities throughout life.
The results of our study suggest that human factors may explain the increased excess in risk of having a road crash in extreme-aged drivers, especially in the elderly. It seems essential to differentiate which part of the responsibility for a crash depends on a preventable misbehavior and which one on driving in an environment intrinsically associated with a higher crash risk. This difference has not been explored in depth in previous works and might make a difference in designing more precise preventive strategies.
Our study aimed at differentiating both components according to one of the main dependent human factors: age. It is important to note that our study does not attempt to identify which elements are intrinsically associated with a risker driving behavior in each age subgroup (it seems evident that those factors might be completely different in younger drivers-inexperience, alcohol abuse, etc.-than in older drivers-cognitive deterioration, pathologies, etc.). On the contrary, our study aims to identify which part of their respective excess risk of causing a road crash is not associated with this riskier driving behavior.
The practical implications of individualizing and quantifying both components of the association of age with the risk of causing a road crash could be widely exemplified. For instance, if (as our results suggest), the excess risk of causing a road crash in elderly drivers depends, predominantly, on a deterioration in their driving abilities (directly or indirectly related to aging), strategies focused on identifying these drivers with limitations in those skills and advise them to withdraw from traffic circulation might be appropriate. Interventions from Public Health institutions or Primary Health Care professionals focused on identifying potentially dangerous loss of driving abilities (ophthalmological evaluation, cognitive deterioration, prescribed drugs, etc.) and incorporating health advice on safety driving could be an excellent opportunity to address this issue. However, these strategies would not be effective if this excess risk depended on environmental circumstances alone (for example, driving in more hostile or unsafe driving environment or using damaged vehicles). In this hypothetical case, interventions could focus on informing these drivers and improving road safety in these environments.
Nevertheless, although the DCP prevails, environment-related prevention measures such as lower speed limits could well result in a substantial reduction in the frequency of serious accidents, in a possible interaction with individual cognitive impairments of older drivers.
This study has several limitations. Most of them are related to the data source: a police-based register with all the well-known drawbacks associated with this type of databases [34][35][36]: under-reporting of urban and less severe crashes, uncertainties about the validity of some variables, missing values for some of them, and lack of some other relevant variables to test our study hypothesis. For example, socio-economic factors and concrete risky driving behaviors could not be collected. Several studies attempted to develop a model for testing aberrant driving behaviors, such as the one tested by Zhang et al. [37], but in a police-based database it was impossible to collect some variables such as driver anger or hurry drivers. As an anonymous police-base register, we could not link the database to hospital records or clinical histories to enrich our data. We used a quasi-induced approach to design our control group. Although non-responsible drivers of clean collisions have been shown to constitute a representative sample of car drivers [22], a selection bias is still possible. On the other hand, we accepted the validity of our assumption about the allocation of responsibilities based on the commission of errors or infractions in clean collisions, which could be biased.
Future studies should focus on developing effective preventive strategies in extreme-aged drivers in order to decrease riskier driving behaviors.

Conclusions
In conclusion, our results support the hypothesis that most of the excess crash risk observed for the youngest and oldest drivers is primarily related to their higher frequency of risky driving behaviors or loss of capabilities and is much less dependent on the driving environment or on the vehicles they drove. This association was no different between men and women, or between crashes with minor or major victims. These results should be considered in order to prioritize preventive strategies intended to decrease road crashes among the youngest and oldest drivers. Future studies should be designed to focus on analyzing the concrete elements of these riskier driving behaviors, the identification and control of the potential loss of capabilities and exploring the usefulness of preventive programs for extreme-aged drivers.

Acknowledgments:
The authors wish to thank the Spanish General Traffic Directorate for facilitating access to the Spanish database of road crashes, Ángela Rivera-Izquierdo and K. Shashok, for improving the use of English in the manuscript, and Pablo Lardelli-Claret for supporting the study design and data analysis.

Conflicts of Interest:
The authors declare no conflict of interest.