When age means safety: Data to assess trends and differences on rule knowledge, risk perception, aberrant and positive road behaviors, and traffic crashes of cyclists

This data article examines the association between age, knowledge of traffic rules, risk perception, risky and positive behaviors on the road and traffic safety outcomes of cyclists. The data was collected using a structured self-administrable and online-based questionnaire, applied to a full sample of 1064 cyclists. The data contains 4 parts: descriptive statistics; graphical trends for each study variable according to age; Post-Hoc (Tukey-HSD) comparisons between cyclists classified in the different age groups; and, finally, the dataset for further explorations in this regard. For further information, it is convenient to read the full article entitled “Explaining Self-Reported Traffic Crashes of Cyclists: An Empirical Study based on Age and Road Risky Behaviors” (Useche et al., 2019) [1].


Subject area
Psychology More specific subject area Road safety and health behavior, sustainable transport modes, trafficcrash prevention, vulnerable road users Type of data Tables, graph, database How data was acquired Original data was collected through an international web-based survey. The data was consolidated and analyzed through the statistical software package IBM SPSS (version 23.0) for descriptive procedures and IBM SPSS AMOS (version 22.0) for structural/inferential ones Data format

Value of the data
This data provides information on the profile (age, cycling patterns, psychosocial and behavioral issues, and traffic crashes) of a sample of 1064 cyclists of different countries.
The risky and positive cycling behaviors of cyclists can be compared according to different user-related features, such as their age, gender, educative level and occupation, variables also contained in the annex dataset.
The data could be compared with other samples/studies using the Behavioral-Questionnaire (BQ) approach to examine associations and trends on traffic behaviors and crashes among cyclists.
This data can be used by other researchers and road safety practitioners to identify riskier patterns and propose evidence-based interventions grounded on the data provided by this large international sample of bicycle users.

Data
The dataset of this article provides information on a set of demographics, behavioral and crashrelated factors of the sample, entirely composed of active cyclists. Table 1 shows the descriptive statistics obtained for all study variables included in this data article. Fig. 1 shows graphically the trends on self-reported aberrant cycling behaviors according to the age of cyclists, and Table 2 allows to identify the specific differences between cyclists of all age groups through a Post-Hoc analysis. In the same sense, Fig. 2 addresses protective factors: positive cycling behaviors, rule knowledge and risk perception, and Table 3 summarizes the statistical differences between age groups for these three variables. Finally, Fig. 3 shows the actual trends on traffic crashes suffered by cyclists of the different age groups during the last 5 years, and Table 4 presents the Post-Hoc-based significant differences found in traffic crash rates.
In addition, the Supplementary SPSS dataset (.sav) will allow researchers to perform additional tests and comparisons using the entire set of measured variables.

Participants
For this cross-sectional research, it was collected and analyzed the data of n ¼ 1064 bicyclists (413 females, and 651 males) from 3 different territories: Latin America (n ¼ 831 individuals, 78.1% of the  In accordance with the pursued analyses and some previous research experiences dealing with different groups of cyclists divided by age [2,3], considering it as a key variable for explaining roadrelated risks [4,5], the full sample was divided in five intervals, composed as follows: o 26 years (n ¼ 390, composing 36.7% of the sample); 26-35 years (n ¼ 318, composing 29.9% of the sample); 36-45 years (n ¼ 160, composing 15.1% of the sample); 46-55 years (n ¼ 120, composing 11.2% of the sample); and 4 55 years (n ¼ 76, composing 7.1% of the sample).

Questionnaire
The questionnaire was administrated only in Spanish and consisted of four sections. The first part asked about individual and demographic variables, such as age, gender, region of provenance and main occupation. Notes: *The mean difference is significant at the 0.05 level. **The mean difference is significant at the 0.01 level. ***The mean difference is significant at the 0.001 level.
In the second part, self-reported risky cycling behaviors were assessed using the raw item bank of the Cyclist Behavior Questionnaire (CBQ) [6], a self-report measure on road behavior specifically designed to measure high-risk riding behaviors (errors and violations) among bike users. This Likert scale is originally composed of 44 items distributed in three factors: Violations (α ¼ 0.785), consisting of 16 items; Errors (α ¼ 0.850), composed of 16 items; and Positive Behaviors (α ¼ 0.729), consisting of 12 items, and based on the one developed by Özkan and Lajunen [7] for motor-vehicle drivers. The entire questionnaire used a frequency-based response scale of 5 levels: 0 ¼ never; 1 ¼ hardly ever; 2 ¼ sometimes; 3 ¼ frequently; 4 ¼ almost always. A global score of Risky Behaviors (α ¼ 0.895) was built up through the sum of Errors and Violations reported by respondents.
As for the third part, and in order to measure the risk perception and the knowledge of traffic rules, the Cyclist Risk Perception and Regulation Scale (RPRS) was used [8], Likert scale composed of 12 items: 7 for risk perception (α ¼ 0.657), and 5 for assessing general rules of bike using (α ¼ 0.722), in which the degree of perceived risk in objective risk factors and the knowledge of general regulations on the road are assessed in a scale from 0 (no knowledge/risk perceived) to 4 (highest knowledge/risk perceived).
Finally, the fourth part of the questionnaire consisted of a series of questions related to the use of bikes, such as the average use of the bicycle (including mean distances traveled and length of journeys) and the reasons for using it as a mean of transportation.

Statistical analysis
First of all, basic descriptive analyses (i.e. means and standard deviations of the study variables) were obtained, with the aim of establishing trends on aberrant (errors and violations) and positive cycling behaviors, protective factors such as the knowledge of traffic norms and risk perception, and their self-reported traffic crash rates as cyclists, based in their age groups -using five intervals, as described in the sample section. Finally, a set of comparative analyses (Tukey's Post-Hoc tests) were performed in order to determine significant differences between the specific age groups. Please note that the global score on risky behaviors is not included in the graphical presentation of the data, due to it constitutes the sum of two sub-scales (error and violation) of the CBQ. Nevertheless, the full set of variables is available in the annex dataset. Notes: *The mean difference is significant at the 0.05 level. **The mean difference is significant at the 0.01 level. ***The mean difference is significant at the 0.001 level.

Acknowledgments
The authors would like to thank the participants, research assistants and institutional stakeholders involved in the data collection. Specifically, thanks to the DATS-INTRAS staff and to Andrea Serge and Dr. Cristina Esteban for their collaboration during the planning and data collection phases. Also, to Dr. Morgane Pollet and to Runa Falzolgher for the professional edition and reading proof of the final version of the manuscript.

Transparency document. Supplementary material
Transparency data associated with this article can be found in the online version at https://doi.org/ 10.1016/j.dib.2018.12.066.  Table 4 Post-Hoc (Tukey HSD) analysismean comparisons for self-reported traffic crashes suffered by cyclists (last 5 years). Factor: age group. Notes: *The mean difference is significant at the 0.05 level. **The mean difference is significant at the 0.01 level. ***The mean difference is significant at the 0.001 level.