Dataset of traffic accidents in motorcyclists in Bogotá, Colombia

According to the World Health Organization, in 2016, Colombia obtained the tenth position worldwide, the third in the continent and the second in South America, according to the accident rate of 9.7 motorcycle fatalities per 100,000 populations. Between 2012 and 2021, the number of deceased and injured motorcyclists among all road users was 50%, with an annual average of 3140 fatal victims and 20,800 injured victims. Bogotá, Cali, and Medellín were the cities with the most accidents. In Bogota in 2017, the deaths of motorcyclists on the roads were around 32% of the road actors. This data article presents the dataset used to analyze and predict the severity of motorcyclist road accidents in Bogota in the article entitled “Extraction of decision rules using genetic algorithms and simulated annealing for prediction of severity of traffic accidents by motorcyclists” [1]. The data set was consolidated from the registration of 175,245 traffic accidents and the report of 337,828 road actors involved in crashes in Bogotá between January 2013 and February 2018. The data was compiled, processed, and enriched with additional information about infrastructure and weather conditions. The data corresponds to 35,693 motorcyclist traffic accidents, represented by 28 variables, and classified into five categories: road actors, motorcyclists and individuals involved, weather conditions and timing, road conditions and location and characteristics of the accident. The data on motorcyclist traffic accidents opens up a scenario to deepen and compare road safety in Latin America, where studies on vulnerable road users are limited. According to severity, the data on motorcycle traffic accidents recorded 28% with material damage, 69% with injured and 3% with fatal victims.


a b s t r a c t
According to the World Health Organization, in 2016, Colombia obtained the tenth position worldwide, the third in the continent and the second in South America, according to the accident rate of 9.7 motorcycle fatalities per 10 0,0 0 0 populations. Between 2012 and 2021, the number of deceased and injured motorcyclists among all road users was 50%, with an annual average of 3140 fatal victims and 20,800 injured victims. Bogotá, Cali, and Medellín were the cities with the most accidents. In Bogota in 2017, the deaths of motorcyclists on the roads were around 32% of the road actors. This data article presents the dataset used to analyze and predict the severity of motorcyclist road accidents in Bogota in the article entitled "Extraction of decision rules using genetic algorithms and simulated annealing for prediction of severity of traffic accidents by motorcyclists" [1] . The data set was consolidated from the registration of 175,245 traffic accidents and the report of 337,828 road actors involved in crashes in Bogotá between January 2013 and February 2018. The data was compiled, processed, and enriched with additional information about infrastructure and weather conditions. The data corresponds to 35,693 motorcyclist traffic accidents, represented by 28 variables, and classified into five categories: road actors, motorcyclists and individuals involved, weather conditions and timing, road conditions and location and characteristics of the accident . The data on motorcyclist traffic  accidents opens up a scenario to deepen and compare road  safety in Latin America, where studies on vulnerable road  users are limited. According to severity, the data on motorcycle traffic accidents recorded 28% with material damage,

Value of the Data
• The data can be used to predict the conditions and factors associated with a motorcyclist traffic accident in Bogotá (Colombia) according to severity (material damage, injuries, and deaths). • The availability of data related to traffic accidents in motorcyclists is limited in Latin America.
• The data can be analyzed comparatively with traffic crashes from other locations to contrast the behavior of motorcyclists on the road. • The data can help identify the causality of the motorcyclist traffic accident and thus define countermeasures to prevent injuries and fatalities on the roads. • Motorcyclist traffic accident data includes information on pavement/road conditions and weather conditions related to the time and location of the crash/collision. • The data set consolidates a relevant source for developing motorcyclists' road safety studies.

Data Description
The data presented in this brief article predicted the severity of traffic accidents in motorcyclists in Bogota, developed in the study by Ospina-Mateus, et al. [1] . In this study, data mining and machine learning techniques were applied to extract decision rules that predict motorcyclists' severity of traffic accidents. The data contains traffic accidents involving motorcyclists between January 2013 and February 2018 in Bogota, Colombia. The data set was extracted from 175,245 traffic accidents and 337,828 reports of road actors involved in crashes. In total, 35,693 motorcyclist accidents were consolidated.
The dataset was classified according to the accident's severity: material damage, injuries, and fatalities. In total, 28 variables were defined for each of the events. These variables were classified into five categories; road actors, motorcyclists and individuals involved, weather conditions and timing, location and road conditions, and accident characteristics. The data files (reads in Excel format) were presented in Tables 1 and 2 , respectively, deposited in Mendeley Data. All variables present in each event were categorically defined in the dataset. The categorization of each variable is indicated and explained in Table 1 . Table 2 contains the compilation of all the information.

Experimental Design, Materials and Methods
The dataset considered the variables in 5 groups. The "road actors" variables indicate the users (car/bus, bicycle, motorcycle, pedestrian) involved in the accident and the crash interaction. The variables related to "motorcyclists and individuals" indicate the number of people involved, gender and age. The variables of weather conditions and timing variables considered specific conditions such as day/date, lighting (daylight/nightlight), type of day (weekdays/weekends), month (trimester) and climatic aspects. The climatic conditions were consulted with the Institute of Hydrology, Meteorology and Environmental Studies of Colombia (IDEAM [3] ) for the date of the event with the level of rainfall (mm). The variables related to the condition of the road and the location correspond to cardinally locating each accident and indicating the type of road and its quality. The information on the quality of the road network was provided by the Institute of Urbanism of Bogota ((IDU-UAERMV) [2] ). Finally, in the last group of variables, the characteristics of the accident were indicated. These characteristics include the type of accident and the number of victims involved, whether uninjured, wounded, or dead. Table 3 contains the dataset with the variables and the severity of the road event. According to the data and severity, 28% correspond to events with material damage, 69% with injuries, and 3% with accidents with fatalities.    ( continued on next page )

Ethics Statements
The data and information related to motorcycle traffic accidents were received formally anonymized, guaranteeing the rights to privacy of humans involved in road events. The primary information was provided by the Secretariat of Mobility and Transit of Bogotá, a Colombian government entity committed to guaranteeing ethical and legal provisions in the use of information.

Declaration of competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Data Availability
Dataset of road crashes in motorcyclists in Bogotá (Original data) (MENDELEY).