A road surface image dataset with detailed annotations for driving assistance applications

The preview of the road surface states is essential for improving the safety and the ride comfort of autonomous vehicles. The created dataset in this data article consists of 370151 road surface images captured under a wide range of road and weather conditions in China. The original pictures are acquired with a vehicle-mounted camera and then the patches containing only the road surface area are cropped. The friction level, material, and unevenness properties of each road image are annotated in detail. This large-scale dataset is useful for developing vision-based road sensing modules to improve the performance of the driving assistance systems. Also, deep-learning experts can regard this dataset as a comparing benchmark for their algorithms. The dataset is available at [1].


Value of the Data
• This large-scale dataset lays the basis for identifying road surface conditions with vehiclemounted cameras and is useful for researchers to develop road sensing and driving assistance systems. • To the best of the author's knowledge, this is the first dataset that simultaneously annotates the friction level, unevenness, and material properties of the road images. The class definition of each road property is reasonable and specific. • Further researches correlated with road monitoring and accident prevention can also be conducted based on the whole or part of the dataset [2 , 3] . • The dataset covers as many available working conditions as possible. The robustness of the algorithm to be developed can be guaranteed on the dataset level. • For deep learning experts, this dataset can act as a benchmark to compare the performance of different image classification algorithms.

Data Description
Road conditions sensing with image data is verified to be feasible in providing essential information for vehicle control systems [4 , 5] . Most existing public datasets contain images collected under limited working conditions, which restrict the robustness of the algorithms in practical applications. This large-scale dataset annotated the friction level, material, and unevenness properties of road images acquired under various conditions.
The friction level property contains six subclasses corresponding to different weather conditions, i.e. dry, wet, water, fresh snow, melted snow, and ice. The road material property consists of asphalt, concrete, mud, and gravel. The road unevenness is divided into smooth, slight unevenness, and severe unevenness according to the amplitude of the road undulation. The subclasses of the three road properties are combined to form the class definition of the dataset. It should be noted that the road material and unevenness are not annotated when the friction levels are fresh snow, melted snow, or ice. Also, the unevenness is not labeled for mud or gravel roads.
Based on the above classification strategies, 27 classes are defined. The directory structure of the dataset is shown in Fig. 1 . There are 27 subfolders in the train-set folder, and each contains  water-asphalt-severe (e) wet-asphalt-slight (f). wet-concrete-smooth (g). wet-concrete-severe (h). water-concrete-slight (i). water-mud (j). dry-gravel (k). melted snow (l). ice.

Experimental Design, Materials and Methods
The original pictures are acquired with a LI-USB30-AR023ZWDRB USB camera mounted on the vehicle bonnet, as shown in Fig. 4 . Table 1 shows the specific parameters of the camera and the lens [6] . The camera has a certain depression angle to ensure the definition of the road surface area. The maximal preview distance on the flat road is 20 meters. The camera links to the IPC with a USB cable. The IPC runs a Python script that calls the OpenCV library to capture and store the pictures [7] . The system collects five pictures per second.
The pictures are captured from real roads accessible to the vehicle, which moves with a velocity in the range of 20-80km/h. The experiments are conducted in Beijing from October 2021 to May 2022 and cover various conditions of weather, sunlight brightness, road service age and aggregate characteristics, and also driving operation to enrich the pattern of the image dataset. Some corner cases such as debris on the road, dirt on the lens, and camera motion blur are also included. This dataset covers as many realistic situations as possible and lays a solid foundation for practical driving assistance applications.
Considering that the vehicle response is mainly affected by the area where the tire passes, we crop the original pictures into many patches with the size of 240 × 360 pixels for accurate road classification. This is realized with a Python script, after which the patches containing only the road surface areas are retained. Then the images are classified manually.

Ethics Statement
This work did not include work involved with human subjects, animal experiments or data collected from social media platforms