Dataset for detecting motorcyclists in pedestrian areas

This paper presents a semi-automated, scalable, and homologous methodology towards IoT implemented in Python for extracting and integrating images in pedestrian and motorcyclist areas on the road for constructing a multiclass object classifier. It consists of two stages. The first stage deals with creating a non-debugged data set by acquiring images related to the semantic context previously mentioned, using an embedded device connected 24/7 via Wi-Fi to a free and public CCTV service in Medellin, Colombia. Through artificial vision techniques, and automatically performs a comparative chronological analysis to download the images observed by 80 cameras that report data asynchronously. The second stage proposes two algorithms focused on debugging the previously obtained data set. The first one facilitates the user in labeling the data set not debugged through Regions of Interest (ROI) and hotkeys. It decomposes the information in the nth image of the data set in the same dictionary to store it in a binary Pickle file. The second one is nothing more than an observer of the classification performed by the user through the first algorithm to allow the user to verify if the information contained in the Pickle file built is correct.


a b s t r a c t
This paper presents a semi-automated, scalable, and homologous methodology towards IoT implemented in Python for extracting and integrating images in pedestrian and motorcyclist areas on the road for constructing a multiclass object classifier.It consists of two stages.The first stage deals with creating a non-debugged data set by acquiring images related to the semantic context previously mentioned, using an embedded device connected 24/7 via Wi-Fi to a free and public CCTV service in Medellin, Colombia.Through artificial vision techniques, and automatically performs a comparative chronological analysis to download the images observed by 80 cameras that report data asynchronously.The second stage proposes two algorithms focused on debugging the previously obtained data set.The first one facilitates the user in labeling the data set not debugged through Regions of Interest (ROI) and hotkeys.It decomposes the information in the nth image of the data set in the same dictionary to store it in a binary Pickle file.The second one is nothing more than an observer of the classification performed by the user through the first algorithm to allow the user to verify if the information contained in the Pickle file built is correct. ©

Value of the Data
• One of the great difficulties the scientific community faces when developing artificial intelligence models is obtaining quality, structured, and publicly accessible data, especially if the semantic context involves classes such as pedestrians, motorcyclists, and their respective interactions with the road as themselves.In addition, constructing a data set is a task that requires a lot of time and order, even more so if there are no tools or methodologies that facilitate their generation and development.
• Anyone looking for labeled data where the semantic context of his project is near to the concept of Intelligent traffic light systems assisted by artificial vision.• It is possible to generalize that the state of the art is searching for more optimal and efficient techniques for constructing robust data sets that do not require human intervention as much as possible.However, their reproducibility and comprehension have become more complex, causing a knowledge gap in users looking for less complicated and more uncomplicated techniques to apply to construct depurate image data.• So, this work proposes a specific and straightforward methodology to get the same data into the same semantic context giving his reader tools and steps to get a labeled Dataset.• As a reference of how the scientific community can apply this type of data to their projects, please check the work developed in [1] which represents just an example of the value and the usefulness of the Dataset generated.

Experimental Design, Materials and Methods
The methodology developed for constructing the Dataset consists of two main lines of execution.However, only the line of image acquisition from the CCTV server uses a Comparative Chronological Analysis to determine whether it is necessary to download the image observed by the nth camera at an instant t of time.The second line consists of the labeling process using rectangular-type ROIs and hotkeys together with the verification process of the labels performed.

Comparative chronological analysis for image download
Comparative Chronological Analysis is the inference process of analyzing the current image observed by the nth camera and the magnitude of the observable difference on a previous image from the same camera.The following equation represents the operation done on it to the data observed by each camera. Where: • ACC (A, B) Chronological Comparative Analysis returns a scalar value.
• n is rows x columns the number of rows times the number of matrix columns.
• (i,j) the state of indexing over the image's array • i represents the actual row and j the actual column in the image's array.
• A the current image observed by the nth camera.
• B the previously stored image of the nth camera.
• A(i, j) the scalar value of the pixel registered in the position (i,j) of A .
• B(i, j) the scalar value of the pixel registered in the position (i,j) of B .
The criteria for downloading an image for the nth camera analysis depends on the value ACC (A, B) previously obtained, therefore, Where: • BD Flag to download.

• ACC (A, B) Chronological Comparative analysis (Scalar Value)
• threshold that sets the download criteria.
The Comparative Chronological Analysis for the Image Download method can ensure that the system does not store identical images based on the download criteria due to possible communication errors or the asynchronous update behavior the connected IP cameras typically connect to the CCTV.It achieves the construction of a raw data set that does not contain repeated or too similar images (threshold).

Region of interest (ROI) and hotkeys
After obtaining the raw data set, labeling based on the classes defined in the project is necessary.In an image, we represent the classes with two-dimensional data.Therefore, the user can extract the information of interest.A technique widely used by different manual labelers is using rectangular ROIs.
Rectangular ROIs is a tool that makes it possible to manually select a range of interest in an image by selecting the area in the image and returning a vector containing the indexing of the top starting vertex of the rectangle and the bottom opening vertex.Fig. 2 presents a sample of the data set built into this document, along with the ROI selection.
The fast access keys are a procedure based on the continuous sensing of the state of a key during the observation of the nth image or general sample so that each key is assigned a class or task.For this work, the following equation summarizes the cases.
• Assign corresponds to the path within a dictionary where the cut-out, the ROI coordinates, and the class label should be stored.• [MMNAP, MMAP, MNAP, PCP] are the distant classes recognized during the project.
• continue tells the system that the general sample information does not contain classes or that there are no more classes to classify.
Combining both concepts makes building a refined and structured data set possible, as shown in Fig. 3 .

Data source
The data come from a free CCTV comprising 80 cameras arranged publicly in Medellín [2] .They are part of the Intelligent Mobility System of Medellin (SIMM), a pioneering program in Colombia of the Medellín Mobility Secretary.It emerged in 2020 as a response to optimizing the city's road network.
The nature of the data represents a challenge to its excellent acquisition because although they are entirely free, each camera updates the information on the server after approximately five minutes and asynchronously for the other cameras.

Methodology
Fig. 4 shows a diagram of the proposed methodology.This methodology adopts characteristics from works like [3][4][5][6][7] .As we can discern, the system depends on the availability and state of the data present in the CCTV.The request for information (RGB Image) is made to each camera independently through a Raspberry Pi Zero W with an internet connection.

Results
As a result, we developed four algorithms, which are: • Image acquisition algorithm for construction of the raw data set.
• Algorithm for the process of labeling and building the clean data set.
• Algorithm for graphically reviewing or verifying the status of the debugged data set.
• Algorithm to make the conversion between pickle format design and YOLO.

Image acquisition algorithm for construction of the raw data set
Algorithm 1 illustrates the data acquisition process; this is according to the presented methodology and corresponds to the implementation of the Comparative Chronological Analysis For Image Download, the flag type variables are used to verify the internet connection and the CCTV server, the old and new_check variables correspond to the (A,B) variables respectively, which are used to determine how different is the B image concerning the B image.
Algorithm 1 is developed under the following sequential process, √ Scroll through the total number of cameras available in the CCTV server and assign to a variable the current observable information for the nth camera.√ If the data of the nth camera are available in the CCTV server and there is no previous image of the nth image in a local folder, the image is stored in .jpg format.√ If there is a previous image captured from the nth camera, the Comparative Chronological Analysis technique is applied to determine if the currently available image is different from the nth-1 image that was previously stored, to make the discrimination according to a threshold previously defined empirically.
The Fig. 5 illustrates the type of images acquired by the acquisition system implemented; it is worth noting that as can be seen there is no control over the lighting, much less the angle of incidence at which the camera captures, i.e. the system is at the mercy of the data available on the CCTV server.

Algorithm for the process of labeling and building the clean or filtered data set
the nth object of interest within the nth image.√ Wait for the user to select one of the 4 hotkeys; each hotkey allows to relate the nth bounding box to one of the 4 classes of interest.
Algorithm 1 Algorithm for Get the Data from URL.

Algorithm for graphically reviewing or verifying the status of the debugged data set
After completing the labeling process, using the code illustrated below, it is possible to review the labels made following the data hierarchy, presenting both the color assigned to the label and the object bounding box and the text of the class type.
Algorithm 3 is developed under the following sequential process, √ Read and display to the user the umpteenth image previously captured and stored in .jpg format.√ Read the .pkl file previously created and reassign its content to a dictionary type variable.√ Draw on the nth image read the content related to the name of the image on the dictionary, presenting the user with the object delimiter box of the nth object in the image together with the name of the class that was assigned.
where X is the width and Y is the height of the nth object's bounding box in the image.This type of boundary box is one count.Therefore, it is necessary to reproduce the interaction shown in the picture to transform any boundary box to its respective YOLO format.The following algorithm facility the interpretation ( Algorithm 4).Finally, using the last algorithm design, the Dataset available in [1] can be reproduced in raw and labeled

Fig. 2 .
Fig. 2. Example of a selected ROI for an object of a specific class.

Fig. 3 .
Fig. 3. Structure of the obtained clean data set.

Fig. 5 .
Fig. 5. Example of the type of images acquired.

Fig. 7 .
Fig. 7. General operation of the label system.
2023 The Author(s).Published by Elsevier Inc.This is an open access article under the CC BY license ( http://creativecommons.org/licenses/by/4.0/ ) Table, Image, Chart, Graph, Figure Data collection The raw Dataset was acquired using an embedded device (Raspberry Pi W2) connected 24/7 via WiFi to a free and public CCTV service in Medellin, Colombia, over the internet, and the raw Dataset was filtered a debugged into YOLO text annotations using the same device and the tools made and report in this article.Data source location Free and public CCTV service operated by the Ministry of Mobility of Medellín in Medellin, Colombia.Data accessibility Repository name: https://zenodo.org/Data identification number: 10.5281/zenodo.7935299Direct URL to data: https://zenodo.org/record/7935299Instructions for accessing these data: Not applicable