COVID-19 RISK ASSESSMENT IN PUBLIC TRANSPORT USING AMBIENT SENSOR DATA AND WIRELESS COMMUNICATIONS

Covid-19 causes one of the most alarming global health and economic crises in modern times. Countries around the world establish different preventing measures to stop or control Covid-19 spread. The goal of this paper is to present methods for the evaluation of indoor air quality in public transport to assess the risk of contracting Covid19. The first part of the paper involves investigating the relationship between Covid-19 and various factors affecting indoor air quality. The focus of this paper relies on exploring existing methods to estimate the number of occupants in public transport. It is known that increased occupancy rate increases the possibility of contamination as well as indoor carbon dioxide concentration. Wireless data collection schemes will be defined that can collect data from public transportation. Collected data are envisioned to be stored in the cloud for data analytics. We will present novel methods to analyze the collected data by considering the historical data and estimate the virus contagion risk level for each public transportation vehicle in service. The methodology is expected to be applicable for other airborne diseases as well. Real-time risk levels of public transportation vehicles will be available through a mobile application so that people can choose their mode of transportation accordingly.


INTRODUCTION
Covid-19,  also known as coronavirus disease 2019, is a contagious respiratory illness caused by the transmission of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) among human beings (Gorbalenya et al., 2020). The first identified case of coronavirus infection has been noticed in December 2019. In a period of less than a year, more than 45 million cases and 1.2 million deaths caused by Covid-19 worldwide have been identified.
Coronavirus spreads mainly through droplets and aerosols after an infected person is coughing, sneezing, singing, talking, or breathing. The spreading happens when those droplets come into the contact with mouth, nose, or eyes of other humans who are in close proximity with the coronavirus-infected humanbeing. Moreover, the droplets can evaporate into aerosols, which can sustain in the air for many hours, enabling airborne way of transmission usually in crowded and poorly ventilated indoor environments, such as bars, restaurants, nightclubs, transportation stations, busses, etc. Strategies imposed by authorities for prevention against Covid-19 spread are social distancing, using protective masks, hands hygiene, avoiding contacting the mouth, eyes, and nose with hands, and ventilation and air filtration in public spaces in order to expel the undesirable aerosols. * Corresponding author: stefan.panic@pr.ac.rs There are important research evidences that suggest the transmission of Covid-19 through aerosol in indoor spaces. Important publications are related to i.) measurement of Covid-19 in the air, including the distance beyond recommended for the droplet transmission (Van Doremalen et al., 2020;Santarpia et al., 2020;Fears et al., 2020), ii. ) some physically established models of emissions of Covid-19 aerosols and dynamics of those aerosols (Qian et al., 2018;Liu et al., 2017;Riediker & Tsai;, iii.) evidence of airborne transmission for the SARS and MERS infections (Yu et al., 2004;Xiao et al. 2018), iv.) Epidemiological evidences of possible airborne transmission, though other routes cannot be excluded (Shen et al., 2020). The available research evidences support the use of protective measures against transmission of Covid-19 in indoor environments as an addition to other protective strategies already used in practice (for example protective masks, hands hygiene and etc.) (Morawska et al., 2020).
Economic and social effects of the Covid-19 outbreak impact all economic sectors, cause financial crises, social inequality, and insecurity stops transportation systems or impacts adversely. Covid-19 presents a great challenge for transportation systems all over the world in order to sustain regular services to customers. There are several factors that can be applied in practice in order to reduce the probability of Covid-19 infection risk in public transportation (for example control of the number of occupants in buses or trains, number of occupants at stations, trip length time, the use of face masks and the application of recommended hygiene standards and the ventilation of busses and trains) (Tirachini et al., 2020;Wielechowski et al., 2020).
Moreover, considering the decreased proximity to other passengers in the public transportation, the ability to evaluate the indoor air quality in the public transit in a timely manner becomes extremely important to assess the risk of Covid-19 contagion (Fattorini & Regol, 2020). Air quality can be defined by using several different measures including volatile organic compounds (VOCs), carbon monoxide (CO), particle matter (PM), other pollutants (carbon dioxide, methane, etc.), temperature, and humidity levels.
On the other hand, wireless sensors networks and wireless communications are already playing an important role in the era of Covid-19 (Saaed et al., 2020;Kamal et al., 2020;Ndiaye et al., 2020). The wireless communication technologies can be used to enable monitoring of the virus spreading, to enable healthcare automation, and to allow virtual education and conferencing. The wireless communication systems are envisioned to help the sustainability of the global economy by assisting in different industry sectors. Moreover, in (Howerton et al., 2020) the LoRaWAN-based system of air quality sensors in the city has been proposed. The generated data are used for a comparative spatial model used for the determination of air quality before and during the COVID-19 outbreak.
In the second part of this work, we will investigate measures to estimate the number of occupants in the public transport based on the device and device-free wireless-based methods. In the third part, a low-energy sensor node will be recommended. In the final part, the general architecture of the system will be provided. Wireless data collection schemes will be defined to collect data from public transportation and to be stored in a data cloud for further processing.
To the best of the author's knowledge, no paper addresses the ambient sensor data and wireless communications for evaluation of air-quality in transportation systems for Covid-19 risk assessment.

THEORETICAL PART
The occupancy rate in public transport is an essential parameter that must be considered while assessing indoor Covid-19 spread risk. Despite the availability of modern ticketing systems that support ticket validation during boarding, typically, stops that passengers exit are not recorded (Kostakos et al., 2010). Therefore, actual occupancy rates are usually not available. To address the problem of occupancy estimating, various solutions are available in the literature. In this study, we focus on solutions for indoor occupancy estimation. Such solutions can be classified into two broad groups based on whether user involvement is required or not. Some solutions require carrying a particular device (e.g., RFID tags, smart devices) or installing a certain mobile application. Such solutions are regarded as device-based solutions. Some solutions require deploying sensors or cameras to monitor either passenger directly or their impact on the environment. Such solutions do not require passengers to carry any device and are classified as device-free solutions.
Also, it would be of interest to observe one high profile measure from telecommunication theory, level crossing rate (LCR). If RF signal level at the transmission is adequately set, which can be accomplished after series of measurements and tests, then LCR values obtained at the reception could identify number of passengers traveling within the bus during the time between two signal receptions. Namely, LCR defines the rate at which random process R crosses predetermined level r, which can be expressed mathematically as: where p R (r) denotes probability density function (PDF) of random process R, while variance of PDF of random process . r , time derivative of R conditioned over R, is defined with . 2 rr  .

DEVICE-BASED SOLUTIONS
Ahorrar (Jain & Madamopoulos, 2016) is an indoor localization and occupancy counting framework designated to improve energy efficiency in office buildings through controlling the heating/cooling, ventilation (HVAC), and lighting. Ahorrar exploits participatory sensing to obtain the real-time distribution of occupants. Given the fact that some occupants carry multiple smart devices, Ahorrar employs a probabilistic and informationtheoretical approach to classify ownership of devices. The classification procedure of devices considers device locations using wireless signal strengths, mobility states of the devices, and data traffic patterns. Data is collected with existing wireless access points. For offline occupants without smartphones, it is recommended to provide wearable hardware platforms at building entrances. Data processing is performed at central locations to avoid the burden on occupants' devices. The goal of the proposed solution is to achieve an accuracy of 95% in terms of occupancy numbers. Despite some novelties such as ownership classification for occupants with multiple smart devices, we expect smart devices ownership at lower levels in our case. Also, public transportation has a different mobility pattern of occupants compared to office buildings. For instance, Ahorrar assumes that when the person walks away with a subset of her devices, the rest of her devices will remain stationary (probably on her desk) until she returns. This is not a device mobility case that we expect in public transport. Also, assumptions such as distributing smart wearables at the entrance is not a viable option in our case. An occupancy counting system is proposed by iAbacus (Nitti et al., 2020) specifically for public bus transportation. The main idea of the system is based on the identification of individual devices using the MAC address of the Wi-Fi network interface. Recent versions of popular mobile operating systems employ software-based randomization techniques to generate MAC addresses to improve user privacy and use the periodically changed random MAC addresses in successive messages. iAbacus system requires the installation of a Wi-Fi packet sniffer to collect probe request frames broadcasted by nearby mobile devices. iAbacus checks the first 6 octets of the obtained MAC addresses and evaluates whether the obtained address is a valid Organizationally Unique Identifier (OUI) assigned by IEEE. If the MAC address is not listed as an OUI, it is considered as a random MAC address and the proposed de-randomization algorithm is applied before the counting process. The counting algorithm is executed on the cloud and assesses whether a device is on the bus or nearby out of the bus. The data is transferred to the bus using a cellular connection during mobility or through Wi-Fi connections at bus stops. Experiments are conducted regarding static and dynamic conditions. For the case of the static condition, experiments are executed in a university room for 15 minutes. To test dynamic conditions, mobility of the bus is simulated by switching on/off devices according to an existing bus route and its stops. The experiments involve 8 devices where 3 of them employ random MAC addresses. The system achieves 100% accuracy with static conditions and 94% accuracy for dynamic cases. iAbacus assumes computed device count reflects the actual number of occupants which is not true considering offline passengers without a smartphone or when the Wi-Fi is disabled. Also, occupants with multiple smart devices can be counted multiple times. Despite contributions such as derandomization and the counting algorithm, the proposed system is evaluated in a small-scale simulation and obtained results can be misleading for a real deployment.
Another solution that employs Wi-Fi probe requests is presented in (Tang et al., 2018). The goal of this approach is to estimate indoor crowd density. The solution consists of a positioning algorithm based on RSSI (received signal strength indicator) based fingerprints. The fingerprinting mechanism is claimed to be dynamic to minimize the inaccuracy of RSSI measurements. Considering the fact that a person may have multiple smart devices, a multiple linear regression model is applied to compute the likelihood of a person generating the Wi-Fi signal. The system is composed of a sniffer connected to a cloud server. Besides the MAC address, RSSI measurements are also collected in probe request messages. Since RSSI typically fluctuates even for the same device, 3 highest probability RSSI values are used to signify a device. Cloud server is responsible for fingerprints management positioning algorithm, crowd density estimation, and signal probability analysis to identify a person with multiple smart devices. One of the major drawbacks of the proposed solution is the requirement of equipping smart devices located at fixed positions. In the experiments, 1 fixed smartphone is employed per 10 square meters. Furthermore, when multiple fixed smart devices are used, they are required to be evenly distributed across the test area, "most of the fixed devices" are supposed to be non-blocked by objects and each fixed device is expected ted to be deployed to a certain location. MAC randomization is not considered in this work.
Another wireless connectivity-based occupancy estimation solution is presented by (Kostakos et al., 2010). The main goal of the proposed solution is to identify trip durations per passenger. The proposed approach exploits Bluetooth discovery requests to obtain unique Bluetooth identifiers of the onboard passengers. This approach requires the availability of Bluetooth adapters. Despite the prevalence of Bluetooth enabled smart devices, the proposed solution requires Bluetooth adapters set on for device detection. The system is implemented and tested on actual routes with busses equipped with GPS. Obtained Bluetooth data is correlated with the localization data considering the bus stop locations on the route. Device discovery time is used to identify the bus stop where the passenger boards. Likewise, the time when the devices disappear implies the bus stop where the passenger exits. The duration of the trip is also used to identify false device detections such as other passengers waiting at the bus stops. This approach assumes that Bluetooth is not switched off/on during the trip. Another drawback of this approach is the over counting of passengers with multiple Bluetooth devices.
Accuracy in counting people is crucial for the evacuation of building occupants in case of an emergency. SmartEvacTrak (Ahmed et al., 2015) is such a solution that provides occupancy counting and localization. The proposed system is composed of a mobile application for data collection and a server for data analysis. SmartEvacTrak assumes not only smartphones but also the availability of certain sensors onboard along with the mobile applications installed on the phones. The proposed approach also requires the deployment of permanent magnets at gates to detect entries and exits using magnetometers and inertial sensors on occupants' phones. Localization is enabled using RSSI measurements obtained from occupants' WiFi adapters. SmartEvacTrak reaches 98% counting accuracy and 97% localization accuracy. Certain parameters such as floor plans, access point locations, the thickness of the obstacles, etc. must be present in the configuration repository. Strict hardware and software requirements on the occupants' side, certain deployment requirements on the building infrastructure, and the requirement of the configuration repository complicate the applicability of the proposed solution. (Li et al., 2016) proposes an indoor crowd monitoring system based on RSSI measurements. The main idea of the proposed system is deploying a wireless sensor network and collecting WiFi RSSI through sensor nodes. Collective data provided by the nodes indicate the location of the devices where the probe requests are issued. The presented approach considers maximum RSSI when two or more nodes receive probe requests from the same device. (Li et al., 2016) considers mobility of the occupants and handles cases when an occupant leaves and returns to the coverage of a node. The system is tested at a lab and classrooms in a university. The time interval between probe requests is varied in the experiments to assess the impact on the performance of the system.
Wi-Counter (Li et al., 2015) is an indoor occupancy counting system employing RSS of the WiFi signals. The system consists of a mobile application to crowdsource RSS data from different locations and a server to train the collected data. The training phase is offline and performed after the crowdsourcing phase. The training phase filters noise and then applies a neural network solution to model the relationship between RSS and the occupancy count. The system is tested in classrooms at a university. 7 access points are used in a classroom of size 96meter square. During the online counting phase, the mean and standard deviation of the obtained RSS is provided to the neural network model to estimate the people count in the classroom. The system reaches up to 93% accuracy even with random mobility.

Camera-based Solutions
Several different methods exist in the literature for counting people in indoor and outdoor applications. In recent years, camera systems are used for detection and counting people with the development of image processing techniques.
The use of cameras for people counting systems has some disadvantages. The application of this method indoors or outdoors determines the quality of the counting process. It is possible to get more successful results than the outdoor environment due to the possibility of controlling the light intensity in the indoor environment. However, the outdoor environment makes counting difficult due to uncontrolled environments such as background, natural light changes, and climate factors.
Another problem is the perception of people with the camera is the number of individuals at the scene. As the number of individuals increases, it makes the counting process difficult due to collective action and individuals closing the camera view (Reis, 2014). Also, the image processing technique for counting people requires expensive hardware for camera and processing and hardware cannot provide its energy from the rechargeable battery.
According to (Aziz et al., 2011) and (Chan et al., 2008), people detection and counting using image processing techniques can be divided into 3 main groups: Trajectory clustering approach, Feature-based regression approach, Individual pedestrian detection.
An algorithm has been developed with a maximum 10% average error for detection and classification of moving objects in different outdoor environments with image processing (Sacchi et al., 2001). RBG camera and depth sensor can be used together in counting people which cross a virtual line (Del Pizzo et al., 2016). Vision-based method for counting people can be divided into two groups, namely neural-based crowd estimation and blob detection and blob tracking (Schlögl et al., 2001). The first method uses a trained neural network. The accuracy of this system depends on the data set used to train the neural network. There are some research papers that use Neural-Based Crowd Estimation (Regazzoni & Tesei, 1996. Blob detection and tracing are based on separating an object from the background (Kettnaker & Zabih, 1999, Haritaoglu et al., 2000. blob detection has poor accuracy when there are crowded people on the stage. (Jin et al., 2015) presents an indoor occupancy detection solution based on the ambient CO 2 level. The proposed system assumes humans as the main source of CO 2 production in the environment. Depending on the application area, this assumption can be misleading. Another factor that needs to be considered is the time needed for CO 2 accumulation in the area indicating the actual concentration level. The configuration of the room where the system will be used should also be studied. Depending on the air ventilation, the rate of the fresh incoming air can vary. Sensor locations and the locations of air supply and exit vents are also critical on the obtained results. The approach proposed by (Jin, 2015) follows sensing by proxy methodology using partial differential equations.

Sensor-based Solutions
Another device-free occupancy detection solution is presented in (Pan et al., 2016). The main goal of (Pan et al., 2016) is to detect occupancy estimation even for multiple monitored people using vibration sensors. The system can detect the traffic of up to 4 people. The proposed system monitors ambient structural vibration. However, the applicability of the proposed system can be low for mobile systems such as vehicles due to various levels of vibration during mobility. Although the system is proposed for indoor environments, the maximum number of people that can be detected by (Pan et al., 2016) can be limiting for most of the indoor applications. (Shih & Rowe, 2015) employs ultrasonic chirps for indoor occupancy estimation. The main idea of the proposed solution is to transmit an ultrasonic chirp and analyzing how the signal dissipates over time. The human body absorbs sound and increased occupancy is expected to reduce the amplitude of signal reflections. This approach requires a training phase to understand the characteristics of the room with a known number of occupants. For training, machine learning methods are employed. To adapt to slight changes in the room, the model must be retrained. The signal is transmitted in the ultrasonic frequency range for silent sensing. The proposed system is motivated by applications such as concert halls and the maximum number of people that can be detected is reported as 50. Despite its advantages, the system is not designated for mobile environments. Therefore, its performance in public transportation can be lower due to the varying mobility of the platform.
Doorjamb (Hnat et al., 2012) is an indoor room-level tracking system to detect passing people. The proposed system employs ultrasonic range sensors located above the doors to identify crossing people. This solution can identify the walking direction and by following the sequence of the crossed doors, room-level tracking becomes possible. To identify individuals, the system checks the heights of the people and their movement sequences between adjacent rooms. To detect boarding and exiting people in public transport, a similar solution can be adopted. Adjacent rooms may or may not be possible in the public transport depending on the vehicle type and model.

RF-based Solutions
Device-free radio frequency (RF) based occupancy counting methods are useful since they do not impose passengers to use their devices. Device-free RF methods can be classified into different groups depending on the type and utilization of the RF signal they use for the measurements and evaluation of the number of occupants: i.) RSSI, ii.) channel state information (CSI) and iii.) ultra-wideband (UWB) signals (Kouyoumdjieva et al., 2019).
RSSI-based solutions evaluate the signal power strength level at the receiver that originates from a radio transmitter (for example such as an access point (AP) from WiFi infrastructure). In ideal propagation conditions when the line-of-sight (LOS) transmission is achieved and multipath fading and shadowing effects are negligible, RSSI is expected to be constant with time. Otherwise, the signal strength varies over time. One of the causes is the presence of people that can block or effect RF transmission between receiver and transmitter. Accordingly, the number of people in an environment can be determined based on measured RSSI values at the receiver node. In ideal circumstances, the RSSI value would result only from the signal strength received through LOS transmission. Although, in an indoor environment, multipath propagation is expected to impact the RSSI value highly. Thus, the measurements usually have to be reevaluated in order to remove other factors. In (Nakatsuka et al., 2008), the RSSI-based method for determination of the number of occupants is first introduced in 2008. It has been shown that the RSSI level decreases with an increasing number of occupants. The proposed model is capable to register up to 30 occupants. In (Xu et al., 2012), the fingerprinting-based method applies a probabilistic model for the occupant's localization. In the first phase, the RSSI level is measured with no occupants and then one occupant enters the room. This one person enters a different location in turn, stands in the middle of a different location, and then moves randomly.
The data are collected and sent to a centralized unit for further processing. The variation of RSSI values between these two phases, also known as RSSI footprint, is stored for a different location and channel links, and a classifier is developed based on RSSI values. This method is capable of mitigating errors caused by multipath in indoor environments but also helps to improve localization precision.
CSI-based people counting methods are similar to RSSIbased methods. Moreover, CSI provides additional information on channel properties of wireless links derived from the physical layer of the system. It describes signal transmission between transmitter and receiver and addresses the impact of multipath, shadowing, and power decay with distance. Furthermore, the CSI is capable to account for the environmental variances caused by moving objects more accurately than RSSI-based methods. Similar to RSSI, CSI-based methods also rely on WLAN infrastructure that is usually available in indoor environments. CSI-based people counting methods are more applicable in environments with high mobility. It is important to notice that counting immobile occupants in the environment can be more difficult if they move since fewer variations in CSI values are expected. However, available results from literature show that CSI may be a more appropriate choice than RSSI for application in people counting systems in indoor environments (Kouyoumdjieva et al., 2019).
In the ultra-wide band (UWB) based technique, the UWB signal is transmitted, the signal is then reflected by targets, and received to detect occupants within the radar range (Niu-Varshney, 2006). It is known that the received UWB signals can be reflected by every object in a particular environment. Thus, undesirable signals are needed to be detected and removed. The number of occupants can be determined by detecting the signal's waveform of each individual occupant or determining the number of occupants based on the pattern of the waveform of the received signals (Choi et al., 2017).

Infection Risk in Public Transportation
The constant use of mass land transport increases the risk of transmission of the virus, as countless people are placed close together. The causes of transmission of the virus are based on high passenger density, overcrowding in a confined space, insufficient ventilation, recirculation of polluted air, and increased exposure time to infected individuals (Tatem et al., 2006, Nasir et al., 2016. (Pestre et al., 2012) showed that, flu can spread rapidly in a closed area where there is not enough air renewal. In their study, (Baker et al., 2010) showed that, flu transmission can occur during modern commercial air travel. As a result of the study, the risk is concentrated among people sitting between two rows of infected passengers with symptoms that are consistent with transmission of other respiratory infections during the flight.

SYSTEM ARCHITECTURE
We regard the proposed solution as Covid-19 Risk Assessment System with Occupancy Estimation (CORAS-OE). CORAS-OE system architecture is composed of three main components, namely CORAS-OE-node, CORAS-OE -cloud, and CORAS-OE-mobile. CORAS-OE-node is the sensor node unit deployed onboard the public transport system to be monitored. CORAS-OE-node is equipped with an integrated sensor combining multiple sensors to track events and monitor environmental conditions. Some of the sensors employed on xnode are CO 2 , Volatile Organic Compounds (VOCs), temperature, humidity, and light sensors. We are concerned with the mobility of the vehicle to detect stops. CORAS-OE-node exploits WiFi probe requests to detect passengers with smart devices. Unless their WiFi network adapter is disabled, passengers on board the vehicle can be detected. The system is desired to distinguish people outside the bus and avoid over counting people with multiple smart devices. CORAS-OE-node is also equipped with a microcontroller for local computing and an SD card for local storage of the collected data.
CORAS-OE-node preprocesses the local data and sends it to x-cloud periodically. To minimize the amount of data that needs to be sent to the cloud, CORAS-OE-node applies various techniques including filtering and data compression. CORAS-OE-node can evaluate assess the risk level of Covid-19 spread locally based on the normalized values obtained from x-cloud daily. However, certain data is sent to CORAS-OE-cloud for analysis and also to update the current risk level. Data transfer frequency is dynamically set. In the worst case, it is updated at every stop. However, there will be no update unless the risk level changes above the given threshold which is also set dynamically at the beginning of the daily service by CORAS-OE-cloud. CORAS-OE-cloud is the cloud component of the system. X-cloud collects data from x-nodes across the city. Data is stored as a time series. Historical data plays an important role in determining the normalized values that can vary based on seasonal changes. Other than the batch analysis, the main task of CORAS-OE-cloud is to feed CORAS-OE-mobile. CORAS-OEmobile is the mobile application that people can use to monitor risk levels on different routes and vehicles. CORAS-OE-mobile is not a crowd-sourcing application to collect data but only the front end of the system to share the results with passengers. The sample illustration of the proposed architecture can be found in Figure 1.

CONCLUSION
The paper proposes an efficient method for assessment of air quality in the public transport system to examine the risk of contracting Covid-19. The introduction of the paper provides data examining the relationship between Covid-19 and various factors affecting indoor air quality. The theoretical part of this work introduces techniques to estimate the number of occupants in the public transport based on the camera, device, and devicefree wireless-based methods. The system's overall architecture consisting of sensor node units, wireless data collection schemes, and cloud storage systems is proposed for the evaluation of airquality in transportation systems for Covid-19 risk assessment.