A dataset for Wi-Fi-based human activity recognition in line-of-sight and non-line-of-sight indoor environments

The aim of this paper is to present a dataset for Wi-Fi-based human activity recognition. The dataset is comprised of five experiments performed by 30 different subjects in three different indoor environments. The experiments performed in the first two environments are of a line-of-sight (LOS) nature, while the experiments performed in the third environment are of a non-line-of-sight (NLOS) nature. Each subject performed 20 trials for each of the experiments which makes the overall number of recorded trials in the dataset equals to 3000 trials (30 subjects × 5 experiments × 20 trials). To record the data, we used the channel state information (CSI) tool [1] to capture the exchanged Wi-Fi packets between a Wi-Fi transmitter and receiver. The utilized transmitter and receiver are retrofitted with the Intel 5300 network interface card which enabled us to capture the CSI values that are contained in the recorded transmissions. Unlike other publicly available human activity datasets, this dataset provides researchers with the ability to test their developed methodologies on both LOS and NLOS environments, in addition to many different variations of human movements, such as walking, falling, turning, and pen pick up from the ground.


a b s t r a c t
The aim of this paper is to present a dataset for Wi-Fi-based human activity recognition. The dataset is comprised of five experiments performed by 30 different subjects in three different indoor environments. The experiments performed in the first two environments are of a line-of-sight (LOS) nature, while the experiments performed in the third environment are of a non-line-of-sight (NLOS) nature. Each subject performed 20 trials for each of the experiments which makes the overall number of recorded trials in the dataset equals to 30 0 0 trials (30 subjects × 5 experiments × 20 trials). To record the data, we used the channel state information (CSI) tool [1] to capture the exchanged Wi-Fi packets between a Wi-Fi transmitter and receiver. The utilized transmitter and receiver are retrofitted with the Intel 5300 network interface card which enabled us to capture the CSI values that are contained in the recorded transmissions. Unlike other publicly available human activity datasets, this dataset provides researchers with the ability to test their developed methodologies on both LOS and NLOS environments, in addition to many different variations of human movements, such as walking, falling, turning, and pen pick up from the ground.

Value of the Data
• The data were recorded for thirty subjects in three environments in both LOS and NLOS configurations. • The dataset presented in this paper can be used to assess the performance of Wi-Fi-based HAR systems. • The collected dataset can be utilized to provide insights on how the different human activity recognition system operates when confronted with activities performed in a NLOS configuration. • The dataset can also be used by medical alert systems to distinguish a falling incident from other regular daily activities.

Data Description
The collected raw signals were stored in one main directory that contains three subdirectories. These subdirectories comprise the data that were recorded in the aforementioned three  different environments. In each of these subdirectories, the data acquired for 10 different subjects are available. Each subject performed five experiments and repeated each experiment 20 times. The total number of files in the subdirectory associated with each environment is 30 0 0 files. Each file is associated with a specific trial and is stored as a comma-separated values data file (.csv). We divided each experiment to activities as provided by Table 1 . Each data file is named based on the format "Ex_Sy_Cz_Ai_Tk.csv", where the description of the data files naming conventions is provided in Table 2 .
For example, a data file with the name "E1_S04_C3_A7_T17.csv" refers to data collected in the first environment for subject number 4 during his/her engagement in activity number 7 (turning) of the third experiment (walking between the transmitter and the receiver) and the trial number is 17.
Each collected data file contains a vector of m packets recorded while performing a certain experiment trial. Each of these packets is stored in a row in the csv file associated with the activity trial. The description of each row entry is provided in Table 3 . Fig. 1 shows the recorded raw CSI signals for each of the five experiments performed by the first subject in the first environment.

Subjects
In total, 30 healthy subjects (28 males and two females) have voluntarily participated in the data collection process. The subjects had a mean ± standard deviation age, height, and weight values of 22.7 ± 2.95 years, 178.37 ± 8.4 cm, and 81.9 ± 18.16 Kg, respectively. All participating Table 3 Description of the fields contained within the structure of the captured Wi-Fi packets.

Field Description timestamp_low
The packet's arrival time that is extracted from the 1MHz NIC clock [1] . bfee-count The count of the number of beamforming measurements that had been recorded by the CSI tool. Nrx The number of the receiving antenna. In our experiments, we are using 3 receiving antennas. Ntx The number of the transmitting antennas. In our experiments, we are only using one antenna at the transmitter side.

RSSI_a
The RSSI over the first receiver antenna measured in dB.

RSSI_b
The RSSI over the second receiver antenna measured in dB.

RSSI_c
The RSSI over the third receiver antenna measured in dB. noise The existed channel's noise measured in dB. agc The automatic gain control that is required to transform the measured RSSI value from dB to dBm. perm Refers to the order of the received signals compared to the receiver antennas. For example, [3 1 2] indicates that the first antenna received the signal that belongs to the third RF chain, the second antenna received the signal from the first RF chain, and the third antenna received the signal from the second RF chain. rate The rate at which the packets are received measured in packets/second. CSI The CSI values are presented as a complex number. In our dataset, we have 90 CSI values associated with each received packet. Each CSI value is indexed as csi_x_y_z where x refers to the transmitting antenna, y refers to the receiving antenna, and z refers to the CSI subcarrier. Since we have only one transmitting antenna, the value of x is always 1. The value of y will be between 1 and 3 since we have three receiving antennas, and the value of z will be between 1 and 30 since for each transmitter-receiver pair we have 30 CSI subcarriers.
subjects received a comprehensive description of the experiment that they will perform. Table 4 shows the details of the participating subjects.

Experimental procedure
Each subject was asked to perform five experiments. The steps required to successfully perform the experiments were explained before to the beginning of the experiments. To collect several instances of the same experiment, the participating subjects were asked to repeat each experiment for 20 trials.
To ensure accurate performance of the planned experiments, five timing diagrams were designed. These timing diagrams describe how to perform the different experiments and what are the activities involved in each one of them. Furthermore, to inform the subject of when to start a certain activity within the current experiment, a series of programmed beep sounds were used. Particularly, three beep sounds were used as follows: 1 A short beep was used to indicate the start of the experiment trial; 2 A medium beep was used to indicate the end of an activity and the beginning of the next activity; and 3 A long beep was used to indicate the end of the experiment. Table 5 shows the timing diagrams associated with each of the different experiments. All the numbers on the timing diagrams are in seconds. Before performing the experiments, the timing diagrams were thoroughly explained to the subjects and any question any subject had on the data collection process was answered. Each subject was asked to follow these pre-set timing sequences to have a uniform timing for all subjects who participated in the study.

Software and equipment
Two desktop computers were equipped with the Intel 5300 NIC to transmit and capture the Wi-Fi packets. The CSI tool [1] was used to capture and process the transmitted packets. Fig. 2 shows the Intel 5300 NIC. Both network cards were configured to operate at the 2.4 GHz band and use the third channel with a channel bandwidth of 20 MHz. The sampling rate was set to 320 Packets/second and the packet size was set to 1 Byte. To transmit the data packets, the injection mode based on LORCON codes [2] is used. Both NICs were configured to operate according to the 802.11n standard, which relies on Orthogonal Frequency Division Multiplexing (OFDM) modulation scheme that enables efficient data transmission over multiple channels [ 3 , 4 ]. The transmitting NIC is equipped with a single antenna while the receiving antenna is equipped with three receiving antennas that capture the Wi-Fi packets. Using these configurations, a Multi-Input Multi-Output (MIMO) system, which consists of 1 × 3 Wi-Fi streams, was used. The CSI tool used to capture the Wi-Fi signals at the receiving NIC can capture 30 CSI subcarriers for each stream. In other words, we are capturing 3 × 30 CSI subcarriers. Table 5 The timing diagram for each of the five experiments. A sound icon is used to mark the locations of the added beep sounds.
In the first environment, we captured the Wi-Fi signals in our research laboratory. The dimensions of the laboratory are 4.7 m × 4.7 m. The transmitter and the receiver were placed at 3.7 m apart from each other. Fig. 3 a shows a sketch of the environment, while Fig. 3 b shows an image of this environment. All subjects were instructed to perform the pre-explained activities in the middle location between the transmitter and the receiver.
In the second environment, the Wi-Fi signals were captured in a university hallway of dimensions 7.95 m × 3.6 m. In this environment, the transmitter and the receiver were placed at 7.6 m apart from each other. Fig. 4 a shows a sketch of this environment while Fig. 4 b shows an actual image of the environment. As before, the subjects were instructed to perform their pre-explained activities near the centre between the transmitter and the receiver.
The experiments performed in the third environment differ from the experiments carried out in the first and the second environments in the sense that there was a barrier between the subject performing the experiment and the device capturing the Wi-Fi signals. In other words, the transmitter and the receiver were in a NLOS configuration. Specifically, in the third environment, there was a barrier wooden wall with thickness of 8 cm between the transmitter and the receiver. The transmitter was placed outside the room while the receiver was placed inside the room. The distance between the transmitter and receiver was fixed at 5.44 m. An illustration and an image of the third environment is provided in Fig. 5 .

Ethics Statement
The experimental procedure was performed according to the Declaration of Helsinki and approved by the Institution Review Board (IRB) office at the Jordan University of Science and Technology. Before performing any of the experiments, each subject was asked to sign a consent form in which they were informed that their personal information will not be disclosed and that they have the right to stop participating in any of the experiments at any time if they chose to do so.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships which have, or could be perceived to have, influenced the work reported in this article.