EyeFi: Fast Human Identification Through Vision and WiFi-based Trajectory Matching

doi:10.5281/zenodo.3882104

Published June 6, 2020 | Version 1.0.0

Dataset Open

EyeFi: Fast Human Identification Through Vision and WiFi-based Trajectory Matching

1. University of North Carolina Chapel Hill
2. Bosch Research and Technology Center

EyeFi Dataset

This dataset is collected as a part of the EyeFi project at Bosch Research and Technology Center, Pittsburgh, PA, USA. The dataset contains WiFi CSI values of human motion trajectories along with ground truth location information captured through a camera. This dataset is used in the following paper "EyeFi: Fast Human Identification Through Vision and WiFi-based Trajectory Matching" that is published in the IEEE International Conference on Distributed Computing in Sensor Systems 2020 (DCOSS '20). We also published a dataset paper titled as "Dataset: Person Tracking and Identification using Cameras and Wi-Fi Channel State Information (CSI) from Smartphones" in Data: Acquisition to Analysis 2020 (DATA '20) workshop describing details of data collection. Please check it out for more information on the dataset.

Clarification/Bug report: Please note that the order of antennas and subcarriers in .h5 files is not written clearly in the README.md file. The order of antennas and subcarriers are as follows for the 90 `csi_real` and `csi_imag` values : [subcarrier1-antenna1, subcarrier1-antenna2, subcarrier1-antenna3, subcarrier2-antenna1, subcarrier2-antenna2, subcarrier2-antenna3,… subcarrier30-antenna1, subcarrier30-antenna2, subcarrier30-antenna3]. Please see the description below. The newer version of the dataset contains this information in README.md. We are sorry for the inconvenience.

Data Collection Setup

In our experiments, we used Intel 5300 WiFi Network Interface Card (NIC) installed in an Intel NUC and Linux CSI tools [1] to extract the WiFi CSI packets. The (x,y) coordinates of the subjects are collected from Bosch Flexidome IP Panoramic 7000 panoramic camera mounted on the ceiling and Angle of Arrivals (AoAs) are derived from the (x,y) coordinates. Both the WiFi card and camera are located at the same origin coordinates but at different height, the camera is location around 2.85m from the ground and WiFi antennas are around 1.12m above the ground.

The data collection environment consists of two areas, first one is a rectangular space measured 11.8m x 8.74m, and the second space is an irregularly shaped kitchen area with maximum distances of 19.74m and 14.24m between two walls. The kitchen also has numerous obstacles and different materials that pose different RF reflection characteristics including strong reflectors such as metal refrigerators and dishwashers.

To collect the WiFi data, we used a Google Pixel 2 XL smartphone as an access point and connect the Intel 5300 NIC to it for WiFi communication. The transmission rate is about 20-25 packets per second. The same WiFi card and phone are used in both lab and kitchen area.

List of Files
Here is a list of files included in the dataset:

|- 1_person
    |- 1_person_1.h5
    |- 1_person_2.h5
|- 2_people
    |- 2_people_1.h5
    |- 2_people_2.h5
    |- 2_people_3.h5
|- 3_people
    |- 3_people_1.h5
    |- 3_people_2.h5
    |- 3_people_3.h5
|- 5_people
    |- 5_people_1.h5
    |- 5_people_2.h5
    |- 5_people_3.h5
    |- 5_people_4.h5
|- 10_people
    |- 10_people_1.h5
    |- 10_people_2.h5
    |- 10_people_3.h5
|- Kitchen
    |- 1_person
        |- kitchen_1_person_1.h5
        |- kitchen_1_person_2.h5
        |- kitchen_1_person_3.h5
    |- 3_people
        |- kitchen_3_people_1.h5
|- training
    |- shuffuled_train.h5
    |- shuffuled_valid.h5
    |- shuffuled_test.h5
View-Dataset-Example.ipynb
README.md

In this dataset, folder `1_person/` , `2_people/` , `3_people/` , `5_people/`, and `10_people/` contains data collected from the lab area whereas `Kitchen/` folder contains data collected from the kitchen area. To see how the each file is structured, please see below in section Access the data.

The training folder contains the training dataset we used to train the neural network discussed in our paper. They are generated by shuffling all the data from `1_person/` folder collected in the lab area (`1_person_1.h5` and `1_person_2.h5`).

Why multiple files in one folder?

Each folder contains multiple files. For example, `1_person` folder has two files: `1_person_1.h5` and `1_person_2.h5`. Files in the same folder always have the same number of human subjects present simultaneously in the scene. However, the person who is holding the phone can be different. Also, the data could be collected through different days and/or the data collection system needs to be rebooted due to stability issue. As result, we provided different files (like `1_person_1.h5`, `1_person_2.h5`) to distinguish different person who is holding the phone and possible system reboot that introduces different phase offsets (see below) in the system.

Special note:

For `1_person_1.h5`, this file is generated by the same person who is holding the phone, and `1_person_2.h5` contains different people holding the phone but only one person is present in the area at a time. Boths files are collected in different days as well.

Access the data
To access the data, hdf5 library is needed to open the dataset. There are free HDF5 viewer available on the official website: https://www.hdfgroup.org/downloads/hdfview/. We also provide an example Python code View-Dataset-Example.ipynb to demonstrate how to access the data.

Each file is structured as (except the files under *"training/"* folder):

|- csi_imag
|- csi_real
|- nPaths_1
    |- offset_00
        |- spotfi_aoa
    |- offset_11
        |- spotfi_aoa
    |- offset_12
        |- spotfi_aoa
    |- offset_21
        |- spotfi_aoa
    |- offset_22
        |- spotfi_aoa
|- nPaths_2
    |- offset_00
        |- spotfi_aoa
    |- offset_11
        |- spotfi_aoa
    |- offset_12
        |- spotfi_aoa
    |- offset_21
        |- spotfi_aoa
    |- offset_22
        |- spotfi_aoa
|- nPaths_3
    |- offset_00
        |- spotfi_aoa
    |- offset_11
        |- spotfi_aoa
    |- offset_12
        |- spotfi_aoa
    |- offset_21
        |- spotfi_aoa
    |- offset_22
        |- spotfi_aoa
|- nPaths_4
    |- offset_00
        |- spotfi_aoa
    |- offset_11
        |- spotfi_aoa
    |- offset_12
        |- spotfi_aoa
    |- offset_21
        |- spotfi_aoa
    |- offset_22
        |- spotfi_aoa
|- num_obj
|- obj_0
    |- cam_aoa
    |- coordinates
|- obj_1
    |- cam_aoa
    |- coordinates
...
|- timestamp

The `csi_real` and `csi_imag` are the real and imagenary part of the CSI measurements. The order of antennas and subcarriers are as follows for the 90 `csi_real` and `csi_imag` values : [subcarrier1-antenna1, subcarrier1-antenna2, subcarrier1-antenna3, subcarrier2-antenna1, subcarrier2-antenna2, subcarrier2-antenna3,… subcarrier30-antenna1, subcarrier30-antenna2, subcarrier30-antenna3]. `nPaths_x` group are SpotFi [2] calculated WiFi Angle of Arrival (AoA) with `x` number of multiple paths specified during calculation. Under the `nPath_x` group are `offset_xx` subgroup where `xx` stands for the offset combination used to correct the phase offset during the SpotFi calculation. We measured the offsets as:

|Antennas | Offset 1 (rad) | Offset 2 (rad) |
|:-------:|:---------------:|:-------------:|
|  1 & 2  |     1.1899      |     -2.0071
|  1 & 3  |     1.3883      |     -1.8129

The measurement is based on the work [3], where the authors state there are two possible offsets between two antennas which we measured by booting the device multiple times. The combination of the offset are used for the `offset_xx` naming. For example, `offset_12` is offset 1 between antenna 1 & 2 and offset 2 between antenna 1 & 3 are used in the SpotFi calculation.

The `num_obj` field is used to store the number of human subjects present in the scene. The `obj_0` is always the subject who is holding the phone. In each file, there are `num_obj` of `obj_x`. For each `obj_x1`, we have the `coordinates` reported from the camera and `cam_aoa`, which is estimated AoA from the camera reported coordinates. The (x,y) coordinates and AoA listed here are chronologically ordered (except the files in the `training` folder) . It reflects the way the person carried the phone moved in the space (for `obj_0`) and everyone else walked (for other `obj_y`, where `y` > 0).

The `timestamp` is provided here for time reference for each WiFi packets.

To access the data (Python):

import h5py

data = h5py.File('3_people_3.h5','r')

csi_real = data['csi_real'][()]
csi_imag = data['csi_imag'][()]

cam_aoa = data['obj_0/cam_aoa'][()] 
cam_loc = data['obj_0/coordinates'][()]

For file inside `training/` folder:

Files inside training folder has a different data structure:


|- nPath-1
    |- aoa
    |- csi_imag
    |- csi_real
    |- spotfi
|- nPath-2
    |- aoa
    |- csi_imag
    |- csi_real
    |- spotfi
|- nPath-3
    |- aoa
    |- csi_imag
    |- csi_real
    |- spotfi
|- nPath-4
    |- aoa
    |- csi_imag
    |- csi_real
    |- spotfi

The group `nPath-x` is the number of multiple path specified during the SpotFi calculation. `aoa` is the camera generated angle of arrival (AoA) (can be considered as ground truth), `csi_image` and `csi_real` is the imaginary and real component of the CSI value. `spotfi` is the SpotFi calculated AoA values. The SpotFi values are chosen based on the lowest median and mean error from across `1_person_1.h5` and `1_person_2.h5`. All the rows under the same `nPath-x` group are aligned (i.e., first row of `aoa` corresponds to the first row of `csi_imag`, `csi_real`, and `spotfi`. There is no timestamp recorded and the sequence of the data is not chronological as they are randomly shuffled from the `1_person_1.h5` and `1_person_2.h5` files.

Citation
If you use the dataset, please cite our paper:

@inproceedings{eyefi2020,
  title={EyeFi: Fast Human Identification Through Vision and WiFi-based Trajectory Matching},
  author={Fang, Shiwei and Islam, Tamzeed and Munir, Sirajum and Nirjon, Shahriar},
  booktitle={2020 IEEE International Conference on Distributed Computing in Sensor Systems (DCOSS)},
  year={2020},
  organization={IEEE}
}

Thanks!

References

1. Halperin, Daniel, et al. "Tool release: Gathering 802.11 n traces with channel state information." ACM SIGCOMM Computer Communication Review 41.1 (2011): 53-53.

2. Kotaru, Manikanta, et al. "Spotfi: Decimeter level localization using wifi." Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication. 2015.

3. Zhang, Dongheng, et al. "Calibrating Phase Offsets for Commodity WiFi." IEEE Systems Journal (2019).

Files

EyeFi_Dataset.zip

Files (1.5 GB)

Name	Size	Download all
EyeFi_Dataset.zip md5:9856f245dfc157079b08e13ae368bc89	1.5 GB	Preview Download

	All versions	This version
Views	1,113	1,058
Downloads	217	214
Data volume	397.5 GB	391.5 GB

EyeFi: Fast Human Identification Through Vision and WiFi-based Trajectory Matching

Creators

Description

Files

EyeFi_Dataset.zip

Files (1.5 GB)