Sleep Apnea Monitoring System Based on Commodity WiFi Devices

: To address the limitations of traditional sleep monitoring methods that highly rely on sleeping posture without considering sleep apnea, an intelligent apnea monitoring system is designed based on commodity WiFi in this paper. By utilizing linear fitting and wavelet transform, the phase error of channel state information (CSI) of the receiving antenna is eliminated, and the noise of the signal amplitude is removed. Moreover, the short-time Fourier transform (STFT) and sliding window method are combined to segment received wireless signals. Finally, several important statistical characteristics are extracted, and a back propagation (BP) neural network model is built to identify apnea state. Thus, interferences caused by changes of sleeping posture are eliminated. Extensive experimental results demonstrate that the proposed system can identify apnea state with an accuracy of over 95.6%. Furthermore, the accuracy can still reach more than 94.8% when the test environment layout is changed. Therefore, the proposed system can be used as a daily apnea monitoring system at home and provide users with health information.


Introduction
In recent years, researchers have proposed several methods to monitor people's health through recognizing human behaviors [1][2][3][4]. As humans spend about one-third of their life sleeping, detecting sleep quality is particularly significant. Traditional monitoring methods of sleep quality mainly focus on detecting brain waves [5] or heartbeat [6]. However, these methods do not work well, as breathing also plays a very important role in sleep. Some common commercial devices analyze sleep quality from data collected by a headset or wristband worn by the user. Alternatively, some devices use pressure sensor arrays embedded in the mattress to monitor sleeping disorders. These devices still have some shortcomings, such as being uncomfortable to wear, unsuitable for daily use, or quite expensive. radar. To reconstruct chest and abdomen movements, it utilizes a phase demodulation algorithm to analyze changes of phase in the continuous wave signal sent by the 2.4 GHz directional antenna. Its main disadvantages are that it requires customized hardware and takes a long time to scan the experimenter. If the experimenter moves frequently during the scanning process, the scanning process will be further affected. The Vital-Radio system proposed by Adib et al. [8] uses a bandwidth of 5.46-7.25 GHz to track the user's breathing and heart rate, which not only requires additional customized hardware equipment but also has a high operating frequency.
Radio frequency identification (RFID) tags have the advantages of simple structure and reading equipment, so they are often used for vital sign monitoring [9]. The Tagbreathe [10] system attaches a lightweight RFID tag to the user's clothes to backscatter radio waves. By analyzing the low-level data obtained by the RFID reader, it can track the periodic movements of the chest and abdomen caused by inhalation and exhalation. In addition, it utilizes phase information to estimate the respiratory frequency. However, the security of RFID technology is relatively poor, so tag information can easily be read illegally or tampered with maliciously.
Because WiFi devices are ubiquitous and WiFi signals have almost no effect on the human body, researchers are interested in studying various WiFi-based applications, such as localization [11][12][13], target detection [14], behavior recognition [15], and industrial Internet of Things (IIoT) [16,17]. The Ubibreathe proposed by Abdelnasser et al. [18] utilizes the received signal strength indication (RSSI) to estimate breathing. The received RSSI signal is filtered with a cut-off frequency of 0.1-0.5 Hz. The discrete wavelet transform is used to remove the noise in the signal, and the fast Fourier transform is used to estimate the respiratory frequency. However, this system is accurate only when the WiFi device is very close to the chest. Compared with RSSI, channel state information (CSI) not only contains carrier amplitude information but also provides phase information of each carrier. Since CSI possesses higher fine-granularity, it provides the potential for more accurate breath detection. Liu et al. [19] developed a technique using CSI amplitude and phase difference to capture the tiny movements caused by breathing and heartbeat. Based on the obtained tiny movements, the respiratory frequency is estimated. Phasebeat, proposed by Wang et al. [20], uses wavelet transform to reconstruct the respiratory signal and heart rate signal to monitor vital signs. TR-BREATH, designed by Chen et al. [21], projects the acquired CSI information into the time-reversed resonance intensity feature space and uses Root-Music and other algorithms to analyze the time-reversed resonance intensity to estimate the respiratory frequency. Mo-Sleep, introduced by Li et al. [22], eliminates the influence of phase error on breathing and uses principal component analysis (PCA) to extract breathing signals.
However, the above-mentioned detection systems have some limitations, such as requiring additional hardware equipment, offering poor privacy, being affected by light intensity, and failing to consider various situations that may occur in actual sleep. Hence, in this study, a new intelligent system is proposed for daily sleep monitoring. By setting two commercial WiFi devices as the transmitter and receiver, a WiFi-based detection system is formed to detect the sleep state of the experimenters. First, the received CSI phase information is corrected. Second, the noise, interference, and outliers of the corrected CSI information are eliminated. Next, a signal segmentation algorithm is designed, and the subcarriers with the most obvious motion changes are selected. Then the CSI signal amplitude is normalized. The time-domain and frequency-domain information is utilized to segment the signal. Finally, the segmented fragments are used to extract features and construct a classifier for detection and recognition. The proposed system does not require the experimenter to carry any equipment, nor does it need to modify the relevant hardware to detect breathing. It is suitable for daily use and features a low cost and excellent privacy.

System Overview
To simulate the breathing states that may occur during sleeping, the experimenter is required to breathe normally in the test area to simulate normal breathing, hold their breath to simulate the apnea state, and get up and leave the test area to simulate the unmanned state.  The proposed WiFi-based sleep-state detection process can be divided into two stages: offline and online, as shown in Fig. 2. In the offline phase, the CSI data of the unmanned state, breathing state, and apnea state are collected separately. Then, the data are preprocessed to eliminate phase errors and noise interference in the environment. After that, the final optimal subcarrier is selected. Finally, the features of the three states are extracted to construct a detection classifier. In the online phase, data are collected and preprocessed, and signal features are extracted in the same way. Finally, the classifier constructed in the offline phase is used to complete the detection of the sleep state, and the time node of each state is recorded.

Channel State Information
In IEEE 802.11n/ac, orthogonal frequency division multiplexing (OFDM) is used to measure and analyze CSI from the physical layer. In the frequency domain, the amplitude and phase of subcarriers can be used to characterize wireless channel state information. To characterize multipath propagation, wireless channels are usually modeled with channel impulse response (CIR), which can be expressed as: where L represents the total number of propagation paths, δ (t) represents the Dirichlet function, and α l and τ l are the amplitude attenuation and time delay of the l-th path, respectively. Since multipath transmission shows frequency selective fading in the frequency domain, it can also be characterized by channel frequency response (CFR), denoted as H ( f ): where f is the frequency. In the time domain, the received signal y (t) is the convolution of the transmitted signal s (t) and h (t): Similarly, the received signal spectrum Y ( f ) is the product of the transmitted signal spectrum S ( f ) and H ( f ) in the frequency domain: CSI is the sampling of CFR. Assuming that there are K subcarriers on one antenna and M packets are received, the CSI can be expressed as a matrix: where represents the sum of all paths of the k-th subcarrier in the m-th data packet.

Phase Error Elimination
In the test environment, the clocks of both the sender and the receiver in the WiFi network are not synchronized, and the detection delay of the data packet and the insufficient accuracy of the device will cause problems such as carrier frequency offset and sampling frequency offset. Therefore, the received CSI phase needs to be processed. The received phase consists of the true phase value and the offset value: whereφ k and φ k represent the measured phase and true phase of the k-th subcarrier, respectively, η represents the phase offset, and ρ represents constant error. Here, the phase error is eliminated by the linear fitting method [22], and the comparative result is shown in Fig. 3. As shown in the figure, the phase is no longer haphazard and is more stable after calibrating. Although the processed phase is not entirely equal to the real phase, it is close enough. Therefore, the error can be ignored.

Noise Cancellation
Due to the presence of noise, the required useful signal often cannot be extracted for subsequent processing. Since the frequency range of breathing is generally 0.2-0.75 times/s, and there are often a lot of high-frequency noises in the actual received data, it is necessary to use the wavelet denoising method to process the CSI amplitude. In this paper, "db3" is used as the wavelet basis to decompose the signal into five layers. The comparison results before and after denoising are shown in Figs. 4a and 4b. Before denoising, the regularity of signal fluctuation is submerged in noise; after denoising, the waveform becomes smooth, reflecting the regularity of fluctuation of the channel with the change of breathing. The signal frequency cumulative distribution function (CDF) before and after denoising is shown in Fig. 5. Before denoising, the real signal is almost covered by high-frequency noise; after denoising, the signal frequency is nearly concentrated at the low frequency, which can get a better denoising effect.

Subcarrier Selection
Due to the different carrier frequencies of subcarriers in OFDM, variations caused by diverse states are also different; therefore, the most obvious subcarrier needs to be extracted. The variation of CSI amplitude is used to quantify the sensitivity of subcarriers to the sleep state. Assuming the signal length (the number of data packets) is M, the variance V k of the CSI signal of the subcarrier is calculated as: where csi k,m represents the value of the k-th subcarrier in the m-th data packet. By calculating the maximum value of V k , the subcarrier with the largest CSI amplitude variance can be extracted for subsequent processing.

Signal Segmentation
Since the collected data may contain the situations in which the experimenter is in the apnea state or the experimenter leaves the test area, these segments need to be segmented from the data and judged accurately in order to monitor the sleep state. Therefore, a sliding window-based method [15] is used to segment the signal. As shown in Fig. 6, a variety of situations may occur in the sleep state. When the experimenter turns around or gets up and leaves the test area, the frequency range is higher than the respiratory rate. Thus, the short-time Fourier transform is used to process the signal. First, the data are cut off from the high frequency to obtain multiple clips without turning over and getting up. Next, assuming that the window length is N and the signal length is M, the variance V n of the CSI signal difference between two adjacent windows is calculated as: In (8), n represents the number of windows and the value range is 1, M N , where · represents rounded down. The Min-Max normalization is applied on V n to obtain V n : The signal is then processed based on the signal segmentation algorithm flow given in Tab. 1.

return T begin and T end
After adjusting the parameters many times, in this paper, the threshold σ is set to be 0.65, and the weighting parameters ω and β are set to be 0.85 and 3, respectively, which can better segment the signal.

Feature Extraction
In this paper, the diversity in the phase difference between antennas in the presence of breathing, apnea state, and unmanned state is used to detect the sleep of the experimenter by calculating the phase difference D between the antennas, and calculating its mean E D , variance V D , range R D , and interquartile moment Q D .
where sort (•) indicates that the sort is from small to large. Assuming the number of antennas is X , the number of subcarriers is K, and the number of samples collected is Y , the input feature dimension of the classifier is calculated as In this paper, the BP neural network is adopted to learn and map the feature set, the structure of which is shown in Fig. 7

Experimental Environment Design
In this paper, two ProBox23 Mini hosts, each equipped with an Intel 5300 wireless network interface card (NIC), are used as transmitters and receivers, as shown in Fig. 8a. The Linux system is installed on the Mini hosts, and the corresponding CSI acquisition tool is built. The transmitter transmits data with a directional antenna, and the receiver has three antennas for receiving signals. Two layouts are used in the experiments: a typical conference office, shown in Fig. 8b, and a vacant room simulating a relatively empty bedroom, shown in Fig. 8c. In the office, the experimenter lies flat on the desk to imitate the sleep state. In the vacant room, the transmitter and receiver are placed on both sides of the experimenter lying flat on a camp bed to simulate the real sleep state. Parameters are set as follows: channel 149 with a center frequency of 5.749 GHz is chosen for data transmission and reception; the packet sending rate is set to be 500 packets per second; the data are collected in the form of mobile phone timing; and the training data duration is set to be 10 s per group. Online data simulation sleep states are collected for 3, 5, and 10 min in each of three groups. During that time, the periods of the experimenter getting up and simulating apnea are recorded using mobile phone timing.
In [23], it is proven that the receiver, transmitter, and human chest and abdomen sections in an indoor line of sight (LOS) environment conform to the Fresnel zone of radio wave propagation. When the human body is lying flat, the chest and abdomen fluctuate most obviously with breathing. Therefore, the sending and receiving distances are set to be 90 cm. Over the course of a few months, data on the three states of unmanned, apnea, and human breathing from some experimental persons of different heights and weights are collected. A total of 450 sets of data are collected as offline data. The extracted features constitute the training data set in the offline stage. Features of the segments are extracted to form the test data set after signal segmentation in the online phase. Inevitably, there is a certain degree of packet loss in the actual collection process. After testing, the packet loss rate is very low and hardly affects the detection results of the algorithm. Therefore, the impact of packet loss on the experimental results can be ignored.

Experimental Results
For the classification, the performance of the extracted signal features through the feature distribution map is observed, and two of the features are taken randomly to draw the feature distribution scatter diagram, as shown in Fig. 9. It can be seen from Fig. 9a that since apnea is similar to the unmanned state, there is a small amount of confusion between the first feature and the third feature. However, as shown in Fig. 9b, the first feature and the second feature are almost free of confusion and they have a clear distribution area, which can better distinguish the presence of breathing, apnea state, and unmanned state. The confusion matrix in the scene pictured in Fig. 8b obtained after the classification is shown in Fig. 10 Figure 10: Classification confusion matrix of three states in the scene pictured in Fig. 8b In (11), TP is defined as "predicted value is positive and the actual value is also positive," FP is defined as "predicted value is positive, whereas the actual value is negative," and FN is defined as "predicted value is negative, whereas the actual value is positive." Combining Figs. 10 and 11, the precision values for the presence of breathing, apnea state, and unmanned state are 96.12%, 98.85%, and 99.33%, respectively, and the recall values are 99.11%, 95.56%, and 99.56%, respectively. Then, macro F1-Score is used as a measurement indicator to perform a weighted average: According to (12), the macro F1-Score is about 0.98. The greater the values of precision and recall in the macro F1-Score comprehensive model, the higher the quality of the classification model.

Impact of Environmental Changes
When the detection environment moves from the environment shown in Fig. 8b to the environment shown in Fig. 8c, the detection accuracy of apnea state is 95.1%, and the classification confusion matrix is shown in Fig. 11. After making massive changes to the positions of the table and chair in the environment shown in Fig. 8b, the accuracy rate can still reach 94.8%, and the classification confusion matrix is shown in Fig. 12 Figure 12: Classification confusion matrix of three states in the scene pictured in Fig. 8c where the position of the furniture is changed The changes of environment tested in this study do not have a significant impact on the proposed algorithm, which demonstrates the robustness of the proposed system. The reason is that the signal characteristics extracted are based on the phase difference between the antennas of the channel under different conditions, which is a dynamic characteristic. The variation of the indoor environment and the placement of furniture only affect the absolute parameters of the channel, such as the attenuation coefficient or multipath propagation delay, which do not have an effective impact on the experimental results of this article.

Conclusion
In this paper, an intelligent sleep apnea monitoring system is proposed to detect sleep status through commercial WiFi devices. First, a signal model is built, and the received signal is preprocessed. Then, the time-domain and frequency-domain information is combined to design a segmentation algorithm that selects the optimal subcarrier and segments the signal subsequently. Finally, the features of segments are extracted, and a classifier is constructed to detect the presence of breathing, apnea state, and unmanned state. Experimental results show that the proposed system can achieve a recognition accuracy of more than 95.6% for identifying the presence of breathing, apnea state, and unmanned state while eliminating the interference of CSI. Even when the layout of the experiment environment is changed, the detection rate can still reach over 94.8%. The proposed system can be used as a daily apnea monitoring system to provide health information to users. As this system is designed for individual detection, the signal propagation path and the detection accuracy will be affected when there are multiple people in the test area, or the human body is in a non-flat posture. In the future, the scenario with multiple people will be considered.

Conflicts of Interest:
The authors declare that they have no conflicts of interest to report regarding the present study.