Experimental Study of Smoothing Modifications of the MUSIC Algorithm for Direction of Arrival Estimation in Indoor Environments

Nowadays, the Direction of Arrival (DoA) estimation problem attracts much attention because of emerging use cases for wireless networks. DoA is beneficial for Reconfigurable Intelligent Surfaces, indoor localization, and various navigation and sensing applications, such as gesture recognition and home monitoring. The Multiple Signal Classification (MUSIC) algorithm is very promising for DoA estimation because it provides better accuracy than the other algorithms and remains simple enough to implement in hardware. MUSIC has many modifications designed to achieve better accuracy in indoor environments by combining and smoothing several measurements. However, such modifications have been implemented in equipment with different capabilities. Consequently, the modifications have never been compared under identical conditions. The paper addresses this issue, provides a classification of existing smoothing modifications of MUSIC, and proposes new ones not considered in the literature yet. All of them are compared in real Wi-Fi networks. For that, a testbed is designed that allows automatic measurements in multiple experiments with different positions of devices. A new calibration procedure is created to achieve higher accuracy, and the testbed is validated in an anechoic chamber. Finally, the paper suggests the preferable smoothing modifications of MUSIC for finding the DoA.


I. INTRODUCTION
Nowadays, the Direction of Arrival (DoA) estimation problem attracts much attention because of new use cases for wireless networks.
DoA estimation is crucially needed for indoor localization and navigation. Indoor navigation is important in huge overcrowded places such as airports and malls [1]. Recently, indoor localization became important for various Internet of Things scenarios [2]. Moreover, DoA estimation becomes vital for autonomous vehicles inside factories and warehouses [3].
One of the hot purposes for DoA estimation is the Reconfigurable Intelligent Surfaces (RISs) configuration, e.g., see [4,5]. RIS [6] is a cheap and simple planar surface composed of many passive reflecting elements, imposing the required phase shifts on the incoming signal independently. These elements reflect signals towards desired directions by adjusting phase shifts. Thus, the RIS can leverage multipath signal propagation by focusing on the reflected signal power. To do so, RIS must be configured according to transmitter and receiver channel parameters.
DoA estimation can also be used in various sensing applications. For example, DoA is suitable for home monitoring and motion detection [7]. For that purpose, the DoA estimation algorithm estimates the direct path from the transmitter to the receiver (that is most likely static) and the plurality of reflected paths that depend on the mobile environment.
All aforementioned use cases require a simple and accurate DoA estimation algorithm. The most famous algorithm for DoA estimation is Multiple Signal Classification (MUSIC) [8]. According to many studies [9,10], MUSIC shows the best performance compared with the other DoA estimation algorithms. Therefore, many papers, such as [11][12][13], use MUSIC as a key element for localization systems and RISs. In addition, MUSIC remains simple enough to be relatively easily implemented in hardware [14].
DoA estimation is complicated in indoor scenarios where several copies of the same signal come to the array of sensors because of multiple signal reflections from walls, ceilings, and obstacles. As such copies are highly correlated, they result in spurious DoAs and degradation of the algorithm accuracy. To decorrelate these signals, the authors of the papers [15][16][17][18][19] design and evaluate smoothing techniques, such as spatial, backward, or time smoothing. Although huge work has been done to modify the original MUSIC algorithm and improve its performance, two important issues are still open.
First, existing prototypes operating in the indoor environment, such as those in papers [15][16][17][18]20], are implemented on different hardware with a limited amount and different kinds of available data. For example, the papers [15][16][17]20] use Intel CSI Tool, providing Channel State Information (CSI) for only 30 subcarriers, while the CSI values for over 70 other subcarriers of a 40 MHz channel are discarded. Existing implementation [18] on software-defined radio operates only with time-domain IQ samples of the L-STF field, the first field of the Legacy preamble is used for setting the receiver's amplifier, frame detection, and coarse frequency offset compensation, while other frame header fields are ignored. To sum up, existing prototypes cannot investigate all possibilities of the proposed modifications. Second, these modifications use different are not compared in scenarios with similar conditions. Some of them [15][16][17]20] require CSI. The other [18,19] operate with IQ samples of the signal in the time domain, and no consumer off-theshelf wireless devices support this feature.
In this paper, we address both issues and investigate various MUSIC modifications for DoA estimation. For that, we overview and classify existing modifications of the MUSIC algorithm, propose additional ones, and compare all of them in real environments with an exhaustive amount of data corresponding to both time and frequency domains. For data extraction, we design a testbed working with real Wi-Fi data 1 , as well as a novel calibration procedure for eliminating constant phase shifts between receiver channels. Our testbed is 1 We select Wi-Fi because Wi-Fi is one of the most popular wireless technologies [21]. It operates in unlicensed bands, which simplify both the research and deployment of localization systems. In the literature, Wi-Fi is considered as a promising technology for indoor localization [22,23], sensing applications [24][25][26] and the use with RISs [27][28][29]. For these reasons, the IEEE 802.11 Working Group develops two new amendments for the Wi-Fi standard, namely, IEEE 802.11az [30] for indoor localization and positioning, and IEEE 802.11bf for sensing applications [7]. validated in an anechoic chamber and is used for modification comparison in various indoor scenarios.
Thus, the main contribution of the paper is as follows: • We overview and classify existing smoothing modifications of the MUSIC algorithm and propose new ones. • We design, calibrate and validate a testbed for extracting exhaustive data from the received Wi-Fi frames; • With the designed testbed, we compare various modifications in the indoor environment and find the most accurate one. The organization of the paper is as follows. In Section II, we consider existing modifications of MUSIC and propose a classification for them. Section III describes experiments in the indoor environment with the devised testbed. In Section IV, we present the results of the comparison and identify the most accurate algorithm. Finally, Section V concludes the paper.

II. MUSIC MODIFICATIONS
In this section, we describe MUSIC and its existing smoothing modifications. First, in Section II-A, we provide a brief overview of the original MUSIC algorithm and its variation operating in the frequency domain, while in Section II-B, we describe various smoothing techniques, such as spatial, backward, and time smoothing. Finally, in Section II-C, we propose a classification of MUSIC modifications and describe systems presented in the literature.

A. ORIGINAL MUSIC
The MUSIC algorithm originates from paper. While we encourage interested readers to look in [8,18] for details, in this section, we briefly introduce the main points needed to understand the essence of MUSIC's modifications.
Consider a device transmitting the Wi-Fi signal with carrier frequency f c and the sensor array of M antennas (see Fig.  1) separated by ∆ = λ 2 , where λ is the carrier wavelength of the signal. The phase shift between two adjacent antennas equals: where c is the speed of light, Θ is the angle between DoA and the normal to the sensor array (see Fig. 1). Here, we consider both the transmitter and the antenna array to be placed at the same level, and Θ is the azimuth angle only.
In the absence of multipath signal propagation, it is sufficient to measure this phase shift between antennas to identify DoA. However, in the indoor environment, an antenna array receives several copies of the signal with different Θ because of reflections. MUSIC is designed to find DoAs and corresponding Θ even in this case.
In the time domain, we can represent the received signal as a M × T matrix X t consisting of the consecutive timedomain IQ samples x t m received by the m-th antenna at time t: m ∈ 1, M , t ∈ 1, T . However, time-domain Wi-Fi signals are composed of U subcarriers. The frequency gap between these subcarriers is negligible compared to the carrier frequency. Thus, we can consider the subcarrier signals as U independently received signals with the carrier frequency f c . Correspondingly, the CSI values from these subcarriers are U independently received samples, similar to time-domain IQ samples. Therefore, in the frequency domain, we can represent the received signal as a M ×U matrix X f , the m-th column of which consists of CSI values from U subcarriers csi u m of the Wi-Fi signal, m ∈ 1, M , u ∈ 1, U : The idea of the MUSIC algorithm is to find the array correlation matrix R xx = XX * and its eigenvectors. Here, matrix R xx has the size of M × M , and * is complex conjugation, while X is either X t or X f determined by the operating domain: time or frequency, respectively. Eigenvectors are used to compute a so-called pseudospectrum, the peaks of which correspond to the most likely DoAs. An example of such pseudospectra can be found in Fig. 6 and 7, where we depict pseudospectra obtained during validation in an anechoic chamber.

B. SMOOTHING TECHNIQUES
Let us review existing smoothing techniques. All smoothing techniques are aimed at manipulating the array correlation matrix R xx to improve its estimation. As stated in paper [31], the main purpose of such smoothing is to decorrelate the signals coming from different directions after reflections because otherwise, they result in spurious peaks on the pseudospectrum and the quality of DoA estimation degrades.

1) Spatial Smoothing
The key idea of Spatial Smoothing is (i) to split the whole antenna array into L subarrays (see Fig. 2), (ii) calculate array correlation matrices R (l) xx , l ∈ 1, L for each subarray  separately, and (iii) find the resulting correlation matrix R xx as the average of the matrices calculated for each subarray: Then, the resulting matrix is used in the MUSIC algorithm [15][16][17][18].

2) Backward smoothing
Backward smoothing is initially proposed in paper [19] and used in paper [16]. For backward smoothing, we need to calculate the so-called forward correlation matrix and backward correlation matrix, see Fig. 3, where red arrows denote the order of the antennas. The forward correlation matrix is computed from the ordinary signal matrix X t or X f (see 1) as described in Section II-A. To obtain the backward correlation matrix, we transform the signal matrix by performing complex conjugation of the ordinary one with the changed order of rows, as shown in Fig. 3. After that, the resulting array correlation matrix R xx is calculated as the average of the forward correlation matrix R

3) Time smoothing
In Wi-Fi networks, data frames are transmitted consequently. The paper [17] suggests using this fact for smoothing the measurements as follows. Let us capture Z consecutive Wi-Fi frames. First, we separately calculate the array correlation matrix R (z) xx for z-th frame (z ∈ 1, Z). Then, we average VOLUME 4, 2016 them and obtain the resulting array correlation matrix R xx as their average:

C. CLASSIFICATION OF MODIFICATIONS
The smoothing techniques described in Section II-B have been implemented in multiple modifications of the MUSIC algorithm, listed in Table 1. Note that the right part of the table summarizes numerical results discussed in Section IV. The considered MUSIC modifications differ in the operating domain: the time domain and the frequency domain (denoted as "t" or "f" in Table 1, respectively). The operating domain means whether the MUSIC algorithm use signal matrix X t or X f , see (1).
In both domains, various smoothing techniques can be applied independently or jointly. In Table 1, the sign "+" in columns S, B, and T indicates that the corresponding modification implements the spatial smoothing, backward smoothing, and time smoothing, correspondingly.
Also, for the sake of brevity, each modification is assigned a notation that consists of the following symbols: • "t" for time or "f" for frequency, • "M" as we modify the MUSIC algorithm, • "+" as we improve the algorithm with some techniques, • "S", "B", and/or "T" for the type of smoothing. For example, notation "tM+ST" implies that the modification of the MUSIC algorithm works in the time domain and exploits spatial and time smoothing, while notation "fM+SBT" means that the modification operates in the frequency domain and uses all kinds of the considered smoothing techniques. Note that the notation for the original MUSIC algorithm is "tM".
Some of the modifications listed in Table 1 are already proposed or studied in the literature. Thus, the algorithm studied in [18] is a sort of "tM+S", while the algorithms from [15], [17] and [16] are based on "fM+S", "fM+ST" and "fM+SB", respectively. The algorithm "tM+B" is studied in paper [19] and the algorithm in papers [20] is "fM" with some additional features. The papers introducing or evaluating the considered modifications are listed in the column "Paper" in  Table 1. From this column, we see that many modifications have not been investigated yet in the literature. For instance, the modification "tM+SBT" is not studied in any papers, while it uses all types of smoothing techniques and on the first sight shall provide better decorrelation of signals after reflections. In this work, we examine these modifications.
Note that in addition to the smoothing approaches, various papers propose additional methods to improve the performance of MUSIC.
However, as we focus only on smoothing approaches, we do not consider these improvements in our study.

A. DESCRIPTION OF THE TESTBED
To evaluate the accuracy of various MUSIC modifications, we have developed a testbed, the initial version of which has been presented as a demo at IEEE INFOCOM 2021 [32]. The testbed consists of a single-antenna transmitting device TX and a receiving device RX with an antenna array, see Fig. 1. TX is based on NI USRP-2944 running the 802.11 Application Framework. Using this framework, TX transmits the 80 MHz frames of IEEE 802.11ac [33].
RX uses NI USRP-2955 and extracts exhaustive information from standard Wi-Fi frames in both time and frequency domains. RX uses a single local oscillator for all RF chains to eliminate random phase shifts between signals from different antennas. However, there are constant phase shifts between these signals caused by imperfections in antennas, feedlines, and RF chains. To eliminate these phase shifts, we use the calibration procedure, the idea of which is as follows. In an anechoic chamber, we deploy the receiving antennas at the same distance from the transmitting one and measure the phase shift between signals on these antennas. During the experiments, we use this phase shift to compensate the constant phase differences caused by hardware imperfections on the RX.
Using the calibrated RX, we gather a dataset for different AoA in various scenarios described in Section III-C.
For a fair comparison, we use the direct path identification   method described in [18] for all the modifications algorithms, which works as follows. We denote the peak of the pseudospectrum, corresponding to the direct path, as a directpath peak. Even small movements in the environment cause significant changes in the reflection-path peaks, while the direct-path peak is usually unchanged. Therefore, we build multiple pseudospectra and discover which DoA is found on all pseudospectra. This DoA is the most stable and can be considered as the direct-path DoA. In our experiments, we use six pseudospectra to identify one direct-path DoA.
We also try to use more pseudospectra, but the difference in the quality is negligible. It can be explained by the fact that the peak locations are not evolving significantly in time, and an increase in the number of pseudospectra used does not improve the accuracy of direct path identification.

B. TESTBED VALIDATION
The validation is intended to prove that our testbed, the calibration procedure, and implemented modifications work properly (see Fig. 6). For that, we place our testbed in an anechoic chamber where the impact of multipath is drastically reduced, and there is only one path of signal propagation, which is Line of Sight (LOS). In this case, we have only one sharp peak on pseudospectra corresponding to the estimated angle Θ that is close to the real direct-path DoA Θ ′ . Fig. 6 and 7 show examples of such pseudospectra obtained for "tM+S" with Θ ′ = −10 • , and "fM+SBT" with Θ ′ = 20 • . As we can see, the peaks match the real angles Θ ′ with small errors. Therefore, our testbed, the proposed calibration technique, and implemented algorithms are correct.

C. CONFIGURATION OF THE EXPERIMENT
To compare various modifications of MUSIC, we consider the indoor environment with multipath signal propagation: a classroom, an office, and a classroom with the presence of people. Fig. 8 shows their layouts. The positions of RX and TX are chosen randomly while ensuring LOS between them. We measure the direct-path DoA, which is the angle Θ ′ between LOS and the normal to the antenna array (see Fig. 1).
As mentioned above, the testbed analyses IEEE 802.11ac frames. For a fair comparison, we use the same VHT-LTF field of IEEE 802.11ac frames for both time-domain and frequency-domain modifications.
As the frames are transmitted in the 80 MHz band, they have 242 tones. The frequency-domain modifications can use CSI values from all these subcarriers, and U = 242 for the MUSIC algorithm, see Section II-A.
For spatial smoothing, L = 2 (see Fig. 2) because it is the only possible number of subarrays of three antennas in the existing antenna array of four antennas.
For time smoothing, we use K = 2 (see Fig. 4) because it is enough to track the dynamics of changes in the accuracy of the modifications.
As the main performance metrics of the considered MU-SIC modifications in the described scenarios, we use the mean error and the median error. Mean error demonstrates the accuracy over all possible values of Θ ′ , median error is less sensitive to high error values when Θ ′ are close to 90 • .

IV. NUMERICAL RESULTS
The accuracy of the original MUSIC is known to significantly degrade when the signal falls parallel to the array or approaches this direction [18]. So, we start the study with the  evaluation of how the median error of various modifications depends on the angle Θ ′ between LOS and the normal to the sensor array, see Figs. 9-11. As expected, we see that the accuracy of the direct-path DoA estimation decreases significantly with an increase in Θ ′ . Thus, in the following experiments, we compare both (i) the accuracy obtained for Θ ′ = 0 • , i.e., the LOS signal falls perpendicular to the array, which is the most favorable case for MUSIC, and (ii) the average accuracy for all angles (Θ ′ ∈ [0 • , 90 • ]). The usage of all smoothing techniques in the frequency domain increases the maximum angle, where the median error is less than 10 • and thus increases the range of applicability of MUSIC. For example, in the Office scenario, the maximum angle increases by 30%, while in the Classroom with People, this value increases by 50%. Table 1 contains the median and mean error values for the three indoor scenarios described in Section III-C. We see that, in the Classroom scenario, the lowest errors are provided by modifications "tM+SB", "tM+SBT", "fM+SB", and "fM+SBT" if we take into account all possible values of the real angle (Θ ′ ∈ [0 • , 90 • ]). When Θ ′ = 0 • all algorithms demonstrates low estimation error. Modifications "tM+S", "tM+ST", "fM+S", and "fM+ST" provide the highest performance and the median error decreases by 1-2 • compared to the legacy "tM" algorithm. These modifications showing the best or almost the best accuracy are highlighted as bold in Table 1.
The same modifications provide the best performance for the Office and Classroom with People scenarios in both cases: Θ ′ ∈ [0 • , 90 • ] and Θ ′ = 0 • . Fig. 12-14 show the CDFs for three modifications ("tM", "tM+BT", "fM+SBT") in the Classroom, the Classroom with People, and the Office scenarios. These figure and the results from Table 1 demonstrate that all modifications show low accuracy of DoA estimation for Θ ′ ∈ [0 • , 90 • ] because the performance of the MUSIC algorithm significantly degrades when Θ ′ > 50 • . However, modifications that include both spatial (S) and backward (B) smoothing improves mean error by 22% in the Office, 18% in the Classroom with People, and 12% in the Classroom scenarios compared to the legacy "tM" algorithm.
Also, the results from Table 1 show that spatial or/and backward smoothing improves the accuracy of the MUSIC algorithm, while time smoothing has almost no effect on accuracy but requires significantly more computing resources. Therefore, the benefits of time smoothing in the considered indoor scenarios are not justified.
To summarize, we reveal that the most accurate modifications of the MUSIC algorithm are "tM+SB" and "fM+SB" that exploit spatial and backward smoothing and operate in the time or frequency domains. Their accuracy is almost the same, so either can be used for the DoA estimation. The   choice of the exact modification depends on the available and convenient data. If the device performs CSI calculation, fM+SB can be used. Otherwise, the simpler device can run tM+SB with IQ samples in the time domain without any loss in the accuracy. VOLUME 4, 2016

V. CONCLUSION
In this paper, we have studied the accuracy of various smoothing modifications of the MUSIC algorithm. These modifications were designed to improve the algorithm performance in the indoor environment with multipath propagation but have not been thoroughly compared yet. We considered many modifications found in the literature, classified them, and, based on the classification, proposed some additional ones. After that, we compared all of them in Wi-Fi networks and found the most accurate ones for DoA estimation. For that, we designed a testbed operating with real Wi-Fi frames, created a new calibration procedure, and validated the designed testbed in an anechoic chamber. Then, we ran experiments in various indoor environments with the testbed and found which modifications of MUSIC are the preferable ones for finding DoA of the Wi-Fi signal. We revealed that some smoothing techniques indeed significantly improve performance. Specifically, the most accurate modifications of the MUSIC algorithm exploit spatial and backward smoothing, and there are no significant differences between the time and frequency domains. These modifications increase the maximum angle, where the median error is less than 10 • , by 30%-50% depending on the scenario. He is an author of more than a dozen research papers and patents. He supervises students and lectures on fundamentals of telecommunications and SDR prototyping. His professional interests are related to testbeds, software defined radios, massive machine-to-machine communication, and ultra-dense networks.
EVGENY KHOROV (Senior Member, IEEE) is currently the Head of the Wireless Networks Lab and the Deputy Director of the Institute for Information Transmission Problems of the Russian Academy of Sciences. He is also the head of the Telecommunication Systems Lab of the Higher School of Economics. He has led dozens of national and international projects sponsored by academic funds and industry. Being a voting member of IEEE 802.11, he has contributed to 802.11 standards with many proposals. He has authored more than 100 articles on 5G and beyond wireless systems, next-generation Wi-Fi, protocol design, and QoS-aware cross-layer optimization. He