Grouping-Based Channel Estimation and Tracking for Millimeter Wave Massive MIMO Systems

Although the millimeter wave (mmWave) massive multiple-input and multiple-output (MIMO) system can potentially boost the network capacity for future communications, the pilot overhead of the system in practice will greatly increase, which causes a significant decrease in system performance. In this paper, we propose a novel grouping-based channel estimation and tracking approach to reduce the pilot overhead and computational complexity while improving the estimation accuracy. Specifically, we design a low-complexity iterative channel estimation and tracking algorithm by fully exploiting the sparsity of mmWave massive MIMO channels, where the signal eigenvectors are estimated and tracked based on the received signals at the base station (BS). With the recovered signal eigenvectors, the celebrated multiple-signal classification (MUSIC) algorithm can be employed to estimate the direction of arrival (DoA) angles and the path amplitude for the user terminals (UTs). To improve the estimation accuracy and accelerate the tracking speed, we develop a closed-form solution for updating the step-size in the proposed iterative algorithm. Furthermore, a grouping method is proposed to reduce the number of sharing pilots in the scenario of multiple UTs to shorten the pilot overhead. The computational complexity of the proposed algorithm is analyzed. Simulation results are provided to verify the effectiveness of the proposed schemes in terms of the estimation accuracy, tracking speed, and overhead reduction.


Introduction
The fifth generation (5G) wireless communication system has been proposed to meet the increasing demands on spectrum efficiency (SE), energy efficiency (EE), and network efficiency (NE). Millimeter wave (mmWave) communications operating in the frequency band spanning from 30 to 300 GHz have received great attention for 5G and beyond 5G (B5G) systems, since they are readily combined with massive multiple-input multiple-output (MIMO) [1,2] techniques for dramatically improving the system SE, EE, and latency performance [3,4].
The channel state information (CSI) accuracy is very important for the design of mmWave massive MIMO systems, and it plays a key role to achieve high SE and EE [5,6]. Particularly, obtaining accurate CSI is the premise to leverage the full benefits brought by the massive MIMO tech-nique on the mmWave frequency band [7]. Conventionally, CSI is commonly obtained with the assistance of the pilot sequences. Attributed to the orthogonality of pilot sequences, the CSI estimation with high accuracy can be obtained. Nonetheless, there is a positive linear relationship between the pilot overhead and the number of antennas. Therefore, when hundreds of antennas are deployed at the BS, the pilot overhead increases linearly, which may deteriorate the system performance significantly [8] due to the pilot contamination.
To reduce the pilot overhead, proper allocation of pilot sequences is one of the main solutions. The authors in [9] have designed a pilot reuse algorithm for massive MIMO with in-phase and quadrature-phase imbalances (IQI). Moreover, a pilot reuse method has been proposed in [8], where the channel power distribution in angular domain is derived via the channel covariance matrix analysis.
Accordingly, a pilot reuse pattern has been designed to reduce pilot overhead. In [10,11], a novel dynamic pilot allocation scheme has been proposed with low complexity. A fractional pilot reuse scheme based on threshold optimization has been developed in [12], where different pilots are allocated to the users classified based on the fluid model.
In fact, the pilot reuse scheme unavoidably affects the accuracy of CSI estimation, which motivates the blind CSI estimation methods developed to reduce the estimation error without using pilot sequences, such as the subspace method (SM) and monte carlo methods (MCMs). However, these conventional methods have extremely high computational complexity due to the matrix decomposition and are not suitable to implement in practice. To reduce the computational complexity, a spatial basis expansion model (SBEM) has been proposed to model the mmWave massive MIMO channel with a uniform linear array (ULA) of antennas in [13]. Then, a unified transmission strategy for the mmWave massive MIMO systems in the spatial domain has been proposed, which includes low complex CSI estimation and user scheduling. The same channel model has been extended to the uniform rectangular array-(URA-) based massive MIMO systems in [14]. By combining two-dimensional fast Fourier transformation (2D-FFT) and spatial spectrum analysis, the CSI can be estimated with low-complexity based on 2D-FFT operation in [15].
By exploiting the sparse characteristic of mmWave channel, an open-loop channel estimator based on orthogonal matching pursuit has been proposed for mmWave hybrid MIMO system [16]. Based on the channel reciprocity characteristics in TDD systems, the Arnoldi iteration has been used to estimate the left and right singular subspaces of the mmWave MIMO channel in [17]. To reduce the computational complexity, a sequential and adaptive subspace estimation (SASE) method has been proposed to estimate the column and row subspaces of the mmWave MIMO channel in [18].
Most of the aforementioned works cannot track the variation on the CSI, which needs to be reimplemented after the channel coherence time to catch up with the varying channel coefficients. Therefore, tracking and simultaneously estimating the CSI can further reduce the computational complexity to achieve accurate CSI. In [19], a modified unscented Kalman filter has been proposed to track the DoA angles with known channel power gain as a priori information. In [20], the predetermined user motion model has been exploited to predict the change in the CSI of massive MIMO systems. The assumptions in both works hinder their practical operation. A sparse complex-valued neural network has been applied to approximate the uplink-to-downlink mapping function in [21]. However, the overhead required for the uplink channel estimation is also intolerable. In [22], a principal subspace analysis-(PSA-) based iterative method has been applied to estimate and track the CSI. However, the computational complexity of this method is still very high due to the eigenvalue decomposition (EVD), and the step size of the iterative algorithm is not optimized.
To reduce the computational complexity and the pilot overhead, we propose a novel channel estimation and track-ing algorithm for the mmWave massive MIMO systems with URA antennas at the BS. First, the channel sparsity of the mmWave frequency channel is fully exploited. Based on the principal component analysis (PCA) method, a weighted iterative method with the optimized step size is proposed to estimate and track the signal eigenvectors. Second, the obtained eigenvectors are used to estimate DoA angles and path amplitudes via the classic multiple-signal classification (MUSIC) algorithm. Then, a standard inner product is used to match the elevation DoA angles with the right azimuth DoA angles on the same path. Moreover, a user terminal (UT) grouping scheme is proposed to realize the training sequence reuse in the scenario of multiple UTs, which substantially reduces the number of required pilots and the pilot contamination. The main contributions of this work can be summarized as follows.
(1) Based on the PCA, a weighted iterative method is proposed to estimate and track the eigenvectors of the signal subspace, which are used to recover the CSI. Since the complex matrix decomposition operation is avoided, the computational complexity decreases significantly (2) To accelerate the convergence speed for the proposed iterative scheme, we develop a closed-from solution for updating the step-size in the proposed iterative algorithm (3) To reduce the pilot overhead, a UT grouping method based on the spatial separability is proposed to divide the UTs into different groups. The spatial separability implies that the UTs in the same group do not have overlapping paths so that they can share the same pilot sequence and transmit simultaneously without causing interference This paper is organized as follows. In Section 2, a mmWave massive MIMO system is described. The singular value decomposition (SVD) method and the method to estimate the CSI based on the signal eigenvectors are introduced in Section 3. In Section 4, a novel iterative method is proposed. The simulation results are demonstrated in Section 6. Finally, conclusions are drawn in Section 7.

System Model
In this section, we describe the wireless channel model of the investigated mmWave massive MIMO system. As shown in Figure 1, the BS equipped with a URA of M × N antennas serves UTs, each of which is equipped with a single antenna. Moreover, the wavelength is denoted as λ, and the interantenna spacing of both elevation and horizontal directions is denoted as d s which is set to be half of the λ.
We assume that the signal transmitted from each UT, i ∈ I ≜ f1, ⋯, Ug where U is the number of UTs, to the BS arrives along P i clusters consisting of a number of independent resolvable paths, and each path has a corresponding elevation DoA angle and azimuth DoA angle. Moreover, the mean and angle spread (AS) of the Based on (5), we can observe that the channel matrix, H p , mainly depends on βðθ, ϕÞ and Aðθ, ϕÞ, where the former is directly related to the channel attenuation and initial phase, and the latter is determined by the DoA angles. Note that if Aðθ, ϕÞ and βðθ, ϕÞ are estimated with high accuracy, the CSI matrix would be reconstructed with high quality. To recover the CSI, there are six parameters to be estimated, i.e., θ p , Δθ p , ϕ p , Δϕ p , αðθ, ϕÞ, and Φðθ, ϕÞ.
Actually, (5) can be simplified when a mmWave frequency band is used in the massive MIMO system. From [2], Δθ p and Δϕ p tend to be zero in the scenario where the massive MIMO system operates over the mmWave frequency band. Therefore, each cluster can be seen as an independent path, and each path is formed by the line-of-sight (LOS) and the first-order reflection components. Consequently, (5) can be rewritten as Then, the received signals at the BS is expressed as where k is the index of the received signal snapshot, x i,k is the transmitted signal from UT i, W is the Additive White Gaussian Noise (AWGN) matrix, and H i,p ∈ ℂ M×N is the channel matrix for UT i on the pth path. From (7), we can observe that to recover the channel matrix, only four parameters need to be estimated for each path p, which are θ i,p , ϕ i,p , αðθ i,p , ϕ i,p Þ, and Φðθ i,p , ϕ i,p Þ. Therefore, by comparing with estimating entries in the original channel matrix, the computational complexity of recovering the parameters related to the paths is significantly reduced.

PCA-Based Eigenvectors Estimation and Tracking
In this section, a PCA-based iterative method to estimate and track the signal eigenvectors is proposed.
3.1. Channel Decomposition. Conventionally, SVD operation can be applied to estimate the eigenvectors related to the channel, which can be used to recover the channel. To explain the related notations and theories clearly, we assume that there is one UT in the system with P distinguishable paths first.
According to the SVD operation, the received signal, Y k , can be written as Y k = SVD H , where S is a unitary matrix S ∈ ℂ M×M , D is a unitary matrix with D ∈ ℂ N×N , the column  Wireless Communications and Mobile Computing vectors of S and D are all referred to as singular vectors, and V is a diagonal matrix, V ∈ ℂ M×N , which is named as singular value matrix and sorted in descending order.
Besides, when using the EVD to decompose the covariance matrices, C L = EfY k Y H k g and C R = EfY H k Y k g, the column vectors of S and D are the eigenvectors of C L and C R , respectively. Based on the array signal processing knowledge, the first P column vectors of S and D are mainly dependent on elevation and azimuth DoA angles of P resolvable paths, which are denoted as S P and D P , respectively. Accordingly, the DoA angles and amplitudes of resolvable paths can be estimated after recovering the S P and D P .
Based on above investigation, by applying the EVD operation, the covariance matrix, C L , can be written as where matrix S P ∈ℂ M×P , and the column vectors of S P are eigenvectors, V s is a diagonal matrix and V s ∈ ℂ P×P with entries from λ 1 to λ P , S Z is a matrix ∈ℂ M×ðM−PÞ , and the column vectors of S Z are eigenvectors, and Similarly, applying EVD operation to C R , we can obtain where D P ∈ ℂ N×P , and the column vectors in D P are eigenvectors, V d is a diagonal matrix, V d ∈ ℂ P×P , with entries from γ 1 to γ P , D Z ∈ ℂ M×ðN−PÞ , and the column vectors of S Z are eigenvectors, V z R is a diagonal matrix, and V z R ∈ ℂ ðN−PÞ×ðN−PÞ , with entries of eigenvalues from γ P+1 to γ M . Even though S P and D P can be recovered via the SVD or EVD operation as analyzed above, each of them has very high computational complexity, which is OðM 3 Þ and OðM 3 + N 3 Þ, respectively. Therefore, they are not suitable for channel estimation in practice. Moreover, it cannot trace the change on the channel matrix caused by the mobility of UTs.

Adaptive PCA-Based Iterative Method.
In order to estimate and track the eigenvectors with low computational complexity, an adaptive PCA-based iterative method is proposed in this subsection. First, we take the estimation and tracking of S P for instance. Define a matrix T 1 = S P U ∈ ℂ M×P , where U is a unitary matrix with P × P dimension. The estimation on T 1 can be formulated as the following optimization problem which is to minimize the mean square error (MSE) of the estimation. A simple iterative gradient descent method can be used to solve the optimization problem in (10). The gradient of JðT 1 Þ with respect to T 1 is updated by where j denotes the iteration index, η > 0 is a step size, and ð∂JðT 1 ÞÞ/ð∂T 1 Þ is to represent the gradient of JðT 1 Þ with respect to T 1 . To obtain the partial derivative of JðT 1 Þ with respect to T 1 , we first calculate the differential of JðT 1 Þ with respect to T 1 via the chain rule, which is given by Based on the matrix analysis, the gradient of JðT 1 Þ is given by After substituting (13) into (11), the update rule on T 1 can be rewritten as With Equation (14), T 1 will converge to the principal subspace, S P U, which is not equal to S P due to the unitary matrix U.

Optimization Problem.
To estimate the corresponding eigenvectors, we reconstruct the optimization problem based on the weighted method proposed in [23]. The new optimization problem is formulated as where Ω is a diagonal matrix whose diagonal item is w i,i . If the diagonal items of Ω are set with different and positive values and w 1,1 > w 2,2 > ⋯>w P,P , the estimation of T 1 will converge to S P [23]. Based on (15), the gradient of JðT 1 Þ can be rewritten as Wireless Communications and Mobile Computing With (16), matrix T 1 in (14) is updated by It has been proven in [23] that T 1 converges to S P when j tends to infinity. From (17), we can observe that the covariance matrix C L is essential to update T 1 , but the accurate To obtain C L , an approximated method has been proposed in [24], which uses a sampling covariance matrixĈ L,j . TheĈ L,j is given bŷ where τ ∈ ½0, 1 is the forgetting factor, the variable k is used to indicate the kth time slot, and j is to indicate the current time slot. The initial matrixĈ L,1 is set to be equal to Y 1 Y H 1 . Moreover, it is noteworthy that the parameter τ can be adjusted to trace channel variance. If the channel is modeled as a stationary process, (18) can be regarded as the calculation on the average value of Y k Y H k , and τ is equal to 1. However, the τ is taken in the range of (0,1) due to the massive MIMO channel is not stationary in practice.
With (18), the updating rule for T 1 can be rewritten as It is noteworthy that the step size, η, is essential to obtain T 1 with high accuracy and convergence speed. However, deriving an appropriate step size η is nontrivial. On the one hand, if η is small, the convergence speed will be slow. On the other hand, the accuracy will be impaired if η is high. Therefore, in the next subsection, the optimal η is derived to accelerate the convergence speed while keeping the accuracy.

Optimal
Step Size. To find the optimal step size η opt , we consider the objection function JðT 1 Þ at the ðk + 1Þth iteration step which is given by To simplify the equation, we replace ð∂JðT 1 ÞÞ/ð∂T 1 Þ with ∇. Then, combining (11) with (20), (20) can be rewritten as Since T H 1,k+1 T 1,k+1 = U H S H P S P U = I, (21) can be written as Therefore, JðT 1,k+1 Þ has a global maximum with the optimal step size, η opt . Based on (22), we can differentiate Jð T 1,k+1 , ηÞ with respect to η, which is given by When ð∂JðT 1,k+1 ÞÞ/ð∂ηÞ = 0, JðT 1,k+1 Þ has a global maximum. As a result, the optimal step size, η opt , is the solution of ð∂JðT 1,k+1 ÞÞ/ð∂ηÞ = 0, which is given by Based on (24), we can achieve the optimal updating rule to estimate and track the signal eigenvector matrix, S P . In the same way, we can estimate and track the D P .

Estimation on the Channel Parameters
After estimating S P and D P in Section 3, the DoA angles of the resolvable paths on both elevation and azimuth directions can be recovered based on the classic MUSIC algorithm. Define u p and v p as With these two variables, the steering matrix, Aðθ, ϕÞ = aðθÞb T ðθ, ϕÞ, can be rewritten as Aðu p , v p Þ = aðu p Þb T ðv p Þ. Then, the MUSIC method is used to estimate u p and v p . Then, the DoA angles can be recovered according to (25) and (26).
where ψ i = Efx i x H i g. Then, subtracting (27) to (28), there is Since h p = βðθ p , ϕ p ÞAðu p , v p Þ, we can derive the following equation from (29) Based on matrix manipulation, from (30), we can obtain where s z,k is the kth column of matrix S z and s z,k,m is the mth entry of s z,k . Based on (31), the pseudospectrum with respect to u can be derived as where pðζÞ = ½1, ζ, ⋯, ζ ðM−1Þ T . According to (32), if ζ = e ju , the roots of the denominator would locate on the unit circle and be the estimation on u since the steering vector, að u p Þ, 1 ≤ p ≤ P, is orthogonal to the noise subspace, S z . Moreover, S z S H z can be derived based on the estimation on S P in Section 2, which is given by Then, inserting (33) into (32), u p , ∀1 ≤ i ≤ P, can be estimated by finding the peaks of the pseudospectrum in (32). By using the same method, we can estimate v p , ∀1 ≤ p ≤ P, with D P .

4.2.
Matching on u p and v p . After estimating u p and v p in the above subsection, it needs to match u p with the right v p on the same path since each path has a corresponding pair of elevation and azimuth DoA angles. To do so, a standard inner product is defined to match u p with v p , which is given by where J and K are two matrices. The inner product is to calculate the energy of the received signals on certain direction. Accordingly, with the estimation of u p and v p on the above subsection, the inner products of received signals matrix Y with the steering matrix Aðu i , v j Þ is given by where i, j = 1, 2, 3 ⋯ P. P r will be the maximum value when i = j. The received signals from P paths make the most contribution to the received energy at the BS. Then, we can match u p with v p by searching the maximum value of P r . After that, based on (25) and (26), the DoA angles of each path can be estimated by the following formulas 4.3. Estimation on βðθ, ϕÞ. Based on (7) and SVD operation, the received signals Y k is rewritten as where ρ p is the pth eigenvalue of Y k ; s p and d p are the pth eigenvectors of S P and D P , respectively. Then, we set G p = s p d H p and construct an orthonormal space as follows According to (39), the pth eigenvalue, ρ p , can be written as Based on the Cramer's Rule, the amplitude of path i can be calculated as Then, the complete algorithm flow of the proposed scheme is given in Algorithm 1.

Extension for Multiple UTs
In this section, the proposed method is extended to the multiple UT scenario. When the UTs have the same paths, their signals cannot be distinguished at the BS. To solve this problem and reduce the pilot overhead, a user grouping scheme is proposed in this section.

Initial DoA Angle
Estimation. The initial DoA angles of different UTs need to be known at the BS before dividing them into different groups. To reach this goal, the training process is necessary. Assume that there are I orthogonal training sequences, denoted as fr 1 ,⋯,r i ,⋯,r I g, where r i is an L-dimensional vector with kr i k 2 = L. During the training stage, each UT is assigned to a training sequence, and the received signals at BS can be expressed as where Y ∈ ℂ M×N×L , and H i ∈ ℂ M×N×1 is the uplink channel between the ith UT and the BS; U is the set of UTs. W ∈ ℂ M×N×L is the additive white Gaussian noise. The orthogonality of the training sequences can be used to distinguish the received signals from different UTs. First, multiply (44) with Based on (45), the transmitted signal from different UTs can be separated. With the detached signal Y i , the proposed channel estimation method can be used to estimate the initial DoA angles associated with the ith UT. However, the number of UTs may be more than the number of orthogonal training sequences, i.e., U > I. To solve this problem, the pilot sequences can be reused in a time division multiple access (TDMA) manner. UTs are divided into dU/Ie groups, and the UTs in each group can reuse the same I orthogonal pilot sequences in different time slots. Accordingly, the BS can estimate the initial DoA angles of all UTs.

Grouping Scheme.
After the initial DoA angle estimation, the UTs can be partitioned into different groups, and the UTs in the same group can use the same training sequence and transmit simultaneously. 1: Receive the kth signal snapshot in the kth time slot, Y k . Then, the BS updates the covariance matrix,Ĉ L,k , based on (18); 2: Estimate and track the eigenvector matrix of the received signal, S P , based on the PCA algorithm in (19). Recover D P in the same way; 3: Estimate u p and v p , 1 ≤ i ≤ P, based on (32); 4: Pair u p , 1 ≤ p ≤ P, with v p , 1 ≤ p ≤ P, based on (35); 5: Estimate θ p , 1 ≤ p ≤ P and ϕ p , 1 ≤ p ≤ P based on (36) and (37); 6: Estimate the amplitude corresponding to each path based on (43); 7: Recover the channel based on (6); The principle of the scheme is to guarantee that the UTs in the same group do not have overlapping paths. For example, there are two UTs, u 1 and u 2 with the DoA angles of θ u 1 , θ u 2 and ϕ u 1 , ϕ u 2 estimated in the initial stage. If the minimum difference on the DoA angles between u 1 and u 2 is bigger than a threshold denoted as λ, ðmin fjθ u 1 − θ u 2 j, jϕ u 1 − ϕ u 2 jg ≥ λÞ, they would be put into the same group. Based on this rule, the UTs can be designated into different groups, and the UTs in the same group have different distinguishable paths. As a result, the UTs in the same group can use the same training sequence, and their CSI can be estimated and tracked by Algorithm 1. : If the minimum difference on the DoA angles between any two UTs is bigger than a threshold denoted as λ, they are put into the same group; 5: The UTs in the same group will use the same pilot sequence.    : If the minimum difference on the DoA angles between any two UTs is bigger than a threshold denoted as λ, they are put into the same group; 5: The UTs in the same group will use the same pilot sequence.
With the grouping scheme proposed in Algorithm 2, the channel estimation scheme proposed in Algorithm 1 can be extended to the multi-UT scenario. First, the UT grouping scheme is executed at the BS. Then, the UTs in the same group will use the same pilot sequence and transmit simultaneously. After the BS receives signals from the UTs in the same group, it updates the covariance matrix based on (18). Then, it estimates the channel state information based on Algorithm 1 proposed in Section 4. It is noteworthy that the BS treats the UTs in the same group as one "virtual UT." Since there exists a certain gap in DoA angles among UTs in the same group, the paths associated with their channels can be distinguished at the BS. Based on the initial DoA estimation, the BS can identify to which the UT paths belong. When the BS cannot track the CSI accurately, it needs to rerun the grouping scheme in the algorithm. To guarantee the channel estimation and tracking accuracy, the BS may run Algorithm 2 after certain time slots in practice.

Numerical Results
In this section, we provide the numerical results to demonstrate the superiority of the proposed scheme. In the simula-tion setup, the number of antennas deployed at the BS on both elevation and azimuth directions is 64 (M = N = 64), and the interantenna spacing, d, is set to be λ/2. In the first scenario, we assume that there is a single UT under coverage of the BS and the received SNR is 10 dB. Moreover, we assume that the propagation from the UT to BS goes three resolvable paths. The elevation and azimuth angles corresponding to three paths are (25 ∘ , 45 ∘ , 80 ∘ ) and (20 ∘ , 30 ∘ , 70 ∘ ), respectively.
6.1. The Estimation Accuracy and Convergence Speed. First, to evaluate the estimation accuracy, a Tanimoto coefficient is defined which is given by where s i is the ith eigenvector of matrix S P , and t 1,i,k is the ith eigenvector of the estimation matrix S P at the kth iteration step. Tanimoto coefficient defined in (46) implies that if ε tends to be 0, the estimation approaches to the real S P . To demonstrate the superiority of the optimal step size derived in Section 3.4, the Backtracking Line Search (BLS) method proposed in [25] is used for comparison. From Figure 2, we can observe that ε achieved by the PCA-based algorithm and BLS algorithm converges to 0, which indicates the availability of these algorithm on estimation of eigenvector matrix. Meanwhile, during the first 400 iterations, the ε achieved by the PCA-based algorithm is less than the BLS After recovering the eigenvectors, the intermediate variables, u p and v p , are estimated and paired according to the proposed scheme. To demonstrate the match process clearly, the inner products of the received signal matrix on the steering matrix are listed in Table 1, where the steering vectors are constructed by the estimated u p and v p .
From Table 1, we can match u p with the right v p by finding the maximum value. Consequently, u 1 , u 2 , and u 3 should be paired with v 2 , v 1 , and v 3 , respectively. After pairing, the estimation of DoA angles and path amplitudes can be obtained through (36), (37), and (43), respectively, which is listed in Table 2, where the index p is to indicate the pth path.
From Table 2, we can observe that the estimations on both θ p , ϕ p , and βðθ p , ϕ p Þ are close to the real values, which verifies the effectiveness of the proposed scheme. Moreover, since the ϕ p is determined by both u p and v p , the estimation error on u p and v p would affect the estimation accuracy on ϕ p . Therefore, the estimation accuracy of the azimuth angles, ϕ p , is less than that on the elevation angles, θ p .
6.2. The Impact of Received SNR on the DoA Estimation. To reveal estimation performance with different SNRs, the normalized mean square error (MSE) on DoA angle estimation is demonstrated in this section. The MSE is defined as Figure 3 illustrates the MSE performance on DoA angle estimation. It can be observed that the estimation MSE on DoA angles reduces as the received SNR at the BS increases. Figure 3 also compares the proposed scheme with the BLS algorithm with respect to DoA angle estimation MSE. We can observe that the proposed scheme outperforms the BLS algorithm on the estimation accuracy. Since the estimation on azimuth DoA angle depends on both the estimated u p and v p , the estimation accuracy on the elevation DoA angle is higher than that on the azimuth DoA angle. 6.3. Tracking Performance. To verify the tracking ability of the proposed scheme, the SVD-based channel estimation is simulated for comparison, where the SVD operation is executed periodically to update the channel estimation. Assuming that the elevation angle changes from 45°to 53°, and azimuth angles change from 30°to 22°, with the changing rate of 0.02°/stepsize. As shown in Figure 4, the SVD-based channel estimation method cannot track the change on the channel even though it may estimate channel with high accuracy in some times. On the other hand, the channel estimation based on Algorithm 1 is very close to the perfect one all the time even when the channel changes. Therefore, this simulation result confirms the tracking ability on the channel estimation of Algorithm 1.
6.4. The Effectiveness of Grouping Scheme. For the sake of illustration on the effect of the UT grouping scheme, a multiple UT scenario is considered in this section. The elevation and azimuth angles corresponding to the distinguish paths for 10 different UTs are listed in Table 3. Assuming that there are only 2 orthogonal pilot sequences and set λ = 5°. First, the UTs are assigned to 5 groups. UTs in 5 groups will reuse the 2 orthogonal pilot sequences in a TDMA manner, and the BS can obtain the initial DoA angle estimations of UTs. Afterwards, based on the grouping scheme (GS) in Algorithm 2, UT1, UT2, UT3, UT4, and UT7 should be put into one group, and the remaining UTs should be put into another group. To better demonstrate the effectiveness of GS, another random grouping scheme is adopted for comparison, where the UT1 to UT5 are put into the same group, and the UT6 to UT10 are put into another group. Accordingly, the estimation on DoA angles is shown in Figure 5.
In Figure 5, the pentagram is used to depict the estimation on DoA angles, the red circle is used to represent the real DoA angles, and the pink dotted circle is used as an indicator. From Figure 5(a), we can observe that the BS can distinguish all paths from different UTs when the proposed GS in Algorithm 2 is adopted. Even the real DoA angles from the two UTs are very close, they can be differentiated by the BS. On the contrary, in Figure 5(b), the pink circle with the dotted line shows that two overlapped circles only contain one corresponding pentagram, which means the system is failed to distinguish the paths between UT1 and UT5. Therefore, when multiple UTs are present in the system, the GS is necessary for the BS to achieve accurate CSI via the proposed channel estimation and tracking scheme.
6.5. Multiple UTs Tracking. Due to the mobility of UTs, tracking the CSI accurately needs to run the GS at the BS periodically. Therefore, the periodicity to execute the GS at When the random grouping scheme is adopted, the overlapping paths from different UTs cannot differentiated. the BS is the key to guarantee tracking accuracy. Without loss of generality, the scenario with two UTs is simulated in this section. In the simulation setup, we assume that the signal transmission from each UT to the BS will go through the same path. Moreover, different initial DoA angles and step size are simulated for comparison.

12
Wireless Communications and Mobile Computing Accordingly, the period to execute Algorithm 2 is denoted as k time slots. To evaluate the effect of the period on the tracking accuracy, a coefficient, denoted as τ, is defined as where b θðkÞ i,j and b ϕðkÞ i,j are the estimations of elevation and azimuth angles of the ith UT in the jth time slot, respectively; J is the total running time slots, which is set to J = 600. τðkÞ implies the average estimation error on the DoA angles on both elevation and azimuth directions.
The tracking accuracy with different periods is illustrated in Figures 6 and 7, which indicates that optimal GS period is directly related to the DoA angle changing rate and the initial DoA angle difference on elevation direction associated to different UTs. We can observe that the tracking accuracy decreases when the period, k, increases. Optimal period for GS is inversely proportional to the changing rate and proportional to the initial DoA angle difference on elevation direction. Moreover, it is noteworthy that the curve with the diamond marker is stationary even if the difference in the azimuth DoA angles between two UTs is small. Actually, the difference in the azimuth angles has a little effect on the tracking accuracy when the elevation angles of UTs are distinguishable.

Conclusion
In this paper, we have proposed a CSI estimation and tracking method for millimeter wave massive MIMO system. We have illustrated the CSI estimation from two aspects: computational complexity and pilot contamination. First, to reduce the computational complexity, a PCA-based algorithm is applied to recover the eigenvectors of the received signal at the BS. Then, with estimated eigenvectors, DoA angles can be estimated by using the MUSIC algorithm and inner products operation. To reduce the pilot overhead, we have utilized the grouping method to divide UTs into different groups. Then, UTs in the same group can share the same pilot sequence and transmit simultaneously without causing interference. In order to express the convergence speed of the proposed algorithm, a BLS algorithm is used for comparison. Compared to this algorithm, the proposed algorithm has faster convergence. Meanwhile, most existing methods are onetime estimation schemes, which need to be rerun after the channel coherence time, but the proposed method possesses good tracking behavior. The numerical simulations are established to verify the superior performance of the proposed method.
Notations ðXÞ T : The transpose of a matrix X ðXÞ * : The conjugate of a matrix X ðXÞ H : The Hermitian of a matrix X

Ef·g:
The statistical expectation det ðXÞ: The determinant of matrix X trðXÞ: The trace of matrix X k·k: The Euclidian norm d·e: The maximum integer operator I: The identity matrix ℂ: The complex domain ε: The Tanimoto coefficient θ, ϕ: The DoA angle Y: The received signals H: The channel matrix C: The covariance matrix <J, K > : The standard inner product of two matrix J and K.

Data Availability
No data available.

Conflicts of Interest
The authors declare that they have no conflicts of interest.