Data-Aided Maximum Likelihood Joint Angle and Delay Estimator Over Orthogonal Frequency Division Multiplex Single-Input Multiple-Output Channels Based on New Gray Wolf Optimization Embedding Importance Sampling

In this paper, we propose a new data-aided (DA) joint angle and delay (JADE) maximum likelihood (ML) estimator. The latter consists of a substantially modified and, hence, significantly improved gray wolf optimization (GWO) technique by fully integrating and embedding within it the powerful importance sampling (IS) concept. This new approach, referred to hereafter as GWOEIS (for “GWO embedding IS”), guarantees global optimality, and offers higher resolution capabilities over orthogonal frequency division multiplex (OFDM) (i.e., multi-carrier and multi-path) single-input multiple-output (SIMO) channels. The traditional GWO randomly initializes the wolfs’ positions (angles and delays) and, hence, requires larger packs and longer hunting (iterations) to catch the prey, i.e., find the correct angles of arrival (AoAs) and time delays (TDs), thereby affecting its search efficiency, whereas GWOEIS ensures faster convergence by providing reliable initial estimates based on a simplified importance function. More importantly, and beyond simple initialization of GWO with IS (coined as IS-GWO hereafter), we modify and dynamically update the conventional simple expression for the convergence factor of the GWO algorithm that entirely drives its hunting and tracking mechanisms by accounting for new cumulative distribution functions (CDFs) derived from the IS technique. Simulations unequivocally confirm these significant benefits in terms of increased accuracy and speed Moreover, GWOEIS reaches the Cramér–Rao lower bound (CRLB), even at low SNR levels.


Introduction
JADE (For the reader's convenience, please find and refer to the full list of all abbreviations adopted in this paper, at the very end, right before the bibliography section) is a crucial operation in many digital receivers.Highly accurate and computationally inexpensive JADE is required in many fields ranging from military applications such as RADAR or SONAR systems to wireless indoor positioning [1,2] and wireless communication systems [3].Moreover, the estimation of AoAs and TDs enable the design of highly accurate localization techniques [4].Many existing works have focused on solving the JADE problem.Most fall under the subspace-based category such as Multiple Signal Classification (MUSIC) [5], ESPRIT [6], and the 2D unitary matrix pencil (UMP) [7].One of the iterative ML estimators based on the space-alternating generalized expectation maximization (SAGE) algorithm was proposed in [8].The approach in [9] mainly targets only the DOA estimation.It introduces a DOA estimation algorithm in a full-dimension MIMO system by solving the maximum likelihood estimation using expectation-maximization (EM) algorithm.Recently, a tensor-based approach for channel and target parameter estimation was proposed in [10].Also, a non-iterative ML estimator that tackles the JADE problem in a multi-carrier transmission context was developed in [11] using the IS technique.More recently, we proposed a new ML JADE solution in a non-DA (NDA) single-carrier scenario [12].Shortly after, we have tackled the same scenario by making IS initialize the differential evolution (DE) technique in [13].Referred to hereafter as IS-DE, it was a first attempt to tackle JADE by exploiting a bio-inspired optimization approach.
Under this category, GWO has notably been used to optimize and solve many engineering problems.The best-known applications are numerical simulations and stability fields [14,15], feature acquisition selection, dataset classification, neural networks training [16], etc.A multi-objective GWO was developed for cloud computing in [17], and for wireless sensor networks in [18].It was adapted to a multi-robot application in [19], and to unmanned aerial vehicles (UAVs) in [20].A GWO-based optimal channel estimation technique was proposed for large-scale MIMO in LTE networks in [21].A dragonfly-evaluated gray wolf optimization (DA-GWO) model was introduced in [22], which hybridizes the concepts of dragonfly algorithm and GWO for channel estimation in millimeter wave massive MIMO system.In [23], we find also that the GWO algorithm was used only for direction of arrival (DoA) estimation.
Nevertheless, GWO suffers from slow convergence, limited solution accuracy, and susceptibility to getting trapped in local optima.Many improvements to GWO were proposed in different applications [24][25][26][27], but none were to tackle JADE, to the best of our knowledge.
In this paper, we exploit GWO [28] to solve JADE over OFDM SIMO transmissions in multi-path environments.The main idea consists of improving the GWO by initializing the wolf positions using the IS technique instead of random positions.More importantly, and beyond simple initialization of GWO with IS (coined as IS-GWO hereafter), we modify and dynamically update the conventional simple expression for the convergence factor of the GWO algorithm that entirely drives its hunting and tracking mechanisms by accounting for new CDFs derived from the IS technique.Numerical assessments will confirm the advantages of the proposed GWOEIS over IS, GWO, IS-GWO, DE, IS-DE, and other stateof-the-art JADE solutions in terms of estimation accuracy, population or sample size (e.g., number of wolves), global convergence, and convergence speed.The remainder of this paper is organized as follows.Section 2 introduces the multi-carrier SIMO system model in multi-path environments.Section 3 addresses JADE, first by deriving the concentrated likelihood function (CLF), then the IS technique, and ultimately the main common algorithmic steps and the key variations we introduced to some that ultimately encompass GWO, IS-GWO, and the proposed GWOEIS.Section 4 discusses our computer simulations and results, whereas Section 5 concludes our work.The adopted notations are as follows.Vectors and matrices are represented in lowerand upper-case bold fonts, respectively.Moreover, {.} T and {.} H denote the conjugate and Hermitian (i.e., transpose conjugate) operators.The Euclidean norm of any vector is denoted as ||.||, and I N denotes the (N × N) identity matrix.For any matrix X, [X] l , and [X] p,l , denote its lth column and (p, l)th entry, respectively.The kronecker product of any two matrices X and Y is denoted as X ⊗ Y.For any vector x, [x] p denotes its pth entry or element, and diag{x} refers to the diagonal matrix whose elements are those of x.Moreover, |.| returns the modulus of any complex number.Finally, j is the pure imaginary number (i.e., j 2 = −1), and the notation ≜ is used for definitions.

System Model
We consider a SIMO OFDM system characterized by a single transmitting and P receiving antenna elements and K sub-carriers.At each time period, this system transmits over these sub-carriers K symbols s = [s 1 , s 2 , . . ., s K ] T , all belonging to an M-ary constellation alphabet C M , and are assumed to be known at pilot transmission periods (i.e., replaced a priori by 1) in the DA-type estimation scheme we are adopting here.The transmit data then goes through a multi-path channel consisting of different paths, whose number Q is assumed to be known.
The resulting multi-path SIMO channel is characterized by τ max can be chosen to be as large as desired; U ([v min , v max ]) denotes a uniform distribution over the interval [v min , v max ]; and All these three parameter vectors are assumed to be unknown.At the receiver side, the observation signal, x p (k), over the pth antenna and the kth sub-carrier, is given by: where n p (k) is an additive white Gaussian noise (AWGN) with zero mean and variance σ 2 , and h p (k) is the channel frequency response (CFR), defined as follows: where ∆ f is the sub-carrier spacing.By stacking, the scalar signal observation in (1) received at each kth sub-carrier into a single observation vector where n(k) = [n 1 (k), n 2 (k), . . ., n P (k)] T is an i.i.d. and spatially uncorrelated zero-mean Gaussian noise vector and h(k) is the P × 1 CFR vector over all antennas, defined as: where Ā(θ) ≜ [a(θ 1 ), a(θ 2 ), . . ., a(θ Q )] is a P × Q steering matrix, a(θ) is the P × 1 steering vector at AoA θ defined for simplicity and without loss of generality here over a uniform linear array (ULA) as: and ), with no impact at all on what follows.
For an even more compact notation, we now stack the P × 1 CFR vector h(k) over all sub-carriers to obtain the KP × 1 total CFR vector H as follows: where A(θ) is the KP × KQ steering matrix defined as: and D(τ) is the KQ × Q TDs matrix defined as: Hence, at pilot period transmissions where sub-carriers are not modulated (i.e., s k = 1), after we stack both x(k) and n(k) in the same way we did to transform h(k) into H, we obtain:

Derivation of the CLF
At any given pilot transmission period, we can derive the log-likelihood function (LLF) that depends on all three unknown parameter vectors θ, τ, and γ as follows: Hence, we can estimate γ using the least squares (LS) solution as follows: where By injecting γ LS into (10), we obtain the CLF: Hence, we can obtain the joint ML estimates of θ and τ as the optimal solution Ξ opt to the following optimization problem:

Overview/Summary of IS for ML DA JADE
We start by approximating B H B in (12) as follows: where I Q denotes the (Q × Q) identity matrix.Then, we plug ( 14) into ( 12) to obtain: Now, injecting the expressions of Ā(θ) and D k (τ) into (15), we obtain the approximate CLF: where ψ(θ, τ) is the so-called periodogram of the observation signal [11] given by: Relying on the observations made above, we summarize the IS process in [11] of generating the R realizations of the AoA-TD couples according to the following steps: Step (1): we start by evaluating the periodogram ψ(θ i , τ j ) in ( 17) at all grid points (θ i , where Λ(v min , v max , δ v ) denotes the set of points obtained over the interval [v min , v max ] with a uniform sampling step δ v .
Step (2): we evaluate the so-called joint pseudo-pdfs [11] set of values over the above AoA-TD grid as follows: where ρ 1 is a design parameter to be chosen properly later on.
Step (3): we evaluate the marginal pseudo-pdf of τ j as follows: Then, we can find the initial TD estimates that correspond to the Q maxima of ( 19) as: where argmax | Q ( f ) returns the Q maxima of the function f .
Step (4): for q = 1, 2, • • • , Q, we compute the pseudo-CDF of τ j as follows: where Step (5): . Then, we apply a linear interpolation to obtain the rth TD realization: Step (6): We evaluate the conditional pseudo-pdf of θ given τ = τ0 Then, we obtain the initial Q AoA estimates as: Step (7): similarly to Step 4, we compute the conditional pseudo-CDF as: where Step (8): Similarly to Step 5, we generate R realizations u (r) Then, we apply a linear interpolation to obtain the rth AoA realization: Thus, we are able to generate R JADE realizations as: where θ(r Q is the AoAs vector estimate, and τ(r Q is the TDs vector estimate.These realizations readily enable the direct implementation of an IS ML solution to (13) as:

GWO vs. Combining|Embedding IS (IS-GWO|GWOEIS)
GWO is inspired by the leadership hierarchy and the hunting mechanism of gray wolves in nature [28].In a pack or population of, say, R ′ individuals, we identify four types of gray wolves that emulate their leadership hierarchy: the α-type are responsible for making hunting decisions (representing the solutions with best results).The β-type help the α-type in decision making and act as their best substitute-candidates when one of them becomes old or dies (second-best solutions in the population).The δ-type have only to submit to the α-and β-types (third-best solutions).And the ω-type are of the lowest rank, and must yield to the dominant ones.Guided by this "social" hierarchy's rules, gray wolves proceed to hunt along three main and consecutive stages: (1) tracking, chasing, and approaching the prey; (2) pursuing, encircling, and harassing it once it stops moving; and (3) attacking it.
Mathematically and generally speaking, GWO translates the leadership hierarchy and the hunting mechanism summarily described above due to lack of space into an optimization by search (i.e., hunting) in any multi-dimensional space whose best solution (i.e., prey) that minimizes a given criterion (i.e., so-called "fitness function") is found in an iterative manner by mimicking the gray wolves (i.e., search agents) hunting behavior (i.e., search adaptation rules).
In the present case, Ξ opt ≜ [θ opt , τ opt ] and −F c θ, τ in (13) stand, respectively, for the prey to be hunted and the fitness function to be minimized in a 2Q-dimensional space.Hence, GWO translates as follows: Step (1): first, the wolves' positions are initialized in the 2Q space according to one of the following cases (a) or (b).

Step (1.a) [GWO]:
The conventional GWO initially places the wolves pack of R ′ = R individuals at random positions in the 2Q search space Σ where Hence, it requires larger packs and longer hunting (iterations) to catch the prey, i.e., find the correct angles of arrival (AoAs) and time delays (TDs) without guaranteeing global convergence.

Step (1.b) [IS-GWO or GWOEIS]: Instead of random initial placement, IS-GWO and GWOEIS position the wolves at Σ
, stemming from the R ′ = R realizations generated in (29).Hence, even with relatively less realizations, this still guarantees global convergence, and it would always provide good-enough rough initialization values to GWOEIS to make the latter converge much faster and more accurately with relatively less hunting iterations.
Step (2): at each iteration {t} T H t=1 over the hunt duration T H , −F c Σ is evaluated over each individual Σ (r) t−1 in the pack, and the fittest three that better minimize it are identified as Σ α t−1 , Σ β t−1 , and Σ δ t−1 , respectively.
Step (3): the so-called convergence factor a t guiding the hunt is updated according to one of the following cases (a) or (b).

Step (3.a) [GWO or IS-GWO]:
) is simply set to decrease linearly from 2 to 0 over the hunt duration T H . Therefore, the positions of the wolves to converge to local minima.

Step (3.b) [GWOEIS]:
To improve and speed up convergence, instead of a common convergence factor, each realization or individual in the pack is assigned one of its own that accounts both for the AoA and TD pseudo-CDFs calculated in Steps ( 4) and ( 7) of the IS technique (cf.Section 3.2) as follows: where u (r) where (S) i denotes the i-th element of the set S. τ0 q and θ0 q are the initial TD and AoA IS estimates obtained in ( 20) and ( 25), respectively, and j ′ = j − Q.
When using the linearly decreasing factor in Step (3.a), the positions of the wolves can converge to local minima.To mitigate this issue, we generate for each individual a specific convergence factor that decreases with the iteration index while accounting through the pseud-CDF values for the distance between the gray wolves and the prey.As long as a wolf is far away from the prey, the uniform variable will generate realizations closer to 1. Once this wolf gets closer to the prey, the realization becomes quasi-static since, the convergence factor is then mainly scaled by 1/t.
Step (4): For each lead gray wolf * = α, β, or δ, we generate two random values, b * j and d * j , in U ([0, 1]) for the calculation of two update coefficients g * j,t (or g * ,(r) j,t with respect to each realization or individual r in the pack), and c * j according to one of the following cases (a) or (b).

Step (4.a) [GWO or IS-GWO]:
In other words, the update coefficients g * j,t and c * j are assigned random values in [−a t , a t ] and [0, 2], respectively.

Wolf (r)
Mo ve

Simulation Results
In this section, we assess the performance of the proposed GWOEIS solution and other key benchmarks for comparisons in terms of the root mean square error (RMSE) or the normalized MSE (NMSE) over a total number of Monte-Carlo runs M c = 1000.Following the IEEE 802.11ac standard (see [7] and first reference therein), we consider a bandwidth B = 80 MHz with a sub-carrier spacing ∆ f = 312.5 KHz giving a total of 256 sub-carriers, among which 11 are exploited for network signaling purposes and the remainder are payload carriers (i.e., K = 122).We also consider P = 6 antennas and Q = 2 equi-powered paths with AoAs 20 • and 45 • and TDs 25 ns and 62.5 ns, respectively.Moreover, we set ρ 1 = 4, δ τ = 1.25 ns, δ θ = 0.1 • , ∆ τ = 18.75 ns, and ∆ θ = 10 • .
In Figures 3 and 4, we evaluate the RMSE/NMSE performance versus the SNR to compare the new GWOEIS solution against the Cramér-Rao lower bound (CRLB) of [6], the UMP algorithm in [7], the classic GWO in [28], the classic DE [13], the IS technique in [11], IS-DE that simply combines DE with IS in [13], and another benchmark version developed here by simply combining this time IS with GWO, referred to as IS-GWO.As shown in Figures 3 and 4, our approach outperforms all other estimation techniques, both in terms of TD and AoA estimations.Additionally, it reaches the CRLB, even at low SNR levels and even with a very low value either parameter R = 30 and T H = 30.We also observe a severe performance degradation of the original GWO and DE techniques due to the high dimension (i.e., 2Q) of the optimization problem.When combined with IS, IS-DE improves a little but is outperformed by IS-GWO, and more so by GWOEIS.
In Figures 3c and 4c, we compare the channel NMSE using the estimates of the TDs and AoAs assuming a perfect knowledge of the channel gains.Our approach remarkably outperforms all other estimation techniques and reaches the lower bound, even at very low SNR levels and even with a very low value either parameter R = 30 and T H = 30.
In Figures 5 and 6, we assess the impact of the samples size R and the hunting duration T H on channel estimation performance versus the SNR.Here, the sample size R refers to the number of realizations with IS, to the wolves pack size with GWO, IS-GWO, and GWOEIS, or the number of individuals with DE and IS-DE.The number of iterations (e.g., T H ) is defined only for iterative approaches, i.e., GWO, IS-GWO, GOWEIS, DE, and IS-DE solutions.We notice that both parameters can have a detrimental effect on the exploration abilities of these techniques.Thus, they need to be set as large as possible to cover the widest area of the multidimensional space.In Figure 5(a,i,b,i,c,i), we assess the estimation performance of the TD, the AoA, and the channel, respectively.GWOEIS, IS-GWO, and IS performs better than DE, GWO, and IS-DE.Figure 5(a,ii,b,ii,c,ii) show the minimum RMSE over all techniques for TD, AoA, and channel estimation, respectively.The RMSE decreases when R and SNR increase.In Figure 5(a,iii,b,iii,c,iii), we can see that GWOEIS achieves the best performance in terms of TD, AoA, and channel estimation over R ≥ 20 and all SNR values.IS, IS-GWO, GWO, DE, and IS-DE can not match the performance of GWOEIS, even with R = 1000, but at the expense of significant increase in computational cost.
In Figure 6, we assess the impact of T H on RMSE performance of the iterative techniques.Once again, GWOEIS outperforms all other iterative techniques in terms of TD, AoA, and channel estimation.By increasing the number of iterations, we reach the best performance for the all algorithms, as shown in Figure 6(a,ii,b,ii,c,ii).Moreover, in Figure 6(a,iii,b,iii,c,iii), we notice that a small order of 10 iterations is enough to put GWOEIS on the top of the rest, with estimation accuracy gains constantly increasing with T H , making its potential gains in computational complexity remarkably large when compared to other techniques that require a very high number of iterations.
In Figure 7, we investigate the effect of the number of paths on the estimation performance.We observe that the new approach, GWOEIS, outperforms most of the benchmarks in the high and low SNR scenarios for TD, AoA, and channel estimation when Q ≤ 4. We consider also the same configuration to evaluate the performance of all of the approaches in terms of temporal and angular separations.In Figure 8, we fix the first delay at τ 1 = 2 T, and we vary the second delay at τ 2 = τ 1 + ∆τ.It is clearly seen that our approach is still capable of achieving the CRLB, even in a challenging scenario with a very small temporal separations of ∆τ = 0.5 T. In Figure 9, we fix the first AoA at θ 1 = −20 • , and we vary the second AoA θ 2 = θ 1 + ∆θ.Here, again, we can highlight the robustness and the super-resolution capacity of our approach, and to appreciate its superiority in challenging scenarios where the paths are closely spaced in both temporal and spatial domains.We have to mention that Figure 8

Conclusions
In this work, we presented a new DA ML JADE estimator over OFDM SIMO multipath channels referred to as GWOEIS, whose key innovation lies in exploiting and embedding the powerful IS concept to avoid the random initialization issues of the traditional GWO, and to significantly improve the hunting mechanism.GWOEIS ensures faster convergence by providing initial estimates based on a simplified importance function.More importantly, and beyond simple initialization of GWO with IS, we modify and dynamically update the conventional simple expression for the convergence factor of the GWO algorithm that entirely drives its hunting and tracking mechanism by accounting for new CDFs derived from the IS technique.The latter significantly boost the estimation performance.Overall, the simulation results show more accurate estimation performance at faster convergence rates with GWOEIS.

Step ( 5 ):
Let dist( * , (r), j) = c * j (Σ * t−1 ) j − (Σ (r) t−1 ) j denote the distance between the lead wolf * and the rth individual in the pack (or search agent) across the j-th dimension.The lead wolves' positions are then updated with respect to each realization r in the pack through intermediate variables χ (r) * according to one of the following cases (a) or (b).

Figure 7 .
Figure 7. MSE vs. the SNR in dB and the number of paths Q for T H = 100 and R = 100.
(a,iii,b,iii,c,iii) show the superiority of UMP reaching the best techniques achieving minimum RMSE only when ∆ τ ≥ 10/B.

Figure 8 .
Figure 8. RMSE vs. the SNR in dB and the temporal separation ∆ τ in ns with Q = 2, R = 100, and T H = 100.

Figure 9 .
Figure 9. RMSE vs. the SNR in dB and the angular separation ∆ θ in degrees with Q = 2, R = 100, and T H = 100.