Robust Bearing-Only Localization Using Total Least Absolute Residuals Optimization

Robust techniques critically improve bearing-only target localization when the relevant measurements are being corrupted by impulsive noise. Resistance to isolated gross errors refers to the conventional least absolute residual (LAR) method, and its estimate can be determined by linear programming when pseudolinear equations are set. The LAR approach, however, cannot reduce the bias attributed to the correlation between system matrices and noise vectors. In the present study, perturbations are introduced into the elements of the system matrix and the data vector simultaneously, and the total optimization problem is formulated based on least absolute deviations. Subsequently, an equivalent form of total least absolute residuals (TLAR) is obtained, and an algorithm is developed to calculate the robust estimate by dual ascent algorithms. Moreover, the performance of the proposed method is verified through the numerical simulations by using two types of localization geometries, i.e., random and linear. As revealed from the results, the TLAR algorithm is capable of exhibiting significantly higher localization accuracy as compared with the LAR method.

On the whole, the existing robust BOTL methods to process outlier data can fall to the outlier detection [17] and the M-estimate [18]. e outlier detection method aims to first detect suspected outlier data, separate them from the original data set, and then exploit the remaining data to complete the localization task. Picard and Weiss and Picard [17] proposed a sparse representation method to detect the outlier data of time-of-arrival, time-difference-of-arrival, and direction-of-arrival, as solved by linear programming. Xiong et al. [19] developed a robust expectation-maximization algorithm for distance outlier detection.
ough the outlier detection method is intuitive and effective, it does not apply to large data sets or complex application scenarios. e other important aspect refers to M-estimate, capable of estimating robust positions without preprocessing data. M-estimate primarily aims to comply with some other criteria, instead of the least-squares criterion, which is more robust to impulsive noise, as an attempt to improve estimators to be less sensitive to model errors.
Such algorithm employs Bi-square function as the cost function of M-estimate. Panigrahi et al. [22] proposed a distributed incremental least mean square algorithm based on Wilcoxon norm for parameter estimation of sensor networks. Wu et al. [25] proposed a robust structure total least-squares algorithm for passive localization. e algorithm adopts the optimized Danish weight function to reduce the effect of outlier data on the localization performance. Furthermore, the L p -norms (1 ≤ p < 2) are useful for robust estimation since less weight is given to isolated deviations. Under the application of the L 1 -norm, such M-estimate is termed as the least absolute residuals (LAR) [26] method. For outlier suppression, the L 1 -norm appears to be markedly superior among the L p -norms (1 ≤ p < ∞) [27].
From a statistical perspective, the least-squares optimization equals the maximum likelihood (ML) estimation when the measurements are corrupted with independent and identically distributed Gaussian noise. However, the noise distribution is altered with the presence of outliers. e Laplace distribution refers to a probability distribution that accommodates large residuals.
e ML estimator of this distribution leads to the LAR solution [28]. ough the ML method is optimal for statistics, its cost function is nonlinear and nonconvex with respect to the target location parameters. e iterative numerical search is inevitable for ML, and the absolute norm should be considered in the respective iteration.
e iterative algorithms tend to diverge when poorly initialized and computationally expensive. To remedy the defects of the ML method, [29] proposed a pseudolinear estimator (PLE) by lumping the nonlinearities into the noise term. However, the PLE is subjected to severe bias [30], and the bias remains with the increase in the number of sensors due to the correlations between system matrices and measurement noises. Various algorithms have been proposed to reduce the bias, including instrumental variable (IV) [31,32] and total least-squares (TLS) [33]. e IV method [31] is capable of reducing biases by setting an instrumental matrix that is asymptotically uncorrelated with the noise vector. Inconsistent with the IV method, the TLS algorithm [33] attempts to reduce biases by minimizing the errors in the system matrix and the measurement vector. However, the IV and TLS estimators fail to improve the bias performance if the measuring angles have gross errors.
In brief, robust pseudolinear algorithms for BOTL aim to reduce biases attributed to both large residuals and the correlation between system matrix and noise vector. e least absolute residuals (LAR) minimization can be adopted to reduce the bias attributed to outlier data. However, LAR faces a major problem of the correlation bias that remains with the increase in the number of sensors. In the present study, these two types of bias are reduced by conducting total least absolute residuals (TLAR) optimization [34]. We first formulate the problem of BOTL subject to outlier data.
Moreover, the pseudolinear measurement model for BOTL is reviewed, and the bias of PLE is analyzed. Subsequently, the TLAR algorithm is developed for BOTL with significantly reduced bias and root mean square error as compared with the LAR method. e main contributions of the proposed method can be summarized as follows: (i) Development of a new bias reduced estimator based on TLAR for BOTL under bearing gross errors (ii) Development of an algorithm for TLAR optimization using dual ascent algorithms (iii) Demonstration of the performance improvement achieved by the TLAR estimator with respect to the LAR method e rest of this paper is organized as follows. In Section 2, the measurement model is described. In Section 3, the pseudolinear equations from the BOTL problem are reviewed, the two types of bias for the PLE method are analyzed, and a LAR solution is presented when bearing measurements are subjected to large deviations. In Section 4, the TLAR approach is presented, and an algorithm is developed for TLAR based on dual ascent algorithms in Section 5. In Section 6, numerical examples illustrating the performance of PLE, TLS, LAR, and TLAR are presented. Lastly, conclusions are drawn in Section 7.

Model Description
A group of sensors are formed by K n normal sensors and K a abnormal sensors with K n significantly larger than K a . e total number of sensors reaches K � K n + K a . Each node is capable of measuring a bearing in the sensor between the positive horizontal direction and the straight line from the target to the node. It is noteworthy that the normal sensors conduct the effective measurements, while the abnormal sensors collect the wrong observations as impacted by object occlusion, interference or network attack, etc. It is assumed that whether the bearing measurements have outliers cannot be distinguished.
us, robust localization methods should be designed to prevent the location performance degradation attributed to outlier data. e problem of robust BOTL is to estimate an unknown target position in R 2 as accurately as possible by all K bearing measurements. e localization geometry is illustrated in Figure 1, where p � [p x , p y ] T denotes the target position vector, s k � [s x,k , s y,k ] T represents the sensor location vector for the kth measurement, θ k is the true bearing at sensor k, and the angle θ k is positive to the counterclockwise direction, k � 1, 2, . . . , K. e relationship between the bearing angle, target position, and sensor location is expressed as the following nonlinear equation: 2 Complexity where tan − 1 denotes the four-quadrant arctangent and θ k ∈ ( 0, 2π ].
In fact, the observed bearings have errors, and the kth measurement can be described as where n k � δ · b k + e k denotes the measurement error; e k represents the independent and identically distributed (i.i.d.) zero mean Gaussian noise with variance σ 2 ; b k represents a bias term; and δ is a binary random variable defined as follows: Since the prior knowledge of δ is unknown, which measurement is reliable and which is not cannot be distinguished in advance. Accordingly, it is necessary to develop a robust localization algorithm when both reliable and unreliable measurements are used.

The Least Absolute Residuals Method
To develop a robust localization method for BOTL, we first review the PLE method by converting nonlinear bearing measurements to pseudolinear equations. We then derive the least absolute residual (LAR) algorithm for BOTL. Lastly, two types of bias for PLE are analyzed.

Pseudolinear Equation.
e measurement equation in (2) is nonlinear with respect to the unknown target location which makes BOTL a nontrivial task. A natural option as we will illustrate in the following would be to model a pseudolinear equation by lumping the nonlinearities into the noise term. For this end, an orthogonal vector sum is first established between the measured angle vector and the true angle vector from Figure 1 given by where u k denotes the true angle vector betweens k and p; u k represents the measured angle vector starting from s k and generates the noisy bearing θ k according to the horizontal direction; and ε k indicates the error vector. By defining α k � [cosθ k , sinθ k ] T and β k � [sin θ k , − cos θ k ] T as two orthogonal unit trigonometric vectors, u k and ε k are written in terms of α k and β k : Using the fact that where ξ k � ‖u k ‖ 2 sin n k is a nonlinear transformed measurement error. Collecting the pseudolinear equation errors where T are the measurement matrix and vector, respectively. e PLE requires that p be estimated by minimizing ξ with respect to p in the least-squares sense. e position estimate can be obtained by solving  Complexity It is termed as pseudolinear estimator (PLE).

Bias
Analysis. e bias of p obtained from (9) is defined by e bias of p includes two parts. e first part is attributed to the correlation between A and ξ. e second part is formed by large residuals. Let A n and h n be the pseudomeasurement matrix and vector obtained from normal sensors. Without considering the abnormal measurements, the PLE solution becomes If the number of normal sensors is significantly bigger than that of abnormal sensors, the first type of bias can be approximated as where ξ n denotes the pseudomeasurement noise with its kth entry given by ξ n,k � ‖u k ‖ 2 sin e k . Based on the Slutsky theorem [33], the first type of bias can be asymptotically computed by As K n goes to infinity, (13) becomes an equality. For finite K n , η 1 obtained by (13) is a good approximation to (12) [35]. After η 1 is calculated, the second type of bias is η 2 � η − η 1 . An example of PLE bias is depicted in Figure 2(b), Section 6. e second type of bias appears to dominate if the standard deviation of e k is small enough. us, the performance of PLE will degrade dramatically under impulsive noise since L 2norm optimization can be severely affected by pseudolinear errors with large residuals. To ensure such items have less influence, we could instead minimize a cost function that gives less weight to large deviations.

Least Absolute Residual.
A common choice to alleviate the effect of gross errors is the absolute value metric denoted by χ(ξ) � ‖ξ‖ 1 , where ‖ · ‖ 1 represents L 1 -norm. As such, the LAR optimization can be achieved by e derivative of function χ(ξ) is bounded for all ξ by the value ±1, demonstrating that the cost function χ(ξ) is robust for all deviations. For the PLE criterion, the derivative is not bounded, and it increases linearly with ξ.
In the literature, numerous algorithms have been developed to solve the minimization problem of χ(ξ) (e.g., iteratively reweighted least-squares (IRWLS) [36], expectation-maximization (EM) procedure [37], and linear programming [38]). A weight matrix is defined as e WLS algorithm is explicitly given by e problem of (16) is that the weights become extraordinarily large for ξ k ≈ 0 or numerically indeterminate for ξ k � 0. Benefiting from convex optimization, LAR problem can be relaxed to identify the minimum bound of the absolute value: where v denotes the upper bound of ξ; 1 is a column vector of ones. To be specific, two nonnegative vectors are denoted, i.e., r � 0.
Note that (23) refers to a standard linear programming problem, which can be solved by using the existing CVX software [39]. When the pseudolinear errors follow i.i.d. Laplacian distribution with zero mean, (14) is equivalent to the ML estimator. However, the LAR method implicitly assumes that only h is subjected to errors. is is not the case since the system matrix A is corrupted with measurement noises as well. e correlation between A and ξ causes the LAR estimator to be inconsistent. ough the LAR estimator has bias, it gives a reasonable estimate and provides an initial guess for other robust estimators.

Total LAR Optimization
e LAR algorithm expressed in (10) implicitly indicates that only h has errors and the gross errors are reduced by using the weights all restricted in h. In fact, matrix A is also subject to measurement errors. When both A and h are disturbed with noise, the LAR solution of (9) will inevitably cause large bias as Aand ξ are statistically dependent. To increase the accuracy of the LAR estimator, the idea of total least absolute 4 Complexity residual (TLAR) can be exploited to reduce the errors in both A and h. e concept of TLAR is that the disturbance vector Δh is adopted to correct the data vector h, while the disturbance matrix ΔA is employed to disturb the data matrix A.
In other words, we use the following equations in the TLAR problem: where M � [A, h] denotes a K × 3 augmented matrix; Σ � [ΔA, Δh] represents a K × 3 perturbation matrix; ΔA is a perturbation matrix of A; Δh is a perturbation vector of h; x � τ · [p x , p y , − 1] T is a 3 × 1 vector; and τ is a scaling factor. Notably, both A and h are corrupted with noise. Next, the error statistics for A and h are examined. e error matrix Σ is defined as where the entries of Σ are given by the difference between M and the noiseless augmented matrix where . If e k is sufficiently small, it yields sin e k ≈ e k , cos e k ≈ 1.
us, ρ k1 and ρ k2 become e mean and second-order moments of the error items are written as Note that from (23), the mean value of errors in M is nonzero as impacted by the outlier data. In addition, (24) and (25) are nonzero even if b k equals zero. It is therefore suggested that the error terms in the identical row of Σ are correlated with each other. To more effectively reduce the bias of LAR, TLAR metric is adopted to minimize the disturbance matrix and vector simultaneously, and the TLAR problem for BOTL is formulated as If the L 1 -norm in (26) is replaced by the Frobenius matrix norm, (26) becomes the well-known TLS problem. Both TLAR and TLS fall to the domain of "total approximation problem," known as the total least pth problem for p ≥ 1. To solve the TLAR problem effectively, an equivalent form of (26) is explored. e following result holds.
us, the global minimum is not ensured. Indeed, its local minima can be computed by adopting the Lagrange multiplier formulation of (27).

Algorithm Development
In the present section, an algorithm is derived to solve (27). Since the optimization of (27) is not convex, a stationary point satisfying the first-order necessary conditions is calculated. First, (27) is transformed into an unconstrained minimization problem by leveraging the Lagrange multiplier method. e Lagrange objective function is defined as And the dual problem of (27) is given by where λ denotes the Lagrange multiplier. Set λ * as the optimal value of (30). By substituting λ * into (29), the minimization of L(x, λ * ) generates the primal optimal point x * . Such dual problem can be solved by a dual ascent (DS) algorithm [40] as expressed below.
(1) λ j is assumed as the optimal solution of the dual problem (30) at the jth step. (2) e primal optimal point x j+1 can be determined from λ j by minimizing L(x, λ j ): (3) e dual variable is updated by where c j > 0 is a step size. e λ-update is realized by using gradient ascent. With proper choice of c j , G(λ) increases for each step, i.e., G(λ j+1 ) > G(λ j ). (4) Next, we investigate the problem of minimizing L(x, λ j ). Based on Taylor's series expansion of L(x + αq, λ j ) around x up to the first order, it yields (5) Note that ∇L(x, λ j ) T q � ‖∇L(x, λ j )‖ 2 ‖q‖ 2 cos β, where β is the angle between ∇L(x, λ j ) and q. us, the steepest decent direction is q � − ∇L(x, λ j ). Let z‖Mx‖ 1 denote the subdifferential of ‖Mx‖ 1 , where (6) g is a K × 1 subgradient vector and its kth element is given by (7) e subdifferential of ‖x‖ ∞ at ‖x‖ ∞ , denoted by z‖x‖ ∞ , is defined by 6 Complexity where conv denotes the convex hull and o i is a vector whose ith element is 1 and all other elements are 0. Based on (34) and (36), ∇L(x, λ j ) is written as

Experimental Results
In the present section, numerical examples are studied to compare the localization performance of the proposed TLAR algorithm with PLE, TLS, and LAR under abnormal sensor and nonabnormal sensor. In addition, two typical localization geometries are applied in the simulations. e first one is randomly distributed sensors, and the other one is linearly distributed sensors. In the respective scenario, four cases are considered. To be specific, (1) the number of abnormal sensors varies and the total number of sensors is fixed, (2) both the numbers of normal and abnormal sensors vary, (3) the numbers of normal and abnormal sensors are fixed, and (4) all sensors are normal. e bearing data achieved from abnormal sensors exhibit uniform distribution U[− π, π]. For normal sensors, the bearing measurement errors are assumed as i.i.d. zero mean Gaussian with standard deviation σ. Simulation comparisons are drawn based on Mc � 1000 Monte-Carlo simulation runs. is study employs bias and root mean square error (RMSE) for localization performance comparison, which are written as where p x (m) and p y (m) denote the estimates of target location parameters for the mth Monte-Carlo run of the simulation.

Randomly Distributed Sensors.
In this section, all sensors are randomly placed in a 100 × 100 m 2 region centered at (50, 50) m (Figure 2(a)). e unknown target is placed at (100, 100) m. In the first example, the number of abnormal sensors is elevated from 2 to 6, and that of normal sensors decreases from 18 to 14. e total number of sensors is fixed at 20. σ is set to 3π/180 (3°). Figure 2(a) presents the RMSE results with the increase in the number of abnormal sensors. e LAR and TLAR methods are capable of reducing the outliers, as indicated in the two plots of Figure 2(a), the blue line with "+" and the red line with "square." e RMSE of LAR is 0.618 m above that of TLAR when the number of outliers is set to 2. Such value increases to 1.551 m when the number of outliers is kept at 6. e estimation bias is presented in Figure 2(b). As the number of outliers is elevated, the value of bias turns more significant.
In the next example, the number of outliers is fixed at 3. e sensors have a total number of twenty. σ ranges from π/180 to 5π/180 (1°to 5°). e number of abnormal sensors (red circles) is set to three, and the rest are normal sensors (blue circles), as illustrated in Figure 3(a). e simulated biases of PLE are plotted in Figure 3(b). Under small σ, the first type of bias attributed to the correlation between A and ξ keeps at low level, and the second type of bias formed by large residuals dominates the theoretical bias of PLE. With the increase in the measurement noise variance, the effect of the second type of bias turns out to be less significant. Figures 4(a) and 4(b) illustrate the RMSE and bias curves of various methods in the presence of abnormal sensors. e PLE method and the TLS estimator fail to give accurate target location estimates since they are not robust to outlier data. e blue line with "+" in Figure 4(a) represents the RMSE value determined by using the LAR method, and the red line with "square" represents the RMSE curve for the TLAR algorithm. ey are well separated and the gap between these two lines increases as σ becomes large, suggesting the reduction of the first type of bias. Furthermore, this phenomenon is verified in Figure 4(b), where the bias of TLAR is significantly lower than that of the LAR method.
Specific to the third example, the number of sensors is elevated from ten to thirty, five at a time, in which one is abnormal, and the other four are normal. σ is set to 3π/180 (3°). Figures 5(a) and 5(b) draw a comparison of the RMSE and bias of various methods. With the increase in the number of sensors, the RMSE and bias are reduced gradually since the numbers of abnormal and normal sensors increase simultaneously and are proportional. It is again proved that the TLAR method exhibits the optimal RMSE and bias performance.
Subsequently, the RMSE and bias performance are determined according to different bearing noise standard deviations when all sensors are normal sensors, as presented in Figures 6(a) and 6(b). In this scenario, the TLS method exhibits the optimal RMSE performance. e RMSE curve of TLAR is slightly higher than that of TLS since the bearing Complexity measurement errors overall comply with the Gaussian distribution. In such scenario, the L 2 -norm criterion is optimal. Unlike the RMSE performance, the bias of TLAR is comparable with that of TLS and much lower than that of LAR or PLE. When all sensors are normal, the LAR and TLAR methods are capable of achieving effective results, compared with PLE and TLS, thereby demonstrating the robustness of the proposed algorithm.   (Figure 7(a)). e target is placed at (200, 40) m. In the first example, the number of abnormal sensors ranges from 2 to 6, and that of sensors is set to 31. σ is set at 3π/180 (3°). Figures 8(a) and 8(b) illustrate the RMSE and bias performance when the number of abnormal sensors is altered. It is therefore demonstrated that the TLAR algorithm exhibits better RMSE and bias performance than the LAR method. As the number of outliers is elevated, the performance is more significantly improved.  In the next example, all sensors are working properly except for the 8th to 10th sensors. Consistent with Section 6.1, the theoretical bias of PLE is plotted in Figure 7(b). e results comply with those presented in Figure 3(b). e RMSE and bias results for such example are given in Figures 9(a) and 9(b), respectively. e RMSE value of TLAR is lower than that of the LAR method for the case of linearly distributed sensors, even though only a 0.502 m reduction for σ � 1°and a 4.56 m reduction for σ � 5°. With the increase in the bearing measurement noise variance, the effect of large residuals turns out to be less significant. e amount of bias reduction is 5.04 m for σ � 5°if TLAR is used.
Specific to the third example, the RMSE and bias results are shown in Figures 10(a)   abnormal, and the other four are normal. σ is set to 3π/180 (3°). Figures 10(a) and 10(b) verify that the solution of TLAR has better RMSE and bias performance than that of PLE, TLS, and LAR. Based on more than 21 sensors, the RMSE and bias of LAR suddenly decrease. is is not unexpected since the sensor observation angles are more significantly discriminated.
In addition, the RMSE and bias curves of PLE, LAR, TLS, and TLAR without abnormal sensors are plotted. Figure 11(a) gives the RMSE results as the bearing standard deviation increases, and Figure 11

Conclusions
e present study presents a TLAR algorithm to solve the BOTL problem in the presence of outlier data. ough the conventional LAR is robust to significant deviations, it has bias formed by the correlation between the system matrix and the noise vector and such bias remains with the increase in the number of sensors. To increase the accuracy of the LAR estimator, the TLAR is proposed by adding minimal perturbations to both system matrix and data vector in least absolute residuals sense, so the perturbed matrix is consistent. As revealed from the experimental results, the TLAR algorithm outperforms the LAR method. When the bearing noise power becomes more significant, the performance of RMSE and bias reduction is more obviously improved.

Data Availability
e simulation data used to support the findings of this study are included within the article.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper. 12 Complexity