Nonlinear Transformations and Radar Detector Design

A nonlinear transformation is introduced, which can be used to compress a series of random variables. For a certain class of random variables, the compression results in the removal of unknown distributional parameters from the resultant series. Hence, the application of this transformation is investigated from a radar target detection perspective. It will be shown that it is possible to achieve the constant false alarm rate property through a simple manipulation of this transformation. Due to the effect the transformation has on the cell under test, it is necessary to couple the approach with binary integration to achieve reasonable results. This is demonstrated in an X-band maritime surveillance radar detection context.


Introduction
The fundamental problem to be examined in this chapter is the detection of targets embedded within the sea surface, from an airborne maritime surveillance radar. Artifacts of interest could be lifeboats or aircraft wreckage resulting from aviation or maritime disasters. From a military perspective, one may be interested in the detection and tracking of submarine periscopes. Another scenario may be the detection of illegal fishing vessels or small boats used for smuggling of people or contraband. An airborne maritime surveillance radar has a difficult task in the detection of such objects from high altitude, while surveying a very large surveillance volume.
Such radars operate at X-band and are high resolution, and as such are affected by backscattering from the sea surface, which is referred to as clutter. This backscattering tends to mask small targets and makes the surveillance task extremely difficult. One of the major issues with the design of radar detection schemes is the minimization of the detection of false tar-gets, while maximizing the detection of real targets. As a statistical hypothesis test, one can apply the Neyman-Pearson Lemma to produce a decision rule that achieves these objectives. However, in many cases, such a decision rule requires clutter model parameter approximations as well as estimates of the target strength based upon sampled returns. An issue, well known within the radar community, is that small variations in the clutter power level can result in huge increases in the number of false alarms. Since clutter power is a function of the underlying clutter model's parameters, approximations of the latter will have an inevitable effect on the former. Hence a large body of research has been devoted to designing radar detection strategies that maintain a fixed level of false alarms. A detector that achieves this objective is said to have the constant false alarm rate (CFAR) property [1].
In order to maintain a fixed rate of false alarms, sliding window decision rules were examined in early studies of radar detection strategies [2][3][4][5][6]. These investigations have been extended to account for different clutter models and to address issues with earlier detector design in a number of subsequent analyses [7][8][9][10][11][12][13][14][15]. Such decision rules can be formulated as follows. Suppose that the statistic Z is the return to be tested for the presence of a target. Let Z 1 , Z 2 , …, Z N be N statistics from which a measurement of the level of clutter is taken, via some function f = f (Z 1 , Z 2 , …, Z N ). Then a target is declared present in the case where Z is larger than a constant times f. The constant is selected so that in ideal scenarios, the false alarm rate remains fixed. It is generally assumed that the clutter statistics are independent and identically distributed in ideal settings, and also independent of the statistic Z. This can be formulated as a statistical hypothesis test by letting H 0 be the hypothesis that the cell under test (CUT) statistic Z does not contain a target, and H 1 the alternative that it contains a target embedded within clutter. Then the test is written where τ > 0 is the threshold constant and the notation used in Eq. (1) means that H 0 is rejected when Z > τf (Z 1 , Z 2 , …, Z N ). The probability of false alarm is given by If τ can be determined, for a specified Pfa in Eq. (2), such that it is independent of clutter parameters, then the decision rule in Eq. (1) will be able to maintain the CFAR property in ideal scenarios. In practical radar systems, a detection scheme such as in Eq. (1) can be run across the data returns sequentially to allow binary decisions on the presence of targets to be made, which are then passed to a tracking algorithm. A comprehensive examination of such detection processes is included in [1].
This chapter examines an alternative approach to achieve the CFAR property, based upon a nonlinear transformation that is used to compress the original clutter sequence. The consequence of this is that the resulting transformed series of random variables will have a fixed clutter power level and so permits a CFAR detector to be proposed. It is then shown how this transformation can be used to produce a practical radar detection scheme.
The chapter is organized as follows. Section 2 introduces the nonlinear mapping and formulates a decision rule. Section 3 specializes this to the case of Pareto distributed sequences, since the Pareto model is suitable for X-band maritime surveillance radar clutter returns. Section 4 demonstrates detector performance in homogeneous clutter, while Section 5 applies the decision rules directly to synthetic target detection in real X-band radar clutter.

Mapping
In X-band maritime surveillance radar, the Pareto distribution has become of much interest as a clutter intensity model due to its validation relative to real radar clutter returns [16][17][18]. This model arises as the intensity distribution of a compound Gaussian model with inverse Gamma texture. Consequently, the Pareto distribution fits into the currently accepted radar clutter model phenomenology [19]. Hence, there have been a number of recent advances in the design of CFAR processes under a Pareto clutter model assumption [20][21][22][23][24][25].
A random variable X has a Pareto distribution [26] with shape parameter α > 0 and scale parameter β > 0 if its cumulative distribution function (cdf) is for t ≥ β. The density of X follows by differentiation of Eq. (3). In order to ensure the existence of the first two moments, it is usually assumed that α > 2, which is an assumption that has been validated in fits of this model to real data [18]. This Pareto model possesses what is referred to as a duality property in Ref. [20]. To introduce this, recall that if Y is an Exponential random variable with unity mean, its cdf is given by for t ≥ 0. Then it can be shown that the Pareto model in Eq. (3) can be related to Eq. (4) via the random variable relationship Other random variables of interest in radar signal processing, such as the Weibull, can also be expressed in a form similar to Eq. (5). Hence, for the purposes of generality, suppose {X j , j ∈ IN := {0, 1, 2, …}} is a sequence of homogeneous random variables with common support and that θ 1 and θ 2 are two fixed real constants. Define a sequence of random variables {Z j , j ∈ IN} by The sequence produced via Eq. (6) is a generalization of the Pareto model (3). Next define a nonlinear mapping ζ : IR + × IR + × IR + × IR + → IR + ∪ {0} by where each x j > 0, x 3 = x 4 and IR + is the positive real numbers. Then the following result is relatively easy to prove: The proof of Lemma 2.1 is now outlined. Supposing that Z j , Z j + 1 , Z j + 2 and Z j + 3 are represented in the form defined via Eq. (6) it follows that where properties of the logarithmic function have been utilized. Since Eq. (8) does not depend on θ 1 and θ 2 , the proof is completed. Lemma 2.1 suggests that if the original sequence of random variables is processed in 4-tuples, the compressed sequences' statistical structure is only dependent on the random variables X j . Observe that the Lemma does not require an independence assumption. Thus if sequence {X j } has no unknown statistical parameters, the process generated by Eq. (7) also has no unknown parameters. This suggests that processing of a data sequence in terms of 4-tuples may be an effective may in which to achieve the CFAR property. The next subsection clarifies this.

Decision rule
In order to propose a decision rule exploiting the transformation introduced in Lemma 2.1, it is necessary to focus first on a series of four returns. Hence, suppose we have a CUT statistic Z, and three clutter measurements are available, denoted Z 1 , Z 2 and Z 3 . Let H 0 be the hypothesis that the CUT contains no target, and H 1 the hypothesis that it does contain a target embedded within clutter. Then, based upon Eq. (7), a linear threshold test takes the form where τ > 0 is the threshold. Based upon Lemma 2.1 if the clutter is modelled by Eq. (6), then it is clear that under H 0 , the Pfa of the test in Eq. (9) will not depend on θ 1 or θ 2 , implying it is CFAR with respect to these parameters. Furthermore, an auxiliary motivation for defining a linear threshold detector such as Eq. (9) is that in the cases where it is assumed that one has a priori knowledge of clutter parameters, linear threshold detectors are ideal, or asymptotically optimal, and hence provide the maximum probability of detection within the class of sliding window decision rules [27].
The test in Eq. (9) can also be re-expressed in terms of the preprocessed clutter statistics. In particular, it can be shown to be equivalent to rejecting H 0 if with the appropriate choice for τ, which can be determined from the corresponding Pfa expression for Eqs. (9) or (10).
Observe that this test is not of the usual form found in the radar signal processing literature, since it compares a CUT with a measurement of clutter based upon three statistics, and not upon a sample of predetermined size. This will be discussed subsequently in terms of practical implementation of the test in Eq. (10). The next section discusses the application of Eq. (10) to the Pareto clutter case, enabling the determination of τ.

Distributions under H 0
Since the motivation of the work developed here is the design of radar detection schemes for maritime surveillance radar, the results of the previous section are specialized to the Pareto case. In order to apply Lemma 2.1, it is necessary to determine the distribution of the resultant sequence produced by ζ under H 0 . The following is the key result: In the case where the sequence of random variables in Lemma 2.1 is Pareto distributed and independent, the cdf of the sequence processed by ζ is given by for t ≥ 0.
This can be recognized as a Pareto distribution, with support the nonnegative real line and shape and scale parameter unity. More specifically, P = X + 1, where X has density (3) with α = β = 1. This illustrates the cost of the nonlinear transformation approach: although the resultant series of clutter has no unknown clutter parameters, it is from a distribution with no finite moments. The independence assumption is adopted for analytical tractability and is consistent with the assumption that independent and identically distributed clutter returns are available, as in the formulation of the test in Eq. (1).
To prove Corollary 3.1, suppose that η 1 and η 2 are two independent random variables with cdf (4). Then by analyzing the difference η 1 − η 2 , it can be shown that it has cdf which is that of a Laplace distribution. Then it follows that where Eq. (12) has been applied, and t > 0. Thus the modulus of the difference is also exponentially distributed with unit mean.
Supposing that κ 1 and κ 2 are two independent random variables with cdf Eq. (13), then by statistical conditioning and an application of Eqs. (4)- (14) shows that the ratio has cdf Eq. (11) with an evaluation of the integral. This establishes the result in Corollary 3.1, as required.

Thresholds and the CUT
Based upon Corollary 3.1 the univariate threshold for the Pareto case is given by The threshold (Eq. (15)) illustrates the issues with the nonlinear mapping, as this threshold will be quite large for appropriate Pfa. Note that for a Pfa of 10 − 6 , τ = 10 6 − 1. This threshold will increase as the Pfa decreases. In the Pareto setting, it is shown in Ref. [20] that an ideal detector has its threshold set via β(Pfa) − 1/α . In the case where α = 4.7241 and β = 0.0446 (which correspond to spiky clutter returns) and with the Pfa set to 10 − 6 , this threshold is 0.8312 by contrast. Thus the nonlinear mapping, in the process of compressing the original data series, can be used to achieve the CFAR property with Eq. (10), but detection performance may be unacceptable.
To explore this further, it is informative to examine the detection scheme in Eq. (10) when there is a target model present. Suppose Ξ is the CUT statistic, in the case where a target is present in the clutter, in the pretransformed data. Let Ξ ^ be the CUT in the transformed domain, meaning the detector Eq. (10) when there is a target present so that T is the intensity measurement of a return signal and clutter in the complex domain. Then by applying the lefthand expression for Pareto random variables in Eq. (5), we can write where each E j is an independent exponentially distributed random variable with unit mean. Then with an application of results from the proof of Corollary 3.1, since |E 2 − E 3 | has the same exponential distribution, one can apply statistical conditioning on E 1 and |E 2 − E 3 | to show that the distribution function of the transformed CUT is where the change of variables x = e − θ and y = e − φ has been applied. Thus the transformed CUT can be generated from the pretransformed CUT via Eq. (17). To examine this, Figure 1 plots Eq. (17) in the case of a Swerling 1 target model embedded within Pareto distributed clutter with α = 4.7241 and β = 0.0446. A Swerling I target model is essentially a bivariate Gaussian model, which is combined with the Pareto model by embedding the latter into a compound Gaussian process with inverse Gamma texture in the complex domain, and then taking modulus squared to produce the intensity measurement [20]. The distribution function of Ξ can also be found in Ref. [20] for the case of interest. Figure 1 shows the pretransformed CUT as well as Eq. (17), in the cases where the signal to clutter (SCR) ratio is 1, 10, 50, and 100 dB. For the case of a 1 dB target model, the CUT has its range of potential values increased under the transformation. This is also the same for the 10 dB case. Interestingly, for the 50 dB and 100 dB cases, the situation is reversed. Hence, as the SCR increases, the nonlinear mapping suppresses the target SCR, reducing the range of admissible values for the transformed CUT. This suggests that although the nonlinear mapping removes unknown clutter parameters, it may also impede detection due to target suppression. If the threshold is set via Eq. (15), then it is clear from Figure 1 that it will be very difficult to detect targets with a reasonably small Pfa. Hence, the new detection scheme must be combined with an integration process to rectify this.

Methodology and data
In order to examine the performance of the proposed detection scheme (9), clutter is simulated under the assumption of a Pareto clutter model, which has been found to fit Defence Science and Technology Group's (DSTG's) real X-band maritime surveillance radar data sets. Ingara is an experimental X-band imaging radar which has provided real clutter for the analysis of detector performance [28]. A trial in 2004 produced a series of clutter sets that have been analyzed from a statistical perspective in Ref. [29]. During the trial, the radar operated in a circular spotlight mode, surveying the same patch of the Southern Ocean at different azimuth and grazing angles. Additionally, the radar provided full polarimetric data. For the purposes of the numerical work to follow, focus is restricted to one particular data set. This is run 34683, at an azimuth angle of 225°, which is approximately in the up wind direction. Additionally, the numerical analysis focuses on the horizontal transmit and receive (HH) case.
For performance analysis in homogeneous clutter, the data is simulated with distributional parameters matched to those obtained from the Ingara data set. The data consists of 821 pulses with 1024 range compressed samples, from which maximum likelihood estimates of the distributional parameters can be obtained from the intensity measurements. Under the Pareto model assumption, the estimates are α ^ = 4.7241 and β ^ = 0.0446 .
As remarked previously, it is necessary to couple (10) with an integration scheme to enhance its performance. The integration scheme used for this purpose is binary integration, which is well-described in Ref. [30], and an application of it in a Pareto distributed clutter environment can be found in Ref. [31]. Such a process applies a series of M ≥ 1 tests of Eq. (10), and then conclude that if at least S out of M return a detection result, then a target is likely to be present in the radar clutter [30], where S ∈ {1, 2, …, M}. Selection of an appropriate S is outlined in Ref. [31]. Essentially, it is pointed out in Ref. [32] that for a specified univariate cumulative detection probability and false alarm rate and a fixed number of maximum binary integration returns M, there exists an optimal S which minimizes the required signal to clutter ratio, and maximizes the binary integration gain. This can be done visually or numerically by plotting the minimum SCR as a function of S, under the assumption of a certain signal model. This approach, and the analysis in Ref. [31], shows that in the current context, the choice of S = 3 with M = 8 should provide good results. Relative to the problem addressed in this chapter, applying binary integration with a linear threshold detector in the transformed clutter domain is not computationally expensive, and thus is seen as a reasonable solution.
If Pfa BI denotes the Pfa for binary integration, then it can be expressed in terms of the univariate detection processes Pfa through the equation The threshold τ is set via Eq. (18) coupled with the univariate Pfa from Eq. (9).
To simulate detection performance, the probability of detection (Pd) is estimated, using 10 6 Monte Carlo runs based upon a Swerling 1 target model assumed for the CUT. For each SCR, the binary integration process is run using S = 3 out of M = 8 binary integration. The motivation for these choices can be found in Ref. [31]. In order to assess the robustness of the detection scheme to interference, up to two interfering targets are inserted into the clutter measurements to give an indication of the performance with interference. Thus independent Swerling 1 targets, with interference to clutter ratio (ICR) of 1 dB, are applied to Z 1 (denoted Inter 1 in the plots), then to Z 2 (denoted Inter 2), and then to both Z 1 and Z 2 (denoted Inter 3) in the univariate decision rule in Eq. (9). A real spurious target may only appear in a subset of the clutter measurements and so this analysis of interference can be viewed as an upper bound on poor performance.

Receiver operating characteristic curves
Receiver operating characteristic (ROC) curves are used to examine the performance, which plots the probability of detection as a function of the false alarm probability, when the target in the CUT is at a fixed SCR. Figures 2-4 provide examples of the performance of the new detector Eq. (10) with binary integration and compares it to the performance of some of the recently introduced detectors designed for operation in a Pareto clutter model environment.
For a CUT Z and clutter range profile Z 1 , Z 2 , …, Z N , the Geometric Mean (GM) CFAR is which is shown in Ref. [20] to have its threshold set via ζ = Pfa − 1/N − 1. Similarly, an Order Statistic (OS)-CFAR has been analyzed in Ref [22], which is given by which has its threshold multiplier ν j set via inversion of the Pfa equation given by where the OS index 1 ≤ j ≤ N and the notation ν j emphasizes the fact that ν j depends on the selected OS index j. Observe that both these decision rules require a priori knowledge of β.
In order to provide a valid comparison with Eq. (10), these detectors have been applied with N = 3 and coupled with binary integration. Due to this, there are three choices available for j, corresponding to a minimum (denoted MIN, when j = 1), median (MED, j = 2), and maximum (MAX, j = 3).   (10) coupled with binary integration is denoted as the nonlinear mapping (NLM). In this case, the CUT SCR is 5 dB, representing a small target. As can be observed, the new decision rule has superior performance. The same experiment is repeated in Figure 3, where the CUT SCR is 15 dB, and then it is increased to 20 dB in Figure 4. These results show that the new detection process has superior performance, while not requiring a priori knowledge of the Pareto scale parameter. These results validate the application of Eq. (10) to target detection in spiky X-band clutter with binary integration.
It is interesting to note that as M is increased, there is very little gain in performance. To demonstrate this, Figure 5 repeats the same scenario in Figure 4 except M has been increased to  30. Comparing Figures 4 and 5, it is clear that there is very little gain. However, the computational complexity increases dramatically as M is increased. Hence, in a practical implementation of the binary integration process, it is more efficient to select M small.

Effect of interference
Next the cost of interference on the new decision rule is examined, and for brevity, only this decision rule is considered. Figure 6 shows the case where the CUT has SCR of 5 dB, and the decision rule (10) coupled with binary integration is denoted BI, while the three interference cases are marked appropriately. Here we observe quite good performance that decreases with the interference. Figure 7 shows the result of increasing the SCR in the CUT to 20 dB. The result is an expected detection performance improvement as shown.

Performance in real data
As a final test of the proposed detection scheme, it was run directly on the Ingara data set under consideration, with the insertion of synthetic Swerling 1 target and interference as for the homogeneous case. A sliding window was run across the data sequentially, and detection performance was estimated by running the 3 out of 8 detection scheme, resulting in a run length of 840,672. The Ingara data is slightly correlated from cell to cell and so the detector Eq. (9), which has threshold set via an independence assumption, becomes a suboptimal decision rule. Detection performance under both clutter model assumptions is plotted on the same ROC curve to compare performance on the real data more easily. The same scenario is repeated as for the analysis under homogeneous independent clutter.

Figure 8
shows detection performance with the CUT SCR of 5 dB, while Figure 9 repeats the same numerical experiment as for Figure 8, except the CUT has SCR of 20 dB. Comparing   Figure 6 we observe that the effects of correlation are having an effect on the performance in real data. The new decision rule is designed to operate in independent homogeneous clutter returns, and so there is a serious variation in performance. The same situation is observed at a larger CUT SCR (comparing Figures 9 and 7).

Conclusions
A nonlinear transformation was introduced and shown to remove clutter parameter dependence for a class of statistical models. This was used to formulate a simple linear threshold detector in the transformed clutter domain. Due to issues with the magnitude of detection thresholds, it was necessary to couple the approach with binary integration.
Analysis of detection performance in simulated clutter showed good detection performance.
Interference had a strong impact on performance as expected. When the detection process was applied directly to real data, similar results were observed. Nonetheless, the nonlinear transformation, coupled with binary integration, resulted in reasonable detection performance while guaranteeing the CFAR property is preserved.