Adaptive Filtering for Full-Duplex UWA Systems With Time-Varying Self-Interference Channel

To enable full-duplex (FD) in underwater acoustic (UWA) systems, a high level of self-interference (SI) cancellation (SIC) is required. This can be achieved by using a combination of SIC methods, including digital SIC. For digital SIC, adaptive filters are used. In time-invariant channels, the SI can be effectively cancelled by classical recursive least-square (RLS) adaptive filters, such as the sliding-window RLS (SRLS) or exponential-window RLS, but their SIC performance degrades in time-varying channels, e.g., in channels with a moving sea surface. Their performance can be improved by delaying the filter inputs. This delay, however, makes the mean squared error (MSE) unsuitable for measuring the SIC performance. In this article, we propose a new evaluation metric, the SIC factor (SICF), which gives better indication of the SIC performance compared to MSE. The SICF can be used to evaluate the performance of digital SIC techniques without the need of implementing a full FD system. A new SRLS adaptive filter based on parabolic approximation of the channel variation in time, named SRLS-P, is also proposed. The SIC performance of the SRLS-P adaptive filter and classical RLS algorithms (with and without the delay) is evaluated by simulation and in lake experiments. The results show that the SRLS-P adaptive filter can significantly improve the SIC performance, compared to the classical RLS adaptive filters.


I. INTRODUCTION
In recent years, full-duplex (FD) operation of terrestrial radio systems, such as communication systems, has demonstrated an ability to increase the system throughput [1]- [5]. If FD operation can be adopted in underwater acoustic (UWA) systems, e.g., in UWA communication systems, the capacity of the acoustic links can be almost doubled. Active sonar systems can also benefit from the FD operation by expanding the signal family used for transmission. Despite the benefits of FD, it is not widely considered for UWA systems mainly due to the severe self-interference (SI) introduced by the near-end transmission. Various SI cancellation (SIC) techniques have been proposed for FD terrestrial radio systems. A combination of SIC methods is used, including digital SIC.
The associate editor coordinating the review of this manuscript and approving it for publication was Qingchun Chen .
Normally, a certain amount of SI is cancelled in the analogue domain before digital cancellation to avoid the saturation in the analogue-to-digital converter (ADC) [1], [6], [7]. For FD UWA systems, due to the lower frequencies of acoustic signals, high resolution ADCs are available. Thus, digital cancellation can be considered as the main practical approach for SIC in FD UWA systems [8]- [10].
One of the major limitations of the digital cancellation performance is due to the hardware imperfection in the transmit and receive chains, among which the non-linearity introduced by the power amplifier (PA) is the dominant factor [11]. A general approach to deal with the PA non-linearity is to estimate the non-linear distortion, e.g. using the Hammerstein model and its extensions, and then compensate it in the received signal [5], [8], [12]. To accurately model the non-linearity, high order basis functions are required. The disadvantages of this approach are the high complexity of the non-linear model and a large number of parameters to be estimated. Another approach is to use the PA output as the reference signal for SIC [9], [13], [14]. In this case, lower complexity linear adaptive filters can be used for the SIC. In [9], we show that, with the use of the PA output as the reference signal, a high level of SIC can be achieved in slow-varying UWA SI channels by using classical recursive least-square (RLS) adaptive filters. The general block diagram of the FD system with digital cancellation using PA output as the reference signal is shown in Fig. 1. The system works at two sampling frequencies. The index of the signal sample with the high (passband) sampling rate is denoted by n, and the low (baseband) sampling rate sample index is i . The analogue (passband) signals are: the PA output s(t ); the SI r (t ); the noise n(t ); the far-end signal f (t ); and the received (hydrophone) signal x(t ). The digital (passband) signals are: the PA output s(n) and the received signal x(n). The digital baseband signals are: the transmitted data symbols a(i ) and the residual signal after the digital cancellation e(i ). DAC is the digital-to-analogue converter. See more details in [9].
Adaptive filters operate efficiently if the power spectral density of the input signal (regressor) does not have zeros, i.e., the regressor correlation matrix is full rank. This, however, requires sampling the baseband signal at a (symbol) rate, which is lower than the Nyquist frequency. As a result, the performance of the adaptive filter is sensitive to the delay between the regressor (PA output) and the desired (hydrophone) signal. To overcome this problem and ensure robust SIC performance regardless of the delay, the digital cancellation scheme from [9] is extended in [15]; the block diagram of the scheme is shown in Fig. 2. In this scheme, two branches are used with symbol rate sampling in each branch, with odd and even samples, respectively, taken from a twice oversampled baseband signal at the PA output. In this article, the SIC performance using different adaptive filters is investigated with this digital cancellation scheme.
Another phenomenon that limits the SIC is the time-varying surface reflections [16], [17]. While a high level of SIC can be achieved for time-invariant SI channels (e.g., in a water tank [9]) using classical RLS adaptive filters, the cancellation performance is limited in experiments with a moving surface. The main limitation is the tracking ability of the classical adaptive filters. The Kalman filter is considered as a good candidate for estimation of time-varying channels [18], [19]. However, for using the Kalman filter, the channel statistics should be known, which is often not the case in practice. To improve SIC performance in fast time-varying channels, other schemes are required.
To measure the performance of a SI canceller, the best approach would seem to be to implement a whole FD system and investigate its performance. However, so far the FD UWA technology is not mature enough to make this approach practical. It is still a problem to cancel the SI close to the noise level. The signal distortions in the SI channel and their influence onto the SI cancellation performance have not been yet well understood. Such issues have been partly addressed separately or in some combination as the PA nonlinearity [8], [9], the nonlinearity in the preamplifier of the hydrophone [20], passband to baseband conversion [15], modelling of the near-end surface reflections [14], modelling of the self-loop interference through the mounting system [21], acoustic isolation of the projector and hydrophone [22], etc. In this article, we propose an improved technique for dealing with the fast time-variability of the SI channel in digital SI cancellers. Other problems that need to be dealt with are the acoustic cavitation [23] and nonlinear signal distortion in the projector. To build a whole FD system, all these issues should be taken into account together, which will be done in the future.
In time-varying channels, the SIC performance can be significantly improved if the input signals are delayed with respect to the time-varying estimate of the channel response as shown in Fig. 3. However, to our knowledge, this opportunity for FD systems has not been investigated yet. Introducing a delay between the channel estimate and the inputs to the adaptive filter results in a problem in measuring the cancellation performance. The residual SI power is normally used to characterise the SIC performance [5], [24], [25], which can be measured by the mean squared error (MSE) [22]. However, the MSE in an adaptive filter with a delay is unsuitable for this purpose, since, in this case, unlike the classical RLS algorithms, the same data is used for channel estimation and computation of the MSE, resulting in over-fitting. Therefore, another measure of SIC performance is required when using adaptive filters with a delay.
In this article, we propose and investigate the SIC factor (SICF) for measuring the cancellation performance and a new adaptive algorithm for FD UWA systems with time-varying SI channels. The contributions of this article are as follows.
1) The SICF is proposed for evaluation of the SIC performance in digital SI cancellers. 2) The dependence between the delay of the input signals and the SIC performance for the exponential window RLS ERLS) and sliding window RLS (SRLS) adaptive filters is investigated.
3) The new adaptive filter (SRLS-P) is proposed, which is derived based on parabolic approximation of the channel variation in time. 4) The proposed algorithm is investigated using numerical simulations and lake experiments, and its performance FIGURE 2. Block diagram of the digital cancellation scheme. The PA output s(n) is down-sampled to twice the symbol rate and interleaved into two branches, s 1 (i ) contains odd samples and s 2 (i ) contains even samples; x(i ) is the baseband received signal; e 1 (i ) and e 2 (i ) represent residual signals in the two branches; w 1 (i ) and w 2 (i ) are weights computed as suggested in [15]. is compared with that of the classical RLS adaptive algorithms. The rest of the paper is organized as follows. In Section II, the new evaluation metric SICF is described. In Section III, the SRLS-P adaptive filter is derived. Section IV and Section V present simulation result in baseband and passband scenarios, respectively. Section VI compares the SIC performance provided by the adaptive filters using experimental data. Section VII draws conclusions.
Notations: In this article, we use capital and small bold fonts for matrices and vectors, respectively; e.g, R and h. We also denote the expectation as E{·}, the transpose of x as x T , and the Hermitian transpose of h as h H .

II. EVALUATION OF SIC PERFORMANCE
The mean squared error (MSE) and the mean squared deviation (MSD) are normally used for evaluating the channel estimation performance [19], [26]. In subsection II-A, we discuss if it is practical to use these two metrics to evaluate the SIC performance in FD systems. In subsection II-B, a new metric, the SICF, is proposed for evaluation of the SIC performance.

A. MSE AND MSD PERFORMANCE
Consider the SIC scheme shown in Fig. 3. In this scheme, x(i) is a baseband version of the signal received by the hydrophone, and it is modelled as: where h(i) is the baseband SI channel response at time instant i, s(i) is the baseband version of the PA output signal, s(i) = [s(i), . . . , s(i − L + 1)] T , and L is the channel length. The signal z(i) contains the far-end signal, as well as noise signals such as the ambient noise, ADC noise, etc. In terms of an adaptive filter operating in the identification mode, s(i) is the regressor and x(i) is the desired signal [18], [19]. Using these signals, the adaptive filter produces an estimateĥ(i+T ) of h(i). Note that, in classical adaptive filters, T = 0 and it is assumed that the estimateĥ(i) is obtained using the regressor and desired signal up to time instant i−1. In this case, the FIR filter shown in Fig. 3 is not required since it is the same as the FIR filter within the adaptive filter with the same input. However, if T > 0, the regressors of these FIR filters are different, they are s(i) for the adaptive filter and the delayed regressor s(i − T ) for the FIR filter. Based on this channel estimate, the SI is cancelled by recovering the SI signal aŝ h H (i)s(i − T ) and subtracting it from the received signal: The performance of an adaptive filter is most often evaluated using the mean squared error (MSE) [18], [19]. The MSE is defined as: For a classical adaptive filter (with T = 0), the SIC performance can be evaluated by computing the MSE. However, by adjusting parameters of an adaptive filter with a delay (non-causal adaptive filter), it is possible to make the MSE even lower than the 'noise-plus-far-end-signal' floor, although this does not mean that the SIC performance is good. It means that not only the SI is cancelled, but also a part of the far-end signal (i.e., the signal of interest) is also cancelled. Essentially, the adaptive filter is over-fitted, since, due to the delay, the same data is used for training the adaptive filter and for the MSE computation. In these scenarios, the MSE becomes an unreliable metric for assessment of the SIC performance. From the interference cancellation point of view, the SIC performance can be evaluated by how much the near-end SI is cancelled. Therefore, everything apart from the near-end SI is treated as the signal of interest (including far-end signal and the noise), which should be recovered. The signal to interference ratio (SIR) at the SI canceller can be written as: where is the signal of interest that includes the far-end signal and the noise, and ε(i) is the residual interference.
If the far-end signal and the error signal are not correlated, then the residual interference ε(i) can be represented as: and substituting (1) and (2) into (5), we have: Assuming that s(i) are uncorrelated for different i and uncorrelated toĥ(i), we have: where σ 2 s = E{|s(i)| 2 } is the variance of the signal s(i), which is assumed stationary. Then using (6) and (7), we obtain: where the MSD is defined as: Finally, we obtain: Thus, the MSD is a useful characteristic of an adaptive filter operating within an SI canceller. It shows how much the ratio between powers of the signal of interest (including noise) and near-end interference improves due to the accuracy of the near-end channel estimation. However, the MSD computation requires knowledge of the true channel response h(i), which is unavailable in most practical scenarios. Another important issue is that (11) is only applicable ifĥ(i) and s(i) are uncorrelated, which may not be the case for adaptive filters with delay.

B. SIC FACTOR
In [4], [5], [24], [25], the interference cancellation gain, which is defined as the ratio of the near-end SI power to the residual SI power, is used for evaluating the performance of the SI canceller. Note that the residual SI is computed as in (5) assuming that the far-end signal is not correlated with the error signal. This assumption is no longer valid when adaptive filters with delay are used. We now propose the SICF, which is shown to provide a good indication of the cancellation performance. It does not require the knowledge of the true channel response, and can be used in practice for adaptive filters with and without the delay. This SICF can be used to evaluate the SIC performance without the need of implementing a full FD system.
Here we consider the SI cancellation problem from the far-end signal detection point of view. The higher far-end signal to residual interference ratio at the SI canceller output, the better the SIC performance. In this scenario, the far-end signal is the signal of interest, and everything else is treated as interference (including noise). Since the far-end signal level is typically higher than the receiver's noise floor, the noise is ignored in the derivation below to simplify the expression. Although the noise is ignored in our derivation, the metric SICF is applicable in the case when the noise is present; this can be seen in numerical results presented in Section IV. Fig. 4 illustrates our description below. We artifically add to the SI signal r(i) a known signal f (i) assumed to be a far-end signal. The level of the signal σ 2 f = E{|f (i)| 2 } is chosen to guarantee a predefined input SIR: The SI canceller (shown in Fig. 4) subtracts the SI estimate produced by the adaptive filter from the received signal r(i) + f (i). The canceller output e(i) contains the signal of interest f (i) and a residual signal ε(i): and since both signals e(i) and f (i) are available after the cancellation, the residual signal ε(i) can be computed as Here we measure the SIC performance as a factor of improvement in the SIR ratio due to the SI cancellation and compute the SICF as: By introducing the artificially added far-end signal, the SICF that we propose evaluates the SI canceller performance taking into account the loss of the far-end signal after SIC. For classical adaptive filters without delay, the signal of interest f (i) and the residual ε(i) are uncorrelated, thus SIR out (i) can be computed as a ratio of their variances. For adaptive filters with delay, due to the over-fitting in the adaptive filter, in general, these two signals are correlated. VOLUME 8, 2020 Therefore, in this case, we cannot use their ratio for computing SIR out (i), another approach is required.
We now assume that the signal of interest f (i) is attenuated due to the imperfection of the adaptive filter. More specifically, we rewrite (13) as: where the modified signal of interest u(i) = αf (i) and the We now find the coefficient α that zeroes the correlation between u(i) and v(i): (17) From (17), we find α as: After finding α, the modified signal of interest u(i) and residual interference v(i) can be computed from (15), and the ratio of their variances can now be used for computation of SIR out (i).
In experiments, the mathematical expectation in (18) is replaced by the average over a time interval after convergence of the adaptive filter. The output SIR can be computed as: and P is the averaging interval. The averaging interval P is preferred to be longer than the coherence time of the SI channel. Note that the far-end signal we used to compute the SICF is artificially added to the received signal, thus it is known at the receiver. The SICF is intended to be used for adjusting the parameters of the adaptive filters to ensure optimal SIC performance can be achieved. In practical systems with real far-end transmission, the far-end signal is unknown. In that case, the SICF can still be computed with an artificial far-end signal for parameter tuning at the training stage.

III. PROPOSED SRLS-P ADAPTIVE FILTER
In this section, we review the ERLS and SRLS adaptive filters, consider their delayed versions, and propose a new adaptive filter based on the SRLS algorithm and parabolic approximation of channel variation in time; we call it the SRLS-P adaptive filter.

A. CLASSICAL ERLS AND SRLS ADAPTIVE FILTERS
At every time instant i, an RLS adaptive filter updates the solution vectorĥ(i) according to the normal equation: where R(i) is an L × L autocorrelation matrix, β(i) is an L × 1 cross-correlation vector, and L is the filter length. The autocorrelation matrix and cross-correlation vector are approximated by averaging in time.
For the classical ERLS adaptive filter, R(i) and β(i) can be updated as: where λ is the forgetting factor, s(i) = [s(i), s(i−1), . . . , s(i− L + 1)] T is the regressor at the ith time instant, and x(i) is the ith sample of the desired signal. The weights of the time average window is the exponential λ |i−p| , p ≤ i. For the classical SRLS adaptive filter, the update of R(i) and β(i) can be written as [27], [28]: where

B. DELAYED ERLS AND SRLS ADAPTIVE FILTERS
Since R(i) and β(i) are obtained by averaging in time, the current channel estimateĥ(i) can be seen as an average of the true channel response over past time instants. If the SI channel is time-invariant,ĥ(i) can be an accurate estimate of h(i). However, for a time-varying channel,ĥ(i) is not an accurate estimate of h(i).
For the SRLS adaptive filter, the channel estimateĥ(i) can be seen as an average of h(i) over the past M time instants. As shown in Fig. 5, if we assume that the channel response varies linearly in the vicinity of i, then its average over the rectangular window is equal to h(i − M /2). In such a case, h(i) is a more accurate estimate of h(i − M /2) than h(i). Therefore, using the delay T = M /2 in the scheme shown in Fig. 3 should provide an improvement in the SIC performance compared to the case T = 0. In Section IV, we demonstrate that this is indeed the case. For the ERLS adaptive filter, the time window is infinite in length, and it is more difficult to determine the optimal delay which provides the highest level of cancellation. Moreover, in Section IV, we also show that even for the same forgetting factor λ, different channel realisations require different T . Therefore, our proposed adaptive filter is based on the SRLS algorithm, for which the optimal delay is well defined. We call the ERLS and SRLS algorithms with delays as ERLSd and SRLSd, respectively, to distinguish them from the classical RLS algorithms.

C. SRLS-P ADAPTIVE FILTER
Compared to the SRLS algorithm, the SRLSd adaptive filter improves the MSD performance, and, as a result, it improves the SIC performance by applying the current channel estimate found at the ith time instant to the delayed regressor s(i − M /2), corresponding to the middle of the averaging time window of length M . It changes the way the SI signal is reconstructed, but the channel estimates are computed in the same way as in the classical SRLS adaptive filter.
In fast time-varying channels, the channel estimation performance provided by the SRLSd algorithm is still limited, since the channel estimate can be viewed as simply an average of the true channel response over the past M time instants. To improve the tracking ability in fast time-varying channels, we propose the SRLS-P adaptive filter. The key idea of the algorithm is the parabolic interpolation of the channel time variation using the estimatesĥ(i) provided by the SRLS algorithm.
We assume that the time-varying channel response is a second-order algebraic polynomial within a short time interval around the time instant i, as shown in Fig. 6:  By substituting (25) into (26) for k = 0, k = M /2, and k = M , we obtain a system of equations with respect to the unknown 3L × 1 vector z = [h 0 (i); h 1 (i); h 2 (i)]. By solving the system, we obtain an estimateĥ 0 (i) of h 0 (i), which is also the new channel estimateh(i) of h(i).
More specifically, we have: where Similarly, we obtain: where and We now arrive at the system of equations: or, in a compact form,  (28) The SRLS-P adaptive algorithm is summarized in Algorithm 1, where is a regularization parameter, s is the transmitted signal, x is the desired signal, The complexity of the SRLS-P algorithm will be dominated by the complexity of solving the system of equations in (39). Directly solving the system of equation requires an order of L 3 arithmetic operations. The complexity can be reduced by solving the system of equation recursively based on the solution obtained at the previous time instant using the dichotomous coordinate descent (DCD) algorithm [27]. In such a case, the complexity reduces to an order of N u L operations, where N u is the number of DCD updates, which is typically a small number.

IV. BASEBAND SIMULATION
In this section, we first show that the delayed RLS algorithms provide improvement in the MSD performance when identifying time-varying channels and then investigate the dependence of the performance on the delay. It will be shown that, for the SRLSd algorithm, the optimal delay is T = M /2, as discussed in Section III-B. However, for the ERLSd algorithm, there is no one-to-one relationship between the optimal delay and the forgetting factor λ.
We show that the MSE is useful for characterising the SIC performance if T = 0, i.e., for classical RLS algorithms. However, if T > 0, the MSE is not a useful characteristic for this purpose. We then show that the proposed SICF metric is suitable for characterising the SIC performance for both the cases, in particular by comparing it with the bit error rate (BER) performance of a far-end transmission.
In the simulation, we set the filter length to L = 50, and model the SI channel as follows. Every element [h(i)] of h(i) is a stationary random process with a power spectral density c G(f ), where G(f ) is uniform within a frequency interval [−f max , f max ], and c is the variance of the th channel tap. The UWA channel normally has a decaying power delay profile due to the spreading and absorbtion loss [29]. The power delay profile c is generated as: and γ is chosen to control the ratio between the variance of latest arrivals ( = L −1) and that of the first arrivals ( = 0). In this scenario, γ is chosen to make this ratio equal to 80 dB. The random processes [h(i)] are independent for different , and they are generated using the FFT-method [30]. We assume a sampling frequency f s = 1 kHz, so that one channel tap delay is 1 ms. The parameter f max determines the maximum speed of the channel variation. To model fast time-varying channels, we use f max = 1 Hz; for slow time-varying channels, f max = 0.1 Hz.
In Fig. 7, a snapshot of the channel impulse response generated through the aforementioned process is shown.    In Fig. 9, we observe that the MSD performance of the ERLS algorithm can also be improved by introducing a delay. As can be seen, in this simulation scenario, the minimum MSD is achieved for λ = 0.955 and T = 37. For λ = 0.94 and λ = 0.97, the minimum MSD is achieved at T = 31 and T = 45, respectively. For the ERLS algorithm, from Fig. 9, one can arrive at the following approximate expression for the optimal delay T opt :

A. MSD PERFORMANCE OF RLS ALGORITHMS WITH A DELAY
where β = 7.8. Note that (43) cannot provide the optimal delay precisely, it can only be used as a reference.
To test if the dependencies between the optimal delay and the window parameters can be applied generally, we ran 1000 simulation trials to find the distribution of the optimal delay for the SRLSd and ERLSd adaptive filters, with M = 100 and λ = 0.955. The results show that, for the SRLSd algorithm, the optimal delay is always T = M /2 in all simulation trials. However, for the ERLSd adaptive filter, the minimum MSD is obtained at T = 37 in 91.5% of the trials, while, in the other trials, the optimal delay is T = 36 or T = 38.  The MSE, MSD and SIC performance are computed over the steady-state part of the learning curve from 1000 to 5000 samples. The average interval for the SIC factor computation is 4 s. These three evaluation metrics are all averaged over 20 simulation trials. We consider the case when the power of the far-end signal is significantly higher than the noise power, thus the noise is not added to the far-end signal. The far-end signal to SI ratio is set to −43 dB.
We can see that, for the SRLS algorithm (T = 0), the optimal sliding window length M found from the MSE and MSD curves is about the same (M = 60 or 70). However, for the other algorithms with T > 0, the optimal M corresponding to the minimum MSE and MSD are different.
The SRLS-P adaptive filter has a significantly improved MSD performance compared to the SRLSd algorithm, which in turn outperforms the SRLS algorithm. Note that, in the SRLS-P algorithm, there are 3 L unknown parameters to be estimated. Therefore, since the estimation interval in the SRLS-P algorithm is 2 M , the estimation requires the window length to be at least M =3 L/2 = 75; this explains the increase of the MSD at low M .
The results in Fig. 10 show that the MSE is lower than the far-end signal to SI ratio for the SRLSd adaptive filter with M < 80. This indicates that the far-end signal is partly cancelled, therefore the MSE is not useful as a performance measure here. In Fig. 10 (e) and (f), we show the SICF of the adaptive filters together with the inverse MSD. It is seen that the SICF and the inverse MSD for the SRLS adaptive filter are nearly the same, as expected from (11). For the adaptive filters with delay, there is some discrepancy between them for small M . We will show in the next section that the proposed SICF metric provides a better indication of performance of the SI canceller than the MSD.

C. MSD, SIC AND BER PERFORMANCE OF SRLSd AND SRLS-P ALGORITHMS
We now investigate the relationship between the MSD, SIC and BER performance provided by the SI canceller in fast time-varying channel (f max = 1 Hz) when using the SRLSd and SRLS-P algorithms. Fig. 11 shows these three characteristics for different values of M . We run 500 simulation trials, and in each trial a new time-varying channel is generated. The length of the realization is 15s. The received signal is generated by adding the far-end signal and noise to the SI channel output. Samples of the noise are generated as Gaussian random zero-mean numbers. The noise variance σ 2 n is set according to the SI to noise ratio (SNR SI ), which is defined as We use the BPSK direct sequence spread spectrum signal as the far-end signal. The chip rate is 1 kHz, the spreading factor is 250. The far-end channel is assumed to be a single path channel. The far-end signal level is defined by the far-end SNR as σ 2 f /σ 2 n . Here we set SNR SI = 43 dB, and the far-end SNR varies from 10 dB to 19 dB. The SICF is computed over the steady-state period from 2 to 15 s, which is about ten times longer than the time correlation of the SI channel.
The performance of the SRLSd algorithm is shown in Fig. 11 (a), (c) and (e). Fig. 11 (a) shows the detection performance of the far-end data after SIC, which is an important indicator of the performance of an FD communication system. The best detection performance is achieved with M = 140 or M = 160 when the far-end SNR lower than 16 dB. The BER slightly degrades for M = 120, and further degrades for smaller M . However, the MSD gives a different indication as the minimum MSD is achieved with M = 100 or M = 120 when the far-end SNR is lower than 16 dB.
The SICF indicates that the best performance is achieved with M = 140 when the far-end SNR lower than 14 dB and with M = 160 when the far-end SNR between 14 dB and 19 dB. It is clear that the SICF provides a better indication of the optimal M for the detection performance.
More importantly, in practice, the MSD is difficult to compute since the true channel response is unknown, whereas the proposed SICF metric is computed without such knowledge as explained in Section II.
In Fig. 11(b), (d) and (f), the BER, MSD and SIC performance of the SRLS-P algorithm are shown. The far-end SNR now varies from −11 dB to −2 dB. We consider much lower far-end signal level compared to that used for the SRLSd algorithm to generate the BER curves, as the SIC performance is significantly improved with the SRLS-P algorithm. It is seen that the optimal detection performance is achieved with M = 140. The dependence between M and the BER performance is consistent with that of the MSD and the SICF. In overall, the SRLS-P algorithm with optimal M outperforms the SRLSd adaptive filter by around 20 dB in terms of the MSD and SICF. It is observed that the BER curve with the optimal M is also shifted in the far-end SNR by about the same value.

V. PASSBAND SIMULATION RESULTS
In this section, we investigate the SIC performance of the SRLS, SRLSd and SRLS-P adaptive filters in scenarios with time-varying SI channels. We use the SIC scheme shown in Fig. 2. The SI channel has one direct path between the projector and hydrophone and one path due to reflection from a time-varying surface. The reflected path is 20 dB weaker than the direct path. The surface is modelled as a sinusoid wave of 0.5 m amplitude and 3 s period. The projector and hydrophone are vertically separated by a distance of 0.5 m, their depths are 9.5 m and 10 m, respectively. We will show that the SIC performance can be significantly improved by the SRLS-P adaptive filter which accurately models the channel variation caused by the time-varying surface reflection.
In the simulation, a 10 s signal with BPSK (binary phase-shift keying) modulation at a 12 kHz carrier frequency and with 1.2 kHz signal bandwidth is transmitted. The symbol rate is f d = 1 kHz. The BPSK symbols are pulse shaped using the root-raised cosine filter with a roll-off factor of 0.2. The sampling rate of the passband signal is 96 kHz.
The received signal at the hydrophone is generated by adding the far-end signal and noise to the SI channel output. Here we set SNR SI = 100 dB and consider the far-end SNR between 0 dB and 15 dB. Fig. 12 shows the SIC performance of the SRLS, SRLSd and SRLS-P adaptive filters. The SIC factor is computed over the time interval from 2 s to 10 s, i.e., the average interval for computing the SICF is 8 s. For each adaptive filter, the parameter M is adjusted to provide the highest SICF. The filter length is L = 40, which is long enough to cover both the main path and the surface reflection. For the SRLS adaptive filter, around 81 dB of SIC can be achieved at 0 dB far-end SNR (M = 60). The SICF is improved by 3 dB when the SRLSd adaptive filter (M = 110) is used, and it is further improved to 98 dB (by 14 dB) with the SRLS-P adaptive filter (M = 240).

VI. EXPERIMENTAL RESULTS
In this section, we investigate the SIC performance of the SRLS, SRLSd and SRLS-P adaptive filters in the lake experiment with the SIC scheme shown in Fig. 2. In the experiment, a Zoom F4 multitrack recorder [31] with a high-resolution 24-bit ADC is used to record the PA output and the hydrophone output. The PA output is fed to the recorder through an attenuator to avoid truncation of the signal or causing damage to the recorder due to the high voltage level.
The configuration and experimental setup are shown in Fig. 13 and 14, respectively. The lake depth at the experi-  mental site is around 8 m. The distance between the projector and the hydrophone is around 1.3 m. The hydrophone is placed at 4 m depth. The experimental site is positioned in the middle of the lake. In Fig. 16, we show a picture of the lake surface taken during the experiment. It was observed during the experiment that the amplitude of the surface waves varied from 5 cm to 10 cm. More information on the experimental site can be found in [32].
In the experiment, we transmit a 15 s BPSK signal at the carrier frequency f c = 14 kHz with a bandwidth of 1.2 kHz; the symbol rate is f d = 1 kHz; the pulse shaping roll-off factor is 0.2. The sampling rate is f s = 96 kHz. At 14 kHz, the transmit voltage response of the transducer [33] is 118 dB re µPa/V at 1 m. During the experiment, the sound pressure level at 1 m range is around 166 dB re µPa.
In Fig. 17, we show the SI channel estimates obtained with the SRLS-P adaptive filter, which provides the highest SICF among the adaptive filters we considered. It can be seen that the SI channel consists of a strong and stable direct path and multiple fast time-varying paths due to reflections from the mounting system and from the lake surface and bottom. The direct path is the one associated with the highest amplitude (at tap 12). Apart from the direct path, there are also a few relatively stable reflections from the structure we used to fix the transducer and hydrophone (shown in Fig. 13). Assuming the sound speed is 1500 m/s, the delay between the direct path and the first surface reflection should be around 3.4 ms. This is consistent with the channel estimates, as the first surface reflection arrives at the 16th tap. The rest of the multipath components are due to multiple reflections from the surface, bottom and the mounting system.   In the experiment, the SI to noise ratio is around 48 dB as shown in Fig. 15. This SNR level is mostly defined by the electrical noise and acoustic noise coming to the water from the metallic construction. The filter length is L = 80, which is long enough to cover the channel delay spread, including the direct path and multiple reflections from the surface and bottom. The SICF is computed over the time interval from 2 s to 15 s. Fig. 18 shows the SIC performance of three adaptive filters with the optimal sliding window lengths M . For the SRLS adaptive filter, at 0 dB far-end SNR, 25.5 dB of SIC is achieved when M = 110. The SICF is improved to 29 dB when the SRLSd adaptive filter with M = 190 is used. The SRLS-P adaptive filter with M = 220 achieves 32 dB of SICF.
The experimental results demonstrate that the SRLS-P adaptive filter provides the best SIC performance among the three adaptive filters. More than 6 dB improvement in the SICF can be achieved by using the SRLS-P adaptive filter compared to that of the SRLS adaptive filter.
However, it is seen that even with the SRLS-P adaptive filter, the level of the residual SI is still higher than the level of the far-end signal. At 0 dB far-end SNR, with 32 dB of SICF, the residual SI is 16 dB higher than the far-end signal. At 15 dB far-end SNR, the SICF is around 29 dB, and the residual SI is 4 dB higher than the far-end signal. However, with such a level of the SI cancellation it becomes possible to detect far-end signals with specific modulation techniques, such as the spread-spectrum modulation as demonstrated in Section IV.
It can be seen that the improvement in SICF for the lake experiment is lower than that achieved in the passband simulation. The power spectral density computed for the first reflection from the lake surface (with an amplitude of about 0.4 as seen in Fig. 17), has shown that f max > 2 Hz. For the further reflections from the lake surface and bottom, as can be seen in Fig. 17, the variation speed is even higher. With M = 220, the product of the estimation window length (0.44 s) by f max is already close to one, which is less than the Nyquist lower boundary. With such settings, one cannot expect high accuracy of estimating the SI channel due to high modelling errors [34]. Still, the SRLS-P algorithm shows improvement by 5.5 to 6 dB against the SRLS algorithm and by 1.5 to 2.5 dB against the SRLSd algorithm.
The estimation accuracy could have been improved using lower M . However, for the identifiability, the number of available signal samples (2M ) should be higher than the number of unknown parameters (3L), i.e. M > 3L/2. For M very close to the boundary 3L/2, the algorithm performance is limited (see Fig. 10). Reduction in L allows smaller M , but, in this case, the SIC performance will be limited by the SI arrivals being truncated by the filter.

VII. CONCLUSION AND FURTHER WORK
In this article, the SICF has been proposed as a practical measure of the SIC performance in FD UWA systems. The SICF has been investigated in comparison with the MSE, MSD and BER. It is shown through numerical simulation that the proposed metric provides a good indication of the SI canceller performance.
To improve the SIC performance of the RLS adaptive filters, we have considered their delayed versions, the SRLSd and ERLSd adaptive filters. The dependence of the SIC performance on the delay of the input signals for these adaptive filters has been investigated using numerical simulations. We have shown that, for the SRLSd adaptive filter, the optimal delay is the half of the sliding window length. For the ERLSd adaptive filter, the relationship between the optimal delay and the forgetting factor can differ for different channel realizations, although, with an optimal delay, the ERLS adaptive filter can provide the same level of SIC performance as the SRLSd adaptive filter.
We have proposed the SRLS-P adaptive filter, which is based on the SRLS algorithm and modelling the channel response variation within a short time interval as a second-order algebraic polynomial. The SIC performance of the SRLS-P adaptive filter has been investigated and compared with that of the SRLS and SRLSd adaptive filters using numerical and lake experiments. The SRLS-P algorithm achieves the highest SICF among these adaptive filters.
Although the SIC performance achieved by the SRLS-P adaptive filter is greatly improved in the simulations, the improvement of that in the experiment is not that high due to too fast surface variations. As further work, we will look into modelling the time-varying channel with higher order polynomials to improve the approximation accuracy. A full FD setup will also be considered.

APPENDIX
We now derive the presentation (26) for the channel estimatê h(i) obtained by the SRLS algorithm.
In the SRLS adaptive filter, without the noise, the estimate at time instant i is given bŷ  m). By replacing i with i + k, this can also be rewritten as: