A Steady-State Kalman Predictor-Based Filtering Strategy for Non-Overlapping Sub-Band Spectral Estimation

This paper focuses on suppressing spectral overlap for sub-band spectral estimation, with which we can greatly decrease the computational complexity of existing spectral estimation algorithms, such as nonlinear least squares spectral analysis and non-quadratic regularized sparse representation. Firstly, our study shows that the nominal ability of the high-order analysis filter to suppress spectral overlap is greatly weakened when filtering a finite-length sequence, because many meaningless zeros are used as samples in convolution operations. Next, an extrapolation-based filtering strategy is proposed to produce a series of estimates as the substitutions of the zeros and to recover the suppression ability. Meanwhile, a steady-state Kalman predictor is applied to perform a linearly-optimal extrapolation. Finally, several typical methods for spectral analysis are applied to demonstrate the effectiveness of the proposed strategy.

biomedical signal processing, and so on. In particular, sparse representation [2][3][4] opens an exciting new vision for spectral analysis. However, such methods are usually accompanied by high computational complexity, which makes their availability somewhat limited.
Sub-band decomposition-based spectral estimation (SDSE) [5] is an important research direction in spectral estimation, because it has several advantageous features, e.g., computational complexity decrease, model order reduction, spectral density whiteness, reduction of linear prediction error for autoregressive (AR) estimation and the increment of both frequency spacing and local signal-to-noise ratio (SNR) [6]. These features have been theoretically demonstrated under the hypothesis of the ideal infinitely-sharp bandpass filter bank [7]. Subsequent studies [8][9][10] indicate that these benefits aid complex frequency estimation in sub-bands, thereby enabling better estimation performance than that achieved in full-band. In addition, the computational complexity of most algorithms for spectral analysis has a superlinear relationship with the data size, and sub-band decomposition can considerably speed up these algorithms. Independently handling each sub-band enables parallel processing, which can further improve the computational efficiency. Both advantages are crucial for reducing the computational burden, especially when analyzing multi-dimensional big data, such as polarimetric and/or interferometric synthetic aperture radar images of large scenes.
Unfortunately, the ideal infinitely-sharp bandpass filter cannot be physically realized, and non-ideal (realizable) filters introduce energy leakage and/or frequency aliasing phenomena [11]. Due to these non-ideal frequency characteristics of analysis filters, spectral overlap between any two contiguous sub-bands occurs during the sub-band decomposition. Then, the performance of SDSE severely degrades.
In the relevant literature, several methods have been proposed to mitigate spectral overlap. We classify these methods into three categories. The first category is defined as ideal frequency domain filtering with a strict box-like spectrum, such as "ideal" Hilbert transform-based half-band filters [9] and harmonic wavelet transform-based filters [12,13]. Theoretically, sub-band decomposition with these filters is immune to spectral overlap. However, discrete Fourier transform will inevitably induce spectral energy leakage, which can likewise distort sub-band decomposition. The second category is known as convolution filtering with wavelet packet filters [8], Kaiser window-based prototype cosine modulated filters, discrete cosine transform (DCT) IV filters [10] and Comb filters [6,14]. It seems that increasing the filter order can improve the filtering performance and also the spectral overlap suppression capability. However, in the context of involving a finite-length sequence and performing convolution filtering, the nominal improvement of performance will lead to spectral energy leakage and inferior filtering accuracy [10]. Considering the compromise between suppressing spectral overlap and reducing spectral energy leakage, we have to restrict the filter order. The third category is frequency-selective filtering, and a representative method is SELF-SVD (singular value decomposition-based method in a selected frequency band) [15]. Essentially, SELF-SVD attempts to attenuate the interferences of the out-of-band components by the post-multiplication with an orthogonal projection matrix. Unfortunately, the attenuation is often insufficient when the out-of-band components are much stronger than the in-band components or the SNR is relatively low. In this case, the estimation of the in-band frequencies is seriously affected.
In this paper, a new filtering strategy is proposed to suppress spectral overlap for sub-band spectral estimation. First, we discuss the formation mechanism of spectral overlap. Nominally, a high-order finite impulse response (FIR) filter usually has a powerful ability in spectral overlap suppression. However, once we perform such a filter on a finite-length sequence with the convolution operation, the non-given samples at the forward and backward sampling periods of the sequence are assumed to be zeros. A certain filtering error therefore occurs and conversely disrupts the decomposed sub-bands. As a result, sub-band spectral analysis severely suffers from the mutual overlap of adjacent sub-band spectra. Second, we propose a filtering strategy to eliminate the filtering error and recover the suppression ability. This strategy intuitively takes the place of the artificial zeros with some extrapolated samples. Toward the problem of data extrapolation, many algorithms have been proposed based on various theories, such as linear prediction [16], Gerchberg-Papoulis [17], Slepian series [18], linear canonical transform [19] and sparse representation [20]. To establish an efficient method for the extrapolation in context and to evaluate the effectiveness of the proposed strategy, we preliminarily develop a linearly-optimal extrapolation based on the classical AR model identification and the Kalman prediction [21][22][23]. Third, we derive the formulas to estimate the residual filtering error and adapt two common information criteria with adaptive penalty terms for AR order determination. Moreover, equiripple FIR filters are applied as analysis filters in coordination with the proposed filtering, because of their advantageous features [11]. Finally, the entire algorithm and the computational complexity are summarized. Some details, such as the sub-band spectrum mosaicking procedure and parameter selection, are discussed in practice.
The remainder of the paper is organized as follows. In Section 2, the formation mechanism of spectral overlap is discussed. Based on this, a steady-state Kalman predictor-based filtering strategy is developed to suppress the overlapped spectra. In Section 3, the proposed filtering strategy is discussed for SDSE. In Section 4, experimental results with several typical algorithms for spectral analysis demonstrate the effectiveness of the proposed strategy. Finally, Section 5 concludes this paper.

Signal Filtering Based on AR Model Identification and Kalman Prediction
This section focuses on signal filtering. To reduce the filtering error induced by convolution filtering, we propose an extrapolation-based filtering strategy and apply a steady-state Kalman predictor for extrapolation. Two criteria with adaptive penalty terms for order determination are developed based on the estimation of the residual filtering error.

Problem Statement of Signal Filtering
FIR filters are typical linear time-invariant (LTI) systems. According to the linear system theory, the filter can be mathematically expressed as the convolution of its impulse response with the input. Suppose that tx n u is an input sequence and th n u is the impulse response of a causal FIR filter; the filtered sequence ty n u can be derived as [11]: where˚denotes the convolution operator and N f is the filter length (i.e., the length of the impulse response; the relationship between N f and the filter order N o can be written as N f " N o`1 ). Alternatively, taking discrete-time Fourier transforms (DTFT), we can represent Equation (1) in the frequency domain as: Y`e jω˘" H`e jω˘X`ejω˘ ( 2) In addition, the filtered sequence length L, the input sequence length N and the filter length Theoretically, given a large enough stop-band attenuation, spectral overlap can be thoroughly suppressed. Moreover, the spectral estimation error in sub-bands can be neglected, as long as the width of the transition band and the ripple of the passband are sufficiently small. Nonetheless, the pursuit of excellent filtering performance substantially increases both the filter order and the length of the filtered sequence (refer to Equation (3)). Such a high order is more likely to create error in part or even all of the filtered samples. This result is contrary to our original objective, and the resultant filtering quality is undesirable.
From the perspective of a discrete-time system, the output sequence of the convolution operation is equivalent to the zero-state response of the filter system, because the initial state of every delay cell is zero prior to the excitation of the input sequence. We take the example of the direct-type FIR system [24]. The value of the output sample at any time depends on all or part of the input samples and the system state at that time. The first N f´1 output samples suffer from biases, because a part of the delay cells do not yet become input-driven states; analogously, the last N f´1 output samples are invalid, because a part of the delay cells restore the initial zero-states. Thus, the length of the valid part of the output sequence, defined as L v , satisfies: Actually, if we rewrite Equation (1) in the following matrix form: then we can find that the matrix X possesses many zero elements, which probably makes the outputs y 0 , y 1 , . . . , y No´1 ; y L´No , y L´No`1 , . . . , y L´1 not ideal. For example, y 0 " x 0 h 0 , while the ideal output should beỹ 0 " x 0 h 0`x´1 h 1`x´2 h 2`¨¨¨`x´No h No . This means that the unknown samples x´N o , x´N o`1 , ..., x´1 are assumed to be zeros. The filtering error of y 0 is y 0´ỹ0 " Likewise, the outputs y 1 , y 2 , ..., y No´1 ; y L´No , y L´No`1 , ..., y L´1 all suffer from errors under the zero assumption. Thus, we can conclude that the meaningless zeros are the error sources of the filtering. Referring to Equation (4), we note that, if the filter order is not less than the length of the input sequence, the output samples are all invalid. Thus, improving filtering performance by means of unlimited increasing filter order is meaningless.
In the next subsection, we will identify an efficient way to resolve this problem.

Filtering Procedure Based on Signal Extrapolation
The desired output of the filtering process should have two characteristics: • The original and filtered sequences should be of equal length; • During the filtering process, the states of the delay cells in the filter system should always maintain input-driven states, i.e., there are no artificial zeros, but authentic samples in X.
As shown in Equation (5), the convolution filtering assumes the unknown samples ., x N`No´1 to be zeros, which leads to the filtering error. Thus, an intuitive thought is to extrapolate the sequence tx n u pN´1q pn"0q along two sides to provide a series of estimates for the unknown samples. Taking place of the zeros in the matrix X with these estimates can mitigate the filtering error. The input sequence is extrapolated along both sides, yielding two extrapolated sequences, called Part A and Part B (see Figure 1). Suppose that L A and L B are the lengths of Part A and Part B, respectively; then, those L A`LB extrapolated samples are used to replace zeros in X. According to Equation (3), the length of the associated output sequence is L A`LB`N`No . From Equation (4), the effective length of the output can be given by L A`LB`N´No . To satisfy the requirement that the original and filtered sequence are of equal length, the extrapolated length can be derived as:  Now, we can conclude that the extrapolated length should be equal to the filter order. We define L G as the constant group delay of the filter. Between time N o and time N`N o , the output samples are valid. The output sample at time N o`n pn " 1, 2,¨¨¨, Nq corresponds to the input sample at time N o´LG`n pn " 1, 2,¨¨¨, Nq, because of the group delay. Consequently, the input sample before time N o´LG is merely used as a training sequence of the system state. Thus, we can obtain the relationships: Letx n andŷ n be the extrapolated sequence and associated filtered result, respectively. Then, they satisfy: The filtering process can be rewritten in matrix form as: where:ŷ and:X

Signal Extrapolation Based on AR Identification and Kalman Prediction
According to the linear prediction theory [25], the AR model is an all-pole model, whose output variable only linearly depends on its own previous values, that is, where q´1 denotes the unit delay, p is the model order, φ 0 , φ 1 , . . ., φ p denote the coefficients of the model and φ 0 " 1. The sequence tε n u 8 n"´8 is a white noise process, which satisfies: where E p¨q denotes the expectation operator.
We choose the forward-backward approach [1] as the coefficient estimator for AR model, for its precision and robustness. Both criteria, including the Akaike information criterion (AIC) and Bayesian information criterion (BIC) [26] can be applied to determine the model order; whereas both criteria sometimes suffer from overfitting. An alternative method of order determination will be discussed in Section 2.4.
A linearly-optimal prediction for AR sequences is derived in [21][22][23] under the minimum mean square error (MMSE) criterion. However, the prediction formula involves a polynomial long division and a coefficient polynomial recursion [23], making the calculation of the prediction somewhat inconvenient. Alternatively, the following steady-state Kalman predictor [27] provides an equivalent prediction with the MMSE predictor, while offering a simpler formula to facilitate the computation.
The AR model is regarded as a dynamic system. A specific state-space representation for a univariate AR(p) process can be written as [25]: where: and: H " The coefficient polynomials of x n and ε n are Φ pq´1q and one, respectively. Since they are relative prime polynomials (or coprime), i.e., the transfer function is irreducible, the system of the AR model is a joint controllable and observable discrete linear stochastic system [28]. Thus, there exists a steady-state Kalman predictor: #ξ n`1|n " Fξ n|n´1`K e n x n " Hξ n|n´1`en (19) Since both ε n and e n are the innovation processes of x n , they are equal [27]: By comparing Equation (15) with Equation (19), we have: Therefore, the one-step steady-state Kalman predictor can be derived as [28]: Similarly, the k-step steady-state Kalman predictor can be presented as: #ξ n`k|n " Fξ n`k´1|n´1`F k´1 Γε n x n`k|n " Hξ n`k|n (23) Here, define: Then, we can obtain: and that is,x Analogously, the k-step backward extrapolation formula can be given by: where the superscript "˚" denotes the complex conjugate operator. To guarantee reasonable and effective extrapolations, the step-size k should satisfy: In order to evaluate the residual filtering error of the proposed filtering strategy, we derive the mean square error (MSE) in Appendix A1.

Adaptive Information Criteria for AR Order Determination
Given the impulse response of an analysis filter and AR coefficients, we can directly calculate MSE by Equations (A2) and (A10). The precision of AR coefficient estimation is concerned with AR order. Consequently, the filtering error at different AR orders can be evaluated with the preceding formulas; conversely, the calculation of MSE can be used for order determination.
AIC and BIC are two common information criteria, whose purpose is to find a model with sufficient goodness of fit and a minimum number of free parameters. In terms of the maximum likelihood estimatê σ 2 p , we can denote AIC and BIC as [26]: As explained in [29], due to the lack of samples, both criteria encounter the risk of overfitting, where the selected order will be larger than the truth order. In particular, AIC has the nonzero overfitting probability as the sample number tends to infinity. Theoretically, both criteria consist of two terms: the first term involves MSE, and it decreases with the increment of the order p; the other term is a penalty that is an increasing function of p. The preferred model order is the one with the lowest AIC or BIC value. As shown in Figure 2a, the objective function curve Č S 1 P 1 E 1 reaches its minimum value at the point P 1 , which gives the correct order p. However, sometimes, both criteria may fail to determine available orders, and those failures are often related to inadequate penalties. Figure 2b illustrates a representative case. Since the change of the objective function instantly slows down as the order exceeds p, the point P 2 is the preferred point for order determination. However, the penalty strength is insufficient, so that the objective function is still falling after P 2 . To handle this situation, we propose an adaptive mechanism to adaptively adjust the penalty strength. A geometric interpretation is depicted in Figure 2b. We assume that the order interval for computation consists of the correct order. Then, the ray ÝÝÑ S 2 E 2 forms the X 2 axis, while the ray Ý ÝÝ Ñ O 2 Y 2 forms the Y 2 axis perpendicular to the ray ÝÝÑ S 2 E 2 throughout the intersection O 2 of the ray ÝÝÑ S 2 E 2 and the objective function axis. Under the new coordination system X 2 O 2 Y 2 , the minimum point P 2 of the curve Č S 2 P 2 E 2 can help to determine the correct order. Meanwhile, this modification has no impact on the case that the criterion works well (see Figure 2a). Figure 2. Geometric interpretation for adaptive Akaike information criterion (AAIC) and adaptive Bayesian information criterion (ABIC): the solid curves Č S 1 P 1 E 1 and Č S 2 P 2 E 2 draw objective function values for AIC or BIC. The preferred orders are located at the point P 1 and P 2 , respectively. (a) The case that the criterion (AIC or BIC) successfully determines the correct order, and (b) the case that the criterion fails due to the inadequate penalty strength. Under the new coordination system X 2 O 2 Y 2 , the point P 2 becomes the minimum point of the curve, and the correct order is retrieved. Therefore, the adaptive AIC (AAIC) based on MSE of the residual filtering error can be derived as: where p s and p e denote the start point and the end point of the computing order interval, respectively. If p s " 1, the adaptive parameter α can be given by: Analogously, the adaptive BIC (ABIC) can be represented as:

Implementation of SDSE
In this section, we discuss the implementation details of SDSE based on the proposed filtering strategy. In particular, equiripple FIR filters are used as the analysis filters for their advantageous features. To suppress spectral overlap and improve spectral precision in practice, we introduce a mosaicking operation for sub-band spectra and discuss the compensation of the residual error of the composite spectrum. After that, we summarize the entire algorithm and analyze the computational complexity.

Properties and Design of Equiripple FIR Filters
Besides the advantages of FIR filters, i.e., exact linear phase response and inherent stabilization, equiripple FIR filters have an explicitly specified transition width and passband/stop-band ripples (see Figure 3). As analysis filters, equiripple FIR filters can bring some important benefits, such as stop-band attenuation with a fixed maximum, the explicitly specified width of the invalid part of the sub-band spectrum (which corresponds to the transition-band spectrum) and a limited maximum deviation of the valid part of the sub-band spectrum (which corresponds to the passband spectrum). As shown in Figure 3, the specifications of a typical equiripple FIR filter consist of the passband edge ω p , stop-band edge ω s and maximum error in passband and stop-band δ p , δ s , respectively. The approximate relationship between the optimal filter length and other parameters developed by Kaiser [11] is: where ∆f denotes the width of the transition-band, The maximum passband variation and the minimum stop-band attenuation in decibels are defined as: and: A s "´20log 10 pδ s q dB respectively. When the specification of a filter is explicitly specified, we can complete the design with the Parks-McClellan (PM) algorithm [30], since it is optimal with respect to the Chebyshev norm and results in about 5 dB more attenuation than the windowed design algorithm [11].

Practical Consideration of Equiripple FIR Filters
Firstly, the equiripple low-pass FIR filter is combined with a preprocessing step-complex frequency modulation-to form a passband filter for sub-band decomposition (see Figure 4).  The magnitude response of the analysis filter is shown in Figure 5, where ω H and ω L denote the high and the low edge of the stop-band, respectively. They satisfy: As long as A s is large enough and the downsample rate M meets the condition: frequency aliasing can be practically suppressed. Secondly, due to the existence of the transition-band of each analysis filter, each sub-band spectrum contains two invalid parts. The spectral estimations of these invalid parts lead to errors. Consequently, according to [31], when mosaicking these sub-band spectral estimations into full-band, we should omit these invalid parts of spectral estimations. This procedure is illustrated in Figure 6. Thus, the composite full-band spectral estimation is practically immune to the spectral overlap. Thirdly, due to the existence of passband ripples in equiripple FIR filters, there theoretically remains a small error in sub-band spectral estimations. Generally, by adjusting the maximum passband variation, we can limit the error to an allowable range. More precise spectral estimation necessitates compensation for the residual error. Since the ripple curve for any given equiripple FIR filter can be accurately measured, the compensation can be performed by weighting sub-band spectral estimations with the measured ripple curve.
Finally, we focus on selecting appropriate filter parameters in SDSE, which can improve the performance and reduce computational cost. The filter order should at least meet: The maximum stop-band attenuation should exceed the dynamic range of the signal to be analyzed. Once the aforementioned conditions are satisfied, the shortest transition-width can be chosen by Equation (34). Moreover, specific requirement will help to set the maximum passband variation.

Computational Complexity of SDSE
Algorithm 1 Non-overlapping sub-band spectral estimation with the steady-state Kalman predictor-based filtering strategy.
Input: The sequence tx n u N´1 n"0 . Parameters: The maximum passband variation A p and the minimum stop-band attenuation A s in decibels; the sub-band number M. end for Select an order p by Equation (31) or Equation (33), with O pNq flops.

Sequence Extrapolation:
Set the step-size k by Equation (28); Calculate tg k l u p´1 l"0 by Equations (16), (17) and (24), with O´p2pq k´1¯fl ops; Implement forward and backward extrapolations by Equations (26) and (27), and obtain tx n u L G`N´1 n"L G´No , with OpN o pq flops.

Sub-Band Spectral Estimation:
Set a rational factor M 0 " "" π ωs ıı , where rrss denotes a rational approximation; Compute ω H and ω L by Equation (38) and ω H`ωL " p2i´1q π{M ; Perform pre-modulation and filtering for tx n u L G`N´1 n"L G´No by Figure 4 and Equation (10) and omit the overlapped parts at both the left and the right side of the sub-band spectrum.

end for
Mosaic the residual sub-band spectrums into an entire spectrum.
Output: The entire spectrum.
As shown in Algorithm 1, we summarize SDSE with the proposed filtering strategy and give the computational complexity of the major steps. First, the proposed strategy can greatly reduce the computational burden. We take the commonly-used amplitude and phase estimation (APES) [32] algorithm as an example. The full-band APES needs O pN 2 log Nq flops [33], while the computation U¯˙fl ops by SDSE with the proposed strategy.
Second, except the sub-band spectral estimation, the main computation requirement is induced by the AR identification and the order selection. The computational complexity of this step is generally much lower than that of the sub-band spectral estimation. In particular, if a proper order or a small enough order interval is preselected before the AR identification, the computation of this step can be negligible.

Simulations and Analysis
In this section, both the feasibility and the effectiveness of the proposed strategy are evaluated by typical numerical simulations, including FIR filtering and line spectral analysis of 1D or 2D sequences.

Filtering Analysis Using the Proposed Strategy
Suppose that the input sequence tx n u is a mixed complex exponential sequence: where the measurement noise tυ n u is a complex Gaussian process. All real parts and imaginary parts of tυ n u are independent and identically distributed (i.i.d.) zero-mean Gaussian distributions with variance σ 2 , i.e., Re pυ n q , Im pυ n q " N p0, σ 2 q. Our purpose is to non-distortedly extract the weak component s p1q n from x n or completely eliminate the strong component s p2q n . The equiripple half-band low-pass FIR filter is chosen for the extraction. The specifications of the filter are: The length of the designed filter based on the given specifications is 119. As shown in Figure 7a, the decreasing trend of the estimated residual error by the proposed strategy is consistent with the real error. When the order exceeds 57, the decrease of the estimated filtering error instantly slows down. Hence, the preferred order is 57. By comparison, due to the deficiency of the penalty strength, none of AIC and BIC can provide the right order; whereas, based on the adaptive penalty terms, both AAIC and ABIC get the right order 57 (see Figure 7b).
As shown in Figure 8b, the weak component s p1q n is completely covered by the sidelobe of the out-of-band strong component s p2q n ; thus, recognizing the existence of the weak component from the mixed spectrum is completely impossible. From the view of the magnitude response (see Figure 8a), the filter has the nominal ability of eliminating the interference of the out-of-band strong components for the in-band weak component. Due to the existence of the convolution filtering error, we still cannot find out the weak component from the convolution spectrum, as shown in Figure 8b. By contrast, once the samples contaminated by the filtering error are omitted by Equation (4) from the filtered sequence, the weak component reappears in the spectrum of the remaining samples (refer to the truncated spectrum in Figure 8b). However, the truncated spectrum has a much wider main lobe than the original spectrum, which means the spectral resolution suffers from a severe decrease. In order to simultaneously maintain the resolution and filter out the interference, we apply the proposed filtering strategy to handle the case. As shown in Figure 8c, based on the proposed strategy, the restored spectrum for the noiseless sequence closely coincides with the truth weak spectrum in shape, especially retaining the spectral resolution. In addition, even when the signal-to-noise (SNR) of s p1q n is low to´3 dB (when σ 2 " 1), the recovery is still effective (see the magnified details of Figure 8c).

Line Spectral Analysis Using 1-D Signals
A complex exponential model can be mathematically represented as: where: 15`0.05iq π and: n " 0, 1,¨¨¨, N´1; N " 128 tυ n u is a real-value sequence of i.i.d. zero-mean Gaussian random variables with variance σ 2 " 1.5811, i.e., υ n " N p0, σ 2 q. φ k , φì and φí are i.i.d. uniform random variables on the interval from zero to 2π, i.e., φ k , φ In this case, we can get each component's SNR of s p1q n : We decompose the mixed-signal x n into four sub-bands using the proposed method with the filter parameter set as: The sub-band, whose radian frequency is within r´0.125π ,`0.125πq, is used for frequency estimation. Furthermore, we estimate the frequencies of complex sinusoids of s p1q n that are contained in both mixed-signal x n and the decomposed sub-band signal, via MUSIC, ESPRIT [34,35] and SELF-SVD [15] algorithms (see Table 1). As shown in Table 1, we analyze the performance based on the Monte Carlo method. Compared with ESPRIT, SELF-SVD in full-band spectral estimation suffers from obvious performance degradations or even failures. Although SELF-SVD can theoretically attenuate the out-of-band components for the in-band frequency estimations, the ability of attenuation is not always sufficient, especially when the power of the out-of-band components are much stronger than that of the in-band components or the SNR is relatively low. Instead of performing the SVD method in the entire frequency domain as ESPRIT, SELF-SVD just performs it in the frequency interval of interest. Obviously, the remaining out-of-band interferences will be treated as in-band components, so that the frequency estimation with SELF-SVD sometimes fails. In the experiment, the power ratio of the out-of-band components to the in-band components ω 2 and ω 4 is up to 10,000 times. As a result, the corresponding frequency estimation with SELF-SVD fails to work. When we eliminate the out-of-band interferences with our method, the estimation of SELF-SVD for the residual signal exhibits similar performance as ESPRIT. In addition, MSEs of MUSIC and ESPRIT indicate that the frequency estimation in the sub-band is much more accurate than that in the full-band. Table 1. Comparison of frequency estimation in full-band and sub-bands.

Means and MSEs
Full-band Sub-band of Estimating Frequencies MUSIC ESPRIT SELF-SVD MUSIC ESPRIT SELF-SVD

Line Spectral Analysis Using 2D signals
Let C k pk " 1, 2,¨¨¨, Kq be a series of random integers with unique values generated from a uniform discrete distribution on r1, 1024ˆ1024s. We define two sets of nonnegative integers as: where t¨u rounds a number to the nearest integer toward zero, and mod p¨q is the modulo operator. The 2D signal model can be expressed as: x n 1 ,n 2 " s p1q n 1 ,n 2`s p2q n 1 ,n 2`υ n 1 ,n 2 s p1q n 1 ,n 2 " where: n 1 " 0, 1,¨¨¨, N 1´1 ; n 2 " 0, 1,¨¨¨, N 2´1 and: N 1 " N 2 " 256, K " 8, 192 tυ n 1 ,n 2 u is a real-value sequence following υ n 1 ,n 2 " N p0, σ 2 q with σ 2 " 0.005. φ k , φ p1q are uniform random variables on the interval from zero to 2π, i.e., φ k , φ p1q The spectrum of this 2D sequence is shown in Figure 9. Since the magnitude of s p2q n 1 ,n 2 is 60 dB greater than that of s p1q n 1 ,n 2 , the sidelobe of the former significantly affects the spectral estimation of the latter.
This affect is especially more severe for the components around s p1q n 1 ,n 2 . The region inside the red pane is used to verify the performance of the proposed method. Figure 9. Actual magnitude spectrum of the 2D signal (the black dot corresponds to s p1q n 1 ,n 2 : 0 dB; the rounded blue spot ‚ corresponds to s p2q n 1 ,n 2 : 60 dB; the red pane covers the region to be analyzed).
The parameters of the analysis filter are selected as: The comparison of Figure 10a and 10c indicates that the Fourier spectrum of s p1q n 1 ,n 2 is severely affected by s p2q n 1 ,n 2 . By contrast, the result shown in Figure 10b seems to be almost exactly the same as the desired result shown in Figure 10c. This decomposition result verifies the effectiveness of the proposed method.
To further testify the performance of our method, we select the APES [32] and the iterative adaptive approach (IAA) [36,37] for spectral estimation. Since the ideal frequency domain filters suffer from energy leakage and/or frequency aliasing problems, the APES result shown in Figure 10c is somewhat blurred. By contrast, the APES result of the decomposed sub-band based on the proposed strategy (see Figure 10d) is quite similar to the actual spectrum (see Figure 10e). Theoretically, the IAA is superior to the APES. However, as shown in Figure 10g, it is even more likely than the APES to suffer from out-of-band interferences. From the view of the sub-band IAA spectrum (see Figure 10h), most of the interferences are eliminated, while the remaining filtering error still has impacts on the spectrum. Thus, the spectral estimation experiment reveals that the sub-band decomposition based on the proposed method can provide relatively ideal performance; whereas the developed method for extrapolation is imperfect, so it can affect the performance of the IAA algorithm.
In addition, a simulated single-polarized SAR image of an airplane based on the physical and optical model is processed via the APES. The computation time of full-band APES (refer to Figure 11a) is 26.85 h, while the time of sub-band APES (refer to Figure 11b) Figure 10. Sub-band decomposition and spectral estimation within the analyzed region: (a) the Fourier spectrum of x n 1 ,n 2 ; (b) the Fourier spectrum of decomposed sub-band signal based on our method; (c) the Fourier spectrum of s p1q n 1 ,n 2 ; (d) the amplitude and phase estimation (APES) result of (a) corresponding to the ideal frequency domain filters-based sub-band decomposition; (e) the APES result of (b); (f) the actual spectrum; (g) the iterative adaptive approach (IAA) result of (a); (h) the IAA result of (b).

Conclusion
This paper has investigated the problem of suppressing spectral overlap in sub-band spectral estimation. The spectral overlap phenomenon is originated from the non-ideal behavior of the analysis filtering, i.e., the filtering error. The error formation in convolution filtering was therefore discussed, based on which an extrapolation-based filtering strategy was proposed to greatly suppress spectral overlap. Several classical theories, including AR identification, Kalman prediction and the equiripple FIR filtering technique, are integrated into the strategy for linearly-optimal extrapolation. To resolve the "overfitting" in order determination with AIC and BIC, we modified the penalty terms for both criteria. The improved criteria adaptively adjust the penalty strength and avoid "overfitting" to some extent. Both 1D and 2D complex exponential signals are utilized to validate the performance of the proposed method. Moreover, we employed SAR image formation for a single-polarized SAR data, simulated based on electromagnetic theory, to testify the efficiency of our method. Future research will focus on developing more sophisticated methods for the problem of extrapolation, with which we can avoid model order determination and further improve the extrapolation precision.

Author Contributions
Zenghui Li contributed to the original idea, algorithm design and paper writing. Bin Xu contributed in part to data processing. Jian Yang and Jianshe Song supervised the work and helped with editing the paper.