An Improved Blind Zone Channelization Structure and Rapid Implementation Method

The paper proposes an enhanced design for broadband digital receivers that aims to improve signal capture probability, real-time performance, and the hardware development cycle. To overcome the issue of false signals in the blind zone channelization structure, this paper introduces an improved joint-decision channelization structure that reduces channel ambiguity during signal reception. Xilinx’s high-level synthesis (HLS) tools are used for accelerated algorithm implementation, and techniques such as pipelining and loop parallelization are employed to reduce system latency. The entire system is implemented on FPGA. The simulation results demonstrate that the proposed solution effectively eliminates channel ambiguity, improves algorithm implementation speed, and meets the design requirements.


Introduction
Since the beginning of the 21st century, with the rapid development of modern electronic technology, electronic information warfare has become a major form of warfare between nations. In electronic warfare equipment, more and more advanced technologies are being applied in the fields of jamming, anti-jamming, reconnaissance, and counterreconnaissance, making the electromagnetic environment increasingly complex [1,2]. At the same time, the radio signals generated by these devices also vary in signal energy and carrier frequency size. To adapt to such a complex environment, electronic warfare receivers need to have a wide monitoring frequency band range, real-time signal processing, a large dynamic range, and the ability to process time-domain overlapping signals [3][4][5].
Traditional electronic warfare receivers mostly use analog technology design, including crystal video receivers, compression receivers, superheterodyne receivers, Bragg receivers, instantaneous frequency measurement (IFM) receivers [6], and channelized receivers. Crystal video receivers have the advantage of a simple structure but can only be used for signal detection and cannot measure signal frequency. Compression receivers use the Fourier transform to compress input signals of different frequencies into time-domain pulse signals, which has good data processing performance, but signal compression may lead to system detection errors. Superheterodyne receivers typically use multi-stage mixing to convert the input RF (radio frequency) signal to intermediate frequency (IF) for processing at a lower frequency, with a good reception dynamic range, but requiring the introduction of filters to eliminate local oscillator leakage, increasing the receiver cost and complexity. Bragg receivers convert input signals into optical signals through an optical Bragg cell [7], and use a small amount of hardware resources to realize a large number of channels, but have a small reception dynamic range and low sensitivity. IFM receivers mainly use the non-linear characteristics of crystal detectors to measure the frequency of input signals, which can cover a larger monitoring frequency range, with high frequency measurement accuracy, but poor frequency measurement accuracy for time-domain overlapping signals, making it difficult to meet the actual electronic warfare requirements.
For the digital channelization filter bank, the input signal s(n) is down-converted and then evenly divided into K subbands by K filters. After D-fold decimation, the output results are obtained, and this process satisfies: When F = 1, it becomes a critical decimation process, and the original structure of the channelized receiver is shown in Figure 1.
Micromachines 2023, 4, x FOR PEER REVIEW 3 of 17 improves the discrimination of useful signals, but also simplifies the implementation process of FPGAs. Additionally, the paper employs a hybrid design approach by combining the HLS design approach [19] with traditional methods to accelerate the implementation process of front-end algorithms, resulting in increased hardware implementation efficiency and design flexibility for the entire receiver system.

The Original Structure
For the digital channelization filter bank, the input signal s(n) is down-converted and then evenly divided into K subbands by K filters. After D-fold decimation, the output results are obtained, and this process satisfies: When F = 1, it becomes a critical decimation process, and the original structure of the channelized receiver is shown in Figure 1. The input signal is first modulated and then shifted to baseband. Afterward, it undergoes low-pass filtering and decimation before being fed into subsequent modules for further processing. However, at this stage, the filtering operation occurs before decimation, leading to redundant computations during signal processing. Moreover, convolution operations must be completed within a single sampling period. To address these issues, an alternative approach is proposed, which involves swapping the positions of filtering and decimation operations. This way, the computational requirements of various modules in the system can be met more efficiently.

Subchannelization
Multi-rate signal processing with the polyphase filtering technique plays a crucial role in reducing the complexity of signal rate conversion and lowering system design complexity. The key process in this technique is subchannelization, where the target monitoring frequency band is divided according to certain rules, and different partitioning methods correspond to different system complexities, computational loads, and resource consumption. In this design, the filter group adopts a 50% overlapping partitioning method [20], as shown in Figure 2, which concatenates the passbands of adjacent subchannels, eliminating any blind spots in the entire monitoring bandwidth and achieving the full probability of interception. The input signal is first modulated and then shifted to baseband. Afterward, it undergoes low-pass filtering and decimation before being fed into subsequent modules for further processing. However, at this stage, the filtering operation occurs before decimation, leading to redundant computations during signal processing. Moreover, convolution operations must be completed within a single sampling period. To address these issues, an alternative approach is proposed, which involves swapping the positions of filtering and decimation operations. This way, the computational requirements of various modules in the system can be met more efficiently.

Subchannelization
Multi-rate signal processing with the polyphase filtering technique plays a crucial role in reducing the complexity of signal rate conversion and lowering system design complexity. The key process in this technique is subchannelization, where the target monitoring frequency band is divided according to certain rules, and different partitioning methods correspond to different system complexities, computational loads, and resource consumption. In this design, the filter group adopts a 50% overlapping partitioning method [20], as shown in Figure 2, which concatenates the passbands of adjacent subchannels, eliminating any blind spots in the entire monitoring bandwidth and achieving the full probability of interception.

Improved Non-Blind Channelization Structure
The DFT-based polyphase channelization algorithm is developed based on the channelization structure of the low-pass filter bank. Given the input signal s(n), the output ( ) of the kth subchannel is defined as: The center frequency of each subchannel in the filter bank under the even-indexed arrangement is:

Improved Non-Blind Channelization Structure
The DFT-based polyphase channelization algorithm is developed based on the channelization structure of the low-pass filter bank. Given the input signal s(n), the output f k (m) of the kth subchannel is defined as: Now, let s p (m) = s(mD − p) and h p (m) = h(mD + p). Then, we can further simplify Equation (2) as follows: Convert part of the relationship in Equation (3) to: The center frequency of each subchannel in the filter bank under the even-indexed arrangement is: Substituting Equation (5) into Equation (4), we obtain: Substituting Equation (6) into Equation (3), we obtain: In the above equation, DFT represents discrete Fourier transform, which can be replaced by fast Fourier transform (FFT) in hardware implementation to reduce hardware resource consumption and improve system operation speed. The improved structure is shown in the following Figure 3. This structure eliminates the multiplier factors used in traditional structures, resulting in improvements in the data path and resource utilization.
In the above equation, DFT represents discrete Fourier transform, which can be replaced by fast Fourier transform (FFT) in hardware implementation to reduce hardware resource consumption and improve system operation speed. The improved structure is shown in the following Figure 3. This structure eliminates the multiplier factors used in traditional structures, resulting in improvements in the data path and resource utilization.

Improved Channel Decision Module
The proposed joint-decision process for eliminating channel ambiguity in the channelized receiver involves "Instantaneous Feature Extraction + Auto-Correlation Threshold Amplitude Detection + Phase Differential Instantaneous Frequency Measurement". In the instantaneous feature extraction step, the coordinate rotation digital computer (CORDIC) algorithm is used to extract the instantaneous amplitude and phase information of the channelized output signal in vector mode. Traditional amplitude detection methods compare the signal amplitude with a fixed threshold value to determine the presence of signals in the channel [21]. In this study, the auto-correlation amplitude detection algorithm is used, which compares the signal amplitude information with dynamically changing threshold values that depend on the amplitudes of multiple points, resulting in more accurate detection results [22]. The phase differential instantaneous frequency measurement utilizes the extracted instantaneous phase information to measure the instantaneous frequency of the signal and, based on the frequency measurement result and the corresponding relationship with the channel center frequency, the correct channel output is determined, eliminating channel ambiguity. Figure 4 illustrates the joint-decision process, which ultimately yields accurate channel output results.

Improved Channel Decision Module
The proposed joint-decision process for eliminating channel ambiguity in the channelized receiver involves "Instantaneous Feature Extraction + Auto-Correlation Threshold Amplitude Detection + Phase Differential Instantaneous Frequency Measurement". In the instantaneous feature extraction step, the coordinate rotation digital computer (CORDIC) algorithm is used to extract the instantaneous amplitude and phase information of the channelized output signal in vector mode. Traditional amplitude detection methods compare the signal amplitude with a fixed threshold value to determine the presence of signals in the channel [21]. In this study, the auto-correlation amplitude detection algorithm is used, which compares the signal amplitude information with dynamically changing threshold values that depend on the amplitudes of multiple points, resulting in more accurate detection results [22]. The phase differential instantaneous frequency measurement utilizes the extracted instantaneous phase information to measure the instantaneous frequency of the signal and, based on the frequency measurement result and the corresponding relationship with the channel center frequency, the correct channel output is determined, eliminating channel ambiguity. Figure 4 illustrates the joint-decision process, which ultimately yields accurate channel output results.
The CORDIC (coordinate rotation digital compute) algorithm calculates the amplitude and phase information of the output signal through iterative rotations [23]. The solving process is shown in Equation (8): I k (n) and Q k (n) represent the real and imaginary parts of the output of the kth subchannel, and α k (n), ϕ k (n), and f k (n) represent the instantaneous amplitude, phase, and frequency of the signal, respectively. The threshold value for adaptive detection is set based on the amplitude information of the output signal, and it is determined according to Equation (9): where β represents the threshold coefficient, A k [n] represents the amplitude of the output signal from the kth subchannel, D represents the extraction factor under critical extraction state, V T [n] represents the adaptive detection threshold, and γ represents the noise floor introduced by the receiver. The amplitude value of the channelized output signal varies dynamically with the input signal, so the threshold values determined according to Equation (9) are also dynamic.  The CORDIC (coordinate rotation digital compute) algorithm calculates the amplitude and phase information of the output signal through iterative rotations [23]. The solving process is shown in Equation (8):  The instantaneous frequency of the output signal can be obtained using the phase difference instantaneous frequency measurement method, where the relationship between instantaneous frequency and instantaneous phase is given by Equation (10): T represents the sampling period of the signal. The frequency of the output signal can be obtained using the four-point phase difference averaging method to reduce measurement errors caused by noise, as the phase difference instantaneous frequency measurement method is sensitive to noise. In this design, the frequency of the output signal is given by the result obtained from the four-point phase difference averaging method.
The entire decision-making process is as follows: 1.
Set the detection threshold V T [n].

2.
Use the signal amplitude calculated from the CORDIC module as input. To reduce the impact of the signal-to-noise ratio, consider a signal to be present in the channel if the amplitude values A k [n] are greater than the detection threshold for five consecutive times. Then, proceed to the channel decision-making part.

3.
Finally, perform a phase-difference-based instantaneous frequency estimation for the channel. The frequency deviation value ∆ f k (n) represents the frequency deviation of the signal from the center of the channel. If |∆ f k (n)| is less than half of the bandwidth, B, of the channel, it is considered that the signal is within the current channel, otherwise it is considered a false signal.

Receiver System Simulation
Simulation Conditions: In MATLAB, the input signal is set as the sum of four signals from Table 1. The input signal-to-noise ratio (SNR) is set to 10 dB, with a total of 4096 sampling points. The number of digital channelized channels is 16. The prototype filter is set to have an order of 192 and a stopband attenuation of 60 dB. The frequency range for each sub-channel is shown in Table 2. The channelized output results are shown in Figure 5. The simulation experiment tested the receiving capability of the channelized receiver for different input signals. The parameters of the four input signals were set in a controlled manner, selecting four typical frequencies to ensure that each signal belonged to a different channel. From Figure 5, it can be observed that the receiver model was able to correctly channelize and receive the superimposed signals from the four different input signals.
The channelized output results are shown in Figure 5. The simulation experiment tested the receiving capability of the channelized receiver for different input signals. The parameters of the four input signals were set in a controlled manner, selecting four typical frequencies to ensure that each signal belonged to a different channel. From Figure 5, it can be observed that the receiver model was able to correctly channelize and receive the superimposed signals from the four different input signals.

Hardware Implementation
The hardware implementation of the wideband digital channelized receiver primarily focuses on the large-scale and high-speed digital signal processing. The hardware platform is based on the Zynq UltraScale + RFSoC series chip, specifically the ZU27DR, which includes high-speed ADC and FPGA sections. The improved scheme of the entire receiver system is shown in Figure 8.

Hardware Implementation
The hardware implementation of the wideband digital channelized receiver primarily focuses on the large-scale and high-speed digital signal processing. The hardware platform is based on the Zynq UltraScale + RFSoC series chip, specifically the ZU27DR, which includes high-speed ADC and FPGA sections. The improved scheme of the entire receiver system is shown in Figure 8.

Data Extraction and Routing Module
In the HLS platform, when implementing the data extraction and routing module, input data can be stored in a multi-dimensional array based on the storage format of the data stream. During the transformation process, the input data needs to be rearranged by columns for parallel computation. Optimization directives can be added during the C Synthesis stage to reduce system latency [24]. Since the four loops in the data extraction and routing module have clear boundaries, and the input signal is divided into I and Q paths, there is no dependency between the real and imaginary part computations. Therefore, optimization can be achieved using the "Pipeline + Rewind" approach, as shown in

Data Extraction and Routing Module
In the HLS platform, when implementing the data extraction and routing module, input data can be stored in a multi-dimensional array based on the storage format of the data stream. During the transformation process, the input data needs to be rearranged by columns for parallel computation. Optimization directives can be added during the C Synthesis stage to reduce system latency [24]. Since the four loops in the data extraction and routing module have clear boundaries, and the input signal is divided into I and Q paths, there is no dependency between the real and imaginary part computations. Therefore, optimization can be achieved using the "Pipeline + Rewind" approach, as shown in Figure 9, to improve the parallel processing capability of the hardware.

Data Extraction and Routing Module
In the HLS platform, when implementing the data extraction and routing module, input data can be stored in a multi-dimensional array based on the storage format of the data stream. During the transformation process, the input data needs to be rearranged by columns for parallel computation. Optimization directives can be added during the C Synthesis stage to reduce system latency [24]. Since the four loops in the data extraction and routing module have clear boundaries, and the input signal is divided into I and Q paths, there is no dependency between the real and imaginary part computations. Therefore, optimization can be achieved using the "Pipeline + Rewind" approach, as shown in Figure 9, to improve the parallel processing capability of the hardware. Figure 9. "Pipeline + Rewind" Optimization approaches. Figure 9. "Pipeline + Rewind" Optimization approaches.

Polyphase Filtering Module
In HLS, when designing the polyphase filtering module, the prototype filter coefficients need to be stored in an array. The coefficients are stored in an array of size (16, (filter order/16)). The implementation process of polyphase filtering is similar to convolution in the time domain, with the only difference being that FIR filtering requires zero padding at the end of the input sequence and uses a circular right shift to align the first item of the input sequence with the first item of the coefficient. Each right shift requires a multiplication operation, and the multiplication results are stored in registers. The final accumulated result is the output of the FIR filter. The polyphase filtering module consists of 16 parallel channels, and since the structure of each channel is the same except for the coefficients and input data, the implementation process of any sub-channel is shown in Figure 10.

Polyphase Filtering Module
In HLS, when designing the polyphase filtering module, the prototype filter coefficients need to be stored in an array. The coefficients are stored in an array of size (16, (filter order/16)). The implementation process of polyphase filtering is similar to convolution in the time domain, with the only difference being that FIR filtering requires zero padding at the end of the input sequence and uses a circular right shift to align the first item of the input sequence with the first item of the coefficient. Each right shift requires a multiplication operation, and the multiplication results are stored in registers. The final accumulated result is the output of the FIR filter. The polyphase filtering module consists of 16 parallel channels, and since the structure of each channel is the same except for the coefficients and input data, the implementation process of any sub-channel is shown in Figure 10.

The Fully Parallel FFT Module
Due to the high throughput requirements of the data output from the multi-phase filtering structure during FFT computation, a parallel FFT module that can perform parallel computation is needed [25]. However, the FFT IP core provided by Xilinx requires at least 16 clock cycles to complete a 16-point DFT computation [26], which does not meet the design requirements. In this paper, a parallel FFT structure is designed using a pipelined design approach, based on the radix-2 decimation in time (DIT) method, to compute the 16-point DFT. The core of the FFT algorithm is the butterfly operation, which divides the 16-point DFT into two 8-point DFTs based on the parity of the input sequence x(n), and further divides the 8-point DFTs into two 4-point DFTs, and so on, until the 16-point DFT is transformed into multiple 2-point DFT computations [27,28]. The butterfly operation flow of the 16-point parallel FFT module is shown in Figure 11.

The Fully Parallel FFT Module
Due to the high throughput requirements of the data output from the multi-phase filtering structure during FFT computation, a parallel FFT module that can perform parallel computation is needed [25]. However, the FFT IP core provided by Xilinx requires at least 16 clock cycles to complete a 16-point DFT computation [26], which does not meet the design requirements. In this paper, a parallel FFT structure is designed using a pipelined design approach, based on the radix-2 decimation in time (DIT) method, to compute the 16-point DFT. The core of the FFT algorithm is the butterfly operation, which divides the 16-point DFT into two 8-point DFTs based on the parity of the input sequence x(n), and further divides the 8-point DFTs into two 4-point DFTs, and so on, until the 16-point DFT is transformed into multiple 2-point DFT computations [27,28]. The butterfly operation flow of the 16-point parallel FFT module is shown in Figure 11.

Channel Decision Module
The CORDIC algorithm has three hardware implementation architectures: serial architecture, parallel architecture, and parallel pipeline architecture. These three architectures are based on the same basic structure of processing units, but operate at different shift amounts and storage angles, resulting in different implementation methods. In this paper, the parallel pipeline architecture is chosen for its design, which consumes significantly more hardware resources compared to the serial architecture.
By adding pipeline registers between each iteration, the processing speed of the system can be effectively improved, reducing the critical path length from N processing units in the parallel architecture to 1 processing unit in the pipeline architecture.
After obtaining the phase of the complex signal through the CORDIC module, the instantaneous frequency of the signal is calculated using the phase-difference-based frequency estimation algorithm, as shown in Figure 12.

Channel Decision Module
The CORDIC algorithm has three hardware implementation architectures: serial architecture, parallel architecture, and parallel pipeline architecture. These three architectures are based on the same basic structure of processing units, but operate at different shift amounts and storage angles, resulting in different implementation methods. In this paper, the parallel pipeline architecture is chosen for its design, which consumes significantly more hardware resources compared to the serial architecture.
By adding pipeline registers between each iteration, the processing speed of the system can be effectively improved, reducing the critical path length from N processing units in the parallel architecture to 1 processing unit in the pipeline architecture.
After obtaining the phase of the complex signal through the CORDIC module, the instantaneous frequency of the signal is calculated using the phase-difference-based frequency estimation algorithm, as shown in Figure 12.

Simulation and Analysis
The ChipScope, an integrated logic analyzer provided by Xilinx, was used to observe the signal, as described in [29]. In MATLAB, a sine wave signal with a carrier frequency of 935 MHz was generated and simulated. Since the input test signal is a complex signal, one path was selected for extraction and comparison verification in the real and imaginary parts. Here, the I path was selected for comparison verification. The first 10 columns of the extracted simulation results in MATLAB are shown in Figure 13.
The data shown in Figure 13 are obtained after quantization. The timing simulation results of the extraction path module are shown in Figure 14. After comparing with the simulation results in Matlab, it can be confirmed that the function of the module is correct.

Simulation and Analysis
The ChipScope, an integrated logic analyzer provided by Xilinx, was used to observe the signal, as described in [29]. In MATLAB, a sine wave signal with a carrier frequency of 935 MHz was generated and simulated. Since the input test signal is a complex signal, one path was selected for extraction and comparison verification in the real and imaginary parts. Here, the I path was selected for comparison verification. The first 10 columns of the extracted simulation results in MATLAB are shown in Figure 13. The data shown in Figure 13 are obtained after quantization. The timing simulation results of the extraction path module are shown in Figure 14. After comparing with the simulation results in Matlab, it can be confirmed that the function of the module is correct. The MATLAB simulation result and timing simulation result of the polyphase filter module are shown in Figures 15 and 16, respectively. After comparison, it was found that the simulation results in HLS were correct, thus completing the consistency verification between the multi-phase filter module and the theoretical model.  The final channelized output result is shown in the Figure 17, where PFFT_OUT_3I is the real component of the third channel output, and PFFT_OUT_3Q is the imaginary component of the third channel output. It can be seen that there is an IQ component output in the third channel, proving that the calculation result of this module in FPGA is correct.
In addition, a chirp signal with a center frequency of 875 MHz and a bandwidth of 10 MHz was generated for the simulation of instantaneous feature extraction. The simulation results, depicted in Figures 18 and 19, demonstrate the improved channel decision module's ability to accurately extract amplitude, phase, and frequency information from the received signal. By comparing the extracted signal information with the subchannel center frequencies defined in the receiver, false signals can be eliminated, resulting in a blind-zone-free reception in the receiver.

Conclusions
In this paper, an improved channel decision method has been proposed to address the issue of false signals in communication receivers based on the theory of blind-spot-free reception in a polyphase filtering structure. The method combines instantaneous feature extraction, the adaptive detection of signal amplitude, and phase difference frequency measurement algorithms to transform the process of eliminating false signals into extracting amplitude, phase, and frequency information from the output signal. This new approach improves the discrimination of useful signals and is more easily implemented on FPGA. Additionally, the hardware implementation of the receiver is optimized for speed and latency using HLS high-level synthesis technology, which significantly shortens the development cycle compared to traditional FPGA development processes. The simulation results confirm the practical value of this method for wideband digital receivers.

Conflicts of Interest:
The authors declare no conflict of interest.