Implementation architecture of signal processing in pulse Doppler radar system based on FPGA

: A kind of high speed and parallel hardware architecture is proposed and designed for digital signal processing of high frequency Pulse Doppler radar here. The platform is based on one XC7K410T FPGA, two XC7K325T FPGAs, one TMS320C6678 DSP, and four sets of MT41J256M8 DDR3. The details of implementation including pulse compression, moving target detection and constant-false-alarm rate are described. The simulation results and resource consumption are presented to demonstrate the advantages of the proposed FPGA implementation architecture.


Introduction
Pulse Doppler (PD) radar is widely applied to many occasions such as moving target detection (MTD), range measurement, and velocity measurement [1], since it has promising anti-interference ability and the echo is not easy to generate velocity ambiguity in high pulse repetition frequency (HPRF) mode [2]. The PD radar system can obtain attractive time resolution and high range resolution by adopting the linear-frequency modulation (LFM) waveform as the transmitted signal [3,4].
Complete radar signal processing is complex and flexible, which involves a significant amount of data and instructions. Moreover, it needs to be implemented in real time. The filedprogrammable-gate array (FPGA) has a large number of hardware resources such as logic gates and triggers. Meanwhile, this high integration device has high parallel processing speed and it can be reconfigured flexibly to meet the demand of different applications. Therefore, the FPGA has great advantages on radar signal processing [5].
This paper proposes a new complete hardware architecture to implement digital signal processing. The platform utilises FPGA as the core devices, equipped with digital-signal-process (DSP), analogue-digital-converter (ADC) and DDR3. The FPGA implements digital-down-conversion (DDC), digital-beam-forming (DBF), pulse compression, MTD, and CFAR, while the DSP implements channels calibration and azimuth angle merging. The processed data of FPGA are transmitted to DSP via RapidIO interface and then to the host computer via Ethernet.
At the end of the paper, we analyse the designed hardware architecture and the resource consumption of the FPGA in the system. In addition, the simulation results of the signal processing are presented followed by analysing the problems appear in the practical experiments.

Radar system
The proposed PD radar system is mainly used for surveillance, with varying scope in different modes. At the same time, it measures the distance, speed, altitude, and azimuth of the targets. Its detection targets are low and slow small objects such as fixedwing or rotor-based unmanned aerial vehicles (UAV), sea-going boats, pedestrians, and vehicles and so on in coastlines, airports and other open space.
The radar supports three types of scanning in azimuth, i.e. circular sweep, fan sweep, and fixed-point sweep. Totally, 13 beams with beamwidth of 5 degrees scanning from −5 degrees to 55 degrees in elevation are formed by signals from 16 receiver channels. Six kinds of waveforms are designed to meet the different detection requirements. Each transmitted waveform is comprised of two pulses to reduce blind zone, where first one is the cosine signal and last one is the LFM signal, while the pulse width and pulse repetition time of each waveform is different from each other.
The radar system is comprised of seven modules, including one signal generator, one set of substrate integrated waveguide-based phased-array antennas, one set of compact integrated transceivers, a signal processing board integrated primarily with three FPGAs and one DSP, a power supplier, a servo, and a computer. Fig. 1 shows the complete signal processing procedure, including the pre-processing, pulse compression, MTD, and constant-falsealarm rate (CFAR). In total, 16 radar receivers changes all of the received 16 echoes from radio frequency to intermediate frequency (IF) before they are transmitted to ADCs, on which the 16 channels of IF analogue signals are converted to be digital signals. Then, the 16 IF digital signals are transferred to the FPGA for being converted into orthogonal baseband signals by the DDC module, as shown in Fig. 1.

Pre-processing implementation on FPGA
After the above frequency conversion, calibration of 16 channels is performed to eliminate the effect of phase and time deviations between them for the subsequent DBF processing. When the radar is powered on or the host computer sends the channel calibration command, the DSP will process the 16 baseband signals and obtain the channel calibration coefficients. Then, the DSP sends the obtained channel calibration coefficients together with the fixed DBF coefficients to the FPGA through external memory interface (EMIF). The baseband signals for each channel are multiplied by the channel calibration coefficients. Then, the calibrated signals are fed into DBF sub-module so that the 13 digital beams are formed.

Pulse compression implementation on FPGA
Pulse compression can achieve the long-range detection performance of long pulse without compromising on range resolution, which is important in small target detection, especially for the transmitted LFM signal [6]. The expression of the LFM is and A is the signal amplitude. τ is the pulse width, ω 0 is the carrier frequency of the signal, and μ is the frequency modulation rate. In general, there are two implementation methods for pulse compression. One is time-domain convolution method which is implemented by the finite impulse response (FIR), and the other is frequency-domain multiplication method which is implemented by fast Fourier transform (FFT). The proposed radar system adopts the second one in order to achieve large compression ratio and reduce the computational complexity. Pulse compression can improve signal-to-noise ratio (SNR) for target detection, but the side lobes of large targets may overwhelm the small ones. Hence, the approach of windowing is adopted to suppress side lobes.
The implementation block diagram of pulse compression is shown in Fig. 1. First, the complex conjugate of the frequency spectrum of the transmitted LFM signal is obtained as the matched filtering coefficients in Matlab. Then, the coefficients are processed by the Hamming window corresponding to the number of points. The obtained results are defined as the modified matched filtering coefficients in this paper, which are stored in the read-only memory (ROM) of the FPGA. At last, the spectrum of the LFM signal is multiplied by the calculated modified matched filter coefficients, and inverse fast Fourier transform (IFFT) is performed to obtain the pulse compression result.

MTD implementation on FPGA
Pulse compression is to increase the SNR so that the target can be detected from noisy signals more efficiently. However, when the target and the clutter appear in the same distance unit, the targets may not be successfully detected, especially for the small ones. The most efficient way to distinguish moving targets from clutter is by using the Doppler frequency shift produced by the relative motion between the radar and the targets.
Due to the unpredictable Doppler frequency of the target, a set of adjacent and partially overlapping narrow-band filters are needed to cover the entire Doppler frequency range to filter the moving target echo. Determining whether the presence of each narrow-band filter output can effectively detect the existence of moving target, and the obtained Doppler frequency from the filter corresponds to the speed of the moving target. Doppler filter bank is the key technique of MTD, which can be achieve by the FIR filter in the time domain or the FFT filter in the frequency domain. The latter is adopted in the proposed system for reducing the computation complexity.
The block schematic of MTD is also shown in Fig. 1. The data received from the pulse compression module are stored in DDR3. Then, the data of slow time dimension is read from DDR3, which is multiplied with the hamming window. The multiplied data is input to the FFT filter banks, and the modulus of the I/Q data are calculated from the output of the set of narrow-band filters.

CFAR implementation on FPGA
The CFAR technique is a signal processing algorithm that provides a detection threshold for the detection strategy in an automatic radar detection system and minimises the effect of clutter and interference on the false alarm probability [7]. Different types of background clutter distribution will lead to different CFAR detection performances. The proposed system adopts the cellaveraging CFAR (CA-CFAR) detection method with seven reference cells and five average cells which has good performance when the clutter follows the Rayleigh distribution. The value of detected cell should be divided by a threshold value and then compared with the average of the five average cells. The threshold value is determined by the command with an integer, whose value ranges from 1 to 9 and is sent from the host computer.
The CFAR detection process is illustrated in Fig. 1. The results of MTD for seven adjacent beams are stored in DDR3 at first. Then, results of six sum-beams are obtained by adding each two of the adjacent beams together. Then, CFAR detection is performed on the slow time dimension and the fast time dimension data of the sum beams. Finally, the distance unit number and speed unit number of the cells detected by CFAR are transmitted to the DSP, together with the corresponding magnitude.

Implementation platform
This radar system uses FPGA to implement signal processing instead of DSP to cope with huge amount of computation, and the architecture of the processor platform is given in Fig. 2. Fig. 3 shows the fabricated signal processing board, and three FPGAs are integrated on it. One of FPGAs is with series number of Xilinx Kintex-7 XC7K410T denoted by F1 and the other two are Xilinx Kintex-7 XC7K325T FPGA denoted by F2. The DDC, the channels calibration, the DBF and the pulse compression modules are implemented on F1, while the MTD and CFAR modules are implemented on F2. When the pulse compression is completed, the data of a series of cascaded Pulse Repetition Time (PRT) is transmitted to the MTD module by through Gigabit Transceiver (GTX). In every PRT, there are 13 beams which are transmitted serially. Due to the limited resources of each F2, 13 beams are processed on two F2s, one of which handles beams indexed from 0 to 6 and the other processes those indexed from 6 to 12. In order to make the programs of the two F2s identical, the No. 6 beam is processed repeatedly.
Each F2 is equipped with two sets of MT41J256M8 DDR3, whose storage capacity is 1 Gb with eight banks, and each bank includes 16k rows and 1k columns. The whole pulses in a frame should be read from the DDR and processed in 1/4 frame time in this radar system. Therefore, the DDR3 should store at least 1.25 frames of data and the read bandwidth should be greater than the write bandwidth.

Resource consumption
The Xilinx Kintex-7 FPGA has a large improvement compared to the previous generation. Tables 1 and 2 show the resource consumption of F1 and F2, respectively. It can be observed that the LUT, BRAM, IO, and DSP are heavily consumed. In fact, F2 is equipped with two sets of DDR3, which can store 1.25 frames of data, and it can reduce the resource consumption of F2.

Simulation results
In the radar system, a received signal coupled directly from the transmitted channel without sending out through the antennas is used as the echo signal for radar self-test. After the aforementioned signal processing, the results are displayed on the host computer for the target with the distance of 2 km and the speed 20 m/s as shown in Fig. 4. Fig. 4a shows the pulse compression result, and it is found that the peak of the waveform is appeared at 2 km, corresponding to the distance of the simulated target. Then, MTD is performed and the result is shown in Fig. 4b and c in speed and distance dimensions, respectively. We can see that the SNR of the distance waveform is improved and the peak is still at 2 km.
Finally, Fig. 4d shows the CFAR result represented by the dot moving trace when the radar works in sweep mode with detection results of 100 frames in one circle, and one dot corresponds to the target detected by the CFAR module in a frame.

Problems in the experiments
During the experiments, some problem appears causing incorrect MTD results and give rise to the wrong results of detection. Fig. 5 shows the waveforms before and after MTD processing in different incorrect conditions. The correct MTD result with normal signal as a comparison is shown in Fig. 5a. While the I and Q signals    generated by the DDC module are not orthogonal, a signal comes out at the image frequency corresponding to the signal frequency as shown in Fig. 5b. Fig. 5c shows the result for the I and Q signals generated from DDS with discontinued phase between adjacent frames, and it leads to two peaks with very close frequencies and widening of the main lobe. Fig. 5d shows the glitches in the signal due to the program with poor timing. It is shown that when there are few small glitches, the noise floor will increase.

Conclusion
In this paper, a real-time architecture of the PD radar processor based on the FPGA is introduced. Then, the implementation of the signal processing algorithms is presented with detailed procedures. In addition, the resource consumption of the FPGA and the problems in experiments are analysed. The simulation results show that the radar system can detect the target and accurately obtain the distance, speed, azimuth and altitude of the target.

Acknowledgments
This work was supported by the National Natural Science Foundation of China under Grant no. 61501033.