Using of Residual Number System as a Mathematical Basis for Software Defined Radio

classes allows to significantly improve the parameters of a computer in SDR especially in functional block a Direct Digital Synthesizers (DDS) in comparison with a computer built on the same physical and technological basis, but in a positional system calculation, and also to receive new more progressive constructive and structural solutions. The experimental results shows that the presented techniques offer interesting advantages for FIR filters characterized by high dynamic range and high number of taps especially when full custom multipliers are not available in the target FPGA architecture or when they must to be used for different purposes. Conclusion. Thus, the proposed system introduces clear advantages over existing systems and shows performance advantages and can be used to build modern communication systems. The proposed architecture reduces the size of the pipeline adders and multipliers which is a very important factor in the design SDR for fast work.


Introduction
Study ob ject. In the classic view, program-defined radio system (Software Defined Radio, SDR) [1] is a central processor, equipped with receiving and transmitting units. The transmitting unit has a communication processor, whose main task is wrapping bits of transmitted data in modulation symbols and their generation modulating signal of a specific system communication that enters the digital-to-analog the converter and further to the radio interface. The recei-ving unit contains analog-digital converter, communication processor, signal demodulation and conversion demodulated system symbols communication in data bits. The role of the central CPU is processing custom data exchange protocols. Review of sources.
Software-defined radio systems can be implemented both on computational means of general purpose and on modern programmable logic integral schemes (FPGA) that allows you to create radio systems on the chip. At the same time for FPGA limits are made radio interfaces, and also ADC and DAC. The use of FPGA does not lead reduced system flexibility because FPGA can be at any time completely or partially reprogrammed. When using modern FPGAs [2] it becomes possible to create systems on based on the principle of SDR on a single chip (Fig. 1). FPGA, unlike discrete digital signal processors allow you to create many different programmable blocks processing on a single chip, which ultimately leads to an increase in quantity simultaneously serviced radio channels. Software-defined radio system can consist of several FPGAs and serve several independent radio channels. A large number of communication processors provides simultaneous processing multiple data streams. On their own communication processors can be several types, each optimized to work with a specific type of signal. Also, individual processor types may be allocated for signal analysis, collecting statistics or packet filtering. Opportunities reprogramming (full or partial) allow you to change the number and composition communication processors depending from current working conditions. Using high-speed modern sequential transceivers as well as large the number of parallel channels allows extend the interconnect structure for single crystal limits and low cost merge several FPGAs into the system. The main task to be solved when developing such systems is efficient and flexible interconnects structure [1]. Basic system requirements interconnects are: 1. Ability to control the transmission rate for each connection.
2. Presence of high-speed matrix switch.
3. Presence of simple bridges for fast combining multiple chips into the system data processing.
4. Automatic detection of connection / disconnection devices that is necessary for ensure redevelopment used resources. Most suitable for implementation interconnect structures satisfying these requirements are nets on chip (Network-on-Chip). In order to simplify routing data streams the most convenient is packet data.
Digital receivers have revolutionized communication systems offering remarkable benefits when compared to their analog counterparts. During the last decade, Direct Digital Synthesizer (DDS) techniques have become increasingly popular methods in digital receiver designs and many ASIC vendors are providing semiconductor solutions for digital communication systems. These systems yield significant benefits in performance, density and cost as well as provide high frequency resolution, fast and phase-continuous frequency switching, exceptional linearity and excellent temperature and aging stability. With the advent of the new Field-Programmable Logic (FPGA) device families, such as the Altera APEX 20K or the Xilinx Virtex, and their increasing speed and density, many new benefits are becoming available to radio frequencies for the design of digital communication systems using these devices [2]. Digital receiver chips perform down conversion, lowpass filtering and decimation of the sampled RF signal. The resulting bandwidth and sample rate reduction makes it possible to perform real-time processing of narrow and wide band radio signals. Traditional numbering systems are commonly used to build DSP systems with commercially available FPGA technology. While FPGA vendors champion their technology as a provider of system-on-a-chip (SOC) DSP solutions, engineers have historically viewed FPGA as a prototyping technology [3]. In order for FPGA to begin to compete in areas currently controlled by low-end standard-cell ICs, a means must be found to more efficiently implement DSP objects. An arithmetic system capable of surmounting these barriers is the residue number system, or RNS [4].
Taking into account the requirements for building high-performance SDR, including those applicable in digital frequency and signal synthesis systems, the main method for solving the problem of increasing the speed of digital data being processed is confirmed, namely, a method that allows building the structure of a computing device of such a system with the maximum parallelization of performing arithmetic operations. This method in turn solves a number of tasks that are put before the computing device: -introduction of efficient algorithmic and hardware structures of parallel type; -application of advanced error control; -use of variants of computer arithmetic, which are best suited for high-speed implementations of computational processes that require large amounts of computation [4].
The use of the usual binary number system in the course of performing arithmetic operations over a large amount of data entails a number of difficulties caused by the presence of inter-bit relationships. This disadvantage, imposes limitations on the ways of implementing arithmetic operations, thereby complicating the hardware and limiting the system's performance. Therefore, it is expedient to use such arithmetic, in which the bitwise relations in the calculations were absent or were minimized. The arithmetic possessing the specified properties is the nonpositioning system of numbering -the system of residual classes (RNS). Thus, the search for ways to solve the problem of increasing productivity led to the idea of independent parallel processing of data and, consequently, the replacement of the usual binary system with the system of residual classes.
In this system, the numbers are represented by their remainders from dividing by the chosen base system, and all rational operations can be performed parallel to  the digits of each digit separately. However, a system of residual classes that is so convenient in one respect is inherent in a number of shortcomings in other respects: the limited effect of this system on the field of positive integers, the difficulty in determining the ratio of numbers in terms of value, determining the outcome of an operation from a range, etc. In turn, these shortcomings require effective ways to overcome them. FPGA devices are organized in channels. Within these channels are found short delay propagation paths and dedicated memory blocks with programmable address and data spaces, which are commonly used to synthesize small RAM and ROM functions. Performance rapidly suffers when carry bits and/or data have to propagate across channel boundaries. This work will build upon previous works [3][4][5][6] and previous RNS-DDS design experience [7].
To eliminate a large area on a chip, and accordingly a large power consumption of the classic SDR, it is proposed to use functional blocks in the SDR structure based on the residual class system, such as filters, digital frequency synthesizer blocks and digital-to-analog converters. In this work, it is necessary to develop mathematical models of SDR with filters in the system of residual classes, to investigate the dependence of the area of functional blocks on the crystal on the order of the filter. It is necessary to develop a method for converting samples from the residual class system into arbitrary analog signals and assess the effectiveness of using chip area by such DDS.

Research method
In the RNS the numbers are represented in the basis of mutually prime numbers, called modules = The product of all modules RNS = ∑︀ =1 is called the dynamic range of the system. Any integer number 0 ≤ ≤ can be uniquely represented in the RNS in the form of the vector { 1 , 2 , . . . , }, there = | | = mod [8].
Dynamic range of RNS is usually divided into two approximately equal parts, so that approximately half of the range represented positive numbers, and the rest of the range -negative. Thus any integer satisfying one of the following two relations: can be represented in the RNS.
The operations of addition, subtraction, and multiplication in a RNS are defined formulas: Equations (2) -(3) show the parallel nature of the RNS, free from bit transfers. These operations are called modular, because for their it takes only one clock cycle to process the numerical values. To convert numbers from the binary position number system to RNS we use an algorithm based on the application of a distributed arithmetic. K-bit number X is divided into separate formats, for each of which is assigned a pre-known number of B-binary discharges. Then the nbit binary number can be expressed as a combination -positional formats with the dimension B bits. This position of each format is assigned a specific weight 2 , where = 0, , 2 , . . . , .
where B -number of digits of the selected format; M -the degree of the format; -a factor of 0 or 1; = 0, , 2 , . . . , is the position of the format; -the position of the digit in the format. Convert a number from binary position code into the modular code is carried out using a modular summation of the remainders modulo Restoring the number X by the remainders where = . Element ⃒ ⃒ −1 ⃒ ⃒ means a multiplicative inverse for , by module [2]. The advantages of representing numbers in RNS can be represented as follows: 1. Since there is no propagation of transfer between arithmetic blocks in the RNS, and numbers of large dimension are represented as small residues, this leads to acceleration in the processing of data.
2. When presenting data using RNS, large numbers are coded into a set of small residues, and accordingly the complexity of arithmetic devices in each channel of the module decreases, which facilitates and simplifies the operation of the computer system.
3. RNS is a non-positioning system without the lack of dependence between its arithmetic blocks; therefore, an error in one channel does not extend to others, which in turn facilitates the process of detecting and correcting errors.
Thus, the use of RNS makes it possible to simplify and reduce the architecture of electronic computing devices, thereby increasing not only the speed, but also the energy efficiency of products.
However, operations such as division, comparison of two numbers, and the detection of a sign are laborious and expensive in RNS. Many solutions have been proposed for these problematic operations. Most of them consist in converting the remainder into a binary system (the inverse transformation). On the other hand, choosing the right set of modules is another important issue for building an effective RNS with a sufficient dynamic range.

Results and analysis
Results. Summing up some results, it can be noted that the system of residual classes allows to significantly improve the parameters of a computer in a Direct Digital Synthesizers (DDS) in comparison with a computer built on the same physical and technological basis, but in a positional system calculation, and also to receive new more progressive constructive and structural solutions.
The essence of digital frequency synthesis is the conversion of the digital code of the number A into an analog harmonic signal with a frequency where -frequency of the clock generator; M is a fixed positive integer, based on the application of the periodicity property of a harmonic function analogous to the property of arithmetic operations modulo the ring of integers.
In the proposed device, the formation of a harmonic oscillation ( ) = cos(2 ) is carried out by obtaining its samples at times = ∆ · with the clock frequency = 1/∆ . Taking into account (5), the discrete samples of a harmonic oscillation with amplitude U are described by the expression: where = 0, ∞. Since the cosine is a periodic function then Several papers have proposed synthesizer solutions in the system of residual classes [3,6,7]. Such modern DDS synthesizers can be used as basic modules of the RNS-SDR system. Analysis of the internal structure of the proposed synthesizers shows that synthesizers can be selected with minimization of the area or an increase in the rate of formation of the synthesized signal [7]. If for traditional DDS we have an exponential dependence of the occupied area on the capacity of the phase accumulator, then for RNS-DDS we have a linear dependence of the area on the capacity of the synthesizer. It is shown that when the phase word is larger than 12 bits, the area of a traditional synthesizer begins to exceed the area of the RNS-DDS synthesizer by 40% or more. If we use the optimization of the synthesizer from the point of view of reducing the magnitude of the delay of the synthesized signal, the gain will be more than 50% or more.
As a rule, the DDS synthesizer or its NCO core is present in the SDR system. These blocks take up most of the synthesizer and reducing their area or increasing performance is a significant improvement to SDR systems.
In addition to DDS, digital filters perform important tasks in SDR systems. Consider methods for optimizing digital filters using a residual class system.
The direct digital synthesizer is implemented using look up tables. These tables are addressed using the results from a phase accumulator and has a quarter wave symmetry to reduce the number of LUTs needed. The harmonica signals are then multiplied with the input signal of the receiver. The decimation filter is a programmable RNS filter the products in the filters are calculated using a series of modulo adders, while the sums calculated using a regular adder tree followed by a modulo stage, see Fig. 2. The receiver was implemented for a RNS codes with a number of dynamic ranges, ranging from 34-to 37-bit, and filters of varying lengths, form 8 to 64 taps. The result of the implementation shows that it is possible to increase the throughput and reduce the complexity of a receiver with certain dynamic ranges and filter lengths.
In [9] compare error free FIR filters with a word length of 20 bits implemented in both RNS and conventional 2's complement. They come to the conclusion that for transpose filters, if the number of taps is greater than 8 the filter implemented using RNS will be faster and, if the number of taps is greater than 40, it will use less area and energy. For direct FIR filters the RNS implementation is slightly faster and the area is smaller for filters having more than 16 taps. These results are valid for both filters with programmable coefficients and with constant coefficients. Also suggest that the supply voltage for calculation of residues, not in the critical path can be reduced as a way to further reduce the power consumption of RNS filters. The results show that the RNS filter has a large overhead due to the conversion to and from RNS number, but the lower area and power increase at a slower rate as the number of taps.
The great advantage of the SDR, which is completely implemented in the system of residual classes, is the absence of the need for multiple conversion blocks into the binary system. Such a unit is needed only at the output of the system.

Fig. 2. RNS implementation of a FIR filter
SDR systems typically use quadrature signals to process signals. The Residue Number System allows to represent complex numbers in a more efficient manner. One efficient way is using the Quadratic Residue Number System. This system has particular large advantage when it comes to complex multiplication. Using only two multiplications, as opposed to 4 or 3 when using normal 2's complement. As said previously the main advantage of using the QRNS is that a complex multiplication can be done using only two multipliers. Arithmetic in QRNS is done the following way ( , * ) ± ( , * ) = ( ± , * ± * ), ( , * ) · ( , * ) = ( · , * · * ).
Which means that addition is as complex using QRNS as using ordinary RNS. However, multiplication is easier. For each residue, two constant multipliers, one adder and one subtractor is needed. Thus for a QRNS number with N residues, a total of 2N constant multipliers, N adders and N subtractor are needed to convert a number form RNS to QRNS. To conclude are the benefits the QRNS the ease of doing multiplication. While the disadvantages are the overhead of converting from ordinary RNS to QRNS and the reduced flexibility regarding the choice of residues when using QRNS.
For the operation of the SDR-RNS system, it is necessary to convert binary signals into the RNS and directly analog signals into samples in the RNS system, as well as the values of the RNS system into analog form.
To convert from RNS to analog form, you can apply a DAC in the system of residual classes [10]. It is also necessary to use an analog-to-digital converter with direct conversion of the analog signal and samples in the residual class system.
Let's consider ways of construction of the digital synthesizer of frequency with a phase accumulator in RNS system and sine-weighted DAC type. Usually, the residue-to-analog (R/A) conversion is performed in two steps where conversion to binary is an intermediate stage. This degrades the performance of the overall RNS by adding an extra overhead and increasing the latency. Therefore, a direct R/A converter is sought to solve that problem and make the RNS efficient. The problem of direct R/A conversion has not been sufficiently investigated yet. In this research area, the author in [10] tackled that problem and suggested a direct R/A converter based on mixed-radix conversion (MRC). The main drawback with the MRC based converter is the sequential nature of the algorithm, which makes it slow for large dynamic range of frequencies. We propose a direct R/A converter architecture based on the Chinese Remainder Theorem (CRT). The proposed converter eliminates the need for an intermediate binary stage and can perform even better than the conventional R/B converter. The need for a large modulo adder is eliminated. Instead, a summer operational amplifier along with a folding circuit is used to perform modulo addition in the analog domain. The proposed converter facilitates the implementation of the CRT when direct conversion to analog form is required and it is very adequate for large dynamic range applications.
In contrast to MRC, the CRT is not a sequential algorithm. The intermediate values can be generated in parallel using ROM look-up tables. A proposed Fig. 3. Block diagram of the RNS ROM-less DDS architecture for direct conversion from RNS to analog representation is shown Fig. 3.
The synthesizer consists of the following functional blocks: the binary code converter into the RNS system, the phase accumulator in the RNS, the RNS processor, the conversion units based on the CRT, the DAC units and the summing operational amplifier. The frequency control word (FCW) is fed to the binary code converter in the residual class system. In the phase accumulator, the phase values are accumulated for each of the residues in the RNS system. In the RNS processor, the necessary transformations of signals -phase transformation, amplitudes, modulation of the synthesized oscillation -occur. After this, the received signal in the form of its values in the RNS enters separately into conversion units based on the CRT system. The resulting values are converted into an analog form in the DAC units [10].
In order to realize an R/A converter based on the implementation of the CRT, we need to modulo add the intermediate values (partial sums of the CRT) generated by the ROMs. Assume each residue has bits, and then the partial sums are generated using three (2 × 3 )-bit ROMs. These values are converted into analog form using -bit DAC-s. Conventional addition is carried out by a summer operational amplifier.
Analysis. The resource savings obtained by using RNS are always greater than 30% when the dynamic range of the input data is 12 bits, while in case of 8 bits the advantage depends on the number of taps. For the FIR1 there are no savings but a small increment in the resources usage due to the overhead of the conversion blocks but savings up to 20% are obtained for FIR5 and FIR 3 experiments. The experimental results shows that the presented techniques offer interesting advantages for FIR filters characterized by high dynamic range and high number of taps especially when full custom multipliers are not available in the target FPGA architecture or when they must to be used for different purposes. Table 1 presents the gain for filters up to the 10th order, which shows that a gain of about 30% is easily attainable.
For large dimension phase words, the size of the ROM becomes the problem of the design for speed DDS. In addition, reducing the size of the ROMs reduces of the area of the whole converter. In terms of power consumption, the proposed converter is expected to consume an amount of static power, at least, equal to the power consumed in the converter proposed in [10] as both contain one operational amplifier and three DACs. The proposed converter may consume slightly more static power to achieve the 3 -bit resolution of the DACs. On the other hand, the dynamic power of ROMs is reduced as the size of the ROMs is reduced. Therefore, the proposed direct digital synthesizer can be more efficient than a RNS DDS with conventional R/B converter and DAC and it eliminates the need for a large modulo digital adder.

Conclusion
The paper discusses the principles of building SDR systems in the system of residual classes. The main functional blocks of such systems are proposed and analyzed. Namely: a reference quadrature generator based on a direct digital frequency synthesizer, filter blocks in the system of residual classes, digital-analog and analog-digital converters in the system of residual classes, quadrature converters, converters from the system of residual classes to the binary system and vice versa.
The key features of the proposed SDR-RNS system are: -Ability to choose between the performance of the system and its energy consumption (the crystal area).
-The proposed architecture reduces the size of the pipeline adders and multipliers which is a very important factor in the design SDR for fast work.
The structure of a perspective SDR is analyzed. The values of the potential reduce hardware costs and methods for its improvement are analyzed. Thus, the proposed system introduces clear advantages over existing systems and shows performance advantages and can be used to build modern communication systems.
Future work: it is necessary to investigate the dependences of the SDR chip area on the values of the number system modules. It is also necessary to compare the maximum operating frequencies of the SDR receiver in the system of residual classes with the SDR receiver in the classical number system. Such a comparative study can be done on a fully functional layout of the SDR system where both approaches are simultaneously implemented in a single technological basis.