Development and design of an FPGA-based encoder for NPN

Abstract This paper describes a cryptographic protection system hardware device designed to improve data encryption and decryption performance and preserve data integrity. The cryptosystem is implemented by hardware-software method, where the encryption and decryption of data are carried out in a stand-alone FPGA device based on non-positional polynomial number system (NPN). For data encryption the next block of text to be encrypted is divided into sub-blocks and represented as separate binary polynomials and binary polynomials-keys are assigned to them, as well as irreducible polynomials (modules). Then, split blocks are calculated in parallel and a ciphertext is formed. For this purpose, the special algorithm where calculation of NPN parameters and check on irreducibility of polynomials (modules) and the program of generation of direct and inverse keys are developed and application functional is developed that implements operations in the ring of polynomials with coefficients GF(2) using an object-oriented approach. We have developed polynomial multipliers modulo sequential and parallel action (matrix multiplier) on the basis of which data encryption and decryption are performed.


PUBLIC INTEREST STATEMENT
For special-purpose transceiver devices, an autonomous portable hardware and software solution for information protection is required. For this task, the work used high-speed programmable logic integrated circuits designed to calculate complex mathematical operations. The algorithm is based on a non-positional polynomial number system (NPN). The source text, divided into blocks using a random number generator, is calculated in parallel and a ciphertext is formed. For this, a special algorithm has been developed, where the NPN parameters are calculated and polynomials (modules) irreducible and a program for generating direct and inverse keys is performed. Hardware multipliers of polynomials modulo sequential, parallel action (matrix multiplier) have been developed on the basis of which encryption and decryption of data is carried out.

Introduction
As means, methods and forms of automation of processes of gathering, storage and processing of information develop and become more complicated, its vulnerability increases. Hardware implementation of cryptosystems allows to protect autonomous digital devices from unauthorized access to classified information from the outside (Shan`gin, 2007). Many methods have been developed to meet the ever-increasing demand for secure communications, data storage, data transmission, etc. (Saini et al., 2020). One of the most reliable ways to solve the problem of data security in computer systems and networks is considered to be cryptographic protection, which provides the transformation of plain text into cipher text using cryptographic algorithms (RyabkoB & Fionov, 2004;Shan`gin, 2007).
Various algorithms and methods of data encryption are known. The use of reliable and effective non-traditional cryptographic methods, algorithms and software for information protection, for example, non-positional polynomial numbering systems or polynomial numbering systems in residual classes can provide fast and more time-efficient encryption methods, and increase the crypto stability of encryption algorithms. Moreover, it is possible to combine software and hardware implementation of encryption functions (Barrera et al., 2020;Aitkhozhayeva & Tynymbayev, 2014).
One of the significant advantages of hardware encryption (compared to software encryption) in modern information and communication technologies is its high performance (Gnatyuk et al., 2016). Besides, hardware implementation of cryptoalgorithm ensures its integrity, and encryption and key formation is performed in encryptor board itself, not in a computer memory. A second important advantage is that the implementation of an algorithm itself is protected. These advantages of hardware encryption have led to interest in hardware implementation of cryptosystems (Gnatyuk et al., 2016;Nedjah & de Macedomourelle, 2006;Sadiq & Ahmed, 2006;Tenca & Tawalbeh, 2003).
As modern security protocols become increasingly algorithm-independent, a high degree of flexibility with respect to cryptographic algorithms is desirable. A promising solution that combines high flexibility with the speed and physical security of traditional hardware is the implementation of cryptographic algorithms on reconfigurable devices such as FPGAs (Krishna et al., 2017;Liu et al., 2019;Wollinger et al., 2004;Zambreno et al., 2006).
Increasing efficiency of the encryption device with FPGA-based NPN is achieved by increasing the performance of encryption and decryption algorithms.

Materials and methods
In this work, FPGA-based data encryption and decryption algorithm was implemented. The cryptosystem algorithm, based on the multiplier of irreducible polynomials modulo (Kalimoldayev, Tynymbayev, Gnatyuk, Ibraimov et al., 2020;Kalimoldayev, et al., 2019), was implemented on Nexys 4 DDR FPGA board with parallel data computation. Nexys 4 DDR board is a complete, ready-to-use digital circuit design platform based on the Artix-7 programmable gate array (FPGA) from Xilinx. With a large, high-performance FPGA (part number Xilinx XC7A100T-1CSG324C), extensive external memory and a suite of USB, Ethernet and other ports, the Nexys4 DDR can accommodate designs ranging from input combinational circuits to powerful embedded processors.
The data encryption and decryption algorithm was designed and tested in the Xilinx ISE Design Suite 14.4 computer-aided design (CAD) and the hardware description language (HDL) Verilog was used. Some blocks and subblocks were written in VHDL. The multiplication of polynomials modulo irreducible polynomials was applied in this work, where multiplication is done with the analysis of lowest bits of multiplier.
The encryption and decryption algorithm has been tested in the ISE simulator (ISim). ISim provides a full-featured HDL simulator integrated into the ISE. A general block diagram of the encryption device is shown in Figure 1.
The RTL scheme of the main data encryption and decryption block is shown in Figure 2. The main block consists of three main subblocks: coder_keyboard, decoder and display. The main coder_keyboardsubblock performs the function of forming the plaintext or source information and encrypting the generated data. The main decoder subblock is used to decrypt data. The main displaysubblock is used to visualize the encryption and decryption of the data. The main block has 4 inputs and 5 outputs. The first input of the main block clk is a 100 MHz clock signal. The inputs clk_kb and data_kbare used to read the scan code from the keypad. The input rst_btn resets the information in the monitor. The vga_b, vga_g, vga_r outputs are for shading the monitor. The horizontal and vertical sync signals vga_hs_o and vga_vs_oare used to send the hue of each pixel in the monitor. Figure 3 shows the schematic of the main coder_keyboardsubblock for entering data into the buffer and encrypting data from the buffer.
The main coder_keyboard subplot consists of several parts: coder, ps2_keyboard_to_ascii and buffered data generation. From the keyboard the data comes to ps2_keyboard_to_ascii, which converts the scan code to ascii code. Open text generation is done by scanning the data from the USB keyboard. The data is stored in a 128 bit or 16 byte buffer. If the buffer is not filled with keyboard data, it is filled with zeros. Each byte corresponds to each keyboard character, i.e. the maximum buffer size is 16 characters. It is possible to change the size of the initial information.
When a keyboard character is pressed, the output (ascii_new) of ps2_keyboard_to_ascii generates a press signal. When this signal is received the data buffer adds data from the ascii_data output. Pressing Enter sends information from the buffer to the coder block to encrypt the data. The BackSpace key removes data from the buffer character by character each time it is pressed.
A snippet of code written in Verilog to generate the raw data is given below: else if(ascii_code = = 7*h08) info_ram≤ info_ram 8; The coder block takes 128 bits of information from the buffer to encrypt the data. Coder consists of 10 parallel coder16. A maximum input data size for a coder16 subblock is 16 bits. Therefore, it is necessary to divide input data into 10 parts. The 10 subblocks are formed by generating pseudorandom numbers. The pseudorandom number generator generates 10 random numbers, and the sum of all numbers must be 128. Generation of pseudo-random numbers is done by a shift register algorithm with linear feedback. The program code of the shift register algorithm with linear feedback was written in Verilog: The shift register algorithm consists of an XOR logic element and a shift register. The shift register algorithm with linear feedback can generate numbers from 1 to 15 and can repeat after 15 reports. The numbers are not repeated in the same report. For example, 1,8,4,2,9,12,6,11,5,10,13,14,15,7,3, but the sum of 10 random numbers will not always be 128. Therefore, it is necessary to change the range of random numbers by taking modulo 6 and adding 11. As a result, 9 numbers are generated randomly from a range of 11 to 16, and the tenth from the difference of 128 from the sum of the previous numbers.
The encryption keys G x ð Þ and base systems P x ð Þ for the different bits were generated using the software part of the cryptosystems. The software code for data encryption was written in Verilog according to the formula: L ð Þ-base system. The process of encryption in 10 subblocks is performed in parallel. After encryption, data from each subblock is connected in a cascading manner, and as a result, we obtain L(x) data with a size of 160 bits.
Listing of the program code for selecting the encryption key and the base system: The decoder unit divides 160 bits of encrypted data into 16 bits and decrypts a data in parallel. During decryption, inverse keys and base systems for different bits are generated using software section of the cryptosystem. When decrypting data, we get ten 16-bit subblocks. We need to cut out the necessary part and get the initial data of 128 bits in size. The decoder block has a pseudo-random number generation subblock, which repeats the number sequence of the division of the 128-bit input data according to encryption. As a result we get initial information. The snippet of the program code of the decoding block with the function of division of 160 bits by 16 bits:

Results and discussion
The results of data encryption and decryption are shown in Figure 4 as a timeline diagram. In Figure 4(a), the data subject to encryption, which are distributed in blocks randomly, is fed to the info input: 12,13,15,13,14,11,11,16,16,7 (the lengths of subblocks of the initial 128-bit ciphertext). Also, unique irreducible polynomials and direct keys are randomly fixed in these blocks, which take part in multiplication of input data polynomials, keys modulo irreducible polynomials. As can be seen in Figure 4(a), the output of coder_info forms an encrypted value: 0d891ef43f1a19-ba127306ca00c40bc632ca0036 in hexadecimal notation.
A visualization block consists of three sub-blocks: synchronous signal formation with a certain frequency, the driver for VGA and symbol formation for displaying on a screen ( Figure 5).
VGA driver block generates synchronous signals according to the vertical and horizontal position of a monitor. The block has inputs coder_info_in, decoder_info_in and info_in, which correspond to encrypted, decrypted and raw data.
The encryption algorithm was implemented on a Nexys 4 DDR development board with an Artix-7 FPGA core (XC7A100T-1CSG324C) from Xilinx. The device has been selected for its optimum performance, which one is suitable for the job. Table 1 shows the number of slice registers, the number of LUTs of slices, the number of fully usable LUT-FF pairs, the number of IOBs linked, the number of RAM/FIFO blocks, the number of BUFG/BUFGCTRL, etc. necessary in the design of the microcircuit. Table 2 also includes the details of the parameters related to time synchronization, such as details of the maximum frequency, minimum period, minimum time required before CLK, maximum time after CLK.
The proposed FPGA-based data encryption and decryption algorithm was compared with other data analyses of the cryptographic algorithm implemented on the FPGA hardware. This algorithm is shown in the article by Niraj Kumar et al. (A. Kumar et al., 2019;Niraj Kumar et al., 2021).  The use of equipment resources increases as length of the encryption block increases, except for used memory, since the calculation is performed iteratively for the n-bit. A larger key size increases the number of conversion cycles for the designed encryption and decryption chip, which increases the combinational path delay, minimum and maximum timing pulse synchronization. The performance of the microcircuit is estimated based on the maximum frequency supported by the FPGA hardware.
The results of this work can be used at developing an embedded encryption block for modern stand-alone digital devices.   Figure 6. (a) Timing summary as FPGA synthesis report (b) Using the FPGA hardware resource.
With hardware implementation of such blocks as polynomial irreducibility checking, direct and inverse key generators, as well as high-performanc ьe polynomial multipliers on matrix-conveyor circuits, etc., high-speed devices for data encryption can be built.

Conclusion
The encryption device was developed and tested using NPN based FPGA. In the process of implementation of this device were applied methods of multithreaded parallel computation with the formation of information modules based on a random number generator. This encryption method not only speeds up the entire process, but also increases a cryptographic strength of ciphertext.
The developed FPGA-based encryption device for high-speed data encryption can be embedded into an architecture of a personal computer. It is possible to build autonomous high-performance encryption digital devices with hardware implementation of a set of irreducible polynomials and autonomous generation of keys and implementations of encryption block with conveyor organization