A Study on the Optimization of Blockchain Hashing Algorithm Based on PRCA

State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou 450001, China School of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou 450001, China School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China School of Computer Science, Fudan University, Shanghai 201203, China Zhengzhou University, Zhengzhou 450001, China


Introduction
Blockchain is a kind of distributed general ledger technology, originated from the literature [1]. Initially, it was mainly used in the field of cryptocurrency, the most representative of which were Bitcoin and Litecoin [2], Monroe [3], and Zcash [4]. Amid its rapid development, blockchain technology can effectively guarantee the authenticity, security, and reliability of data. It also has been widely used in medical data [5], personal data protection [6], and data allocation scheme [7]. As the basic unit of blockchain, block consists of partition header including original data and block body including transaction data. Among them, block data are used to connect the previous block and index the data from the hash value of range block. Each blockchain transaction is conducted by using hash function interaction. It guarantees the security of blockchain.
However, with the continuous development of blockchain, its security issues become increasingly prominent. e lightweight hash function SHA1 in the blockchain is no longer regarded as an attacker that can withstand sufficient funds and computing resources. SHA256 can replace SHA1 for information exchange with good anticollision ability, while it cannot be changed at will. To avoid chain breakage, it is necessary to modify the hash values of all blocks behind the block at the same time. As a result, a large computational complexity is needed and the security of the blockchain is not guaranteed.
In the process of executing operations, the PRCA (Proactive Reconfigurable Computing Architecture) generates the optimal computation structure set by self-perception and dynamic selection. All the software and hardware variants are dynamically variable. erefore, in the process of application processing, they can select optimal solutions according to the independent variables in the program to get the variable optimal solution sets with equivalent function and different computing efficiency [8]. Combining with blockchain, it can improve the performance of the algorithm, improve the transmission efficiency, and enhance the security of hash algorithm.
is paper proposes an optimization scheme of blockchain hashing algorithm based on PRCA. Aiming at the blockchain hash algorithm structure, a reconfigurable hash algorithm with high performance is implemented in a full pipeline way. At the same time, 10,000 Mbp communication is realized by mimic computer to reduce data transmission delay, and data is read from memory by DMA, which improves transmission efficiency. In each transaction, the hash algorithm is negotiated and the mimic computer is reconstructed, which aims to transform the hash algorithm structure through using lightweight hash algorithm for many times. is scheme not only improves the efficiency of processing data for blockchain but also increases its security.

Proactive Reconfigurable
Computing Architecture

Definition of Proactive Reconfigurable Computation.
PRCA is an operation mechanism based on multidimensional reconstructed functional structure and dynamic multibody. When proactive reconfigurable computation is processing data, execution structures, such as computing, storage, and interconnection, are changing dynamically with the efficiency of transaction processing, instead of improving the algorithm to improve the operation performance without changing the basic hardware. ere are many functional equivalents in PRCA, but they are accomplished by combining different hardware structures with this algorithm. e purpose is to achieve the high performance of computing, that is, how to automatically perceive variables to generate the optimal computing set and autonomously reconstruct the computing in the processing algorithm [9].
PRCA has variable infrastructure and algorithm, which makes it possible to obtain optimal solutions to different problems. It pursues different services and comprehensive high performance under different loads or other conditions, builds the most appropriate processing components, and forms the most appropriate architecture. Proactive reconfigurable computation combines the advantages of general computing and special computing to achieve the goal of solving problems efficiently. In terms of the general computing structure, it is characterized by its determined structures and variable algorithm and may calculate any computable problems with high efficiency. Its principle is shown in Figure 1.

Proactive Reconfigurable Computer.
Proactive reconfigurable computer is a new type of computer developed according to the principle of mimetic computing to achieve the high performance of computing. e computational structure can be regarded as a high-order function. In the analysis of the calculation, the computational structure will generate the most efficient set of settlement structures by selecting the perceptual independent variables. e essence of proactive reconfigurable computer is the functionalization of computational structure. Its high performance and efficiency are very suitable for the processing and analysis of big data nowadays. Compared with the traditional computer, the energy efficiency of proactive reconfigurable computer has been improved more than 10 times. e structure of the principle prototype of the proactive reconfigurable computer is shown in Figure 2. e purpose of proactive reconfigurable computer is to deal with intensive computing. It consists of an ATOM general microprocessor, four high-order reconfigurable largescale reconfigurable FPGAs, and DDR3 memory, which connects LVDS bus FULL-MESH through floor GTX, and is controlled by the control unit BMC and synchronized by clock synchronization unit. e prototype supports multiple interfaces and storage media and reconstructs FPGA processing core, I/O interface, and on-chip interconnection network according to the application requirements, so as to achieve the purpose of high-efficiency computing [10].
Proactive reconfigurable computers use dynamic randomicity to build an asymmetric defense system, which expands the attack surface to weaken intrinsic attacks of feature sniffing and state transition [11]. Based on such a characteristic, 10,000 Mbp communication is realized by using FPGA to reduce data transmission delay, build a simulated hash structure, and improve the speed of hash value calculation of blockchain data. A Merkle tree is formed to match the algorithm, which makes it difficult for attackers to distinguish the complexity of the target and improves the security performance of the system [12]. e protection function of computer hardware is used to expand the area of attack, increase the difficulty of blockchain attack, and improve the antiattack ability.

Optimization of Blockchain Hash Algorithms
Based on PRCA

System Framework and Block
Structure. e proactive reconfigurable computer is configured as a node in the blockchain network. Users and proactive reconfigurable computers establish a connection. e proactive reconfigurable computer catches the data in the DDR memory and realizes the direct connection high-speed transmission from network to the memory data by the asynchronous FIFO, reducing the intermediate transmission level. In blockchain, a high-performance hash algorithm is implemented by means of pipelines and the key segment calculation data hash is extracted from memory [13]. After calculating the hash value, the result is encapsulated and transferred to the storage server to complete the storage of the blockchain. e specific system framework is shown in Figure 3.
e block stores all the information about transactions, including the generation time of transaction, the record index number of transaction, the hash value of transactions, bitcoin's expenditure address and its amount of expenditures, and other types of transaction. A Merkle value will be generated in the transaction. e hash node value in the transaction determines that each address cannot be repeatedly traded and forged. To further improve the security of transactions, a proactive reconfigurable hash is added to the blockchain, which is composed of various types and structures of hash algorithms and can be used separately or in series. e concrete structure model is shown in Figure 4.
Unit nodes in blockchains monitor network traffic to calculate transaction volume [14]. Before the transaction is generated, the hash algorithm selection step will be added, and then the appropriate hash function will be selected from the hash list. e unit node uses the selected hash function to compete to find the hash value. Once the hash value is found, the block will be propagated to another node in the blockchain for verification.
In the interaction, the sensor layer on the spot collects data. e sensor transmits data to unit nodes and requests the transaction to store the data. If unit nodes successfully complete the transaction mining, the blockchain network will update the block. After that, the blockchain network returns the field layer data to the control layer. en block mining will be started. After the block mining is finished, the blockchain network receives the node of transaction and broadcasts the block and validation request to other nodes. Other nodes using hash algorithm confirmed from the block header for verification. After the successful verification, they will update the block and store nodes and blocks. If the contents of transactions are transferring data or commands, the requested node will transfer the data or command to the other layers. e specific block mining and updating are shown in Figure 5.
At the same time, the random number generator randomly chooses the new hash algorithm at intervals, and the two sides negotiate again and update for new hash algorithm to improve security.   hash function for many operations. Hashing is a method of applying hash function to data that computes a relatively unique output for almost any size of input. It allows individuals to independently obtain input data and hash data and produce the same results, proving that the data has not changed. Take SHA256 as an example to illustrate the optimization and implementation of hash algorithm on proactive reconfigurable computers. e throughput of the algorithm solves the computational performance of the algorithm. e specific implementation formula is as follows:

Hash Algorithm
In equation (1), T is the throughput, B denotes the data block size, f is the maximum clock frequency, N is the pipeline series, and d denotes the calculation delay. e number of pipeline series is proportional to frequency and throughput. In order to improve the throughput of the algorithm, we can use prediction and CSA strategies to reduce the delay of critical paths and use full-pipeline SHA1 and SHA256 algorithms. e following is an introduction to the optimization of SHA256, which can be extended to SHA1.

SHA256.
For messages with a length no more than 2 64 bits, the hash algorithm SHA256 will produce a hash value with a length of 256 bits, which is called a message digest. e digest is a 32-byte array that can be represented by a hexadecimal string of length 64. e processing of the SHA256 algorithm is divided into five steps: (i) Add great many 0 bits to the input data until 448 bits. en add 64-bit length to the input data until 512 bits. (2) In the above algorithm, 1 (E t ), 0 (A t ), Maj(A t , B t , C t ), and Ch(E t , F t , G t ) are logical functions, and W t is updated according to From the processing of the SHA256 algorithm, it can be seen that the key is to update the values of A and E, which requires multiple addition operations and 64 cycles of iteration. erefore, the optimization of these two operands will play an important role in reducing the time consumption of the algorithm.

Critical Path Segmentation Optimization.
e time consumption of the SHA256 operation is mainly in the iteration part of Step 4, and the most time-consuming part is the calculation of A and E values. erefore, adopting the method of critical path segmentation and combining with the parallel characteristics of FPGA computing resources can effectively shorten the time consumption.
H t , K t , and W t in the critical path do not need additional logical operations or do not depend on other operands of the current round. erefore, the critical path of the algorithm is divided into the following formulas: In this way, A and E values will be updated and shortened from the original 6t ADD and 5t ADD to 4t ADD and 3t ADD , where t ADD denotes the time consumption of addition operations.

Minimum Addition
Optimization. FPGA is suitable for bit operation. Carry-Save Adders (CSA) strategy can reduce addition operation, minimize critical path length, and ensure pipeline throughput. For n-bit binary numbers a, b, and c, the CAS operations are as follows: CSA(a, b, c) � S(a, b, c) + Ca(a, b, c) By dividing the critical paths, it takes 2t ADD , 4t ADD , and 3t ADD to calculate S t , A t+1 , and E t+1 , respectively. Since the addition operation consumes a lot of time on the FPGA, the CSA method should be used to increase bit operation and reduce the addition operation, in order that the total time consumption can be reduced. By using the critical path Security and Communication Networks 5 partitioning method and CSA strategy, formulas (4)∼(6) are replaced by CSA operation in the following formulas: e critical path segmentation method and the CSA strategy reduce the operation of A t+1 and E t+1 to only 2t ADD , thus improving the efficiency of the algorithm.

Pipeline Optimization.
After the optimization of critical path partition, the time consumption of the longest path is reduced. For serial computing, the total time consumption does not decrease. erefore, it is necessary to use the parallel characteristics of FPGA and pipeline method for optimization, so as to truly reduce the total time consumption of computing.
According to the characteristics of the SHA256 algorithm and the optimization of critical path, the core processing of the algorithm is divided into three modules: W module, split S module, and update module A − H. e pipelining technology reduces time consumption by increasing resource utilization. erefore, each module needs 64 computing units and a total of 192 computing units.
While data are being calculated, in the first clock cycle, the first data are input to the W 0 computing unit for processing in the first clock cycle. In the second clock cycle, the output of W 0 is taken as the input of S0, and W 1 is calculated. At the same time, the second data are input to W 0 . In the third clock cycle, three computing units are processed in parallel, and so on. Until the 66th clock cycle, when all 192 units are running, the output of the first data is completed. When there is a large amount of data to be computed, one type of data is computed in a clock cycle, which reduces the time consumed by 64 iterations in the algorithm. erefore, the throughput and resource utilization of the algorithm are greatly improved. e pipeline structure of the SHA256 algorithm is shown in Figure 6.

Communication Optimization.
For adapting to the calculation of blockchain hash, the concrete structure of proactive reconfigurable computer is shown in Figure 7, which mainly includes Hash_Core, I_10G, CTL_DDR3_0/1, State_U, Ctl_Core, and I_1G modules. e functions of each module are as follows: (i) Hash_Core module. e core processing module of hash computing is mainly responsible for hash calculation of blockchain data, which is implemented in full-pipeline mode and supports hash calculation of SHA1, SHA256, and so forth. (ii) I_10G module. e data communication interface circuit based on 10,000 Mega mainly includes 10,000 Mega MAC interface, data buffer, and interface of module on the same chip. e module is mainly responsible for the input of data to be processed and the recovery of calculation results. (iii) CTL_DDR3_0 module. e data communication interface circuit based on DDR3 mainly includes DDR3 interface, data buffer, and interface of onchip module. is module is mainly responsible for data memory reading. (iv) CTL_DDR3_1 module. e data communication interface circuit based on DDR3 mainly includes DDR3 interface, data buffer, and interface of onchip module. is module is mainly responsible for data memory writing.
(v) State_U module. Acquire the on-chip state of each module, and then output it to Ctl_Core. (vi) Ctl_Core module.
e processor-based on-chip processing control core is mainly responsible for reporting the running state of the mimic computer and processing the control information. Block data are cached to CTL_DDR3_0 via I_10G network interface, hash values are read and calculated by Hash_Core, and results are cached into CTL_DDR3_1 and finally sent to the network by I_10G. e host computer controls the proactive reconfigurable computer in real time through I_1G Gigabit interface and Ctl_Core according to the information reported by State_U.

10G Network.
10G network is implemented based on IP protocol, and the content of data transmission is controlled by external users. It uses FIFO interface to communicate with external devices [15]. In the process of transmitting control messages, if the receiver does not have an ARP response, the system will issue a timeout error because ARP does not respond; if there is a timeout transmission, the system will show the number of times of timeout transmission. If the transmission succeeds, the successful message will be returned; if the transmission fails, the error message which is retransmitted overtime will be returned. If there is a timeout and no information is received, the system will send out the wrong signal of communication channel, according to which the user will take appropriate action accordingly. e whole structure is shown in Figure 8.
In Figure 8, the sending port includes two FIFOs: the sending data FIFO (ip_snd_fifo) and the sending status FIFO (ip_snd_status_fifo). e sending data FIFO's depth is 65 bits and low 64 bits are data interface. e highest bit indicates whether the data transmission is the last one. If more than 1440 bytes of data are to be transmitted, multiple transfers are required. e sending status FIFO is used to identify whether there is an error in the data transmission. If there is an error like the timeout in the process of data transmission, all subsequent contents will be read out until the last one. Each data transmission corresponds to a state FIFO write. e receiving port has only one FIFO, that is, the   Security and Communication Networks receiving data FIFO (ip_rec_fifo), which has a depth of 65 bits and low 64 bits as the data interface. e highest bit indicates whether the data transmission is the last frame of data, and the data received is identified by index number.

Memory Management.
Read-write memory is implemented by four groups of FIFOs in burst mode. Every time before it reads and writes memory, it will calculate the memory address range according to the length of the data and store it in wrrdinfo_fifo. At the same time, the data will be cached in wfifo_fifo, and according to the information of wrrdinfo_fifo, the read-write arbitration module determines whether it is a reading operation or a writing one. If it is a writing operation, the data will be written to memory through the DDR write module. e process of reading memory data is similar to that of writing. e read information and data will be cached in out_rdinfo_fifo and rififo_fifo, respectively. e read-write structure of memory is shown in Figure 9, where the size of request information wrrdinfo_fifo and out_rdinfo_fifo is 16 * 64 bits, and the size of reading and writing wfifo_fifo and rififo_fifo is 4096 * 64 bits.
When the initialization of memory is completed, that is, phy_init_done is set to 1, the CTL_DDR3_0 and CTL_DDR3_1 modules are in the read-write state, and the read-write state jump will be completed according to the wrrdinfo_q[0] identifier bit, as shown in Figure 10. When it begins reading and writing memory, the address of memory will be counted according to the length of writing, and the reading and writing of the whole data will be completed. After the reading and writing operation is completed, it will jump to the idle state and wait for the next operation.

Application of PRCA Blockchain.
Public and private keys in blockchains are a pair of keys obtained by a kind of algorithm. It will be encrypted with public key and decrypted with corresponding private key. After three times of SHA256 computation and one time of RIPEMD160 computation for the public key, a public key hash can be obtained, and the address can finally be obtained through base58 encoding [16]. Merkle tree is a kind of tree structure. In trading with blockchains, every transaction is hashed, and the final root is Merkle root [17]. Proof-of-work (PoW) is called mining in blockchains. CPU calculation uses the complexity of hash operation to determine PoW, and it will produce a value smaller than the specified target [18]. Block filter proposed in the blockchain is a fast search based on hash function, which can quickly determine whether a retrieved value exists in the searched set [19]. e application of hash algorithm in blockchain is shown in Figure 11.
In this paper, the communication equipment and network are optimized. In a relatively safe environment, a relatively simple and lightweight hash algorithm is chosen to replace the complex hash algorithm, so as to improve the running speed of the system and reduce the energy consumption of the system. Meanwhile, multiple hash algorithm is used to reduce the attack of length expansion and ensure the integrity and tamper-proofing of information, which reflects the security performance of blockchain.

Experimental Analysis
In this paper, proactive reconfigurable computer is used for experiments. e software platform is ISE software integrating design, simulation, integration, wiring, and generation. First, the comparison of CPU running speed and resource utilization is given by optimizing the hash algorithm deeply. Second, the collision resistance of proactive reconfigurable hashes is analyzed. Finally, the security of this scheme is analyzed from many aspects. e configuration information of each computing unit used in the experiment is shown in Table 1.

Performance Analysis.
On the proactive reconfigurable computer, the SHA256 and SHA1 algorithms are implemented, respectively. eir resource occupation, frequency, and throughput are shown in Table 2.
As seen from Table 2 and Figure 12, SHA256 and SHA1 implemented in a pipelined manner occupy less than 10% of the resources but with high throughput.
Next is the performance comparison of SHA256 and SHA1 between the proactive reconfigurable computer and CPU, as is shown in Table 3.
From Table 3, it can be seen that the proactive reconfigurable computer can realize the parallelism of multiple modules and can fully meet the application requirements of hash computing in blockchain. Taking Bitcoin three hash as an example, three SHA256 combinations are connected in series to form a cascade pipeline. e data can be directly input into the pipeline without waiting, and the results are output sequentially by the end, which is very efficient. Contrastively, CPU can only rely on multithreaded concurrency to improve computing performance, and its essence is still serial execution, which will not be competent for blockchain applications requiring large amounts of computing.
Meanwhile, the proactive reconfigurable computer is equipped with a 10-gigabit network, whose data transmission peak is about 10 Gbps, which can meet the communication requirements of blockchain high-frequency transactions. As each clock cycle can transmit 8 bytes of data, the clock frequency is 156.25 MHz; while the FIFO interface and frequency of DDR are 8 bytes and 156.25 MHz, the data transmitted by 10G network can be synchronized through FIFO cache and written into memory with 64 bytes and 300 MHz. Two memory modules are configured: one is responsible for writing operation of 10G network and reading operation of hash module, and the other is responsible for writing operation of hash module and reading operation of 10G network. e two memory modules work independently, which improves the efficiency of data transmission.

Antiattack Analysis.
Hash operation is irreversible and gets different values for different contents. Any change of input information will lead to significant changes in hash results. Moreover, hash operation is also anticollision; that is, two pieces of information with the same hash result cannot be found, which can effectively prevent differential attack [20].
Assuming that the output value of hash function is uniformly distributed and the message digest has m bits, the hash value has n � 2 m possible outputs. For any k (k ≤ n) random input, the probability of at least one collision is If p(n, k) > 0.5, that is, 1 − e − (k(k− 1)/2n) � 1/2, then ln 2 ≈ (k 2 /2n); this means k ≈ � n √ . According to the above calculation, if the hash function has an output digest of m bits, then only k � 2 m/2 attempts will result in a collision with a probability of at least 50%. SHA1 and SHA256 are operations of 2 160 and 2 256 orders of magnitude, respectively. Table 4 gives the threshold of hash function conflict.
Bitcoin obtains hash data through the SHA256 algorithm and runs two iterations in block trading to mitigate the length expansion attack. PRCA blockchain system can be described by a triple tuple as Ω � Block, Hash, Num { }, where "Block" represents block data, "Hash" represents hash algorithm, and "Num" represents iteration times. e multiple phases of Ω have many different hash combination schemes and can be represented by Ω � Block(t), { Hash(t), Num(t)} at time t, which is dynamic, diverse, and random. e hash algorithm of PRCA blockchain system Ω is dynamically reconfigurable. After negotiation, the hash algorithm can be reconstructed dynamically and partially to complete the switching of different algorithms. In addition, "Block" is changing constantly, and the content of each transaction is unpredictable and completely different. Finally, "Num" can be negotiated by both sides to improve its security by increasing the number of iterations without significantly increasing the amount of computation. Obviously, the blockchain based on PRCA not only improves the complexity of internal hash operation but also combines the hash to increase the length of output, which greatly hinders the attackers from extending the blockchain and reduces the probability of collision.

Security Performance Analysis.
Encryption of information is the key link of blockchain, which mainly includes hash function and asymmetric encryption algorithms [21]. Asymmetric encryption uses private key to prove the ownership of the node and is implemented by digital signature. Hash algorithm is used to transform the input of any

Block 1 transactions
Hash160 Figure 11: e application of hash algorithm in blockchain. length into an output of fixed length consisting of letters and numbers, which is irreversible and tamper-proofing. From the perspective of information security, the main advantages of this scheme are as follows: (i) Multiple hash algorithms are jointly used to ensure the integrity and nontampering of information (ii) ere is a pseudorandom dynamic selection and the hash algorithm is updated to increase the difficulty of attack in time dimension (iii) By using the hardware implementation of proactive reconfigurable computer, the attack surface is expanded and the attack threshold is raised Obviously, the blockchain based on PRCA enhances the confidentiality, authenticity, and integrity of data and enhances the overall security of blockchain transactions with its reliability, security, and tamper-resistance.

Conclusions
In order to improve the efficiency and security of blockchain hash algorithm, a scheme of blockchain hash algorithm optimization based on PRCA is proposed in this paper. is scheme combines blockchain with proactive reconfigurable computer to improve the performance of blockchain hash function. In terms of security performance, several lightweight hash algorithms are used to exchange information to ensure the integrity and tamper-proofing of information. e proactive reconfigurable computer hardware is used to expand the attack surface, improve the attack threshold, and ensure the security of blockchain.
Blockchain security is the most important part of the system, which includes data, intelligent contract, privacy protection, and application risk. Meanwhile, the data of blockchain is unique. Under the condition of its own security, data writing cannot be changed. Based on the security problem of data immutability, the data structure,    Data Availability e data used support the findings of the study are available from the corresponding authors upon request.

Additional Points
Highlights. In this paper, proactive reconfigurable computer is used for experiments. e software platform is ISE software integrating design, simulation, integration, wiring, and generation. First, the comparison of CPU running speed and resource utilization is given by optimizing the hash algorithm deeply. Second, the collision resistance of proactive reconfigurable hashes is analyzed. Finally, the security of this scheme is analyzed from many aspects.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.