Novel randomization and iterative based algorithms for the transactions assignment in blockchain problem

This study focuses on the load balancing of the transactions in the blockchain. The problem is how to assign these transactions to the blocks. The objective is to guarantee a load balancing of the workload in the time of blocks. The proposed problem is an NP-hard one. To face the hardness of the studied problem, the challenge is to develop algorithms that solve the problem approximately. Finding an approximate solution is a real challenge. In this paper, nine algorithms are proposed. These algorithms are based on the dispatching-rules method, randomization approach, clustering algorithms, and iterative method. The proposed algorithms return approximate solutions in a remarkable time. In addition, in this paper, a novel architecture composed of blocks is proposed. This architecture adds the component “Balancer”. This component is responsible to call the best-proposed algorithm and solve the scheduling problem in a polynomial time. In addition, the proposed work helps users to solve the problem of big data concurrency. These algorithms are coded and compared. The performance of these algorithms is tested over three classes of instances. These classes are generated based on uniform distribution. The total number of instances tested is 1350. The average gap, execution time, and the percentage of the best-reached value are used as metrics to measure the performance of the proposed algorithms. Experimental results show the performance of these algorithms and a comparison between them is discussed. The experimental results show that the best algorithm is best-mi-transactions iterative multi-choice with 93.9% in an average running time of 0.003 s.


Introduction
Nowadays, the utilization of blockchain technology is in increase day after day [1]. Person use of blockchain technology has also remarkably increased since 2016. The statistics published by several sources in 2020 show that there were more than 40 million blockchain wallets this year in comparison to 10 million blockchain wallets in 2016. The blockchain is integrated into several areas. Indeed, cryptocurrencies are used in a large way the blockchain technology. The security of cryptocurrencies in blockchain technology is studied giving state-of-art challenges and future prospects in [2]. Various category is presented in the latter work to show the importance of the blockchain applied to cryptocurrencies. a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 In the face of multiple transactions and a huge number of transactions, good management of these transactions optimize the use of the blocks to reply rapidly to all requested demands. Each transaction is characterized by its estimated execution time. In this paper, we focus on the load balancing of the transactions to the different blocks. The problem addressed in this paper is novel because the work reported has not been proposed or studied anywhere else in the literature. A solution is proposed for a new industrial problem that can be used to balance blocks into blockchain technology.
This paper proposes nine algorithms to find the best way to assign the big number of transactions in the given blocks, ensuring fair time execution between blocks. These algorithms are based on the dispatching-rules method, randomization approach, clustering algorithms, and iterative method. The proposed algorithms return approximate solutions in a remarkable time.
The choice to use heuristics rather than other methodologies is based on the following advantages provided by heuristic-based algorithms: 1. Simplicity: Heuristics are usually simpler and easier to interpret and understand. Heuristic algorithms often involve many sets of logical rules that the researcher can manipulate easily.
2. Efficacy: Heuristic algorithms can be faster (time execution) and need less memory to expand than other models.
3. Robustness: Heuristics are also more robust than other models.
4. Area-particular knowledge: Heuristics may encompass area-particular knowledge and proficiency, which can be difficult to capture in an other model.
A comparative study was discussed to measure and show the effectiveness of each developed algorithm compared to the best solutions among all proposed algorithms.
The essential contributions of this work are detailed in the following points: • The proposal and coding of nine algorithms to solve an NP-hard problem.
• The design of a novel architecture that adds a new component called "Balancer".
• The running time to obtain a solution to the studied problem is polynomial.
• There is no dominance between the algorithms. This means that any combination of two or more algorithms gives a new result. This can give authors more flexibility to choose several combinations and amelioration to extend the work and utilize the proposed algorithms.
• The proposed algorithms show their performance over different classes of instances.
• The utilization of the proposed algorithms as initial solutions to apply different meta-heuristics to solve the studied problem.
The rest of the paper is organized into six sections. A stat of the art is detailed and analysed in Section 2. The second section is reserved for the architecture composed of blocks. In this section, a novel architecture is given and details the different components. A new component called "Balancer" is proposed. The third section defines the problem formulation. All used variables and the modernization of the objective function are detailed in this section. The fourth section presents approximate solutions through different proposed algorithms. Nine algorithms are proposed and illustrated. Section 6 reports the results and discussions. Three classes of instances are tested to compare the performance of the proposed algorithms. Finally, the paper concludes with a summary in Section 7.

State of the art
In [3], the authors give different national and international perspectives regarding the use of cryptocurrencies. The discussion of the cryptocurrency innovations such as Bitcoin is detailed in [4]. A discussion of the non-trusted blockchain technology called Bitcoin. These authors presented three characterizations of digital money Bitcoin. In [5] authors give an overview and future perspectives on the utilization of the blockchain in cryptocurrency. In addition, the authors analyze current cryptocurrency projects and give different examples. In this same context authors in [6][7][8][9] studied another literature review regarding the blockchain in cryptocurrency. Blockchain technology is applied in different domains. Indeed, several works treated the application of blockchain technology. In [10][11][12] authors presented a survey of blockchain applications in different domains. Offered contracts that apply or perform fully or partially automatically without human intervention. These contracts are called smart contracts [13]. Several works treated blockchain-based smart contracts. Indeed, in [14], the authors extracted 24 papers from several scientific databases to track the smart contract issues and analyses the impact of this utilization. Different smart contract applications are illustrated and future studies are provided. In [15], the authors analyze 468 research papers that studied the smart contract and their 20,188 references. Six major streams related to the studied research area are identified using factor analysis. Authors in [16] explained the diverse components and working fundamentals of the smart contract. In addition, the authors identified and analyzed the diverse use cases of smart contracts with the interest of utilizing smart contracts in blockchain in several domains. Different algorithms are illustrated related to the use of logic in smart contracts with blockchain systems [17]. In healthcare management, several existing in previous research work and applications studied for the healthcare domain using the blockchain approach [18]. Other surveys regarding the contracts with blockchain are studied by several researchers [19,20]. In [21], authors explain the combination between the blockchain and the IoT. The authors based the work on the sharing of services and resources and a cryptographically verifiable manner. In fact, this latter work shows that the blockchain-IoT combination can give the new world a new vision regarding real-life manipulation. Another domain that the blockchain technology is applied is financial services. In this context, the blockchain and its effect on financial sectors are treated in [22]. The load balancing is the main core of the studied problem. The load balancing algorithms are largely studied in the literature in different domains in [23][24][25][26][27].
A comprehensive study of the previous research work regarding this issue is presented in [28]. Several works detailed the financial services related with blockchain in [29][30][31][32]. The supply chain with the utilization of the blockchain is also studied in the literature. In [33], authors show how the boundary conditions must be met before blockchain can be applied. Eighteen boundary conditions were proposed. The shifting trust in the creation of supply chains via blockchain is studied in [34]. A case study of the data management in supply chains using blockchain is presented in [35]. In our work, several algorithms are proposed to solve the load balancing in the blockchain. On the other hand, several works in literature treated the load balancing or as also known as the fair distribution or the equity distribution [36][37][38][39][40].
The proposed algorithms can be adopted to solve the parallel machine problem studied in [47][48][49][50][51][52]. Table 1 provides a summary of the related works detailed previously. Several other works treating the blockchain problem can be considered [46,53,54].
Other scheduling algorithms treated in [55][56][57] can be exploited to deal with the studied problem.
These previous works give many limitations that can be detailed as follows: • Scalability: some algorithms can not be easily scalable; • Overhead: Several presented algorithms related to load balancing can give additional overhead; • insufficiency of accuracy: some Load-balancing algorithms may not always accurately forecast the load on blocks; • Limitation of implementation: Some algorithms may only be applicable to certain types of transactions, servers, and models.
To surmount these limitations, this paper proposes nine novel algorithms for efficacy manage the very high number of transactions to the different offered blocks. These algorithms are based on the dispatching-rules method, randomization approach, clustering algorithms, and iterative method. The proposed algorithms return approximate solutions in a remarkable time.

Architecture composed of blocks
The studied problem is based on the assignment of the different transactions to different blocks. Fig 1 illustrates an overview of this problem. As shown in this latter figure, the component "Balancer" is a novel component that is added to the proposed architecture. This component is responsible to run the proposed algorithms and choose the best one to decide how to assign the given transactions to different blocks. The "Balancer" solve an NP-hard problem. For this reason, the execution time of the algorithm is very crucial and must be considered as one of the metrics to choose the effeminacy of the algorithm.
The proposed architecture is composed of three components detailed as follows: • Transactions • Balancer

• Blocks
The transactions component is the engine that is responsible to collect all the given transactions that must be executed. In addition, this component collects also all information on each transaction essentially the estimated running time of each transaction. The second component is the "Balancer". This is the core of the architecture. Several algorithms are called to solve the problem constituted by the given transactions. Finally, the component "Blocks" is composed of the available blocks.

Problem formulation
Let Tr be a set of n T transactions to be assigned to n b blocks into a blockchain. Each transaction j satisfies certain characteristics. The estimated running time rt j for each transaction is fixed in advance. The cumulative estimated running time when the transaction j is assigned is Cr j 8j 2 {1, . . ., n T }. The total workload for each block i after finishing the assignment is Tw i 8i 2 {1, . . ., n b }. The maximum completion time when all transactions are completed is Tw max and formulated in Eq 1.
The maximum completion time Tw max can be formulated as described in Eq 9.
The total completion time gap is denoted by Gtw and shown in Eq 3.
Gtw ¼ The goal is to find a solution that minimizes the sum of the gaps between the block that has the minimum execution time and each other blocks. The objective is to minimize Gtw.
We define the variable x ij as detailed in Eq 4.
The constraints of the proposed problem are detailed in Eqs 5, 6, 7 and 8.
x ij 2 f0; 1g; 8i 2 f1; � � � ; n b g and 8j 2 f1; � � � ; n T g ð7Þ The objective is to find a solution that will assign all the transactions on the two given blocks.  From Fig 2, it can be deduced that the first block has a total execution time of 22 and the second block has a total execution time of 17. The gap between the total execution times for block 1 and block 2 is equal to Tw 1 − Tw 2 = 5. The principal goal of the work reported here is to minimize this gap. Therefore, a more efficient assignment needs to be found with a gap of less than 5. To calculate the gap between the blocks, several indicators are chosen. In this paper and for the above example the following indicator is proposed: Tw 1 − Tw 2 where Tw 1 represents Tw max .
Example 2 Assume that we have the same instance detailed in Table 2.  The principal goal of the work reported here is to minimize this gap. This schedule shows that a more efficient assignment is found with a gap of less than 5 compared with the schedule in Example 1.

Proposed algorithms
In this section, we detail all the developed algorithms regarding the studied problem of the blockchain with constraints. Nine algorithms are proposed. The first algorithm is called "Lon-

Longest Transactions Time (LTT)
This algorithm uses the dispatching rule method by sorting all transactions according to the non-increasing order of their estimated running time. After the sorting, the scheduling will be on the block that has the minimum total workload. The sorting algorithm used for the dispatching rule is the heapsort algorithm. The complexity of this algorithm is O(nlogn) [57].

Smallest Transactions Time (STT)
This algorithm is the same as LTT, the difference is to sort all transactions according to the non-increasing order of their estimated running time.

Iterative Multi-choice Longest-Transactions Time (IML)
The transaction that has the first longest running time and the transaction that has the second longest running time. This is meaning that the choice will be between the two first elements in the table of transactions after sorting. This choice will be based on the application of the probability μ. In fact, for μ probability, the choice is for the transaction that has the first longest running time and for 1 − μ probability for the transaction that has the second longest running time. In the practice, μ = 0.4. The choice will be repeated itn = 800 times and the best solution will be chosen. The procedure NIO() is responsible for the ordering of the transactions given in input according to the non-increasing order of its running time. The procedure SHE() is responsible for the scheduling of the transactions given in input on the most available block. In addition, Tr 1 represents the first transaction that has the longest running time and Tr 2 represents the second transaction that has the longest running time. Algorithm 1 represents the algorithm of IML.

Iterative Multi-choice Smallest-Transactions Time (IMS)
This algorithm is similar to IML. The difference is to sort the transactions in the first step according to the non-decreasing order of its rt j .

Block-transaction Iterative Longest-Multi-choice (BIL)
This algorithm uses the randomization method. The first step is to sort all transactions according to the non-increasing order of its rt j . Next, the second step is to schedule the n b transactions that have the longest running time to the blocks. The third step is concerning the remaining transactions. The remaining transactions will be scheduled with a probability ϑ. This probability is based on the choice of the transaction. In fact, for ϑ probability, the choice is for the transaction that has the first longest running time and for 1 − ϑ probability for the transaction that has the second longest running time. In practice, ϑ = 0.4. The choice will be repeated itn = 800 times and the best solution will be picked. The set LT contains the first n b transactions that have the longest running time. Algorithm 2 represents the algorithm of BIL. if (μ) then 6: Call SHE(Tr 1 ) 7: else 8: Call SHE(Tr 2 ) 9: end if 10: if (k = n T − n b ) then 11: if (Tr 1 is not scheduled) then 12: Call SHE(Tr 1 ) 13: else 14: Call

Block-transaction Iterative Smallest-Multi-choice (BIS)
This algorithm uses the randomization method. The first step is to sort all transactions according to the non-increasing order of its rt j . Next, the second step is to schedule the n b transactions that have the longest running time to the blocks. The third step is concerning the remaining transactions. The remaining transactions will be scheduled with a probability ϑ. This probability is based on the choice of the transaction. In fact, for ϑ probability, the choice is for the transaction that has the first smallest running time and for 1 − ϑ probability for the transaction that has the second smallest running time. In practice, ϑ = 0.4. The choice will be repeated itn = 800 times and the best solution will be picked. We denoted by TrS 1 represents the first transaction that has the smallest rt j and TrS 2 represents the second transaction that has the smallest rt j . Algorithm 3 represents the algorithm of BIS.

Mi-transactions Iterative Longest-Multi-choice (MIL)
The first step is to sort all transactions in the non-increasing order of its rt j . After that, the second step is to schedule the n T 2 transactions that have the longest running time to the blocks. The third step is concerning the remaining transactions. The remaining transactions will be scheduled with a probability ϑ. This probability is based on the choice of the transaction. In fact, for ϑ probability, the choice is for the transaction that has the first longest running time and for 1 − ϑ probability for the transaction that has the second longest running time. In practice, ϑ = 0.4. The choice will be repeated itn = 800 times and the best solution will be picked. We denoted by SHEST(L, TwI) is the procedure that schedules the transactions in the list L taking into consideration the load of blocks Tw i stored into the list TwI. This means that the blocks are not initially empty. The instructions of MIL are detailed in Algorithm 4.

Mi-transactions Iterative Smallest-Multi-choice (MIS)
The same procedure is like the MIL algorithm, however, the difference is regarding the remaining transactions after the scheduling of the n T 2 transactions that have the longest running time to the blocks. In the third step, instead of the transaction that has the first longest running time for probability ϑ, we select the transaction that has the first smallest running time.

Best-mi-transactions Iterative Multi-choice (BIM)
Firstly, we call the iterative multi-choice longest-transactions time algorithm and we call the block-transaction iterative longest-multi-choice algorithm. The best solution will be selected. Thus, we have the following equation: We denoted by Gtw I and Gtw B the total completion time gap returned by IML and BIL, respectively. Algorithm 5 represents the algorithm of BIM.

Results and discussion
In this section, we detail the obtained results after running all proposed algorithms. The proposed algorithms were codded using a C++ program using Visual Studio 2019. The hardware used is an Intel(R) Core(TM) i5-6200U CPU @ 2.30GHz 2.40 GHz and 8 GB RAM. The operating system is Windows 10 Enterprise with 64-bit version 22H2. The experimental results show different advantages of the proposed algorithms as follows: • The execution time to obtain a solution to the studied problem is polynomial.
• There is no dominance between the algorithms. This means that any combination of two or more algorithms gives a new result. This can give authors more flexibility to choose several combinations and amelioration to extend the work and utilize the proposed algorithms.
• The proposed algorithms show their performance over different classes of instances.
• The utilization of the proposed algorithms as initial solutions to apply different met-heuristics to solve the studied problem.
Despite the advantage of the proposed algorithms, there are some limitations of the research work as follows: • The work develops novel algorithms and architecture based on the component "Blancer".
However, the work doesn't give a lower bound of the problem to measure the performance of the results compared to the lower bound. Measuring the distance to the lower bound is important to know the far of the proposed algorithms to the optimal solution.
• Some proposed algorithms give a high average gap. The average gap can be a performance by applying other methods.
• The proposed algorithms are not tested in big-scale instances.
• More other distributions can be tested to generate the instances like normal distribution and binomial distribution.
Several instances were generated to perform the proposed algorithms. These algorithms are compared between them and measured using different metrics. The instances tested in this section are generated like in [38]. Three classes are selected to test the proposed algorithms. These classes are based on the uniform distribution UnD [.]. The rt j of the transaction j was generated as: • Class CS1: rt j 2 UnD [5,15].
The number of transactions is varied as {10, 15,20,25,30,35,40,45,50} and the number of blocks is varied as {2,3,4,5,6}. For each number of transactions, each number of blocks, and each class, 10 instances were generated. So, in total, we have 1350 instances. Three metrics are proposed to show the performance of the given algorithms. These metrics are: Gtw : the gap obtained based on the best value picked after execution of all algorithms (Gtw b ) and the value obtained by the presented algorithm (Gtw). G = 0, if Gtw = 0.
• GP The average of G over a fixed number of instances.
• Time: average estimated running time in seconds. The symbol "+" is shown when the average running time is less than 0.001 s.
• Pge: percentage of transactions where Gtw = Gtw b among all 1350 instances. Table 3, represents the overview of results for all proposed algorithms. From this table, we can see that the best algorithm is BIM in 93.9% of cases, with an average gap of 0.03 and an average estimated running time of 0.003 s. The second best algorithm is IML with a percentage of 89.5%, an average gap of 0.05, and an average running time of 0.002 s. Table 4 represents the GP variation for all proposed algorithms. This latter table shows that for the best algorithm, BIM the best average gap of less than 0.001 is obtained when n T = {15, 30}. However, the highest average gap of 0.09 is obtained when n T = 40. For the MIS algorithm, the best GP value of 0.38 is obtained when n T = 40. On the other hand, for the IML algorithm the best GP of 0.01 is reached when n T = 15. Table 5 shows the GP variation for all proposed algorithms classified by n b . For the best algorithm BIM, the minimum GP of less than 0.001 is obtained when n b = 2 and the maximum GP of 0.06 is obtained when n b = 3. For the IML algorithm the maximum GP is 0.09. It is clear that the maximum values of GP are obtained by STT.
The pair (n T , n b ) will be denoted by ID. So, for each (n T , n b ) value, we have a new value of ID. The first ID value is 1.  For more details, Tables 6 and 7 show the average running time variation for all proposed algorithms classified by n b and n T , respectively. Table 6 shows that the maximum running time of 0.005 s is reached for the BIM algorithm when n T = 50. For LTT and STT, and for all values of n T , always the running time is less than 0.001 s. Table 7 shows that the maximum running time of 0.004 s is reached for the BIM algorithm when n b = 6. For LTT and STT and for all values of n b , always the running time is less than 0.001 s. Table 8 shows the contrast estimation results for all proposed algorithms. The importance of this test is based on the estimation of the contrast between medians of instances of results considering all pairwise comparisons. The test obtains a quantitative difference between gaps computed through medians between two algorithms over all instances [58]. Table 8 shows the estimations computed for each proposed algorithm. Focusing the attention on the rows of Table 8, we can see the performance of BIM (all its related estimators are negative).
The proposed algorithms show their performance in running time. Indeed, the most higher running time is 0.005 s. This means the polynomial time of the proposed algorithms. The  experimental results show the non-dominance of the proposed algorithms. This can allow researchers to make a combination of the proposed algorithms to propose new ones. The proposed algorithms can be utilized in a branch and bound algorithm to develop an exact solution to the studied problem. This is because the time execution of the best algorithm is very remarkable and efficient. So, the burning of a node in the tree doesn't take time and memory to expand the other possibilities. Other methodologies that can be used to achieve the objective in relation to this work are: • Applying some met-heuristics and considering the proposed algorithms as heuristics and initial solutions.
• Search the square of difference gap instead of the proposed objective given in Eq 1.
• Other new metrics can be considered like quadratic difference (RMSE) between the AGPs for all algorithms.
The four best proposed-algorithms are IML, BIL, MIL, and BIM. The results illustrated in tables and figures show the performance of the algorithms. The application of randomization has a remarkable impact on the performance of the algorithm. Indeed, these four algorithms are based on the randomization method. This means that the randomization method proves its efficacy in the proposed problem.

Conclusion
The application of technology in different domains is a crucial point. However, optimizing the utilization of technological information is very important to save time and money. This paper dealt with the solution to the problem related to the transactions that can be treated in the blocks of the blockchain. This problem is an NP-hard one. This problem can be solved by proposing an approximate solution. It is important to give an approximate solution to the studied problem in polynomial time. Nine algorithms are proposed to solve the studied problem in polynomial time. These algorithms are based on the dispatching-rules method, randomization approach, clustering algorithms, and iterative method. The results show that the best algorithm proposed in this paper is best-mi-transactions iterative multi-choice with 93.9% in an average running time of 0.003 s. The proposed algorithms give several approximate solutions in a remarkable running time. The limitation of this work is the comparison of the best-proposed algorithm to a valid lower bound of the studied problem. Four future axes can be extended for this research. The first axe is to develop an optimal solution to the studied problem. The proposed algorithms can be utilized to obtain the exact solution in a decision tree. The second future axe is to test the proposed algorithms over extended classes of instances and determine some hard classes of instances. The third future axe is to use the proposed algorithms in several meta-heuristics to enhance the solutions. The last future axe is the incorporation of the proposed algorithm in the future to solve other kinds of problems in cloud management or network routing.