Efficiency of RSA Key Factorization by Open-Source Libraries and Distributed System Architecture

. The security of the RSA algorithm relies on the difficulty to factorize large numbers. However the computational power of information technologies is increasing all the time, while open-source factoring libraries are developed at similar pace. Therefore, the possibility to factorize large numbers also increases. In this paper we analyze the efficiency of open-source libraries to factor RSA numbers by using it in computer cluster to decrease the calculation time. To achieve this, we analyze efficiency of Msieve, GGNFS and CADO-NFS libraries for 81 decimal digit number factorization with varying number of cluster nodes (cores). By choosing the best solution (Msieve library with GGNFS library integration for sieving method) we analyze the possibility to factorize different size RSA numbers, and discuss exact conditions to achieve it.


Introduction
Factoring of large numbers is one of the oldest branches of number theory.It gained its importance and popularity after RSA (Rivest et al., 1978) was proposed in 1977.RSA is a public-key cryptosystem where two large primary numbers are multiplied to get a part of a key.Different opinions on the best way to break the RSA exist.D. R. Brown (2005) claims that it may be as difficult as factoring while Boneh and Venkatesan (1998) believe it can be done easier than factoring.Despite the contradiction, the possibility of large number factorization is still one of the main measures to evaluate the security of RSA key length.
Currently, the largest factored RSA number is RSA-768 (768 bit or 232 decimal digit lengths), factored in 2009 by Kleinjung (2010).The factorization of RSA-768 number took more than two years and 80 single core 2.2 GHz AMD Opteron processors with 2 GB RAM.It would be difficult to achieve the same results given domestic computational power, however the RSA-155 (512 bit length) is much easier to deal with.Open-source projects as well as computer clusters and distributed computing can be adopted for the task.The aim of this paper is to evaluate the efficiency of open source factoring libraries to factor RSA numbers in computer cluster.This will show the level of vulnerability of RSA against modern approaches with physical resources similar to average domestic conditions.
In this paper we have two research questions and consequentially two different approaches:  Which open-source factoring library is the most suitable for large number factoring in computer cluster? How long it would take to factorize up to 155 decimal digit RSA numbers by using 8 computer cluster (32 cores)?

Factoring of Large Numbers
Factoring of large numbers can be divided into two broad categories, according to the purpose (Duta et al., 2016):  Special-purpose algorithms.Here algorithm runtime depends on properties of the number being factored, or on its unknown factors, such as size, special form, etc. Trial Division (Bressoud, 2012), Fermat's (Lehman, 1974), Euler's (Oystein, 1976), Pollards p-1 (Pollard, 1974), Pollard's Rho (Pollard, 1975), William's p+1 (Williams, 1982), Elliptic curve method (ECM) (Lenstra, 1985) algorithms belong to this category. General-purpose algorithms.Here, the speed does not depend on the size of the prime factors, the number of prime factors or on the form of the number.Algorithms of this category are Dixon's (Dixon, 1981), continued fraction factorization (CFRAC) (Morrison, 1975), quadratic sieve (Pomerance, 1984), number field sieve (NFS) (Lenstra, 1993), special number field sieve (SNFS) (Silverman, 2007), general number field sieve (GNFS) (Pomerance, 2008), Shanks square forms factorization algorithm (SQUFOF) (Shanks, 1975).While Special-purpose algorithms are useful for solving specific problems, the general-purpose algorithms are more suitable for RSA number factoring.As seen in results of Duta (2016) experiment, the number field sieve algorithms (NFS, SNFS, GNFS) are the most efficient in the category of general-purpose algorithmsthe speed is more than twice as fast compared to other algorithms.
Comparative information of these three libraries very limited, as Winograd uses Msieve and GGNFS to extend CrypTool functionality and states the suitability of these libraries for the purpose, excluding any information about the efficiency.Valenta (2016) explains why CADO-NFS and Msieve are suitable to be used on the Amazon Elastic Compute Cloud platform, yet again -without efficiency testing.According to the results of our method research -there are no direct Msieve, GGNFS and CADO-NFS efficiently comparison experiments.Therefore, it is unclear which library is the most suitable for distributed factoring of large numbers.

Efficiency Comparison of Msieve, GGNFS and CADO-NFS Libraries
For evaluation of the efficiency of number field sieve libraries, two primary numbers P of 40 and Q of 41 decimal digits were multiplied and analyzed.As a result -number N of 81 decimal digit was obtained.P=6075380529345458860144577398704761614649 Q=66610666966686667666666656664666366626661 N=404685149136122917469742099379400401593004996531215430758 097470298767813731556989 The number N was factorized by using computer cluster named "Vilkas" which is located in Vilnius Gediminas Technical University.For the experiment the cluster was configured to use 8 Intel® Core™ i7-860 @ 2.80 GHz with 4 cores, 4 GB DDR3-1600 RAM, 500 GB HDD SATA each.To define how the efficiency depends on the number of parallel processes, we have analyzed six situations with 1, 2, 4, 8, 16 and 32 processes.In each of the situation, three analogues are tested: Msieve (A1), Msieve + GGNFS (A2), CADO-NFS (A3).GGNFS was not analyzed as it is, only as a combination with Msieve.It due to the complications to use all five GGNFS library methods for factoring in a cluster.In tested factoring solution A2 only GGNFS sieving method was used, while the rest four methods were implemented by Msieve library.We calculated the elapsed time for each of factoring steps (see table 1) and the overall time (see Fig. 1) for the number N factoring.The results prove the Msieve method is not optimized for sieving method, therefore the combination of Msieve and GGNFS shows better results.Meanwhile the comparison between Msieve+GGNFS and CADO-NFS shows similar results.However, combination of Msieve and GGNFS is more stable, as in some situations (number of processes) the factoring time increases unexpectedly.We assume it might be related to poor task distribution for cluster nodes as the jumps of factoring time are noticed when more than one computer is used.

Factoring of RSA Keys with Msieve and GGNFS for Sieving
In efficiency research we executed experiments with 81 decimal digit number N.An interesting discovery is that the Msieve library can be used for larger number factorization along with GGNFS for sieving.In order to experiment with larger numbers we used eight numbers from RSA-100 to RSA-155 for factoring.Each of those numbers were factored in computer cluster "Vilkas" with 32 cores (the same as in previous experiments) and execution time was estimated for each method.
The experiment results are shown in Table 2 and illustrate that the sieving method plays the key role in the factoring speed.It also shows that all methods require more time to execute by increasing the size of factored number.By analyzing the overall factoring time, it exponentially increases according to the length of the number (see Fig. 2).Important noticethe RSA number of 155 decimal digits, or 512 bit can be factorized in less than 60 hours by using computer cluster of 8 Intel® Core™ i7-860 @ 2.80 GHz computers.This proves the RSA-512 is critically unsafe as the key can be broken within three days.If we would follow the exponential curve and would analyze what would be the factoring time for RSA-1024 number we would need more than 15000 years.This RSA number is not factorized yet by any other author and could not be factored by our computer cluster in near future as well.

Conclusions
The research cases of large RSA number factorization were performed by using number field sieve algorithms.In most of cases, open-source libraries were used to factor the RSA number.As the computational capacity of modern computers is increasing and open-source libraries are optimized, the possibility to factor large RSA-512 numbers by an individual person, rather than a scientific laboratory, is increasing rapidly.
By adapting existing open-source libraries for large number factorization in computer cluster we noticed Msieve is lacking efficiency in sieving stage.However, by integrating Msieve with GGNFS for sieving stage, the factoring time decreased up to 20 times.However, the difference is decreasing along with the increasing core number.This shows the computer cluster can be used in order to decrease the factoring time, however it is not able to provide linear increase.
Used combination of Msieve and GGNFS libraries in 32 core computer cluster is able to factor RSA number of 512 bit in less than 60 hours.This time is too short to consider RSA-512 to be safe or even decent for usage.By increasing the length of the number, the factoring time increases exponentially.The RSA-1024 is safe against the attacks against within our used solution.The fact that RSA-1024 was not factored yet by any other author proves the key length is currently sufficient.

Fig. 1 .
Fig. 1.Comparison of factoring time of different libraries when different number of processes is used

Table 1 .
Time of Msieve, Msieve+GGNFS and CADO-NFS libraries in each factoring method when different number of processes is used.

Table 2 .
Factoring time in 32 core (8 node) computer cluster achieved by using with Msieve + GGNFS libraries