1 Introduction

A biprime is a number \(N\) of the form \(N=p\cdot q\) where \(p\) and \(q\) are primes. Such numbers are used as a component of the public key (i.e., the modulus) in the RSA cryptosystem  [33], with the factorization being a component of the secret key. A long line of research has studied methods for sampling biprimes efficiently; in the early days, the task required specialized hardware and was not considered generally practical  [31, 32]. In subsequent years, advances in computational power brought RSA into the realm of practicality, and then ubiquity. Given a security parameter \({\kappa } \), the de facto standard method for sampling RSA biprimes involves choosing random \({\kappa } \)-bit numbers and subjecting them to the Miller-Rabin primality test  [27, 30] until two primes are found; these primes are then multiplied to form a \(2{\kappa } \)-bit modulus. This method suffices when a single party wishes to generate a modulus, and is permitted to know the associated factorization.

Boneh and Franklin  [3, 4] initiated the study of distributed RSA modulus generation.Footnote 1 This problem involves a set of parties who wish to jointly sample a biprime in such a way that no corrupt and colluding subset (below some defined threshold size) can learn the biprime’s factorization.

It is clear that applying generic multiparty computation (MPC) techniques to the standard sampling algorithm yields an impractical solution: implementing the Miller-Rabin primality test requires repeatedly computing \(a^{p-1}\bmod {p}\), where \(p\) is (in this case) secret, and so such an approach would require the generic protocol to evaluate a circuit containing many modular exponentiations over \({\kappa } \) bits each. Instead, Boneh and Franklin  [3, 4] constructed a new biprimality test that generalizes Miller-Rabin and avoids computing modular exponentiations with secret moduli. Their test carries out all exponentiations modulo the public biprime \(N\), and this allows the exponentiations to be performed locally by the parties. Furthermore, they introduced a three-phase structure for the overall sampling protocol, which subsequent works have embraced:

  1. 1.

    Prime Candidate Sieving: candidate values for \(p\) and \(q\) are sampled jointly in secret-shared form, and a weak-but-cheap form of trial division sieves them, culling candidates with small factors.

  2. 2.

    Modulus Reconstruction: is securely computed and revealed.

  3. 3.

    Biprimality Testing: using a distributed protocol, \(N\) is tested for biprimality. If \(N\) is not a biprime, then the process is repeated.

The seminal work of Boneh and Franklin considered the semi-honest n-party setting with an honest majority of participants. Many extensions and improvements followed (as detailed in Sect. 1.3), the most notable of which (for our purposes) are two recent works that achieve malicious security against a dishonest majority. In the first, Hazay et al. [19, 20] proposed an n-party protocol in which both sieving and modulus reconstruction are achieved via additively homomorphic encryption. Specifically, they rely upon both ElGamal and Paillier encryption, and in order to achieve malicious security, they use zero-knowledge proofs for a variety of relations over the ciphertexts. Thus, their protocol represents a substantial advancement in terms of its security guarantee, but this comes at the cost of additional complexity assumptions and an intricate proof, and also at substantial concrete cost, due to the use of many custom zero-knowledge proofs.

The subsequent protocol of Frederiksen et al. [16] (the second recent work of note) relies mainly on oblivious transfer (OT), which they use to perform both sieving and, via Gilboa’s classic multiplication protocol  [17], modulus reconstruction. They achieved malicious security using the folklore technique in which a “Proof of Honesty” is evaluated as the last step and demonstrated practicality by implementing their protocol; however, it is not clear how to extend their approach to more than two parties in a straightforward way. Moreover, their approach to sieving admits selective-failure attacks, for which they account by including some leakage in the functionality. It also permits a malicious adversary to selectively and covertly induce false negatives (i.e., force the rejection of true biprimes after the sieving stage), a property that is again modeled in their functionality. In conjunction, these attributes degrade security, because the adversary can rejection-sample biprimes based on the additional leaked information, and efficiency, because ruling out malicious false-negatives involves running sufficiently many instances to make the probability of statistical failure in all instances negligible.

Thus, given the current state of the art, it remains unclear whether one can sample an RSA modulus among two parties (one being malicious) without leaking additional information or permitting covert rejection sampling, or whether one can sample an RSA modulus among many parties (all but one being malicious) without involving heavy cryptographic primitives such as additively homomorphic encryption, and their associated performance penalties. In this work, we present a protocol which efficiently achieves both tasks.

1.1 Results and Contributions

A Clean Functionality. We define \({\mathcal {F}_{\mathsf {RSAGen}}}\), a simple, natural functionality for sampling biprimes from the same well-known distribution used by prior works  [4, 16, 20], with no leakage or conflation of sampling failures with adversarial behavior.

A Modular Protocol, with Natural Assumptions. We present a protocol \({\pi _{\mathsf {RSAGen}}}\) in the (\({\mathcal {F}_{\mathsf {AugMul}}}\), \({\mathcal {F}_{\mathsf {Biprime}}}\))-hybrid model, where \({\mathcal {F}_{\mathsf {AugMul}}}\) is an augmented multiplier functionality and \({\mathcal {F}_{\mathsf {Biprime}}}\) is a biprimality-testing functionality, and prove that it UC-realizes \({\mathcal {F}_{\mathsf {RSAGen}}}\) in the malicious setting, assuming the hardness of factoring. More specifically, we prove:

Theorem 1.1

(Main Security Theorem, Informal). In the presence of a PPT malicious adversary corrupting any subset of parties, \({\mathcal {F}_{\mathsf {RSAGen}}}\) can be securely computed with abort in the (\({\mathcal {F}_{\mathsf {AugMul}}}\), \({\mathcal {F}_{\mathsf {Biprime}}}\))-hybrid model, assuming the hardness of factoring.

Additionally, because our security proof relies upon the hardness of factoring only when the adversary cheats, we find to our surprise that our protocol achieves perfect security against semi-honest adversaries.

Theorem 1.2

(Semi-Honest Security Theorem, Informal). In the presence of a computationally unbounded semi-honest adversary corrupting any subset of parties, \({\mathcal {F}_{\mathsf {RSAGen}}}\) can be computed with perfect security in the (\({\mathcal {F}_{\mathsf {AugMul}}}\), \({\mathcal {F}_{\mathsf {Biprime}}}\))-hybrid model.

Supporting Functionalities and Protocols. We define \({\mathcal {F}_{\mathsf {Biprime}}}\), a simple, natural functionality for biprimality testing, and show that it is UC-realized in the semi-honest setting by a well known protocol of Boneh and Franklin  [4], and in the malicious setting by a derivative of the protocol of Frederiksen et al.  [16]. We believe this dramatically simplifies the composition of these two protocols, and as a consequence, leads to a simpler analysis. Either protocol can be based exclusively upon oblivious transfer.

We also define \({\mathcal {F}_{\mathsf {AugMul}}}\), a functionality for sampling and multiplying secret-shared values in a special form derived from the Chinese Remainder Theorem. In the context of \({\pi _{\mathsf {RSAGen}}}\), this functionality allows us to efficiently sample numbers in a specific range, with no small factors, and then compute their product. We prove that it can be UC-realized exclusively from oblivious transfer, using derivatives of well-known multiplication protocols  [13, 14].

Asymptotic Efficiency. We perform an asymptotic analysis of our composed protocols and find that our semi-honest protocol is a factor of \({\kappa }/\log {\kappa } \) more bandwidth-efficient than that of Frederiksen et al.  [16]. Our malicious protocol is a factor of \({\kappa }/{s} \) more efficient than theirs in the optimistic case (when parties follow the protocol), and a factor of \({\kappa } \) more efficient when parties deviate from the protocol. Recall that \({\kappa } \) is the bit-length of the primes p and q, and \({s} \) is a statistical security parameter. Frederiksen et al. claim in turn that their protocol is strictly superior to the protocol of Hazay et al.  [20] with respect to asymptotic bandwidth performance.

Concrete Efficiency. We perform a closed-form concrete analysis of our protocol (with some optimizations, including the use of random oracles), and find that in terms of communication, it outperforms the protocol of Frederiksen et al. (the most efficient prior work) by a factor of roughly five in the presence of worst-case malicious adversaries, and by a factor of eighty or more in the semi-honest setting.

1.2 Overview of Techniques

Constructive Sampling and Efficient Modulus Reconstruction. Most prior works use rejection sampling to generate a pair of candidate primes, and then multiply those primes together in a separate step. Specifically, they sample a shared value \(p\leftarrow [0,2^{\kappa })\) uniformly, and then run a trial-division protocol repeatedly, discarding both the value and the work that has gone into testing it if trial division fails. This represents a substantial amount of wasted work in expectation. Furthermore, Frederiksen et al.  [16] report that multiplication of candidates after sieving accounts for two thirds of their concrete cost.

We propose a different approach that leverages the Chinese Remainder Theorem (CRT) to constructively sample a pair of candidate primes and multiply them together efficiently. A similar sieving approach (in spirit) was initially formulated as an optimization in a different setting by Malkin et al.  [26]. The CRT implies an isomorphism between a set of values, each in a field modulo a distinct prime, and a single value in a ring modulo the product of those primes (i.e., \({\mathbb {Z}}_{m_1}\times \ldots \times {\mathbb {Z}}_{m_\ell }\simeq {\mathbb {Z}}_{m_1\cdot \ldots \cdot m_\ell }\)). We refer to the set of values as the CRT form or CRT representation of the single value to which they are isomorphic. We formulate a sampling mechanism based on this isomorphism as follows: for each of the first \(O({\kappa }/\log {\kappa })\) odd primes, the parties jointly (and efficiently) sample shares of a value that is nonzero modulo that prime. These values are the shared CRT form of a single \({\kappa } \)-bit value that is guaranteed to be indivisible by any prime in the set sampled against. For technical reasons, we sample two such candidates simultaneously.

Rather than converting pairs of candidate primes from CRT form to standard form, and then multiplying them, we instead multiply them component-wise in CRT form, and then convert the product to standard form to complete the protocol. This effectively replaces a single “full-width” multiplication of size \({\kappa } \) with \(O({\kappa }/\log {\kappa })\) individual multiplications, each of size \(O(\log {\kappa })\). We intend to perform multiplication via an OT-based protocol, and the computation and communication complexity of such protocols grows at least with the square of their input length, even in the semi-honest case  [17]. Thus in the semi-honest case, our approach yields an overall complexity of \(O({\kappa } \log {\kappa })\), as compared to \(O({\kappa } ^2)\) for a single full-width multiplication. In the malicious case, combining the best known multiplier construction  [13, 14] with the most efficient known OT extension scheme  [5] yields a complexity that also grows with the product of the input length and a statistical parameter \({s} \), and so our approach achieves an overall complexity of \(O({\kappa } \log {\kappa } + {\kappa } \cdot {s})\), as compared to \(O({\kappa } ^2+{\kappa } \cdot {s})\) for a single full-width malicious multiplication. Via closed-form analysis, we show that this asymptotic improvement is also reflected concretely.

Achieving Security with Abort Efficiently. The fact that we sample primes in CRT form also plays a crucial role in our security analysis. Unlike the work of Frederiksen et al. [16], our protocol achieves the standard, intuitive notion of security with abort: the adversary can instruct the functionality to abort regardless of whether a biprime is successfully sampled, and the honest parties are always made aware of such adversarial aborts. There is, in other words, absolutely no conflation of sampling failures with adversarial behavior. For the sake of efficiency, our protocol permits the adversary to cheat prior to biprimality testing, and then rules out such cheats retroactively using one of two strategies. In the case that a biprime is successfully sampled, adversarial behavior is ruled out retroactively in a privacy-preserving fashion using well-known but moderately expensive techniques, which is tolerable only because it need not be done more than once. In the case that a sampled value is not a biprime, however, the inputs to the sampling protocol are revealed to all parties, and the retroactive check is carried out in the clear. Proving the latter approach secure turns out to be surprisingly subtle.

The challenge arises from the fact that the simulator must simulate the protocol transcript for the OT-multipliers on behalf of the honest parties without knowing their inputs. Later, if the sampling-protocol inputs are revealed, the simulator must “explain” how the simulated transcript is consistent with the true inputs of the honest parties. Specifically, in maliciously secure OT-multipliers of the sort we use  [13, 14], the OT receiver (Bob) uses a high-entropy encoding of his input, and the sender (Alice) can, by cheating, learn a one-bit predicate of this encoding. Before Bob’s true input is known to the simulator, it must pick an encoding at random. When Bob’s input is revealed, the simulator must find an encoding of his input which is consistent with the predicate on the random encoding that Alice has learned. This task closely resembles solving a random instance of subset sum.

We are able to overcome this difficulty because our multiplications are performed component-wise over CRT-form representations of their operands. Because each component is of size \(O(\log {\kappa })\) bits, the simulator can simply guess random encodings until it finds one that matches the required constraints. We show that this strategy succeeds in strict polynomial time, and that it induces a distribution statistically close to that of the real execution.

This form of “privacy-free” malicious security (wherein honest behavior is verified at the cost of sacrificing privacy) leads to considerable efficiency gains in our case: it is up to a multiplicative factor of \({s} \) (the statistical parameter) cheaper than the privacy-preserving check used in the case that a candidate passes the biprimality test (and the one used in prior OT-multipliers  [13, 14]). Since most candidates fail the biprimality test, using the privacy-free check to verify that they were generated honestly results in substantial savings.

Biprimality Testing as a Black Box. We specify a functionality for biprimality testing, and prove that it can be realized by a maliciously secure version of the Boneh-Franklin biprimality test. Our functionality has a clean interface and does not, for example, require its inputs to be authenticated to ensure that they were actually generated by the sampling phase of the protocol. The key insight that allows us to achieve this level of modularity is a reduction to factoring: if an adversary is able to cheat by supplying incorrect inputs to the biprimality test, relative to a candidate biprime \(N\), and the biprimality test succeeds, then we show that the adversary can be used to factor biprimes. We are careful to rely on this reduction only in the case that \(N\) is actually a biprime, and to prevent the adversary from influencing the distribution of candidates.

The Benefits of Modularity. We claim as a contribution the fact that modularity has yielded both a simpler protocol description and a reasonably simple proof of security. We believe that this approach will lead to derivatives of our work with stronger security properties or with security against stronger adversaries. As a first example, we prove that a semi-honest version of our protocol (differing only in that it omits the retroactive consistency check in the protocol’s final step) achieves perfect security. We furthermore observe that in the malicious setting, instantiating \({\mathcal {F}_{\mathsf {Biprime}}}\) and \({\mathcal {F}_{\mathsf {AugMul}}}\) with security against adaptive adversaries yields an RSA modulus sampling protocol that is adaptively secure.

Similarly, only minor adjustments to the main protocol are required to achieve security with identifiable abort  [11, 22]. If we assume that the underlying functionalities \({\mathcal {F}_{\mathsf {AugMul}}}\) and \({\mathcal {F}_{\mathsf {Biprime}}}\) are instantiated with identifiable abort, then it remains only to ensure the use of consistent inputs across these functionalities, and to detect which party has provided inconsistent inputs if an abort occurs. This can be accomplished by augmenting \({\mathcal {F}_{\mathsf {Biprime}}}\) with an additional interface for revealing the input values provided by all the parties upon global request (e.g., when the candidate \(N\) is not a biprime). Given identifiable abort, it is possible to guarantee output delivery in the presence of up to \(n-1\) corruptions via standard techniques, although the functionality must be weakened to allow the adversary to reject one biprime per corrupt party.Footnote 2 A proof of this extension is beyond the scope of this work; we focus instead on the advancements our framework yields in the setting of security with abort.

1.3 Additional Related Work

Frankel, MacKenzie, and Yung  [15] adjusted the protocol of Boneh and Franklin  [3] to achieve security against malicious adversaries in the honest-majority setting. Their main contribution was the introduction of a method for robust distributed multiplication over the integers. Cocks [8] proposed a method for multiparty RSA key generation under heuristic assumptions, and later attacks by Coppersmith (see [9]) and Joye and Pinch  [23] suggest this method may be insecure. Poupard and Stern  [29] presented a maliciously secure two-party protocol based on oblivious transfer. Gilboa  [17] achieved improved efficiency in the semi-honest two-party model, and introduced a novel method for multiplication from oblivious transfer, from which our own multipliers ultimately derive.

Malkin, Wu, and Boneh  [26] implemented the protocol of Boneh and Franklin and introduced an optimized sieving method similar in spirit to ours. In particular, their protocol generates sharings of random values in \({\mathbb {Z}}_M^*\) (where M is a primorial modulus) during the sieving phase, instead of naïve random candidates for primes p and q. However, their method produces multiplicative sharings of p and q, which are converted into additive sharings for biprimality testing via an honest-majority, semi-honest protocol. This conversion requires rounds linear in the party count, and it is unclear how to adapt it to tolerate a malicious majority of parties without a significant performance penalty.

Algesheimer, Camenish, and Shoup  [1] described a method to compute a distributed version of the Miller-Rabin test: they used secret-sharing conversion techniques reliant on approximations of \(1/p\) to compute exponentiations modulo a shared \(p\). However, each invocation of their Miller-Rabin test still has complexity in \(O({\kappa } ^3)\) per party, and their overall protocol has communication complexity in \(O({\kappa } ^5/\log ^2{{\kappa }})\), with \(\varTheta ({\kappa })\) rounds of interaction. Concretely, Damgård and Mikkelsen  [12] estimate that 10000 rounds are required to sample a 2000-bit biprime using this method. Damgård and Mikkelsen also extended their work to improve both its communication and round complexity by several orders of magnitude, and to achieve malicious security in the honest-majority setting. Their protocol is at least a factor of \(O({\kappa })\) better than that of Algesheimer, Camenish, and Shoup, but it still requires hundreds of rounds. We were not able to compute an explicit complexity analysis of their approach.

1.4 Organization

Basic notation and background information are given in Sect. 2. Our ideal biprime-sampling functionality is defined in Sect. 3, and we give a protocol that realizes it in Sect. 4. In Sect. 5, we present our biprimality-testing protocol. In the full version  [7] of this work, we give an efficiency analysis, full proofs of security, and the details of our multiplication protocol.

2 Preliminaries

Notation. We use \(=\) for equality, for assignment, \(\leftarrow \) for sampling from a distribution, \(\equiv \) for congruence, \(\smash {\approx _\mathrm{c}}\) for computational indistinguishability, and \(\smash {\approx _\mathrm{s}}\) for statistical indistinguishability. In general, single-letter variables are set in italic font, multi-letter variables and function names are set in sans-serif font, and string literals are set in slab-serif font. We use \(\bmod {}\) to indicate the modulus operator, while \((\bmod {}\ m)\) at the end of a line indicates that all equivalence relations on that line are to be taken over the integers modulo m. By convention, we parameterize computational security by the bit-length of each prime in an RSA biprime; we denote this length by \({\kappa } \) throughout. We use \({s} \) to represent the statistical parameter. Where concrete efficiency is concerned, we introduce a second computational security parameter, \({\lambda } \), which represents the length of a symmetric key of equivalent strength to a biprime of length \(2{\kappa } \).Footnote 3 \({\kappa } \) and \({\lambda } \) must vary together, and a recommendation for the relationship between them has been laid down by NIST  [2].

Vectors and arrays are given in bold and indexed by subscripts; thus \(\mathbf {x}_i\) is the \(i\)th element of the vector \(\mathbf {x}\), which is distinct from the scalar variable x. When we wish to select a row or column from a two-dimensional array, we place a \(*\) in the dimension along which we are not selecting. Thus \(\smash {\mathbf {y}_{*,j}}\) is the \(j\)th column of matrix \(\mathbf {y}\), and \(\smash {\mathbf {y}_{j,*}}\) is the \(j\)th row. We use \(\mathcal{P} _i\) to denote the party with index i, and when only two parties are present, we refer to them as Alice and Bob. Variables may often be subscripted with an index to indicate that they belong to a particular party. When arrays are owned by a party, the party index always comes first. We use |x| to denote the bit-length of x, and \(|\mathbf {y}|\) to denote the number of elements in the vector \(\mathbf {y}\).

Universal Composability. We prove our protocols secure in the Universal Composability (UC) framework, and use standard UC notation. We refer the reader to Canetti  [6] for further details. In functionality descriptions, we leave some standard bookkeeping elements implicit. For example, we assume that the functionality aborts if a party tries to reuse a session identifier inappropriately, send messages out of order, etc. For convenience, we provide a function \(\mathsf {GenSID} \), which takes any number of arguments and deterministically derives a unique Session ID from those arguments.

Chinese Remainder Theorem. The Chinese Remainder Theorem (CRT) defines an isomorphism between a set of residues modulo a set of respective coprime values and a single value modulo the product of the same set of coprime values. This forms the basis of our sampling procedure.

Theorem 2.1

(CRT). Let \(\mathbf {m} \) be a vector of coprime positive integers and let \(\mathbf {x}\) be a vector of numbers such that \(|\mathbf {m} |=|\mathbf {x}|=\ell \) and \(0\le \mathbf {x}_j<\mathbf {m} _j\) for all \(j\in [\ell ]\), and finally let . Under these conditions there exists a unique value y such that \(0\le y<M\) and \(y\equiv \mathbf {x}_j\pmod {\mathbf {m} _j}\) for every \(j\in [\ell ]\).

We refer to \(\mathbf {x}\) as the CRT form of y with respect to \(\mathbf {m} \). For completeness, we give the \({\mathsf {CRTRecon}}\) algorithm, which finds the unique y given \(\mathbf {m} \) and \(\mathbf {x}\).

figure a

3 Assumptions and Ideal Functionality

We begin this section by discussing the distribution of biprimes from which we sample, and thus the precise factoring assumption that we make, and then we give an efficient sampling algorithm and an ideal functionality that computes it.

3.1 Factoring Assumptions

The standard factoring experiment (Experiment 3.1) as formalized by Katz and Lindell  [24] is parametrized by an adversary \(\mathcal{A} \) and a biprime-sampling algorithm \(\mathsf {GenModulus} \). On input \(1^{\kappa } \), this algorithm returns \((N,p,q)\), where \(N=p\cdot q\), and \(p\) and \(q\) are \({\kappa } \)-bit primes.Footnote 4

figure b

In many cryptographic applications, \(\mathsf {GenModulus} (1^{\kappa })\) is defined to sample \(p\) and \(q\) uniformly from the set of primes in the range \([2^{{\kappa }-1},2^{\kappa })\)  [18], and the factoring assumption with respect to this common \(\mathsf {GenModulus}\) function states that for every PPT adversary \(\mathcal{A} \) there exists a negligible function \(\mathsf {negl}\) such that

$$\begin{aligned} {\mathrm {Pr}}\left[ {\mathsf {Factor}} _{\mathcal{A},\mathsf {GenModulus}}({\kappa })=1\right] \le \mathsf {negl}({\kappa }). \end{aligned}$$

Because efficiently sampling according to this uniform biprime distribution is difficult in a multiparty context, most prior works sample according to a different distribution, and thus using the moduli they produce requires a slightly different factoring assumption than the traditional one. In particular, several recent works use a distribution originally proposed by Boneh and Franklin  [4], which is well-adapted to multiparty sampling. Our work follows this pattern.

Boneh and Franklin’s distribution is defined by the sampling algorithm \({\mathsf {BFGM}}\), which takes as an additional parameter the number of parties n. The algorithm samples n integer shares, each in the range \([0,2^{{\kappa }-\log {n}})\), and sums these shares to arrive at a candidate prime. This does not induce a uniform distribution on the set of \({\kappa } \)-bit primes. Furthermore, \({\mathsf {BFGM}}\) only samples individual primes \(p\) or \(q\) that have \(p\equiv q\equiv 3\pmod {4}\), in order to facilitate efficient distributed primality testing, and it filters out the subset of otherwise-valid moduli \(N=p\cdot q\) that have \(p\equiv 1\pmod {q}\) or \(q\equiv 1\pmod {p}\).Footnote 5

figure c

Any protocol whose security depends upon the hardness of factoring moduli output by our protocol (including our protocol itself) must rely upon the assumption that for every PPT adversary \(\mathcal{A} \),

$$ {\mathrm {Pr}}\left[ {\mathsf {Factor}} _{\mathcal{A},{\mathsf {BFGM}}}({\kappa },n)=1\right] \le \mathsf {negl}({\kappa }) $$

3.2 The Distributed Biprime-Sampling Functionality

Unfortunately, our ideal modulus-sampling functionality cannot merely call \({\mathsf {BFGM}}\); we wish our functionality to run in strict polynomial time, whereas the running time of \({\mathsf {BFGM}}\) is only expected polynomial. Thus, we define a new sampling algorithm, \({\mathsf {CRTSample}}\), which might fail, but conditioned on success outputs samples statistically close to \({\mathsf {BFGM}}\).Footnote 6 Furthermore, we give \({\mathsf {CRTSample}}\) a specific distribution of failures that is tied to the design of our protocol. As a second concession to our protocol design (and following Hazay et al.  [20]), \({\mathsf {CRTSample}}\) takes as input up to \(n-1\) integer shares of \(p\) and \(q\), arbitrarily determined by the adversary, while the remaining shares are sampled randomly. We begin with a few useful notions.

Definition 3.3

(Primorial Number). The \(i\)th primorial number is defined to be the product of the first i prime numbers.

Definition 3.4

(\(({\kappa },n)\)-Near-Primorial Vector). Let \(\ell \) be the largest number such that the \(\ell \)th primorial number is less than \(2^{{\kappa }-\log {n}-1}\), and let \(\mathbf {m} \) be a vector of length \(\ell \) such that \(\mathbf {m} _1=4\) and \(\mathbf {m} _2,\ldots ,\mathbf {m} _\ell \) are the odd factors of the \(\ell ^\text {th}\) primorial number, in ascending order. \(\mathbf {m} \) is the unique \(({\kappa },n)\)-near-primorial vector.

Definition 3.5

(\(\mathbf {m} \)-Coprimality). Let \(\mathbf {m} \) be a vector of integers. An integer x is \(\mathbf {m} \)-coprime if and only if it is not divisible by any \(\mathbf {m} _i\) for \(i\in [|\mathbf {m} |]\).

figure d

Boneh and Franklin  [4, Lemma 2.1] showed that knowledge of \(n-1\) integer shares of the factors \(p\) and \(q\) does not give the adversary any meaningful advantage in factoring biprimes from the distribution produced by \({\mathsf {BFGM}}\) and, by extension, \({\mathsf {CRTSample}}\). Hazay et al.  [20, Lemma 4.1] extended this argument to the malicious setting, wherein the adversary is allowed to choose its own shares.

Lemma 3.7

( [4, 20]). Let \(n<{\kappa } \) and let \((\mathcal{A} _1,\mathcal{A} _2)\) be a pair of PPT algorithms. For \((\mathsf {state},\{(p_i,q_i)\}_{i\in [n-1]})\leftarrow \mathcal{A} _1(1^{\kappa },1^n)\), let \(N\) be a biprime sampled by running \({\mathsf {CRTSample}}\)\(({\kappa },n,\{(p_i,q_i)\}_{i\in [n-1]})\). If \(\mathcal{A} _2(\mathsf {state},N)\) outputs the factors of \(N\) with probability at least \(\smash {1/{\kappa } ^d}\), then there exists an expected-polynomial-time algorithm \(\mathcal{B} \) that succeeds with probability \(1/2^4n^3{\kappa } ^d\) in the experiment \({\mathsf {Factor}}\)\(_{\mathcal{B},{{\mathsf {BFGM}}}({\kappa },n)}\).

Multiparty Functionality. Our ideal functionality \({\mathcal {F}_{\mathsf {RSAGen}}}\) is a natural embedding of \({\mathsf {CRTSample}}\) in a multiparty functionality: it receives inputs \(\{(p_i,q_i)\}_{i\in {\mathbf {P}^*}}\) from the adversary and runs a single iteration of \({\mathsf {CRTSample}}\) with these inputs when invoked. It either outputs the corresponding modulus if it is valid, or indicates that a sampling failure has occurred. Running a single iteration of \({\mathsf {CRTSample}}\) per invocation of \({\mathcal {F}_{\mathsf {RSAGen}}}\) enables significant freedom in the use of \({\mathcal {F}_{\mathsf {RSAGen}}}\), because it can be composed in different ways to tune the trade-off between resource usage and execution time. It also simplifies the analysis of the protocol \({\pi _{\mathsf {RSAGen}}}\) that realizes \({\mathcal {F}_{\mathsf {RSAGen}}}\), because the analysis is made independent of the success rate of the sampling procedure.

The functionality may not deliver \(N\) to the honest parties for one of two reasons: either \({\mathsf {CRTSample}}\) failed to sample a biprime, or the adversary caused the computation to abort. In either case, the honest parties are informed of the cause of the failure, and consequently the adversary is unable to conflate the two cases. This is essentially the standard notion of security with abort, applied to the multiparty computation of the \({\mathsf {CRTSample}}\) algorithm. In both cases, the \(p\) and \(q\) output by \({\mathsf {CRTSample}}\) are given to the adversary. This leakage simplifies our proof considerably, and we consider it benign, since the honest parties never receive (and therefore cannot possibly use) \(N\).

figure e

4 The Distributed Biprime-Sampling Protocol

In this section, we present the distributed biprime-sampling protocol \({\pi _{\mathsf {RSAGen}}}\), with which we realize \({\mathcal {F}_{\mathsf {RSAGen}}}\). We begin with a high-level overview, and then in Sect. 4.2, we formally define the two ideal functionalities on which our protocol relies, after which in Sect. 4.3 we give the protocol itself. In Sect. 4.4, we present proof sketches of semi-honest and malicious security.

4.1 High-Level Overview

As described in the Introduction, our protocol derives from that of Boneh and Franklin [4], the main technical differences relative to other recent Boneh-Franklin derivatives  [16, 20] being the modularity with which it is described and proven, and the use of CRT-based sampling. Our protocol has three main phases, which we now describe in sequence.

Candidate Sieving. In the first phase of our protocol, the parties jointly sample two \({\kappa } \)-bit candidate primes \(p\) and \(q\) without any small factors, and multiply them to learn their product \(N\). Our protocol achieves these tasks in a unified, integrated way, thanks to the Chinese Remainder Theorem.

Consider a prime \(m\) and a set of shares \(x_i\) for \(i\in [n]\) over the field \({\mathbb {Z}}_m\). As in the description of \({\mathsf {CRTRecon}}\), let a and b be defined such that \(a\cdot b\equiv 1\pmod {m}\), and let M be an integer. Observe that if m divides M, then

$$\begin{aligned} \sum \limits _{i\in [n]}x_i\not \equiv 0\pmod {m}{\quad }\implies {\quad }\sum \limits _{i\in [n]}a\cdot b\cdot x_i\bmod {M}\not \equiv 0\pmod {m} \end{aligned}$$
(1)

Now consider a vector of coprime integers \(\mathbf {m} \) of length \(\ell \), and let \(M\) be their product. Let \(\mathbf {x}\) be a vector, each element secret shared over the fields defined by the corresponding element of \(\mathbf {m} \), and let \(\mathbf {a}\) and \(\mathbf {b}\) be defined as in \({\mathsf {CRTRecon}}\) (i.e., and \(\mathbf {a}_j\cdot \mathbf {b}_j \equiv 1 \pmod {\mathbf {m} _j}\)). We can see that for any \(k,j\in [\ell ]\) such that \(k\ne j\),

$$\begin{aligned} \mathbf {a}_j \equiv 0\pmod {\mathbf {m} _{k}}{\quad }\implies {\quad }\sum \limits _{i\in [n]}\mathbf {a}_j\cdot \mathbf {b}_j\cdot \mathbf {x}_{i,j}\bmod {M}\equiv 0\pmod {\mathbf {m} _k} \end{aligned}$$
(2)

and the conjunction of Eqs. 1 and 2 gives us

$$ \sum \limits _{j\in [\ell ]}\sum \limits _{i\in [n]}\mathbf {a}_j\cdot \mathbf {b}_j\cdot \mathbf {x}_{i,j}\bmod {M}\equiv \sum \limits _{i\in [n]}\mathbf {x}_{i,k}\pmod {\mathbf {m} _k} $$

for all \(k\in [\ell ]\). Observe that this holds regardless of which order we perform the sums in, and regardless of whether the \(\bmod {\,M}\) operation is done at the end, or between the two sums, or not at all.

It follows then that we can sample n shares for an additive secret sharing over the integers of a \({\kappa } \)-bit value x (distributed between 0 and \(n\cdot M\)) by choosing \(\mathbf {m} \) to be the \(({\kappa },n)\)-near-primorial vector (per Definition 3.4), instructing each party \(\mathcal{P} _i\) for \(i\in [n]\) to pick \(\mathbf {x}_{i,j}\) locally for \(j\in [\ell ]\) such that \(0\le \mathbf {x}_{i,j}<\mathbf {m} _j\), and then instructing each party to locally reconstruct , its share of x. It furthermore follows that if the parties can contrive to ensure that

$$\begin{aligned} \sum \limits _{i\in [n]}\mathbf {x}_{i,j}\not \equiv 0\pmod {\mathbf {m} _j} \end{aligned}$$
(3)

for \(j\in [\ell ]\), then x will not be divisible by any prime in \(\mathbf {m} \).

Observe next that if the parties sample two shared vectors \(\mathbf {p}\) and \(\mathbf {q}\) as above (corresponding to the candidate primes \(p\) and \(q\)) and compute a shared vector \(\mathbf {N}\) of identical dimension such that

$$\begin{aligned} \sum \limits _{i\in [n]}\mathbf {p}_{i,j}\cdot \sum \limits _{i\in [n]}\mathbf {q}_{i,j}\equiv \sum \limits _{i\in [n]}\mathbf {N}_{i,j}\pmod {\mathbf {m} _j} \end{aligned}$$
(4)

for all \(j\in [\ell ]\), then it follows that

$$ \sum \limits _{i\in [n]}{\mathsf {CRTRecon}} (\mathbf {m}, \mathbf {p}_{i,*})\cdot \sum \limits _{i\in [n]}{\mathsf {CRTRecon}} (\mathbf {m}, \mathbf {q}_{i,*}) =\sum \limits _{i\in [n]}{\mathsf {CRTRecon}} (\mathbf {m}, \mathbf {N}_{i,*}) $$

and from this it follows that the parties can calculate integer shares of \(N=p\cdot q\) by multiplying \(\mathbf {p}\) and \(\mathbf {q}\) together element-wise using a modular-multiplication protocol for linear secret shares, and then locally running \({\mathsf {CRTRecon}}\) on the output to reconstruct \(N\). In fact, our sampling protocol makes use of a special functionality \({\mathcal {F}_{\mathsf {AugMul}}}\), which samples \(\mathbf {p}\), \(\mathbf {q}\), and \(\mathbf {N}\) simultaneously such that the conditions in Eqs. 3 and 4 hold.

There remains one problem: our vector \(\mathbf {m} \) was chosen for sampling integer-shared values between 0 and \(n\cdot M\) (with each share no larger than \(M\)), but \(N\) might be as large as \(n^2\cdot M^2\). In order to avoid wrapping during reconstruction of \(N\), we must reconstruct with respect to a larger vector of primes (while continuing to sample with respect to a smaller one). Let \(\mathbf {m} \) now be of length \({\ell '}\), and let \(\ell \) continue to denote the length of the prefix of \(\mathbf {m} \) with respect to which sampling is performed. After sampling the initial vectors \(\mathbf {p}\), \(\mathbf {q}\), and \(\mathbf {N}\), each party \(\mathcal{P} _i\) for \(i\in [n]\) must extend \(\mathbf {p}_{i,*}\) locally to \({\ell '}\) elements, by computing

for \(j\in [\ell +1,{\ell '}]\), and then likewise for \(\mathbf {q}_{i,*}\). Finally, the parties must use a modular-multiplication protocol to compute the appropriate extension of \(\mathbf {N}\); from this extended \(\mathbf {N}\), they can reconstruct shares of \(N=p\cdot q\). They swap these shares, and thus each party ends the Sieving phase of our protocol with a candidate biprime \(N\) and an integer share of each of its factors, \(p_i\) and \(q_i\).

Each party completes the first phase by performing a local trial division to check if \(N\) is divisible by any prime smaller than some bound B (which is a parameter of the protocol). The purpose of this step is to reduce the number of calls to \({\mathcal {F}_{\mathsf {Biprime}}}\) and thus improve efficiency.

Biprimality Test. The parties jointly execute a biprimality test, where every party inputs the candidate \(N\) and its shares \(p_i\) and \(q_i\), and receives back a biprimality indicator. This phase essentially comprises a single call to a functionality \({\mathcal {F}_{\mathsf {Biprime}}}\), which allows an adversary to force spurious negative results, but never returns false positive results. Though this phase is simple, much of the subtlety of our proof concentrates here: we show via a reduction to factoring that cheating parties have a negligible chance to pass the biprimality test if they provide wrong inputs. This eliminates the need to authenticate the inputs in any way.

Consistency Check. To achieve malicious security, the parties must ensure that none among them cheated during the previous stages in a way that might influence the result of the computation. This is what we have previously termed the retroactive consistency check. If the biprimality test indicated that \(N\) is not a biprime, then the parties use a special interface of \({\mathcal {F}_{\mathsf {AugMul}}}\) to reveal the shares they used during the protocol, and then they verify locally and independently that \(p\) and \(q\) are not both primes. If the biprimality test indicated that \(N\) is a biprime, then the parties run a secure test (again via a special interface of \({\mathcal {F}_{\mathsf {AugMul}}}\)) to ensure that length extensions of \(\mathbf {p}\) and \(\mathbf {q}\) were performed honestly. To achieve semi-honest security, this phase is unnecessary, and the protocol can end with the biprimality test.

4.2 Ideal Functionalities Used in the Protocol

Augmented Multiparty Multiplier. The augmented multiplier functionality \({\mathcal {F}_{\mathsf {AugMul}}}\) (Functionality 4.1) is a reactive functionality that operates in multiple phases and stores an internal state across calls. It is meant to help in manipulating CRT-form secret shares. It contains five basic interfaces.

  • The \(\texttt {sample}\) interface allows the parties to sample shares of non-zero multiplication triplets over small primes. That is, given a prime \(m\), the functionality receives a triplet \((x_i,y_i,z_i)\) from every corrupted party \(\mathcal{P} _i\), and then samples a triplet \((x_j,y_j,z_j)\leftarrow {\mathbb {Z}}^3_m\) for every honest \(\mathcal{P} _j\) conditioned on

    $$ \sum \limits _{i\in [n]}z_i\equiv \sum \limits _{i\in [n]}x_i\cdot \sum \limits _{i\in [n]}y_i\not \equiv 0\pmod {m} $$

    In the context of \({\pi _{\mathsf {RSAGen}}}\), this is used to sample CRT-shares of \(p\) and \(q\).

  • The \(\texttt {input}\) and \(\texttt {multiply}\) interfaces, taken together, allow the parties to load shares (with respect to some small prime modulus \(m\)) into the functionality’s memory, and later perform modular multiplication on two sets of shares that are associated with the same modulus. That is, given a prime \(m\), each party \(\mathcal{P} _i\) inputs \(x_i\) and, independently, \(y_i\), and when the parties request a product, with each corrupt party \(\mathcal{P} _j\) also supplying its own an output share \(z_j\), the functionality samples a share of z from \({\mathbb {Z}}_m\) for each honest party subject to

    $$ \sum \limits _{i\in [n]}z_i\equiv \sum \limits _{i\in [n]}x_i\cdot \sum \limits _{i\in [n]}y_i\pmod {m} $$

    In the context of \({\pi _{\mathsf {RSAGen}}}\), this interface is used to perform length-extension on CRT-shares of \(p\) and \(q\).

  • The \(\texttt {check}\) interface allows the parties to securely compute a predicate over the set of stored values. In the context of \({\pi _{\mathsf {RSAGen}}}\), this is used to check that the CRT-share extension of \(p\) and \(q\) has been performed correctly, when \(N\) is a biprime.

  • The \(\texttt {open}\) interface allows the parties to retroactively reveal their inputs to one another. In the context of \({\pi _{\mathsf {RSAGen}}}\), this is used to verify the sampling procedure and biprimality test when \(N\) is not a biprime.

These five interfaces suffice for the malicious version of the protocol, and the first three alone suffice for the semi-honest version. We make a final adjustment, which leads to a substantial efficiency improvement in the protocol with which we realize \({\mathcal {F}_{\mathsf {AugMul}}}\) (which we describe in the full version of this paper  [7]). Specifically, we give the adversary an interface by which it can request that any stored value be leaked to itself, and by which it can (arbitrarily) determine the output of any call to the \(\texttt {sample}\) or \(\texttt {multiply}\) interfaces. However, if the adversary uses this interface, the functionality remembers, and informs the honest parties by aborting when the \(\texttt {check}\) or \(\texttt {open}\) interfaces is used.

figure f

Biprimality Test. The biprimality-test functionality \({\mathcal {F}_{\mathsf {Biprime}}}\) (Functionality 4.2) abstracts the behavior of the biprimality test of Boneh and Franklin  [4]. The functionality receives from each party a candidate biprime \(N\), along with shares of its factors \(p\) and \(q\). It checks whether \(p\) and \(q\) are primes and whether \(N=p\cdot q\). The adversary is given an additional interface, by which it can ask the functionality to leak the honest parties’ inputs, but when this interface is used then the functionality reports to the honest parties that \(N\) is not a biprime, even if it is one.

figure g

Realizations. In the full version of this paper  [7], we discuss a protocol to realize \({\mathcal {F}_{\mathsf {AugMul}}}\), and in Sect. 5, we propose a protocol to realize \({\mathcal {F}_{\mathsf {Biprime}}}\). Both make use of generic MPC, but in such a way that no generic MPC is required unless \(N\) is a biprime.

4.3 The Protocol Itself

We refer the reader back to Sect. 4.1 for an overview of our protocol. We have mentioned that it requires a vector of coprime values, which is prefixed by the \(({\kappa },n)\)-near-primorial vector. We now give this vector a precise definition. Note that the efficiency of our protocol relies upon this vector, because we use its contents to sieve candidate primes. Since smaller numbers are more likely to be factors for the candidate primes, we choose the largest allowable set of the smallest sequential primes.

Definition 4.3

(\(({\kappa },n)\)-Compatible Parameter Set). Let \({\ell '}\) be the smallest number such that the \({\ell '}^\text {th}\) primorial number is greater than \(2^{2{\kappa }-1}\), and let \(\mathbf {m} \) be a vector of length \({\ell '}\) such that \(\mathbf {m} _1=4\) and \(\mathbf {m} _2,\ldots ,\mathbf {m} _{\ell '}\) are the odd factors of the \({\ell '}^\text {th}\) primorial number, in ascending order. \((\mathbf {m},{\ell '},\ell ,M)\) is the \(({\kappa },n)\)-compatible parameter set if \(\ell < {\ell '}\) and the prefix of \(\mathbf {m} \) of length \(\ell \) is the \(({\kappa }, n)\)-near-primorial vector per Definition 3.4, and if \(M\) is the product of this prefix.

figure h

4.4 Security Sketches

We now informally argue that \({\pi _{\mathsf {RSAGen}}}\) realizes \({\mathcal {F}_{\mathsf {RSAGen}}}\) in the semi-honest and malicious settings. We give a full proof for the malicious setting in the full version of this paper  [7].

Theorem 4.5

\({\pi _{\mathsf {RSAGen}}}\) UC-realizes \({\mathcal {F}_{\mathsf {RSAGen}}}\) with perfect security in the (\({\mathcal {F}_{\mathsf {AugMul}}}\), \({\mathcal {F}_{\mathsf {Biprime}}}\))-hybrid model against a static, semi-honest adversary that corrupts up to \(n-1\) parties.

Proof Sketch

In lieu of arguing for the correctness of our protocol, we refer the reader to the explanation in Sect. 4.1, and focus here on the strategy of a simulator \(\mathcal{S} \) against a semi-honest adversary \(\mathcal{A} \) who corrupts the parties indexed by \({\mathbf {P}^*} \). \(\mathcal S\) forwards all messages between \(\mathcal{A} \) and the environment faithfully.

In Step 1 of \({\pi _{\mathsf {RSAGen}}}\), for each \(j\in [2,\ell ]\), \(\mathcal S\) receives the \(\texttt {sample}\) instruction with modulus \(\mathbf {m} _j\) on behalf of \({\mathcal {F}_{\mathsf {AugMul}}}\) from all parties indexed by \({\mathbf {P}^*} \). For each j it then samples \((\mathbf {p}_{i,j},\mathbf {q}_{i,j},\mathbf {N}_{i,j})\leftarrow {\mathbb {Z}}^3_{\mathbf {m} _j}\) uniformly for \(i\in {\mathbf {P}^*} \), and returns each triple to the appropriate party.

Step 2 involves no interaction on the part of the parties, but it is at this point that \(\mathcal S\) computes \(p_i\) and \(q_i\) for \(i\in {\mathbf {P}^*} \), in the same way that the parties themselves do. Note that since \(\mathbf {p}_{*,1}\) and \(\mathbf {q}_{*,1}\) are deterministically chosen, they are known to \(\mathcal S\). The simulator then sends these shares to \({\mathcal {F}_{\mathsf {RSAGen}}}\) via the functionality’s \(\texttt {adv-input}\) interface, and receives in return either a biprime \(N\), or two factors \(p\) and \(q\) such that is not a biprime. Regardless, it instructs \({\mathcal {F}_{\mathsf {RSAGen}}}\) to \(\texttt {proceed}\).

In Step 3 of \({\pi _{\mathsf {RSAGen}}}\), \(\mathcal S\) receives two \(\texttt {input}\) instructions from each corrupted party for each \(j\in [\ell +1,{\ell '}]\) on behalf of \({\mathcal {F}_{\mathsf {AugMul}}}\), and confirms receipt as \({\mathcal {F}_{\mathsf {AugMul}}}\) would. Subsequently, for each \(j\in [\ell +1,{\ell '}]\), the corrupt parties all send a \(\texttt {multiply}\) instruction, and then \(\mathcal S\) samples \(\mathbf {N}_{i,j}\leftarrow {\mathbb {Z}}_{\mathbf {m} _j}\) for \(i\in [n]\) subject to

$$\begin{aligned} \sum _{i\in [n]} \mathbf {N}_{i,j} \equiv N\pmod {\mathbf {m} _j} \end{aligned}$$

and returns each share to the matching corrupt party.

In Step 4 of \({\pi _{\mathsf {RSAGen}}}\), for every \(j\in [{\ell '}]\), every corrupt party \(\mathcal{P} _{i'}\) for \(i'\in {\mathbf {P}^*} \), and every honest party \(\mathcal{P} _i\) for \(i\in [n]\setminus {\mathbf {P}^*} \), \(\mathcal S\) sends \(\mathbf {N}_{i,j}\) to \(\mathcal{P} _{i'}\) on behalf of \(\mathcal{P} _i\), and receives \(\mathbf {N}_{i',j}\) (which it already knows) in reply.

To simulate the final steps of \({\pi _{\mathsf {RSAGen}}}\), \(\mathcal S\) tries to divide \(N\) by all primes smaller than B. If it succeeds, then the protocol is complete. Otherwise, it receives check-biprimality from all of the corrupt parties on behalf of \({\mathcal {F}_{\mathsf {Biprime}}}\), and replies with \(\texttt {biprime}\) or not-biprime as appropriate. It can be verified by inspection that the view of the environment is identically distributed in the ideal-world experiment containing \(\mathcal S\) and honest parties that interact with \({\mathcal {F}_{\mathsf {RSAGen}}}\), and the real-world experiment containing \(\mathcal A\) and parties running \({\pi _{\mathsf {RSAGen}}}\).    \(\square \)

Theorem 4.6

If factoring biprimes sampled by \({\mathsf {BFGM}}\) is hard, then \({\pi _{\mathsf {RSAGen}}}\) UC-realizes \({\mathcal {F}_{\mathsf {RSAGen}}}\) in the (\({\mathcal {F}_{\mathsf {AugMul}}}\), \({\mathcal {F}_{\mathsf {Biprime}}}\))-hybrid model against a static, malicious PPT adversary that corrupts up to \(n-1\) parties.

Proof Sketch

We observe that if the adversary simply follows the specification of the protocol and does not cheat in its inputs to \({\mathcal {F}_{\mathsf {AugMul}}}\) or \({\mathcal {F}_{\mathsf {Biprime}}}\), then the simulator can follow the same strategy as in the semi-honest case. At any point if the adversary deviates from the protocol, the simulator requests \({\mathcal {F}_{\mathsf {RSAGen}}}\) to reveal all honest parties’ shares, and thereafter the simulator uses them by effectively running the code of the honest parties. This matches the adversary’s view in the real protocol as far as the distribution of the honest parties’ shares is concerned.

It remains to be argued that any deviation from the protocol specification will also result in an abort in the real world with honest parties, and will additionally be recognized by the honest parties as an adversarially induced cheat (as opposed to a statistical sampling failure). Note that the honest parties must only detect cheating when \(N\) is truly a biprime and the adversary has sabotaged a successful candidate; if \(N\) is not a biprime and would have been rejected anyway, then cheat-detection is unimportant. We analyze all possible cases where the adversary deviates from the protocol below. Let \(N\) be defined as the value implied by parties’ sampled shares in Step 1 of \({\pi _{\mathsf {RSAGen}}}\).

Case 1: \(N\) is a non-biprime and reconstructed correctly. In this case, \({\mathcal {F}_{\mathsf {Biprime}}}\) will always reject \(N\) as there exist no satisfying inputs (i.e., there are no two prime factors \(p,q\) such that \(p\cdot q=N\)).

Case 2: \(N\) is a non-biprime and reconstructed incorrectly as \(N'\). If by fluke \(N'\) happens to be a biprime then the incorrect reconstruction will be caught by the explicit secure predicate check during the consistency-check phase. If \(N'\) is a non-biprime then the argument from the previous case applies.

Case 3: \(N\) is a biprime and reconstructed correctly. If consistent inputs are used for the biprimality test and nobody cheats, the candidate \(N\) is successfully accepted (this case essentially corresponds to the semi-honest case). Otherwise, if inconsistent inputs are used for the biprimality test, one of the following events will occur:

  • \({\mathcal {F}_{\mathsf {Biprime}}}\) rejects this candidate. In this case, all parties reveal their shares of \(p\) and \(q\) to one another (with guaranteed correctness via \({\mathcal {F}_{\mathsf {AugMul}}}\)) and locally test their primality. This will reveal that \(N\) was a biprime, and that \({\mathcal {F}_{\mathsf {Biprime}}}\) must have been supplied with inconsistent inputs, implying that some party has cheated.

  • \({\mathcal {F}_{\mathsf {Biprime}}}\) accepts this candidate. This case occurs with negligible probability (assuming factoring is hard). Because \(N\) only has two factors, there is exactly one pair of inputs that the adversary can supply to \({\mathcal {F}_{\mathsf {Biprime}}}\) to induce this scenario, apart from the pair specified by the protocol. In our full proof (see the full version  [7] of this paper) we show that finding this alternative pair of satisfying inputs implies factoring \(N\). We are careful to rely on the hardness of factoring only in this case, where by premise \(N\) is a biprime with \({\kappa } \)-bit factors (i.e., an instance of the factoring problem).

Case 4: \(N\) is a biprime and reconstructed incorrectly as \(N'\). If \(N'\) is a biprime then the incorrect reconstruction will be caught during the consistency-check phase, just as when \(N\) is a biprime. If \(N'\) is a non-biprime then it will by rejected by \({\mathcal {F}_{\mathsf {Biprime}}}\), inducing all parties to reveal their shares and find that their shares do not in fact reconstruct to \(N'\), with the implication that some party has cheated.

Thus the adversary is always caught when trying to sabotage a true biprime, and it can never sneak a non-biprime past the consistency check. Because the real-world protocol always aborts in the case of cheating, it is indistinguishable from the simulation described above, assuming that factoring is hard.    \(\square \)

5 Distributed Biprimality Testing

In the semi-honest setting, \({\mathcal {F}_{\mathsf {Biprime}}}\) can be realized by the biprimality-testing protocol of Boneh and Franklin  [4]. We discuss this in the full version  [7] of this paper. The following lemma follows immediately from their work.

Lemma 5.1

The biprimality-testing protocol described by Boneh and Franklin  [4] UC-realizes \({\mathcal {F}_{\mathsf {Biprime}}}\) with statistical security in the \({\mathcal {F}_{\mathsf {ComCompute}}} \)-hybrid model against a static, semi-honest adversary who corrupts up to \(n-1\) parties.

5.1 The Malicious Setting

Unlike a semi-honest adversary, we permit a malicious adversary to force a true biprime to fail our biprimality test, and detect such behavior using independent mechanisms in the \({\pi _{\mathsf {RSAGen}}}\) protocol. However, we must ensure that a non-biprime can never pass the test with more than negligible probability. To achieve this, we use a derivative of the biprimality-testing protocol of Frederiksen et al.  [16]; relative to their protocol, ours is simpler, and we prove that it UC-realizes \({\mathcal {F}_{\mathsf {Biprime}}}\).

The protocol essentially comprises a randomized version of the semi-honest Boneh-Franklin test described previously, followed by a Schnorr-like protocol to verify that the test was performed correctly. The soundness error of the underlying biprimality test is compounded by the Schnorr-like protocol’s soundness error to yield a combined error of 3/4; this necessitates an increase in the number of iterations by a factor of \(\log _{4/3}(2)<2.5\). While this is sufficient to ensure the test itself is carried out honestly, it does not ensure the correct inputs are used. Consequently, generic MPC is used to verify the relationship between the messages involved in the Schnorr-like protocol and the true candidate given by \(N\) and shares of its factors. As a side effect, this generic computation samples \(r\leftarrow {\mathbb {Z}}_N\) and outputs \(z=r\cdot (p\,+\,q\,-\,1)\bmod {N}\) so that the GCD test can afterward be run locally by each party.

Our protocol makes use of a number of subfunctionalities, all of which are standard and described in the full version of this paper  [7]. Namely, we use a coin-tossing functionality \({\mathcal {F}_{\mathsf {CT}}}\) to uniformly sample an element from some set, the one-to-many commitment functionality \({\mathcal {F}_{\mathsf {Com}}}\), the generic MPC functionality over committed inputs \({\mathcal {F}_{\mathsf {ComCompute}}}\), and the integer-sharing-of-zero functionality \({\mathcal {F}_{\mathsf {Zero}}}\). In addition, the protocol uses the algorithm \({\mathsf {VerifyBiprime}}\) (Algorithm 5.3).

figure i

Below we present the algorithm \({\mathsf {VerifyBiprime}}\) that is used for the GCD test. The inputs are the candidate biprime \(N\), an integer \(M\) (the bound on the shares’ size), a bit-vector \(\mathbf {c}\) of length \(2.5{s} \), and for each \(i\in [n]\) a tuple consisting of the shares \(p_i\) and \(q_i\) with the Schnorr-like messages \(\mathbf {\tau }_{i,*}\) and \(\mathbf {\zeta }_{i,*}\) generated by \(\mathcal{P} _i\). The algorithm verifies that all input values are compatible, and returns \(z=r\cdot (p+q-1) \bmod N\) for a random r.

figure j

Theorem 5.4

\({\pi _{\mathsf {Biprime}}}\) UC-realizes \({\mathcal {F}_{\mathsf {Biprime}}}\) in the \(({\mathcal {F}_{\mathsf {Com}}}, {\mathcal {F}_{\mathsf {ComCompute}}}, {\mathcal {F}_{\mathsf {CT}}}, {\mathcal {F}_{\mathsf {Zero}}})\)-hybrid model with statistical security against a static, malicious adversary that corrupts up to \(n-1\) parties.

Proof Sketch

Our simulator \(\mathcal{S} \) for \({\mathcal {F}_{\mathsf {Biprime}}}\) receives \(N\) as common input. Let \({\mathbf {P}^*} \) and be vectors indexing the corrupt and honest parties, respectively. To simulate Steps 1 through 3 of \({\pi _{\mathsf {Biprime}}}\), \(\mathcal S\) simply behaves as \({\mathcal {F}_{\mathsf {CT}}}\), \({\mathcal {F}_{\mathsf {Zero}}}\), and \({\mathcal {F}_{\mathsf {ComCompute}}}\) would in its interactions with the corrupt parties on their behalf, remembering the values received and transmitted. Before continuing, \(\mathcal S\) submits the corrupted parties’ shares of \(p\) and \(q\) to \({\mathcal {F}_{\mathsf {Biprime}}}\) on their behalf. In response, \({\mathcal {F}_{\mathsf {Biprime}}}\) either informs \(\mathcal S\) that \(N\) is a biprime, or leaks the honest parties’ shares. In Step 4, \(\mathcal S\) again behaves exactly as \({\mathcal {F}_{\mathsf {Com}}}\) would. During the remainder of the protocol, the simulator must follow one of two different strategies, conditioned on whether or not \(N\) is a biprime. We will show that both strategies lead to a simulation that is statistically indistinguishable from the real-world experiment.

  • If \({\mathcal {F}_{\mathsf {Biprime}}}\) reported that \(N\) is a biprime, then we know by the specification of \({\mathcal {F}_{\mathsf {Biprime}}}\) that the corrupt parties committed to correct shares of \(p\) and \(q\) in Step 1 of \({\pi _{\mathsf {Biprime}}}\). Boneh and Franklin  [4] showed that the value (i.e., sign) of the right-hand side of the equality in Step 7 is predictable and related to the value of \(\mathbf {\gamma }_j\). We refer to them for a precise description and proof. If without loss of generality we take that value to be 1, then \(\mathcal S\) can simulate iteration j of Steps 6 and 7 as follows. First, \(\mathcal S\) computes \(\hat{\mathbf {\chi }}_{i,j}\) for \(i\in {\mathbf {P}^*} \) to be the corrupt parties’ ideal values of \(\mathbf {\chi }_{i,j}\) as defined in Step 4 of \({\pi _{\mathsf {Biprime}}}\). Then, \(\mathcal S\) samples \(\mathbf {\chi }_{i,j}\leftarrow {\mathbb {Z}}^*_N\) uniformly for subject to

    and simulates Step 6 by releasing \(\mathbf {\chi }_{i,j}\) for to the corrupt parties on behalf of \({\mathcal {F}_{\mathsf {Com}}}\). These values are statistically close to their counterparts in the real protocol. Finally, \(\mathcal S\) simulates Step 7 by running the test for itself and sending the \(\texttt {cheat}\) command to \({\mathcal {F}_{\mathsf {Biprime}}}\) on failure.

    Given the information now known to \(\mathcal S\), Steps 8 through 11 of \({\pi _{\mathsf {Biprime}}}\) can be simulated in a manner similar to the simulation of a common Schnorr protocol: \(\mathcal S\) simply chooses \(\mathbf {\zeta }_{i,*}\leftarrow {\mathbb {Z}}^{2.5{s}}_{M\cdot 2^{{s} + 1}}\) uniformly for , fixes \(\mathbf {c}\leftarrow \{0,1\}^{2.5{s}}\) ahead of time, and then works backwards via the equation in Step 11 to compute the values of \(\mathbf {\alpha }_{i,*}\) for that it must send on behalf of the honest parties in Step 8. These values are statistically close to their counterparts in the real protocol.

    \(\mathcal S\) finally simulates the remaining steps of \({\pi _{\mathsf {Biprime}}}\) by checking the \({\mathsf {VerifyBiprime}}\) predicate itself (since the final GCD test is purely local, no action need be taken by \(\mathcal S\)). If at any point after Step 4 the corrupt parties have cheated (i.e., sent an unexpected value or violated the \({\mathsf {VerifyBiprime}}\) predicate), then \(\mathcal S\) sends the \(\texttt {cheat}\) command to \({\mathcal {F}_{\mathsf {Biprime}}}\). Otherwise, it sends the \(\texttt {proceed}\) command to \({\mathcal {F}_{\mathsf {Biprime}}}\), completing the simulation.

  • If \({\mathcal {F}_{\mathsf {Biprime}}}\) reported that \(N\) is not a biprime (which may indicate that the corrupt parties supplied incorrect shares of \(p\) or \(q\)), then it also leaked the honest parties’ shares of \(p\) and \(q\) to \(\mathcal S\). Thus, \(\mathcal S\) can simulate Steps 4 through 13 of \({\pi _{\mathsf {Biprime}}}\) by running the honest parties’ code on their behalf. In all instances of the ideal-world experiment, the honest parties report to the environment that \(N\) is a non-biprime. Thus, we need only prove that there is no strategy by which the corrupt parties can successfully convince the honest parties that \(N\) is a biprime in the real world. In order to get away with such a real-world cheat, the adversary must cheat in every iteration j of Steps 4 through 6 for which

    $$ \mathbf {\gamma }_j^{(N-p-q)/4}\not \equiv \pm 1\pmod {N} $$

    Specifically, in every such iteration j, the corrupt parties must contrive to send values \(\mathbf {\chi }_{i,j}\) for \(i\in {\mathbf {P}^*} \) such that

    $$ \mathbf {\gamma }_j^{(N - 5) / 4} \cdot \prod \limits _{i\in [n]} \mathbf {\chi }_{i,j} \equiv \mathbf {\gamma }_j^{(N-p-q)/4 + \mathbf {\Delta }_{1,j}} \equiv \pm 1\pmod {N} $$

    for some nonzero offset value \(\mathbf {\Delta }_{1,j}\). We can define a similar offset \(\mathbf {\Delta }_{2,j}\) for the corrupt parties’ transmitted values of \(\mathbf {\alpha }_{i,j}\), relative to the values of \(\mathbf {\tau }_{i,j}\) committed in Step 1:

    $$ \mathbf {\gamma }_j^{\mathbf {\Delta }_{2,j}}\cdot \prod \limits _{i\in [n]} \mathbf {\alpha }_{i,j} \equiv \prod \limits _{i\in [n]}\mathbf {\gamma }_j^{\mathbf {\tau }_{i,j}}\pmod {N} $$

    Since we have presupposed that the protocol outputs \(\texttt {biprime}\), we know that the corrupt parties must transmit correctly calculated values of \(\mathbf {\zeta }_{i,*}\) in Step 10 of \({\pi _{\mathsf {Biprime}}}\), or else Step 12 would output non-biprime when these values are checked by the \({\mathsf {VerifyBiprime}}\) predicate. It follows from this fact and from the equation in Step 11 that \(\mathbf {\Delta }_{2,j} \equiv \mathbf {c}_j\cdot \mathbf {\Delta }_{1,j} \pmod {\varphi (N)}\), where \(\varphi (\cdot )\) is Euler’s totient function. However, both \(\mathbf {\Delta }_{1,*}\) and \(\mathbf {\Delta }_{2,*}\) are fixed before \(\mathbf {c}\) is revealed to the corrupt parties, and so the adversary can succeed in this cheat with probability at most 1/2 for any individual iteration j. Per Boneh and Franklin  [4, Lemma 4.1], a particular iteration j of Steps 4 through 6 of \({\pi _{\mathsf {Biprime}}}\) produces a false positive result with probability at most 1/2 if the adversary behaves honestly. If we assume that the adversary cheats always and only when a false positive would not have been produced by honest behavior, then the total probability of an adversary producing a positive outcome in the \(j\)th iteration of Steps 4 through 6 is upper-bounded by 3/4. The probability that an adversary succeeds over all \(2.5{s} \) iterations is therefore at most \((3/4)^{2.5{s}}<2^{-{s}}\). Thus, the adversary has a negligible chance to force the acceptance of a non-biprime in the real world, and the distribution of outcomes produced by \(\mathcal S\) is statistically indistinguishable from the real-world distribution.   \(\square \)