1 Introduction

Elliptic curve cryptosystem (ECC) has been used widely in cryptographic devices such as smart card. For the implementation of ECC in device, we must analyze not only its mathematic security but also the ability against physical attacks, such as fault attack (FA). So far, there have been many results about FA against ECC [13], especially against elliptic curve digital signature algorithm (ECDSA) [46]. Among them, lattice-based fault attack (LFA) is one of the most effective attacks. It combines both fault attack (FA) and lattice attack (LA). Firstly, some information about the nonce k in signature is revealed by FA. Next, with the leakage information of k, LA can disclose the private key \(d_A\). LAs against (EC) DSA-like signature algorithm are mainly classified into three types. The first type [79] is based on knowing parts of the nonce k. The second type [10] is based on the fact that there exist a few same blocks in each nonce k. The last type [11] is based on the condition that some different nonces in signatures share partial same bits from each other. Nevertheless, it seems only the first type of LA combined with FA is applied successfully on ECDSA-like signature algorithm in practice [4, 6, 12]. There seems to be no FA combined with the other two types of LA in practice. It is worthy to do further research about the kind of FA and the corresponding countermeasures on ECDSA-like signature, such as SM2 signature algorithm.

SM2 signature algorithm (hereafter SM2) is a signature algorithm standard based on elliptic curve published by Chinese Government [13], and has been extensively used in cryptographic device in finance. The first type of LA based on knowing parts of k against SM2 has been introduced in INSCRYPT’2014 [9].

Our Contributions. A practical LFA against SM2 is presented based on the condition of LA that there are some bits shared between different nonces, and the attack is mounted in a smart card successfully. It seems that it is the first time to combine FA and the last type of LA (sharing some bits between nonces) against SM2 in practice. We first utilize practical laser FA to make the instructions of writing nonces into RAM skipped deliberately, so that the nonces in SM2 share partial same bits. Next, based on the faulty results of the above FA, we build the model of LA proposed in SAC’2013 [11] and recover the private key \(d_A\) successfully. At last, we propose a new countermeasure for SM2 to resist LFA by destroying the condition of LA directly. We also prove its security against LFA. Even if some information of the nonce k is leaked, the countermeasure still guarantees the ability to resist LA.

The remainder of the paper is organized as follows: Sect. 2 gives a brief introduction of SM2 and the basic theory of lattice. In Sect. 3, the practical LFA against SM2 is described. In Sect. 4, the countermeasure to resist LFA is presented. Finally, the conclusion is given in Sect. 5.

2 Preliminaries

2.1 SM2 Signature Algorithm

For simplicity, we only analyze the elliptic curve E(ab) in prime finite field \(F_p\) defined by the Weierstrass equation \(y^2=x^3+ax+b\text { mod }p\), where \(a,b\in F_p\) and \(4a^3+27b^2\ne 0\text { mod }p\). The set of points on E(ab) and the infinity point \(\mathcal {O}\) constitute an additive group \(E(F_p)\). The scalar multiplication (SM) \(Q=kG\) is the most important operation in \(E(F_p)\), where \(G,Q\in E(F_p)\) and \(k\in \mathbb {Z}\). The detailed introduction about SM on \(E(F_p)\) can be found in [14].

In SM2, the curve parameters abp and the base point \(G\in E(F_p)\) with order n are all given. The private key \(d_A\) is randomly selected in interval \([1,n-1]\) and the corresponding public key \(P_A\) satisfies \(P_A=d_AG\).

Signature: sign message M with private key \(d_A\).

  1. 1.

    Compute \(e = SHA\left( Z_A||M \right) \), where SHA(.) is the hash algorithm SM3 and \(Z_A\) is the public user information;

  2. 2.

    Select \(k \in \left[ {1,n - 1} \right] \) randomly;

  3. 3.

    Compute \(Q({x_1},{y_1})=kG\);

  4. 4.

    Compute \(r = {e +x_1}\text { mod }n\). If \(r = 0\) or \(r+k=n\) then goto step 2;

  5. 5.

    Compute \(s=(1+d_A)^{-1}(k-rd_A)\text { mod }n\). If \(s = 0\) then goto step 2;

  6. 6.

    Return results (rs).

Verification: verify \((M',r',s')\) with public key \(P_A\).

  1. 1.

    If \(r'\) or \(s'\notin [1,n-1]\) then return false;

  2. 2.

    Compute \(e'=SHA(Z_A||M')\);

  3. 3.

    Compute \(t=r'+s'\text { mod }n\). If \(t=0\) then return false;

  4. 4.

    Compute \((x'_1,y'_1)=s'G+tP_A\);

  5. 5.

    Compute \(R=e'+x'_1\text { mod }n\). If \(R=r'\) then return true, else return false.

2.2 Lattice Attack Basis

Suppose that there exist the vectors \({\varvec{b}}_1, {\varvec{b}}_2,\ldots ,{\varvec{b}}_N\in \mathbb {Z}^n\) which are all linearly independent from each other. Let \( L=\{ \sum \limits _{i = 1}^N {{x_i}{{\varvec{b}}_i}|} {x_1}, \ldots ,{x_N} \in \mathbb {Z}\}\), then L is the called integer lattice generated by the \({\varvec{b}}_i\)’s, where the vector set \(B=\{{\varvec{b}}_1,\ldots ,{\varvec{b}}_N\}\) is a basis of L. Let matrix \(A=({\varvec{b}}_1,\ldots ,{\varvec{b}}_N)^T\), then for any vector \(w\in L\), there exists \({\varvec{x}}=({x_1}, \ldots ,{x_N})\in \mathbb {Z}^n\) satisfying \(w={\varvec{x}}A\).

The closest vector problem (CVP): given a basis B of L and a vector \({\varvec{u}}\in \mathbb {Z}^n\), find a lattice vector \({\varvec{v}}\in L\) satisfying \( \left\| {{\varvec{v}} - {\varvec{u}}} \right\| =\lambda \left( {L,{\varvec{u}}} \right) \), where \(\lambda \left( {L,{\varvec{u}}} \right) \) is the closest distance between L and \({\varvec{u}}\). CVP can be solved in polynomial time by the combination of LLL algorithm [15] and Babai’s Nearest Plane algorithm [16]. Moreover, as presented in [7, 16], it has been proved, as long as the unknown lattice vector \({\varvec{v}}\in L\) and any nonzero vector \({\varvec{u}}\in \mathbb {Z}^n\) satisfy\(|| {{\varvec{v}} - {\varvec{u}}} ||^2 \le c_1 c_2 \varDelta (A)^{2/N}\), then \({\varvec{v}}\) can be determined uniquely in polynomial time as a CVP. Here \(c_1 \approx 1\), \(1<c_2\le N\), and \(\varDelta (A)\) is the determinant of matrix A.

3 Lattice-Based Fault Attack on SM2

In this section, we will introduce the procedure of the lattice-based fault attack against SM2. First, during implementing signatures repeatedly, we will mount laser fault attack (FA) on the smart card to obtain some shared bits between different nonces. Next, based on the faulty signatures derived from FA, we can build the model of the last type of lattice attack (LA) and recover the private key \(d_A\) by some known LA tools.

3.1 Experimental Condition

In the experiment, SM2 is implemented in a smart card and there are no countermeasures. The CPU frequency is 14 MHz and the bus width is 32 bits. The implementation of SM2 is based on hardware and software with a key length of 256 bits. As shown in Fig. 1, CPU implements the instructions of SM2 algorithm in EEPROM, with the help of coprocessor which supports the operations of modular addition, reduction and multiplication of big number. The random number generator is responsible for generating random numbers and sending them to RAM. In addition, as shown in Fig. 2, we use the laser attack platform of Riscure Company in experiments. Finally, the LA is performed in a computer with Inter Core i7-3770 at 3.4 GHz.

Fig. 1.
figure 1

The construction of smart card chip

Fig. 2.
figure 2

The laser attack platform for FA

3.2 Fault Attack Against SM2

Actually, during the execution of SM2 in the chip, at the moment when the nonces are written into RAM through BUS, there will be obvious peak value appearing in the power consumption curve. As shown in Fig. 3, the part of power consumption curve in red box is easy to be distinguished, which indicates two operations, generating the random numbers and storing them into RAM.

Fig. 3.
figure 3

The power consumption curve when the random numbers are generated and transferred (Color figure online)

Fig. 4.
figure 4

The right position in EEPROM for laser fault attack (Color figure online)

For a 256-bit nonce k with big endian storage pattern, only 32 bits of k are generated every time, so it needs to generate 8 times. After each 32-bit random number is generated, it is transferred into RAM through 32-bit BUS, so we can mount laser FA at the above appropriate time. We use laser attack platform to induce forcibly some faults at the time that the random numbers will be written into RAM, so that the written instructions are skipped. As a result, the new generated random numbers are not written into RAM successfully, and the corresponding block of k remains unchanged as the last one stored in the RAM which is the block of nonce in the last signature or the initial value in RAM. This implies that there exist some bits shared between different nonces although they are still unknown.

In addition, in order to determine the right position for attack in the chip, we use the laser attacker platform to scan all the areas of the chip. Since we can not judge whether the nonces own the faults we want, we first import a known private key into the chip. Thereby, we can derive the values of all the nonces from the faulty signature results. As shown in Fig. 4, according to the derived values of nonces, the right position is determined in the red section at the edge of EEPROM. Given the proper parameters such as glitch length, laser intensity, laser duration time and so on, the success rate for obtaining some shared bits between nonces at the position is approximately 100 %. In view of the big endian storage pattern of k, we find out that the shared bits are the most significant bits (MSBs) of nonces and the number l of shared bits is a multiple of 32. After that, we import an unknown private key for real experiments. Based on the determined injection position and time, the fault attack can be mounted against many signatures uninterruptedly. Finally, we obtain 50 continuous faulty signature results \((r_i,s_i)\)(\(i=0,\ldots ,N\)).

3.3 Model of Lattice Attack Against SM2

As mentioned before, we have \(N+1\)(\(N=49\)) faulty signature results \(({{r_i},{s_i}})\)(\(i=0,\ldots ,N\)). Knowing that at least l MSBs of all the nonces are same, we can build the model of LA as presented in [11]. Let \(k_{i}\) and a represent respectively the nonce in the i-th signature and the shared l MSBs of all the nonces, then \(k_i = a{2^{m-l}}+{b_i}\)(\(i=0,\ldots ,N\)) and \(0 < a < {2^l}\). Here \(m(m=256)\) is the key length of SM2 and \({b_i}\) is the rest of \({k_i}\) satisfying \(0 < {b_i} <2^{m-l}\). For \(i=0,\ldots ,N\), substitute \(k_i= a{2^{m-l}}+{b_i}\) into step 5 in SM2 signature, and obtain \(N+1\) equations. Then subtract the 0-th equation from the other equations respectively, and obtain the following equations.

$$\begin{aligned} (s_i+r_i-s_0-r_0)d_A-(s_0-s_i)=b_i-b_0\text { mod }n (i=1,\ldots ,N) \end{aligned}$$
(1)

Since \(0 < {b_i},{b_0} < 2^{m-l}\), apparently \(0 < |{b_i} - {b_0}| < 2^{m-l}\). Let \(\varDelta {b_i} = \left| {{b_i} - {b_0}} \right| \), \(\varDelta {t_i} = (s_i+r_i-s_0-r_0)\text { mod }n\) and \(\varDelta {u_i} =(s_0-s_i)\text { mod }n\), then the above equations are written as

$$\begin{aligned} 0 < \left| {\varDelta {t_i}{d_A} - \varDelta {u_i} + n{h_i}} \right| = \varDelta {b_i} < 2^{m-l}(i=1,\ldots ,N). \end{aligned}$$
(2)

Where \({h_i} \in \mathbb Z\) is the smallest integer which makes the above equations true.

Let matrix \(A = \left( {\begin{array}{*{20}{c}} 1&{}{2^{l}\varDelta {t_1}}&{} \cdots &{}{2^{l}\varDelta {t_N}}\\ 0&{}2^{l}n&{} \cdots &{}0\\ \vdots &{}\vdots &{} \ddots &{} \vdots \\ 0&{} 0 &{} \cdots &{}2^{l}n \end{array}} \right) \), then all the row vectors \({\varvec{b}}_0,\ldots ,{\varvec{b}}_N\) of A generate a lattice L, where \(A=({\varvec{b}}_0,\ldots ,{\varvec{b}}_N)^T\). For \({\varvec{x}}=(d_A,h_1,\ldots ,h_N)\in \mathbb Z^{N+1}\), \({\varvec{v}}=\mathbf {x}A=(d_A,d_A 2^{l}\varDelta {t_1}+h_12^{l}n,\ldots , d_A 2^{l}\varDelta {t_N}+h_N2^{l}n)\) is a nonzero lattice vector in L. Let vector \({\varvec{u}}=(0,2^{l}\varDelta {u_1},\ldots ,2^{l}\varDelta {u_N} )\in \mathbb Z^{N+1}\), then the above inequations can be rewritten as \(||{\varvec{v}} - {\varvec{u}}|| \le 2^{m}\sqrt{N+1}\). As mentioned in Sect. 2, if \(2^{m}\sqrt{N+1} \le \sqrt{c_1c_2}(2^{lN}n^N)^{1/(N+1)}(2^{m-1}<n<2^{m})\), i.e., \(N > m/(l-1)\), then vector \({\varvec{v}}\) can be determined uniquely by solving CVP [7], where \(c_1=N+1\) and \(c_2=1\). Naturally, the private key \(d_A\) can be recovered from \({\varvec{v}}\).

3.4 Attack Results

In the attack experiments, we set the length \(l=32i(i=1,\ldots ,7)\) of the shared MSBs and the number \(N=50-t(t=0,\ldots ,49-256/l)\) of signatures by increasing the values of i and t in turn. After each setting, based on fplll-4.0 Lattice Reduction Library [17], we perform \(51-N\) attacks, where the N signatures in the i-th attack are selected from i-th signature to \({(i+N-1)}\)-th signature. If only one of the attacks can obtain the key \(d_A\) that meets \(P_A=d_AG\) , we think the lattice attack is successful.

The experimental results show that the lattice attack is still successful when \(l=192(i=6)\) and \(N =3(t=47)\), and the average time for each attack is about 32 \(\upmu \)s. It implies that there are 192 MSBs shared between the nonces and we only need 3 signatures to disclose \(d_A\) successfully. Moreover, there are 20 successful cases in the 48 attacks with success rate 41 %.

4 Countermeasure to Resist Lattice-Based Fault Attack

In this section, a new countermeasure to resist LFA is proposed for SM2. It destroys directly the conditions of LA rather than purely preventing FA. Therefore, even though the FA has made the nonces known partially or sharing some bits, the LA still cannot be mounted.

4.1 SM2 with Countermeasure

Signature with Countermeasure: sign message M with keys \(d_A\) and \({P_A}\).

  1. 1.

    Compute \(e = SHA\left( Z_A||M \right) \);

  2. 2.

    Select \(k,w \in \left[ {1,n - 1} \right] \) randomly;

  3. 3.

    Compute \(Q({x_1},{y_1})= k G + w{P_A}\);

  4. 4.

    Compute \(r = {e +x_1}\text { mod }n\). If \(r = 0\) or \(r+k=n\) then goto step 2;

  5. 5.

    Compute \(s=(1+d_A)^{-1}(k+(w-r)d_A) \text { mod }n\). If \(s = 0\) then goto step 2;

  6. 6.

    Return results (rs).

In the SM2 with countermeasure above, there are two nonces k and w generated with same length, and the public key \(P_A\) is employed for signature. The nonce k is actually added with the mask \({w}{d_A}\). Moreover, the verification without any modification can be passed successfully.

4.2 Provable Security Against Lattice Attack

As mentioned above, the condition of the LA based on knowing parts of nonce k is strongest. Hence, it is sufficient to only analyze the security of our countermeasure against the strongest LA. Obviously, the result of analysis can be also applied similarly to our proposed attack.

As shown in SM2 Signature with countermeasure, it is assumed that we obtain N(\(N > m/(l-1)\)) signature results \((r_i,s_i)\)(\(i = 1, \ldots ,N\)), and both the l MSBs \(a_i\) of nonce \(k_i\) and the l MSBs \({c_i}\) of nonce \(w_i\) are known in the i-signature. Here m is the key length of SM2. Let \(b_i,{d_i}\) represent the remaining unknown values of \(k_i, w_i\) respectively, then \(k_i = a_i2^{m-l}+ b_i \) and \({w_i} = {c_i}2^{m-l}+{d_i}\). Where \(a_i,{c_i} < {2^l}\) and \(b_i,{d_i} < {2^{m-l}}\). Let \(t_i=(s_i-c_i2^{m-l}+r_i)\text { mod }n\) and \(u_i=(a_i2^{m-l}-s_i)\text { mod }n\), then we have the following equations.

$$\begin{aligned} {t_i{d_A} - u_i+h_in} = b_i + {d_i}{d_A}(i = 1, \ldots ,N) \end{aligned}$$
(3)

where \(h_i\) is the smallest integer which makes the above equations true.

Similarly, we can construct a lattice L by matrix \(A=\left( {\begin{array}{*{20}{c}} {\beta }&{}{{t_1}}&{} \cdots &{}{{t_N}}\\ 0&{}{n}&{} \cdots &{}0\\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ 0&{}0&{} \cdots &{}{n} \end{array}} \right) ,\) where \(\beta \) is any nonzero real number. Let vector \({\varvec{u}} = \left( {0,{u_1},{u_2}, \ldots ,{u_N}} \right) \in \mathbb Z^{N+1}\) and lattice vector \({\varvec{v}} = {\varvec{x}}A=(\beta d_A, d_A {t_1}+h_1n,\ldots ,d_A {t_N}+h_Nn)\), where \({\varvec{x}}=({d_A},{{h_1},{h_2},\ldots ,{h_N}})\in {\mathbb {Z}^{N + 1}}\), then we have the following equation

$$\begin{aligned} ||{\varvec{v}} - {\varvec{u}}||^2 = {{(\beta {d_A})}^2} + \sum \limits _{i = 1}^N {{{({b_i} + {d_i}{d_A})}^2}}. \end{aligned}$$
(4)

It is known that \(d_A\) is a random number less than \(n(2^{m-1}<n<2^{m})\). The probability is \(\frac{1}{{{2^{d+1}}}}\) when \({\log _2}{d_A}=m-d\), where d is non-negative integer. Therefore, \({\log _2}{d_A}\) is slightly smaller than or equal to \({\log _2}n\). In other words, d is very small in general. In addition, \({\log _2}d_i\) is much greater than d in practical FA, otherwise \(d_i\) and \(d_A\) can be directly obtained from exhaustive search attack rather than LA. Thereby, the inequation \({\log _2}n - {\log _2}d_A \ll {\log _2}d_i\) holds, namely, \(n\ll {d_i}{d_A} < {n^2}/{2^l}\). Therefore, the following inequation holds.

$$\begin{aligned} ||{\varvec{v}} - {\varvec{u}}||^2 \gg (N+1)(n^{2N}\beta ^2 d_A^2)^{1/(N+1)}> (N+1)\varDelta (A)^{2/(N+1)} \end{aligned}$$
(5)

As described in Sect. 2, the above inequation does not satisfy the condition of CVP, so \({\varvec{v}}\) can not be determined. Apparently, it is also impossible to recover \({d_A}\). The conditions of LA are destroyed completely. In addition, if the known bits in the nonces are the least significant bits or contiguous blocks in the middle, the same conclusion can be proved by the similar way as above.

5 Conclusion

In this paper, we introduce a lattice-based fault attack (LFA) against SM2 in a smart card. The attack is based on the condition of lattice attack (LA) that there are some bits shared between different nonces. First, the instructions of writing nonces into RAM are skipped by practical laser fault attack (FA), so that some bits between the nonces in SM2 remain unchanged. Then we combine the results of FA with the model of LA to recover the private key \(d_A\) successfully. The experimental results show that 3 faulty signatures are needed to recover \(d_A\) with average time 32 \(\upmu \)s and success rate 42 %. In addition, we also propose a countermeasure for SM2 to resist LFA by destroying the condition of LA from algorithm level. It is proved in theory that the countermeasure is sufficient to resist LFA. Moreover, the similar attack and countermeasure can also be applied to ECDSA.