Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Over the past several years, lattices have emerged as an attractive foundation for cryptography. The most efficient (and potentially practical) lattice-based cryptosystems are related to ideal lattices, which correspond to ideals in certain families of rings, e.g., \(\mathbb {Z}[X]/(X^{2^{k}}+1)\). Representative works include [HPS98, Mic02, LMPR08, Gen09, LPR10].

More recently, a handful of cryptographic constructions have relied directly on principal ideals that have “relatively short” generators, which serve as secret keys.Footnote 1 These include a simplified variant of Gentry’s original fully homomorphic encryption scheme [Gen09] due to Smart and Vercauteren [SV10], the closely related Soliloquy encryption scheme [CGS14], and candidate cryptographic multilinear maps [GGH13, LSS14]. Breaking these systems is no harder than solving the following problem, which we call the Short Generator of a Principal Ideal Problem (SG-PIP): given some \(\mathbb {Z}\)-basis of an ideal that is guaranteed to have a “short” generator g, find a sufficiently short generator (not necessarily g itself).

Potential attacks on SG-PIP in certain rings were sketched by Bernstein [Ber14b] and Campbell et al. [CGS14]. The basic structure of the attacks, which appears to be folklore in computational number theory, consists of two main parts:

  • First, given a \(\mathbb {Z}\)-basis of the principal ideal, find some arbitrary (not necessarily short) generator of the ideal. For this task, which is known as the Principal Ideal Problem (PIP), the state of the art is an algorithm of Biasse and Fieker [BF14, Bia14], whose running time has only a subexponential \(2^{n^{2/3+\epsilon }}\) dependence on n, the degree of the ring (over \(\mathbb {Z}\)). In addition, building on the recent work of Eisenträger et al. [EHKS14], polynomial-time quantum algorithms for PIP have recently been described in two independent works [CGS14, BS15], the latter of which provides a fully rigorous treatment.

  • Second, transform the generator found in the previous phase into a short generator, thereby recovering the secret key, or its functional equivalent. The standard approach casts this task as a closest vector problem (CVP) on the Dirichlet “log-unit” lattice.

In this work, we focus entirely on the second phase, i.e., on recovering a short generator from any generator. At first, one might suspect that this is a hard problem: in general, the fastest known algorithms for CVP (even allowing quantum) run in exponential \(2^{\varOmega (n)}\) time [MV10, ADS15], or in less time but with much weaker guarantees on the solution quality (e.g., [LLL82, Bab85, Sch87]). In addition, Bernstein [Ber14b] suggested an algebraic approach that may yield slightly subexponential running times in number fields having many subfields, but it remains to be seen if this proposal can be carried through. Regardless of the method used, it is not obvious a priori whether solving CVP on the log-unit lattice yields a sufficiently short generator; much depends on the geometry of the lattice (in the relevant norm) and the quality of the solution.

A promising observation made by several researchers [CGS14, Ber14a] is that the CVP instances arising in the second phase have some implicit structure: the existence of a “rather short” generator (by choice of the secret key) implies that the target point is “somewhat close” to the log-unit lattice; CVP with such a distance guarantee is more commonly known as bounded-distance decoding (BDD) and is sometimes easier than the general case of CVP. Indeed, Garg et al. [GGH13] gave an improved variant of the Gentry-Szydlo algorithm [GS02] which shows that in cyclotomic rings having power-of-two index, BDD on the log-unit lattice is efficiently solvable to within sub-polynomial \(n^{-\log \log n}\) distance. However, this threshold is much too small to handle the BDD instances arising in cryptosystems.

Campbell et al. [CGS14] were the first to claim an efficient solution to the second phase above. In more detail, they asserted that in cyclotomic rings having power-of-two index, the second phase can be accomplished simply by decoding the log-unit lattice using a standard algorithm such as LLL [LLL82]. However, this claim was not accompanied by a proof.Footnote 2 Nevertheless, experiments in cryptographically relevant choices of dimension have shown that decoding is indeed practically efficient [She14, Sch15], giving strong evidence that the approach of [CGS14] does indeed work.

Contributions. Our first main contribution is a rigorous proof showing that the second phase above can be solved in polynomial time, in any cyclotomic of prime-power index. Our proof is based on classical ideas and results from analytical number theory, along with some techniques from probability theory, and consists of two main technical contributions. First, in Sect. 3 we use standard tools from analytical number theory, such as bounds on Dirichlet L-series, to elucidate the geometry of a standard set of generators for the group of cyclotomic units. (The cyclotomic units correspond either to the log-unit lattice itself, or to a sublattice whose index is conjectured to be quite small.) Using this geometry, in Sects. 4 and 5 we show that for a wide class of typical distributions of the secret generator—e.g., Gaussian-like distributions—the naïve “round-off” lattice-decoding algorithm [Len82, Bab85] (using the standard generators of the cyclotomic units) can be used to efficiently recover the secret short generator, given any generator of the ideal.Footnote 3 To complement these results, in Appendix B we give concrete numerical data demonstrating that the second phase succeeds for all practical choices of dimension.

Our second main contribution concerns the questions: in an arbitrary principal ideal (of a prime-power cyclotomic), how long can a shortest generator be? And how short of a generator can we find efficiently? In Sect. 6, we show that for an overwhelming majority of principal ideals, the shortest generator is a \(2^{\tilde{\varTheta }(\sqrt{n})}\) factor longer than the shortest nonzero vector in the ideal. Moreover, one can efficiently find a generator satisfying this bound, given an arbitrary generator. The first of these facts means that the principal ideals used in the aforementioned cryptographic applications are highly atypical, because their shortest generators are also nearly shortest vectors. The second fact implies that the \(2^{\tilde{O}(\sqrt{n})}\)-approximate Shortest Vector Problem (SVP) on arbitrary principal ideals reduces to the Principal Ideal Problem.

Implications and Discussion. Combining our main contributions with known algorithms for PIP [BF14, Bia14, CGS14, BS15] (which are the computational bottleneck) yields the following two main implications:

  • First, there is a quantum polynomial-time, or classical \(2^{n^{2/3+\epsilon }}\)-time, algorithm for SG-PIP, implying a key-recovery attack for the cryptographic constructions of [SV10, GGH13, LSS14, CGS14].

  • Second, there is a quantum polynomial-time algorithm for \(2^{\tilde{O}(\sqrt{n})}\)-approximate SVP on principal ideals in any prime-power cyclotomic. (Note that we do not obtain any improvement over classical SVP algorithms, because \(2^{n^{2/3}}\) time is sufficient to solve \(2^{\tilde{O}(n^{1/3})}\)-approximate SVP on arbitrary lattices [Sch87].)

In light of these, an important open problem is to obtain faster classical PIP algorithms, perhaps also using the guarantee that a short generator exists.

A natural question is what effect, if any, these attacks have on other ring-based problems, such as NTRU [HPS98] and ring-LWE [LPR10], which are the heart of many cryptosystems. Specifically, the theoretical foundation of the ring-LWE problem is the conjectured quantum hardness of approximate-SVP on arbitrary ideals, usually in a cyclotomic ring and for (near-)polynomial approximation factors. As far as we can tell, the above-described algorithms do not appear to affect this foundation: the first crucially relies on the existence of an “unusually short” generator, the second is inherently limited to relatively large SVP approximation factors, and both apply only to principal ideals. An important question is whether these barriers can be overcome, and if so, whether this leads to attacks on ring-LWE or NTRU themselves.

In a complementary direction, another interesting question is whether the above attacks can be extended to other families of non-cyclotomic rings, such as those suggested in [Ber14b]. For this it may suffice to find (by analysis, computation, or both) a suitably good basis of the log-unit lattice, or of a sublattice of not too large index.

2 Preliminaries

We denote column vectors by lower-case bold letters (e.g., \(\mathbf {x}\)) and matrices by upper-case bold letters (e.g., \(\mathbf {X}\)). We often adopt the nonstandard, but very useful, convention of indexing rows and columns by particular finite sets (not necessarily \({\{}{1,\ldots ,n}{\}}\)), and identify a matrix with its indexed set of column vectors. The canonical scalar product over \(\mathbb {R}^n\) and over \(\mathbb {C}^n\) is denoted \({\langle }{\cdot ,\cdot }{\rangle }\), and \({||}{\cdot }{||}\) denotes the Euclidean norm. For a complex number \(z \in \mathbb {C}\), \(\overline{z}\) denotes its complex conjugate, and \({|}{z}{|} = \sqrt{z \cdot \overline{z}}\) denotes its magnitude.

2.1 Lattices and BDD

A lattice \(\mathcal {L}\) is a discrete additive subgroup of \(\mathbb {R}^{n}\) for some positive integer n. The minimum distance of \(\mathcal {L}\) is \(\lambda _{1}(\mathcal {L}) := \min _{\mathbf {v}\in \mathcal {L}\setminus {\{}{\mathbf {0}}{\}}} {||}{\mathbf {v}}{||}\), the length of a shortest nonzero lattice vector. Every lattice is generated as the integer linear combinations of some (non-unique) \(\mathbb {R}\)-linearly independent basis vectors \(\mathbf {B}= {\{}{\mathbf {b}_{1}, \ldots , \mathbf {b}_{k}}{\}}\), as \(\mathcal {L}= \mathcal {L}(\mathbf {B}) := {\{}{\sum _{j=1}^{k} \mathbb {Z}\cdot \mathbf {b}_{j}}{\}}\), where \(k \le n\) is called the rank of the lattice.

Letting \({{\mathrm{span}}}\) denote the \(\mathbb {R}\)-linear span of a set, the dual basis \(\mathbf {B}^{\vee } = {\{}{\mathbf {b}_{1}^{\vee }, \ldots , \mathbf {b}_{k}^{\vee }}{\}} \subset {{\mathrm{span}}}(\mathbf {B})\) and dual lattice \(\mathcal {L}= \mathcal {L}(\mathbf {B}^{\vee })\) are defined to satisfy \({\langle }{\mathbf {b}_{j}^{\vee }, \mathbf {b}_{j'}}{\rangle } = \delta _{j,j'}\) for all \(j,j'\), where the Kronecker delta \(\delta _{j,j'} = 1\) if \(j=j'\), and is 0 otherwise. In other words, \(\mathbf {B}^{t} \cdot \mathbf {B}^{\vee } = (\mathbf {B}^{\vee })^{t} \cdot \mathbf {B}\) is the identity matrix.

In this work we deal with a computational problem on lattices called bounded-distance decoding (BDD): given a lattice basis \(\mathbf {B}\subset \mathbb {R}^{n}\) of \(\mathcal {L}=\mathcal {L}(\mathbf {B})\) and a target point \(\mathbf {t}\in {{\mathrm{span}}}(\mathcal {L})\) with the guarantee that \(\min _{\mathbf {v}\in \mathcal {L}} {||}{\mathbf {v}-\mathbf {t}}{||} \le r\) for some known \(r < \lambda _{1}(\mathcal {L})/2\), find the unique \(\mathbf {v}\in \mathcal {L}\) closest to \(\mathbf {t}\) (i.e., such that \({||}{\mathbf {v}-\mathbf {t}}{||} \le r\)). In fact, in our context \(\mathbf {B}\) and r will be fixed in advance, and \(\mathbf {t}\) is the only input that may vary.

A standard approach to solve BDD (and related problems) is the “round-off” algorithm of [Bab85], which simply returns \(\mathbf {B}\cdot {\lfloor }{(\mathbf {B}^{\vee })^{t} \cdot \mathbf {t}}{\rceil }\), where the rounding function \({\lfloor }{c}{\rceil } := {\lfloor }{c + \frac{1}{2}}{\rfloor } \in \mathbb {Z}\) is applied to each coordinate independently. (Notice that \((\mathbf {B}^{\vee })^{t} \cdot \mathbf {t}\) is the coefficient vector of \(\mathbf {t}\) with respect to basis \(\mathbf {B}\).) We recall the following standard fact about this algorithm, and include a brief proof for completeness.

Claim

Let \(\mathcal {L}\subset \mathbb {R}^{n}\) be a lattice with basis \(\mathbf {B}\), and let \(\mathbf {t}= \mathbf {v}+ \mathbf {e}\in \mathbb {R}^{n}\) for some \(\mathbf {v}\in \mathcal {L}\), \(\mathbf {e}\in \mathbb {R}^{n}\). If \({\langle }{\mathbf {b}_{j}^{\vee }, \mathbf {e}}{\rangle } \in [-\frac{1}{2}, \frac{1}{2})\) for all j, then on input \(\mathbf {t}\) and basis \(\mathbf {B}\), the round-off algorithm outputs \(\mathbf {v}\).

Proof

Because \(\mathbf {v}= \mathbf {B}\mathbf {z}\) for some integer vector \(\mathbf {z}\), we have \((\mathbf {B}^{\vee })^{t} \cdot \mathbf {t}= \mathbf {z}+ (\mathbf {B}^{\vee })^{t} \cdot \mathbf {e}\), so by hypothesis on the \({\langle }{\mathbf {b}_{j}, \mathbf {e}}{\rangle }\), we have \({\lfloor }{(\mathbf {B}^{\vee })^{t} \cdot \mathbf {t}}{\rceil } = \mathbf {z}\). The claim follows.

2.2 Circulant Matrices

We recall some standard facts about circulant matrices for a finite abelian group \((G,\cdot )\), and their relationship with the characters of the group. See e.g., [Lan02] for further details and proofs.

Definition 1

(Circulant Matrix). For a vector \(\mathbf {a}= (a_{g})_{g \in G}\) indexed by G, the G-circulant matrix associated with \(\mathbf {a}\) is the G-by-G matrix whose (ij)th entry is \(a_{i j^{-1}}\).

Note that the transpose of any G-circulant matrix (associated with \((a_{g})_{g \in G}\)) is also a G-circulant matrix (associated with \((a_{g^{-1}})_{g \in G}\)).

Definition 2

(Character Group). A character is a group morphism \(\chi :G \rightarrow {\{}{u \in \mathbb {C}: {|}{u}{|} = 1}{\}}\), i.e., \(\chi (g \cdot h) = \chi (g) \cdot \chi (h)\) for all \(g,h \in G\). The character group \((\hat{G}, \cdot )\) is the set of characters of G, with the group operation being the usual multiplication of functions, i.e., \((\chi \cdot \psi )(g) = \chi (g) \cdot \psi (g)\).

A basic fact is that \({|}{\hat{G}}{|} = {|}{G}{|}\). Notice that for a character \(\chi \in \hat{G}\), we have \(\overline{\chi (g)} = \chi (g)^{-1} = \chi (g^{-1})\). We identify \(\chi \) with the vector \((\chi (g))_{g \in G}\). Then all characters \(\chi \) have Euclidean norm \({||}{\chi }{||} = \sqrt{{|}{G}{|}}\), because

$${\langle }{\chi , \chi }{\rangle } = \sum _{g \in G} \chi (g) \cdot \overline{\chi (g)} = \sum _{g \in G} 1 = {|}{G}{|}. $$

Moreover, distinct characters \(\chi , \psi \) are orthogonal:

$$\begin{aligned} {\langle }{\chi , \psi }{\rangle }&= \sum _{g \in G} \chi (g) \cdot \overline{\psi (g)} = \sum _{g \in G} (\chi \cdot \psi ^{-1})(g) = 0. \end{aligned}$$

Therefore, the complex G-by-\(\hat{G}\) matrix

$$\begin{aligned} \mathbf {P}_{G} := {|}{G}{|}^{-1/2} \cdot ({\chi (g)})_{g \in G, \chi \in \hat{G}} \end{aligned}$$

is unitary, i.e., \(\mathbf {P}_{G}^{-1} = \mathbf {P}_{G}^{*}\), the conjugate transpose of \(\mathbf {P}_{G}\).

Lemma 1

A complex matrix \(\mathbf {A}\) is G-circulant if and only if the \(\hat{G}\)-by-\(\hat{G}\) matrix \(\mathbf {P}_{G}^{-1} \cdot \mathbf {A}\cdot \mathbf {P}_{G}\) is diagonal; equivalently, the columns of \(\mathbf {P}_{G}\) are the eigenvectors of \(\mathbf {A}\). If \(\mathbf {A}\) is the G-circulant matrix associated with \(\mathbf {a}= (a_{g})_{g \in G}\), its eigenvalue corresponding to \(\chi \in \hat{G}\) is \(\lambda _{\chi } = {\langle }{\mathbf {a}, \chi }{\rangle } = \sum _{g \in G} a_{g} \cdot \overline{\chi (g)}\).

It follows that every row and column of \(\mathbf {A}\) has squared Euclidean norm

$$\begin{aligned} {||}{\mathbf {a}}{||}^{2} = {||}{\mathbf {P}_{G}^{*} \cdot \mathbf {a}}{||}^{2} = {|}{G}{|}^{-1} \cdot \sum _{\chi \in \hat{G}} {|}{\lambda _{\chi }}{|}^{2}. \end{aligned}$$

It also follows that \(\mathbf {A}^{-1}\) (when defined) is G-circulant, with eigenvalue \(\lambda _{\chi }^{-1}\) for eigenvector \(\chi \).

Proof

Suppose that \(\mathbf {A}\) is G-circulant, and let \(\chi \in \hat{G}\) be a character of G. Then

$$\begin{aligned} (\mathbf {A}\cdot \chi )_{g} = \sum _{h \in G} a_{g h^{-1}} \cdot \chi (h) = \left( {\sum _{k \in G} a_{k} \cdot \overline{\chi (k)}}\right) \cdot \chi (g), \end{aligned}$$

where in the final equality we have substituted \(k=g h^{-1}\) and used \(\chi (h) = \overline{\chi (k)} \cdot \chi (g)\). So \(\mathbf {A}\cdot \chi = \lambda _{\chi } \cdot \chi \).

For the other direction, it suffices by linearity to show that \(\mathbf {A}_{\chi } = \mathbf {P}_{G} \cdot \mathbf {D}_{\chi } \cdot \mathbf {P}_{G}^{-1}\) is G-circulant for every \(\chi \in \hat{G}\), where \(\mathbf {D}_{\chi }\) is the diagonal \(\hat{G}\)-by-\(\hat{G}\) matrix with 1 in its \((\chi ,\chi )\)th entry and zeros elsewhere. Indeed, by definition of \(\mathbf {P}_{G}\) and because \(\mathbf {P}_{G}^{-1} = \mathbf {P}_{G}^{*}\), the (ij)th entry of \(\mathbf {A}_{\chi }\) is simply \({|}{G}{|}^{-1} \cdot \chi (i) \cdot \overline{\chi (j)} = {|}{G}{|}^{-1} \cdot \chi (i j^{-1})\), which depends only on \(i j^{-1}\) as required.

2.3 Dirichlet Characters and L-Series

A Dirichlet character \(\chi \) is a character of \(\mathbb {Z}_{k}^{*}\) for some positive integer k. Note that if \(k | \ell \) then \(\chi \) induces a character of \(\mathbb {Z}_{\ell }^{*}\) via the natural morphism \(\mathbb {Z}_{\ell }^{*} \rightarrow \mathbb {Z}_{k}^{*}\), so we can equivalently view \(\chi \) as being defined modulo either k or \(\ell \). The conductor \(f_{\chi }\) of \(\chi \) is the smallest positive f such that \(\chi \) is induced by a Dirichlet character modulo f. The character is said to be even if \(\chi (-1)=1\); note that the even Dirichlet characters correspond with the characters of \(\mathbb {Z}_{k}^{*}/{\{}{\pm 1}{\}}\). The character is said to be quadratic if all its values are real (i.e., \(\pm 1\)), and it is not the constant 1 character (which is known as the principal character). Following the convention used in [Was97], we often implicitly extend \(\chi \) to a completely multiplicative function from \(\mathbb {Z}\) to \(\mathbb {C}\), by considering it as modulo its conductor k (i.e., as a primitive character) and letting \(\chi (a) = 0\) if \(\gcd (a,k) > 1\).

Definition 3

(Dirichlet L -Series). For a Dirichlet character \(\chi \), the Dirichlet L-function \(L(\cdot , \chi )\) is defined as the formal series

$$\begin{aligned} L(s, \chi ) = \sum _{k\ge 1} \frac{\chi (k)}{k^s}. \end{aligned}$$

For any Dirichlet character \(\chi \), the series \(L(s,\chi )\) is absolutely convergent for all \(s \in \mathbb {C}\) with \(\mathfrak {R}(s) > 1\). It is also known that \(L(1, \chi )\) converges and is nonzero for any non-principal Dirichlet character (i.e., \(\chi \ne 1\)). We have the following asymptotic bounds on its value; we will only use the lower bounds.

Theorem 1

There exists a \(C>0\) such that, for any non-quadratic character \(\chi \) of conductor \(f>1\),

$$\begin{aligned} \frac{1}{\ell (f)} \le {|}{L(1,\chi )}{|} \le \ell (f) \quad where\,\ell (f) = C \ln f . \end{aligned}$$
(1)

Moreover, for any quadratic character \(\chi \),

$$\begin{aligned} {|}{L(1,\chi )}{|} \ge \frac{1}{C \sqrt{f}}. \end{aligned}$$
(2)

Equation (1) can be traced back to Landau [Lan27], and improving the constant C is an active field of research [Lou15]. Equation (2) is also classical and follows from Dirichlet’s class number formula (see, e.g., [MV06, Sect. 4.4]). We note that under the Generalized Riemann Hypothesis, the bound in Eq. (1) can be improved to \(\ell (f) = C \ln \ln f\), and holds for both quadratic and non-quadratic characters (see, e.g., [LLS15]).

2.4 Cyclotomic Number Fields and the Log-Unit Lattice

Cyclotomic Number Fields. Let L be a field. An element \(\zeta \in L\) is a root of unity if \(\zeta ^m=1\) for some positive integer m. The order of a root of unity \(\zeta \in L\) is the order of the finite multiplicative subgroup of \(L^*\) generated by \(\zeta \). A primitive mth root of unity in L is a root of unity \(\zeta \in L\) of order m. Note that if \(\zeta \in L\) is a primitive mth root of unity, then the polynomial \(X^m-1\in L[X]\) factors as \(\prod _{i=0}^{m-1}(X-\zeta ^i)\) over L[X]. Also note that the complete set of primitive mth roots in L consists of the powers \(\zeta ^j\) for \(j \in \mathbb {Z}_{m}^{*}\).

An algebraic number field K is an extension field of the rationals \(\mathbb {Q}\) such that its dimension \([K:\mathbb {Q}]\) as a \(\mathbb {Q}\)-vector space (i.e., its degree) is finite. If \(\varOmega \supset K\) is an extension field such that \(\varOmega \) is algebraically closed over \(\mathbb {Q}\), then there are exactly \([K:\mathbb {Q}]\) field embeddings of K into \(\varOmega \).Footnote 4 An algebraic number field is Galois if the order of its automorphism group equals its degree.Footnote 5 A number field K is cyclotomic if \(K= \mathbb {Q}(\zeta )\) for some root of unity \(\zeta \in K\). Its degree is \(\varphi (m)\), where \(\varphi (\cdot )\) is the Euler totient function and m is the order of \(\zeta \), and its ring of integers R is monogenic, i.e., \(R=\mathbb {Z}[\zeta ]\). We let U denote the cyclic (multiplicative) subgroup of mth roots of unity, which is generated by \(\zeta \).

A cyclotomic number field is Galois. If \(K = \mathbb {Q}(\zeta )\) is a cyclotomic number field with \(\zeta \in K\) an mth primitive root of unity then each automorphism is characterized by the assignment \(\zeta \mapsto \zeta ^j\) for some \(j \in \mathbb {Z}_{m}^{*}\). As a consequence, if L is an extension field of a cyclotomic field K, then K is situated uniquely in L. For concreteness, we situate cyclotomic number fields in the complex numbers \(\mathbb {C}\). Let m be a positive integer and define \(\omega = \omega _{m} = \exp (2\pi \imath /m) \in \mathbb {C}\). Then \(\omega \) is a primitive mth root of unity and \(K=\mathbb {Q}(\omega )\) is the mth cyclotomic number field. The embeddings of K into the complex numbers (i.e., the automorphisms of K) are denoted \(\sigma _j\) for \(j \in \mathbb {Z}_{m}^{*}\), where \(\sigma _j\) sends \(\omega \) to \(\omega ^j\). The concatenation \(\sigma (a) = (\sigma _{j}(a))_{j \in \mathbb {Z}_{m}^{*}}\) of these embeddings is known as the canonical embedding, and is used to endow K with a geometry, e.g., \({||}{a}{||} := {||}{\sigma (a)}{||}\) for any \(a \in K\).

Logarithmic Embedding. The embeddings \(\sigma _i\) of K, being complex, come in conjugate pairs, i.e., \(\sigma _j(x) = \overline{\sigma _{-j}(x)}\). We will mainly be concerned with their magnitudes, so we identify the pairs by indexing over the multiplicative quotient group \(G := \mathbb {Z}_{m}^{*} / {\{}{\pm 1}{\}}\). We then have the logarithmic embedding, defined as

$$\begin{aligned} {{\mathrm{Log}}}:K&\rightarrow \mathbb {R}^{\varphi (m)/2} \\ a&\mapsto \left( {\log {|}{\sigma _{i}(a)}{|} }\right) _{i \in G}. \end{aligned}$$

The logarithmic embedding defines a group morphism, mapping the multiplicative group \(K^*\) to an additive subgroup of \(\mathbb {R}^{\varphi (m)/2}\). The kernel of \({{\mathrm{Log}}}\) restricted to \(R^*\) is \({\{}{\pm 1}{\}}\cdot U\). The Dirichlet Unit Theorem (see [Sam70, Chap. 4.4, Theorem 1]) implies that \(\varLambda = {{\mathrm{Log}}}(R^{*})\), the image of the multiplicative unit group of R under the logarithmic embedding, is a full-rank lattice in the linear subspace of \(\mathbb {R}^{\varphi (m)/2}\) orthogonal to the all-1s vector \(\mathbf {1}\). We refer to \(\varLambda \) as the log-unit lattice.

Cyclotomic Units. Let A be the multiplicative subgroup of \(K^*\) generated by \(\pm \zeta \) and

$$\begin{aligned} z_j := \zeta ^j - 1, \quad j \in \mathbb {Z}_{m} \setminus {\{}{0}{\}}. \end{aligned}$$

Notice that \(z_{j} = -\zeta ^{j} \cdot z_{-j}\), so \(z_{j}\) and \(z_{-j}\) are equivalent modulo \(\pm U\); in particular, \({{\mathrm{Log}}}(z_{j}) = {{\mathrm{Log}}}(z_{-j})\). The group of cyclotomic units, denoted C, is defined by

$$\begin{aligned} C = A \cap R^* . \end{aligned}$$

The \(z_j\) given above are not necessarily units in R, and thus do not generate C. However, a closely related generating set, which we call the canonical generators, is given by the following lemma. Recall that \(G = \mathbb {Z}_{m}^{*} / {\{}{\pm 1}{\}}\), and identify it with some canonical set of representatives in \(\mathbb {Z}_{m}^{*}\).

Lemma 2

(Lemma 8.1 of [Was97]). Let m be a prime power, and define \(b_j := z_j / z_1 = (\zeta ^j-1)/(\zeta -1)\). The group C of cyclotomic units is generated by \(\pm \zeta \) and \(b_{j}\) for \(j \in G \setminus {\{}{1}{\}}\).

Notice that \({{\mathrm{Log}}}C\) is a sublattice of \(\varLambda \). As shown below, the index of \(\varLambda \) over \({{\mathrm{Log}}}C\) is finite. In fact, it is \(h^+(m)\), the class number of the real subfield \(K^+=\mathbb {Q}(\zeta +\bar{\zeta })\), defined as the index of the subgroup of principal fractional ideals in the multiplicative group of all fractional ideals (in \(K^+\)). The proof of this theorem is left as Exercise 8.5 in [Was97]. For completeness, we sketch the solution in Appendix A.

Theorem 2

For a prime power \(m>2\), the index of the log-unit lattice \(\varLambda \) over \({{\mathrm{Log}}}C\) is

$$\begin{aligned} {[}{\varLambda : {{\mathrm{Log}}}C}{]} = h^+(m). \end{aligned}$$

Some Facts and Conjectures Concerning \(h^+\) . For our purposes, we need \(h^+(m)\) not to be very big. For all power-of-two m up to \(m=256\), and also for \(m=512\) under GRH, it is known that \(h^+(m)=1\) (see [Mil14]). Whether \(h^{+}(m)=1\) for all power-of-two m is known as Weber’s class number problem, and is presented in the literature as a reasonable conjecture.

In the case of odd primes, it also appears that \(h^+\) is quite small. Computations of Schoof [Sch03] and Miller [Mil15] show that \(h^+(p) \le 11\) for all primes \(p \le 241\). For powers of odd primes it has been conjectured (with support of the Cohen-Lenstra heuristic) that, for all but finitely many pairs \((p,\ell )\) where p is a prime, \(h^+(p^{\ell +1}) = h^+(p^{\ell })\) [BPR04]. A direct consequence is that \(h^+(p^{\ell })\) is bounded for a fixed p and increasing \(\ell \).

3 Geometry of the Canonical Generators

Throughout this section, let the cyclotomic index m be a prime power. Our goal here is to show that the canonical generators of the cyclotomic units, under the logarithmic embedding, are geometrically well-suited for bounded-distance decoding.

Recalling that \(G = \mathbb {Z}_{m}^{*}/{\{}{\pm 1}{\}}\) is identified with some set of canonical representatives in \(\mathbb {Z}_{m}^{*}\) and that \({{\mathrm{Log}}}(b_{j}) = {{\mathrm{Log}}}(b_{-j})\), define

$$\begin{aligned} \mathbf {b}_{j}&= {{\mathrm{Log}}}(b_{j}), \quad j \in G \setminus {\{}{1}{\}}, \end{aligned}$$

to be the log-embeddings of the canonical generators \(b_j = (\zeta ^{j} - 1)/(\zeta - 1)\) defined in Lemma 2. By Lemma 2, these \(\mathbf {b}_{j}\) form a basis of the sublattice \({{\mathrm{Log}}}C\), which by Theorem 2 has index \(h^{+}(m)\) in \(\varLambda \).

In order to apply the round-off algorithm and Claim 2.1 with this basis, we bound the norms \({||}{\mathbf {b}^\vee _j}{||}\) of the dual basis vectors. The remainder of this section is dedicated to proving the following theorem.

Theorem 3

Let \(m = p^{k}\) for a prime p, and let \({\{}{\mathbf {b}_{j}^{\vee }}{\}}_{j \in G \setminus {\{}{1}{\}}}\) denote the basis dual to \({\{}{\mathbf {b}_{j}}{\}}_{j \in G \setminus {\{}{1}{\}}}\). Then all \({||}{\mathbf {b}_{j}^{\vee }}{||}\) are equal, and

$$\begin{aligned} {||}{\mathbf {b}_{j}^{\vee }}{||}^{2} \le 2k {|}{G}{|}^{-1} \cdot (\ell (m)^2 + O(1))= O(m^{-1} \cdot \log ^3 m ) . \end{aligned}$$

To prove the theorem we start by relating the basis vectors \(\mathbf {b}_{j}\) to a certain G-circulant matrix. Recalling that \(z_{j} = \zeta ^{j}-1\) is the numerator of \(b_{j}\), define

$$\begin{aligned} \mathbf {z}_{j} := {{\mathrm{Log}}}(z_{j}) = \mathbf {b}_{j} + \mathbf {z}_{1} . \end{aligned}$$
(3)

Collect these vectors into a square G-by-G matrix \(\mathbf {Z}\) whose jth column is \(\mathbf {z}_{j^{-1}}\), and notice that its (ij)th entry \(\log {|}{\omega ^{i \cdot j^{-1}} - 1}{|}\) is determined by \(i j^{-1} \in G\) alone, so \(\mathbf {Z}\) is the G-circulant matrix associated with \(\mathbf {z}_{1}\). For each eigenvector \(\chi \in \hat{G}\) of \(\mathbf {Z}\), let \(\lambda _{\chi } := {\langle }{\mathbf {z}_{1}, \chi }{\rangle }\) denote the corresponding eigenvalue.

Lemma 3

For all \(j \in G \setminus {\{}{1}{\}}\) we have

$$\begin{aligned} {||}{\mathbf {b}_{j}^{\vee }}{||}^{2} = {|}{G}{|}^{-1} \cdot \sum _{\chi \in \widehat{G} \setminus {\{}{1}{\}}} {|}{\lambda _{\chi }}{|}^{-2}. \end{aligned}$$
(4)

Proof

Let \(\mathbf {z}_{j}^{\vee }\) denote the vectors dual to the \(\mathbf {z}_{j}\), i.e., the columns of \(\mathbf {Z}^{-t}\). (As shown below in the proof of Theorem 3, \(\mathbf {Z}^{-1}\) is indeed well defined because all eigenvalues \(\lambda _{\chi }\) of \(\mathbf {Z}\) are nonzero.)

We first claim that \(\mathbf {b}_{j}^{\vee }\) is simply the projection of \(\mathbf {z}_{j}^{\vee }\) orthogonal to \(\mathbf {1}\), i.e., \(\mathbf {b}_{j}^{\vee } = \mathbf {z}_{j}^{\vee } - {|}{G}{|}^{-1} \cdot {\langle }{\mathbf {z}_{j}^{\vee }, \mathbf {1}}{\rangle } \cdot \mathbf {1}\). Indeed, these vectors are all in \({{\mathrm{span}}}(\mathbf {b}_{j'})_{j'}\), the space orthogonal to \(\mathbf {1}\), and moreover, for all \(j,j' \in G \setminus {\{}{1}{\}}\) they satisfy

$$\begin{aligned} {\langle }{\mathbf {z}_{j}^{\vee } - {|}{G}{|}^{-1} \cdot {\langle }{\mathbf {z}_{j}^{\vee }, \mathbf {1}}{\rangle } \cdot \mathbf {1}, \mathbf {b}_{j'}}{\rangle } = {\langle }{\mathbf {z}_{j}^{\vee }, \mathbf {b}_{j'}}{\rangle } = {\langle }{\mathbf {z}_{j}^{\vee }, \mathbf {z}_{j'} - \mathbf {z}_{1}}{\rangle } = \delta _{j,j'} - 0. \end{aligned}$$

Now,

$$\begin{aligned} {||}{\mathbf {b}_{j}^{\vee }}{||}^{2} = {||}{\mathbf {z}_{j}^{\vee }}{||}^{2} - {|}{G}{|}^{-1} \cdot {\langle }{\mathbf {z}_{j}^{\vee }, \mathbf {1}}{\rangle }^{2}. \end{aligned}$$

Recall by Lemma 1 that \(\mathbf {Z}^{-t}\) is the G-circulant matrix associated with \(\mathbf {z}_{1}^{\vee }\), which has eigenvalue \(\lambda _{\chi }^{-1} = {\langle }{\mathbf {z}_{1}^{\vee }, \chi }{\rangle }\) for eigenvector \(\chi \in \hat{G}\). By the remarks following Lemma 1, \({||}{\mathbf {z}_{j}^{\vee }}{||}^{2} = {|}{G}{|}^{-1} \cdot \sum _{\chi \in \hat{G}} {|}{\lambda _{\chi }}{|}^{-2}\). The lemma follows by noting that \({\langle }{\mathbf {z}_{j}^{\vee }, \mathbf {1}}{\rangle } = {\langle }{\mathbf {z}_{1}^{\vee }, \mathbf {1}}{\rangle } = \lambda _{1}^{-1}\).

We now provide an upper bound on the right-hand side of Eq. (4). Our proof is similar to the proof that the cyclotomic units have finite index in the full group of units [Was97, Theorem 8.2].

Theorem 4

[Was97, Lemma 4.8 and Theorem 4.9]. Let \(\chi \) be an even Dirichlet character of conductor \(f > 1\), and let \(\omega _f = \exp (2\pi \imath /f) \in \mathbb {C}\). Then

$$\begin{aligned} \left| { \sum _{a \in \mathbb {Z}_{f}^{*}} \overline{\chi (a)} \cdot \log {|}{1 -\omega _{f}^a}{|}}\right| = \sqrt{f} \cdot {|}{L(1,\chi )}{|}. \end{aligned}$$

For completeness, we briefly explain how the finite sum on the left hand side gives rise to an L-series, and refer to [Was97] for the details. Using the Taylor expansion

$$ \log |1 - x| = - \sum _{k\ge 1} x^k/k , $$

one gets a sum over finitely many a and infinitely many k of terms \(\overline{\chi (a)} \cdot \omega _f^{ak} / k\). For a fixed k, the sum over a can easily be rewritten as \(\tau (\chi ) \cdot \chi (k)/k\), where \(\tau (\chi )\) is a Gauss sum (see [Was97, Lemma 4.7]), which makes the Dirichlet L-function apparent.

Corollary 1

Suppose \(f > 1\) divides a prime power m. For any even Dirichlet character \(\chi \) of conductor f,

$$\begin{aligned} \left| { \sum _{a \in \mathbb {Z}_{m}^{*}} \overline{\chi (a)} \cdot \log {|}{1 -\omega _{m}^a}{|}}\right| = \sqrt{f} \cdot {|}{L(1,\chi )}{|}. \end{aligned}$$

Proof

Let \(\phi :\mathbb {Z}_m^* \rightarrow \mathbb {Z}_f^*\) be the map given by reduction modulo f. We have

$$\begin{aligned} \sum _{a \in \mathbb {Z}_{m}^{*}} \overline{\chi (a)} \cdot \log {|}{1-\omega _{m}^{a}}{|}&= \sum _{a \in \mathbb {Z}_{f}^{*}} \overline{\chi (a)} \sum _{\begin{array}{c} b \in \mathbb {Z}_{m}^{*} \\ \phi (b) = a \end{array}} \log {|}{1-\omega _{m}^{b}}{|} \\&= \sum _{a \in \mathbb {Z}_{f}^{*}} \overline{\chi (a)} \cdot \log \left| {\prod _{\begin{array}{c} b \in \mathbb {Z}_{m}^{*} \\ \phi (b) =a \end{array}} (1-\omega _{m}^{b})}\right| \\&= \sum _{a \in \mathbb {Z}_{f}^{*}} \overline{\chi (a)} \cdot \log \left| {1-\omega _{f}^{a}}\right| , \end{aligned}$$

where in the last equality we have used the identity \(\prod _{i \in \mathbb {Z}_{n}} (1 - \omega _{n}^{i} Y) = 1-Y^{n}\) and \(\omega _{m}^{n} = \omega _{f}\) with \(n=m/f\). The claim follows by applying Theorem 4.

We are now ready to complete the proof of the main theorem.

Proof

(Proof of Theorem 3 ). Recall that the characters \(\chi \in \hat{G}\) correspond to the even characters of \(\mathbb {Z}_{m}^{*}\), because \(\chi (\pm 1) = 1\). Also recall that by Lemma 1, the eigenvalues are

$$\lambda _\chi = {\langle }{\mathbf {z}_{1}, \chi }{\rangle } = \sum _{a \in G} \overline{\chi (a)} \cdot \log |1-\omega _{m}^a| = \frac{1}{2} \sum _{a \in \mathbb {Z}_{m}^{*}} \overline{\chi (\pm a)} \cdot \log |1-\omega _{m}^a|, $$

where the second equality holds because \(|1-\omega _{m}^{-a}| = |1-\omega _{m}^a|\). Therefore, using Corollary 1 we have

$$\begin{aligned} {|}{\lambda _\chi }{|} = \frac{1}{2} {\sqrt{f_{\chi }}} \cdot |L(1,\chi )| , \end{aligned}$$
(5)

and so by Lemma 3,

$$ {||}{\mathbf {b}_{j}^{\vee }}{||}^{2} = {|}{G}{|}^{-1} \cdot \sum _{\chi \in \widehat{G} \setminus {\{}{1}{\}}} {|}{\lambda _{\chi }}{|}^{-2} = 4 {|}{G}{|}^{-1} \cdot \sum _{\chi \in \widehat{G} \setminus {\{}{1}{\}}} f_{\chi }^{-1} \cdot |L(1,\chi )|^{-2} .$$

We first consider the contribution to the sum coming from quadratic characters. When p is an odd prime, there is exactly one quadratic character (see [MV06, Sect. 9.3]), and it is of conductor p, hence by Eq. (2) in Theorem 1, the contribution to the sum is O(1) (assuming it is even; otherwise it does not participate in the sum). In the case \(p=2\) the contribution is also O(1) since there are at most three quadratic characters (see again [MV06, Sect. 9.3]) and their conductor is bounded from above by an absolute constant. Finally, the contribution coming from non-quadratic characters is at most

$$ \ell (m)^{2} \sum _{\chi \in \hat{G}\setminus {\{}{1}{\}}} f_{\chi }^{-1} \le \frac{k}{2} \cdot \ell (m)^2 , $$

where we used Eq. (1) in Theorem 1 and Claim 3 below.

Claim

Let \(m = p^{k}\) for a prime p. Then, for \(G = \mathbb {Z}_{m}^{*}/{\{}{\pm 1}{\}}\),

$$ \sum _{\chi \in \hat{G}\setminus {\{}{1}{\}}} f_\chi ^{-1} \le \frac{k}{2}. $$

Proof

Notice that there are at most f Dirichlet characters of conductor f, at most half of which are even (when \(f > 1\)), so

$$ \sum _{\chi \in \hat{G}\setminus {\{}{1}{\}}} f_\chi ^{-1} \le \sum _{\ell = 1}^k \frac{p^{\ell }}{2} \cdot \frac{1}{p^\ell } = \frac{k}{2}. $$

4 Algorithmic Implications

The following is our main result about the decoding algorithm, showing that under mild restrictions on the distribution of the short generator, one can recover it from any generator that differs from it by a unit in C. Roughly speaking, the requirement from the distribution is that the ratios between its complex embeddings are not too large. We note that since the \(\mathbf {v}_i\) below are assumed to be orthogonal to the all-1 vector, the scale of the distribution (or variance in the case of Gaussians) is irrelevant: this should not come as a surprise, since, e.g., one can normalize the input generator \(g'\) to have algebraic norm 1.

Theorem 5

Let D be a distribution over \(\mathbb {Q}(\zeta )\) with the property that for any tuple of vectors \(\mathbf {v}_1,\ldots ,\mathbf {v}_{\varphi (m)/2-1} \in \mathbb {R}^{\varphi (m)/2}\) of Euclidean norm 1 that are orthogonal to the all-1 vector \(\mathbf {1}\), the probability that \({|}{{\langle }{{{\mathrm{Log}}}(g),\mathbf {v}_i}{\rangle }}{|} < c \sqrt{m} \cdot (\log m)^{-3/2}\) holds for all i is at least some \(\alpha >0\), where g is chosen from D and c is a universal constant. Then there is an efficient algorithm that given \(g'=g \cdot u\), where g is chosen from D and \(u \in C\) is a cyclotomic unit, outputs an element of the form \(\pm \zeta ^j g\) with probability at least \(\alpha \).

Proof

The algorithm applies the round-off algorithm from Claim 2.1 to \({{\mathrm{Log}}}(g')={{\mathrm{Log}}}(g) + {{\mathrm{Log}}}(u)\), using the vectors \(\mathbf {b}_{j}\) (defined and analyzed in Sect. 3) as the basis. By the assumption on D and Theorem 3, with probability at least \(\alpha \) the output is \({{\mathrm{Log}}}(u) \in {{\mathrm{Log}}}(C)\). We next find integer coefficients \(a_j\) such that \({{\mathrm{Log}}}(u) = \sum a_j \mathbf {b}_{j}\), and compute \(u'=\prod b_j^{a_j}\). Since \({{\mathrm{Log}}}(u')={{\mathrm{Log}}}(u)\) it follows that \(u'\) must be of the form \(\pm \zeta ^j u\) for some sign and some j. Therefore, \(g'/u'\) is the desired element.

In the next section we show that the condition on D in the theorem is satisfied by several natural distributions.

One possible concern with the above algorithm is that it expects as input \(g \cdot u\) for a cyclotomic unit \(u \in C\), whereas the first phase of the attack described in the introduction, i.e., a PIP algorithm, is only guaranteed to output \(g \cdot u\) for an arbitrary unit \(u \in R^*\). There are several reasons why this should not be an issue. First, as mentioned in Sect. 2, in some cases, e.g., for power-of-2 cyclotomic, it is conjectured that \(C=R^*\). More generally, the index of C in \(R^*\), which we recall is \(h^+\), the class number of the totally real subfield, is often small. In such a case, if we have a list of coset representatives of C in \(R^*\), we can enumerate over all of them and use the algorithm above to recover g, increasing the running time only by a factor of \(h^+\). In order to obtain such a list of representatives, we can use an algorithm for computing the unit group, either classical [BF14] or quantum [EHKS14]. These algorithms are no slower than the known PIP algorithms and moreover, need only be applied once for a given cyclotomic field (as opposed to once for each public key). Alternatively, by running the PIP algorithm multiple times on a basis of a principal ideal with a known short generator chosen using the secret key generation algorithm, we can recover a list of representatives for all the cosets that show up as output of the PIP algorithm with non-negligible probability; we can then enumerate over that list.

In the above statement and proof we glossed over issues of precision and assumed for simplicity, as one often does, that the input \(g'\) is given exactly. To be fully rigorous, one needs to verify that the algorithm can deal with inputs that are specified with finite precision, and still runs in time polynomial in its input size. Typically, by finite precision one means that the input is given in fixed-point representation, providing additive approximation to the true numbers. Here, however, it is more natural to assume that the input is given in (the strictly more general) floating-point representation, providing multiplicative approximation to the true numbers. Not only is this more natural, but also the known PIP algorithms [BF14, Bia14, BS15] generate an output in this format, or an output that can be easily converted to this format.Footnote 6 Luckily, dealing with floating-point inputs is straightforward. First notice that \({{\mathrm{Log}}}(g')\) can be written in standard fixed-point representation, and so can \({{\mathrm{Log}}}(u)\). The integer coefficients \(a_j\) can be stored exactly since they are at most exponential in the input size. Finally, by using a sufficiently good multiplicative approximation of \(b_j\) (with the multiplicative error being much less than \(1/a_j\)), we can obtain an arbitrarily good multiplicative approximation of \(u'\). As a result we get a multiplicative approximation of the desired output \(g'/u'\) that can be made essentially as good as the multiplicative approximation of the input \(g'\).

5 Tail Bounds

In this section we show that the condition on D in Theorem 5 is satisfied by two natural distributions: the continuous Gaussian and a wide enough discrete Gaussian (over any lattice). This section is independent of the other sections in this paper, and we avoid the use of notation from algebraic number theory. Instead, we identify elements of K with vectors in \(\mathbb {R}^{\varphi (m)}\) by taking the real and the imaginary part of their \(\varphi (m)/2\) complex embeddings, i.e., a is mapped to \({(}{{\mathfrak {R}(\sigma _{j}(a)),\mathfrak {I}(\sigma _{j}(a))} }{)}_{j \in G}\). As a result, all random variables appearing here are real. The results in this section should be easy to extend to other distributions.

We start with Lemma 4, a tail bound on the sum of subexponential random variables. The proof is based on a standard Bernstein argument, and follows the proof in [Ver12] apart from some minor modifications for convenience.

Definition 4

For \(\alpha ,\beta >0\), a random variable X is \((\alpha ,\beta )\) -subexponential if

$$ \mathbb {E}[\cosh (\alpha X)] \le \beta , $$

where recall that \(\cosh (x) := (e^x+e^{-x})/2\).

Lemma 4

(Tail bound). Let \(X_1,\ldots ,X_n\) be independent centered (i.e., expectation zero) \((\alpha ,\beta )\)-subexponential random variables. Then, for any \(\mathbf {a}= (a_1,\ldots ,a_n) \in \mathbb {R}^n\) and every \(t \ge 0\),

$$ \Pr \left[ { \left| {\sum a_i X_i}\right| \ge t}\right] \le 2 \exp \left( {-\min \left( {\frac{\alpha ^2 t^2}{8 \beta \Vert \mathbf {a}\Vert _2^2}, \frac{\alpha t}{2\Vert \mathbf {a}\Vert _\infty }}\right) }\right) . $$

Proof

By scaling, we can assume without loss of generality that \(\alpha =1\). Next, we use the inequality

$$ e^{\delta x} - \delta x - 1 \le (e^{\delta x} - \delta x - 1) + (e^{-\delta x} + \delta x - 1) = 2 (\cosh (\delta x) - 1) \le 2 \delta ^2 (\cosh ( x ) -1) $$

which holds for all \(-1 \le \delta \le 1\) and all \(x \in \mathbb {R}\), where the second inequality follows from the Taylor expansion. By applying this inequality to a \((1,\beta )\)-subexponential centered random variable X, and taking expectations we see that for all \(-1 \le \delta \le 1\),

$$\begin{aligned} \mathbb {E}{[}{\exp (\delta X)}{]}&\le 1+2 \delta ^2 \mathbb {E}{[}{\cosh (X)-1}{]} \nonumber \\&\le 1+2 \delta ^2 (\beta -1) \le \exp (2 \delta ^2 \beta ). \end{aligned}$$
(6)

Using Markov’s inequality, we can bound the upper tail probability for any \(\lambda > 0\) as

$$\begin{aligned} \Pr \left[ { \sum a_i X_i \ge t}\right]&= \Pr \left[ { \exp \left( {\lambda \sum a_i X_i}\right) \ge \exp (\lambda t)}\right] \\&\le \exp (-\lambda t) \cdot \mathbb {E}\left[ {\exp \left( {\lambda \sum a_i X_i}\right) }\right] \\&= \exp (-\lambda t) \cdot \prod \mathbb {E}\left[ {\exp \left( {\lambda a_i X_i}\right) }\right] \\&\le \exp (-\lambda t + 2 \beta \lambda ^2 \Vert \mathbf {a}\Vert _2^2) , \end{aligned}$$

where in the second inequality we used (6) and assumed that \(\lambda \Vert \mathbf {a}\Vert _\infty \le 1\). Taking \(\lambda = \min (t/(4 \beta \Vert \mathbf {a}\Vert _2^2), 1/ \Vert \mathbf {a}\Vert _\infty )\) this bound becomes at most

$$ \exp \left( {-\min \left( {\frac{t^2}{8 \beta \Vert \mathbf {a}\Vert _2^2}, \frac{t}{2\Vert \mathbf {a}\Vert _\infty }}\right) }\right) . $$

We complete the proof by applying the same argument with \(-\mathbf {a}\).

The next claim follows immediately from Definition 4.

Claim

If Y is a non-negative random variable such that both \(\mathbb {E}[Y]\) and \(\mathbb {E}[Y^{-1}]\) are finite, then \(\log Y\) is a \((1,\beta )\)-subexponential random variable for some \(\beta >0\).

The following is an immediate corollary of the tail bound. It shows that the condition in Theorem 5 holds with overwhelming probability for a continuous Gaussian distribution of any radius that is spherical in the embedding basis. Notice that the parameter r plays no role in the conclusion of the statement.

Lemma 5

Let \(X_1,\ldots ,X_n,X'_1,\ldots ,X'_n\) be i.i.d. N(0, r) variables for some \(r>0\), and let \(\hat{X}_i = (X_i^2 + X_i'^2)^{1/2}\). Then, for any vectors \(\mathbf {a}^{(1)},\ldots ,\mathbf {a}^{(\ell )} \in \mathbb {R}^n\) of Euclidean norm 1 that are orthogonal to the all-1 vector, and every \(t \ge C\) for some universal constant C,

$$ \Pr \left[ { \exists j, \left| {\sum _i a^{(j)}_i \log (\hat{X}_i)}\right| \ge t} \right] \le 2 \ell \exp ({-t/2}) . $$

Proof

By union bound, it suffices to prove the lemma for the case \(\ell =1\), and we let \(\mathbf {a}=\mathbf {a}^{(1)}\). Since \(\sum a_i = 0\), we can assume without loss of generality that \(r=1\). Notice that \(\hat{X}_i\) has a chi distribution with 2 degrees of freedom (also known as a Rayleigh distribution) whose density function is given by \(x e^{-x^2/2}\) for \(x>0\) and zero otherwise. In particular, it is easy to see that both \(\mathbb {E}[\hat{X}_i]\) and \(\mathbb {E}[\hat{X}_i^{-1}]\) are finite (both are \(\sqrt{\pi /2}\)). Therefore, by Claim 5, \(\log \hat{X}_i\) is \((1,\beta )\) subexponential for some constant \(\beta >0\). From this it follows that \(\hat{X}_i = \log \hat{X}_i - \mathbb {E}[\log \hat{X}_i]\) are centered \((1,\beta ')\) subexponential random variables for some constant \(\beta '>0\). The result now follows by applying Lemma 4 to \(\hat{X}_1,\ldots ,\hat{X}_n\), using the bound \(\Vert \mathbf {a}\Vert _\infty \le 1\), and the observation that \(\sum _i a_i \mathbb {E}[\log \hat{X}_i] = 0\).

In the next lemma we show that small perturbations of the continuous Gaussian distribution still satisfy the condition in Theorem 5.

Lemma 6

Let \(X=(X_1,\ldots ,X_n,X'_1,\ldots ,X'_n)\) be i.i.d. N(0, r) variables for some \(r>0\), and let \(Y=(Y_1,\ldots ,Y_n,Y'_1,\ldots ,Y'_n)\) be a (not necessarily independent) random vector satisfying \(\Vert Y\Vert _2 \le u\) with probability 1 for some \(u \le r/(20 \sqrt{n})\). Let \(Z=X+Y\) and define \(\hat{X}_i\), \(\hat{Y}_i\), \(\hat{Z}_i\) as before. Then for any vectors \(\mathbf {a}^{(1)},\ldots ,\mathbf {a}^{(\ell )} \in \mathbb {R}^n\) of Euclidean norm 1 that are orthogonal to the all-1 vector, it holds with constant probability that for all j,

$$ \left| {\sum _i a^{(j)}_i \log (\hat{Z}_i)}\right| \le 1 + 10 \log \ell \;. $$

Proof

By Lemma 5 we have that with some constant probability close to 1,

$$\begin{aligned} \forall j, \left| {\sum _i a^{(j)}_i \log (\hat{X}_i)}\right| < 10 \log \ell . \end{aligned}$$
(7)

Moreover, since \(\hat{X}_i < r/(10\sqrt{n})\) implies that both \(|X_i|\) and \(|X'_i|\) are smaller than \(r/(10\sqrt{n})\), we see that by independence of \(X_{i}, X'_{i}\), the probability of the former event is at most c / n for some small constant c. As a result we have that with constant probability close to 1,

$$ \forall i, \hat{X}_i > r/(10\sqrt{n}) . $$

In the following we assume that these two conditions hold (which happens with constant probability close to 1 by union bound), and bound the effect of Y. Now let \(\mathbf {a}\) be one of the vectors in the statement of the lemma. Then,

$$\begin{aligned} \left| {\sum _i a_i \log (\hat{Z}_i)}\right|&\le \left| {\sum _i a_i \log (\hat{X}_i)}\right| + \left| {\sum _i a_i \log (\hat{Z}_i/\hat{X}_i)}\right| \\&\le 10 \log \ell + \left| {\sum _i a_i \log (\hat{Z}_i/\hat{X}_i)}\right| , \end{aligned}$$

where we used Eq. (7). Notice that by the triangle inequality (for two-dimensional Euclidean space),

$$ \hat{X}_i - \hat{Y}_i \le \hat{Z}_i \le \hat{X}_i + \hat{Y}_i \;. $$

Since \(\hat{Y}_i \le \Vert Y\Vert _2 \le u \le r / (20\sqrt{n}) \le \hat{X}_i/2\), and using the inequality \(|\log (1+\delta )| \le 2|\delta |\) valid for all \(\delta \in [-1/2,1/2]\),

$$\begin{aligned} \left| {\sum _i a_i \log (\hat{Z}_i/\hat{X}_i)}\right|&\le \left( {\sum _i (\log (\hat{Z}_i/\hat{X}_i))^2}\right) ^{1/2} \\&\le \left( {\sum _i (2 \hat{Y}_i/\hat{X}_i)^2}\right) ^{1/2} \\&\le 20\sqrt{n}/r \cdot \left( {\sum _i \hat{Y}_i^2}\right) ^{1/2} \\&\le 20\sqrt{n}u/r \le 1 , \end{aligned}$$

where the first inequality follows from Cauchy-Schwarz.

Finally, we consider the spherical (in the embedding basis) discrete Gaussian distribution over an arbitrary lattice \(L \subseteq \mathbb {R}^{2n}\). Such distributions show up often in cryptographic constructions (see, e.g., [LPR13]), and often that lattice is the (embedding of the) ring of integers R. For background on the discrete Gaussian distribution and the smoothing parameter, see, e.g., [MR04]. In order to apply Lemma 6 to this distribution, take X to be the continuous Gaussian \(D_r\) for some \(r \ge 100 n \eta _\varepsilon (L)\), and Y the discrete Gaussian \(D_{L-X,s}\) over the coset \(L-X\) of parameter \(s = \eta _\varepsilon (L)\) for some negligible parameter \(\varepsilon \). Using Banaszczyk’s result [Ban93] we have that with all but exponentially small probability in n, \(\Vert Y\Vert _2 \le \sqrt{2n} \eta _\varepsilon (L) \le r / (60\sqrt{n})\). Moreover, by the lemma below, the distribution of \(Z=X+Y\) is within negligible statistical distance of the discrete Gaussian distribution \(D_{L,r'}\) for \(r'=(r^2+\eta _\varepsilon (L)^2)^{1/2}\). We therefore see that the condition in Theorem 5 holds for the discrete Gaussian distribution \(D_{L,r'}\) for any lattice L and any \(r' > 200 n \eta _\varepsilon (L)\).

Lemma 7

(Special Case of [Pei10, Theorem 3.1]). Let L be a lattice and \(r,s>0\) be such that \(s \ge \eta _\varepsilon (L)\) for some \(\varepsilon \le 1/2\). Then if we choose \(\mathbf {x}\) from the continuous Gaussian \(D_{r}\) and then choose \(\mathbf {y}\) from the discrete Gaussian \(D_{L-\mathbf {x},s}\) then \(\mathbf {x}+\mathbf {y}\) is within statistical distance \(8 \varepsilon \) of the discrete Gaussian \(D_{L,(r^2+s^2)^{1/2}}\).

6 Shortest Generators of Principal Ideals and an SVP Algorithm

In a principal ideal \(\mathcal {I}\), how long (in the Euclidean norm) can the shortest generator be, relative to its algebraic norm? In this section we provide lower and upper bounds showing that for a cyclotomic ring R of prime-power index m, the answer is \(\exp (\tilde{\varTheta }(\sqrt{m})) \cdot S(\mathcal {I})\), where \(S(\mathcal {I}) = {{\mathrm{N}}}(\mathcal {I})^{1/\varphi (m)}\) is the dimension-normalized algebraic norm of \(\mathcal {I}\), and \(\tilde{\varTheta }\) hides polylogarithmic factors. (To be precise, the lower bound is under the mild conjecture that \(h^{+}(m) = 2^{O(m)}\); see the end of Sect. 2.4.) By contrast, it is well known (see, e.g., [PR07, Lemmas 6.1 and 6.2]) that the minimum distance (i.e., the length of a shortest nonzero vector) of any ideal is bounded by \(\varOmega (\sqrt{m}) \cdot S(\mathcal {I})\) and \(O(m) \cdot S(\mathcal {I})\), by the arithmetic-mean/geometric-mean inequality and Minkowski’s theorem, respectively. Therefore, any algorithm that always outputs a generator when given a principal ideal (e.g., the algorithm analyzed in the previous sections) obtains no better than a \(\exp (\tilde{\varOmega }(\sqrt{m}))\) approximation factor for the Shortest Vector Problem, in the worst case.

We first show in Sect. 6.1 that upper and lower bounds on shortest generators follow directly from an analysis of the covering radius of the log-unit lattice \(\varLambda \) (and its sublattice \({{\mathrm{Log}}}C\)), in the \(\ell _{\infty }\) and \(\ell _{1}\) norms (respectively). Sections 6.2 and 6.3 then prove upper and lower bounds on these covering radii. In fact, the proofs demonstrate more: the lower bound holds for “almost all” principal ideals, and the upper bound is algorithmic in the following sense: given an arbitrary generator (which can be found using the quantum PIP algorithm of [BS15, BS16]), we can efficiently find a generator satisfying the bound, which in particular is a \(\exp (\tilde{O}(\sqrt{m}))\)-approximate shortest vector in the ideal.

Throughout this section we let \(m > 2\) be a prime power, and let \(n := {|}{G}{|} = \varphi (m)/2 = \varTheta (m)\). Let H be the subspace of \(\mathbb {R}^{n}\) spanned by \(\varLambda = {{\mathrm{Log}}}R^{*}\) (and by \({{\mathrm{Log}}}C\), the log embedding of the cyclotomic units), which is the subspace orthogonal to \(\mathbf {1}\), the all-1s vector. Define the covering radius of a lattice \(\mathcal {L}\) with respect to the \(\ell _p\) norm as

$$ \mu ^{(p)}(\mathcal {L}) = \max _{\mathbf {x}\in {{\mathrm{span}}}(\mathcal {L})} \min _{\mathbf {v}\in \mathcal {L}} {||}{\mathbf {x}-\mathbf {v}}{||}_p = \max _{\mathbf {x}\in {{\mathrm{span}}}(\mathcal {L})} \min _{\mathbf {v}\in \mathbf {x}+\mathcal {L}} {||}{\mathbf {v}}{||}_p . $$

6.1 Relation to Covering Radius

For any \(g \in R\), let \(\mathcal {I}= gR\). Also let \(\mathbf {g}= {{\mathrm{Log}}}(g)\) and write it as \(\mathbf {g}= s \mathbf {1}+ \mathbf {g}_{H}\) where \(\mathbf {g}_{H} \in H\). Observe that \(s = \log S(\mathcal {I})\), because

$$\begin{aligned} {{\mathrm{N}}}(\mathcal {I}) = {{\mathrm{N}}}(g) = \prod _{i \in \mathbb {Z}_{m}^{*}} \sigma _{i}(g) = \prod _{i \in G} {|}{\sigma _{i}(g)}{|}^{2} = \exp (2 {\langle }{\mathbf {g}, \mathbf {1}}{\rangle }) = \exp (s \cdot \varphi (m)). \end{aligned}$$

Lemma 8

Let g, \(\mathcal {I}\), s, and \(\mathbf {g}_{H}\) be as above. There exists an efficient algorithm that, given g and any \(\mathbf {h}_{H} \in \mathbf {g}_{H} + {{\mathrm{Log}}}C\), outputs a generator h of \(\mathcal {I}\) such that

$$ {||}{h}{||} \le \sqrt{\varphi (m)} \cdot \exp ({||}{\mathbf {h}_{H}}{||}_{\infty }) \cdot S(\mathcal {I}). $$

In particular, there exists a generator of Euclidean norm at most \(\sqrt{\varphi (m)} \cdot \exp (\mu ^{(\infty )}({{\mathrm{Log}}}C)) \cdot S(\mathcal {I})\).

Proof

As in the proof of Theorem 5, for simplicity we ignore issues of precision; see the discussion at the end of Sect. 4. The algorithm lets \(\mathbf {u}= \mathbf {h}_{H} - \mathbf {g}_{H} \in {{\mathrm{Log}}}C\), computes the coefficients \(a_{j} \in \mathbb {Z}\) such that \(\mathbf {u}= \sum a_{j} \mathbf {b}_{j}\), and outputs \(h = g \cdot \prod b_{j}^{a_{j}}\). Because \(\mathbf {h}:= {{\mathrm{Log}}}(h) = {{\mathrm{Log}}}(g) + \mathbf {u}= s \mathbf {1}+ \mathbf {h}_{H}\), we have

$$ {||}{h}{||}^{2} = \sum _{i \in \mathbb {Z}_{m}^{*}} {|}{\sigma _{i}(h)}{|}^{2} \le \varphi (m) \cdot \exp ({||}{\mathbf {h}}{||}_{\infty })^{2} = \varphi (m) \cdot \exp ({||}{\mathbf {h}_{H}}{||}_{\infty })^{2} \cdot S(\mathcal {I})^{2}. $$

Lemma 9

There exists a principal ideal \(\mathcal {I}\subseteq R\) for which every generator has Euclidean norm at least \(\exp (\varOmega (\mu ^{(1)}(\varLambda )/m)) \cdot S(\mathcal {I})\).

In fact, the proof shows that a “random principal ideal,” i.e., one whose generators correspond to a uniformly random coset of the log-unit lattice, satisfies the above bound with overwhelming probability. (Formalizing this requires a bit more effort; we omit the details.)

Proof

Let \(\mathbf {x}+ \varLambda \subset H\) be a “deep hole” coset of \(\varLambda \) in the \(\ell _{1}\) norm, i.e., one for which \({||}{\mathbf {v}}{||}_{1} \ge \mu ^{(1)}(\varLambda )\) for every \(\mathbf {v}\in \mathbf {x}+\varLambda \subset H\). Because the n coordinates of any such \(\mathbf {v}\) sum to zero, the sum of the positive coordinates must be exactly \({||}{\mathbf {v}}{||}_{1}/2\), and therefore there must be a coordinate that is at least \(\mu ^{(1)}(\varLambda )/(2n) = \varOmega (\mu ^{(1)}(\varLambda )/m)\).

Next, assume for a moment that there exists \(g \in R\) for which \(\mathbf {g}_{H}=\mathbf {x}\), where as before we write \(\mathbf {g}= {{\mathrm{Log}}}(g) = s \mathbf {1}+ \mathbf {g}_{H}\). Then any generator h of the ideal \(\mathcal {I}= gR\) satisfies \({{\mathrm{Log}}}(h) \in {{\mathrm{Log}}}(g) + \varLambda = s \mathbf {1}+ \mathbf {x}+ \varLambda \), so by the observation above, it must have the claimed Euclidean norm.

To complete the proof, notice that even if there does not exist a g as above, one can find g so as to make \(\mathbf {g}_{H}\) arbitrarily close to \(\mathbf {x}\), which suffices for the above analysis. To see this, consider \(x = M \cdot {{\mathrm{Exp}}}(\mathbf {x})\), where M is a sufficiently large integer and \({{\mathrm{Exp}}}(\mathbf {x}) \in {{\mathrm{Log}}}^{-1}(\mathbf {x})\) denotes an arbitrary preimage in \(K_{\mathbb {R}} := K \otimes _{\mathbb {Q}} \mathbb {R}\) of \(\mathbf {x}\) under the log embedding (extended to \(K_{\mathbb {R}}\)). Then rounding x to a nearest \(g \in R\) yields the claim.

6.2 Covering Radius Upper Bound and an SVP Algorithm

Theorem 6

There is an efficient randomized algorithm that given any vector \(\mathbf {x}\in H\) outputs a vector \(\mathbf {v}\in {{\mathrm{Log}}}C\) such that \({||}{\mathbf {x}-\mathbf {v}}{||}_\infty = O(\sqrt{m \log m})\) with high probability.

Before giving the proof, we mention some implications of the theorem. First, using the fact that \({{\mathrm{Log}}}C\) is a sublattice of \(\varLambda \), we immediately get the following corollary regarding the covering radii of these lattices.

Corollary 2

For a prime power m, we have \(\mu ^{(\infty )}(\varLambda ) \le \mu ^{(\infty )}({{\mathrm{Log}}}C) \le O(\sqrt{m \log m})\).

We remark that this corollary can also be obtained directly from Lemma 11 below and the non-trivial result of Banaszczyk and Szarek [BS97] (see also [Ban98]). We also note that if the Komlós conjecture is true, then the \(\sqrt{\log m}\) factor in the corollary can be removed.

It follows immediately from the corollary and Lemma 8 that any principal ideal \(\mathcal {I}\) has a generator whose Euclidean norm is at most \(\exp (O(\sqrt{m \log m})) \cdot S(\mathcal {I})\). This also leads to an efficient quantum algorithm providing a non-trivial approximation to SVP in principal ideals, as described in the following theorem.

Theorem 7

There is an efficient quantum algorithm that approximates SVP on principal ideal lattices in cyclotomics of prime-power index m to within approximation factor \(2^{O(\sqrt{m \log m})}\).

Proof

Given a principal ideal \(\mathcal {I}\), first use the efficient quantum algorithm of Biasse and Song [BS15] to recover a generator g of \(\mathcal {I}\), and as above, write \({{\mathrm{Log}}}(g) = s \mathbf {1}+ \mathbf {g}_{H}\) for \(\mathbf {g}_{H} \in H\). Next, apply Theorem 6 to \(\mathbf {g}_{H}\) and let \(\mathbf {v}\in {{\mathrm{Log}}}C\) be the output. Finally, apply the algorithm from Lemma 8 with g and \(\mathbf {h}_{H} := \mathbf {g}_{H} - \mathbf {v}\) to find a generator h whose Euclidean norm is at most \(\exp (O(\sqrt{m \log m})) \cdot S(\mathcal {I})\), and output h. It is sufficiently short since, as mentioned at the start of the section, \(\lambda _1(\mathcal {I})=\varOmega (\sqrt{m}) \cdot S(\mathcal {I})\) by the arithmetic mean-geometric mean inequality.

For the proof of Theorem 6, we need a simple probabilistic lemma, as well as a bound on the norm of the \(\mathbf {b}_j\). For \(\alpha \in [0,1]\), define \(S(\alpha )\) as the unique probability distribution on support \(\{\alpha ,\alpha -1\}\) with expectation 0 (i.e., it assigns probability \(1-\alpha \) to \(\alpha \) and probability \(\alpha \) to \(\alpha -1\)).

Lemma 10

Let \(\mathbf {A}\) be an \(n \times n\) matrix all of whose rows have Euclidean norm at most \(T>0\), and let \(\alpha _1,\ldots ,\alpha _n \in [0,1]\) be arbitrary. Let \(x_1,\ldots ,x_n\) be independent with \(x_i\) distributed as \(S(\alpha _i)\), and let \(\mathbf {x}= (x_1,\ldots ,x_n)\). Then with probability \(\varOmega (1/\sqrt{n})\), both

$$\begin{aligned} \Vert \mathbf {A}\mathbf {x}\Vert _\infty \le O(T \sqrt{\log n}) \qquad \text {and} \qquad \left| {\sum x_i}\right| \le O(1) . \end{aligned}$$

Proof

Since \(S(\alpha )\) is bounded, it is a subgaussian random variable of constant subgaussian norm. (See [Ver12, Sect. 5.2.3] for the definition and properties of subgaussian random variables.) Because the sum of independent subgaussian random variables is also subgaussian (see [Ver12, Lemma 5.9]), \((\mathbf {A}\mathbf {x})_i\) has subgaussian norm O(T) for every \(i=1,\ldots ,n\). Therefore, for a large enough universal constant \(C>0\),

$$ \Pr \left[ { \left| {(\mathbf {A}\mathbf {x})_i}\right| > C T \sqrt{\log n} }\right] = O(1/n^2) , $$

and by a union bound we get

$$\begin{aligned} \Pr \left[ { {||}{\mathbf {A}\mathbf {x}}{||}_\infty > C T \sqrt{\log n} }\right] = O(1/n) . \end{aligned}$$
(8)

Next, by the Berry-Esseen theorem (see, e.g., [O’D14, Sect. 5.2]), since the \(x_i\) have expectation 0 and bounded second and third moments, the probability that \({|}{\sum x_i}{|} = O(1)\) is \(\varOmega (1/\sqrt{n})\). Together with Eq. (8) and the union bound, this completes the proof.

Lemma 11

Let m be a prime power. Then for all \(j \in G\), \(\Vert \mathbf {z}_j\Vert = O(\sqrt{m})\), where \(\mathbf {z}_j\) are the vectors defined in Eq. (3).

Proof

Notice that

$$\begin{aligned} \nonumber \Vert \mathbf {z}_j \Vert ^2&= \sum _{i \in G} \log ^2 {|}{\omega ^{ij}-1}{|} = \sum _{i \in G} \log ^2 {|}{\omega ^{i}-1}{|}\\ \nonumber&= \sum _{i \in G} \log ^2{|}{2 \sin (\pi i /m)}{|} \le \sum _{i=1}^{{\lfloor }{m/2}{\rfloor }} \log ^2(2 \sin (\pi i /m))\\&= \sum _{i=1}^{{\lfloor }{m/2}{\rfloor }} f(i/m) , \end{aligned}$$
(9)

where \(f:[0,1/2] \rightarrow \mathbb {R}\) is given by \(f(x) = \log ^2(2\sin (\pi x))\). Since \(f(x) \le \log 2\) for \(1/6 \le x \le 1/2\) (recall that \(\sin (\pi /6)=1/2\)), the contribution to the sum in Eq. (9) coming from \(i > {\lfloor }{m/6}{\rfloor }\) is at most O(m). It therefore suffices to consider the contribution coming from \(i \in \{1,\ldots ,{\lfloor }{m/6}{\rfloor }\}\). Since \(\sin (\pi x) \ge 2x\) for \(0 \le x \le 1/2\) (as follows from the concavity of sine on \([0,\pi /2]\)), that contribution satisfies

$$\begin{aligned} \sum _{i=1}^{{\lfloor }{m/6}{\rfloor }} f(i/m) \le \sum _{i=1}^{{\lfloor }{m/6}{\rfloor }} \log ^2(4i/m) \le m \int _{0}^{1/6} \log ^2(4x) dx = O(m) \;, \end{aligned}$$

the last equality following from

$$\begin{aligned} \int _{0}^{y} \log ^2(x) dx = y (\log ^2 y - 2\log y +2). \end{aligned}$$

Proof

(Proof of Theorem 6 ). Given any \(\mathbf {y}\in H\), find real coefficients \((a_j)_{j \in G\setminus \{1\}}\) such that \( \mathbf {y}= \sum a_j \mathbf {b}_j\). For \(j \in G\setminus {\{}{1}{\}}\), let \(\alpha _j = (a_{j}\text { mod }1) \in [0,1)\) be the fractional part of \(a_{j}\), and let \(x_j\) be independent random variables distributed like \(S(\alpha _j)\). The algorithm outputs \(\mathbf {u}= \sum (a_j - x_j) \mathbf {b}_j\). Notice that \(\mathbf {u}\in {{\mathrm{Log}}}C\) as desired. To analyze the distance of \(\mathbf {u}\) from \(\mathbf {y}\), for convenience let \(x_1\) be an independent random variable distributed like S(0) (so \(x_{1}=0\) always). Recalling that \(\mathbf {b}_j = \mathbf {z}_j - \mathbf {z}_1\), write

$$ \mathbf {y}- \mathbf {u}= \sum _{j \in G} x_j (\mathbf {z}_j - \mathbf {z}_1) = \sum _{j \in G} x_j \mathbf {z}_j - \left( {\sum _{j \in G} x_j}\right) \mathbf {z}_1 , $$

and so by the triangle inequality

$$\begin{aligned} {||}{\mathbf {y}- \mathbf {u}}{||}_\infty&\le \left| \!\left| {\sum _{j \in G} x_j \mathbf {z}_j}\right| \!\right| _\infty + \left| {\sum _{j \in G} x_j}\right| \cdot {||}{\mathbf {z}_1}{||}_\infty \\&\le \left| \!\left| {\sum _{j \in G} x_j \mathbf {z}_j}\right| \!\right| _\infty + \left| {\sum _{j \in G} x_j}\right| \cdot O(\sqrt{m}) , \end{aligned}$$

where we used the trivial bound \({||}{\mathbf {z}_1}{||}_\infty \le {||}{\mathbf {z}_1}{||}_2\) and applied Lemma 11.Footnote 7 We now apply Lemma 10 to the matrix \(\mathbf {Z}\) whose columns are the \(\mathbf {z}_j\). Since \(\mathbf {Z}\) is G-circulant, the Euclidean norms of all its rows and columns are the same, and by Lemma 11 are \(O(\sqrt{m})\). We therefore obtain that with probability \(\varOmega (1/\sqrt{n})\),

$$\begin{aligned} {||}{\mathbf {y}- \mathbf {u}}{||}_\infty&\le O(\sqrt{m \log n}) + O(\sqrt{m}) = O(\sqrt{m \log m}) , \end{aligned}$$

as desired. The success probability can be amplified by repetition.

6.3 Covering Radius Lower Bound

Let \(h' := (h^+)^{1/(n-1)}\), which we recall is conjectured to be constant. Combined with Lemma 9, the theorem below shows that there exists a principal ideal \(\mathcal {I}\subseteq R\) for which every generator has Euclidean norm at least \(\exp (\varOmega (\sqrt{m} / (h' \log m))) \cdot S(\mathcal {I})\).

Theorem 8

For a prime power m, the log-unit lattice satisfies

$$ \mu ^{(1)}(\varLambda ) \ge \varOmega \big (m^{3/2} / (h' \log m)\big ) . $$

Proof

Using Lemma 12 below,

$$ (\det ({{\mathrm{Log}}}C))^{1/(n-1)} = \varOmega (m^{1/2} / \log m ) . $$

Since \(\det (\varLambda ) = \det ({{\mathrm{Log}}}C) / h^+\),Footnote 8

$$ (\det (\varLambda ))^{1/(n-1)} = \varOmega (m^{1/2} / (h' \log m)) \; . $$

The theorem now follows from the fact that

$$ {{\mathrm{vol}}}(B_1^{n} \cap H) \le \sqrt{n} \cdot 2^{n-1} / (n-1)! = O(1/n)^{n-1} , $$

where \(B_1^{n} := \{ \mathbf {x}\in \mathbb {R}^{n} : \Vert \mathbf {x}\Vert _1 \le 1 \}\). To prove this inequality, notice that (1) the volume of \(B_1^{n-1}\) is \(2^{n-1} / (n-1)!\), (2) the projection of \(B_1^{n} \cap H\) on the first \(n-1\) coordinates is contained in \(B_1^{n-1}\), and (3) this projection shrinks volumes by \(\sqrt{n}\), as can be seen by computing its Jacobian.

Lemma 12

The determinant of \({{\mathrm{Log}}}C\) satisfies

$$\begin{aligned} \det ({{\mathrm{Log}}}C)^{1/(n-1)} = \varOmega (\sqrt{m} / \log m ). \end{aligned}$$

Proof

Recall from the proof of Lemma 3 that \(\mathbf {b}_{j}^{\vee }\) is the projection of \(\mathbf {z}_{j}^{\vee }\) orthogonal to \(\mathbf {1}\). The |G|-dimensional full-rank lattice generated by \({\{}{ \mathbf {z}_{j}^{\vee } }{\}}_{j \in G}\) has determinant

$$ | \det (\mathbf {Z}^{-t}) | = \prod _{\chi \in \widehat{G}} |\lambda _\chi ^{-1}| . $$

Next, notice that the shortest vector in the intersection of this lattice with the span of \(\mathbf {1}\) is \(\mathbf {Z}^{-t} \mathbf {1}= \lambda _1^{-1} \mathbf {1}\), whose Euclidean norm is \(\lambda _1^{-1} \sqrt{|G|}\). Therefore, the dual of \({{\mathrm{Log}}}C\), which is the projection of this lattice orthogonally to \(\mathbf {1}\), has determinant

$$ |G|^{-1/2} \prod _{\chi \in \widehat{G} \setminus {\{}{1}{\}}} |\lambda _\chi ^{-1}| , $$

and therefore

$$\begin{aligned} \det ({{\mathrm{Log}}}C) \nonumber&= |G|^{1/2} \prod _{\chi \in \widehat{G} \setminus {\{}{1}{\}}} |\lambda _\chi | \\&= |G|^{1/2} \prod _{\chi \in \widehat{G} \setminus {\{}{1}{\}}} \left| {\frac{1}{2} \sqrt{f_\chi } \cdot L(1,\chi )}\right| , \end{aligned}$$
(10)

where we used Eq. (5). Letting \(m=p^{k}\) for a prime p, and using Theorem 1, we get that

$$ L := \prod _{\chi \in \widehat{G} \setminus {\{}{1}{\}}} |L(1,\chi )| = \varOmega ((\log m)^{-(n-1-q)} \cdot p^{-q/2}) $$

where q denotes the number of even quadratic characters modulo m, which is at most 3 (see [MV06, Sect. 9.3]). We conclude that

$$\begin{aligned} L^{1/(n-1)} = \varOmega (1/\log m). \end{aligned}$$
(11)

Next, consider \(F = \prod _{\chi \in \hat{G} \setminus {\{}{1}{\}}} f_\chi \). For each \(0 < j \le k\), there are exactly \(\varphi (p^j) - \varphi (p^{j-1})\) characters of conductor \(f_\chi = p^j\). Exactly half are these are even when p is odd and \(j>1\), and also when \(p=2\) and \(j>2\). When p is odd and \(j=1\) there are \(\varphi (p)/2 - 1\) even characters of conductor p, and when \(p=2\) there are no even characters of conductor 2 or 4. Assuming p is odd (the case \(p=2\) being very similar), this leads to

$$\begin{aligned} \log _{p} F&= \sum _{j=1}^k j \cdot \frac{\varphi (p^j) - \varphi (p^{j-1})}{2} - \frac{1}{2} \\&= \frac{k}{2} \cdot \varphi (p^k) - \frac{1}{2} \sum _{j=0}^{k-1} \varphi (p^j) - \frac{1}{2} \\&= kn - \frac{p-1}{2} \sum _{j=0}^{k-2} p^j -1 = kn - \frac{p^{k-1}}{2} - \frac{1}{2} , \end{aligned}$$

and we conclude that

$$\begin{aligned} F = m^{n ({ 1 - \frac{1}{2 k(p-1)} - \frac{1}{2kn})}}&= \varOmega (m)^{n} . \end{aligned}$$
(12)

Plugging (11) and (12) into (10) completes the proof.