Abstract
We present our speed records for Falcon signature generation and verification on ARMv8-A architecture. Our implementations are benchmarked on Apple M1 ‘Firestorm’, Raspberry Pi 4 Cortex-A72, and Jetson AGX Xavier. Our optimized signature generation is \(2\times \) slower, but signature verification is 3–3.9\(\times \) faster than the state-of-the-art CRYSTALS-Dilithium implementation on the same platforms. Faster signature verification may be particularly useful for the client side on constrained devices. Our Falcon implementation outperforms the previous work targeting Jetson AGX Xavier by the factors \(1.48\times \) for signing in falcon512 and falcon1024, \(1.52\times \) for verifying in falcon512, and \(1.70\times \) for verifying in falcon1024. We achieve improvement in Falcon signature generation by supporting a larger subset of possible parameter values for FFT-related functions and applying our compressed twiddle-factor table to reduce memory usage. We also demonstrate that the recently proposed signature scheme Hawk, sharing optimized functionality with Falcon, has \(3.3\times \) faster signature generation and 1.6–1.9\(\times \) slower signature verification when implemented on the same ARMv8 processors as Falcon.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
mitigations=off https://make-linux-fast-again.com/.
- 9.
- 10.
- 11.
References
Abdulrahman, A., Hwang, V., Kannwischer, M.J., Sprenkels, D.: Faster kyber and dilithium on the Cortex-M4. In: Ateniese, G., Venturi, D. (eds.) Applied Cryptography and Network Security, ACNS 2022. Lecture Notes in Computer Science, vol. 13269, pp. 853–871. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-09234-3_42
Alagic, G., et al.: Status report on the third round of the NIST post-quantum cryptography standardization process (2022)
Alkim, E., Bilgin, Y.A., Cenk, M., Gérard, F.: Cortex-M4 optimizations for R, MLWE schemes. IACR TCHES 2020(3), 336–357 (2020)
Andrysco, M., Nötzli, A., Brown, F., Jhala, R., Stefan, D.: Towards verified, constant-time floating point operations. In: ACM CCS 2018, pp. 1369–1382 (2018)
Bai, S., et al.: CRYSTALS-Dilithium: Algorithm Specifications and Supporting Documentation (Version 3.1) (2021)
Becker, H., Hwang, V., Kannwischer, M.J., Yang, B.Y., Yang, S.Y.: Neon NTT: faster dilithium, kyber, and saber on cortex-A72 and apple M1. IACR TCHES 1, 221–244 (2022)
Becker, H., Kannwischer, M.J.: Hybrid scalar/vector implementations of Keccak and SPHINCS+ on AArch64. Cryptology ePrint Archive, Report 2022/1243
Becker, H., Mera, J.M.B., Karmakar, A., Yiu, J., Verbauwhede, I.: Polynomial multiplication on embedded vector architectures. IACR TCHES 2022(1), 482–505 (2022)
Becoulet, A., Verguet, A.: A depth-first iterative algorithm for the conjugate pair fast Fourier transform. IEEE Trans. Sig. Process. 69, 1537–1547 (2021). https://doi.org/10.1109/TSP.2021.3060279
Bennett, H., Ganju, A., Peetathawatchai, P., Stephens-Davidowitz, N.: Just how hard are rotations of \(\mathbb{{Z}} ^n\)? Algorithms and cryptography with the simplest lattice. Cryptology ePrint Archive, Report 2021/1548 (2021)
Bernstein, D.J., Hülsing, A., Kölbl, S., Niederhagen, R., Rijneveld, J., Schwabe, P.: The SPHINCS+ signature framework. In: Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security (2019)
Bindel, N., McCarthy, S., Twardokus, G., Rahbari, H.: Drive (Quantum) safe! - towards post-quantum security for V2V communications. Cryptology ePrint Archive, Paper 2022/483 (2022)
Blake, A.M., Witten, I.H., Cree, M.J.: The fastest Fourier transform in the south. IEEE Trans. Sig. Proc. 61, 4707–4716 (2013)
Botros, L., Kannwischer, M.J., Schwabe, P.: Memory-efficient high-speed implementation of Kyber on Cortex-M4. In: Buchmann, J., Nitaj, A., Rachidi, T. (eds.) AFRICACRYPT 2019. LNCS, vol. 11627, pp. 209–228. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-23696-0_11
Buchmann, J., Dahmen, E., Hülsing, A.: XMSS - a practical forward secure signature scheme based on minimal security assumptions. In: Yang, B.-Y. (ed.) PQCrypto 2011. LNCS, vol. 7071, pp. 117–129. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25405-5_8
Chen, L., et al.: Report on post-quantum cryptography. Technical Report. NIST IR 8105, National Institute of Standards and Technology (2016)
Chung, C.M.M., Hwang, V., Kannwischer, M.J., Seiler, G., Shih, C.J., Yang, B.Y.: NTT multiplication for NTT-unfriendly rings: new speed records for saber and NTRU on Cortex-M4 and AVX2. IACR Trans. Cryptographic Hardw. Embed. Syst. 2021(2), 159–188 (2021)
Cooley, J.W., Tukey, J.W.: An algorithm for the machine calculation of complex Fourier series. Math. Comput. 19, 297–301 (1965)
Cooper, D.A., et al.: Recommendation for stateful hash-based signature schemes. NIST Spec. Publ. SP 800, 208 (2020)
Dagdelen, Ö., Fischlin, M., Gagliardoni, T.: The Fiat–Shamir transformation in a quantum world. In: Sako, K., Sarkar, P. (eds.) ASIACRYPT 2013. LNCS, vol. 8270, pp. 62–81. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-42045-0_4
Ducas, L., Postlethwaite, E.W., Pulles, L.N., van Woerden, W.: Hawk: module LIP makes lattice signatures fast, compact and simple. Cryptology ePrint Archive, Report 2022/1155 (2022). https://eprint.iacr.org/2022/1155
Ducas, L., van Woerden, W.P.J.: On the lattice isomorphism problem, quadratic forms, remarkable lattices, and cryptography. In: Dunkelman, O., Dziembowski, S. (eds.) EUROCRYPT 2022, Part III. Lecture Notes in Computer Science, vol. 13277, pp. 643–673. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-07082-2_23
Fouque, P.A., et al.: Falcon: Fast-Fourier Lattice-based Compact Signatures over NTRU: Specifications v1.2 (2020)
Frigo, M., Johnson, S.G.: FFTW: fastest Fourier transform in the west. Astrophysics Source Code Library, pp. ascl-1201 (2012)
Howe, J., Westerbaan, B.: Benchmarking and Analysing the NIST PQC Finalist Lattice-Based Signature Schemes on the ARM Cortex M7. Cryptology ePrint Archive, Paper 2022/405 (2022)
Huelsing, A., Butin, D., Gazdag, S.L., Rijneveld, J., Mohaisen, A.: XMSS: eXtended Merkle Signature Scheme. RFC 8391 (2018). https://www.rfc-editor.org/info/rfc8391
Hülsing, A.: W-OTS+ – shorter signatures for hash-based signature schemes. In: Youssef, A., Nitaj, A., Hassanien, A.E. (eds.) AFRICACRYPT 2013. LNCS, vol. 7918, pp. 173–188. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38553-7_10
Jalali, A., Azarderakhsh, R., Mozaffari Kermani, M., Campagna, M., Jao, D.: ARMv8 SIKE: optimized supersingular isogeny key encapsulation on ARMv8 processors. IEEE Trans. Circ. Syst. I: Regul. Pap. 66, 4209–4218 (2019)
Kannwischer, M.J., Petri, R., Rijneveld, J., Schwabe, P., Stoffelen, K.: PQM4: post-quantum crypto library for the ARM Cortex-M4. https://github.com/mupq/pqm4
Karmakar, A., Bermudo Mera, J.M., Sinha Roy, S., Verbauwhede, I.: Saber on ARM. IACR Trans. Cryptographic Hardw. Embed. Syst. 2018(3), 243–266 (2018)
Kim, Y., Song, J., Seo, S.C.: Accelerating falcon on ARMv8. IEEE Access 10, 44446–44460 (2022). https://doi.org/10.1109/ACCESS.2022.3169784
Kwon, H., et al.: ARMed Frodo. In: Kim, H. (ed.) WISA 2021. LNCS, vol. 13009, pp. 206–217. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-89432-0_17
Lyubashevsky, V.: Fiat-Shamir with aborts: applications to lattice and factoring-based signatures. In: Matsui, M. (ed.) ASIACRYPT 2009. LNCS, vol. 5912, pp. 598–616. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-10366-7_35
McGrew, D., Curcio, M., Fluhrer, S.: RFC 8554: Leighton-Micali hash-based signatures (2019). https://www.rfc-editor.org/rfc/rfc8554
Nguyen, D.T., Gaj, K.: Fast NEON-based multiplication for lattice-based NIST post-quantum cryptography finalists. In: Cheon, J.H., Tillich, J.-P. (eds.) PQCrypto 2021 2021. LNCS, vol. 12841, pp. 234–254. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-81293-5_13
Nguyen, D.T., Gaj, K.: Optimized software implementations of CRYSTALS-Kyber, NTRU, and Saber using NEON-based special instructions of ARMv8. In: Proceedings of the NIST 3rd PQC Standardization Conference (NIST PQC 2021) (2021)
Pornin, T.: New Efficient, Constant-Time Implementations of Falcon. Cryptology ePrint Archive, Report 2019/893 (2019). https://eprint.iacr.org/2019/893
Seo, H., Sanal, P., Jalali, A., Azarderakhsh, R.: Optimized implementation of SIKE round 2 on 64-bit ARM Cortex-A processors. IEEE Trans. Circuits Syst. I Regul. Pap. 67(8), 2659–2671 (2020)
Shor, P.: Algorithms for quantum computation: discrete logarithms and factoring. In: Proceedings 35th Annual Symposium on Foundations of Computer Science, pp. 124–134. IEEE Computer Society Press, Santa Fe, NM, USA (1994)
Streit, S., De Santis, F.: Post-quantum key exchange on ARMv8-A: a new hope for NEON made simple. IEEE Trans. Comput. 11, 1651–1662 (2018)
Zhao, L., Zhang, J., Huang, J., Liu, Z., Hancke, G.: Efficient Implementation of kyber on Mobile devices. In: 2021 IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS), pp. 506–513
Acknowledgments
This work has been partially supported by the National Science Foundation under Grant No.: CNS-1801512 and by the US Department of Commerce (NIST) under Grant No.: 70NANB18H218.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A Visualizing Complex Point Multiplication
A Visualizing Complex Point Multiplication
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Nguyen, D.T., Gaj, K. (2023). Fast Falcon Signature Generation and Verification Using ARMv8 NEON Instructions. In: El Mrabet, N., De Feo, L., Duquesne, S. (eds) Progress in Cryptology - AFRICACRYPT 2023. AFRICACRYPT 2023. Lecture Notes in Computer Science, vol 14064. Springer, Cham. https://doi.org/10.1007/978-3-031-37679-5_18
Download citation
DOI: https://doi.org/10.1007/978-3-031-37679-5_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-37678-8
Online ISBN: 978-3-031-37679-5
eBook Packages: Computer ScienceComputer Science (R0)