Applied Cryptography and Network Security 2017 pdf pdf

(1)

· Hiroaki Kikuchi (Eds.)

123

LNCS 10355

15th International Conference, ACNS 2017

Kanazawa, Japan, July 10–12, 2017

Proceedings

(2)

Commenced Publication in 1973 Founding and Former Series Editors:

Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen

Editorial Board

David Hutchison

Lancaster University, Lancaster, UK Takeo Kanade

Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler

University of Surrey, Guildford, UK Jon M. Kleinberg

Cornell University, Ithaca, NY, USA Friedemann Mattern

ETH Zurich, Zurich, Switzerland John C. Mitchell

Stanford University, Stanford, CA, USA Moni Naor

Weizmann Institute of Science, Rehovot, Israel C. Pandu Rangan

Indian Institute of Technology, Madras, India Bernhard Steffen

TU Dortmund University, Dortmund, Germany Demetri Terzopoulos

University of California, Los Angeles, CA, USA Doug Tygar

University of California, Berkeley, CA, USA Gerhard Weikum

(3)

(4)

Hiroaki Kikuchi (Eds.)

Applied Cryptography

and Network Security

15th International Conference, ACNS 2017

Kanazawa, Japan, July 10

_–

12, 2017

Proceedings

(5)

Hamburg University of Technology Hamburg

Germany Atsuko Miyaji

Graduate School of Engineering Osaka University

Suita, Osaka Japan

Department of Frontier Media Science Meiji University

Tokyo Japan

ISSN 0302-9743 ISSN 1611-3349 (electronic) Lecture Notes in Computer Science

ISBN 978-3-319-61203-4 ISBN 978-3-319-61204-1 (eBook) DOI 10.1007/978-3-319-61204-1

Library of Congress Control Number: 2017944358 LNCS Sublibrary: SL4–Security and Cryptology ©Springer International Publishing AG 2017

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, speciﬁcally the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microﬁlms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed.

The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a speciﬁc statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.

The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional afﬁliations.

Printed on acid-free paper

This Springer imprint is published by Springer Nature The registered company is Springer International Publishing AG

(6)

The 15th International Conference on Applied Cryptography and Network Security (ACNS2017) was held in Kanazawa, Japan, during July 10–12, 2017. The previous conferences in the ACNS series were successfully held in Kunming, China (2003), Yellow Mountain, China (2004), New York, USA (2005), Singapore (2006), Zhuhai, China (2007), New York, USA (2008), Paris, France (2009), Beijing, China (2010), Malaga, Spain (2011), Singapore (2012), Banff, Canada (2013), Lausanne, Switzerland (2014), New York, USA (2015), and London, UK (2016).

ACNS is an annual conference focusing on innovative research and current developments that advance the areas of applied cryptography, cyber security, and privacy. Academic research with high relevance to real-world problems as well as developments in industrial and technical frontiers fall within the scope of the conference.

This year we have received 149 submissions from 34 different countries. Each submission was reviewed by 3.7 Program Committee members on average. Papers submitted by Program Committee members received on average 4.4 reviews. The committee decided to accept 34 regular papers. The broad range of areas covered by the high-quality papers accepted for ACNS 2107 attests very much to the fulﬁllment of the conference goals.

The program included two invited talks given by Dr. Karthikeyan Bhargavan (Inria Paris) and Prof. Doug Tygar (UC Berkeley).

The decisions of the best student paper award was based on a vote among the Program Committee members. To be eligible for selection, the primary author of the paper has to be a full-time student who is present at the conference. The winner was Carlos Aguilar-Melchor, Martin Albrecht, and Thomas Ricosset from Universitéde Toulouse, Toulouse, France, Royal Holloway, University of London, UK, and Thales Communications & Security, Gennevilliers, France. The title of the paper is“Sampling From Arbitrary Centered Discrete Gaussians For Lattice-Based Cryptography.”

(7)

We would like to thank the authors for submitting their papers to the conference. The selection of the papers was a challenging and dedicated task, and we are deeply grateful to the 48 Program Committee members and the external reviewers for their reviews and discussions. We also would like to thank EasyChair for providing a user-friendly interface for us to manage all submissions and proceedingsﬁles. Finally, we would like to thank the general chair, Prof. Hiroaki Kikuchi, and the members of the local Organizing Committee.

July 2017 Dieter Gollmann

(8)

The 15th International Conference

on Applied Cryptography

and Network Security

Jointly organized by Osaka University

and

Japan Advanced Institute of Science and Technology (JAIST) and

Information-technology Promotion Agency (IPA)

General Chair

Hiroaki Kikuchi Meiji University, Japan

Program Co-chairs

Dieter Gollmann Hamburg University of Technology, Germany Atsuko Miyaji Osaka University / JAIST, Japan

Program Committee

Diego Aranha University of Campinas, Brazil Giuseppe Ateniese Stevens Institute of Technology, USA

Man Ho Au Hong Kong Polytechnic University, Hong Kong, SAR China

Carsten Baum Bar-Ilan University, Israel Rishiraj Bhattacharyya NISER Bhubaneswar, India Liqun Chen University of Surrey, UK Chen-Mou Chen Osaka University, Japan

Céline Chevalier UniversitéPanthéon-Assas, France

Sherman S.M. Chow Chinese University of Hong Kong, Hong Kong, SAR China

Mauro Conti University of Padua, Italy Alexandra Dmitrienko ETH Zurich, Switzerland

Michael Franz University of California, Irvine, USA Georg Fuchsbauer ENS, France

Sebastian Gajek FUAS, Germany Goichiro Hanaoka AIST, Japan

(9)

Swee-Huay Heng Multimedia University, Malaysia Francisco Rodrguez

Henrquez

CINVESTAV-IPN, Mexico

Xinyi Huang Fujian Normal University, China Michael Huth Imperial College London, UK Tibor Jager Paderborn University, Germany Aniket Kate Purdue University, USA Stefan Katzenbeisser TU Darmstadt, Germany

Kwangjo Kim KAIST, Korea

Kwok-yan Lam NTU, Singapore

Mark Manulis University of Surrey, UK Tarik Moataz Brown University, USA Ivan Martinovic University of Oxford, UK

Jörn Müller-Quade Karlsruhe Institute of Technology, Germany David Naccache École normale supérieure, France

Michael Naehrig Microsoft Research Redmond, USA Hamed Okhravi MIT Lincoln Laboratory, USA

Panos Papadimitratos KTH Royal Institute of Technology, Sweden Jong Hwan Park Sangmyung University, Korea

Thomas Peyrin Nanyang Technological University, Singapore Bertram Poettering Ruhr-Universität Bochum, Germany

Christina Pöpper NYU, United Arab Emirates Bart Preneel KU Leuven, Belgium Thomas Schneider TU Darmstadt, Germany Michael Scott Dublin City University, Ireland Vanessa Teague University of Melbourne, Australia Somitra Kr. Sanadhya Ashoka University, India

Mehdi Tibouchi NTT Secure Platform Laboratories, Japan Ivan Visconti University of Salerno, Italy

Bo-Yin Yang Academia Sinica, Taiwan

Kan Yasuda NTT Secure Platform Laboratories, Japan Fangguo Zhang Sun Yat-sen University, China

Jianying Zhou SUTD, Singapore

Organizing Committee

Local Arrangements

Akinori Kawachi Tokushima University, Japan

Co-chairs

Kazumasa Omote University of Tsukuba, Japan Shoichi Hirose University of Fukui, Japan Kenji Yasunaga Kanazawa University, Japan

(10)

Finance Co-chairs

Masaki Fujikawa Kogakuin University, Japan

Yuichi Futa JAIST, Japan

Natsume Matsuzaki University of Nagasaki, Japan Takumi Yamamoto Mitsubishi Electric, Japan

Publicity Co-chairs

Noritaka Inagaki IPA, Japan Masaki Hashimoto IISEC, Japan

Naoto Yanai Osaka University, Japan

Kaitai Liang Manchester Metropolitan University, UK

Liaison Co-chairs

Keita Emura NICT, Japan

Eiji Takimoto Ritsumeikan University, Japan Toru Nakamura KDDI Research, Japan

System Co-chairs

Atsuo Inomata Tokyo Denki University/NAIST, Japan Masaaki Shirase Future University Hakodate, Japan Minoru Kuribayashi Okayama University, Japan Toshihiro Yamauchi Okayama University, Japan Shinya Okumura Osaka University, Japan

Publication Co-chairs

Takeshi Okamoto Tsukuba University of Technology, Japan Takashi Nishide University of Tsukuba, Japan

Ryo Kikuchi NTT, Japan

Satoru Tanaka JAIST, Japan

Registration Co-chairs

Hideyuki Miyake Toshiba, Japan Dai Watanabe Hitachi, Japan

Chunhua Su Osaka University, Japan

Additional Reviewers

Alesiani, Francesco Aminanto, Muhamad Erza Andaló, Fernanda

Armknecht, Frederik

(11)

Barrera, David

Chi-Domínguez, Jesús Javier Chin, Ji-Jian

Ochoa-Jiménez, JoséEduardo Oliveira, Thomaz

(12)

(13)

Applied Cryptography

Sampling from Arbitrary Centered Discrete Gaussians

for Lattice-Based Cryptography. . . 3 Carlos Aguilar-Melchor, Martin R. Albrecht, and Thomas Ricosset

Simple Security Definitions for and Constructions of 0-RTT

Key Exchange . . . 20 Britta Hale, Tibor Jager, Sebastian Lauer, and Jö_{rg Schwenk}

TOPPSS: Cost-Minimal Password-Protected Secret Sharing Based

on Threshold OPRF . . . 39 Stanisław Jarecki, Aggelos Kiayias, Hugo Krawczyk, and Jiayu Xu

Secure and Efficient Pairing at 256-Bit Security Level . . . 59 Yutaro Kiyomura, Akiko Inoue, Yuto Kawahara, Masaya Yasuda,

Tsuyoshi Takagi, and Tetsutaro Kobayashi

Data Protection and Mobile Security

No Free Charge Theorem: A Covert Channel via USB Charging Cable

on Mobile Devices . . . 83 Riccardo Spolaor, Laila Abudahi, Veelasha Moonsamy,

Mauro Conti, and Radha Poovendran

Are You Lying: Validating the Time-Location of Outdoor Images . . . 103 Xiaopeng Li, Wenyuan Xu, Song Wang, and Xianshan Qu

Lights, Camera, Action! Exploring Effects of Visual Distractions

on Completion of Security Tasks . . . 124 Bruce Berg, Tyler Kaczmarek, Alfred Kobsa, and Gene Tsudik

A Pilot Study of Multiple Password Interference Between Text

and Map-Based Passwords . . . 145 Weizhi Meng, Wenjuan Li, Wang Hao Lee, Lijun Jiang,

and Jianying Zhou

Security Analysis

Hierarchical Key Assignment with Dynamic Read-Write Privilege

(14)

A Novel GPU-Based Implementation of the Cube Attack: Preliminary

Results Against Trivium. . . 184 Marco Cianfriglia, Stefano Guarino, Massimo Bernaschi,

Flavio Lombardi, and Marco Pedicini

Related-Key Impossible-Differential Attack on Reduced-RoundSKINNY . . . 208 Ralph Ankele, Subhadeep Banik, Avik Chakraborti, Eik List,

Florian Mendel, Siang Meng Sim, and Gaoli Wang Faster Secure Multi-party Computation of AES and DES

Using Lookup Tables . . . 229 Marcel Keller, Emmanuela Orsini, Dragos Rotaru, Peter Scholl,

Eduardo Soria-Vazquez, and Srinivas Vivek

Cryptographic Primitives

An Experimental Study of the BDD Approach for the Search

LWE Problem . . . 253 Rui Xu, Sze Ling Yeo, Kazuhide Fukushima, Tsuyoshi Takagi,

Hwajung Seo, Shinsaku Kiyomoto, and Matt Henricksen

Efficiently Obfuscating Re-Encryption Program Under DDH Assumption . . . 273 Akshayaram Srinivasan and Chandrasekaran Pandu Rangan

Lattice-Based Group Signatures: Achieving Full Dynamicity with Ease . . . 293 San Ling, Khoa Nguyen, Huaxiong Wang, and Yanhong Xu

Breaking and Fixing Mobile App Authentication

with OAuth2.0-based Protocols . . . 313 Ronghai Yang, Wing Cheong Lau, and Shangcheng Shi

Adaptive Proofs Have Straightline Extractors (in the Random

Oracle Model) . . . 336 David Bernhard, Ngoc Khanh Nguyen, and Bogdan Warinschi

More Efficient Construction of Bounded KDM Secure Encryption . . . 354 Kaoru Kurosawa and Rie Habuka

Signature Schemes with Randomized Verification . . . 373 Cody Freitag, Rishab Goyal, Susan Hohenberger, Venkata Koppula,

Eysa Lee, Tatsuaki Okamoto, Jordan Tran, and Brent Waters

Side Channel Attack

Trade-Offs for S-Boxes: Cryptographic Properties

(15)

A Practical Chosen Message Power Analysis Approach Against Ciphers

with the Key Whitening Layers . . . 415 Chenyang Tu, Lingchen Zhang, Zeyi Liu, Neng Gao, and Yuan Ma

Side-Channel Attacks Meet Secure Network Protocols . . . 435 Alex Biryukov, Daniel Dinu, and Yann Le Corre

Cryptographic Protocol

Lattice-Based DAPS and Generalizations: Self-enforcement

in Signature Schemes . . . 457 Dan Boneh, Sam Kim, and Valeria Nikolaenko

Forward-Secure Searchable Encryption on Labeled Bipartite Graphs . . . 478 Russell W.F. Lai and Sherman S.M. Chow

Bounds in Various Generalized Settings of the Discrete

Logarithm Problem . . . 498 Jason H.M. Ying and Noboru Kunihiro

An Enhanced Binary Characteristic Set Algorithm and Its Applications

to Algebraic Cryptanalysis . . . 518 Sze Ling Yeo, Zhen Li, Khoongming Khoo, and Yu Bin Low

SCRAPE: Scalable Randomness Attested by Public Entities . . . 537 Ignacio Cascudo and Bernardo David

cMix: Mixing with Minimal Real-Time Asymmetric Cryptographic

Operations . . . 557 David Chaum, Debajyoti Das, Farid Javani, Aniket Kate,

Anna Krasnova, Joeri De Ruiter, and Alan T. Sherman

Almost Optimal Oblivious Transfer from QA-NIZK . . . 579 Olivier Blazy, Cé_{line Chevalier, and Paul Germouty}

OnionPIR: Effective Protection of Sensitive Metadata in Online

Communication Networks . . . 599 Daniel Demmler, Marco Holz, and Thomas Schneider

Data and Server Security

Accountable Storage . . . 623 Giuseppe Ateniese, Michael T. Goodrich, Vassilios Lekakis,

(16)

Maliciously Secure Multi-Client ORAM . . . 645 Matteo Maffei, Giulio Malavolta, Manuel Reinert,

and Dominique Schrö_der

Legacy-Compliant Data Authentication for Industrial

Control System Traffic . . . 665 John Henry Castellanos, Daniele Antonioli, Nils Ole Tippenhauer,

and Martí_{n Ochoa}

Multi-client Oblivious RAM Secure Against Malicious Servers. . . 686 Erik-Oliver Blass, Travis Mayberry, and Guevara Noubir

(17)

(18)

Gaussians for Lattice-Based Cryptography

Carlos Aguilar-Melchor1_{, Martin R. Albrecht}2_{, and Thomas Ricosset}1,3(B) 1 _{INP ENSEEIHT, IRIT-CNRS, Universit´e de Toulouse, Toulouse, France}

{carlos.aguilar,thomas.ricosset}@enseeiht.fr

2 _{Information Security Group, Royal Holloway, University of London, London, UK} martin.albrecht@royalholloway.ac.uk

3 _{Thales Communications & Security, Gennevilliers, France}

Abstract. Non-Centered Discrete Gaussian sampling is a fundamental building block in many lattice-based constructions in cryptography, such as signature and identity-based encryption schemes. On the one hand, the center-dependent approaches, e.g. cumulative distribution tables (CDT), Knuth-Yao, the alias method, discrete Zigurat and their variants, are the fastest known algorithms to sample from a discrete Gaussian distribu-tion. However, they use a relatively large precomputed table for each possible real center in [0,1) making them impracticable for non-centered discrete Gaussian sampling. On the other hand, rejection sampling allows to sample from a discrete Gaussian distribution for all real centers with-out prohibitive precomputation cost but needs costly floating-point arith-metic and several trials per sample. In this work, we study how to reduce the number of centers for which we have to precompute tables and pro-pose a non-centered CDT algorithm with practicable size of precomputed tables as fast as its centered variant. Finally, we provide some experimen-tal results for our open-source C++ implementation indicating that our sampler increases the rate of Peikert’s algorithm for sampling from arbi-trary lattices (and cosets) by a factor 3 with precomputation storage up to 6.2 MB.

1 Introduction

Lattice-based cryptography has generated considerable interest in the last decade due to many attractive features, including conjectured security against quantum attacks, strong security guarantees from worst-case hardness and constructions of fully homomorphic encryption (FHE) schemes (see the survey [33]). More-over, lattice-based cryptographic schemes are often algorithmically simple and efficient, manipulating essentially vectors and matrices or polynomials modulo relatively small integers, and in some cases outperform traditional systems.

M.R. Albrecht—The research of this author was supported by EPSRC grant “Bit Security of Learning with Errors for Post-Quantum Cryptography and Fully Homo-morphic Encryption” (EP/P009417/1) and the EPSRC grant “Multilinear Maps in Cryptography” (EP/L018543/1).

c

Springer International Publishing AG 2017

(19)

Modern lattice-based cryptosystems are built upon two main average-case problems over general lattices: Short Integer Solution (SIS) [1] and Learning With Errors (LWE) [35], and their analogues over ideal lattices, ring-SIS [29] and ring-LWE [27]. The hardness of these problems can be related to the one of their worst-case counterpart, if the instances follow specific distributions and parameters are choosen appropriately [1,27,29,35].

In particular, discrete Gaussian distributions play a central role in lattice-based cryptography. A natural set of examples to illustrate the importance of Gaussian sampling are lattice-based signature and identity-based encryption (IBE) schemes [16]. The most iconic example is the signature algorithm proposed in [16] (hereafter GPV), as a secure alternative to the well-known (and broken) GGH signature scheme [18]. In this paper, the authors use the Klein/GPV algo-rithm [21], a randomized variant of Babai’s nearest plane algoalgo-rithm [4]. In this algorithm, the rounding step is replaced by randomized rounding according to a discrete Gaussian distribution to return a lattice point (almost) independent of a hidden basis. The GPV signature scheme has also been combined with LWE to obtain the first identity-based encryption (IBE) scheme [16] conjectured to be secure against quantum attacks. Later, a new Gaussian sampling algorithm for arbitrary lattices was presented in [32]. It is a randomized variant of Babai’s rounding-off algorithm, is more efficient and parallelizable, but it outputs longer vectors than Klein/GPV’s algorithm.

Alternatively to the above trapdoor technique, lattice-based signatures [11,23–26] were also constructed by applying the Fiat-Shamir heuristic [14]. Note that in contrast to the algorithms outlined above which sample from a discrete Gaussian distribution for any real center not known in advance, the schemes devel-oped in [11,25] only need to sample from a discrete Gaussian centered at zero.

1.1 Our Contributions

We develop techniques to speed-up discrete Gaussian sampling when the center is not known in advance, obtaining a flexible time-memory trade-off comparing favorably to rejection sampling. We start with the cumulative distribution table (CDT) suggested in [32] and lower the computational cost of the precomputa-tion phase and the global memory required when sampling from a non-centered discrete Gaussian by precomputing the CDT for a relatively small number of centers, inO(λ3_{), and by computing the cdf when needed, i.e. when for a given} uniform random input, the values returned by the CDTs for the two closest pre-computed centers differ. Second, we present an adaptation of the lazy technique described in [12] to compute most of the cdf in double IEEE standard double precision, thus decreasing the number of precomputed CDTs. Finally, we pro-pose a more flexible approach which takes advantage of the information already present in the precomputed CDTs. For this we use a Taylor expansion around the precomputed centers and values instead of this lazy technique, thus enabling to reduce the number of precomputed CDTs to aω(λ).

(20)

1.2 Related Work

Many discrete Gaussian samplers over the Integers have been proposed for lattice-based cryptography. Rejection Sampling [12,17], Inversion Sampling with a Cumu-lative Distribution Table (CDT) [32], Knuth-Yao [13], Discrete Ziggurat [7], Bernoulli Sampling [11], Kahn-Karney [20] and Binary Arithmetic Coding [36].

The optimal method will of course depend on the setting in which it is used. In this work, we focus on what can be done on a modern computer, with a comfortable amount of memery and hardwired integer and floating-point opera-tions. This is in contrast to the works [11,13] which focus on circuits or embedded devices. We consider exploring the limits of the usual memory and hardwired operations in commodity hardware as much an interesting question as it is to consider what is feasible in more constrained settings.

Rejection Sampling and Variants. Straightforward rejection sampling [37] is a classical method to sample from any distribution by sampling from a uniform distribution and accept the value with a probability equal to its probability in the target distribution. This method does not use pre-computed data but needs floating-point arithmetic and several trials by sample. Bernoulli sampling [11] introduces an exponential bias from Bernoulli variables, which can be efficiently sampled specially in circuits. The bias is then corrected in a rejection phase based on another Bernouilli variable. This approach is particularly suited for embed-ded devices for the simplicity of the computation and the near-optimal entropy consumption. Kahn-Karney sampling is another variant of rejection sampling to sample from a discrete Gaussian distribution which does not use floating-point arithmetic. It is based on the von Neumann algorithm to sample from the exponential distribution [31], requires no precomputed tables and consumes a smaller amount of random bits than Bernoulli sampling, though it is slower. Currently the fastest approach in the computer setting uses a straightforward rejection sampling approach with “lazy” floating-point computations [12] using IEEE standard double precision floating-point numbers in most cases.

Note that none of these methods requires precomputation depending on the distribution’s center c. In all the alternative approaches we present hereafter, there is some center-dependent precomputation. When the center is not know this can result in prohibitive costs and handling these becomes a major issue around which most of our work is focused.

(21)

time-memory trade-off. Knuth-Yao sampling [22] uses a random bit generator to traverse a binary tree formed from the bit representation of the probability of each possible sample, the terminal node is labeled by the corresponding sample. The main advantage of this method is that it consumes a near-optimal amount of random bits. A block variant and other practical improvements are suggested in [13]. This method is center-dependent but clearly designed for circuits and on a computer setting it is surpassed by other approaches.

Our main contribution is to show how to get rid of the known-center con-straint with reasonable memory usage for center-dependent approaches. As a consequence, we obtain a performance gain with respect to rejection sam-pling approaches. Alternatively, any of the methods discussed above could have replaced our straightforward CDT approach. This, however, would have made our algorithms, proofs, and implementations more involved. On the other hand, further performance improvements could perhaps be achieved this way. This is an interesting problem for future work.

2 Preliminaries

Throughout this work, we denote the set of real numbers byRand the Integers byZ. We extend any real functionf(·) to a countable setAby definingf(A) =

x∈Af(x). We denote also byUI the uniform distribution onI. 2.1 Discrete Gaussian Distributions on Z

The discrete Gaussian distribution onZis defined as the probability distribution whose unnormalized density function is

ρ:Z_→[0,1) x→e−x22

Ifs∈R+ andc∈R, then we extend this definition to

ρs,c(x) :=ρ

x−c s

and denoteρs,0(x) byρs(x). For any meanc∈Rand parameters∈R+ we can now define the discrete Gaussian distributionDs,cas

∀x∈Z, Ds,c(x) := ρs,c(x) ρs,c(Z)

Note that the standard deviation of this distribution is σ = s/√2π. We also define cdfs,cas the cumulative distribution function (cdf) ofDs,c

∀x∈Z, cdfs,c(x) := x i=−∞

(22)

Smoothing Parameter. The smoothing parameterηǫ(Λ) quantifies the minimal discrete Gaussian parametersrequired to obtain a given level of smoothness on the latticeΛ. Intuitively, if one picks a noise vector over a lattice from a discrete Gaussian distribution with radius at least as large as the smoothing parameter, and reduces this modulo the fundamental parallelepiped of the lattice, then the resulting distribution is very close to uniform (for details and formal definition see [30]).

Gaussian Measure. An interesting property of discrete Gaussian distributions with a parameter sgreater than the smoothing parameter is that the Gaussian measure, i.e. ρs,c(Z) forDs,c, is essentially the same for all centers.

Lemma 1 (From the proof of[30, Lemma 4.4]). For anyǫ∈(0,1),s > ηǫ(Z) andc∈Rwe have

Δmeasure:= ρs,c(Z) ρs,0(Z) ∈

1−ǫ 1 +ǫ,1

Tailcut Parameter. To deal with the infinite domain of Gaussian distributions, algorithms usually take advantage of their rapid decay to sample from a finite domain. The next lemma is useful in determining the tailcut parameterτ. Lemma 2 ([17, Lemma 4.2]). For any ǫ >0,s > ηǫ(Z)andτ >0, we have

Etailcut := Pr X∼DZ,s,c

[|X−c|> τ s]<2e−πτ2·1 +₁ ǫ −ǫ

2.2 Floating-Point Arithmetic

We recall some facts from [12] about floating-point arithmetic (FPA) with m bits of mantissa, which we denote by FPm. A floating-point number is a triplet ¯

x= (s, e, v) where s ∈ {0,1}, e∈ Zand v ∈N2m₋₁ which represents the real number ¯x= (−1)s·2e−m_·_{v. Denote by} _ǫ_{= 2}1−m _{the floating-point precision.} Every FPA-operation ¯◦ ∈ {+,¯ −¯,×¯,¯/} and its respective arithmetic operation onR,◦ ∈ {+,−,×, /} verify

∀x,¯ y¯∈FPm,|(¯x◦¯y)¯ −(¯x◦y)¯ | ≤(x◦y)ǫ

Moreover, we assume that the floating-point implementation of the exponential function ¯exp(·) verifies

∀x¯∈FPm,|exp(¯¯ x)−exp(¯x)| ≤ǫ.

2.3 Taylor Expansion

(23)

Theorem 1 (Taylor’s theorem). Let d∈Z+ and let the functionf :R_→R bed times differentiable in some neighborhoodU of a∈R. Then for anyx∈U

f(x) =Td,f,a(x) +Rd,f,a(x) where

Td,f,a(x) = d i=0

f(i)_(a) i! (x−a)

i

and

Rd,f,a(x) = x

a

f(d+1)_(t) d! (x−t)

d dt

3 Variable-Center with Polynomial Number of CDTs

We consider the case in which the mean is variable, i.e. the center is not know before the online phase, as it is the case for lattice-based hash-and-sign signa-tures. The center can be any real number, but without loss of generality we will only consider centers in [0,1). Because CDTs are center-dependent, a first naive option would be to precompute a CDT for each possible real center in [0,1) in accordance with the desired accuracy. Obviously, this first option has the same time complexity than the classical CDT algorithm, i.e. O(λlogsλ) for λ the security parameter. However, it is completely impractical with 2λ _precomputed CDTs of size O(sλ1.5_{). An opposite trade-off is to compute the CDT} on-the-fly, avoiding any precomputation storage, which increase the computational cost to O(sλ3.5_{) assuming that the computation of the exponential function run in} O(λ3_{) (see Sect.}_3.2 _{for a justification of this assumption).}

An interesting question is can we keep the time complexity of the classical CDT algorithm with a polynomial number of precomputed CDTs. To answer this question, we start by fixing the numbernof equally spaced centers in [0,1) and precompute the CDTs for each of these. Then, we apply the CDT algorithm to the two precomputed centers closest to the desired center for the same cumulative probability uniformly draw. Assuming that the number of precomputed CDTs is sufficient, the values returned from both CDTs will be equal most of the time, in this case we can conclude, thanks to a simple monotonic argument, that the returned value would have been the same for the CDT at the desired center and return it as a valid sample. Otherwise, the largest value will immediately follow the smallest and we will then have to compute the cdf at the smallest value for the desired center in order to know if the cumulative probability is lower or higher than this cdf. If it is lower then the smaller value will be returned as sample, else it will be the largest.

3.1 Twin-CDT Algorithm

(24)

phase of the Twin-CDT algorithm. Algorithm1 precomputes CDTs, up to a precision m that guarantees the λ most significant bits of each cdf, and store them with λ-bits of precision as a matrix T, where the i-th line is the CDT corresponding to the i-th precomputed center i/n. To sample from Ds,c, Algorithm2 searches the preimages by the cdf of a cumulative probability p, draw from the uniform distribution on [0,1)∩FPλ, in both CDTs corresponding to the center ⌊n(c− ⌊c⌋)⌋/n (respectively ⌈n(c− ⌊c⌋)⌉/n) which return a value v1 (resp.v2). If the same value is returned from the both CDTs (i.e. v1 =v2), then this value added the desired center integer part is a valid sample, else it computes cdfs,c−⌊c⌋(v1) and returnsv1+⌊c⌋ifp <cdfs,c(v1) andv2+⌊c⌋else.

Algorithm 1. Twin-CDT Algorithm: Offline Phase Input: a Gaussian parametersand a number of centers n Output: a precomputed matrixT

1: initialize an empty matrixT∈FPn×2⌈τ s⌉+3_λ 2: fori←0, . . . , n−1do

3: forj←0, . . . ,2⌈τ s⌉+ 2do

4: Ti,j←FPm: cdfs,i/n(j− ⌈τ s⌉ −1)

Algorithm 2. Twin-CDT Algorithm: Online Phase Input: a centercand a precomputed matrixT

Output: a samplexthat followsDs,c 1: p←U[0,1)∩FPλ

2: v1←i− ⌈τ s⌉ −1 s.t.T⌊n(c−⌊c⌋)⌋,i−1≤p <T⌊n(c−⌊c⌋)⌋,i 3: v2←j− ⌈τ s⌉ −1 s.t.T⌈n(c−⌊c⌋)⌉,j−1≤p <T⌈n(c−⌊c⌋)⌉,j 4: if v1=v2 then

5: returnv1+⌊c⌋ 6: else

7: if p <FPm: cdfs,c−⌊c⌋(v1)then 8: returnv1+⌊c⌋

9: else

10: returnv2+⌊c⌋

Correctness. We establish correctness in the lemma below.

(25)

function cdf, then cdf(X) has a uniform distribution on [0,1]. Hence the inversion method: cdf−1(U[0,1]) has the same distribution asX. Finally by noting that for all s, p ∈ R, cdfs,c(p) is monotonic in c, if cdf−s,c11 (p) = cdf−s,c21(p) := v, then cdf−s,c1(p) = v for all c ∈ [c1, c2], and as a consequence, for all v ∈ [−⌈τ s⌉ − 1,⌈τ s⌉+ 1], the probability of outputtingv is equal to FPm: cdfs,c(v)−FPm:

cdfs,c(v−1) which is 2−λ_{-close to}_Ds,c(v). _⊓_⊔

The remaining issue in the correctness analysis of Algorithm2 is to determine the error occurring during them-precision cdf computation. Indeed, this error allows us to learn what precisionm is needed to correctly compute theλmost significant bits of the cdf. This error is characterized in Lemma4.

Lemma 4. Let m∈Zbe a positive integer andε= 21−m_{. Let} _¯_c,_¯_s,_h_¯_∈_FP

Proof. We derive the following bounds using [10, Facts 6.12, 6.14, 6.22]:

Δcdfs,c(x)≤Δ For the sake of readability the FPA error bound of Lemma4 is fully simplified and is therefore not tight. For practical implementation, one can derive a better bound using an ad-hoc approach such as done in [34].

Efficiency. On average, the evaluation of the cdf requires ⌈τ s⌉+ 1.5 evalua-tions of the exponential function. For the sake of clarity, we assume that the exponential function is computed using a direct power series evaluation with schoolbook multiplication, so its time complexity is O(λ3_{). We refer the reader} to [6] for a discussion of different ways to compute the exponential function in high-precision.

Lemma5establishes that the time complexity of Algorithm2isO(λlogsλ+ λ4_{/n), so with} _n ₌ _O_(λ3_{) it has asymptotically the same computational cost} than the classical CDT algorithm.

Lemma 5. Let_Pcdfbe the probability of computing thecdf during the execution of Algorithm2, assuming thatτ s≥10, we have

Pcdf≤2.2τ s

(26)

Proof.

Hence the upper bound. ⊓⊔

On the other hand, the precomputation matrix generated by Algorithm1taken times the size of one CDT, hence the space complexity isO(nsλ1.5_{). Note that} fornsufficiently big to make the cdf computational cost negligible, the memory space required by this algorithm is about 1 GB for the parameters considered in cryptography and thus prohibitively expensive for practical use.

3.2 Lazy-CDT Algorithm

A first idea to decrease the number of precomputed CDTs is to avoid costly cdf evaluations by using the same lazy trick as in [12] for rejection sampling. Indeed, a careful analysis of Algorithm2shows most of the time many of the computed cdf bits are not used. This gives us to a new strategy which consists of computing the bits of cdfs,c(v1) lazily. When the values corresponding to the generated probability for the two closest centers are different, the Lazy-CDT algorithm first only computes the cdf at a precision m′ _{to ensure} _{k < λ} _{correct bits. If} the comparison is decided with thosekbits, it returns the sample. Otherwise, it recomputes the cdf at a precision mto ensureλcorrect bits.

Correctness. In addition to the choice ofm, discussed in Sect.3.1, to achieveλ bits of precision, the correctness of Algorithm3also requires to knowkwhich is the number of correct bits after the floating-point computation of the cdf with m′ _{bits of mantissa. For this purpose, given}_m′ _Lemma₄ _{provides a theoretical} lower bound onk.

(27)

Algorithm 3. Lazy-CDT Algorithm: Online Phase Input: a centercand a precomputed matrixT Output: a samplexthat followsDs,c

1: p←U[0,1)∩FPλ

2: v1←i− ⌈τ s⌉ −1 s.t.T⌊n(c−⌊c⌋)⌋,i−1 ≤p <T⌊n(c−⌊c⌋)⌋,i 3: v2←j− ⌈τ s⌉ −1 s.t.T⌈n(c−⌊c⌋)⌉,j−1≤p <T⌈n(c−⌊c⌋)⌉,j 4: if v1=v2 then

5: returnv1+⌊c⌋ 6: else

7: if FPk:p <FPm′: cdfs,c−⌊c⌋(v1)then 8: returnv1+⌊c⌋

9: else

10: if FPk:p >FPm′ : cdfs,c−⌊c⌋(v1)then 11: returnv2+⌊c⌋

12: else

13: if p >FPm: cdfs,c−⌊c⌋(v1)then

15: else

common leading bits of cdfs,⌊n(c−⌊c⌋)⌋/n(v1) and cdfs,⌈n(c−⌊c⌋)⌉/n(v2). By using this lazy trick in addition to lookup tables as described in Sect.5 with parame-ters considered in cryptography, we achieve a computational cost lower than the classical centered CDT algorithm with a memory requirement in the order of 1 megabyte.

4 A More Flexible Time-Memory Tradeoff

In view of limitations of the lazy approach described above, a natural question is if we can find a better solution to approximate the cdf. The major advantage of this lazy trick is that it does not require additional memory. However, in our context the CDTs are precomputed and rather than approximate the cdf from scratch it would be interesting to reuse the information contained in these precomputations. Consider the cdf as a function of the center and note that each precomputed cdf is zero degree term of the Taylor expansion of the cdf around a precomputed center. Hence, we may approximate the cdf by its Taylor expansions by precomputing some higher degree terms.

(28)

4.1 Taylor-CDT Algorithm

OurTaylor-CDT algorithm is similar to theLazy-CDT algorithm (Algorithm3) described above, except that the lazy computation of the cdf is replaced by the Taylor expansion of the cdf, viewed as a function of the Gaussian center, around each precomputed centers for all possible values. The zero-degree term of each of these Taylor expansions is present in the corresponding CDT element Ti,j and the d higher-degree terms are stored as an element Ei,j of another matrix E. As for the other approaches, these precomputations shall be performed at a sufficient precision m to ensure λcorrect bits. During the online phase, Algo-rithm5proceed as follow. Drawpfrom the uniform distribution over [0,1)∩FPλ and searchpin the CDTs of the two closest precomputed centers to the desired center decimal part. If the two values found are equal, add the desired center integer part to this value and return it as a valid sample. Otherwise, select the closest precomputed center to the desired center decimal part and evaluate, at the desired center decimal part, the Taylor expansion corresponding to this cen-ter and the value found in its CDT. Ifpis smaller or bigger than this evaluation with respect for the error approximation upper bound Eexpansion, characterized in Lemma6, add the desired center integer part to the corresponding value and return it as a valid sample. Otherwise, it is necessary to compute the full cdf to decide which value to return.

Algorithm 4. Taylor-CDT Algorithm: Offline Phase

Input: a Gaussian parameters, a number of centersn, a Taylor expansion degreed Output: two precomputed matricesTandE

1: initialize two empty matricesT∈FPn×2⌈τ s⌉+3_λ andE∈(FPdλ)n×2⌈τ s⌉+3 2: fori←0, . . . , n−1do

3: forj←0, . . . ,2⌈τ s⌉+ 2do

4: Ti,j←FPm: cdfs,i/n(j− ⌈τ s⌉ −1)

5: Ei,j←FPm:Td,cdfs,x(j−⌈τ s⌉−1),i/n(x)−Ti,j

Efficiency. Algorithm5 performs two binary searches on CDTs inO(λlogsλ), dadditions and multiplications onFPm inO(m2) with probabilityPcdf ≈3λ/n (see Lemma5) and a cdf computation on FPm in O(sλ3.5) with probability close to 2q+1

PcdfEexpansion, where q is the number of common leading bits of cdfs,⌊n(c−⌊c⌋)⌋/n(v1) and cdfs,⌈n(c−⌊c⌋)⌉/n(v2) and Eexpansion is the Taylor expan-sion approximation error bound described in Lemma6.

Lemma 6. Let Eexpansion be the maximal Euclidean distance between cdfs,x(v) and_Td,cdfs,x(v),c(x), its Taylor expansion aroundc, for allv∈[−⌈τ s⌉ −1,⌈τ s⌉+ 1],c∈[0,1) andx∈[c, c+ 1/2n], assuming thatτ≥2.5,s≥4, we have

(29)

Algorithm 5. Taylor-CDT Algorithm: Online Phase Input: a centercand two precomputed matricesTandE Output: a samplexthat followsDs,c

1: p←U[0,1)∩FPλ

Proof. From Theorem1we have

Eexpansion= max

(30)

5 Lookup Tables

We shall now show how to use partial lookup tables to avoid the binary search in most cases when using CDT algorithms, this technique is the CDT analogue of the Knuth-Yao algorithm improvement described in [8]. Note that this strategy is particularly fitting for discrete Gaussian distributions with relatively small expected values. The basic idea is to subdivide the uniform distribution U[0,1) into ℓ uniform distributions on subsets of the same sizeU[i/ℓ,(i+1)/ℓ), with ℓ a power of two. We then precompute a partial lookup table on these subsets which allows to return the sample at once when the subset considered does not include a cdf image. We note that instead of subdividing the uniform range into stripes of the same size, we can also recursively subdivide only some stripes of the previous subdivision. However, for the sake of clarity and ease of exposure, this improvement is not included in this paper and we will describe this technique for the classical centered CDT algorithm.

First, we initialize a lookup table of size ℓ= 2l _{where the}_{i-th entry} corre-sponds to a subinterval [i/ℓ,(i+ 1)/ℓ) of [0,1). Second, after precomputing the CDT, we mark all the entries for which there is at least one CDT element in their corresponding subinterval [i/ℓ,(i+ 1)/ℓ) with⊥, and all remaining entries with ⊤. Each entry marked with⊤allows to return a sample without the need to perform a binary search in the CDT, because only one value corresponds to this subinterval which is the first CDT element greater or equal to (i+ 1)/ℓ. Efficiency. The efficiency of this technique is directly related to the number of entries, marked with ⊤, whose subintervals do not contain a CDT element. We denote the probability of performing binary search byPbinsrch, obviously the probability to return the sample immediately after choosing i, which is a part ofp, is 1−Pbinsrch. Lemma7gives a lower bound of Pbinsrch.

Lemma 7. For anyℓ≥28 _and_s_≥_η 1

2(Z). LetPbinsrch be the probability of per-forming binary search during the execution of the CDT algorithm implemented with the lookup table trick described above, we have

Pbinsrch<1.2slog2ℓ/ℓ Proof.

Pbinsrch=

ℓ−⌈ci=+⌊c−τ s⌋τ s⌉ ⌊ℓcdfs,c(i)⌋ − ⌊ℓcdfs,c(i−1)⌋ ℓ

From Lemma2 we have

ℓcdfs,cc−0.6slog2ℓ

= 0

ℓ1−cdfs,cc+ 0.6s log2ℓ

= 0

(31)

6 Experimental Results

In this section, we present experimental results of our C++ implementation1 distributed under the terms of the GNU General Public License version 3 or later (GPLv3+) which uses the MPFR [15] and GMP [19] libraries as well as Salsa20 [5] as the pseudorandom number generator. Our non-centered discrete Gaussian sampler was implemented with a binary search executed byte by byte ifℓ= 28_{and 2-bytes by 2-bytes if}_ℓ_{= 2}16_{without recursive subdivision of}_U

[0,1), therefore [0,1) is subdivided inℓintervals of the same size and cdf(x) is stored for allx∈[−⌈τ σ⌉ −1,⌈τ σ⌉+ 1]. The implementation of our non-centered discrete Gaussian sampler uses a fixed number of precomputed centers n = 28 _{with a} lookup table of sizeℓ= 28 _{and includes the lazy cdf evaluation optimization.}

We tested the performance of our non-centered discrete Gaussian sampler by using it as a subroutine for Peikert’s sampler [32] for sampling fromD(g),σ′,0 with g ∈ Z[x]/(xN _{+ 1) for} _N _{a power of two. To this end, we adapted the} implementation of this sampler from [3] where we swap out the sampler from

Table 1.Performance of sampling fromD(g),σ′as implemented in [3] and with our non-centered discrete Gaussian sampler withℓ=n= 28. The columnD(g),σ′/sgives the number of samples returned per second, the column “memory” the maximum amount of memory consumed by the process. All timings are on a Intel(R) Xeon(R) CPU E5-2667 (strombenzin). Precomputation uses 2 cores, the online phase uses one core.

[3]

N logσ′ _{precomp time} _D

(g),σ′/s memory 256 38.2 0.08 s 8.46 ms 118.17 11,556 kB 512 42.0 0.17 s 16.96 ms 58.95 11,340 kB 1024 45.8 0.32 s 38.05 ms 26.28 21,424 kB 2048 49.6 0.93 s 78.17 ms 12.79 41,960 kB 4096 53.3 2.26 s 157.53 ms 6.35 86,640 kB 8192 57.0 6.08 s 337.32 ms 2.96 192,520 kB 16384 60.7 13.36 s 700.75 ms 1.43 301,200 kB This work

(32)

the dgs library [2] (implementing rejection sampling and [11]) used in [3] with our sampler for sampling forDZ,σ,c. Note that sampling from D(g),σ′,0 is more involved and thus slower than sampling fromD_ZN_,σ′,0. That is, to sample from D(g),σ′,0, [3] first computes an approximate square root ofΣ2=σ′2·g−T·g−1−r2 with r= 2· ⌈√logN⌉. Then, given an approximation √Σ2

′

of √Σ2 it samples a vector x ←$ RN from a standard normal distribution and interpret it as a polynomial in Q[X]/(xN _{+ 1); computes} _y ₌ √_Σ

2 ′

·x in Q[X]/(xN _{+ 1) and} returns g·(⌊y⌉r), where ⌊y⌉r denotes sampling a vector in ZN where the i-th component follows DZ,r,yi. Thus, implementing Peikert’s sampler requires sampling from DZ,r,yi for changing centers yi and sampling from a standard normal distribution. We give experimental results in Table1, indicating that our sampler increases the rate by a factor≈3.

References

1. Ajtai, M.: Generating hard instances of lattice problems (extended abstract). In: Proceedings of the Twenty-Eighth Annual ACM Symposium on Theory of Com-puting, STOC 1996, NY, USA, pp. 99–108. ACM, New York (1996)

2. Albrecht, M.R.: dgs – discrete gaussians over the integers (2014).https://bitbucket. org/malb/dgs

3. Albrecht, M.R., Cocis, C., Laguillaumie, F., Langlois, A.: Implementing candidate graded encoding schemes from ideal lattices. In: Iwata, T., Cheon, J.H. (eds.) ASIACRYPT 2015. LNCS, vol. 9453, pp. 752–775. Springer, Heidelberg (2015). doi:10.1007/978-3-662-48800-3 31

4. Babai, L.: On Lov´asz’ lattice reduction and the nearest lattice point problem. In: Mehlhorn, K. (ed.) STACS 1985. LNCS, vol. 182, pp. 13–20. Springer, Heidelberg (1985). doi:10.1007/BFb0023990

5. Bernstein, D.J.: The salsa20 family of stream ciphers. In: Robshaw, M., Billet, O. (eds.) New Stream Cipher Designs: The eSTREAM Finalists, pp. 84–97. Springer, Heidelberg (2008)

6. Brent, R.P., et al.: Fast algorithms for high-precision computation of elementary functions. In: Proceedings of 7th Conference on Real Numbers and Computers (RNC 7), pp. 7–8 (2006)

7. Buchmann, J., Cabarcas, D., G¨opfert, F., H¨ulsing, A., Weiden, P.: Discrete Ziggu-rat: a time-memory trade-off for sampling from a Gaussian distribution over the integers. In: Lange, T., Lauter, K., Lisonˇek, P. (eds.) SAC 2013. LNCS, vol. 8282, pp. 402–417. Springer, Heidelberg (2014). doi:10.1007/978-3-662-43414-7 20 8. de Clercq, R., Roy, S.S., Vercauteren, F., Verbauwhede, I.: Efficient software

imple-mentation of ring-LWE encryption. In: 2015 Design, Automation Test in Europe Conference Exhibition (DATE), pp. 339–344 (2015)

9. Devroye, L.: Non-Uniform Random Variate Generation. Springer, Heidelberg (1986)

10. Ducas, L.: Lattice based signatures: attacks, analysis and optimization. Ph.D. thesis (2013)

(33)

12. Ducas, L., Nguyen, P.Q.: Faster Gaussian lattice sampling using lazy floating-point arithmetic. In: Wang, X., Sako, K. (eds.) ASIACRYPT 2012. LNCS, vol. 7658, pp. 415–432. Springer, Heidelberg (2012). doi:10.1007/978-3-642-34961-4 26

13. Dwarakanath, N.C., Galbraith, S.D.: Sampling from discrete Gaussians for lattice-based cryptography on a constrained device. Appl. Algebra Eng. Commun. Com-put.25(3), 159–180 (2014)

14. Fiat, A., Shamir, A.: How To prove yourself: practical solutions to identification and signature problems. In: Odlyzko, A.M. (ed.) CRYPTO 1986. LNCS, vol. 263, pp. 186–194. Springer, Heidelberg (1987). doi:10.1007/3-540-47721-7 12

15. Fousse, L., Hanrot, G., Lef`evre, V., P´elissier, P., Zimmermann, P.: MPFR: a multiple-precision binary floating-point library with correct rounding. ACM Trans. Math. Softw.33(2) (2007)

16. Gentry, C., Peikert, C., Vaikuntanathan, V.: Trapdoors for hard lattices and new cryptographic constructions. In: Proceedings of the Fortieth Annual ACM Sympo-sium on Theory of Computing, STOC 2008, pp. 197–206. ACM, New York (2008) 17. Gentry, C., Peikert, C., Vaikuntanathan, V.: Trapdoors for hard lattices and new cryptographic constructions. In: Ladner, R.E., Dwork, C. (eds.) 40th ACM STOC, pp. 197–206. ACM Press, Victoria, 17–20 May 2008

18. Goldreich, O., Goldwasser, S., Halevi, S.: Public-key cryptosystems from lattice reduction problems. In: Kaliski, B.S. (ed.) CRYPTO 1997. LNCS, vol. 1294, pp. 112–131. Springer, Heidelberg (1997). doi:10.1007/BFb0052231

19. Granlund, T.: The GMP development team: GNU MP: The GNU Multiple Preci-sion Arithmetic Library, 6.0.1 edn. (2015).http://gmplib.org/

20. Karney, C.F.F.: Sampling exactly from the normal distribution. ACM Trans. Math. Softw.42(1), 3:1–3:14 (2016)

21. Klein, P.: Finding the closest lattice vector when it’s unusually close. In: Pro-ceedings of the Eleventh Annual ACM-SIAM Symposium on Discrete Algo-rithms, SODA 2000, pp. 937–941. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA (2000)

22. Knuth, D.E., Yao, A.C.: The complexity of nonuniform random number genera-tion. In: Traub, J.F. (ed.) Algorithms and Complexity: New Directions and Recent Results. Academic Press, New York (1976)

23. Lyubashevsky, V.: Lattice-based identification schemes secure under active attacks. In: Cramer, R. (ed.) PKC 2008. LNCS, vol. 4939, pp. 162–179. Springer, Heidelberg (2008). doi:10.1007/978-3-540-78440-1 10

24. Lyubashevsky, V.: Fiat-Shamir with aborts: applications to lattice and factoring-based signatures. In: Matsui, M. (ed.) ASIACRYPT 2009. LNCS, vol. 5912, pp. 598–616. Springer, Heidelberg (2009). doi:10.1007/978-3-642-10366-7 35

25. Lyubashevsky, V.: Lattice signatures without trapdoors. In: Pointcheval, D., Johansson, T. (eds.) EUROCRYPT 2012. LNCS, vol. 7237, pp. 738–755. Springer, Heidelberg (2012). doi:10.1007/978-3-642-29011-4 43

26. Lyubashevsky, V., Micciancio, D.: Asymptotically efficient lattice-based digital sig-natures. In: Canetti, R. (ed.) TCC 2008. LNCS, vol. 4948, pp. 37–54. Springer, Heidelberg (2008). doi:10.1007/978-3-540-78524-8 3

27. Lyubashevsky, V., Peikert, C., Regev, O.: On ideal lattices and learning with errors over rings. In: Gilbert, H. (ed.) EUROCRYPT 2010. LNCS, vol. 6110, pp. 1–23. Springer, Heidelberg (2010). doi:10.1007/978-3-642-13190-5 1

(34)

29. Micciancio, D.: Generalized compact knapsacks, cyclic lattices, and efficient one-way functions. Comput. Complex.16(4), 365–411 (2007)

30. Micciancio, D., Regev, O.: Worst-case to average-case reductions based on Gaussian measures. SIAM J. Comput.37(1), 267–302 (2007)

31. von Neumann, J.: Various techniques used in connection with random digits. J. Res. Nat. Bur. Stand.12, 36–38 (1951)

32. Peikert, C.: An efficient and parallel Gaussian sampler for lattices. In: Rabin, T. (ed.) CRYPTO 2010. LNCS, vol. 6223, pp. 80–97. Springer, Heidelberg (2010). doi:10.1007/978-3-642-14623-7 5

33. Peikert, C.: A decade of lattice cryptography. Found. Trends Theor. Comput. Sci. 10(4), 283–424 (2016)

34. Pujol, X., Stehl´e, D.: Rigorous and efficient short lattice vectors enumeration. In: Pieprzyk, J. (ed.) ASIACRYPT 2008. LNCS, vol. 5350, pp. 390–405. Springer, Heidelberg (2008). doi:10.1007/978-3-540-89255-7 24

35. Regev, O.: On lattices, learning with errors, random linear codes, and cryptogra-phy. In: Proceedings of the Thirty-Seventh Annual ACM Symposium on Theory of Computing, STOC 2005, NY, USA, pp. 84–93. ACM, New York (2005) 36. Saarinen, M.J.O.: Arithmetic coding and blinding countermeasures for lattice

sig-natures. J. Cryptographic Eng. 1–14 (2017)

37. Von Neumann, J.: The general and logical theory of automata. Cerebral Mech. Behav.1(41), 1–2 (1951)

(35)

and Constructions of 0-RTT Key Exchange

Britta Hale1(B)_{, Tibor Jager}2_{, Sebastian Lauer}3_{, and J¨}_{org Schwenk}3 1

NTNU, Norwegian University of Science and Technology, Trondheim, Norway britta.hale@ntnu.no

2

Paderborn University, Paderborn, Germany tibor.jager@upb.de

3

Horst G¨ortz Institute, Ruhr-University Bochum, Bochum, Germany {sebastian.lauer,joerg.schwenk}@rub.de

Abstract. Zero Round-Trip Time (0-RTT) key exchange protocols allow for the transmission of cryptographically protected payload data without requiring the prior exchange of messages of a cryptographic key exchange protocol. The 0-RTT KE concept was first realized by Google in the QUIC Crypto protocol, and a 0-RTT mode has been intensively discussed for inclusion in TLS 1.3.

In 0-RTT KE two keys are generated, typically using a Diffie-Hellman key exchange. The first key is a combination of an ephemeral client share and a long-lived server share. The second key is computed using an ephemeral server share and the same ephemeral client share.

In this paper, we propose simple security models, which catch the intuition behind known 0-RTT KE protocols; namely that the first (resp. second) key should remain indistinguishable from a random value, even if the second (resp. first) key is revealed. We call this propertystrong key independence. We also give the first constructions of 0-RTT KE which are provably secure in these models, based on the generic assumption that secure non-interactive key exchange (NIKE) exists (This work was partially supported by a STSM Grant from COST Action IC1306).

Keywords: Foundations

·

Low-latency key exchange

·

0-RTT proto-cols

·

Authenticated key exchange

·

Non-interactive key exchange

·

QUIC

·

TLS 1.3.

1 Introduction

Efficiency, in terms of messages to be exchanged before a key is established, is a growing consideration for internet protocols today. Basically, the first generation of internet key exchange protocols did not care too much about efficiency, since secure connections were considered to be the exception rather than the rule: SSL (versions 2.0 and 3.0) and TLS (versions 1.0, 1.1, and 1.2) require 2 round-trip times (RTT) for key establishment before the first cryptographically-protected payload data can be sent. With the increased use of encryption,1 _{efficiency is}

1

For example, initiatives like Let’s Encrypt (https://letsencrypt.org/). c

Springer International Publishing AG 2017

(36)

of escalating importance for protocols like TLS. Similarly, the older IPSec IKE version v1 needs between 3 RTT (aggressive mode + quick mode) and 4.5 RTT (main mode + quick mode). This was soon realized to be problematic, and in IKEv2 the number of RTTs was reduced to 2.

The QUIC Protocol. Fundamentally, the discussion on low-latency key exchange (aka. LLKE, zero-RTT or 0-RTT key exchange) was opened when Google pro-posed the QUIC protocol.2 _{QUIC (cf. Fig.}_{1) achieves low-latency by caching a} signed server configuration file on the client side, which contains a medium-lived Diffie-Hellman (DH) share Y0=gy0_.3

When a client wishes to establish a connection with a server and possesses a valid configuration file of that server, it chooses a fresh ephemeral DH share X =gx_{and computes a temporal key}_k1_from_gy0x_{. Using this key}_{k1, the client}

can encrypt and authenticate data to be sent to the server, together withX. In response, the server sends a fresh DH share Y =gy _{and computes a session key} k2 fromgxy_{, which is used for all subsequent data exchanges.}

Fig. 1.Google’s QUIC protocol (simplified) with cached server key configuration file (Y0, σS). AE denotes a symmetric authenticated encryption algorithm (e.g.,

AES-GCM), (sksig_S , pk_Ssig) denotes the server’s long-term signing keys, and πtS (resp. π s C)

denotes the oracle at server S executing the singlet-th instance of the protocol (resp. for client).

2

Seehttps://www.chromium.org/quic. 3

(37)

TLS 1.3. Early TLS 1.3 drafts, e.g.draft-ietf-tls-tls13-08[25], contained a 0-RTT key exchange mode where a QUIC-likeServerConfigurationmessage is cached by the client. The current versiondraft-ietf-tls-tls13-18[26] follows a different approach, where the initial key establishment between a client and a server is never 0-RTT. Instead, it defines a method to establish a new session based on the secret key of a previous session. Even though this is also called “0-RTT” in the current TLS 1.3 specification, it is rather a “0-RTT session resumption” protocol, but does not allow for 0-RTT key establishment. Most importantly, the major difference between the approach of the current TLS 1.3 draft in comparison to a “real” 0-RTT key exchange protocol is that the former requires storing of secret key information on the client between sessions. In contrast, a 0-RTT key establishment protocol does not require secret information to be stored between sessions.

Facebook’s Zero Protocol. Very recently, the social network Facebook announced that it is currently experimenting with a 0-RTT KE protocol calledZero.4 _Zero is very similar to QUIC, except that it uses another nonce and encryption of the ServerHello message. It is noteworthy that the main difference between Zero and QUIC was introduced in order to prevent an attack discovered by Facebook, which has been reported to Google and meanwhile been fixed in QUIC, too. We believe that this is a good example that shows the demand of simple security definitions and provably-secure constructions for such protocols.

Security Goals. 0-RTT KE protocols like QUIC have ad-hoc designs that aim at achieving three goals: (1) 0-RTT encryption, where ciphertext data can already be sent together with the first handshake message; (2) perfect forward secrecy (PFS), where all ciphertexts exchanged after the second handshake message will remain secure even after the (static or semi-static) private keys of the server have been leaked, and (3) key independence, where “knowledge” about one of the two symmetric keys generated should not endanger the security of the other key.

Strong Key Independence. Intuitively, a 0-RTT KE protocol should achieve strong key independencebetweenk1andk2; if any one of the two keys is leaked at any time, the other key should still be indistinguishable from a random value. In all known security models, this intuition would be formalized as follows: if the adversaryAasks aRevealquery fork1, he is still allowed to ask aTestquery for k2, and vice versa. If the two keys are computationally independent from each other (which also includes computations on the different protocol messages), then the adversary should have only a negligible advantage in answering the

Testquery correctly.

Ultimately this leads to the following research questions:Do existing exam-ples of 0-RTT KE protocols have strong key independence? Can we describe a generic way to construct 0-RTT KE protocols that provably achieve strong key independence?

4

(38)

QUIC Does Not Provide Strong Key Independence. If an attackerAis allowed to learn k1 by aReveal-query, then he is able to decryptAE(k1;Y) and re-encrypt its own value Y∗_:=_gy∗

asAE(k1;Y∗_{). Furthermore, he can then compute the} samek2=Xy∗

as the client oracle, and can thus distinguish between the “real” key and a “random” key chosen by theTestquery. See [11] for more details on key dependency in QUIC.

Note that this theoretical attack does not imply that QUIC is insecure. It only shows that the authenticity of the server’s Diffie-Hellman share, which is sent in QUIC to establishk2, depends strongly on the security of keyk1. Therefore QUIC does not provide strong key independence in the sense sketched above.

Previous Work on 0-RTT Key Exchange. The concept of 0-RTT key exchange was not developed in academia, but in industry – motivated by concrete practical demands of distributed applications. All previous works on 0-RTT KE [11,23] conducted a-posteriori security analyses of the QUIC protocol, with tailored models. There are no foundational constructions as yet, and the relation to other cryptographic protocols and primitives is not yet well-understood.

At ACM CCS 2014, Fischlin and G¨unther [11] provided a formal definition of multi-stage key exchange protocols and used it to analyze the security of QUIC. Lychev et al. [23] gave an alternate analysis of QUIC, which considers both efficiency and security. They describe a security model which is bespoke to QUIC, adopting the complex, monolithic security model of [17] to the protocol’s requirements. Zhao [31] considers identity-concealed 0-RTT protocols, where user privacy is protected by hiding identities of users in a setting with mutual cryptographic authentication of both communicating parties. G¨untheret al.[14] extended the “puncturable encryption”-approach of Green and Miers [13] to show that even 0-RTT KE with full forward secrecy is possible, by evolving the secret key after each decryption. However, their construction is currently mainly of conceptual interest, as it is not yet efficient enough to be deployed at large scale in practice.

Security Model. In this paper, we use a variant of the Canetti-Krawczyk [7] security model. This family of security models is especially suited to protocols with only two message exchanges, with one-round key exchange protocols con-stituting the most important subclass. Popular examples of such protocols are MQV [22], HMQV [18], SMQV [27], KEA [21,24], and NAXOS [20]. A compar-ison of different variants of the Canetti-Krawczyk model can be found in [9,29].

(39)

Naturally, the novel primitive of 0-RTT KE requires formal security defi-nitions. There are different ways to create such a model. One approach is to focus ongenerality of the model. Fischlin and G¨unther [11] followed this path, by definingmulti-stage key exchange protocols, a generalization of 0-RTT KE. This approach has the advantage that it lays the foundation for the study of a very general class of interesting and novel primitives. However, its drawback is that this generality inherently also brings a huge complexity to the model. Clearly, the more complex the security model, the more difficult it becomes to devise new, simple, efficient, and provably-secure constructions. Moreover, proofs in complex models tend to be error-prone and less intuitive, because central tech-nical ideas may be concealed in formal details that are required to handle the generality of the model.

Another approach is to devise a model which is tailored to the analysis of one specific protocol. For example, the complex, monolithic ACCE security model was developed in [17] to provide ana posteriori security analysis of TLS.5 A similar approach was followed by Lychev et al.[23], who adopted this model for ana posteriori analysis of QUIC, by defining the so-called Q-ACCE model. The notable drawback of this approach is that such tailor-made models tend to capture only the properties achieved by existing protocols, but not necessarily all properties that we would expect from a “good” 0-RTT KE protocol. In gen-eral, such tailor-made models do not, therefore, form a useful foundation for the creation of new protocols.

In this paper, we follow a different approach. We propose novel “bare-bone” security models for 0-RTT KE, which aim at capturing all (strong key inde-pendence and forward secrecy), but alsoonlythe properties intuitively expected from “good” 0-RTT KE protocols. We propose two different models. One consid-ers the practically-relevant case of server-only authentication (where the client may or may not authenticate later over the established communication channel, similar in spirit to the server-only-authenticated ACCE model of [19]). The other considers traditionalmutualcryptographic authentication of a client and server. The reduced generality of our definitions – in comparison to the very general multi-stage security model of [11] – is intended. A model which captures only, but also all the properties expected from a “good” 0-RTT KE protocol allows us to devise relatively simple, foundational, and generic constructions of 0-RTT KE protocols with as-clean-as-possible security analyses.

Importance of Foundational Generic Constructions. Following [3], we use non-interactive key exchange (NIKE) [8,12] in combination with digital signatures as a main building block.6_{This yields the first examples of 0-RTT KE protocols} with strong key independence, as well as the first constructions of 0-RTT KE from generic complexity assumptions. There are many advantages of such generic constructions:

5

A more modular approach was later proposed in [4]. 6

(40)

1. Generic constructions provide a better understanding of the structure of pro-tocols. Since the primitives we use have abstract security properties, we can see precisely which abstract security requirements are needed to implement 0-RTT KE protocols.

2. They clarify the relations and implications between different types of crypto-graphic primitives.

3. They can be generically instantiated with building blocks based on different complexity assumptions. For example, if “post-quantum” security is needed, one can directly obtain a concrete protocol by using only post-quantum secure building blocks in the generic construction.

Usually generic constructions tend to involve more computational overhead than ad-hoc constructions. However, we note that our 0-RTT KE protocols can be instantiated relatively efficiently, given the efficient NIKE schemes of [12], for example.

Contributions. Contributions in this paper can be summarized as follows: – Simple security models. We provide simple security models, which capture

all properties that we expect from a “good” 0-RTT KE protocol, but only these properties. We consider both the “practical” setting with server-only authentication and the classical setting with mutual authentication.

– First generic constructions.We give intuitive, relatively simple, and efficient constructions of 0-RTT KE protocols in both settings.

– First Non-DH instantiation. Both QUIC and TLS 1.3 are based on DH key exchange. Our generic construction yields the first 0-RTT KE protocol which is not based on Diffie-Hellman (e.g., by instantiating the generic construction with the factoring-based NIKE scheme of Freireet al.[12]).

– First 0-RTT KE with strong key independence.Our 0-RTT KE protocols are the first to achieve strong key independence in the sense described above. – Well-established, general assumptions. The construction is based on

gen-eral assumptions, namely the existence of secure NIKE and digital signature schemes. For all building blocks we require only standard security properties. – Security in the Standard Model. The security analysis is completely in the standard model, i.e. it is performed without resorting to the Random Oracle heuristic [1] and without relying on non-standard complexity assumptions. – Efficient instantiability.Despite the fact that our constructions are generic,

the resulting protocols can be instantiated relatively efficiently.

(41)

2 Preliminaries

For our construction in Sect.4, we need signature schemes and non-interactive key exchange (NIKE) protocols. Here we summarize the definitions of these two primitives and their security from the literature.

2.1 Digital Signatures

A digital signature scheme consists of three polynomial-time algorithm

SIG = (SIG.Gen,SIG.Sign,SIG.Vfy). The key generation algorithm (sk, pk) ←$ SIG.Gen(1λ_{) generates a public verification key}_pk_{and a secret signing key}_sk_on input of security parameterλ. Signing algorithmσ $

←SIG.Sign(sk, m) generates a signature for messagem. Verification algorithmSIG.Vfy(pk, σ, m) returns 1 if σis a valid signature for munder keypk, and 0 otherwise.

Consider the following security experiment played between a challenger C

and an adversaryA.

1. The challenger generates a public/secret key pair (sk, pk)←$ SIG.Gen(1λ_{), the} adversary receivespk as input.

2. The adversary may query arbitrary messagesmi to the challenger. The chal-lenger replies to each query with a signatureσi=SIG.Sign(sk, mi). Here iis an index, ranging between 1≤i ≤q for someq ∈N_{. Queries can be made} adaptively.

3. Eventually, the adversary outputs a message/signature pair (m, σ).

Definition 1. We define the advantage on an adversaryA in this game as

AdvsEUF -CMASIG,A (λ) := Pr

(m, σ)← A$ C(λ)_{(pk) :} SIG.Vfy(pk, σ, m) = 1, (m, σ)= (mi, σi)∀i

.

SIG is strongly secure against existential forgeries under adaptive chosen-message attacks (sEUF-CMA), ifAdvsEUF -CMA_SIG,A (λ)is a negligible function in λ for all probabilistic polynomial-time adversariesA.

Remark 1. Signatures with sEUF-CMA security can be constructed generi-cally from any EUF-CMA-secure signature scheme and chameleon hash func-tions [6,28].

2.2 Secure Non-interactive Key Exchange

Definition 2. A non-interactive key exchange (NIKE) scheme consists of two deterministic algorithms (NIKE.Gen,NIKE.Key).

NIKE.Gen(1λ_{, r)} _{takes a security parameter} _λ _{and randomness} _r _{∈ {}_0,₁_}λ_{. It} outputs a key pair(pk,sk). We write(pk,sk)←$ NIKE.Gen(1λ₎_{to denote that}

NIKE.Gen(1λ_{, r)}_{is executed with uniformly random}_r $

(42)

NIKE.Key(ski,pkj) is a deterministic algorithm which takes as input a secret key ski and a public key pkj, and outputs a keyki,j.

We say that a NIKE scheme is correct, if for all(pki,ski)←$ NIKE.Gen(1λ₎_and (pkj,skj)←$ NIKE.Gen(1λ₎_{holds that} _NIKE_._Key_(ski,_{pkj) =}_NIKE_._Key_(sk_j,_pki). A NIKE scheme is used bydpartiesP1, . . . , Pdas follows. Each partyPi gen-erates a key pair (pki,ski)←NIKE.Gen(1

λ_{) and publishes}_pk

i. In order to com-pute the key shared byPi andPj, partyPi computeski,j =NIKE.Key(ski,pkj). Similarly, partyPjcomputeskj,i=NIKE.Key(skj,pki). Correctness of the NIKE scheme guarantees thatki,j=kj,i.

CKS-Light Security. The CKS-light security model for NIKE protocols is rel-atively simplistic and compact. We choose this model because other (more complex) NIKE security models like CKS, CKS-heavy, and m-CKS-heavy are polynomial-time equivalent toCKS-light. See [12] for more details.

Security of a NIKE protocolNIKEis defined by a gameNIKEplayed between an adversary Aand a challenger. The challenger takes a security parameter λ and a random bitbas input and answers all queries ofAuntil she outputs a bit b′_{. The challenger answers the following queries for}_A_:

– RegisterHonest(i). A supplies an index i. The challenger runs NIKE.Gen(1λ₎ to generate a key pair (pki, ski) and records the tuple (honest, pki, ski) for later and returnspki toA. This query may be askedat most twice byA. – RegisterCorrupt(pki). With this queryAsupplies a public keypki. The

chal-lenger records the tuple (Corrupt, pki) for later.

– GetCorruptKey(i, j).A supplies two indicesiandj wherepki was registered as corrupt and pkj as honest. The challenger runs k ← NIKE.Key(skj, pki) and returnsk toA.

– Test(i, j). The adversary supplies two indices i and j that were registered honestly. Now the challenger uses bit b: if b = 0, then the challenger runs ki,j ← NIKE.Key(pki, skj) and returns the keyki,j. If b = 1, then the chal-lenger samples a random element from the key space, records it for later, and returns the key toA.

The gameNIKE outputs 1, denoted byNIKEA_NIKE(λ) = 1, ifb=b′ _{and 0} otherwise. We say Awins the game ifNIKEA_NIKE(λ) = 1.

Definition 3. For any adversary A playing the above NIKE game against a NIKE scheme NIKE, we define the advantage of winning the gameNIKEas

AdvCKS-light_NIKE,A (λ) = 2·Pr

NIKEANIKE(λ) = 1

−1 .