Domain Extension for MACs Beyond the Birthday Barrier

From Bauman National Library
This page was last modified on 20 December 2015, at 13:02.
Domain Extension for MACs Beyond the Birthday Barrier
Authors Yevgeniy Dodis[@: 1] and
John Steinberger[@: 2]
Published 2011
Download Original paper

Abstract. Given an n-bit to n-bitMAC (e.g., a fixed key blockcipher) withMAC security ε against q queries, we design a variable-length MAC achieving MAC security against queries of total length qn. In particular, our construction is the first to break the “birthday barrier” for MAC domain extension from noncompressing primitives, since our security bound is meaningful even for (assuming ε is the best possible O(1/2n)). In contrast, the previous best construction for MAC domain extension for n-bit to n-bit primitives, due to Dodis and Steinberger, achieved MAC security of , which means that q cannot cross the “birthday bound” of .


Most primitives in symmetric-key cryptography are built from block ciphers, such as AES. In this paper, we will concentrate on the question of designing variable-input-length (VIL) message authentication codes (MACs) from block ciphers. This question is very well studied, as we survey below. However, with few exceptions, most existing constructions and their analyses make the following two assumptions: (a) Pseudorandomness: the block cipher is modeled as a pseudorandom permutation (PRP); and (b) Secrecy of Intermediate Results: the attacker only learns the input/output behavior of the corresponding VIL-MAC, but does not learn any of the intermediate results. As observed by Dodis et al., each of these assumptions might either be unnecessarily strong, or simply too unrealistic in the following two scenarios.

DOMAIN EXTENSION OF MACS. This is our main question. Since the desired MAC primitive only needs to be unpredictable, it would be highly desirable to only assume that the block cipher is unpredictable as well, as opposed to pseudorandom. Indeed, it seems that assuming the block cipher is unpredictable is a much weaker assumption than assuming full pseudorandomness: to break the latter, all one needs to do is to predict one bit of “random-looking” information about the block cipher with probability just a little over , while the former requires one to fully compute the value of the block cipher on a new point with non-trivial probability. For example, in the non-uniform model, any block cipher (in fact, even non-trivial pseudorandom generator) with an n-bit key can be very efficiently distinguished from random with advantage . To the best of our knowledge, no such lower bounds are known for breaking unpredictability, meaning that close to MAC security might be possible for such a block cipher. To put it differently, while we hope that existing block ciphers are actually PRPs, it seems quite reasonable to assume that their MAC security could be noticeably better than their PRP security. Thus, if we can design a VIL-MAC whose security is tightly related to the unpredictability of the block cipher, this VIL-MAC might be more secure than the MAC whose analysis assumes the pseudorandomness of the cipher. Of course, one might hope that existing block-cipher based VIL-MACs, such as CBC-MAC [1][2] and HMAC [3][4] (whose compression function, under the hood, also uses a block cipher), are already secure when the block cipher is unpredictable. Unfortunately, as detailed in Dodis et al. [5][6][7] ( особенно [7], this is not the case: with few exceptions mentioned shortly, standard constructions are completely insecure when instantiated with unpredictable block ciphers, — often despite having simple proofs of security when one models the block cipher as a PRP.

RESILIENCE TO SIDE-CHANNELS. Even if the block cipher is a very good PRP, in practice many cryptographic implementations fall prey to various forms of side-channel attacks [8][9][10][11][12], where the physical realization of a cryptographic primitive can leak additional information, such as the computation-time, powerconsumption, radiation/noise/heat emission etc. Thus, hardware people are paying special attention to securing block ciphers, such as AES, against such side-channel attacks. Although this might be a daunting task, it appears reasonable that specialized hardware implementations of AES might be pretty resistent to common forms of sidechannel attacks. On the other hand, when the block cipher is used in some more complicated application, such as the design of a VIL-MAC, it might be hard or impractical to design a specialized “leakage-resilient” implementation for each such application, instead of doing so for a single, fixed-length building block (such as AES). Motivated by these considerations, Dodis et al. [5][6][7] proposed the model where the internals of the block cipher implementation are assumed secure, as usual, but all the external input/output behavior of the block cipher could potentially leak to the attacker (say, via side-channel attack). To give this model a name while simultaneously making it more general, we say that a construction of a (deterministic) MAC P using some lower level keyed primitive(s) f is transparent (w.r.t. f), if (a) the key for P only consists of one of more keys for f; (b) when making a query M to P, the attacker not only gets P(M), but also gets all the input/output pairs for every call to f made during the evaluation of P(M). Since P is deterministic and all keys reside “inside” f, this indeed provides the attacker with the entire transcript of P(M), short of what is happening during the calls to f. Coming back to our setting, we are interested in building a transparent VILMAC out of a block cipher. As we will see, this question is highly non-trivial even if the block cipher is assumed pseudorandom, let alone unpredictable. Indeed, as observed by [5][6][7], most existing VIL-MACs, including CBC-MAC [1][2] and HMAC [3][4], are completely insecure when the intermediate results are leaked, even when instantiated with a PRP.

OUR MAIN RESULT. Motivated by these applications, we ask the same question as Dodis et al. [5][6][7], which simultaneously addresses both of the above concerns.

Question 1 Can one build a transparent VIL-MAC P out of a block cipher f which is only assumed unpredictable?

As already mentioned, most standard VIL-MACs built fromblock ciphers fail to address eitherMAC-preservation or transparency (even with a PRP). So we turn to the known secure approaches. As it turns out, all of them followed the principle of An and Bellare [13] of first constructing a compressing Weakly Collision Resistant (WCR)1 hash function F from m to n bits, for some fixed m > n, then iterating this fixed-length WCR F using some variant of the Merkle-Damg°ard transform, and finally composing the output with a freshly keyed block cipher. As argued by Preneel and van Oorschot [14], any construction of this kind can achieve at most birthday security. Translated to the MAC-preservation setting, even if our original MAC f cannot be forged with probability ε using q queries, the resulting VIL-MAC P cannot have security greater than , meaning that q cannot cross , even is ε is assumed to be (the best possible) .

Interestingly, even achieving birthday security turned out to be challenging when the block cipher is only assumed unpredictable. The first secure construction of Dodis and Puniya [6], based on the Feistel network, only achieved security . The bound was then improved to by Dodis, Pietrzak and Puniya [5] using the “enhanced CBC” construction. Finally, Dodis and Steinberger [7] showed (nearly) birthday security using a new analysis of the Shrimpton-Stam [15] compression function. All these constructions were also transparent.

We ask the question if it is possible to build (hopefully, transparent) VIL-MACs from block ciphers with beyond birthday security. Most ambitiously, if f cannot be forged with probability ε using q queries, we would like to build a VIL-MAC P with security close to , meaning our security is meaningful even for values of q approaching ,, provided ε is assumed to be (the best possible) . As our main result, we answer this question in the affirmative. Informally (see Section 4 for more details), Theorem 1 There exist fixed polynomials a(n) and b(n) and a construction P of a transparent VIL-MAC from an n-bit block cipher f, such that the rate2 of P is a(n) and the MAC security ε′ of P against q′ queries of total length qn is at most , where ε is the MAC-security of f against q queries. In particular, this bound is meaningful for q (and q′) approaching .

OTHER RELATED WORK. As we mentioned, the question of achieving beyond-birthday security for building VILMACs from unpredictable block ciphers was open prior to our work. In fact, the only domain extension results for MACs with beyond birthday security we obtained just recently by Yasuda [16] and Lee and Steinberger [17]. However, both results started with a shrinking MAC from strictly more than 2n to n bits. As we will see below, building such shrinking MACs (with beyond birthday security) from unpredictable block ciphers is highly nontrivial, and will be one of the key challenges we resolve on route to proving our main result. (However, we note that our result does not3 simply reduce to building a 2n to n bit MAC from an n-bit to n-bit MAC.)

Another related area is that of for building VIL pseudorandom functions (PRFs) with beyond birthday security from PRPs, or more generally, fixed-length PRFs. In particular, several such constructions were found by [18][19][20][21][22][23]. However, it is easy to see that none of themwork either for theMAC domain extension, or even for building VIL-MACs (let alone PRFs) when the intermediate computation results are leaked. For example, the corollary of our main result, giving a transparent VIL-MAC from a -secure PRP with security εprp+ ˜, appears to be new.

Perhaps the closest work to ours is a paper of Maurer and Tessaro [24], who showed how to build a variablelength random oracle from an n-to-n bit random oracle. Their construction, analyzed in the indifferentiability framework of [25][26], has fixed polynomial rate (just like our construction) and security , for any . However, although our construction borrows some ideas from that of [23], the two settings appear incomparable. On the one hand, the Maurer-Tessaro paper has to build an “indifferentiability simulator” for their setting (which required “input extraction” not required in our setting). However, they assumed a truly random function, and could use various probability calculations in deriving their result. In our setting, the block cipher is only unpredictable, and we have to make an explicit reduction to unforgeability, which makes matters substantially more delicate.

Outline of Our Construction

Our construction is quite involved, although we abstract it into several self-contained layers. As a side benefit, some of these layers (see below) are of potentially independent interest, and could be used for other purposes.

STEP 1: REDUCING TO 3n-TO-2nWCR AND 2n-TO-n MAC. First, we notice that the above mentioned birthday limitation [14] of the An-Bellare approach no longer holds provided we build a WCR hash function F from m to 2n bits (for some m > 2n, say m = 3n). Namely, “birthday on 2n bits” might still give good enough security 2n. However, even if we succeed in doing so with beyond birthday security (which will be one of our key results), we now also have to build a “final” MAC G from 2n to n bits. Thus, using known techniques but with different parameters (see Lemma 1 and Figure 1), our problem reduces to building beyond birthday WCR F from 3n to 2n bits and a beyond birthday MAC G from 2n to n bits.

STEP 2: REDUCING TO COVER-FREE FUNCTIONS. It so turns out that both of these tasks—i.e. the construction of the WCR function F and the construction of the MAC G—can be achieved from a more powerful (keyed) primitive which we introduce, called a cover-free function. Informally, a keyed function g from (recall, we will have m = 3n) to (for some parameter t), where , is called cover-free (CF) if, given oracle access to g, it is infeasible to produce a sequence of (distinct) queries in such that, for some 1 ≤ j ≤ q, zℓ(sj) ∈ {zℓ(s1), . . . , zℓ(sj−1)} for all . In other words, for each new query one of the coordinates of must be “uncovered” by previous coordinates of that index. The case t = 1 corresponds to the standard m to n bit WCR security, however better (and in particular beyond-birthday) cover-free security can be achieved with larger values of t.

First, as depicted on the left side of Figure 2, we can compose CF g with t independently keyed block ciphers , by setting , where . We show that the resulting G is easily seen to be a secure MAC from m bits to n bits. Moreover, the MAC security of G is tightly related to the CF security of g and the MAC security of f (see Lemma 2). Intuitively, a new forgery for G will give a new forgery for at least one of the fℓ’s, by the CF security of g. Since m = 3n > 2n, this already gives us the needed 2n to n bit MAC.

More interestingly, as depicted on the right side of Figure 2, we show how to compose a CF function g with 2t independently keyed block ciphers (in a variant of the “double-pipe” mode of [27]) to get a WCR function F from m bits to 2n bits. Moreover, the WCR security of F will be “roughly” , where ε′ is the CF security of g and ε is the MAC security of f (see Lemma 3). Thus, as long as we can build CF g with security ′ close to O(qε), the WCR security of F will also be such. The proof of this result critically uses the “multi-collision to forgery” technique of Dodis and Steinberger [7].

Summarizing the discussion above, our task of building a VIL-MAC P thus reduces to building a CF function g with security where ε is the MAC security of the underlying n-bit to n-bit primitive f. We also wish to build the CF function g with t as small as possible (which is relevant since the efficiency of P, including the size of key, is proportional to t). See Lemma 4.

STEP 3: BUILDING CF FUNCTIONS. This is, by far, the most involved part of our construction. The inspiration for this construction came from the afore-mentioned paper ofMaurer and Tessaro [24], who showed how to build a VIL random oracle from an n-to-n bit random oracle. As we mentioned already, the setting of [24] is incomparable to our setting, especially since we cannot assume that our block cipher is (pseudo)random. However, our actual construction of CF functions is quite similar to the corresponding “cover-free” layer of the construction of [24], although we had to make some changes (actually, simplifications) to the construction of [24], and our analyses are completely different. Our CF construction has three layers which we informally call combinatorial, cryptographic and algebraic. An impatient reader can look at Figure 3 for a concrete example (with t = 3 and other notation explained below).

STEP 3A: USING INPUT-RESTRICTING FAMILIES. This purely combinatorial step is precisely the same as in [24], and is also the most expensive step of our construction. We will use an unkeyed function E from to (here r is a parameter) called an input-restricting function family (IRFF; see Definition 1). Intuitively, IRFF has the property that after any q queries to E, the number Q of new inputs s for which the r-tuple E(s) is covered by the union of is “not much larger” than q, and this should be true even when q is almost 2n. Recall, our final goal is to ensure that it is hard to produce any such new input s. While IRFFs do not (and cannot!)5 quite get us there, they ensure that there are not that many choices for the attacker of which new inputs to “cover” by old inputs.

We discuss the known constructions of IRFFs in Section 4, but mention that the constructions of IRFFs are completely combinatorial, and closely related to constructions of certain types of highly unbalanced bipartite expander graphs. While well-studied, these types of expander graphs are not yet completely understood, and in particular the “extreme” setting of parameters relevant to our case has not been the object of much attention. Therefore, although the existence of IRFFs with “good parameters” is known (and lead to the asymptotic bound claimed in Theorem 1), the concrete constructions are pretty inefficient. Nevertheless, as these parameters and efficiency are improved by future research in computational complexity, so will our final construction.

STEPS 3B-C: ADDING CONFUSION AND MIXING. Recall, IRFFs convert our input s into an r-tuple . To get the final t-tuple for our CF function g, we can imagine repeating the following two-step precedure (steps 3b and 3c) t times, each time with a freshly keyed block cipher F (so the total number of block cipher keys for g will be t). First, we pass all r values through the block cipher F (“confusion step”), getting the values . This is the cryptographic “confusion” layer. Then we algebraically “mix” all 2r values through a fixed, degree-r multivariate polynomial p (see Equation 3). This gives us one of the t outputs values .

The intuition for these last two steps is hard to explain (and, indeed, our analysis is quite involved). At a high level, the confusion step (evaluating )) is certainly needed to make a reduction to unforgeability, while the mixing step uses the fact that low-degree polynomials have few roots, so a “non-trivial” collision on the output of p will enable one to guess one of the values we are trying to forge. Of course, the difficulty is to make a successful guess for when and where the non-trivial collision to p will happen, with probability roughly 1/Q, where Q is the guarantee given by IRFF (so Q is close to q). It turns out, there is a trivial strategy to make such a guess with “birthday” probability , even when t = 1. Of course, such probability is too low, and this is why we repeat steps 3b-c t times, for an appropriately chosen parameter t. We then show that the required guessing strategy can be reduced to the analysis of two “balls-and-bins” games. The relevance of such games to the domain extension of MACs was first introduced by Dodis and Steinberger [7]. Unfortunately, the two “ballsand- bins” games we have to analyze are significantly more complicated than the game of [7]. Nevertheless, as our most involved technical step, we show that both games can be won with probability roughly . Thus, by choosing (say, t = n), we get the desired bound .

EFFICIENCY. Our final VIL-MAC construction uses 5t keys for f, where we recall that the minimal value of . Theoretically, we can reduce key material down to a single key for f, by “keying” f via fixed, reserved input bits. Namely, to simulate (at most) 5n keys this way we need to reserve bits of input (and “truncate” the same number of bits in the output), effectively reducing the block length of the construction from n down to math>\ n' = n - \left [log_2 \left (5n \right ) \right ] ~ \, \! </math>. Due to the output truncation, we now also need to guess the missing output bits not returned by our forger, incurring an (acceptable) additional degradation of the security bound.

Our final VIL-MAC also achieves rate roughly proportional to . Achieving a low value of r (coming from the combinatorial IRFF part) is more problematic (see Section 4), although existentially one can also make r = O(n). So the best rate we can hope to achieve using our approach is . Therefore, we primarily view our result as an important feasibility result, much like the result of Maurer and Tessaro [24]. Nevertheless, our feasibility opens the door for future, potentially more efficient constructions.


A keyed function family is a map where . The strings in are the keys of f and we write for for and .

For MACs we consider the following game, where A is a halting adversary with oracle access to : Game Forge(A, f): ; ,

If , and x was not a query of A then A wins, otherwise A looses. We define the insecurity of f as a MAC to be where the maximum is taken over all adversaries A making at most q queries of total combined length at most (after padding, if any) and of “running time” at most T. The “running time” is defined to be the total running time of the experiment, including the time necessary to compute the answers to A’s queries. Moreover we “bill” the final verification query (and its length) to A (so that A must in fact make q − 1 queries if x in ; seen another way, we ask A to verify its own forgery, if it attempts one). When f has fixed input length (i.e. for some m in N) then is a function of q and it is convenient to elide the last argument, writing instead of .

The weak collision resistance or “wcr” security of a function family f is measured as the maximum advantage of an adversary in finding a collision for a randomly keyed member of f when given oracle access to this member. We write for the maximum such advantage over all adversaries A making at most q queries of running time at most T. (Here we do not measure the total query length, as we will only measure the wcr security of fixed input length constructions.) We skip a formal pseudocode-based definition of this standard notion, but mention that the adversary must make the queries verifying its collision, not merely output a colliding pair.

Given a block length n and a message x, we let be a suffix-free encoding of x of length a multiple of n bits (for example, the standard Merkle-Damg°ard padding of x, which appends the length of x as the last block6). Furthermore, given two keyed compression functions F : we define a keyed function by where and each has n bits, for all in , (see Fig. 1).

Fig. 1 Constructonp .

The proof of the following (standard) lemma is given in Appendix B[28]:

Lemma 1. Let , and consider as a function of key space . Then, for ,

Informally speaking, Lemma 1 reduces our task to building, from an n-bit to n-bit primitive f, compression functions F and G such that F has beyond-birthday wcr security and G has beyond-birthday mac security, where these securities must be based only the mac security of f (i.e., breaking the wcr security of F must imply breaking the mac security of f, and breaking the mac security of G must likewise imply breaking the mac security of f).

To the latter end we introduce in this paper the notion of a cover-free keyed function family . Here t is a parameter of the definition and we write the output of as where in for each i; later we will simply write when the dependence on a key k is understood. In the cover-free game, an adversary adaptively queries on distinct points , and wins if for some j each block of output of is “covered” by a previous output, in the sense that .

The following game formalizes this: Game Cover(A, g):


If makes distinct queries to such that , для любого , Then A wins; Otherwise, A looses.

We define the cover-free (CF) insecurity of g as where the maximum is taken over all adversaries A making at most q queries and of running time at most T, with the same conventions as above on the running time. We (informally) say that a function family is cover-free to mean it has small cover-free insecurity.

[File: fig_2.png|thumb|center|412px|Рис. 2. Left-composition . Right-parallel composition . ]]

Given a (cover-free) function family where the l-th block of is given by the function and a function family we define the composed function family by where and , and is a shorthand for the concatenation of . See Figure 2. We also define a parallel composition of f and g, defined by .

In other words, is simply the concatenation of two functions instantiated with the same g-key but independent f-keys.

Recall that our construction MD[F,G] takes as parameters keyed compression functions и . Given a cover-free function family and a function family , we will set , κ2 = κ + tκ, and define



for all . The specification of our construction is thus now reduced to defining the cover-free function family g. We note that the n-bit to n-bit function family f is a parameter of the scheme (not constructed from any lower-level primitive) whereas g must be instantiated from f, and its cover-free security reduced to the mac security of f; see the next section for details on the construction of g.

Recall that, by Lemma 1, we are interested in bounding and in terms of . Towards this goal, we give two lemmas that upper bound InSecwcr and as a function of and . The proofs of both lemmas are given in Appendix B[28].

Lemma 2. . Then .

Lemma 3. . Then .

(We note that, unlike Lemmas 1 and 2, the proof of Lemma 3 is not a triviality. In particular, it requires a “multicollision to forgery” (MTF) bin-filling game of the type used in [7].) Combining Lemmas 1, 2 and 3 we directly obtain:

Lemma 4. and let F, G be as in (1), (2). Then, if , .

Lemma 4 reduces our problem to constructing the cover-free function family g from the function family f such that can be bounded in terms of . This is the topic of the next section, and the paper’s main technical achievement.

When a keyed function is built from a smaller primitive, where the function’s key consists of a finite set of keys for the smaller primitive (which is potentially called several times with different keys), the notions of MAC, WCR and cover-free securities naturally extend to a transparent model, where the adversary receives a full transcript of the function’s computation at each query, up to calls to the primitive (namely, calls to the lower-level primitive appear as oracle calls in the transcript, so as not to reveal the primitive’s keys). In fact, all results and proofs of this paper can be (easily) interpreted and are valid in this stronger “transparent” model. However, to keep the presentation simple, we will not further remind this from here on.

Building Cover-Free Function Families from MACs

This section contains our main result, the construction of a cover-free function family based on n-bit to n-bit primitives, that achieves beyond-birthday security assuming only goodMAC security from the primitives. We note in passing that an unkeyed function cannot be cover-free against information-theoretic adversaries unless t2n ≥ 2m or unless t is as large as the desired query security, which gives values of t that are too large to be practical for most settings.

Our construction uses the notion of an input-restricting function family (IRFF), introduced by Maurer and Tessaro. The following definition is slightly modified for our purposes.

Definition 1. Let and let m > n.-IRFF is a set of functions such that (i) and for all and all , (ii) for all there exists such that , and (iii) for any subset such that we have


The constructions of input-restricting function families are discussed in Section 4[19].

Рис. 3.  : r = 2, t = 3.

Our cover-free function family is also adapted from [24]. The construction takes as parameters m ≥ n as well as integers r, t ≥ 1 and a -IRFF be n-bit to n-bit primitives (later to be instantiated as members of function family , possibly fixed-key blockciphers). The construction also uses a (concrete, unkeyed) function , описанную ниже. Пусть described below.




for 1 ≤ l ≤ t (see Figure 3). From we obtain a keyed function family of signature by instantiating each with a member of a function family ; however, we opt for the unkeyed notation (in which are implicitly keyed) when possible to reduce notational overhead.

As for the function p, it is the polynomial


where are n-bit strings treated as elements of the field . The only properties of p that matter are the two following:

I. Invertibility. For any 1 ≤ j ≤ r and any values such that are not all zero, there are few values such that , and these values are efficiently enumerable.

II. Collision Invertibility. For any 1 ≤ j, j′ ≤ r and any values such that there are few values such that



and these values are efficiently enumerable.

Both properties are easily verifiable from the fact that is a polynomial of yj of the type типа , where c does not depend on . Maurer and Tessaro use a different construction instead of p which does not obviously satisfy either property above, that requires enlarging the set of functions to a set where v ranges from 1 to ⌈m/n + 1⌉.

To state our main theorem, let be the amount of time required to list the values v and Eh(s) ∈ U for any given and set such that . We have:

Theorem 2. Let be a -IRFF, let be a function family, and consider as a keyed function family of key space by setting for any . Then


for any q ≤ K, where and

where is the time required to find all the roots of a polynomial of degree r in a field of size . In particular, when t = n and , we have

Proof. Let be an adversary for the game that runs in time T and that has success probability . It suffices to design an adversary B for the game with advantage at least

and that runs in time .

B has access to a random member fk0 of f. B chooses t random keys, and selects a random index . Then B simulates with oracle , instantianting the function Fℓ with and instantianting with , using its oracle. Moreover B proceeds to “forget” the value of , treats each of the functions as an oracle, and tries to forge any one of them (predicting their output on an unqueried input), making only one such forgery attempt during the game. Since B has chance 1/t of forging Fℓ0 if it does make a correct forgery, it suffices for B to make such a forgetful forgery with chance at least

in order for it to forge with chance at least εA. It is easier to consider a modified version of A′, which we call simply A, that directly issues F-queries; more precisely, A issues a sequence of queries ′ where q′ ≤ qr and each ; B answers the query with the tuple . We can assume A never makes the same query twice. We let and let . Note that

by the input-restricting property of . We also let for 1 and put for any (which B can compute after it answers A’s i-th query as long as ). We say A “wins the generous cover-free game” at the i-th query if there exists an such that for .

Clearly, there exists an A of same running time as A′ whose advantage in the generous game is at least as great as , since A can simply simulate A′ and ask the various F-queries needed to compute the answers to A′’s queries; by definition, A wins if A′ wins . (It is easy to check that if A′ makes (distinct) queries such that , then A wins the generous cover-free game by the time it has finished asking the queries necessary to compute the answer to the query of A′.) Thus it is sufficient to have B forge one of the F-functions with probability at least . We now view B as simply answering A’s F-queries (as opposed to computing answers to -queries) though in reality B is running the whole computation, including the simulation of A′ by A.

We view each value as a “bin” with t “slots”; the ℓ-th slot of bin s “receives a ball” or “becomes full” at query (namely, if the bin already exists at that point), if , and if either ). Once a bin receives a ball in a slot, the slot remains full. A slot cannot receive more than one ball, and bins are never removed; we note that no bins exist at the start, and that bins are added at the i-th query. Under these definitions, A wins the “generous” cover-free game precisely if some bin becomes full (i.e., all its slots become full). It is helpful to picture A and B as playing an adversarial game in which A wins if it fills a bin without B forging one of the functions , and where B wins otherwise (in fact, we may even picture A as choosing the answers to its queries, while B observes and tries to guess an answer before it is revealed).

We say that ball l of a bin is “early” if and “late” otherwise; thus a ball is early if and only if it is added to a bin at the same A-query which creates the bin. B plays one of two different forging strategies with equal probability. The first strategy is designed to prevent too many early balls from appearing in bins while the second strategy is designed to prevent A from filling a bin (the second strategy only functions well if not too many early balls appear in bins, whence the necessity of the first strategy). We name the two strategies “early prevention” and “late prevention”; despite these names, we emphasize the two strategies are not played sequentially; instead, B flips a coin at the start to decide which strategy to use.

We start by describing B’s early prevention strategy. Let ; as noted above, , so Q is an upper bound for the total number of bins created during the game. The goal of B’s early prevention strategy is to prevent A from creating, for every or more bins that each have k or more early balls in them. In other words, we only require this strategy to work (i.e. forge a function Fℓ with “good enough” probability) if there is some 1 ≤ k ≤ t such that or more bins are created with k or more early balls in them. We model the early prevention strategy via a slightly simplified balls-in-bins game described below. To connect this balls-in-bins game with the “real” game played by B and A, it is helpful to first review the process via which bins are created and early balls are added to them. Consider a query made by A. Then

and the elements of are the new bins created by this query. Each bin has t slots and the “value” of the l-th slot of s is revealed when B makes the query ; after the value is revealed, an early ball is added to the l-th slot of s according to whether there exists an such that or not (notice that is known at this point for all ). Thus, the process of filling the newly created bins with early balls consists in t “phases” (the queries , which are made sequentially by B), where the ℓ-th phase simultaneously reveals the values of the l-th slots of all the new bins, and whether these slots receive early balls or not. The following balls-in-bins game thus abstracts the process of creation of new bins and early balls:

‘EARLY PREVENTION’ BALLS-AND-BINS GAME. This game is played between two adversaries and . Parameters are integers . Rules are as follows:

  • The game proceeds in q′ rounds. At round , A announces some number of bins such that .
  • At the beginning of each round the bins are empty. Each bin has t slots. Each round consists of t phases.

At the l-th phase, A reveals which of the bins have their l-th slot “filled” by a “ball”.

  • Before each phase of each round, is allowed to secretly predict a bin that will receive a ball at that phase;

wins if it makes a correct guess, but it is only allowed to make one guess during the entire game.

  • Let be the number of bins that receive k or more balls at round i, and let where the sum is

taken over all the rounds. Then is required to fill bins such that for at least one value of k, 1 ≤ k ≤ t.

In Proposition 1 of Appendix A we exhibit a strategy for that gives it at least chance of winning the above game, regardless of ’s strategy. Thus, if or more bins each receive k or more early balls for some 1 ≤ k ≤ t, and if B uses this strategy, B has chance of correctly predicting, before the answer to some query is given, that the value returned by this query will result in slot l of some (specific) bin receiving an early ball. To guess , B further chooses a random , and solves in order to guess (since s receives an early ball in slot ℓ precisely when there exists an such that ).

To see that is really “solvable” two different cases must be considered, according to whether or not. , then was created by an earlier A-query and the value of its slots are known, in particular the value of its l-th slot is known. Let , let be the unique index such that and let . Then all the values are known to B except for the value , which it needs to guess using the equation


By condition (i) of Definition 1 so, by the ‘Invertibility’ property of p, there are few values math>\ \bar {y}_{h_0} ~ \, \! </math> that solve (5). More precisely, since is a nonzero polynomial of degree at most r in , B has to choose from the at most r roots of , where is just a constant. In the second case, and is not known (like , it is about to be revealed). Let , let be the unique index such that and let for 1 ≤ h ≤ r. Then all the values are known to B except , and B needs to solve


(this is ) for (or at least for ). But

since ; also, by the

injectivity of , so it follows by the ‘Collision Invertibility’ property of p that there are few values solving (6); in fact these are the at most r different roots of , considered as a polynomial in . The term in Theorem 2 accounts for B’s root-finding costs, which are incurred only once in the computation. Naturally, B’s further guessing of s′ and of the correct root erodes its probability of making a correct forgery even it has correctly guessed an early ball is about to be added to a bin slot, but it is easy to bound this erosion: B has chance at least of correctly guessing s′ and chance at least 1/r of correctly guessing the root. Thus, if Q1−k/t or more bins each receive k or more early balls for some 1 ≤ k ≤ t and if B is using its ‘early prevention’ strategy (which we have just finished describing), then B has chance at least

of forging. As B uses this strategy with probability 1 2 , we can therefore assume that fewer than bins receive k early balls for every 1 ≤ k ≤ t, or else B already reaches the requisite probability of success of . We now discuss B’s ‘late prevention’ strategy. Here B attempts to prevent A from filling a bin with t balls by guessing the arrival of late balls. We note that, if a query results in some late ball being placed in the l-th slot of bin s, then (by definition of ‘late’) and so the values are already known prior to the answer of the query . Moreover the fact that the query results in a late ball appearing in bin s means there is some such that (i) for some , (ii) the queries have already been made for , and (iii) (the value will become known when is answered). Let and , all of which are known to B except . Then, if B has correctly guessed a late ball is going to appear in the ℓ-th slot of bin s and has correctly guessed the value of , it can predict by solving


, for which there are at most r solutions. (This is the second (and last) place we require the ‘Invertibility’ property of p.) Given these observations, the following balls-and-bins game clearly models B’s ‘late prevention’ task, up to but not including guessing the root of (7):

‘LATE PREVENTION’ BALLS-AND-BINS GAME. This game is played between two adversaries and . Parameters are integers t, q′,Q ≥ 1. Rules are as follows:

  • The game involves “bins” with t slots each, where each slot can contain either contain a ball or not. At the

beginning of the game, there are no bins. Bins are added to the game as described below, and never removed.

  • The game proceeds in q′ rounds, each of which consists of t “phases”.
  • At the beginning of round i, A announces some number ≥ 0 such thatPj≤i ≤ Q. If = 0, the round

is skipped.

  • At phase l of round i, 1 ≤ l ≤ t, A chooses a subset (possibly empty) of the currently existing bins that do

not yet have a ball in their l-th slot, and places balls in all of their l-th slots, simultaneously. Moreover, A labels each ball placed with a number from 1 to . (Multiple balls with the same label are allowed, and not all labels are required to appear.)

  • At the end of round i, A introduces vi new bins to the game, each possibly already containing some balls.

Throughout the game, the total number of new bins introduced with k or more balls already in them must be less than for all 1 ≤ k ≤ t.

  • Before each phase of each round, B is allowed to secretly predict a bin that will receive a ball at that phase

and a label for that ball; B wins if it guesses both correctly. It is only allowed to make one guess during the game

  • A must fill some bin with t balls by the end of the game.

We note that the new bins introduced at the end of round i correspond to the elements of and that corresponds to . The “label” for a ball placed in a bin s at phase ℓ corresponds to an element such that , discussed above. (In the ‘real game’ between B and A several such elements s′ may exist, so that more accurate modelization would allow A to choose a nonempty list of labels rather than a single label for each ball; however, seeking to minimize the guessing advantage of B, A would automatically make each of these lists a singleton anyway.)

In Proposition 2 of Appendix A we exhibit a strategy for B in the ‘late prevention’ game that succeeds with probability regardless of A’s strategy. The ‘late prevention’ strategy of B consists simply of coupling the B-strategy of Proposition 2 with a guessing of the root of (7). Thus, as long as fewer than Q1−k/t bins receive k or more early balls for 1 ≤ k ≤ t, as long as A fills some bins with t balls and as long as B uses its late prevention strategy, B has chance at least

of forging. Since B uses the ‘late prevention’ strategy with probability 1 2 , this concludes the proof.


Replacing g in Lemma 4 by our cover-free function and using Theorem 2 with m = 3n, we obtain:

Theorem 3 Let be a , let as a keyed function family of key space like in Theorem 2. Define by (1), (2) with . Then


where and as long as q ≤ K, and where

In particular, when t = n and (i.e. ) and q ≤ K we have


By default we shall apply the second part of Theorem 3, choosing t = n. In order to interpret (9) we need to know what values of and K are achievable via IRFFs and to know for those IRFFs, as this term dominates Tmac. The question of instantiating the IRFF was already studied by Maurer and Tessaro, who reduced it to the construction of certain types of highly unbalanced bipartite expander graphs. While well-studied, these types of expander graphs are not yet completely understood, and in particular the setting of parameters relevant to our case has not been the object of much attention. Here we mention bounds achieved by two explicit constructions as well as those achieved by a non-explicit, probabilistic construction. In all cases we set m = 3n. We note that can always be upper bounded by by appending three functions to the IRFF that read off each block of input via the identity. Moreover, we can easily enforce condition (i) of Definition 1 as long as r ≤ 2n. Since the family sizes r in question are anyway polynomial in n, we assume these tweaks without further mention.

Existential construction. A probabilistic construction [24] achieves a with math>\ r=O \left (n \right ), \delta \approx 1 ~ \, \! </math> and . In this case . Then the right-hand side of (9) becomes


Assuming achieves query security up to . However, this construction is inexplicit.

Expanders of [29]. Expanders of Ta-Shma, Umans and Zuckerman yield an explicit with and . In this case Q = qpoly(n). The right-hand side of (9) becomes


Assuming we can then achieve query security up to . (We note this construction is strictly better from all standpoints than the one presented by Maurer and Tessaro.)

Expanders of [30]. Expanders of Guruswami, Umans and Venkatesan yield an explicit with and for any (0, 1). In this case .

We can set . For constant the right-hand side of (9) again becomes


Assuming the insecurity thus remains negligible as long as . The advantage of this construction is that it affords efficient inversion time of O() (as opposed to for the previous two constructions).

Interpretation. The assumption is only realistic as long as does not allow to do an exhaustive search over the key space of f; assuming the latter has size , this implies that our upper bounds are only meaningful if (since is dominated by ). The first two constructions, which are only known to have , therefore only give a meaningful bound for . Thus, with the current understanding of , they might become beyond birthday only if (and approach only if ). However, the last construction, having , yields beyond-birthday security even if κ = n, which is the case of AES-128. Once again, though, we stress that the current limitations of our approach are due only to the limitations in the current constructions of expander graphs, and are not related to any “cryptographic” difficulties. Needless to say, future advances in the constructions of expander graphs will not only improve our parameters, but will likely have other applications in many areas of theoretical computer science.

Heuristic Instantiation. In practice, we expect a variety of heuristic instantiations to potentially approach the IRFF parameters of the non-explicit construction, most important of which is the setting of , which directly affects the rate and the efficiency. Here is one such construction based on block ciphers. We simply implement eachmath>\ E_i : \left \{0,1 \right \}^3n \to \left \{0,1 \right \}^n ~ \, \! </math> as the XOR of three (independently keyed) fixed key block ciphers, i.e. . We note that in this case the 3r keys do not constitute key material, but rather fixed constants of the construction. We conjecture that, for a good enough block cipher, this construction might achieve the parameters , (or possibly even ) and .

Authors' contacts

Cite error: Invalid <references> tag; parameter "group" is allowed only.

Use <references />, or <references group="..." />


  1. 1.0 1.1 Bellare,M., Kilian, J., Rogaway, P.: The security of cipher block chaining. In: Desmedt, Y.G. (ed.) CRYPTO 1994. LNCS, vol. 839, pp. 341–358. Springer, Heidelberg (1994)
  2. 2.0 2.1 Petrank, E., Rackoff, C.: CBC MAC for Real-Time Data Sources. J. Cryptology 13(3), 315– 338 (2000)
  3. 3.0 3.1 Bellare, M.: New Proofs for NMAC and HMAC: Security without Collision-Resistance. In: Dwork, C. (ed.) CRYPTO 2006. LNCS, vol. 4117, pp. 602–619. Springer, Heidelberg (2006)
  4. 4.0 4.1 Bellare, M., Canetti, R., Krawczyk, H.: Keying Hash Functions for Message Authentication. In: Koblitz, N. (ed.) CRYPTO 1996. LNCS, vol. 1109, pp. 1–15. Springer, Heidelberg (1996)
  5. 5.0 5.1 5.2 5.3 5.4 Dodis, Y., Pietrzak, K., Puniya, P.: A New Mode of Operation for Block Ciphers and Length- Preserving MACs. In: Smart, N.P. (ed.) EUROCRYPT 2008. LNCS, vol. 4965, pp. 198–219. Springer, Heidelberg (2008)
  6. 6.0 6.1 6.2 6.3 6.4 Dodis, Y., Puniya, P.: Feistel Networks Made Public, and Applications. In: Naor, M. (ed.) EUROCRYPT 2007. LNCS, vol. 4515, pp. 534–554. Springer, Heidelberg (2007)
  7. 7.0 7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9 Dodis, Y., Steinberger, J.: Message Authentication Codes from Unpredictable Block Ciphers. In: Halevi, S. (ed.) CRYPTO 2009. LNCS, vol. 5677, pp. 267–285. Springer, Heidelberg (2009), full version pdf
  8. Gandolfi, K., Mourtel, C., Olivier, F.: Electromagnetic analysis: Concrete results. In: Koc., C. .K., Naccache, D., Paar, C. (eds.) CHES 2001. LNCS, vol. 2162, pp. 251–261. Springer, Heidelberg (2001)
  9. Halderman, J.A., Schoen, S.D., Heninger, N., Clarkson,W., Paul,W., Calandrino, J.A., Feldman, A.J., Appelbaum, J., Felten, E.W.: Lest we remember: cold-boot attacks on encryption keys. Commun. ACM 52(5), 91–98 (2009)
  10. Kocher, P.C.: Timing attacks on implementations of diffie-hellman, rsa, dss, and other systems. In: Koblitz, N. (ed.) CRYPTO 1996. LNCS, vol. 1109, pp. 104–113. Springer, Heidelberg (1996)
  11. Kocher, P.C., Jaffe, J., Jun, B.: Differential power analysis. In: Wiener, M. (ed.) CRYPTO 1999. LNCS, vol. 1666, pp. 388–397. Springer, Heidelberg (1999)
  12. Quisquater, J.-J., Samyde, D.: ElectroMagnetic analysis (EMA): Measures and countermeasures for smart cards. In: Attali, I., Jensen, T.P. (eds.) E-smart 2001. LNCS, vol. 2140, pp. 200–210. Springer, Heidelberg (2001)
  13. An, J.H., Bellare, M.: Constructing VIL-MACs from FIL-MACs: Message Authentication under Weakened Assumptions. In: Wiener, M. (ed.) CRYPTO 1999. LNCS, vol. 1666, pp. 252–269. Springer, Heidelberg (1999)
  14. 14.0 14.1 Preneel, B., van Oorschot, P.C.: MDx-MAC and Building Fast MACs from Hash Functions. In: Coppersmith, D. (ed.) CRYPTO 1995. LNCS, vol. 963, pp. 1–14. Springer, Heidelberg (1995)
  15. Shrimpton, T., Stam, M.: Building a Collision-Resistant Compression Function from Non- Compressing Primitives. In: Aceto, L., Damg°ard, I., Goldberg, L.A., Halld.orsson, M.M., Ing.olfsd.ottir, A.,Walukiewicz, I. (eds.) ICALP 2008, Part II. LNCS, vol. 5126, pp. 643–654. Springer, Heidelberg (2008)
  16. Yasuda, K.: A double-piped mode of operation for MACs, PRFs and PROs: Security beyond the birthday barrier. In: Joux, A. (ed.) EUROCRYPT 2009. LNCS, vol. 5479, pp. 242–259. Springer, Heidelberg (2009)
  17. Lee, J., Steinberger, J.: Multi-property-preserving domain extension using polynomial-based modes of operation. In: Gilbert, H. (ed.) EUROCRYPT 2010. LNCS, vol. 6110, pp. 573–596. Springer, Heidelberg (2010)
  18. Aiello, W., Venkatesan, R.: Foiling birthday attacks in length-doubling transformations — Benes: a non-reversible alternative to Feistel. In: Maurer, U.M. (ed.) EUROCRYPT 1996. LNCS, vol. 1070, pp. 307–320. Springer, Heidelberg (1996)
  19. 19.0 19.1 Bellare, M., Goldreich, O., Krawczyk, H.: Stateless Evaluation of Pseudorandom Functions: Security beyond the Birthday Barrier. In:Wiener,M. (ed.) CRYPTO 1999. LNCS, vol. 1666, pp. 270–287. Springer, Heidelberg (1999)
  20. Maurer, U., Pietrzak, K.: The security of Many-Round Luby-Rackoff Pseudo-Random Permutations. In: Biham, E. (ed.) EUROCRYPT 2003. LNCS, vol. 2656, pp. 544–561. Springer, Heidelberg (2003)
  21. Patarin, J.: Luby-Rackoff: 7 rounds are enough for 2n(1−ε) security. In: Boneh, D. (ed.) CRYPTO 2003. LNCS, vol. 2729, pp. 513–529. Springer, Heidelberg (2003)
  22. Patarin, J.: Security of Random Feistel Schemes with 5 or More Rounds. In: Franklin, M. (ed.) CRYPTO 2004. LNCS, vol. 3152, pp. 106–122. Springer, Heidelberg (2004)
  23. Patarin, J., Montreuil, A.: Benes and Butterfly Schemes Revisited. In: ISISC 2005 (2005)
  24. 24.0 24.1 24.2 24.3 24.4 24.5 24.6 24.7 24.8 Maurer, U., Tessaro, S.: Domain Extension of Public Random Functions: Beyond the Birthday Barrier. In: Menezes, A. (ed.) CRYPTO 2007. LNCS, vol. 4622, pp. 187–204. Springer, Heidelberg (2007)
  25. Coron, J.-S., Dodis, Y., Malinaud, C., Puniya, P.: Merkle-Damg°ard Revisited: How to Construct a Hash Function. In: Shoup, V. (ed.) CRYPTO 2005. LNCS, vol. 3621, pp. 430–448. Springer, Heidelberg (2005)
  26. Maurer, U., Renner, R., Holenstein, R.: Indifferentiability, impossibility results on reductions, and applications to the random oracle methodology. In: Naor, M. (ed.) TCC 2004. LNCS, vol. 2951, pp. 21–39. Springer, Heidelberg (2004)
  27. Lucks, S.: A failure-friendly design principle for hash functions. In: Roy, B. (ed.) ASIACRYPT 2005. LNCS, vol. 3788, pp. 474–494. Springer, Heidelberg (2005)
  28. 28.0 28.1 Dodis, Y., Steinberger, J.: Domain Extension for MACs Beyond the Birthday Barrier. In: EUROCRYPT 2011. LNCS, vol. 6632, pp. 323–342. Springer, Heidelberg (2011), full version of this paper
  29. Ta-Shma, A., Umans, C., Zuckerman, D.: Lossless Condensers, Unbalanced Expanders, And Extractors. Combinatorica 27(2), 213–240 (2007)
  30. Guruswami, V., Umans, C., Vadhan, S.P.: Unbalanced expanders and randomness extractors from Parvaresh-Vardy codes. J. ACM 56(4) (2009) 342 Y. Dodis and J. Steinberger

Cite error: <ref> tags exist for a group named "@:", but no corresponding <references group="@:"/> tag was found, or a closing </ref> is missing