Stronger Security Proofs for RSA and Rabin Bits - CiteSeerX

Report 15 Downloads 8 Views
Stronger Security Proofs for RSA and Rabin Bits R. Fischlin and C.P. Schnorr Fachbereich Mathematik/Informatik Universitat Frankfurt PSF 111932 60054 Frankfurt/Main, Germany email: f schlin, [email protected] URL: http://www.mi.informatik.uni-frankfurt.de January 8, 1999

Abstract

The RSA and Rabin encryption function are respectively de ned as EN (x) = xe mod N and EN (x) = x2 mod N , where N is a product of two large random primes p, q and e is relatively prime to '(N ). We present a simpler and tighter proof of the result of Alexi, Chor, Goldreich and Schnorr [ACGS88] that the following problems are equivalent by probabilistic polynomial time reductions: (2) given EN (x) nd x; (3) given EN (x) 1 predict the least-signi cant bit of x with success probability 21 + poly( n) , where N has n bits. The new proof consists of a more ecient algorithm for inverting the RSA/Rabin{ function with the help of an oracle that predicts the least-signi cant bit of x. It yields provable security guarantees for RSA{message bits and for the RSA{random number generator for modules N of practical size.

1 Introduction Randomness is a fundamental computational resource and the ecient generation of provably secure pseudo-random bits is a basic problem. Yao [Y82] and Blum, Micali [BM84] have shown that perfect random number generators (RNG) exist under reasonable complexity assumptions. Some perfect RNG's are based on the RSA{function EN (x) = xe mod N and the Rabin{function EN (x) = x2 mod N , where the n{bit integer N is a product of two large random primes p; q and e is relatively prime to '(N ) = (p ? 1)(q ? 1) and e 6= 1 mod '(N ). The corresponding RNG transforms a random seed x0 2 [1; N ) into a bit string b1 ; : : : ; bm of arbitrary polynomial length m = nO(1) according to the recursion bi := xi mod 2; xi := EN (xi?1 ). The security of these RNG's was established in a series of works [GMT82, BCS83, ACGS88, VV84]: the RSA/Rabin{function can be inverted in polynomial time if one is given an oracle which predicts from given EN (x) the least-signi cant bit of x with success probability 21 + poly(1 n) . While the ACGS-result shows 1

that the RSA/Rabin RNG is perfect in an asymptotic sense, the practicality of this result has been questionable as the transformation of attacks against these RNG's into a full inversion of the RSA/Rabin{function (resp. the factorization of N ) is rather slow. The main contribution of this paper is a much simpler and stronger proof of the ACGS-result. The new proof gives a more ecient reduction from bit prediction to full inversion of the RSA/Rabin{function. It yields a security guarantee for modules N of practical size. Notation. Let N be product of two large primes p; q, 2n?1 < N < 2n. Let ZN = Z=N Z be the ring of integers modulo N and let ZN denote the multiplicative subgroup of invertible elements in ZN . We represent elements x 2 ZN by their least nonnegative residue in the interval [0; N ), i.e., ZN = [0; N ). We let [ax]N 2 [0; N ) denote the least non-negative residue of ax ( mod N ). We use [ax]N for arithmetic expressions over Z while the arithmetic for a; x 2 ZN = [0; N ) is done modulo N . We let `(z ) = z mod 2 denote the least-signi cant bit of z 2 ZN . Let e be relatively prime to '(N ) = (p ? 1)(q ? 1) and e 6= 1 mod '(N ). The RSA cryptosystem enciphers a message x 2 ZN into EN (x) = xe mod N . Let O1 be an oracle running in expected time T which, given EN (x) and e; N , predicts the leastsigni cant bit `(x) of x with advantage ": Prx;w [ O1 EN (x) = `(x) ]  21 + ", where the probability refers to random x 2R [0; N ) and the internal coin tosses w of the oracle. We assume that the time T of the oracle also covers the time for the evaluation of the function EN . Throughout the paper we assume that "?1 , n are powers of 2 and n  29 . We let lg denote the logarithm function with base 2. All intervals [0; N ); [0; 8"?1 ) etc. are over the integers. For a nite set A let b 2R A denote a random element of A that is uniformly distributed. All time bounds count arithmetic steps using integers with lg(n"?1 ) + O(1) bits. We use integers of that size for counting the votes in majority decisions. Halving approximations via binary division. Consider the problem of computing x 2 ZN from EN (x) and N with the help of the oracle O1 for `(), but without knowing the factorization of N . The new method inverts EN by iteratively halving approximations uN of random multiples [ax]N with known multiplier a, via binary division, see gure 1. 0

(

[ax]N

)

N

`(ax) = 0

`(ax) = 1

0

?

w

( )   2?1 ax N

( )  2?1 ax N U

N

gure 1: binary division

The basic idea is that given an interval containing [ax]N , an interval of half width containing [ 21 ax]N can be computed given the least signi cant bit `(ax) of [ax]N | Figure 1 shows these two intervals, the half width interval is shown for the two values of `(ax). Repeating 2

this process for at most n iterations the interval narrows down to containing exactly one element [2?n ax]N . At this point, ax and therefore x can be found exactly. More formally we have [ 21 ax]N = 21 [ax]N for even [ax]N and [ 21 ax]N = 21 ([ax]N + N ) for odd [ax]N . Given the approximation uN for [ax]N we approximate [ 21 ax]N by 21 (u + `(ax))N . One binary division halves the approximation error: [ 21 ax]N ? 21 (u + `(ax))N = 21 ( [ax]N ? uN ). The bits `(ax) can be obtained for known a's with the help of O1 as in [ACGS88]. Actually we improve the relevant procedure in various ways. Binary division has already been used by Goldwasser, Micali, Tong [GMT82] together with a perfect oracle. But we face the diculty that the oracle is faulty with a small advantage. The subsequent works [BCS83], [ACGS88] have replaced binary division by the binary gcd{algorithm of Brent and Kung [BK83]. We reset binary division together with halving approximations into the setup of [ACGS88]. A cornerstone are majority decisions via pairwise independent sampling as introduced in [ACGS88]. We introduce the canonical multipliers 2?t a into this sampling method. This leads to an algorithm for RSA{inversion that is much more uniform than the AGCS-algorithm because 2?t a does not depend on the dynamics of the algorithm. This higher uniformity is the basis of our optimizations. Our results. In Section 2 we present the core of the new algorithm for inverting the RSA{ function EN . It runs in expected time O(n2 "?2 T + n2 "?6 ), where T is the time and " the advantage of oracle O1 . The expectation refers to the internal coin tosses of the oracle and of the inversion algorithm. This improves the ACGS-time bound of O(n3 "?8 T ) for RSA{ inversion using such O1 . The new time bound di erentiates the costs O(n2 "?2 T ) induced by the oracle calls and all other steps O(n2 "?6 ) which we call the additional overhead. Subsequent optimizations in Sections 3 and 4 minimize the number of oracle calls and reduce the additional overhead. Our security result extends to the j -th least-signi cant message bit for arbitrary j . In the extension to arbitrary j the additional overhead is proportional to 22j , whereas the number of oracle calls does not depend on j . In Section 3 we introduce the subsample majority rule, a trick that improves the eciency of majority decisions via pairwise independent votes. In various applications it is computationally easy to generate pairwise independent votes as base of a majority decision, while mutually independence is not available or too expensive. Following [ACGS88] we can easily generate with the help of the oracle pairwise independent 0,1-valued votes that each has an advantage " in predicting the target bit `(at x). A large sample size m is necessary in order to make the error probability m"1 2 of the majority decision suciently small. To reduce the computational costs of the large sample we only use a small random subsample of it. The votes of the random subsample are mutually independent, and so a smaller subspace suces to maintain the error probability of a majority decision. The time for the subsample majority decision reduces to the size of the small subsample. The large sample is only mentally used for the analysis, it does not enter into the computation. Using this trick we gain a factor lgnn in the number of oracle calls and in the time for RSA{inversion. The reduced number of oracle calls is optimal up to factor O(lg n). In Section 4 we process all possible initial guesses for the approximate locations of [ax]N ; [bx]N simultaneously. This reduces the additional overhead in the time for RSA{ inversion to O(n2 "?4 lg(n"?1 )). Section 5 contains conclusions for the security of RSA{ 3

message bits and of the RSA{random number generator for modules N of practical size. These conclusions are preliminary as the additional overhead can be further reduced. In Section 6 we extend the oracle algorithm from inverting the RSA{function to inverting the Rabin{function and we derive a security guarantee for the x2 mod N {generator under the assumption that factoring is hard. We consider two versions of Rabin's function, improving previously known results of [ACGS88], [VV84].

2 RSA{inversion by binary division We introduce a novel method for inverting the RSA{function with the help of an oracle O1 that predicts the least-signi cant message bit with advantage ", but without knowing the factorization of N . The core of the new method is the algorithm RSA{inversion presented below. High level description of RSA{inversion. We want to compute x from EN (x) using oracle O1 that has an "-advantage for `(x). We invert EN (x) using the method of binary division explained in the introduction. By that method we get from an approximate location uN of [ax]N approximate locations ut N of [at x]N where the error of ut+1 N is only half that of ut N . Formally let a0 = a, u0 = u, at = [2?t a]N and ut = 21 (ut?1 + `(at x)). The main work of stage t is to determine the bit `(at x) by majority decision using oracle O1 . For this we use a second independent multiplier b and an approximate location vN of [bx]N . So upon initiation the algorithm picks two random multipliers a; b 2 ZN , it guesses `(ax), `(bx) and approximate locations uN , vN of [ax]N , [bx]N . More precisely, it guesses the closest rationals u; v to N1 [ax]N ; N1 [bx]N so that 8"?3 u; 8"?1 v are integers. Our majority decision for `(at x) further develops the procedure of [ACGS88]. It predicts `(at x) using O1 (EN (ct;i x)) for multipliers ct;i =def at (1+2i)+b 2 ZN , where i ranges over the set Am =def fi : j1+2ij  mg of size m. Note that the equation [ct;i x]N = [(1+2i)at x+b x]N induces a uniquely de ned integer wt;i with jwt;i j  m satisfying [ct;i x]N = [at x]N (1 + 2i) + [bx]N ? wt;i N . (1) Hence `(ct;i x) = `(at x)(1 + 2i) + `(bx) ? wt;i N = `(at x) + `(bx) + wt;i mod 2. If wt;i and `(bx) are given then `(ct;i x) and `(at x) are linearly related. Therefore a prediction of `(ct;i x) yields a prediction of `(at x) with the same advantage. Importantly, the least signi cant bit of [at x]N (1 + 2i) and of [at x]N coincide because 1 + 2i is odd | an even factor 2i would cancel out the least signi cant bit of [at x]N in Equation 1, and thus the least signi cant bits of [at x]N and of [(at (2i) + b)x]N are uncorrelated. This shows that multipliers of the form at (2i) + b are useless. We compute wt;i from the approximations ut and v, subject to some error due to the inaccuracy of ut ; v. We call the computed wt;i correct, if it coincides with the wt;i de ned by Equation 1. The i-th measurement of the majority decision for `(at x) guesses O1 (EN (ct;i x)) for the left-hand side and computes `(at x) + `(bx) + wt;i mod 2 for the right-hand side, and guesses `(at x) accordingly: "`(at x) = 0" i O1 EN (ct;i x) = `(bx) + wt;i mod 2. This is correct if the oracle reply O1 EN (ct;i x) and wt;i are both correct. The majority decision for `(at x) makes measurements over the m points ct;i , it samples over the oracle replies 4

O1 EN (ct;i x) for i 2 Am . A main point will be to have the errors of the measurements pairwise independent for distinct i. This pairwise independence is induced by the pairwise independence of the multipliers ct;i = 2?t at (1 + 2i) + b 2 ZN for random a; b 2 ZN and xed t. The number of measurements. What is a good choice for the number mt of measureP ments at stage t ? On the one hand we want to minimize the number of oracle calls nt=1 mt over all stages. On the other hand we need that the error probability of the guessed `(at x), summed over the stages t = 1; :::; n, is bounded away from 1. That error probability depends on mt and on the error j ut ? N1 [at x]N j of stage t. Suppose we initially guess u so that the approximation error j u ? N1 [ax]N j is at most  "3 . Then we have via binary division jut ? 1 [ax]N j  2?t  | we justify the choice  = 16 N at stage t. The numerical error induced into the above computation of wt;i is at most 2 j1 + 2ij 2?t . Using j1 + 2ij  mt this error is at most 2 mt 2?t . In order to preserve the "-advantage of oracle O1 we require that 2 mt 2?t   4" . Under theses premises we show below that the majority decision for `(at x) errs with probability at most 9 m4t "2 . So we need that Pnt=1 9 m4t "2 < 1. This goal can be achieved by setting mt = n"?2 . However this implies that   8"3n and ?1 is a factor of the additional "3 which overheadP which we want to be small. An alternative choice is mt = 2t "?2 ,  = 16 yields nt=1 9 m4t "2 < 49 . Here the problem is that we cannot have an exponential number mt = 2t "?2 of oracle calls for large t. To cap these costs we replace 2t "?2 for t  1 + lg n by 2n"?2 . Thus our choice is mt := minf2t ; 2ng"?2 . The deviation from 2t "?P2 for t  1 + lg n n 4 2 to Pn ?2 adds at most 9 t=1 9 mt "2 , and the number of oracle calls is at most t=1 mt < 2n" , i.e. it is polynomial in n; "?1 . Our choice of3 mt ;  saves a factor n in the additional overhead compared to the choice mt = n"?2 ;  = 8"n . Novelties. We introduce the canonical multipliers at ; ct;i and the calculation of ut; wt;i via binary division. We recursively get the approximate location ut N of [at x]N as ut := 1 2 (ut?1 + `(at?1 x)). This in turn yields wt;i := but (1 + 2i) + v c. We may get a faulty wt;i when [at x]N (1 + 2i) + [bx]N is close to an integer multiple of N . The corresponding error of wt;i will be analyzed in detail. Comparison with the ACGS{method. The ACGS{algorithm uses binary division within the gcd{calculation, which for given a; b searches for integers k; l so that [(ak + bl)x]N = 1. That use of binary division gives away the advantage that binary division halves the approximation error. Our canonical multipliers ct;i provide higher uniformity than the dynamic multipliers ak + bl | that reduce [(ak + bl)x]N in the gcd{algorithm. The higher uniformity of the new method opens the door to various optimizations that improve the eciency. The oracle is only queried about the canonical points EN (ct;i x), this reduces the number of oracle calls to almost its information{theoretical minimum. Furthermore the new algorithm guesses the least-signi cant bits and approximate locations of two message multiples [ax]N ; [bx]N whereas the gcd-method requires four such multiples.

5

RSA{inversion

1. INPUT EN (x); N; " Initiation Pick random integers a; b 2R ZN  = [0; N ), 3 " " ? 3 guess rational integers u 2 8 [0; 8" ); v 2 8 [0; 8"?1 ) satisfying j N1 [ax]N ? u j  16"3 , j N1 [bx]N ? v j  16" , set a0 := a; u0 := u. Guess 0 ;  2 f0; 1g such that 0 = `(ax), and  = `(bx). ( The above guesses are made at random and the condition refers to what we hope to be the outcome. ) 2. FOR t = 1 TO n DO at := 21 at?1 ; ut := 21 (ut?1 + t?1 ); m := minf2t; 2ng"?2 ; ct;i := at (1 + 2i) + b; wt;i := but (1 + 2i) + vc for all i 2 Am = fi : j1 + 2ij  mg: z := #fi 2 Am j O1 EN (ct;i x) =  + wt;i mod 2g Majority decision t := [ 0 if z  m2 and 1 otherwise ] 3. OUTPUT x := a?n 1 bun N + 21 c mod N In the following analysis we use the conditional probability that step 1 guesses correctly. Furthermore, when analyzing stage t, we assume that i = `(ai x) for all i < t. We refer to that condition as the right alternative. If we are in the right alternative u; v3 and ut " N and are uniquely determined by a; b | we eliminate ties by requiring [ax]N < uN + 16 [bx]N < vN + 16" N . All probabilities refer to the random pair (a; b) 2R (ZN )2 and to the coin tosses of the oracle. Halving the approximation error. In the right alternative the approximation error of ut N to [at x]N halves with each iteration: [at x]N = 21 ( [at?1 x]N + `(at?1 x)N ). Hence N1 [at x]N ? ut = N1 [at x]N ? 12 (ut?1 + `(at?1 x) ) = 21 ( N1 [at?1 x]N ? ut?1 ). (2) Correctness of output. By Equation 2 RSA{inversion succeeds in the right alternative. In this case we have j N1 [an x]N ? un j = 2?n j N1 [a0 x]N ? u0 j < 21 . Thus an x = bun N + 12 c mod N and the output is correct. Correctness of wt;i. We call wt;i correct if [ct;ix]N = [at x]N (1 + 2i) + [bx]N ? wt;i N: Correct wt;i satisfy the equation `(ct;i x) = `(at x) + `(bx) + wt;i mod 2; where we use that ?wt;i N = wt;i mod 2 holds for odd N . Majority decision replaces in the latter equation `(ct;i x) by O1 EN (ct;i x) and determines `(at x) so that the equation holds for the majority of the i 2 Am . The error probability of this majority decision is analyzed below. Error probability of wt;i. We show that Pra;b[wt;i errs]  "=4, where "wt;i errs" means that it is not correct. The error probability of wt;i depends on the numerical error 4t;i =def (ut (1 + 2i) + v) ? N1 ( [at x]N (1 + 2i) + [bx]N ) of the rational number ut (1 + 2i) + v. While the "correct" value of wt;i is the integer part of N1 ( [at x]N (1 + 2i) + [bx]N ), we compute wt;i from ut (1 + 2i) + v. If wt;i errs we must have N1 jct;i xjN  j 4t;i j , where jz jN = min( [z ]N ; N ? [z ]N ) denotes the absolute value of z in ZN . In fact, the numerical error 4t;i can only a ect the mod N {reduction of 6

[at x]N (1 + 2i) + [bx]N if that integer has distance at most N  j 4t;i j to the nearest integer multiple of N . In the right alternative iterating Equation 2 yields N1 [at x]N ? ut = 2?t ( N1 [ax]N ? u ). Using 2?t "2 j1 + 2ij  1 for i 2 Am and the triangular inequality we get j 4t;i j = j ut(1 + 2i) ? N1 [at x]N (1 + 2i) + v ? N1 [bx]N ) j  16" (2?t "2 j1 + 2ij + 1)  8" . So far we have shown that incorrect wt;i imply that N1 jct;i xjN  j4t;i j  8" . Thus, the event Errt;i =def [ N1 jct;i xjN  8" ] covers errors of wt;i . As ct;i is random in ZN we get Pra;b [ wt;i errs ]  Pra;b[ Errt;i ] = 4" . Pairwise independence approximation errors. The matrix of the ZN  of oracle  and ?t (1 + 2i)   a  c 1 ; 2 t;i linear transformation c = 1 ; 2?t (1 + 2j ) b has determinant 2?t+1 (j ? i) 6= t;j 0 mod N for j2ij < min(p; q). This shows that the multipliers ct;i 's are pairwise independent for random (a; b). The oracle errors as well as the events Errt;i are pairwise independent since these events depend only on the multipliers ct;i . Remark. In [ACGS88] and [FS97] the approximation errors for wt;i have not been treated correctly. These errors are | the way they are de ned | not pairwise independent for distinct measurements. This can be corrected by enlarging the approximation error to an event | like Errt;i | that only depends on the multipliers. The correction increases the constant factors in the time bounds. We thank D. Knuth for pointing out this mistake and O. Goldreich for his help in correcting it. Error probability of the majority decision. The i-th measurement | the i-th guess of `(at x) | is correct if the oracle reply and wt;i are both correct. We de ne 0,1-valued error variables Xi that cover the error of the i-th measurement: Xi = 1 i O1 EN (ct;i x) 6= `(ct;i x) OR Errt;i. The Xi 's are pairwise independent as the ct;i 's are pairwise independent, and Errt;i depends only on ct;i x. We have E[Xi ]  21 ? 34 " since the oracle has advantage " and Errt;i has correct i the majority probability at most 4" . Moreover Var[Xi ] < 14 . A majority decision is P 1 3 of the m measurements is correct. A majority decision errs only if m i Xi ?   4 ", where 1 3 1 P  =def m i E[Xi]  2 ? 4 ". Chebyshev's inequality for the the m random variables Xi with i 2 Am shows that Pr[ j m1 Pi Xi ? E[X ] j  34 "]  ( 43 ")?2 Var[ Pi Xi ]  9m4"2 : Here, the last inequality follows from the identity Var[ Pi Xi ] = Pi Var[Xi ]=m2 which holds for any m pairwise independent random variables Xi . At this point we need that the Xi are pairwise independent. As m = minf2t ; 2ng"?2 the majority decision for `(at x) errs with probability 942t for t  1 + lg n and with probability 92nPfor t  1 + lg n. Thus, the probability that some majority decision is wrong is at most t1 942t + (n ? lg n) 92n  94 + 29 = 32 . Therefore, if step 1 guesses correctly, RSA{inversion succeeds with probability at least 13 . 7

Running time. We give an upper bound for the expected number of steps required to

compute x for given EN (x) and N . We separately count the oracle calls and the other steps which form the additional overhead. The oracle is queried about EN (ct;i x) for t = 1; :::; n and i 2 Am . The oracle calls depend on a; b but not on u; v; 0 ;  . So we keep a; b xed while we try all relevant possibilities for u; v; 0 ;  . As the algorithm has success rate 31 and calls the oracle at most m  2n"?2 times per stage, the expected number of oracle calls is at most 3  2 n2 "?2 . They require 3  2 n2 "?2 T steps. Each majority decision contributes to the additional overhead at most 2n"?2 steps that are performed with all oracle replies given. The algorithm does not need the exact rational ut and merely computes wt;i = but (1 + 2i) + vc using lg(n"?1 ) + O(1) precision bits from ut . We see that the additional overhead is at most the product of the following factors 1. # of quadruples (u; v; 0 ;  ) 8"?3  8"?1  2  2 2. # of stages n 3. # of steps per majority decision 2n"?2 4. the inverse of the success rate 3 9 2 ? 6 Hence the additional overhead is at most 3  2 n " , and thus the expected time for RSA{ inversion is 3n2 "?2 (2T + 29 "?4 ). Recall that our time bounds count arithmetic steps using integers with lg(n"?1 ) + O(1) bits. Integers of bit length lg(n"?1 ) + 1 are used for counting the n"?2 votes of a majority decision. We use lg(n"?1 ) + 3 precision bits from ut when we compute wt;i . While the n stages of RSA{inversion are done with lg(n"?1 )+3{bit integers we need for the computation of the output n precision bits of un . For this we store n precision bits of each ut . These bits incur only minor costs because we simply shift them at each stage.

Using an oracle for the j-th least-signi cant message bit. The j -th least-signi cant

message bit `j (x) is called secure if EN can be inverted in polynomial time via an oracle Oj that predicts `j (x) for given EN (x). Suppose that oracle Oj predicts `j (x) with advantage " in expected time T . RSA{inversion using oracle Oj for arbitrary j proceeds in a similar way as for j = 1. It guesses initially u; v and Lj (ax); Lj (bx) 2 [0; 2j ), the integers that consist of the j least-signi cant bits of [ax]N ; [bx]N . A main point is that the majority decision for `j (at x) takes into account carry{overs from the j ? 1 least-signi cant bits. By the linearity of Lj the equation Lj (ct;i x) = Lj (at x)(1 + 2i) + Lj (bx) ? wt;i N mod 2j holds for correct wt;i . This implies the equation Lj?1 (ct;i x) + 2j?1 `j (ct;i x) = Lj?1(at x)(1 + 2i) + Lj (bx) + 2j?1 `j (at x) ? wt;i N mod 2j : In order to predict `j (at x) we replace in the latter equation `j (ct;i x) by Oj EN (ct;i x) and we recover Lj ?1 (ct;i x), Lj ?1(at x) recursively from the initial values Lj (ax); Lj (bx), the approximate locations uN; vN and N . We choose `j (at x) so that the equation holds for the majority of i 2 Am . We can apply binary division since `(at x) is always given via Lj (at x). Next we consider how the time of RSA{inversion for the case of arbitrary j compared 8

to the particular case j = 1. The rst factor 82 "?4 22 of the additional overhead increases to the number of quadrupels (u; v; Lj (ax); Lj (bx)) which is at most 82 "?4 22j . Now the time bound for RSA{inversion via Oj is O(n2 "?2 (T + 22j "?4 )) while it is O(24j n3 "?8 T ) for the ACGS-algorithm. There is a double advantage in the new time bound. The factor 24j decreases to 22j and it only a ects the additional overhead. The number of oracle calls and the additional overhead can be further reduced by the methods in Sections 3 and 4. Simultaneous security of RSA{message bits. The m least-signi cant message bits Lm (x) 2 [0; 2m ) are by de nition simultaneously secure if given EN (x) they are polynomialtime indistinguishable from a random number y 2R [0; 2m ). In section 5.1 of [ACGS88] it is shown that the O(lg n) least-signi cant RSA{message bits are secure or else the RSA{ function can be inverted in polynomial time. Here we give a stronger proof of this result, improving the ACGS-time bound of oracle RSA{inversion. Now RSA{inversion uses a distinguishing oracle D which given EN (x) distinguishes Lm (x) from a random y 2R [0; 2m ) at tolerance level : j Pr[D(Lj+1(x); EN (x) ) = 1] ? Pr[D(y; EN (x) ) = 1] j  . To extend the algorithm RSA{inversion from using oracle Oj to using a distinguishing oracle D we follow Section 5.1 of [ACGS88]. Yao has shown that every distinguishing algorithm D yields for some j  m an oracle Oj that predicts `j (x) when given Lj?1 (x) and EN (x), see [K97, Section 3.5, Lemma P1]. The time bound T of Oj is essentially the time bound of D and its advantage is " = =m. Consider RSA{inversion via oracle Oj . For the prediction of `j (ct;i x) the oracle Oj must be given Lj ?1 (ct;i x). As this value is be recovered from the initial guesses Lj (ax); Lj (bx) and uN; vN , the oracle depends on the correctness of the initial guess. In particular Oj must be tried for all 82 "?4 22j tuples (u; v; Lj (ax); Lj (bx)) which increases the number of oracle calls by that factor. We see that the RSA{function can be inverted using the oracle D in expected time O(22m n2 m6 ?6 T ). This improves the corresponding ACGS time bound O(24m n3 m8 ?8 T ). As the new time bound decreases the factor 24m to 22m it doubles the number m of simultaneously secure least-signi cant RSA{message bits compared to the ACGS-result. Another approach is to apply the XOR{Lemma of Vazirani, Vazirani [VV84, G95]. Let H  f1; :::; mg be a random non-emptyP subset. A distinguishing algorithm D yields an oracle OH which given EN (x) predicts k2H `k (x) mod 2 with advantage " =  2?m . The time bound T of OH is essentially that of D, see the computational XOR{Proposition in [G95]. So we have an oracle Oj with "{advantage for the j -th message bit `j (x) for j = max(H ), provided we know the k-th bit for all k 2 H n fj g. Recall that this condition holds when we apply the oracle inversion algorithm using Oj . This way RSA{inversion using oracle D runs in expected time O(22m n2 ?2 (T + 24m ?4 ) ). Thus, the number of oracle calls reduces from 22m n2 m6 ?6 to 22m n2 m2 ?2 at the expense of an increased additional overhead of O(26m n2 ?6 ).

3 Subsample majority decision According to the ACGS method we have generated a large sample of pairwise independent points [at (1 + 2i) + b)x]N from two random points [at x]N ; [bx]N . By that method we get via oracle O1 a large number of pairwise independent guesses for the unknown bit `(at x) 9

given suciently close approximations of [at x]N ; [bx]N . Next, we reduce the computational costs of the ACGS-method maintaining the structure of the pairwise independent sample of ACGS: We introduce the subsample majority decision, a trick that reduces the number of oracle calls for RSA{inversion by a factor n1 lg n. Consider the error variables Xi that cover the error of the i-th measurement in a majority decision for `(at x). The error probabil4 ity of a majority decision is 9 m" 2 , so we need a large sample size m to make this error small. To reduce the computational costs of the large sample we only use a small random subsample A0m0 of m0  m randomly selected i 2 Am . The randomly selected Xi(k) are mutually independent, even though the original Xi are merely pairwise independent. While the subsample induces only a small additional error probability, the time for the subsample majority decision is reduced from m to m0 . The large sample merely appears in the mental error analysis, it does not enter into the computation. We can even x a random multiset A0m0  fi : j1 + 2ij  mg for all SMAJ{calls, where a multiset is a set with multiplicities. Theorem 3 uses such a xed multiset A0m0 . Subsample Majority Decision (SMAJ). Pick a random (i(1); :::; i(m0 )) 2R (Am )m0 and let A0m0 = fi(1); :::; i(m0 )g with multiplicities. Decide that "`(at x) = 0" i O1 EN (ct;i(k) x) = `(bx) + wt;i(k) mod 2 holds for at least half of the k = 1; ::; m0 . We randomly select i(1); :::; i(m0 ) with repetition from Am so that the Xi(k) are mutually independent. Consider thePerror variables Xi of section 2 with E[Xi P ]  21 ? 34 " and 0 Xi(k)  12 , Var[Xi ] < 41 and let X =def m1 i2Am Xi . The SMAJ{rule errs only if m10 mk=1 1 3 i.e., if the majority of measurements of the subsample err. Since E[X ]  2 ? 4 " the SMAJ{ 1 Pm0 1 rule errs only if either X  E[X ] + 4 " or if m0 k=1 Xi(k)  X + 21 ". For the second event we let the values Xi with i 2 Am be xed while the randomization of Xi(k) is over the random selection i(k) 2R Am of Xi(1) ; : : : ; Xi(m0 ) . Then the variables Xi(1) ; : : : ; Xi(m0 ) are identically distributed and mutually independent with mean value X: We use the Hoeffding bound [H63] as in exercise 4.7 [MR95]: Hoe ding's Bound. For xed Xi's and random (i(1); :::; i(m0 )) 2R (Am )m0 : 0 Xi(k)  X + 21 " ] < exp(?2m0 ( 21 ")2 ). Pr[ m10 Pmk=1

Proposition 1. If the errors Xi are pairwise independent and E[Xi]  21 ? 34 " then SMAJ errs with probability at most m4"2 + exp(? 12 m0 "2 ). Proof. We have seen that the SMAJ{rule errs only if either have X  E[X ] + 14 " or P 0 Maxi Var[Xi ] m 1 1 4 m0 k=1 Xi(k)  X + 2 ". The probability of the rst event is at most m ("=4)2  m "2

by Chebyshev's inequality for the pairwise independent events Xi . Then the variables Xi(1) ; : : : ; Xi(m0 ) are identically distributed and mutually independent with mean value X: By Hoeffding's bound the probability of the second event is at most exp(? 21 m0 (")2 ).  RSA{inversion using the SMAJ{rule. We modify the stages t  4 + lg n of RSA{ inversion as follows. Apply the SMAJ{rule and Proposition 1 with m = 24 "?2 n (rather than m = 2"?2 n ) and m0 = 2"?2 lg n and use the multipliers ct;i with i 2 A0m0  Am . At 10

stages t  3 + lg n we use m = 2t "?2 and no subspace sampling. The algorithm of Section 4 starts at stage 4 + lg n. We have 21 m0 "2 = lg n > 1:4426 ln n. A single SMAJ{call at stage t  4 + lg n fails by Proposition 1 with probability m"4 2 + n?1:4426 < 31n for n  29 . Thus, all SMAJ{calls of stages t  4 + lg n succeed except with probability 13 . They require at most 2n lg n "?2 oracle calls. These bounds are used in Section 4. 4 All majority decisions at stages t  3+lg n have error probability 9 . Thus RSA{inversion P 4 1 2 succeeds at least with probability 1 ? 9 ? 3 = 9 . Neglecting the t3+lg n 2t "?2 = 16n"?2 oracle calls of stages t  3 + lg n we get Theorem 2. Using an oracle O1 that, given EN (x) and N , predicts `(x) with advantage " in time T the RSA{function EN can be inverted in expected time 9n(lg n) "?2  (T + 28 "?4 ). A main point is that the number of oracle calls for RSA{inversion is at most2 9n"?2 lg n, 2 3 ?8  3 whereas the ACGS-algorithm requires (64) 3 n " oracle calls, with (64)3 3  219:7 . We can further reduce in Theorem 2 the factor 9 to 2: By guessing upon initiation closer approximations uN; vN to [ax]N ; [bx]N we can raise the success rate 92 closely to 1. This merely increases the additional overhead. On the other hand the inversion algorithm makes almost optimal use of the oracle calls, as argued below. Oracle optimality. We want to invert EN (x) with the help of an oracle O1 where O1 EN (x) predicts `(x) with advantage " for random x. We are not interested in the known subexponential time algorithms that invert EN without using the oracle. We exclude these algorithms by restricting the access to EN (x) exclusively to oracle queries O1 EN (ax) for multipliers a of the algorithms choice. We let the multipliers a be computed with unlimited computational costs. This makes the oracle replies the only possible source of information for the recovering of x. We call this computational model the oracle access model. It covers all known oracle algorithms for RSA{inversion.

Theorem 3. Inverting EN (x) in the oracle access model requires ln42 n"?2 oracle calls. Theorem 3 has been suggested by O. Goldreich. By that theorem, the number 9n"?2 lg n of oracle calls in Theorem 2 is minimal up to a factor O(lg n). Informally an oracle call reveals only an "2 -fraction of a random bit while we must recover an n-bit random string x. We need some concepts from information theory, see e.g., [LV97, Section 1.11]. The P Shannon entropy H (X ) = ? p lg p measures the information content (or uncertainty) of a discrete random variable X with probabilities p = Pr[X = ]. The conditional entropy H (X jY ) of X given Y is de ned the same way with the p 's replaced by the conditional probabilities Pr[X = j Y ]. It is well known that H (X jY ) = H (X; Y ) ? H (Y ), where the probability distribution of (X; Y ) is the joint distribution of X and Y with probabilities p ; = Pr[X = ; Y = ]. Moreover, the information in X about Y is de ned as I (X : Y ) = H (Y ) ? H (Y jX ), and thus I (X : Y ) = H (X ) + H (Y ) ? H (X; Y ). By de nition I (X : Y ) is the part of H (Y ) that is complementary to H (Y jX ). We clearly have I (X : Y ) = I (Y : X ) and by that symmetry I (X : Y ) is called the mutual information in X and Y . Finally, let H2 (p) = ?p lg p ? (1 ? p) lg(1 ? p) denote the binary entropy function. 11

Proof. Let the random variable X be uniformly distributed over ZN representing random

RSA{messages. When queried about EN (cX ) oracle O1 replies by the bit O1 EN (cX ). The entropy of that bit is at most 1. We study the mutual information I (X : O1 EN (cX )) = H (X ) ? H (X j O1 EN (cX ) ) of X and the oracle's reply. As O1 EN (cX ) has "-advantage in predicting `(cX ), this mutual information satis es H (X ) ? H (X j O1 EN (cX ))  H (`(cX )) ? H (`(cX ) j O1 EN (cx))  H2( 21 ) ? H2( 12 + ") = 1 ? H2( 21 + "). Here we use that I (X : O1 EN (cX ))  I (`(cX ) : O1 EN (cX )), the latter is the mutual information of `(cX ) and O1 EN (cX ). Moreover H (`(cX )) = H2 ( 21 ) = 1, and the conditional entropy H (`(cX ) j O1 EN (cX )) is H2 (p) where p = Prx;w [O1 EN (cX ) = `(cX )]  21 + ". As H2 ( 12 + p) is monotonously decreasing for p > 12 we get the lower bound 1 ? H2 ( 12 + "). Moreover, if O1 EN (cX ) reveals no further information on cX | information about the bits other than `(cX ) | then I (X : O1 EN (X )) = 1 ? H2 ( 21 + "). In particular there exist oracles O1 having "-advantage for `(X ), where I (X : O1 EN (X )) = 1 ? H2 ( 21 + ")  4 2 4 ln 2 " + O (" ): Using such oracle O1 every RSA{inversion algorithm must perform at least ln 2 ? 2 (1 ? O("2 ) ) oracle calls as it must recover n bit of information of X . This holds be4 n" cause H (X ) = n while each oracle call provides at most I (X : O1 EN (cX ))  ln42 "2 + O("4 ) information about X . 

4 Processing all approximate locations simultaneously So far RSA{inversion processes all pairs of locations (u; v) separately. Simultaneously these pairs can be processed much faster. We simulate for all u; v the algorithm RSA{inversion using the SMAJ{rule of Section 3, where m = 24 n"?2 and m0 = 2"?2 lg n. All SMAJ{calls are performed with the same random multiset A0m0  Am of size m0 . We skip the rst 3 + lg n stages of the algorithm of Section 3, these stages iteratively improve the precision of the approximate locations and di er from the rest. So our simulation starts at stage t = 4 + lg n and ends in stage t = n. It picks random a; b 2R ZN and precomputes all oracle replies Ot;i := O1 EN (ct;i x) for i 2 A0m0 and ct;i = at (1 + 2i) + b. It tries for the xed a; b all u 2 B := 2"73n [0; 27 n"?3 ) and all v 2 8" [0; 8"?1 ). The interval B is chosen so that j N1 [ax]N ? u j  2"83n holds for some u 2 B . This is the precision required at stage t = 4 + lg n of RSA{inversion, where ut 2 8"23t [0; 8  2t "?3 ). At stages t > 4 + lg n we stick to the precision level of stage 4 + lg n, we round ut to the nearest rational in B . The rounding induces no extra error of wt;i . Recall that the SMAJ{rule sets in the i-measurement t to 0 i Equation 3 holds for at least half of the i 2 A0m0 : Ot;i =  + but (1 + 2i) + vc mod 2. (3) The main work is to compute for all u 2 B; v 2 8" [0; 8"?1 ), all t and  2 f0; 1g: 12

?(u; v; ; t) =def #fi 2 A0m0 j Ot;i =  + bu(1 + 2i) + vc mod 2g. Once we are given all ?-values we easily simulate the algorithm RSA{inversion for all u 2 B , v 2 8" [0; 8"?1 ) and 0 ;  2 f0; 1g in time O(n2 "?4 ) because there are O(n"?4 ) quadruples (u; v; 0 ;  ) to start with, and RSA{inversion sets "t = 0" i ?(ut?1 ; v; ; t)  m0 =2. So it remains to compute all ?-values. Recall that there are jB j  8"?1  2  (n ? lg n) < 211 n2 "?4 such values, each depending on m0 = 2"?2 lg n equations. Our aim is to compute all these values in time almost linear in the number of values. Computation of all ?-values. Equation 3 can be written with ut = u as Ot;i =  + b(s + s)"=8c mod 2, (4) ? 1 ? 1 ? 1 where we de ne s := b(2iu mod 1)8" c 2 [0; 8" ), and s := b(u + v mod 1)8" c 2 [0; 8"?1 ). Obviously Equations 3 and 4 are equivalent, except when s + s = ?1 mod 8"?1 . This equivalence is due to the additivity b(s + s)"=8c = bs"=8c + bs"=8c which holds for all integers s; s with s + s 6= ?1 mod 8"?1 . For simplicity we neglect the condition s + s 6= ?1 mod 8"?1 . 1 As the values ; s in Equation 4 do not depend on i we can eliminate ; s from the problem. We reduce the ?-values to the following  -values which do not depend on ; s; v : ) ( 1)8"?1 c = s mod 8"?1 ;  (s; u; t) := # i 2 A0m0 bO(2iu=mod t;i  for  = 0; 1; s 2 [0; 8"?1 ); u 2 B . By Equation 4 we have ?(u; v; ; t) = P(;s)  (s; u; t), where the sum ranges over all pairs (; s) 2 f0; 1g[0; 8"?1 ) with  =  +b(s+s)"=8)c mod 2 and s := b(u + v mod 1)8"?1 c. Given all  {values we rst compute  (P S; u; t) := P0s<S  (s; u; t) for all S 2 [0; 8"?1 ) and all ; u. Rewriting the above sum (;s)  (s; u; t) we get the equation ?(u; v; ; t) =  (8"?1 ? s; u; t) + ( 1? (8"?1 ; u; t) ? 1? (8"?1 ? s; u; t) ); where we separately sum with  =  over the s satisfying 0  s + s < 8"?1 and with  = 1 +  mod 2 over the s with 8"?1  s + s < 16"?1 . We need one addition for each of the 211 n2 "?4  {values and two additions per ?{value. So we get all ?-values using 3  211 n2 "?4 steps. It remains to compute the  {values.

Computation of all  {values.

1. We must compute 211 n2 "?4  {values. The ; s in the de nition of  (s; u; t) are determined by i; u; t and thus for xed u; t the m0 i's distribute into 16"?1 boxes (; s; u; t) corresponding to the pairs (; s). Speci cally  (s; u; t) is the number of i's in box (; s; u; t). We partition each box into 8"?1 subboxes containing the i with i = c mod 8"?1 for c 2 [0; 8"?1 ). Let  (c; s; u; t) denote the number of i in subbox (; c; s; u; t). We compute the  -values rst. The point is that many subboxes coincide for distinct s; u for the same t. E.g. if we form subboxes for the even and the odd i the same subboxes appear for u and u + 21 because ( (2 i 12 ) mod 1)8"?1 = 1 in the de nition of  .

1 In the case s+s = ?1 mod 8"?1 we must have j ( (1+2i)at +b)x jN  4" . If the i with s+s = ?1 mod 8"?1 are all counted incorrectly the error probability of wt;i increases at most by 4" .

13

In general we write u 2 B uniquely in the form u = u0 + j"8 with u0 2 2"73n [0; 24 n"?2 ) and j 2 [0; 8"?1 ). Clearly  (c; s; u0 + j"8 ; t) =  (c; s ? 2jc; u0 ; t), where the s-coordinates are always taken mod8"?1 , and which follows from (2 c j"8 mod 1)8"?1 = 2 c j mod 8"?1 . Thus we only need to compute the  (c; s; u0 ; t) for all u0 2 B \ [0; 8" ). We distribute for xed u0 2 B \ [0; 8" ) and xed t the m0 i's into the subboxes (; c; s; u; t) and we count accordingly. This requires at most m0  jB j  2  n=8"?1 = 26 n2 "?4 lg n steps. Note that most of the O(n2 "?5 ) subboxes remain empty without incurring any costs. 2. Finally we compute  (s; u; t) = Pc  (c; s; u; t) for all  2 f0; 1g, all s; u; t summing up over c 2 [0; 8"?1 ). This is done by a FFT{like network, where subsums are used repeatedly. In depth e of the network we form the sum over the  (c; s; u; t) for the 2e c's that coincide modulo 8"?1 =2e , we do this for all residue classes modulo 8"?1 =2e . Each of these sums is the sum of two subsums of the preceding layer in depth e ? 1. Each subsum in depth e repeats 2e times as the values for u; c and u; c coincide,  (c; s; u; t) =  (c; s; u; t) if c = c mod 8"?1 =2e and 2e?1 u = 2e?1 u mod 1. ( In this case the values (2 i u mod 1)8"?1 = (2 c u mod 1)8"?1 in the de nition of  coincide for c; u and c; u. ) The FFT{network has depth lg(8"?1 ) and after elimination of the redundancies there are 210 n2 "?4 vertices per layer. We have reduced the number of vertices per layer by a factor 2 due to the identity  (c; s; u; t) =  (c; s; u + 21 ; t). Another reduction by a factor 2 is possible since 0 (s; u; t)+1 (s; u; t) does not depend on the oracle calls. The resulting network has size 29 n2 "?4 lg(8"?1 ):

Time bounds. The dominant part of the time bound for ? is the size 29 n2"?4 lg(8"?1 ) of

the FFT{network for step 2. In the right alternative each SMAJ{call errs by the analysis of Section 3 with probability at most 31n . We perform less than n consecutive stages starting in stage 4 + lg n. The inversion of the RSA{function succeeds with probability 32 . The elimination of the rst 3+lg n stages has doubled the success probability. Neglecting minor terms, EN can be inverted in expected time 3 n (lg n)"?2 T + 3  28 n2 "?4 lg(8n"?1 ): (5)

Theorem 4. Processing all pairs (u; v) simultaneously, the additional overhead of RSAinversion is at most O(n2 "?4 lg(n"?1 )).

The additional overhead in Theorem 4 is quadratic in n while in Theorem 2 it is linear in n. We believe that a factor n can be saved in Theorem 4. In the right alternative the fraction of measurements that predict " `(at x) = 0; 1" is ?(u; v; l; t)=m0 . By Chebyshev's inequality this fraction must be close to 21  ", where " is the exact advantage of O1 . We can discard the pairs (u; v) for which ?(u; v; l; t)=m0 di ers more than 2" from both 21  ".

5 Security of RSA{message bits and of the RSA{RNG An important question of practical interest is how to generate eciently many pseudorandom bits that are provably good under weak complexity assumptions. Provable security 14

for the RSA{RNG follows from Theorems 2 and 4. Under the assumption that there is no breakthrough in algorithms for inverting the whole RSA{function Theorems 2 and 4 yield provable security for RSA{message bits and for the RSA{RNG for modules N of practical size | n = 1; 000 and n = 5; 000, respectively. The fastest known factoring method. The fastest known algorithm for factoring N or for breaking the RSA cryptoscheme requires at least LN [ 31 ; 1:9]1+o(1) steps, where LN [v; c] = exp(c  (ln N )v (ln ln N )1?v ). LN [ 31 ; 1:9] is the conjectured run time of the number eld sieve method with Coppersmith's modi cation using several number elds [BLP93]. Factoring even a non-negligible fraction of random RSA{modules N requires LN [ 31 ; 1:9] steps by this algorithm. Practical security of RSA{message bits. Consider the time bound (5) for RSA{ 1 . The time inversion with the oracle time bound T := 3:16  1013 ; n := 1 000; " := 100 1 22 bound (5) for RSA{inversion is about 10 which is clearly less than LN [ 3 ; 1:9]  1025:5 , for N  2n , the time of the fastest known algorithm for factoring N . This yields a security guarantee for the least-signi cant message bit: For given EN (x) it is impossible to predict 1 within one MIP-year (3:16  1013 instructions) or else the RSA{ `(x) with advantage 100 function EN can be inverted faster than is possible by factoring N using the fastest known algorithm. The contribution of the additional overhead to the time bound (5) is about 1:4  1018 1 . There is an interesting consequence. Each of the 8 least-signi cant for n = 1; 000; " = 100 RSA{message bits satis es essentially the same security guarantee as the least-signi cant one because the additional overhead of oracle RSA{inversion via oracle Oj is proportional to 22j , see the end of Section 2, the claim holds since 1:4  1018 214  2:3  1022 . On the other hand the ACGS-result does not give any security-guarantee for modules N of bit length 1,000, not even against an one-step attacker with T = 1. The ACGS{time bound for inversion is 219:7 10003 1008  8:5  1030  1025:5 , which means that the ACGStime for RSA{inversion does not beat the time for factoring N by known algorithms. Practical and provably secure random bit generation. Let N = p  q be a random RSA{modulus with primes p; q, let x0 2R [0; N ) and let e be a xed RSA{exponent | e is relatively prime to '(N ) = (p ? 1)(q ? 1) and e 6= 1 mod '(N ). Based on the Blum{ Micali construction [BM84], the RSA{RNG produces from random seeds (x0 ; N ) the bit string b = (b1 ; : : : ; bm ) as xi = xei?1 mod N; bi = xi mod 2 for i = 1; : : : ; m: A distinguishing algorithm D rejects b produced as above at tolerance level  if for random a 2R f0; 1gm : j Prb[D(b) = 1] ? Pra[D(a) = 1] j  : 1 is considered to be sucient for practical purposes. A tolerance level  = 100

Theorem 5. Let the RSA{RNG produce from random seeds (x0 ; N ) of length 2n an output b = (b1 ; :::; bm ) of length m. Every distinguishing algorithm D of running time T , that

rejects the output at tolerance level , yields an algorithm that inverts the RSA{function

15

EN in expected time 3n(lg n) m2?2 T + O(n2 m4 ?4 lg(nm?1 ) ) for a polynomial fraction of N .

Proof. Suppose the bit string b 2 f0; 1gm is rejected by some test A in time T (A) and

tolerance level . By Yao's argument, see eg. [K97, section 3.5, Lemma P1], and since the distribution of b is shift-invariant (EN is a permutation ), there is an oracle O1 , which given EN (x) and N , predicts `(x) in time T (A) + mn2 with advantage " := =m for a non-negligible fraction of N . By the time bound (5), and assuming that T (A) dominates mn2 , we can invert EN in the claimed expected time.  8 Odlyzko [O95] rates the 1995 yearly world computing power to 3  10 MIP-years, where a MIP-year corresponds to 3:16  1013 instructions. Then 3  108 MIP-years correspond to 1022 instructions.

Corollary 6. The RSA{random generator produces for n = 5 000 from random seeds (x0 ; N ) of bit length 104 at least m = 107 pseudo-random bits that withstand all statistical 1 , or else the whole RSA-function tests doable with at most 1022 steps at tolerance level  = 100 EN can be inverted in less than LN [ 31 ; 1:9] steps for a non-negligible fraction of N . Proof. We apply Theorem 4 with the O-constant 3  28 of the time bound (5). This inverts the RSA{function EN using about 8  1047 steps while LN [ 31 ; 1:9] > 3:7  1050 for N  25000 . 

6 The Rabin{function and the x2 mod N {generator.

We extend the results of the previous sections from the RSA{function EN (x) = xe mod N to Rabin's encryption function where e = 2. The corresponding RNG is the x2 mod N { generator. The least-signi cant bit of the Rabin{function and the x2 mod N {generator have been proved to be secure under the assumption that factoring integers is hard [ACGS88], [VV84]. We show that this security even holds for modules N of practical size, i.e. we improve the time bound of the oracle factoring algorithm. Throughout the section let N be a Blum integer | a product of two primes p and q that are congruent to 3 mod 4. Let QRN and ZN (+1) be the groups of residues mod N that are squares, respectively have Jacobi symbol +1. Then ?1 is a quadratic non-residue modulo N , ?1 2 ZN (+1). We have a chain of groups QRN  ZN (+1)  ZN that increase by a factor 2 with each inclusion. For further details on the Jacobi symbol see [NZ80]. We distinguish three variants of the Rabin{function x 7! x2 mod N , the uncentered, the centered and the absolute Rabin{function:  the original, unmodi ed uncentered Rabin{function ENu (x) = x2 mod N 2 (0; N ).  the centered Rabin{function ENc (x) = x2 modN 2 (?N=2; N=2),  the absolute Rabin{function ENa (x) = j x2 modN j 2 (0; N=2). The uncentered Rabin{function ENu outputs the residue in (0; N ) whereas ENc outputs x2 modN , the residue in the symmetrical interval (?N=2; N=2). Note that j x2 modN j = 16

min( [x2 ]N ; [N ? x2 ]N ) 2 [0; N=2) is the absolute value of x2 2 ZN . As N is a Blum integer we have ?1 2 ZN (+1) n QRN . Moreover the set MN =def ZN (+1) \ (0; N=2) has cardinality jZN j=4, and is a group under the operation a  b =def jabjN . It is important that  the uncentered Rabin{function ENu permutes the set QRN \ (0; N ).  the centered Rabin{function ENc permutes the set QRN \ (?N=2; N=2),  the absolute Rabin{function ENa permutes the set MN = ZN (+1) \ (0; N=2). The whole point is that ZN (+1) can be decided in polynomial time whereas QRN is presumably dicult to decide. So ENa permutes a nice, poly{time recognizable set MN , whereas both ENc ; ENu permute "complicated" sets. We note that ENc (x) = ENa (x); ENa (x) = j ENc (x) j; ENu (x) 2 fENc (x); ENc (x) + N g. Thus ENc extends the output of ENa by one bit | the sign. Previous oracle inversion algorithms for ENa have been proposed in [ACGS88] and for u EN in [VV84]. We improve the previous time bounds. The x2 mod N {generator. The x2 mod N {generator transforms a random seed (x0; N ) into a bit string (b1 ; :::; bm ) as xi := EN (xi?1 ); bi := `(xi ) for i = 1; :::; m. We distinguish three variants of this generator, the uncentered, the centered and the absolute RNG, according to the three variants of the Rabin{function EN . Speci cally, the seed x0 is random in the set QRN \ (0; N ), QRN \ (?N=2; N=2), respectively MN for the uncentered, centered, respectively absolute Rabin{function. Historically the uncentered RNG has been introduced as the x2 mod N {generator [BBS86]. However, the absolute and the centered RNG coincide and yield better results. The absolute and the centered RNG coincide in the output. Let xai ; xci; xui denote the integer xi in the i-th iteration with ENa ; ENc ; ENu and input x0 = xa0 = xc0 = xu0 . Using ENc (x) = ENa (x) we see by induction on i that xci = xai and `(xci ) = xci mod 2 = xai mod 2 = `(xai ). On the other hand the uncentered RNG is quite di erent and unsymmetrical. It outputs the xor of `(xci ) and the sign-bit [xci > 0]. It comes as no surprise that we can establish better security for the absolute | and the equivalent centered | RNG than for the uncentered one. Oracle inversion of the absolute Rabin{function. We would like to modify the oracle algorithm for RSA{inversion from the RSA{function to the permutation ENa on MN . The modi ed algorithm uses an oracle O1 which given ENa (x) and N predicts for random x 2R MN the bit `(x) with advantage ". The inversion algorithm uses the canonical multipliers ct;i = at (1 + 2i) + b of Section 2. How to interpret the oracle. The diculty is that the queries to the oracle may be of the wrong form. Namely, if we feed the oracle with ENa (ct;i x) | where ct;i x 62 MN | then the oracle's answer does not correspond to ct;i x but rather to the square root of (ct;i x)2 that resides in MN . This may happen if either ct;i 62 ZN (+1) or [ct;i x]N > N=2. The case ct;i 62 ZN (+1) is easy to detect, in this case we discard the multiplier ct;i . We detect the case [ct;i x]N > N=2 with high probability via the approximation wt;i N of [ct;i x]N . If ct;i x 2 ZN (+1) and [ct;i x]N > N=2 then the oracle's answer corresponds to 17

?ct;ix since ?ct;ix 2 MN and (?ct;ix)2 = (ct;i x)2 . In this case the oracle guess corresponds to `(?ct;i x) = `(N ? [ct;i ]N ) = 1 + `(ct;i x) mod 2. So if [ct;i x]N > N=2 we must reverse the guess O1 ENa (ct;i x). ax mod N

?N

s

`(ax) = 0

? 12 N

+ 12 N

0

1 2

`(ax) = 1 W

1 2

w

s

s

+ 12 N

ax mod N

gure 2: binary division via mod N The distribution of multipliers with Jacobi symbol +1. On the average half of the multipliers ct;i = at (1 + 2i) + b are in ZN (+1), which follows from jZN (+1)j = jZN j=2.

However we have to get suciently close to the density 21 for a speci c subset, corresponding to the i 2 Am , and for all stages t. Otherwise we need higher accuracy for the approximate location uN of [ax]N that has to be guesssed upon initiation. This point a ects the time analysis, it has been neglected in [ACGS88]. Peralta [P92] shows that for every prime number P , for distinct xed integers A1 ; :::; Am 2 ZP and random X 2R ZP the distribution of the sequence of quadratic charactersp of X + A1 ; :::; X + Am deviates from the uniform distribution on f1gm by at most m(3+P P ) . We apply this result to the prime factors p; q of N with X = b and Ai = at (1 + 2i) for the i 2 Am , and we use that b mod p, b mod q are independent for random b 2R ZN . Then Peralta's result shows that for random b 2R ZN the fraction p of multipliers ct;i = at (1+2i)+ b 3+ p 1  with j1 + 2ij  m that are in ZN (+1) is 2 + O(m p ), where p := min(p; q), for every p 3+ p t = 1; :::; n. The di erence O(m p ) of this fraction to 12 is so small that its e ect is negligible over all n stages of the inversion algorithm. Technical details that deviate from RSA{inversion. The following changes compared to RSA{inversion accompany the smaller density  12 of the multipliers ct;i 2 ZN (+1):

 Double m. With high probability there are about m multipliers ct;i 2 ZN (+1) with j1 + 2ij  2m:  The numerical error 4t;i of ut (1 + 2i) + v doubles as i and m double.  Use an initial approximation v for N1 [bx]N of double distance, the numerical error 4t;i doubles anyway. So initially guess the v 2 4" [0; 4"?1 ) that is closest to N1 [bx]N . The number of v{values halves.  In case that wt;i errs then N1 jct;i xjN may be twice as large as the numerical error 4t;i doubles. If wt;i errs we have N1 jct;i xjN  4" . 18

 The new error event Errt;i = [ N1 jct;i xjN  4" ] has probability at most 2" ( 4" for RSA{

inversion ). The increased error Errt;i will be reduced by an "additional" error. Compute at ; ct;i ; ut ; wt;i as for RSA{inversion. Decide that 2[ct;i x]N < N i b2(ut (1 + 0 := 2i) + v)c is even. If 2[ct;i x]N > N reverse the guess for `(at x) by simply adding wt;i b2(ut (1 + 2i) + v)c mod 2 in the equation of the i-th measurement. Thus, guess that 0 mod 2. "`(at x) = 0" i at least half of the i satisfy O1 ENa (ct;i x) = `(bx) + wt;i + wt;i Error analysis. There is an "additional" error if the equivalence 2[ct;i x]N < N , b2(ut (1+2i)+ v)c = 0 mod 2 does not hold. From the error analysis of wt;i in Section 2 we see that this event occurs only if j2ct;i xjN  "=4. The latter event is part of the error event 0 are both incorrect Errt;i = [ jct;i xjN  4" N ] and it cancels out: If the parities of wt;i and wt;i these errors cancel each other. Therefore the error of the i-th measurement is covered by the event 8" < jct;i xjN  4" which has probability at most 4" , the same as for RSA{inversion. Time bounds. We use half the number of v and the same number of u as for RSA{ inversion and thus the additional overhead for ENa {inversion is half that for RSA{inversion, while the number of oracle calls is the same. On the other hand, the method of [ACGS88] makes ENa {inversion 44 {times slower than RSA{inversion. Furthermore the improvements of Sections 3 and 4 carry over to ENa . This extends Theorems 2 and 3 from the RSA{function to the absolute Rabin{function ENa and Theorem 5 and Corollary 6 from the RSA{RNG to the absolute/centered x2 mod N {generator. We neglect the di erence between the uniform distribution on ZN and ZN . Note that the time to compute two Jacobi symbols | that comes with each oracle call | is negligible provided that T  n lg n.

Theorem 7. The assertions of Theorems 2 and 3 hold for the absolute Rabin{function ENa

in place of the RSA{function EN . Theorems 4 and Corollary 5 hold for the absolute/centered x2 mod N {generator in place of the RSA{generator.

The strength of Theorem 7 is that it proves security provided that factoring Blum integers is hard. This follows because the problems of inverting ENa and of factoring N are equivalent [R79]. Here is the standard way of factoring N = p  q using an inversion algorithm for ENa . Let DNa denote the permutation on MN that is inverse to ENa . Obviously DNa ENa yields a 4{1 mapping from ZN to MN . Two inputs x; y collide under this mapping if and only if x = y mod p and x = y mod q. Exactly one of four colliding inputs is in MN . Therefore the events pj(z ? DNa ENa (z )) and qj(z ? DNa ENa (z )) are independent for random z 2 ZN , and each event occurs with probability 21 . This shows that fgcd(z  DNa ENa (z ); N )g = fp; qg holds with probability 12 for random z 2 ZN . In particular N can be factored in expected time 2  3n (lg n)"?2 T + 3  28 n2 "?4 lg(8n"?1 ) using an oracle with "{advantage in predicting `(x) from given ENa (x). This improves the O(n3 "?8 T ) time bound of [ACGS88]. Comparison with the muddle square method. It is interesting to compare the centered x2 mod N {generator with the randomized x2 mod N {generator implied by the general result of Goldreich and Levin [GL89,L93]: iteratively square xi mod N and output 19

the scalar products bi = hxi ; z i mod 2 for i = 1; ::; m with a xed random bit string z . Knuth [K97, section 3.5, Theorem P] shows that N can be factored in expected time O(n2 ?2 m2 T (A) + n4 ?2 m3 ) for a non-negligible fraction of the N if we are given a statistical test A that rejects (b1 ; :::; bm ) at tolerance level . This yields a security guarantee for the muddle square method that is similar to the one of Corollary 6. The problem of inverting the uncentered Rabin{function. Consider the permutations ENc ; ENu acting on the set of quadratic residues. The problems of inverting ENc and ENu are equivalent as we can easily transform one output into the other using that ENu (x) ? ENc (x) 2 f0; N g. We would like to modify the oracle algorithm for RSA{inversion to the inversion of ENu . Let O1 be an oracle which, given ENu (x) and N , predicts the leastsigni cant bit of x 2 QRN with advantage ", Prx;w [O1 (ENu (x)) = `(x)]  21 +" for x 2R QRN and the coin tosses w of O1 . We face the problem that the correct interpretation of the oracle replies O1 ENu (ct;i x) require the quadratic residuosity of ct;i x. For simplicity we will only use multipliers ct;i that are quadratic residues. The subgroup QRN of ZN has order jZN j=4. To compensate the reduced density of the usable multipliers the inversion algorithm guesses initially an approximate location uN for [ax]N with 41 times the distance for RSA{inversion. Deciding quadratic residuosity. Blum, Blum, Shub [BSS86] have shown in Lemma 2 of [BBS86] that every oracle O1 as above | that given ENu (x) predicts with "-advantage the least-signi cant bit of x | can be used to predict quadratic residuosity with "-advantage: For all a 2 ZN (+1) and random z 2R QRN the following equivalence holds with probability 21 + ": a 2 QRN () O1 ENc (az ) = `(az ). So we predict "a 2 QRN " i O1 ENc (az ) = `(az ). The resulting "-advantage for quadratic residuosity can easily be ampli ed. Speci cally, to predict quadratic residuosity with error probability at most 81m we pick m := "?2 dln me independent z1 ; :::; zm 2R QRN , and we decide that "a 2 QRN " i O1 EN (azi ) = `(azi ) holds for the majority of the i. By Hoeffding's bound this majority decision has error probability at most exp(?2m"2 )  exp(?2 ln m) = m?2  1=8m for m  8. How to interpret the oracle. If we feed the oracle with ENu (ct;ix) | where ct;i x 62 QRN | then the oracle's answer does not correspond to ct;i x but rather to the square root of (ct;i x)2 that resides in QRN \ (0; N ). This may happen if either ct;i 2 ZN (+1) n QRN or ct;i x 62 ZN (+1). The case ct;i 62 ZN (+1) is easy to detect, in this case we discard the multiplier ct;i . If ct;i x 2 ZN (+1) then the oracle's answer corresponds to either ct;i x or to ?ct;i x since (?ct;i x)2 = (ct;i x)2 . If ct;i x 62 QRN then we must reverse the guess O1 ENu (ct;i x). Inverting the uncentered Rabin{function. We describe how the algorithm di ers from the RSA{inversion. Let m be as for RSA{inversion. Let us rst assume that 2 is in QRN . Initially pick random a; b 2R ZN and produce all quadratic residues of either type c1;i := a2 (1+2i)+ b and c01;i := a i + 2b with j1+2ij  4m. On the average there are about m quadratic residues of either type. Guess the closest rationals uN; vN to [ax]N ; [bx]N so that 32"?3 u; 8"?1 are integers. Also guess `(ax); `(bx). At stage t determine `( a2 x) by majority decision using oracle O1 and the multipliers c1;i = a2 (1 + 2i) + b 2 QRN . 20

Details of the majority decision of `( 2a x). Compute from u; v the integer w1;i which

most likely satis es [c1;i x]N = [ a2 x]N (1 + 2i) + [bx]N + w1;i mod 2. Decide that `( a2 x) = 0 i the majority of the multipliers c1;i satis es O1 EN (c1;i x) = `( a2 x) + `(bx) + w1;i mod 2. Given `( a2 x) we can in the same way determine `( 2b x) using the multipliers a01;i = a i + 2b 2 QRN with j1 + 2ij  4m. Then replace a; b by a2 ; 2b 2 ZN and go to the next stage. The new multipliers c1;i , and 0 c1;i are again in QRN since we divide them by the quadratic residue 2. The case that 2 is a quadratic non-residue. In this case we separately determine the quadratic residues of both types c1;i ; c01;i for the stages t = 1 and t = 2. We use the quadratic residues of stage 1 at the odd stages t and the quadratic residues of stage 2 at the even stages t. This is correct since we divide the residues by a power of 4 compared to stages 1 and 2. Time bounds. The extra work in ENu {inversion | versus RSA{inversion | is to decide which of the ct;i , and c0t;i ; x are in QRN . We rst discard the multipliers with Jacobi symbol ?1. Then we are left with about 4m + 1 = O(n"?2 ) quadratic residuosity decisions. We have shown above, that we can decide each of these quadratic residuosities | with error probability at most 81m | using "?2 dln ne calls to the oracle O1 . So we get all 4m+1 decisions using O(n"?4 lg(n"?1 ) ) oracle calls. Reducing the additional overhead as in Section 4 we get the following theorem which improves the O(n3 "?11 T ) time bound of [VV84].

Theorem 8. Using an oracle which given ENc (x) predicts `(x) with advantage " in time T the (un-)centered Rabin{function ENc can be inverted in expected time O(n"?4 lg(n"?1 )T ).

7 Conclusion. We give stronger security proofs for the input bits of the RSA/Rabin{function given the value of the function. Assuming that there are no faster algorithms for the invertion of the RSA/Rabin{function than via the currently known factoring algorithms we prove security for RSA{message bits, for the entire RSA{Random number generator and for the centered x2 mod N generator for modules N of practical size, e.g. of bit length 1,000 and 5,000, respectively. For the rst time this yields provably secure and practical random number generators under the assumption that factoring integers is hard. Other ecient RNG's have been proved to be secure under complexity assumptions that are less widely studied, e.g. [MS91], [FS96]. Our main result shows that the asymptotic theory of perfect random number generators originating from the work of Yao [Y82] and Blum, Micali [BM84] has a practical impact. Note that the statements of this theory only hold for suciently long seeds of the random generator, whereas in applications the seeds are limited by precise bounds, e.g. by bit length 1,000 or 5,000. Our results bridge the gap between asymptotics and praxis. To close this gap we give explicit constant factors for all time bounds. In addition we use a complexity assumption for the problem of factoring integers of bit length 1,000 and 5,000. Such a 21

precise assumption is possible by the extensive work on the factoring problem done over the last years, where the results of the theoretical analysis and of practical implementations of algorithms are surprisingly close, see [LL93]. The whole argument set forth in [ACGS88] and nalized in the present paper is not at all trivial. It is all the more surprising that this has a practical impact, a well designed theory can be practical. The present paper relies heavily on its precursor [ACGS88]. Speci cally we use the method of pairwise independent sampling for majority decision of [ACGS88]. We further re ne this method by the rule of subsample majority decision (SMAJ). SMAJ may be of independent interest and may have applications beyond this paper. The other main improvement over [ACGS88] originates from the observation that binary division is more appropriate for RSA{inversion and easier to analyze than the binary gcd method used in [BCS83] and [ACGS88]. In fact we have reinvented binary decision which has already been used in [GMT82]. Binary division bears fruits beyond the present work. In a subsequent paper [S98] we introduce a novel bit representation of modulo N integers so that the RSA{ message bits in the new representation are all individually secure. In a sense we not only improve the results of [ACGS88], we present the essentially optimal reduction from RSA{bit prediction to full RSA{inversion | optimal in the number of calls of the prediction oracle. The reduction of the number of oracle calls is possible by the SMAJ rule. The reduced number of oracle calls is minimal up to a factor O(lg n) for all algorithms for RSA{inversion in a suitable model of algorithms. Our complexity analysis separates the number of oracle calls from the remaining steps which we call the additional overhead. We substantially reduce the additional overhead by simulating the algorithm RSA{inversion in parallel for all values of the unknown approximation points (u; v). Our parallel simulation via a FFT-network is almost linear time in the number of values of (u; v) times the number n of stages ( n is the bit length of the RSA{message ). We believe that the latter factor n can still be removed.

Acknowledgement. We gratefully acknowledge the comments of D.E. Knuth and of an anonymous referee. We thank O. Goldreich for his sustained and insisting suggestions for improvements. All this led to a considerably improved presentation of the material.

References [ACGS88] [BBS86] [BCS83]

W. Alexi, B. Chor, O. Goldreich and C.P. Schnorr: RSA and Rabin Functions: certain parts are as hard as the whole. Siam J. Comp. 17 (1988), pp. 194{209. L. Blum, M. Blum and M. Shub: A Simple Unpredictible Pseudo-Random Number Generator. Siam J. Comp. 15 (1986), pp. 364{383. M. Ben-Or, B. Chor and A. Shamir: On the Cryptographic Security of Single RSA{Bits. Proc. 15th ACM Symp. on Theory of Computation, April 1983, pp. 421{430.

22

[BLP93] [BK83] [BM84] [FS96] [FS97] [G95] [GL89] [GMT82] [HSS93] [H63] [K97] [L93] [LV97] [LL93]

J.P. Buhler, H.W. Lenstra, Jr. and C. Pomerance: Factoring Integers with the Number Field Sieve. in: The Development of the Number Field Sieve, (Ed. A.K. Lenstra, H.W. Lenstra, Jr.) Springer LNM 1554 (1993), pp. 50{94. R.P. Brent and H.T. Kung: Systolic VLSI arrays for linear time gcd computation. VLSI 83, IFIP, F. Anceau and E.J. Aas, eds., Elsevier Science Publishers (1983), pp. 145{154. M. Blum and S. Micali: How to Generate Cryptographically Strong Sequences of Pseudorandom Bits. Siam J. Comp., 13 (1984), pp. 850{864. J.B. Fischer and J. Stern: An Ecient Pseudo-Random Generator Provably as Secure as Syndrome Decoding. Proc. EUROCRYPT'96, Springer LNCS 1070 (1996) pp. 245{255. R. Fischlin and C.P. Schnorr: Stronger Security Proofs for RSA and Rabin Bits. Proceedings Eurocrypt'97, Springer LNCS 1233 (1997), pp. 267{279. This is a preliminary version of the current paper. O. Goldreich: Three XOR-Lemmas| An Exposition. ECCC TR95-056, http://www.eccc.uni-trier.de/eccc/. O. Goldreich and L.A. Levin: Hard Core Bit for any One Way Function. Proc. of ACM Symp. on Theory of Computing (1989) pp. 25{32. S. Goldwasser, S. Micali and P. Tong: Why and How to Establish a Private Code on a Public Network. Proc. 23rd IEEE Symp. on Foundations of Computer Science, Nov. 1982, pp. 134{144. J. Hastad, A.W. Schrift and A. Shamir: The Discrete Logarithm Modulo a Composite Hides O(n) bits. J. of Computing and Systems Science 47 (1993), pp. 376{404. W. Hoe ding: Probability in equalities for sums of bounded random variables. J. Amer. Stat. Ass. 58 (1963), pp. 13{30. D.E. Knuth: Seminumerical Algorithms, 3rd edn. Addison-Wesley, Reading, MA (1997). Also Amendments to Volume 2. January 1997. http://www.cssta .Stanford.EDU/~uno/taocp.html L.A. Levin: Randomness and Nondeterminism. J. Symbolic Logic 58 (1993), pp. 1102{1103. M. Li and P. Vitanyi: An Introduction to Kolmogorov Complexity and Its Applications. Springer-Verlag New York, Second Ed. 1997. A.K. Lenstra and H.W. Lenstra, Jr. (Eds): The development of the number eld sieve. Springer LNM 1554 (1993). 23

[MS91] [MR95] [NZ80] [O95] [P92] [R79] [RSA78] [S98] [VV84] [Y82]

S. Micali and C.P. Schnorr: Ecient, Perfect Polynomial Random Number Generators. J. Cryptology 3 (1991), pp. 157{172. R. Motwani and P. Raghavan: Randomized Algorithms. Cambridge University Press Cambridge UK, 1995. I. Niven and H.S. Zuckermann: An Introduction into the Theory of Numbers. John Wiley, New York (1980). A.M. Odlyzko: The Future of Integer Factorization. CryptoBytes, RSA Laboratories, 1 (1995), pp. 5{12. R. Peralta: On the Distribution of Quadratic Residues and Non-residues Modulo a Prime Number. Math. Comp., 58 M.O. Rabin: Digital signatures and public key functions as intractable as factorization. TM-212, Laboratory of Computer Science, MIT, 1979. R.L. Rivest. A. Shamir and L. Adleman: A Method for Obtaining Digital Signatures and Public Key Cryptosystems. Comm. ACM, 21 (1978), pp. 120{ 126. C.P. Schnorr: Security of Arbitrary RSA and of 'All' Discrete Log Bits. Preprint University Frankfurt,(1998). U.V. Vazirani and V.V. Vazirani: Ecient and Secure Pseudo-Random Number Generation. In Proc. 25th Symp. on Foundations of Computing Science (1984) IEEE, pp. 458{463. A.C. Yao: Theory and Application of Trapdoor Functions. Proc. of IEEE Symp. on Foundations of Computer Science (1982), pp. 80{91.

24