The Distribution of the Spectrum for the Discrete ... - Semantic Scholar

Report 5 Downloads 82 Views
IEICE TRANS. FUNDAMENTALS, VOL.E88–A, NO.1 JANUARY 2005

67

PAPER

Special Section on Cryptography and Information Security

The Distribution of the Spectrum for the Discrete Fourier Transform Test Included in SP800-22 Kenji HAMANO†a) , Member

SUMMARY In this paper, the problem in the distribution of the test statistic of the Discrete Fourier Transform (DFT) test included in SP80022 released by the National Institute of Standards and Technology (NIST), which causes a very high rate of rejection compared with the significance level, is considered on the basis of the distribution of the spectrum. The statistic of the DFT test, which was supposed to follow the standard normal distribution N(0, 1) according to the central limit theorem, seems to follow the normal distribution N(0.691, 0.5) approximately. The author derived the distribution function of √ the spectrum, and changed the √ threshold value from the default value of 3n to the value of 1.7308 · · ·× n, where n is the length of a random number sequence. By this modification, the test statistic becomes to follow the normal distribution N(0, 0.5) approximately. The disagreement between this variance (= 0.5) and that of the standard normal distribution (= 1) can be considered to originate in the dependence of the spectrum. The evidences of the dependence are shown. key words: discrete Fourier transform, random number, statistical test, SP800-22, NIST, central limit theorem

1.

Introduction

Random number is used in many different kinds of applications; adequate randomness is needed there. The word “random” can be defined as the result of the flips of an unbiased “fair” coin with sides that are labeled “0” or “1,” with each flip having a probability of exactly 12 of producing a “0” or “1.” All elements of a sequence that is generated by flips of this fair coin are independent, and the value of the next element cannot be predicted. The famous statistical test suites for randomness are [4]–[6]. Our research section scrutinized SP800-22 [7] released by NIST. That consists of sixteen tests; among them, the DFT test was of greatest concern to us. The DFT test detects periodic features of a random number sequence. Those features play an important role in the cryptanalysis. However, we were not able to verify the threshold value used in the DFT test, because there was no information about the distribution of the spectrum in SP800-22. To check on the DFT test, we consider the distribution of the spectrum on the basis of both theoretical and numerical examination. In a recent report [2], we arrived at the conclusion: the default threshold value of the DFT test is not correct. In addition, we gave its correct value in the report. Although we were able to revise the threshold, the numerical simulation showed that the variance of the test statistic, whose variance Manuscript received March 22, 2004. Manuscript revised June 9, 2004. Final manuscript received August 10, 2004. † The author is with the Technical Research and Development Institute, Japan Defense Agency, Tokyo, 154-8511 Japan. a) E-mail: [email protected]

was assumed to be the value of 1 in SP800-22 on the basis of the central limit theorem, was nearly the value of 0.5. The research of Kim et al., which had been carried out independently of our research, also obtained the correct threshold value [3]. In their report, they concluded that the correct value of the variance of the test statistic is exactly the value of 0.5, however, their statement that it is due to symmetry of the spectrum is patently incorrect, because half of the spectrums are discarded in the DFT test. Therefore, the correct value of the variance of the test statistic remains unsolved. At present, the reason why the variance of the test statistic is not the value of 1 is not fully explained and the correct value of the variance of the test statistic is not theoretically derived. In order to deal with this problem, we consider the distribution of a set of spectrums that is derived from one random number sequence by the use of the DFT, because such a set of spectrums is used to compute the test statistic of the DFT test. The distribution of a set of spectrums is not investigated in the report of Kim et al. [3], and we look more carefully into the distribution of the spectrum than their report. In this paper, we consider the following points. 1. The distribution of a set of real parts. 2. The distribution of the sum of a set of real parts, imaginary parts, and spectrums. Based on the above points, we show that the spectrums are mutually dependent. These points are considered to explain the disagreement of the variance of the test statistic, because the independence of the random variables is the condition for applying the central limit theorem to the random variables. We also show that the variance of the test statistic varies with changes in the threshold value. Therefore, the variance of the test statistic seems to be a function of the threshold value. Moreover, we show the practical problem when the DFT test is used without our change. Using the DFT test leads to an increase in the probability of the Type 1 error: even though a sequence has true randomness, the DFT test concludes that the sequence is non-random. This problem has a profound influence on both software and hardware developer for the random number generation. For example, even though an engineer presents an excellent random number generator (RNG), its evaluation report based on SP80022 will reject the RNG because of a flaw detected by the DFT test. In addition, when the security system using the good RNG gets checked by SP800-22 before the actual op-

c 2005 The Institute of Electronics, Information and Communication Engineers Copyright 

IEICE TRANS. FUNDAMENTALS, VOL.E88–A, NO.1 JANUARY 2005

68

eration, the generator will fail the DFT test, and the system will not be able to be operated for a long time. 2.

8. Compute a test statistic: N1 − N0 N1 − E[N1 ] . = √ d= √  (n + 1)0.95 × 0.05 Var[N1 ]

Discrete Fourier Transform Test

In this section, we explain the procedure of the DFT test. Let n be the number of bits in the original input sequence ε that is generated by the RNG and consists of zeros and ones. Let εi be the ith bit in the sequence ε. Let E[·] be the expected value of a random variable. Let Var[·] be the variance of a random variable. The procedure of the DFT test is as follows. 1. Choose the parameter α called the significance level in the range [0.001, 0.01]. It denotes the probability of the Type 1 error. 2. The zeros and ones of the input sequence ε are converted to values of −1 and 1 to produce the sequence n−1 ), where X (= {xi }i=0 xi = 2εi − 1 (0 ≤ i ≤ n − 1). 3. Apply the DFT on X to produce a sequence of complex numbers { f j }n−1 j=0 . Since the complex conjugate of f j (= f¯j ) equals fn− j , half of the complex numbers, namely { f j }n−1 j=n +1 , are discarded, where 

n =



n 2 − 1, n−1 2 ,

if n is even; if n is odd.

4. Calculate a sequence of absolute values of f j (=  {| f j |}nj=0 ). 5. Compute a threshold value (default) √ T 0.95 = 3n. 

According to SP800-22, 95% of the values {| f j |}nj=0 are supposed to be < T 0.95 under an assumption of randomness. 6. Compute N0 = 0.95(n + 1). 7. Compute N1 = #{ j | | f j | < T 0.95 , 0 ≤ j ≤ n }. We can see N1 as the sum of n + 1 Bernoulli random variables that take on the value of one with probability 0.95 and the value of zero with probability 0.05. Let Y j be a Bernoulli random variable, where Y j = 1 if | f j | < T 0.95 and Y j = 0 otherwise. Since it is supposed that Pr(| f j | < T 0.95 ) = 0.95, Y j = 1 with probability 0.95 and Y j = 0 with probability 0.05; thus, the  equation N1 = nj=0 Y j is derived. By the use of this equation, the expected value and the variance of N1 are calculated as 

E[N1 ] = (n + 1) × 0.95 = N0 ; Var[N1 ] = (n + 1) × 0.95 × 0.05. Notice that a Bernoulli process consists of repeatedly performing independent but identical Bernoulli trials.

According to the central limit theorem, d can be considered to follow the standard normal distribution N(0, 1) as n approaches ∞. 9. Compute a P-value:  ∞ 2 exp(−u2 )du √ π √|d|2  2  ∞ 1 u = du √ exp − 2 |d| 2π  2  −|d| 1 u + du. √ exp − 2 −∞ 2π The P-value is defined by the probability that a variate would assume a value greater than or equal to the observed value of |d| strictly by chance. If d follows N(0, 1), P-value follows the uniform distribution. 10. If the P-value ≥ α, then ε appears to be random. If the P-value < α, then ε appears to be non-random. SP800-22 recommends that n ≥ 1000. 3.

The Distribution Function of | f j |

In this section, we derive the distribution function of the random variable | f j | of which there is no information in SP80022, where f j is computed by the use of the following equation (DFT).   n−1  2πk j fj = xk exp i . (1) n k=0 We also give the correct value of T 0.95 by using the distribution function. Let c j , s j be the real and imaginary part of f j , respectively: f j = c j + is j , where i is the imaginary unit. 3.1 When j = 0 n−1 xk . When j = 0, Eq. (1) becomes the equation f0 = k=0 Thus, f0 is an integer number. The distribution of | f0 | is derived as follows. When f0 = r, where r is an integer number, n+r ; #{k | xk = 1, 0 ≤ k ≤ n − 1} = 2 n−r #{k | xk = −1, 0 ≤ k ≤ n − 1} = . 2 Therefore, when r > 0, Pr(| f0 | = r) = Pr( f0 = r) + Pr( f0 = −r)   1 n = n+r n−1 , 2  2 n 1 Pr(| f0 | = 0) = n n . 2 2

HAMANO: THE DISTRIBUTION OF THE SPECTRUM FOR THE DISCRETE FOURIER TRANSFORM TEST INCLUDED IN SP800-22

69

3.2 When j  0 The real part of f j (= c j ) is expressed as the following: cj =

n−1 

ak, j xk ,

k=0

 j where ak, j = cos 2πk n . The characteristic function of c j denoted by φ(t) is expressed as the following:  n−1     φ(t) = E[exp(itc j )] = E  exp(it(ak, j xk )) k=0

n−1  n−1  E exp(it(ak, j xk )) = cos(ak, j t); = k=0

lows the χ2 distribution with 2 degrees of freedom when c j and s j are mutually independent. Thus, the distribution function of | f j | is expressed as the following:   2    1 x  .  Pr(| f j | ≤ x) = 1 − exp − (4) √ 2 n/2 The positive solution of the equation:  2     1 x  = 0.95 1 − exp − √ 2 n/2

k=0

 thus, log φ(t) = n−1 k=0 log cos(ak, j t). The Taylor series of log cos(x) about a point x = 0 has the form: x6 x2 x4 − + O(x7 ). log cos(x) = − − 2 12 45  n−1 4 n 3n 2 When n−1 k=0 ak, j (= 2 ), k=0 ak, j (= 8 ) are designated by a, b, respectively, the following equation is obtained: b a log φ(t) = − t2 − t4 + O(t5 ). 2 12 When the function log φ(t) is denoted by g(t), the characteristic function φ(t) = exp(g(t)) is expressed as the following Taylor series:   a 2 1 a2 b 4 − φ(t) = 1 − t + (2) t + O(t5 ). 2 4 2 3 Let µl be the lth moment about the origin of a random variable Z:   1 dl  l (3) µl = E[Z ] = l l φ(t) . t=0 i dt and let µl be the lth moment about the mean of a random variable Z: µl = E[(Z − µ)l ], where E[Z] = µ. Since E[c j ] = 0, the moments of c j can be calculated as below by the use of the relation µl = µl and Eq. (3). µ1 = 0, µ2 =

Likewise, s j approximately follows N(0, n2 ), because the imaginary part of f j (= s j) is expressed as s j = n−1 2 n−1 2πk j k=0 bk, j xk , where bk, j = sin k=0 bk, j = n , and  n−1 4 n 3n 2 , k=0 bk, j = 8 . We are now ready to consider the distribution function   2 2  2 | fj| cj sj of | f j |. Since √n/2 = √n/2 + √n/2 and c j , s j can  2 | fj| folbe considered to follow N(0, n2 ) approximately, √n/2

n 3n(n − 1) , µ3 = 0, µ4 = . 2 4

Thus, the  skewness of c j is zero and the kurtosis of c j is  3 1 − 1n . This kurtosis converges with 3 when n approaches ∞. These are consistent with the values of the normal distribution. Therefore, c j can be considered to follow N(0, n2 ) approximately.

gives the correct threshold value: T 0.95 = 1.730818382602285 · · · ×



n.

(5)

Notice that what we have shown in this section is that c j , s j (1 ≤ j ≤ n ) follow N(0, n2 ) separately, not that a set of   real parts (= {c j }nj=1 ) and a set of imaginary parts (= {s j }nj=1 ) follow N(0, n2 ) as a whole. 4.



The Distribution of { f j } nj=1 

In this section, we shall discuss the distribution of {| f j |}nj=1 . As shown in the previous section, the distribution of | f j | when j = 0 is inconsistent with that of | f j | when 1 ≤ j ≤ n ;   thus, we use the distribution of { f j }nj=1 instead of { f j }nj=0 . Now, we would like to look closely at the distribution of   {c j }nj=1 , {s j }nj=1 . 

4.1 The Distribution of {c j }nj=1 For the sake of simplicity, let us just consider the case when n is a prime number in this subsection. Since   n n n−1    2πk j  c j = n x0 + xk cos n j=1 j=1 k=1 n−1 xk = n x0 − k=1 , 2 the following values are obtained.  n    E  c j  = n x0 ;

(6)

j=1

  2  n    n − 1  . E  c j − n x0   = 4 j=1

(7)

IEICE TRANS. FUNDAMENTALS, VOL.E88–A, NO.1 JANUARY 2005

70 

This means that a mean of a set of real parts (= {c j }nj=1 ) is 

n −n

with probability 1/2; with probability 1/2;

If we make n independent observations of the random quantity X, thereby obtaining the values X1 , X2 , . . . , Xn , the empirical distribution function Fn (x) is specified by the following.



thus, we found that {c j }nj=1 has two asymptotic distribution.  To make {c j }nj=1 have one asymptotic distribution, we propose the following conversion from the sequence ε to the sequence X.  2εi − 1, if ε0 = 1; xi = (8) 1 − 2εi , if ε0 = 0. By the use of this conversion, the value of x0 is fixed at 1. In the hereafter, we presuppose that this conversion is used. In addition, the following equations are used instead of f j , c j , and s j . f j − x0 = (c j − x0 ) + i · s j ;   n−1  2πk j xk cos c j − x0 = ; n k=1   n−1  2πk j xk sin . sj = n k=1 The characteristic function of c j − x0 is expressed as Eq. (2),  n−1 2 3n 4 ak, j = n2 −1 and b = n−1 where a = k=1 k=1 ak, j = 8 −1; thus, n c j − x0 approximately follows N(0, 2 − 1). Since n ≥ 1000, the approximate distribution of c j − x0 is also considered to be N(0, n2 ). 4.2 The Statistical Correlation We show that the correlation between c j − x0 and ck − x0 ( j  k) is not 0. The covariance of random variables Z1 , Z2 designated as Cov(Z1 , Z2 ) is E[Z1 Z2 ] − E[Z1 ]E[Z2 ], and the correlation Cov(Z1 ,Z2 ) √ coefficient ρ(Z1 , Z2 ) = √Var[Z ; thus, if E[Z1 Z2 ] = 1 ] Var[Z2 ] E[Z1 ]E[Z2 ], then ρ(Z1 , Z2 ) = 0. It means that there is no correlation between Z1 with Z2 . Notice that generally speaking, ρ(Z1 , Z2 ) = 0 does not mean Z1 and Z2 are mutually  2πi j  indepen   cos cos 2πik = dent. Since E[(c j −x0 )(ck −x0 )] = n−1 i=1 n n −1  0, c j − x0 is correlated with ck − x0 ( j  k). Therefore, On the conc j − x0 (1 ≤ j ≤ n ) are mutually   dependent.  n−1 2πi j 2πik trary E[s j sk ] = i=1 sin n sin n = 0 ( j  k); thus, s j (1 ≤ j ≤ n ) are not correlated with each other. 5.

Fn (x) =

number of X1 , X2 , . . . , Xn that are ≤ x . n

The following is a test statistic of the KS test. √ Kn+ = n max (Fn (x) − F(x)). −∞<x