Pseudorandomness for permutation and regular branching programs

Report 15 Downloads 120 Views
Pseudorandomness for permutation and regular branching programs Anindya De Computer Science Division University of California, Berkeley Berkeley, CA 94720, USA [email protected]

Abstract—In this paper, we prove the following results about the INW pseudorandom generator : • It fools constant width permutation branching programs with error using a seed of length O(log n · log(1/ )). • It fools constant width regular branching programs with error using a seed of length O(log n · (log log n + log(1/ ))). These results match the recent results of Kouck´y et al. (STOC 2011) and Braverman et al. and Brody and Verbin (FOCS 2010). However, our analysis gives a better dependence of the seed length on the width for permutation branching programs than the results of Kouck´y et al. (STOC 2011). Perhaps, more significantly, our proof method is entirely different and linear algebraic in nature as opposed to the group theoretic methods of [1] and the information theoretic and probabilistic methods of [2], [3]. Along the way, we also obtain pseudorandom generators for the “small biased spaces” for group products [4] with a seed length O(log n·(log |G|+log(1/ ))). Previously, it was possible to get O(log n · (|G|O(1) + log(1/ ))) using the pseudorandom generator of [1]. Keywords-branching programs; INW generator; expander products

I. I NTRODUCTION One of the most fundamental questions in complexity theory is whether one can save on computational resources like space and time by using randomness. While it is known that randomness is indispensable in settings like cryptography and distributed computation, a long line of research [5], [6], [7], [8] has shown that assuming appropriate lower bounds on the circuit complexity of some functions, one can derandomize every randomized polynomial time algorithm i.e. show P = BPP. Since it is hard to unconditionally achieve non-trivial derandomization of complexity classes like BPP, attention was directed towards derandomization of low-level complexity classes where it may be possible to achieve unconditional results. In fact, it has also been shown that any non-trivial derandomization of BPP implies circuit lower bounds [9] which seem out of reach of the present state of the art. One of the most important problems in this line of research is to derandomize bounded space computation. The ultimate aim of this line of research is to prove RL = L i.e., to show that any computation that can be solved

in randomized logspace can be simulated in deterministic logspace. Savitch [10] showed that RL ⌃ NL ⌃ L2 i.e. randomized logspace computation and in fact, non-deterministic logspace computation can be simulated deterministically in O(log2 n) space. Nisan [11] also showed that RL ⌃ L2 by constructing a pseudorandom generator (PRG) which can stretch a seed of length O(log2 n) into n bits that fools logspace machines. In fact, Nisan’s PRG fools read-once branching programs (which we define next) of polynomial length and width. Definition 1.1: A read-once branching program (BP) of width w and length n is a directed multilayer graph with n + 1 layers such that each layer has w nodes with edges going from the ith layer to the (i+1)th layer (0 ⌥ i ⌥ n 1). For every node (except those in the last layer), there are exactly two edges leaving that node with one marked 0 and the other marked 1. There is a designated start state in the first layer and a set of “accepting” states in the (n + 1)th layer. Remark 1.2: In this paper, whenever we refer to branching programs, we mean read-once branching programs. We note that if the read-once restriction is not imposed, then in fact, width 5 branching programs capture NC1 [12]. A BP is said to be a permutation branching program (PBP) if for any layer the transitions corresponding to 0, (resp. 1) are matchings. A BP is said to be a regular branching program (RBP) if the number of edges coming into every node is either 0 or 2. A given BP accepts an input x ↵ {0, 1}n if starting from the start state and following the path specified by the input, it ends in one of the accepting states. It is not hard to see that randomized log space computation is a uniform version of BPs with w = nO(1) . Coming back to PRGs for branching programs, after [11], several papers [13], [14] improved on some parameters of the construction in [11]. However, improving on the O(log2 n) seed remained (and continues to remain) open in the following minimal sense: It is not known how to construct a PRG with seed length o(log2 n) seed with constant error for width 3 BPs. Faced with this difficulty, research was focussed on solving special cases of this problem with better seed lengths ([15],[16],[17]). Attention has also been directed towards

getting better seed length when there is some structural restriction on the BPs. In particular, when the branching program is regular, then Braverman et al. [2] and Brody and Verbin [3] constructed pseudorandom generators with seed length O(log n · (log log n + log(1/⇤))) which fool constant width branching programs with error ⇤. The dependence of the seed length on width was better for the latter paper which obtained O(log n · (log w + log log n + log(1/⇤))). Kouck´y, Nimbhorkar and Pudl´ak [1] improved on this further for the case of permutation branching programs. In particular, for fooling constant width permutation branching programs with error ⇤, they got a seed length of O(log n · log(1/⇤)). In their analysis, they transform this problem into the language of group products1 and then analyze their construction using basic properties of groups. In this vein, ˘ ıma and Z´ ˘ ak recently got a we should also mention that S´ breakthrough by constructing a hitting set with seed length O(log n) for width-3 branching programs [18]. However, their techniques seem totally disjoint from other works in this line of research. A. Our results We present a pseudorandom generator which ⇤-fools permutation branching programs of length n and width w using a seed of length O(log n · (w8 + log(1/⇤)). For regular branching programs, we get a seed length of O(log n · (w8 + log log n + log(1/⇤)). The PRG we use is the INW generator [13] (as was in the previous works [3], [2], [1]). We remark that Kouck´y et al. obtained a seed length of O(log n · (w! + log(1/⇤)) for fooling permutation branching programs using the same generator. What we consider more significant is that our analysis is based on analyzing spectra of the stochastic matrices that arise in the transitions of the branching program. Thus, we see it is a more linear algebraic approach which might be helpful in other contexts as well. This is in contrast to the result of Kouck´y et al. which is based on a group theoretic approach and is thus difficult to adapt to more combinatorial settings. We remark here that ours is not the first result to use a linear algebraic analysis of the INW generator. Previously, Rozenman and Vadhan [19] used a linear algebraic analysis of the INW generator to prove SL = L (following Reingold’s seminal result [20]). We also consider the small-bias spaces problem for group products, first considered by Meka and Zuckerman in [4]. Let G be a group. The task is to construct a distribution D over Gn such that the following holds. Let x1 , . . . , xn ↵ {0, 1} and UGn be the uniform distribution over Gn . Let g = (g1 , . . . , gn ) be sampled from D. Then, for any g ↵ G, | Pr [g1x1 · . . . · gnxn = g] g D

Pr [g1x1 · . . . · gnxn = g]| ⌥ ⇤

g U Gn

The previous best result that could be obtained using the works of [4] and [1] is O(log n(|G|O(1) + log(1/⇤)). We 1 we

discuss this later

show how to get O(log n(log |G| + log(1/⇤)) with a much simpler analysis. Because of lack of space, the proof for both arbitrary regular and permutation branching programs is deferred to the appendix. Instead, we consider the much simpler case of fooling a particular kind of permutation branching program namely abelian group products (this is different from the problem in [4]). While the problem is much simpler, it has some of the ideas that will be used in the analysis for the more general case of permutation and regular branching programs. Secondly, the complexity of the analysis in [1] seemed to stem from the fact that the group underlying the permutation branching program has proper subgroups. We show that rather its the non-commutativity of the group that makes the analysis complicated. We next analyze the INW generator in detail and understand the main idea behind the improved analysis of the generator. II. T ECHNICAL OVERVIEW A. Impagliazzo-Nisan-Wigderson generator The PRG used in this paper is the construction of Impagliazzo, Nisan and Wigderson [13] (hereafter referred to as the INW generator). We now describe their construction. First, let us recall the following important fact about construction of expander graphs [21]. Fact 2.1: For any n and ⌅ > 0, there exists graphs on {0, 1}n with degree d = (1/⌅) (1) and second eigenvalue ⌅ such that given any vertex x ↵ {0, 1}n and an edge label i ↵ [d], the ith neighbor of x is computable in n (1) time. The INW generator is defined recursively as follows. Let t 0 : {0, 1} ⌦ {0, 1} be simply a function which maps a t bit string to its first bit. Assume i 1 : {0, 1}m ⌦ {0, 1}◆ . Then i : {0, 1}m+log d ⌦ {0, 1}2◆ is defined as follows. Let x = y ⇧ z ↵ {0, 1}m+log d such that y is m bits long and z is log d bits long. Let H be a graph on 2m vertices constructed using Fact 2.1. Let y be the z th neighbor of y in H. Then i (x) = i 1 (y) ⇧ i 1 (y ). Here and elsewhere, ⇧ is used to denote concatenation. From the above, one can easily see that i : i {0, 1}t+i log d ⌦ {0, 1}2 . As we can put t to be anything, we get that log n : {0, 1}log n log d ⌦ {0, 1}n . As d = (1/⌅) (1) , the INW generator stretches a seed of length O(log n · log(1/⌅)) to n bits. Remark 2.2: In order to tackle the problem of small biased spaces for group products, it will be necessary for us to define the INW generator which produces elements from a bigger alphabet. The construction in this case is as follows : we assume that we want to produce elements from some set G. For some t log |G|, let 0 : {0, 1}t ⌦ G be simply a function whose output is the first log |G| bits and interprets it as an element of G. Assume i 1 : {0, 1}m ⌦ G◆ . Then m+log d ⌦ {0, 1}2◆ is defined as follows. Let i : {0, 1} m+log d x = y ⇧ z ↵ {0, 1} such that y is m bits long and z is

log d bits long. Let H be a graph on 2m vertices constructed using Fact 2.1. Let y be the z th neighbor of y in H. Then i (x) = i 1 (y) ⇧ i 1 (y ). From the above, one can easily see that i : i {0, 1}t+i log d ⌦ G2 . As we can put t to be anything as long as it is at least log |G|, we get that log n : {0, 1}log |G|+log n log d ⌦ Gn . As d = (1/⌅) (1) , the INW generator stretches a seed of length O(log |G| + log n · log(1/⌅)) to n bits. We now get back to the analysis of the INW generator. For the purposes of this discussion, we assume that the INW generator is producing bit strings as opposed to elements in G. B. Analysis of INW generator in terms of stochastic matrices To understand the analysis of the INW generator from [13] as well as the improvements in this paper, it is helpful to look at branching programs from the following viewpoint. Assume that the branching program is of width w. Then states in every layer can be numbered from 1 to w. Also for x, y ↵ [w] and b ↵ {0, 1}, we introduce the notation (x, i, b) (y, i + 1) if there is an edge labelled b going from vertex x in layer i to vertex y in layer i + 1. Now, for every layer 0 ⌥ i < n, we can introduce two stochastic matrices M0i and M1i (we interchangeably call them walk matrices as well) which are defined as ⌃ 1 if (x, i, b) (y, i + 1) Mbi (x, y) = 0 otherwise Now, assume that we start with a probability distribution x ↵ Rw over the states in the 0th layer and then the string chosen is y, then the probability distribution on the states in n 1 the final layer is given by xT i=0 Myi i . Since any string is n chosen with probability 1/2 , the final distribution is given by ⌅n 1 ⇥ ⇤⇧ n 1 xT i=0 Mxi i M0i + M1i T =x 2n 2 n i=0 x {0,1}

If instead, the y’s are drawn from a distribution D, then the distribution on the final layer will be ⌥ n 1

xT

D(y)

y

{0,1}n

i=0

My i i ⌦

Thus, our aim is to find a distribution D which can be sampled with a few bits of randomness and ⌥ ⌅n 1 ⇥ ⇤⇧ n 1 M + M 0i 1i D(y) My i i ⌦ ⌥⇤ 2 n i=0 i=0 y {0,1}

We are not specific about the norm in the above statement as any reasonable norm will do. In particular, we will be using both the operator norm as well as the Frobenius norm for

matrices. Indeed, note that for a constant sized matrix, these are within constant factors of each other. We also observe that the product of stochastic matrices is a stochastic matrix. We now define and understand the concept of expanderproduct of distribution of matrices. C. Comparison of the product and the expander product of matrices For this subsection and the next, we assume that we will only be dealing with permutation branching programs. The analogous description for regular branching programs is slightly more complicated and is deferred to the full version. We recall that the operator norm of a permutation matrix M (denoted by ◆M ◆2 is always bounded by 1. An important property of the operator norm is that it is submultiplicative i.e., ||X · Y ||2 ⌥ ||X||2 ||Y ||2 . Let us assume that 1 , 2 : {0, 1}r ⌦ {0, 1}n and ⌃1 , ⌃2 : {0, 1}n ⌦ Cm⇥m . Assume that ||⌃1 (x)||2 , ||⌃2 (x)||2 ⌥ 1 for all x ↵ {0, 1}n (this will be the case for permutation branching programs). Consider the following two sums; A= x {0,1}r

1 ⌃1 ( 2r

1 (x))

B= x {0,1}r

1 ⌃2 ( 2r

Then the product of A and B is given by

x,y {0,1}r

1 ⌃1 ( 22r

1 (x))⌃2 ( 2 (y))

2 (x))

(1)

= A·B

We now consider a 2d regular graph H on {0, 1}r with second eigenvalue bounded by ⌅. We define the expander product as A ·H B =

1

x {0,1}r ,(x,y) E(H)

⌃1 ( 1 (x))⌃2 ( 2 (y)) 2r+d

We note that without specifying the functions ⌃i , i , it is not possible to concretely define A ·H B. However, specifying all the parameters, makes the definitions and applications cumbersome. So, we sacrifice some accuracy for the sake of clarity. The ⌃i ’s and i ’s will be clear from the context. The relation between the above definitions and the INW generator and branching programs is obvious. Let us consider a branching program of length 2m+1 . Now, let m : m {0, 1}t ⌦ {0, 1}2 be the instantiation of the INW generator which stretches t bits to 2m bits. Let us define ⌃1 (x) = i

||L(zp ) L(zt )||W where W is the non-trivial subspace of zt and xt . What is more concerning is that one can have a series of nodes (in the true tree), call them x0 , . . . , xm and y1 , . . . , ym such that xi has two children, xi 1 and yi . Also, for all i, the non-trivial subspace of L(yi ) is properly contained in the nontrivial subspace of L(x0 ). The way around in this situation is to actually do a global analysis of the error incurred by the chain as a whole rather than trying to do it on a per node basis. The proof uses ideas from the “Key Convegence Lemma” in [1]. Claim 5.5: Let A be a label of a node in the true tree of a width w permutation branching program. Let V be the subspace of Cm such that x ↵ V , x · A = x. Let V ⌦ be the orthogonal space of V . Then, for all x ↵ V ⌦ ||x · A||2 ⌥ (1

4

w

)||x||2

This proof of the above claim uses decomposition of regular representations into its irreducible components followed by application of Schur’s lemma. We do not elaborate on the proof here. This claim is analogous to Claim 3.7 for abelian groups. In particular, this claim allows us to prove the first case (i.e. the non-trivial subspace of yt and zt is the same as that of xt ) for regular branching programs exactly in the same way as the proof goes for abelian groups. We do not elaborate here on the proof of the second and the third cases. VI. C HANGES FOR THE ANALYSIS OF REGULAR BRANCHING PROGRAMS

We now sketch the changes in the analysis for regular branching programs. The important observation that will allow some of the techniques we developed for permutation branching programs is that by definition the number of edges coming in and going out of any node in a regular branching program is 2. Thus, by Hall’s theorem, the edges between any two layers can be decomposed into two disjoint matchings. This means that labelings in the true tree for a regular branching program correspond to some permutation branching program. In particular, an analogue of Claim 5.5 will hold for regular branching programs as well. The major point of departure from the analysis of permutation branching programs will be the following : If y was in the trivial subspace of L(xt ) for a node xt in the true tree, then for a permutation branching program, y · (L(xt ) L(xp )) = 0. For regular branching programs, at this stage, we can only bound ◆y · (L(xt ) L(xp ))◆2 by 8 O(⌅ · log n · 2w ). This gives us the aforementioned seed length and it is plausible that improving this bound will immediately improve the seed length for regular branching programs. We also mention that for getting the current bound on the seed length for regular branching programs, we do not need to go through the full analysis of the third case

in permutation branching program and a much weaker and simpler analysis suffices. We skip rest of the details here. ACKNOWLEDGMENT The work was supported by Luca Trevisan’s NSF grant CCF-1017403. We would like to thank Thomas Steinke and Salil Vadhan for pointing out an important error in the previous version of this work where we had erroneously claimed an improvement over the existing result for regular branching programs. We would also like to thank James Cook, Ilias Diakonikolas, Michal Kouck´y, Omer Reingold, Srikanth Srinivasan, Piyush Srivastava, Luca Trevisan, Madhur Tulsiani, Tom Watson and Thomas Vidick for helpful discussions. R EFERENCES [1] M. Kouck´y, P. Nimbhorkar, and P. Pudl´ak, “Pseudorandomness for group products,” in Proceedings of the 43rd ACM Symposium on Theory of Computing, 2011. [2] M. Braverman, A. Rao, R. Raz, and A. Yehudayoff, “Pseudorandom generators for regular branching programs,” in Proceedings of the 51st IEEE Symposium on Foundations of Computer Science, 2010, pp. 40–47. [3] J. Brody and E. Verbin, “The Coin Problem and pseudorandomness for Branching programs ,” in Proceedings of the 51st IEEE Symposium on Foundations of Computer Science, 2010, pp. 30–39. [4] R. Meka and D. Zuckerman, “Small-Bias Spaces for Group Products,” in Proceedings of APPROX-RANDOM, 2009, pp. 658–672. [5] A. C. Yao, “Theory and applications of trapdoor functions,” in Proceedings of the 23th IEEE Symposium on Foundations of Computer Science, 1982, pp. 80–91. [6] M. Blum and S. Micali, “How to generate cryptographically strong sequences of pseudorandom bits,” SIAM Journal on Computing, vol. 13, no. 4, pp. 850–864, 1984, preliminary version in Proc. of FOCS’82. [7] N. Nisan and A. Wigderson, “Hardness vs randomness,” Journal of Computer and System Sciences, vol. 49, pp. 149– 167, 1994, preliminary version in Proc. of FOCS’88. [8] R. Impagliazzo and A. Wigderson, “P = BP P unless E has sub-exponential circuits,” in Proceedings of the 29th ACM Symposium on Theory of Computing, 1997, pp. 220–229. [9] V. Kabanets and R. Impagliazzo, “Derandomizing polynomial identity tests means proving circuit lower bounds,” Computational Complexity, vol. 13, no. 1-2, pp. 1–46, 2004. [10] W. J. Savitch, “Relationships between nondeterministic and deterministic tape complexities,” Journal of Computer and System Sciences, vol. 4, no. 2, pp. 177–192, 1970. [11] N. Nisan, “Pseudorandom generators for space bounded computation,” Combinatorica, vol. 12, no. 4, pp. 449–461, 1992, preliminary version in Proc. of STOC’90.

[12] D. A. M. Barrington, “Bounded-Width Polynomial-Size Branching Programs Recognize Exactly Those Languages in NC1 ,” Journal of Computer and System Sciences, vol. 38, no. 1, pp. 150–164, 1989. [13] R. Impagliazzo, N. Nisan, and A. Wigderson, “Pseudorandomness for network algorithms,” in Proceedings of the 26th ACM Symposium on Theory of Computing, 1994, pp. 356– 364. [14] R. Raz and O. Reingold, “On recycling randomness in space bounded computation,” in Proceedings of the 31st ACM Symposium on Theory of Computing, 1999, pp. 159–168. [15] C.-J. Lu, “Improved Pseudorandom Generators for Combinatorial Rectangles,” Combinatorica, vol. 22, no. 3, pp. 417– 434, 2002. [16] S. Lovett, O. Reingold, L. Trevisan, and S. P. Vadhan, “Pseudorandom Bit Generators That Fool Modular Sums,” in APPROX-RANDOM, 2009, pp. 615–630. [17] P. Gopalan, R. Meka, O. Reingold, and D. Zuckerman, “Pseudorandom Generators for Combinatorial Shapes,” Electronic Colloquium on Computational Complexity, Tech. Rep. TR10176, 2010. ˇ ıma and S. Z´ ˇ ak, “A Polynomial time construction of [18] J. S´ Hitting set for read-once branching programs of width 3.” Electronic Colloquium on Computational Complexity, Tech. Rep. TR10-088, 2010. [19] E. Rozenman and S. Vadhan, “Derandomized squaring of graphs,” in Proceedings of RANDOM’05. Springer-Verlag, 2005, pp. 436–447. [20] O. Reingold, “Undirected ST-connectivity in log-space,” in Proceedings of the 37th ACM Symposium on Theory of Computing, 2005, pp. 376–385. [21] O. Reingold, S. P. Vadhan, and A. Wigderson, “Entropy waves, the zig-zag graph product, and new constant-degree expanders and extractors,” in Proceedings of the 41st IEEE Symposium on Foundations of Computer Science, 2000. [22] J. Kahn, G. Kalai, and N. Linial, “The Influence of Variables on Boolean Functions,” in Proceedings of the 29th IEEE Symposium on Foundations of Computer Science, 1988, pp. 68–80.