Generation of well-formed parenthesis strings in constant worst-case

Report 10 Downloads 21 Views
JOURNAL OF ALGORITHMS ARTICLE NO.

29, 165᎐173 Ž1998.

AL980960

Generation of Well-Formed Parenthesis Strings in Constant Worst-Case Time Timothy R. Walsh Department of Computer Science, Uni¨ ersity of Quebec at Montreal, Montreal, ´ Quebec, ´ Canada H3C 3P8 Received June 3, 1997; revised April 24, 1998

Proskurowski and Ruskey Ž J. Algorithms 11 Ž1990., 68᎐84. published a recursive algorithm for generating well-formed parenthesis strings of length 2 n and challenged the reader to find a loop-free version of their algorithm. We present two nonrecursive versions of their algorithm, one of which generates each string in O Ž n. worst-case time and requires space for only O Ž1. extra integer variables, and the other generates each string in O Ž1. worst-case time and uses O Ž n. extra space. 䊚 1998 Academic Press

1. INTRODUCTION In wPRx, Proskurowski and Ruskey published a recursive algorithm for generating well-formed parenthesis strings of length 2 n and challenged the reader to find a loop-free version of their algorithm. Loop-free generation algorithms for other representations of binary trees have since appeared ŽwvBx, wLvBRx.; we present here what we believe to be the first loop-free generation algorithm for well-formed parenthesis strings, although we have received reports of unpublished algorithms. In Section 2 we describe the Proskurowski᎐Ruskey Gray code. In Section 3 we apply Chase’s graylex order wChx to derive a nonrecursive version of it that generates each string in O Ž n. worst-case time and uses O Ž1. extra space Žthat is, O Ž1. extra integer variables rather than bits.. In Section 4 we apply Ehrlich’s auxiliary array ŽwBERx, wEhx. to derive another version that generates each string in O Ž1. worst-case time and uses O Ž n. extra space. 2. THE PROSKUROWSKI᎐RUSKEY GRAY CODE FOR WELL-FORMED PARENTHESIS STRINGS A well-formed parenthesis string, or Dyck word, of length 2 n is a string of n 0’s and n 1’s such that no prefix contains more 0’s than 1’s. In wPRx 165 0196-6774r98 $25.00 Copyright 䊚 1998 by Academic Press All rights of reproduction in any form reserved.

166

TIMOTHY R. WALSH

Proskurowski and Ruskey present the following recursive description of a Gray code in which each word of length 2 n is changed to its successor by transposing a single pair of letters. For the Dyck word y s 1k 0 x, the operations flip and insert are defined as flipŽ y . s 1ky 1 01 x and insert Ž y . s 1kq 1 00 x. These definitions are extended to lists, and two other operations on lists are defined: A( B means A followed by B and A R means A reversed. The lists T Ž n, k ., 1 F k F n, which consists of all of the words of length 2 n with the prefix 1k 0, are generated by means of the following recursion:

¡flip Ž T Ž n, 2. .

T Ž n, k . s

~ flip Ž T

if k s 1 - n,

Ž n, k q 1 . . ( insert Ž T Ž n y 1, k y 1 . .

¢1 0 n

R

n

if 1 - k - n, if k s n.

Ž 2.1.

The first word in T Ž n, k . turns out to be

¡101100Ž 10.

~

k

k

first Ž T Ž n, k . . s 1 010 Ž 10 .

¢1 0 n

n

ny 3

if k s 1 and n G 3,

ny ky1

if 1 - k - n or Ž 2.2. Ž k s 1 and n s 2 . , if k s n

Žwe added the conditions ‘‘and n G 3’’ and ‘‘or Ž k s 1 and n s 2.’’ to make the formulae in wPRx work for the trivial case in which k s 1 and n s 2.. The last word in T Ž n, k . is 1k 0 k Ž10. ny k . In wPRx a recursive algorithm is given that generates T Ž n, k . in O Ž1. average time per word, and two Gray codes are given for the list of all of the Dyck words of length 2 n: T Ž n q 1, 1. with the prefix 10 removed, and T Ž n, n.(T Ž n, n y 1.( ⭈⭈⭈ (T Ž n, 2.(T R Ž n, 1..

3. A NONRECURSIVE GENERATION ALGORITHM THAT USES O Ž1. EXTRA SPACE To obtain a nonrecursive algorithm for generating T Ž n, k ., we generated the lists T Ž n, k . for 1 F k F n F 5 by hand from the recursion Ž2.1. Žsee Fig. 1. and observed the pattern followed by the motion of the ith occurrence of 1 from the left, which we abbreviate as 1 i , in an interval of words in which 11 , 1 2 , . . . , 1 iy1 stay in one place. By definition, 11 , 1 2 , . . . , 1 k are fixed in position; we call the other occurrences of 1 free. The rightmost position that 1 i can have is 2 i y 1; we call a free 1 that is not in its rightmost position a liberal 1.

LOOP-FREE PARENTHESIS GENERATION

167

FIG. 1. T Ž n, k ., the list of well-formed parenthesis strings of length 2 n with prefix 1k 0, 1 F k F n F 5.

THEOREM 1. The set of words of T Ž n, k . in which 1 kq 1 , 1 kq2 , . . . , 1 iy1 stay in one place is an inter¨ al of consecuti¨ e words and is partitioned into subinter¨ als in which 1 i is stationary too. As we pass from subinter¨ al to subinter¨ al, 1 i mo¨ es one position at a time, either mo¨ ing rightward from its leftmost position adjacent to 1 iy1 Ž except that 1 kq1 starts in position k q 2 with one zero between it and 1 k . to its rightmost position Ž position 2 i y 1 in the word ., or else mo¨ ing leftward from its rightmost position to its leftmost position. Finally, 1 i mo¨ es rightward if and only if the number of liberal 1’s to its left is e¨ en. Proof. For k s n the statement of the theorem is vacuously true, since the one word in T Ž n, n. has no free 1’s. For k - n we assume it to be true for T Ž n, k q 1. and T Ž n y 1, k y 1., and prove it for T Ž n, k .. More precisely, we prove the last statement of the theoremᎏabout the direction of motion of 1 i ᎏsince the other statements of the theorem are easily verified. Suppose that k s 1, so that, by the first line of Ž2.1., T Ž n, 1. s flipŽT Ž n, 2... In T Ž n, 1. s flipŽT Ž n, 2.., 1 2 is a free 1 in position 3, its

168

TIMOTHY R. WALSH

leftmost and its rightmost position; in T Ž n, 2., 1 2 is fixed. Each 1 i , i ) 2, has the same number of liberal 1’s to its left in the corresponding words in T Ž n, 1. and in T Ž n, 2., and it moves in the same direction in both lists. So the last statement of the theorem holds for T Ž n, 1.. Now suppose k ) 1, so that, by the second line of Ž2.1., T Ž n, k . s flipŽT R Ž n, k q 1..( insert ŽT Ž n y 1, k y 1... We compare both the direction of motion of each free 1 and the number of liberal 1’s to its left in T Ž n, k ., T Ž n, k q 1., and T Ž n y 1, k y 1.. The flip operator, acting on T Ž n, k q 1., converts 1 kq 1 from a fixed 1 to a free 1 in position k q 2, which is not its rightmost position 2 k q 1. For each 1 i , i ) k q 1, the number of liberal 1’s to the left of 1 i is greater by 1 for a word in flipŽT Ž n, k q 1.. than for the corresponding word in T Ž n, k q 1.. Reversing the list flipŽT Ž n, k q 1.. reverses the direction of motion of 1 i , so that it moves in flipŽT R Ž n, k q 1.. in the direct opposite to that in which it moves in T Ž n, k q 1., which is consistent with the difference in the number of liberal 1’s to its left. The insert operator, acting on T Ž n y 1, k y 1., creates 1 k which is fixed, so that each free 1 has the same number of liberal 1’s to its left in the corresponding words in insert ŽT Ž n y 1, k y 1.. and in T Ž n y 1, k y 1., and it moves in the same direction. In passing from the last word of flipŽT R Ž n, k q 1.., which is the first word of flipŽT Ž n, k q 1.., to the first word of insert ŽT Ž n y 1, k y 1.., 1 kq 1 , which has no free 1’s to its left, moves rightward from position k q 2 to position k q 3. So the last statement of the theorem holds for T Ž n, k ., 1 - k - n. The result follows by induction first on n and then, for each n, on n y k. COROLLARY. Let g i be the position of 1 i in a word in T Ž n, k .. Then the corresponding list of words g kq 1 ⭈⭈⭈ g n has the property that the set of words in which the prefix g kq 1 ⭈⭈⭈ g iy1 is fixed is an inter¨ al of consecuti¨ e words and is partitioned into subinter¨ als in which g i is fixed too, and as we pass from subinter¨ al to subinter¨ al, the sequence sŽ g kq 1 ⭈⭈⭈ g iy1 . of ¨ alues assumed by g i is gi¨ en by s Ž . s Ž k q 2, k q 3, . . . , 2 k q 1 . and s Ž g kq 1 g kq2 ⭈⭈⭈ g iy1 .

¡Ž g

s

~

iy1

q 1, . . . , 2 i y 1 .

Ž 2 i y 1, . . . , g iy1 q 1 .

¢

if g j - 2 j y 1 for an even number of j, k q 1 F j F i y 1, if g j - 2 j y 1 for an odd number of j, k q 1 F j F i y 1.

LOOP-FREE PARENTHESIS GENERATION

169

Since sŽ p . is monotone for any prefix p, the list of words g kq 1 ⭈⭈⭈ g n is in graylex order wChx. The successor to a given word is determined from the following generic algorithm for finding the next word in any list that is in graylex order Žmore generally, in any list of words in which all of the words with a given prefix form an interval of adjacent words; wWa1x, wWa2x.: Search for the pi¨ otᎏthe largest i such that g i is not at its last value in sŽ g kq 1 ⭈⭈⭈ g iy1 .; if there is a pivot i, then change g i to its next value in sŽ g kq1 ⭈⭈⭈ g iy1 .; change each g j , i q 1 F j F n, to its first value in sŽ g kq1 ⭈⭈⭈ g jy1 .; otherwise g kq 1 ⭈⭈⭈ g n is the last word in the list. Rather than store the auxiliary array g kq 1 ⭈⭈⭈ g n , we modify the generic algorithm so that it acts on each Dyck word directly. For each i from n down to k q 1, if there is an even number of liberal 1’s to the left of 1 i , then 1 i is moving right and its last position is 2 i y 1, and otherwise 1 i is moving left and its last position is adjacent to 1 iy1 Ž1 kq1 never moves left.. If all of these 1’s are in their last positions, then the word is the last one. Otherwise, the pivot is the index i of the first 1 i we find that is not in its last position; 1 i gets moved by one position to its right or left, depending on the direction in which it is moving, and if i - n, then all of the 1’s to its right must be moved to their first positions. By examining Fig. 2, the reader can verify that only 1 i and 1 iq1 actually have to be moved: each 1 j , j ) i q 1, is in its rightmost position, which was its last position and becomes its first position after 1 i and 1 iq1 have moved, because the number of liberal 1’s to its left always changes by 1. While scanning a Dyck word from right to left to find the pivot, we could scan the word from left to right each time we encounter a 1 to count the liberal 1’s to its left, making the worst-case time complexity O Ž n2 .. To make the algorithm run in O Ž n. time, we maintain a variable Odd that is true if the total number of liberal 1’s in a given word is odd. Before scanning a word, we initialize a Boolean variable Left to Odd, and we

FIG. 2. The suffix beginning with 1 i before and after 1 i moves. There are four cases to consider, depending upon whether 1 i moves right or left, and whether or not it moves to or from its rightmost position. The arrows indicate the direction of motion of each 1.

170

TIMOTHY R. WALSH

change it whenever a liberal 1 is encountered, so that as each 1 is examined, the variable Left is true if that 1 is moving leftward. We also maintain a Boolean variable Last, which becomes true if the current word turns out to be the last one in T Ž n, k .. If we are generating T Ž n, k ., then we generate first ŽT Ž n, k .. from Ž2.2., we initialize Last to false, and we initialize Odd to true if n ) 2 and k - n, since, by Ž2.2., first ŽT Ž n, k .. has only one liberal 1 Ž1 maxŽ3, kq1. ., and otherwise we set Odd to false, because then first ŽT Ž n, k .. has no liberal 1’s. Then we execute the updating algorithm Next given in Fig. 3 until Last becomes true. If we are generating all of the Dyck words of length 2 n by generating T Ž n q 1, 1. without the prefix 10, then the updating algorithm can be executed as is: 1 2 , which is free but at its rightmost position in T Ž n q 1, 1., is simply renamed 11 , which is now fixed.

FIG. 3. Finding the successor to a given word in T Ž n, k . in O Ž n. worst-case time and O Ž1. extra space.

LOOP-FREE PARENTHESIS GENERATION

171

To generate T Ž n, n.(T Ž n, n y 1.( ⭈⭈⭈ (T Ž n, 2.(T R Ž n, 1., we begin as if we were generating T Ž n, n., set k s n, and then execute the updating algorithm with following changes: change line 5 to ‘‘while i G 1 do’’; in line 7, change ‘‘if j - 2 i y 1’’ to ‘‘if j - 2 i y 1 and i ) k’’; insert the following code after line 22: if i s k then  We pass from T Ž n, k . to T Ž n, k y 1. or to T R Ž n, 1.. 4 pw j q 1x [ 1; k [ k y 1;  Only 1 k moves. 4 Odd [ not Odd;  The directions of motion are reversed for any k; if k ) 1, then 1 kq 1 is a new liberal 1; if k s 1, then the list is reversed. 4 return end if;

4. FINDING THE NEXT WELL-FORMED PARENTHESIS STRING IN O Ž1. WORST-CASE TIME We store three auxiliary arrays: the array g 1 g 2 ⭈⭈⭈ g n of positions of the occurrences of 1, the array d1 d 2 ⭈⭈⭈ d n , where d i s 1 if 1 i is moving right Ž g i is increasing . and 0 otherwise, and the array e1 e2 ⭈⭈⭈ e n introduced in wBERx and wEhx to find the pivot in O Ž1. worst-case time. Once the pivot i is found, the Dyck word and all of the auxiliary arrays are also updated in O Ž1. time. From Fig. 2 it would appear that d iq2 , d iq3 , . . . , d n must all be changed from 1 to 0; however, each of these d m changes when it is needed Žwhen m becomes the pivot., so that the total work done to generate each new Dyck word is O Ž1.. For the first word in T Ž n, k ., the values of g 1 , g 2 , . . . , g n can be obtained from Ž2.2., d i s 1, if i F maxŽ k q 1, 3., and 0 otherwise, and e i s i for all i. After generating the first word of T Ž n, k ., we execute the algorithm QuickNext to Fig. 4 until Last becomes true. To generate all of the length 2 n Dyck words as the list T Ž n, n.(T Ž n, n y 1.( ⭈⭈⭈ ( T Ž n, 2.(T R Ž n, 1., we begin as if we were generating T Ž n, n., initialize k to n, and then execute QuickNext with the following changes: change line 3 to ‘‘if i F 1 then’’; change line 22 to if i s k then  We pass from T Ž n, k . to T Ž n, k y 1. or to T R Žn, 1.. 4 pw j q 1x [ 1; k [ k y 1;  Only 1 k moves. dw m x, m ) k, will be updated later. 4 else if i s n then

172

TIMOTHY R. WALSH

FIG. 4. Finding the successor to a given word in T Ž n, k . in O Ž1. worst-case time and O Ž n. extra space.

The algorithms of Figs. 3 and 4, plus algorithms for finding the position of a Dyck word in T Ž n, k . and the Dyck word in a given position in T Ž n, k . in O Ž n. arithmetic operations, were programmed in Modula-2 and tested. The two generation algorithms ran in roughly the same total time, which is often the case when average-case O Ž1.-time algorithms are made loop-free. The elimination of auxiliary arrays is not a significant advantage for a computer, but does make the algorithm easier to execute by hand, so that both algorithms have their advantages. The listings for these programs and other related results are contained in a research report wWa1x that is available from the author on request. The observant reader may notice the

LOOP-FREE PARENTHESIS GENERATION

173

comments in Fig. 4 explaining the array e1 e2 ⭈⭈⭈ e n introduced in wBERx and wEhx and used there and in many other places to find the pivot in O Ž1. worst-case time, and may want to prove, or to obtain a proof ŽwWa1x, wWa2x., that it works for any list of words in which all of the words with the same proper prefix g 1 g 2 ⭈⭈⭈ g iy1 form an interval of adjacent words Žwhether or not the list is in graylex order., provided only that in each such proper subinterval of the list, g i assumes at least two distinct values. REFERENCES wBERx J. R. Bitner, G. Ehrlich, and E. M. Reingold, Efficient generation of the binary reflected Gray code and its applications, Comm. ACM 19 Ž1976., 517᎐521. wChx P. J. Chase, Combination generation and graylex ordering, Proceedings of the 18th Manitoba Conference on Numerical Mathematics and Computing, Winnipeg, 1988, Congressus Numerantium 69 Ž1989., 215᎐242. wEhx G. Ehrlich, Loopless algorithms for generating permutations, combinations, and other combinatorial configurations, J. ACM 20 Ž1973., 500᎐513. wPRx A. Proskurowski and F. Ruskey, Generating binary trees by transpositions, J. Algorithms 11 Ž1990., 68᎐84. wLvBRx J. M. Lucas, D. R. van Baronaigien, and F. Ruskey, On rotations and the generation of binary trees, J. Algorithms 15 Ž1993., 343᎐366. wvBx D. R. van Baronaigien, A loopless algorithm for generating binary tree sequences, Inform. Process. Lett. 39 Ž1991., 189᎐194. wWa1x T. R. Walsh, ‘‘A Simple Sequencing and Ranking Method That Works on Almost all Gray Codes,’’ Research Report 243, Department of Mathematics and Computer Science, University of Quebec at Montreal, April 1995. wWa2x T. R. Walsh, Gray codes for involutions, submitted for publication.