Optimal Edge Ranking of Trees in Polynomial Time Raymond Greenlaw y Department of Computer Science University of New Hampshire Durham, NH 03824
Pilar de la Torre Department of Computer Science University of New Hampshire Durham, NH 03824
Alejandro A. Schaer z Department of Computer Science Rice University Houston, TX 77251 September 29, 1993
Abstract
An edge ranking of a graph is a labeling of the edges using positive integers such that all paths between two edges with the same label contain an intermediate edge with a higher label. An edge ranking is optimal if the highest label used is as small as possible. The edge-ranking problem has applications in scheduling the manufacture of complex multi-part products; it is equivalent to nding the minimum height edge-separator tree. In this paper we give the rst polynomial-time algorithm to nd an optimal edge ranking of a tree, placing the problem in P . An interesting feature of the algorithm is an unusual greedy procedure that allows us to narrow an exponential search space down to a polynomial search space containing an optimal solution. An N C algorithm is presented that nds an optimal edge ranking for trees of constant degree. We also prove that a natural decision problem emerging from our sequential algorithm is P -complete.
Keywords: Edge ranking, minimum height edge-separator, node ranking, trees. Address for Correspondence: A. A. Schaer; Department of Computer Science; Rice University; Houston, TX 77251.
y z
e-mail address:
[email protected]; research partially supported by NSF grant CCR-9010445. e-mail address:
[email protected]; research partially supported by NSF grant CCR-9209184. e-mail address:scha
[email protected]; research partially supported by NSF grant CCR-9010534.
1
1 Introduction This paper places the problem of optimally edge ranking a tree in P answering an open question posed in [12]. The algorithm we present has several interesting features: it has two layers of greediness as opposed to more typical greedy algorithms that have only one greedy part, it never needs to backtrack to a previous level and reassign labels, and it narrows down an exponential search space to a polynomial one in which there is still an optimal solution. We also present an NC algorithm for optimally edge ranking trees with constant degree. Evidence is provided suggesting certain aspects of our sequential algorithm may not parallelize well by proving that a natural decision problem based on one component of the algorithm is P -complete. There could be an NC algorithm for edge ranking that uses a dierent approach. To motivate the edge-ranking problem we rst examine the closely related node-ranking problem. Let T be an unrooted tree having n nodes. A node ranking of T is a labeling of the nodes with positive integers such that the path between any two nodes with the same label contains an intermediate node with a higher label. A node ranking is optimal if the largest value used is a minimum among all rankings. A node ranking with highest value k is called a k?node ranking. A k?node ranking corresponds to a separator tree; at stage i = 1; 2; : : :; k ? 1, one can remove all nodes with label k ? i +1, and eventually be left with only nodes having the label 1. An optimal node ranking corresponds to a separator tree of minimum height. In many parallel algorithms (see for example, [15] and references therein) the running time depends on the height of a separator tree; therefore, it is important to nd a separator tree of minimum height. Node ranking of trees has been well studied. The original application for node ranking was to solve it as a subroutine in an algorithm to nd an approximate edge ranking [11, 12]. Iyer, Ratli, and Vijayan gave an O(n log n) algorithm to nd an optimal node ranking of a tree, where n is the number of nodes in the tree [11]. Schaer re ned their algorithm and its analysis to improve the running time to O(n) [19]. Liang, Dhall, and Lakshmivarahan found an NC approximation algorithm that produced a ranking within a factor of two of the optimal [14]. de la Torre, Greenlaw, and Przytycka gave an NC algorithm for optimally node ranking a tree [3, 4, 18]. It is interesting to note that a tree can always be ranked using a highest value of at most 1 + blog2 nc[14]. The edge-ranking problem is de ned analogously as for node ranking. An edge ranking of a tree T is a labeling of the edges with positive integers such that the path between any two edges with the same label contains an intermediate edge with a higher label. An edge ranking is optimal if the largest value is a minimum among all rankings. An optimal edge ranking corresponds to an edge-separator tree of minimum height. Finding an optimal edge ranking has an interesting application to scheduling the assembly steps in manufacturing a complex, multi-part product [12, 16]. In contrast to node ranking, a tree with n nodes may require n ? 1 values in order to rank it. In particular, a star has edge rank n ? 1. Iyer, Ratli, and Vijayan gave an O(n log n) approximation algorithm that nds an edge ranking whose largest value is simultaneously at most two times the optimum and at most log3=2 n ? 2 more than the optimum [12]. Their approximation algorithm uses a node-ranking algorithm as a subroutine. The main open problem in their paper is to determine whether the edge-ranking problem is in P or if it is NP -hard. It is easy to see the problem is in NP . The dichotomy between graph problems based on nodes versus the corresponding problems based on edges is well known. Usually, it seems that problems based on edges are more dicult. For example, there is a general theorem proving that a variety of node-deletion problems are NP -complete [13, 20], however, as noted in [21] there does not seem to be any similar strategy for unifying edge-deletion problems. Although nearly all natural problems de ned on trees are in P (and usually NC ), Perl 2
and Zaks proved that several edge labeling problems on trees are NP -complete [17]. Based on these observations and the fact that the optimal edge-ranking problem seemed similar to some of the problems described in [17], it seemed plausible that the optimal edge-ranking problem would be NP -complete. Our main result is a polynomial-time algorithm to nd an optimal edge ranking of a tree; this settles the main open problem in [12]. The edge-ranking algorithm is more complex than the node-ranking algorithms and requires several new ideas. The remainder of this paper is outlined below. In Section 2 we present preliminary de nitions. A brief road map outlining our strategy is provided in Section 3. A result reducing the edge-ranking problem to a problem involving only local constraints is given in Section 4. Section 5 contains a sequence of lemmas involving greedy conditions that show how we can narrow down an exponential search space for edge ranking to a polynomial one. Pseudocode for the algorithm is also described in this section; we implemented the algorithm in under 500 lines of C based on this pseudocode description. A running time analysis of the algorithm is given in Section 6. The NC algorithm to optimally edge rank trees of constant degree is given in Section 7. The P -completeness result is proved in Section 8. A summary and some open problems are presented in Section 9.
2 De nitions and Notation Suppose we have a tree that is rooted arbitrarily. Our approach is to label the edges in a bottom-up fashion. We focus on labeling the edges emanating from a single node once all the edges below it have been labeled. When visiting internal node v , our algorithm assigns labels irrevocably to all the edges emanating downward to children of v . One of the keys to the polynomial time running bound is that edges never need to be relabeled. We call the problem of labeling the edges emanating from an internal node the labeling problem and de ne it formally in the next section. v 4
1
3
5
w4
w1 1
1
2
2
w3
w2 3
2
1
1 1
Figure 2.1: A sample ranking to illustrate the de nitions. The tree is rooted at v . Numbers shown next to edges are the labels the edges receive in a supercritical ranking.
Trees and Rankings
For each node v of a rooted tree T = (V; E ), Tv denotes the subtree rooted at v . The set of edges emanating (down) from a node v is denoted Ev . We will use E to denote a particular subset of Ev . If v is a node with children w1; : : :; wc then Ev = fv ! w1 ; : : :; v ! wcg. Throughout the paper node v 3
is assumed to have children w1; : : :; wc. For example, in Figure 2.1 c has value 4. For completeness, we repeat the de nitions of node ranking and edge ranking below.
De nition 2.1 Let T = (V; E ) be a rooted tree. A node ranking of T is a mapping : V ! f1; 2; : : :g with the property that if (u) = (v ) then there exists a node w on the path between u and v with (w) > (u). A node ranking is optimal if the largest value it uses is as small as possible among all node rankings.
De nition 2.2 Let T = (V; E ) be a rooted tree. An edge ranking of T is a mapping : E ! f1; 2; : : :g
with the property that if (e) = (f ) then there exists an edge g on the path between e and f with (g) > (e). An edge ranking is optimal if the largest value it uses is as small as possible among all edge rankings. The largest value used in an optimal ranking is denoted rank (T ). The tree of Figure 2.1 has rank (T ) = 5. From now on when we refer to a ranking, we are talking about an edge ranking unless explicitly noted otherwise.
De nition 2.3 A labeling at node v is a mapping from Ev to f1; 2; : : :g. The labeling of node v in Figure 2.1 assigns values 4, 1, 5, and 3 to the edges emanating from v . A ranking is sometimes represented in terms of its ordered pairs. For example, if L is a valid labeling at a node v and i is a ranking of Twi , then L [ 1 [ [ c is a ranking of Tv .
Critical Lists and Optimal Rankings
We call a sequence of distinct positive integers that are sorted in decreasing order a decreasing list. Let U denote the set of all decreasing lists. The symbols >lex and lex denote the left-to-right lexicographic ordering of lists from U .
De nition 2.4 Let be a ranking of a tree T = (V; E ). The critical list of at node w consists of those values labeling edges in Tw such that on the path from any such edge to the root w there is no edge with a higher label. The values on the critical list are sorted in decreasing order. We denote the critical list for node w under ranking by crit (w; ); the second argument may be omitted when it is clear from the context or unimportant. We sometimes write crit (v ! w) to denote crit (w). Intuitively, crit (w) contains values that are critical in extending the ranking further up the tree. In particular, labeling the edge emanating upward from w with any value from the critical list would violate the de nition of ranking. For example, crit (v ) for the tree shown in Figure 2.1 is 5, 4, 3, 1. All known algorithms for node ranking compute the same ranking. Just as in the case of node ranking, it is useful to focus on speci c types of edge rankings. We introduce two notions of optimality for edge rankings de ned in terms of critical lists.
De nition 2.5 Let be a ranking of a tree T = (V; E ) rooted at r. A ranking of T is list-optimal at r if crit (r; ) lex crit (r; ) for all rankings of T . A ranking of T is supercritical if it is list-optimal at every node. It follows from the de nition above that every tree has a list-optimal ranking because this is simply an optimal ranking that has the lexicographically least critical list at the root. However, it does not follow that every tree has a supercritical ranking. It is plausible that to obtain a list-optimal ranking at the root, some subtree may require a non-optimal ranking. We will prove that every tree does in fact have a supercritical ranking. The algorithm speci ed in Section 5 always nds one. 4
De nition 2.6 Let l be a positive integer and K 2 U be a decreasing list, where l 62 K . cover (l; K )
denotes the decreasing list of values on K that are less than l. cover (l; K ) is called the cover list. We say l covers every element in the cover list. If L is a labeling that assigns label l to edge v ! w, the cover list of v ! w is cover (l; crit (v ! w)). It will also be denoted by cover (l; v ! w) and cover (L; v ! w). For example, given the critical list 7, 4, 2, and 1 for node wi , the label 5 assigned to edge (v ! wi ) would cover the values 4, 2, and 1. The cover list in this case is 4, 2, and 1. If crit (ei ) = 8; 6; 3; 1 then cover (9; ei) = 8; 6; 3; 1; cover (7; ei) = (6; 3; 1); cover (4; ei) = 3; 1; and cover (1; ei) would be unde ned since 1 appears on crit (ei ).
Valid and Optimal Labelings
We now de ne the notion of an optimal labeling that relies essentially on local constraints. De nition 2.7 A labeling L of Ev = fv ! w1; : : :; v ! wcg is valid with respect to the rankings 1; : : :; c (not necessarily supercritical) of the subtrees Tw1 ; : : :; Twc if L [ 1 [ [ c is a ranking. A valid labeling L of Ev is optimal at v if the ranking = L [ 1 [ [ c is such that crit (v; ) lex crit (v; ) for every ranking of Tv that coincides with on Twi 's. The proof of the lemma given below follows immediately from the previous de nitions. Lemma 2.8 A labeling L of Ev is valid with respect to rankings 1; : : :; c on Sv if and only if the following three conditions hold: 1. v ! wi gets a unique label, 1 i c; 2. if a label l on an edge emanating from v also occurs on crit (v ! wj ) for some j , then l must be covered by the label of v ! wj ; 3. any value that appears on more than one critical list is covered on all or all but one of these lists. The lemma shows that whether a labeling Ev is valid with respect to the rankings 1; : : :; c depends only on the critical lists crit (w1 ; 1); : : :; crit (wc ; c) of v 's children. Furthermore, this implies that the optimality of a labeling at v only depends on the critical lists of the rankings of the subtrees.
3 Overall Strategy: Intuition and Road Map We brie y provide some intuition and a road map of our overall strategy in developing the edgeranking algorithm. The strategy consists of rooting the tree and then, proceeding in a bottom up fashion, labeling the edges e1 ; e2; : : :; ec emanating from each node v once its subtrees have been fully labeled. For such a strategy to succeed, the local labeling of the ei 's must be such that the resulting edge ranking of the subtree tree Tv rooted at v can be extended to a global optimal ranking of any tree that happens to have Tv as a subtree. In Section 4 we will prove a local criterion for recognizing, in polynomial time, locally optimal labelings that lead to global optimal edge rankings. This criterion reduces the problem of optimally edge ranking a tree in polynomial time to that of designing a polynomial time algorithm capable of nding locally optimal labelings from among the exponentially many valid local labelings. In Section 5 we identify and prove the fundamental structural properties of the exponential search space from which our algorithm draws its strategy. Exploiting this structure, the algorithm narrows down its search and nds the desired locally optimal labelings in polynomial time. 5
4 From Local to Global Optimality In this section we formally de ne the labeling problem and prove a theorem that reduces the problem of nding an optimal ranking to it. Once this is complete, we focus our attention on solving the labeling problem in the next section.
De nition 4.1 Labeling Problem Input: A node v with children w1; : : :; wc and their associated critical lists crit (w1); : : :; crit (wc)
corresponding to the rankings 1 ; : : :; c of Tw1 ; : : :; Twc respectively. Output: A valid labeling L of Ev that is optimal at v, and crit (v; L [ 1 [ c).
It should be noted that any set of lists can be viewed as the input to the labeling problem. This is because any list = fl1; : : :; ldg from U can be viewed as the critical list resulting from a ranking of some tree. For example, the tree consisting of root w with children x1 ; : : :; xd , where the i-th child xi has li ? 1 children. Furthermore, the output of the labeling problem does not depend on the rankings i or the subtrees on which they are de ned but only on the set of critical lists, crit (wi ). More formally stated, consider two instances of the labeling problem I and I 0 whose inputs are v , wi , i , and Twi and v , wi , 0i, and Tw0 i for 1 i c such that crit (wi; i) = crit (wi; 0i). The solutions to I and I 0 are the same. The solution of the labeling problem allows us to extend supercritical rankings. The output labeling together with the supercritical rankings of the subtrees Twi produces the supercritical ranking for Tv as the following theorem shows.
Theorem 4.2 Let T be a tree rooted at r. Consider an internal node v with children w1; : : :; wc. Suppose 1; : : :; c are rankings of Tw1 ; : : :; Tw , where i is list-optimal on Tw , 1 i c. An optimal c
labeling L of Ev combined with the i's is a list-optimal ranking of Tv .
i
Proof: Let = 1 [ [ c. To show that L [ is list-optimal, we consider an arbitrary alternative ranking 0 and show that crit (v; L [ ) is lexicographically less than or equal to crit (v; 0). Suppose that 0 assigns the labels li0 to the edges v ! wi , for 1 i c. Since is list-optimal on each Twi , for 1 i c; crit (wi ; 0) lex crit (wi; ). Let pi denote the largest value on crit (wi ; 0) that is not on crit (wi ; ) or 0 if the lists are equal. De ne a ranking 00 0 of Tv as combined with the following labeling: for 1 i c edge v ! wi gets labeled with max fli ; pig. Our goals are to prove that crit (v; L [ ) lex crit (v; 00) and crit (v; 00) lex crit (v; 0); since lex is transitive, these two assertions would establish that crit (v; L [ ) lex crit (v; 0), as desired. First we must show that 00 is a legal ranking of Tv . Suppose we x a labeling L0 that when coupled with a ranking of Sv yields a ranking . Suppose an edge v ! wi gets label l in L0 . It will be useful to consider the values on the critical list of wi that are not covered by the label l assigned to v ! wi . These are the values on crit (v ! wi ; ) n cover (L0; v ! wi ) together with l. We call them the exposed values for edge v ! wi . To show that 00 is a ranking, it is sucient to prove that no value is exposed for two dierent edges. As we can see by examining the following two cases, for 1 i c, all exposed values in 00 are also exposed for the same edge in 0. Case 1. Suppose v ! wi gets labeled by li0 . By construction, all dierences in the lists 0 crit (v ! wi; 00) = crit (v ! wi; ) and crit (v ! wi ; 0) occur below pi , which is less than or equal to li . So, any value exposed on crit (v ! wi ; 00) is also exposed on crit (v ! wi; 0). 6
Case 2. Suppose v ! wi gets labeled by pi . Since pi > li0 and above pi crit (v ! wi ; 00) and crit (v ! wi ; 0) agree, the values exposed for edge v ! wi in 00 are also exposed in 0. Since L [ and 0 are legal rankings, no value is exposed twice in 00; therefore 00 is a ranking of Tv . To see that crit (v; L [ ) lex crit (v; 00) observe that for each wi ; the rankings and 00 agree on Twi and that for each edge emanating from v, the label assigned to it by 00 is at least as large as the label assigned by . From the observation made above that all exposed values in 00 are also exposed in 0, it follows that crit (v; 00) lex crit (v; 0). This completes the proof.
Corollary 4.3 Let be a ranking of Tv that is supercritical on Tw for every child w of v. If the
labeling given by the restriction of to Ev is optimal, then is supercritical. A ranking of a rooted tree obtained by successively computing optimal labelings of each node, starting with the leaves, is supercritical.
5 Narrowing Down the Search Space The sequence of lemmas presented in this section help us restrict our search in solving the labeling problem. They all concern the existence or non-existence of optimal labelings obeying certain restrictions. To identify a labeling, we often list the labels and their associated edges as ordered pairs, so that (li; ei ) means that label li is assigned to edge ei . We will usually sort a labeling by the order of its labels (l values) and not by edge names. In all of the proofs where the order matters, it is convenient to have (lc; ec ) correspond to the edge with largest label and (l1; e1) correspond to the edge with smallest label. This means that ei is the edge with i-th smallest label, and which edge that is may dier from context to context. Thus ei is usually not the same as v ! wi . The following set of assumptions pertains to all lemmas presented in this section.
Assumptions 5.1 Let T be a rooted tree. Let v be an internal node of T with children w1; : : :; wc. Let be a partial edge ranking of Tv on Tw1 ; : : :; Twc . That is, labels all edges in Tv except those of
Ev .
Lemma 5.2 Let L = (lc; ec); (lc?1; ec?1); : : :; (l1; e1) be a valid labeling such that lc > lc?1 > > l1.
There is no optimal labeling in which the highest label used is greater than lc :
Proof: Consider the critical list crit (v; L [ ) that will be passed up the tree as a result of the labeling L. It starts with a (possibly empty) pre x of values a1; a2; : : :; at that are all bigger than lc . Since L is valid and none of the ai 's is covered, each ai is on the critical list of exactly one child of v . Set at+1 equal to lc . The remainder of the proof is by contradiction. Suppose we have an optimal labeling M in which the highest value used, h, is greater than lc. Suppose that ai h ai+1 . The values a1; a2; : : :; ai must be uncovered in M . We cannot have h = ai or h = ai+1 because then M would be invalid (the label h would also be uncovered on an input critical list). Thus ai > h > ai+1 . This means crit (v; M [ ) will start a1 ; a2; : : :; ai ; h; which is lexicographically greater than crit (v; L [ ), which starts a1 ; a2; : : :; ai ; ai+1. This contradicts the assumption that M is an optimal labeling. Lemma 5.3 Let L = (lc; ec); (lc?1; ec?1); : : :; (l1; e1) be an optimal labeling such that lc > lc?1 > > l1. Let Ej = fej ; ej?1; : : :; e1g. Ej is a set comprising a sux of the edge list when sorted by labels in L. The labeling L restricted to the edge set Ej (that is the labeling (lj ; ej ); (lj ?1; ej ?1); : : :; (l1; e1)) 7
is optimal for the subtree of T containing only v , the edges in Ej , and the subtrees descending from those edges.
Proof: The validity of L restricted to Ej follows from the overall validity of L and Lemma 2.8. By the contrapositive of Lemma 5.2, there is an optimal labeling L0 of Ej that uses no value larger than lj . If L0 were better than L restricted to Ej , we could replace that part of L with L0 and get a better
overall labeling, contradicting the assumption that L is optimal. The following de nition is crucial in narrowing down the search space as it identi es a subclass in which there is an optimal labeling that we can nd in polynomial time.
De nition 5.4 Given the assumptions highlighted at the beginning of this section,
L = (lc; ec ); (lc?1; ec?1 ); : : :; (l1; e1) is a greedy cover labeling (abbreviated to gc labeling) if for every i, and for every j < i, cover (li; ei) lex cover (li ; ej ). That is to say, when we choose to use the label li on one of the edges in the sux Ei = fei ; ei?1 ; : : :; e1g, the edge ei we choose is the edge where the value li will cover as much as possible. It will be important in many places to assume that ties in the above de nition are broken consistently; that is, if cover (li; ei) = cover (li; ej ), then some auxiliary algorithm deterministically chooses the edge to label li . We can do this by assuming that there is a left-to-right order on the edges and that the leftmost edge always wins a tie.
Lemma 5.5 Let L = (lc; ec); (lc?1; ec?1); : : :; (l1; e1) be a valid labeling such that lc > lc?1 > > l1. Then there is a valid gc labeling whose biggest label is also lc . If L is optimal, then the corresponding gc labeling can be made optimal too. Proof: The proof is by induction on the number of edges.
Base Case (1 edge). Any labeling on a single edge is a gc labeling. Induction Step. Let k be given. Assume that the lemma is true if there are strictly fewer than k edges to label. Let an arbitrary instance of the labeling problem for a set E with k edges and a valid labeling L for that instance (as in the statement of the lemma) be given. If for every j < k, cover (lk ; ek ) lex cover (lk ; ej ); then the choice of labeling (lk ; ek ) is consistent with the gc labeling de nition. By the induction hypothesis, we can nd a gc labeling on the edge set Ek?1 = E ? fek g using no label greater than lk?1; combining this labeling with the label (lk ; ek ) makes a gc labeling for the entire edge set with largest label lk . Furthermore, if L is optimal then by Lemma 5.3 the labeling of Ek?1 is optimal. By the induction hypothesis, we can nd a gc labeling of Ek?1 that is optimal, and combining it with (lk; ek ) gives an optimal gc labeling for E . Otherwise, there exists a j such that cover (lk ; ej ) >lex cover (lk ; ek ). Choose the j that lexicographically maximizes cover (lk ; ej ). We construct a modi cation of the labeling L that is still valid and as required by the de nition of gc labeling, assigns its maximum label lk to ej . We can then use the induction hypothesis, exactly as above, to ll out a gc labeling for Ek?1 . Since cover (lk ; ej ) >lex cover (lk ; ek ), there exists a largest value m that is on the list cover (lk ; ej ) but not on the list cover (lk ; ek ). There are two cases depending on the relative values of m and lj . Case 1. m < lj . We modify L to L0 by swapping the labels lj and lk , so that ej gets label lk and ek gets label lj . We need to prove that this has no impact on the other labels and preserves optimality, if L is optimal. Observe that by the de nition of m, the lists crit (ej ) and crit (ek ) agree on all values in the interval between m and lk (i. e., each such value is on both critical lists or on neither). Everything below m 8
gets covered on both lists in L and L0. Thus the exposed values from the union of crit (ej ) and crit (ek ) are the same in L and L0. Case 2. m lj . We modify L to L00 by labeling ej with lk and labeling ek with m; all the other labels stay the same. By the construction in this case, m was on crit (ej ) in L and was left exposed when we labeled ej with lj . Since L is valid, m cannot occur as an exposed value on any other critical list of L or as a label in L. Therefore, since in L00 we covered the value m on crit (ej ) with a label lk greater than m, we can now reuse m to label ek and still have a valid labeling. As in case 1, the de nition of m ensures that the values in the interval between m and lk that are left exposed in L are the same as the values left exposed in L00. All values below m on crit (ej ) and crit (ek ) are covered in L00 , so in terms of optimality, L00 is at least as good as L.
Corollary 5.6 There is always an optimal labeling that is also a gc labeling. Based on the previous corollary, we can restrict our search for labelings to the class of gc labelings. The next lemma tells us which gc labeling we are searching for.
Lemma 5.7 Among all valid gc labelings, the labeling that has the lexicographically smallest list of labels sorted from large to small is optimal. Proof: Consider two arbitrary competing valid gc labelings: L = (lc ; ec); (lc?1; ec?1 ); : : :; (l1; e1) L0 = (lc0 ; e0c ); (lc0 ?1; e0c?1); : : :; (l10 ; e01)
that are both sorted in decreasing order of label values. To help with intuition, consider that in general, two labelings may label the edges in totally dierent orders; the fact that L and L0 are both gc labelings restricts this freedom. Assume without loss of generality that L has a lexicographically smaller label list than L0; and in particular, that j is the highest index at which lj is less than lj0 . All the labels larger than lj agree: lc = lc0 , lc?1 = lc0 ?1 , : : :, lj+1 = lj0 +1 . Now consider the de nition of gc labeling (De nition 5.4). Given that the largest label is xed at lc (assuming j 6= n), the choice of which edge gets that label is deterministic, provided that ties are broken consistently. Thus since both L and L0 are gc labelings we see that ec = e0c ; : : :; ej +1 = e0j +1 . Therefore, crit (v; L [ ) and crit (v; L0 [ ) agree on all values greater than lj0 ; each such value is either on both critical lists or on neither. Furthermore, since lj0 is used as a label in L0 it will be on crit (v; L0 [ ) and it cannot remain exposed on any critical list crit (e0i ) for i 6= j . So, the value lj0 is not on the critical list induced by the labeling L, and L is a better labeling than L0 . Since L and L0 are arbitrary gc labelings, the above argument shows that any gc labeling that does not use a lexicographically minimum label list is not an optimal labeling. By Corollary 5.6, there is a gc labeling that is an optimal labeling. Therefore, the optimal gc labeling must be the gc labeling with lexicographically minimum label list. Based on the previous lemma, our algorithm will search for a gc labeling with a lexicographically minimum label list. The good news is that given a xed label list, there is at most one gc labeling using that list (if ties are broken consistently) and it can be found in polynomial time. The bad news is that there appear, at rst glance, to be exponentially many label lists to consider. We need a search strategy to explore the exponential search space in polynomial time. Our idea is to pin down the label list one value at a time from largest to smallest. That is, for a given pre x of the label list 9
lc ; lc?1; : : :; lj+1, we will be able to determine in polynomial time whether there is a gc labeling whose
label list starts with this pre x (and we actually nd such a labeling if there is one). The labels below
lj+1 may not be optimal. The search strategy is described by pseudocode and proved correct in the
subsequent lemmas. All but the rst lemma concern the speci cs of the search algorithm, so the other lemmas are presented after the pseudocode.
Lemma 5.8 Let maxrank be a positive integer. Suppose there is a valid gc labeling of Ev with highest
label less than or equal to maxrank and the value maxrank is not on any input critical list of Ev . Then there is a valid gc labeling of Ev with highest label maxrank (of course, this labeling may not be optimal).
Proof: Let L = (lc; ec); (lc?1; ec?1); : : :; (l1; e1) be a valid gc labeling of Ev such that its maximum value lc maxrank. Suppose maxrank is not on any critical list. De ne L0 to be equal to (maxrank ; ec); (lc?1; ec?1 ); : : :; (l1; e1). Since maxrank lc and it is not on any critical list, L0 is a valid labeling, although it may not be a gc labeling. By Lemma 5.5, there is a valid gc labeling L00 with highest label maxrank.
The Sequential Algorithm
We specify the pseudocode for the ranking algorithm in this section. We implemented the algorithm in C based on our pseudocode. There are six routines; each is described immediately before its pseudocode. In several routines, the parameter E is a subset of the edges emanating from node v , and each edge in the set has a critical list for the tree below it. The rst procedure SortCover (E ) is not essential to the correctness of the algorithm, but it improves the running time. This procedure precomputes a table of edge preferences associated with the greedy cover criterion that speeds up the main computation. Ties for assigning labels are broken in a left-toright manner. SortCover (E ) COMMENT: Let n be the number of nodes in the tree. In this routine we build a table CoverOrder [1::n] such that CoverOrder [l] is the list of edges e 2 E sorted in lexicographic decreasing order by cover (l; e). If we have decided to use l as the next label, then the gc labeling criterion requires that we assign it to the rst unlabeled edge in CoverOrder [l]. CoverOrder [1] contains all edges in left-to-right order; For i = 2 to n do Scan CoverOrder [i ? 1] and copy each edge that has i ? 1 on its critical list to the next position in CoverOrder [i]; Scan CoverOrder [i ? 1] again, this time copying, in the same order, each edge that does not have i ? 1 on its critical list to the next position in CoverOrder [i];
The procedure NextEdge (rank; E ) nds which edge in E gets label rank in a gc labeling of E . If no edge can be labeled rank because rank is on a critical list, NextEdge returns error. NextEdge (rank ; E ) If rank is on crit (ei ) for some ei 2 E then Return(error ); Else 10
Let ej be the rst unlabeled edge in the list CoverOrder [rank ]; Return(ej ); One can maintain counters of how many critical lists in E contain each value, so that it is possible to test in constant time if rank is on some critical list.
The routine GreedyCover (maxrank ; E ) determines whether there is a gc labeling of E in which maxrank is the highest label used. If there is such a labeling, it returns one; if not, it returns error. GreedyCover (maxrank ; E ) Let duplicate be the largest value that occurs on two or more input critical lists, or 0 if there are no duplicate values; Let ec := NextEdge (maxrank ; E ); If maxrank duplicate or ec = error then Return(error ); Else If E = fec g then let reclabel := nil; Else COMMENT: Try to label ec with maxrank. Let nextrank be the largest integer less than maxrank that is not on any critical list for the edges E n fec g; If nextrank is 0 then Return(error ); Else let reclabel := GreedyCover (nextrank ; E n fec g); If reclabel = error then Return(error ); Else Return the labeling (maxrank ; ec) concatenated with the labeling reclabel; Using the counters of how many critical lists in E contain each value, we can search for nextrank in O(n) time. The procedure Label (E ) nds an optimal labeling for E by searching for the maximum rank to use and using GreedyCover to test if a given maximum rank is possible. Label (E ) If E = ; then Return(nil); i := 1; While (labeling := GreedyCover (i; E )) = error do i := i + 1; Let ec be the edge that gets label i in labeling; Return the labeling (i; ec) concatenated with Label (E n fec g); SolveLabeling (v ) is a procedure that given a node v , whose children w1; : : :; wc have had their subtrees labeled, solves the instance of the labeling problem with input crit (w1); : : :; crit (wc). SolveLabeling (v ) Let E := fv ! w1 ; : : :; v ! wc g; where w1; : : :; wc are v 's children; SortCover (E ); Label (E ); Compute the critical list for v , crit (v );
The top-level routine Rank (T ) computes an optimal ranking of the tree T rooted at r by applying SolveLabeling (v ) to each tree node v . 11
Rank (T ) For each leaf x, set the critical list of x to nil; For each node v such that all of its children have had their subtrees labeled do SolveLabeling (v );
The theorem proved below shows that GreedyCover returns the desired labeling. Theorem 5.9 Consider an instance of the labeling problem at v and a subset E Ev . If maxrank is a positive integer that is in none of the critical lists crit (e), e 2 E , and there is a valid gc labeling of E with highest label less than or equal to maxrank, GreedyCover (maxrank ; E ) returns a valid gc labeling of E with highest label equal to maxrank. Proof: The proof is by induction on jEj. Base Case. For jEj = 1, any valid labeling is a valid gc labeling. Since there is only one edge, the only validity condition we need to check is that maxrank is not any input critical list. This condition is checked in the call to NextEdge . Induction Step. Let k be given. Suppose the lemma is true for all calls to GreedyCover in which E is of size k ? 1. Now consider an arbitrary call to GreedyCover in which jEj = k. By assumption, there is a valid gc labeling of E with highest label less than or equal to maxrank. By Lemma 5.8, there is also a valid gc labeling of E with highest label exactly maxrank. The call to NextEdge in GreedyCover determines the unique edge ek that gets the largest label maxrank in this gc labeling. Let sec be the second largest label in L. As in GreedyCover , let nextrank be the largest integer less than maxrank that is not on any critical list for the edges in E n fek g. We claim that sec nextrank. Suppose seeking a contradiction that nextrank < sec < maxrank. Then, by choice of nextrank, sec must be on the critical list of some edge e in E n fek g. Since all labels that L assigns to E n fek g are at most sec, the label that L assigns e does not cover sec on e's critical list. Hence Condition 2 of Lemma 2.8 is violated, contradicting the assumption that L is a valid labeling. We have shown that L's second highest label is at most nextrank. Hence L n f(ek ; maxrank )g is a valid labeling of E n fek g using no label larger than nextrank. By inductive hypothesis the recursive call returns such a labeling, reclabel. It remains to show that reclabel [ f(ek ; maxrank )g is valid. We show this by proving that it satis es the three validity conditions of Lemma 2.8. Condition 1. Since maxrank is the highest label we will use, any gc labeling of the remaining unlabeled edges must use all labels less than maxrank. The next largest label nextrank must be less than maxrank and cannot appear on the critical list of any of the unlabeled edges. GreedyCover nds the largest possible value of nextrank. Since the labels are assigned in strictly decreasing order, no label will be assigned to more than one edge. Condition 2. When we call NextEdge, it checks that the value maxrank is not on any input critical list. Thus it is not possible that the value maxrank is assigned as a label and also remains exposed. All the remaining labels are less than maxrank and all the values below maxrank on crit (ek ) are covered by maxrank, so it is not possible that some other value will both be assigned as a label and remain exposed on crit (ek ). Condition 3. Observe that every rank assigned in GreedyCover is higher than the largest value remaining exposed on two or more critical lists. Thus it is not possible for GreedyCover to return a complete labeling with two copies of a value exposed. We are seeking a valid gc labeling starting with (maxrank ; ek ) composed with a gc labeling of E n fek g using no label larger than nextrank. By the induction hypothesis, the recursive call GreedyCover (nextrank ; E n fek g) nds such a labeling. 12
Since k was arbitrary, the theorem follows by induction on k.
Corollary 5.10 GreedyCover (maxrank ; E ) returns a valid gc labeling of E with highest label equal to maxrank if and only if there is such a labeling. Theorem 5.11 Under the same hypotheses as Theorem 5.9, Label returns an optimal labeling. Proof: We will prove that Label produces the valid gc labeling with the lexicographically smallest
list of labels. By Lemma 5.7, this labeling is optimal. By Corollary 5.10, GreedyCover will produce a valid gc labeling when it is possible to do so. We need to show that the way in which Label calls GreedyCover and itself produces the gc labeling with lexicographically smallest label list. The proof is by induction on jEj. The while loop in Label searches incrementally for the smallest maximum label that can be used in a valid gc labeling. If we choose the right maximum value, the de nition of gc labeling forces the choice of which edge in E gets labeled with that value. NextEdge identi es that edge correctly. By Corollary 5.10, the smallest value of i for which GreedyCover (i; E ) returns a (non-error) labeling is the smallest maximum label that can be used in a gc labeling. Since the choice of which edge gets label i is forced, the choice of which jEj ? 1 edges remain unlabeled is also forced. We can apply the theorem we are trying to prove inductively to the set of remaining edges. The inductive application of the theorem corresponds precisely to the recursive call to Label in the pseudocode.
Corollary 5.12 SolveLabeling solves the labeling problem.
6 Optimal Edge Ranking is in P In this section we show that the algorithm described in the previous section to nd an optimal edge ranking for an n?node tree runs in polynomial time. We bound the running time for each routine in the pseudocode. There are a couple of places where we state time bounds for slight variations of the pseudocode that are faster but more complicated than the pseudocode as presented; we explain the variations in the appropriate proofs.
Lemma 6.1 Let v be a vertex with c children w1; : : :; wc. Let b be the maximum value appearing on the critical lists crit (w1); : : :; crit (wc ). A call to SortCover (Ev ) can be implemented in O(cb) time.
Proof: We start by building a bit-array that tells us for each edge e and each value l whether
l is on crit (e). There are at most c edges and at most b possible distinct values appearing in the critical lists of v 's children so building the bit-array takes O(cb) time. Once it is built all the lookups can be done in constant time. Each iteration of the main loop scans the O(c)-length edge list from the previous iteration twice doing a constant amount of work per item. Once the procedure lls in CoverOrder [b + 1] it stops. Thus the for loop takes O(cb) time. In the actual implementation stopping at value b means that we need to remember b, and if in NextEdge we want to look up an index of CoverOrder that is bigger than b, we must look up under b instead.
Lemma 6.2 Let v be a vertex with c children w1; : : :; wc and let E Ev . Let be the maximum of c and the sum of the lengths of the critical lists crit (w) of the edges v ! w 2 E . A call to GreedyCover (maxrank ; E ) takes O(jEj ) time. 13
Proof: We begin by analyzing the call GreedyCover makes to NextEdge. A call to NextEdge (rank ; E )
takes O( ) time as shown below. Checking whether rank is on any critical list can be done by scanning each list. Two critical lists can be compared in lexicographic order in time proportional to the length of the shorter list. Therefore, nding the lexicographic maximum of a collection of such lists of total size takes O( ) time. Scanning the list of edges in CoverOrder [rank ] takes O( ) time. Let k = jEj denote the remaining number of children to be labeled. Let t(NextEdge ; ) denote the time for a call to NextEdge under the assumptions in the lemma and let t(GreedyCover ; ; k) denote the time for the call to GreedyCover. As above, scanning all the critical lists to nd duplicate and nextrank can be done in O( ) time. Therefore, the running time of GreedyCover satis es the following recurrence:
t(GreedyCover ; ; k) = t(GreedyCover ; ; k ? 1) + O(t(NextEdge; )) + O(); 0 < k jEj t(GreedyCover ; ; 0) = O(1): Filling in for NextEdge, this becomes
t(GreedyCover ; ; k) = t(GreedyCover ; ; k ? 1) + O(); whose solution is O(jEj ).
Theorem 6.3 Let I be an instance of the labeling problem in which vertex v has c children w1; : : :; wc. Let b be the maximum value appearing on the critical lists crit (w1); : : :; crit (wc) and let s be the sum of the lengths of the c critical lists. This instance of the labeling problem can be solved by the call SolveLabeling (v ) in O(c2s log(b + c) + cb) time. Proof: There are four steps in the procedure SolveLabeling (v). We analyze the complexity for
each of these steps below. Setting up Ev requires O(c) time. The call to SortCover (Ev ) takes O(cb) time by Lemma 6.1. We analyze the call to Label below. Let k = jEj denote the number of remaining children to be labeled. Let t(Label ; s; k) be the running time of Label (E ) when v has k children and the total size of the critical lists in E is at most s. As written in the pseudocode, the running time of Label (E ) satis es the following recurrence:
t(Label ; s; k) = t(Label ; s; k ? 1) + ((b + c) O(t(GreedyCover ; s; k))) + O(s + c); 0 < k c t(Label ; s; 0) = O(1); because we try up to b + c candidate values to nd the next edge label, and each candidate label necessitates a call to GreedyCover (E ) and O(s + k) bookkeeping. We can instead search for the next label by binary search. Since we know that the largest label we need is at most b + c and since k jEvj = c, our binary search is in the range [1; b + c] and we get the new recurrence: t(Label ; s; k) = t(Label ; s; k ? 1) + (log(b + c) O(t(GreedyCover ; s; k))) + O(s + c): By Lemma 6.2, this recurrence reduces to
t(Label ; s; k) = t(Label ; s; k ? 1) + O(cs log(b + c)); whose solution is O(c2s log(b + c)). 14
The fourth step involves computing v 's critical list. To compute the new critical list we use a bit-array where the i-th entry is set to 1, if i is a new label or i is exposed on a critical list. We can compute the bit-array by scanning each critical list in O(c + s) time. We can convert the bit-array into a list in O(b + c) time. Summing up all the partial time bounds yields the claimed total bound. In Section 7, we will need the following special case of the previous theorem. Corollary 6.4 We can solve an instance of the labeling problem as given in Theorem 6.3 in which the c critical lists are decreasing sequences of distinct integers from f1; 2; : : :; B g in O(c3B log(c + B )) time. Theorem 6.5 A supercritical edge ranking of an n?node tree can be found in O(n3 log n) time. Thus the edge-ranking problem is in P . Proof: The correctness of our algorithm follows from Theorem 5.11 and Corollary 4.3. Algorithm Rank (T ) consists of one call to SolveLabeling (v ) for each node v 2 V . Let cv be the number of children of v and sv be the sum of the lengthsPof the critical lists crit (w) for v ! w 2 Ev .PApplying Theorem 6.3, Rank's total time complexity is O( v2V c2v sv log(bv + cv )), which is O(n2 log n v2V cv ), by observing that the maximum possible value for b + c and s is n. Finally, noting that the sum of the values of cv at each node is exactly the number of edges in the tree, which is n ? 1, the time bound stated in the theorem follows.
7 An N C Algorithm for Constant Degree Trees This section shows that the problem of optimally edge ranking a tree is in the class NC for constant degree trees. There is also a parallel approximation algorithm that yields an edge ranking within a factor of two of optimal. The algorithm is a relatively straightforward parallelization of the sequential approximation algorithm given in [12]. The algorithm is fully described in [5]. Theorem 7.1 Let T be an n?node tree with rank (T ) at most B and such that the degree of any node is at most D. Then T can be supercritically ranked in O(log2 n + D3 B log n log(B + D)) time using n2B = log n processors on a CREW PRAM. Proof: The following divide and conquer algorithm computes both T 's supercritical ranking with respect to a given root r and the list crit (r; ), within the claimed time and processor bounds. If T consists of a single node r, the algorithm returns the empty ranking and ; as crit (r). Otherwise, the algorithm's divide, conquer, and combine phases work as follows. The divide phase computes a splitting path P = hv1 ; : : :; vt i, whose removal divides T into subtrees of size at most n=2. P is the unique path such that vt = r, and v is on P if and only if Tv has more than n=2 nodes. For the tree in Figure 7.2, for example, P consists of the bold lines. This phase can be implemented in O(log n) time using n processors using Euler tour techniques. Let X = fw j w 2= P; w's parent 2 P g. In the conquer phase, the algorithm calls itself recursively to simultaneously compute, for every subtree Tw such that w 2 X , the supercritical ranking of Tw and the list crit (w). The combine phase of the algorithm extends the ranking to all the edges emanating from path P in three steps. At this point, for every vi 2 P and each of its children w 62 P , the critical list crit (w) is already computed. In particular, the critical list crit (w) of all of v1 's children is computed. Let UB denote the set of all decreasing lists of distinct integers from f1; 2; : : :; B g. 15
2
1 1 3 2 5
1
1 1 3 4
1
7
2 2 3
3
2
1 5 4
2
1
6
1
3
2
3
4
1 2
1
2
1
Figure 7.2: A supercritical ranking of a tree. The tree is rooted at the top node. The splitting path for this tree is illustrated by the bold lines.
Step 1. For each vi on path P and each list 2 UB , apply algorithm SolveLabeling (v) to solve the
following instance of the labeling problem on Ev : The input list associated to vi 's j -th child wj is assumed to be j , where j = if wj is vi?1 and i > 1; and j = crit (wj ) otherwise. For each pair vi and , let L[vi ; ] denote the labeling of Evi and B[vi ; ] the critical list, computed by SolveLabeling (vi). This step can be implemented in O(D3B log(B +D)) time using tjUB j = t2B CREW PRAM processors by Corollary 6.4. Step 2. Construct an auxiliary forest F whose vertex set is
W = f(vk ; ) j 1 k t; 2 UB g; where each forest-vertex (vk ; ) 2 W has as forest-parent ( B[vk ; ]) if B[vk ; ] 2 UB and parent F (vk ; ) = (vk+1 ;nil otherwise, where B[vk ; ] is as computed in step 1. Since the forest F has tjUB j = t2B vertices, this step can be performed in O(1) time with t2B processors. Step 3. Find the sequence of forest-vertices, Q = (v1; 1); : : :; (vt; t), that connects forest-vertex (v1; crit (v1)) to the root of the forest-tree holding it. Then, for each i between 1 and t, the desired labeling of the edges emanating from vi consists of L[vi ; ], where is i?1 if i > 1 and L[v; ;] for i = 1. The desired supercritical list, crit (r), of T 's root is given by B[vt ; t?1] if t > 1. In the case t = 1, r = v and crit (r) is given by B[v; ;]. This step can be implemented in O(log t + B ) time using t2B processors. The processor complexity of the algorithm is O(P n2B ). Since the depth of the recursion is O(log n), the n 3 overall time complexity of the algorithm is O( log k=1 [log n + D B log(B + D)]). The overall number 16
of operations done by the algorithm is O(n2B (log n + D3 B log(B + D))), so by Brent's scheduling principle [1] we can reduce the number of processors by a factor of log n. The correctness of the algorithm follows by induction on the number of nodes in T , based on the correctness of the combine phase of the algorithm that can be proved by induction on the length t of the splitting path by invoking Theorem 5.11. We now show that the problem of edge ranking trees whose nodes have constant degree is in NC . This fact will follow from the theorem proved above and the upper bound on the edge rank of an n-node tree with node degrees of at most D provided by the following lemma.
Lemma 7.2 Let T be an n?node tree with maximum degree D. Let knode denote the optimal node rank of T . The edge rank of T , rank (T ), is at most D knode . Proof: Let (u) be an optimal node ranking for the nodes u of T . The result is proved by exhibiting an edge ranking whose largest value is at most D knode . For each non-leaf u, we de ne a labeling (u ! wi) on the edges u ! w1; : : :; u ! wc (c D) emanating from u by (u ! wi) = D (u) ? i (1 i c) We prove that the overall mapping is an edge ranking of T . Edges emanating from the same vertex get labeled dierently. Suppose maps two edges u ! xl and v ! ym to the same value. Without loss of generality, assume (u) (v ). Then, 0 D[ (u) ? (v )] = m ? l < D, and hence (u) = (v ). Since u and v have equal node rank and is a legal node ranking, the undirected path P joining u ! xl and v ! ym contains a vertex s that is between u and v , such that (s) > (u). Then, if s ! t is the edge emanating from s that belongs to path P ,
(s ! t) D (s) ? D D (u) > (u ! xl ): Therefore, is a legal ranking.
Corollary 7.3 An n?node tree having maximum degree equal to D can be supercritically ranked in O(D4 log2 n[log log n + log D]) time using 2D nD+1 = log n processors on a CREW PRAM. Proof: T 's node rank is at most 1 + blog2 nc [14]. Using Theorem 7.1 and Lemma 7.2, the bounds
stated in the corollary follow. The results in this section imply that there exists a poly-log time algorithm that uses a subexponential number of processors to optimally edge rank a tree whose maximum degree is poly-log. By taking D = O(1) in Corollary 7.3, the theorem stated below follows.
Theorem 7.4 The problem of ranking trees of constant node degree is in NC . It can be solved in O(log2 n log log n) time using nD+1 = log n processors on a CREW PRAM, where n is the number of nodes in the tree and D is the maximum degree of any node. Theorem 7.1 can be viewed as an NC reduction from the problem of edge ranking a tree to the following problem:
De nition 7.5 Path Labeling Problem Input: A rooted tree T = (V; E ). A path P = hv1; : : :; vti, where vt is the root of T . For each edge vi ! w the critical list crit (w) of a ranking w of Tw . Output: 17
S 1. An assignment, L, of values to all edges emanating from any vertex on P such that = L[ w is a ranking of T that induces an optimal labeling on Ev for every node v 2 P , 2. and the critical list crit (vt ; L). Based on the algorithm given in the proof of Theorem 7.1, we have the following corollary.
Corollary 7.6 Optimal edge ranking is NC reducible to the path labeling problem. Further remarks concerning the parallel complexity of this problem are given in Section 9.
8 FillSlots is P -complete In this section we show that a natural decision problem emerging from a routine in our sequential algorithm is P -complete. The result suggests that any parallel algorithm incorporating this procedure (FillSlots de ned below) is unlikely to exhibit an exponential speedup over the sequential algorithm. The reader is referred to [8] for background material regarding P -completeness theory. The P -completeness result is based on a procedure that is performed by GreedyCover and NextEdge. The result is concerned only with solving an instance of the labeling problem. The idea is described informally rst. Given a set of critical lists our new procedure nds the highest slot that is open on all critical lists. It then assigns a value to this slot on the list where this value will cover lexicographically the most other values. More precisely, suppose we are given as input an instance of the labeling problem (see De nition 4.1) and a value maxrank. The following procedure FillSlots performs a similar function to GreedyCover and NextEdge in solving the labeling problem. Note, the procedure computes a partial labeling. Procedure FillSlots(maxrank, E ) SortCover (E ); While maxrank 1 and E 6= ; do Try to compute nextrank as the largest positive integer less than maxrank that is not on any critical list for the edges E ; Choose the edge e := NextEdge (nextrank ; E ) to label next; Label e with nextrank covering all smaller values on e's critical list; E := E n feg; maxrank := nextrank; In our application of this procedure maxrank will be initially set to a large integer. An inspection of GreedyCover and NextEdge reveals that FillSlots performs the same computation as these procedures when attempting to nd new values. The dierence is in that duplicate values are not taken into account in FillSlots. We de ne a natural decision problem based on FillSlots below.
De nition 8.1 FillSlots Problem Input: Given a collection of lists comprised of 0's and 1's, a designated list l, and two integers s and
t.
Output: Does the FillSlots procedure use value t to cover values on list l when started with a maxrank value of s?
18
Note the correspondence between the FillSlots problem and the labeling problem. The lists correspond to critical lists, the designated list l corresponds to a speci c edge, and the value t corresponds to a label used to cover values on list l. We prove that the FillSlots problem is P -complete below, thus providing evidence that our sequential algorithm is inherently sequential.
Theorem 8.2 The FillSlots problem is P -complete under log-space reducibility.1 Proof: It is easy to see that the FillSlots problem is in P . We prove it is log-space complete for P by giving a reduction from a variant of the Circuit Value Problem (CVP). The variant of CVP we
use is the monotone, fan-in 2, fan-out 2 version with the gates assumed to be numbered in topological order (MCVP). This version is known to be P -complete [10]. We also assume that no gate receives both its inputs from the same gate. Let the inputs to the circuit be x1 ; : : :; xp and let the gates in topological order be g1; : : :; gm; the output gate is gm . Let denote an instance of MCVP. The idea is to generate lists to simulate each gate in . The lists are represented by characteristic vectors that are columns of a 0-1 matrix M . Each column of the matrix represents a list of numbers as a Boolean vector, where 1 means the row number is on the list of that column. The top row of the matrix represents the value s, and the subsequent rows represent the values s ? 1; s ? 2; : : :; 1. In symbols, Mij = 1 if and only if (s ? i + 1) is on list j . We specify some values in the matrix associated with gates; any unspeci ed values in the matrix are 0. Each gate is associated with a set of rows and columns disjoint from those of other gates. Each row is associated with some gate; each column is associated with a gate, except for the p columns at the far left of the matrix that correspond to the circuit inputs. A TRUE circuit input is denoted by a column with a single 1 in it; the placement of the 1 depends on which gate the input goes to. A FALSE circuit input is denoted by a column of zeros. An AND gate gadget is associated with four consecutive rows and three consecutive columns. An OR gate gadget is associated with three consecutive rows and two consecutive columns. The topological ordering of the gates is translated into a left-to-right ordering of the gadget columns and a top-to-bottom ordering of the gadget rows. Speci cally, if an output of gate i is an input of gate j then the rows of i's gadget will be above the rows of j 's gadget and the columns of i's gadget will be to the left of the columns of j 's gadget. The interesting part of the gadget for an AND is depicted below. These are the values that occupy the 4 3 submatrix of the rows and columns associated with that AND gate. 0 0 0 1
0 1 0 1
0 1 0 1
In the third column there are two additional 1's below the four rows shown | one in a row where the left output of the gate is input and the other in a row where the right output of the gate is input. Dangling outputs can be placed arbitrarily below the last row of gm 's gadget. The left (right) input to an AND gate is \delivered" to the left of the gadget in a column to the rst (third) row of the gadget. If the input comes from another gate, the value to the left on the rst or third row will be a 1. If the input is a circuit input, it's value will be the value of the input itself. The interesting part of the gadget for an OR gate is the following 3 2 submatrix: 1
The reduction is also N C 1 .
19
0 1 0 0 1 1 Both inputs to an OR gate are \delivered" in separate columns to the second row of the gadget. As for the AND gate, the matrix enries representing the inputs are 1 if they come from other gates, and have their real value if they are circuit inputs. The outputs of the OR gate are two additional 1's occurring in the second column of gadget. These go to the appropriate rows depending on where they serve as inputs. The topological numbering of the circuit allows us to space the gadgets out appropriately so they do not interfere with one another. We assume the output gate gm of is an OR gate. In our instance of FillSlots we choose the designated list l as the second column of the gadget corresponding to gm , the value of s to be the rst row of the matrix we construct, and t as the value corresponding to the middle row of gm . It is easy to check that the matrix can be described in log-space. To complete the proof we establish a correspondence between the evaluation of the circuit and the assignment of labels in FillSlots . Consider the sequence pairs of values (nextrank ; NextEdge (nextrank ; E )) that FillSlots chooses. In terms of the matrix these are just (row, column) pairs, where nextrank is the next label assigned or row selected and NextEdge determines the next edge (or alternatively critical list) or column that gets labeled. We say that a column is an output column if it is the third column of an AND gadget or the second column of an OR gadget. A rank assigned to an output column is a high rank if it is above the lowest row number of that gadget. The intuition is that an output column rank is high if it covers the 1 in the lower righthand-corner of the gate gadget. In the list of (row, column) pairs that FillSlots chooses, we restrict our attention to the subsequence of pairs where the column is an output column. Simultaneously imagine evaluating the gates of the circuit in topological order. The invariant that shows the correspondence is: the next gate output column in which FillSlots assigns a high rank corresponds exactly to the next gate whose output is FALSE. We prove the invariant inductively. It is is certainly true when no gates have been evaluated and no output columns have been assigned a rank. Suppose the invariant holds for the evaluation of gates g1; : : :; gk?1 and consider the evaluation of gate gk . The placement of the gate gadgets in M ensures that the 1's in the gadget for gk are strictly above those in the gadgets for the later gates. Therefore, FillSlots will always choose next a column for the lowest numbered gate to which it can give a high rank. The induction step has two similar cases depending on the gate type of gk . Suppose gk is an AND gate. In the rst and third rows of the gk gadget output column there are 0's. If nextrank corresponds to either of those rows, then NextEdge will select that column (critical list) for nextrank because it will cover a value in the row below it. The output column of the gk gadget is preferred to the column just to its left because of the output column has two 1's further down the column. Next observe that the rst and third rows will have 1's to the left of the AND gadget corresponding to the two inputs of g if the inputs come from earlier gates. Those are the only 1's in their rows. Suppose those 1's are in output columns c1 and c2. If neither column c1 nor c2 has been previously chosen, then (by inductive hypothesis) the outputs of those gates are both TRUE and we are justi ably blocked from giving a high rank to the output column of the AND gate under consideration. On the other hand, suppose without loss of generality that column c1 has been previously chosen. This means that c1 is no longer in the set of remaining critical lists, and we can assign the output column of the AND gate the rank of the rst gadget row. Because of the lexicographic ordering of the 20
columns, c1 must have gotten a high rank. By inductive hypothesis this implies that the output of the gate corresponding to c1 is known to be FALSE. This in turn implies that the output of the current gate is FALSE, which completes the induction step in this case. Instead if c1 corresponds to a circuit input rather than a gate output, then it will have a 1 in the top row of the AND gate gadget for g if and only if the circuit input is 1, so the analysis is the same. The case for c2 is symmetric. Now suppose gk is an OR gate. In the second row of the gk gadget output column there are 0's. If nextrank corresponds to this rows, then NextEdge will select the output column (critical list) for nextrank because it will cover a value in the row below it. The output column of the gk gadget is preferred to the column just to its left because of the output column has two 1's further down the column. If the OR gate gk takes its inputs from other gates, then the second rows will have two 1's to the left of the OR gadget corresponding to the two inputs of gk . Those are the only 1's in that second gadget row. Suppose those 1's are in output columns c1 and c2. If either column c1 or c2 has not been previously chosen by FillSlots, then by inductive hypothesis the outputs of that gate are TRUE. The output of gk should be 1 and indeed, the 1 in the second row of gk 's gadget blocks us from giving a high rank to the output column of gk . On the other hand, suppose that both columns c1 and c2 have been previously chosen. This means c1 and c2 are no longer in the set of remaining critical lists, and we can assign the output column of gk the rank of the second gadget row. Because of the lexicographic ordering of the columns, c1 and c2 must have gotten a high rank. By inductive hypothesis this implies that the output of the corresponding gates are both FALSE. This in turn implies that the output of gk is FALSE, which completes the induction step in this case. If c1 and/or c2 instead correspond(s) to a circuit input rather than a gate output, then it will have a 1 in the second row of the OR gate gadget for gk if and only if the circuit input is 1, so the analysis is similar. The FillSlots problem seems to be related to several other P -complete problems | for example, rst t decreasing bin packing, Gaussian elimination with partial pivoting, and iterated mod (see [8] for de nitions of these problems). FillSlots may prove useful in showing that other problems are P -complete.
9 Summary and Open Problems We presented the rst polynomial time algorithm for optimally edge ranking a tree, an NC algorithm to optimally rank constant degree trees, and a P -completeness result suggesting that our sequential algorithm probably does not parallelize well. Our work raises several interesting open problems: 1. Is there a faster sequential algorithm? 2. Are the labeling and edge-ranking problems in NC or are they P -complete? Our sequential ranking algorithm can be viewed as a polynomial time reduction from the path labeling problem (de ned in De nition 7.5) to the labeling problem (de ned in De nition 4.1). If the path labeling problem is in NC , then optimal edge ranking of trees is in NC (Corollary 7.6). Whether the path labeling problem is in NC is an open question, even under the assumption that the labeling problem is in NC . In particular, we make the following observations. (a) For n?node trees with node degrees O(1), the labeling problem is trivially in NC . Our 21
current NC algorithm for the path labeling problem is not trivial and makes eective use of the O(1) degree condition. (b) For n?node trees with node degrees O(logk n) for a constant k, the labeling problem is again trivially in NC . It is not clear whether this fact can be used to place the path labeling problem in NC . 3. What is the complexity of edge ranking for general graphs?
Addendum: After submitting our journal version and sending in the nal proceedings version for the ACM-SIAM Symposium on Discrete Algorithms 1993, we were made aware of an earlier paper by Deogun and Peng. Their paper appeared in [7] and claimed to place the edge-ranking problem in P . In [6] several aws in their algorithm are pointed out.
Acknowledgement: We thank the anonymous referees for their helpful suggestions on the exposition.
References [1] R. P. Brent, The parallel evaluation of general arithmetic expressions. Journal of the ACM, 21(1974), 201{206. [2] R. Cole, Parallel merge sort. SIAM Journal on Computing 17(1988), 770{785. [3] P. de la Torre and R. Greenlaw, Super critical tree numbering and optimal tree ranking are in NC. In Third IEEE Symposium on Parallel and Distributed Processing, pages 767{773, Dallas, Texas, 1991. IEEE Computer Society. [4] P. de la Torre, R. Greenlaw, T. M. Przytycka, The optimal tree ranking problem is in NC . Parallel Processing Letters, 2(1992), 31{41. [5] P. de la Torre, R. Greenlaw, A. A. Schaer, Optimal edge ranking of trees in polynomial time. University of New Hampshire technical report 92-10, 1993. [6] P. de la Torre, R. Greenlaw, A. A. Schaer, A note on Deogun and Peng's edge ranking algorithm. University of New Hampshire technical report 93-13, 1993. [7] J. S. Deogun and Y. Peng, Edge Ranking of Trees. Congressus Numerantium 79(1990), pp. 19{28. [8] R. Greenlaw, H. J. Hoover, and W. L. Ruzzo, Topics in Parallel Computation: A Guide to P -completeness Theory. Oxford University Press, Computing Science Series, editor Z. Galil, to appear. [9] M. R. Garey and D. S. Johnson, Computers and Intractability: A Guide to the Theory of NPCompleteness, W. H. Freeman and Company, San Francisco, 1979. [10] L. M. Goldschlager, R. A. Shaw, and J. Staples, The maximum ow problem is log space complete for P . Theoretical Computer Science, 21(1982), 105{111. 22
[11] A. V. Iyer, H. D. Ratli, and G. Vijayan, Optimal node ranking of trees. Information Processing Letters, 28(1988), 225{229. [12] A. V. Iyer, H. D. Ratli, and G. Vijayan, On an edge ranking problem of trees and graphs. Discrete Applied Mathematics, 30(1991), 43{52. [13] J. M. Lewis and M. Yannakakis, The node-deletion problem for hereditary properties is NPcomplete. Journal of Computer and System Sciences, 20(1980), 219{230. [14] Y. Liang, S. K. Dhall, and S. Lakshmivarahan, Parallel algorithms for ranking of trees. In Second Annual Symposium on Parallel and Distributed Computing, pages 26{31, Dallas, Texas, 1990. IEEE Computer Society. [15] N. Megiddo, Applying parallel computation algorithms in the design of serial algorithms. Journal of the ACM, 30(1983), 852{865. [16] J. Nevins and D. Whitney, Editors, Concurrent Design of Products and Processes, McGraw-Hill, 1989. [17] Y. Perl and S. Zaks, On the complexity of edge labelings for trees. Theoretical Computer Science, 19(1982), 1{16. [18] T. M. Przytycka, The optimal tree ranking problem is in NC . Manuscript, 1991. [19] A. A. Schaer, Optimal node ranking of trees in linear time. Information Processing Letters, 33(1989/90), 91{96. [20] M. Yannakakis, Node-deletion problems on bipartite graphs. SIAM Journal on Computing, 10(1981), 310{327. [21] M. Yannakakis, Edge-deletion problems. SIAM Journal on Computing, 10(1981), 297{309.
23