An Efficient Low-Degree RMST Algorithm for VLSI ... - Springer Link

Comment

Report 1 Downloads 112 Views

An Efficient Low-Degree RMST Algorithm for VLSI/ULSI Physical Design* 1

1

1

1

Yin Wang , Xianlong Hong , Tong Jing , Yang Yang , 2 2 Xiaodong Hu , and Guiying Yan 1

Dept. of Computer Science and Technology, Tsinghua Univ., Beijing 100084, P. R. China {wang-y01,yycs99}@mails.tsinghua.edu.cn, {hxl-dcs,jingtong}@tsinghua.edu.cn 2 Institution of Applied Mathematics, Chinese Academy of Sciences, Beijing 100080, P. R. China [email protected], [email protected]

Abstract. Motivated by very/ultra large scale integrated circuit (VLSI/ULSI) physical design applications, we study the construction of rectilinear minimum spanning tree (RMST) with its maximum vertex degree as the constraint. Given a collection of n points in the plane, we firstly construct a graph named the bounded-degree neighborhood graph (BNG). Based on this framework, we propose an O(n log n) algorithm to construct a 4-BDRMST (RMST with maximum vertex degree ≤ 4). This is the first 4-BDRMST algorithm with such a complexity, and experimental results show that the algorithm is significantly faster than the existing 4-BDRMST algorithms.

1 Introduction In recent years, the very/ultra large scale integrated circuit (VLSI/ULSI) has profoundly advanced, which enables us to design a single chip with more and more functions and many more transistors. Meanwhile, the needs for small chip size propel fabrication technology into the nanometer era. The shrinking of geometries calls for great concerns for interconnect effects. For high performance VLSI/ULSI physical design, interconnect optimization research should play an active role. Rectilinear minimum spanning tree (RMST) construction is fundamental to the interconnect optimization in physical design. RMST is frequently used as the wire length/approximation delay estimation in the whole chip floorplanning and placement phase. Since Hwang [10] proved that RMST is a 3/2 approximation of the rectilinear Steiner minimal tree (RSMT), many Steiner tree heuristics use RMST as a backbone for RSMT in the process of routing. Thus, RMST needs highly efficient solutions. Many physical design algorithms construct a RMST as a framework for later processing. They require RMST with a low maximum vertex degree because their time complexity often grows up exponentially with respect to the maximum vertex degree. *

This work was supported in part by the NSFC under Grant No.60373012 and No. 60121120706, the SRFDP of China under Grant No.20020003008, and the Hi-Tech Research and Development (863) Program of China under Grant No.2002AA1Z1460.

E. Macii et al. (Eds.): PATMOS 2004, LNCS 3254, pp. 442–452, 2004. © Springer-Verlag Berlin Heidelberg 2004

An Efficient Low-Degree RMST Algorithm for VLSI/ULSI Physical Design

443

These include the algorithm of Georgakopoulos and Papadimitriou [6] and some Steiner tree approximations [12], as well as VLSI global routing algorithms [1,17]. Thus, low-degree RMST construction is important to such applications. This paper mainly focuses on the construction of a low-degree RMST. The remainder of this paper is organized as follows. In Section 2, we define the bounded degree rectilinear minimum spanning tree (BDRMST) and the boundeddegree neighborhood graph (BNG), and discuss their properties. In Section 3, the algorithm of constructing the BNG is described. In Section 4, a proof of the algorithm’s correctness is given. Then, Section 5 shows experimental results. Finally, Section 6 concludes the paper.

2 Preliminaries 2.1 The BDRMST In a given set of points in the plane and an integer d ≥ 2, find a minimum-cost rectilinear spanning tree with maximum vertex degree ≤ d. We call this rectilinear spanning tree a bounded degree rectilinear minimum spanning tree (d-BDRMST). Existing spanning tree algorithms can be classified into two categories. Some of them efficiently find a minimum-cost spanning tree [2,9,11,14,18,23], but they do not guarantee a bound on the maximum vertex degree. Others indeed construct boundeddegree spanning trees [3,13,19], but they do not guarantee a minimum-cost connection. Finding a RMST with bounded vertex degrees is usually hard. In fact, finding a 2BDRMST is equivalent to solving the traveling salesman problem (TSP), which is known to be NP-hard [5]. Papadimitriou and Vazirani [16] showed that finding a Euclidean 3-BDRMST is also NP-hard. However, they noticed that a Euclidean 5BDRMST could be found in polynomial time. Later, Robins and Salowe [20] showed that each point set in the rectilinear plane has a 4-BDRMST. They pointed out that the 4-BDRMST could be computed in polynomial time. Shortly after, Griffith et al. [8] proposed a polynomial time algorithm to compute a 4-BDRMST as a step in their 2 Batched 1-Steiner heuristic (B1S). The running time of their algorithm is O(n ). There are two limitations to their algorithm. Firstly, they use a linear-time neighbor search2 ing method. Searching the neighbors of n points takes O(n ) time. Secondly, in the point-adding phase, they use a linear-time dynamic minimum spanning tree (MST) 2 maintenance algorithm, and adding n points takes O(n ) time. Even though Fredrickson’s data structure [4] can reduce the time complexity of the point adding 2 stage to O(n log n), the neighbor searching stage will still cost O(n ) time. It is impractical due to its complicated description and large hidden constants. This scheme can not improve the running time further. The quadratic complexity of the 4-BDRMST construction does not slow down B1S 3 because B1S itself is an O(n ) algorithm. But in physical design applications, it needs 2 a sub-quadratic Steiner heuristic based on low-degree RMST. An O(n ) BDRMST will become the bottleneck. So, we try to solve the 4-BDRMST problem in an efficient way.

444

Y. Wang et al.

In order to provide a framework for BDRMST construction, we construct a subgraph of the Delaunay triangulation of which the maximum vertex degree ≤ 4. We call this graph the bounded-degree neighborhood graph (BNG). Based on this framework, we propose an O(n log n) algorithm to construct a 4-BDRMST. To our knowledge, this is the first 4-BDRMST algorithm proved with such a time complexity. We have implemented the algorithm and compared it with existing algorithms. Experimental results show that our algorithm is much faster than the 4-BDRMST algorithm of Griffith et al [8]. Meanwhile, it is much faster than many typical algorithms that compute ordinary RMSTs. 2.2 The BNG Since the MST problem on a weighted graph is well studied [2,7,14,18], the idea to construct a metric MST by first constructing a graph on the point set are proposed in many metric MST algorithms [9,22,23]. But they cannot output MSTs with bounded degree. Griffith et al.’s 4-BDRMST algorithm does not construct a graph, but it actually utilizes an implicit graph in the neighborhood relations. Similarly, our 4BDRMST algorithm is based on a graph named the BNG. Before we define the BNG, we need to mention the uniqueness property observed by Robins and Salowe [20]. Definition 1. Given a point p, a region R has the uniqueness property with respect to p if for every pair of points u, w ∈ R, we have ||wu|| < max (||wp| , ||up||). A partition of space into a finite set of disjoint regions is said to have the uniqueness property if each of its regions has the uniqueness property. We call a partition with the uniqueness property a unique partition, and a region of a unique partition a unique region. Fig.1 illustrates a kind of unique partition in the rectilinear plane that have the uniqueness property. This is called a diagonal partition. Note that it has 8 unique regions (4 two-dimensional “wedges” and 4 one-dimensional “half-lines”).

Fig. 1. Partition that has the uniqueness property (the diagonal partition)

Consider the MST algorithms on graphs. The cycle property of the MST states that, an edge with the longest weight in any cycle can be safely deleted. So the uniqueness property implies that, if we are going to construct a MST on the point set, we can connect points to only the nearest neighbors in a unique region. The diagonal partition is rather strong in the sense that it ensures ||wu|| < max (||wp|| , ||up||). If we use this partition, the maximum degree of the MST would be 8. In fact, ||wu|| ≤ max (||wp||,||up||) is sufficient for the MST construction. So we define the weak uniqueness property as follows.

An Efficient Low-Degree RMST Algorithm for VLSI/ULSI Physical Design

445

Definition 2. Given a point p, a region R has the weak uniqueness property with respect to p if for every pair of points u, w ∈ R, we have ||wu|| ≤ max (||wp|| , ||up||). A partition of space into a finite set of disjoint regions is said to have the weak uniqueness property if each of its regions has the weak uniqueness property. Similar to the diagonal partition, we represent an exploded view of a weak diagonal partition in Fig.2(a). This can be thought as if we “perturb” each half-line of the diagonal partition clockwise into a wedge. We prove the following theorem similar to Robins and Salowe’s Lemma 6 [20]. The only difference is that equality can occur in our partition.

(a) The weak diagonal partition

(b) The weak uniqueness in a region

(c) Equality case

Fig. 2. The weak diagonal partition

Theorem 1. Given a point p in the rectilinear plane, each region of the weak diagonal partition has the weak uniqueness property. (Due to the paper length, we do not give the proof here.) Note that if two points are in a weak diagonal partition of another point (we call it the center), the two points cannot form an angle ≥ 90 degrees with the center. This property will be used to prove several other properties later. Now we consider a graph on a point set in the rectilinear plane. We do a weak diagonal partition at each of its vertices. If the center is connected to only the nearest neighbor in each unique region, we call this graph a weak unique graph. It is obvious that the maximum vertex degree in a weak unique graph is 4. Then we can define BNG. Definition 3. The BNG is a connected sub-graph of the Delaunay triangulation that is a weak unique graph. Now we discuss some properties of the BNG. Property 1. The vertex degree of the BNG is bounded by 4. This is obvious. So we are guaranteed to find a 4-BDRMST on it if it contains one. We will prove in Section 4 that the BNG actually contains at least one MST. We have the following property. Property 2. The BNG has at most 2n edges, where n is the number of vertices in the BNG. On uniformly distributed random point sets, the number of edges is normally below 1.5n. On a point set extracted from a real circuit design, the number of edges is

446

Y. Wang et al.

(a) The BNG

(b) The Delaunay triangulation

Fig. 3. A bounded-degree neighborhood graph and the corresponding Delaunay triangulation on a point set of size 830 extracted from a real circuit

usually smaller. Because the MST algorithms on graphs usually run in time O(m log n), where m is number of edges, this property leads to a significant speed up of the MST construction. Fig.3 compares a BNG and a Delaunay triangulation on the same point set. The Delaunay triangulation usually has about 3n edges. Since the BNG is a sub-graph of the Delaunay triangulation, we have the following property. Property 3. The BNG is planar. Thus an O(n) MST algorithm [2] can be applied on it to achieve great efficiency.

3 The Algorithm and Its Complexity We construct the BNG by pruning a Delaunay triangulation. At each point p, we do a weak diagonal partition. In each region, if there are more than one edge adjacent to p, we say that there is a conflict. We then delete longer edges, leaving the shortest one. When there are two edges of equal length, we delete the edge with a larger Euclidean length. At last, some Delaunay edges remain. The resulting sub-graph is the BNG. We call this process a uniqueness pruning process. The BNG can be constructed by putting a fully constructed Delaunay triangulation through the uniqueness pruning process, but we need not construct a Delaunay triangulation in full before we start the BNG construction. We can prune the Delaunay edges on-the-fly, as described in detail below. We have a two-dimensional pointer array PART of size 4n. We record the shortest edge in region r of the point p in PART [p][r]. At the beginning of the Delaunay triangulation, each element of the array is a null pointer. As soon as the Delaunay triangulation process reports an edge, we examine it with a procedure DIAGONAL_TEST. The procedure tests both end points of the edge. Centered at an end point p, we do a weak diagonal partition. If there is already an edge recorded in PART [p][r], a conflict occurs.We then delete the longer between the two: the edge recorded in PART [p][r] and the edge we are testing. And we record the shorter in PART [p][r]. In case of ties we delete the edge with a larger Euclidean length. Only the edges not deleted at the end are included in the BNG. Our pruning procedure DIAGONAL_TEST can be specified formally as follows.

An Efficient Low-Degree RMST Algorithm for VLSI/ULSI Physical Design

447

We can integrate this procedure into any L1 Delaunay triangulation algorithm [15, 21]. Due to the advantages of the plane-sweep algorithm, we choose the algorithm of Shute et al. [21]. The DIAGONAL_TEST procedure runs in constant time for each Delaunay edge and takes O(m) time as a whole, where m is the number of Delaunay edges. Since the Delaunay triangulation algorithm takes O(n log n) time, the overall time complexity is O(n log n). We need O(n) space to record the shortest edges in the unique regions and the Delaunay triangulation algorithm also takes O(n) space. So the overall space requirement is O(n). A careful study reveals that since our algorithm only needs an array of pointers of size 4n, and the Delaunay edges are tested before they are stored, many Delaunay edges are left out of the storage. Since a Delaunay edge takes much more space than a pointer, we actually need less space than the Delaunay triangulation algorithm. We can apply any RMST algorithm on the resulting BNG to get an 4-BDRMST. Since the BNG is planar, we can apply an efficient O(n) algorithm as described in [2] on it. Thus we get an O(n log n) time algorithm to compute the 4-BDRMST. Algorithm 1 DIAGONAL_TEST Require: A two dimensional array PART of size 4n. The input is an Delaunay edge e; for both of the end point s of e do if e is already deleted then Return; end if r ← the region number of s in which e lies; if PART [s][r] = NULL then PART [s][r] ← e; else if WEIGHT(e) > WEIGHT(PART [s][r]) then Delete(e); else if WEIGHT(e) < WEIGHT(PART [s][r]) then Delete(PART [s][r]); PART [s][r] ← e; else Between the choice of e and PART [s][r], delete the edge with a longer Euclidean length, store the other in PART [s][r]; end if end for

4 Proof of Algorithm Correctness The main result of this section is Theorem 2, showing that the BNG contains at least one 4-BDRMST. We will derive it from several smaller facts. It is obvious the uniqueness pruning process will not delete a potential RMST edge in a triangular cycle, but we are not convinced yet that the pruning process does not incidentally delete a cut edge from the Delaunay graph. The uniqueness pruning process can process other graphs as well, but not every graph can be pruned in this manner to be used for RMST construction. In the weak

448

Y. Wang et al.

diagonal partition of Fig.2(b), without loss of generality, we assume that ||up|| ≤ ||wp||. By the weak uniqueness property we have ||wu|| ≤ ||wp||. Then the pruning process might delete the edge wp. It can cause no harm if we are constructing an RMST on a complete graph, since there is always an edge between u and w. But this is not always true for other graphs. What if wp is the only edge between w and p? If this is the case and we simply delete wp from the graph, then w might be isolated and there can be no spanning trees in the graph. We call such a configuration that there are only two edges between three points a broken triangle. If there is a broken triangle in a unique region, the uniqueness pruning process may cut the graph into components. We will show that this cannot happen if we use the Delaunay graph. Lemma 1. The uniqueness pruning process cannot isolate a point from a broken triangle of the L1 Delaunay triangulation. Proof. In the Delaunay triangulation, the only case when broken triangles appear is when the bisector of up and wp do not intersect in a Voronoi point (See Fig.4. we also depicted the weak diagonal partition on p in the figure.) This situation only happens on the periphery of the Delaunay triangulation. To form parallel bisectors, pu and pw must form an angle ≥ 90 degrees, this is a fundamental characteristics of the bisectors in the L1 metric. Thus w and u both cannot lie in a region of a weak diagonal partition centered at p. Thus in a Delaunay triangulation there can be no conflicts between the edges of the broken triangles and no edges will be deleted. So the pruning process cannot isolate a point from a broken triangle.

Fig. 4. Parallel bisectors in L1 Delaunay triangulation

(a) Three points in general position.

(b) Three points with ||ba|| = ||bc||.

Fig. 5. Two conflicts in one triangle

Next we show that the pruning process cannot isolate a point from a full triangle. Lemma 2. The uniqueness pruning process cannot isolate a point from a triangle in any connected graph. Proof. To isolate a point from a triangle, we must delete two edges. Now we will show that the pruning process can delete only one edge from a triangle. Consider the situation depicted in Fig.5(a). First we assume that ac and bc are both in general position, that is, they are not on a half-line. So they cannot be of equal length as ab. With-

An Efficient Low-Degree RMST Algorithm for VLSI/ULSI Physical Design

449

out losing generality, we assume that ac and ab lie in some weak diagonal region with respect to a, and bc and ba lie in some weak diagonal region with respect to b. Here is two con-flicts in the triangle ∆abc. Note that the angle ∠acb is larger than 90 degrees and edges ca and cb cannot be both inside any weak diagonal region. So there can be at most two conflicts and ab must be the longest edge in the triangle ∆abc. In both conflicts, only ab will be considered for deletion by the pruning process and no points will be isolated. Now we consider the case of ties, as in Fig.5(b). We have ||ba|| = ||bc||. If we choose to delete bc when pruning on b, later the process would delete ab when pruning on a, and b will be isolated. So we must choose to delete ba when pruning on b. Remember that in the pruning process, we choose to delete the edge opposite to the obtuse angle in case of ties. So we will choose to delete ab, and no points can be isolated. Theorem 2. The bounded-degree neighborhood graph contains a 4-BDRMST. Proof. Since an L1 Delaunay triangulation contains only broken triangles and full triangles, following Lemmas 1–2, we can see that an L1 Delaunay triangulation is still connected after the uniqueness pruning process. And according to the cycle property of the RMST we can see that the uniqueness pruning process preserves the RMST contained in a L1 Delaunay triangulation. Since the BNG is constructed by pruning the rectilinear Delaunay triangulation, it contains at lease one RMST. Consider our weak diagonal partition scheme, this RMST must be a 4-BDRMST.

5 Experimental Results and Discussions We have implemented the algorithm using C programming language. It uses Shute et al.’s plane-sweep Delaunay triangulation algorithm to feed the Delaunay edges to the pruning process, resulting in a BNG. We then use Kruskal’s algorithm [14] to find an RMST on it. All generated RMSTs are correct and has the maximum degree bounded by 4. 5.1 Typical Existing Algorithms We compare our algorithm to several other RMST algorithms. Among the programs, only Robins’s can generate 4-BDRMSTs, others can only output ordinary RMSTs. All programs are implemented using the C programming language, compiled with the same GCC compiler and run on the same machine: a Sun Fire 880 workstation with 8GB RAM. • Prim’s algorithm [18]. We use the implicit complete graph on the point set as the input to Prim’s algorithm. We did not use Kruskal’s algorithm because it needs to 2 sort the O(n ) edges at the beginning and is impractical for large inputs. • An O(n log n) time algorithm that first computes octant nearest neighbors for each point using the divide-andconquer algorithm of Guibas and Stolfi [9], then finds an RMST on this graph using Prim’s algorithm.

450

Y. Wang et al.

• A RMST code by Dr. L. Scheffer, combining Prim’s algorithm with on-the-fly computation of octant nearest neighbors via quad-tree based rectangular range searches. • A 4-BDRMST algorithm implemented by Gabriel Robins for the Batched 1-Steiner Heuristic as described in [8]. We also include a version of our program that construct an ordinary RMST directly on the Delaunay triangulation. Table 1. Running time on random point sets Input 2000 4000 8000 10000 20000 40000 80000 100000

Prim 0.16s 0.63s 2.73s 5.55s 33.63s 154.16s -

Guibas 0.02s 0.05s 0.14s 0.19s 0.47s 1.16s 2.74s 3.59s

Scheffer 0.05s 0.12s 0.35s 0.50s 1.46s 4.05s 9.24s 19.23s

Robins 0.06s 0.29s 1.45s 2.81s 18.61s 80.97s 329.43s 518.10s

Delaunay 0.03s 0.06s 0.13s 0.16s 0.36s 0.78s 1.75s 2.30s

BNG 0.03s 0.06s 0.11s 0.14s 0.29s 0.64s 1.46s 1.91s

Table 2. Running time on point sets extracted from real circuit designs Input 337 830 1944 2437 2676 12052 22373 34728

Prim

Recommend Documents

Energy Efficient VLSI Architecture for Linear Turbo ... - Springer Link

An Algorithm for VLSI Implementation of Highly Efficient ... - CiteSeerX