IEEE COMMUNICATIONS LETTERS, VOL. 14, NO. 3, MARCH 2010
251
Frame-Based Multicast Switching Michael Tartre and Bill Lin AbstractโThis letter considers the multicast scheduling problem for the case where the average rates of flows are known and a fixed frame size is applicable. We present a frame-based decomposition method for computing offline a recurring schedule that can guarantee the specified flow rates with minimum internal speedup, if any. We consider both the no-splitting case, where a multicast cell must be transferred to all its destinations in a single time slot, and the fanout-splitting case, where a multicast cell may take multiple time slots to transfer to all its destinations, transferring only to a subset of its destinations each time. Index TermsโMulticast switching, high-performance switches, rate guarantees.
I. I NTRODUCTION WITCH scheduling for crossbar switches is considered a difficult problem when the traffic demands are unknown. Although the online scheduling problem has been extensively studied, existing algorithms are either too complex to implement at high speeds, not capable of providing delay guarantees, or both. On the other hand, offline scheduling is an attractive option when the traffic profile is known a priori and remains relatively static since the complexity of online scheduling is eliminated and delay guarantees can be provided. The Birkhoff-von Neumann (BvN) switching strategy proposed by Chang et al. [1] provides one such offline scheduling approach that can guarantee 100% throughput and deterministic delays for any known admissible traffic. The BvN approach can be used to decompose any admissible traffic matrix into a convex combination of permutation matrices that correspond to switch configurations. Although BvN switching can support any flow rates specified as non-negative real numbers, the approach has several drawbacks. First, the number of permutation matrices that a BvN decomposition can generate is ๐(๐ 2 ) for an ๐ ร ๐ switch, which can lead to a substantial online memory requirement when ๐ is large. Second, the BvN switching approach is not completely offline. It uses the Packetized Generalized Process Sharing (PGPS) algorithm in [2] to schedule online the generated permutation matrices, which is non-trivial to implement. For applications where the flow rates can be accurately specified as integers or rounded off into integers with acceptable round-off errors, frame-based offline scheduling methods have been proposed [3] that can decompose any integer rate matrix into ๐ permutation matrices, where ๐ is an integer frame size corresponding to the largest row or column sum of integer flow rates. When applicable, frame-based decomposition approaches offer several advantages over the BvN approach. First, the worst-case online memory requirement is much less when ๐ is much less than ๐ 2 . Second, there is no need for PGPS scheduling since the ๐ permutations can simply rotate. In this letter, we propose a new frame-based offline scheduling method that provides support for multicast switching.
S
Manuscript received February 5, 2009. The associate editor coordinating the review of this letter and approving it for publication was A. Sprintson. The authors are with the University of California, San Diego, La Jolla, CA 92093โ0407 (e-mail:
[email protected]). Digital Object Identifier 10.1109/LCOMM.2010.03.090239
Recently, Sundararajan et al. [4] extended the BvN decomposition approach to support multicast switching. This method retains the generality of the BvN approach in that it can support any flow rates specified as non-negative real numbers, but it also inherits the worst-case online memory requirement of the BvN approach and the need for online scheduling using PGPS. Our work is complementary in that it provides a simpler solution in cases where integer flow rates and a fixed frame size apply. To our knowledge, previous frame-based offline scheduling methods did not consider multicast switching. II. F RAME -BASED O FFLINE M ULTICAST S CHEDULING In this section, we describe our frame-based offline multicast scheduling, which is formulated as a graph coloring problem. We will consider two modes of multicast operations: 1) No-splitting: A multicast cell must be transferred to all its destinations in a single time slot. 2) Fanout-splitting: A multicast cell may take multiple time slots to transfer to all its destinations, transferring only to a subset of its destinations each time. In the fanout-splitting case, a multicast cell remains queued until it has been transferred to all its destinations. The fanoutsplitting case provides the offline scheduler with more flexibility in resolving conflicts. To support partial service, we assume the multicast virtual output queueing model with requeueing described in [6], which is also assumed in [4], [5]. In this queueing model, when a multicast cell receives partial service, the cell is dequeued from its current queue and requeued in the virtual queue corresponding to the residue (the unserviced part of the multicast). A. The No-Splitting Case Consider the following example with both unicast and multicast flows for a 3 ร 3 switch. Suppose the unicast flows are specified with the following rate matrix ๐
= (๐๐๐ ). โ โ 0 0 0 0 โ ๐
= โ 0.6ยฏ6 0.3ยฏ3 0 0 0.3ยฏ3 Suppose there is one multicast flow as follows: ๐1โ{1,2} = 0.3ยฏ3 More formally, a flow is defined as follows: Definition 1 (Flow): Let ๐ be a set of switch ports. A flow ๐ is specified with a source ๐ (๐ ) โ ๐ , a subset of destinations ๐ท(๐ ) โ ๐ , and a flow rate ๐(๐ ). In the general case, flow rates are specified as non-negative real numbers in the range ๐(๐ ) โ [0, 1]. To capture the scheduling conflicts between the different flows, we use the conflict graph model proposed in [4]. Definition 2 (Conflict graph): The conflict graph for a given traffic profile is defined as a graph ๐บ = (๐น, ๐ถ), where ๐น represents the set of all flows to be served (unicast and multicast), and ๐ถ represents the set of conflicts between the
c 2010 IEEE 1089-7798/10$25.00 โ
252
IEEE COMMUNICATIONS LETTERS, VOL. 14, NO. 3, MARCH 2010
(3,3) 0.33
1:[1,2] 0.33
(3,3) 1
1:[1,2] 1
(2,1) 0.66
(2,2) 0.33
(2,1) 2
(2,2) 1
(a) Fig. 1.
(3,3)
(b)
(2,1)
1:[1,2]
(3,3) 1
(2,1)
(2,2)
(2,1) 2
1:[1] 1
1:[2] 1
(3,3)
(2,2) 1
(c)
(2,1)
1:[1]
1:[2]
(2,1)
(d)
(2,2)
(e)
(a) Conflict graph; (b) Integer conflict graph; (c) Unit conflict graph; (d) Fanout-split integer conflict graph; (e) Fanout-split unit conflict graph.
flows in ๐น . (๐, โ) โ ๐ถ means flows ๐ and โ cannot co-exist in a valid switch configuration. For a conflict graph, two flows ๐ and โ are in conflict if there have the same source, ๐ (๐) = ๐ (โ), or if their set of destinations overlap, ๐ท(๐) โฉ ๐ท(โ) โ= โ
. The conflict graph for the above example is depicted in Figure 1(a), which has 4 flows that correspond to the 4 vertices in the figure. A unicast flow is shown as a pair (๐, ๐), a multicast flow is shown in the form of ๐:[๐, ๐, . . .], and the rates are shown in bold. In our problem definition, we assume a fixed integer frame size ๐ is given and the flow rates are specified as integers. Using a integer frame size of ๐ = 3, the above example can be equivalently specified as follows: โ
0 0 ห=โ 2 1 ๐
0 0
โ 0 0 โ and ๐ห1โ{1,2} = 1 1
Analogously, we define an integer conflict graph as follows: Definition 3 (Integer conflict graph): An integer conflict graph is a conflict graph in which the flow rates are specified as non-negative integers. The integer conflict graph for this example is shown in Figure 1(b). This integer conflict graph can be derived from the conflict shown in Figure 1(a) by multiplying the real number rates by ๐ โ i.e., ๐ห(๐ ) = ๐(๐ ) ร ๐ . Since in general flow rates can be any non-negative real number in the range of [0, 1], either it may not be possible to โdiscretizeโ the specification into integer flow rates, or a very large frame size would be required. In these cases, the extended BvN switching strategy proposed in [4] is more suitable. In our approach, we further refine the input specification into a unit conflict graph under the no-splitting case. Definition 4 (Unit conflict graph): A unit conflict graph is an integer conflict graph with all flow rates equal to 1. Starting from an integer conflict graph, we generate a unit conflict graph by replicating each node ๐ with a rate ๐ห(๐ ) > 1 into ๐ห(๐ ) nodes, ๐ฎ(๐ ) = {๐1 , ๐2 , . . . , ๐๐ห(๐ ) }. This is shown in Figure 1(c), with replicated nodes shown in dashed circles. Since all replicated nodes have the same unit rate, there is no need to show their rates. In the unit conflict graph generated, we say two nodes ๐ and โ are in conflict if they have either the same source or overlapping destinations. If two nodes ๐ and โ are in conflict by this definition, then an edge is added between them if there isnโt one already. Once the unit conflict graph is derived, the offline multicast scheduling problem is reduced to the classical graph coloring problem [7]. Continuing with the example shown in Figure 1(c), the unit conflict graph shown can be colored using ๐พ = 4 colors, for example as follows:
Colors ๐1 ๐2 ๐3 ๐4
Flows (2, 1) and (3, 3) 1:[1, 2] (2, 1) (2, 2)
The corresponding switch configurations are as follows:
(
0 1 0
0 0 0
0 0 1
) ( +
1 0 0
1 0 0
0 0 0
) ( +
0 1 0
0 0 0
0 0 0
) ( +
0 0 0
0 1 0
0 0 0
)
Note that some switch configurations have inputs that connect to multiple outputs, corresponding to a multicast. The graph coloring provides us with a minimum number of recurring switch configurations necessary for transferring a frame of cells with the specified flow rates with no fanout splitting. If the number of switch configurations (colors) equals to the frame size, ๐พ = ๐ , then no internal speedup is required. Otherwise, an internal speedup of ๐พ/๐ is required to schedule the cell arrivals in ๐ time slots. In this example, the minimum solution requires ๐พ = 4 colors and an internal speed of ๐พ/๐ = 4/3 for ๐ = 3. As we shall see next, with fanoutsplitting, no internal speedup is required for this example. B. The Fanout-Splitting Case We now consider the operation mode that permits fanout splitting, where only a partial subset of destinations for a given multicast flow may be serviced in a time slot. This is achieved by performing fanout-splitting of each multicast flow, starting from the initial integer conflict graph constructed. Consider the integer conflict graph shown in Figure 1(b). Starting from this integer conflict graph, a multicast node ๐ is split into a set of โฃ๐ท(๐ )โฃ nodes, ๐ฌ(๐ ) = {๐1 , ๐2 , . . . , ๐โฃ๐ท(๐ )โฃ }. For each ๐๐ โ ๐ท(๐ ), a separate node ๐๐ is created that only has ๐๐ as its destination. This is depicted in Figure 1(d). The multicast-split nodes are shown in double circles. Each new node ๐๐ created retains the same rate as ๐ . In the fanout-split integer conflict graph generated, we say two nodes ๐ and โ are in conflict (a) if they do not belong to the same multicast-split set ๐ฌ(๐ ), and (b) if they have either the same source or overlapping destinations. If two nodes are in conflict by this definition, then an edge is added between them if there isnโt one already. Since all split-nodes in ๐ฌ(๐ ) correspond to the same multicast flow ๐ , there is no conflict between them, which means they can be scheduled (colored) in the same switch configuration. Depending on how these nodes are colored, the solutions would correspond to different combinations of fanout-splitting. One final step before graph coloring is to further split the fanout-split integer conflict graph into a fanout-split unit conflict graph. The unit conflict graph construction procedure described in the previous section needs to be modified to consider fanout-splitting. The procedure works as follows. After we fanout-split each multicast node ๐ into โฃ๐ท(๐ )โฃ
TARTRE and LIN: FRAME-BASED MULTICAST SWITCHING
1:[1,2,3] 2
(2,1) 1
253
(2,1) 1
1:[1] 2
(2,2) 1
1:[2] 2
(2,2) 1 (2,3) 1
1:[3] 2
(2,3) 1
(a)
(b)
1:[1]
1:[1]
1:[2]
1:[2]
1:[3]
1:[3]
(2,1)
(2,2)
(2,3)
(c) Fig. 2. (a) Integer conflict graph; (b) Fanout-split integer conflict graph; (c) Fanout-split unit conflict graph.
nodes, ๐ฌ(๐ ) = {๐1 , ๐2 , . . . , ๐โฃ๐ท(๐ )โฃ }, we further replicate ๐ฌ(๐ ) into ๐ห๐ sets, ๐ฌ1 (๐ ), ๐ฌ2 (๐ ), . . . , ๐ฌ๐ห๐ (๐ ), with each ๐ฌ๐ (๐ ) = {๐๐,1 , ๐๐,2 , . . . , ๐๐,โฃ๐ท(๐ )โฃ } containing its own instances of โฃ๐ท(๐ )โฃ nodes. In the fanout-split unit conflict graph generated, we say two nodes ๐ and โ are in conflict (a) if they do not belong to the same multicast-split set ๐ฌ๐ (๐ ), and (b) if they have either the same source or overlapping destinations. If two nodes are in conflict by this definition, then an edge is added between them if there isnโt one already. Observe that in this construction, if two nodes ๐ and โ are a part of the same ๐ฌ๐ (๐ ), then there is no conflict, which means they can be scheduled (colored) in the same time slot. However, if ๐ and โ are in two different multicast-split set, even if they have been generated from the same original flow ๐ , say ๐ฌ๐ (๐ ) and ๐ฌ๐ (๐ ), then there is a conflict if they have either the same source or overlapping destinations. For the example shown in Figure 1(d), the resulting fanout-split unit conflict graph is shown in Figure 1(e). In this example, the original one multicast flow ๐ = 1:[1,2] only has a rate of 1, which generates just one multicast-split set ๐ฌ(๐ ) = (1:[1], 1:[2]). Given the fanout-split unit conflict graph shown in Figure 1(e), we can derive an offline multicast schedule by performing a graph coloring on it. However, unlike the nosplitting case, this fanout-split unit conflict graph can be colored using just ๐พ = 3 colors as follows: Colors ๐1 ๐2 ๐3
Flows (2, 1) and (3, 3) 1:[1] and (2, 2) 1:[2] and (2, 1)
Since ๐พ = ๐ , no internal speedup is required. The fanoutsplitting method searches a larger solution space, including no-splitting solutions. C. A Fanout-Splitting Case Requiring Internal Speedup Although in the above example, no internal speedup is required, this is not always the case. Consider the following example1 for a 2 ร 3 switch with 3 unicast flows, 1 multicast flow, and a frame size of ๐ = 3: ( ) 0 0 0 ห ๐
= and ๐ห1โ{1,2,3} = 2 1 1 1 1 This
example was given in [5] as a motivating case for network coding.
The corresponding integer conflict graph is shown in Figure 2(a). Using the fanout-splitting rules described above in Section II-B, a fanout-split integer conflict graph and a fanout-split unit conflict graph can be derived, as shown in Figures 2(b) and 2(c), respectively. In the fanout-split unit conflict graph shown in Figure 2(c), the original one multicast flow ๐ = 1:[1,2,3] has a rate of 2, which generates two multicast-split sets, ๐ฌ1 (๐ ) and ๐ฌ2 (๐ ), each with their own instances of (1:[1], 1:[2], 1:[3]). These nodes are shown in bold dashed circles. As shown in Figure 2(c), all nodes in ๐ฌ1 (๐ ) conflict with nodes in ๐ฌ2 (๐ ), but nodes within either ๐ฌ1 (๐ ) or ๐ฌ2 (๐ ) do not conflict with each other. This fanout-split unit conflict graph can be colored using ๐พ = 4 colors as follows: Colors ๐1 ๐2 ๐3 ๐4
Flows (2, 1), 1:[2] and 1:[3] (2, 2), 1:[1] and 1:[3] (2, 3) and 1:[1] 1:[2]
In this example, given ๐ = 3, an internal speedup of ๐พ/๐ = 4/3 is required even when fanout-splitting is considered. To distinguish the nodes in ๐ฌ1 (๐ ) from nodes in ๐ฌ2 (๐ ), the nodes in ๐ฌ2 (๐ ) are shown in bold in the above coloring table. D. Offline Scheduling as Graph Coloring As described above, the problem of offline scheduling is reduced to the graph coloring of a (fanout-split) unit conflict graph. In general, graph coloring is known to be NP-complete [7]. However, in practice, the problem can be solved efficiently using both well-developed exact and heuristic algorithms, using for example the efficient graph coloring algorithms proposed by Brยดelaz [8]. III. C ONCLUSIONS This letter described a frame-based offline scheduling method for multicast switching that provides both rate and throughput guarantees with minimum internal speedup. We considered both no-splitting and fanout-splitting operation modes. To our knowledge, previous frame-based offline scheduling methods did not consider multicast switching. R EFERENCES [1] C. S. Chang, W. J. Chen, and H. Y. Huang, โOn service guarantees for input buffered crossbar switches: a capacity decomposition approach by Birkhoff and von Neumann,โ IEEE IWQoS 1999. [2] A. K. Parekh and R. G. Gallager, โA generalized processor sharing approach to flow control in integrated service networks: the single-node case,โ IEEE/ACM Trans. Networking, 1993. [3] J. Hui, Switching and Traffic Theory for Integrated Broadband Networks. Boston, MA: Kluwer Academic Publishers, 1990. [4] J. K. Sundararajan, S. Deb, and M. Medard, โExtending the Birkhoffvon Neumann switching strategy for multicastโon the use of optical splitting in switches,โ IEEE J. Sel. Areas Commun., 2007. [5] J. K. Sundararajan, M. Medard, M. Kim, A. Eryilmaz, D. Shah, and R. Koetter, โNetwork coding in a multicast switch,โ IEEE INFOCOM, 2007. [6] M. A. Marsan et al., โMulticast traffic in input-queued switches: optimal scheduling and maximum throughput,โ IEEE/ACM Trans. Networking, vol. 11, no. 3, June 2003. [7] M. R. Garey and D. S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman. 1979. [8] D. Brยดelaz, โNew methods to color the vertices of a graph,โ Commun. ACM, vol. 22, no. 4, pp, 251โ256, Apr. 1979.