Distributed Selfish Load Balancing on Networks Petra Berenbrink∗
Martin Hoefer†
Abstract We study distributed load balancing in networks with selfish agents. In the simplest model considered here, there are n identical machines represented by vertices in a network and m n selfish agents that unilaterally decide to move from one vetex to another if this improves their experienced load. We present several protocols for concurrent migration that satisfy desirable properties such as being based only on local information and computation and the absence of global coordination or cooperation of agents. Our main contribution is to show rapid convergence of the resulting migration process to states that satisfy different stability or balance criteria. In particular, the convergence time to a Nash equilibrium is only logarithmic in m and polynomial in n, where the polynomial depends on the graph structure. Using a slight modification with neutral moves, a perfectly balanced state can be reached after additional time polynomial in n. In addition, we show reduced convergence times to approximate Nash equilibria. Finally, we extend our results to networks of machines with different speeds or to agents that have different weights and show similar results for convergence to approximate and exact Nash equilibria.
1 Introduction Load balancing is an essential requirement in large networks to ensure efficient utilization of resources and satisfactory performance of the system. In many large computer networks load balancing becomes a challenge because of the absence of global information and coordination. When there is only local information available about the load situation and even the existence of machines, a centralized approach to load balancing is inappropriate or even impossible. Instead, one then needs to develop protocols that respect the informational and ∗ Supported by an NSERC Discovery grant. School of Computing Science, Simon Fraser University, Burnaby, B.C., V5A 1S6, Canada,
[email protected] † Supported by DFG through UMIC Research Center at RWTH Aachen University and by grant Ho 3831/3-1. Lehrstuhl Informatik I, RWTH Aachen University, Germany,
[email protected] ‡ Max-Planck-Institut f¨ ur Informatik, 66123 Saarbr¨ ucken, Germany,
[email protected]. Supported by a PIMS fellowship, part of this work was done while being at SFU, Canada.
Thomas Sauerwald‡
computational restrictions of the scenario. In addition, the protocols should guarantee rapid convergence to balanced states. Some distributed algorithmic approaches for load balancing have been proposed in algorithmic game theory, see, e.g., [5, 14, 2, 17]. In this context tasks are considered as selfish agents that act unilaterally and migrate concurrently between machines without global coordination. Such an approach has two main advantages over protocols that use more centralized optimization. Firstly, concurrent migration and unilateral decision making of multiple agents controlling the tasks reduce the coordination overhead and may still allow for rapid convergence (i.e., sublinear in the number of agents). Secondly, such “task agents” have an incentive to follow the protocol. This is an advantage in modern computer networks that are influenced by a variety of economic incentives and developments. In these networks centralized coordination is often absent and user actions are made in a selfish manner. While these properties make protocols for concurrent selfish load balancing desirable, their existence and convergence properties are not well-understood in many load balancing contexts. In this paper, we present protocols for selfish load balancing in a discrete network balancing model. There are n identical machines which represent vertices in an arbitrary, undirected graph G = (V, E), and m tasks that are initially assigned arbitrarily to the machines. Our protocols proceed in a round-based fashion. In each round, every task picks a neighboring machine at random and decides probabilistically whether or not to migrate to that machine. Hence, the tasks only need local information, they have to know about the load of machine they are currently assigned to and the load of neighboring machines in the graph. The main challenge in the design of concurrent protocols such as ours is to carefully choose appropriate migration probabilities in order to guarantee rapid convergence while avoiding oscillation effects. Our scenario represents a significant extension of the existing literature on selfish load balancing, as concurrent protocols have been considered essentially only for complete graphs [4, 16, 2, 14, 5, 6] In our model there are several concepts of a state in which the assignment of tasks is “stable” or “balanced”.
1487
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
The standard notion of stability is the Nash equilibrium (NE), in which no player can unilaterally improve by moving to a neighboring machine. Note that throughout this paper we restrict to pure NE and states that do not involve randomization. A weaker and more rapidly achievable condition is that of -approximate Nash equilibrium (-apx. NE), where no player can decrease his personal load by less than a factor of (1−). However, in a Nash Equilibrium the load difference between the least loaded and most loaded machines can still be in the order of the network diameter in our model. A natural and desirable balancing condition is an optimum state, in which each machine has m n or m tasks, and naturally such a state is more n tricky to obtain. In this paper, we consider protocols and study the convergence times to all three concepts of balance and stability. We extend our results to networks of machines with different speeds or to agents that have different weights and show similar results for convergence to approximate and exact Nash equilibria. 1.1 Contribution and Techniques We propose and analyze concurrent probabilistic protocols to obtain approximate and exact NE and optimum states for identical machines or machines with speeds, and for weighted tasks on identical machines. Identical machines. We present a protocol that reaches an exact NE after a number of rounds that depends only logarithmically on m. Let Δ be the maximum degree of the graph and μ2 be the second smallest eigenvalue of the Laplace matrix of G. The dynamics reach a NE in expected time O (Δ/μ2 · (ln m + ln n) + |E| · Δ/μ2 ). For m ≥ δn4 and δ > 1, the first part of this convergence time is the time needed to reach a 2/(1 + δ)-apx. NE. The second part is the time needed so that no agent wants to unilaterally deviate to a neighboring machine. The convergence time is only logarithmic in the number of agents m, but the dependency on n is polynomial and connected to the structure of the graph. Obviously, in general a polynomial dependence on n cannot be avoided (e.g., for paths). To reach an optimum state, we present a protocol that includes neutral moves, i.e. moves between neighboring machines of the same load. While such a move is not executed by a strictly myopic rational agent, it might still turn out to be profitable as this allows tasks to reach other, less loaded parts of the graph. In this sense, agents have an incentive to “experiment”, and reaching a completely balanced state is clearly desirable from a system point of view. Our protocol adjustment results in an additional time that is only polynomial in n to obtain an optimum state . The convergence time
is O((1 + ln m) · poly(n)). Machines with speeds. In this case, each machine i ∈ V has a speed si ∈ N, si ≥ 1, and thus processes tasks at a different rate. Let S := i∈V si and δ > 1, then using our protocol a set of m ≥ δ · 8 · n3 · S agents converges to an 2/(1 + δ)-apx. NE in expected time O((1 + ln m) · poly(n, smax )) on any graph, where smax is the maximum speed. An exact NE is reached after additional time O(poly(n, smax )), so in total also in time O((1 + ln m) · poly(n, smax )). Weighted tasks. Finally, we also consider the case that each task has a weight w ∈ N, w ≥ 1. In this case, the expected convergence time to a NE is O(Δ · W 3 · wmax ), where W is the sum of all weights and wmax is the maximum weight. Outline of the paper. Due to technical reasons we present our results in a different order than the one stated above. After presenting the necessary definitions and preliminaries in Section 2, we first derive in Section 3 the results on load balancing with speeds (Theorems 3.1 and 3.2), as this introduces the main technical framework. The results on networks with identical machines are discussed in Section 4. Then we enhance the general approach for machines with speeds to show convergence to apx. NE (Theorems 4.1 and 4.2). In particular, we can establish a relation to μ2 , the eigenvector of the Laplace matrix (Lemma 4.2). To show convergence to an optimum (Theorem 4.3) load distribution requires slightly different arguments. We essentially show that our protocol with neutral moves quickly reaches an optimum state and remains within the set of optima for a sufficiently long time, with high probability. Finally, Section 5 contains the extension to weighted tasks (Theorem 5.1) and mostly applies our previous approach. Due to spatial constraints most proofs are missing from this extended abstract and will appear in the full version of this paper. 1.2 Related Work Most closely related to our paper is [5], where the case of identical machines in a complete graph is studied. There, a protocol that is equivalent to ours is shown to arrive at a NE in time O(log log m + poly(n)). Note that for complete graphs the NE and optimal allocations are identical. An extension of this model to weighted tasks is studied in [6]. The authors present a protocol that converge to a NE in time polynomially in n, m, and the largest task weight. Here we extend the results of both papers significantly by studying dynamics on general graphs and machines that have speeds. This also requires to use different techniques that allow to capture the connections between convergence time and graph structure. Our paper relates to a general stream of works for
1488
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
selfish load balancing on complete graphs. There is a variety of issues that have been considered, starting with seminal papers on algorithms and dynamics to reach NE [13, 15]. More directly related are concurrent protocols for selfish load balancing in different contexts that allow convergence results similar to ours. Whereas some papers consider protocols that use some form of global information [14] or coordinated migration [19], others consider infinitesimal or splittable tasks [18, 4] or work without rationality assumptions [16, 2]. The machine models in these cases range from identical, uniformly related (linear with speeds) to unrelated machines. The latter also contains the case when there are access restrictions of certain agents to certain machines. In contrast, in our model players migrate over a network and can access machines only depending on their current location. This is a fundamental difference to all previous related work in this area. For an overview of work on selfish load balancing see, e.g., [30]. A slightly different approach for discrete selfish load balancing on networks are finite congestion games [28], for which the convergence times of sequential bestresponse dynamics to exact and approximate NE have been extensively studied [9, 3, 29]. A general approach for a concurrent better-response protocol is [1], which is inspired by similar results for non-atomic congestion games [17]. In this protocol, agents pick strategies only by imitation of other agents, and there is rapid convergence even for general delay functions, but only to an approximate equilibrium concept where all agents experience a similar cost. While this represents a generally applicable approach, the obtained approximate equilibrium might be far from any (apx.) NE. A different line of research are no-regret and similar payoff-based learning dynamics [23, 22, 24, 25], but they usually converge (quickly) only in the history of play and/or to classes of mixed NE. In contrast, we present protocols that reach pure exact and approximate NE rapidly. Our protocol is also related to a vast amount of literature on (non-selfish) load balancing over networks, where results usually concern the case of identical machines and unweighted tasks. Often there are additional restrictions on the graph structure such as regular graphs, expander graphs, tori, etc. A central measure of balance is the discrepancy, i.e., the difference between most and least loaded machine in the network. In expectation, our protocols mimic continuous diffusion, which has been studied initially in [11, 8] and later, e.g., in [26]. This work established the connection between convergence, discrepancy, and eigenvalues of graph matrices. Closer to our paper are discrete diffusion processes – prominently studied in [27], where the authors introduce a general technique to bound the load devia-
tions between an idealized and the actual processes. Recently, randomized extensions of the algorithm in+[27] have been considered, e.g., [12, 20]. However, either machines have to communicate with their neighbors to determine the number of tasks that should move [27, 20], or the tasks perform independent random walks [12]. In the first case, machines have a strong control over their tasks, while in the second case tasks may jump from an underloaded to an overloaded machine, which clearly is undesirable in a game-theoretic context. 2
Notation and Preliminaries
We consider an arbitrary, undirected and connected graph G = (V, E) with n = |V | vertices representing machines. The degree of a vertex i ∈ V is d(i). Δ denotes the maximum degree of any vertex in V . For two vertices i, j, d(i, j) = max{d(i), d(j)} is the maximum degree of i and j. There are m tasks in the system, which are initially assigned arbitrarily to the n machines. We denote by x a state of the system, i.e., a fixed assignment of tasks to the machines. For any machine i ∈ V , we denote by xi the set of tasks are that assigned to machine i in x. We consider a probabilistic migration process, in which the state is a random variable in each time step. In particular, let X t be the state at (the end of) step t. Similarly, Xit is the subset of tasks assigned to machine i ∈ V , and it is a random variable for every t ≥ 1 due to the probabilistic nature of our migration protocols. Each task has a weight w ∈ N and, unless stated otherwise, we assume uniform tasks with w = 1. Each vertex i ∈ V is a machine with a speed si ∈ N, si ≥ 1. We define S = m i=1 si . Note that we can also handle rational speeds by normalization to integers. By smax and smin we denote the maximal and minimal speed of a machine, respectively. If smax = smin , we have a network of identical machines and assume w.l.o.g. si = 1 for all i ∈ V . In the case of uniform tasks the load of a machine is defined as the sum of tasks assigned to it, divided by the speed of the machine. In the case of weighted tasks the load is the sum of the weights of these jobs divided by the speed. In particular, by W (xi ) we denote the weight on machine i in state x, i.e., the sum of weights t of all tasks that are located on i. Similarly, W (Xi ) is the weight at the end of step t. Let W := i∈V W (xi ). For a state x the load of vertex i is denoted by L(xi ) and equals L(xti ) = W (xi )/si . Each task on machine i experiences a disutility of L(xi ) in state x. Naturally, for our process at time t, we obtain the random variable L(Xit ). A task is a selfish agent that strives to minimize the experienced load. The task is only aware of the load of the machine it is currently located at and is able
1489
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
to inspect the load one of the neighboring machines. We consider a round-based process. In each iteration every task uses a protocol to decide to which of the neighbouring machines it potentially migrates. The Laplace L matrix is a standard matrix associated with undirected graphs, which is based on adjacency and degree information (e.g., [10]). Formally, L is defined as the n × n-matrix where Li,j is equal to d(i) if j = i, −1 if {i, j} ∈ E(G) and 0 otherwise. The eigenvalues of the Laplace matrix are known to encode valuable structural information for dynamic load balancing processes, see, e.g. [21, 7]. A state x is an -approximate Nash equilibrium (apx. NE) for any 0 ≤ ≤ 1 if no task can decrease the experience load by more than factor of (1 − ). In such a state we have for every machine i and every neighboring machine j (1 − ) ·
W (xi ) W (xj ) + 1 ≤ . si sj
For our migration process, we often consider the change of the potential, which is ΔΦr (X t ) := Φr (X t−1 ) − Φr (X t ) . We will also use a normalized formulation of Φ0 (x) denoted by Ψ(x). Definition 2.2. The function Ψ(x) is defined as Ψ(x) = Φ0 (x) −
m2 . S
The potential change is defined as ΔΨ(X t ) := Ψ(X t−1 ) − Ψ(X t ) = ΔΦ0 (X t ). Note that for identical machines it holds m2 m m 2 W (xi ) − . Ψ(x) = Φ0 (x)−2m· +n· 2 = n n n i∈V
As a standard convention, the term “with high probability” means with probability at least 1 − n−c for some constant c > 0. Our main results are upper bounds on the expected time for our protocols to reach an (apx.) NE. We point out that one can W (xi ) ∈ {m/n, m/n} . get corresponding upper bounds which hold with high Our game can be expressed as an atomic congestion probability at the cost of a multiplicative increase by game, and thus the following function due to Rosen- O(log n). thal [28] is a potential function for our game. 3 Uniform Tasks and Related Machines (xi ) W W (xi ) · (W (xi ) + 1) k In this section, we first present our results on general Φ1 (x) = = networks and machines with speeds, as our analysis of si si i∈V k=1 i∈V this case introduces our main approach. Protocol I in Whenever a single player makes a unilateral assignment Figure 1 allows tasks to move to neighboring machines change, the change in the potential function equals the with a smaller load. In more detail, in each round every load change experienced by the player. Thus, the local player randomly chooses a neighboring machine. If the optima of the potential function are exactly the NE anticipated load of the other machine is smaller, the of the game, and the potential function measures the player moves to it with a probability that depends on progress to NE from a players point of view. several parameters: the degree of the actual vertices, We will also use the quadratic potential function the speeds of both machines and their load difference. that is standard in the load balancing literature. Note that these are all local parameters. The factor d(i)/d(i, j) = d(i)/(max{d(i), d(j)}) is (W (xi ))2 crucial to prevent that too many tasks from low degree Φ0 (x) = si vertices move to neighbors with a much larger degree. i∈V Alternatively, we can use 1/Δ in our analysis but this It measures progress to a social optimum from a global would require that all tasks know the maximum degree. point of view. This function will be helpful when Our first result in this section shows that, for we prove convergence to apx. NE. This leads to the sufficiently large m, we reach an apx. NE in time following general definition. logarithmic in m. The second result bounds the time Definition 2.1. For r ∈ {0, 1}, define to reach a NE. For = 0 we call such a state an (exact) Nash equilibrium (NE). A network of identical machines has reached a (social) optimum (OPT) x if
Φr (x) :=
W (xi ) · (W (xi ) + r)
i∈V
si
.
Theorem 3.1. Let m ≥ δ · 8 · n3 · S for some δ > 1, then Protocol I reaches a 2/(1 + δ)-approximate Nash
1490
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
Definition 3.1. For any i, j ∈ V with {i, j} ∈ E and for each task in parallel do any given state x, the expected flow along this edge in Let i = i() the current machine of task a single round of our protocol starting from x is Choose a neighboring machine j u.a.r. t−1 t−1 ⎧ if L(Xi ) − L(Xj ) > 1/sj then j ) if L(xi ) − L(xj ) > s1 ⎨ L(xi )−L(x Move task from resource i to j with probability j α·d(i,j)· s1 + s1 fi,j (x) := i j ⎩ 0 otherwise. L(Xit−1 ) − L(Xjt−1 ) d(i) 1 · d(i, j) α · s + s1 · W (Xit−1 ) Note that f (x) is always greater than 0. i
i,j
j
end if end for
E(x) :=
Figure 1: Protocol I for uniform tasks and machines with speeds. We set α := 4smax . equilibrium in expected time smax , O (log m + log n + log smax ) · diam(G)2 · Δ · S · smin
1 , (i, j) ∈ E : L(xi ) − L(xj ) > sj
is the set of edges over which tasks have an incentive to move when the system is in state x. Finally, for any r ∈ {0, 1} we define 1 1 r r Λri,j (x) := (2α−2)·d(i, j)· ·fi,j (x)+ − + . si sj si sj
instead of E(x).
We usually use the shorthand E which is O((1 + ln m) · poly(n, smax )) on any graph. Our aim is to prove that in one round the system makes progress towards a NE, i.e., that Theorem 3.2. Let m ≥ δ · 8 · n3 · S for some δ > 1, E ΔΦr (X t |X t−1 = x) > 0. Let us first consider the then Protocol I reaches a Nash equilibrium in expected potential change when the number of tasks transferred time over any edge is exactly its expected number. Hence, we O((1 + ln m) · poly(n, smax )) define on any graph. If m ≤ δ ·8 ·n3 ·S, then Protocol I reaches
r (X t |X t−1 = x) := ΔΦ i a Nash equilibrium in expected time W (x) · (W (x) + r) (E W (Xit )|X t−1 = x )2 − O(poly(n) · poly(smax )) . si si i∈V i∈V In expectation, the protocol behaves like a a con E W (Xit )|X t−1 = x . tinuous diffusion process. To avoid oscillation we need − r · si i∈V α ≥ 4smax . This, however, implies that even though tasks can have a large incentive for migration (e.g., if they move from a full and slow to an empty and fast The following lemma generalizes [7, Lemma 2] to the machine), they never migrate with a probability of more setting with speeds. than 1/4smax. Thus, to reach an apx. NE where all Lemma 3.1. For any round t ∈ N it holds that players have a small incentive to migrate, it might take Ω(smax ) many rounds for the last players to move. Thus,
r (X t |X t−1 = x) ≥ ΔΦ fi,j (x) · Λri,j (x) . a convergence time that polynomially depends on smax (i,j)∈E(x) is unavoidable, given the way our protocol is defined. In Section 3.1 we prove some fundamental bounds Now we will provide a bound on the real potential on the potential change in one step of Protocol I. Using change, i.e., we include the deviation which occurs since these insights we show Theorem 3.1 and Theorem 3.2 the actual number of transferred tasks may differ from in Section 3.2. their expected values. For the proof of the bound in Lemma 3.3, we require the technical Lemma 3.2. 3.1 Potential Function Analysis First we introduce some additional definitions. Note that throughout Lemma 3.2. For any step t and any state x, the paper we first estimate the expected potential de crease occuring in a single round of the process. In Var W (Xit |X t−1 = x) 1 1 . = fi,j (x) + particular, we usually condition on the event that X t−1 si si sj i∈V (i,j)∈ E is some arbitrary but fixed state x.
1491
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
Lemma 3.3. For any step t and any state x, E ΔΦr (X t |X t−1 = x) 1 1 . ≥ fi,j (x) · Λri,j (x) − − si sj
we have Λ1i,j (x)
Proof. We obtain
=
L(xi ) − L(xj ) ≥
E ΔΦr (X t |X t−1 = x) Φr (x) − E Φr (X t |X t−1 = x) W (xi ) · (W (xi ) + r) si i∈V E W (Xit |X t−1 = x) − r· si i∈V t E (W (Xi |X t−1 = x)2 − si
Λ0i,j (x)
W (xi ) · (W (xi ) + r)
si E W (Xit |X t−1 = x ) − r· si i∈V E W (Xit |X t−1 = x) 2 − si i∈V Var W (Xit |X t−1 = x) − si
.
1 1 + , sj si
1 1 1 − − ≥ si sj 2
1 1 + si sj
.
LΔ (x) := max |L(xi ) − m/S| . i∈V
In order to prove our first theorem, we use function Φ0 . For any given state x, we observe a simple relation between the potential value Φ0 (x) and the maximum load difference LΔ (x). Note that Φ0 (x) = 2 (W (x i )) /si is minimized when all W (xi ) = m/S · i∈V si . This implies (m2 · si )/S 2 = m2 /S. Φ0 (x) ≥
i∈V
≥
3.2 Convergence to Approximate and Exact Nash Equilibria In this section, we finally prove our main theorems. We use Lemma 3.1 for the migration in a single round and consider the potential drop w.r.t. an “ideal” potential value if exactly the expected loads are realized. Then, using Lemma 3.3, we bound the difference between ideal and realized potential values by analyzing the variance of the migration process. This yields bounds on the expected drop of the potential in one round and is the main ingredient to prove the theorems. Let us define for any given state x the maximum load difference as
i∈V
=
1 1 + si sj
we have
i∈V
=
2. If r = 0, then for any (i, j) ∈ E(x) with
(i,j)∈E(x)
=
1 1 1 − − ≥ si sj 2smax
r (X t |X t−1 = x) ΔΦ Var W (Xit |X t−1 = x) − si i∈V 1 1 r , fi,j (x) · Λi,j (x) − − si sj (i,j)∈E
i∈V
where the last inequality follows by applying Lemma 3.1 The following lemma considers the normalized version and Lemma 3.2. 2 of Ψ(x). The next technical lemma builds on Lemma 3.3 and
shows that every edge (i, j) ∈ E(x) with a sufficiently large load difference contributes positively to E ΔΦr (X t |X t−1 = x) .
Lemma 3.5. For any state x of the system it holds that Ψ(x) = Φ0 (x) −
Lemma 3.6. Let T be the first time step with
Lemma 3.4. Set α := 4smax ≥ 4. Then the following two statements hold for any state x:
1. If r = 1, then for any (i, j) ∈ E(x) with L(xi ) − L(xj ) ≥
1 1 + , sj si · sj
m2 ≤ S · (LΔ (x))2 . S
LΔ (X T ) ≤ 8 · diam(G) · n · Δ. Then E [T ] is bounded above by smax 2 O (ln m + ln n + ln smax ) · diam(G) · Δ · S · . smin
1492
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
2 . Then, plugging in the definition of Λ0 (x) Proof. In this proof, we consider progress in Φ0 (X t ). edge in E i,j t−1 We assume that the assignment X = x is arbitrary and fi,j (x) yields but fixed. Consider now the protocol before the ex E Δi,j Φ0 (X t |X t−1 = x) ecution of the next step t. Let l ∈ V be a vertex with |L(xl ) − m/S| = LΔ (x), and assume w.l.o.g. that 1 1 0 = fi,j (x) · Λi,j (x) − fi,j (x) · + L(x) > m/S. Then there must be another vertex k ∈ V si sj with L(xk ) ≤ m/S. This implies that there is an edge L(xi ) − L(xj ) {p, q} ∈ E on a path from l to k such that · = α · d(i, j) s1i + s1j L(xp ) − m/S ≥ L(xq ) − m/S + Γt / diam(G), 1 1 (2α − 2) · d(i, j) · · fi,j (x) + si sj which is equivalent to L(xi ) − L(xj ) − L(xp ) − L(xq ) ≥ LΔ (x)/ diam(G). α · d(i, j) L(xi ) − L(xj ) By Lemma 3.3, · ≥ α · d(i, j) s1i + s1j ⎛ ⎞ E ΔΦ(X t |X t−1 = x) 2 ⎜ ⎟ 1 1 · (L(xi ) − L(xj ))⎠ ⎝ 2− . ≥ fi,j (x) · Λi,j (x) − − α si sj (i,j)∈E
≥0
With slight abuse of notation, let us define for any edge
(i, j) ∈ E, E Δi,j Φ0 (X t |X t−1 = x) := 1 1 , fi,j (x) · Λ0i,j (x) − − si sj so that t
t−1
≥
≥−
4 , α
4|E| . E Δi,j Φ0 (X t |X t−1 = x) ≥ − α
E Δp,q Φ0 (X t |X t−1 = x) 1 1 0 = fp,q (x) · Λp,q (x) − − si sj L(xp ) − L(xq ) ≥ α · d(p, q) s1p + s1q 2 · (L(xp ) − L(xq )) − 2 · 2− α LΔ (x) L(xp ) − L(xq ) 1 − 2 · ≥ diam(G) α · d(p, q) sp + s1q ≥
1 (i,j)∈E
1 (i,j)∈E
2 si
3 : Group 3: Consider now the edge (p, q), i.e., the set E
Group 1: Lemma 3.4 with r = 0 shows that E Δi,j Φ0 (X t |X t−1 = x) =
2 L(xi ) − L(xj ) s + ≥− j α · d(i, j) α
2 (i,j)∈E
into three disjoint groups (where Let us now group E we omit the argument x throughout): 2 2
(p, q) + E1 := (i, j) ∈ E : L(xi ) − L(xj ) > si sj
: L(xi ) − L(xj ) ≤ 2 + 2
2 := (i, j) ∈ E (p, q) E sj sj
3 := (p, q) E
L(xi ) − L(xj ) α · d(i, j)
since we assumed that all speeds are larger than one.
2 , we obtain Summing up over all edges in E
= x)
(i,j)∈E
−
E ΔΦ0 (X |X E Δi,j Φ(X t |X t−1 = x) . ≥
−
1 1 0 ≥0 . fi,j (x) · Λi,j (x) − − si sj
=
LΔ (x) LΔ (x)/ diam(G) · 1 1 2 diam(G) α · d(p, q) sp + sq (LΔ (x))2 2 diam(G)2 · α · d(p, q) s1p +
1 sq
,
Group 2: In this case we consider the value of where the second last inequality holds since L(xp ) − t t−1 E Δ Φ (X |X = x) . Let (i, j) be an L(xq ) ≥ LΔ (x)/ diam(G) and α ≥ 2. The last i,j 0 2 (i,j)∈E
1493
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
inequality holds since by assumption, LΔ (x) ≥ 8 · To prove the bound on the approximation ratio, we diam(G) · n · Δ. Now we get note that once LΔ (X T ) ≤ 8 · diam(G) · n · Δ ≤ 8 · n3 , it follows that for every i ∈ V , that we have E ΔΦ0 (X t |X t−1 = x) W (XiT )/si − m/S ≤ LΔ (X T ) ≤ 8 · n3 . E Δi,j Φ0 (X t |X t−1 = x) ≥ Consider now any pair i, j with {i, j} ∈ E. Then, 1 {i,j}∈E E Δi,j Φ0 (X t |X t−1 = x) + W (XiT )/si ≤ 8 · n3 + m/S 2 {i,j}∈E
and similarly
+ E Δp,q Φ0 (X t |X t−1 = x)
(LΔ (x)) 4|E| + α 2 diam(G)2 · α · d(p, q) · s1p +
≥
0−
≥
(LΔ (x))2 2 4 diam(G) · α · d(p, q) · s1p +
(W (XjT ) + 1)/sj ≥ max 1/sj , (m/S) − 8n3 + (1/sj ) .
2
1 sq
1 sq
We are looking for the smallest possible ∈ [0, 1) such that (1 − ) · W (XiT )/si ≤ (W (XjT ) + 1)/sj . Plugging in our bounds from above, a simple calculation yields the result = 1/(1 + δ). 2
,
where the last inequality uses LΔ (x) ≥ 8 ·diam(G)·n·Δ and α > smax . 2 Recall that Φ0 (x) = i∈V (W (xi )) /si is minimized when all W2 (xi ) =2 m/S 2· si . This implies Φ(X t ) ≥ i∈V (m · si )/S = m /S. Here, we continue with the norrmalized version of Φ0 (x), i.e., with Ψ(x). By Lemma 3.5, Ψ(x) ≤ S · (LΔ (x))2 . Since ΔΨ(X t ) = ΔΦ0 (X t ) and α = 4smax we obtain (as long as LΔ (X t ) > 8 · diam(G) · n · Δ) that E ΔΨ(X t |X t−1 = x) ≥ ≥ ≥
(LΔ (x))2 4 diam(G)2 · α · d(p, q) · s1p + 4S ·
Ψ(x) · α · Δ · s1p +
diam(G)2
Ψ(x) 32 S · diam(G)2 ·
1 sq
1 sq
The following lemma will be useful in proving the second theorem for convergence to exact NE. Lemma 3.7. Consider two vertices i, j in a state x such that L(xi ) − L(xj ) > s1j . If all speeds are integers, then L(xi ) − L(xj ) ≥
1 1 + . sj si sj
Lemma 3.8. For every state x, we define 2 m n m +n· − , Ψ (x) = Φ1 (x) − S S 4smin which satisfies
0 ≤ Ψ (x) ≤ S · (LΔ (x))2 + n · LΔ (x) + n .
Proof. [Proof of Theorem 3.2] Note that Lemma 3.7 provides the minimum decrease required to apply Lemma 3.4 for Φ1 . Plugging in this bound for Λ1i,j in 2 With γ := 32 S · diam(G) · s /s · Δ we have max min Lemma 3.3 yields an expected (additive) decrease of at E ΔΨ(X t |X t−1 = x) ≥ γ1 · Ψ(x), or equivalently least 1/(8Δs3max) per round. If m ≥ δ · 8 · n3 · S, we see with Lemma 3.6 that 1 LΔ (X T ) ≤ 8 diam(G) · n · Δ for a step T with · Ψ(x) . E Ψ(X t |X t−1 = x) ≤ 1 − γ E [T ] = O ((1 + ln m) · poly(n, smax )) . Note that for any state x we have Ψ(x) ≤ m2 /smin . 3.8 implies that by normalizing potential Φ1 to By rounding Ψ to take on integer values and applying Lemma Ψ , the latter is only of size O(poly(n) · S). Standard standard results on multiplicative drift analysis, the results in probability theory then show that an exact lemma follows. 2 NE is obtained after smax smin
·Δ
.
O(poly(n) · S · Δ · s3max ) = O(poly(n) · poly(smax )) Proof. [Proof of Theorem 3.1] Let T be the first time step when LΔ (X T ) ≤ 8 · diam(G) · n · Δ holds. Using rounds in expectation. If initially m ≤ δ · 8 · n3 · S, Lemma 3.6 we have E [T ] = O((log m+log n+log smax )· then clearly, LΔ (X T ) ≤ m = O(n3 · S), and the same diam(G)2 ·Δ·S·(smax )/(smin )). This provides the bound reasoning as above applies. 2 on the convergence time.
1494
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
for each task in parallel do Let i = i() the current machine of task Choose neighbouring machine j u.a.r. if W (Xit−1 ) − W (Xjt−1 ) > 1 then Move task from resource i to j with probability t−1 t−1 d(i) W (Xi ) − W (Xj ) · d(i, j) 2α · W (Xit−1 )
In the following, we provide some insights on how to prove Theorems 4.1 and 4.2. The proof of Theorem 4.3 can be found in the full version of this paper. For a state x, we define fi,j (x) as before, i.e., as the expected load that is sent from i to j in one round of Protocol I. The next lemma bounds the potential change for the edges with a load difference of at most two.
Lemma 4.1. For α = 3, end if E ΔΨ(X t |X t−1 = x) if W (Xit−1 ) − W (Xjt−1 ) = 1 then 2 (W (xi ) − W (xj )) Move task from resource i to j with probability . ≥ 18 · d(i, j) {i,j}∈E : d(i) 1 |W (xi )−W (xj )|>1 · d(i, j) β · W (Xit−1 ) Based on Lemma 4.1 and using the combinatorial characterization of μ2 , it is easy to see that Protocol end if I quickly reduces the potential to O(n6 ). end for Figure 2: Protocol II that is able to reach an optimum. We set α = 3 and β = n11 . 4
Lemma 4.2. Protocol I reaches a time-step T with Ψ(X T ) ≤ 2|E|/μ2 = O(n6 ) in expected time O (Δ/μ2 · (ln m + ln n)) . Proof. We conclude from Lemma 4.1 that
Uniform Tasks and Identical Machines
In this section we consider the case of identical machines. We use Protocol I with smax = 1 and α = 3. For uniform tasks and identical machines a task assigned to i prefers machine j if and only if W (Xit−1 )−W (Xjt−1 ) > 1. With μ2 being the second smallest eigenvalue of the Laplace matrix, we show the following result. Theorem 4.1. Let m ≥ δ · n4 for some δ > 1. With α = 3 Protocol I reaches a 2/(1 + δ)-approximate Nash equilibrium in expected time
E ΔΨt |X t−1 = x)
O ((Δ/μ2 ) · (ln m + ln n) + (|E| · Δ)/μ2 ) .
(W (xi ) − W (xj )) 18 · d(i, j)
2
(W (xi ) − W (xj )) 18 · Δ
2
{i,j}∈E
≥
{i,j}∈E
.
On the other hand, a classic result from standard (continuous) diffusion [7, Theorem 4] (see also [21, Proof of Theorem 1]) shows that
O (Δ/μ2 · (ln m + ln n)) . Theorem 4.2. With α = 3 Protocol I reaches a Nash equilibrium in expected time
≥
{i,j}∈E
2
(W (xi ) − W (xj )) μ2 ≥ · Ψ(x) , 18Δ 18 · Δ
where we recall that μ2 is the second smallest eigenvalue of the Laplace matrix. By noting that (W (xi ) −
we derive W (xj ))2 ≤ 1 for all e ∈ E \ E,
Next we consider Protocol 2 (see Figure 2) which (W (xi ) − W (xj ))2 μ2 |E| is a slightly alterated version of Protocol 1. Protocol ≥ · Ψ(x) − . 2 allows neutral moves which are moves from machine 18 · Δ 18 · Δ 18 · Δ i to machine j when W (Xit−1 ) − W (Xjt−1 ) = 1. For {i,j}∈E this protocol we show convergence towards the social Hence if μ2 Ψ(x) ≥ 2|E|, we get optimum, i.e., a state where every machine has a load (W (xi ) − W (xj ))2 of either m/n or m/n. Note that neutral moves are μ2 ≥ · Ψ(x) . necessary to reach a social optimum. 18 · Δ 36 · Δ Theorem 4.3. For β = n11 , Protocol II reaches a social optimum in expected time O(log(m) · poly(n)).
{i,j}∈E
Therefore, E ΔΨ(X t )|X t−1 = x ≥
1495
μ2 · Ψ(x) . 36 · Δ
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
This implies that as long as Ψ(X t ) ≥ 2|E| μ2 , E Ψ(X t )|X t−1 = x) ≤ Ψ(x) − E ΔΨ(X t |X t−1 = x) μ2 · Ψ(x) . = 1− 36Δ
for each task in parallel do Let i = i() the current machine of task and w the weight of task Choose neighbor machine j u.a.r. if W (Xit−1 ) − W (Xjt−1 ) > w then Move task from machine i to j with probability
Moreover, Ψ(X 0 ) ≤ m2 . Applying standard results on multiplicative drift analysis yields the claim of the lemma. 2
t−1 t−1 d(i) W (Xi ) − W (Xj ) · d(i, j) 2α · Wit−1
end if end for
Proof. [Proof of Theorem 4.1] By Lemma 4.2, we reach in expected time O(Δ/μ2 · (ln m + ln n)) a time-step T with Ψ(X T ) ≤ 2|E|/μ2 = O(n6 ). This provides the Figure 3: Protocol III for weighted tasks and uniform speeds. bound on the convergence time. For the ratio on apx. NE we note that, given the bound for Ψ, the maximum load must satisfy 5 Weighted Tasks and Identical Machines In this section we assume that every task ∈ [m] has Wmax (X T ) = max W (XiT ) ≤ m/n + n3 . i∈V an integral weight w > 0. We denote the maximal and minimal weight of any task by wmax and wmin , The minimum load must satisfy respectively. In this case, a task with weight w that Wmin (X T ) = min W (XiT ) ≥ m/n − n3 . is located at machine i in state x prefers machine j over i∈V i if and only if W (xi ) > W (xj )+ w , which is equivalent Recall that in an -apx. NE, for all neighboring vertices to W (xi ) − W (xj ) > wx . We call r satisfied if there is i, j the following inequality must hold no neighboring machine j with W (xi ) > W (xj ) + w . Our protocol for weighted tasks is given in Figure 3. (1 − )W (XiT ) ≤ W (XjT ) + 1. Theorem 5.1. For α > 4wmax Protocol III reaches a Thus, as we have W (XiT ) ≤ m/n + n3 and Nash equilibrium in expected time O(Δ · W 3 · wmax ). W (XjT ) ≥ max{0, m/n − n3 } + 1 for all edges (i, j) ∈ E along which players want to move, it suf- 6 Conclusions fices for to satisfy (1 − )(m/n + n3 ) ≤ m/n − n3 . In this paper we initiate the study of concurrent proBy solving this for and using m ≥ δn4 , we get tocols for selfish load balancing on general networks. ≤ 1 − (δ − 1)/(δ + 1) = 2/(δ + 1) as desired. 2 Our protocols rely only on local information and comProof. [Proof of Theorem 4.2] The analysis of Theorem 4.1 shows that after time O(Δ/μ2 · (ln m + ln n)) we reach a step T such that Ψ(X T ) = 2|E|/μ2 . Let T be the first of these time steps. If in a step t ≥ T we are not in a NE, there is an edge {i, j} ∈ E with |W (Xit−1 ) − W (Xjt−1 )| > 1. Then Lemma 4.1 yields E [ΔΨ(X t )] ≥ 2/(9Δ). By standard probability arguments and using expected potential decrease of 2/(9Δ) per round, the expected number of rounds until Ψ(X t ) is reduced from O(|E|/μ2 ) to 0 is O((|E| · Δ)/μ2 ). Note that for Ψ(X t ) = 0, X t is a social optimum and a NE and our process might stop even earlier. Thus, the expected time to reach a NE is at most O(Δ/μ2 · (ln m + ln n) + (|E| · Δ)/μ2 ).
putation and yield rapid convergence times to NE and optimum states for systems with many agents. We show a number of generalisations e.g., to networks with uniformly related machines of different speeds or agents with weights. There are many open problems that arise from our work, such as finding improved convergence times or general lower bounds for concurrent protocols. On a more technical side, the generalization of our techniques is an interesting open problem, e.g., to more general potential functions that work for networks with both speeds and weights [13]. Another interesting problem is whether approaches that do not use potential functions (e.g., [4]) can be applied here. References
2
1496
[1] Heiner Ackermann, Petra Berenbrink, Simon Fischer, and Martin Hoefer. Concurrent imitation dynamics in
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
congestion games. In Proc. 28th Symp. Principles of Distributed Computing (PODC), pages 63–72, 2009. Heiner Ackermann, Simon Fischer, and Martin Hoefer. Distributed algorithms for QoS load balancing. In Proc. 21st Symp. Parallelism in Algorithms and Architectures (SPAA), pages 197–203, 2009. Heiner Ackermann, Heiko R¨ oglin, and Berthold V¨ ocking. On the impact of combinatorial structure on congestion games. J. ACM, 55(6), 2008. Baruch Awerbuch and Rohit Khandakar. Fast load balancing via bounded best response. In Proc. 19th Symp. Discrete Algorithms (SODA), pages 314–322, 2008. Petra Berenbrink, Tom Friedetzky, Leslie Ann Goldberg, Paul Goldberg, Zengjian Hu, and Russel Martin. Distributed selfish load balancing. SIAM J. Comput., 37(4):1163–1181, 2007. Petra Berenbrink, Tom Friedetzky, Iman Hajirasouliha, and Zengjian Hu. Convergence to equilibria in distributed, selfish reallocation processes with weighted tasks. In Proc. 15th European Symposium on Algorithms (ESA), pages 41–52, 2007. Petra Berenbrink, Tom Friedetzky, and Zengjian Hu. A new analytical method for parallel, diffusion-type load balancing. J. Parallel and Distributed Comput., 69:54–61, 2009. Jacques Boillat. Load balancing and poisson equation in a graph. Concurrency: Pract. Exper., 2(4):289–313, 1990. Steve Chien and Alistair Sinclair. Convergence to approximate Nash equilibria in congestion games. In Proc. 18th Symp. Discrete Algorithms (SODA), pages 169–178, 2007. Fan Chung. Spectral Graph Theory. Number 92 in CBMS Regional Conference Series in Mathematics. Amer. Math. Society, 1997. Geroge Cybenko. Load balancing for distributed memory multiprocessors. J. Parallel and Distributed Comput., 7:279–301, 1989. Robert Els¨ asser, Burkhard Monien, and Stefan Schamberger. Distributing unit size workload packages in heterogeneous networks. J. Graph Alg. Appl., 10(1):51–68, 2006. Eyal Even-Dar, Alexander Kesselman, and Yishay Mansour. Convergence time to Nash equilibria in load balancing. ACM Trans. Algorithms, 3(3), 2007. Eyal Even-Dar and Yishay Mansour. Fast convergence of selfish rerouting. In Proc. 16th Symp. Discrete Algorithms (SODA), pages 772–781, 2005. Rainer Feldmann, Martin Gairing, Thomas L¨ ucking, Burkhard Monien, and Manuel Rode. Nashification and the coordination ratio for a selfish routing game. In Proc. 30th Intl. Coll. Automata, Languages and Programming (ICALP), pages 514–526, 2003. Simon Fischer, Petri M¨ ah¨ onen, Marcel Sch¨ ongens, and Berthold V¨ ocking. Load balancing for dynamic spectrum assignment with local information for secondary users. In Proc. Symp. Dynamic Spectrum Access Net-
works (DySPAN), 2008. [17] Simon Fischer, Harald R¨ acke, and Berthold V¨ ocking. Fast convergence to Wardrop equilibria by adaptive sampling mehtods. In Proc. 38th Symp. Theory of Computing (STOC), pages 653–662, 2006. [18] Simon Fischer and Berthold V¨ ocking. Adaptive routing with stale information. Theoret. Comput. Sci., 410(36):3357–3371, 2008. [19] Dimitris Fotakis, Alexis Kaporis, and Paul Spirakis. Atomic congestion games: Fast, myopic and concurrent. Theory Comput. Syst., 47(1):38–49, 2010. [20] Tobias Friedrich and Thomas Sauerwald. Near-perfect load balancing by randomized rounding. In Proc. 41st Symp. Theory of Computing (STOC), pages 121–130, 2009. [21] Bhaskar Ghosh and S. Muthukrishnan. Dynamic Load Balancing by Random Matchings. J. Comput. Syst. Sci., 53(3):357–370, 1996. ´ Tardos. [22] Robert Kleinberg, Georgios Piliouras, and Eva Load balancing without regret in the bulletin board model. In Proc. 28th Symp. Principles of Distributed Computing (PODC), pages 56–62, 2009. ´ Tardos. [23] Robert Kleinberg, Georgios Piliouras, and Eva Multiplicative updates outperform generic no-regret learning in congestion games. In Proc. 41st Symp. Theory of Computing (STOC), pages 533–542, 2009. [24] Jason Marden, G¨ urdal Arslan, and Jeff Shamma. Regret based dynamics: Convergence in weakly acyclic games. In Proc. 6th Conf. Autonomous Agents and Multi-Agent Systems (AAMAS), 2007. [25] Jason Marden, Peyton Young, G¨ urdal Arslan, and Jeff Shamma. Payoff-based dynamics for multiplayer weakly acyclic games. SIAM J. Control Opt., 48(1):373–396, 2009. [26] S. Muthukrishnan, Bhaskar Ghosh, and Martin Schultz. First- and second-order diffusive methods for rapid, coarse, distributed load balancing. Theory Comput. Syst., 31(4):331–354, 1998. [27] Yuval Rabani, Alistair Sinclair, and Rolf Wanka. Local divergence of Markov chains and the analysis of iterative load balancing schemes. In Proc. 39th Symp. Foundations of Computer Science (FOCS), pages 694– 705, 1998. [28] Robert Rosenthal. A class of games possessing purestrategy Nash equilibria. Intl. J. Game Theory, 2:65– 67, 1973. [29] Alexander Skopalik and Berthold V¨ ocking. Inapproximability of pure Nash equilibria. In Proc. 40th Symp. Theory of Computing (STOC), pages 355–364, 2008. [30] Berthold V¨ ocking. Selfish load balancing. In Noam ´ Tardos, Tim Roughgarden, and Vijay VaziNisan, Eva rani, editors, Algorithmic Game Theory, chapter 20. Cambridge University Press, 2007.
1497
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.