FPTAS for Mixed-Strategy Nash Equilibria in Tree Graphical Games ...

Comment

Report 2 Downloads 142 Views

FPTAS for Mixed-Strategy Nash Equilibria in Tree Graphical Games and Their Generalizations

arXiv:1602.05237v1 [cs.GT] 16 Feb 2016

Luis E. Ortiz [email protected] Department of Computer and Information Science, University of Michigan - Dearborn Dearborn, MI 48128 Mohammad T. Irfan

[email protected]

Department of Computer Science, Bowdoin College Brunswick, ME 04011

Abstract We provide the first FPTAS for computing an approximate mixed-strategy Nash Equilibrium (MSNE) in graphical multi-hypermatrix games, which are generalizations of normalform games, graphical games (GGs), graphical polymatrix games, and hypergraphical games. The computational complexity of graphical polymatrix games, or polymatrix GGs for short, has been of great interest in the computational/algorithmic game theory community. The exact-MSNE formulation of the problem for polymatrix GGs is PPAD-compete, thus generally believed intractable, even for binary-action games with tree graphs. In contrast, to the best of our knowledge, we are the first to establish a corollary FPTAS (or quasi-PTAS) for tree polymatrix and normal-form GGs with the number of actions bounded by a constant (or a logarithm of the number of players, respectively).

1. Introduction We consider the problem of computing approximate mixed-strategy Nash equilibiria (MSNE) in graphical multi-hypermatrix games (GMhGs) (Ortiz, 2014). In particular, we present the algorithmic implications of the results on sparse-discretization representations that Ortiz (2014) leaves open in his research note. Later in this paper, for the sake of completeness, we present the formal definition of GMhGs given by Ortiz (2014). Roughly speaking, in a GMhG, each player’s payoff is the addition/summation of several local payoff hypermatrixes, defined with respect to each individual player’s local hypergraph. GMhGs generalize normal-form games (NFGs), graphical games (GGs) (Kearns, Littman, & Singh, 2001; Kearns, 2007), graphical polymatrix games (graphical GGs), and hypergraphical games (Papadimitriou, 2005; Papadimitriou & Roughgarden, 2008). In this paper, we provide FPTAS and quasi-PTAS for GMhGs in which the individual player’s number of actions m and the hypertree-width w of the global game hypergraph induced by the union of the local hypergraphs of each player are bounded. In particular, we present a CSP formulation of the problem from which our results follow immediately given old results in the AI literature, as Ortiz (2014) describes in his research note. However, for the sake of self-containment, we also provide a specific dynamic-programming (DP) algorithm for this problem in the context of polymatrix GGs with tree graphs, an important subclass of GMhGs. The specific DP algorithm we present is an FPTAS when m is bounded

Ortiz & Irfan

by a constant, 1 and a quasi-PTAS when m is bounded by a logarithmic function of the number of players n. 2 The result extends to GGs in normal-form/tabular representation (i.e., exponential in the neighborhood size), but because it requires a more complex generalization of the CSP mentioned in this paragraph, we move the details of the more complex CSP to the appendix.

2. Our Main Contribution in Context: Relevant Previous Work In this section we provide a brief overview of previous computational complexity and algorithmic results for the problem of MSNE in general. The objective, however, is to provide context for the significance of our main result on the algorithmic implications of the Sparse Nash-equilibria Representation Theorem of Ortiz (2014) for computing approximate MSNE, in the absolute sense, as game theory most commonly defines it, and for the particular class of GMhGs. While Ortiz (2014) states the Sparse Nash-equilibria Representation Theorem for GMhGs, he only discusses the algorithmic implications for NFGs and (standard) GGs, leaving the borader algorithmic implications for the more general class of GMhGs as an open problem. We address precisely this open problem here. A full account of our results for all specific sub-classes of GMhGs such as NFGs and (standard) GGs is beyond the scope of this paper, just as is the discussion on (a) other types of recently popular types of approximations such as relative and constant; (b) other popular equilibrium-solution concepts such as pure-strategy Nash equilibria (PSNE) and correlated equilibria (CE) (Aumann, 1974, 1987); and (c) other quality guarantees of the computational solution of the respective problem, including exact MSNE and “well-supported” approximate MSNE. However, in the following, we will briefly touch on some of the previous results most useful to achieve our aforementioned objective for this section. We refer the reader to Ortiz (2014) for a summary of previous computational complexity and algorithmic results on exact and approximate PSNE and MSNE in NFGs and GGs. In brief, the complexity status of NFGs is well-understood today. See (Ortiz, 2014) for a brief discussion and summary and (Daskalakis, Goldberg, & Papadimitriou, 2005; Daskalakis & Papadimitriou, 2005; Chen & Deng, 2005a, 2005b, 2006; Daskalakis, Goldberg, & Papadimitriou, 2009) for a series of seminal works that ultimately culminated in the PPAD-completeness of 2-player multi-action NFGs, also called bimatrix games. Once the complexity of exact MSNE computation was established, the spotlight naturally fell on approximate MSNE computation, especially in succinctly representable games such as GGs. Chen, Deng, and Teng (2009) showed that bimatrix games do not admit an FPTAS unless PPAD ⊆ P. This result opened up the problem of computing a PTAS. There has been a series of results based on constant-factor approximations. The current best PTAS is a 0.3393-approximation for bimatrix games (Tsaknakis & Spirakis, 2008), which can be extended to the cases of three and four-player games with the approximation guarantees of 0.6022 and 0.7153, respectively. Note that sub-exponential algorithms for computing MSNE have been known prior to all of these results (Lipton, Markakis, & Mehta, 2003). As a result, it is unlikely that the case of constant number of players will be PPAD1. Actually, the FPTAS result follows even when m log(m k) = O(log(n)), where n is the number of players and k is the largest neighborhood size of the tree game graph defining the polymatrix GGs. 2. We note that w = 2 for the case of polymatrix GGs with tree graphs.

2

FPTAS for MSNE in GGs

complete. Along that line, Rubinstein (2014) considered the hardness of computing ǫMSNE in n-player succinctly representable games such as standard and polymatrix GGs. He showed that there exists a constant ǫ such that finding an ǫ-MSNE in a 2-action graphical polymatrix game with a bipartite structure and having a maximum degree of 3 is PPADcomplete. 3 This result extends that of Chen et al. (2009). Whereas Chen et al. (2009) showed the hardness of bimatrix games for a polynomially small ǫ, Rubinstein (2014) showed the hardness (in this case, PPAD-completeness) of n-player polymatrix games for a constant ǫ. There is some recent good news in the case of polymatrix games with an unbounded number of player. Deligkas, Fearnley, Savani, and Spirakis (2014) presented an algorithm for computing a (0.5 + δ)-MSNE of an n-player polymatrix game. Their algorithm runs in time polynomial in the input size and 1δ . They started by formulating an ǫ-MSNE in the form of “regret,” and designed a gradient-descent method for minimizing the regret up to the approximation guarantee. Very recently, Barman, Ligett, and Piliouras (2015) gave a quasi-polynomial time randomized algorithm for computing an ǫ-MSNE in tree-structured polymatrix games. They assumed that the payoffs are normalized so that the local payoff of any player i from any other player j lies in [0, 1/di ], where di is the degree of i. This guarantees, in a strong way, that the total payoff of any player is in [0, 1]. In comparison, we do not make the assumption of local payoffs lying in [0, 1/di ]. Our algorithm is a deterministic FPTAS when the number of actions is bounded by a constant. Please note that while the complexity results stated above help place our work in context, they do not directly apply to our main interest in this paper: algorithms for computing ǫMSNE, in the absolute sense, in polymatrix GGs, and their generalizations, with tree graphs, or graphs with bounded tree-width or hyper-treewidth. The computational complexity of our cases of interest is open for unbounded m. Here we provide FPTAS for our cases of interest, assuming bounded m. Also, it is important to keep in mind the main motivation behind GGs, as originally introduced by Kearns et al. (2001): compact/succinct representations that are alternatives to standard normal-form games in game theory; said differently, representation sizes that do not depend exponentially in n, but instead exponential in k, and linear in n. As Kearns et al. (2001) stated, if k ≪ n, we obtain exponential gains in representation size. Thus, it is n and k the parameters of main interest in standard GGs; the parameter m is of secondary interest. Indeed, even Kearns et al. (2001) concentrate on the case of binary actions (i.e., m = 2). As noted in the research note of Ortiz (2014), GMhGs have the polynomial intersection property and thus a polynomial CE scheme (Papadimitriou, 2005; Papadimitriou & Roughgarden, 2008; Jiang & Leyton-Brown, 2011b). Hence, the computation of a single/sample CE of any GMhG is in P . However, while computing the CE with optimum social welfare in a standard GG is N P -hard (Papadimitriou & Roughgarden, 2005, 2008), the same computation, or for that matter that of optimizing any linear combination of functions respecting the constraint-network-graph induced by the linear CE constraints for the GMhG, is in P if either the tree-width w′ of the resulting primal graph of the CSP induced by the linear 3. The author showed a stronger hardness result for ǫ-approximate MSNE, which is more relaxed concept than ǫ-MSNE. Every ǫ-MSNE is an ǫ-approximate MSNE, but not vice versa.

3

Ortiz & Irfan

CE-constraints for the GMhG or the hypertree-width w of the dual hypergraph of the same CSP is bounded. The latter result follows easily from the derivation of the polynomial-time linear program for computing CE of Kakade, Kearns, Langford, and Ortiz (2003). 4 Because the tree-width for a polymatrix GG is 2, the derivation of the result of Kakade et al. (2003) also immediately implies that the same computation for polymatrix GGs with tree graphs is in P . 5

3. Mathematical Preliminaries, Background, and Notation This section introduces the basic technical background, notation, and concepts used in this paper. The presentation follows closely and borrows heavily from that of Ortiz (2014). 6 3.1 Basic Notation Denote by a ≡ (a1 , a2 , . . . , an ) an n-dimensional vector and by a−i ≡ (a1 , . . . , ai−1 , ai+1 , . . . , an ) the same vector without component i. Similarly, for every set S ⊂ [n] ≡ {1, . . . , n}, denote by aS ≡ (ai : i ∈ S) the (sub-)vector formed from a using only components in S, such that, if S c ≡ [n] − S denotes the complement of S, a ≡ (aS , aS c ) ≡ (ai , a−i ) for every i. If A1 , . . . , An are sets, denote by A ≡ ×i∈[n] Ai , A−i ≡ ×j∈[n]−{i} Aj and AS ≡ ×j∈S Aj . To simplify the presentation, whenever we have a difference of a set S with a singleton set {i}, we often abuse notation and denote by S − i ≡≡ S − {i}. If G = (V, E) is an undirected graph, then for S each i ∈ V denote by Ni ≡ {j | (j, i) ∈ E} the neighbors of node/vertex i in G, and Ni ≡ Ni {i} the neighborhood of node/vertex i in G. Note that we have i ∈ / Ni but i ∈ Ni for all i ∈ V . 3.2 Graphical Multi-hypermatrix Game Representations This section formally defines graphical multi-hypermatrix games (GMhGs), as Ortiz (2014) originally introduced. GMhGs are graphical models for compact representations of classical game representations in game theory. Definition 1. A graphical multi-hypermatrix game (GMhG) is defined by a set V of n players, and for each player i ∈ V , a set of actions, or pure strategies, Ai ; a set Ci ⊂ 2V of ′ local cliques, or local hyperedges, 7 such that if C ∈ Ci then i ∈ C; and a set {Mi,C : AC → R | C ∈ Ci } of local-clique payoff matrices. For each player i ∈ V , the sets Ni ≡ ∪C∈Ci C and Ni ≡ {j ∈ V | i ∈ Nj , j 6= i} are the clique of players affecting i’s payoff including i (i.e., i’s neighborhood) and those affected by i not including i, respectively. The local and 4. Kakade et al. (2003) present the result in the context of tree GGs only. The generalization of that derivation is simple and natural given that the foundation of the derivation of their main algorithmic/computational result lies in previous work characterizing the exact computation of problems in graphical models in terms of either the tree-width or the hypertree-width. For the particular instance of inference in probabilistic graphical models, the running time of exact computation is known to be exponential in the tree-width. For constraint networks, the same computation is known to be exponential in either.Papadimitriou and Roughgarden (2005) later rediscovered the same result for standard GGs. See Papadimitriou and Roughgarden (2008) for the journal version. 5. Jiang and Leyton-Brown (2011a) rediscovered this result for the case of optimizing social-welfare. 6. One important distinction is that we use ai instead of xi to index the pure strategies of player i. 7. We use the terms “clique” and “hyperedge” interchangeably throughout.

4

FPTAS for MSNE in GGs

global payoff Mi′ : ANi → R and Mi : A → R of i are (implicitly) defined as P matrices ′ ′ Mi (aNi ) ≡ C∈Ci Mi,C (aC ) and Mi (a) ≡ Mi′ (aNi ), respectively.

A standard GG, or more formally, a GG in local normal-form, is a GMhG with the property that each Ci is a singleton set corresponding exactly to the neighborhood Ni of the GG graph, so that E ≡ {(j, i) | for all i ∈ V, and j ∈ Ni } is the set of edges/arcs of the game graph with vertex set V . Hence, in the case of GGs, it is convenient to eliminate the subscripts corresponding to the local hyperedges, so that we can simply denote ′ Mi′ (aNi ) ≡ Mi,N (aNi ). A polymatrix GG is a GMhG with the property that for each player i i ∈ V , each C ∈ Ci has the form {i, j} for some j ∈ V, j 6= i (i.e., |C| = 2, containing i, by definition, and another player j). Technically, the definitions above are for directed versions of GGs. The original definition uses an undirected graph, so that we require the following additional condition: for all i, j ∈ V , we have j ∈ Ni iff i ∈ Nj . We assume the undirected version of polimatrix GG, which is the most standard definition, in the DP algorithm given in Section 6.1. We denote by κi ≡ |Ci | and κ ≡ maxi κi the number of hyperedges of player i and the maximum number of hyperedges over all players, respectively. Similarly, we denote κ′i ≡ maxC∈Ci |C| and κ′ ≡ maxi κ′i the size of the biggest hyperedge of player i and the size of the biggest hyperedge over all players, respectively. Also, for consistency with previous notation for standard GGs, we denote by ki ≡ |Ni | and k ≡ maxi ki the size of the neighborhood of the primal graph induced by the local hyperedges of each player i and the maximum neighborhood size over all players, respectively. Note that κ′i ≤ ki , so that κ′ ≤ k. We refer the reader to Ortiz (2014) for motivation and further discussion on the significance of GMhGs, including their connection to and generalization of other game representations. Using thenotation introduced in the last paragraph, the representation size of GMhGs ′ is O n κ mκ . For polymatrix GGs, we have κi = ki ≤ k and κ′i = 2 for each player i, so that the expression of the representation size simplifies to O n k m2 . For normal-form GGs, we have κi = 1 and κ′i = ki ≤ k for each player i, so that the expression of the representation size simplifies to O n mk . 3.3 Solution Concepts

A joint mixed strategy p ≡ (p1 , . . . , pn ) in a game is formed from each individual mixed strategy pi ≡ (pi (ai ) : ai ∈ Ai ) for player i, which is a probability distribution over the P players actions Ai (i.e., pi (ai ) ≥ 0 for allPai ∈ Ai and p (a i i ) = 1). Denote by ai ∈Ai Pi ≡ { pi | pi (ai ) ≥ 0, for all ai ∈ Ai and p (a ) = 1} the set of all possible mixed ai ∈Ai i i strategies of player i (i.e., all possible probability distributions over Ai ). Similar to the vector notation introduced above, for all i and any clique/set S ⊂ V , denote by p−i and pS the mixed strategies corresponding to all the players except i and all the players in clique S, respectively, so that p ≡ (pi , p−i ) ≡ (pS , pV −S ). A joint mixed strategy p induces a joint (product) Q probability distribution over the joint action space A, such that, for all a ∈ A, p(a) ≡ i∈V pi (ai ) is the probability, with respect to joint mixed strategy p, that joint action a is played. The expected payoff of player i with respect to joint mixed strategy p is denoted by P Mi (p) ≡ a∈A p(a)Mi (a). 5

Ortiz & Irfan

Definition 2. For any ǫ ≥ 0, a joint mixed-strategy p∗ is called an ǫ mixed-strategy Nash equilibrium (ǫ-MSNE) if for every player i, and for all ai ∈ Ai , Mi (p∗i , p∗−i ) ≥ Mi (ai , p∗i )− ǫ. That is, no player can increase its expected payoff more than ǫ by unilaterally deviating from its mixed strategy part p∗i in the equilibrium, assuming the others play according to their respective parts p∗−i . A mixed-strategy Nash equilibrium (MSNE) is then a 0-MSNE. Note that, for all p−i ∈ P−i , maxpi ∈Pi Mi (pi , p−i ) = maxa′i ∈Ai Mi (a′i , p−i ) ≥ Mi (ai , p−i ), for all ai ∈ Ai . 3.4 Normalizing the Payoff Scale Note that the equilibrium conditions are invariant to affine transformations. In the case of GGs with local payoff matrices represented in tabular/matrix/normal-form, it is convention to assume, without loss of generality, that the payoff values are such that, for each player i ∈ V , we have mina Mi (a) = minaNi Mi′ (aNi ) = 0 and maxa Mi (a) = maxaNi Mi′ (aNi ) = 1. Note that in the case of GGs using such “tabular” representations, we do not lose generality by assuming the maximum and minimum local payoff values of each player are 0 and 1, respectively, because we can compute them both efficiently. While this will not be the case for GG generalizations, in the worst case, it is also computationally efficient for GMhGs whose local hypergraphs have bounded hypertree-widths. For instance, normalizing the payoffs of a polymatrix GG is in P . Normalizing the payoff of a GG in standard local strategic/normal-form takes linear time in the representation size of the game, because we can find the minimum and maximum local payoff values for each local payoff hypermatrix (which, for each player, is exponential in the size of the player’s neighborhood) simply by going over each payoff value in the hypermatrix in sequence. However, such an approach is intractable in GMhGs in general. Denote and minimumP payoff values for each player i ∈ V , by P the maximum ′ ′ (a ) and l ≡ min ui ≡ maxaNi C∈Ci Mi,C C i aNi C∈Ci Mi,C (aC ), respectively. Computing both ui and li is NP-hard. To see this, first note that ui and li are the result of a max and min operation over an additive function of the set of the player’s hyper-edges and its possible joint-actions. It is easy to reduce the problem of finding a solution to an arbitrary contraint network to that of computing both ui and li for each player i. A notable exception are cases in which the hypertree-width of the local hypergraph of each player is bounded by either a constant or a logarithm of the total n. P number of players ′ A polymatrix GG In that case, we have ui = maxai j∈Ni maxaj Mi,j (ai , aj ) P is an example. ′ (a , a ). It is evident from the last expression that we can and li = minai j∈Ni minaj Mi,j i j efficiently compute each of those values for each i via + PDP in time |Ai |(|Ni | + |Ai |) = O(mk 2 2 m ), and compute all the values for all i in time 2 i |Ai |(|Ni | + |Ai |) = O(m|E| + m ). Despite those exceptions, in general, we do not have much of a choice but to assume that the payoffs of all players are in the same scale, so that using a global approximation-quality value ǫ is meaningful; and to compute the individual maximum and minimum values of each hypermatrix payoff of each player as a way to set up the sparse uniform-discretizations in the next section. Some additional notation is necessary before stating the theorem. Denote by ui,C ≡ ′ (x ) and l ′ maxxC ∈AC Mi,C C i,C ≡ minxC ∈AC Mi,C (xC ) the largest and smallest payoff values 6

FPTAS for MSNE in GGs

′ , respectively; and by R achieved by the local-grid payoff hypermatrix Mi,C i,C ≡ ui,C − li,C its largest range of values.

4. Discretization Schemes The discretization scheme considered here for the space of mixed strategies of each player is similar to that of (Kearns et al., 2001), except that we allow for the possibility of different discretization sizes for the mixed strategies of players. Definition 3. In an (individually-uniform) mixed-strategy discretization scheme, the uncountable set I = [0, 1] of possible value assignments to the probability pi (ai ) of each action ai of each player i is approximated by a finite grid defined by the set Iei = {0, τi , 2τi , . . . , (si − 1)τi , 1} of values separated by the same distance τi = 1/si for some integer si . Thus the mixed-strategy-discretization size is |Iei | = si + P1. Then, we would only consider mixed strategies qi such that qi (ai ) ∈ Iei for all ai , and ai qi (ai ) = 1. The induced mixed-strategy |A | discretized space of joint mixed strategies is Ie ≡ ×i∈V Iei i , subject to the individual normalization constraints.

Here we also introduce one dimensional discretization of the expected payoffs of each player. This will turn out to be crucial to our results.

Definition 4. In an (individually-uniform) expected-payoff discretization scheme, the uncountable set I = [0, 1] of possible expected payoff values that each player can receive for ′ (p ) induced by each possible local-clique joint-mixedeach local-clique payoff matrix Mi,C C e strategy pC ∈ IC (i.e., in the grid induced by the individually-uniform expected-payoff discretization scheme), of each player i is approximated by a finite grid defined by the set Iei′ = {0, τi′ , 2τi′ , . . . , (s′i − 1)τi′ , 1} of values separated by the same distance τi′ = 1/s′i for some integer s′i . Thus the expected-payoff-discretization size is |Iei′ | = s′i + 1. Then, for any f′ (aB , qC−B ) = arg min e′ |r − B ⊂ C ∈ Ci , we would only consider expected-payoffs M i,C r∈Ii ′ ′ Mi,C (aB , qC−B | ≡ Proj Mi,C (aB , qC−B ) , that are closest to the exact local-clique expected ′ (a , q 8 said differently, for any subset B of each local-clique C ∈ C and payoff Mi,C B C−B ); i ′ ′ ′ e f qC−B (aC−B ) ∈ IC−B for all aC−B P ∈ AC−B , we have |Mi,C (aB , qC−B ) − Mi,C (aB , qC−B )| ≤ τ ′ /2, and, for all j ∈ C − i, aj ∈Aj qj (aj ) = 1. The induced expected-payoff discretized |Ci | space over all local-cliques of all players is Ie′ ≡ ×i∈V Iei′ .

Chan and Ortiz (2015) use a similar idea in the setting of interdependent defense (IDD) games, where each of n sites has a binary pure-strategy set, and a specific instance of the general setting in which the attacker has n + 1 pure strategies. We refer the reader to Chan and Ortiz (2015) for formal definitions and further details. The reason why the attacker has n + 1 pure strategies is because, in the particular instance of IDD games 8. Note that we are using = instead of ∈ in the definition of Proj (.) above; said differently, we are treating Proj (.) as a function instead of a correspondence. This is because most of the time the arg min returns a singleton set, whose element is the unique min, except when there is a tie between the two nearby points in the expected-payoff discretization. In case of a tie, the arg min would return a set with the two nearby points. In such cases, we assume we break the tie in favor of the larger of the two values.

7

Ortiz & Irfan

that Chan and Ortiz (2015) consider, the attacker can attack at most one site at a time, simultaneously. Said differently, in that specific class of games the number of pure strategies for the attacker is linear in the number of players n in the game, but by the parametric nature of the attacker’s utility/payoff, which corresponds to the “social-welfare” cost of the sites, the representation size of the attacker’s payoff function is linear in the number of edges of the network of sites, which is at most quadratic in the number of site players. In contrast, the potential multiplicity of actions of all players poses one of the main challenges in our case, particularly because of the non-tabular/non-normal-form representation of the general GMhGs, which is exponential in the size of the largest hyper-edge over all players neighborhood hyper-graphs.

5. A GMhG-induced CSP Consider the following CSP induced by the GMhG representation and the MSNE conditions: • Variables: for all i, ai , a variable pi,ai corresponding to the mixed-strategy/probability that player i plays pure strategy ai and, for all C ∈ Ci , a variable Si,C,ai corresponding to some partial sum of the expected payoff of player i based on S an ordering of the local-clique/hyperedge elements of Ci ; that is, formally, if Pi ≡ ai {pi,ai } and Si,C ≡ S S S i Pi C∈Ci Si,C . ai {Si,C,ai }, then the set of all variables is

• Domains: the domain of each variable pi (ai ) is Iei , while that of each partial-sum variable Si,C,ai is Iei′ . • Constraints: for each i,

1. Best-response and partial-sum expected local-clique payoff: compute a hyper-tree decomposition of the local hyerpergraph induced by hyperedges Ci ; then order the set of local-cliques Ci of each player i such that Ci ≡ {Ci1 , Ci2 , . . . , Ciκi }, where the superscript denotes the corresponding order of the local-cliques of player i, and the order is consistent with the hypertree decomposition of the local hypergraph, in the standard (non-serial) DP-sense used in constraint and probabilistic graphical models (Dechter, 2003; Koller & Friedman, 2009); and for each ai , (a)

X

pi,a′i Si,C κi ,a′ ≥ Si,C κi ,ai − ǫ/2 i

i

a′i

i

f 1 (ai , p 1 ), and for l = 2, . . . , κi , (b) Si,C 1 ,ai = M i,C C −i i

i

i

f l (ai , p l ) + S l−1 Si,C l ,ai = M C −i i,C ,ai i,C i

2. Normalization:

P

ai

i

i

i

pi,ai = 1

The number of variables of the CSP is O (n m κ). The size of each domain Iei is O (s), where s ≡ maxi si . The size of each domain Iei′ is O (s′ ), where s′ ≡ maxi s′i . The comf l (ai , p l ) in 1(b) above, which takes time O(sκ′ −1 ), dominates the putation of each M C −i i,C i

i

8

FPTAS for MSNE in GGs

running time to build the constraint set. The total number of constraints is O (n m κ). The maximum number of variables in any constraint is O(m κ′ ). Given a hyper-tree decomposition, the amount of time to build the constraint set using a tabular representation is ′ O(n m κ sm κ (s′ )m ). In summary, the representation size of theGMhG-induced CSP presented above, using ′ a tabular representation, is O n m smκ (s′ )m . 5.1 The GMhG-induced CSP is Correct We first state the Sparse Nash-equilibria Representation Theorem of Ortiz (2014) as a lemma here for convenience. Lemma 1. (Ortiz, 2014) (Sparse MSNE Representation Theorem) For any GMhG and any ǫ such that P C∈Ci Ri,C (|C| − 1) 0 < ǫ ≤ 2 min , i∈V maxC ′ ∈Ci |C ′ | − 1 a (uniform) discretization with ' & P 2 |Ai | maxj∈Ni C∈Cj Rj,C (|C| − 1) m κ κ′ =O si = ǫ ǫ for each player i is sufficient to guarantee that for every MSNE of the game, its closest (in ℓ∞ distance) joint mixed strategy in the induced discretized space is also an ǫ-MSNE of the game. We next present our main sparse-representation theorem, now including the discretization over the partial sums of expected local-clique payoffs. Theorem 1. (Sparse Joint MSNE and Expected-Payoff Representation Theorem) Consider any GMhG and any ǫ such that P Ri,C (|C| − 1) . 0 < ǫ ≤ 2 min C∈Ci i∈V maxC ′ ∈Ci |C ′ | − 1 Setting, for all players i, the pair (τi , τi′ ) defining the joint (individually-uniform) mixedstrategy and expected-payoff discretization of player i such that τi =

8 |Ai | maxj∈Ni

and τi′ =

P

ǫ

C∈Cj

Rj,C (|C| − 1)

ǫ , 4 κi

so that the discretization sizes ' & P 8 |Ai | maxj∈Ni C∈Cj Rj,C (|C| − 1) m κ κ′ =O si = ǫ ǫ 9

Ortiz & Irfan

and

4 κi = ǫ

s′i

=O

κ ǫ

for each mixed-strategy probability and expected payoff value, respectively, is sufficient to guarantee that for every MSNE of the game, its closest (in ℓ∞ distance) joint mixed strategy in the induced discretized space is a solution of the GMhG-induced CSP, and that any solution to the GMhG-induced CSP (in discretized probability and payoff space) is an ǫMSNE of the game. Proof. Let p′ be an MSNE of the GMhG. Let p be the mixed strategy closest, in ℓ∞ , to p′ in the grid induced by the combination of the discretizations that each τi generates. For ∗ f 1 (ai , p∗ 1 ), and all i and ai , set p∗i,ai = pi (ai ); and for all i and ai , first set Si,C =M 1 i,Ci Ci −i i ,ai ∗ ∗ ∗ f the recursively for l = 2, . . . , κi , set S l = M l (ai , p l ) + S l−1 . The resulting i,Ci

i,Ci ,ai

Ci −i

i,Ci

,ai

assignment satisfies the normalization constraint of the CSP, by the definition of a mixed strategy. The assignment also satisfies the partial-sum expected local-clique payoffs by construction. By the setting of τi and Lemma 1, we have that p is an (ǫ/4)-MSNE, and thus also an ǫ-MSNE. In addition, for all i and ai , we have the following sequence of inequalities: X X X ′ ′ Mi,C (ai , pC−i ) − ǫ/4 (a′i , pC−i ) ≥ pi (a′i ) Mi,C a′i

X a′i

C∈Ci

p∗i,a′ i

κi X l=1

C∈Ci

′ Mi,C (a′i , p∗C l −i ) l i −i i

κi X

≥

′ ∗ Mi,C l −i (ai , pC l −i ) − ǫ/4 i

l=1

i

fi,C , for all i and C ∈ Ci , we have that for all ai and l = 1, . . . , κi , By the definition of M f l (a′i , p∗ l ) + τi′ /2 . f l (a′i , p∗ l ) − τi′ /2 ≤ M ′ l (a′i , p∗ l ) ≤ M M i,C −i i,C −i C −i C −i i,C −i C −i i

i

i

i

i

i

Applying the last inequality to the last inequality of the previous sequence of inequalities, and by unravelling the construction of the CSP assignment, we have X a′i

and

p∗i,a′

i

κi κi X X f l (a′i , p∗ l ) + τi′ /2 ≥ f l (ai , p∗ l ) − τi′ /2 − ǫ/4 M M i,C −i i,C −i C −i C −i i

i

i

l=1

X

l=1

i

∗ ∗ κ κi ′ ≥ S − κi τi′ − ǫ/4 . p∗i,a′ Si,C i,C i ,a′ ,a i

a′i

i

i

i

i

By the setting of τi′ , we have that κi τi′ = ǫ/4 and that X a′i

∗ ∗ κ κi ′ ≥ S − ǫ/2 . p∗i,a′ Si,C i,C i ,a′ ,a i

i

i

i

i

Hence, the assignment also satisfies the best-response constraints. Putting everything together we have that the assignment (p∗ , S ∗ ) is a solution of the GMhG-induced CSP. 10

FPTAS for MSNE in GGs

Now, for the second part of the theorem, suppose (p∗ , S ∗ ) is a solution of the GMhGinduced CSP. Then, by the combination of the best-response and partial-sum expected local-clique payoff constraints, we have that, for all i and ai , X ∗ ∗ κ κi ′ ≥ S − ǫ/2 , p∗i,a′ Si,C i,C i ,a ,a i

a′i

i

i

i

i

∗ ∗ f Si,C ), 1 ,a = Mi,C 1 (ai , p |Ci | i

and

Ci

i

i

−i

∗ ∗ f . Si,C l ,a = Mi,C l (ai , pC l −i ) + S i,C l−1 −i,a i i

i

i

i

i

This in turn implies that for all i and ai , we can obtain the following sequence of inequalities: X

p∗i,a′

i

a′i

X a′i

l=1

X

i

i

C∈Ci

p∗i,a′

X a′i

i

fi,C (a′ , p∗ ) ≥ M i C−i

l=1

f l (ai , p∗ l ) − ǫ/2 M i,C C −i

X

C∈Ci

i

i

fi,C (ai , p∗ ) − ǫ/2 M C−i

C∈Ci

C∈Ci

a′i

i

κi X

X ′ ′ Mi,C (ai , p∗C−i ) − τi′ /2 − ǫ/2 Mi,C (a′i , p∗C−i ) + τi′ /2 ≥

X

X

f l (a′i , p∗ l ) ≥ M i,C C −i

X

p∗i,a′

a′i

p∗i,a′

κi X

i

X

′ Mi,C (a′i , p∗C−i ) ≥

C∈Ci

p∗i,a′

i

X

a′i

p∗i,a′

i

′ Mi,C (ai , p∗C−i ) − κi τi′ − ǫ/2

C∈Ci ′ Mi,C (a′i , p∗C−i ) ≥

C∈Ci

X

X

X

′ Mi,C (ai , p∗C−i ) − ǫ/2 − ǫ/2

C∈Ci

X

′ Mi,C (a′i , p∗C−i ) ≥

C∈Ci

X

′ Mi,C (ai , p∗C−i ) − ǫ

C∈Ci

Hence, the corresponding joint mixed-strategy that such a solution to the CSP defines via p∗ is an ǫ-MSNE of the GMhG.

6. Some Computational Results The following results extend those summarizing the algorithmic implications that Ortiz (2014) states, based on the special characteristics of the GMhG-induced CSP presented in the previous section. Because of the particular significance of polymatrix GGs, we also state specific corollaries of our main technical results for that class of games. Theorem 2. There exists an algorithm that, given as input a number ǫ > 0 and an n-player GMhG with maximum local-hyperedge-set size κ and maximum number of actions m, and whose corresponding CSP has a hypergraph with hypertree-width w, computes an ǫ-MSNE ′ of the GMhG in time [n (m κ κ′ /ǫ)mκ ]O(w) . The following corollary characterizes the computational complexity of the approximation schemes resulting from instances of the last theorem. 11

Ortiz & Irfan

Corollary 1. There exists an algorithm that, given as input a GMhG with bounded w, outputs an ǫ-MSNE in polynomial time in the size of the input and 1/ǫ, for any ǫ > 0; hence, the algorithm is an FPTAS. If, instead, we have w = O(polylog(n)), then the algorithm is a quasi-PTAS. Corollary 2. There exists an algorithm that, given as input a number ǫ > 0 and an nplayer polymatrix GG with tree graph, maximum neighborhood size k, and maximum number of actions m, computes an ǫ-MSNE of the polymatrix GG in time [n (mk/ǫ)]O(m) . If m is bounded by a constant, then the algorithm is an FPTAS. If, instead, m = O(polylog(n)), then the algorithm is a quasi-PTAS. Theorem 3. There exists an algorithm that, given as input a number ǫ > 0 and an n-player GMhG with maximum number of actions m, primal-graph treewidth w′ of the corresponding CSP, maximum local-hyperedge-set size κ, and maximum local-hyperedge size κ′ , computes ′ ′ an ǫ-MSNE of the game in time 2O(w ) n log(n) + n[(m κ κ′ /ǫ)m ]O(w ) . Corollary 3. There exists an FPTAS for computing an approximate MSNE in n-player GMhGs with corresponding m, κ, and κ′ all bounded by constants, independent of n, and primal-graph treewidth w′ = O(log(n)). Corollary 4. There exists an algorithm that, given as input an n-player polymatrix GG with a tree graph, maximum neighborhood size k, and maximum number of actions m, computes an ǫ-MSNE of the polymatrix GG in time 2O(m) n log(n) + n (mk/ǫ)O(m) . If m is bounded by a constant, then the algorithm is an FPTAS. If, instead, m = O(polylog(n)), then the algorithm is a quasi-PTAS. 6.1 Sparse-discretization-based DP for Polimatrix GGs with Tree Graphs Recall that in our approach, we discretize both the set of mixed strategies of each player as well as the set of expected-payoff values for each action of each player and each possible mixed strategy of its neighboring players, in the sparse-discretization-induced grid over each individual neighbor mixed strategies. In this section we present a specific DP algorithm in the context of the special, but still important class of graphical polymatrix games with tree graphs. This is for simplicity, and as we further discuss in the remarks of the next section, is without loss of generality: the idea behind our DP algorithm extends easily to the broader class of graphical multihypermatrix games by using the concept of hypertree decompositions. As a brief preview of the discussion, the running time will now depend exponentially on the hypertree width of the resulting hypertree decomposition (based on the union of local hypergraphs defining the local payoff matrix of each player), of course, as it is typical in results for exact computation based on DP-style algorithms in general graphical models (i.e., including constraint and probabilistic). In the description below, we will assume that we have designated a root node to the tree defining the game graph, and defined the notion of parents and children nodes accordingly, with respect to the designated root. Said differently, once we designate a root, it induces an implicit directed tree with the root as the sole source, and the leaves as sinks of the induced directed tree graph. Using the induced directed tree graph, for any node/player i, we denote 12

FPTAS for MSNE in GGs

by pa(i) the single parent of any non-root node in the tree, if i is not the root, otherwise, pa(i) is undefined; and denote by Ch(i) the children of node i in the root-designated-induced directed tree. Please keep in mind that if i corresponds to the designated root node of the tree, then i has no parent, so that pa(i) is undefined. Similarly, if i is a leaf of the game tree, relative to the designated root, then Ch(i) = ∅. The nature of the algorithm is the same as that for TreeNash (Kearns et al., 2001) and NashProp (Ortiz & Kearns, 2003) used for standard GGs, except that (1) the encoding of the “table messages” passed from the “child node/player” to the “parent node/player” is in terms of the set {−∞, 0}, instead of bits {0, 1}, where in each case, the first and second element of the set encodes feasibility and infeasibility, respectively; and (2) more distinctively, the DP algorithm here implicitly passes messages corresponding to the partialsum of expected local-pairs payoff values, restricted to a finite grid of possible values, across the children of a node/player in a fixed, but arbitrary order. Collection Pass. Recursively, for each node i in the induced directed tree, relative to the root, denote by j = pa(i). Order Ch(i) and denote the resulting node order by o1 , . . . , o|Ch(i)| . Apply the following DP from leaves to root: for each arc (j, i) in the designated-root-induced directed tree, and (pi , pj ) a mixed-strategy pair in the induced grid, Ti→j (pi , pj ) = max Bi (pi , pj , So|Ch(i)| ) + Ro|Ch(i)| (pi , So|Ch(i)| ) So|Ch(i)|

Wi→j (pi , pj ) = arg max Bi (pi , pj , So|Ch(i)| ) + Ro|Ch(i)| (pi , So|Ch(i)| ) So|Ch(i)|

where Bi (pi , pj , So|Ch(i)| ) =

X ai

  X fi,j (a′i , pj ) + So pi (a′i ) M log 1

 fi,j (ai , pj ) + So (a′i ) ≥ M (ai ) − ǫ |Ch(i)| |Ch(i)|

a′i

and, for l = 1, . . . , |Ch(i)|,

Vol (Sol , pol , Sol−1 ) =

X ai

log

h

i

fi,o (ai , po ) + So (ai ) 1 Sol (ai ) = M l l l−1

Fol (pi , Sol , pol , Sol−1 ) =Tol →i (pol , pi ) + Rol−1 (pi , Sol−1 ) + Vol (Sol , pol , Sol−1 ) Rol (pi , Sol ) = max Fol (pi , Sol , pol , Sol−1 ) pol ,Sol−1

Wol (pi , Sol ) = arg max Fol (pi , Sol , pol , Sol−1 ) . pol ,Sol−1

Note that we are using the following boundary conditions for simplicity of presentation: Ro0 ≡ 0 and So0 ≡ 0, so that Fo1 (pi , So1 , po1 , So0 ) ≡ Fo1 (pi , So1 , po1 ) = To1 →i (po1 , pi ). If i is the designated root, then, because there is no corresponding parent j, we have Ti→j (pi , pj ) ≡ Ti (pi ) and Wi→j (pi , pj ) ≡ Wi (pi ). Assignment Pass. For the root i, set p∗i ∈ arg maxpi Ti (pi ) and So∗|Ch(i)| ∈ Wi (p∗i ), where the o|Ch(i)| is the last node in the oder of the root’s children Ch(i). Then recursively apply the following assignment process starting at o|Ch(i)| : for l = |Ch(i)|, . . . , 1, set (p∗ol , So∗l−1 ) ∈ Wol (p∗i , So∗l ). 13

Ortiz & Irfan

6.2 The Running Time of the DP Algorithm A running-time analysis of the DP algorithm presented above yields the following theorem, which is one of our main algorithmic results of this paper. Theorem 4. The DP algorithm computes an ǫ-MSNE in a polymatrix GG with a tree graph O(m) . in time n mǫ k

Corollary 5. The DP algorithm is an FPTAS to compute an ǫ-MSNE in an n-player polymatrix GG with a tree graph and a bounded number of actions m. If m = O(polylog(n)), then the DP algorithm is a quasi-PTAS.

7. Concluding Remarks Using the concept of hypertree decomposition as is typical in the literature in probabilistic (Koller & Friedman, 2009) and constraint (Dechter, 2003) graphical models yields a straightforward, but tedious to derive, extension of the results presented for graphical polymatrix games with tree graphs to more general graphical multi-hypermatrix games both for computing a single/sample ǫ-MSNE and an absolute-approximate optimum of the best and worst social-welfare among the set of ǫ-MSNE (or the optimum of any linear combination of the players’ payoffs, or other related quantities like minimax or maximin). 9 The following bullet-points summarize this discussion. • There exists a FPTAS to compute an ǫ-approximation, in absolute terms, of the optimal of any linear combination of the players’ payoffs ǫ-MSNE in the induced grid in a polymatrix GG with a tree graph and bounded m. Note that this result includes approximately computing the best and worst social-welfare, among other related quantities of interest. The existence result becomes that of a quasi-PTAS when m = O(polylog(n)). • The results stated in the last bullet-point extend to GMhGs with bounded (induced) hypertree-width w. For example, the quasi-PTAS result requires m and w be O(polylog(n)). NashProp-style (Ortiz & Kearns, 2003) heuristic adaptations of the DP algorithm for tree polymatrix GG for loopy game graphs are possible (details omitted). We present a more refined alternative to the GMhG-induced CSP that reduces the dependency in κ. The main idea is to evaluate the expressions involving the expected local fi,C (ai , pC−i ), in a smart way, by decomposing the sum payoffs hypermatrices, such as M involving the expectation, considering one player mixed-strategy at a time, and projecting to the discretized utility space after evaluating each term in the sum. Using this approach we believe we can obtain an FPTAS for standard GGs (i.e., GGs in normal-form) with tree graphs and bounded number of actions, for which the best known approximation result to-date is a quasi-PTAS. Unfortunately, the resulting alternative CSP is considerably more 9. See Ortiz (2014) for a brief discussion of the role of the AI literature on probabilistic grahical models and constraint networks as it relates to GGs, as well as how GMhGs relate to other classes of games such as polymatrix GGs.

14

FPTAS for MSNE in GGs

complex than the one presented in Section 5, hence we refer the reader to the appendix for the details. While we can obtain results analogous to those in Section 6 by using the refined version of the GMhG-induced CSP we present in the appendix, we omit the statements of those general results to keep the presentation simple. Instead we present the results concerning the specific DP algorithm that follows from the refined CSP presented in the appendix. Theorem 5. There exista a DP algorithm computes an ǫ-MSNE in a GG in normal-form 3m+2 k 2 . This implies that, if the number of O m with a tree graph in time n mk ǫ actions m is bounded, then the running time is poly n mk , 1ǫ .

Corollary 6. There exists a DP algorithm that is an FPTAS to compute an ǫ-MSNE in an n-player GG in normal-form with a tree graph and a bounded number of actions m. If m = O(polylog(n)), then the DP algorithm is a quasi-PTAS. Finally, even the refined-CSP approach described in this section does not seem to lead to reductions on the exponential dependency of our algorithms in m. It appears that we would need a different, sparser type of discretization scheme over the simplex defining the mixed-strategy of each player in a way that prevent us from having to consider a number of mixed-strategies in the discretized space that is exponential in m. We leave pursuing research along this line, which may lead to a substantial reduction on the current exponential dependency in m, or perhaps instead establishing complexity results suggesting that such dependency is inevitable, for future work.

References Aumann, R. (1974). Subjectivity and correlation in randomized strategies. Journal of Mathematical Economics, 1. Aumann, R. (1987). Correlated equilibrium as an expression of Bayesian rationality. Econometrica, 55. Barman, S., Ligett, K., & Piliouras, G. (2015). Approximating Nash equilibria in tree polymatrix games. In 8th International Symposium on Algorithmic Game Theory (SAGT). Chan, H., & Ortiz, L. (2015). Computing nash equilibrium in interdependent defense games. In AAAI Conference on Artificial Intelligence. Chen, X., & Deng, X. (2005a). 3-NASH is PPAD-complete. Tech. rep. 134, Electronic Colloquium on Computational Complexity (ECCC). http://eccc.hpi-web.de/ eccc-reports/2005/TR05-134/index.html. Chen, X., & Deng, X. (2005b). Settling the complexity of 2-player Nash-equilibrium. Tech. rep. 140, Electronic Colloquium on Computational Complexity (ECCC). http:// eccc.hpi-web.de/eccc-reports/2005/TR05-140/index.html. Chen, X., & Deng, X. (2006). Settling the complexity of two-player Nash equilibrium. In Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS’06). 15

Ortiz & Irfan

Chen, X., Deng, X., & Teng, S.-H. (2009). Settling the complexity of computing two-player Nash equilibria. J. ACM, 56 (3), 14:1–14:57. Daskalakis, C., Goldberg, P. W., & Papadimitriou, C. H. (2009). The complexity of computing a Nash equilibrium. Commun. ACM, 52 (2), 89–97. Daskalakis, K., Goldberg, P. W., & Papadimitriou, C. H. (2005). The complexity of computing a Nash equilibrium. Electronic Colloquium on Computational Complexity (ECCC). Daskalakis, K., & Papadimitriou, C. H. (2005). Three-player games are hard. Electronic Colloquium on Computational Complexity (ECCC). Dechter, R. (2003). Constraint Processing. Morgan Kaufmann. Deligkas, A., Fearnley, J., Savani, R., & Spirakis, P. (2014). Computing approximate Nash equilibria in polymatrix games. ArXiv e-prints. Jiang, A. X., & Leyton-Brown, K. (2011a). A general framework for computing optimal correlated equilibria in compact games. In Seventh Workshop on Internet and Network Economics (WINE). Jiang, A. X., & Leyton-Brown, K. (2011b). Polynomial computation of exact correlated equilibrium in compact games. SIGecom Exchanges, 10 (1), 6–8. Kakade, S., Kearns, M., Langford, J., & Ortiz, L. (2003). Correlated equilibria in graphical games. In EC ’03: Proceedings of the 4th ACM conference on Electronic commerce, pp. 42–47, New York, NY, USA. ACM. ´ Tardos, & i, V. Kearns, M. (2007). Graphical games. In Nisan, N., Roughgarden, T., Eva V. V. (Eds.), Algorithmic Game Theory, chap. 7, pp. 159–180. Cambridge University Press. Kearns, M. J., Littman, M. L., & Singh, S. P. (2001). Graphical models for game theory. In UAI ’01: Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence, pp. 253–260, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc. Koller, D., & Friedman, N. (2009). Probabilistic Graphical Models: Principles and Techniques. MIT Press. Lipton, R. J., Markakis, E., & Mehta, A. (2003). Playing large games using simple strategies. In EC ’03: Proceedings of the 4th ACM conference on Electronic commerce, pp. 36–41, New York, NY, USA. ACM. Ortiz, L. E. (2014). On sparse discretization for graphical games. CoRR, abs/1411.3320. Ortiz, L. E., & Kearns, M. (2003). Nash propagation for loopy graphical games. In Becker, S. B., Thrun, S. T., & Obermayer, K. (Eds.), Advances in Neural Information Processing Systems 15, pp. 817–824. Papadimitriou, C. H. (2005). Computing correlated equilibria in multi-player games. In STOC ’05: Proceedings of the thirty-seventh annual ACM symposium on Theory of computing, pp. 49–56. Papadimitriou, C. H., & Roughgarden, T. (2005). Computing equilibria in multi-player games. In SODA ’05: Proceedings of the sixteenth annual ACM-SIAM symposium on Discre te algorithms, pp. 82–91. 16

FPTAS for MSNE in GGs

Papadimitriou, C. H., & Roughgarden, T. (2008). Computing correlated equilibria in multiplayer games. J. ACM, 55 (3), 14:1–14:29. Rubinstein, A. (2014). Inapproximability of nash equilibrium. CoRR, abs/1405.3322. Tsaknakis, H., & Spirakis, P. G. (2008). An optimization approach for approximate Nash equilibria. Internet Math., 5 (4), 365–382.

Appendix A. GMhG-induced CSP: Refined Version This appendix presents a more complex generalization of the CSP presented in Section 5 in the main body. • Variables: for all i, ai , a variable pi,ai corresponding to the mixed-strategy/probability that player i plays pure strategy ai and, for all C ∈ Ci , a variable Si,C,ai corresponding to some partial sum of the expected payoff of player i based on an ordering of the local-clique/hyperedge elements of Ci , and given an ordering of C − i = o1 , o2 , . . . , o|C|−1 , a variable Ei,C,aC−[ot ] corresponding to Ssome partial conditional expected payoff of player i; that is, formally, if Pi ≡ ai {pi,ai }, Si,C ≡ S|C|−1 S S t=1 aC−[ot ] {Ei,C,aC−[ot ] }, then the set of all variables is ai {Si,C,ai }, and Ei,C ≡ S S S Ei,C ) . i Pi ∪ C∈Ci (Si,C • Domains: the domain of each variable pi (ai ) is Iei , while that of each partial-sum variable Si,C,ai and each partial-conditional-expectation variable Ei,C,aC−[ot ] is Iei′ . • Constraints: for each i,

1. Best-response and partial-sum expected local-clique payoff: compute a hyper-tree decomposition of the local hyerpergraph induced by hyperedges Ci ; then order the set of local-cliques Ci of each player i such that Ci ≡ {Ci1 , Ci2 , . . . , Ciκi }, where the superscript denotes the corresponding order of the local-cliques of player i, and the order is consistent with the hypertree decomposition of the local hypergraph, in the standard (non-serial) DP-sense used in constraint and probabilistic graphical models (Dechter, 2003; Koller & Friedman, 2009); and for each ai , (a)

X

pi,a′i Si,C κi ,a′ ≥ Si,C κi ,ai − ǫ/2 i

a′i

i

i

(b) Si,C 1 ,ai = Ei,C 1 ,ai , and for l = 2, . . . , κi , i

i

Si,C l ,ai = Ei,C l ,ai + Si,C l−1 ,ai i

i

i

(c) for each set C ∈ Ci , order the elements of C−i such that C−i = o1 , o2 , . . . , o|C|−1 , fi,C (aC ) and for t = 2, . . . , |C| − 1, then set Ei,C,aC ≡ M   X Ei,C,aC−[ot ] = Proj  pot ,aot Ei,C,aC−[ot−1]  aot

17

Ortiz & Irfan

2. Normalization:

P

ai

pi,ai = 1

′ The number of variables of the CSP is O n κ mκ , which is larger than the version of the CSP presented in the main body, but exactly the worst-case representation size of fi,C (aC ) in 1(c) above takes constant time. The the GMhG. The computation of each M ′

total number of constraints is O n κ mκ , which is also larger than the version of the CSP presented in the main body, but exactly the worst-case representation size of the GMhG. The maximum number of variables in any constraint is O(m), which is smaller than the version of the CSP presented in the main body by a factor of κ′ . Given a hyper-tree decomposition, the amount of time to build the constraint set using a tabular representation ′ is O(n κ mκ +1 sm (s′ )m+1 ). In summary, the representation size of the GMhG-induced CSP presented above, using ′ a tabular representation, is O(n κ mκ +1 sm (s′ )m+1 ). Note the key reduction in the dependence on κ′ from the analogous expression given for the CSP in Section 5: the parameter κ′ only appears in the exponent of m, as it also does in the representation size of the GMhG, and not in the exponent of s.

Appendix B. Sparse-discretization-based DP for GGs in Normal-form with Tree Graphs This appendix is analogous to Section 6.1 in the main body, but deals with normal-form GGs, instead of polymatrix GGs. We refer the reader to the introduction to Section 6.1 for general context and notation. Collection Pass. Recursively, for each node i in the induced directed tree, relative to the root, denote by j = pa(i). Order Ch(i) and denote the resulting node order by o1 , . . . , o|Ch(i)| . Apply the following DP from leaves to root: for each arc (j, i) in the designated-root-induced directed tree, and (pi , pj ) a mixed-strategy pair in the induced grid, Ti→j (pi , pj ) = max Bi (pi , pj , Eo|Ch(i)| ) + Ro|Ch(i)| (pi , Eo|Ch(i)| ) So|Ch(i)|

Wi→j (pi , pj ) = arg max Bi (pi , pj , Eo|Ch(i)| ) + Ro|Ch(i)| (pi , Eo|Ch(i)| ) So|Ch(i)|

where

Bi (pi , pj , Eo|Ch(i)| ) =

X ai

 

log 1

X

pi (a′i )pj (a′j )Eo|Ch(i)| (a′i , a′j ) ≥

X a′j

a′i ,a′j

18



pj (a′j )Eo|Ch(i)| (ai , a′j ) − ǫ

FPTAS for MSNE in GGs

and, for l = 1, . . . , |Ch(i)|, Vol (Eol , pol , Eol−1 ) =

X

aNi −[ol ]



 

log 1Eol (aNi −[ol ] ) = Proj 

X aol



pol (aol )Eol (aNi −[ol−1 ] )

Fol (pi , Eol , pol , Eol−1 ) =Tol →i (pol , pi ) + Rol−1 (pi , Eol−1 ) + Vol (Eol , pol , Eol−1 ) Rol (pi , Eol ) = max Fol (pi , Eol , pol , Eol−1 ) pol ,Eol−1

Wol (pi , Eol ) = arg max Fol (pi , Eol , pol , Eol−1 ) . pol ,Eol−1

Note that we are using the following boundary conditions for simplicity of presentation: f Ro0 ≡ 0 and, for all aNi , Eo0 (a Fo1 (pi , Eo1 , po1 ) = o1 (pi , Eo1 , po1 , Eo0 ) ≡ i Nhi ) ≡ Mi (aNi ), so thatFP P To1 →i (po1 , pi ) + aN −o log 1 Eo1 (aNi −o1 ) = Proj . If i is the ao1 po1 (ao1 )Eo0 (aNi ) 1 i designated root, then, because there is no corresponding parent j, we have Ti→j (pi , pj ) ≡ Ti (pi ) and Wi→j (pi , pj ) ≡ Wi (pi ). Assignment Pass. For the root i, set p∗i ∈ arg maxpi Ti (pi ) and Eo∗|Ch(i)| ∈ Wi (p∗i ), where o|Ch(i)| is the last node in the order of the root’s children Ch(i). Then recursively apply the following assignment process starting at o|Ch(i)| : for l = |Ch(i)|, . . . , 1, set (p∗ol , Eo∗l−1 ) ∈ Wol (p∗i , Eo∗l ). Note that for the case of polymatrix GGs the DP above would be essentially the same as that presented in Section 6.1 in the main body. B.1 The Running Time of the DP Algorithm for Tree GGs in Normal-form The worst-case running-time for message passing at each node i is =

mk ǫ mk ǫ

2m

|Ch(i)|

×

3m+2

X

O

l=1

mk ǫ

|Ch(i)|−1

×

X

O

m

m2

r=0

m|Ch(i)|−l

2

2 ! k m2 ǫ

r

! |Ch(i)| m2 −1 = O m2 − 1 mk 3m+2 2 (ki −2)−1 = O m ǫ mk 3m+2 2ki −6 , O m = ǫ

mk ǫ

3m+2

from which the running-time result of Theorem 5 in the concluding remarks.

19

Recommend Documents

Decision Problems for Nash Equilibria in Stochastic Games - arXiv

On the Complexity of Nash Equilibria in Anonymous Games