Combinatorial Algorithms for the Maximum k-plex Problem

Report 4 Downloads 19 Views
Combinatorial Algorithms for the Maximum k-plex Problem Benjamin McClosky, Illya V. Hicks

Department of Computational and Applied Mathematics, Rice University, 6100 Main St - MS 134, Houston, Texas 77005-1892, USA, {[email protected], [email protected]}

The maximum clique problem provides a classic framework for detecting cohesive subgraphs. However, this approach can fail to detect much of the cohesive structure in a graph. To address this issue, Seidman and Foster introduced k-plexes as a degree-based relaxation of graph completeness. More recently, Balasundaram et al. formulated the maximum k-plex problem as an integer program and designed a branch-and-cut algorithm. This paper derives a new upper bound on the cardinality of k-plexes and adapts combinatorial clique algorithms to find maximum k-plexes.

1.

Introduction

All graphs in this paper are finite, simple, and undirected. A complete graph consists of pairwise adjacent vertices. A maximal complete subgraph defines a clique. The problem of finding maximum cardinality cliques is a classic NP-complete problem and is of fundamental importance in combinatorial optimization. The maximum clique problem has applications in ad hoc wireless networks (Chen et al. 2004), data mining (Washio and Motoda 2003), and biochemistry and genomics (Butenko and Wilhelm 2006). Cliques also provide an intuitive approach for detecting cohesion in social networks (Festinger 1949, Luce and Perry 1949). A social network is a graph whose vertices consist of a set of actors and whose edges indicate relationships between actors. Cohesive subgroups are subsets of actors among whom there are relatively strong, direct, intense, frequent, or positive ties (Wasserman and Faust 1994). These subgroups are interesting because they facilitate the emergence of consensus among the actors (Friedkin 1984). In other words, members within a cohesive subgroup tend to exhibit homogeneity. Despite the intuitive appeal of using cliques to model cohesion in social networks, cliques are in fact rarely appropriate for analysis of real data because the definition is too strict (Wasserman and Faust 1994). This motivates the study clique relaxations. Researchers 1

have relaxed a variety of clique properties including familiarity, reachability, and robustness (Balasundaram et al. 2007). In graph theoretic terms, these properties respectively correspond to vertex degree, path length, and connectivity. This paper focuses on a degree-based relaxation known as a k-plex (Seidman and Foster 1978). In order to define k-plexes, consider a graph G = (V, E) and vertex v ∈ V . Define

NG (v) := {u ∈ V : uv ∈ E}, degG (v) := |NG (v)|, ∆(G) := maxv∈V degG (v), and δ(G) := ¯ = (V, E) ¯ to minv∈V degG (v). Let G[K] denote the subgraph induced by K ⊆ V . Define G be the complement of G, where e ∈ E¯ ⇔ e $∈ E. In this paper, k ≥ 1 is a positive integer. Definition 1 (Seidman and Foster 1978). K ⊆ V induces a k-plex if δ(G[K]) ≥ |K| − k. The term k-plex refers to both the set K and the subgraph G[K]. Notice that 1-plexes are complete subgraphs. Let ωk (G) denote the cardinality of a largest k-plex in G. Definition 2 (Seidman and Foster 1978). C ⊆ V induces a co-k-plex if ∆(G[C]) ≤ k − 1. ¯ Notice that co-1-plexes consist of pairwise nonadjacent A co-k-plex in G is a k-plex in G. vertices, or stable sets. Seidman and Foster (1978) introduced k-plexes in the context of social networks. Balasundaram et al. (2006) provided an integer programming formulation for the maximum k-plex problem, derived inequalities for the k-plex polytope, and established the NP-completeness of the k-plex decision problem. McClosky and Hicks (2007) studied the co-k-plex polytope. This paper develops combinatorial algorithms for finding maximum k-plexes. Section 2 describes heuristics for finding upper bounds on ωk (G). Section 3 describes a heuristic for finding lower bounds on ωk (G). Section 4 contains exact k-plex algorithms. Section 5 summarizes and discusses future research. Computational Results. Sections 2-4 contain computational results obtained by conducting experiments on two classes of graphs. The first class of graphs consists entirely of DIMACS (2007) instances. The DIMACS graphs are widely considered the standard testbed for clique algorithms. The second class of graphs consists of real-life social networks. The COMP-GEOM-t instances are collaboration networks for computational geometers (Beebe 2002, Jones 2002). Here, two authors are adjacent if they have jointly published more than t articles. The ERDOS-x-y graphs are collaboration networks of all authors with an Erd¨os number at most y as of year x (Grossman et al. 1995). Data for both collaboration networks can be downloaded from the Pajek data website (Batagelj and Mrvar 2006). 2

Table 1: Test Instances

G brock200-1 brock200-2 brock200-4 brock400-2 brock400-4 brock800-2 brock800-4 c-fat200-1 c-fat200-2 c-fat200-5 c-fat500-1 c-fat500-2 c-fat500-5 c-fat500-10 hamming6-2 hamming6-4 hamming8-2 hamming8-4 hamming10-2 hamming10-4 johnson8-2-4 johnson8-4-4 keller4 MANN-a9 MANN-a27 MANN-a45 p-hat300-1 p-hat300-2 p-hat300-3 p-hat700-1 p-hat700-2 p-hat700-3 san200-0.7-2 san200-0.9-1 COMP-GEOM-0 COMP-GEOM-1 COMP-GEOM-2 ERDOS-97-1 ERDOS-98-1 ERDOS-99-1 ERDOS-97-2 ERDOS-98-2 ERDOS-99-2

|V | 200 200 200 400 400 800 800 200 200 200 500 500 500 500 64 64 256 256 1024 1024 28 70 171 45 378 1035 300 300 300 700 700 700 200 200 7343 7343 7343 472 485 492 5488 5822 6100

|E| 14834 9876 13089 59786 59765 208166 207643 1534 3235 8473 4459 9139 23191 46627 1824 704 31616 20864 518656 434176 210 1855 9435 918 70551 533115 10933 21928 33390 60999 121728 183010 13930 17910 11898 3939 1976 1314 1381 1417 8972 9505 9939

3

ω2 (G) [25,53] [13,24] [19,41] [27,133] [27,133] [23,253] [23,252] 12 24 58 14 26 64 126 32 6 128 16 [512,530] [45,153] 5 14 15 26 236 [662,668] [9,66] [28,85] [43,108] [10,291] [50,298] [73,311] – – 22 10 8 8 8 8 8 8 8

ω3 (G) – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – 22 11 10 9 9 9 9 9 9

Table 1 contains information corresponding to the testbed. The columns ω2 (G) and ω3 (G) show the largest known k-plex in G. This data is due to Balasundaram et al. (2007). All values not specified within a range are optimal. There appears to be no known values of ω4 (G) for these graphs. All implementations were tested using a 2.2 GHz Dual-Core AMD Opteron processor with 3 GB of memory.

2.

Co-k-plex Coloring

This section develops an upper bound on ωk (G) by generalizing the concept of graph coloring. A coloring of G is a function cm : V (→ {1, ..., m} such that cm (u) $= cm (v) whenever uv ∈ E.

The chromatic number, χ(G), of G is the smallest integer m for which there exists a valid coloring cm . Notice that cm (u) $= cm (v) for all u, v ∈ K whenever K ⊆ V induces a complete

subgraph. It follows that the chromatic number is an upper bound for ω(G). That is, ω(G) ≤ χ(G).

The goal is to find a k-plex analogue of this bound. To that end, suppose the co-k-plexes

C1 , ..., Cm partition the vertex set, and let K be a maximum k-plex in G. The sets C1 , ..., Cm define a co-k-plex coloring of G. Observe that ωk (G) = |K| =

m ! i=1

|K ∩ Ci | ≤

m !

ωk (G[Ci ]),

(1)

i=1

where the inequality holds because k-plexes are closed under set inclusion (Seidman and Foster 1978). Let Π be the set of all co-k-plex colorings of G and define the graph invariant ! χk (G) := min{ ωk (G[C]) : C ∈ Π}. (2) C∈C

χk (G) is the co-k-plex chromatic number of G. Notice that (1) reduces to ω(G) ≤ χ(G)

when k = 1 and C1 , ..., Cm is an optimal coloring. It follows that χ1 (G) = χ(G). Moreover, (1) and (2) together imply the bound ωk (G) ≤ χk (G).

(3)

In practice, determining the exact value of χk (G) can be computationally prohibitive, so the remainder of this section is devoted to heuristics for approximating χk (G). These heuristics fall into two categories: integral and fractional. To see the distinction, let I be the set of all co-k-plexes in G, and let Iv denote the set of co-k-plexes containing v. Define " x(A) := a∈A xa . Consider the following dual pair of integer programs: 4

max{x(V ) : x ∈ {0, 1}, x(C) ≤ ωk (G[C]) for all C ∈ I} ! min{ ωk (G[C])yC : y ∈ {0, 1}, y(Iv ) ≥ 1 for all v ∈ V }.

(4) (5)

C∈I

The optimal objective values for (4) and (5) are ωk (G) and χk (G), respectively. Moreover, by strong duality, the optimal objective values for the linear relaxations are equal. It follows that any feasible solution to the linear relaxation of (5) produces an upper bound on ωk (G). Integer Co-k-plex Coloring Heuristics (ICCH) find feasible solutions to (5). Fractional Co-kplex Coloring Heuristics (FCCH) find feasible solutions to the linear relaxation of (5). Both heuristics make use of the following lemmas which provide analytic bounds on ωk (G). Lemma 1. Every graph G satisfies ωk (G) ≤ ∆(G) + k. Proof. If there exists a k-plex K in G such that |K| > ∆(G)+k, then degG[K] (v) ≥ |K|−k > ∆(G) for all v ∈ K, a contradiction.

Lemma 2. Given a graph G and an integer m ≥ 1, let am denote the number of vertices

v ∈ V such that degG (v) ≥ m. If j := max{m : am ≥ k + m}, then ωk (G) ≤ k + j.

Proof. Suppose G contains a k-plex K such that |K| ≥ k + j + 1. By definition of k-plex, degG[K] (v) ≥ |K| − k ≥ j + 1 for all v ∈ K. In other words, K contains at least k + j + 1 vertices v such that degG[K] (v) ≥ j + 1. It

follows that aj+1 ≥ k + j + 1, contradicting the definition of j.

Lemma 3 (Balasundaram et al. 2006). If G is a co-k-plex, then ωk (G) ≤ 2k − 2 + k mod 2. Figure 1 shows an ICCH. Line 4 stores the degree of every vertex in Cm . Line 7 uses Lemmas 1, 2, and 3 to bound ωk (G[Ci ]). Lines 3, 4, 6, and 7 can each be accomplished in linear time using an adjacency matrix. It follows that this ICCH is an O(|V |2 ) algorithm. Table 2

contains computational results obtained by running the function ICCH on the benchmark graphs with an arbitrary initial vertex ordering. The FCCH adapts the fractional coloring procedure of Balas and Xue (1996). More precisely, the FCCH constructs a set of co-k-plexes C1 , ..., Ch ⊆ V such that, after p iterations,

5

G brock200-1 brock200-2 brock200-4 brock400-2 brock400-4 brock800-2 brock800-4 c-fat200-1 c-fat200-2 c-fat200-5 c-fat500-1 c-fat500-2 c-fat500-5 c-fat500-10 hamming6-2 hamming6-4 hamming8-2 hamming8-4 hamming10-2 hamming10-4 johnson8-2-4 johnson8-4-4 keller4 MANN-a9 MANN-a27 MANN-a45 p-hat300-1 p-hat300-2 p-hat300-3 p-hat700-1 p-hat700-2 p-hat700-3 san200-0.7-2 san200-0.9-1 COMP-GEOM-0 COMP-GEOM-1 COMP-GEOM-2 ERDOS971 ERDOS972 ERDOS981 ERDOS982 ERDOS991 ERDOS992 ∗ optimal

Table 2: ICCH Results

χ2 (G) 93 55 78 172 168 248 247 18 34 82 19 36 85 172 32∗ 8 128∗ 32 512∗ 128 12 28 54 38 324 833 39 76 129 76 165 267 105 133 29 22 16 20 19 18 20 18 20

seconds 0.0 0.0 0.0 0.1 0.1 0.3 0.3 0.0 0.0 0.0 0.0 0.0 0.1 0.1 0.0 0.0 0.0 0.0 0.7 0.5 0.0 0.0 0.0 0.0 0.1 1.0 0.0 0.0 0.0 0.1 0.2 0.3 0.0 0.0 10 9.8 10 0.1 5.9 0.1 5.8 0.0 6.3

χ3 (G) 151 95 131 285 286 442 440 20 37 89 23 41 98 191 48 12 192 48 768 192 16 42 100 44 369 1032 69 135 210 135 289 451 147 184 48 37 28 32 36 33 36 34 38

6

seconds 0.0 0.0 0.0 0.1 0.1 0.3 0.3 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0.0 0.0 0.7 0.5 0.0 0.0 0.0 0.0 0.1 0.8 0.0 0.0 0.0 0.1 0.1 0.2 0.0 0.0 10 10 11 0.1 4.9 0.1 5.7 0.1 6.5

χ4 (G) 169 118 151 332 330 570 557 21 38 90 24 42 99 192 56 16 224 64 896 256 19 48 128 45 378 1035 90 173 245 184 375 556 159 193 50 37 27 31 36 32 37 34 38

seconds 0.0 0.0 0.0 0.1 0.1 0.3 0.3 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0.0 0.0 0.7 0.5 0.0 0.0 0.0 0.0 0.1 0.8 0.0 0.0 0.0 0.1 0.2 0.2 0.0 0.0 11 11 10 0.1 5.1 0.1 5.9 0.1 6.6

every vertex belongs to exactly p distinct co-k-plex sets. These co-k-plexes combine to form a feasible solution to the linear relaxation of (5) as follows: h

1! y¯ := yC . p i=1 i The feasibility of y¯ implies that h

ωk (G) ≤

1! ωk (G[Ci ])¯ y Ci . p i=1

Figure 2 shows the FCCH. The set C consists of the co-k-plexes C1 , ..., Ch . At every

iteration, each vertex is either added to an existing Ci ∈ C in Line 7 or to a new partition set in Line 10. By Line 12, every vertex belongs to exactly p partition sets, so tnew is a valid

upper bound on ωk (G). The FCCH termination condition can be inefficient, so the number of iterations and the number partition sets in C are both required to be O(|V |). Theorem 1. If the number of iterations and the number of partition sets are O(|V |), then FCCH can be executed in O(|V |4 ) time.

Proof. At every iteration, for each vertex v ∈ V , the algorithm tests if Ci ∪{v} is a co-k-plex.

This can be done by counting the number of u ∈ NG (v) ∩ Ci , which requires O(|V |) time.

Since there are O(|V |) partition sets, there can be O(|V |2 ) possible pairs (Ci , v). Thus, after O(|V |) iterations, this step has complexity O(|V |4 ). Lines 2 and 10 execute the O(|V |2 )

ICCH algorithm. Since there are at most O(|V |) iterations, these steps have complexity

O(|V |3 ). All other operations contribute O(|V |2 ) to the complexity. Therefore, the overall complexity of FCCH is O(|V |4 ).

function ICCH(V ) 1. Ci = ∅ for 1 ≤ i ≤ |V | 2. for all u ∈ V 3. m = min{i : Ci ∪ {u} is a co-k-plex in G} 4. Cm = Cm ∪ {u} 5. end 6. Compute ji := max{m : am ≥ k + m} for each Ci " | 7. bound = |V i=1 min{ 2k − 2 + k mod 2, k + ji , ∆(G[Ci ]) + k, |Ci |} 8. return bound Figure 1: An Integer Co-k-plex Coloring Heuristic

7

G brock200-1 brock200-2 brock200-4 brock400-2 brock400-4 brock800-2 brock800-4 c-fat200-1 c-fat200-2 c-fat200-5 c-fat500-1 c-fat500-2 c-fat500-5 c-fat500-10 hamming6-2 hamming6-4 hamming8-2 hamming8-4 hamming10-2 hamming10-4 johnson8-2-4 johnson8-4-4 keller4 MANN-a9 MANN-a27 MANN-a45 p-hat300-1 p-hat300-2 p-hat300-3 p-hat700-1 p-hat700-2 p-hat700-3 san200-0.7-2 san200-0.9-1 COMP-GEOM-0 COMP-GEOM-1 COMP-GEOM-2 ERDOS971 ERDOS972 ERDOS981 ERDOS982 ERDOS991 ERDOS992 ∗ optimal

Table 3: FCCH Results

χ2 (G) 82 48 68 151 151 221 223 16 30 76 18 33 82 166 32∗ 8 128∗ 32 512∗ 128 10 28 45 36 321 803 34 71 115 68 146 243 79 127 23 15 12 14 14 12 14 12 14

seconds 0.0 0.0 0.0 0.2 0.2 1.4 2.4 0.0 0.0 0.0 0.1 0.1 0.2 0.4 0.0 0.0 0.0 0.0 1.3 0.7 0.0 0.0 0.0 0.0 0.2 5.9 0.0 0.1 0.2 0.3 0.7 1.4 0.0 0.0 18 17 17 0.0 9.6 0.1 9.9 0.1 11

χ3 (G) 139 89 122 267 265 401 410 20 37 89 23 41 98 191 48 12 192 48 768 192 16 42 88 44 366 1028 62 129 201 123 272 428 140 177 44 28 22 25 28 25 30 25 28

8

seconds 0.1 0.0 0.0 0.2 0.3 1.7 0.8 0.0 0.0 0.0 0.0 0.0 0.1 0.1 0.0 0.0 0.0 0.0 1.3 0.6 0.0 0.0 0.0 0.0 0.1 1.8 0.0 0.0 0.1 0.3 0.3 0.8 0.0 0.1 18 17 17 0.0 8.7 0.1 9.9 0.0 11

χ4 (G) 164 118 151 320 320 535 537 21 38 90 24 42 99 192 56 16 224 64 896 256 18 46 113 45 378 1035 85 161 240 168 349 532 159 191 43 30 23 29 30 28 29 28 30

seconds 0.0 0.0 0.0 0.2 0.1 1.3 1.2 0.0 0.0 0.0 0.0 0.0 0.1 0.2 0.0 0.0 0.1 0.0 2.0 0.6 0.0 0.0 0.0 0.0 0.1 1.1 0.0 0.1 0.1 0.3 0.5 0.6 0.0 0.1 18 16 17 0.1 8.8 0.1 10 0.1 11

function FCCH(V ) 1. told = ∞; p = 1 2. tnew =ICCH(V ); store the partition sets in C 3. while tnew < told 4. U = V ; told = tnew ; p = p + 1 5. for all v ∈ U 6. if ∃ Ci ∈ C such that v $∈ Ci and Ci ∪ {v} is a co-k-plex 7. Ci = Ci ∪ {v}; U = U \ {v} 8. end 9. end 10. ICCH(U ); append new partition sets in C 11. Compute j" i := max{m : am ≥ k + m} for each Ci ∈ C 1 12. tnew = p · Ci ∈C min{ 2k − 2 + k mod 2, k + ji , ∆(G[Ci ]) + k, |Ci |} 13. end 14. return told Figure 2: A Fractional Co-k-plex Coloring Heuristic Table 3 contains computational results obtained by running the function FCCH on the benchmark graphs with an arbitrary initial vertex. The bound on the number of iterations was set to 5 · |V |.

3.

A k-plex Heuristic

This section describes a heuristic for finding maximum k-plexes. Feasible k-plexes provide lower bounds on ωk (G). The heuristic indirectly searches for cohesive subgraphs in G and extends them to maximal k-plexes. There has been extensive research on heuristics for finding large complete subgraphs (Busygin et al. 2002, Feo and Resende 2005, Gendreau 1993, Marchiori 2002). A typical combinatorial heuristic systematically searches a set of neighborhoods in the feasible solution space for local optima (Hansen et al. 2004). When a local optimum is obtained, it is compared to the incumbent solution and stored if necessary. The heuristic then continues searching in other neighborhoods. Obviously, the solution quality heavily depends on both the choice of neighborhoods and the local search method. Recall that if IG denotes the set of all complete subgraphs in G, then IG also denotes ¯ The remainder of this section focuses on finding stable sets the set of all stable sets in G. ¯ which are extended to maximal k-plexes in G. This approach is valid because every in G element in IG is extendible to a maximal k-plex in G. Without loss of generality, assume G 9

# *

)

+ #*

!

(

&

$ "

%

#'

##

#$

¯ with root s. Figure 3: H is connected. For if not, simply run the heuristic on each component. For u, v ∈ V , let d(u, v) be the length of a shortest path from u to v in G. The concept of

neighborhood is based on the parity of shortest path lengths from some root node s. Given a root s ∈ V , define the following sets: K0 := {v ∈ V | d(s, v) even} and K1 := {v ∈ V | d(s, v) odd}. ¯ is For example, consider the search for k-plexes in some graph H, and suppose that H shown in Figure 3. The vertex set V (H) partitions into the sets K0 = {s, 5, 6, 7, 8, 12, 13} ¯ and K1 = {1, 2, 3, 4, 9, 10, 11}. For i ∈ {0, 1}, notice that u, v ∈ Ki and uv ∈ E(H)

together imply d(u, s) = d(v, s). Otherwise, d(u, s) and d(v, s) would have different parities. Therefore, for every v ∈ Ki , NH¯ (v) ∩ {u ∈ Ki \ {v} : d(u, s) $= d(v, s)} = ∅. Hopefully, this property causes Ki to contain large stable sets. Now Ki ∈ / IH in general, but there will typically exist many subsets Ki" ⊆ Ki such that

Ki" ∈ IH . In order to examine a variety of these subsets, construct elements in IH from Ki ¯ i ]. To determine which end of edge uv ∈ E(H[K ¯ i ]) by removing one end of every edge in H[K to remove, always apply exactly one of the following rules:

Rule 1. If degH[K ¯ i ] (v) ≤ degH[K ¯ i ] (u), remove u. Otherwise, remove v. Rule 2. If degH¯ (v) ≤ degH¯ (u), remove u. Otherwise, remove v. Rule 3. Always remove v.

Rule 4. Always remove u. ¯ i ]). Let Kij be the subset obtained from Ki be applying Rule j to every edge in E(H[K Rules 1 and 2 are greedy metrics. Rules 3 and 4 are included to diversify the search space. Now extend each set Kij to a maximal k-plex in H. All k-plexes that can be constructed from a set Ki in this way constitute a neighborhood. Therefore, the search space is essentially 10

function lbound(R) 1. for all s ∈ R 2. define K0 and K1 with respect to root s 3. construct sets Kij ⊆ Ki 4. extend sets Kij to maximal k-plexes in H 5. for all j and i 6. kick(Kij ) 7. end 8. update incumbent I if necessary 9. end function kick(K) 10. construct set S := {v ∈ V \ K : |NH¯ (v) ∩ K| ≤ 1} 11. let K = K ∪ S 12. construct sets K j ⊆ K 13. extend sets K j to maximal k-plexes in H 14. end Figure 4: k-plex heuristic. a function of the root nodes, and specifying a set of neighborhoods is equivalent to specifying a set of root nodes R. The k-plex heuristic is shown in Figure 4. The incumbent solution I is initially empty and stored as a global variable.

To find a k-plex in H, arbitrarily choose a set of vertices to define R. Line 2 builds a ¯ rooted at s to determine d(v, s) for all v ∈ V . The breadthbreadth-first-search tree in H

first-search tree is also used to define degH[K ¯ i ] (v) for all v. Line 3 applies Rules 1-4, and Line 4 uses a greedy heuristic. Line 6 passes the sets Kij to the new function kick. Its purpose is to help the heuristic escape local optima. Line 12 uses Rules 1-4 for each input set K. Figure 4 is a basic k-plex heuristic. Table 4 contains computational results obtained by running lbound on the benchmark graphs. LB1 corresponds to choosing an arbitrary set of

|V | 40

vertices to define R. LB2 corresponds to setting R = V. The heuristics terminate

after an hour time limit.

4.

Exact Algorithms

This section describes exact algorithms for finding maximum k-plexes in a graph G = (V, E). The first type is based on a standard clique algorithm (Applegate and Johnson 1993, ¨ Carraghan and Pardalos 1990). The second type adapts an algorithm of Osterg˚ ard (2002).

11

G brock200-1 brock200-2 brock200-4 brock400-2 brock400-4 brock800-2 brock800-4 c-fat200-1 c-fat200-2 c-fat200-5 c-fat500-1 c-fat500-2 c-fat500-5 c-fat500-10 hamming6-2 hamming6-4 hamming8-2 hamming8-4 hamming10-2 hamming10-4 johnson8-2-4 johnson8-4-4 keller4 MANN-a9 MANN-a27 MANN-a45 p-hat300-1 p-hat300-2 p-hat300-3 p-hat700-1 p-hat700-2 p-hat700-3 san200-0.7-2 san200-0.9-1 COMP-GEOM-0 COMP-GEOM-1 COMP-GEOM-2 ERDOS971 ERDOS972 ERDOS981 ERDOS982 ERDOS991 ERDOS992 ∗ optimal

LB1 ω2 (G) 25 12 18 27 33 22 22 12∗ 24∗ 58∗ 14∗ 26∗ 64∗ 126∗ 32∗ 4 128∗ 16∗ 512∗ 43 4 14 15∗ 22 218 646 9 30 42 10 50 70 26 90 16 8 8 8∗ 8∗ 8∗ 8∗ 8∗ 8∗

Table 4: lbound Results

sec. 1 1 1 2 2 15 15 2 2 2 20 19 16 13 0 0 0 1 3 12 0 0 1 0 2 35 4 3 1 33 19 8 1 0 1439 1489 1529 18 2506 17 2473 18 2103

LB2 ω2 (G) 25 13∗ 19 28 33 23 23 12∗ 24∗ 58∗ 14∗ 26∗ 64∗ 126∗ 32∗ 4 128∗ 16∗ 512∗ 43 4 14 15∗ 22 218 646 9 30 43 12 51 71 26 90 22∗ 10∗ 8∗ 8∗ 8∗ 8∗ 8∗ 8∗ 8∗

sec. 3 5 4 23 23 299 301 10 9 7 225 210 191 146 0 0 2 8 65 281 0 0 3 0 14 859 28 20 10 537 316 140 3 2 3600 3600 3600 3600 3506 3600 3600 3600 3600

LB1 ω3 (G) 27 15 22 31 33 26 25 12∗ 24∗ 58∗ 14∗ 26∗ 64∗ 126∗ 32∗ 8∗ 128∗ 16 512 64 8∗ 14 18 30 258 762 11 34 49 13 58 82 36 125∗ 16 8 8 9∗ 9∗ 9∗ 9∗ 9∗ 9∗

12

sec. 1 1 1 2 2 15 15 2 2 2 19 18 16 13 0 0 0 1 3 12 0 0 1 0 3 76 4 3 1 33 19 9 1 0 1551 1597 1543 17 2433 17 2698 18 2319

LB2 ω3 (G) 28 15 23 32 33 26 26 12∗ 24∗ 58∗ 14∗ 26∗ 64∗ 126∗ 32∗ 8∗ 128∗ 16 512 64 8∗ 14 18 30 260 762 11 34 49 14 58 84 36 125∗ 22∗ 11∗ 8 9∗ 9∗ 9∗ 9∗ 9∗ 9∗

sec. 3 6 4 23 23 298 301 10 9 7 226 210 185 150 0 0 2 8 67 288 0 0 2 0 29 1748 28 19 10 555 320 140 4 2 3600 3600 3600 3600 3600 3600 3600 3600 3600

LB1 ω4 (G) 31 17 24 35 36 29 29 12∗ 24∗ 58∗ 14∗ 26∗ 64∗ 126∗ 32 8 128 16 512 63 9∗ 14 20 36∗ 250 756 12 39 53 16 65 92 48 125 16 8 8 10 10 10 10 10 10

sec. 1 1 1 2 2 15 15 2 2 2 19 19 16 13 0 0 0 1 3 12 0 0 1 0 2 21 4 3 2 32 19 9 1 0 1545 1531 1468 17 2483 17 2632 18 2746

LB2 ω4 (G) 32 17 25 36 37 30 30 12∗ 24∗ 58∗ 14∗ 26∗ 64∗ 126∗ 32 8 128 16 512 64 9∗ 14 20 36∗ 257 756 13 39 55 16 66 95 48 125 22 11 8 10 10 10 10 10 10

sec. 3 5 4 23 23 299 301 10 10 7 226 212 194 152 0 0 2 8 74 297 0 0 2 0 17 540 28 20 11 529 321 141 4 2 3600 3600 3600 3600 3600 3600 3600 3600 3600

function basicClique(U, K) 1. while U $= ∅ 2. if |K| + |U | ≤ max 3. return 4. end 5. U = U \ {v} for some v ∈ U 6. basicClique(U ∩ NG (v), K ∪ {v}) 7. end 8. if |K| > max 9. max = |K| 10. end 11. return Figure 5: Basic Clique Algorithm

4.1.

Algorithm Type 1

Consider the standard clique algorithm shown in Figure 5. At any point, the algorithm is constructing a complete graph K. The candidate set, U ⊆ V \ K, contains all vertices v such # that K ∪ {v} is complete. In other words, U := v∈K NG (v). The global variable max stores

the cardinality of the largest clique found. To find a maximum clique, initialize max = 0 and make the function call basicClique(V, ∅). This clique algorithm generalizes to find maximum k-plexes. The main difference is that # the candidate set U for a k-plex K is no longer v∈K NG (v). It is now defined as U := {v ∈ V \ K : K ∪ {v} is a k-plex}.

Figure 6 shows the basic k-plex algorithm. To find a maximum k-plex, initialize max = 0 and make the function call basicPlex(V, ∅). Table 5 contains computational results obtained by running basicPlex on the benchmark graphs with a one hour time limit. Without Lines 2-4, the clique algorithm examines every clique in G. Recall that G can contain an exponential, with respect to |V |, number of cliques (Moon and Moser 1965).

Lines 2-4 attempt to avoid enumeration of an exponential number of subgraphs. This is known as pruning the search tree. Although there may exist graphs which require the enumeration of an exponential number of cliques, pruning can reduce the runtime. The basic clique algorithm has many variants (R´egin 2003, Sewell 1998, Tomita and Seki 2003, Wood 1997). Many researchers focus on improving the pruning strategy using the coloring bound. A coloring heuristic provides an upper bound on ω(G[U ]) and has the

13

G brock200-1 brock200-2 brock200-4 brock400-2 brock400-4 brock800-2 brock800-4 c-fat200-1 c-fat200-2 c-fat200-5 c-fat500-1 c-fat500-2 c-fat500-5 c-fat500-10 hamming6-2 hamming6-4 hamming8-2 hamming8-4 hamming10-2 hamming10-4 johnson8-2-4 johnson8-4-4 keller4 MANN-a9 MANN-a27 MANN-a45 p-hat300-1 p-hat300-2 p-hat300-3 p-hat700-1 p-hat700-2 p-hat700-3 san200-0.7-2 san200-0.9-1 COMP-GEOM-0 COMP-GEOM-1 COMP-GEOM-2 ERDOS-97-1 ERDOS-97-2 ERDOS-98-1 ERDOS-98-2 ERDOS-99-1 ERDOS-99-2 ∗ optimal

ω2 (G) 25 13∗ 20 27 27 23 23 12∗ 24∗ 58∗ 14∗ 26∗ 64 126 32∗ 6∗ 128 16 512 32 5∗ 14∗ 15 26∗ 234 660 10∗ 29 42 13∗ 49 69 24 90 21 10 8 8∗ 3 8∗ 3 8∗ 3

seconds ≥3600 166 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 0 0 3112 1 1 ≥3600 ≥3600 506 0 ≥3600 ≥3600 ≥3600 ≥3600 0 110 ≥3600 66 ≥3600 ≥3600 14 ≥3600 ≥3600 1887 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 10 ≥3600 9 ≥3600 10 ≥3600

Table 5: basicPlex Results BBN 172822699 9759381 193074734 169761253 160979618 134201916 133857528 975 7308 86024721 2712 31068 84968699 39813170 26461612 4709 39716014 237558610 3595516 146893539 1666 11542436 247583422 5585820 79044110 19339018 665249 167764775 145501695 55769755 116066244 105454118 441219398 107493877 8346 754265 647635 1390 737 1424 650 1463 591

ω3 (G) 28 15 22 31 32 25 24 12∗ 24∗ 58 14∗ 26∗ 64 126 32 8∗ 128 18 512 43 8∗ 18 20 36∗ 351 990 12∗ 35 51 14 58 81 36 125 22 11 10 9∗ 3 9∗ 3 9∗ 3

14

seconds ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 4 5 ≥3600 115 126 ≥3600 ≥3600 ≥3600 1 ≥3600 ≥3600 ≥3600 ≥3600 0 ≥3600 ≥3600 2 ≥3600 ≥3600 1111 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 1122 ≥3600 1250 ≥3600 1323 ≥ 3600

BBN 182056437 199178608 199289654 162264447 146807899 139748190 144247348 58324 115832 104935293 364617 818322 94102915 45937780 244753572 71069 43138327 222683938 3790553 132802297 12837 350491163 207711375 106834 7146812 1022834 40704167 162883168 145528614 110462323 116785628 105553105 395072520 35590748 4672 313457 423436 198148 480 204612 420 211967 376

ω4 (G) 31 17 25 34 36 28 27 12∗ 24∗ 58 14 26 64 126 37 10∗ 128 22 512 64 9∗ 22 22 36∗ 351 990 14 41 57 16 65 93 48 125 22 12 11 4 4 4 4 4 4

seconds ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 170 226 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 9 ≥3600 ≥3600 ≥3600 ≥3600 0 ≥3600 ≥3600 278 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥ 3600 ≥ 3600 ≥ 3600 ≥ 3600 ≥ 3600 ≥ 3600 ≥ 3600 ≥ 3600 ≥ 3600

BBN 180633250 192281927 193677120 155153487 145283598 124918468 127834010 2025883 3827925 108945815 6098258 14272751 92359122 45046647 261840105 849851 50079738 230542048 3877853 54961531 104984 342248079 258859895 25470013 10158168 1283088 128042727 154569677 139965092 98797915 117405227 97553599 330336652 39843163 6269 21132 8674 339906 470 303239 410 292341 369

potential to prune a larger portion of the search tree because χ(G[U ]) ≤ |U |. Figure 7 shows an algorithm which uses co-k-plex coloring to prune the search tree.

Let k-plex1a and k-plex1b denote the functions obtained by using ICCH and FCCH, respectively, to execute Line 2 of k-plex1. ICCH and FCCH are discussed in Section 2. To find a maximum k-plex in G, run LB1 to find an initial value for max and make the function call to k-plex1a(V, ∅) or k-plex1b(V, ∅). Tables 6 and 7 contain computational results obtained by running these algorithms on the benchmark graphs with a one hour time limit.

4.2.

Algorithm Type 2

¨ The algorithm in this subsection is based on the following idea of Osterg˚ ard (2002). Let V = {v1 , ..., vn } and Si = {vi , ..., vn }. The algorithm in Figure 5 searches for the largest ¨ clique in S1 containing v1 , the largest clique in S2 containing v2 , and so on. Osterg˚ ard suggests reversing this order. That is, search Sn for the largest clique containing vn , Sn−1 for the largest clique containing vn−1 , and so on. Let c(i) be the size of the largest clique in Si . Clearly, c(n) = 1 and c(1) = ω(G). Moreover, c(i) ∈ {c(i + 1), c(i + 1) + 1} for i = 1, ..., n − 1. ¨ Figure 8 shows Osterg˚ ard’s clique algorithm. The search order allows for the following pruning strategy. Given candidate set U, let i = min{j : vj ∈ U }. Notice that U ⊆ Si and

hence ωk (G[U ]) ≤ c(i). This new bound is used in Line 10 in Figure 8. ¨ Osterg˚ ard’s algorithm adapts to find maximum k-plexes with two modifications. First,

function basicPlex(U, K) 1. while U $= ∅ 2. if |K| + |U | ≤ max 3. return 4. end 5. K = K ∪ {v}; U = U \ {v} for some v ∈ U 6. U " := {u ∈ U : K ∪ {u} is a k-plex} 7. basicPlex(U " , K) 8. end 9. if |K| > max 10. max = |K| 11. end 12. return Figure 6: Basic k-plex Algorithm

15

Table 6: k-plex1a Results G brock200-1 brock200-2 brock200-4 brock400-2 brock400-4 brock800-2 brock800-4 c-fat200-1 c-fat200-2 c-fat200-5 c-fat500-1 c-fat500-2 c-fat500-5 c-fat500-10 hamming6-2 hamming6-4 hamming8-2 hamming8-4 hamming10-2 hamming10-4 johnson8-2-4 johnson8-4-4 keller4 MANN-a9 MANN-a27 MANN-a45 p-hat300-1 p-hat300-2 p-hat300-3 p-hat700-1 p-hat700-2 p-hat700-3 san200-0.7-2 san200-0.9-1 COMP-GEOM-0 COMP-GEOM-1 COMP-GEOM-2 ERDOS-97-1 ERDOS-97-2 ERDOS-98-1 ERDOS-98-2 ERDOS-99-1 ERDOS-99-2 ∗ optimal

ω2 (G) 25 13∗ 19 27 33 23 23 12∗ 24∗ 58 14∗ 26∗ 64 126 32∗ 6∗ 128∗ 16 512∗ 43 5∗ 14∗ 15 26∗ 234 660 10∗ 30 42 13∗ 50 70 26 90 22∗ 10∗ 8∗ 8∗ 8 8∗ 8 8∗ 8

seconds ≥3600 289 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 2 3 ≥3600 32 33 ≥3600 ≥3600 0 0 0 ≥3600 1 ≥3600 0 138 ≥3600 123 ≥3600 ≥3600 33 ≥3600 ≥3600 3186 ≥3600 ≥3600 ≥3600 ≥3600 1218 1205 1220 313 ≥ 3600 427 ≥ 3600 406 ≥ 3600

BBN 86030174 8663613 102282008 95110220 51472394 94857693 96369941 873 5269 19298000 2293 20601 14631858 5104395 0 3668 0 158903409 0 44661342 1585 7755953 147002319 4111457 78820556 18866263 562727 77146967 73906134 46290951 51487369 42921661 247380139 64534714 3301 2366 1211 954 2698 1016 2359 472 2913

ω3 (G) 28 15 22 31 33 26 25 12∗ 24∗ 58 14∗ 26∗ 64 126 32 8∗ 128 18 512 64 8∗ 18 20 36∗ 351 990 12∗ 35 50 14 58 82 36 125 22 11 8 9 9 9 9 9 9

16

seconds ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 27 28 ≥3600 1580 1617 ≥3600 ≥3600 ≥3600 1 ≥3600 ≥3600 ≥3600 ≥3600 0 ≥3600 ≥3600 6 ≥3600 ≥3600 2827 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥ 3600 ≥ 3600 ≥ 3600 ≥ 3600 ≥ 3600 ≥ 3600

BBN 95147581 100542983 106907189 88619566 74724985 81559009 90097273 57730 109070 24247513 357420 787649 21078075 7643111 92535097 59533 14422543 149846604 434296 21254123 12378 191111049 128108327 102896 383569 18029 39631513 84162645 70787196 56967864 56760785 46670755 368232192 5514998 18785 110109 14166 5742 1494 6008 1086 6248 943

ω4 (G) 31 17 25 35 36 29 29 12∗ 24∗ 58 14 26 64 126 37 10∗ 128 22 512 64 9∗ 22 22 36∗ 351 990 14 41 57 16 65 92 48 125 22 11 8 10 10 10 10 10 10

seconds ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 922 909 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 15 ≥3600 ≥3600 ≥3600 ≥3600 1 ≥3600 ≥3600 739 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥ 3600 ≥ 3600 ≥ 3600 ≥ 3600 ≥ 3600 ≥ 3600

BBN 97664910 90143057 98966956 81344425 80938337 71154628 72735746 1960167 3385277 28799458 545917 1224327 24930408 9068983 119263227 701425 20007766 157055208 456014 33084666 104804 172931195 169102280 25470013 781334 53068 54618899 84779804 70408792 43437614 61159187 48056462 301494773 6899702 173911 122167 6208 5571 1685 5876 1771 5135 1544

Table 7: k-plex1b Results G brock200-1 brock200-2 brock200-4 brock400-2 brock400-4 brock800-2 brock800-4 c-fat200-1 c-fat200-2 c-fat200-5 c-fat500-1 c-fat500-2 c-fat500-5 c-fat500-10 hamming6-2 hamming6-4 hamming8-2 hamming8-4 hamming10-2 hamming10-4 johnson8-2-4 johnson8-4-4 keller4 MANN-a9 MANN-a27 MANN-a45 p-hat300-1 p-hat300-2 p-hat300-3 p-hat700-1 p-hat700-2 p-hat700-3 san200-0.7-2 san200-0.9-1 COMP-GEOM-0 COMP-GEOM-1 COMP-GEOM-2 ERDOS-97-1 ERDOS-97-2 ERDOS-98-1 ERDOS-98-2 ERDOS-99-1 ERDOS-99-2 ∗ optimal

ω2 (G) 25 13∗ 19 27 33 22 22 12∗ 24∗ 58∗ 14∗ 26∗ 64∗ 126 32∗ 6∗ 128∗ 16 512∗ 43 5∗ 14∗ 15 26∗ 234 660 10∗ 30 42 12 50 70 26 90 22∗ 10∗ 8∗ 8∗ 8 8∗ 8 8∗ 8

seconds ≥3600 1778 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 3 3 836 34 39 1579 ≥3600 0 0 0 ≥3600 1 ≥3600 0 475 ≥3600 395 ≥3600 ≥3600 139 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 2202 2075 2731 370 ≥ 3600 414 ≥ 3600 438 ≥ 3600

BBN 10054924 7722362 12056663 3298972 2879018 847946 846155 802 2050 444241 2060 10177 557081 327032 0 3380 0 9945892 0 323816 1585 6389736 18263136 3240597 3053292 28446 500766 7335988 5279160 2617164 1176955 750652 12473403 9748246 2797 2013 1193 869 574 953 449 989 592

ω3 (G) 28 15 22 31 33 26 25 12∗ 24∗ 58 14∗ 26∗ 64 126 32 8∗ 128 18 512 64 8∗ 18 20 36∗ 351 990 12 35 50 13 58 82 36 125 22 11 8 9 9 9 9 9 9

17

seconds ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 41 60 ≥3600 1796 2368 ≥3600 ≥3600 ≥3600 3 ≥3600 ≥3600 ≥3600 ≥3600 0 ≥3600 ≥3600 15 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥ 3600 ≥ 3600 ≥ 3600 ≥ 3600 ≥ 3600 ≥ 3600

BBN 9610060 15728716 11842251 2506819 2769768 687075 708185 57612 96754 3960477 356744 754350 2800459 974543 6160422 58663 636791 9748498 7508 370717 12337 48544486 17714412 100969 78758 7035 12816637 6439794 5693325 2557821 1407568 1007772 16047112 970390 15368 10073 4102 4928 702 4497 492 4379 633

ω4 (G) 31 17 24 35 36 29 29 12∗ 24 58 14 26 64 126 36 10∗ 128 18 512 64 9∗ 22 22 36∗ 351 990 14 40 57 16 65 92 48 125 22 11 8 10 10 10 10 10 10

seconds ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 1397 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 33 ≥3600 ≥3600 ≥3600 ≥3600 2 ≥3600 ≥3600 1733 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥ 3600 ≥ 3600 ≥ 3600 ≥ 3600 ≥ 3600 ≥ 3600

BBN 11595841 15883075 13308843 3084792 3339363 719475 771036 1958425 1586893 5319030 427000 797974 3453142 1193712 44987310 693982 2835717 10199713 29602 236250 104679 52307238 20281149 25470013 253074 20406 12577276 8083792 5957872 2516929 1459008 1001573 16009322 1705032 71325 20237 6028 4961 671 4506 487 4376 615

function k-plex1(U, K) 1. while U $= ∅ 2. Compute χ˜k (G[U ]) ≥ χk (G[U ]) 3. if |K| + χ˜k (G[U ]) ≤ max 4. return 5. end 6. K = K ∪ {v}; U = U \ {v} for some v ∈ U 7. U " := {u ∈ U : K ∪ {u} is a k-plex} 8. k-plex1(U " , K) 9. end 10. if |K| > max 11. max = |K| 12. end 13. return Figure 7: k-plex Algorithm define ck (i) = ωk (G[Si ]). Second, the candidate set with respect to K is defined as U := {v ∈ V \ K : K ∪ {v} is a k-plex}. Figure 9 shows k-plex2. Table 8 contains computational results obtained by running kplex2 on the benchmark graphs with a one hour time limit.

5.

Conclusions and Future Work

This paper describes combinatorial algorithms for finding maximum k-plexes in a graph. Section 2 focuses on co-k-plex coloring heuristics which are used as an upper bound on the k-plex number. Section 3 discusses a heuristic for finding maximum k-plexes. This heuristic provides a lower bound on the k-plex number. Section 4 describes exact algorithms for finding maximum k-plexes. Table 9 summarizes the number of instances solved to optimality by each exact algorithm. The first three exact algorithms perform similarly within the hour time limit. Although this type of algorithm appears to benefit from the upper and lower bound heuristics, they solve a relatively small number of instances to optimality. This suggests that the coloring heuristics might not produce tight upper bounds, so an interesting avenue for future work would be to develop stronger coloring heuristics. 18

function OsterClique(U, K) 1. if |U | = 0 2. if |K| > max 3. max = |K| 4. found=true 5. end 6. return 7. end 8. while U $= ∅ 9. if |K| + |U | ≤ max 10. return 11. end 12. i = min{j : vj ∈ U } 13. if |K| + c(i) ≤ max 14. return 15. end 16. U = U \ {vi } 17. OsterClique(U ∩ NG (vi ), K ∪ {vi }) 18. if found=true 19. return 20. end 21. end 22. return function findClique 23. max = 0 24. for i = n down to 1 25. f ound = f alse 26. OsterClique(Si ∩ NG (vi ), {vi }) 27. end 28. c(i) = max 29. return ¨ Figure 8: Osterg˚ ard’s Clique Algorithm

19

function OsterPlex(U, K) 1. if |U | = 0 2. if |K| > max 3. max = |K| 4. found=true 5. end 6. return 7. end 8. while U $= ∅ 9. if |K| + |U | ≤ max 10. return 11. end 12. i = min{j : vj ∈ U } 13. if |K| + ck (i) ≤ max 14. return 15. end 16. K = K ∪ {vi }; U = U \ {vi } 17. U " := {u ∈ U : K ∪ {u} is a k-plex}. 18. OsterPlex(U " , K) 19. if found=true 20. return 21. end 22. end 23. return function k-plex2 24. max = 0 25. for i = n down to 1 26. f ound = f alse 27. OsterPlex(Si , {vi }) 28. end 29. ck (i) = max 30. return ¨ Figure 9: Osterg˚ ard’s Algorithm Adapted for k-plexes A natural approach for improving the co-k-plex coloring heuristics would be to generalize the DSATUR graph coloring heuristic (Br´elaz’s 1979). The DSATUR heuristic dynamically determines the order in which vertices are colored. More precisely, it maintains the number of distinct colors adjacent to each uncolored vertex and always colors the vertex with the highest number of adjacent color classes. The number of adjacent colors classes defines a vertex’s saturation degree. This idea generalizes to co-k-plex coloring. The saturation degree of v is redefined as the number of distinct partition sets Ci such that Ci ∪ {v} is not a 20

Table 8: k-plex2 Results G brock200-1 brock200-2 brock200-4 brock400-2 brock400-4 brock800-2 brock800-4 c-fat200-1 c-fat200-2 c-fat200-5 c-fat500-1 c-fat500-2 c-fat500-5 c-fat500-10 hamming6-2 hamming6-4 hamming8-2 hamming8-4 hamming10-2 hamming10-4 johnson8-2-4 johnson8-4-4 keller4 MANN-a9 MANN-a27 MANN-a45 p-hat300-1 p-hat300-2 p-hat300-3 p-hat700-1 p-hat700-2 p-hat700-3 san200-0.7-2 san200-0.9-1 COMP-GEOM-0 COMP-GEOM-1 COMP-GEOM-2 ERDOS-97-1 ERDOS-97-2 ERDOS-98-1 ERDOS-98-2 ERDOS-99-1 ERDOS-99-2 ∗ optimal

ω2 (G) 23 13∗ 19 23 23 21 20 12∗ 24∗ 58∗ 14∗ 26∗ 64∗ 126∗ 32∗ 6∗ 128∗ 16∗ 512∗ 41 5∗ 14∗ 15∗ 26∗ 235 661 10∗ 29 34 13∗ 37 51 24 90 22∗ 10∗ 8∗ 8∗ 8∗ 8∗ 8∗ 8∗ 8∗

seconds ≥3600 74 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 0 0 0 0 0 0 0 0 0 0 47 33 ≥3600 0 0 1000 0 ≥3600 ≥3600 6 ≥3600 ≥3600 667 ≥3600 ≥3600 ≥3600 ≥3600 397 1118 1145 0 1253 0 1514 0 1757

BBN 983266826 19636408 917681365 996101972 989222371 926436143 919220732 3677 1895 760 19733 10081 4195 2141 396 4526 7448 7982728 147919 595181620 2621 40896 284120617 53102 33275514 3326395 1561119 704387512 850772303 90760266 655931022 820473217 1079696544 485362621 2150878 4875730 5034335 25770 12602187 27246 14265207 28035 15748963

ω3 (G) 24 15 20 27 22 20 22 12∗ 24∗ 58∗ 14∗ 26∗ 64∗ 126∗ 32∗ 8∗ 128∗ 17 448 46 8∗ 18∗ 21 36∗ 351 990 12∗ 29 37 13 39 43 36 125∗ 5 4 4 9∗ 3 9∗ 3 9∗ 3

seconds ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 0 0 0 8 2 0 0 1 0 2744 ≥3600 ≥3600 ≥3600 0 39 ≥3600 1 ≥3600 ≥3600 552 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 253 ≥3600 ≥3600 ≥3600 19 ≥3600 20 ≥3600 21 ≥3600

BBN 1061715139 973470713 1078546865 1112675498 1132833870 1077097050 1044131631 123687 30227 6566 1183127 266957 52676 18391 195054 37113 182864232 1053193592 208033053 1058346589 10472 14014988 909167108 375502 23112644 2840051 128854637 909116307 956121775 687249461 804657901 991432587 940350668 64534321 156104820 172334737 162641029 2391684 216016439 2552046 209965048 2632545 205078469

ω4 (G) 27 17 21 29 30 23 24 12∗ 24∗ 58∗ 14∗ 26∗ 64∗ 126∗ 40∗ 10∗ 112 19 430 51 9∗ 21 16 36∗ 351 990 13 33 36 13 42 44 48 50 4 4 4 11∗ 4 11∗ 4 11∗ 4

seconds ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 19 2 0 1416 91 6 2 872 1 ≥3600 ≥3600 ≥3600 ≥3600 0 ≥3600 ≥3600 129 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 ≥3600 1897 ≥3600 1675 ≥3600 1783 ≥3600

BBN 1090413176 1118582507 1158926695 1152442843 1159027166 1142792112 1109841414 4452622 759423 88301 134845416 12812559 956345 221427 226208834 395729 808066329 1045849871 799045131 1042679457 151028 1160638581 1183954679 36511315 93317993 16526869 1001501766 987023800 1050444550 964257683 937295855 1063865458 862621323 1006342152 265372393 267076112 266562165 256331821 331758265 212106816 318616472 223030325 308498817

co-k-plex. This general version of DSATUR was implemented and tested (McClosky 2008). However, the results did not improve upon ICCH and FCCH. The final exact algorithm, k-plex2, dominates the Type 1 algorithms with respect to number of instances solved to optimality. Moreover, k-plex2 converges quickly, when it converges at all. However, the final solution can be far from optimal when k-plex2 fails to converge. The algorithm demonstrates this phenomenon on the collaboration networks for k = 3, 4. The reason for this behavior is that the algorithm can spend much of its time optimizing over a subset of V . Consequently, if vertices at the end of the vertex ordering are needed to make large k-plexes, good solutions cannot be found until the entire vertex set is 21

Table 9: Results Summary

Algorithm basicPlex k-plex1a k-plex1b k-plex2

k=2 16 20 21 28

k=3 11 8 7 18

k=4 5 5 4 14

Total 32 33 32 60

processed. This illustrates the importance of vertex ordering for this type of algorithm. Type 1 algorithms spend time at each branch and bound node to approximate χk (G[U ]) for the candidate set U . Unfortunately, χk (G) could be an inaccurate bound on ωk (G) in general. The k-plex2 algorithm spends no time estimating χk (G) but benefits from the bound obtained using the ck array. In any case, the purpose of these bounds is to prune the candidate set. The requirement for membership in the candidate set becomes less stringent as k increases, so the set becomes harder to prune. This contributes to the increase in runtime as k grows. When comparing these results to those found in Balasundaram et al. (2006), the combinatorial algorithms outperform branch-and-cut on the DIMACS graphs. On the other hand, branch-and-cut works better on the larger social network graphs. This suggests that combinatorial methods are superior for graphs on a few hundred vertices, but branch-and-cut becomes the preferred method as the size of the graphs grow. Another area for future research is an exact co-k-plex coloring algorithm. The co-k-plex chromatic number is unknown for most of the benchmark graphs. Therefore, it is difficult to evaluate the performance of the co-k-plex coloring heuristics in Section 2. It would also be beneficial to study additional heuristics for both the upper and lower bounds on ωk (G).

Acknowledgments This research was partially supported by NSF grants DMI-0521209 and DMS-0729251. Svy¨ atoslav Trukhanov has independently generalized Osterg˚ ard’s clique algorithm to find maximum k-plexes (Trukhanov 2008).

References Applegate, D. and Johnson, D. 1993. dfmax.c. ftp://dimacs.rutgers.edu/pub/challange/graph/solvers/.

22

Atamt¨ urk, A., Nemhauser, G.L., and Savelsbergh, M.W.P. 2000. Conflict graphs in solving integer programming problems. European Journal of Operational Research 121 40–55. Balas, E. and Xue, J. 1996. Weighted and unweighted maximum clique algorithms with upper bounds from fractional coloring. Algorithmica 15 397–412. Balasundaram, B., Butenko, S., Hicks, I.V. and Sachdeva, S. 2006. Clique relaxations in social network analysis: The maximum k-plex problem. Submitted. V. Batagelj and A. Mrvar, 2006. Pajek datasets. http://vlado.fmf.uni-lj.si/pub/networks/data/, 2006. Beebe, N.H.F. (2002): Nelson H.F. Beebe’s Bibliographies Page. Butenko, S. and Wilhelm, W. 2006. Clique-detection models in computational biochemistry and genomics. European Journal of Operational Research, 173 1–17. Carraghan, R. and Pardalos, P.M. 1990. An exact algorithm for the maximum clique problem. Oper Res. Lett. 9 375–382. Chen, Y. P., Liestman, A. L., and Liu, J. 2004. Clustering algorithms for ad hoc wireless networks. Ad Hoc and Sensor Networks (Y. Pan and Y. Xiao eds.), Nova Science Publishers. DIMACS. Cliques, Coloring, and Satisability: Second DIMACS Implementation Challenge. http://dimacs.rutgers.edu/Challenges/, 1995. Accessed 2006. Festinger, L. 1949. The analysis of sociograms using matrix algebra. Human Relations 10:153-58. Friedkin, N.E.1984. Structural cohesion and equivalence explanations of social homogeneity. Sociological Methods & Research 12: 235-261. Jones, B., Computational Geometry Database, February 2002; FTP / HTTP Luce, R. and Perry, A. 1949. A method of matrix analysis of group structure. Psychometrika 14, 95-116. McClosky, B. and Hicks, I. V. 2007. The co-2-plex polytope and integral systems, SIAM Journal of Discrete Mathematics, to appear. McClosky, B. 2008. Independence Systems and Stable Set Relaxations. Ph.D. thesis, Computational and Applied Mathematics Department, Rice University, Houston, TX. Moon, J.W. and Moser, L. 1965. On cliques in graphs. Israel J. Math. 3 23–28. 23

¨ Osterg˚ ard, P. R. J. 2002. A fast algorithm for the maximum clique problem. Discrete Appl. Math. 120 197–207. V. Batagelj and A. Mrvar, 2006. Pajek datasets. http://vlado.fmf.uni-lj.si/pub/networks/data/. Accessed 2006. R´egin, J.C. 2003. Solving the maximum clique problem with constraint programming. Fifth International Workshop on Integration of AI and OR Techniques in Constraint Programming for Combinatorial Optimization Problems. 166–179. Seidman, S. B. and Foster, B. L. 1978. A graph theoretic generalization of the clique concept. Journal of Mathematical Sociology 6 139–154. Sewell, E.C. 1998. A branch and bound algorithm for the stability number of a sparse graph. INFORMS J. Comput. 10 438–447. Tomita, E. and Seki, T. 2003. An efficient branch-and-bound algorithm for finding a maximum clique. Lecture Notes in Computer Science Series 2731 278–289. Trukhanov, S. 2008, personal communication. Washio, T. and Motoda, H. 2003. State of the art of graph-based data mining. SIGKDD Explor. Newsl.. 5(1) 59–68. Wasserman, S. and Faust, K. 1994. Social Network Analysis, Cambridge University Press. Wood, D.R. 1997. An algorithm for finding a maximum clique in a graph. Oper. Res. Lett. 21 211-217.

24