Achieving Exact Cluster Recovery Threshold via Semidefinite ...

Report 3 Downloads 106 Views
Achieving Exact Cluster Recovery Threshold via Semidefinite Programming Jiaming Xu Department of Statistics, The Wharton School University of Pennsylvania [email protected] Joint work with Bruce Hajek (Illinois) and Yihong Wu (Illinois)

June 17, 2015

Community detection in networks • Networks with community structures arise in many applications

Santa Fe Institute Collaboration network [Girvan-Newman ’02]

Jiaming Xu (Wharton)

Optimal Cluster Recovery via SDP

2

Community detection in networks • Networks with community structures arise in many applications

Santa Fe Institute Collaboration network [Girvan-Newman ’02] • Task: Discover underlying communities based on the network

topology

Jiaming Xu (Wharton)

Optimal Cluster Recovery via SDP

2

Stochastic block model [Holland et al. ’83] Planted partition model [Condon-Karp 01]

n = 40, K = 10, r = 3 Jiaming Xu (Wharton)

Optimal Cluster Recovery via SDP

3

Stochastic block model [Holland et al. ’83] Planted partition model [Condon-Karp 01]

p = 0.9 Jiaming Xu (Wharton)

Optimal Cluster Recovery via SDP

4

Stochastic block model [Holland et al. ’83] Planted partition model [Condon-Karp 01]

p = 0.9 Jiaming Xu (Wharton)

q = 0.1 Optimal Cluster Recovery via SDP

4

Stochastic block model [Holland et al. ’83] Planted partition model [Condon-Karp 01]

p = 0.9 Jiaming Xu (Wharton)

q = 0.1 Optimal Cluster Recovery via SDP

5

Exact recovery

b C ∗ −→ A −→ C • Goal: exact recovery (strong consistency) n→∞

b = C ∗ } −−−→ 1 P{C • Alternatives I almost exact recovery (weak consistency): [Mossel-Neeman-Sly ’14, Abbe-Sandon ’15, Montanari ’15]... I correlated recovery: [Decelle-Krzakala-Moore-Zdeborova ’11, Mossel-Neeman-Sly ’12 ’13, Massoulie ’13]...

Jiaming Xu (Wharton)

Optimal Cluster Recovery via SDP

6

Objectives of this talk

• Information limit: When is exact recovery possible (impossible)?

Jiaming Xu (Wharton)

Optimal Cluster Recovery via SDP

7

Objectives of this talk

• Information limit: When is exact recovery possible (impossible)? • Is the information limit achievable in polynomial time, e.g., via

semidefinite programming?

Jiaming Xu (Wharton)

Optimal Cluster Recovery via SDP

7

remainder of the talk

1

Two equal-sized communities

2

A single community of linear size

3

Extensions and open problems

Jiaming Xu (Wharton)

Optimal Cluster Recovery via SDP

8

Two equal-sized communities: Binary symmetric SBM

Model: • n nodes partitioned into two communities of size n2 (σi∗ = ±1). ( • i ∼ j independently w.p.

p= q=

Jiaming Xu (Wharton)

a log n n b log n n

σi∗ = σj∗ σi∗ 6= σj∗

Optimal Cluster Recovery via SDP

9

Two equal-sized communities: Binary symmetric SBM

Model: • n nodes partitioned into two communities of size n2 (σi∗ = ±1). ( • i ∼ j independently w.p.

p= q=

a log n n b log n n

σi∗ = σj∗ σi∗ 6= σj∗

Remarks • a + b > 2 is the connectivity threshold and necessary for exact

recovery

Jiaming Xu (Wharton)

Optimal Cluster Recovery via SDP

9

Two equal-sized communities: MLE ⇒ SDP relaxation • Maximum likelihood estimator (MLE): Assume p ≥ q

max hA, σσ > i σ

→ # of in-cluster edges

s.t. σi ∈ {±1} i ∈ [n] σ>1 = 0

Jiaming Xu (Wharton)

Optimal Cluster Recovery via SDP

10

Two equal-sized communities: MLE ⇒ SDP relaxation • Maximum likelihood estimator (MLE): Assume p ≥ q

max hA, σσ > i

→ # of in-cluster edges

σ

s.t. σi ∈ {±1} i ∈ [n] σ>1 = 0 lift: Y =σσ >

========⇒

max hA, Y i Y

s.t. rank(Y ) = 1 Yii = 1 i ∈ [n] hJ, Y i = 0

Jiaming Xu (Wharton)

Optimal Cluster Recovery via SDP

10

Two equal-sized communities: MLE ⇒ SDP relaxation • Maximum likelihood estimator (MLE): Assume p ≥ q

max hA, σσ > i

→ # of in-cluster edges

σ

s.t. σi ∈ {±1} i ∈ [n] σ>1 = 0 lift: Y =σσ >

========⇒

max hA, Y i Y

s.t. Y  0 Yii = 1 i ∈ [n] hJ, Y i = 0

Jiaming Xu (Wharton)

Optimal Cluster Recovery via SDP

10

Two equal-sized communities: MLE ⇒ SDP relaxation • Maximum likelihood estimator (MLE): Assume p ≥ q

max hA, σσ > i

→ # of in-cluster edges

σ

s.t. σi ∈ {±1} i ∈ [n] σ>1 = 0 lift: Y =σσ >

========⇒

max hA, Y i Y

s.t. Y  0 Yii = 1 i ∈ [n] hJ, Y i = 0 ( • Goal: P

YbSDP =

1

−1

−1

1

) →1

Jiaming Xu (Wharton)

Optimal Cluster Recovery via SDP

10

Two equal-sized communities: Optimal recovery via SDP Theorem (Abbe-Bandeira-Hall ’14, Mossel-Neeman-Sly ’14) √ √ • If ( a − b)2 > 2, recovery is achievable in polynomial-time. √ √ • If ( a − b)2 < 2, recovery is impossible.

Jiaming Xu (Wharton)

Optimal Cluster Recovery via SDP

11

Two equal-sized communities: Optimal recovery via SDP Theorem (Abbe-Bandeira-Hall ’14, Mossel-Neeman-Sly ’14) √ √ • If ( a − b)2 > 2, recovery is achievable in polynomial-time. √ √ • If ( a − b)2 < 2, recovery is impossible. Theorem (Hajek-Wu-X. ’14)

√ √ SDP achieves the optimal recovery threshold ( a − b)2 > 2.

Jiaming Xu (Wharton)

Optimal Cluster Recovery via SDP

11

Two equal-sized communities: Optimal recovery via SDP Theorem (Abbe-Bandeira-Hall ’14, Mossel-Neeman-Sly ’14) √ √ • If ( a − b)2 > 2, recovery is achievable in polynomial-time. √ √ • If ( a − b)2 < 2, recovery is impossible. Theorem (Hajek-Wu-X. ’14)

√ √ SDP achieves the optimal recovery threshold ( a − b)2 > 2. Remarks • originally conjectured in [Abbe-Bandeira-Hall ’14] • independently proved by [Bandeira ’15]

( • P

YbSDP =

1

−1

−1

1

) = 1 − n−Ω(1)

Jiaming Xu (Wharton)

Optimal Cluster Recovery via SDP

11

Two equal-sized communities: Dual certificate argument ( Goal: P YbSDP =

1

−1

−1

1

) →1

YbSDP = arg max hA, Y i Y

s.t. Y  0 Yii = 1 hJ, Y i = 0

Jiaming Xu (Wharton)

Optimal Cluster Recovery via SDP

12

Two equal-sized communities: Dual certificate argument ( Goal: P YbSDP =

1

−1

−1

1

) →1

YbSDP = arg max hA, Y i

dual variables

s.t. Y  0

S0

Y

Yii = 1 hJ, Y i = 0

Jiaming Xu (Wharton)

D = diag {di } λ∈R

Optimal Cluster Recovery via SDP

12

Two equal-sized communities: Dual certificate argument ( Goal: P YbSDP =

1

−1

−1

1

) →1

YbSDP = arg max hA, Y i

dual variables

s.t. Y  0

S0

Y

D = diag {di }

Yii = 1 hJ, Y i = 0

λ∈R

• di = (# of nbrs in own cluster) − (# of nbrs in other cluster)

∼ Binom(n/2 − 1, p) − Binom(n/2, q)

Jiaming Xu (Wharton)

Optimal Cluster Recovery via SDP

12

Two equal-sized communities: Dual certificate argument ( Goal: P YbSDP =

1

−1

−1

1

) →1

YbSDP = arg max hA, Y i

dual variables

s.t. Y  0

S0

Y

D = diag {di }

Yii = 1 hJ, Y i = 0

λ∈R

• di = (# of nbrs in own cluster) − (# of nbrs in other cluster)

∼ Binom(n/2 − 1, p) − Binom(n/2, q) • S = D − A + λJ  0 if λ ≥ (p + q)/2 and min di ≥ kA − E [A] k

Jiaming Xu (Wharton)

Optimal Cluster Recovery via SDP

12

Two equal-sized communities: Dual certificate argument ( Goal: P YbSDP =

1

−1

−1

1

) →1

YbSDP = arg max hA, Y i

dual variables

s.t. Y  0

S0

Y

D = diag {di }

Yii = 1 hJ, Y i = 0

λ∈R

• di = (# of nbrs in own cluster) − (# of nbrs in other cluster)

∼ Binom(n/2 − 1, p) − Binom(n/2, q) • S = D − A + λJ  0 if λ ≥ (p + q)/2 and min di ≥ kA − E [A] k • min di = ΩP (log n) if



a−



Jiaming Xu (Wharton)

b>



2

Optimal Cluster Recovery via SDP

12

Two equal-sized communities: Dual certificate argument ( Goal: P YbSDP =

1

−1

−1

1

) →1

YbSDP = arg max hA, Y i

dual variables

s.t. Y  0

S0

Y

D = diag {di }

Yii = 1 hJ, Y i = 0

λ∈R

• di = (# of nbrs in own cluster) − (# of nbrs in other cluster)

∼ Binom(n/2 − 1, p) − Binom(n/2, q) • S = D − A + λJ  0 if λ ≥ (p + q)/2 and min di ≥ kA − E [A] k • min di = ΩP (log n) if



a−





2 √ • kA − E [A] k = OP ( log n): 2nd-order stochastic dominance [Tomozei-Massouli´e ’14] + result for iid matrix [Seginer ’00] Jiaming Xu (Wharton)

b>

Optimal Cluster Recovery via SDP

12

A single community: Planted dense subgraph model

• One cluster of size K plus n − K outliers

p

q

q

q

• Connectivity p within cluster and q otherwise • Linear community size: K = ρ n n n • Relatively sparse graph: p = a log and q = b log n n

Jiaming Xu (Wharton)

Optimal Cluster Recovery via SDP

13

A single community: Optimal recovery via SDP

Theorem (Hajek-Wu-X. ’14) • If ρf (a, b) > 1, recovery is achievable via SDP in polynomial-time. • If ρf (a, b) < 1, recovery is impossible.

Jiaming Xu (Wharton)

Optimal Cluster Recovery via SDP

14

A single community: Optimal recovery via SDP

Theorem (Hajek-Wu-X. ’14) • If ρf (a, b) > 1, recovery is achievable via SDP in polynomial-time. • If ρf (a, b) < 1, recovery is impossible.

Remarks a−b • f (a, b) = a − τ ∗ log τea∗ with τ ∗ = log a−log b • Sufficiency: dual certificate argument

• Necessity: show MLE fails by swapping an in-cluster node with an

out-cluster node

Jiaming Xu (Wharton)

Optimal Cluster Recovery via SDP

14

Extensions and open problems

SDP achieves sharp threshold: • Two unequal-sized clusters (ρn, (1 − ρ)n): η(ρ, a, b) > 1

Jiaming Xu (Wharton)

Optimal Cluster Recovery via SDP

15

Extensions and open problems

SDP achieves sharp threshold: • Two unequal-sized clusters (ρn, (1 − ρ)n): η(ρ, a, b) > 1 • Two clusters with unknown sizes:

Jiaming Xu (Wharton)



a−



b>



2

Optimal Cluster Recovery via SDP

15

Extensions and open problems

SDP achieves sharp threshold: • Two unequal-sized clusters (ρn, (1 − ρ)n): η(ρ, a, b) > 1



a− √ √ √ • r equal-sized clusters: a − b > r • Two clusters with unknown sizes:

Jiaming Xu (Wharton)



b>



2

Optimal Cluster Recovery via SDP

15

Extensions and open problems

SDP achieves sharp threshold: • Two unequal-sized clusters (ρn, (1 − ρ)n): η(ρ, a, b) > 1



a− √ √ √ • r equal-sized clusters: a − b > r • Two clusters with unknown sizes:



b>



2

General SBM: • Optimality of SDP relaxation remains open (but within a factor of 4) • Sharp threshold is found in [Abbe-Sandon ’15] via a two-stage

procedure

Jiaming Xu (Wharton)

Optimal Cluster Recovery via SDP

15

Concluding remarks

• If community sizes are linear, information limit is attainable in

polynomial-time via SDP • If community sizes scale as nβ for β < 1, information limit might

not be achievable in polynomial-time [Hajek-Wu-X. ’14]

Jiaming Xu (Wharton)

Optimal Cluster Recovery via SDP

16

Concluding remarks

• If community sizes are linear, information limit is attainable in

polynomial-time via SDP • If community sizes scale as nβ for β < 1, information limit might

not be achievable in polynomial-time [Hajek-Wu-X. ’14] References • B. Hajek, Y. Wu & J. X. (2014). Achieving exact cluster recovery threshold via semidefinite programming. arXiv:1412.6156 (ISIT ’15) • B. Hajek, Y. Wu & J. X. (2015). Achieving exact cluster recovery threshold via semidefinite programming: Extensions. arXiv:1502.07738 • B. Hajek, Y. Wu & J. X. (2014). Computational lower bounds for community detection on random graphs. arXiv:1406.6625 (COLT ’15)

Jiaming Xu (Wharton)

Optimal Cluster Recovery via SDP

16

Spectral concentration

Theorem Let A denote a symmetric and zero-diagonal random matrix, where the entries {Aij : i < j} are independent and [0, 1]-valued. Assume that E [Aij ] ≤ p, where c0 log n/n ≤ p ≤ 1 − c1 for arbitrary constants c0 > 0 and c1 > 0. Then for any c > 0, there exists c0 > 0 such that for any n ≥ 1,  √ P kA − E [A]k2 ≤ c0 np ≥ 1 − n−c .

Jiaming Xu (Wharton)

Optimal Cluster Recovery via SDP

17

A single community: MLE ⇔ Densest K-subgraph Assuming p > q and ξ = cluster indicator • Maximum likelihood estimator (MLE) X max Aij ξi ξj ξ

i,j

s.t. ξ ∈ {0, 1}n ξ>1 = K

Jiaming Xu (Wharton)

Optimal Cluster Recovery via SDP

18

A single community: MLE ⇔ Densest K-subgraph Assuming p > q and ξ = cluster indicator • Maximum likelihood estimator (MLE) X max Aij ξi ξj ξ

i,j

s.t. ξ ∈ {0, 1}n ξ>1 = K lift: Z=ξξ >

⇐======⇒

max hA, Zi Z

s.t. rank(Z) = 1 Zii ≤ 1

∀i ∈ [n]

Zij ≥ 0

∀i, j ∈ [n]

hI, Zi = K hJ, Zi = K 2 Jiaming Xu (Wharton)

Optimal Cluster Recovery via SDP

18

A single community: SDP relaxation • Semidefinite programming (SDP) relaxation of MLE

bSDP = arg maxhA, Zi Z Z

s.t. Z  0 Zii ≤ 1,

∀i ∈ [n]

Zij ≥ 0,

∀i, j ∈ [n]

hI, Zi = K hJ, Zi = K 2   b • goal: P Z =  SDP

1 0

 0 →1 0

Jiaming Xu (Wharton)

Optimal Cluster Recovery via SDP

19

A single community: Dual certificate

max hA, Zi Z

s.t. Z  0 Zii ≤ 1 Zij ≥ 0 hI, Zi = K hJ, Zi = K 2

Jiaming Xu (Wharton)

Optimal Cluster Recovery via SDP

20

A single community: Dual certificate

max hA, Zi

dual variables

s.t. Z  0

S0

Z

Zii ≤ 1

D = diag {di }

Zij ≥ 0

B≥0

hI, Zi = K

η∈R

hJ, Zi = K 2

λ∈R

• Sξ ∗ = 0 ⇒ di = e(i, C ∗ ) − λK − η if i ∈ C ∗ ; di = 0 otherwise. • B=

0 0

, where

= b1> , with bi = λ − e(i, C ∗ )/K for i ∈ / C∗

• Set λ = τ ∗ log n/n so that mini∈C / ∗ bi ≥ 0 • Set η = ||A − E [A] || such that λ2 (S) > 0 if di ≥ 0



• mini∈C ∗ e(i, C ∗ ) − λK = ΩP (log n) and η = OP ( log n) Jiaming Xu (Wharton)

Optimal Cluster Recovery via SDP

20

A single community: Necessity

ρf (a, b) < 1 ⇒

min e(i, C ∗ ) | {z } i∈C ∗

K almost ind. Binom(K−1, p)