Achieving Exact Cluster Recovery Threshold via Semidefinite Programming Jiaming Xu Department of Statistics, The Wharton School University of Pennsylvania
[email protected] Joint work with Bruce Hajek (Illinois) and Yihong Wu (Illinois)
June 17, 2015
Community detection in networks • Networks with community structures arise in many applications
Santa Fe Institute Collaboration network [Girvan-Newman ’02]
Jiaming Xu (Wharton)
Optimal Cluster Recovery via SDP
2
Community detection in networks • Networks with community structures arise in many applications
Santa Fe Institute Collaboration network [Girvan-Newman ’02] • Task: Discover underlying communities based on the network
topology
Jiaming Xu (Wharton)
Optimal Cluster Recovery via SDP
2
Stochastic block model [Holland et al. ’83] Planted partition model [Condon-Karp 01]
n = 40, K = 10, r = 3 Jiaming Xu (Wharton)
Optimal Cluster Recovery via SDP
3
Stochastic block model [Holland et al. ’83] Planted partition model [Condon-Karp 01]
p = 0.9 Jiaming Xu (Wharton)
Optimal Cluster Recovery via SDP
4
Stochastic block model [Holland et al. ’83] Planted partition model [Condon-Karp 01]
p = 0.9 Jiaming Xu (Wharton)
q = 0.1 Optimal Cluster Recovery via SDP
4
Stochastic block model [Holland et al. ’83] Planted partition model [Condon-Karp 01]
p = 0.9 Jiaming Xu (Wharton)
q = 0.1 Optimal Cluster Recovery via SDP
5
Exact recovery
b C ∗ −→ A −→ C • Goal: exact recovery (strong consistency) n→∞
b = C ∗ } −−−→ 1 P{C • Alternatives I almost exact recovery (weak consistency): [Mossel-Neeman-Sly ’14, Abbe-Sandon ’15, Montanari ’15]... I correlated recovery: [Decelle-Krzakala-Moore-Zdeborova ’11, Mossel-Neeman-Sly ’12 ’13, Massoulie ’13]...
Jiaming Xu (Wharton)
Optimal Cluster Recovery via SDP
6
Objectives of this talk
• Information limit: When is exact recovery possible (impossible)?
Jiaming Xu (Wharton)
Optimal Cluster Recovery via SDP
7
Objectives of this talk
• Information limit: When is exact recovery possible (impossible)? • Is the information limit achievable in polynomial time, e.g., via
semidefinite programming?
Jiaming Xu (Wharton)
Optimal Cluster Recovery via SDP
7
remainder of the talk
1
Two equal-sized communities
2
A single community of linear size
3
Extensions and open problems
Jiaming Xu (Wharton)
Optimal Cluster Recovery via SDP
8
Two equal-sized communities: Binary symmetric SBM
Model: • n nodes partitioned into two communities of size n2 (σi∗ = ±1). ( • i ∼ j independently w.p.
p= q=
Jiaming Xu (Wharton)
a log n n b log n n
σi∗ = σj∗ σi∗ 6= σj∗
Optimal Cluster Recovery via SDP
9
Two equal-sized communities: Binary symmetric SBM
Model: • n nodes partitioned into two communities of size n2 (σi∗ = ±1). ( • i ∼ j independently w.p.
p= q=
a log n n b log n n
σi∗ = σj∗ σi∗ 6= σj∗
Remarks • a + b > 2 is the connectivity threshold and necessary for exact
recovery
Jiaming Xu (Wharton)
Optimal Cluster Recovery via SDP
9
Two equal-sized communities: MLE ⇒ SDP relaxation • Maximum likelihood estimator (MLE): Assume p ≥ q
max hA, σσ > i σ
→ # of in-cluster edges
s.t. σi ∈ {±1} i ∈ [n] σ>1 = 0
Jiaming Xu (Wharton)
Optimal Cluster Recovery via SDP
10
Two equal-sized communities: MLE ⇒ SDP relaxation • Maximum likelihood estimator (MLE): Assume p ≥ q
max hA, σσ > i
→ # of in-cluster edges
σ
s.t. σi ∈ {±1} i ∈ [n] σ>1 = 0 lift: Y =σσ >
========⇒
max hA, Y i Y
s.t. rank(Y ) = 1 Yii = 1 i ∈ [n] hJ, Y i = 0
Jiaming Xu (Wharton)
Optimal Cluster Recovery via SDP
10
Two equal-sized communities: MLE ⇒ SDP relaxation • Maximum likelihood estimator (MLE): Assume p ≥ q
max hA, σσ > i
→ # of in-cluster edges
σ
s.t. σi ∈ {±1} i ∈ [n] σ>1 = 0 lift: Y =σσ >
========⇒
max hA, Y i Y
s.t. Y 0 Yii = 1 i ∈ [n] hJ, Y i = 0
Jiaming Xu (Wharton)
Optimal Cluster Recovery via SDP
10
Two equal-sized communities: MLE ⇒ SDP relaxation • Maximum likelihood estimator (MLE): Assume p ≥ q
max hA, σσ > i
→ # of in-cluster edges
σ
s.t. σi ∈ {±1} i ∈ [n] σ>1 = 0 lift: Y =σσ >
========⇒
max hA, Y i Y
s.t. Y 0 Yii = 1 i ∈ [n] hJ, Y i = 0 ( • Goal: P
YbSDP =
1
−1
−1
1
) →1
Jiaming Xu (Wharton)
Optimal Cluster Recovery via SDP
10
Two equal-sized communities: Optimal recovery via SDP Theorem (Abbe-Bandeira-Hall ’14, Mossel-Neeman-Sly ’14) √ √ • If ( a − b)2 > 2, recovery is achievable in polynomial-time. √ √ • If ( a − b)2 < 2, recovery is impossible.
Jiaming Xu (Wharton)
Optimal Cluster Recovery via SDP
11
Two equal-sized communities: Optimal recovery via SDP Theorem (Abbe-Bandeira-Hall ’14, Mossel-Neeman-Sly ’14) √ √ • If ( a − b)2 > 2, recovery is achievable in polynomial-time. √ √ • If ( a − b)2 < 2, recovery is impossible. Theorem (Hajek-Wu-X. ’14)
√ √ SDP achieves the optimal recovery threshold ( a − b)2 > 2.
Jiaming Xu (Wharton)
Optimal Cluster Recovery via SDP
11
Two equal-sized communities: Optimal recovery via SDP Theorem (Abbe-Bandeira-Hall ’14, Mossel-Neeman-Sly ’14) √ √ • If ( a − b)2 > 2, recovery is achievable in polynomial-time. √ √ • If ( a − b)2 < 2, recovery is impossible. Theorem (Hajek-Wu-X. ’14)
√ √ SDP achieves the optimal recovery threshold ( a − b)2 > 2. Remarks • originally conjectured in [Abbe-Bandeira-Hall ’14] • independently proved by [Bandeira ’15]
( • P
YbSDP =
1
−1
−1
1
) = 1 − n−Ω(1)
Jiaming Xu (Wharton)
Optimal Cluster Recovery via SDP
11
Two equal-sized communities: Dual certificate argument ( Goal: P YbSDP =
1
−1
−1
1
) →1
YbSDP = arg max hA, Y i Y
s.t. Y 0 Yii = 1 hJ, Y i = 0
Jiaming Xu (Wharton)
Optimal Cluster Recovery via SDP
12
Two equal-sized communities: Dual certificate argument ( Goal: P YbSDP =
1
−1
−1
1
) →1
YbSDP = arg max hA, Y i
dual variables
s.t. Y 0
S0
Y
Yii = 1 hJ, Y i = 0
Jiaming Xu (Wharton)
D = diag {di } λ∈R
Optimal Cluster Recovery via SDP
12
Two equal-sized communities: Dual certificate argument ( Goal: P YbSDP =
1
−1
−1
1
) →1
YbSDP = arg max hA, Y i
dual variables
s.t. Y 0
S0
Y
D = diag {di }
Yii = 1 hJ, Y i = 0
λ∈R
• di = (# of nbrs in own cluster) − (# of nbrs in other cluster)
∼ Binom(n/2 − 1, p) − Binom(n/2, q)
Jiaming Xu (Wharton)
Optimal Cluster Recovery via SDP
12
Two equal-sized communities: Dual certificate argument ( Goal: P YbSDP =
1
−1
−1
1
) →1
YbSDP = arg max hA, Y i
dual variables
s.t. Y 0
S0
Y
D = diag {di }
Yii = 1 hJ, Y i = 0
λ∈R
• di = (# of nbrs in own cluster) − (# of nbrs in other cluster)
∼ Binom(n/2 − 1, p) − Binom(n/2, q) • S = D − A + λJ 0 if λ ≥ (p + q)/2 and min di ≥ kA − E [A] k
Jiaming Xu (Wharton)
Optimal Cluster Recovery via SDP
12
Two equal-sized communities: Dual certificate argument ( Goal: P YbSDP =
1
−1
−1
1
) →1
YbSDP = arg max hA, Y i
dual variables
s.t. Y 0
S0
Y
D = diag {di }
Yii = 1 hJ, Y i = 0
λ∈R
• di = (# of nbrs in own cluster) − (# of nbrs in other cluster)
∼ Binom(n/2 − 1, p) − Binom(n/2, q) • S = D − A + λJ 0 if λ ≥ (p + q)/2 and min di ≥ kA − E [A] k • min di = ΩP (log n) if
√
a−
√
Jiaming Xu (Wharton)
b>
√
2
Optimal Cluster Recovery via SDP
12
Two equal-sized communities: Dual certificate argument ( Goal: P YbSDP =
1
−1
−1
1
) →1
YbSDP = arg max hA, Y i
dual variables
s.t. Y 0
S0
Y
D = diag {di }
Yii = 1 hJ, Y i = 0
λ∈R
• di = (# of nbrs in own cluster) − (# of nbrs in other cluster)
∼ Binom(n/2 − 1, p) − Binom(n/2, q) • S = D − A + λJ 0 if λ ≥ (p + q)/2 and min di ≥ kA − E [A] k • min di = ΩP (log n) if
√
a−
√
√
2 √ • kA − E [A] k = OP ( log n): 2nd-order stochastic dominance [Tomozei-Massouli´e ’14] + result for iid matrix [Seginer ’00] Jiaming Xu (Wharton)
b>
Optimal Cluster Recovery via SDP
12
A single community: Planted dense subgraph model
• One cluster of size K plus n − K outliers
p
q
q
q
• Connectivity p within cluster and q otherwise • Linear community size: K = ρ n n n • Relatively sparse graph: p = a log and q = b log n n
Jiaming Xu (Wharton)
Optimal Cluster Recovery via SDP
13
A single community: Optimal recovery via SDP
Theorem (Hajek-Wu-X. ’14) • If ρf (a, b) > 1, recovery is achievable via SDP in polynomial-time. • If ρf (a, b) < 1, recovery is impossible.
Jiaming Xu (Wharton)
Optimal Cluster Recovery via SDP
14
A single community: Optimal recovery via SDP
Theorem (Hajek-Wu-X. ’14) • If ρf (a, b) > 1, recovery is achievable via SDP in polynomial-time. • If ρf (a, b) < 1, recovery is impossible.
Remarks a−b • f (a, b) = a − τ ∗ log τea∗ with τ ∗ = log a−log b • Sufficiency: dual certificate argument
• Necessity: show MLE fails by swapping an in-cluster node with an
out-cluster node
Jiaming Xu (Wharton)
Optimal Cluster Recovery via SDP
14
Extensions and open problems
SDP achieves sharp threshold: • Two unequal-sized clusters (ρn, (1 − ρ)n): η(ρ, a, b) > 1
Jiaming Xu (Wharton)
Optimal Cluster Recovery via SDP
15
Extensions and open problems
SDP achieves sharp threshold: • Two unequal-sized clusters (ρn, (1 − ρ)n): η(ρ, a, b) > 1 • Two clusters with unknown sizes:
Jiaming Xu (Wharton)
√
a−
√
b>
√
2
Optimal Cluster Recovery via SDP
15
Extensions and open problems
SDP achieves sharp threshold: • Two unequal-sized clusters (ρn, (1 − ρ)n): η(ρ, a, b) > 1
√
a− √ √ √ • r equal-sized clusters: a − b > r • Two clusters with unknown sizes:
Jiaming Xu (Wharton)
√
b>
√
2
Optimal Cluster Recovery via SDP
15
Extensions and open problems
SDP achieves sharp threshold: • Two unequal-sized clusters (ρn, (1 − ρ)n): η(ρ, a, b) > 1
√
a− √ √ √ • r equal-sized clusters: a − b > r • Two clusters with unknown sizes:
√
b>
√
2
General SBM: • Optimality of SDP relaxation remains open (but within a factor of 4) • Sharp threshold is found in [Abbe-Sandon ’15] via a two-stage
procedure
Jiaming Xu (Wharton)
Optimal Cluster Recovery via SDP
15
Concluding remarks
• If community sizes are linear, information limit is attainable in
polynomial-time via SDP • If community sizes scale as nβ for β < 1, information limit might
not be achievable in polynomial-time [Hajek-Wu-X. ’14]
Jiaming Xu (Wharton)
Optimal Cluster Recovery via SDP
16
Concluding remarks
• If community sizes are linear, information limit is attainable in
polynomial-time via SDP • If community sizes scale as nβ for β < 1, information limit might
not be achievable in polynomial-time [Hajek-Wu-X. ’14] References • B. Hajek, Y. Wu & J. X. (2014). Achieving exact cluster recovery threshold via semidefinite programming. arXiv:1412.6156 (ISIT ’15) • B. Hajek, Y. Wu & J. X. (2015). Achieving exact cluster recovery threshold via semidefinite programming: Extensions. arXiv:1502.07738 • B. Hajek, Y. Wu & J. X. (2014). Computational lower bounds for community detection on random graphs. arXiv:1406.6625 (COLT ’15)
Jiaming Xu (Wharton)
Optimal Cluster Recovery via SDP
16
Spectral concentration
Theorem Let A denote a symmetric and zero-diagonal random matrix, where the entries {Aij : i < j} are independent and [0, 1]-valued. Assume that E [Aij ] ≤ p, where c0 log n/n ≤ p ≤ 1 − c1 for arbitrary constants c0 > 0 and c1 > 0. Then for any c > 0, there exists c0 > 0 such that for any n ≥ 1, √ P kA − E [A]k2 ≤ c0 np ≥ 1 − n−c .
Jiaming Xu (Wharton)
Optimal Cluster Recovery via SDP
17
A single community: MLE ⇔ Densest K-subgraph Assuming p > q and ξ = cluster indicator • Maximum likelihood estimator (MLE) X max Aij ξi ξj ξ
i,j
s.t. ξ ∈ {0, 1}n ξ>1 = K
Jiaming Xu (Wharton)
Optimal Cluster Recovery via SDP
18
A single community: MLE ⇔ Densest K-subgraph Assuming p > q and ξ = cluster indicator • Maximum likelihood estimator (MLE) X max Aij ξi ξj ξ
i,j
s.t. ξ ∈ {0, 1}n ξ>1 = K lift: Z=ξξ >
⇐======⇒
max hA, Zi Z
s.t. rank(Z) = 1 Zii ≤ 1
∀i ∈ [n]
Zij ≥ 0
∀i, j ∈ [n]
hI, Zi = K hJ, Zi = K 2 Jiaming Xu (Wharton)
Optimal Cluster Recovery via SDP
18
A single community: SDP relaxation • Semidefinite programming (SDP) relaxation of MLE
bSDP = arg maxhA, Zi Z Z
s.t. Z 0 Zii ≤ 1,
∀i ∈ [n]
Zij ≥ 0,
∀i, j ∈ [n]
hI, Zi = K hJ, Zi = K 2 b • goal: P Z = SDP
1 0
0 →1 0
Jiaming Xu (Wharton)
Optimal Cluster Recovery via SDP
19
A single community: Dual certificate
max hA, Zi Z
s.t. Z 0 Zii ≤ 1 Zij ≥ 0 hI, Zi = K hJ, Zi = K 2
Jiaming Xu (Wharton)
Optimal Cluster Recovery via SDP
20
A single community: Dual certificate
max hA, Zi
dual variables
s.t. Z 0
S0
Z
Zii ≤ 1
D = diag {di }
Zij ≥ 0
B≥0
hI, Zi = K
η∈R
hJ, Zi = K 2
λ∈R
• Sξ ∗ = 0 ⇒ di = e(i, C ∗ ) − λK − η if i ∈ C ∗ ; di = 0 otherwise. • B=
0 0
, where
= b1> , with bi = λ − e(i, C ∗ )/K for i ∈ / C∗
• Set λ = τ ∗ log n/n so that mini∈C / ∗ bi ≥ 0 • Set η = ||A − E [A] || such that λ2 (S) > 0 if di ≥ 0
√
• mini∈C ∗ e(i, C ∗ ) − λK = ΩP (log n) and η = OP ( log n) Jiaming Xu (Wharton)
Optimal Cluster Recovery via SDP
20
A single community: Necessity
ρf (a, b) < 1 ⇒
min e(i, C ∗ ) | {z } i∈C ∗
K almost ind. Binom(K−1, p)