Independence and chromatic number (and random k-SAT): Sparse Case

Report 15 Downloads 17 Views
Independence and chromatic number (and random k-SAT): Sparse Case Dimitris Achlioptas Microsoft

Random graphs

W.h.p.: with probability that tends to 1 as n → ∞.

Hamiltonian cycle z

Let

τ2 be the moment all vertices have degree ≥ 2

Hamiltonian cycle z

Let z

τ2 be the moment all vertices have degree ≥ 2

W.h.p. G(n, m = τ2 ) has a Hamiltonian cycle

Hamiltonian cycle z

Let

τ2 be the moment all vertices have degree ≥ 2

z

W.h.p. G(n, m = τ2 ) has a Hamiltonian cycle

z

W.h.p. it can be found in time

O(n3 log n)

[Ajtai, Komlós, Szemerédi 85] [Bollobás, Fenner, Frieze 87]

Hamiltonian cycle z

Let

τ2 be the moment all vertices have degree ≥ 2

z

W.h.p. G(n, m = τ2 ) has a Hamiltonian cycle

z

W.h.p. it can be found in time

O(n3 log n)

[Ajtai, Komlós, Szemerédi 85] [Bollobás, Fenner, Frieze 87] z

In G(n, 1/2) Hamiltonicity can be decided in O(n) expected time. [Gurevich, Shelah 84]

Cliques in random graphs z

The largest clique in G(n, 1/2) has size 2 log2 n − 2 log2 log2 n±1 [Bollobás, Erdős 75] [Matula 76]

Cliques in random graphs z

The largest clique in G(n, 1/2) has size 2 log2 n − 2 log2 log2 n±1 [Bollobás, Erdős 75] [Matula 76]

z

No maximal clique of size < log2 n

Cliques in random graphs z

The largest clique in G(n, 1/2) has size 2 log2 n − 2 log2 log2 n±1 [Bollobás, Erdős 75] [Matula 76]

z

No maximal clique of size < log2 n

z

Can we find a clique of size (1 + ²) log2 n ? [Karp 76]

Cliques in random graphs z

The largest clique in G(n, 1/2) has size 2 log2 n − 2 log2 log2 n±1 [Bollobás, Erdős 75] [Matula 76]

z

No maximal clique of size < log2 n

z

Can we find a clique of size (1 + ²) log2 n ? What if we “hide” a clique of size

n1=2¡ ² ?

Two problems for which we know much less. z Chromatic z Random

number of sparse random graphs

k-SAT

Two problems for which we know much less. z Chromatic z Random

number of sparse random graphs

k-SAT

z Canonical – –

for random constraint satisfaction: Binary constraints over k-ary domain k-ary constraints over binary domain

z Studied

in: AI, Math, Optimization, Physics,…

A factor-graph representation of k-coloring z

Each vertex is a variable with domain {1,2,…,k}.

z

All constraints are “not-equal”.

z

Random graph = each constraint picks two variables at random.

e2

v2

...

Each edge is a constraint on two variables.

e1

v1

...

z

Vertices

Edges

SAT via factor-graphs (x12 ∨ x5 ∨ x9 ) ∧ (x34 ∨ x21 ∨ x5 ) ∧ · · · · · · ∧ (x21 ∨ x9 ∨ x13 )

SAT via factor-graphs (x12 ∨ x5 ∨ x9 ) ∧ (x34 ∨ x21 ∨ x5 ) ∧ · · · · · · ∧ (x21 ∨ x9 ∨ x13 ) Variable nodes

z

Edges are labeled +/- to indicate whether the literal is negated.

z

Constraints are “at least one literal must be satisfied”.

z

Random k-SAT = constraints pick k literals at random.

c

x

...

Edge between x and c iff x occurs in clause c.

...

z

Clause nodes

Diluted mean-field spin glasses

z

Conflicting, fixed constraints: quenched disorder

z

Random bipartite graph: lack of geometry, mean field

z

Sparse: diluted

z

Hypergraph coloring, random XOR-SAT, error-correcting codes…

Variables

c

x

...

Small, discrete domains: spins

...

z

Constraints

Random graph coloring:

Background

A trivial lower bound z For

any graph, the chromatic number is at least: Number of vertices Size of maximum independent set

A trivial lower bound z For

any graph, the chromatic number is at least: Number of vertices Size of maximum independent set

z For

random graphs, use upper bound for largest independent set. µ ¶ s n ( × (1 − p) 2 ) → 0 s

An algorithmic upper bound z

Repeat zPick a random uncolored vertex zAssign it the lowest allowed number (color)

Uses 2 x trivial lower bound number of colors

An algorithmic upper bound z

Repeat zPick a random uncolored vertex zAssign it the lowest allowed number (color)

Uses 2 x trivial lower bound number of colors

z

No algorithm is known to do better

The lower bound is asymptotically tight As d grows, G(n, d/n) can be colored using independent sets of essentially maximum size [Bollobás 89] [Łuczak 91]

The lower bound is asymptotically tight As d grows, G(n, d/n) can be colored using independent sets of essentially maximum size [Bollobás 89] [Łuczak 91]

Only two possible values Theorem. For every d > 0, there exists an integer k = k (d) such that w.h.p. the chromatic number of G(n, p = d/n)

is either k or k + 1 [Łuczak 91]

“The Values” Theorem. For every d > 0, there exists an integer k = k (d) such that w.h.p. the chromatic number of G(n, p = d/n)

is either k or k + 1 where k is the smallest integer s.t. d < 2k log k .

Examples z

If

d = 7, w.h.p. the chromatic number is 4 or 5 .

Examples z

z

If

If

d = 7, w.h.p. the chromatic number is 4 or 5 . d = 10

60

, w.h.p. the chromatic number is

377145549067226075809014239493833600551612641764765068157 5

or 377145549067226075809014239493833600551612641764765068157 6

One value Theorem. If (2k − 1) ln k < d < 2k ln k then w.h.p. the chromatic number of G(n, d/n) is k + 1.

One value Theorem. If (2k − 1) ln k < d < 2k ln k then w.h.p. the chromatic number of G(n, d/n) is k + 1.

z

If d = 10100 , then w.h.p. the chromatic number is

Random k-SAT:

Background

Random k-SAT z

Fix a set of n variables X = {x1 , x2 , . . . , xn }

µ ¶ z

Among all 2

k

n possible k-clauses select m k

uniformly and independently. Typically m = rn . z

Example ( k = 3) :

(x12 ∨ x5 ∨ x9 ) ∧ (x34 ∨ x21 ∨ x5 ) ∧ · · · · · · ∧ (x21 ∨ x9 ∨ x13 )

Generating hard 3-SAT instances [Mitchell, Selman, Levesque 92]

Generating hard 3-SAT instances [Mitchell, Selman, Levesque 92]

z

The critical point appears to be around

r ≈ 4.2

The satisfiability threshold conjecture z

For every k

≥ 3, there is a constant rk such that ½

lim Pr[Fk (n, rn) is satisfiable] =

n !1

1 0

if r = rk − ² if r = rk + ²

The satisfiability threshold conjecture z

For every k

≥ 3, there is a constant rk such that ½

lim Pr[Fk (n, rn) is satisfiable] =

n !1

z

For every k

1 0

if r = rk − ² if r = rk + ²

≥ 3, k

2

k

k

< rk < 2 ln 2

Unit-clause propagation Repeat – Pick a random unset variable and set it to 1 – While there are unit-clauses satisfy them – If a 0-clause is generated fail

Unit-clause propagation Repeat – Pick a random unset variable and set it to 1 – While there are unit-clauses satisfy them – If a 0-clause is generated fail z

UC finds a satisfying truth assignment if

k

r
0] ≥ E[X 2] Pro of: Let Y = 1 if X > 0, and Y = 0 otherwise. By Cauchy-Schwartz, E[X ]2 = E[XY ]2 · E[X 2 ]E[Y 2 ] = E[X 2 ] Pr[X > 0] .

Ideal for sums If X = X1 +X2 + · · · E[X ]2

then X

=

E[Xi ]E[Xj ]

i;j

2

E[X ]

X =

E[Xi Xj ]

i;j

X

X

Ideal for sums If X = X1 +X2 + · · · E[X ]2

then X

=

E[Xi ]E[Xj ]

i;j

2

E[X ]

X =

E[Xi Xj ]

i;j

Example: The Xi correspond to the

X

¡n¢ q

potential q -cliques in G(n, 1/2)

X

Dominant contribution from non-ovelapping cliques

General observations z Method

works well when the Xi are like

“needles in a haystack”

General observations z Method

works well when the Xi are like

“needles in a haystack” z Lack

of correlations =⇒ rapid drop in influence around solutions

General observations z Method

works well when the Xi are like

“needles in a haystack” z Lack

of correlations =⇒ rapid drop in influence around solutions

z Algorithms

get no “hints”

The second moment method for random k-SAT z

Let X be the # of satisfying truth assignments

The second moment method for random k-SAT z

Let X be the # of satisfying truth assignments

For every clause-density r > 0, there is β = β (r ) > 0 such that

E[X ]2 n < (1 − β ) E[X 2 ]

The second moment method for random k-SAT z

Let X be the # of satisfying truth assignments

For every clause-density r > 0, there is β = β (r ) > 0 such that

E[X ]2 n < (1 − β ) E[X 2 ] z

The number of satisfying truth assignments has huge variance.

The second moment method for random k-SAT z

Let X be the # of satisfying truth assignments

For every clause-density r > 0, there is β = β (r ) > 0 such that

E[X ]2 n < (1 − β ) E[X 2 ] z

The number of satisfying truth assignments has huge variance.

z

The satisfying truth assignments do not form a n σ ∈ { 0 , 1 } “uniformly random mist” in

To prove 2k ln2 – k/2 - 1 z

Let H (σ, F ) be the number of satisfied literal occurrences in F under σ

where γ < 1 satisfies (1 + γ 2 )k ¡ 1 (1 − γ 2 ) = 1 .

To prove 2k ln2 – k/2 - 1 z

Let H (σ, F ) be the number of satisfied literal occurrences in F under σ

z

Let X = X (F ) be defined as X (F ) =

X

1¾j= F γ H ( ¾;F )

¾

X Y = ¾

1¾j= c γ H ( ¾;c)

c

where γ < 1 satisfies (1 + γ 2 )k ¡ 1 (1 − γ 2 ) = 1 .

To prove 2k ln2 – k/2 - 1 z

Let H (σ, F ) be the number of satisfied literal occurrences in F under σ

z

Let X = X (F ) be defined as X (F ) =

X

1¾j= F γ H ( ¾;F )

¾

X Y = ¾

1¾j= c γ H ( ¾;c)

c

where γ < 1 satisfies (1 + γ 2 )k ¡ 1 (1 − γ 2 ) = 1 .

General functions z

Given any t.a. σ and any k-clause c let

v = v(σ, c) ∈ {−1, +1}k be the values of the literals in c under σ.

General functions z

Given any t.a. σ and any k-clause c let

v = v(σ, c) ∈ {−1, +1}k be the values of the literals in c under σ. z

We will study random variables of the form X Y X= f (v(σ, c)) ¾

c

where f : {−1, +1}k → R is an arbitrary function

X=

X Y ¾

f (v) =

c

for all v

1 (

f (v(σ, c))

0

if v = (− 1, − 1, . . . , − 1) =⇒

f (v) = 1 8 0 >
: 1

otherwis e if v = (− 1, − 1, . . . , − 1) or if v = (+1 , +1 , . . . , +1) otherwis e

=⇒

2

n

# of satisfying truth assignments

# of “Not All Equal” truth assignments (NAE)

X=

X Y ¾

f (v) =

c

for all v

1 (

f (v(σ, c))

0

if v = (− 1, − 1, . . . , − 1) =⇒

f (v) = 1 8 0 >
: 1

otherwis e if v = (− 1, − 1, . . . , − 1) or if v = (+1 , +1 , . . . , +1) otherwis e

=⇒

2n # of satisfying truth assignments

# of satisfying truth assignments whose complement is also satisfying

Overlap parameter = distance z

Overlap parameter is Hamming distance

Overlap parameter = distance z

Overlap parameter is Hamming distance

z

For any f, if

σ, τ agree on z = n/2 variables

h E f (v(σ, c))f (v(τ, c))

i =

£

¤ £ ¤ E f (v(σ, c)) E f (v(τ, c))

Overlap parameter = distance z

Overlap parameter is Hamming distance

z

For any f, if

σ, τ agree on z = n/2 variables

h

i

E f (v(σ, c))f (v(τ, c))

z

For any f , if

=

£

¤ £ ¤ E f (v(σ, c)) E f (v(τ, c))

σ, τ agree on z variables, let h

Cf (z/n) ≡ E f (v(σ, c))f (v(τ, c))

i

Contribution according to distance Independence

Entropy vs. correlation For every function f : 2

E[X ]

=

2

n

µ ¶

Xn z=0

2

E[X ]

=

2

n

µ ¶

Xn z=0

µ

Recall:



n = αn

µ

n Cf (1/2)m z

¶n

1

α® (1 − α)1¡

n Cf (z/n)m z

®

× poly(n)

Contribution according to distance Independence

®=z/n

The importance of being balanced z

An analytic condition: 0

Cf (1=2) = 0 =⇒

the s.m.m. fails

®=z/n

NAE 5-SAT 7 < r < 11

NAE 5-SAT 7 < r < 11

The importance of being balanced z

An analytic condition: 0

Cf (1=2) = 0 =⇒

the s.m.m. fails

The importance of being balanced z

An analytic condition: 0

Cf (1=2) = 0 =⇒

z

the s.m.m. fails

A geometric criterion: 0

Cf (1=2) = 0 ⇐⇒

X v ∈{−1;+1 }k

f (v)v = 0

The importance of being balanced 0

Cf (1=2) = 0 ⇐⇒

X

f (v)v = 0

v ∈{−1;+1 }k (-1,-1,…,-1)

Constant

SAT Complementary

(+1,+1,…,+1)

Balance & Information Theory z

Want to balance vectors in “optimal” way

Balance & Information Theory z

Want to balance vectors in “optimal” way

z

Information theory

=⇒

maximize the entropy of the f (v) subject to X f (−1, −1, . . . , −1) = 0 and f (v)v = 0 v 2f¡

1;+1 g k

Balance & Information Theory z

Want to balance vectors in “optimal” way

z

Information theory

=⇒

maximize the entropy of the f (v) subject to X f (−1, −1, . . . , −1) = 0 and f (v)v = 0 v 2f¡

z

Lagrange multipliers

1;+1 g k

=⇒ the optimal f is

f (v) = γ # of +1s in v for the unique γ that satisfies the constraints

Balance & Information Theory z

Want to balance vectors in “optimal” way

z

Information theory

=⇒

maximize the entropy of the f (v) subject to X f (−1, −1, . . . , −1) = 0 and f (v)v = 0 v 2f¡ (-1,-1,…,-1)

Heroic

1;+1 g k

(+1,+1,…,+1)

Random graph coloring

Threshold formulation Theorem. A random graph with n vertices and m = cn edgesis w.h.p. k -colorable if

c · k log k − log k − 1 . and w h p non-k-colorable if

c ≥ k log k − 12 log k

.

Main points z Non-k-colorability:

Pro of. The¬probability that¬there exists¬any µ ¶ cn k-coloring¬is at¬most 1 kn

1−

k

→0

z k-colorability:

Pro of. Apply second moment method to the number of balanced k -colorings of G(n, m).

Setup z

Let Xσ be the indicator that the balanced

k-partition σ is a proper k-coloring. z

We will prove that if X =

P

c · k log k − log k − 1

then for all there is a constant ¾ X¾

D = D(k) such > 0 that

E[X 2 ] < D E[X ]2 z

This implies that

G(n, cn) is k-colorable w.h.p.

Setup z

2

E[X ] = sum over all σ, τ of E[X¾X¿ ].

z For

any pair of balanced k-partitions

σ, τ

let aijn be the # of vertices having color i in σ and color j in τ. 0

Pr[σ and τ are proper] = @1 −

2

k

X + ij

1

a2ij A

cn

Examples Balance

=⇒ A is doubly-stochastic.

When σ, τ are uncorrelated, A is the flat 1/k matrix

1

k

Examples Balance

=⇒ A is doubly-stochastic.

When σ, τ are uncorrelated, A is the flat 1/k matrix

1

k

As

σ, τ

align, A tends to the identity matrix I

A matrix-valued overlap k

2

So, E[X ] =

X A 2B

k

µ

n An

¶µ 1−

2

k

+

1 X

k2

¶ cn 2 aij

A matrix-valued overlap k

2

So, E[X ] =

X A 2B

k

µ

n An

¶µ 1−

2

k

+

1 X

k2

¶ cn 2 aij

which is controlled by the maximizer of

− over

X

µ

aij log aij + c log 1 −

k× k

2

k

+

1 X

k2



a2ij

doubly-stochastic matrices A = (aij ) .

A matrix-valued overlap k

2

So, E[X ] =

X A 2B

k

µ

n An

¶µ 1−

2

k

+

1 X

k2

¶ cn 2 aij

which is controlled by the maximizer of

− over

X

µ

aij log aij + c log 1 −

k× k

2

k

+

1 X

k2



a2ij

doubly-stochastic matrices A = (aij ) .



X

aij log aij + c ·

i;j z

1X

k2 i;j

a2ij

Entropy decreases away from the flat 1/k matrix – For small c , this loss overwhelms the sum of squares gain – But for large enough

c ,….

z

The maximizer jumps instantaneously from flat, to a matrix where k vertices capture the majority of mass

z

Proved this happens only for c > k log k − log k − 1



X

aij log aij + c ·

i;j z

1X

k2 i;j

a2ij

Entropy decreases away from the flat 1/k matrix – For small c , this loss overwhelms the sum of squares gain – But for large enough

c ,….

z

The maximizer jumps instantaneously from flat, to a matrix where k vertices capture the majority of mass

z

Proved this happens only for c > k log k − log k − 1



X

aij log aij + c ·

i;j z

1X

k2 i;j

a2ij

Entropy decreases away from the flat 1/k matrix – For small c , this loss overwhelms the sum of squares gain – But for large enough

c ,….

z

The maximizer jumps instantaneously from flat, to a matrix where k vertices capture the majority of mass

z

Proved this happens only for c > k log k − log k − 1



X

aij log aij + c ·

i;j z

1X

k2 i;j

a2ij

Entropy decreases away from the flat 1/k matrix – For small c , this loss overwhelms the sum of squares gain – But for large enough

z

c ,….

The maximizer jumps instantaneously from flat, to a matrix where k entries capture majority of mass



X

aij log aij + c ·

i;j z

1X

k2 i;j

a2ij

Entropy decreases away from the flat 1/k matrix – For small c , this loss overwhelms the sum of squares gain – But for large enough

c ,….

z

The maximizer jumps instantaneously from flat, to a matrix where k entries capture majority of mass

z

This jump happens only after c > k log k − log k − 1

Proof overview Proof. Compare the value at the flat matrix with upper bound for everywhere else derived by:

Proof overview Proof. Compare the value at the flat matrix with upper bound for everywhere else derived by: 1.

Relax to singly stochastic matrices.

Proof overview Proof. Compare the value at the flat matrix with upper bound for everywhere else derived by: 1. 2.

Relax to singly stochastic matrices. Prescribe the L2 norm of each row ρi.

Proof overview Proof. Compare the value at the flat matrix with upper bound for everywhere else derived by: 1. 2. 3.

Relax to singly stochastic matrices. Prescribe the L2 norm of each row ρi. Find max-entropy, f(ρi), of each row given ρi.

Proof overview Proof. Compare the value at the flat matrix with upper bound for everywhere else derived by:

3.

Relax to singly stochastic matrices. Prescribe the L2 norm of each row ρi. Find max-entropy, f(ρi), of each row given ρi.

4.

Prove that f’’’ > 0.

1. 2.

Proof overview Proof. Compare the value at the flat matrix with upper bound for everywhere else derived by:

3.

Relax to singly stochastic matrices. Prescribe the L2 norm of each row ρi. Find max-entropy, f(ρi), of each row given ρi.

4.

Prove that f’’’ > 0.

1. 2.

5.

Use (4) to determine the optimal distribution of the ρi given their total ρ.

Proof overview Proof. Compare the value at the flat matrix with upper bound for everywhere else derived by:

3.

Relax to singly stochastic matrices. Prescribe the L2 norm of each row ρi. Find max-entropy, f(ρi), of each row given ρi.

4.

Prove that f’’’ > 0.

1. 2.

5. 6.

Use (4) to determine the optimal distribution of the ρi given their total ρ. Optimize over ρ.

Random regular graphs Theorem. For every integer d > 0, w.h.p. the chromatic number of a random d-regular graph

is either k , k + 1, or k + 2

where k is the smallest integer s.t. d < 2k log k .

A vector analogue (optimizing a single row) Maximize



Xk

ai log ai

i =1

subject to Xk ai

=

1

=

ρ

i =1

Xk i =1

a2i

for some 1/k < ρ < 1

A vector analogue (optimizing a single row) Maximize



Xk

ai log ai

i =1

subject to Xk ai

=

1

i =1

Xk i =1

k a2i

=

ρ

for some 1/k < ρ < 1

Maximize



Xk

A vector analogue (optimizing a single row) For k = 3 the maximizer is

ai log ai

i =1

subject to Xk ai

=

1

=

ρ

i =1

Xk i =1

a2i

for some 1/k < ρ < 1

(x, y, y ) where x > y

Maximize



Xk

A vector analogue (optimizing a single row) For k = 3 the maximizer is

ai log ai

i =1

subject to Xk ai

=

1

=

ρ

i =1

Xk i =1

a2i

for some 1/k < ρ < 1

(x, y, y ) where x > y

Maximize



Xk

A vector analogue (optimizing a single row) For k = 3 the maximizer is

ai log ai

i =1

subject to Xk ai

=

1

=

ρ

i =1

Xk i =1

a2i

for some 1/k < ρ < 1

(x, y, y ) where x > y

Maximize



Xk

A vector analogue (optimizing a single row) For k = 3 the maximizer is

ai log ai

i =1

subject to Xk ai

=

1

=

ρ

i =1

Xk i =1

a2i

for some 1/k < ρ < 1

(x, y, y ) where x > y

Maximize



Xk

A vector analogue (optimizing a single row) For k = 3 the maximizer is

ai log ai

(x, y, y ) where x > y

i =1

subject to Xk ai

=

1

=

ρ

i =1

Xk i =1

a2i

for some 1/k < ρ < 1

For k > 3 the maximizer is

(x, y, . . . , y )

Maximum entropy image restoration z Create

a composite image of an object that:

– Minimizes “empirical error” z Typically, least-squares error over luminance – Maximizes “plausibility” z Typically, maximum entropy

Maximum entropy image restoration

Structure of maximizer helps detect stars in astronomy

The End