Independence and chromatic number (and random k-SAT): Sparse Case Dimitris Achlioptas Microsoft
Random graphs
W.h.p.: with probability that tends to 1 as n → ∞.
Hamiltonian cycle z
Let
τ2 be the moment all vertices have degree ≥ 2
Hamiltonian cycle z
Let z
τ2 be the moment all vertices have degree ≥ 2
W.h.p. G(n, m = τ2 ) has a Hamiltonian cycle
Hamiltonian cycle z
Let
τ2 be the moment all vertices have degree ≥ 2
z
W.h.p. G(n, m = τ2 ) has a Hamiltonian cycle
z
W.h.p. it can be found in time
O(n3 log n)
[Ajtai, Komlós, Szemerédi 85] [Bollobás, Fenner, Frieze 87]
Hamiltonian cycle z
Let
τ2 be the moment all vertices have degree ≥ 2
z
W.h.p. G(n, m = τ2 ) has a Hamiltonian cycle
z
W.h.p. it can be found in time
O(n3 log n)
[Ajtai, Komlós, Szemerédi 85] [Bollobás, Fenner, Frieze 87] z
In G(n, 1/2) Hamiltonicity can be decided in O(n) expected time. [Gurevich, Shelah 84]
Cliques in random graphs z
The largest clique in G(n, 1/2) has size 2 log2 n − 2 log2 log2 n±1 [Bollobás, Erdős 75] [Matula 76]
Cliques in random graphs z
The largest clique in G(n, 1/2) has size 2 log2 n − 2 log2 log2 n±1 [Bollobás, Erdős 75] [Matula 76]
z
No maximal clique of size < log2 n
Cliques in random graphs z
The largest clique in G(n, 1/2) has size 2 log2 n − 2 log2 log2 n±1 [Bollobás, Erdős 75] [Matula 76]
z
No maximal clique of size < log2 n
z
Can we find a clique of size (1 + ²) log2 n ? [Karp 76]
Cliques in random graphs z
The largest clique in G(n, 1/2) has size 2 log2 n − 2 log2 log2 n±1 [Bollobás, Erdős 75] [Matula 76]
z
No maximal clique of size < log2 n
z
Can we find a clique of size (1 + ²) log2 n ? What if we “hide” a clique of size
n1=2¡ ² ?
Two problems for which we know much less. z Chromatic z Random
number of sparse random graphs
k-SAT
Two problems for which we know much less. z Chromatic z Random
number of sparse random graphs
k-SAT
z Canonical – –
for random constraint satisfaction: Binary constraints over k-ary domain k-ary constraints over binary domain
z Studied
in: AI, Math, Optimization, Physics,…
A factor-graph representation of k-coloring z
Each vertex is a variable with domain {1,2,…,k}.
z
All constraints are “not-equal”.
z
Random graph = each constraint picks two variables at random.
e2
v2
...
Each edge is a constraint on two variables.
e1
v1
...
z
Vertices
Edges
SAT via factor-graphs (x12 ∨ x5 ∨ x9 ) ∧ (x34 ∨ x21 ∨ x5 ) ∧ · · · · · · ∧ (x21 ∨ x9 ∨ x13 )
SAT via factor-graphs (x12 ∨ x5 ∨ x9 ) ∧ (x34 ∨ x21 ∨ x5 ) ∧ · · · · · · ∧ (x21 ∨ x9 ∨ x13 ) Variable nodes
z
Edges are labeled +/- to indicate whether the literal is negated.
z
Constraints are “at least one literal must be satisfied”.
z
Random k-SAT = constraints pick k literals at random.
c
x
...
Edge between x and c iff x occurs in clause c.
...
z
Clause nodes
Diluted mean-field spin glasses
z
Conflicting, fixed constraints: quenched disorder
z
Random bipartite graph: lack of geometry, mean field
z
Sparse: diluted
z
Hypergraph coloring, random XOR-SAT, error-correcting codes…
Variables
c
x
...
Small, discrete domains: spins
...
z
Constraints
Random graph coloring:
Background
A trivial lower bound z For
any graph, the chromatic number is at least: Number of vertices Size of maximum independent set
A trivial lower bound z For
any graph, the chromatic number is at least: Number of vertices Size of maximum independent set
z For
random graphs, use upper bound for largest independent set. µ ¶ s n ( × (1 − p) 2 ) → 0 s
An algorithmic upper bound z
Repeat zPick a random uncolored vertex zAssign it the lowest allowed number (color)
Uses 2 x trivial lower bound number of colors
An algorithmic upper bound z
Repeat zPick a random uncolored vertex zAssign it the lowest allowed number (color)
Uses 2 x trivial lower bound number of colors
z
No algorithm is known to do better
The lower bound is asymptotically tight As d grows, G(n, d/n) can be colored using independent sets of essentially maximum size [Bollobás 89] [Łuczak 91]
The lower bound is asymptotically tight As d grows, G(n, d/n) can be colored using independent sets of essentially maximum size [Bollobás 89] [Łuczak 91]
Only two possible values Theorem. For every d > 0, there exists an integer k = k (d) such that w.h.p. the chromatic number of G(n, p = d/n)
is either k or k + 1 [Łuczak 91]
“The Values” Theorem. For every d > 0, there exists an integer k = k (d) such that w.h.p. the chromatic number of G(n, p = d/n)
is either k or k + 1 where k is the smallest integer s.t. d < 2k log k .
Examples z
If
d = 7, w.h.p. the chromatic number is 4 or 5 .
Examples z
z
If
If
d = 7, w.h.p. the chromatic number is 4 or 5 . d = 10
60
, w.h.p. the chromatic number is
377145549067226075809014239493833600551612641764765068157 5
or 377145549067226075809014239493833600551612641764765068157 6
One value Theorem. If (2k − 1) ln k < d < 2k ln k then w.h.p. the chromatic number of G(n, d/n) is k + 1.
One value Theorem. If (2k − 1) ln k < d < 2k ln k then w.h.p. the chromatic number of G(n, d/n) is k + 1.
z
If d = 10100 , then w.h.p. the chromatic number is
Random k-SAT:
Background
Random k-SAT z
Fix a set of n variables X = {x1 , x2 , . . . , xn }
µ ¶ z
Among all 2
k
n possible k-clauses select m k
uniformly and independently. Typically m = rn . z
Example ( k = 3) :
(x12 ∨ x5 ∨ x9 ) ∧ (x34 ∨ x21 ∨ x5 ) ∧ · · · · · · ∧ (x21 ∨ x9 ∨ x13 )
Generating hard 3-SAT instances [Mitchell, Selman, Levesque 92]
Generating hard 3-SAT instances [Mitchell, Selman, Levesque 92]
z
The critical point appears to be around
r ≈ 4.2
The satisfiability threshold conjecture z
For every k
≥ 3, there is a constant rk such that ½
lim Pr[Fk (n, rn) is satisfiable] =
n !1
1 0
if r = rk − ² if r = rk + ²
The satisfiability threshold conjecture z
For every k
≥ 3, there is a constant rk such that ½
lim Pr[Fk (n, rn) is satisfiable] =
n !1
z
For every k
1 0
if r = rk − ² if r = rk + ²
≥ 3, k
2
k
k
< rk < 2 ln 2
Unit-clause propagation Repeat – Pick a random unset variable and set it to 1 – While there are unit-clauses satisfy them – If a 0-clause is generated fail
Unit-clause propagation Repeat – Pick a random unset variable and set it to 1 – While there are unit-clauses satisfy them – If a 0-clause is generated fail z
UC finds a satisfying truth assignment if
k
r
0] ≥ E[X 2] Pro of: Let Y = 1 if X > 0, and Y = 0 otherwise. By Cauchy-Schwartz, E[X ]2 = E[XY ]2 · E[X 2 ]E[Y 2 ] = E[X 2 ] Pr[X > 0] .
Ideal for sums If X = X1 +X2 + · · · E[X ]2
then X
=
E[Xi ]E[Xj ]
i;j
2
E[X ]
X =
E[Xi Xj ]
i;j
X
X
Ideal for sums If X = X1 +X2 + · · · E[X ]2
then X
=
E[Xi ]E[Xj ]
i;j
2
E[X ]
X =
E[Xi Xj ]
i;j
Example: The Xi correspond to the
X
¡n¢ q
potential q -cliques in G(n, 1/2)
X
Dominant contribution from non-ovelapping cliques
General observations z Method
works well when the Xi are like
“needles in a haystack”
General observations z Method
works well when the Xi are like
“needles in a haystack” z Lack
of correlations =⇒ rapid drop in influence around solutions
General observations z Method
works well when the Xi are like
“needles in a haystack” z Lack
of correlations =⇒ rapid drop in influence around solutions
z Algorithms
get no “hints”
The second moment method for random k-SAT z
Let X be the # of satisfying truth assignments
The second moment method for random k-SAT z
Let X be the # of satisfying truth assignments
For every clause-density r > 0, there is β = β (r ) > 0 such that
E[X ]2 n < (1 − β ) E[X 2 ]
The second moment method for random k-SAT z
Let X be the # of satisfying truth assignments
For every clause-density r > 0, there is β = β (r ) > 0 such that
E[X ]2 n < (1 − β ) E[X 2 ] z
The number of satisfying truth assignments has huge variance.
The second moment method for random k-SAT z
Let X be the # of satisfying truth assignments
For every clause-density r > 0, there is β = β (r ) > 0 such that
E[X ]2 n < (1 − β ) E[X 2 ] z
The number of satisfying truth assignments has huge variance.
z
The satisfying truth assignments do not form a n σ ∈ { 0 , 1 } “uniformly random mist” in
To prove 2k ln2 – k/2 - 1 z
Let H (σ, F ) be the number of satisfied literal occurrences in F under σ
where γ < 1 satisfies (1 + γ 2 )k ¡ 1 (1 − γ 2 ) = 1 .
To prove 2k ln2 – k/2 - 1 z
Let H (σ, F ) be the number of satisfied literal occurrences in F under σ
z
Let X = X (F ) be defined as X (F ) =
X
1¾j= F γ H ( ¾;F )
¾
X Y = ¾
1¾j= c γ H ( ¾;c)
c
where γ < 1 satisfies (1 + γ 2 )k ¡ 1 (1 − γ 2 ) = 1 .
To prove 2k ln2 – k/2 - 1 z
Let H (σ, F ) be the number of satisfied literal occurrences in F under σ
z
Let X = X (F ) be defined as X (F ) =
X
1¾j= F γ H ( ¾;F )
¾
X Y = ¾
1¾j= c γ H ( ¾;c)
c
where γ < 1 satisfies (1 + γ 2 )k ¡ 1 (1 − γ 2 ) = 1 .
General functions z
Given any t.a. σ and any k-clause c let
v = v(σ, c) ∈ {−1, +1}k be the values of the literals in c under σ.
General functions z
Given any t.a. σ and any k-clause c let
v = v(σ, c) ∈ {−1, +1}k be the values of the literals in c under σ. z
We will study random variables of the form X Y X= f (v(σ, c)) ¾
c
where f : {−1, +1}k → R is an arbitrary function
X=
X Y ¾
f (v) =
c
for all v
1 (
f (v(σ, c))
0
if v = (− 1, − 1, . . . , − 1) =⇒
f (v) = 1 8 0 >
: 1
otherwis e if v = (− 1, − 1, . . . , − 1) or if v = (+1 , +1 , . . . , +1) otherwis e
=⇒
2
n
# of satisfying truth assignments
# of “Not All Equal” truth assignments (NAE)
X=
X Y ¾
f (v) =
c
for all v
1 (
f (v(σ, c))
0
if v = (− 1, − 1, . . . , − 1) =⇒
f (v) = 1 8 0 >
: 1
otherwis e if v = (− 1, − 1, . . . , − 1) or if v = (+1 , +1 , . . . , +1) otherwis e
=⇒
2n # of satisfying truth assignments
# of satisfying truth assignments whose complement is also satisfying
Overlap parameter = distance z
Overlap parameter is Hamming distance
Overlap parameter = distance z
Overlap parameter is Hamming distance
z
For any f, if
σ, τ agree on z = n/2 variables
h E f (v(σ, c))f (v(τ, c))
i =
£
¤ £ ¤ E f (v(σ, c)) E f (v(τ, c))
Overlap parameter = distance z
Overlap parameter is Hamming distance
z
For any f, if
σ, τ agree on z = n/2 variables
h
i
E f (v(σ, c))f (v(τ, c))
z
For any f , if
=
£
¤ £ ¤ E f (v(σ, c)) E f (v(τ, c))
σ, τ agree on z variables, let h
Cf (z/n) ≡ E f (v(σ, c))f (v(τ, c))
i
Contribution according to distance Independence
Entropy vs. correlation For every function f : 2
E[X ]
=
2
n
µ ¶
Xn z=0
2
E[X ]
=
2
n
µ ¶
Xn z=0
µ
Recall:
¶
n = αn
µ
n Cf (1/2)m z
¶n
1
α® (1 − α)1¡
n Cf (z/n)m z
®
× poly(n)
Contribution according to distance Independence
®=z/n
The importance of being balanced z
An analytic condition: 0
Cf (1=2) = 0 =⇒
the s.m.m. fails
®=z/n
NAE 5-SAT 7 < r < 11
NAE 5-SAT 7 < r < 11
The importance of being balanced z
An analytic condition: 0
Cf (1=2) = 0 =⇒
the s.m.m. fails
The importance of being balanced z
An analytic condition: 0
Cf (1=2) = 0 =⇒
z
the s.m.m. fails
A geometric criterion: 0
Cf (1=2) = 0 ⇐⇒
X v ∈{−1;+1 }k
f (v)v = 0
The importance of being balanced 0
Cf (1=2) = 0 ⇐⇒
X
f (v)v = 0
v ∈{−1;+1 }k (-1,-1,…,-1)
Constant
SAT Complementary
(+1,+1,…,+1)
Balance & Information Theory z
Want to balance vectors in “optimal” way
Balance & Information Theory z
Want to balance vectors in “optimal” way
z
Information theory
=⇒
maximize the entropy of the f (v) subject to X f (−1, −1, . . . , −1) = 0 and f (v)v = 0 v 2f¡
1;+1 g k
Balance & Information Theory z
Want to balance vectors in “optimal” way
z
Information theory
=⇒
maximize the entropy of the f (v) subject to X f (−1, −1, . . . , −1) = 0 and f (v)v = 0 v 2f¡
z
Lagrange multipliers
1;+1 g k
=⇒ the optimal f is
f (v) = γ # of +1s in v for the unique γ that satisfies the constraints
Balance & Information Theory z
Want to balance vectors in “optimal” way
z
Information theory
=⇒
maximize the entropy of the f (v) subject to X f (−1, −1, . . . , −1) = 0 and f (v)v = 0 v 2f¡ (-1,-1,…,-1)
Heroic
1;+1 g k
(+1,+1,…,+1)
Random graph coloring
Threshold formulation Theorem. A random graph with n vertices and m = cn edgesis w.h.p. k -colorable if
c · k log k − log k − 1 . and w h p non-k-colorable if
c ≥ k log k − 12 log k
.
Main points z Non-k-colorability:
Pro of. The¬probability that¬there exists¬any µ ¶ cn k-coloring¬is at¬most 1 kn
1−
k
→0
z k-colorability:
Pro of. Apply second moment method to the number of balanced k -colorings of G(n, m).
Setup z
Let Xσ be the indicator that the balanced
k-partition σ is a proper k-coloring. z
We will prove that if X =
P
c · k log k − log k − 1
then for all there is a constant ¾ X¾
D = D(k) such > 0 that
E[X 2 ] < D E[X ]2 z
This implies that
G(n, cn) is k-colorable w.h.p.
Setup z
2
E[X ] = sum over all σ, τ of E[X¾X¿ ].
z For
any pair of balanced k-partitions
σ, τ
let aijn be the # of vertices having color i in σ and color j in τ. 0
Pr[σ and τ are proper] = @1 −
2
k
X + ij
1
a2ij A
cn
Examples Balance
=⇒ A is doubly-stochastic.
When σ, τ are uncorrelated, A is the flat 1/k matrix
1
k
Examples Balance
=⇒ A is doubly-stochastic.
When σ, τ are uncorrelated, A is the flat 1/k matrix
1
k
As
σ, τ
align, A tends to the identity matrix I
A matrix-valued overlap k
2
So, E[X ] =
X A 2B
k
µ
n An
¶µ 1−
2
k
+
1 X
k2
¶ cn 2 aij
A matrix-valued overlap k
2
So, E[X ] =
X A 2B
k
µ
n An
¶µ 1−
2
k
+
1 X
k2
¶ cn 2 aij
which is controlled by the maximizer of
− over
X
µ
aij log aij + c log 1 −
k× k
2
k
+
1 X
k2
¶
a2ij
doubly-stochastic matrices A = (aij ) .
A matrix-valued overlap k
2
So, E[X ] =
X A 2B
k
µ
n An
¶µ 1−
2
k
+
1 X
k2
¶ cn 2 aij
which is controlled by the maximizer of
− over
X
µ
aij log aij + c log 1 −
k× k
2
k
+
1 X
k2
¶
a2ij
doubly-stochastic matrices A = (aij ) .
−
X
aij log aij + c ·
i;j z
1X
k2 i;j
a2ij
Entropy decreases away from the flat 1/k matrix – For small c , this loss overwhelms the sum of squares gain – But for large enough
c ,….
z
The maximizer jumps instantaneously from flat, to a matrix where k vertices capture the majority of mass
z
Proved this happens only for c > k log k − log k − 1
−
X
aij log aij + c ·
i;j z
1X
k2 i;j
a2ij
Entropy decreases away from the flat 1/k matrix – For small c , this loss overwhelms the sum of squares gain – But for large enough
c ,….
z
The maximizer jumps instantaneously from flat, to a matrix where k vertices capture the majority of mass
z
Proved this happens only for c > k log k − log k − 1
−
X
aij log aij + c ·
i;j z
1X
k2 i;j
a2ij
Entropy decreases away from the flat 1/k matrix – For small c , this loss overwhelms the sum of squares gain – But for large enough
c ,….
z
The maximizer jumps instantaneously from flat, to a matrix where k vertices capture the majority of mass
z
Proved this happens only for c > k log k − log k − 1
−
X
aij log aij + c ·
i;j z
1X
k2 i;j
a2ij
Entropy decreases away from the flat 1/k matrix – For small c , this loss overwhelms the sum of squares gain – But for large enough
z
c ,….
The maximizer jumps instantaneously from flat, to a matrix where k entries capture majority of mass
−
X
aij log aij + c ·
i;j z
1X
k2 i;j
a2ij
Entropy decreases away from the flat 1/k matrix – For small c , this loss overwhelms the sum of squares gain – But for large enough
c ,….
z
The maximizer jumps instantaneously from flat, to a matrix where k entries capture majority of mass
z
This jump happens only after c > k log k − log k − 1
Proof overview Proof. Compare the value at the flat matrix with upper bound for everywhere else derived by:
Proof overview Proof. Compare the value at the flat matrix with upper bound for everywhere else derived by: 1.
Relax to singly stochastic matrices.
Proof overview Proof. Compare the value at the flat matrix with upper bound for everywhere else derived by: 1. 2.
Relax to singly stochastic matrices. Prescribe the L2 norm of each row ρi.
Proof overview Proof. Compare the value at the flat matrix with upper bound for everywhere else derived by: 1. 2. 3.
Relax to singly stochastic matrices. Prescribe the L2 norm of each row ρi. Find max-entropy, f(ρi), of each row given ρi.
Proof overview Proof. Compare the value at the flat matrix with upper bound for everywhere else derived by:
3.
Relax to singly stochastic matrices. Prescribe the L2 norm of each row ρi. Find max-entropy, f(ρi), of each row given ρi.
4.
Prove that f’’’ > 0.
1. 2.
Proof overview Proof. Compare the value at the flat matrix with upper bound for everywhere else derived by:
3.
Relax to singly stochastic matrices. Prescribe the L2 norm of each row ρi. Find max-entropy, f(ρi), of each row given ρi.
4.
Prove that f’’’ > 0.
1. 2.
5.
Use (4) to determine the optimal distribution of the ρi given their total ρ.
Proof overview Proof. Compare the value at the flat matrix with upper bound for everywhere else derived by:
3.
Relax to singly stochastic matrices. Prescribe the L2 norm of each row ρi. Find max-entropy, f(ρi), of each row given ρi.
4.
Prove that f’’’ > 0.
1. 2.
5. 6.
Use (4) to determine the optimal distribution of the ρi given their total ρ. Optimize over ρ.
Random regular graphs Theorem. For every integer d > 0, w.h.p. the chromatic number of a random d-regular graph
is either k , k + 1, or k + 2
where k is the smallest integer s.t. d < 2k log k .
A vector analogue (optimizing a single row) Maximize
−
Xk
ai log ai
i =1
subject to Xk ai
=
1
=
ρ
i =1
Xk i =1
a2i
for some 1/k < ρ < 1
A vector analogue (optimizing a single row) Maximize
−
Xk
ai log ai
i =1
subject to Xk ai
=
1
i =1
Xk i =1
k a2i
=
ρ
for some 1/k < ρ < 1
Maximize
−
Xk
A vector analogue (optimizing a single row) For k = 3 the maximizer is
ai log ai
i =1
subject to Xk ai
=
1
=
ρ
i =1
Xk i =1
a2i
for some 1/k < ρ < 1
(x, y, y ) where x > y
Maximize
−
Xk
A vector analogue (optimizing a single row) For k = 3 the maximizer is
ai log ai
i =1
subject to Xk ai
=
1
=
ρ
i =1
Xk i =1
a2i
for some 1/k < ρ < 1
(x, y, y ) where x > y
Maximize
−
Xk
A vector analogue (optimizing a single row) For k = 3 the maximizer is
ai log ai
i =1
subject to Xk ai
=
1
=
ρ
i =1
Xk i =1
a2i
for some 1/k < ρ < 1
(x, y, y ) where x > y
Maximize
−
Xk
A vector analogue (optimizing a single row) For k = 3 the maximizer is
ai log ai
i =1
subject to Xk ai
=
1
=
ρ
i =1
Xk i =1
a2i
for some 1/k < ρ < 1
(x, y, y ) where x > y
Maximize
−
Xk
A vector analogue (optimizing a single row) For k = 3 the maximizer is
ai log ai
(x, y, y ) where x > y
i =1
subject to Xk ai
=
1
=
ρ
i =1
Xk i =1
a2i
for some 1/k < ρ < 1
For k > 3 the maximizer is
(x, y, . . . , y )
Maximum entropy image restoration z Create
a composite image of an object that:
– Minimizes “empirical error” z Typically, least-squares error over luminance – Maximizes “plausibility” z Typically, maximum entropy
Maximum entropy image restoration
Structure of maximizer helps detect stars in astronomy
The End