The skew spectrum of graphs Risi Kondor
Gatsby Unit, UCL with
Karsten Borgwardt
University of Cambridge
2 3 1
4 5
7
6
2 3 1
4 5
7
6
0 0 0 A= 1 0 0 0
0 0 1 1 0 0 0
0 1 0 1 0 0 0
1 1 1 0 0 1 0
0 0 0 0 0 1 0
0 0 0 1 1 0 1
0 0 0 0 0 1 0
2 3 1
4 5
7
0 0 0 A= 1 0 0 0
0 0 1 1 0 0 0
0 1 0 1 0 0 0
1 1 1 0 0 1 0
0 0 0 0 0 1 0
0 0 0 1 1 0 1
0 0 0 0 0 1 0
6
q(A) is a graph invariant if it is invariant to relabeling.
poly(n) time computable
complete set of invariants Graph isomorphism problem
efficiently computable set of invariant features Graph kernels, etc.
f (σ)f=: [A] Sn σ(n),σ(n−1) →R
f( 7 f( 7 f( 7 f( ? f( ?
? ? ? ? ? ) = [A]1,2 ? 6 ? ? ? ? ) = [A]1,3 ? ? 6 ? ? ? ) = [A]1,4 6
.. .
7 7
.. .
.. .
? ? ? ? ) = [A]2,3 ? 6 ? ? ? ) = [A]2,4 6
.. .
Now if we permute the vertices by i !→ π(i) ....
[A ]π(i),π(j) = [A]i,j !
Now if we permute the vertices by i !→ π(i) ....
[A ]π(i),π(j) = [A]i,j !
ff ( ? !
7
? π(j)
6
? ? ?)= π(i)
f( ?
7
i
?
? ? ?)
6
j
... in other words f (πσ) = f (σ) . !
... or f = f , where !
π
f (σ) = f (π π
is the translate of f by π.
−1
σ)
2. Non-commutative harmonic analysis and invariants
G is a group if for any x, y, z ∈ G
1. xy ∈ G , 2. x(yz) = (xy)z , 3. there is an e ∈ G such that ex = xe = x, 4. there is an x
−1
∈ G such that xx
−1
=x
−1
x = e.
Permutations σ : {1, 2, . . . , n} →{ 1, 2, . . . , n} form a group called the symmetric group, denoted Sn .
n−1 "
−ikx ! f (k) = e f (x) ρ(x) ρ(y) x=0 = ρ(xy)
f!(ρ) =
"
x∈G
ρ(x) f (x)
ρ(x) ρ(y) = ρ(xy)
ρ: G → C
d×d
is called a representation of G
Example
S3
!
1 0
ρ((12)) =
!
ρ(e) =
ρ((123)) =
!
0 1
"
1 0 0 −1
2 3
"
√ " −1/2 − 3/2 √ 3/2 −1/2
1
Equivalence: ρ1 (x) = T −1 ρ2 (x) T
Reducibility: T ρ: G → C
! = ρ(xy) ρ(x) ρ(y)
ρ(x) T =
−1
d×d
ρ1 (x) 0 0 ρ2 (x)
"
is called a representation of G
A complete set of inequivalent irreducible unitary representations we denote R .
The Fourier transform on a group is
•
Diaconis: !Group" representations in probability f (ρ) = f (x) ρ(x) ρ∈R and statistics (1988) x∈G • Clausen, Maslen, Rockmore, Healy, ... : FFTs • Kondor, Howard and Jebara: Multi-object tracking with representations of the symmetric group (AISTATS, 2007) • Huang, Guestrin and Guibas: Efficient inference for distributions on permutations (NIPS, 2007)
t ! f (ρ) = ρ(t) f!(ρ)
The power spectrum of f is the set of invariant matrices † ! ! ! a(ρ) = f (ρ) · f (ρ)
! at (ρ) = (ρ(t)f!(ρ))† · (ρ(t)f!(ρ)) = f!(ρ)† · f!(ρ) = ! a(ρ)
Kakarala’s non-commutative bispectrum is
b(ρ1 , ρ2 ) = C
! †
#† $ f"(ρ1 ) ⊗ f"(ρ2 ) C f"(ρ) ρ
where ρ1 (z) ⊗ ρ2 (z) = C
!"
# ρ(z) C †
ρ
is the Clebsch-Gordan decomposition. [Kakarala, 1992]
The skew spectrum is the unitarily equivalent, but easier to compute set of matrices q!z (ρ) = r!z (ρ) · f!(ρ) †
where
rz (x) = f (xz)f (x)
[Kondor, 2007]
3. Back to graphs...
What we have so far: 1.
f (σ) = [A]σ(n),σ(n−1)
2.
Under permuting the vertices f = f
3.
Our favorite invariant is the skew spectrum
!
q!ν (ρ) = r!ν (ρ) · f!(ρ) †
where
f!(ρ) =
"
π
rν (σ) = f (σν) f (σ)
ρ(σ) f (σ)
σ∈Sn
Far too expensive in this form!
f( 7 f( 7 f( 7 f( ? f( ?
? ? ? ? ? ) = [A]1,2 ? 6 ? ? ? ? ) = [A]1,3 ? ? 6 ? ? ? ) = [A]1,4 6
.. .
7 7
.. .
.. .
? ? ? ? ) = [A]2,3 ? 6 ? ? ? ) = [A]2,4 6
.. .
1. The ν index only has to extend over one representative from each Sn−2 σ Sn−2 coset. 2. The f! and r!ν Fourier transforms are very sparse.
f!(
r!ν (ρ)† · f!(ρ)
)=
d=1
f!(
)=
d=n−1
2·2
f!(
)=
d = n(n − 3)/2
1·1
f!(
)=
d = (n − 1)(n − 2)/2
1·1
f!(
)=
1·1
7 d = n(n − 1)(n − 5)/6
The answer is
49.
(and it’s computable in O(n ) time) 3
Sn
Bratelli diagram
http://www.cs.columbia.edu/~risi/SnOB
4. Experiments
•
For n up to about 300, the skew spectrum can be computed in fractions of a second.
•
For small graphs (n~5) it’s complete!
•
For n~100 good for learning tasks.
Number of instances/classes Max. number of nodes Reduced skew spectrum Random walk kernel Shortest path kernel
MUTAG 600/6 28 88.61 (0.21) 71.89 (0.66) 81.28 (0.45)
ENZYME 188/2 126 25.83 (0.34) 14.97 (0.28) 27.53 (0.29)
NCI1 4110/2 111 62.72 (0.05) 51.30 (0.23) 61.66 (0.10)
NCI109 4127/2 111 62.62 (0.03) 53.11 (0.11) 62.35 (0.13)
Conclusions
•
Reduced the problem of representing graphs to an abstract algebraic problem.
•
Being restricted to a homogeneous space makes it easy to compute the skew spectrum but also collapses its size.
•
Surprisingly, just 49 scalar invariants seem to be able enough to do the job (compressed sensing).
•
Natural question: what about labeled graphs?