Spherical Embedding and Classification Richard C. Wilson Edwin R. Hancock Dept. of Computer Science University of York
Background Dissimilarities are a common starting point in pattern recognition • Dissimilarity matrix D • Find the similarity matrix S 1 J J S (I )D(I ) 2 n n If S is PSD then – S is a kernel matrix K – Can use a kernel machine – Or embed points in a Euclidean space
K XX T U1/ 2 1/ 2 UT X U1/ 2 – And use standard vector-based PR
We can identify D with the (Euclidean) distances between points:
Dij d 2 (xi , x j ) xi x j , xi x j xi , xi x j , x j 2 xi , x j
Background Commonly, when comparing structural representations, we do some kind of alignment The similarity matrix is usually indefinite (negative eigenvalues) and we do not get a kernel
K UUT , X U1/ 2 if i 0
There is no representation as points in a Euclidean space Two basic approaches – Modify the dissimilarities to make them Euclidean – Use a non-Euclidean space
Ideally, any non-Euclidean space should be metric and feasible to compute distances in
Riemannian space We want to embed the objects in space so we can compute statistics of them Need a metric distance measure Space cannot be Euclidean (normal, flat space) Riemannian space fulfils all these requirements • Space is curved – distances are not Euclidean, indefinite similarities • Metric space • Distances are measured by geodesics (shortest curve joining two points in the space) – Can be difficult to compute the geodesics – Need to use a space where geodesics are easy
• Obvious choice is space of constant curvature everywhere
Spherical space • The elliptic manifold has constant positive curvature everywhere – Can visualise as the surface of a hypersphere embedded in Euclidean space 2 2 – Embedding equation: xi r i – Has positive definite metric tensor, so it is Riemannian – Sectional curvature is K=1/r2
Well-known example: the sphere
x2 y2 z 2 r 2 P ( x, y, z ) (r sin u sin v, r cos u sin v, r cos v) ds 2 r 2 sin 2 vdu2 r 2 dv 2 x, y xT y
Non-Euclidean geometry
Previous work
• Lindman and Caelli (1978) – Embedding of psychological similarity data in elliptic and hyperbolic(negatively curved) space – Optimisation method suitable for small datasets
• Cox and Cox (1991) – Define the stress of a configuration on the sphere and minimise – Difficult optimisation problem – Not practical on large datasets
• Shavitt and Tankel (2008) – Embedding of internet connectivity into hyperbolic space – Physics-based simulation of partical dynamics
In this paper we use the exponential map to develop an efficient solution for large datasets
Geodesics on the sphere • A geodesic curve on a manifold is the curve of shortest length joining two points – The geodesic distance between two points is the length of the geodesic joining them
• Geodesics are „great circles‟ of the hypersphere • Distance between points dependent on the angle and curvature
i
d ij r ij r ij
x i , x j r 2 cos ij d ij r cos
1
xi , x j r
2
ij j
Elliptic space embedding
Problem: Find points on the surface of a hypersphere such that the geodesic distances are given by D
min d d X,r
2 ij
*2 2 ij
| x i | r where d ij r cos
1
xi , x j r2
Non-linear constrained optimisation problem Computationally expensive for large datasets Our strategy is to update each point position separately
Exponential Map One of the difficulties of the embedding is that it is constrained to a hypersphere The exponential map is a tool from differential geometry which allows us to map between a manifold and the tangent space at a point • The tangent plane is a Euclidean subspace • Log part: from the manifold to the tangent plane X Log M Y • The exp part goes in the opposite direction Y Exp M X • The map is defined relative to a centre M
TM
M
X Exp Log
Y
Exponential map for sphere
Map points on the sphere onto the tangent plane – Map has an origin where sphere touched tangent plane (O) – Distances (to origin) are preserved (OX=OX‟)
X’
X
Optimise on the tangent plane
Exponential map for the sphere Given a centre m, a point x on the sphere and a point x‟ on the tangent plane: x' (x m cos ) (to tangent plane) sin sin x m cos x' (to sphere) The tangent plane is flat, so distances are Euclidean:
dij2 (x j xi )T (x j xi ) When xi is the centre of the map (xi=m) then the distances are exact, distortion will occur as xi moves away from the centre • Project current positions onto tangent plane using xi as centre • Compute gradient of embedding error on the tangent plane • Update position xi
Updating procedure
E d ij2 d ij*2 i, j
2
Ei 4 d ij2 d ij*2 x i x j j
x i( k 1) x i( k ) Ei
For large datasets, computation of second derivatives is expensive, so we use a simple gradient descent Can however choose an optimal step size as the smallest root of the cubic: 2 2 2 n E 3 3 E j 2 2 2j E j j j 0 j j j j with j d ij2 d ij*2 , j E T (xi x j )
Initialisation
• Need a good initialisation • Method presented in CVPR10
Z(r ) r 2 cos 1 D r XT X r * arg min 0 Z(r ) r
X U Z Λ1Z/ 2 • If λ0=0 the result is exact • Otherwise, this a good starting point for our optimisation
Algorithm Algorithm Minimise Z(r) to find initial embedding
r * arg min 0 Z(r ), X U Z Λ1Z/ 2 r
Iterate: For each point xi: Map points onto tangent plane at xi x'
sin
(x m cos )
Optimise on tangent plane
Ei 4 d ij2 d ij*2 xi x j j
xi( k 1) xi( k ) Ei
Map points onto manifold x m cos
sin
x'
Classifiers in Elliptic space In practical applications, we want to do some kind of learning on the data, for example classification • NN classifier is straightforward, as we can compute distances • In principle, we can implement any geometric classifier as we have a smooth metric space – But the classifier must respect the geometry of the manifold
• A simple classifier we can use is the nearest mean classifier (NMC) – Can compute the generalised mean on the manifold for each class
Generalise d mean x arg min d 2 (x, x i ) x
i
x ( k 1) Exp x( k )
1 Log x ( k ) x i n i
(iterative process)
Some examples
*
*Reproduced from “Classification of silhouettes using contour fragments” Daliri and Torre, CVIU 113(9) 2009
Learning: classification
Conclusions
• We can use Riemannian spaces to represent data from dissimilarity measures when they cannot be represented in Euclidean space • Showed efficient method for embedding in elliptic space which works on large datasets – Produces embeddings of low distortion
• Can define simple classifiers which respect the manifold – NN, NMC
• Need to extend to more sophisticated geometric classifiers