Rotation and Scale Invariant Texture Analysis with Tunable Gabor Filter Banks Xinqi Chu Kap Luk Chan School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore Email: { chux0001, eklchan }@ntu.edu.sg
Jan. 14, 2009
Electrical and Electronic Engineering, NTU
The Invariance Issue on Textures Why Invariance?
D001
D004
D021
D040
Original:
Rotation:
Scaling:
Concurrent Rotation and Scaling:
Examples illustrating the invariance problem Electrical and Electronic Engineering, NTU
The Invariance Issue on Textures Classification of Existing Invariant Texture Recognition Approaches
Same size, same orientation B.S. Manjunath & W.Y. Ma on 112 textures
Rotation invariance max. 40 textures Porter & Canagarajah Discrete Wavelet Decomposition Kim and Udpa Rotated wavelet filters Fountain and Tan Multichannel Filtering Kashyap and Khotanzad Circular autoregressive model
Rotation or Scale invariance Kai-Kuang Ma Ju Han on 16 textures
Concurrent Rotation & Scale invariance C.M Pun & M.C Lee on 25 textures Cohen and Patel on 9 textures Chen and Kundu on 10 textures Wu and Wei on 10 textures
Our method is based on a spectral shift measure with a fast search strategy which has the same recognition speed as in the circumstance where rotation and scale are absent. The method is tested on 112 textures
Electrical and Electronic Engineering, NTU
Concurrent Rotation and Scale Texture Recognition The most desirable methods are expected to have the following properties: Concurrent Rotation and Scaling Invariance Good recognition performance when large number of texture-classes present We chose Manjunath’s Gabor filter banks(non-invariance) to build on top of it because: It shows excellent performance on the entire Brodatz database consist of 112 classes It’s mimicry to receptive field in eye cortex. Therefore our aim is to achieve"concurrent rotation and scaling invariance".
Electrical and Electronic Engineering, NTU
The 2-D Gabor Function The Band-pass Filter
Figure: 2DGabor Electrical and Electronic Engineering, NTU
The 2-D Gabor filter bank
Figure: The 2-D Gabor filter bank
Electrical and Electronic Engineering, NTU
The 2-D Gabor filter bank An example from the Brodatz Database original128x128/d004.bmp
Electrical and Electronic Engineering, NTU
The Effects of Rotation and Scale How to estimate rotation and scaling
Problem to solve: Classifying a input texture image Unknown Class ID Unknown Rotation(to the true reference) Unknown Scale(to the true reference)
We first assume that the class ID for the input texture is already known. Therefore we focus on the estimation of rotation and scale first. We’ll tackle the entire problem after the estimation method is introduced.
Electrical and Electronic Engineering, NTU
The Effects of Rotation and Scale How to estimate rotation and scaling when the CLASS ID IS KNOWN? 20 degree counter-clockwise
Original
1.5 up-scaled 20 degree counter clockwise
1.5 up-scaled
Texture 21
Cartisan Frequency domain:
90
90
40
120
90
40
120
60
60
90
25
120
25
120
60
60
20
30
30
20 15
15 20
150
Dominant Peak Position in Polar Coordinates:
30
150
150
30
30
10 10
10
180
0
330
210
240
300 270
Estimation: Error:
20
150
30
5
180
0
210
330
240
10
300 270
5
180
0
210
330
300
240 270
180
0
210
330
240
300 270
rho=35, theta1.57
rho=34.78, theta=1.89
rho=24, theta1.57
rho=23.14, theta1.89
N/A(reference) N/A
s=1, r=18.43 degree 7.8% in r
s=1.46, r=0 degree 2.6% in s,
s=1.5125, r=18.43 degree 0.8% in s, 7.8% in r
Electrical and Electronic Engineering, NTU
The Idea and the Paradox We can measure the scale and rotation with respect to the true reference texture by a spectral shift measure in polar-frequency domain if the CLASS ID IS KNOWN. The chicken and egg problem: We have to know the correct CLASS ID in order to get the correct rotation and scale parameters. If we know the class ID for the input already, the classification job is done, why bother estimating these parameters? Input Texture
?
Classification
? Rotation and scale Estimation
Figure: The chicken and egg problem Electrical and Electronic Engineering, NTU
The Solution to the Paradox The Algorithm Chart
Training
Classification
112 training images 1 for each class
Extract feature w.r.t 112 Gabor banks (Each feature is 48D)
Extract feature vectors for 112 reference images, 1 for each class
Estimate the position of dominant frequency location
Compare with 112 reference texture feature vectors
Classification
Rotation and scale Estimation
Figure: The solution to the chicken and egg problem
Electrical and Electronic Engineering, NTU
Locate the dominate frequency
112 Spectrums 1 for each class
Input Texture
The Algorithm The advantage of this algorithm: Robustness: The only one method that has since applied to all 112 classes. The same time the input texture is classified, the rotation and scaling are also quantitatively evaluated, a property most of the method using the invariant representations do not have. Recognition Speed is a bit slower than the non-invariant algorithm(if implemented in Matlab). The disadvantage of this algorithm: Memory required increased to 128 times the original method. Inherent system errors. Electrical and Electronic Engineering, NTU
The Evaluation of the Method The texture taxonomy Textures
Homogeneous
Inhomogenous
Periodical
Directional
Random
Texture
Texture
Spectrum
Spectrum
Other examples in the same category
Other examples in the same category
Electrical and Electronic Engineering, NTU
The Evaluation of the Method (cont’d) The experimental setup Scale: 0.7 ∼ 1.4, step= 0.1, 8 scales Rotation: 0 o ∼ 180o , step= 20 o , 10 rotations Totally 112 × 8 × 10 = 8960 testing images. We have 80 images for each class in which 63 images are concurrently rotated and scaled.
Figure:
Each column represents two images from the same texture (from left to right:D18,D26,D27,D87,D112) but of 4 times scale difference, and you can observe that the texture is entirely different though the upper row is just a 4 up-scaled version of the lower row.
Electrical and Electronic Engineering, NTU
The Evaluation of the Method (cont’d) The experimental results
Conventional method no-rotation/scale
Conventional method rotated & scaled
Proposed method rotated & scaled 1
Manjunath’s Applied on original dataset (128x128, 112x16)
0.9
Manjunath’s Applied on rotated dataset(90x90, 112x16x9)
1
1 0.8
0.9
0.9
0.8
0.8
0.7
0.7
0.6
0.6
0.5
0.5
0.5
0.4
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1
0.7 0.6
0.3 0.2 0.1
0
0
20
40
60
80
100
120
0
0
0
20
40
60
80
100
120
0
20
40
Electrical and Electronic Engineering, NTU
60
80
100
120
Methods Datasets Conventional no-rotation/scale Conventional rotated & scaled Our method rotated & scaled Rotation/scale effects on conventional Rate increase due to tuning
Inhomo. 44.9 19.7 50.2 -25.2 +30.4
Periodic 97.9 19.5 79.7 -78.4 +60
Directional 87.6 19.1 71.7 -68.5 +51.6
Rand. 61.2 19.7 56.7 -40.5 +37.0
Overall 74.7 19.4 63.7 -55.3 +44.2
Electrical and Electronic Engineering, NTU
Summary and the way ahead
Future work: Is mean and variance proper statistics for the filtered outputs? Dominant peak shift is probably not a accurate and safe measure of the rotation, we can take multiple peaks into consideration to get a more robust estimate. Inhomogeneous textures can be taken into account
Electrical and Electronic Engineering, NTU
Thank you!
Electrical and Electronic Engineering, NTU