Texture segmentation using Gabor filter and multi-layer perceptron ...

Report 5 Downloads 131 Views
Texture Segmentation using Gabor Filter and MultiLayer Perceptron* 1

'"Javad Alirezaie

Nezamoddin N. Kachouie

'Department of Electrical and Computer Eng. Ryerson University Toronto, Canada [email protected]

'Department of Systems Design Eng. University of Waterloo Waterloo, Canada

- Previous approaches to texture analysis and

image and decompose it into a set of filtered images accordingly. The set of filtered images are used directly as feature images or they are processed and/or combined to extract the features. Eventually extracted features are used for the segmentation of the input image.

Abstract

segmentation pe$orm multi-channelfiltering by applying a set of filters in frequency domain or a set of masks in spatialdomain. In this paper we describe a texture segmentation algorithm based on multi-channel filtering in conjunction with neural networks for feature extraction and segmentation. Thefeatures extracted by Gaborfilters have been applied for image segmentation and analysis. There are some important considerations about filter parameters and the reduction of feature dimensions. Here we introduce a method ta extract optimal feature dimensions using competitive networks and multilayer perceptrons. We present the segmentation results using different banhidths. The comparison of segmentation results generated using our method andprevious research wing leaming vector quantization (LVQ) is presented.

Keywords: Texture, Gabor filter, neural networks, Competitive, MLP, LVQ, feature extraction, segmentation

1 Introduction Texture segmentation and analysis is an important aspect of pattern recognition and digital imaging. In image analysis, textures have been used to perform scene segmentation for object and region recognition, surface classification and shape recognition. Texture analysis needs to identify attributes which are useful to discriminate, recognize and segment different texture types. Texture segmentation involves accurately partitioning an image into sections according to the textured regions or by recognizing the borders between different textures in the scene or image. Much research in this field bas proposed texture segmentation and analysis methods based on a filter bank model which is motivated by the human vision system's (HVS) unique capabilities for texture segmentation. In this model, a set of filters in the fiequency domain (or a set of masks in spatial domain) are applied in parallel to an input * 0-7803-7952-7/03/$17.00 0

2003 IEEE.

[email protected]

There are several applicable choices as filter banks which are used for textured images. From a practical point of view, some filters may be more useful for specific segmentation tasks but not for the others. Gaussian filters modulated by exponential or by sinusoidal filters, known as Gabor filters, have been proved to be very useful for texture analysis for the images containing specific 6equency and orientation characteristics [1,4,12,13]. While different filter banks can perform joint spatial/spatial-6equency decomposition, Gabor filter bank using a Gabor base function is one of the most amactive ones. This set of filters bas an optimal localization in the joint spatiavspatial-6equency domain according to the uncertainty principal [1,4]. These selective band pass filters with different radial spatial 6equency and orientation have optimum resolution in the time and kequency domains that resemble the simple visual cortical cell characteristics. Considerable research has been performed using Gabor filters for texture segmentation and analysis. The research involving the use of Gabor filters for texture segmentation can be divided into three major disciplines: 1- investigating the best fiequencies and orientations (Bandwidth) according to the characteristics of a specific textured image or considering the general texture analysis problem. 2- Inventing robust featwe extraction and feature reduction methods. 3- Employing the best classification and segmentation methods to apply to the optimal extracted features. Bovik and Clark in their research [l] have proposed a computational approach for analyzing visible textures. In their method they detect boundaries between textures by 2897

comparing the channel amplitude responses and detecting discontinuities in the texture phase by locating large variations in the channel phase responses. Dunn and Higgins have investigated designing the optimal Gabor . filters [4] and argued that Gabor filter outputs can be presented as a Rician model and developed an algorithm to select optimal filter parameters to discriminate texture pairs.~Jain aqd Farrokhnia proposed a filter selection method [a] based on reconstruction of the .input image from the filtered images. They also proposed the optimum radial frequencies and orientations for different channels ti@ have been used widely by many researcbers since. Unser and Eden described an unsupervised texture segmentation method [I21 using Karhunen-Loeve transform on the resulting features of Gabor filters to reduce the feature vector dimension. Weldon Et. Al. presented a method [13] to design a single Gabor filter for multi-textured image seg&entation. Clausi and Jemigan iyestigated and compared different techniques [3] used to extract texture features and Jain and K m proposed a neural network texture classification method [7] as a generalition of the multi-channel filtering method.

:

.

Thii paper describes a texture segmentation method according to the general multi-channel decomposition approach and employing neural networks both for feature reduction and classification. By.appl%g a Gabor filter bank to the input image with suggested radial frequencies and orientations, a set of filtered images are generated. The multi-cbannel decomposition is accomplished by estimating the local energy in the filter outputs. Having 5 radial frequencies and. 6 orientations generates thuty feature. images. To reduce the features and obtain an optimal feature dimension we train a competitive network with an unsuperbised leaming method. The weight vectors of a trained network are used to reduce the extracted feature dimension. Eventually the resultant reduced features are used to train a multi layer perceptron with a supervised leaming method to segment the input image. -

In section 2 we review the multi-channel decomposition by Gabor filters. Section 3 describes the proposed method using Gabor Filter bank in conjunction with Competitive layer and Multi-Layer Perceptron (GFCMLP). In sections 4 and 5 the results and conclusion are presented respectively. .

2 Gabor filter bank A Gabor base function is a Gaussian function modulated with exponential or sinusoidal function that is defined in terms of the product of a Gaussian and an exponential. Two dimentianal Gabor functions h(x,y) can be written as:

and its frequency response H(U,V) is: H(u,v) =G(u-Uo, v) =exp{-2?

[ ( U - U ~ )v’o;]) ~ ~ ~ ~(3) +

Gabor functions are bandpass filters which are Gaussians, centered on (U,,e) in the spatial-frequency domain. The parameters U,, o x and oy determine the subband Gabor filter. U0 and €Iare center frequencies and ox and oy are the bandwidth of the filter. Equation (2) defines a complete Gabor function consisting of both real and imaginary (or even and odd) components. Rotation by 0 in the spatial domain (x-y plane) or in the spatialfrequency domain (U-vplane) provides selective arbitrary orientation for different channels. We can implement a Gabor filter bank by using only even-symmetric or real components as suggested by Jain and Farrokbnia [6] and can be represented by:

H(u,v) = G(u-U0 ,v) + G(utU0, v)

(5)

That is composed of two Gaussians in the spatialfrequency domain compared with .one Gaussian in the complex version. General multi-chamel filtering methods as shown in Fig. 1, consists of 3 major stages: applying the filter bank to generate filtered images, local energy estimation for feature extraction and classification of extracted features into different regions for segmentation. In the first step a textured input image is decomposed into filtered images. Typically, in the second stage, a local energy function consisting of nonlinearity and smoothing is applied on the filtered images (the output of Gabor filter bank) for feature extraction [1,6,3,13]. The most well known nonlinearity functions are: 1- sigmoid function, 2- rectifying, 3- square . function, 4- magnitude response and 5- real part. In this step each channel corresponding a different filter is tuned to a different radial frequency or orientation to capture local characteristics of different textures in the input image such as spatial frequency, edge intensity and direction. Atter the second stage, a set of feature images or a feature vector corresponding to each pixel in the input image is generated. The dimension of feature vectors is equal to (or multiplied by an integer)-the number of filters in the filter bank. 2698

second bank 6 orientations O', 30°, 60", go", 120' and 150' are used that give a bank of 30 filters. Input image

Local energy

Classification

Classifier

Segmented image Segmented Image

i

Fig.2. GF-CMLP methot Fig. 1. Multi-channel decomposition diagram Eventually the feature vectors should be classified and assigned to different textures. There are several classification methods to accomplish the segmentation task such as: Bayesian classifier, nearest neighbor classifier, Multi-Layer Perceptron (MLP), Fisher Linear Discriminant (FLD) and Leaming Vector Quantization (LVQ).

!

A square fimction is used as nonlineari~and it causes the sinusoidal modulations in the output of filter bank to be transformed to a square modulation. TO smooth out the fluctuations in the specific texture or noise in the image im(x,y), a Gaussian low pass filter is applied to the output of the filter bank. The s u e of the smoothing function is determined according to the s u e of the:Gabor band pass filter. The impulse response of the Gaussian filter is:

9

3 Proposed method (GF-CMLP) Our proposed method (GF-CMLP) as shown in Fig. 2 consists of three main stages:

3.1

where,

Multichannel decomposition

This stage consiN of filtering by a Gabor filter bank, applying a nonlinearity function and smoothing the results. We use two sets of Gabor filters composed of 20 and 30 filters. For both sets the same 5 radial fiequencies suggested by Jain and Farokhnia [6],are used:

4Jz;sJ;;,~disz&,.54$i

i

and U, is the radial frequency of band pass Gabor filter. 3.2

Feature vector reduction

ij

To reduce the feature vector dimension we use a competitive network. In section 3.2.1 I an overview of competitive networks is presented add section 3.2.2 describes feature vector reduction in GF-CMLP method. 1

The radial frequency bandwidth is one octave, thus the frequency difference of UI and U*, is given by logz ( U ~ R T I ) ,is equal to 1. For each radial frequency 4 orientations On, 45",90" and 135" are used in the first bank that generate the total number of 20 channels and in the

!

Competitive networks I . . In a competitive layer the neurons are &shlbuted~in order to recognize frequently presented $put vectors in an unsupervised manner. The competitive /transfer function accepts a net input for each neuron in thhe,layer and returns 3.2.1

2899

neuron outputs of 0 for all neurons except for the winner, which outputs 1. In a competitive layer each neuron competes to respond to an input vector p, the neuron whose weight vector is closest to p gets the highest net input and, therefore, wins the competition and outputs 1 and all other neurons output 0. In the training phase to adjust the winner so as to move it closer to the input [8] the weights of the winning neuron will be adjusted with the learning rule as shown by equation (7).

pixel in the input image. For images with different no. of textures we used different number of neurons kom 5 to 20 in our C-N.According to two criterions, the classification error and dimension of quantized feature vectors, we selected 5, 8 and 10 neurons for C-N as the optimum no. of neurons for 2, 4 and 5 textures in the image respectively. training

featur Aw= ?@- w)

Competitive Network

vectors 30X 1

(7)

where p, w and h are an input vector, an input weight vector and a learning rate respectively. Since this learning rule allows the weights to leam an input vector, it is useful in recognition applications. Thus, the neuron whose weight vector is closest to the input vector will be updated to be even closer. As a result the winning neuron is more liiely to win the competition when a similar vector is provided, and less likely to win if a veq.different input vector is provided. During training, as more inputs are presented, each neuron in the layer closest to a group of input vectors adjusts its weights to more closely resemble the inputs. Eventually, every cluster of similar input vectors will have a neuron that identifies the presented vector if it belongs to the cluster and the competitive network will categorize the input vectors. . . ~ .

IC-NI

eight vectors of competitive layer

10x30

I weight vectors

lox’

. -

quantized &re

.

segmented

Feature reduction using weight vectors of C-N To reduce the feature dimension in GF-CMLP method, a competitive network (C-N) as depicted in Fig.3 is used. The motivation to use competitive networks stems &om the previous research using LVQ for classification [8,11]. LVQ consists of one competitive layer and one linear layer, and it is a supervised laming algorithm. LVQ describes the class borders by nearest neighbor rule and its main applications are itatistical pattem recogoition and classification [8]. Despite using ~a bidden competitive layer, LVQ is restricted by only learning linear relationships between quantized vectors and desired output vectorS. If the network could leam nonlinear relationships between the reduced dimension vectors and desired outputs, it is possible to obtain improved segmentation results. 3.2.2

On the other hand, by applying the resultant.weight veaOrs of a competitive layer to the features prior to feediig the classifier would reduce the dimensionality of feature vectors as is depicted in Fig.3 and faster segmentation will result. In order to train C-N, sample vectors are selected by random among the extracted feature vectors obtained by Gabor filter hank in the previous step. After C-N is trained the weight vectors of the layer are regarded as qnantizing mask coefficients. These coefficients are used to apply on the extracted -feature vectors by Gabor filter bank corresponds to each

-

imaOe

vectors

Fig.3. Feature reduction and Classification by GF-CMLP method

3.3

Classification by Multi-Layer Perceptron

To learn nonlinear relationships between input and output vectors, a Multi Layer Perceptron (MLP) has multiple layers with nonlinear transfer functions. Feedfonvard networks often have one or more hidden layers of sigmoid neurons followed by an output layer [5]. MLP is trained by adjusting the weights using Least Square Error (LSE) that minimizes the mean square error as shown by equation (8).

The total square error between the desired class and the actual output 9in output layer K is calculated. To train the neural network, the gradient is determined by using a backpropagation technique which involves performing computations backwards through the network. AAer the backpropagation network is trained properly, it typically provides reasonable answers when presented with that it has never seen. A 3-layer perceptron, which is used to accomplish the segmentation task,is depicted in Fig. 3. Our MLP uses the sigmoid transfer function in all three layers. During training, random selected quantized feature 2900

vectors are assigned to proper classes. Although the extracted feature dimension is 30 for input images with different no. of textures, the quantized feature dimension is 5, 8 and IO for images with 2, 4 and 5 textures respectively. After MLP is trained, quantized feature vectors corresponds to each pixel of the image are classified to proper regions.

4

Results

In the presented approach, textured images from Brodatz album [2], MIT vision and modeling database [lo] and MeasTex image texture database [9] are used to derive the input data set. The sample images consist of D77, D84, D55, D53 and D24 selected from Brodatz album, Fabric.0000, Fabric.0017, Flowers.0002, Leaves.0006 and Leaves.0013 selected from [lo] and Grass.0002, Misc.0002 and Rock.0005 selected !?om [9]. We used combinations of 2, 3, 4 and 5 textures as test images. In several experiments we used two sets of filters consisting of 20 and 30 filters with 45' and 30" bandwidths respectively. The segmented result uses the second set (30 filters with bandwidth of 30') and is better than the former case presented. The increase in the feature dimension using 30 filters in comparison with 20 filters in the fxst set is compensated by feature quantization using competitive layer. The dimension of quantized feature vectors is from 5 to 10 and depends on the number of different textures in the textured image. To train the networks, less than 6 percent of the pixels are selected by random.

Fig.4. (a) Textured image, (b) segmentation results by GFCMLP method, (c) GF-CMLP method after applying median filter

The image in Fig. 4a is a textured image consisting of Fabric.0000 6 0 m [lo], Misc.0002, Grass.0002 and Rock.0005 selected from [9]. Fig. 4b shows the segmented result by our method and in Fig. 4c the segmented results after applying median filter are provided. The image in Fig. 5a is a textured image consisting of Fabric.0000, Fabric.0017, Flowers.0002, Leaves.0006 and Leaves.0013 selected from [lo]. Fig. 5b shows the segmented result by onr GF-CMLP method and in Fig. 5c the segmented results after applying a median filter are provided. The comparison of classification errors using onr proposed method and the segmentation results reported in [I 11 using Gabor filter, Discrete Cosine Transform (DCT) and Laws filters in conjunction with LVQ for textured image of Fig3 are shown in Table I. According to our segmented results and previous research shown in Table 1, our proposed method has better classification performance in comparison with the other three methods using LVQ for classification. In all experiments about 6 percent of pixels are randomly selected to train the networks.

Fig.5. (a) Texhlred image, (b) segmenktion results by GFCMLP methcd, (c) GF-CMLP method after applying median filter Table 1 ClassificationErrors for FigS. Method GF-CMLP

I

Error (percenri 21.62

I

DCT and LV Laws and LV The reported filter parameters for filters which are used in [ 111 for image segmentation are shown in Table 2. As depicted in the Table 2, the number of filters whicb are used in [ll] for Gabor, DCT and Laws are equal to the feature dimension and are 20, 8 and 25 respectively. -In GF-CMLP, despite having 30 filters in the filter bank, the feature dimension is reduced to 10. The smaller feature dimension, would result the faster segmentation. 2901

[4] D. Dunn and W. E. Higgins, “Optimal Gabor filters for texture segmentation”, IEEE Transactions on Image Processing, V O ~ .No. , 7, July 1995. Method GF-CMLP Gabor DCT

No of features

No of filters 30

IO 20

20 8

I

[5] J. A. Freeman and D. M. Skapura, Neural networks algorithms, applications, and programming techniques, Addison-Wesley, Massachusetts, 1992.

8 [6] A. K. Jain and F. Farrokhnia, ‘‘Unsupewised texture

segmentation using gabor filters”, Pattem Recognition, Vol. 24,No. 12,pp. 1167-1186, 1991.

[7] A. K. Jain and K. Karu, “Learning texture discrimination masks”, IEEE Transactions Pattem Analysis and Machine Intelligence, Vol. 18, No. 2, pp.195-205, February 1996. [8] T. Kohonen, T. S. Huang and M. R. Schroder, Serf organizing maps, Springer-Vcrlag, Berlin, 1997.

191 MeasTex image texture database, hnp://www.cssip.elec.uq.edu.au/-guy/measte~meastex.h, 1998.

[IO] MIT vision and modeling hnp://www.media.mit.edu/vismod/,1998.

group,

[ I l l T. Randen and H. Husoy, “Filtering for texture classification: A comparative study”, IEEE Transactions Pattem Analysis and Machine Intelligence, vol. 21, no. 4, 1999.

[I21 M. Unser and M. Eden,” Nonlinear operators for improving texture segmentation based on features extracted by spatial filtering”, IEEE Transactions Systems, Man, Cybernetics, vol. 20, no. 4, 1990.

[I31 T. P. Weldon, W. E. H i m andD. F. Dunn,“Gabor filter design for multiple texture segmentation”, SPIE Joumal of OPT.ENG., Vol. 35, No. IO, 1996. [I41 T. P. Weldon and W.E. Higgins, “Design of multiple Gabor filters for texture segmentation”, Proc. Int ’I Con$ Acoustic Speech, Signal Proc., Atlanta, Ga., pp.22432246, May 1996.

2902