A New Wavelet-Based Texture Descriptor for ... - Semantic Scholar

Report 1 Downloads 206 Views
A New Wavelet-Based Texture Descriptor for Image Retrieval E. de Ves1 , A. Ruedin2 , D. Acevedo2 , X. Benavent3, and L. Seijas2 1

2

Computer Science Department, University of Valencia, Departamento de Computación, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, 3 Robotics Institute, University of Valencia {Esther.Deves,Xaro.Benavent}@uv.es

Abstract. This paper presents a novel texture descriptor based on the wavelet transform. First, we will consider vertical and horizontal coefficients at the same position as the components of a bivariate random vector. The magnitud and angle of these vectors are computed and its histograms are analyzed. This empirical magnitud histogram is modelled by using a gamma distribution (pdf). As a result, the feature extraction step consists of estimating the gamma parameters using the maxima likelihood estimator and computing the circular histograms of angles. The similarity measurement step is done by means of the well-known Kullback-Leibler divergence. Finally, retrieval experiments are done using the Brodatz texture collection obtaining a good performance of this new texture descriptor. We compare two wavelet transforms, with and without downsampling, and show the advantage of the second one, which is translation invariant, for the construction of our texture descriptor. Keywords: Texture descriptor, Wavelet Transform, Image retrieval.

1

Introduction

The increasing amount of information available in today’s world raises the need to retrieve relevant data efficiently. Unlike text-based retrieval, where key words are successfully used to index documents, content-based image retrieval poses up-front the fundamental questions of how to extract useful image features and how to use them for intuitive retrieval [11] [5]. Interest in content-based image retrieval (CBIR) systems has been growing in the last few years. This interest has been motivated by the increasing number of image databases which need effective and efficient techniques for retrieving multimedia information. Attempts have been made to develop general purpose image retrieval systems based on multiple features (e.g. color, shape and texture), which describe the image content [10]. The new visual information retrieval systems extract visual image features usually related to color, texture and shape from each image stored in the database, and use this representation to compare images by means of a similarity measure. W.G. Kropatsch, M. Kampel, and A. Hanbury (Eds.): CAIP 2007, LNCS 4673, pp. 895–902, 2007. c Springer-Verlag Berlin Heidelberg 2007 

896

E. de Ves et al.

The features extracted from an image can be classified into low and high level features, which are normally obtained by combining low level features with a reasonable predefined model. The low level features are obtained by preprocessing each image in the database. Among these characteristics we can mention those with chromatic information, those related to textures present in the image and those related to the shape of objects in the image. Generally speaking, the structure of a CBIR system is composed of two main modules: the feature extraction module and the similarity measurement module. In the latter a distance between the query image and each image in the database is computed, by making use of the extracted features. Each image obtains a relevance score related to the query image in order to rank the database. Texture analysis plays an important role in many image processing tasks. There are a great number of references for texture analysis and particularly texture classification. A central point in texture analysis is the definition of good features to characterize textures, that can be useful for content-based retrieval. Gabor features for texture classification followed by a linear discriminant analysis was used in [2]. Ayala proposes in [1] a new descriptors for binary and gray-scale textures based on defined spatial size distributions (SSD). In [7] a functional descriptor of the multivariate random closed set defined from the texture is proposed as features to describe grayscale textures. Randen presents an interesting comparative study of different filtering approaches where the features used for texture classification are obtained from signal processing techniques like Law and Gabor filters, wavelet transforms or the discrete cosine transform [12]. Wavelets have been successfully applied to different image processing applications. When transformed, an image is represented as the sum of its details at different scales and orientations, plus a coarse approximation of the image. This naturally led to consider wavelets as a possible tool for texture classification. Some traditional approaches use the energy of the wavelet coefficients in each subband as texture descriptors, under the assumption that the energy of the distribution in the frequency domain identifies textures. A natural extension of the mentioned method consists in shaping every texture by means of marginal densities of the wavelet coefficients in every subband. This is justified by psychological studies that suggest that two homogeneous textures are difficult to distinguish if they produce similar marginal distributions as response to a bank of filters ([8]). A number of authors have observed that wavelet subband coefficients have highly non-Gaussian statistics [3], [14]. Numerous tests show that the normalized histograms of wavelet coefficients in each detail subband can be well approximated with a Generalized Gaussian Distribution (GGD). This model has been used with the Kullback-Leibler divergence as a similarity measure among distributions [6]. In this work, by applying a wavelet transform and considering at each scale the pair (horizontal detail, vertical detail) associated with each position, we extract information on on the importance (contrast) and orientation of the edges in the image. We present a novel texture descriptor, used for content-based retrieval, based on the wavelet transform. The basic idea consists in modeling the

A New Wavelet-Based Texture Descriptor for Image Retrieval

897

distributions of moduli of wavelet detail coefficients in each decomposition level, and computing their empirical angle histogram at each level (assuming vertical and horizontal coefficients at the same position are the components of a bivariate random vector). Our tests indicate that the gamma density function (pdf), described by two parameters, is a reasonable model for the moduli coefficients histogram. Thus, we propose to characterize each texture in the database by information extracted both from the moduli and the orientation of the coefficients. Section 2 introduces an illustrative example of our work, and explains our approach in detail. In section 3 we present experimental results which evaluate the performance of our texture descriptors. Finally, in section 4 we have concluding remarks.

2 2.1

Modelling Coefficients Distribution of Wavelet Transform Analyzing the Joint Histograms of Wavelet Coefficients

For our tests we have chosen the orthogonal Daubechies 4 wavelet. In the traditional wavelet transform ([4] [9]), coefficients are calculated via convolutions with 2 filters (lowpass and highpass). This is followed by downsampling operations, which prevent the wavelet from being invariant to translations. For comparison, in our tests we have included another wavelet transform that is translation invariant, proposed by Mallat [9]. It is calculated with the so–called à trous algorithm, via convolutions with 2 filters, but has no downsampling operations, so that it gives a redundant representation of an image. To capture lower frequencies at each step of the transform, the filters are upsampled. 1 [3 + e, 1 − e, 3 − e, 1 + e], Daubechies 4 filters are lowpass [ h3 h2 h1 h0 ] = 4√ 2 √ with e = 3, and highpass [−h0 h1 − h2 h3 ]. The filters for Mallat’s translation invariant transform correspond to a a √ 2[ a b b a ] and highpass [ c − c ], with a = 0.125, biorthogonal wavelet: lowpass √ b = 0.375 and c = 2/2. A previous step before proposing a new model for the joint histogram of wavelets coefficients for each detail level, is to study this distribution function for some simple images. The image in figure 1(a), chosen for the analysis, corresponds to a natural image from the Brodatz collection. It represents a texture with a privileged orientation. Both mentioned wavelet transforms have been applied to this image up to three levels of decomposition. In each level, four subbands are obtained: the approximation subband, the horizontal, vertical and diagonal details. In our study, we shall only use the vertical and horizontal detail subbands. It is worth mentioning that for the traditional wavelet transform, the subbands for the first level are a fourth (in size) of the image; when there is no downsampling step, they have the same size as the original image. In figure 1(a) we have an original texture, to which we have applied 3 levels of the biorthogonal wavelet transform without downsampling. In figure 1(b) are

898

E. de Ves et al.

(a)

200

300

150

100

200

100

50

100

50

(b)

0

0

0

−50

−50

−100

−100

−200

−100

−150 −100

0

100

−300

−200

−100

0

100

200

150

0

100 200

800 300

600

100 50

(c)

−200 −100

200

400

100

200

0

0

0

−50

−100

−200

−100

−200

−400 −600

−300

−150 −100

0

100

−200

0.2

0

200

400

−400 −200

0.09

0.06

0.06

0.04

0.03

0.02

0

200 400

0.15

(d)

0.1

0.05

0

0

50

100

150

0.1

0

0

50

100

150

200

0.07

0

100

200

300

400

0.08

0.06

0.08

0.06

0.05

(e)

0

0.06

0.04

0.04

0.03

0.04 0.02 0.02 0

0.02

0.01 0

40

80

120 140

0

0

100

0

300

400

0

0

150

0

0.136

300

0.054

0.054

0.068

0.036

0.036

0.018

0.018

0.000

0.000

π/2

π

Level 1

0.000

π/2

3π/2

π

Level 2

600 700

0.072

0.034

3π/2

450

0

0.072

0.102

(f)

200

π/2

3π/2

π

Level 3

Fig. 1. Empirical distributions for image (a). First row (b): joint distribution of Mallat’s à trous wavelet coefficients, second row (c): joint distribution of Daubechies 4 detail coefficients, third row (d): magnitude histogram of Mallat’s à trous coefficients, fourth row (e): magnitude histogram of Daubechies 4 transform coefficients, (f): circular histogram of Mallat’s à trous coefficients.

A New Wavelet-Based Texture Descriptor for Image Retrieval

899

plotted the joint empirical distributions of the detail coefficients for 3 levels, assuming that horizontal and vertical detail coefficients at the same position are the components of a bivariate random vector. From these joint distributions it can be inferred that a correlation exists between the horizontal and vertical wavelet subbands. This correlation is noticeable in the second and third levels of the transform, whereas the finer detail coefficients do not give much information: they are dominated by noise. The magnitude and angle of the random vector samples can be computed and analyzed. The empirical magnitude histogram (figure 1(d)) and the circular histogram (figure 1(f)) are shown for this wavelet. The circular histogram clearly indicates a privileged orientation of the edges in the texture. We have similar results for images with clearly oriented edges. The same study has been done applying the traditional Daubechies 4 wavelet transform to the same image. The joint empirical distributions of detail coefficients are shown in figure 1(c). It is evident that in this case there is no noticeable correlation between horizontal and vertical subbands. The histograms behave differently. 2.2

The Proposed Model for Wavelet-Coefficients Distributions

The previous section shows the importance of treating horizontal and vertical details jointly. However, it is a very difficult task to find a model to which the empirical joint histogram will fit reasonably well for any kind of texture. There are some papers in this approach, as [13], where the distribution of the subband coefficients is modeled using a joint alpha-stable sub-Gaussian distribution. Our approach is different. We do want to make use of the existing relation between the vertical and horizontal coefficients at a certain position, but we also want a model which is simple to fit and capable of characterizing different types of textures. We want a model for the moduli histograms. Observe that for both wavelet transforms, the shape of the empirical distribution changes for each level, in such a way that the peak of the histogram is shifted to the right. This behavior may be modelled by means of the gamma distribution (pdf). This distribution is defined by two parameters: k and θ. f (x; k, θ) = xk−1

e−(x/θ) θk Γ (k)

(1)

where k > 0 is a shape parameter, and θ > 0 is related to the scale of the distribution (if θ is large, then the distribution will be more spread out). The gamma distribution is related to many other distributions. The chi-square and exponential distributions, which are children of the gamma distribution, are oneparameter distributions that fix one of the two gamma parameters. The idea is to fit our empirical moduli histograms to a gamma distribution using the maxima-likelihood estimator. A goodness-of-fit Kolmogorov-Smirnov test has been applied to the different samples in order to justify the use of this model, giving very high p-values (larger than α = 0.05) for most of the samples

900

E. de Ves et al.

analyzed, for both wavelet transforms considered. It seems that the gamma distribution may be a reasonable model for the moduli of wavelet coefficients. As seen in the previous section, the circular histogram gives valuable information about privileged orientations in textures. Thus, these histograms can also be used to describe textures. An image with a privileged orientation will present a bimodal circular histogram with 180 degrees separated statistical modes whereas a random texture, without privileged orientations, will present a uniform circular histogram. Thus, we propose to characterize each texture in the database by: – Information from the moduli: parameters kn , θn , of the gamma pdf for n = 1 . . . 3, where n is the wavelet decomposition level. – Information from the angles given by the empirical circular histogram. Angles are quantized by dividing interval [0 , 2 π] into 40 bins. Pn (r) is the observed frequency of the angles in the rth bin for level n. Retrieval experiments in section 3, have been performed by using the gamma distribution parameters and the circular histograms.

3

Experimental Results

The objetive of this section is to test the new wavelet-based texture descriptor. Images from the well-known Brodatz image database have been used for the experiments; this image database is composed of 105 images representing different kinds of textures. Thirty six images from this collection were selected and each original image of size 512 × 512 was partitioned into sixteen 128 × 128 subimages by randomly choosing the left corner, thus creating a test database of N = 36 × 16. The texture descriptor was computed for each image in the test database. In our experiments, a simulated query image was anyone in our test database. The relevant images for each query were defined as the other 15 subimages from the same original Brodatz image. We evaluated the performance in terms of the average rate of retrieval of relevant images. We have done the same experiment three times: with the Daubechies 4 filters and the traditional wavelet transform, with the Daubechies 4 filters without downsampling, and Mallat’s biorthogonal wavelet without downsampling. The Kullback-Leibler divergence (KLD), commonly used to compare 2 distributions, was the chosen similarity measure between query image Iq and each image in the database {Ij , j = 1 . . . N }: we retrieved the top 16 closest images to the query, and counted how many corresponded to the same texture as the query image. The score used to measure the similarity between image Ij and Iq was computed as S(j, q) = S1 (j, q) + S2 (j, q),

where S (j, q) =

3  n=1

S (j, q, n),  = 1, 2,

(2)

A New Wavelet-Based Texture Descriptor for Image Retrieval

901



 f (x; knj , θnj ) P j (r) Pnj (r) log nq . q q dx, S2 (j, q, n) = f (x; kn , θn ) Pn (r) r (3) The term S1 (j, q, n) is the KLD (or cross entropy) of the moduli pdf at level n for images j and q. The term S2 (j, q, n) is the KLD of the empirical distribution of the quantized angles at level n for images j and q. The final score was the sum of both scores (modulus and angle KLD scores) at the first 3 levels. S1 (j, q, n) =

f (x; knj , θnj ) log

Table 1. Average retrieval rate in the top 16 Wavelet Transform Wavelet filters

Algorithm

Extracted Features S1

Daubechies 4 with downsampling 0.75 Daubechies 4 no downsampling 0.75 Mallat’s biorthogonal no downsampling 0.79

S1 + S2 0.84 0.89 0.89

Table 1 shows the percentage of relevant images retrieved in the top 16 matches. The maximun and minimun percentage rate were 89% and 75% respectively. The main points that we can observe from these results are that the downsampling operation of the wavelet transform and the features used are very significant in the retrieval performance. Omitting the downsampling steps in the wavelet transform seems very meaningful for the results. This is because the resulting transform is translation invariant. Our results in the first column of the table, which do not take into account the angle information, and only base the results on the moduli information, are similar to the ones obtained by modelling the marginal histograms (vertical and horizontal independently) with the GGD distribution. Moreover, in our proposed descriptor we need only half the number of parameters. The inclusion of the orientation information in the features improves the average percentage rate in 10% independently of the wavelet considered and the presence– or absence– of the downsampling operation. It means that the orientation information is a valuable feature to discriminate textures.

4

Conclusions

We have presented a novel wavelet-based texture descriptor for visual information retrieval. The basic idea to characterize each texture in the image database is to use the information from moduli and orientations of wavelet coefficients, assuming that vertical and horizontal coefficients at the same position are the components of a bivariate random vector. The Kullback-Leibler divergence has been chosen as measure of similarity. The good performance of this new descriptor is revealed in retrieval experiments using the Brodatz database.

902

E. de Ves et al.

Acknowledgement. This work has been partially supported by grants GV04177 (for research stays), MCYT TIN2006-10134, UBACYT X166 and BID 1728/ OC-AR-PICT 26001.

References 1. Ayala, G., Domingo, J.: Spatial size distributions: Applications to shape and texture analysis. IEEE Trans Image Processing 23, 1430–1442 (2001) 2. Azencott, R., Wang, J.P., Younes, L.: Texture classification using windowed fourier filters. IEEE Trans Pattern Analisys and Machine Intelligence 19(3), 148–153 (1997) 3. Buccigrossi, R., Simoncelli, E.: Image compression via joint statistical characterization in the wavelet domain. IEEE Trans Signal Proc 8, 1688–1701 (1999) 4. Daubechies, I.: Ten lectures on wavelets. Society for Industrial and Applied Mathematics (1992) 5. Del Bimbo, A.: Visual Information Retrieval. Morgan Kaufmann, San Francisco (1999) 6. Do, M.N., Vetterli, M.: Wavelet-based texture retrieval using generalized gaussian density and kullback-leibler distance. IEEE Trans Image Processing 11, 2 (2002) 7. Epifanio, I., Ayala, G.: A random set view of texture classification. IEEE Transactions on Image Processing 11, 859–867 (2002) 8. Heeger, D., Berger, J.R.: Pyramid-based txture analysis/synthesis. In: Proc. ACM SIGGRAPH (1995) 9. Mallat, S.: A Wavelet Tour of Signal Processing. Academic Press, London (1999) 10. Pentland, A., Picard, R.W., Sclaroff, S.: Photobook: Content-based manipulation of image databases. International Journal of Computer Vision 18(3), 233–254 (1996) 11. Smeulders, A.W.M., Santini, S., Gupta, A., Jain, R.: Content-based image retrieval at the end of the early years. IEEE transactions on Pattern Analysis and Machine Intellingence 22(12), 1349–1379 (2000) 12. Husoy, J., Randen, T.: filtering for texture classifciation: A comparative study. IEEE Trans Pattern Analisys and Machine Intelligence 20, 115–122 (1999) 13. Tzagkarakis, G., Beferull-Lozano, B., Tsakalides, P.: Rotation-invariant texture retrieval with gaussianized steerable pyramids. IEEE Trans Image Processing 15(2006) 14. Wouwer, G.V., Scheunders, P., Dyck, D.V.: Statistical texture characterization from discrete wavelet representation. IEEE Trans Image Processing 8, 592–598 (1999)