Using Covariance Matrices for Unsupervised ... - Semantic Scholar

Report 5 Downloads 173 Views
Using Covariance Matrices for Unsupervised Texture Segmentation Michael Donoser and Horst Bischof Institute for Computer Graphics and Vision Graz University of Technology, Austria {donoser,bischof}@icg.tugraz.at

Abstract In this paper we propose an efficient unsupervised texture segmentation method. We introduce the extension of a state-of-the-art segmentation algorithm, which is exclusively based on color cues, by incorporating texture information. We further show how to use covariance matrices of low level features for texture description which can be efficiently calculated based on integral images. Furthermore, a multi-scale extension allows to provide accurate texture segmentation results. An experimental evaluation on a synthetic texture database and images of the Berkeley image database demonstrate the improved performance of the algorithm.

1. Introduction The problem of segmentation is to partition an image into a set of non-overlapping regions covering the entire image. Traditionally, segmentation is formulated as bottomup process, where no high-level knowledge about the image scene is incorporated into the algorithm. Bottom-up approaches identify regions in the input image only based on low-level cues, like color or texture and do not rely on any a-priori knowledge such as an object database. Automatic segmentation is one of the most popular topics in computer vision and many research groups are working on this issue. A general overview of different color segmentation algorithms can be found e. g. in [2, 8]. One of the main directions of current research in this field is to define segmentation as a minimum cut or maximum flow problem through a graph as by [20] in their normalized cut framework. The normalized cut algorithm has been extensively used as framework for different segmentation algorithms as e. g. by [1, 24]. Another direction of research is the segmentation of images by evolving boundary contours as in the popular level set framework [14]. Such an approach allows accurate results and easy integration of shape constraints, as done by [19, 18, 4]. Finally, there is

the traditional field of local, appearance based methods like the mean shift [3] which is still one of the state-of-the-art algorithms for color segmentation. Unsupervised segmentation results often deviate from human segmentations. Therefore, there has recently been much interest on top-down algorithms e. g. by [23], or on simultaneous combination of top-down and bottom-up approaches as e. g. by [9]. The integration of top-down information into the segmentation problem obviously leads to improved results, but there are still many applications where a priori information is hard to obtain or is not available at all. Therefore, these applications have to rely on efficient and as good as possible bottom-up segmentation results. While many bottom-up segmentation methods are exclusively based on analyzing color cues many papers as e. g. [7] outlined that including texture information can substantially improve segmentation results. In general, methods for integration of texture information into segmentation can mainly be divided into two main groups: statistical modeling and filtering methods. Statistical methods focus on analyzing the stationary statistics of textures [22], while filtering methods mostly apply filter banks, where Gabor filtering is most prominent [15]. In this paper we focus on unsupervised bottom-up texture segmentation. We introduce an extension of a state-ofthe-art unsupervised color segmentation method proposed by Donoser and Bischof [5]. This method named ROI-SEG achieved state-of-the-art segmentation results on the Berkeley image segmentation database exclusively based on color cues. We describe a method to incorporate texture information by efficiently calculated covariance matrices and show that segmentation results can be improved especially for highly textured areas. The outline of the paper is as follows. Section 2 describes the extension of ROI-SEG for including texture information in detail. Section 3 demonstrates the improved performance of the algorithm on images of a texture segmentation database and on images of the Berkeley segmentation database. Finally Section 4 draws some conclusions.

2. Unsupervised texture segmentation

2.2. Texture descriptor

The main idea of this paper is to extend a state-of-theart unsupervised segmentation algorithm, which is exclusively based on color cues, with texture information. For that purpose we propose to use covariance matrices of lowlevel feature vectors as texture descriptors which can be efficiently calculated by integral images. Section 2.1 summarizes the unsupervised color segmentation method. Section 2.2 describes the covariance matrix calculation for texture description and shows how the descriptor can be integrated into the segmentation framework. Finally, Section 2.3 outlines a multi-scale extension of the method, which allows to achieve accurate texture segmentation results in short computation time.

We use covariance matrices of low-level features as strong texture descriptor. The use of covariance matrices as descriptor was proposed by Porikli et al. [17] mainly for tracking purposes. In [21] also applications for texture classification were shown, but neglecting any segmentation issues. The main advantage of using the covariance matrix is that it enables efficient fusion of different types of low-level features and modalities, and that its dimensionality is small. Furthermore covariance matrices can be calculated in an efficient manner based on integral images. We use a seven-dimensional feature vector f for constructing the covariance matrices which is defined by f = [R G B Ix Iy Ixx Iyy ] ,

2.1. ROI-SEG

(1)

where R, G and B are the RGB color values of the pixel and Ix , Iy , Ixx and Iyy are the corresponding first and secondorder derivatives. The space of covariance matrices is not a vector space and therefore a standard arithmetic difference does not measure the difference between the matrices. But covariance matrices are symmetric and positive semidefinite and can therefore be formulated as a connected Riemannian manifold, which is a topological space that is locally Euclidean [16]. In this manifold space various distance metrics exist and we use the one from Foerstner and Moonen [6] which is defined as v D     u uX ~ 1, Σ ~2 , ~ 1, Σ ~2 = t ln2 λd Σ (2) ∆ Σ

The unsupervised color segmentation algorithm ROISEG was proposed by Donoser and Bischof [5]. This method is based on the underlying idea of combining a set of differently focused sub-segmentations into the final result. ROI-SEG can roughly be divided intro three subsequent steps. First, a set of regions-of-interest (ROIs) is automatically detected in the input image. Second, each of these ROIs is passed to a semi-automatic color segmentation method. Third, all calculated sub-segmentations are combined to the final result. The method as described in [5] achieved state-of-the-art segmentation results on the Berkeley image segmentation database and is exclusively based on analyzing color cues. We propose an extension for including texture information into the framework which allows to improve segmentation results. The incorporation of texture information only modifies the second part of ROI-SEG – the semi-automatic segmentation algorithm – while the first and the third part exactly stay the same. Therefore, we next shortly summarize the second part, while for a detailed explanation of the automatic detection of the ROIs and the combination of the individual sub-segmentations please refer to [5]. The semi-automatic part of ROI-SEG uses each of the automatically detected ROIs independently to provide a set of sub-segmentations. The first step is to calculate a likelihood value p (C|xi ) for each pixel xi of the input image, which measures the color similarity between a local neighborhood of size x × y and the input ROI C. This likelihood calculation is exclusively based on color cues by analyzing Bhattacharyya distances between Gaussian Mixture Model (GMM) fits to the RGB values of the ROI and the local neighborhood. As second step the color likelihood map is passed as input to a Maximally Stable Extremal Region (MSER) detector [12], which provides a set of connected regions as sub-segmentation result for every input ROI.

d=1

~ 1 and Σ ~ 2 are the two input covariance matrices, λd where Σ ~ 1 and Σ ~ 2 and D is the are the generalized eigenvalues of Σ dimensionality of the feature vector, i. e. seven in our case. This distance measure ∆ fulfills positivity, symmetry and the triangle inequality, and therefore all metric axioms. We now propose to replace the calculation of the Bhattacharyya distances as described in Section 2.1 with the corresponding covariance matrix distances. For that purpose, ~ C, Σ ~ N ), where Σ ~ C is the feature cowe set p (C|xi ) to ∆(Σ ~ N is the the matrix of variance matrix of the ROI C and Σ the local neighborhood of the analyzed pixel xi . The rest of the algorithm stays the same. Therefore, repeatedly the covariance matrices of local windows of size x×y are compared to the reference covariance matrix of the input ROI. Please note, that the integral image can be used to calculate the covariance matrices for the local subwindows in an efficient manner as it was shown in [17, 5]. Figure 1 compares the segmentation results and the calculated likelihoods of ROI-SEG and the proposed method for a selected texture example. As can be seen the integration of texture information significantly improves the results. 2

database [11]. Figure 2 shows two selected examples where the integration of texture information improves the segmentation performance compared to the ROI-SEG algorithm. We further evaluated the proposed method on images of the Prague texture segmentation benchmark [13] which allows to automatically generate texture mosaics built from 114 color textures of 10 thematic classes. Figure 3 shows some images highlighting the unsupervised texture segmentation results. (a) ROI-SEG likelihood

4. Conclusion

(b) Proposed texture likelihood

In this paper we proposed an extension of a state-ofthe-art color segmentation method based on incorporation of texture information. We showed how covariance matrices can be used as strong texture descriptor in this framework and how a multi-scale modification ensures accurate segmentation boundaries. Experimental evaluation on a synthetic texture database and images of the Berkeley image segmentation database showed that segmentation results can be improved especially for highly textured areas. (c) ROI-SEG result

(d) Result of proposed method

References

Figure 1: Comparison between likelihood maps and color segmentation results of ROI-SEG and proposed method which incorporates texture information. The methods are both initialized by a manually drawn bounding box in the upper left texture area.

[1] Y. Y. Boykov and M. P. Jolly. Interactive graph cuts for optimal boundary & region segmentation of objects in ND images. In Proceedings of International Conference on Computer Vision (ICCV), volume 1, pages 105–112, 2001. [2] H. Cheng, X. H. Jiang, Y. Sun, and J. Wang. Color image segmentation: advances and prospects. Pattern Recognition, 34(12):2259–2281, 2001. [3] D. Comaniciu and P. Meer. Robust analysis of feature spaces: Color image segmentation. In Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), pages 750–755, 1997. [4] D. Cremers, M. Rousson, and R. Deriche. A review of statistical approaches to level set segmentation: integrating color, texture, motion and shape. International Journal of Computer Vision, 72(2), April 2007. [5] M. Donoser and H. Bischof. ROI-SEG: Unsupervised color segmentation by combining differently focused sub results. In Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), 2007. [6] W. Foerstner and B. Moonen. A metric for covariance matrices. Technical report, Department of Geodesy and Geoinformatics, Stuttgart University, 1999. [7] C. Fowlkes, D. Martin, and J. Malik. Learning affinity functions for image segmentation: combining patch-based and gradient-based approaches. In Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), pages 54–61, 2003. [8] J. Freixenet, X. Munoz, D. Raba, J. Marti, and X. Cufi. Yet another survey on image segmentation: Region and boundary information integration. In Proceedings of European Conference on Computer Vision (ECCV), pages 408–422, 2002.

2.3. Multi-scale analysis One of the most important parameters of the unsupervised segmentation method is the size x × y of the local windows which defines the local neighborhood. The larger the size the better the textures can be described but the more boundary accuracy is lost. We propose to use a multi-scale approach [10] for calculating the final texture segmentation result. We first start at a large scale calculating first segmentation results. After combining the sub-segmentations, still some areas might be unassigned, especially close to texture transitions. In these areas we repeat the proposed segmentation method based on a smaller window size.

3. Experiments We implemented the proposed unsupervised texture segmentation method in Matlab-Mex, which allows to perform a sub-segmentation within 200 ms and the entire segmentation in a few seconds depending on the number of ROIs. We first demonstrate the performance of the proposed texture segmentation method on images of the Berkeley 3

(a) ROI-SEG

(b) ROI-SEG

(c) Texture

(d) Texture

Figure 2: Comparison between ROI-SEG exclusively based on color cues and the proposed texture segmenter.

(a)

(b)

(c)

(d)

Figure 3: Unsupervised texture segmentation results of images from Prague texture database. [9] A. Levin and Y. Weiss. Learning to combine bottom-up and top-down segmentation. In Proceedings of European Conference on Computer Vision (ECCV), pages 581–594, 2006. [10] T. Lindeberg. Scale-space for discrete signals. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 12(3):234–254, 1990. [11] D. Martin, C. Fowlkes, D. Tal, and J. Malik. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proceedings of International Conference on Computer Vision (ICCV), pages 416–423, 2001. [12] J. Matas, O. Chum, M. Urban, and T. Pajdla. Robust wide baseline stereo from maximally stable extremal regions. In Proceedings of British Machine Vision Conference (BMVC), pages 384–393, 2002. [13] S. Mikeˇs and M. Haindl. Prague texture segmentation data generator and benchmark. ERCIM News, (64):67–68, 2006. [14] S. Osher and N. Paragios. Geometric Level Set Methods in Imaging,Vision,and Graphics. Springer New York, Inc., 2003. [15] N. Paragios and R. Deriche. Geodesic active contours for supervised texture segmentation. In Proceedings of Computer Vision and Pattern Recognition (CVPR), volume 2, 1999. [16] A. Pennec, P. Fillard, and N. Ayache. A riemannian framework for tensor computing. International Journal of Computer Vision (IJCV), 66:41–66, 2006. [17] F. Porikli, O. Tuzel, and P. Meer. Covariance tracking using model update based on lie algebra. In Conference on Com-

[18]

[19]

[20]

[21]

[22]

[23]

[24]

4

puter Vision and Pattern Recognition (CVPR), volume 1, pages 728–735, 2006. T. Riklin-Raviv, N. Kiryati, and N. Sochen. Prior-based segmentation by projective registration and level sets. In Proceedings of International Conference on Computer Vision (ICCV), volume 1, pages 204–211, 2005. M. Rousson and N. Paragios. Shape priors for level set representations. In Proceedings of European Conference on Computer Vision (ECCV), pages 78–92, 2002. J. Shi and J. Malik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):888–905, 2000. O. Tuzel, F. Porikli, and P. Meer. Region covariance: A fast descriptor for detection and classification. In Proceedings of European Conference on Computer Vision (ECCV), pages 589–600, 2006. M. Unser. Texture classification and segmentation using wavelet frames. IEEE Transactions on Image Processing, 4(11):1549–1560, 1995. M. Vasconcelos, N. Vasconcelos, and G. Carneiro. Weakly supervised top-down image segmentation. In Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), pages 1001–1006, 2006. R. Zabih and V. Kolmogorov. Spatially coherent clustering using graph cuts. In Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), volume 2, pages 437–444, 2004.