Colour Texture Segmentation Using Modelling Approach Michal Haindl and Stanislav Mikeˇs Institute of Information Theory and Automation, Academy of Sciences CR, 182 08 Prague, Czech Republic {haindl, xaos}@utia.cas.cz
Abstract. A fast and robust type of unsupervised multispectral texture segmentation method with unknown number of classes is presented. Single decorrelated monospectral texture factors are represented by four local autoregressive random field models recursively evaluated for each pixel and for each spectral band. The segmentation algorithm is based on the underlying Gaussian mixture model and starts with an over segmented initial estimation which is adaptively modified until the optimal number of homogeneous texture segments is reached. The performance of the presented method is extensively tested on the Prague segmentation benchmark using nineteen most frequented segmentation criteria.
1
Introduction
Segmentation is a fundamental process affecting the overall performance of an automated image analysis system. Image regions, homogeneous with respect to some usually textural measure, which result from a segmentation algorithm are analysed in subsequent interpretation steps. Texture-based image segmentation is area of intense research activity in recent years and many algorithms were published in consequence of all this effort. These methods are usually categorized [1] as region-based, boundary-based, or as a hybrid of the two. Different published methods are difficult to compare because of lack of a comprehensive analysis together with accessible experimental data, however available results indicate that the texture segmentation problem is still far from being solved. Spatial interaction models and especially Markov random fields-based models are increasingly popular for texture representation [2], [1], [3], etc. Several researchers dealt with the difficult problem of unsupervised segmentation using these models see for example [4], [5], [6], [7] or [8], which is also addressed in this paper.
2
Texture Representation
Static smooth multispectral textures require three dimensional models for adequate representation. However if we slightly compromise spatial-spectral correlations description these textures can be represented by a set of simpler 2D S. Singh et al. (Eds.): ICAPR 2005, LNCS 3687, pp. 484–491, 2005. c Springer-Verlag Berlin Heidelberg 2005
Colour Texture Segmentation Using Modelling Approach
485
data models with fewer parameters per model. Natural texture data space can be decorrelated only approximately thus the independent spectral component representation suffers with some loss of image information. Because the segmentation is less demanding application than texture synthesis, it is sufficient if such a representation maintains discriminative power of the full model even if its visual modelling strength is imperceptibly compromised. Spectral factorization using the Karhunen-Loeve expansion transforms the original centered data space θ indexed on the rectangular M × N finite lattice I into a new data space with K-L coordinate axes Y¯ . This new basis vectors are the eigenvectors of the second-order statistical moments matrix Φ = E{Y˜r Y˜rT } where the multiindex r has two components r = [r1 , r2 ], the first component is row and the second one column index, respectively. Components of the transformed vector Y¯r are mutually uncorrelated. If we assume further on Gaussian distribution of vectors Y¯r then they are also independent, i.e., p(Y¯r ) =
n
p(Y¯r,k )
k=1
and single monospectral random fields can be represented independently. 2.1
Texture Factor Model
We assume that single monospectral texture factors (Yr = Y¯r,k ) can be modelled using a causal autoregressive random field model (CAR). The 2D CAR model can be expressed as a stationary causal uncorrelated noise driven 2D autoregressive process [9]: (1) Yr = γXr + er , where γ = [a1 , . . . , aη ] is the parameter vector, Irc is a causal neighborhood index set with η = card(Irc ) and er is a white Gaussian noise with zero mean and a constant but unknown variance σ 2 , Xr is a corresponding vector of the contextual neighbours Yr−s and r, r − 1, . . . is a chosen direction of movement on the image index lattice I. The selection of an appropriate CAR model support (Irc ) is important to obtain good texture representation. An optimal neighbourhood can be found analytically using the Bayesian approach ([9]). The Bayesian parameters estimation of a CAR model can be found analytically also under few additional and acceptable assumptions. The recursive Bayesian parameter estimation of the causal AR model with the normal-gamma parameter prior which maximize the posterior density is [9]: T T = γˆr−2 + γˆr−1
−1 Vx(r−2) Xr−1 (Yr−1 − γˆr−2 Xr−1 )T T V −1 (1 + Xr−1 x(r−2) Xr−1 )
,
(2)
486
M. Haindl and S. Mikeˇs
where Vx(r−1) =
r−1
Xk XkT + Vx(0) .
k=1
Local texture for each pixel is represented by four parametric vectors per spectral band (r3 = 1, 2, . . . , n, for colour textures n = 3). Each parameter vector contains local estimations of CAR model parameters. These models have identical contextual neighbourhood Irc but they differ in their major movement direction (top-down, bottom-up, rightward, leftward), i.e., T t b r l = {˜ γr,r , γ˜r,r , γ˜r,r , γ˜r,r }T . γ˜r,r 3 3 3 3 3
˜ is subsequently smooth out and its dimensionality is The parametric space Θ reduced using the Karhunen-Loeve feature extraction (analogously to the spectral space decorrelation). Finally we add the average local spectral values ζr,i to the resulting feature vector (Θr ).
3
Mixture Model Based Segmentation
Multi-spectral texture segmentation is done by clustering in the CAR parameter space Θ defined on the lattice I where Θr = [γr,1 , ζr,1 , γr,2 , ζr,2 , . . . γr,n , ζr,n ]T . γr,i is the parameter vector (2) computed for the i-th transformed spectral band for the lattice location r. We assume that this parametric space can be represented using the Gaussian mixture model (GM) with diagonal covariance matrices due to the CAR parametric space decorrelation. The Gaussian mixture model for CAR parametric representation is as follows:
p(Θr ) =
K
pi p(Θr | νi , Σi ) ,
(3)
i=1
|Σi |− 2 1
p(Θr | νi , Σi ) =
d
(2π) 2
e−
−1 (Θr −νi ) (Θr −νi )T Σ i 2
.
(4)
The mixture model equations (3),(4) are solved using a modified EM algorithm. The algorithm is initialized using νi , Σi statistics estimated from the corresponding rectangular subimages obtained by regular division of the input texture mosaic. An alternative initialization can be random choice of these statistics. For each possible couple of rectangles the Kullback Leibler divergence
D (p(Θr | νi , Σi ) || p(Θr | νj , Σj )) =
p(Θr | νi , Σi ) log Ω
p(Θr | νi , Σi ) p(Θr | νj , Σj )
dΘr (5)
Colour Texture Segmentation Using Modelling Approach
487
is evaluated and the most similar rectangles, i.e., {i, j} = arg min D (p(Θr | νl , Σl ) || p(Θr | νk , Σk )) k,l
are merged together in each step. This initialization results in Kini subimages and recomputed statistics νi , Σi . Kini > K where K is the optimal number of textured segments to be found by the algorithm. Two steps of the EM algorithm are repeating after initialization. The components with smaller weights ) are eliminated. For every pair of compothan a fixed threshold (pj < K0.1 ini nents we estimate their Kullback Leibler divergence (5). From the most similar couple, the component with the weight smaller than the threshold is merged to its stronger partner and all statistics are actualized using the EM algorithm. The algorithm stops when either the likelihood function has negligible increase (Lt − Lt−1 < 0.05) or the maximum iteration number threshold is reached. The parametric vectors representing texture mosaic pixels are assigned to the clusters according to the highest component probabilities, i.e., Yr is assigned to the cluster ωj if πr,j = maxj ws p(Θr−s | νj , Σj ) , s∈Ir
where ws are fixed distance-based weights, Ir is a rectangular neighbourhood and πr,j > πthre (otherwise the pixel is unclassified). The area of single cluster blobs is evaluated in the post-processing thematic map filtration step. Thematic map blobs with area smaller than a given threshold are attached to its neighbour with the highest similarity value. If there is no similar neighbour the blob is eliminated.
4
Experimental Results
The algorithm was tested on natural colour textures mosaics from the Prague Texture Segmentation Data-Generator and Benchmark [10]. The benchmark test mosaics layouts and each cell texture membership are randomly generated and filled with colour textures from our large (more than 1000 high resolution colour textures) colour texture database. The benchmark ranks segmentation algorithms according to a chosen criterion. We have implemented three groups of criteria – region-based [11], pixel-wise [12], [13] and consistency measures [14]. The region-based [11] performance criteria mutually compare ground truth (GT) image regions with the corresponding machine segmented regions (MS). They are the correct, oversegmentation, undersegmentation, missed and noise criteria, i.e., correct > 75% GT (ground truth) region pixels are correctly assigned, oversegmentation > 75% GT pixels are assigned to a union of regions, undersegmentation > 75% pixels from a classified region belong to a union of GT regions, missed (GT in none of the previous categories) and noise (MS in none of the previous categories). Our pixel-wise criteria group contains the most frequented classification criteria such as the omission and commision errors, class accuracy,
488
M. Haindl and S. Mikeˇs
Table 1. Benchmark criteria: CS = correct segmentation; OS = over-segmentation; US = under-segmentation; ME = missed error; NE = noise error; O = omission error; C = commision error; CA = class accuracy; CO = recall – correct assignment; CC = precision – object accuracy; I. = type I error; II. = type II error; EA = mean class accuracy estimate; OA = overall accuracy; MS = mapping score; RM = root mean square proportion estimation error; CI = comparison index; GCE = Global Consistency Error; LCE = Local Consistency Error.
CS OS US ME NE
Prague Segmentation Benchmark – Colour presented method GMRF method [8] Blobworld [15] EDISON [16] 46.24 32.43 15.73 12.68 76.21 50.76 1.16 86.93 3.81 14.23 10.25 0.00 7.66 13.19 67.95 2.48 9.59 16.19 71.58 4.65
O C CA CO CC
7.03 0.83 27.04 68.54 96.05
8.76 3.22 23.87 62.23 87.47
9.36 7.03 21.10 54.00 70.64
14.83 0.17 16.05 31.55 98.09
I. II. EA OA MS RM CI
31.46 1.11 76.10 68.54 66.07 3.56 78.83
37.77 4.38 66.10 62.23 54.91 5.45 69.64
46.00 9.69 56.29 54.00 35.37 8.17 59.00
68.44 0.24 41.29 31.55 31.13 3.21 50.29
GCE LCE
8.52 5.55
16.56 8.69
38.29 27.28
3.54 3.43
recall, precision, etc. Finally the last criteria set incorporates the global and local consistency errors [14]. Tab. 1 compares the overall benchmark performance of the proposed algorithm (segmentaton time 11 min/img on the Athlon 2GHz processor) with the Blobworld [15] (30 min/img), Edison [16] (10 s/img) and our previously published method [8] (55 min/img), respectively. These results demonstrate very good pixel-wise, correct region segmentation and low undersegmentation properties of our method while the oversegmentation results are only average. For all the pixel-wise criteria or the consistency measures our method is either the best one or the next best with marginal difference from the best one. Fig. 1 shows four selected 512×512 experimental benchmark mosaics created from five to eleven natural colour textures. The last three columns demonstrate comparative results from three alternative algorithms. Hard natural textures were chosen rather than synthesized (for example using Markov random field models) ones because they are expected to be more difficult for the underlying segmentation model. The second column demonstrates robust behaviour of our
Colour Texture Segmentation Using Modelling Approach
489
algorithm but also infrequent algorithm failures producing an oversegmented thematic map for some textures. Such failures can be corrected by more elaborate postprocessing step. The Blobworld [15] and Edison [16] algorithms on these data performed steadily worse as can be seen in the last two columns of Fig. 1, some areas are undersegmented while other parts of the mosaics are oversegmented. Resulting segmentation results are promising however comparison with other algorithms is difficult because of lack of sound experimental evaluation results in the field of texture segmentation algorithms. The overall accuracy of pixelwise correct segmentation for this example is 69%. This result can be further improved by an appropriate postprocessing.
Fig. 1. Selected experimental texture mosaics from the benchmark, our segmentation results (2. column), GMRF method [8] (3.column), Blobworld [15] (4. column), and Edison [16] segmentation results (rightmost column), respectively.
5
Conclusions
We proposed novel efficient and robust method for unsupervised texture segmentation with unknown number of classes based on the underlying CAR and GM texture models. Although the algorithm uses the random field type model it is extremely fast because it uses efficient recursive parameter estimation of the
490
M. Haindl and S. Mikeˇs
model and therefore is much faster than the usual Markov chain Monte Carlo estimation approach. Usual handicap of segmentation methods is their lot of application dependent parameters to be experimentally estimated. Our method requires only a contextual neighbourhood selection and two additional thresholds. The algorithm’s performance is demonstrated on the extensive benchmark tests on natural texture mosaics. It performs favorably compared with three alternative segmentation algorithms and it is faster than our previously published GMRF method. These test results are encouraging and we proceed with more elaborate postprocessing and some alternative texture representation models such as an alternative 3D CAR random field model.
Acknowledgements This research was supported by the EC projects no. IST-2001-34744, FP6507752, and partially by the GAAV grants no. A2075302, T400750407 and ˇ MSMT project 1M6798555601 DAR.
References 1. Reed, T.R., du Buf, J.M.H.: A review of recent texture segmentation and feature extraction techniques. CVGIP–Image Understanding 57 (1993) 359–372 2. Kashyap, R.: Image models. In T.Y. Young, K.F., ed.: Handbook of Pattern Recognition and Image Processing. Academic Press, New York (1986) 3. Haindl, M.: Texture synthesis. CWI Quarterly 4 (1991) 305–331 4. Panjwani, D., Healey, G.: Markov random field models for unsupervised segmentation of textured color images. IEEE Transactions on Pattern Analysis and Machine Intelligence 17 (1995) 939–954 5. Manjunath, B., Chellapa, R.: Unsupervised texture segmentation using markov random field models. IEEE Transactions on Pattern Analysis and Machine Intelligence 13 (1991) 478–482 6. Andrey, P., Tarroux, P.: Unsupervised segmentation of markov random field modeled textured images using selectionist relaxation. IEEE Transactions on Pattern Analysis and Machine Intelligence 20 (1998) 252–262 7. Haindl, M.: Texture segmentation using recursive markov random field parameter estimation. In Bjarne, K., Peter, J., eds.: Proceedings of the 11th Scandinavian Conference on Image Analysis, Lyngby, Denmark, Pattern Recognition Society of Denmark (1999) 771–776 8. Haindl, M., Mikeˇs, S.: Model-based texture segmentation. In Campilho, A., Kamel, M., eds.: Image Analysis and Recognition. Lecture Notes in Computer Science 3212, Berlin, Springer-Verlag (2004) 306 – 313 ˇ 9. Haindl, M., Simberov´ a, S.: A Multispectral Image Line Reconstruction Method. In: Theory & Applications of Image Analysis. World Scientific Publishing Co., Singapore (1992) 306–315 10. : Prague texture segmentation data-generator and benchmark. http://mosaic.utia. cas.cz (2004)
Colour Texture Segmentation Using Modelling Approach
491
11. Hoover, A., Jean-Baptiste, G., Jiang, X., Flynn, P.J., Bunke, H., Goldgof, D.B., Bowyer, K., Eggert, D.W., Fitzgibbon, A., Fisher, R.B.: An experimental comparison of range image segmentation algorithms. IEEE Transaction on Pattern Analysis and Machine Intelligence 18 (1996) 673–689 12. Rosenfield, G.: Analysis of thematic map classification error matrices. Photogrammetric Engineering and Remote Sensing 52 (1986) 681–686 13. Martin, D., Fowlkes, C., Malik, J.: Learning to detect natural image bounderies using brightness and texture. IEEE Transactions on Pattern Analysis and Machine Intelligence 26 (2004) 1–19 14. Martin, D., Fowlkes, C., Tal, D., Malik, J.: A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: Proc. 8th Int’l Conf. Computer Vision. Volume 2. (2001) 416–423 15. Carson, C., Thomas, M., Belongie, S., Hellerstein, J.M., Malik, J.: Blobworld: A system for region-based image indexing and retrieval. In: Third International Conference on Visual Information Systems, Springer (1999) 16. Christoudias, C., Georgescu, B., Meer, P.: Synergism in low level vision. In Kasturi, R., Laurendeau, D., Suen, C., eds.: Proceedings of the 16th International Conference on Pattern Recognition. Volume 4., Los Alamitos, IEEE Computer Society (2002) 150–155