A Probabilistic Multi-Phase Model for Variational Image Segmentation ? Thomas Pock and Horst Bischof Institute for Computer Graphics and Vision, Graz University of Technology Infeldgasse 16/2, A-8010 Graz, Austria {pock, bischof}@icg.tugraz.at http://www.icg.tugraz.at
Abstract. Recently, the Phase Field Method has shown to be a powerful tool for variational image segmentation. In this paper, we present a novel multi-phase model for probability based image segmentation. By interpreting the phase fields as probabilities of pixels belonging to a certain phase, we obtain the model formulation by maximizing the mutual information between image features and the phase fields. For optimizing the model, we derive the Euler Lagrange equations and present their efficient implementation by using a narrow band scheme. We present experimental results on segmenting synthetic, medical and natural images.
1
Introduction
Image segmentation is one of the best studied but still unsolved low level vision problems. Its primal goal is to partition a given image domain Ω into K nonoverlapping regions Ωi , i = 1 . . . K, so that the objects covered by each region share some specific properties. The active contour or snake model, introduced by Kass Witkin and Terzopoulos [1], has shown to be a powerful tool for image segmentation. The major (technical) difficulty of active contours is to properly represent the interface of the contour. Basically, there are two concepts for interface representation. In an explicit representation, the points belonging to the interface are spatially sampled. Although this representation is very efficient in both memory and computational terms, it has drawbacks with respect to topological changes and strong curvatures. In an implicit representation, the interface is defined as the isocontour of a higher-dimensional function. Hence, topological changes can be handled in a very natural way [2]. The Level Set Method, introduced by Osher and Sethian [3] is based on such an implicit representation. The moving front (interface) is defined by the roots of a continuous function φ. For computational simplicity, the level set function is chosen as a signed distance function. For the task of image segmentation, numerous variants, involving image gradients (e.g. [4]) and region based forces ?
This work was supported by the Austrian Science Fund (FWF) under the grant P17066-N04.
(e.g. [5]) have been proposed to drive the front to the desired image boundaries. Although the level set method has been proven to be a robust numerical tool for a great number of applications, it also has some significant disadvantages [2]: – The numerical solution of the level set equation is obtained by a forward Euler method with a global time step restriction and thus converges slowly. – The interface of the level set function has to be tracked explicitly. – For numerical stability, the level set function should periodically be reinitialized to a signed distance function. – Multiple regions and open curves (curves having ends) cannot be handled in a straight-forward manner. The Variational approach, which is closely related to the level set method, is based on defining an appropriate energy functional whose minimizer is an optimal solution to a given image. The celebrated Mumford-Shah segmentation model [6] was one of the first models for variational image segmentation. In its original setting, the Mumford-Shah functional is hard to minimize, due to the lacking convexity and regularity of the edge-length term. A solution was presented by Ambrosio and Tortorelli [7] via Γ -convergence, where the edge set is represented by means of a phase field z ∈ [0, 1]. Suppose, z = 1 almost everywhere and sharply drops down to 0 in a -neighborhood around the edges. Furthermore, consider the following energy: Z Z (1 − z)2 2 Lz, = |∇z| dx + dx , (1) 4 Ω Ω where is the phase field parameter which controls the size of the transition band. The remarkable result, associated with this energy is that Lz, converges to the edge length, as → 0+ . It should also be noted, that this type of energy is not limited to closed curves, as it is the case for level set functions. Another possibility is to represent the edges as the transition of a phase field w, having two distinct phases [8], [9]. The Energy associated with this two-phase model is Z Z (w(1 − w))2 dx (2) Lw, = |∇w|2 dx + Ω Ω The difference between (1) and (2) mainly consists of the type of function used in the second term. The former one uses a single well potential with a well at 1, the latter one uses a double well potential with equidepth wells at 0 and 1. Deriving the Euler Lagrange equation of (2), one obtains the so-called AllenCahn equation [10] ∂Lw, ∂w 2 ≡ = −2∆w + w(1 − w)(1 − 2w) . ∂w ∂t
(3)
In has been shown that this equation approximates the motion of an interface by its mean curvature as → 0+ , which is important to get smooth boundaries [11]. The approximative formulation (3), offers some potential advantages:
– The explicit tracking of the moving interface (as it is the case for the level set method) is completely avoided. – Since the phase fields energies (e.g. (2)) are quadratic, they can be solved by fast state-of-the-art solvers. – Curves having edges or ends can easily be represented. – The phase field parameter can also be used for multiscale analysis (e.g. [12]). In this paper we utilize the phase field approach to derive a very general segmentation model which can easily handle multiple regions. In contrast to the level set method, where the level set function defines clear cut boundaries between the regions, we interpret the phase fields as probabilities of pixels belonging to regions (inspired by [13]). This model is very general, since it allows pixel to belong to several regions. In Section 2 we derive the model formulation based on maximizing the mutual information between the phase fields and a probability function based on image features. Although we aim for being very general, we demonstrate that even using simple Gaussian functions as a prior for the image features is sufficient for a large number of segmentation problems while having the advantage of being very cheap in terms of computational costs. For optimization, we derive the Euler Lagrange equations of the model and show how they can be solved via alternating minimization. In Section 3 we present a fast and robust narrow band scheme to compute the model. In Section 4, the model is validated by different numerical experiments including natural and medical images. As a byproduct, we show that the pixel probabilities can also be used for border matting of objects having fuzzy boundaries. In the last Section we give some concluding remarks and present some ideas for future directions.
2
Segmentation Model
Consider a random variable P ∈ {1, . . . , K} of region labels and a vector P(x) = (p1 (x), . . . , pK (x))T ∈ RK , describing the probability that a pixel x ∈ Ω ⊆ Rn belongs to region Ω` , where Ω = ∪` Ω` and ∩` Ω` = ∅ for ` ∈ [1, . . . , K]. This leads to the following partition of the image domain Ω: Ω(x) =
K X
p` (x) ≡ 1 ,
p` ≥ 0
(4)
`=1
where K is the number of image regions. Furthermore, let F ⊆ Rm be a random vector describing image features e.g. RGB channels of a color image. A segmentation is said to be optimal with respect to F, if the information of F which is covered by P is maximal. This can be achieved by maximizing the mutual information between F and P. I(F; P) = H(F) − H(F | P) ,
(5)
where H(·) and H(· | ·) denote the entropy and the conditional entropy. This criterion was also used by the authors of [14] and [15] but it is important to
note, that our model is more general since it allows each pixel to belong to several regions with a certain probability. Since H(F) is independent of the region labels, maximizing (5) can be turned into minimizing H(F | P). In order to obtain smooth boundaries, we additionally penalize the objective function by the phase field energy (2) which has, in a slightly different form, also been used in [13] for “Soft-Mumford-Shah” segmentation. Hence, the objective function we wish to minimize is E = H(F | P) + αLP, , where LP, =
K Z X
|∇p` |2 +
Ω
`=1
(p` (1 − p` ))2 dx ,
(6)
(7)
with the constraints that K X
p` = 1
and p` ≥ 0
(8)
`=1
The parameter α is used to control the influence of the phase field energy term. By the weak law of large numbers, the entropy term H(F | P) can be approximated by (up to negligible constants) K Z X ˜ H(F | P) = − p` log (Prob(F | P = `)) dx . (9) `=1
2.1
Ω
Parametric versus Non-parametric Density Estimation
Evaluating (9) requires the estimation of Prob(F | P = `). This can either be done by assuming a certain probability density function or by a non-parametric density estimator. Although non-parametric density estimation would be preferable, we decided to use parametric models. The main reason is the reduced computational complexity which becomes significant for 3D segmentation problems. Moreover, we found that even using single Gaussians for each region is sufficient for a great number of segmentation tasks. However, we emphasize that our image model easily generalizes to more complex models such as Gaussian Mixture Models [16] or non-parametric density estimators [17]. For this paper, as mentioned above, we proceed by assuming that each component of F follows a Gaussian probability density functions (PDFs), each associated with region Ω` . 1 1 −1 T exp − (s − µ ) Σ (s − µ ) , (10) G` (s, µ` , Σ ` ) = ` ` ` 2 (2π)m/2 |Σ ` |1/2 where µ` , Σ` are the m × 1 mean vector and the m × m covariance matrix. With this (9) becomes (up to negligible constants) K Z X ˜ H(F | P) = p` (s(x) − µ` )T Σ −1 (11) ` (s(x) − µ` ) + log(|Σ ` |) dx , `=1
Ω
where s(x) is the feature vector at pixel position x. By substituting (11) into (6) we have the complete energy functional of our segmentation model: ˜ E = H(F | P) + αLP, , 2.2
(12)
Optimization of the Constraint Energy Functional
The minimization of (12) requires the estimation of the parameters Θ = {µ` , Σ ` }, ` ∈ [1, . . . , K], describing the Gaussian PDFs and the estimation of the optimal pixel ownership P(x). This multivariate optimization problem is solved via the alternating minimization algorithm (AM), which is a well established technique. Since the AM algorithm works iteratively, the estimation of the parameters and the pixel ownerships at the n-th iteration are denoted as Θ(n) and P(n) . Given a pixel ownership P(n) (x), the currently optimal parameters Θ(n) of the Gaussian PDFs are estimated by solving Θ(n) = argminΘ {E(Θ | P(n) , F)}.
(13)
Deriving the Euler Lagrange equations of (12) with respect to Θ = {µ` , Σ ` }, one arrives at the following equations: Z 1 (n) (n) p` (x) s(x) dx (14) µ` = R (n) Ω p (x) dx ` Ω and (n)
Σ`
=R
Z
1 (n) p (x) dx Ω `
(n)
Ω
p`
(n)
(n)
(x)(s(x) − µ` ) (s(x) − µ` )T dx
(15)
Subsequently, based on the optimal estimation of Θ, the optimal pixel ownership for the next iteration is obtained by P(n+1) = argminP {E(P | Θ(n) , F)}.
(16)
Since the different pixel ownerships p` are related to each other by the P constraint K (8) they have to cooperate with each other. To satisfy the constraint `=1 p` = 1, we introduce a Lagrange multiplier λ and proceed by solving the following equations ( !) K X ∂ E+λ p` − 1 =0. (17) ∂p` `=1
1 −K
PK
∂E `=1 ∂p` .
As a result, we get λ = By substituting λ into (17) we obtain the gradient of the constrained energy functional K ∂E ∂E 1 X ∂E = − =0. ∂p` ∂p` K ∂p` `=1
(18)
∂E The expressions ∂p are given by the gradients of the unconstrained energy ` functional which yield ∂E 2 = h` + α −2∆p` + p` (1 − p` )(1 − 2p` ) , (19) ∂p` where h` = (s(x) − µ` )T Σ −1 ` (s(x) − µ` ) + log(|Σ ` |) . So far, we have not introduced the second constraint, namely p` ≥ 0. We found that it is easier to explicitely deal with this during the optimization procedure rather than introducing an additional Lagrange multiplier.
3
Fast and Robust Implementation
Before we start to implement our model, we briefly recall the notation used for image discretization, which will be used throughout the rest of the paper. We are using a 2D Cartesian grid which is defined as {(xi , yj ) | 1 ≤ i ≤ M, 1 ≤ j ≤ N }. For convenience, we are using a uniform grid, for which all the subintervalls 4x = [xi+1 − xi ] and 4y = [yj+1 − yj ] are equal in size. By definition, the use of a Cartesian grid implies a rectangular domain Ω = [x1 , xM ] × [y1 , yN ]. 3.1
Discretization of the Euler Lagrange Equations
The implementation of (14) and (15) is straight-forward and does not require further explanations. For a solution of (18), we are using standard Gauss-Seidel iterations of the following approximated Newton-type descent scheme: (k+1) (p` )i,j
=
(k) (p` )i,j
1 − (τ` )i,j
∂E ∂p`
(k) .
(20)
i,j
Due to (18), we only have to discretize (19) which is given by 2 ∂E 2 3 = (h` )i,j + α −2(∆p` )i,j + ((p` )i,j − 3(p` )i,j + 2(p` )i,j ) , (21) ∂p` i,j where (∆p` )i,j = ((p` )i+1,j + (p` )i−1,j + (p` )i,j+1 + (p` )i,j−1 − 4(p` )i,j ) is a finite difference approximation of the Laplace operator. The parameters (τ` )i,j are approximations of the second order derivatives of the energy functional and are given by ∂ ∂E 2 2 1 − 6(p` )i,j + 6(p` )i,j (22) (τ` )i,j = = α 8 + ∂(p` )i,j ∂p` i,j As mentioned in the previous section, we need to ensure p` ≥ 0 during the Gauss-Seidel iterations. For this purpose, we use the following rule: First, we (k+1) compute the values (p` )i,j , according to (20). Second, if the new values do (k) (k+1) not satisfy the constraint (p` )i,j ≥ 0, we set (τ` )i,j = (p` )i,j / ∂E/∂p` i,j .
Finally, we choose τi,j = min{(τ` )i,j } and set (τ` )i,j = τi,j for all ` ∈ [i, . . . , K]. Concerning the stability of the scheme, we can easily derive a lower bound of the phase field parameter . Since p` ∈ [0, 1], we have inf p` {τ` } = 0.5. If we require τ` > 0, we get 2 > 1/8. We also found that just a few iterations (5-10) are sufficient in order to approximately solve (18) for one step of the AM algorithm. 3.2
Fast Narrow Band Implementation
The AM algorithm described above has a computational complexity of O(N 2 ), where N is the size along one dimension of the image. If we restrict the computations to a narrow band, located around the transition of the phase fields, the computational complexity could be reduced to O(kN ), where k is the width of the narrow band. This concept has also been used in the context of level sets [18] and has shown to substantially reduce the computational costs, especially in the context of 3D applications. We propose the following algorithm to generate the narrow band of the phase field. 1. Choose all potential transition points: N0 = {(xi , yj ) | (∇p` )2i,j < γ }, where γ is a small fixed threshold (e.g. γ = 0.05). 2. The narrow band N is generated by extending N0 to a predefined width by solving the Eikonal equation [18] up to a fixed stopping value ( e.g. tmax = 5). The narrow band variants of (14) and (15) are straight-forward and thus omitted. To efficiently update the covariance we use the well known property of the covariance matrix, Σ = E[ssT ] − µµT . The narrow band variant of (18) is obtained by simply restricting (20) to the points of the narrow band. Hence, we have reduced the computational complexity to O(kN ) for each iteration of the AM algorithm. After a few iterations of the AM algorithm, the narrow band has to be updated, which is done by the same procedure.
4
Experimental Results
In order to show the wide applicability of the proposed segmentation model, we have tested our algorithm on several images. For all experiments we used = 2.0. Fig. 1 shows the two-phase segmentation of the corpus callosum from a brain MRI. As a single feature we used just the image intensity. The parameter α was set to 15. This example shows, that using simple Gaussian PDFs is sufficient to distinguish the object from a complex background. Moreover, due to our narrow band formulation, only pixels contained in the transition between the phases are taken into account, which essentially increases the robustness of our algorithm. Fig. 2 shows the segmentation of a person from a color image. We used a threedimensional feature vector composed of the RGB channels. The parameter α was set to 20. This shows, that our model easily generalizes to higher dimensional feature vectors. Fig. 3 shows the segmentation of two textured image with 5 phases. The parameter α was set to 20. For the first image, the feature vector was based only on pixel intensities. For the second, more challenging image, we
extended the feature vector by using the coefficients of the first level of a Haarwavelet transform. Fig. 4 shows a three-phase segmentation of a natural image. The parameter α was set to 20. The objects koala, trunk and leaves are fairly good extracted. 4.1
Automatic Border Matting
Fig. 5 shows the segmentation of a fox sitting in the grass. The parameter α was set to 20. Since the fox apparently does not have clear-cut boundaries, one can not define a hard transition between fox and grass. As a byproduct of our method, we directly obtain matted boundaries of the object by using the pixel probabilities of the phase field. Moreover, the border matting of our model works adaptively, which means that the transition width of the phase field is automatically adapted to the appearance of the object boundary. This is in contrast to other methods such as [19], where border matting has to be performed in an extra step.
5
Conclusion
A probability based multi-phase model for variational image segmentation was presented in this paper. The model deals with phase fields, which are interpreted as the probabilities of pixels belonging to regions. The model formulation is derived by maximizing the mutual information between the phase fields and a PDF based on image features. We demonstrated, that this model is very general and can easily deal with multiple regions. For optimization, we derived its Euler-Lagrange equations and presented a fast and robust narrow band implementation of them. In order to show the wide applicability of the model, the algorithm was tested on several images with the same parameter settings. Additionally, we showed that the phase fields can easily be utilized to automatically obtain matted boundaries of fuzzy objects. As discussed in Section 2.1 we use simple multivariate Gaussians for density estimation, but it is clear that in order to handle more complex cases (illumination changes, texture) more sophisticated models are required. Additionally, when dealing with PDFs, the spatial dependence of the features inside the regions is completely ignored. So far, the number of regions are user-specified, but it can be automated using mulitiscale patch statistics [13]. Future work will mainly concentrate on this issues.
References 1. Kass, M., Witkin, A., Terzopoulos, D.: Snakes: Active contour models. Int’l J. of Comp. Vision 1 (1988) 321–331 2. Osher, S., Fedkiw, R.: Level Set Methods and Dynamic Implicit Surfaces. Springer (2003) 3. Osher, S., Sethian, J.: Fronts propagating with curvature dependent speed: Algorithms based on hamilton-jacobi formulations. J. Comput. Phys. 79 (1988) 12–49
(a)
(b)
(c)
(d)
Fig. 1. Segmentation of the corpus callosum from a brain MRI. (a) Initial, (b) intermediate (30 iterations) and (c) final segmentation (100 iterations). (d) Phase field of the final segmentation.
(a)
(b)
(c)
Fig. 2. Segmentation of a person from a color image. (a) Original image, (b) final phase field (60 iterations) and (c) extracted person using pixel probabilities of the phase field.
(a)
(b)
Fig. 3. Segmentation of textured images using 5 phases. The original images are shown with superimposed boundaries of the segmentation result.
4. Caselles, V., Kimmel, R., Sapiro, G.: Geodesic active contours. In: Int’l. Conf. Computer Vision ’95, Boston (1995) 694–699 5. Chan, T., Vese, L.: Active contours without edges. IEEE Trans. Image Processing 10(2) (2001) 266–277 6. Mumford, D., Shah, J.: Optimal approximation by piecewise smooth functions and associated variational problems. Comm. Pure Appl. Math. 42 (1989) 577–685 7. Ambrosio, L., Tortorelli, V.: Approximation of functionals depending on jumps by elliptic functionals via Γ -convergence. Comm. Pure. Appl. Math. 43 (1990) 999–1036 8. Esedoglu, S., Tsai, Y.H.R.: Threshold dynamics for the piecewise constant MufordShah functional. Technical report, CAM report (04-63) (2004) 9. Shen, J.: Γ -convergence approximation to piecewise constant Mumford-Shah segmentation. In: Int’l Conf. Advanced Concepts Intell. Vision Systems. (2005) 499– 506
(a)
(b)
(c)
(d)
Fig. 4. Three-phase segmentation of natural image. (a) Original image. (b)-(d) Extracted objects: koala, trunk and leaves.
(a)
(b)
(c)
(d)
Fig. 5. Segmentation of an object having fuzzy boundaries. (a) Original image, (b) final phase field, (c) extracted object by using phase field probabilities and (d) detail view of (c).
10. Allen, S., Cahn., J.: A microscopic theory for antiphase boundary motion and its application to antiphase domain coarsening. Acta Metall. Mater. (1979) 1085–1095 11. Rubinstein, J., Sternberg, P., Keller, J.B.: Fast reaction, slow diffusion, and curve shortening. SIAM J. Appl. Math. 49(1) (1989) 116–133 12. Droske, M., Rumpf, M.: Multi scale joint segmentation and registration of image morphology. IEEE Transaction on Pattern Recognition and Machine Intelligence (2005) submitted. 13. Shen, J.: A stochastic-variational model for soft Mumford-Shah segmentation. Int’l J. Biomedical Imaging (2006) 14. Awate, S.P., Tasdizen, T., Whitaker, R.T.: Unsupervised texture segmentation with nonparametric neighborhood statistics. In: ECCV 2006, Graz (2006) to appear. 15. Kim, J., Fisher, J.W., Yezzi, A., Cetin, M., Willsky, A.S.: Nonparametric methods for image segmentation using information theory and curve evolution. In: IEEE Int’l Conf. on Image Processing, Rochester (2002) 16. Redner, R.A., Walker, H.F.: Mixture densities, maximum likelihood and the EM algorithm. SIAM Review (1984) 17. Parzen, E.: On estimation of a probability density function and mode. Annals of Mathematical Statistics 33(3) (1962) 1065–1076 18. Sethian, J.: Level Set Methods and Fast Marching Methods: Evolving Interfaces in Computational Geometry. 2nd edn. Cambridge University Press (1999) 19. Rother, C., Kolmogorov, V., Blake, A.: Grabcut - interactive foreground extraction using iterated graph cuts. In: ACM Siggraph. (2004)