SEGMENTATION OF TEXTURED IMAGES USING A MULTIRESOLUTION GAUSSIAN AUTOREGRESSIVE MODEL Mary L. Comer and Edward J. Delp Computer Vision and Image Processing Laboratory School of Electrical Engineering Purdue University West Lafayette, Indiana Corresponding Author: Professor Edward J. Delp School of Electrical Engineering 1285 Electrical Engineering Building Purdue University West Lafayette, IN 47907-1285 Telephone: (317) 494-1740 Fax: (317) 494-6440
[email protected] Abstract { We present a new algorithm for segmentation of textured images using a multiresolution Bayesian approach. The new algorithm uses a multiresolution Gaussian autoregressive (MGAR) model for the pyramid representation of the observed image, and assumes a multiscale Markov random eld model for the class label pyramid. Unlike previously proposed Bayesian multiresolution segmentation approaches, which have either used a single-resolution representation of the observed image or implicitly assumed independence between dierent levels of a multiresolution representation of the observed image, the models used in this paper incorporate correlations between dierent levels of both the observed image pyramid and the class label pyramid. The criterion used for segmentation is the minimization of the expected value of the number of misclassi ed nodes in the multiresolution lattice. The estimate which satis es this criterion is referred to as the \multiresolution maximization of the posterior marginals" (MMPM) estimate, and is a natural extension of the single-resolution \maximization of the posterior marginals" (MPM) estimate. Previous multiresolution segmentation techniques have been based on the maximum a posteriori (MAP) estimation criterion, which has been shown to be less appropriate for segmentation than the MPM criterion. It is assumed that the number of distinct textures in the observed image is known. The parameters of the MGAR model | the means, prediction coecients, and prediction error variances of the dierent textures | are unknown. A modi ed version of the expectationmaximization (EM) algorithm is used to estimate these parameters. The parameters of the Gibbs distribution for the label pyramid are assumed to be known. Experimental results demonstrating the performance of the algorithm are presented.
EDICS: IP
This work was partially supported by a National Science Foundation Graduate Fellowship.
1 INTRODUCTION In this paper we present a new multiresolution textured-image segmentation algorithm which uses a doubly-stochastic model. For single-resolution schemes the doubly-stochastic model can be described as follows: Given an observed image containing a number of regions corresponding to dierent objects, or textures, there is an unobserved image, referred to as the label eld, which contains the classi cations of all pixels in the image. The observed image and the label eld are discrete-parameter random elds, and are modeled using stochastic models. In the doubly-stochastic framework for the multiresolution case, the observed data is a multiresolution representation of the observed image. The corresponding label pyramid contains the classi cations of all nodes in the multiresolution lattice on which the observed image pyramid is de ned. The classi cations of the nodes in the multiresolution pyramid will be referred to as the class labels. The importance of utilizing information at various scales for texture analysis has led to the development of several multiresolution approaches to textured image segmentation [1, 2, 3, 4, 5, 6, 7]. Exploiting information at multiple resolutions oers several advantages over single-resolution approaches. First, computational complexity is reduced, since much of the work can be done at coarse resolutions, where there are signi cantly fewer pixels to process. Second, algorithms which determine the classi cation of a given pixel based on local characteristics in the region containing the pixel can eectively make decisions based on larger neighborhoods by examining the image at coarser resolutions, without the increase in computational complexity which would result from simply using larger neighborhoods at the original image resolution. Also, processing an image at multiple resolutions precludes the need for a priori selection of one optimal resolution for processing. In this paper we propose a new segmentation algorithm which uses a MGAR model for 1
the observed image pyramid. Our approach is dierent from previously proposed approaches in that we obtain a multiresolution representation of the observed image and model this representation as a stochastic process indexed by the nodes of a multiresolution lattice. Previous approaches have used multiresolution models for the pixel labels, but have used either single-resolution representations of the observed image [1, 2, 4] or multiresolution representations of the observed image with the implicit assumption that the random variables at a given level of the observed image pyramid are independent from the random variables at other levels [3, 5, 6]. In [2] the MAP estimate of the label eld at the coarsest resolution is approximated rst, using iterated conditional modes (ICM), and the result is then propagated to the next- ner resolution, where the MAP estimate of the label eld at that resolution is approximated using ICM. This process is continued until the nest resolution is reached. In [3] the image is rst segmented at the coarsest resolution using a Bayes decision rule, and then a coarse-to- ne procedure is used to re ne the segmented image. Varying-resolution simultaneous autoregressive (SAR) models are t to the observed image in [5], and features from dierent resolutions are combined to segment the image using a K-means clustering algorithm. In [6] the observed image is modeled at each resolution as a Gauss Markov random eld (GMRF). It is assumed that the GMRF parameters at the nest resolution are known, and the GMRF parameters at coarser resolutions are estimated using the values of these nest-resolution parameters. The MAP estimate is approximated at each resolution using ICM rst at the coarsest levels, then at ner levels, as in [2]. The problem of texture discrimination, in which, given a set of known textures and a noisy observed image, the texture from the pre-de ned set corresponding most closely to the observed data must be selected, is addressed using a multiscale stochastic model in [8]. The texture discrimination problem is dierent from the texture segmentation problem considered in this paper, in which there is no pre-de ned set of textures from which to select. 2
The MGAR model used in this paper was proposed for segmentation in [7]. The work described in this paper is dierent from [7] in that the model used for the class label pyramid in [7] is dierent from the model used here. In [7] correlations between dierent resolution levels in the label pyramid were incorporated simply by using the nal segmentation at a given resolution as the initial segmentation at the next- ner resolution. In this paper a less ad hoc method is used | inter-level correlations in the label pyramid are incorporated into the label pyramid model by using a multiscale Markov random eld (MMRF) model. Also, in [7] the prediction coecients and prediction error variances of the MGAR model were assumed to be known a priori. In this paper they are estimated as the segmentation is performed. Our approach is also dierent from previously proposed multiresolution approaches in that we approximate the \multiresolution maximization of the posterior marginals" (MMPM) estimate of the label pyramid, which is a natural extension of the single-resolution \maximization of the posterior marginals" (MPM) estimate [9]. Previous approaches have been based on MAP estimation. Using the MPM approach has two possible advantages. First, the cost function which the MPM estimate minimizes is more appropriate for image segmentation than the cost function which the MAP estimate minimizes [9]. This is because the MAP estimate assigns the same cost to every incorrect segmentation, regardless of the number of pixels at which the incorrect segmentation diers from the true segmentation, whereas the MPM estimate assigns a cost to an incorrect segmentation based on the number of incorrectly classi ed pixels in that segmentation. Second, using the MPM criterion facilitates the use of the EM algorithm to estimate unknown parameters of the MGAR model [10, 11, 12]. The MMPM estimate of the class label pyramid is obtained by maximizing the marginal conditional probability mass functions of the class labels given the multiresolution represen3
tation of the observed image. The values of these conditional probability mass functions are estimated by sampling from the conditional distribution of the class label pyramid given the observed image pyramid using a Gibbs sampler with constant temperature, following the method used in the single-resolution MPM algorithm [9]. The initial estimate of the class label pyramid is obtained by performing the single-resolution MPM algorithm at the coarsest level of the pyramid, and then propagating the resulting segmentation down to the other levels of the pyramid. This signi cantly increases the speed of convergence of the algorithm. A method similar to that proposed in [10] for single-resolution processing is used to simultaneously segment the image and estimate parameters of the MGAR model. In Section 2 we describe the statistical models used for the observed image pyramid and the label pyramid. In Section 3 we present the segmentation algorithm for the case when all the model parameters are known, and in Section 4 a method for estimating the parameters of the MGAR model is proposed. Section 5 describes the new multiresolution algorithm proposed for simultaneous segmentation and parameter estimation, combining the techniques discussed in Sections 3 and 4. Experimental results demonstrating the performance of the algorithm are presented in Section 6.
2 STATISTICAL MODELS In our problem formulation the observed data Y is a multiresolution representation of the observed image, obtained using a Gaussian pyramid decomposition [13]. Thus, Y is a stochastic process indexed by the nodes of a multiresolution lattice, such as the one shown in Figure 1. The class label pyramid X is de ned on the same multiresolution lattice as Y, and contains the classi cations of the nodes in the lattice. The set of nodes in the lattice is denoted S . Each level in the lattice corresponds to a dierent spatial resolution, where level 0 represents the nest spatial resolution and level M ? 1 the coarsest spatial resolution. The set of lattice 4
points at level n will be denoted S n and the random elds containing the observed image ( )
at resolution n and the classi cations of the nodes at resolution n will be denoted Y n and ( )
X n , respectively. Each node in S n corresponds to 4n pixels at the original image spatial resolution. Hence, each random variable in X n represents the classi cation of a block of 4n pixels in the original image. The random variable in Y n at node (i; j ) 2 S n will be denoted Yi;jn , for i = 0; : : : ; (I=2n) ? 1; j = 0; : : : ; (J=2n ) ? 1, where I and J are the number ( )
( )
( )
( )
( )
( )
of rows and columns, respectively, in the original image. This notation will also be used for the random variables in X n . To simplify the notation, the random variables in Y and X ( )
may also be indexed by a single letter, as in a lexicographical ordering, in which case the
sth random variable in Y will be denoted by Ys , and the sth random variable in X will be Xs . Throughout the paper, y = (y ; y ; : : :; yN ) and x = (x ; x ; : : :; xN ), where N is the 1
2
1
2
total number of nodes in S , will represent sample realizations of Y = (Y ; Y ; : : :; YN ) and 1
2
X = (X ; X ; : : : ; XN ). For every node s 2 S , the set of values which the random variable Xs can take is f1; 2; : : : ; Lg, where L is the number of dierent classes, or textures, in the image. We shall 1
2
assume that L is known. To segment the observed image, two models will be needed. A model will be needed for the conditional probability density function of Y given the classi cations of all nodes in the multiresolution lattice. We will also need a model for the conditional probability mass function of the label pyramid X.
2.1 MGAR Model In this section we describe the MGAR model used for Y. To simplify the discussion we will rst assume that Y is indexed by the nodes in a binary tree, as shown in Figure 2. Extension to the quadtree lattice shown in Figure 1 will then be described. 5
In order to de ne the model for Y, we associate with the binary tree the ordering of the nodes shown in Figure 2. The nodes at level n are indexed from 2M ?n? to 2M ?n ? 1, where 1
M is the number of levels in the tree. The model which will be used is a causal MGAR model, where the notion of causality for our model is de ned by the ordering of the nodes of the tree de ned above. In this model the random variables Y ; : : : ; YN are modeled as jointly Gaussian random variables 1
conditioned on the classi cation of the tree nodes. The model can be described as follows: The value of the random variable Ys can be predicted as a linear combination of the values of the random variables at previous nodes. The prediction errors Y~ ; : : :; Y~N form a sequence of 1
independent random variables. The parameters used for the prediction of Ys depend on the class to which node s belongs, i.e., xs. The variance of the prediction error Y~s also depends on xs. The prediction errors can be written as
i h X Y~s = Ys ? xnss + axnss;r Ys?r ? xnss (
(
)
)
(
(1)
)
r>0
where ns is the level of node s in the tree, faxnss;r g are the prediction coecients at level ns for class xs, and Y~ ; : : : ; Y~N is a sequence of independent, Gaussian random variables. The (
)
1
variance of Y~s is denoted (xnss ) . The MGAR model for class k at level n is said to be of (
) 2
order R if there are R values of r for which ak;rn is non-zero. Since the random variables Y~ ; : : : ; Y~N are independent Gaussian random variables, the ( )
1
form of their joint conditional probability density function given the classi cation of all tree nodes is known. However, we need the joint conditional probability density function of
Y ; : : :; YN given the classi cation of all tree nodes. By considering the sequences Y ; : : :; YN and Y~ ; : : : ; Y~N as vectors, i.e., Y = [Y ; : : :; YN ]T and Y~ = [Y~ ; : : : ; Y~N ]T , Y~ can be written 1
1
1
1
1
6
as
Y~ = C + AY
(2)
where C is a constant matrix and A can be determined from Equation 1. Since Y~ s is a linear combination of values of Y at node s and nodes which precede node s, but not at future nodes, the matrix A is lower triangular. Also, since the coecient of the term Ys in the expression for Y~s is 1, all diagonal elements of A are 1. Thus the Jacobian of the
transformation from Y to Y~ is 1, and the conditional probability density function of Y given
X is [2]
fYjX (yjx; ) =
N Y s=1
q 1 ns exp ? y~nss 2(xs ) 2(xs ) 2
(
where
y~s = ys ? xnss + (
)
X r>0
(
) 2
!
(3)
) 2
i h axnss;r ys?r ? xnss (
)
(
(4)
)
and the vector contains the MGAR model parameters. The elements of are kn , ak;rn , ( )
( )
and (kn ) , for k = 1; : : : ; L; n = 0; : : : ; M ? 1; and r = 1; : : : ; R. Since the lter coecients ( ) 2
used to obtain the Gaussian pyramid representation of the observed image sum to 1, the class means do not depend on the resolution level n, so that kn = km k for all n; m. ( )
(
)
2.2 Model for Class Label Pyramid A multiscale Markov random eld model is used for the class label pyramid X [4]. This model is a natural extension of the Markov random eld (MRF) model previously used for image segmentation in single-resolution algorithms [14, 15, 16]. We rst describe the general form of the multiscale Markov random eld model, and then de ne the speci c model used in this paper. It is rst necessary to de ne the concept of a neighborhood system for the multiresolution 7
lattice S . If S is written as S = fs ; s ; : : :; sN g then G = fGs S; s 2 S g is a neighborhood 1
2
system for S if, for every node s 2 S , s 62 Gs and s 2 Gr () r 2 Gs, for any r 2 S . The elements of the set Gs are the neighbors of node s. If, for every node s 2 S , Xs is a discrete random variable, then X is a multiscale Markov random eld with neighborhood system G if
P (Xs = xsjXr = xr; r 6= s) = P (Xs = xs jXr = xr ; 8r 2 Gs); 8s 2 S
(5)
where P (AjB ) is the conditional probability of the event A given the event B . As in the single-resolution MRF case, speci cation of the multiscale Markov random eld in terms of the conditional probabilities in Equation 5 is very dicult. This problem can be addressed by specifying a Gibbs distribution for the joint probability mass function pX (x) = P (X =
x) = P (X = x ; : : :; XN = xN ). 1
1
The speci cation of a Gibbs distribution for X depends on the concept of cliques. A set of nodes C S is a clique if, for any nodes s; r 2 C , s 2 Gr. Thus, the collection of all cliques, which we shall denote as C , is induced by the neighborhood system. We shall assume that if P (Xi = xi) > 0 8i then P (X = x) > 0, for any x. This is referred to as the positivity condition. Assuming this is true, it can be shown that if pX (x) can be expressed in the form
! X 1 1 pX (x) = z exp ? T VC (x) C 2C
(6)
where VC (x) is a function which depends only on the values of x at nodes in the clique C , z is a normalizing constant, and T is a constant which represents \temperature", then X is a multiscale Markov random eld. A probability mass function which can be written in this form is referred to as a Gibbs distribution. A multiscale MRF can be speci ed in terms of the functions VC (x), C 2 C , instead of the conditional probabilities of Equation 5 [17]. This 8
greatly simpli es the task of specifying a multiscale MRF. For the multiscale MRF model used in this paper, the collection of cliques C consists of all pairs of nodes at the same resolution which are spatially horizontally or vertically adjacent to each other (type-1 cliques), all pairs of nodes having a parent-child relationship to each other (type-2 cliques), and all single pixels (type-3 cliques). Type-1 cliques and type-2 cliques are illustrated in Figures 3 and 4, respectively. The neighborhood system corresponding to this collection of cliques is shown in Figure 5. The probability mass function of X can now be de ned using the collection of cliques described above. Let C be the set of all cliques of type 1 and C the set of all cliques of type 1
2
2. Then the probability mass function of X is de ned to be
1 0 X X X 1 nr t(xr ; xs) ? nr t(xr; xs) ?
xr A pX (x) = z exp @? fr;sg2C fr;sg2C frg2C ( 1
)
( 2
1
)
(7)
2
where n and n are spatial interaction parameters at level n, node r is in level nr , node ( ) 1
( ) 2
r is assumed without loss of generality to be the parent node for each pair of nodes in the summation representing type-2 cliques, and
8 > < 0 if xr = xs t(xr; xs) = > : 1 if xr 6= xs
(8)
The parameter k can be viewed as a cost parameter for class k. If, for a given k, k is high, then class k is less likely to occur than classes with lower costs. For applications in which there is a priori information about the relative sizes of the various classes, the parameters
f k g can be selected to incorporate this information into the label eld model. In the absence of such a priori information, k will be assumed to be zero for every k. To segment the observed image, the form for the conditional probability mass function 9
of X given Y is needed. Using Bayes' rule and Equations 3 and 7, we have
pXjY (xjy; ) = fYjX (fyjx(y; j))pX (x) = Y
2N !3 Y y~s 1 4 q 1 5 exp ? n s fY (yj) s 2(xs ) 2(xnss ) 2
(
=1
(
) 2
) 2
1 1 0 X n X X nr r @ A z exp ? fr;sg2C t(xr ; xs) ? fr;sg2C t(xr ; xs) ? frg2C xr 3 2N Y 1 1 = zf (yj) 4 q ns 5 Y s 2xs ) 1 0 N X X X X y ~ n n s exp @? ? r t(xr; xs) ? r t(xr; xs) ?
xr A n s s 2(xs ) fr;sg2C fr;sg2C frg2C ( 1
)
( 2
1
2
=1
2
=1
(
) 2
)
( 1
(
) 2
)
( 2
1
)
(9)
2
Since fY (yj) does not depend on x, it will not need to be considered in the optimization.
It should be noted that pXjY (xjy; ) is a Gibbs distribution.
3 SEGMENTATION ALGORITHM In this section we assume that is known and describe the multiresolution segmentation algorithm. The MMPM algorithm follows the single-resolution MPM algorithm [9] very closely. The segmentation problem is formulated as an optimization problem. The optimization criterion which is used is the minimization of the expected value of the number of misclassi ed nodes in the multiresolution lattice. The optimization used to estimate X can be viewed as the minimization of the conditional expected value of a cost functional R(X; x), given the observed image pyramid Y, over all
10
possible realizations of the label pyramid X. The cost functional is given by
R(X; x) =
N X s=1
t(Xs ; xs)
(10)
where t(; ) is the function de ned in Equation 8. Thus, R(X; x) is the number of nodes at which X and x are not equal, i.e., the number of misclassi ed nodes in S . The value of
x which minimizes the conditional expectation of this cost functional will be denoted x. Thus,
E [R(X; x)jY = y] E [R(X; x)jY = y] 8x
(11)
Using Equations 8 and 10, we have N X E [R(X; x)jY = y] = E [ t(Xs; xs)jY = y]
= = =
s=1 N X
E [t(Xs; xs)jY = y]
s=1 N X s=1 N X s=1
P (Xs 6= xsjY = y) (1 ? P (Xs = xsjY = y))
(12)
We can minimize this sum by choosing for each node s 2 S the value of xs from the set
f1; 2; : : : ; Lg which minimizes (1 ? P (Xs = xsjY = y)) or, equivalently, the value which maximizes P (Xs = xsjY = y). The estimate of the label pyramid X which is derived by independently maximizing this marginal probability for each s 2 S is the MMPM estimate of X. To nd the MMPM estimate of X it is necessary to nd for each s 2 S the value of k
11
which maximizes
P (Xs = kjY = y) = pXsjY (kjy; ) X = pXjY (xjy; ) x:xs =k
(13)
where k 2 f1; 2; : : : ; Lg. Exact computation of these marginal probabilities as in Equation 13 is computationally infeasible. Marroquin et al. presented an algorithm to solve this problem when X is a singleresolution MRF and Y is a single-resolution image [9]. The same method can be used here to approximate P (Xs = kjY = y) for each s 2 S and k 2 f1; 2; : : : ; Lg as follows: Use the Gibbs sampler [18] with constant temperature (no annealing) to generate a Markov chain X(t) which converges in distribution to a random eld with probability mass function
pXjY (xjy; ), given by Equation 9. The marginal conditional probabilities pXs jY (kjy; ),
which are to be maximized, are then approximated as the fraction of time the Markov Chain spends in state k at node s, for each k and s. For the Markov chain X(t) generated using the Gibbs sampler there are LN possible states, corresponding to the LN possible realizations of X. At each step only one node is visited, so that X(t ? 1) and X(t) can dier at no more than one node. At time t the state of X(t) at node s is a random variable Xs (t). Let qt 2 S be the node visited at time t. Then the state of Xqt (t) is determined by sampling from the conditional probability mass function
pXqt jY;Xr ;r2Gqt (kjy; xr(t ? 1); r 2 Gqt ; ). If the sequence fq ; q ; q ; : : :g contains every node 1
2
3
s 2 S in nitely often, then it can be shown that for any initial con guration x(0), lim P (X(t) = xjY = y; X(0) = x(0)) = pXjY (xjy; )
t!1
12
(14)
for every x [18]. Thus, the Markov chain converges in distribution to a multiresolution
process with probability mass function pXjY (xjy; ).
To describe the approximation of the marginal conditional probability mass function at each node using the Markov chain X(t), we rst de ne the function
8 > < 1 if Xs (t) = k uk;s(t) = > : 0 if Xs (t) 6= k
(15)
Then, if N is the number of iterations (complete passes through the pyramid) of the Gibbs 0
sampler, then the approximations N X pXsjY (kjy; ) N1 uk;s(t) 8k; s 0
0
t=1
(16)
provide estimates of the values needed to obtain the MMPM estimate of X.
4 PARAMETER ESTIMATION In order to implement the Gibbs sampler, we must estimate the value of . We will use a modi ed version of the EM algorithm to estimate . In this section we describe the EM
algorithm for the case when the values of the marginal conditional probabilities of the class label pyramid given the observed image pyramid are known. These values are needed to perform the EM algorithm for our formulation. The EM algorithm has been widely used for the estimation of parameters for incompletedata problems [19, 20, 21, 22]. In an incomplete-data problem the observed data represent only a subset of the complete set of data. There also is a set of data which is unobserved, or hidden. For example, in our formulation the observed image pyramid represents the observed data, and the class label pyramid represents the hidden data. The EM algorithm is 13
an iterative procedure for approximating maximum-likelihood estimates. At each iteration two steps are performed: the expectation step and the maximization step. In general, if (p)
is the estimate of at the pth iteration, then in the expectation step at iteration p + 1 the function
Q(; (p)) = E [log fYjX (yjx; )jY = y; (p)] + E [log pX (xj)jY = y; (p)]
(17)
is computed. Since in our formulation the probability mass function of X does not depend on , we only use the rst term of Equation 17. The estimate (p + 1) is obtained in the
maximization step as the value of which maximizes Q(; (p)), i.e., (p + 1) satis es
Q((p + 1); (p)) Q(; (p)) 8
(18)
Substituting the expression for fYjX (yjx; ) into Equation 17 (and dropping the second term) gives
Q(; (p)) = E [log
N Y s=1
! y ~ 1 s q exp ? ns jY = y; (p)] n s 2(xs ) 2(xs ) 2
(
(
) 2
) 2
1 0 1 A jY = y; (p)] ? 1 X E [ y~nss jY = y; (p)] = E [log @ q 2 s2S (xs ) s2S 2(xnss ) 1 0 ? X LX 1 A P (Xs = kjY = y; (p)) ? = log @ q n s s2S k 2(k ) ? (~y k ) 1 X LX s 2 s2S k (kns ) P (Xs = kjY = y; (p)) X
2
(
(
) 2
) 2
1
(
=0
1
=0
) 2
2
(
) 2
where
y~sk = ys ? k +
X r>0
14
ak;rns [ys?r ? k ] (
)
(19) (20)
In the M-step at iteration p + 1, Q(; (p)) must be maximized with respect to . Let ^
be the value of which maximizes Q(; (p)). By dierentiating and setting to zero, it can be shown that ^ , with elements ^k , (^kn ) , and a^k;rn , must satisfy the set of equations ( ) 2
( )
3 2 X4 X ns ys ? ^k + a^k;q [ys?q ? ^k ]5 P (Xs = kjY = y; (p)) = 0
(21)
3 2 X ns X 4 ys ? ^k + a^k;q [ys?q ? ^k ]5 [ys?r ? ^k ]P (Xs = kjY = y; (p)) = 0
(22)
(
s2S
q>0 (
s2S
(n)
)
)
q>0
2 3 X 4 X ns ys ? ^k + a^k;q [ys?q ? ^k ]5 P (Xs = kjY = y; (p)) ? (
s2S (n)
)
q>0
(^kn )
( ) 2
X
s2S (n)
P (Xs = kjY = y; (p)) = 0
(23)
for k = 1; : : : ; L; n = 0; : : :; M ? 1; and r = 1; : : : ; R.
The value of (p + 1) can be obtained by solving the above system of equations for ^k , (^kn ) , and a^k;rn and setting (p + 1) = ^ . However, Equation 22 is a nonlinear function of ( ) 2
( )
^k and ^ak;rn , which makes solving the system of equations dicult. To address this problem, ( )
we divide the M-step into two substeps: First the class means are estimated by using the equations
N X ^k = N1 ysP (Xs = kjY = y; (p)) k s=1
where
Nk =
N X s=1
P (Xs = kjY = y; (p))
(24) (25)
This equation for ^k is the equation which results if the random variables in the observed image pyramid are assumed to be conditionally independent given the class label pyramid. After the means have been estimated using Equation 24, the remaining MGAR model parameters can be estimated easily using Equations 22 and 23. 15
5 SEGMENTATION WITH SIMULTANEOUS PARAMETER ESTIMATION The EM/MMPM algorithm proposed in this paper combines the techniques described in Sections 3 and 4. First, the MMPM algorithm is performed using an initial estimate of , say ^ (0). After a certain number of iterations of the MMPM algorithm, the resulting estimates of pXsjY (kjy; ^ (0)) are used in Equations 22 through 25 to obtain an updated estimate of , say ^ (1). This new estimate of is then used in the MMPM algorithm to nd
estimates of pXsjY (kjy; ^ (1)), which are then used to update the estimate of . This process is continued until some suitable stopping point is reached.
The EM/MMPM algorithm generates a ( nite) collection of Markov chains X(1; t); X(2; t);
: : : ; X(P + 1; t), for some P 1. Generation of X(p; t) is referred to as stage p of the
algorithm. The estimate of obtained during stage p is denoted by the random variable
(p). The elements of (p) are the random variables Mk (p); Ak;rn (p); Skn (p) for k = 1; : : : L; ( )
( )
n = 0; : : : M ? 1; and r = 1; : : : R. The algorithm begins with the estimate (0) = ^ (0) for some ^ (0) 2 . The Markov chain X(1; t) is generated using the procedure described in Section 3. The state of Xqt (1; t) is determined by sampling from the conditional probability mass function pXqt jY;Xr ;r2Gqt ; (kjy; xr(1; t? 1); r 2 Gqt ; ^ (0)), where pXjY; p (xjy; ^ (p)), for any p 1, has the same form as pXjY (xjy; ) (0)
( )
given by Equation 9, with (p) random, unlike , which is deterministic.
After each pyramid node has been visited T times, for some T 1, the estimate (1) 1
1
is computed. Using the estimates vk;s(1; t) de ned by
vk;s(1; t) = 1t
t X i=1
16
uk;s(1; i)
(26)
where
8 > < 1 if Xs (1; t) = k uk;s(1; t) = > : 0 if Xs (1; t) 6= k
(27)
the estimate (1) is computed using N X
Mk (1) =
ysvk;s(1; T )
s=1 N X
s=1
1
(28)
vk;s(1; T ) 1
3 2 X ns X 4 ys ? Mk (1) + Ak;q (1)[ys?q ? Mk (1)]5 [ys?r ? Mk (1)]vk;s(1; T ) = 0 (
s2S (n)
)
1
q>0
(29)
2 3 X 4 X ns ys ? Mk (1) + Ak;q (1)[ys?q ? Mk (1)]5 vk;s(1; T ) ? (
q>0
s2S (n)
(Skn (1)) ( )
)
1
X
2
s2S (n)
vk;s(1; T ) = 0
(30)
1
Note that these equations have the same form as the EM update equations (Equations 22 and 25), with pXs jY (kjy; ) replaced by the estimates vk;s(1; T ). 1
In general, the Markov chain X(p; t) is generated using the Gibbs sampler and sampling from the distribution pXjY; p? (xjy; ^ (p ? 1)), and the estimate (p) is computed using (
1)
the equations
N X
Mk (p) =
ysvk;s(p; Tp)
s=1 N X
s=1
vk;s(p; Tp)
2 3 X 4 X ns ys ? Mk (p) + Ak;q (p)[ys?q ? Mk (p)]5 [ys?r ? Mk (p)]vk;s(p; Tp) = 0 (
s2S (n)
)
q>0
3 2 X ns X 4 ys ? Mk (p) + Ak;q (p)[ys?q ? Mk (p)]5 vk;s(p; Tp) ? (
s2S (n)
)
q>0
17
(31)
(32)
(Skn (p)) ( )
X
2
s2S (n)
vk;s(p; Tp) = 0
(33)
The nal estimate of is (P ). The nal segmentation is obtained by maximizing over all
k the value vk;s(P + 1; TP ) for every s 2 S , for some TP 1. +1
+1
The algorithm can be summarized as follows: First, initial estimates ^ (0) and X(1; 0) of
and X, respectively, are selected. Then, for p = 1; : : : ; P , stage p of the algorithm consists of two steps: 1. Perform Tp iterations of the MMPM algorithm using ^ (p ? 1) as the value of .
2. Use the modi ed EM update equations for to obtain ^ (p), using the values vk;s(p; Tp) as estimates of pXsjY (kjy; (p ? 1)).
After ^ (P ) has been obtained as the nal estimate of , TP iterations of the MPM algorithm are performed, using ^ (P ). The nal segmentation is X(P + 1; TP ). +1
+1
6 EXPERIMENTAL RESULTS For the results presented in this section, the Gaussian pyramid decomposition described in [13], with the same lter weights as those used in [13], was used to obtain the multiresolution representation of the observed image, although the algorithm could also be used with other multiresolution decomposition schemes, e.g., averaging over 2x2 blocks and subsampling to obtain coarse-resolution data from ne-resolution data. For all results presented in this section, three pyramid levels were used (i.e., M = 3). This means that the original image was examined at three dierent resolutions. It was found that for some images the use of more than three pyramid levels resulted in an insucient number of pixels at the coarsest resolution for the algorithm to perform well. The values n = 2:4 8n, = = 0:3, and Tp = 3 for every p = 1 : : : ; P + 1, were ( ) 1
(0) 2
(1) 2
used. The class cost parameter k was assumed to be zero for every k. For all experiments 18
described here, 70 stages of the single-resolution EM/MPM algorithm [10, 11, 12] were rst performed at the coarsest level of the pyramid, and the resulting segmentation propagated to other levels, to obtain an initial estimate of the label pyramid X. Also, to obtain initial estimates of the MGAR model parameters, the estimates of the conditional probabilities of the class labels at the coarsest resolution, obtained using the EM/MPM algorithm at the coarsest resolution, were propagated to higher resolutions. These estimates were then used in Equations 24, 22 and 23 to obtain the initial estimates of the MGAR parameters. The MGAR prediction window used at node (i; j ) 2 S n contains nodes (i ? 1; j ? 1) 2 ( )
S n ; (i ? 1; j ) 2 S n ; (i; j ? 1) 2 S n ; (i=2; j=2) 2 S n ; (i=2; j=2 + 1) 2 S n ; (i=2 + ( )
( )
( )
( +1)
1; j=2) 2 S n ; (i=2 + 1; j=2 + 1) 2 S n ( +1)
( +1)
( +1)
if n is not the coarsest resolution, and nodes
(i ? 1; j ? 1) 2 S n ; (i ? 1; j ) 2 S n ; (i ? 1; j + 1) 2 S n ; (i; j ? 1) 2 S n if n is the coarsest ( )
( )
( )
( )
resolution. Figures 6 through 9 show results from applying the multiresolution EM/MMPM algorithm to synthetic texture images, as well as the results from the single-resolution EM/MPM algorithm for comparison. In each of these gures, the original image is shown in (a), the result obtained using the EM/MPM algorithm is shown in (b), and the result obtained from applying 20 stages of the multiresolution EM/MMPM algorithm is shown in (c). For the images shown in Figures 7, 8, and 9, the EM/MMPM algorithm performed signi cantly better than the EM/MPM algorithm. For the image shown in Figure 6, the two algorithms performed similarly. Table 1 shows the percentage of pixels which were misclassi ed by the EM/MPM and EM/MMPM algorithms for the four synthetic test images. In general, the multiresolution algorithm provides much better performance in terms of the number of misclassi ed pixels. Figures 10 through 14 show the results from applying 20 stages of the multiresolution EM/MPM algorithm to infrared test images. In each of these gures the image shown in (b) 19
Table 1: Percentage of Misclassi ed Pixels Image EM/MPM EM/MMPM Figure 6(a) 2.4 3.1 Figure 7(a) 4.8 0.74 Figure 8(a) 39.2 12.3 Figure 9(a) 30.0 1.2 is the segmentation resulting from the EM/MPM algorithm with L = 2, the image shown in (c) is the result obtained using the EM/MMPM algorithm with L = 2, (d) shows the results from EM/MPM with L = 3, and (e) shows the result from EM/MMPM with L = 3. In some cases, the EM/MMPM algorithm did not nd 3 statistically distinct classes. The multiresolution algorithm performs better than the single-resolution algorithm for the images in Figures 10, 11, 12 (for L = 3), and 13, whereas the single-resolution algorithm does a better job of distinguishing the road and the vehicle from the background in Figure 14. The nal issue that we discuss in this section is a comparison of the amount of computation required to achieve convergence for two of the test images using the EM/MPM and EM/MMPM algorithms. We consider the convergence of the parameter estimates to do this, although the amount of computation could also be measured by studying convergence of the MPM algorithm at each stage of the EM/MPM and EM/MMPM algorithms. We use the number of visits per pixel as a measure of computational complexity. Table 2 shows the number of visits to each pixel required for convergence of the parameter estimates obtained by the two algorithms for the two test images in Figures 6(a) and 7(a). The EM/MMPM algorithm converges more quickly than the EM/MPM algorithm for both images. This is also not a surprising result, since much of the segmentation and parameter estimation can be done at the coarsest resolution, where there are fewer pixels to process.
20
Table 2: Visits Per Pixel Image EM/MPM EM/MMPM Figure 6(a) 1250 550 Figure 7(a) 3000 300
7 CONCLUSION We have presented a multiresolution extension of the EM/MPM algorithm. The EM/MMPM algorithm eectively uses larger neighborhoods to perform the segmentation than the singleresolution algorithm, giving the multiresolution approach the ability to model large-scale properties of textures more eectively. A postscript version of this paper and the image data sets are available via anonymous ftp to skynet.ecn.purdue.edu in the directory /pub/dist/delp/mres_segment.
REFERENCES [1] C. Bouman and M. Shapiro, \A multiscale random eld model for Bayesian image segmentation," IEEE Transactions on Image Processing, vol. 3, no. 2, pp. 162{177, March 1994. [2] C. Bouman and B. Liu, \Multiple resolution segmentation of textured images," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 13, no. 2, pp. 99{113, February 1991. [3] I. Ng, J. Kittler, and J. Illingworth, \Supervised segmentation using a multiresolution data representation," Signal Processing, vol. 31, no. 2, pp. 133{163, March 1993. [4] Z. Kato, M. Berthod, and J. Zerubia, \Multiscale Markov random eld models for parallel image classi cation," Proceedings of the Fourth International Conference on Computer Vision, May 1993, Berlin, Germany. [5] J. Mao and A. K. Jain, \Texture classi cation and segmentation using multiresolution simultaneous autoregressive models," Pattern Recognition, vol. 25, no. 2, pp. 173{188, February 1992. [6] S. Krishnamachari and R. Chellappa, \Multiresolution Gauss Markov random eld models," tech. rep., University of Maryland, College Park, Maryland, January 1995. 21
[7] M. L. Comer and E. J. Delp, \Multiresolution image segmentation," Proceedings of the 1995 IEEE International Conference on Acoustics, Speech, and Signal Processing, May 1995, Detroit, Michigan, pp. 2415{2418. [8] M. Luettgen and A. Willsky, \Likelihood calculation for a class of multiscale stochastic models, with application to texture discrimination," IEEE Transactions on Image Processing, vol. 4, no. 2, pp. 194{207, February 1995. [9] J. Marroquin, S. Mitter, and T. Poggio, \Probabilistic solution of ill-posed problems in computational vision," Journal of the American Statistical Association, vol. 82, no. 397, pp. 76{89, March 1987. [10] M. L. Comer and E. J. Delp, \Parameter estimation and segmentation of noisy or textured images using the EM algorithm and MPM estimation," Proceedings of the 1994 IEEE International Conference on Image Processing, November 1994, Austin, Texas, pp. 650{654. [11] M. L. Comer and E. J. Delp, \The EM/MPM algorithm for segmentation of textured images: Analysis and further experimental results," submitted to IEEE Transactions on Image Processing. [12] M. L. Comer and E. J. Delp, \The EM/MPM algorithm for segmentation of textured images: Analysis and further experimental results," Proceedings of the 1996 IEEE International Conference on Image Processing, September 1996, Lausanna, Switzerland. [13] P. J. Burt and E. H. Adelson, \The Laplacian pyramid as a compact image code," IEEE Transactions on Communications, vol. 31, no. 4, pp. 532{540, April 1983. [14] H. Derin and H. Elliott, \Modeling and segmentation of noisy and textured images using Gibbs random elds," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 9, pp. 39{55, January 1987. [15] P. A. Kelly, H. Derin, and K. D. Hart, \Adaptive segmentation of speckled images using a hierarchical random eld model," IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 36, pp. 1626{1641, October 1988. [16] S. Lakshmanan and H. Derin, \Simultaneous parameter estimation and segmentation of Gibbs random elds using simulated annealing," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 11, no. 8, pp. 799{813, August 1989. [17] J. Besag, \Spatial interaction and the statistical analysis of lattice systems," Journal of the Royal Statistical Society B, vol. 36, pp. 192{236, 1974. [18] S. Geman and D. Geman, \Stochastic relaxation, Gibbs distributions, and Bayesian restoration of images," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-6, no. 6, pp. 721{741, November 1984. 22
[19] A. P. Dempster, N. M. Laird, and D. B. Rubin, \Maximum likelihood from incomplete data via the EM algorithm," Journal of the Royal Statistical Society B, vol. 39, pp. 1{38, 1977. [20] C. F. J. Wu, \On the convergence properties of the EM algorithm," The Annals of Statistics, vol. 11, no. 1, pp. 95{103, 1983. [21] R. A. Redner and H. F. Walker, \Mixture densities, maximum likelihood and the EM algorithm," Society for Industrial and Applied Mathematics Review, vol. 26, no. 2, pp. 195{239, April 1984. [22] M. Aitkin and D. B. Rubin, \Estimation and hypothesis testing in nite mixture models," Journal of the Royal Statistical Society Series B, vol. 47, no. 1, pp. 67{75, 1985.
23
n=2
n=1
n=0
Figure 1: Multiresolution lattice
1 n=3 2
4
3
5
6
n=2 n=1
7
n=0 8
9 10
11 12
13 14
Figure 2: Binary tree
24
15
(n)
(n)
S
S
Figure 3: Examples of cliques of type 1. Pairs of nodes marked with dots form cliques.
(n+1)
(n+1)
S
(n)
S
S
(n)
S
(n+1)
(n+1)
S
S
(n)
(n)
S
S
Figure 4: Examples of cliques of type 2. Pairs of nodes marked with dots form cliques.
25
S
(n+1)
S
(n)
S
(n-1)
X
Figure 5: Neighborhood system for multiscale MRF. Nodes marked with dots are neighbors of node marked with X.
26
(a)
(b)
(c) Figure 6: (a): Original image. (b): Segmented image obtained using EM/MPM algorithm. (c): Segmented image obtained using EM/MMPM algorithm.
27
(a)
(b)
(c) Figure 7: (a): Original image. (b): Segmented image obtained using EM/MPM algorithm. (c): Segmented image obtained using EM/MMPM algorithm.
28
(a)
(b)
(c) Figure 8: (a): Original image. (b): Segmented image obtained using EM/MPM algorithm. (c): Segmented image obtained using EM/MMPM algorithm.
29
(a)
(b)
(c) Figure 9: (a): Original image. (b): Segmented image obtained using EM/MPM algorithm. (c): Segmented image obtained using EM/MMPM algorithm.
30
(a)
(b)
(c)
(d) (e) Figure 10: (a): Original image. (b): Segmented image obtained using EM/MPM algorithm with 2 classes. (c): Segmented image obtained using EM/MMPM algorithm with 2 classes. (d): Segmented image obtained using EM/MPM algorithm with 3 classes. (e): Segmented image obtained using EM/MMPM algorithm with 3 classes.
31
(a)
(b)
(c)
(d) (e) Figure 11: (a): Original image. (b): Segmented image obtained using EM/MPM algorithm with 2 classes. (c): Segmented image obtained using EM/MMPM algorithm with 2 classes. (d): Segmented image obtained using EM/MPM algorithm with 3 classes. (e): Segmented image obtained using EM/MMPM algorithm with 3 classes.
32
(a)
(b)
(c)
(d) (e) Figure 12: (a): Original image. (b): Segmented image obtained using EM/MPM algorithm with 2 classes. (c): Segmented image obtained using EM/MMPM algorithm with 2 classes. (d): Segmented image obtained using EM/MPM algorithm with 3 classes. (e): Segmented image obtained using EM/MMPM algorithm with 3 classes.
33
(a)
(b)
(c)
(d) (e) Figure 13: (a): Original image. (b): Segmented image obtained using EM/MPM algorithm with 2 classes. (c): Segmented image obtained using EM/MMPM algorithm with 2 classes. (d): Segmented image obtained using EM/MPM algorithm with 3 classes. (e): Segmented image obtained using EM/MMPM algorithm with 3 classes.
34
(a)
(b)
(c)
(d) (e) Figure 14: (a): Original image. (b): Segmented image obtained using EM/MPM algorithm with 2 classes. (c): Segmented image obtained using EM/MMPM algorithm with 2 classes. (d): Segmented image obtained using EM/MPM algorithm with 3 classes. (e): Segmented image obtained using EM/MMPM algorithm with 3 classes.
35