Manuscript submitted to AIMS’ Journals Volume X, Number 0X, XX 200X
Website: http://AIMsciences.org pp. X–XX
SPATIO-TEMPORAL SPECKLE REDUCTION IN ULTRASOUND SEQUENCES
Noura Azzabou Laboratoire MAS, Ecole Centrale de Paris Grande Voie des Vignes 92 295 Chatenay-Malabry, FRANCE
Nikos Paragios Laboratoire MAS, Ecole Centrale de Paris and GALEN Group INRIA Saclay Grande Voie des Vignes 92 295 Chatenay-Malabry, FRANCE
Abstract. In this paper we will propose a novel variational framework for speckle removal in ultrasound images. Our method combines efficiently a fidelity to data term adapted to the Rayleigh distribution of the speckle and a novel spatio- temporal smoothness constraint. The regularization relies on a non parametric image model that describes the observed image structure and express inter-dependencies between pixels in space and time. The interaction between pixels is determined through the definition of new measure of similarity between them to better reflect image content. To compute this similarity measure, we take into consideration the spatial aspect as well as the temporal one. Experiments were carried on both synthetic and real data and the results show the potential of our method.
1. Introduction. Ultrasound imaging is a popular non invasive and low cost technique to observe the dynamical behavior of organs. The produced images inherit a multiplicative component that is the speckle. This signal may contain information useful to medical experts [17]. Nevertheless, for some applications like segmentation or registration the speckle affects the quality of images and complicates the diagnosis task. Therefore, speckle removal could be a tool to obtain better images while preserving details. In the literature, numerous methods for speckle suppression were proposed. Some of them transform the multiplicative noise problem into an additive one by considering the logarithm of the image and assume that the noise is Gaussian [2, 12]. The limitation of such a method lies in the fact that the logarithm function reduces the contrast in the image and makes the task of denoising more complex. Other techniques based on total variation minimization in the wavelet domain [14] fail to preserve image details since they rely on the piecewise smoothness assumption. The anisotropic diffusion [18] relies on the gradient information which is not robust when dealing with high noise levels. We want to point out also that these methods, consider only fixed images and not sequences. The temporal aspect was addressed in [1] using temporal averaging. This technique is efficient tool of speckle removal but due to motion, fine details can be blurred unless an appropriate 2000 Mathematics Subject Classification. Primary: 58F15, 58F17; Secondary: 53C35. Key words and phrases. Speckle Removal, Rayleigh distribution, Regularization, UltraSound sequence. .
1
2
NOURA AZZABOU AND NIKOS PARAGIOS
weight definition is done. Techniques that aim to perform Speckle removal through motion correction were also investigated [13]. The use of motion information seems to be a rather elegant choice, however recovering reliable displacement vectors is an ill-posed problem amplified by the speckle presence. Analysis in the transform domain like wavelet was also considered toward speckle reduction [3, 15], but the transform selection is crucial for the performance of these methods. In this paper, we will address the speckle removal problem in ultrasound images. Inspired from total variation techniques, our approach entails a functional minimization process. Such a functional is composed of two terms : the regularization term that is able to encode complex image structure and the data term that encodes the speckle model. In addition to the image structure, the regularization term is designed to take implicitly into consideration the temporal aspect. In this context, the interactions between pixels are determined by weights that reflect the similarity in the temporal and the spatial domain with implicit constraints imposing motion consistency. As far as fidelity to the data term is considered, its formulation is based on the speckle model which is assumed to be a Rayleigh distribution. The paper is organized as follows, section 2 is devoted to the presentation of the method, we present in section 3 the definition of weights (or interaction between pixels), next we will focus on the experimental validation of our approach in section 4, and finally we conclude the paper in section 5. 2. Problem Statement. Let us consider two image sequences U and I defined on T × Ω where T is the length of the sequence and Ω refers to a frame spatial domain. These two sequences are related according to I = U ∗ n with n being a multiplicative noise sequence that follows a Rayleigh distribution [17] defined as : r(n) =
n2 n exp(− ) σn2 2σn2
(1)
In order to recover the noise free sequence U we have to minimize a cost function that insures smoothness, preserves boundaries, while imposing a fidelity constraint, to the observed sequence. If we consider the Maximum a Posteriori (MAP) estimation framework, we recover the speckle-free image by maximizing the posterior probability P (U |I) = P (I|U )P (U ) . This amounts to minimizing the following: P (I) − log (P (U |I)) = −log (P (I|U )) − log (P (U )) + log (P (I)))
(2)
The last term is constant, hence we could minimize the following cost function: E(U ) = −log (P (I|U )) − log (P (U ))
(3)
The first term is the fidelity to data term that is dependent on the noise model. The second one is the regularization term that contains a prior knowledge about the image model. In the discrete case and for a frame Ut the observed noise likelihood is equivalent to the data term that can be expressed as Edata (Ut ) = −log(P (It |Ut )) = −log(P (nt ))
(4)
Under the assumption that (i) the noise observations are independent (ii) identically distributed, and (iii) U and I have non zero intensity for each pixel in the sequence,
SPATIO-TEMPORAL SPECKLE REDUCTION IN ULTRASOUND SEQUENCES
3
we can write: Edata (Ut )
= − log
|Ω| Y
i=1
r(nt (i)) = −
|Ω| X
log (r(nt (i)))
i=1
=
|Ω| X nt (i)2 nt (i) + −log σn2 2σn2 i=1
=
|Ω| X It (i) It2 (i) − log + σn2 Ut (i) 2σ 2 Ut2 (i) i=1
Note that the assumption of independence between the noise observations is too simple because the speckle is correlated. However such an assumption yields a simpler energy and a computationally efficient formulation. The continuous formulation of the data term can be derived by analogy from the equivalence between the data term in the total variation framework and the observed noise likelihood in the MAP estimation framework. The equivalence between the two formulation was also adopted by Aujol et al in [4] using a different noise model. For our problem and within a finite domain, we suggest the following data term in a continuous framework. Z It (x) It2 (x) Edata (Ut ) = − log + 2 2 dx (5) σ 2 Ut (x) 2σ Ut (x) Ω Z I 2 (x) dx (6) = log (Ut (x)) + 2t 2 2σn Ut (x) Ω To study the convexity of the regularization term with respect to Ut we will consider the function 1 I 2 (x) f (y) = log(y) + 2 t 2 (7) 2σn y it can be easily shown that the second derivative of f is positive for √ 6It (x) y≤ . (8) 2σn2 Hence, if we assume that the intensity y ranges between ]0, 255], the inequality is verified if √ 6It (x) ≥ 255 for any intensity value It (x) ∈ ]0, 255] (9) 2σn2 hence the cost function is convex for particular σn values that should verify √ 6 × Infx∈Ω (It ) 2 σn ≤ (10) 2 × 255 The fidelity to data term is convex if the inequality (10) is verified. In practice it is true when (σn ≤ 0.1) which is equivalent to small noise amount. Hence for small value of σn , the data term Edata is a convex function. The convexity hypothesis is not valid in practice, and using a gradient descent based method could result in a local minimum solution. Regarding the regularization term total variation minimization [16] is the most used and well studied model. This formulation was used as an image model in case of additive noise as well as multiplicative one [4, 11]. Such a model assumes that the image is piecewise constant where intensity difference between neighboring
4
NOURA AZZABOU AND NIKOS PARAGIOS
pixels must be reduced. Total variation minimization is an efficient technique with respect to edge preserving, however it fails to preserve fine texture details. Several models was proposed to overcome this limitation, and weights were introduced to extend interactions to larger neighborhood and select only pixels with similar content [10, 9, 7]. In this paper, we selected the model introduced in [5] that is a generalized linear interpolation model where each pixel is expressed as a weighted mean of the remaining pixels of the sequence. Such a formulation involves a data driven image model were the interaction between pixels is defined according to the similarity between them in order to preserve image content. The underlying image model is no longer piecewise smooth like the classical TV model but it is a linear model that accounts more for the image structure as was showed in[5]. For this application, we will consider the temporal extension of this formulation in order to account for the time dimension and to perform a temporal filtering as well as a spatial one. In this case Ereg is defined as !2 RR ZZ w(x, t, y, t1 )Ut1 (y)dydt1 T ×Ω RR − Ut (x) dxdt (11) Ereg (U ) = w(x, t, y, t1 )dydt1 T ×Ω T ×Ω where w(x, t, y, t1 ) is a similarity measure between two pixels x and y observed on the frame t and t1 . The definition of the weights will be discussed in the next section. Finally, we define the global cost function that is minimized to estimate the restored image as Z T E(u) = Ereg (U ) + λ Edata (Ut )dt (12) 0
where λ is a trade-off parameter between the regularization and the fidelity to data term. The estimation of the noise free ultrasound sequence is obtained through minimizing the cost function given in equation (12). The minimization is done through a gradient descent scheme. Starting from the observation U 0 = I, the speckle-free sequence is obtained at the convergence of the sequence Utk ∂Ereg 1 1 It (x) Utk+1 (x) = Utk (x) − dt +λ − 2 (13) ∂Ut (x) Ut (x) σn Ut (x)3 In order to decrease the computation cost we will restrict the interaction between pixels to a local neighborhood domain and not the whole image. Let Πx be a spatial neighborhood centered on a pixel x and Tw the size of the temporal window (the number of frames that interact directly with a given frame). In this case, the derivative of the regularization term is equal to Z Z t+Tw w(x, t, y, t1 ) ∂Ereg = 2 Ut (x) − Ut1 (y)dydt1 ∂Ut (x) Z(x, t) Πx t−Tw Z Z t+Tw Z Z t1 +Tw w(z, t1 , x, t) w(z, t1 , y, t2 ) Ut2 (y)dydt2 dzdt1 +2 Ut1 (z) − Z(z, t1 ) Z(z, t1 ) Πx t−Tw Πz t1 −Tw (14)
R
R t+Tw
with Z(x, t) = Πx t−Tw w(x, t, y, t1 )dydt1 being the normalization coefficient. Note that such a derivative formulation assumes that the weights are independent from the image U . In the proposed formulation, weights are calculated in the beginning of the process on the initial observations because we assume that observed image has the same structure that the speckle-free one.
SPATIO-TEMPORAL SPECKLE REDUCTION IN ULTRASOUND SEQUENCES
5
3. Weights Computation. In our algorithm we combine spatial and temporal filtering. Therefore, a robust weight definition is crucial to avoid details blurring due to combination of boundaries and non-boundaries pixels or due to motion. Furthermore, an appropriate definition of the temporal weights can substitute the necessity to estimate the motion between successive frames of the sequence. In this work, we will suggest a suitable definition of the weights in the case of multiplicative Rayleigh noise and we will consider a distance based on the spatial information as well as the temporal one. To this end, we use patch based similarity which is more robust to noise than the similarity between intensities at a pixel level [6]. We recall that this measure relies on the assumption that an image is stationary (at least at local scale), and redundant which means that a given patch has several copies in the entire ultrasound sequence. Let us characterize a pixel x at the frame Ut by the set of neighboring pixels in the same frame. Let us consider now the L2 distance between two patches of size (2p + 1) × (2p + 1) centered on x and y in two different frames Ut1 and Ut2 in the sequence. Z pZ p 2 ds (x, t1 , y, t2 ) = (It1 (x + z) − It2 (y + z)) dz −p p
Z
−p p
Z
2
(Ut1 (x + z)nt1 (x + z) − Ut2 (y + z)nt2 (y + z)) dz
= −p
−p
In case the observed patches are derived from the same speckle free patch, we can write: Z pZ p 2 ds (x, t1 , y, t2 ) = Ut21 (x + z) (nt1 (x + z) − nt2 (y + z)) dz −p
−p
This equation shows that contrarily to additive noise, the L2 distance is not only dependent on noise distribution but also on the image intensity. To overcome this dependence one can compute Z pZ p (It1 (x + z) − It2 (y + z))2 dz ds (x, t1 , y, t2 ) = Ut21 (x + z) −p −p To determine the distance ds (x, t1 , y, t2 ), one needs an estimation of Ut1 (x). A simple way to do that is to compute the average of It1 over the patch around x. Furthermore, in order to obtain a symmetric distance with respect to t1 and t2 we will consider Ut21 = Ut1 ∗ Ut2 . Thus, we obtain the following distance definition Z pZ p (It1 (x + z) − It2 (y + z))2 ds (x, t1 , y, t2 ) = dz ˜t (x + z)U ˜t (y + z) U −p −p 1 2 Z pZ p It1 (x + z) ˜ Ut1 (x) = dz 2 −p −p (2p + 1) Now if we consider the temporal aspect of the sequence, two similar pixels that belong to the same structure have similar displacement vectors. Thus, in addition to the distance between patches, the variation of pixel intensity over time should be taken into account. To this end, we consider temporal windows of size (2tc + 1) and similarly to the spatial distance, we define Z tc 2 [It1 +k (x) − It2 +k (y)] dk (15) dt (x, t1 , y, t2 ) = ˜t +k (x)U ˜t +k (y) U −tc 1 2
6
NOURA AZZABOU AND NIKOS PARAGIOS
Then the similarity measure between x and y belonging to two different frames is defined as a decreasing function of the distances ds (x, t1 , y, t2 ) and dt (x, t1 , y, t2 ): ds (x, t1 , y, t2 ) dt (x, t1 , y, t2 ) w(x, t1 , y, t2 ) = exp − exp − (16) 2h2s 2h2t hs and ht are parameters that have an impact on the selection degree of pixels that interact together. They have an influence on the smoothness of the final result. To summarize, the proposed weights are designed to take into account the noise properties, mainly the fact that it is multiplicative. With such definition, we restrict interactions between pixels only to similar ones in order to preserve details in each frame. We take also into account motion consistency by computing similarity measure on temporal windows. The main advantage of using such a weight definition is its robustness to motion [8]. On the proposed framework, there is no need to estimate the motion because pixels belonging to different frames will interact only if they have similar content which will avoid blurring. It is important to point out that we consider an estimate of the image U , obtained using a simple spatial mean filter to compute the different weights. The quality of this estimation is not accurate but it preserves the main structures being present in the original image. However, in the presence of fine texture with a small scale (like the noise) the mean is not a good estimate. Considering U instead of its estimate will make the minimization of the functional E more complicated because the regularization term will not be convex in this case. We can can also consider an iterative process where we use our model using a rough estimation of U then consider the result to compute new weights and then re estimate U by minimizing the energy using the new weights. 4. Experimental Results. To assess the performance of the proposed method we used both synthetic and real data. In the first experiment, we used synthetic images with artificial speckle. The speckle is simulated by low pass filtering a complex Gaussian random field and computing its magnitude [2]. We compared our algorithm to the anisotropic diffusion algorithm [18] and a wavelet based technique [15] using the Matlab implementation provided by the authors. Regarding anisotropic diffusion the free parameter refers to the number of iterations. For the wavelet based algorithm the parameter are the thresholding coefficient and the neighborhood size were local energy was computed. For the first method we tuned the iteration number to obtain the better possible results (PSNR and visually). The optimal parameters range of the second method was provided by the author in their paper but we slightly modified them to improve their performance with respect to our data. Regarding our algorithm we considered 11×11 window size for Πx while λ = 0.05. For weight computation the parameters setting is: hs = σ5n and a 3×3 patch size. We did not consider the temporal aspect for this experiment because the synthetic data are single images and not sequences. For an evaluation we considered ˆ that is defined as the PSNR of the reconstructed images U T
P SN R = 10log10
XX 2552 ˆt (x))2 With M SE = (Ut (x) − U M SE t=0
(17)
x∈Ω
In table (1) we reported the PSNR value for the different methods and with different noise variance. This results show that our method achieve high performance when the noise variance is important. Such a behavior is justified by the fact that we
SPATIO-TEMPORAL SPECKLE REDUCTION IN ULTRASOUND SEQUENCES
7
Sythetic1 [15] Synthetic2 σn 1 0.5 0.25 1 0.5 0.25 Corrupted image 22.20 27.93 33.95 20.44 26.47 32.62 OurMethod 29.16 37.32 38.01 33.88 38.89 43.53 24.88 33.09 40.95 31.55 38.36 46.15 SRAD[18] wav [15] 29.14 35.48 41.69 31.06 35.85 39.75 Table 1. PSNR values for denoised image corrupted by Speckle
consider large spatial neighborhood (Πx ) for filtering which reduces considerably the variance inside each region. The anisotropic diffusion algorithm (SRAD)[18] is more local and its numerical scheme is based on interactions between pixels at local scale when compared to our algorithm. Besides, the gradient information is not reliable in case of high level noise. The approach presented in [15] and the choice of the parameter of threshold used in this algorithm to compute the noise free wavelet coefficients is critical to insure the balance between edge preserving and noise suppression. In figure [Fig.(1)] one can see the restoration results for the synthetic image using the different methods and different variance. The method based on the wavelet transform provides images with sharp contours but the noise remains in homogeneous regions. The anisotropic diffusion results in blurry edges while smoothing the homogeneous areas. Our algorithm reaches an optimal balance where the obtained images are completely smoothed inside each region while the edges being sharp. As we stated before we consider large neighborhood size to remove the speckle, while encoding the image structure in our weight definition to preserve image discontinuities and details. This figure shows that our method preserves the image discontinuities and contrast between regions better than the two other algorithms. It is important to point out that, for the synthetic data and for the case of high noise variance (σn = 0.5 and σn = 1) the condition (10) is not verified (minimum intensity value for both image is 60). The function that we are minimizing is not convex and we converged to a local minimum. The obtained results show that the algorithm is still efficient. But, the minimization technique could be improved to a more global scheme such as the simulated annealing algorithm. But this will result in a time consuming solutions. As far as real data are concerned we considered an ultrasound sequence of the left ventricle on which we applied our speckle removal algorithm. The parameters used for this experiment are: a 7×7 window size for Πx , λ = 0.5 and Tw = 5. The weight computation was performed using 3×3 patch size, a temporal window for comparison of size tc = 2 and hs = ht = 0.05. In figure [Fig.(3)] we reported the restoration result on one frame of an ultrasound sequence. To evaluate the quality of restoration, we extracted the boundaries of the ventricle, using a level set based technique. The contours obtained using our method are smooth and the structure being present in the image were preserved. The contour extracted in the frame restored using our approach is similar to the one associated to the anisotropic diffusion, but this method yields piecewise smooth images. Regarding the wavelet based technique, it produces image with sharp details but the contours are not smooth.
8
NOURA AZZABOU AND NIKOS PARAGIOS
(a)
(b)
(c)
(d)
Figure 1. (a) Image (Sythetic1) corrupted by the speckle σn = 0.5 (b) Result using the anisotropic diffusion [18] (c) Result using the wavelet based technique [15] (d) Result using our algorithm To evaluate the impact of the time component we compared the algorithm performance by processing the whole sequence using (Tw = 2) and by processing it frame by frame (Tw = 0). The obtained results are shown in figure [Fig.(4)]. One can see that using a temporal window is more efficient for speckle reduction and this is illustrated by the residual images [Fig.(4-c),Fig.(4-e)](difference between the filtered image and the observed one). Besides, we notice that considering the time component does not introduce a blur on the images because our weight definition is robust enough to compensate the effect of motion. To conclude we can say that the experimental results show that our method can deal with correlated speckle even though we make the assumption of independence between pixels. We have to point out that the proposed speckle removal approach is flexible with the selection of the scale of interaction between pixels contrarily to the PDE based technique presented in [18] 5. Discussion. In this paper we proposed a new spatio-temporal speckle removal technique for ultrasound sequences. Our main contributions are : (i) the definition of an image model that is robust to motion and ensures better structure reconstruction. This model is based on limiting the interactions between pixels to only those
SPATIO-TEMPORAL SPECKLE REDUCTION IN ULTRASOUND SEQUENCES
(a)
(b)
(c)
(d)
9
Figure 2. (a) Image (Sythetic2) corrupted by the speckle σn = 1 (b) Result using the anisotropic diffusion [18] (c) Result using the wavelet based technique [15] (d) Result using our algorithm having consistent motion and belonging to the same structure. (ii) We introduced also a data term that is coherent with the speckle distribution. A future direction to improve our work will be considering anisotropic neighborhood shapes as well as variable size ones. This will ensure a better interaction between pixels lying on the same structure. Considering the fact that speckle is a correlated signal is also a direction that could bring further improvement to the method. REFERENCES [1] J. G. Abbott and F. L. Thurstone. Acoustic speckle: Theory and experimental analysis. Ultrason. Imag, 1:303–324, 1979. [2] A. Achim, A. Bezerianos, and P. Tsakalides. Novel bayesian multiscale method for speckle removal in medical ultrasound images. IEEE trans. on Medical Imaging, 20:772–783, August 2001. [3] E.D. Angelini, A.F. Laine, S. Takuma, J. Holmos, and S. Homma. Lv volume quantification via spatiotemporal analysis of real-time 3-d echocardiography. IEEE trans. on Medical Imaging, 20(6):457–469, 2001. [4] J.F. Aujol and G. Aubert. A variational approach to remove multiplicative noise. SIAM Journal on Applied Mathematics, 68:925–946, 2008.
10
NOURA AZZABOU AND NIKOS PARAGIOS
(a)
(b)
(c)
(d)
Figure 3. Results of restoration on real ultrasound frames. (a) Observed image (b) Result using algorithm [15] (c) Anisotropic diffusion [18] (d) Results of our algorithm
[5] N. Azzabou, N. Paragios, F. Guichard, and F. Cao. Variable bandwidth image denoising using image-based noise models. In CVPR, pages 1–7, 2007. [6] A. Buades, B. Coll, and J-M. Morel. A non-local algorithm for image denoising. In Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pages 60–65, 2005. [7] A. Buades, B. Coll, and J-M Morel. Image enhancement by non-local reverse heat equation. Technical Report 2006-22, CMLA Preprint, 2006.
SPATIO-TEMPORAL SPECKLE REDUCTION IN ULTRASOUND SEQUENCES
11
[8] A. Buades, B. Coll, and J.M Morel. Denoising image sequences does not require motion estimation. Technical Report 2005-18, CMLA Preprint, 2005. [9] G. Gilboa, J. Darbon, S. Osher, and T. Chan. Nonlocal convex functionals for image regularization. Technical Report 06-57, UCLA CAM Report, 2006. [10] G. Gilboa and S. Osher. Nonlocal linear image regularization and supervised segmentation. SIAM Multiscale Modeling and Simulation, 6:595–630, 2007. [11] Y. Huang, M. Ng, and Y Wen. A new total variation method for multiplicative noise removal. SIAM Journal on Imaging Sciences, 2:20–40, 2009. [12] A.K. Jain. Fundamentals of digital image processing. Prentice-Hall, Inc., 1989. [13] Lee Markosien. A motion-compensated filter for ultrasound image sequences. Technical report, Providence, RI, USA, 1996. [14] A. Ogier, P. Hellier, and C. Barillot. Speckle reduction on ultrasound image by variational methods and adaptive Lagrangian multipliers. In ISBI, pages 547–550, 2004. [15] A. Pizurica, W. Philips, I. Lemahieu, and M. Acheroy. A versatile wavelet domain noise filtration technique for medical imaging. IEEE trans. on Medical Imaging, 22(3):323–331, 2003. [16] L. Rudin, S. Osher, and E. Fatemi. Nonlinear total variation based noise removal. Physica D, 60:259–268, 1992. [17] R.F. Wagner, S.W. Smith, and H. Sandrik, J.M.and Lopez. Statistics of speckle in ultrasound b-scans. IEEE Trans on Sonics and Ultrasonics, 30(3):156–163, 1983. [18] Y. Yu and S.T. Acton. Speckle reducing anisotropic diffusion. IEEE trans. on Image Processing, 11(11):1260–1270, 2002. E-mail address:
[email protected] E-mail address:
[email protected] 12
NOURA AZZABOU AND NIKOS PARAGIOS
(a)
(b)
(c)
(d)
(e)
Figure 4. Results of filtering real ultrasound frames (a) Observed image (b) Result of our algorithm without temporal component (Tw = 0) (frame by frame filtering) (c) Result of our algorithm using temporal filtering (Tw = 2) (d) Residual obtained with our algorithm without temporal component (Tw = 0) (frame by frame filtering)(e) Residual obtained with our algorithm using temporal filtering (Tw = 2)