matting through variational inpainting - Semantic Scholar

Report 1 Downloads 43 Views
MATTING THROUGH VARIATIONAL INPAINTING Kangyu Ni, Sheshadri Thiruvenkadam, and Tony Chan Department of Mathematics University of California 405 Hilgard Avenue Los Angeles, CA 90095, USA email: {kni66, sheshad, chan}@math.ucla.edu ABSTRACT While current matting algorithms work very well for some natural images, their performance is questionable in the presence of sharp discontinuities in the foreground and background regions. To counter the above problem, we propose to use variational PDE based inpainting techniques within the matting problem, that are largely successful in inpainting geometric features into unknown regions. KEY WORDS matting, variational inpainting.

1

Introduction

Digital Image Composition is commonly used by the graphics community to create various scenario for objects by extracting them from their original background scene and pasting them realistically to new background scenes. The above technique has been popularly used by the movie industry to transport images of actors captured in a controlled studio environment to novel locations. A crucial step that precedes compositing images is extracting the object in question from its original background scene realistically (i.e. preserving the fractional nature of the object boundary). The above step is referred to as Image Matting, and additionally as in our work, if the background is unknown, referred to as Natural Image Matting. Given an image I : Ω → < containing the object of interest, we wish to recover (α, F , B), using the commonly used matting equation:

process for the user for many natural images. Recent works [5, 6, 12] attempt to counter this problem by adding a prior region growing step in the algorithm that computes a reasonable trimap from a user given rough guess. While current algorithms work very well for some natural images, their performance is questionable in the presence of sharp discontinues in the foreground and background regions. The reason is that these algorithms model the background and foreground intensities of pixels in the unknown region D, using statistical estimates of nearby pixel regions (usually nearest neighbor sense). Thus they do not take into account the inherent geometry of the foreground and background regions. For example, in Fig.1, (a) is an image occluded by an object (white region). (b) shows the desired inpainting result. (c) is the result of nearest neighbor interpolation, variants of which are used in current matting algorithms such as the Poisson matting [11]. We see from (c) that such methods are not able to achieve the correct inpainting result. Hence, the resulting matte may be erroneous even for images with simple geometric structures.

(a) given image

(b) desired result (c) by nearest values

I(x) = α(x)F (x) + (1 − α(x))B(x),

Figure 1. Example of interpolating the missing region of an im-

Here F and B are the foreground and background intensities, and α is the soft-segmentation (i.e. α-matte) of the object. Previous works attempt to solve the above ill-posed problem by imposing statistical priors on (F , B) [4, 9], followed by regularity constraints [11] on α. The problem is further simplified by a user specified trimap, that partitions the image domain into three regions; definitely foreground/background and an unknown transition region (say D). Thus the problem of recovering (α, F , B) is restricted only to D. The above simplification approach means that the solution’s quality depends on the initial-trimap accuracy hence resulting in a time consuming hand drawing

In response to the above problem, we focus on the class of non-texture images and propose to estimate the foreground and background by adapting existing nontexture variational inpainting methods, such as the total variation (TV) inpainting [3] and the Euler’s elastica inpainting [1]. The former is the first to take the Bayesian approach for inpainting and adapts the ROF image model [10]. The latter improves the TV inpainting and is a curvature prior model. Both inpainting methods perform better in narrow inpainting regions than thick ones, see [2].

age by using the nearest known pixel values.

Figure 3. Comparison of a level line to be completed in the unknown region by TV and elastica inpainting

Figure 2. Indications of subregions ΩF , D, and ΩB of the image domain Ω

Typically, a matting algorithm consists of three iterative steps. We summarize our algorithm vis-a-vis these steps. • Trimap refinement: A prior segmentation step is added to refine the approximate user defined trimap to conform to the actual α-transition region. We use an iterative scheme similar to Poisson matting, [11] to refine the trimap using the current iteration’s α-estimate. • Extrapolating F and B: The foreground and background intensities, F and B have to be extrapolated (i.e inpainted) into the transition region. Our inpainting technique is PDE based (TV, elastica) and is successful in inpainting geometric features from the known regions. • Solving for α: α is solved for in the transition region using the matting equation subject to suitable priors. Here, similar to Poisson matting, we search for α in the Sobolev space H 1 (Ω).

2

Proposed Models

Let I : Ω → [0, 1] be the given image. A trimap is provided by the user, indicating the definite background region ΩB , the definite foreground region ΩF and the unknown region D, see figure 2. The first two subsections describe the methods of extrapolating background B : Ω → [0, 1] and foreground F : Ω → [0, 1] through the total variation and Euler’s elastica inpainting, respectively. The background is obtained by inpainting the data on the definite background ΩB into the unknown region D. Similarly, the foreground is obtained by inpainting the data on the definite foreground ΩF into D. The third subsection describes the model for extracting the matte.

2.1

Total Variation Inpainting

The TV inpainting is a PDE-based variational model, adapted from the ROF denoising model [10]. It is based on the observation that edges play an important role in the geometry of an image. The TV inpainting interpolates images across the missing regions, while preserving sharp edges. We propose to utilize this technique to extrapolate the background and foreground within the matting problem. The extrapolated background through the TV inpainting is obtained by the following energy minimization: Z min

B∈BV (D∪ΩB )

Etv [B] =

|∇B|,

(1)

D∪ΩB

with constraint B |ΩB = I |ΩB . Minimizing this energy functional is equivalent to connecting sharp edges according to the level sets in the known region. This can be seen by the coarea formula: Z

Z

1

Z

|∇B|dx =

dsdλ , 0

Γλ

where Γλ = {x : B(x) = λ} is the level set and ds is the arc length of the level sets. The gradient descent of the Euler-Lagrange equation (1) is ∂B ∇B = 1D ∇ · ( ), ∂t |∇B|

(2)

with condition B |ΩB = I |ΩB . The boundary condition along the boundary between D and ΩF is ∂B = 0. − ∂→ ν The formulation for estimating the foreground is similar.

2.2

2.3

Euler’s Elastica Inpainting

For large-scale inpainting regions, the TV inpainting often fails to connect edges correctly, due to its nature of penalizing the total length of the level lines. The level lines may form corners at the boundary of the inpainting region, in order to achieve the smallest length. This usually does not agree with the visual perception. The geometry of the inpainting regions is crucial for the TV inpainting result. The Euler’s elastica inpainting improves the TV inpainting and additionally penalizes the curvature. As a result, the level lines extend properly into the inpainting region. Figure 3 is an example of completing a level line inside the unknown region according to the TV and elastica inpainting, respectively. The estimated background through Euler’s elastica inpainting is obtained by the following: Z min

B∈BV (D∪ΩB )

(a + bκ2 )|∇B|dx, (3)

Eelas [B] = D∪ΩB

Matte Extracting

The α matte according to the estimated foreground and background is extracted by minimizing the following variational problem: Z min α

(αF + (1 − α)B − I)2 +

λ |∇α|2 dx , 2

(7)

with α |ΩF = 1 and α |ΩB = 0. This imposes smoothness of α while ensures the composition is close to the given image.

3

Numerical Method

In this section, we describe the numerical method of our iterative scheme, n = 1, 2, ... . 1. Trimap refinement:

with condition

(n)

B |ΩB = I |ΩB .

ΩF = {x ∈ Ω : α(n−1) (x) > 0.97}

In the functional, a and b are positive constants and cur∇B vature κ = ∇ · ( |∇B| ) formally. The weak absolute curvature is defined for arbitrary BV functions in [1]. Minimizing this energy functional is equivalent to connecting sharp edges according to the curvature of the level sets in the known region. This again can be seen by the coarea formula:

ΩB = {x ∈ Ω : α(n−1) (x) < 0.03}

Z

Z (a + bκ2 )|∇B|dx = 0

1

2. Extrapolating F and B: B (n) = arg min E[B], with B|Ω(n) = I|Ω(n) B

B

B

F (n) = arg min E[F ], with F |Ω(n) = I|Ω(n) F

F

F

3. Solving for α:

Z

Z

(a + bκ2 )dsdλ .

α(n) = arg min

Γλ

α

The gradient descent of the Euler-Lagrange equation of (3) is ∂B − → = 1D ∇ · V ∂t

(4)

with B |ΩB = I |ΩB , where − → t ∂φ0 (k)|∇B| − → → V = φ(k)− n − − → |∇B| ∂t → and − n =

(n)

(5)

with constraint α|Ω(n) = 1 and α|Ω(n) = 0 F

B

E[·] can be either Etv [·] or Eelas [·]. Initially, α(0) is the user-defined trimap. Repeat step 1 − 3 until α converges, i.e. d(α(n) , α(n+1) ) < ², where d for example is the l1 norm and ² is small. In the second step, the total variation inpainting is implemented by the lagged diffusivity and fixed point iteration described in [3] for (2). The numerical scheme is n+1 Bi,j =(

n n n n Bi−1,j Bi+1,j Bi,j+1 Bi,j−1 + )/A, + + n n n n |Bi+ | |Bi,j+ |Bi− | |Bi,j− 1 1| 1 1| ,j ,j 2

2

2

2

where

∂ ∇B → − − → − , t =→ n⊥ , − → = t · ∇. |∇B| ∂t

A=

The boundary conditions along the boundary between D and ΩF is ∂φ0 (κ)|∇B| ∂B = 0 and = 0. − → − ∂ν ∂→ ν

(αF (n) +(1−α)B (n) −I)2 +|∇α|2 ,

(6)

1 1 1 1 + n . + n + n n |Bi+ |B | |B |B | | | 1 ,j i,j+ 1 i− 1 ,j i,j− 1 2

2

2

2

For elastica inpainting, to avoid modeling boundary conditions (6), we extend the inpainting region of B from D to D ∪ ΩF and extend the inpainting region of F from D to D ∪ ΩB . The inpainting result of B (resp. F ) on

ΩF (resp. ΩB ) is less important since it is not utilized in optimizing α. We use the numerical scheme described in [1], which is an explicit scheme for ∂B − → = 1D∪ΩF |∇B|∇ · V . (8) ∂t As suggested in [8], the factor |∇B| accelerates the original time marching equation (4) and is discretized by the central ~ is discretized by the half-point differencing. The term ∇· V central differencing, ~i,j = (V 1 1 − V 1 1 ) + (V 2 1 − V 2 1 ), ∇·V i− ,j i,j+ i+ ,j i,j− 2

2

2

2

~ in (5). where (V 1 , V 2 ) = V In (5), the discretization of Dx and Dy at the x-halfpoint are Dx Bi+ 12 ,j =

1 (Bi+1,j − Bi,j ) and 2

1 Dy Bi+ 21 ,j = minmod( (Bi+1,j+1 − Bi+1,j−1 ), 2 1 (Bi,j+1 − Bi,j−1 )), 2 where minmod(a, b) =

sgn(a) + sgn(b) min(|a|, |b|). 2

The discretizations at other half-points are similar. The reader may consult the details in [1]. The α matte is obtained by the steepest descent of Euler-Lagrange equation for (7) αn+1 = αn −δt([αn F +(1−αn )B −I](F −B)−λ4αn ), where 4α = −4αi,j + αi+1,j + αi−1,j + αi,j+1 + αi,j−1 is the usual five point discrete Laplacian.

4 Experimental Results In this section, we present matting results of our algorithm on synthetic and real images, and compare with a method that uses nearest neighbor inpainting. Our experiments indicate that our method outperforms such nearest neighbor based matting methods, especially for images with sharp edges. Specifically, the comparison method we use here is similar to Poisson matting; models α in H 1 (Ω) and iteratively refines the trimap region, and uses nearest neighbor interpolation to fill in F and B values into the unknown region. The first example is shown in Fig.4. The foreground of the given image (a) is a constant and the background (d) is a bar. The given trimap (b) indicates the definite

foreground, background, and unknown regions by α = 1, α = 0, and α = 0.5, respectively. The first column are the ground truth matte (c), the extracted mattes by elastica inpainting (e), TV inpainting (g), and the comparison method (i). The respective mattes are (d), (f), (h), and (j). The background (f) obtained by the elastica inpainting is very close to the ground truth background (d). As a result, the corresponding matte (e) is also very close to the ground truth matte (c). The performance of the TV inpainting strongly depends on the geometry of the inpainting region and the image data. As expected, the TV inpainting is able to recover the underlying geometry at the region with smaller width and fails at the larger scale region. The corresponding matte (g) also reflects the accuracy of the extrapolated background (h). The comparison method is not able to recover the underlying geometry (j) and the extracted matte (i) is erroneous. The second example is shown in Fig.5. The given image (a) is a starfish, as the foreground, and the background has stripes occluded by the starfish. The second row shows the elastica inpainting method (c) is able to extract the α matte accurately while the result of the comparison method (d) is erroneous near sharp gradient of the background. The third row shows the resulting foreground by the elastica inpainting (e) and the comparison method (f). The unknown region is indicated in between the red curves. The last row shows the resulting background by the elastica inpainting (g) and the comparison method (h). The elastica inpainting outperforms the comparison method and thus successfully contributes to matting extraction. The third example is shown in Fig.6. The given image (a) is a bear, as foreground, and the background has a bar occluded by the bear. The second and third rows show the extracted mattes and extracted foreground, respectively, by the elastica inpainting and the comparison method. Observe that near the boundary of the bear and bar, the comparison method (f) is erroneous. This can be seen clearly in the fourth row, which shows the selected local regions of elastica inpainting (g), the comparison method (h), and the original image (i). The last two rows show the resulting foreground and background, respectively. The unknown region is indicated in between the red curves. The above three experimental results demonstrate that when the background has strong geometric features occluded by the foreground, a curvature-prior model, such as the elastica inpainting, is able to model the background correctly. As a result, the extracted matte is more accurate. From our experiment, the extracted mattes from the iterative scheme do not change significantly in about 3 iterations. The numerical error converges in about 30 iterations.

5

Conclusion and Future Work

In this paper, we propose to employ variational PDE-based inpainting methods for the matting problem. Our experimental results show the elastica inpainting is effective for non-texture images. This leads to a promising direction

for the future work. First, we would like to speed up the algorithm. In particular, the PDE-solver of the elastica inpainting is fourth-order and stiff. We may employ the constrained elastica inpainting method, as in [7], which is modeled in the discrete setting and converges fast. We will also justify rigorously the convergence of the iterative scheme.

6

Acknowledgements

This research is supported by ONR grant N00014-06-10345 and NSF grant DMS-0610079.

(a) given image

(b) given trimap

(c) ground truth α

(d) ground truth B

(e) α by elastica inpaint

(f) B by elastica inpaint

(g) α by TV inpaint

(h) B by TV inpaint

(i) α by nearest values

(j) B by nearest values

References [1] T. Chan, S. H. Kang and J. Shen, Euler’s Elastica and Curvature Based Inpaintings, SIAM J. Appl. Math., 63:564-594, 2002. [2] T. Chan and S. H. Kang, An Error Analysis on Image Inpainting Problems, J. Math. Imag. Vision, to appear. [3] T. Chan and J. Shen, Mathematical Models for Local Nontexture Inpaintings, SIAM J. Visual Comm. Image Rep., 12:436-449, 2001 [4] Y. Y. Chuang, B. Curless, D. H. Salesin and R, Szeliski, A Bayesian Approach to Digital Matting, Proceeding of CVPR 2001, Vol.II, 264-271. [5] Y. Guan, W. Chen, X. Liang, Z. Ding, and Q. Peng, Easy Matting-A Stroke Based Approach for Continuous Image Matting, Eurographics 2006, Vol 25(2006), Number 3. [6] O. Juan and R. Keriven, Trimap Segmentation for Fast and User-Friendly Alpha Matting, VLSM 2005, pp.186-197. [7] K. Ni, D. Roble and T. Chan, A Texture Synthesis Approach to Elastica Inpainting, to appear in SIGGRAPH sketch 2007. [8] A. Marquina and S. Osher, Lecture Notes in Computer Science, volume 1682m chapter ”A new time dependent model based on level set motion for nonlinear deblurring and noise removal”, pp. 429-434, 1999. [9] M. A. Ruzon and C. Tomasi, Alpha Estimation in Natural Images, Proceeding of CVPR 2000, 18-25. [10] L. Rudin, S. Osher, and E. Fatemi, Nonlinear total variation based noise removal algorithms, Phys. D, 60:259-268, 1992. [11] J. Sun, H. Jia, C. Tang and H. Shum, Poisson Matting, Proc. of ACM SIGGRAPH, pp.315-321, 2004. [12] J. Wang, M. F. Cohen, An Iterative Optimization Approach for unified image segmentaion and matting, In Proceedings of International Conference on Computer Vision (2005), pp.936-943.

Figure 4. Comparison of the Euler’s elastica inpainting, TV inpainting, and nearest values for extrapolating the background and the corresponding extracted mattes.

(a) given image

(c) α by elastica inpaint

(a) given image

(b) given trimap

(c) α by elastica inpaint

(d) α by nearest values

(e) αF by elastica inpaint

(f) αF by nearest values

(b) given trimap

(d) α by nearest values

(g) elastica

(e) F by elastica inpaint

(g) B by elastica inpaint

(h) nearest

(i) original

(f) F by nearest values

(j) F by elastica inpaint

(k) F by nearest values

(l) B by elastica inpaint

(m) B by nearest values

(h) B by nearest values

Figure 5. The proposed method by utilizing the Euler’s elastica inpainting produces a satisfying matte of the starfish.

Figure 6. The proposed method by utilizing the Euler’s elastica inpainting produces a satisfying matte of the bear.

Recommend Documents