A Scale-Space Approach to Nonlocal Optical Flow ... - Semantic Scholar

Report 2 Downloads 56 Views
A Scale-Space Approach to Nonlocal Optical Flow Calculations Luis Alvarez1 , Joachim Weickert2, and Javier Sanchez1 Departamento de Informatica y Sistemas, Universidad de Las Palmas, Campus de Ta ra, SP-35017 Las Palmas, Spain. E-Mail: flalvarez,[email protected] WWW: http://serdis.dis.ulpgc.es/~lalvarez 1

2

Computer Vision, Graphics and Pattern Recognition Group, Department of Mathematics and Computer Science, University of Mannheim, D-68131 Mannheim, Germany. E-Mail: [email protected] WWW: http://www.ti.uni-mannheim.de/~bmg/weickert/

Abstract. We interpret a classical method for optical ow computation due to Nagel and Enkelmann (1986) as a tensor-driven anisotropic di usion approach in digital image analysis. We introduce an improvement into the model formulation, and we establish well-posedness results for the resulting system of parabolic partial di erential equations. Since our method avoids linearizations in the optical ow constraint, it can recover displacement elds which are far beyond the typical one-pixel limits that are characteristic for di erential methods for optical ow recovery. A robust numerical scheme is presented in detail. We avoid convergence to irrelevant local minima by embedding our method into a linear scalespace framework and using a focusing strategy from coarse to ne scales. We illustrate the high accuracy of the proposed method by means of a synthetic and a real-world image sequence.

1 Introduction Optical ow computation consists of nding the apparent motion of objects in a sequence of images. It is a key problem in arti cial vision and much research has been devoted to this eld; for a survey see e.g. [20]. In the present paper we shall consider two images I1 (x; y) and I2 (x; y) (de ned on R 2 to simplify the discussion) which represent two consecutive views in a sequence of images. Under the assumption that corresponding pixels have equal grey values, the determination of the optical ow from I1 to I2 comes down to nding a function h(x; y) = (u(x; y); v(x; y)) such that I1 (x; y )

= I2 (x + u(x; y); y + v(x; y)),

8(x; y) 2 R 2 :

(1)

To compute h(x; y) the preceding equality is usually linearized yielding the so-called "optical ow constraint" I1 (x) ? I2 (x)  hrI2 (x); h(x)i

8x 2 R 2 ;

where x := (x; y). The linearized optical ow constraint assumes that the object displacements h(x) are small. In other cases, this linearization is no longer valid. Frequently, instead of equation (1), researchers use the alternative equality I1 (x ? u(x; y ); y ? v (x; y ))

8(x; y) 2 R 2 :

= I2 (x; y),

(2)

In this case the displacement h(x; y) is centered in the image I2 (x; y): The determination of optical ow is a classic ill-posed problem in computer vision [6], and it requires to be supplemented with additional regularizing assumptions. The regularization by Horn and Schunck [13] assumes that the optical ow eld is smooth. Much research has been done to modify the Horn and Schunck approach in order to permit discontinuous ow elds; see [7, 8, 18, 21{24, 28] and the references therein. An important improvement in this direction has been achieved by Nagel and Enkelmann [21] in 1986. They consider the following minimization problem:

Z

ENE (h) =

R2

(I1 (x ? u(x; y); y ? v(x; y)) ? I2 (x; y))2 dx

Z

+C

R



(rh)T D (rI1 ) (rh)

trace 2



(3)

dx

where C is a positive constant and D (rI1 ) is a regularized projection matrix in the direction perpendicular of rI1 :

80 1 0 1T 9 > > @I @I < = 1 @y @y 2 @ A @ A D (rI1 ) = +  Id : > jrI1 j2 + 22 > : ? @I@x ? @I@x ; 1

1

1

1

In this formulation, Id denotes the identity matrix. The advantage of this method is that it inhibits blurring of the ow across boundaries of I1 where jrI1 j >> . This model, however, uses an optical ow constraint which is centered in I2 , while the projection matrix D in the smoothness term depends on I1 . This inconsistency may lead to erroneous results for large displacement elds. In order to avoid this problem, we consider a modi ed energy functional where both the optical ow constraint and the smoothness constraint are related to I1 : E (h) =

Z

R2

+C

(I1 (x; y) ? I2 (x + u(x; y); y + v(x; y)))2 dx

Z

R

trace 2



(rh)T D (rI1 ) (rh)



dx:

(4)

The associated Euler-Lagrange equations are given by the PDE system C div (D (rI1 ) C div (D (rI1 )

2 (x + h(x)) = 0; ru) + (I1 (x) ? I2 (x + h(x))) @I @x 2 (x + h(x)) = 0: rv) + (I1 (x) ? I2 (x + h(x))) @I @y

(5) (6)

In this paper, we are interested in solutions of the equations (5)-(6) in the case of large displacement elds. Therefore, we do not introduce any linearization in the above system. We obtain the solutions by calculating the asymptotic state (t ! 1) of the parabolic system @u @t @v @t

2 (x + h(x)); = C div (D (rI1 ) ru) + (I1 (x) ? I2 (x + h(x))) @I @x 2 (x + h(x)): = C div (D (rI1 ) rv) + (I1 (x) ? I2 (x + h(x))) @I @y

(7) (8)

Interestingly, this coupled system of di usion{reaction equations reveals a di usion tensor which resembles the one used for edge-enhancing anisotropic di usion ltering. Indeed, D(rI1 ) has the eigenvectors v1 := rI1 and v2 := rI1? . The corresponding eigenvalues are given by 1 (jrI1 j) =

2

jrI1 j2 + 22 ; jrI1 j2 + 2 : 2 (jrI1 j) = jrI1 j2 + 22

(9) (10)

We observe, that 1 + 2 = 1 holds independently of rI1 . In the interior of objects we have jrI1 j ! 0, and therefore 1 ! 1=2 and 2 ! 1=2. At ideal edges where jrI1 j ! 1, we obtain 1 ! 0 and 2 ! 1. Thus, we have isotropic behaviour within regions, and at image boundaries the process smoothes anisotropically along the edge. This behaviour is very similar to edgeenhancing anisotropic di usion ltering [26], and it is also close in spirit to the modi ed mean-curvature motion considered in [3]. In this sense, one may regard the Nagel{Enkelmann method as an early predecessor of modern PDE techniques for image restoration. For a detailed treatment of anisotropic di usion ltering we refer to [27], and an axiomatic classi cation of mean-curvature motion and related morphological PDEs for image analysis is presented in [2]. In general, we cannot expect the uniqueness of solutions of the elliptic system (5)-(6), and the asymptotic state of the above parabolic system depends on the initial data for the ow u and v. In order to encourage convergence to the physically correct solution in case of large displacement ow, we will design a linear scale-space focusing procedure for the optical ow constraint. One may expect that this leads to a signi cant improvement, since methods which consider the optical ow constraint under Gaussian convolution have demonstrated their advantages already without nonlinear smoothness terms; see e.g. Florack et al.

[11]. Using a scale-space approach enables us also to perform a ner and more reliable scale focusing as it would be the case for related pyramid [4] or multigrid approaches [9]. The paper is organized as follows: In Section 2 we show the existence and uniqueness of solutions of the nonlinear parabolic system (7)-(8), under suitable conditions on the data. In Section 3 we apply a linear scale-space focusing which enables us to achieve convergence to realistic solutions for large displacement vectors. Section 4 describes a numerical discretization of the parabolic system (7)-(8) based on an explicit nite di erence scheme. In Section 5 we present experimental results on a synthetic and a real-world image sequence. Finally, in Section 6 we conclude with a summary.

2 Existence and Uniqueness of the Parabolic System We will study the solutions of the parabolic system of nonlinear partial di erential equations (7)-(8). In [1], the authors develop a theoretical framework to study the existence and uniqueness of solutions of a similar parabolic system, but with a di erent regularization term. The main techniques used in [1] can be applied in order to obtain the existence and uniqueness of the solutions of the system (7)-(8). First, we notice that the di usion tensor D(rI1 ) is uniformly bounded. Moreover, if we assume that I1 2 C 1 (R 2 ), then the di usion tensor is a continuous function and, on the other hand, the eigenvalues of the di usion tensor are uniformly bounded from below by a positive constant. Then, by classical results, we obtain that the homogeneous problem @u @t @v @t

= C div (D(rI1 ) ru) = C div (D(rI1 ) rv)

with C > 0 has a unique solution for any initial datum h0 = (u0 ; v0 ) 2 L2 (R 2 )  L2 (R 2 ). Using the semigroup standard notation we denote by S (t)h0 the solution of the above homogeneous system for the initial data h0 : Let us also denote by F : L2 (R 2 )  L2 (R 2 ) ! L2 (R 2 )  L2 (R 2 ) the function F (h) = (I1 ? I2 (Id + h)) rI2 (Id + h):

Then the following result holds. Theorem21. Let I22 2 C 2(R 2 ) and I1 2 C 1(R 2 ). Then, for all initial ows h0 2 L2 (R )  L2 (R ), there exists a unique generalized solution h(:; t) 2 C

?[0; 1); L2(R 2)  L2(R 2)

of the nonlinear PDE system (7){(8) .

Proof. We will only sketch the proof. For more details we refer the reader to [1] where all mathematical details are presented. First, we note that, under the conditions I2 2 C 2 (R 2 ) and I1 2 C 1 (R 2 ), the function F (:) is Lipschitz. On the other hand, using standard mathematical techniques we can show that the solutions of (7)-(8) can be expressed as a xed point of the function (h)(t) = S (t)h0 +

Zt 0

S (t ? s)F (h(s))ds:

Therefore, the solutions of (7)-(8) satisfy (h)(t)

= h(t)

and nally, since F (:) is Lipschitz, we can show that  is a contraction, and by the Banach xed point theorem there exists a unique solution of the problem.

3 A Linear Scale-Space Approach to Recover Large Displacements In general, the Euler-Lagrange equations (5)-(6) will have multiple solutions. As a consequence, the asymptotic state of the parabolic system (7)-(8), which we use for approximating the optical ow, will depend on the initial data. Typically, the convergence is the better, the closer the initial data is to the asymptotic state. When we expect small displacements in the scene, the natural choice is to take u  v  0 as initialization of the ow. For large displacement elds, however, this will not work, and we need better initial data. To this end, we embed our method into a linear scale-space framework [14, 29]. Considering the problem at a coarse scale avoids that the algorithm gets trapped in physically irrelevant local minima. The coarse-scale solution serves then as initial data for solving the problem at a ner scale. Scale focusing has a long tradition in linear scalespace theory (see e.g. Bergholm [5] for an early approach. Detailed descriptions of linear scale-space theory can be found in [10, 12, 15, 16, 19, 25]. We proceed as follows. First, we introduce a linear scale factor in the parabolic PDE system in order to end up with @u @t

= C div (D(rI1 ) ru ) + +

@v @t



G

 I1 (x) ? G  I2 (x + h (x))

= C div (D(rI1 ) rv ) + +



G

 I1 (x) ? G  I2 (x + h (x))

 @ (G  I2 )

(x + h (x));

(11)

 @ (G  I2 )

(x + h (x));

(12)

@x

@y

where G  I represents the convolution of I with a Gaussian of standard deviation :

The convolution with a Gaussian blends the information in the images and allows us to recover a connection between the objects in I1 and I2 : We start with a large initial scale 0 . Then we compute the optical ow (u0 ; v0 ) at scale 0 as the asymptotic state of the solution of the above PDE system using as initial data u  v  0: Next, we choose a number of scales n < n?1 < :::: < 0 , and for each scale i we compute the optical ow (u ; v ) as the asymptotic state of the above PDE system with initial data (u ?1 ; v ?1 ). The nal computed

ow corresponds to the smallest scale n . In accordance with the logarithmic sampling strategy in linear scale-space theory [17], we choose i := i 0 with some decay rate  2 (0; 1). i

i

i

i

4 Numerical Scheme We discretize the parabolic system (11){(12) by nite di erences. Derivatives in x and y are approximated by central di erences, and for the discretization in t direction we use an explicit (Euler forward) scheme. Gaussian convolution was performed in the spatial domain with renormalizedGaussians,  which where a b truncated at 5 times their standard deviation. Let D = b c . Then our explicit scheme has the structure uki;j+1 ? uki;j 

+ ai;j uki+1;j ? uki;j + ai?1;j + ai;j uki?1;j ? uki;j + 2 h21 2 h21 uk ? uk uk ? uk + ci;j+12+ ci;j i;j+1h2 i;j + ci;j?12+ ci;j i;j?1h2 i;j + 2 2 bi?1;j ?1 + bi;j uki?1;j ?1 ? uki;j bi+1;j +1 + bi;j uki+1;j +1 ? uki;j + + 2 2h1h2 2 2h1h2 !? uk ? uk uk ? uk ? bi+1;j?21 + bi;j i+1;j2h?1h i;j ? bi?1;j+12 + bi;j i?12;jh+1h i;j + 1 2 1 2 =C

ai+1;j









+ I1; (xi;j ? hk;i;j ) ? I2; (xi;j ) I1;x; (xi;j ? hk;i;j ); (13) k+1 ? v k k k k k vi;j i;j = C ai+1;j + ai;j vi+1;j ? vi;j + ai?1;j + ai;j vi?1;j ? vi;j +  2 h21 2 h21 vk ? vk vk ? vk + ci;j+12+ ci;j i;j+1h2 i;j + ci;j?12+ ci;j i;j?1h2 i;j + 2 2 k k bi?1;j ?1 + bi;j vik?1;j ?1 ? vi;j bi+1;j +1 + bi;j vik+1;j +1 ? vi;j + + 2 2h1 h2 2 2h1 h2 !? k k k k ? bi+1;j?21 + bi;j vi+1;j2h?1h? vi;j ? bi?1;j+12 + bi;j vi?12;jh+1h? vi;j + 1 2 1 2 + I1; (xi;j ? hk;i;j ) ? I2; (xi;j ) I1;y; (xi;j ? hk;i;j ):

(14)

The notations are almost selfexplaining: for instance,  is the time step size, h1 and h2 denote the pixel size in x and y direction, respectively, uki;j approximates 1 u in some grid point xi;j at time k , and I1;x; is an approximation to G  @I @x . k We calculate values of type I1; (xi;j ? hi;j ) by linear interpolation, and we use the time step size 0:5 (15)  = 4C + max (jI (x )j2 ; jI1;y; (xi;j )j2 ) : i;j 1;x; i;j This step size can be motivated from a stability analysis in the maximum norm when considering a corresponding isotropic problem with a linearized optical

ow constraint.

5 Experimental Results The complete algorithm for computing the optical ow depends on a number of parameters which have an intuitive meaning: { The regularization parameter C speci es the balance between the smoothing term and the optical ow constraint. Larger values lead to smoother ow elds. { The constant  in the smoothing term serves as a contrast parameter: locations where the image gradient magnitude is larger than  are regarded as edges. The di usion process smoothes anisotropically along these edges. In our experiments we used  := 1. The results were not very sensitive to underestimations of . { The scale 0 denotes the standard deviation of the largest Gaussian. In general, 0 is chosen according to the maximum displacement expected. In our case we used 0 := 10. { The decay rate  2 (0; 1) for the computation of the scales m := m 0. We may expect a good focusing if  is close to 1. We have chosen  := 0:95. { The smallest scale is given by n . It should be close to the inner scale of the image in order to achieve optimal ow localization. { The stopping time T for solving the system (11){(12) at each scale m . We observed that T := 50 gives results which are suciently close to the asymptotic state. In our rst experiment we use a synthetic image composed of four black squares on a white background. Each square moves in a di erent direction and with a di erent displacement magnitude: the left square on the top moves with (u; v) = (10; 5), the right square on the top is discplaced with (u; v) = (?10; 0), the left square on the bottom is shifted by (u; v) = (0; ?5), and the right square on the bottom undergoes a translation by (?10; ?10). Figure 1 illustrates the improvements due to scale-space focusing. In order to visualize the ow eld (u; v) we use two grey level images (ugl ; vgl ) de ned by ugl := 128 + 8u and vgl := 128 + 8v . We use the regularization parameter C = 15000, start with

initial scale 0 = 10 and show the results for focusing to the scales 10 = 5:99, 20 = 3:58, 30 = 2:15, 40 = 1:29, 50 = 0:77, 60 = 0:46, and 70 = 0:28, respectively. It can be observed that that this leads to a signi cantly improved

ow localization.

Fig. 1. From top to bottom and from left to right: the original pair of images I1

and I2 ; and the ow components (un ; vn ) resulting from focusing to the scales 10 = 5:99, 20 = 3:58, 30 = 2:15, 40 = 1:29, 50 = 0:77, 60 = 0:46, and 70 = 0:28, respectively.

We notice that the computed ow is a good approximation of the expected

ow. In fact, not only the orientation of the ow is correct, but also the ow magnitude is surprisingly accurate: the maximum of the computed optic ow magnitude is 14:13, which is a very good approximation of the ground truth

Fig. 2. Optic ow obtained without the scalespace approach (T = 800).

p

maximum 10 2  14:14. It results from the square which moves in (?10; ?10) direction. This indicates that { under speci c circumstances { our method may even lead to optic ow results with subpixel accuracy. In a second experiment, we illustrate the necessity of the scale-space approach. Figure 2 depicts the optical ow that we obtain without scale-space focusing, i.e. with 0 = 0. As can be expected, the algorithm gets trapped in a physically irrelevant local minimum. In the third experiment, we use the classical taxi sequence, but instead of taking two consecutive frames { as is usually done { we consider the frames 15 and 19. The dark car at the left creates a largest displacement magnitude of approximately 12 pixels. In Figure 3 we present the computed ow using the regularization parameter C = 500 and focusing from 0 = 10 to 70 = 0:28. The computed maximal ow magnitude is 11:68, which is a good approximation of the actual displacement of the dark car. Figure 4 shows a vector plot of the computed ow eld.

6 Conclusions Usually, when computer vision researchers deal with variational methods for optical ow calculations, they linearize the optical ow constraint. However, such a linearization works only for small displacements. In this paper we investigate an improved formulation of a classical method by Nagel and Enkelmann where no linearization is used. We identify this method with two coupled anisotropic

Fig. 3. Optic ow computation of the taxi sequence using frames 15 and 19: di usion lters with a nonlinear reaction term. We showed that this parabolic system is well-posed from a mathematical viewpoint, and we presented a nite di erence scheme for its numerical solution. In order to avoid that the algorithms converges to physically irrelevant local minima, we embedded it into a linear scale-space approach for focusing the solution from a coarse to a ne scale. The numerical results that we have presented for a synthetic and a real-world sequence are very encouraging: it was possible to recover displacements of more than 10 pixels with high accuracy. It is our hope that this successful blend of nonlinear anisotropic PDEs and linear scale-space techniques may serve as a motivation to study other combinations of linear and nonlinear scale-space approaches in the future. Acknowledgements. This work has been supported by the European TMR network "Viscosity Solutions and their Applications".

References 1. L. Alvarez, J. Esclarin, M. Lefebure and J. Sanchez, A PDE model for computing the optical ow, Proc. C.E.D.Y.A. XVI, Universidad de Las Palmas de Gran Canaria, 1999.

Fig. 4. Vector plot of the optic ow between the frames 15 and 19 of the taxi sequence. 2. L. Alvarez, F. Guichard, P.-L. Lions, J.-M. Morel, Axioms and fundamental equations in image processing, Arch. Rational Mech. Anal., Vol. 123, 199{257, 1993. 3. L. Alvarez, P.-L. Lions, J.-M. Morel, Image selective smoothing and edge detection by nonlinear di usion. II, SIAM J. Numer. Anal., Vol. 29, 845{866, 1992. 4. P. Anandan, A computational framework and an algorithm for the measurement of visual motion, Int. J. Comput. Vision, Vol. 2, 283{310, 1989. 5. F. Bergholm, Edge focusing, IEEE Trans. Pattern Anal. Mach. Intell., Vol. 9, 726{ 741, 1987. 6. M. Bertero, T.A. Poggio, V. Torre, Ill-posed problems in early vision, Proc. IEEE, Vol. 76, 869{889, 1988. 7. I. Cohen, Nonlinear variational method for optical ow computation, Proc. Eighth Scandinavian Conf. on Image Analysis (SCIA '93, Troms, May 25{28, 1993), Vol. 1, 523{530, 1993. 8. R. Deriche, P. Kornprobst, G. Aubert, Optical- ow estimation while preserving its discontinuities: A variational approach, Proc. Second Asian Conf. Computer Vision (ACCV '95, Singapore, December 5{8, 1995), Vol. 2, 290{295, 1995. 9. W. Enkelmann, Investigation of multigrid for algorithms for the estimation of optical ow elds in image sequences, Computer Vision, Graphics and Image Processing, Vol. 43, 150-177, 1988. 10. L. Florack, Image structure, Kluwer, Dordrecht, 1997. 11. L.M.J. Florack, W.J. Niessen, M. Nielsen, The intrinsic structure of the optic ow incorporating measurement duality, Int. J. Comput. Vision, Vol. 27, 263{286, 1998.

12. B. ter Haar Romeny, L. Florack, J. Koenderink, M. Viergever (Eds.), Scale-space theory in computer vision, Lecture Notes in Computer Science, Vol. 1252, Springer, Berlin, 1997. 13. B. Horn, B. Schunck, Determining optical ow, Artif. Intell., Vol. 17, 185{203, 1981. 14. T. Iijima, Basic theory on normalization of pattern (in case of typical onedimensional pattern), Bulletin of the Electrotechnical Laboratory, Vol. 26, 368{388, 1962 (in Japanese). 15. T. Iijima, Pattern recognition, Corona-sha, 1973 (in Japanese). 16. T. Iijima, Theory of pattern recognition, Series of Basic Information Technology, Vol. 6, Morishita Publishing, 1989 (in Japanese). 17. J.J. Koenderink, The structure of images, Biological Cybernetics, Vol. 50, 363{370, 1984. 18. A. Kumar, A.R. Tannenbaum, G.J. Balas, Optic ow: a curve evolution approach, IEEE Trans. Image Proc., Vol. 5, 598{610, 1996. 19. T. Lindeberg, Scale-space theory in computer vision, Kluwer, Boston, 1994. 20. A. Mitiche, P. Bouthemy, Computation and analysis of image motion: a synopsis of current problems and methods, Int. J. Comput. Vision, Vol. 19, 29{55, 1996. 21. H.H. Nagel, W. Enkelmann, An investigation of smoothness constraints for the estimation of displacement vector elds from images sequences, IEEE Trans. Pattern Anal. Mach. Intell., Vol. 8, 565{593, 1986. 22. P. Nesi, Variational approach to optical ow estimation managing discontinuities, Image Vision Comput., Vol. 11, 419{439, 1993. 23. M. Proesmans, L. Van Gool, E. Pauwels, A. Oosterlinck, Determination of optical ow and its discontinuities using non-linear di usion, J.-O. Eklundh (Ed.), Computer vision { ECCV '94, Lecture Notes in Computer Science, Vol. 801, Springer, Berlin, 295{304, 1994. 24. C. Schnorr, Segmentation of visual motion by minimizing convex non-quadratic functionals, Proc. 12th Int. Conf. Pattern Recognition (ICPR 12, Jerusalem, Oct. 9{13, 1994), Vol. A, IEEE Computer Society Press, Los Alamitos, 661{663, 1994. 25. J. Sporring, M. Nielsen, L. Florack, P. Johansen (Eds.), Gaussian scale-space theory, Kluwer, Dordrecht, 1997. 26. J. Weickert, Theoretical foundations of anisotropic di usion in image processing, Computing, Suppl. 11, 221{236, 1996. 27. J. Weickert, Anisotropic di usion in image processing, Teubner-Verlag, Stuttgart, 1998. 28. J. Weickert, On discontinuity-preserving optic ow, S. Orphanoudakis, P. Trahanias, J. Crowley, N. Katevas (Eds.), Proc. Computer Vision and Mobile Robotics Workshop (CVMR '98, Santorini, Sept. 17{18, 1998), 115{122, 1998. 29. J. Weickert, S. Ishikawa, A. Imiya, Linear scale-space has rst been proposed in Japan, J. Math. Imag. Vision, Vol. 10, May 1999.