New Algorithms for Two-Frame Structure from Motion J Oliensis NEC Research Institute 4 Independence Way Princeton, NJ 08540
Abstract
We describe two new algorithms for two{frame structure from motion from tracked point features. One is the rst fast algorithm for computing an exact least{squares estimate. It exploits our observation that the rotationally invariant least{squares error can be written in a simple form that depends just on the motion. The other is essentially as accurate as the least{squares estimate and is more ecient, probably faster, and potentially more robust than previous algorithms of comparable accuracy. We also analyze theoretically the accuracy of the optical{ ow approximation to the least{squares error.
1 Introduction
The most accurate current structure{from{motion (SFM) algorithms minimize the least{squares image error in the structure as well as the motion. This makes them slow, since the minimization is in many variables. There exists a faster but non{optimal 2{ frame algorithm that minimizes the coplanarity error in the ve motion unknowns. Recently, [7][15] have shown that a standard version of this second approach, which we refer to as the weighted{coplanarity algorithm or WC, gives results that are almost as accurate as those of the full minimization. WC typically give results for the translation direction diering by fractions of a degree from those of the full minimization. Also, [7][6][11] have shown that two{frame algorithms have less of a problem with local minima and give better results than previously believed. Combined, the results of [7][6][15][11] show that two{frame algorithms can be at once fast, robust, and accurate, and should be considered important tools for SFM. This paper describes two new two{frame algorithms. The rst, A1 , is based on our observation that the least{squares, rotationally{invariant error can be written in a simple form that depends on the motion alone. By minimizing this error in the motion unknowns, we obtain the rst fast algorithm that computes a true least{squares reconstruction. Our second
Yakup Genc Department of Computer Science University of Illinois Urbana, IL 61801 algorithm, A2 , is targeted for increased speed, at a small cost in accuracy. It is based on the optical{ ow approximation to the least{squares error. We analyze this approximation theoretically and show that it is typically a good one. Experimentally, we show that A2 is as accurate as the weighted{coplanarity approach [15][7] and thus nearly as accurate as the least{squares estimate. It is also more ecient and probably faster1 than either A1 or WC, and it is potentially more robust. A2 minimizes over the motion unknowns, but it reorganizes the minimization to make it more ecient. Like some previous approaches, it alternates between solving for the rotation and the translation direction T^ T= jTj. But, in contrast with these, its computation of T^ is insensitive to rst{order rotation errors, so that it is guaranteed to converge to the correct reconstruction when it starts nearby. The algorithm gives accurate results for T^ even when the rotation still has a large error, and typically it requires just a few cycles of rotation/translation recovery to compute T^ accurately. Also, its rotation{recovery step is purely linear and thus fast, while its translation recovery is also relatively fast since it involves a nonlinear minimization just in the two unknowns corresponding to the translation direction. In fact, much of the translation{recovery step can be reduced to a minimization in a single variable, and for roughly forward translations one can do the full minimization exactly with little computational cost [11]. Since the core of A2 consists of a minimization in one or two unknowns, it should be signi cantly faster than WC or A1 . Also, since [6] suggests that local{ minimum problems occur mainly when the focus of expansion (FOE) lies within the image, and since for these cases A2 can nd the global{minimum solution for the translation [11], A2 should also be more robust than WC or even a full minimization as in A1 . 1 We have not compared explicit timings for the algorithms, since this depends crucially on the implementation and our current implementations are non{optimal.
One disadvantage of A2 is its lack of generality: it cannot deal with arbitrarily large translations. But theory and experiment con rm that this is a mild restriction. Our experiments show that even very large translations do not cause problems.
2 Algorithm A1
Let the unit vectors p^0i and p^1i represent image points in the rst and second image corresponding to the 3D point Pi , and let R and T be the rotation and translation. We represent the structure Pi in the coordinate system of the rst image. Consider the rotationally invariant least{squares error 2 R (P , T) 2 i i ELS;i jP , p ^ + , p ^ Pi j 0i jR (Pi , T)j 1i : (1) The least{squares error in the image plane (Section 3) is more standard. But the two errors are almost equal for moderate eld of view (FOV), and it is not clear which is more realistic. ELS seems to accord better with the physics of lenses and with an assumption that the scene is independent of viewing angle. Assume R and T are given. Rewrite the sec ond term in (1) as (Pi , T) = jPi , Tj , R,1 p^1i 2 . P i = jPi j, (Pi , T) = jPi , Tj ; and T lie on a great circle which we characterize by its normal n^ . Neglecting the positive{depth constraints as usual, one can select P i = jPi j and (Pi , T) = jPi , Tj anywhere on 0 this great circle. Thus ELS;i minP ELS;i
2 2 ^ R,1 p min j n ^ p ^ ^ 0 ij + n 1 i n ^ ?T ,n^T S n^ ; S = p^ p^T + R,1p^ p^T R: min 0i 0i 1i 1i n ^ ?T i
= =
0 Since n^ ? T; one can compute ELS;i explicitly as the least eigenvalue of a 2 2 matrix
q
0 ELS;i = A =2 , A2 =4 , B ;
(2)
A p^T0i 1 , T^ T^ T p^0i + p^T1i R 1 , T^ T^ T R,1 p^1i 2 and B T^ p^0i R,1 p^1i : P Algorithm A minimizes E (T;R) E 0 1
LS
i
LS;i
over T;R using the simple expression (2). ELS is the correct least{squares error under the rotationally symmetric error model, except for our neglect of the positive{depth constraints.2 The main advantage of the exact form (2) over the approximate errors of WC or A2 is that it gives better results when T is close to an image point, see below and [15]. 2 We know of no work to date which properly incorporates the positive{depth constraint into the least{squares error. It may not be dicult to incorporate these constraints into (2), due to its simplicity.
3 Algorithm A2
Algorithm A2 minimizes the optical{ ow approximation to the least{squares error. We rst analyze this approximation. The standard image{plane least{squares error
2 [R (P , T)] 2 [P ] i i 2 2 ELS;i Z , p0i + [R (P , T)] , p1i ; i i z
where now p0i and p1i are 2D image points. We use the notation [V]2 to denote a 2D vector consisting of the rst 2 components of the vector V. First assume zero rotation. As before, one can 0 minP ELS;i neglecting the positive{ compute ELS;i depth constraints: i
p
0 ELS;i = A=2 , A2 =4 , B; A jp0i , ej2 + jp1i , ej2 =2;
B j(p0i , e) (p1i , p0i )j2 . Here the epipole e [T]2 =Tz , and for 2D vectors V , V 0 , the notation 0 in the V V 0 signi es Vx Vy0 , Vy Vx0 . Expanding ELS;i parameter i 4B=A2 yields , , 0 ELS;i = E00 ;i 1 + i =4 + o 2i ; E00 ;i B= (2A) : , p 0 We have E00 ;i =ELS;i = 21 = 1 , 1 , , so 1 0 0 E00 ;i =ELS;i 1=2, and E00 ;i =ELS;i is a slowly decreasing function of . Explicitly, i = i , 2 2 2 2 4 sin () = 1 + , where is the angle between (p0 , e) ; (p1 , e) and jp1 , ej = jp0 , ej. i is small and thus E00 ;i is an excellent approximation to 0 ; except very near the special con guration where ELS;i
p0i, e, p1i form an approximate isosceles right triangle. For a general image, no matter where e is, this
con guration occurs for at most a small fraction of the image points. For moderatePtranslations, it occurs only for e p0i . Thus E0 P i E00 ;i gives an excel0 ; especially for lent approximation to ELS i ELS;i moderate translations. For zero rotations, the optical{ ow error corresponds to
0 B= 2 jp0i , ej2 : EF;i 0 gives a good approximation to E00 ;i whenever E F;i e , p0;1i jp0i , p1i j : Since E00 ;i gives a good ap0 under these circumstances, so proximation to ELS;i 0 does EF;i . For example, if j e , p1i j > 6 jp0i , p1i j, the 00;i , E 0 = E 0 is about :007 and maximum of E 0 , E 0 =LS;i E 0 LS;i that of EF;i LS;i is about 0:18. Since LS;i
0 can give a poor approximation to E 0 only for EF;i P 0F;i typically image points near e, the sum EF i EF;i gives a good approximation to ELS :
We sketch how this result extends for small, nonzero rotations. Assuming Np image points, consider the length{Np vector with elements i (p0i , e) (p1i , p0i ) = jp0i , ej : The optical{ ow error for nonzero rotations, which is the one used by A2 ; is
EF (T) = T P 3 =2; (3) where P 3 is a Np Np projection matrix de ned
to annihilate the rst{order rotation contribution to i . (That is, P 3 annihilates all vectors of the form (p0i , e) ri = jp0i , ej, where ri is a rotational ow.) P 3 causes EF (T) to be rotation invariant to rst order. For moderate FOV and small rotations, ELS;i
2 i , T]2 , R,1p 2 ; E~LS;i [PZi ]2 , p0i + ([P 1i Pi , T)z i and minR ELS (T;R) minR E~LS;i (T; R) E~LS;i (T; Rmin). Our previous discussion for zero rotation, plus the fact that EF (T) is rst{order rotation invariant, implies that the optical ow error continues to be a good approximator for small rotations: EF (T) minR ELS (T;R) :
3.1 Algorithm A2 Description :
A2 cycles between recovering the translation and rotation. In the translation{recovery step, we assume that the rotation has already been recovered and compensated for up to a small error. Our current technique for recovering T^ , described in [5] and brie y below, is based on a steepest{descent minimization of (3). It can be supplemented by the global method of [11].
3.2 Rotation Recovery
We assume that the previous step of the algorithm has recovered T^ up to small corrections and describe how to recover the rotation. With focal length 1, the image displacements between the two images neglecting noise are ,1 di p1i ,p0i = Zi 1(T,z pZ0,i 1,T[T]2) +fi (R; p1i) ; (4) z i
where fi is the rotational displacement. (4) is exact with no optical{ ow approximation. Let p0i , , T 2 1 (xi ; yi ) , and let ri ,xy; , 1 + y Ti ; r2i
,1 + x2; xyT ; r3 (,y; x)T i i i
denote the rst{order rotational ows. For small rotations, fi is approximately given by X bb ,2 fi ! ri + o ! ; !Z jT j . (5) b=1:3
(1;2;3); where V (b) De ne three length{Np vectors V has elements V i(b) jpp0i ,, eej r(ib): 0i (1) (2) (3) V ; V ; V . From (4) and (5), Let V c = V c! + o !2; T^ Z ,1 jTj ;
(6)
(7) where the subscript c indicates that the quantities are calculated for the current T^ estimate, and T^ denotes the error in the recovered T^ . For a moderate eld c Vc! of view (FOV) with F < 90, solving directly for ! would give a result biased toward ! ^z, since the third column of Vc o (x; y) while the other columns are o (1). Thus we actually compute ! by solving c = Vc0 !0 in the least{squares sense for !0 , where Vc0 [V c(1) ; V c(2); ,1 V c(3)]; !0T (!1; !2; !3) ; and 0 11=2
@Np,1
N X , p
i=1
x2 + y 2 i A :
Once we have computed ! in this way, we rotate the second image to compensate for this recovered rotation and then reapply the translation{recovery step.
3.3 Initializing the Iteration
To start the iteration described above, we have considered two methods for providing initial estimates of the translation and rotation. The rst is the standard linear \8-point" algorithm [4] as improved by Hartley [2]. It can deal with motions of any size but works best for large motions [10]. The second, an improved version [5] of the original approach of [3], also requires just linear{algebra computations. It assumes that the translational motion is not very large. We have arbitrarily used the second method in many of our experiments. But it cannot deal with planar scenes, and, somewhat surprisingly, our experiments indicate that the \8-point" algorithm does essentially as well as the second technique even for small translations. Thus for practical implementations of our algorithm one should probably initialize using the \8{point" algorithm.
3.5 Experiments
3.5.1 Synthetic Sequences In the following experiments, the rotations were chosen randomly up to a maximum of about 22 . The structure consisted of 30 randomly selected points, and the noise was 1 pixel Gaussian, assuming a 512512 image and the speci ed FOV. For each translation tested, we created 30 sequences, with dierent structures, rotations, and noise for each sequence. The error reported for each T represents the average result over these 30 sequences. We compared our approach A2 to WC, which gives close{to{optimal results [15][7]. Figure 1 shows results for a FOV of 60 and 3D points with 20 Z 100. For ve dierent directions of the true T, we plot the average angular error in the recovered T as a function of the magnitude of Ttrue as jTtruej varies from 0:75 to 16 units. For the most part, the results of A2 and WC are indistinguishable. A2 appears to do slightly better for the more dicult small{translation trials, at least when translation direction is near ^z. Both algorithms do poorly in these trials, however. The performance of A2 could have been improved by using the exact method of [11] to avoid local minima. Figure 2 shows how the algorithms' performance varies as a function of the translation direction. The FOV and depth range were as before. Again, A2 does
Figure 1: Angular error in recovered translation directions for varying magnitudes at xed directions (T^ = (cos ; 0; sin )T with = 0 , = 5, = 45, = 85 and = 90). Translation Direction Error for Varying Magnitude 25
25
20
20
T = [1.000 0.000 0.000]’
15
15
10
10
5 0
0
5
10
15
25 20
T = [0.707 0.000 0.707]’
15 10 5 0
Average Angular Error [deg]
Due to the bas{relief ambiguity [1][8] [12], the initial linear estimate for T usually determines the T{^z plane accurately, where ^z is the viewing direction, but does not compute T^ reliably within this plane. Thus, as described in [5], after this initial estimate, we rst minimize EF (T ) just within the plane of ^z and the initial estimate for T^ . This typically accomplishes most of the work of recovering T^ , and, since it is a minimization in a single variable, it is faster than doing the full minimization in two variables. The two{frame error function has a characteristic local minimum which is intrinsic to Euclidean SFM [6]. We also use the stage of single{variable minimization to avoid this local minimum, by minimizing in T^ separately on both sides of the viewing direction [5]. Following the single{variable minimization, we use the full two{variable minimization of EF (T ) to re ne the T^ estimate. This iterative{minimization technique is important mainly for FOE lying outside the image. For FOE within or near the image, one can compute the exact{ minimum solution for T^ as in [11]. We used the iterative method in all the experiments below.
Average Angular Error [deg]
3.4 Details of the Translation Recovery
T = [0.996 0.000 0.087]’
5 0
0
5
10
15
25 20
T = [0.087 0.000 0.996]’
15 10 5
0
5
10
15
0
0
5 10 Translation Magnitude
15
25 20
T = [0.000 0.000 1.000]’
Weighted−Coplanarity
15
New Algorithm
10 5 0
0
5 10 Translation Magnitude
15
slightly better than WC when jTj is small and T^ ^z, but otherwise gives identical results. We also studied the convergence behavior of our algorithm. For the experiments in Figure 1, the bottom plot in Figure 4 shows how many cycles of rotation/translation recovery our algorithm needed to converge. Convergence typically takes less than 4 cycles even for large translations. For an additional set of experiments, the top plot in Figure 4 shows how the error in the recovered T^ decreases with the number of iterations. The error shown is the average result over 300 trials with random translations varying in slant between 0 and 90 and in magnitude between 0:75 and 16. The FOV and depth range were as before. Typically, just 2{3 iterations are enough to give a good estimate of the translation direction. We also tested our initial linear estimator from [5] against the \8-point" algorithm. Figure 5 shows the results for a variety of translation directions and magnitudes. The FOV and depth range were as before. As expected, the \8{point" algorithm does better at large translations. But, unexpectedly, it also does as well as our linear estimator even for small translations, except for T^ ^z. We also compared an iterative version of our linear estimator to the \8-point" algorithm. For this version, we repeat a two{step cycle of linear rotation recovery followed by linear translation recovery, until convergence. Figure 6 shows that this iterated approach again does slightly better than (a single run of) the \8-point" algorithm for T^ ^z but otherwise performs nearly identically.
T = 2[cos(Θ) 0 sin(Θ)]’
10
Average Angular Error [deg]
5
0
0
15
0.5
1
1.5
15
T = 4[cos(Θ) 0 sin(Θ)]’
10
5
0
0
0.5 1 Translation Direction Θ
Weighted−Coplanarity New Algorithm
0
0.5 1 Translation Direction Θ
4 3 2 1
1
2
3
4
5 6 Number of Iterations
7
8
9
10
Number of Iterations to Converge for Varying Translation Magnitude
1.5
10
0
5
0
T = 8[cos(Θ) 0 sin(Θ)]’
5
Number of Iterations vs Error in Translation Direction
20 Average Number of Iterations
15
Average Angular Error [deg]
Translation Direction Error for Varying Direction
Figure 4: Number of iterations for convergence in the new algorithm. Translation Direction Error [deg]
Figure 2: Angular error in translation direction recovery for varying translation directions (T^ = (cos(); 0; sin())T , 2 [0 ; 90
Translation Direction: [0.707 0.000 0.707]’ Translation Direction: [0.000 0.000 1.000]’ Translation Direction: [1.000 0.000 0.000]’
15
10
5
0
0
2
4
6
8 10 Translation Magnitude
12
14
16
1.5
Figure 3: ) at xed translation magnitudes of 2, 4 and 8. We tested the two algorithms on planar scenes. The planes were chosen with random tilts, with random slants between 20 and 75, and so that the minimum depth of the 3D points was 20. The FOV was 60. Figure 7 shows the results.3 When T^ ^z; A2 again does slightly better than WC for small jTj, but it now appears to do slightly worse for large jTj. For other T^ directions, its performance is nearly the same. Surprisingly, our algorithm achieved the same converged results whether started with the \8{point" algorithm or our initial linear estimator. Since our initial estimator gives poor results for planar scenes, this indicates that our approach converges quite stably for these scenes. We also tested the algorithms with very large motion to the side of the scene. For each image pair, we chose the depths of the 3D points in the range d , 40 to d + 40, where d varied from, 50 ,to 70 for dierent, pairs. The translation was 4 d + Wmax 0 , d + Wmax =2 T ; where Wmax characterizes the width of the structure. The rotations were in the range 60{90: Figure 8 shows the results for 1000 image pairs created in this way. 3 To account for the well known two{fold ambiguity in reconstructing planar scenes, the error plotted in this Figure is computed as the minimum of the two errors between the recovered T^ and the two valid possibilities for the ground truth T^ .
A2 gives results close to those of WC even for this
large translational motion. Surprisingly, it again converged to the same results when started from either initial linear estimator, indicating its stability. Finally, we conducted experiments with varying FOV (Figure 9) and scene depth (Figure 10). For the FOV test, we created 1000 image pairs in the same way as in the rst two experiments except that the FOV varied randomly in the range 50{80. We also chose the translations randomly with magnitudes of 2{6 units. For the varying scene{depth test, we kept the eld of view xed at 60, again varied the translations in the range 2 jTj 6, and randomly chose d in the range 50{80 units. (As before, for any given sequence we chose the 3D depths in the range d, 40 to d+40.) Note that the average error does not decrease with the FOV, though larger FOV makes it easier to distinguish rotations from translations. This is due to the fact that we scaled the size of the image noise by the FOV [13].
3.5.2 Experiments on Real Images We have also run A2 and WC on the CASTLE data
set (available from CMU). This sequence consists of 11 images with 28 feature points tracked over the sequence. From the provided calibration, we calculated that the FOV was 9:2 . Based on a multi{frame reconstruction and the provided ground truth, the 3D points vary in depth from 90{104, in units where the maximum translation from the rst camera position was 4 units. Figure 11 shows one of the images with the tracked feature points marked. Figure 12 shows
Figure 5: Angular error in translation direction recovery with linear algorithms for varying translation magnitudes at xed translation directions (T^ = (cos ; 0; sin )T with = 0 , = 45 and = 90). Linear Parts of Two Algorithms 25
20 15
Average Angular Error [deg]
10 5 0
0
5
10
15
T = [0.707 0.000 0.707]’
T = [1.000 0.000 0.000]’ 15
20
10
15
5
10
0
5
0
5 10 Translation Magnitude
15
25 T = [0.000 0.000 1.000]’ 20 8−Point Algorithm
15
Linear Parts of Two Algorithms 25
25
20
0
0
5
10
Average Angular Error [deg]
Average Angular Error [deg]
T = [1.000 0.000 0.000]’
Average Angular Error [deg]
25
Figure 6: Angular error in translation direction recovery with linear algorithms (iterative for the new approach) for varying translation magnitudes at xed translation directions (T^ = (cos ; 0; sin )T with = 0, = 45 and = 90).
15
T = [0.707 0.000 0.707]’ 20 15 10 5 0
0
5 10 Translation Magnitude
15
25 T = [0.000 0.000 1.000]’ 20
New Algorithm [Linear Part] 10
15
5
10
8−Point Algorithm New Algorithm [Linear Part]
0
5 0
5 10 Translation Magnitude
15
the results4 obtained by WC and A2 for the angular errors in the recovered T^ for all 55 distinct image pairs. The two algorithm perform essentially identically. Over all pairs, the average error in T^ is 1:35 and the standard deviation is 1:44.
4 Conclusion
We presented two new algorithms for 2{frame structure from motion and experimentally evaluated the second. The rst, A1 , computes a true least{squares reconstruction by minimizing an error just in the motion. The second, A2 , is approximate but more ecient. Our experiments show that the eciency of A2 comes at little cost in accuracy: it usually gives the same results as WC and in some cases does better. Applying the exact method of [11] would have improved A2 's robustness. A2 may be more robust than WC or even than a full minimization of the least{squares error such as in A1 . This is because A2 minimizes a simpler error function, whose properties are easier to analyze. It is the simplicity of this error function that makes possible the fast global search technique of [11], using which A2 can avoid all local minima for forward T. The simplicity also makes possible the analysis of [6], which shows how to avoid local minima for non{ forward T. Though A2 starts from a small{translation assumpSince the motion in this sequence is almost purely translational, for our algorithm we did not compensate for the rotation. Compensating for the rotations gives similar results; the dierence in average is just 0 08 . 4
:
0
0
5 10 Translation Magnitude
15
tion, it can give accurate results even for very large translations when initialized using the \8{point" algorithm. As shown in [10], the \8{point" algorithm gives accurate and reliable results when the translation is large and the depth range of the scene is not too small. The translation{recovery step in A2 can deal with arbitrarily large translations as long as the rotation is known accurately enough: if the rotation is zero, minimizing EF (T) gives T exactly up to noise, even for large translations. Thus, when the translation is large, typically the \8-point" algorithm will compute the rotation accurately, and A2 will accurately compute the translation after compensating for this rotation. We have not yet studied how to optimize the convergence schedule for A2 . We expect it to be faster than WC or A1 at least for the initial convergence toward the correct reconstruction. At a later stage of re ning the reconstruction, A1 and A2 might be equally fast.
References
[1] P. Belhumeur, D. Kriegman, and A. Yuille \The Bas{ Relief Ambiguity," CVPR 1060{1066, 1997. [2] R. I. Hartley, \In Defense of the Eight{Point Algorithm," PAMI 19:580{593, 1995.
[3] A.D. Jepson and D.J. Heeger, \Linear subspace methods for recovering translational direction," in Spatial Vision in Humans and Robots, Cambridge, 39{ 62, 1993.
Figure 7: Angular error in translation direction recovery for planar scenes. The translation directions are given by T^ = (cos ; 0; sin )T , with = 0 , = 45 and = 90 .
200
20
10
0
5
10
15
180
30
160
20
140
o
Weighted Coplanarity (mean=1.13 )
10
0
0
5 10 Translation Magnitude
15
Number of Occurences
Average Angular Error [deg]
30
Average Angular Error [deg]
Translation Direction Error for Large Motion to the Side
Translation Direction Error for Varying Magnitude [Planar Scene] 40 T = [1.000 0.000 0.000]’ T = [0.707 0.000 0.707]’
40
0
Figure 8: Angular error in translation direction recovery for a motion to the side of the scene.
o
New Algorithm (mean=1.22 )
120 100 80
40 T = [0.000 0.000 1.000]’
60
30
Weighted−Coplanarity
40 New Algorithm
20
20 10
0 0
0
5 10 Translation Magnitude
15
0
1
2
3 Angular Error [deg]
4
5
6
[14] C. Tomasi and T. Kanade, \Shape and motion from [4] H. C. Longuet{Higgins, \A computer algorithm for
reconstructing a scene from two projections," Na-
ture, 293:133{135, 1981.
[5] J. Oliensis, \Computing the Camera Heading from Multiple Frames," CVPR 1998, 203{210. [6] J. Oliensis, \A New Structure from Motion Ambiguity," CVPR 185{191, 1999 [7] J. Oliensis, \A Multi{frame Structure from Motion
Algorithm under Perspective Projection," IJCV, to appear.
[8] J. Oliensis, \Structure from Linear and Planar Motions," CVPR 335{342, 1996. [9] J. Oliensis, \A Critique of Structure from Motion Algorithms," NECI TR, 1997.
[10] J. Oliensis, \Rigorous Bounds for Two{Frame Structure from Motion," ECCV 1996, pp. 184{195. [11] S. Srinivasan, \Extracting Structure from Optical
Flow Using the Fast Error Search Technique," University of Maryland CAR-TR-893, 1998.
[12] R. Szeliski and S.B. Kang, \Shape ambiguities in structure from motion," PAMI 19, 506{512, 1997.
[13] T. Y. Tian, C. Tomasi, and D. J. Heeger, \Comparison of Approaches to Egomotion Computation" CVPR, pp. 315{320, 1996.
image streams under orthography: A factorization method," IJCV 9:137-154, 1992.
[15] Zhengyou Zhang, \On the optimization criteria for two{frame structure from motion," PAMI 20, 717729, 1998.
Figure 9: Angular error in translation direction recovery for varying eld of view. Translation Direction Error vs FOV 6
5
Average Angular Error [deg]
Weighted Coplanarity New Algorithm
4
3
2
1
0
50
55
60
65 FOV [deg]
70
75
80
Figure 10: Angular error in translation direction recovery for varying scene depth.
Figure 12: Performance of approximate and the full minimization algorithms on CASTLE data. Translation Direction Error for CASTLE Data 8
Translation Direction Error vs Scene Depth 10
7
Angular Error in Translation Direction [deg]
9
8 Weighted Coplanarity New Algorithm
Average Angular Error [deg]
7
6
5
6 Weighted−Coplanarity [m:1.35,s:1.44] 5
New Algorithm [m:1.35,s:1.44]
4
3
2
4 1
3 0
2
1
0
50
55
60
65 Scene Depth
70
75
80
Figure 11: A frame in CASTLE data with overlayed feature points.
0
10
20
30 Image Pairs
40
50
60