AUTOMATIC LENS DISTORTION ESTIMATION FOR AN ACTIVE CAMERA∗ Oswald Lanz ITC-irst, 38100 POVO (TN), ITALY
[email protected] Abstract
This paper proposes a new algorithm for estimating lens distortion for a pantilt-zoom camera. No calibration pattern is needed: detection of a single feature point at different camera configurations allows the generation of a virtual calibration pattern which is composed of points that, in an undistorted image, are ensured to lie on straight lines. An iterative algorithm estimates lens distortion by minimizing a non-straightness error metric solving two linear subproblems. Experiments on synthetic and real images show the validity of the proposed method.
Keywords:
Radial Lens Distortion, Active Camera Control
Introduction Visual sensors are becoming very attractive in applications where electronic systems have to interact with their environment. They deliver observations that contain a huge amount of information, they can be controlled in many ways and they are inexpensive. However, low-price cameras do not always behave like an ideal sensor. Complex mathematical models are needed and have to be calibrated. This paper proposes a new method for estimating lens distortion for a pan-tilt-zoom controllable camera based on straight line rectification. Classical approaches attempt to rectify a geometrically accurate calibration grid using a nonlinear minimization approach (e.g. Tsai, 1987; Heikkilä and Silvén, 1997). Such methods can provide precise results, but are slow, very sensitive to the initial guess and require manual pattern placement. Features from static images are used in Devernay and Faugeras, 2001. Edge detection with sub-pixel accuracy is applied to find 3D segments in a distorted image. Due to their linear approximation, detected segments are usually short and they often split, which might make distortion recovery difficult. A geometric error ∗ Research
partially funded by Provincia Autonoma di Trento under project PEACH
2 Figure 1 The virtual calibration patterns generated by the detection of a single feature point: (a) principal point pattern from different zoom settings, (b) distortion pattern from different camera orientations.
metric similar to the one proposed in this paper is used to rectify the segments, a task performed by a Levenberg-Marquart nonlinear minimization procedure. The proposed method is innovative in two points. Neither complex singleimage feature extraction has to be performed nor any calibration pattern is needed. Instead, we exploit camera control capabilities to create a geometrically perfect, virtual calibration pattern. This task is described in Section 1. The generated pattern allows to estimate radial lens distortion with an iterative, two step linear algorithm presented in Section 2. In Section 3 we demonstrate the validity of the approach on synthetic data and real images.
1.
Virtual Pattern Generation
We exploit the camera’s control capabilities to generate two calibration patterns: one pattern will be used to find the principal point – an approximation of the distortion center –, and the second to estimate radial lens distortion. Both patterns are constructed by assembling the location of a single feature point detected at different, controlled camera orientations and zoom levels. The camera’s principal point is assumed to be invariant to zoom changes. To estimate it we exploit the fact that zoom optical flow is radial and therefore single point trajectories all meet at a near-central position, i.e. the principal point. The corresponding calibration pattern is composed by point samples of such radial lines: at maximum zoom level the camera is gazing the feature point at her image border; by successively decreasing the zoom level the same feature point will approach the principal point and its trajectory defines one radial line. Figure 1a shows a typical principal point calibration pattern. The second pattern is again composed by several point sets, each one representing samples from an image line. For each line a sequence of images is captured at different viewing orientations. The camera is controlled in such a way that the generating feature point is always imaged to the chosen line. The pan-tilt parameters that provide this scan are computed from the camera control model. Most cameras follow a pan-first strategy on orthogonal concentric axes: a space point P = (x y z)t , after p pan and t tilt rotation, is mapped to
3 the point P 0 = (x0 y 0 z 0 )t according to:
x0 1 0 0 cos(p) 0 sin(p) x y 0 = 0 cos(t) − sin(t) y , 0 1 0 z0 0 sin(t) cos(t) − sin(p) 0 cos(p) z or P 0 = (Rt Rp )P = Rpt P . Under a pinhole camera model the 3D locations of points that are imaged to a line are described by a plane through the camera center (see Hartley and Zisserman, 2002). Any such plane can be parameterized by a 3D vector B = (b1 b2 b3 )t . To obtain the desired pattern we therefore constrain the feature point P to lie on a given camera frame plane B 0 according to B 0 · P 0 = B 0 · Rpt P = 0. Since the above equation is linear in P the feature point location has to be known only up to a scale, i.e. only its ray parametrization is required. It can be recovered from the camera’s pan-tilt readings when imaged to the principal −1 point according to P ∝ Rpt · (0 0 1)t . This is a suitable measurement when the principal point is close to the center of distortion, a hypothesis that for real cameras has always shown to be acceptable. A scan control sequence for plane B 0 is then computed by solving, at each scan step i, the above non-linear control equation for pan-tilt angles (p, t)i . An additional constraint is required: we impose constant angular scan size ∆pt = k(p, t)i − (p, t)i−1 k. Consistent rotation orientation is guaranteed by providing the nonlinear solver with a good initial guess, e.g. a linear prediction. Figure 1b shows a distortion calibration pattern generated using horizontal lines, i.e. b03 = 0.
2.
Distortion Removal Estimation
Accounting for radial lens distortion, the actual projection (px , py ) is related to its ideal, undistorted point (ˆ px , pˆy ) by a radial displacement: µ
pˆx pˆy
¶
µ
=
cx cy
¶
µ
+ L(r)
px − cx py − cy
¶
q
,
r=
(px − cx )2 + (py − cy )2
We will specify this inverse distortion model by estimating the center of distortion (cx , cy ) and the Taylor expansion coefficients of the distortion correction function L up to a predefined order. A distorted image can then be rectified by remapping its pixels according to the above relation. The proposed approach is posed as an optimization problem whose purpose is to straighten image lines eventually provided by, but not limited to, a pattern generated with the technique described in the previous section. Consider the Euclidian distance of an image point p from a line b: d(p, b) =
px bx + py by + bz , p2x + p2y
4 or equivalently d(p, b) = p · b, with p in homogeneous notation and b normalized wrt x-y components. Being p a measured sample of the undistorted line ˆb, its (L, c) corrected distance is given by d(p, ˆb) = ˆb · c + L(r)ˆb · (p − c) with p and c in homogeneous notation. (L, c) are estimated by minimizing, over all pattern points, this nonlinear error metric in the least squares sense. Hence, minimization is performed with the following objective function: obj(L, c) =
X
d2 (pij , ˆbi )
Care should be taken in order to avoid the trivial solution L ≡ 0, that is the image to center collapse. To do so we impose the physically plausible hypothesis of no distortion in the center, that is L(0) = 1. In the above objective function ˆbi are unknown too, thus a standard nonlinear minimization approach becomes impractical. Note that under the assumption of no knowledge of the camera’s internal calibration, this is true even if the pattern has been generated by sampling points from a known 3D plane, as proposed in the previous section. This paper proposes to perform the minimization with an iterative two-step linear algorithm that, at each iteration, first estimates the line parameters ˆb and then updates the correction L and eventually c. Let us assume for now that the center of distortion c is known in advance or approximated as the principal point, a hypothesis that will be removed later. At each iteration, the first step of the algorithm involves estimating ˆbi from the pattern that has been rectified with the correction obtained at the previous iteration. A linear least squares fit to the corresponding line points can be computed via Singular Value Decomposition (SVD, see Golub and Loan, 1990). Once ˆbi are known the minimization becomes a linear least squares problem in the Taylor expansion coefficients lk of L. Note that the estimated ˆbi already fix the pattern scale, thus l0 can be included in the optimization and L(0) = 1 can be imposed a posteriori by normalizing l wrt l0 . The associated overdetermined system of linear equations 2 k ˆ (l0 + l1 rij + l2 rij + . . . + lk rij ) bi · (c − pij ) = −bˆi · c
can be solved for lk using again SVD. The normalization of l concludes one iteration. The algorithm can be interpreted as a fixed point method, thus a stopping criterion based on the innovation of l can be used. A more intuitive way is to stop when the point distance variances go below a given threshold. As initial condition the undistorted hypothesis l0 = 1, lk≥1 = 0 has always shown good convergence properties for real lens problems. Several experiments have shown that, due to the additional degree of freedom, including l0 in the minimization accelerates convergence by two orders of magnitude.
5 8
maximum variance mean variance standard deviation
7 6 5 4 3 2 1 0 0
5
10
15
20
25
Iterations
Figure 2. Distortion removal algorithm on synthetic data. The left image shows pattern point and distortion center trajectories after several iterations. At the center an image region of the distorted pattern and its correction after convergence is shown. Corrected points are tight to its estimated, undistorted lines. The right plot shows that convergence is achieved after 9 iterations.
The hypothesis of known center of distortion can be removed by introducing an additional step in the iterative algorithm. Before updating lk , the center c is estimated according to the same objective function. A linear approximation (rij have been taken from the previous iteration) has shown bad convergence properties, thus a nonlinear approach turned out to be convenient. Only a few iterations are usually required to get a stable estimate and a good initial guess is given by the principal point. The principal point can be estimated in two steps from its pattern as follows: an image line is fitted to each of the radial line point sets; the principal point is then computed as their least squares intersection from the resulting overdetermined system of linear equations.
3.
Results and Conclusions
The first experiment is performed on synthetic data where ground truth is available. The distortion pattern has been generated following the proposed procedure. It is composed by 496 points sampled from 14 image lines at different distances and orientations. The renderer, implemented in OpenGL, simulates an ideal pinhole camera controlled with a pan-first model as described in Section 1. Distortion is applied a posteriori according to D(ˆ r) = 1 + 0.0003125 rˆ. Both the expansion coefficients up to 3rd order and the distortion center have been estimated. Initial guesses for the iterative optimization are no distortion and a center point that is 18 pixel off form its true value. Distorted pattern points have a distance variance from its line fit of up to 7.48 pixel, mean variance is 5.71 with standard deviation 1.43. Figure 2 shows the algorithm at work. Convergence is achieved after 9 iterations (by excluding l0 from the minimization more then 2000 iterations are required!), giving L(r) = 1 − 0.0977 r + 0.0160 r2 − 0.00175 r3 with center as close as 0.5 pixel to its true value. The corrected points have a maximum variance of 0.267 pixel with mean variance 0.233 and standard deviation 0.0257. Ground truth allows
6 Figure 3 Distortion removal on a Sony SNC-RZ30P network camera. The left image shows a checkerboard that is distorted by the camera lens. The right image shows its correction obtained using a virtual calibration pattern.
to asses accuracy on the entire image. After correction we computed displacement from ground truth which was limited by maxkˆ r − rL(r)k = 0.1473, thus showing closeness to the analytical solution. In the second experiment we calibrate lens distortion of a Sony SNC-RZ30P network camera1 . The principal point is estimated from the pattern shown in Figure 1. Lines have been sampled by detecting a small bright ball at different zoom levels. The same feature point has been used to create a distortion pattern composed by 1879 points sampled from 48 vertical and horizontal lines. Convergence is achieved after 6 iterations, dropping point variances from 3.92 pixel down to 1.62 which can be considered detection noise. The center of distortion is shifted by 5 pixel from its initial guess, the principal point. Figure 3 shows the distorted image of a checkerboard and its correction. Experiments have demonstrated the validity of the proposed method. Virtual pattern generation is fully automatic and flexible. Distortion removal estimation is fast and accurate. The algorithm has an intuitive interpretation and is transparent: its interpretation as a fixed point method solving two linear subproblems may allow assessment of theoretical convergence properties. Smart feature point sampling can also have impact on the quality of the solution and would be supported by the proposed approach. Extension to a complete internal calibration method is desirable. These topics are subject of future research.
References Devernay, F. and Faugeras, O. (2001). Automatic calibration and removal of distortion from scenes of structured environments. In Machine Vision and Applications. Golub, G.H. and Loan, C.F. Van (1990). Matrix Computations. Johns Hopkins University Press. Hartley, R. and Zisserman, A. (2002). Multiple View Geometry in Computer Vision. Cambridge University Press. Heikkilä, J. and Silvén, O. (1997). A four-step camera calibration procedure with implicit image correction. In International Conference on Computer Vision and Pattern Recognition. Tsai, R. (1987). A versatile camera calibration technique for high-accuracy 3d machine vision metrology using off-the-shelf tv cameras and lenses. Robotics and Automation.
1 Acknowledgements
go to P. Chippendale, S. Rizzo and S. Ziller for kindly providing experimental data.