Epipolar image rectification using cylinder Geometry - Semantic Scholar

Report 3 Downloads 230 Views
Student project

Epipolar image rectification using cylinder geometry

Institute of Computer Science in the Technical Faculty of Christian-Albrechts-University of Kiel Multimedia Information Processing Prof. Dr. Ing. Reinhard Koch

done by: Tutor:

Ariane Nouidui-Tchagou Dipl. Ing. Bogumil Bartczak

Kiel, 16.11.2006

1

Aufgabe Name, Vorname: Immatrikulations-Nr: Studiengang:

Nouidui Tchagou, Ariane 420347 Ingenieur Informatik

betreuender Hochschullehrer: Betreuer: Institut: Arbeitsgruppe:

Prof. Dr-Ing. Reinhard Koch Dipl. Ing Bogumil Bartczak Institut f¨ ur Informatik Multimedia Information Processing

Beginn am: Einzureichen bis:

01.08.2006 16.11.2006

Aufgabenstellung: Zylindrische Kamera Rektifizierung

i

Selbstst¨ andigkeitserkl¨ arung Ich erkl¨are hiermit, dass ich die vorliegende Arbeit selbstst¨andig und nur unter Verwendung der angegebenen Literatur und Hilfsmittel angefertigt habe. Ariane Nouidui-Tchagou

ii

Contents 1 Introduction 2 Basics 2.1 Camera model . . . . . . . . . . 2.1.1 Pinhole camera model . . 2.1.2 A general camera model . 2.1.3 The projection Matrix . . 2.2 Stereo vision . . . . . . . . . . . 2.2.1 Epipolar geometry . . . . 2.3 Rectification . . . . . . . . . . . . 2.3.1 Homography rectification 2.3.2 Polar rectification . . . .

2

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

3 3 3 5 9 10 10 11 12 14

3 Cylindrical Rectification 3.1 Cylinder points and cylinder coordinates system . . . . 3.2 Determining the rectification rotation . . . . . . . . . . 3.3 Scaling the epipolar lines . . . . . . . . . . . . . . . . . 3.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 Determining the common interval box . . . . . . 3.4.2 Computing the rectified image . . . . . . . . . . 3.4.3 Regaining the original image . . . . . . . . . . . 3.5 Advantages and disadvantages of cylindrical rectification

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

15 16 18 20 21 22 23 23 24

4 Experiments

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

25

iii

List of Figures 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10

Pinhole camera model and image formation . . . Perspective projection . . . . . . . . . . . . . . . A more general camera model . . . . . . . . . . . Camera coordinates system . . . . . . . . . . . . Changing coordinates in the image plane . . . . . 3-D vision problem . . . . . . . . . . . . . . . . . Epipolar geometry . . . . . . . . . . . . . . . . . Rectified epipolar geometry . . . . . . . . . . . . Planar rectification . . . . . . . . . . . . . . . . . Planar rectification in the case of forward motion

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

4 5 7 7 8 10 11 12 13 14

3.1 3.2 3.3 3.4 3.5 3.6 3.7

Cylindrical rectification of an horizonal camera motion. Camera and cylinder coordinates system. . . . . . . . . Rotating the epipolar lines within the epipolar plane . . Calculating the rotation matrix . . . . . . . . . . . . . . Enforcing the position of the epipolar lines . . . . . . . . Projecting the epipolar on to a unit radius cylinder . . . Image resolution of the rectified image . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

15 17 18 19 20 21 22

4.1 4.2 4.3 4.4 4.5 4.6

Left-right camera motion rectification . . . . . . . . . . . . . . . . Principle of the correlation technique . . . . . . . . . . . . . . . . . Relation between the disparity and the depth. . . . . . . . . . . . . Disparity map(left) and Depth map (right) . . . . . . . . . . . . . Forward motion rectification . . . . . . . . . . . . . . . . . . . . . . Disparity map(left) and depth map of forward motion rectification

. . . . . .

25 26 27 27 28 29

1

1 Introduction In this work, the image pair cylindrical rectification is implemented. A rectification consists of transforming the given image pair so that the epipolar lines are aligned horizontally and used as image rows. Thereby, the corresponding epipolar lines have the same rows number in the resulting images. The cylindrical rectification, proposed by Roy, Meunier and Cox [RMC97], uses a cylinder to transform the given images. Beside the cylindrical rectification, there exist other rectification methods: the homography and the polar rectification. The homography rectification uses a common plane to rectify the images and the polar rectification parametrizes the image pair into a common (r, θ)-space. Unlike the homography rectification, the cylindrical rectification can deal with forward camera motion. In addition, the method works in the three dimensional space in opposition to the polar rectification and as a result, it may be used for any camera geometry. The main application area of cylindrical rectification is stereo vision, which is a technique dealing with the reconstruction of three dimensional objects. In fact, recording three dimensional scenes and objects leads to a depth information loss. By finding corresponding points in both images, the depth information of the physical point can be estimated. Using the epipolar constraint, which restrains the search space, and applying the cylindrical rectification, the corresponding points can be determined easier in that they could be searched in image rows. In Chapter 2, the basics of cylindrical rectification are presented. First, the simplest camera model, called pinhole camera model is described. Second, a more general model, which allows the modeling of cameras position and orientation, is presented. Further, the principle of stereo vision is introduced. Then the epipolar geometry, constraint and their relation to the rectification are presented. Finally the homography and polar rectification are described. In chapter 3, the cylindrical rectification procedure is described. First, each single transformation applied in order to rectify the images is presented. Second, the basic implementation steps are elucidated: common region determination, choosing an image resolution for the rectified image etc. Further, the way from an original pixel to a rectified image pixel and vice versa are explicated. Finally the advantages and inconveniences of cylindrical rectification are given. In Chapter 4, some experiments are made for testing the implementation.

2

2 Basics Image pair rectification is a process, which requires some basic knowledge presented in this chapter. On the one hand, the way how an image evolves in a camera is explained. This explanation requires the modeling of cameras. For that reason, the simplest camera model, called pinhole camera model, is defined. This model is then expanded since modeling real cameras considers their orientation and position in the world. In addition, the position and orientation of the recorded scenes or objects must also be taken into account to describe the formed images. The use of so called coordinate systems allows the representation of objects in the three dimensional space. This representation is important in that it permits to describe the relation between a scene or object point and its projection into the image plane. The three most important coordinate systems in camera modeling and image rectification are the world, camera and image coordinate systems. Given a coordinates system, it is possible to change in another system. This is particularly fundamental since recording an image with a camera can be considered as a coordinate system changing transformation. On the other hand, there is an interest in regaining the three dimensional objects coordinates having two pictures of them. Thus, techniques have been designed to find corresponding points between images and to use them for reconstructing the original 3D-space points. The epipolar geometry, which is a geometry used to facilitate the search of corresponding point is explained in this chapter. This geometry allows to reduce the search space from a two to a one dimensional space. This search is accomplished on so called epipolar lines. The rectification transforms the epipolar lines so that they are parallel, horizontal and used as images rows. In this case, the corresponding points are searched in the image rows. The following sections give a more detailed explanation of the concepts mentioned above, which are needed to understand the cylindrical rectification implementation.

2.1 Camera model The sections below describe geometric models for cameras and introduced the required coordinate systems.

2.1.1 Pinhole camera model The pinhole camera is the simplest camera model, which can be defined. It consists of a pinhole and an image plane also called retinal plane. Light rays emitted or reflected by an object pass through the hole and join the retinal plane at different

3

2 Basics points to form an image. The distance f between the image plane and pinhole is called the camera constant or focal length. The pinhole is geometrically represented by a point C, called optical center. The line going through the optical center and perpendicular to the image plane is the optical axis. The optical axis intersects the image plane in a point, called the principle point O. Image formation in the retinal plane results from an operation called the perspective projection. This projection joins the line < C, M >, called optical ray of M, with the image plane to form a point m, where M is an object point. Figure 2.1 illustrates the image formation.

Figure 2.1: Pinhole camera model and image formation The optical center C represents the projection center. m is the perspective projection of M on the image plane. For the pinhole camera model, the optical center is placed at the world coordinates system origin and the image plane is taken to be the plane with equation Z = 1, it means f=1. In this case, the perspective projection of the world point M (x, y, z) in the image space (U, V ) is given by m(u,v) where: u=

x z

(2.1)

y (2.2) z The upper division is called the perspective division. As can be seen, projecting a world point in the image plane leads to a loss of the depth information z. Figure 2.2 illustrates the perspective projection. v=

4

2.1 Camera model

Figure 2.2: Perspective projection The world point M is projected into the image space (U,V). The above mentioned equations are obtained by using both similar triangles (C,m,L) and (C,M,T) and the intercept theorem.

2.1.2 A general camera model Up to now, the assumption was that the camera is located at the origin of the world coordinates system. By considering a camera, which is not positioned at that origin, a ”local world coordinates system” should be defined at the new position of the optical center, in order to preserve the characters of the pinhole model. This expanded camera model is characterized by its intrinsic and extrinsic parameters. The extrinsic parameters describe the position and orientation of the camera regarding the world coordinates. The extrinsic parameters also allow to determine the camera coordinates of an object or scene point in world coordinates. The intrinsic parameters describe how a point in the image plane can be addressed. In the following, the extrinsic and intrinsic parameters are defined.

Extrinsic parameters The basis of the camera system consists of four vectors (C, Ho , Vo , A), given in world coordinates. The vectors H and V span the image plane and the vector A represents the optical axis and is orthogonal to both vectors H and V. The basis

5

2 Basics B := (C, Ho , Vo , A) can be specified with  1    0 Ho Vo A C B= =  0 0 0 0 1 0 |

the matrix: 0 1 0 0

 0 Cx   0 Cy  H V A 0 o o  1 Cz  0 0 0 1 | {z } 0 1 =:R {z }

(2.3)

=:T

This matrix models the rigid transformation that the optical center and the image plane undergo to get a new position and orientation. R is a rotation and specifies the orientation of the camera. T is the translation from the world coordinates center to the camera system center. It is possible to change from the camera coordinates system and the world coordinates and vice versa. Given a point M 0 (x0 , y 0 , z 0 , 1) in the camera coordinates basis, it can be expressed in a point M in world coordinates basis: M = x0 H + y 0 V + z 0 A + C = BM 0

(2.4)

Since the vectors Ho , Vo , C, A are given in world coordinates, the point M is also in world coordinates. In the case that a point M (x, y, z, 1) is in world coordinates coordinates system, it can be expressed as a point M 0 (x0 , y 0 , z 0 , 1) in the local world coordinates system or camera coordinates system. The camera coordinates M’ are obtained by inverting the matrix B: M 0 = B −1 M = (T R)−1 M = RT T −1 M (2.5) The upper relation can also be written as:   T Ho 0   VoT 0   M0 =    AT 0  000 1

1 0 0 0

0 1 0 0

 0 −Cx 0 −Cy   1 −Cz  0 1

(2.6)

Figure 2.3 illustrates the expanded camera model. Figure 2.4 depicts the camera coordinates system. Intrinsic parameters The intrinsic parameters describe how a point is addressed in the resulting image. In order to describe this addressing, a transformation is introduced from the camera coordinates system to a new one, called image coordinates system. The image coordinates system uses two basis vectors Un , Vn with differs respectively from H and V only in scale factors ku and kv :

6

Un = ku Ho

(2.7)

Vn = kv Vo

(2.8)

2.1 Camera model

Figure 2.3: A more general camera model T and R are respectively the rotation and the translation that the camera center and image plane of the world system undergo, to take their new position and orientation.

Figure 2.4: Camera coordinates system The affine basis (Ho , Vo , A, C) is presented. The optical axis intersects the image plane at the principle point O.

7

2 Basics In addition, the principle point O has different coordinates in the new system, in order to avoid negative image coordinates. The new coordinates of O are specified by a point On = (uo , vo ). Due to this coordinates, a translation using the vector On is applied to every point projected onto the image plane. Considering the new basis vectors, the principle point On and the focal length, the relation between a m(u0 , v 0 , 1) projected onto the image plane and m(u, v, 1), representing its coordinates in the resulting image can be described by the the following equation:  − kfu u  v   0 1 0 | 

0 − kfv 0 {z

=:K

  uo u0  vo   v 0  1 1 }

(2.9)

The above matrix is referred to be the intern calibration matrix K. Figure 2.5 shows the image coordinates system. By combining the extrinsic, intrinsic parameters and

Figure 2.5: Changing coordinates in the image plane As can be seen the basis vectors differ in a scale factor. The (Ho , Vo )-space has the principle point O. In the new space (Un , Vn ) the principle point has new coordinates. the perspective projection into a matrix, called projection matrix, a relation can be established between a 3D object point and its coordinates within the recorded image.

8

2.1 Camera model

2.1.3 The projection Matrix After extrinsic and extrinsic parameters has been calculated, the so called projection matrix of the camera can be determined. This matrix transforms an object point M (x, y, z, 1) given in world coordinates in a point m in image coordinates. First, the extrinsic parameters give the camera coordinates M 0 : M 0 = RT T −1 M

(2.10)

Applying the perspective projection of the point onto the image plane leads to a dimension reduction. The matrix of the perspective projection is given by: 

 1 0 0 0 Po =  0 1 0 0  0 0 z1 0

(2.11)

The factor z1 represents the perspective division. The intrinsic parameters are then used to calculate the image coordinates: m = KPo M 0

(2.12)

m = KPo RT T −1 M

(2.13)

The whole transformation is then:

The equation can also be written in the form: m = KRT (I3×3 − C3 ) M | {z }

(2.14)

P

In the upper equation the perspective projection without perspective division is represented by the identity matrix point I3×3 . The perspective division can be applied after the P matrix is used. P ∈ R3×4 is the projection matrix of the camera: P = KRT (I3×3 − C3 )

(2.15)

Now that the projection matrix is defined, it is possible to compute the pixel values or image coordinates of a taken three dimensional scene or objects. However, the projection of a scene leads to a depth information loss of three dimensional points. Regaining this information requires stereo vision, which is an application area of the cylindrical rectification.

9

2 Basics

2.2 Stereo vision Stereo vision [Fau93] is a technique dealing with the reconstruction of three dimensional objects, given two or more images of this object. The images are obtained by using cameras with known orientation and position. In this work, only two images of the scene or objects are assumed to be given. In order to perform the reconstruction, matching points are searched within images. This search can be based on the color intensity of the pixel values. Having a point in one image, its corresponding point is founded in the other image. To correspond means here that, both points are the image of the same physical point in the original scene. After matching points are determined, their optical rays through both cameras centers are intersected to acquire the depth of the three dimensional point(see figure 2.6). For a given point in one image, the corresponding point can be any point in the second image. In order to reduce the search space, the epipolar geometry and its constraint are used. In the following, the epipolar geometry and constraint are presented. Furthermore the rectification is explained.

Figure 2.6: 3-D vision problem

2.2.1 Epipolar geometry The epipolar geometry describes the relation between two images. In this section, two cameras and their image planes are given. The cameras have respectively C1 and C2 as optical centers. The line joining to the camera centers is called the baseline. The epipolar geometry is the geometry of the intersection of the image planes with the planes pencil having the baseline as axis. Further, two image points p1 and p2 are given as the projection of a three dimensional object or scene point M. M , C1 and C2 are coplanar. The considered plane is an epipolar plane. The intersection of the baseline and the image plane is called epipole. The optical ray of M going through the image plane I1 has its image, called epipolar line, in the image plane I2 . Since any point of this optical ray may have been matched to p1 , the corresponding point

10

2.3 Rectification of p1 lyes on the considered epipolar line in image plane I2 . An epipolar line pass through the projection p2 of M in I2 and the epipole of the considered image plane. Figure 2.7 illustrates the epipolar geometry.

Figure 2.7: Epipolar geometry C1 and C2 are the given optical centers. The real position of P is not known. The optical ray of P going through the image plane I1 has its image in I2 as the epipolar line < e2 , p2 >. The optical ray of P in the image plane I2 has its image in I1 as the epipolar line < e1 , p1 >. It means that all possible matches for p1 lie on the epipolar line and all possible matches for p2 lie on the epipolar line < e1 , p1 > Considering this geometry, the epipolar constraint is given. The epipolar constraint is that, the point p1 has all its possible matches in the image plane of the second camera lying on an epipolar line. Since without the epipolar constraint a corresponding point of p1 may be searched in the whole image plane I2 , the dimension of the search space is reduced from two to one. Using the epipolar constraint and aligning the epipolar lines horizontally, in order to use them as image rows reduces the search of correspondent to the image rows. This transformation of the image is denoted rectification. The geometry resulting from the rectification is called rectified epipolar geometry and illustrated in figure 2.7. In that geometry, the epipolar lines are parallel to the baseline and the image plane normal is orthogonal to the baseline. In addition, the corresponding points must only translated from each other along the baseline so that they have the same rows number within the rectified images.

2.3 Rectification The aim of the rectification procedure is to ensure a rectified epipolar geometry. There exist different techniques of rectification. The oldest one called homography rectification projects the given images onto a common plane, which is parallel to the

11

2 Basics

Figure 2.8: Rectified epipolar geometry The epipoles are at infinity. The epipolar lines are parallel and horizontal. The normal to the image plane is orthogonal to the baseline. baseline. Another method, called polar rectification parametrizes the epipolar lines within the image and around the epipoles using polar coordinates. Each epipolar line corresponds to an angle. This technique is similar to the cylindrical rectification, which transforms the images on a common cylinder surface. In this last method, the epipolar lines are made parallel to the cylinder axis. The next two sections describe the homography and polar rectification. In chapter 3 a more detailed description of the cylindrical rectification is presented.

2.3.1 Homography rectification The homography rectification method is a standard method, which is relatively simple. The idea is to reproject both images of the given cameras onto a common plane. The chosen plane is parallel to the baseline and considered as the image plane of a new camera having an orientation matrix R and an extern calibration matrix K. By choosing a common region on the rectification plane, where the images are mapped to, it is ensured that the corresponding epipolar lines get the same rows number in the rectified images. Figure 2.8 illustrates the homography rectification. Having the matrices K1, K2 (K1 = K2 = K) and R1, R2 intrinsic and extrinsic parameters of the cameras, an homography can be defined between a point in the old image plane and a point in the common plane. The rectification for the first camera can be defined as: H1 = KRT R1 K1−1

12

(2.16)

2.3 Rectification

Figure 2.9: Planar rectification The rectification plane is chosen so that it is parallel to the baseline. This common plane is considered as image plane of a new camera and its orientation R is taken as the rotation matrix used in the extrinsic parameters of the camera. The rectification for the second camera is defined as: H2 = KRT R2 K2−1

(2.17)

The transformation RT Ri is the relative rotation between the camera of the new image plane and the camera with orientation matrix Ri . Since the cameras must only be rotated and do not change their centers, it is sufficient to consider the relative rotation. The main advantages of the homography rectification method are its simplicity and efficiency regarding the runtime. A single homography is calculated and used for all the epipolar lines. Additionally, the homography rectification method preserves straight lines since the transformation is linear and the same for all image points. This straight lines preservation is particularly useful when rectification and stereo matching have to be applied on edges or lines. The main disadvantage of homography rectification is that it fails for any camera motion with a large forward component1 . In this case, unbounded images can be evolved. In fact, every point lying on the baseline does not have its projection on the common rectification plane and a point near of the baseline requires a very large image to be projected. Figure 2.10 illustrates the case of forward motion. 1

A forward component represent a situation, where one of the camera is positioned behind the other camera

13

2 Basics

Figure 2.10: Planar rectification in the case of forward motion The cameras are specified by their image planes, optical axis and centers. All the points lying on the baseline have no image in the common rectifying plane. For the points right from the baseline an image distortion is introduced.

2.3.2 Polar rectification The polar rectification method [PKG99] parametrizes a given image pair using polar coordinates. The process starts from the extreme epipolar lines, which passed through the images corners. The distance between the epipolar lines is chosen so that no pixel information is lost. The epipole is taken to be the origin of a polar coordinates system. Each epipolar line gets an angle and a radius (r, θ) around the epipole. This information is then registered in a (r, θ)-space. The epipolar line then becomes horizontal and is set as image row in the rectified image. By choosing a common interval region for both images, the corresponding epipolar lines are referred to the same angle. This rectification method is not linear and allows a rectification with a minimal image size. It deals with epipoles lying in the image, thus with forward motion. In addition, it is similar to the cylindrical rectification because each image pixel are transformed into a (r, θ) space. However the polar rectification works within the image and may possibly not be used for any camera geometry.

14

3 Cylindrical Rectification The cylindrical rectification is more general as the homography and polar rectification. It deals with camera forward motion and is performed in three dimensional space and can potentially be used in any other camera geometry. The goal of cylindrical rectification is a transformation of the given images into a common (r, θ)-space. Hence, a mapping of the images onto a cylinder surface is realized. The values (r, θ) are used as (x, y) pixel or image coordinates in the rectified images. The cylinder is selected so that, its axis is parallel to the baseline. The straight lines of the cylinder are used as epipolar lines in the rectified image because they are parallel to the baseline. To perform the rectification, each epipolar line is rotated within the epipolar plane and expressed in a a common (r, θ)-space. By choosing a common region to which both given images are transformed, by rotating the epipolar lines so that they remain in the epipolar plane and by aligning the epipolar lines parallel to the baseline, the rectified geometry is obtained, where corresponding epipolar lines are referred to the same angle and the corresponding points only differ along the cylinder axis. Figure 3.6 shows the cylindrical rectification of an image pair.

Figure 3.1: Cylindrical rectification of an horizonal camera motion. Both images have been transformed onto a common cylinder region.

15

3 Cylindrical Rectification Rectifying an image with the cylindrical method needs three important steps. Given an epipolar line in 3D, the first operation is to rotate it until it is parallel to the cylinder axis, thus parallel to the baseline. Each point on the epipolar line are taken to be in camera coordinates system. Rcyl denotes the rotation matrix. After the rotation is done, the epipolar line is transformed in a (r, θ)-space by using an euclidian cylinder coordinates system. Tcyl specifies the basis vectors of the cylinder system and denotes the transformation from camera to cylinder coordinates system. Finally, the epipolar line is projected onto a cylinder having a unit diameter. Scyl denotes the scaling transformation. Each step is made in three dimensional space. The whole transformation can be expressed as a matrix L := Scyl · Tcyl · Rcyl . Each epipolar line undergoes a different one. Since the epipolar lines are finally expressed in the cylinder system, each of them corresponds to an angle θ around the cylinder axis. An image pixel is referred by the epipolar line, on which it lies, thus by the angle θ and by its position r on the line. The following sections describe how each matrix of the above described transformations is obtained.

3.1 Cylinder points and cylinder coordinates system Since the aim of cylinder rectification is to transform a given image pair on a cylinder, it should be possible to handle and to address cylinder points. For that reason, an euclidian cylinder coordinates is defined. This coordinates system specifies the cylinder orientation and allows to describe the cylinder points. Every point that should be transformed onto on the cylinder is assumed to be given in camera coordinates system. Having the images, the intrinsic parameters are used to get the camera coordinates of any ray optical going through the image plane. The transformation Tcyl changes from the camera system to the cylinder system. The basis of the cylinder coordinates system is specified by four vectors C, Xcyl , Ycyl , Zcyl , given in camera coordinates system, where C is one of the two cameras centers C1 and C2 and (Xcyl , Ycyl , Zcyl ) an orthogonal vector basis. Since the cylinder axis should be parallel to the baseline, its axis Xcyl is defined as the normalized baseline vector expressed in camera coordinates. Using the orientation matrix R1 of the first camera and its camera center C1 , the cylinder coordinates system is fixed at the origin C1 for the first camera and C2 for the second camera. The basis of this cylinder coordinates system is described as follows: Xcyl = R1T · (C2 − C1 )

(3.1)

Ycyl is the second basis vector and is computed as the cross product of vector z = (0, 0, 1)T and vector Xcyl . The third vector Zcyl is then the cross product of the two first vectors. In this way an orthogonal right hand coordinates system is obtained. If the baseline is parallel to or equals z, the camera coordinates system basis can be used as basis vectors of the cylinder system. Depending on whether the cylinder basis vectors are at the first or at the second camera origin, there is a difference

16

3.1 Cylinder points and cylinder coordinates system between the used transformation Tcyl . Tcyla is used for the first camera and defined as follows:     n(Xcyl ) n(Xcyl )  =  n(Ycyl )  n(Ycyl × Xcyl ) Tcyla =  (3.2) n(Xcyl × (z × Xcyl )) n(Zcyl ) In order to consider the same cylinder orientation and the same addressing for both images, the cylinder vectors basis of the second camera are defined depending on the cylinder basis of the first camera. For that reason, the relative rotation R12 of both cameras is calculated. R12 converts a point from the second camera coordinates system into the first camera coordinates system and is defined by: R12 = R1T · R2

(3.3)

where R2 is the orientation matrix of the second camera. Tcylb then defines the transformation matrix for all image points lying on the second camera image plane and considers the common cylinder orientation given by Tf oea : Tf oeb = Tf oea · R12

(3.4)

Figure 3.2 illustrates the cylinder basis expressed in camera coordinates. Now that

Figure 3.2: Camera and cylinder coordinates system. The basis vectors of the euclidian cylinder coordinates system are given in camera coordinates the cylinder has been specified, the parametrization of given images can be realized on it. In order to align the epipolar lines parallel to baseline, they should be first rotated so that they are parallel to the cylinder axis and second they could be

17

3 Cylindrical Rectification projected onto the cylinder. Therefore, a rotation of the epipolar lines is necessary before it is expressed in cylinder coordinates.

3.2 Determining the rectification rotation In order to make an epipolar line parallel to the cylinder axis, a rotation angle and an axis of rotation is needed. Additionally, the rotation should be accomplished within the epipolar plane so that the rotated epipolar line remains in that plane. As a result, the rotation axis is chosen as the normal to the epipolar plane and computed as the cross product of the baseline and a given epipolar line point pxyz . Considering the fact that by a rectified epipolar geometry the image plane normal is perpendicular to the baseline, the rotation angle is the angle needed in order to rotate this normal until it is orthogonal to the baseline (see figure 3.3)

Figure 3.3: Rotating the epipolar lines within the epipolar plane View of the cylinder within the epipolar plane. The rotation axis is perpendicular to the epipolar plane. An epipolar line is rotated until it is parallel to the baseline. This rotation can be computed directly using two vectors, z’ and p’, which reflects the rotation angle. The first vector z’ is the projection of the image plane normal z onto the epipolar plane and computed as: z 0 = axis × (z × axis)

(3.5)

The second vector is the orthogonal projection of pxyz in the cylinder plane normal to the baseline containing the camera origin and is computed as follows: T p0 = Tcyl BTcyl pxyz

18

(3.6)

3.2 Determining the rectification rotation whereB is the matrix of the orthogonal projection:  0 0 0 B =  0 1 0 . Figure 3.4 illustrates the vectors used to compute the rotation 0 0 1 directly.

Figure 3.4: Calculating the rotation matrix The epipolar line is projected onto the cylinder plane orthogonal to the baseline and containing the origin. Therefore, the projection of pxyz is orthogonal to the baseline. The normal to the image plane is projected into the epipolar plane. The rotation matrix is then directly computed between p’ and z’ p’ is the projected point expressed in camera coordinate system. However, in the case that the epipole lies in the image, it should be ensured that all points lying on an epipolar line are rotated onto the same epipolar line. To enforce this, the half cylinder space, on which the point p00x,y,z := BTcyl pxyz is taken into account. In that case, the baseline divides the cylinder in two space: the space with equation Zcyl < 0 and the plane with equation Zcyl > 0. Figure 3.3 illustrates how the position of p” is considered. Having the vectors p’ and z’, the rotation matrix Rcyl , which is the rotation from p’ to z’, is given by: T   n(p0 ) n(z 0 )    n(p0 × z 0 ) n(p0 × z 0 ) = 0 0 0 0 0 0 n((p × z ) × p ) n((p × z ) × z ) 

Rf oe

(3.7)

19

3 Cylindrical Rectification

Figure 3.5: Enforcing the position of the epipolar lines Any point in the space(left image) Zcyl < 0 will be rotated with a different angle as any point lying in the space Zcyl > 0. The consequence of this is a disruption of the epipolar line. It should be ensured that every point of an epipolar line undergo the same rotation(right image). Thus the coordinates of the projected poind p” may be multiply by -1 if it lyes in the wrong half cylinder space.

3.3 Scaling the epipolar lines The third step of cylindrical rectification projects the epipolar lines onto a cylinder having a unit radius. This scaling is performed so that every epipolar line has the same distance to the cylinder axis. Figure 3.5 illustrates the scaling. The scale factor k is computed for a known point pxyz . If qcyl = (r, u, v)T is a point on the cylinder, it should be ensured that the distance of this point to the cylinder axis is one. To enforce this, the point is considered to lie in cylinder plane orthogonal to the baseline and the vector W := (u, v)T should have the norm one:

     

0 0 0

r

0

 0 1 0  ·  u  =  u  = 1 (3.8)



0 0 1 v v Since the point qcyl is obtained from Scyl Tcyl Rcyl pxyz , it can be replaced in factor:

  

0 0 0 1

 0 1 0  ·  0

0 0 1 0

the point pxyz through the equation qcyl = equation 3.5 in order to compute the scaling



0 0

1  0 Tcyl Rcyl pxyz (3.9) k

=1

0 k1

Upper equation yields the solution: k = kBTcyl Rcyl pxyz k

20

(3.10)

3.4 Implementation

Figure 3.6: Projecting the epipolar on to a unit radius cylinder The epipolar line l1 has a distance d from the image plane. By rotating it, the distance is preserved. Analog for the second line l1 . The rotated lines thus lye at different distance of the cylinder surface. The scaling matrix Scyl projects them all onto the cylinder surface.

If the inverse transformation of L is required, it means if a cylinder point is given and the corresponding optical ray is searched, the scaling factor should be computed in another way. It should be ensured that the z camera coordinates of the optical ray vector equals 1 (point lie on the image plane).   0  0  pxyz = 1 (3.11) 1 T T T S −1 )q , the scaling Using the upper relation and the fact that pxyz = (Rcyl cyl cyl cyl factor can be calculated:

x=

1 − (Tcyl c3 ) · (Aqcyl ) (Tcyl c3 )(Bqcyl )

(3.12) 

 0 0 0 where c3 is the third column vector of rotation matrix Rcyl , where A =  0 1 0  0 0 1

3.4 Implementation Starting with an empty sink image, the rectified image is computed pixel per pixel. The color value of each pixel is obtained from the original image. First, the image

21

3 Cylindrical Rectification resolution of the rectified image should be specified. In this work, the worst case is assumed. The height√of the rectified image is chosen as N := 2(W +H) and the width is chosen as M := W 2 + H 2 , where (W,H) is the original image resolution(see figure 3.7). Second, the region or bounding box of the rectified image on the cylinder is determined. This region represents the angle interval of the epipolar line and their position interval on the cylinder axis. By choosing a common angle and axis position interval, a same image resolution for both images and by performing the same equidistant sampling to build the rectified images, the corresponding points gets the same θ, it means the same rows y and the images are then effectively rectified. This sampling is done by using a grid having the assumed resolution. The obtained color values are interpolated and set at the current pixel. A similar process is applied to reconstruct the original image from the rectified image. In the following, a more detailed description of the above mentioned steps are given.

Figure 3.7: Image resolution of the rectified image The width(right) is given by the longest epipolar line in the image, which is the epipolar line going through the diagonal. The height is given by the contour of the original image.

3.4.1 Determining the common interval box Considering the fact that, the extreme epipolar lines always pass through the image [Adv] corners and because it is ensured that every epipolar line is transformed to the same half cylinder space during the rectification, the bounding box of the rectified image can be computed by using the four original image corners. The following steps are made: The optical ray of each image corner is calculated. This rays are then intersected with the image plane at Z=1. Now, the transformation L can be computed depending on the rays coordinates and applied to each of them. This yields four cylinder

22

3.4 Implementation points. Each of these cylinder points has an angle and cylinder axis position. By determining the minimum and the maximum angle and cylinder axis position, the cylinder region of the rectified image is determined. If this region differs for both images, they are intersected so that the corresponding epipolar lines get the same angle on the cylinder and have the same length within the rectified images.

3.4.2 Computing the rectified image For a pixel (u, v) in the rectified image (initialized as an empty image), the corresponding pixel (u0 , v 0 ) can be computed in the original image. First the associated cylinder point has to be computed. q(x, y, z) may be the denotation of this point. The coordinates of q are obtained as described in the following. If (rmin , rmax , θmin , θmax ) represents the common region or bounding box for both rectified images, the angle θ and the position r of q on the cylinder are given by: r = rmin + u · δr

(3.13)

θ = θmin + v · δθ

(3.14)

where δr and δθ are increments used to sample the cylinder equidistantly. δr =

rmax − rmin M

θmax − θmin N It is now possible to calculate the coordinates of q: δθ =

(3.15)

(3.16)

x=r

(3.17)

y = sin(θ)

(3.18)

z = cos(θ)

(3.19)

The point q allows to compute the transformation L−1 associated to this point. Applying L−1 to q yields an optical ray p. p is then intersected with the image plane of the current camera. If the resulting point W (wu , wv ) is a valid point in the original image then its coordinates and color intensity are taken and set for current pixel.

3.4.3 Regaining the original image Given a rectified image, it is possible to regain the original image. Since the resolution of the original image is known, an empty image can be created with this resolution. Each pixel of the empty image is passed through and the corresponding coordinates are computed. If (u’,v’) is a pixel in the image, that should be reconstructed, its rectified coordinates should be founded. Using the projection matrix of the current camera, the

23

3 Cylindrical Rectification ray of this pixel can be computed as a vector pxyz . Further, the vector pxyz coordinates are divided in order to make its z-component equals 1 so that the considered point really lyes on the image plane. The matrix L is then computed and applied to pxyz . The cylinder coordinates of the resulting point should be computed and the grid position on the rectified cylinder so that the pixel in the rectified image can be determined. If v(x,y,z) is the coordinates of the ray given by the back projection of pixel (u0 , v 0 ), the angle θ and the position r of v on the cylinder are given by: r=x

(3.20)

y θ = atan( ) (3.21) z The values r and θ are then used to locate the pixel position (u,v) on the rectified image: r − rmin u= (3.22) δr θ − θmin (3.23) δθ where δr and δθ are defined like in formula (3.11) and (3.12). If the pixel coordinates are valid in the rectified image, they are used for the current pixel. v=

3.5 Advantages and disadvantages of cylindrical rectification A particular property of the cylindrical rectification and one of its main advantage is that, it guarantees a bounded rectified image. Since the images to be rectified have a finite size, thus a countable set of epipolar lines, and because each epipolar line is rectified and then used to construct the new image, it is not possible that the resulting image is unbounded. Due to this fact image distortion is avoid by using the cylindrical rectification. In addition, in opposition to the polar rectification, the cylindrical rectification works with optical rays, therefore it may be used for any camera geometry. However, the method is complex and has an higher runtime as the planar rectification since for each epipolar line several computation are done to compute the rectification matrix.

24

4 Experiments The previous chapters presents the cylindrical rectification method. The images to rectify are mapped on the cylinder surface. In this chapter, the results of the implemented method are presented. The used images are synthetic and thus do not present any calibration errors. Figure 4.1 depicts the cylindrical rectification of an horizontal camera motion. The original image has the resolution 640 × 480. The rectified image size is chosen as 800 × 1200 and the rectified images have been used to build a disparity map.

Figure 4.1: Left-right camera motion rectification

25

4 Experiments The disparity is the difference u1 − u2 given two corresponding points (u1 , v1 ) and (u2 , v2 ). By using the rectified images, the corresponding points are determined more efficiently in that they have been searched within the rows of the rectified images. The disparity map presented in figure 4.4 results from a normalized cross correlation [Fau93] with a 9 × 9 windows: in order to find the coordinates of the pixel in image 2 that matches the pixel coordinates (u1 , v1 ) in image 1, a squarish window of size 9 × 9 centered at (u1 , v1 ) is considered and compute a correlation with the second intensity image along the row v2 = v1 . Figure 4.1 illustrates the principle of the correlation technique.

Figure 4.2: Principle of the correlation technique C is an appropriate correlation function. The maximum is reached for a value δo and specifies that a corresponding point is founded for (u1 , v1 ). The disparity of (u1 , v1 ) is then taken to be δo and u2 = u1 + δo There is a simple relationship between disparity, the distance d of a 3D point M and the distance measured from the two optical centers(baseline length) d12 (figure 4.2). d = u1 − u2 = d12 f /z

(4.1)

Therefore the disparity map allows to construct a depth map of the original scene(see Figure 4.4). Figure 4.5 shows the rectification of two cameras with an absolute forward motion. As in the example above, the original image has the size 640 × 480 and the rectified image is 800 × 1200 large. Figure 4.6 shows the disparity and depth map of the forward camera motion scene.

26

Figure 4.3: Relation between the disparity and the depth. The formula 4.1 is obtained by noticing that the triangles p1 o1 C1 and p2 o2 C2 are similar.

Figure 4.4: Disparity map(left) and Depth map (right) Regions, which are not common to both original images got non deterministic values in the disparity map and in the depth map. Such regions are depicted by the image border and for example by the pier in the first image which is partially overlapped in the second original image.

27

4 Experiments

Figure 4.5: Forward motion rectification The epipole is depicted by the red point in the original image (upper left). The rectified image(bottom left) illustrates how the epipole(red vertical line) is distributed over the horizontal aligned epipolar lines. The image center (turquoise point) remains at its position. The green feature point is only horizontally translated from its corresponding point.

28

Figure 4.6: Disparity map(left) and depth map of forward motion rectification

29

Conclusion In this work, a cylindrical rectification method was presented. This method parametrizes a given image pair in a (r, θ)-space by transforming the images onto a common cylinder surface. In chapter 3, the basics steps of the cylindrical rectification was alluded and the implementation was presented. Since the aim of rectification is the rectified epipolar geometry, the cylinder is chosen so that its axis is parallel to the line going through both cameras centers. Each epipolar line is rotated until it become parallel to the baseline and projected on a cylinder of unit radius. Since the cylinder orientation is the same for both given images, since the epipolar lines are made parallel to the cylinder axis thus to the baseline, since a common cylinder region and the same image resolution have been chosen for the resulting images and because this region has been sampled with the same equidistant rate for both images, it is ensured that the obtained images are rectified. The realized experiments in chapter 4 confirm the upper statement. The cylindrical rectification works with optical rays and is performed in the three dimensional. Hence, the method can be used used for any camera geometry. While the cylindrical rectification does not preserve straight lines, it ensures an always bounded image. The cylindrical rectification was implemented in C++ and some experiments are presented in chapter 4. The method has been tested and succeed for both normal camera motion and forward camera motion. The image size of the rectified image is independent of camera motion and can be set by the user. The rectified images has been used to construct a disparity map. From the disparity map a depth map of the original scene has been constructed.

30

Bibliography [Adv] Daniel Oram Advanced. Rectification for any epipolar geometry. [Fau93] Olivier Faugeras. Three-Dimensional Computer Vision. A geometric viewpoint. Massachusetts Institute of Technology, 1993. [PKG99] Marc Pollefeys, Reinhard Koch, and Luc J. Van Gool. A simple and efficient rectification method for general motion. In ICCV (1), pages 496– 501, 1999. [RMC97] S. Roy, J. Meunier, and I. Cox. Cylindrical rectification to minimize epipolar distortion, 1997.

31